Keywords
Omic Sciences, Volatile Organic Compounds, Biomarkers, Volatilomic, Xenovolatilomic.
This article is included in the Agriculture, Food and Nutrition gateway.
Volatilomics is an omics science that is characterized as being a specific subbranch of metabolomics, which studies the different types of volatile organic compounds that may be present in a certain biological matrix. It has had impacts on the identification of new natural compounds and food safety processes, since it allows the evaluation of emerging contaminants that are present on food matrices, through the identification of biomarkers generated in response to this type of xenobiotic compounds through xenovolatilomic studies.
In this way, this review seeks to understand the scientific advances reported towards volatilomic studies, for which different types of primary research are reported depending on the main instrumental techniques used for the characterization of different types of VOCs that have been reported in our country between 2012 and 2022.
Using a qualitative methodology, a search was carried out in the Scopus database, from which the bibliometric information of the primary research reported during this time was obtained, to later analyze the different research concerning the use of volatilomic studies and the fields of action that are currently used, as well as the different techniques for obtaining these compounds and the data analysis methodologies established for the processing of this type of research.
Finally, it can be concluded that, from the present review, the applicability of volatilomic studies is shown. The incursion carried out by this type of science on the verification of food safety in different types of matrices, in addition to allowing the study of the volatile profile formed by the different volatile organic compounds expressed by said matrix and the respective ecological role expressed by these compounds with the environment in which they are found.
Omic Sciences, Volatile Organic Compounds, Biomarkers, Volatilomic, Xenovolatilomic.
The term “omics” began to be adopted by various scientific communities beginning in the year 1980. From then on, the importance of studying diverse sets of biomolecules found in different types of biological matrices became apparent.1 An abundance of research asserts that the advent of omics sciences occurred alongside the development of the Human Genome Project (HGP), which facilitated the complete sequencing of human DNA.2 Since then, numerous branches of omics sciences have emerged, each boasting a unique field of action. These include genomics (a discipline that studies the collection of genes in DNA), transcriptomics (which investigates different types of mRNA transcripts), proteomics (a field that analyzes diverse sets of proteins), and metabolomics (which examines different sets of metabolites).1 Advancements in science showcased through a wide array of research on the application of omics sciences, highlight the significance of technological tools for the ongoing progression of the sciences and society at large. Therefore, a vast body of research documents the use of omics to probe the biodiversity that populates our planet Earth. This research involves analyzing a variety of life forms such as plants, animals, soil biodiversity, environmental sustainability, and various types of coexisting microorganisms. Each of these contributes unique information through the utilization of each omics discipline.3
At present, genomics and transcriptomics are commonly utilized disciplines. Both are approached through comparative study, yet genomics offers a process of genome characterization, revealing structural and functional conditions, as well as evolutionary processes and similarities among gene families.4 As such, genomics has served as a launching pad in agriculture for the creation of novel biotechnological developments, for instance, in the development of plant improvement programs.4 Transcriptomics, conversely, investigates the various biological processes that occur within an organism to identify genes that are associated with tissues and genetic responses expressed under specific conditions. Likewise, it offers insights into macro-scale evolutionary processes across different species over time.4 Proteomics, distinct from transcriptomics, is charged with understanding proteins at both the structural and functional levels. These biomolecules are vital to life due to their role in various metabolic pathways.5 It includes several sub-disciplines, such as expression proteomics (the identification of the relative abundance of proteins), structural proteomics (three-dimensional analysis of protein conformation), and functional proteomics (the study of interactions, functions, and distribution of proteins within each organism). Therefore, the application of proteomics aids in the analysis of the protein profile by identifying patterns of presence, absence, and changes through the proteome.5
Another facet of omics sciences is metabolomics, which focuses on the analysis and characterization of the myriad of metabolites found within a specific organism.6 Metabolomics is intrinsically connected with the chemical analysis of molecules on an individual level. Additionally, this discipline provides various technological tools to more accurately characterize the biomolecules constituting a given matrix.6 At present, a multitude of chemical compounds exist, varying in polarity, functional groups, molecular masses, and chemical stability, among other characteristics.7 Given this diverse array of compounds, the employment of a range of tools to support the identification of metabolites becomes necessary. Thus, both gas chromatography (GC) and liquid chromatography (LC) facilitate the segregation of all these compounds present within a matrix.7 Furthermore, pairing these types of chromatography with identification techniques such as mass spectrometry (MS), nuclear magnetic resonance (NMR), and matrix-assisted laser desorption/ionization (MALDI) fosters the development of innovative techniques like GC – MS, LC – MS, and MALDI – TOF. These techniques enable nuanced characterization in light of this assortment of chemical compounds.7
Research conducted using omics sciences resonates profoundly with the academic community and society at large. This is attributable to the fact that these contributions pave the way for the discovery of the bioprospecting potential that omics sciences possess for exploring biological characteristics, from the genetic to the molecular level, with significant benefits for health.8 In this context, metabolomics plays a pivotal role in understanding the complex interactions within the metabolite environment as a response to the connection formed between various types of organisms and their surrounding environment.8 Therefore, in a logical progression, a metabolomics study enables an exploratory analysis of the chemical potential that a particular biological matrix may hold. This is performed through a series of stages, such as defining the experimental conditions of the study, extracting metabolites, and then separating and detecting them.8
In the present day, due to numerous technological advancements, new omics sciences have emerged. These include epigenomics,9 metagenomics,10 nutrigenomics,11 connectomics,12 speciomics,13 foodomics,14 petroleomics,15 fluxomics,16 lipidomics,17 and volatilomics.18 Each of these disciplines has a distinct field of study.19 Turning our attention to volatilomics, this discipline arises as a subfield of metabolomics. It is characterized by its analysis of the metabolites that constitute the volatile profile within a given matrix.20 Of late, volatilomic analysis has been gaining traction in the research arena (Figure 1), given the number of research studies reported that employ this tool for quality control and early, non-invasive identification of certain diseases.20 Moreover, volatilomic studies are being conducted for the detection of toxic substances present in certain foods. As such, it ensures the safety of these products, thereby circumventing issues of adulteration and food fraud.20
Consequently, it is crucial to comprehend the practical role that volatilomic studies play in identifying volatile organic compounds (VOCs). These compounds enable the non-invasive and efficient assessment of food quality through the extraction of metabolites that constitute the volatile profile. Moreover, these VOCs are responsible for generating the aroma in a broad variety of foods consumed by society. These compounds are also tied to the ripeness level that food may exhibit,21 or they may be influenced by the presence of external compounds, such as those generated by the use of emerging pollutants—for instance, the presence of pesticides and heavy metals in food. All of this culminates in an analysis of the importance of VOCs in the sensory identity that each type of food can possess concerning its volatile profile.21 Therefore, in this review, our objective is to comprehend the scientific advancements reported in the field of volatilomic studies. To that end, we present a variety of primary research concerning the principal instrumental techniques utilized for the separation and characterization of diverse VOCs reported and applied within our country.
When defining volatilomics, we can classify it as a branch of metabolomics dedicated to the study of VOCs and the respective roles these play within the processes of biological interaction in each organism. In the current era, extensive volatile chemodiversity is known among VOCs, thus underscoring the importance of studying the volatilome in different matrices.22 VOCs are characterized as a diverse group of molecules that, structurally, are composed of carbon, hydrogen, oxygen, nitrogen, and occasionally sulfur. These compounds have a low boiling point, which renders them gaseous at room temperature.23 Volatilomics studies involve a range of processes, which encompass sample preparation, extraction techniques, analytical parameters, instrumentation, and the analysis of multivariate data.24 These steps are executed through the combination of techniques such as solid-phase microextraction (SPME) and the use of gas chromatography with spectrometry (GC – MS).24 Presently, volatilomics has diverse applications, including evaluating authenticity, contrasting compositions based on various geographical origins,25 and assessing food quality.24 It is also employed in analyzing the ideal conditions of foods and the influence of physicochemical variables on the change of the volatilome, potentially triggering the expression of VOCs.26 Furthermore, it permits the non-invasive identification of diseases, aiding in their detection by characterizing potential biomarkers for early discovery in patients presenting different pathological stages.27 This contributes to the identification of biomarkers for monitoring the toxicity and ecotoxicology of biological matrices,28 in the presence of toxic agents. Additionally, volatilomics finds application in the development of various chemical-environmental studies.29
Recent research has centered on the early identification of diseases, such as cancer, through non-invasive techniques. These methods allow for the detection of biomarkers—molecules that are expressed during the development of a given disease within an organism.30 Accordingly, research targeting the chromatographic analysis of exhaled air has expanded, with the aim of identifying biomarkers. This, in turn, enables the establishment of reference points for early disease diagnosis through the analysis of VOCs.30 To date, a wide array of VOCs has been identified. Table 1 presents various types of compounds currently being studied, including major atmospheric pollutants (MAPs) and harmful, polluting agents, also known as polycyclic aromatic hydrocarbons (PAHs).31
Additionally, volatile organic compounds (VOCs) can be classified into semi-volatile organic compounds (SVOCs) and very volatile organic compounds (VVOCs).32 Moreover, the identification of biogenic volatile organic compounds (BVOCs),18,33,34 and microbial volatile organic compounds (MVOCs) is reported.35 The latter are characterized by their role as mediators in both inter- and intraspecific communication processes, serving in signaling mechanisms as well as in the production of metabolites for plant defense.35,36 Furthermore, the prospect of utilizing MVOCs as ecological alternatives to mitigate the use of various types of harmful chemical pesticides is being explored.36,37
There are various analytical techniques for extracting VOCs. Among these, the use of solid-phase extraction (SPE), a sample preparation technique developed in the 1970s, is well-documented.38 Currently, various methodologies have been reported, giving rise to sub-techniques such as matrix solid-phase dispersion (MSPD), miniaturized solid-phase extraction (M-SPE), and solid-phase microextraction (SPME),38 among others. In 1990, Arthur and Pawliszyn introduced solid-phase microextraction in the headspace (HS – SPME), which streamlined the isolation of different types of compounds for obtaining VOCs in various biological samples.23 It presents advantages such as the elimination of solvent use, ease of handling, minimal equipment requirements, and rapidity. It is also adaptable for automation, provides good linearity, and exhibits high sensitivity.38 Similarly, it operates under an extraction-desorption process of the analytes within the material’s surface, thereby facilitating the extraction of compounds of interest for subsequent analysis using various chromatographic techniques, as depicted in Figure 2.
When conducting HS – SPME analysis, it is important to consider several factors that influence the process of extracting VOCs. These include the amount of sample used for extraction, extraction time, desorption time, as well as the material constituting the microextraction fiber. Examples of fibers utilized include polydimethylsiloxane (PDMS), polyethylene glycol (PEG), polyacrylate (PA), as well as combinations of polydimethylsiloxane and carboxene (PDMS/CAR), divinylbenzene and polydimethylsiloxane (DVB/PDMS), and polydimethylsiloxane, carboxene and divinylbenzene (PDMS/CAR/DVB) materials.23 Additionally, the type of fiber used in the HS – SPME analysis is selected based on the polarity of the analyte of interest. This is because the combination of each material yields different polarities, and the analytes that have an affinity with the fiber will be attracted to it. Also, within SPME, we can identify three distinct sub-techniques of analysis. These include the use of solid-phase microextraction with multiple headspaces (MHS-SPME),39 dynamic headspace solid-phase microextraction (DHS-SPME),21 and static headspace solid-phase microextraction (SHS-SPME).40,41
In a volatilomic study, following the extraction of metabolites and their separation through instrumental techniques such as GC and LC, and the subsequent detection of chemical compounds via techniques like MS, nuclear magnetic resonance (NMR), ultraviolet light (UV), or infrared spectroscopy (IR), this information is processed through a preliminary step that executes peak deconvolution (data preprocessing). This is done to avoid overlap of chromatographic and spectral signals that could represent instrumental information about a chemical molecule.42 Consequently, given the scientific and technological advancements of the present day, all these extraction techniques have been coupled with novel methodologies for the separation and identification of VOCs. Some examples of these are presented in Table 2.
Analysis matrix | Extraction technique | Separation and identification technique | Reference |
---|---|---|---|
Merlot wine | SPME | GC – FID GC – MS GC – O – FID | (Welke et al., 2022)131 |
Ginger | HS – SPME | HS - GC – MS E-nose | (D. Yu et al., 2022)132 |
Meat fat | SPME | GC x GC – QTOF/MS E-nose Flash | (J. Wang et al., 2022)133 |
Figs | FlavourSpec | GC – IMS | (Liu et al., 2022)134 |
Dairy fermentation (Yogurt and kefir) | HS – SPME | HS – GC – HTIMS | (Capitain et al., 2022)72,20 |
Cabernet Sauvignon wines | HS – SPME | GC – Quadrupole/Orbitrap–MS FPD PFPD NPD | (Liu, Yaran et al., 2022)135 |
Commercial bog blueberry wines | SPME | GC – Quadrupole/Orbitrap–MS | (Lin et al., 2022)55 |
Ossau-Iraty DOP cheese | Tedlar bags | SIFT – MS | (Reyrolle et al., 2022)136 |
Soy sauce | SPME | GC – Q – MS – O GC – Orbitrap – MS | (Feng et al., 2020)137 |
Citrus juice | HS – SPME | GC – MS – IMS | (Brendel et al., 2021)138 |
Feijoa | HS – SPME | GC – MS | (Baena-Pedroza et al., 2020)139 |
Lulo | HS – SPME | GC – MS | (Corpas-Iguaran et al., 2017)140 |
Cherry tomato | HS – SPME | GC – MS | (Londoño-Giraldo et al., 2021)141 (Baena-Pedroza et al., 2021)128 |
Fine cocoa beans | HS – SPME | GC – MS | (Escobar et al., 2021)142 |
Natural waters – Trihalomethanes | HF – SBME | GC – ECD | (Ladino et al., 2014)143 (Correa et al., 2015)144 |
Concurrently, the emergence of new combined techniques has been reported, which leverage various technological advancements. These techniques often employ tools based on the development of sensors, such as the electronic nose (E-nose).24 The E-nose facilitates the evaluation and identification of VOCs through electronic processes, leading to the genesis of specific study disciplines like olfactometry (O).43 This field has been explored using the electronic nose (E-nose),44,45 including E-noses of the MOS (metal oxide semiconductors) type,46 and Flash E-Noses, which are characterized by the use of an electronic nose in conjunction with flash gas chromatography.47 On the other hand, extraction systems like SPME have been combined with gas chromatography, mass spectrometry, and olfactometry (HS - SPME - GC - MS - O).48 Moreover, new couplings involving two-dimensional chromatography with olfactometry (GC x GC - O) are being reported.49,50 Studies elucidating the diverse applications of the electronic nose for identifying these types of compounds underscore the use of chemometric tools for analyzing this type of data. Such tools include the support vector machine (SVM),51 the extreme learning machine (ELM),52 backpropagation neural networks, also known as backpropagation networks (BP),52 and the employment of rapid discrimination models of the artificial neural network (ANN) via the development of self-organizing Kohonen maps (SOM).53–55 These tools facilitate the design of a classification model for recognition in the E-nose system and the identification of VOCs.52 All of this contributes to the progressive advancement in characterizing biomarkers for the non-invasive diagnosis of diseases such as cancer and chronic obstructive pulmonary disease (COPD).56 Therefore, these analyses enable the cross-checking of experimentally obtained information with various active metabolic pathways. These pathways can occur in different types of organisms and may be altered under the presence of emerging contaminants, for the analysis of response metabolites and their applicative role within biochemical pathways.57
Once data from an E-nose, GCxGC – O, GC – MS analysis is obtained, this information is subject to a preprocessing and information treatment phase. The aim here is to align, identify, and validate the data through either supervised or unsupervised multivariate statistical analysis techniques.42 Currently, a variety of techniques have been developed for analyzing this type of volatilomic data, with reports citing the use of techniques such as: Principal Component Analysis (PCA),58 Linear Discriminant Analysis (LDA),59 Partial Least Squares regression (PLS),60 Partial Least Squares Discriminant Analysis (PLS-DA),61 Orthogonal Partial Least Squares analysis (OPLS), and Orthogonal Partial Least Squares Discriminant Analysis (OPLS - DA),62 Hierarchical Cluster Analysis (HCA),63 Support Vector Machine (SVM),51 Non-Targeted Screening Workflow (NTS-Workflow),64 Random Forest (RF),65,47 Multilayer Perceptron (MLP),47 Multivariate Curve Resolution (MCR-ALS),66 Kernel algorithm with Partial Least Squares Discriminant Analysis (Kernel – PLS),67 as well as the k-Nearest Neighbors method (k-NN),68 Principal Component Analysis combined with the k-Nearest Neighbors algorithm (PCA-kNN),69 Quantitative Descriptive Sensory Analysis (QDA),70 Artificial Neural Network analysis (ANN),71 and Non-Negative Matrix Factorization (NNMF).72 A volatilomic analysis conducted using E-nose technologies often incorporates chemometric methods such as PCA, LDA, DA, QDA, SVM, and ANN as mathematical models for the identification of VOCs73 (Figure 3).
Principal Component Analysis (PCA), Discriminant Analysis (DA), and Partial Least Squares regression (PLS) are widely-used multivariate analysis tools for conducting a volatilomic study.24 These statistical tools facilitate the simplification of dimensionality provided by the number of variables. This allows for the discrimination of study groups based on the obtained metabolic profile and enables the identification of the most representative metabolites.74 These methods are categorized as either supervised or unsupervised.
To distill essential information from sample analyses across GC – MS, LC – MS, and NMR, the application of chemometric and bioinformatic tools is required. The conventional approach has typically employed unsupervised methods such as PCA, clustering, and self-organizing maps (SOM), as well as supervised techniques including RF, kNN, Principal Component Regression (PCR), PLS, and SVM.75 In addition, novel methods have been introduced, such as Pathway Analysis (PA). These techniques generate an enrichment for metabolomic data and facilitate interpretation in relation to statistically significant pathways. They are characterized by being either non-topology-based methods (nTB) or topology-based (TB), but they do not incorporate an over-representation analysis (ORA). Additionally, there are functional class scoring (FCS) methods available.75
Given the progress of recent scientific advancements, innovative learning methods have been developed, enabling the more efficient execution of both supervised and unsupervised methods through the use of Artificial Intelligence (AI). AI is characterized by its arsenal of algorithms designed to emulate human learning processes, thereby facilitating data-driven decision-making.75 In line with the advancement of AI, models for machine learning, deep learning, and varying levels of neural networks have been formulated. These are designed to generate a model from a base dataset that is fit for making decisions with a high degree of prediction accuracy. This leads to the application of concepts such as “Deep Learning” and “Machine Learning,” and the use of neural networks through algorithms that mimic the function of a human brain neuron when induced by stimuli.75 Deep learning encompasses supervised learning, which serves as a discriminative analysis. This includes the Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Short-Term Long Memory (LSTM), and Gated Recurrent Unit (GRU).75 Conversely, unsupervised learning, serving as a generative analysis, features the Generative Adversarial Network (GAN), Autoencoder (AE), Sparse Autoencoder (SAE), Denoising Autoencoder (DAE), Self-Organizing Map (SOM), Restricted Boltzmann Machine (RBM), and the creation of Artificial Neural Networks (ANN) and the Deep Belief Network (DBN).75 In addition, there exists a hybrid form of learning that leverages both supervised and unsupervised methods, thereby generating composite data models through Deep Transfer Learning (DTL) and Deep Reinforcement Learning (DRL).75
As a result, studies relating to metabolomic and volatilomic data modeling have been conducted. For instance, the application of NMR metabolic data analyzed through deep learning techniques, using convolutional neural network architectures, has been reported, with relevance to food analysis and various biomedical research fields.75 In a study by Capitain et al., (2022), an analysis of yogurt and kefir samples using HS – GC – HTIMS is detailed. Accordingly, the volatile profile of 33 traditional kefir samples, 13 commercial ones, and 15 commercial yogurt samples were obtained. These were subsequently analyzed using PCA and NNMF techniques, leading to the identification of compounds such as ethanol, methyl-1-butanol, ethyl acetate, 3-methylbutyl acetate, and aldehydes.72 In a similar vein, other research reports the use of machine learning for the identification of biomarkers through the classification of specific patterns in the analysis of VOCs, and the various compounds that participate within different metabolic pathways for the biosynthesis of VOCs.76 Such analyses also facilitate the differentiation between substances from the identification of samples by HS – SPME – GC – MS – IMS methods. This enables the distinction between grapefruit and orange juices using techniques such as LDA, SVM, kNN, and PCA. These methods, in turn, provide alternatives for assessing the authenticity of various types of citrus fruits through the application of deep learning.77
In the same way that metabolomic and volatilomic studies facilitate the identification of new natural products and probe their chemio-ecological relationship with the environment, the application of these omic sciences has also been reported in the extraction of biomarkers. This is achieved through the creation of the xenometabolome or exposome, whereby metabolomic analysis is performed in tandem with endogenous metabolites and the xenometabolome induced under the influence of chemical xenobiotics that trigger a biological alteration.78 The identification of these potential biomarkers, brought about by the presence of xenobiotic agents, has been extensively explored using techniques such as NMR, GC – MS, LC – MS, LC – HRMS,78 and trace level xenobiotic analysis using LC – nanospray MS (nUHPL – nESI – TOFMS). This last technique boasts a high degree of analytical sensitivity, low injection volume, and an extensive metabolic analysis range.79 This offers an alternative approach for small molecule profiling, guaranteeing solid repeatability and reproducibility of this innovative methodology for the analysis of urine and plasma samples.80 An illustration of this application is the study conducted by Al-Salhi et al., (2012), in which the xenometabolic profile in the plasma of fish from the Oncorhynchus mykiss species is determined. This profile was generated as a consequence of the exposure of these organisms to wastewater. Their work underscores the significance of metabolomic and xenometabolomic studies for identifying potential biomarkers in plasma as a means of monitoring long-term exposure to contaminant substances.81
Through these types of studies, it is feasible to examine the various metabolic pathways that become disrupted in the presence of a broad array of xenobiotic agents, which can trigger alterations in the subject organism.78 Consequently, research is delving into the creation of platforms for xenometabolite identification, termed as “XenoScan,” employing techniques such as LC – MS. This has led to the development of novel metabolomic approaches using xenobiotic metabolite enrichment. Such an approach enables the analysis of response metabolites produced in the presence of these external agents and the various formation pathways associated with the ecological analysis of intestinal microbiology.82 By utilizing various analytical tools in conjunction with big data analysis and multivariate statistics, it is possible to distinguish grouped samples and identify specific biomarker compounds.24 This demonstrates the fundamental role of biomarkers in evaluating the quality, originality, and predictive analysis of diverse factors affecting food.24 Currently, a range of biomarker types exists, based on the omic science through which their identification has been realized. These include genetic, epigenetic, transcriptomic, proteomic, and metabolomic biomarkers (within which lipidomic and volatilomic biomarkers are found).83,84
In a pathological state, VOCs undergo significant alterations; hence, the study of the volatilome in the body secretions of ill individuals is employed to identify potential biomarkers.85 In this realm, volatilomic studies have been implemented across various biological fluids for pathology detection, aiming towards the early diagnosis of such conditions. For lung cancer, VOC samples were sourced from exhaled breath,85 and VOCs present in urine samples were used for diagnosing breast cancer,86 liver,87 and prostate diseases.88 VOC studies in blood were utilized to diagnose colorectal cancer,89 ovarian cancer,90 and polycystic ovary syndrome.91 In Colombia, a study was conducted on VOCs present in exhaled breath for detecting gastric cancer using GC – MS, presenting a faster and more effective diagnostic approach compared to traditional tests.92 This method was also used in the identification of compounds such as 2-pentanone, 2-heptanone, ethyl propionate, p-xylene, 1,2,4-trimethylbenzene, and o-cresol, considered potential biomarkers for the detection of breast cancer through urine samples.93 Similarly, using GCxGC-TOF/MS, it was possible to identify VOCs such as butyrolactone, 2-methoxyphenol, 3-methoxy-5-methylphenol, 1-(2,6,6-trimethylcyclohexa-1,3-dien-1-yl)-2-buten-1-one, nootkatone, and 1-(2,6,6-trimethyl-1-cyclohexenyl)-2-buten-1-one as potential biomarkers for bladder cancer detection.94
Furthermore, these types of studies are also being implemented for the identification of biomarkers in animals, facilitating disease diagnosis in livestock through non-invasive tests, which utilize breath for the capture of these compounds.95 Consequently, diagnoses of conditions such as ketosis and bovine respiratory disease (BRD) are made feasible, using sample preparation techniques like SPME and VOC identification methodologies such as GC-MS and E-nose.95 Thus, volatilomic studies also offer a swift diagnostic alternative for early disease detection, finding response metabolites to disruptions or alterations induced in the environment. This allows the identification of biomarkers related to senescence,96 maturation, susceptibility, exposure, effect, efficacy, circularity, therapeutic use, safety, omics, and toxicity.97 This broad applicability enables risk assessment in patients and monitoring disease progression in several aspects such as treatment, prevention, diagnosis, therapy response, experimental evaluation, and environmental and epidemiological risk measurement.98
However, these studies are not limited to the determination of VOCs in animals and humans. Such research also extends to medical diagnostic tests, biological analysis of various ecosystem environments, phytopathogen control, and ecosystem biomonitoring, thereby fostering cross-disciplinary research approaches with chemical ecology.99 These studies also play a critical role in food safety analysis. This last aspect is vital within the food industry, as the analysis of flavor and aroma represents critical criteria for consumers. For instance, an analysis using HS – SPME – GC – MS/MS facilitated the identification of the volatile footprint in apricot samples in Xinjiang, China, in addition to volatile biomarkers such as damascenone, acetophenone, eucalyptol, myrcenol, 7-hexadecenal, 2,4-dimethyl-cyclohexanol, and salicylaldehyde.100 Regarding flavor analysis, a study reported by Bai et al., (2013) evaluated compounds with a fishy flavor induced by the absorption of substances produced from the presence of microorganisms. By making use of in vivo SPME, they identified the presence of geosmin and 2-methylisoborneol, which are generated by cyanobacteria and actinobacteria microorganisms. These compounds contribute to earthy and musty flavors in food, particularly in fish.101
Humanity has witnessed countless technological advancements, but in the realm of volatilomics, rapid VOC detection is marked by the advent of electronic noses and tongues. These devices mimic human responses to various tastes and smells and work in conjunction with machine learning and deep learning techniques such as ANN, CNN, PCA, PLS, SVM.102 The utilization of these tools is characterized by their cost-effectiveness, speed, and accuracy, along with their high utility in the realm of food quality and safety.102 These devices often employ semiconductor sensors based on metal oxide (E-nose MOS) for the identification of volatile gases.103 The Figaro type E-nose MOS provides excellent sensitivity, rapid response, low cost, and robust durability. It is ideal for monitoring bad smells,104,105 and its data can be interpreted through multivariate analysis methodologies for the development of classification systems.103 Additionally, the development of HS-MS E-nose has been reported. In this setup, the electronic nose is directly coupled to the headspace, and the compounds are analyzed by mass spectrometry. However, this process must be conducted in an inert atmosphere to facilitate normalization against fragment abundances and the addition of an internal standard.106
Another notable aspect of the E-nose is its operational basis on chemical sensors, which are not selective for a specific compound. Instead, they permit the detection of compound groups, thus serving in various fields such as medicine for capturing VOCs present in human breath.107 In this manner, an odor footprint is obtained as an analysis pattern, which is subsequently compared with a database, enabling diagnoses of diseases like pneumonia,107 and lung cancer from exhaled breath.108 These tools are employed for evaluating the volatile profile generated in various food matrices,109 and any changes that might occur within drying processes. For example, it identified in the sautéed tea leaves of the Flos Sophorae Immaturus species compounds such as furaneol, 2-pentylfuran, 3-methylheptane, 3-furaldehyde, 2,5-dimethylfuran, utilizing the E-nose Flash GC.110,111
Moreover, it is crucial to consider how the data obtained through E-nose will be interpreted, given that multivariate analyses necessitate novel methodologies based on machine learning to interpret extensive data sets expressed through multidimensional signal analysis. For instance,112 evaluated the approach of combining a sensor array with instrumentation hardware and machine learning methods to analyze samples generated from wastewater treatment. Consequently, they determined the multidimensional measurement generated from a sensor array, inducing a dimensionality reduction through the T-distributed Stochastic Neighbor Embedding (t-SNE) method. This method operates similarly to PCA, but unlike PCA, t-SNE generates a probability distribution representing the degree of similarity between neighboring data in a certain space and a lower dimensionality degree. This results in improved data separation and enhanced categorization of these data into clusters compared to when PCA is used.112
Research has also established methods for monitoring VOCs emitted by the composting process.113 These employ complementary techniques such as the E-nose, GC – MS, and the Odile olfactometer (Odotech). The E-nose demonstrates capacity in identifying compound families, while the GC – MS is used for characterizing individual compounds. Additionally, the olfactometer is used for quantifying concentrations of smells in odor units per cubic meter (OuE/m3).113 However, issues often arise regarding the energy consumption and size of the device. As a solution, alternatives have been reported that focus on the development of small, low-energy E-noses. These use a low-power microplate (MHP) with Pd-SnO2 and Pd-WO3 microparticles to generate an optimal temperature of 300 °C for detection and proper sensor functioning. This process involves the use of a Wavelet-type transformation to treat, process, evaluate, and reduce signal noise. Consequently, data analysis can be performed using the k-NN algorithm and the development of a forward propagation neural network (pN-BPNN).114 Moreover, it has become clear that the development of an E-nose alone is not enough. The task of enhancing flavor in various food matrices requires integration with the development of an electronic tongue (E-tongue). The use of GC – MS, along with these techniques and respective sensory analyses, allows for continuous improvements in food quality.115 In essence, the combination of E-nose, E-tongue, and colorimeters has facilitated the interconnected analysis of this data. Such analysis is executed through machine learning methods like RNA, ANN, Extreme Gradient Boosting (XGBoost), Random Forest Regression (RFR), and SVR. These methods prove useful for the sensory evaluation of matrices such as the Japanese horse mackerel (Trachurus japonicus) during its cooling process over several days. This process relies on the interconnected electronic sensory information obtained through these techniques.116
As the world becomes more globalized and scientific advancements increase, the amount of information associated with the publication of diverse primary research continues to grow.117 This creates a need to synthesize the wealth of existing information in a well-organized and digestible manner. This is where systematic reviews (SRs) come into play.118 These reviews are known for their established procedures that result in succinct and orderly summaries. Their goal is to guide readers through the available information on a specific topic,119 thereby providing high-quality research.120 This process is achieved using a replicable methodology that collates emerging primary research on a topic. Notable among these methodologies is the PRISMA (Preferred Reporting Items for Systematic Reviews), which aids in the identification, selection, and synthesis of information.121,122
The creation of an SR necessitates a research question to guide the analysis of pertinent information.123 Additionally, a standard language is required, which is often achieved by searching for terms indexed in Medline’s thesaurus, known as Medical Subject Headings (MeSH), or utilizing ScienceDirect Topics.124 Upon obtaining the terms, a search equation is created, typically based on the PICO mnemonic (Population, Intervention, Comparison, Outcome),125 or PICOT (Population, Intervention, Comparison, Outcome, Time).123 Following this, the literature search is conducted and inclusion and exclusion criteria are applied to the information gathered. These criteria might involve the language used, type of document, timeframe, among others. Ultimately, the process concludes with the extraction of information, evaluation of study quality, and synthesis or presentation of findings.123 In addition to this process, a bibliometric analysis of the obtained information is often conducted. This analysis allows for the identification of trends within the subject matter based on the research and publication of various articles, thereby facilitating the examination of different indices of scientific activity.126 Furthermore, it evaluates the degree of application of the topic across different knowledge areas, encouraging interdisciplinary processes and the formation of collaboration networks among countries, authors, co-authors, among others.127,128
The information search was conducted on June 13, 2022, with the objective of analyzing the documents reported from the years 2012 to 2022. The search incorporating all the metadata that comprised the equation was entered into the Scopus (Elsevier) database. As illustrated in Figure 4, the filters used for the extraction of primary research can be observed, based on the following parameters: time period (2012 – 2022), type of document (Article), stage of publication (final) for the Scopus database, language (English) and finally, country (Colombia) (Figure S1). This resulted in the formulation of the following search equation to carry out this Systematic Review (SR):
TITLE-ABS-KEY ((“Volatilomic Fingerprint” OR volatilomic OR “volatile profiling” OR “Volatile metabolites”)) AND (LIMIT-TO (PUBYEAR, 2022) OR LIMIT-TO (PUBYEAR, 2021) OR LIMIT-TO (PUBYEAR, 2020) OR LIMIT-TO (PUBYEAR, 2019) OR LIMIT-TO (PUBYEAR, 2018) OR LIMIT-TO (PUBYEAR, 2017) OR LIMIT-TO (PUBYEAR, 2016) OR LIMIT-TO (PUBYEAR, 2015) OR LIMIT-TO (PUBYEAR, 2014) OR LIMIT-TO (PUBYEAR, 2013) OR LIMIT-TO (PUBYEAR, 2012)) AND (LIMIT-TO (DOCTYPE, “ar”)) AND (LIMIT-TO (PUBSTAGE, “final”)) AND (LIMIT-TO (LANGUAGE, “English”)) AND (LIMIT-TO (AFFILCOUNTRY, “Colombia”)).
Following this, a bibliometric analysis of the various primary studies obtained from the Scopus database was performed, with the information being exported into a reference list file (.bib). This was done in order to analyze it using the R software (https://www.r-project.org/) and RStudio (https://rstudio.com/). The open-source bibliometrix package (https://github.com/massimoaria/bibliometrix) was utilized, which is recognized for being a free and open-source program that allows for the identification of various bibliometric records of articles, quantifying the number of articles per year (Figure 5) and their research trends in the field. Simultaneously, the information was exported into a comma-separated file (.csv) for analysis in the freely accessible software VOSviewer (https://www.vosviewer.com/). This was conducted in order to construct various network graphics of co-authorship between countries (Figure 6), keywords (Figure 7), researchers (Figure 8), and an additional analysis on the impact factor generated by these elements (Figure 9).
As shown in Figure 5, a correlation can be observed between the number of articles published from the period spanning 1962 to June 13, 2022. This correlation indicates an increase in the number of publication articles pertaining to the concepts of volatilomics, volatomics, volatile profile, and volatile footprint. These are recognized as the primary concepts that have been reported as keywords when analyzing a volatilomic study. This analysis reveals that there are few articles published with the term volatilomics, given that it is a relatively new science within the scientific community. This science possesses a transversal field of action ranging from health sciences to chemical sciences, with a greater application in analytical and instrumental chemistry. As previously mentioned, this type of science has had research focuses within the processes of identification of Volatile Organic Compounds (VOCs), chem-ecological analysis, evaluation of food safety, and its potential application towards the characterization of biomarkers associated with different types of alterations within metabolic processes.
Upon applying the filters in the Scopus database and obtaining the bibliometric information from the research on volatilomics in Colombia, a co-authorship network graph was produced, highlighting the collaboration between countries (Figure 6). Primary research on this topic was found to be conducted in conjunction with various countries, resulting in a collaborative network for the development of these studies with countries such as the United Kingdom, China, Germany, Italy, India, Brazil, Portugal, United States, Spain, Egypt, France, and the Netherlands. The frequency of collaboration with these countries is reflected in the node size they represent in the graph. As a result, these countries appear more actively engaged in conducting this type of research, with the United States and China emerging as the leading contributors in this field. For Colombia, a small representative node was noted, suggesting that only a limited number of studies have explored the use of volatilomics and the various advancements this science provides as it is applied to various research.
Upon analyzing a co-occurrence network among indexed keywords with a minimum occurrence of fifteen (Figure 7), it becomes evident that these indexed terms signify the use of key concepts that are widely used within research processes in volatilomics. Consequently, a contribution is made towards the development of sciences, through the use of chemistry, mass spectrometry, chromatography, in order to capture different VOCs and the respective interpretation of these data through chemometric tests and analysis of a large volume of data using multivariate analysis techniques. This includes the study of the principal components (PCA) which is pivotal for the acquisition of possible biomarkers and the role they play within alterations of metabolic pathways. All of this is performed with the aim of identifying this type of compounds through volatilomics and xenovolatilomics in different types of food matrices. In addition, this research bears significant importance for early disease diagnosis.
As seen in Figure 8, a graph of the co-authorship network among the primary researchers who have made various contributions towards the progressive development of the omics sciences in Colombia was derived. As a result, five large groups were found that are highly cited among themselves. Upon exploring the information of each of them, Figure 9 was obtained, where the information related to the number of articles involved in the subject of volatilomics is compared against the impact factor (H-index) for the authors obtained in the bibliometric analysis of volatilomic studies in Colombia. This analysis revealed researchers with a high degree of impact within the world of metabolomics, but few reported research directed towards volatilomic studies.
Among these authors, the work of Baena-Pedroza, A.M., Corpas-Iguarán, E., Londoño-Giraldo, L.M., Martínez-Seidel, F., Taborda-Ocampo, G., is particularly notable. These individuals are recognized as key contributors in volatilomic studies within the Research Group in Chromatography and Related Techniques (GICTA), which is affiliated with the University of Caldas, Manizales, Colombia. Additionally, the author Stashenko, E.E. is recognized as a member of the CENIVAM Center of Excellence in Research and the CROM-MASS Laboratory of Chromatography and Mass Spectrometry which are both associated with the Industrial University of Santander, Bucaramanga, Colombia. They have made substantial contributions to thematic areas such as agriculture and biological sciences, biochemistry, genetics and molecular biology, chemistry, medicine, immunology and microbiology, as well as pharmacology, toxicology, and pharmaceutical sciences.
Through this review, it becomes clear that volatilomics studies demonstrate significant applicability, particularly in terms of ensuring food safety in different types of food matrices. Furthermore, it allows for the study of the volatile profile formed by the various VOCs expressed by this matrix and the respective ecological role that these compounds play within the environment in which they are found. In the same vein, volatilomic studies have facilitated scientific advancements through the use of various technological tools, as they have ventured into the development of new methods for the extraction, separation, and identification of these compounds. These advancements are also tied to the integration of new chemometric methodologies that require the use of supervised and unsupervised analysis techniques, in order to identify variations in the data obtained, validate established methodologies, and discover potential biomarkers of different types, among others.
The importance of the SPME methodology, coupled with the use of GC – MS for obtaining the volatilome and the volatile profile affected by the presence of xenobiotic agents in different types of matrices, is underscored. This methodology generates contributions not only to the development of omic sciences, but also to the evaluation of food safety in foods, early detection of diseases, identification of new natural products, and understanding of compounds within the different metabolic biochemical processes. Consequently, not only have these types of methodologies made a significant impact, but there has also been a rise in innovative research using E-nose and artificial intelligence for the development of machine learning that allows the construction of statistical classification models of this type of data obtained through HS – SPME – GC – MS and E-nose. With the use of machine learning, it is possible to obtain results much more quickly pertaining to the identification of contaminated matrices or different types of biomarkers that help in the evaluation of safety or in the identification of impacts on human health through rapid diagnostic tests, for early and effective detection of diseases.
It is our hope that research in the area of volatilomics continues to grow, making various contributions in the field through the study of various matrices and finding applications within the health area, with the aim of facilitating an early and timely diagnosis of diseases. We also anticipate that the use of machine learning will pave the way for the development of robust classification models for obtaining rapid results in verification and validation processes of the information obtained through the development of this type of primary research on the use of volatilomics.
JPBA: Authored the document and developed the bibliometric analysis, EEVS: Reviewed primary literature and provided support in document preparation, JAFL: Conducted the interpretation of bibliometric analysis, GTO: Managed the writing, reviewing, and editing. Furthermore, the authors accepted the final version of the review and all authors have read and accepted the version of the manuscript.
The authors declare that at present, the results obtained in the present research are not available under the direction of a repository. However, for any questions please contact juan.betancourt@ucaldas.edu.co.
Figshare: Figure S1. Inclusion and exclusion criteria for the search of scientific articles on volatilomic studies. https://doi.org/10.6084/m9.figshare.26530738.v1. 129
The data are available under the Public Domain Dedication (CC0) license.
Figshare: PRISMA Checklist for Volatilomics: An emerging discipline within Omics Sciences – A systematic review. https://doi.org/10.6084/m9.figshare.26530693.v4. 130
The data are available under the Public Domain Dedication (CC0) license.
The authors thank the Ministry of Science, Technology and Innovation for the support received through the call 907: “Call for young researchers and innovators in the context of economic reactivation 2021”.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Are the rationale for, and objectives of, the Systematic Review clearly stated?
No
Are sufficient details of the methods and analysis provided to allow replication by others?
No
Is the statistical analysis and its interpretation appropriate?
Partly
Are the conclusions drawn adequately supported by the results presented in the review?
Partly
If this is a Living Systematic Review, is the ‘living’ method appropriate and is the search schedule clearly defined and justified? (‘Living Systematic Review’ or a variation of this term should be included in the title.)
No
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Volatilomics, Metabolomics, Bibliometrics
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |
---|---|
1 | |
Version 1 02 Sep 24 |
read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)