Scientists at The University of Milan-Bicocca, Milan, Italy, have proposed INTEGRATE, a computational pipeline that incorporates metabolomics and transcriptomics data, by applying constraint-based stoichiometric metabolic models as a scaffold. The pipeline might aid in developing personalized treatments for people suffering from complicated disorders.

From cancer to neurodegeneration and aging, many physio-pathological states and multifactorial disorders include a metabolic component. Metabolism is inextricably linked to most, if not all, biological functions by its very nature. As a result, metabolism may serve as a specific integrative readout of a cell’s or organism’s physio-pathological status. At least two interconnected regulatory layers are required for each metabolic flux: On the one hand, the catalyzing enzyme’s expression level determines the maximum theoretical flux level (i.e., the reaction’s net rate) for every enzyme-controlled reaction. Metabolic regulation, on the other hand, regulates metabolic flux by interacting with the responsible enzyme through metabolites (substrates, cofactors, and allosteric modulators).

In several sectors, such as health, wellness, and biotransformations, it’s critical to understand the topography of metabolism and how it’s regulated. The knowledge of metabolic fluxes is the initial condition for this characterization. Constraint-based steady-state models are a valuable tool for predicting metabolic fluxes from other high-throughput omics data. When metabolomics and transcriptomics data are analyzed independently, the hierarchical regulation of metabolism is not adequately characterized. To dismantle the connection between distinct regulatory layers controlling metabolism, metabolomics and transcriptomics must be integrated.

In this study, the scientists describe the INTEGRATE (Model-based multi-omics data INTEGRAtion to characterize mulTi-level mEtabolic regulation) pipeline, which uses metabolomics and transcriptomics data to properly characterize the landscape of metabolic regulation in various biological samples.  

From transcriptomics data, INTEGRATE initially computes differential expression of reactions (transcriptional regulation only). Then it uses constraint-based modeling to anticipate how global relative variations in expression will translate into consistent differences in metabolic fluxes. In principle, any accessible approach can be utilized to achieve this goal. INTEGRATE uses exo-metabolomics data to limit chosen extracellular fluxes in order to improve model predictions. INTEGRATE parallelly employs intracellular metabolomics datasets and the mass action law formulation to forecast how variations in substrate availability correspond to differences in metabolic fluxes (metabolic regulation alone), ignoring enzymatic activity. The intersection of the two output datasets distinguishes metabolic and/or gene expression-regulated fluxes.

The core process of the INTEGRATE methodology is to integrate the input experimental datasets, which are centered around heterogeneous objects (i.e., genes, metabolites, and fluxes), into the input metabolic network to produce the three datasets, each of which is focused around the object reaction: Reaction Activity Scores (RAS), Feasible Flux Distributions (FFD), and Reaction Propensity scores (RPS). After obtaining the three reaction-oriented datasets, INTEGRATE determines if the value of each response in a certain cell line is significantly higher or lower than in another. If the null hypothesis is rejected by any appropriate statistical test and the variation reaches a threshold value, they consider the variation to be statistically significant.

The metabolic reactions are subsequently given two scores by INTEGRATE. The first score measures the degree of agreement between the RAS and RPS datasets in terms of variance signals (for reactions in common). Highly concordant responses correspond to fluxes with coordinated metabolic and transcriptome regulation, while weakly concordant reactions correspond to the opposite. The second score evaluates whether flux changes are consistent with metabolic control by comparing FFD and RPS (for reactions in common). Metabolically controlled reactions are those with a low RAS-RPS agreement but a high FFD-RPS.

Image Description: Graphical representation of INTEGRATE pipeline.
Image Source: INTEGRATE: Model-based multi-omics data integration to characterise multi-level metabolic regulation.

The ENGRO2 metabolic model- Recon3D and other genome-wide human metabolism reconstructions are valuable sources of detailed and multi-level information about human metabolism. In general, they can be used directly as a scaffold model in this workflow to integrate the experimental input data for the five breast cell lines. To do this, the scientists rebuilt the ENGRO2 metabolic network, a constraint-based core model of central carbon metabolism and critical amino acid metabolism. The ENGRO2 core model has 494 reactions, 410 metabolites, and 494 genes in its final form.

Image Description:Evaluation of the effect of the different types of constraints on ENGRO2 feasible solutions.
Image Source: INTEGRATE: Model-based multi-omics data integration to characterise multi-level metabolic regulation.

The researchers employed the ENGRO2 metabolic model in combination with a carefully curated, simulation-ready reconstruction of the human central carbon metabolism to test their workflow. They were able to attribute some flux differences to transcriptional or metabolic control, or both, after applying their process to one non-tumorigenic and four distinct breast malignancies. Surprisingly, it was discovered that reactions for which flux variations and RPS variations in both models coincide quite well. Here, the scientists emphasize that the two datasets are completely separate.

INTEGRATE incorporates high-throughput data to capture dynamic aspects of the metabolic state of distinct cells or tissues, providing complementary views on their stable profile. Transcriptomics and metabolomics data are more expressive when combined using a model. The information on the differential activity of reactions acquired from gene expression data is supplemented by information on the direction of the observed change, which is provided by metabolic fluxes predicted via constraint-based modeling.

INTEGRATE additionally adds information on the compartment in which a reaction happens to complement information on the differential propensity of reactions obtained from metabolomics. For example, metabolomics data alone would not have allowed us to distinguish between the contribution of the aconitase reaction (ACONT) substrate in the cytosol compartment vs. its metabolic counterpart. On the other hand, INTEGRATE found that the cytosolic reaction is involved in metabolic regulation. Interestingly, transcriptomics data were used to supplement this information, although the aconitase flux is not regulated at the transcriptional level. This finding suggests that indirect transcriptional regulation is most likely to blame for observed flux variations, which needs additional examination.

Many different analyses can be devised and carried out downstream of this pipeline. For example, one could look at the metabolism of amino acids to see if each cell line prefers to synthesize or metabolize them differently. It’s also possible to look at fluxes that better distinguish amongst the five cell lines.

The main novelty of this approach is the immediate exploitation of metabolomics information to assess whether a flux is regulated at the metabolic level, and this pipeline can be further extended to integrate proteomics and phosphoproteomics. Also, along with transcriptomics-derived RASs, (phospho)proteomics-derived RASs could be computed. “We believe that our approach will stimulate discussion in the COBRA and cancer metabolism communities.” The researchers anticipated.

It is possible to create an effective therapeutic intervention by identifying the metabolic reaction(s) that indirectly affect the target reaction. INTEGRATE dissects the intricate and interconnected regulation of metabolic networks and informs targeted tactics to combat metabolic rewiring and/or malfunction underlying distinct clinical illnesses by integrating high-throughput omics data through mathematical models. Expanding the toolbox of computational tools will undoubtedly improve the rate of success of efforts to apply engineering concepts to living organisms, resulting in the development of predictable, scalable, and efficient biological devices whose performance is not hampered by a lack of understanding of the underlying design principles.

Story Source: Di Filippo, M., Pescini, D., Galuzzi, B. G., Bonanomi, M., Gaglio, D., Mangano, E., … & Damiani, C. (2022). INTEGRATE: Model-based multi-omics data integration to characterize multi-level metabolic regulation. PLOS Computational Biology, 18(2), e1009337. DOI:

Data Availability:


Please enter your comment!
Please enter your name here