GEMINI: Gene Expression and Metabolism Integrated for Network Inference

There is a strong need for computational frameworks that integrate different biological processes and data-types to unravel cellular regulation. Current efforts to reconstruct transcriptional regulatory networks (TRNs) focus primarily on proximal data such as gene co-expression and transcription factor (TF) binding. While such approaches enable rapid reconstruction of TRNs, the overwhelming combinatorics of possible networks limits identification of mechanistic regulatory interactions. Utilizing growth phenotypes and systems-level constraints to inform regulatory network reconstruction is an unmet challenge. We presented an approach called Gene Expression and Metabolism Integrated for Network Inference (GEMINI) that links a compendium of candidate regulatory interactions with the metabolic network to predict their systems-level effect on growth phenotypes. We then compared predictions with experimental phenotype data to select phenotype-consistent regulatory interactions.

GEMINI makes use of the observation that only a small fraction of regulatory network states are compatible with a viable metabolic network, and outputs a regulatory network that is simultaneously consistent with the input genome-scale metabolic network model, gene expression data, and TF knockout phenotypes. GEMINI preferentially recalls gold-standard interactions (p-value = 10−172), significantly better than using gene expression alone. We applied GEMINI to create an integrated metabolic-regulatory network model for Saccharomyces cerevisiae involving 25,000 regulatory interactions controlling 1597 metabolic reactions. The model quantitatively predicts TF knockout phenotypes in new conditions (p-value = 10−14) and reveals potential condition-specific regulatory mechanisms.

Our results suggest that a metabolic constraint-based approach can be successfully used to help reconstruct TRNs from high-throughput data, and highlights the potential of using a biochemically-detailed mechanistic framework to integrate and reconcile inconsistencies across different data-types. The algorithm and associated data are available at​data/

Data and Software File(s):