The Eli and Edythe L. Broad Institute A Collaboration of Massachusetts Institute of Technology, Harvard University and affiliated Hospitals, and Whitehead Institute for Biomedical Research Lessons learned from the Genome- scale metabolic reconstruction and curation of Neurospora crassa Jeremy Zucker Jonathan Dreyfuss Heather Hood James Galagan
Capture Metabolic Knowledge Pathway-tools/BioCyc KEGG Reactions Interactions Literature
Visualizing ‘omics Data Provide a visually intuitive, metabolic framework for interpreting large ‘omics datasets
in silico Predictions Algorithmically Interpret Expression Data in a Metabolic Context?
Example: Plasmodium Validation KO Phenotype Predictions – 90% Accuracy External Metabolite Changes – 70% Accuracy New Predictions 40 Enzymatic drug targets Experimental validation of novel target Eflux* *Colijn, C., A. Brandes, J. Zucker, et al. (2009). PLoS Comput Biol
Modeling in the Neurospora PO1 ClockVisualization and Analysis Profiling RNA-Seq ChIP-Seq Interpretation of Expression Profiling and Regulatory Network Data in a Metabolic Context – Inform Experiments
BUILDING THE MODEL
Manual reconstruction protocol Nature Protocols, Vol. 5, No. 1. (07 January 2010), pp
Automated Model SEED reconstruction pipeline Nature biotechnology, Vol. 28, No. 9. (29 September 2010), pp
Genome sequence to metabolic model PathwaysLiterature Nutrient media (Vogels) NeurosporaCyc ElementsMetadata Complexes Reactions Transporters Biomass composition
EFICAz2 predicts enzymes … Decision tree Databases HMMs FDR SVM 9934 protein sequences 1993 enzymes 1770 reactions BMC Bioinformatics 2009, 10:107
Protein Complex editor 182 reactions with isozymes or complexes 31 complexes experimentally validated through literature search 2-oxoisovalerate alpha subunit 2-oxoisovalerate beta subunit … fatty acid synthase beta subunit dehydratase fatty acid synthase alpha subunit reductase Identify multiple genes of reaction Allow curator to validate potential complexes 2-oxoisovalerate complex Present all possible combinations of complexes Fatty acid synthase complex …
Transport inference parser (TIP) 9934 free-text Protein annotations 176 transporters assigned to 97 transport reactions MFS glucose transporter ATP synthase … sucrose transporter Filter proteins for transporters Infer multimeric complex Infer substrate Infer energy-coupling mechanism … Bioinformatics (2008) 24 (13): i259-i267.
Pathologic predicts pathways 1770 enzyme- catalyzed reactions 265 Pathways … … X = #rxns in metacyc pwy Y = #rxns with enzyme evidence Z = #unique rxns in pwy P(X|Y|Z) = prob of pwy in Neurospora Science 293:2040-4, 2001.
Literature curation validates predictions … 1212 citations associated with 307 pathways 31 complexes 168 genes …
Neurospora Cellular overview
NEUROSPORACYC
New feature on Broad website
NeurosporaCyc Cellular overview
NeurosporaCyc cellular overview
Googlemaps-like zoomable interface
Highlight genes on overview
NeurosporaCyc Omics Viewer
Omics data mapped onto metabolism
Omics data mapped onto Genome
DEBUGGING THE BUG
The problem with EC numbers Reaction classNumber of reactions neurospora (metacyc) Balanced normal reactions993 (4585) Generic reactions198 (688) Protein modification reactions:82 (469) Reactions with instanceless classes:80 (228) Generic redox reactions36 (212) Polymeric reactions24 (91) Polymerization pathway reactions11 (17)
Generic Reactions
instance of ?
Protein Modification reactions
Reactions with instanceless classes
Solution: Instantiate classes
Generic Redox reactions
Polymeric reactions
Polymerization Pathway reactions
Solution: Instantiate polymerization steps POLYMER-INST-Fatty-Acids-C16 + coenzyme A + ATP -> POLYMER-INST-Saturated-Fatty-Acyl-CoA- C16 + diphosphate + AMP + H+ POLYMER-INST-Fatty-Acids-C14 + coenzyme A + ATP -> POLYMER-INST-Saturated-Fatty-Acyl-CoA- C14 + diphosphate + AMP + H+ … POLYMER-INST-Fatty-Acids-C0 + coenzyme A + ATP -> POLYMER-INST-Saturated-Fatty-Acyl-CoA- C0 + diphosphate + AMP + H+
What happens when the metabolic network is infeasible? Add a “reaction” with the smallest number of reactants and products that results in a feasible model minimize card(r) subject to Sv + r = 0 l ≤ v ≤ u
Fast Automated Reconstruction of Metabolism Input: – EFICAz probabilities for each reaction – Biomass components – Experimental growth / no growth phenotypes in different nutrient conditions – Gene essentiality – Manual curation of pathways Output: – Metabolic network of MetaCyc reactions maximally consistent with input
VALIDATING THE MODEL WITH IN SILICO KNOCKOUT PREDICTIONS
Neurospora phenotypes for validation Neurospora e-Compendium – 29 Mutants essential on minimal media – Non-essential on supplemental media PO1 Phenotype Collection – 79 non-essential KOs under minimal media – Additional phenotypes are observed. Used FBA with Neurospora model to simulate gene knockouts in minimal medium
Neurospora phenotype prediction results Predicted EssentialNon-Essential ObservedEssential22 (TN)7 (FP) Non-Essential14 (FN)65 (TP) PrecisionTP/ (TP+FP) 90% RecallTP/ (TP+FN) 82% SpecificityTN/ (TP+FP) 76% Accuracy(TP+TN)/ (TP+TN+FP+FN) 81%
Comparison of model organisms under minimal media Yeast (iND750) 1 E.Coli (iAF1260) 2 Neurospora Viable Predicted/ Observed 439/455=96%993/1022=97%65/79=82% Essential Predicted/ Observed 35/109=32%159/238=67%22/29=76% Overall accuracy84%91%81% [1] Genome Res : [2] Molecular Systems Biology :121
MODELING THE EFFECT OF OXYGEN LIMITATION ON XYLOSE FERMENTATION
Biofuels from Neurospora? Growing interest for obtaining biofuels from fungi Neurospora crassa has more cellulytic enzymes than Trichoderma reesei N. crassa can degrade cellulose and hemicellulose to ethanol [Rao83] Simultaneous saccharification and fermentation means that N. crassa is a possible candidate for consolidated bioprocessing Xylose Ethanol
Effects of Oxygen limitation on Xylose fermentation in Neurospora crassa Zhang, Z., Qu, Y., Zhang, X., Lin, J., March Effects of oxygen limitation on xylose fermentation, intracellular metabolites, and key enzymes of Neurospora crassa as Applied biochemistry and biotechnology 145 (1-3), Xylose Pyruvate TCAEthanol RespirationFermentation Glycolysis Oxygen level (mmol/L*g) Ethanol conversion (%) Low O 2 Intermediate O 2 High O 2
Pentose phosphate Aerobic respiration Fermentation TCA Cycle Model of Xylose Fermentation Xylose Oxygen Ethanol ATP Two paths from xylose to xylitol
Pentose phosphate Aerobic respiration Fermentation TCA Cycle Oxygen=5 ATP=16.3 NADPH Regeneration NADPH & NAD + Utilization High Oxygen NAD + Regeneration
Pentose phosphate Aerobic respiration Fermentation TCA Cycle Ethanol Low Oxygen Oxygen=0
Pentose phosphate Aerobic respiration Fermentation TCA Cycle Ethanol Intermediate Oxygen Optimal Ethanol NADPH & NAD Utilization Oxygen=0.5 ATP=2.8 NAD Regeneration NADPH Regeneration All O 2 used to regenerate NAD used in first step
Pentose phosphate Aerobic respiration Fermentation TCA Cycle Ethanol Intermediate Oxygen Optimal Ethanol NADPH & NAD Utilization Oxygen=0.5 ATP=2.8 NAD Regeneration NADPH Regeneration All O 2 used to regenerate NAD used in first step Bottleneck Pyruvate decarboxylase Improve NADH enzyme
USING E-FLUX TO PREDICT DRUG TARGETS BY INTEGRATING EXPRESSION DATA WITH FBA
E-Flux explanation
Application of E-flux to TB
Next Steps Annotation: use phenotype predictions to improve model NeurosporaCyc: Use E-flux to interpret the effect of clock genetic regulatory program on metabolism. Validation: add additional phenotypes
Acknowledgements Neurospora P01 Project Heather Hood Jonathan Dreyfuss James Galagan SRI Peter Karp Mario Latendresse Markus Krumenacker Ingrid Kesseler Tomer Altman Suzanne Paley Ron Caspi Mike Travers
Fast Automated Reconstruction of Metabolism (FARM) Gene Calls (Broad) Protein Complex prediction Transport predictor (TIP) Pathway prediction (Pathologic) Enzyme prediction (EFICAz) Literature curation (CAP) Nutrient media (Vogels) NeurosporaCyc
C C Fast Automated Reconstruction of Metabolism (FARM) 846 Reactions 640 Metabolites 564 Genes EFICAz predictions Pathway predictions Nutrient conditions Biomass composition Protein complexes Transport