Seminar in bioinformatics Computation of elementary modes: a unifying framework and the new binary approach Elad Gerson, Spring 2006, Technion. Julien.

Slides:



Advertisements
Similar presentations
Lets begin constructing the model… Step (I) - Definitions We begin with a very simple imaginary metabolic network represented as a directed graph: Vertex.
Advertisements

Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.
Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.
Metabolic functions of duplicate genes in Saccharomyces cerevisiae Presented by Tony Kuepfer et al
The (Right) Null Space of S Systems Biology by Bernhard O. Polson Chapter9 Deborah Sills Walker Lab Group meeting April 12, 2007.
Integration of enzyme activities into metabolic flux distributions by elementary mode analysis Kyushu Institute of Technology Hiroyuki Kurata, Quanyu Zhao,
1 2 Extreme Pathway Lengths and Reaction Participation in Genome Scale Metabolic Networks Jason A. Papin, Nathan D. Price and Bernhard Ø. Palsson.
Mathematical Representation of Reconstructed Networks The Left Null space The Row and column spaces of S.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Basis of a Vector Space (11/2/05)
Experimental and computational assessment of conditionally essential genes in E. coli Chao WANG, Oct
Approximation Algorithms
Algorithm Design Techniques: Induction Chapter 5 (Except Section 5.6)
20. Lecture WS 2008/09Bioinformatics III1 V20 Metabolic Pathway Analysis (MPA) Metabolic Pathway Analysis searches for meaningful structural and functional.
1 Data Structures A program solves a problem. A program solves a problem. A solution consists of: A solution consists of:  a way to organize the data.
Computational Complexity, Physical Mapping III + Perl CIS 667 March 4, 2004.
Humboldt- Universität zu Berlin Edda Klipp Systembiologie 3 - Stoichiometry Sommersemester 2010 Humboldt-Universität zu Berlin Institut für Biologie Theoretische.
21. Lecture WS 2003/04Bioinformatics III1 Metabolic Pathway Analysis: Elementary Modes The technique of Elementary Flux Modes (EFM) was developed prior.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Metabolic/Subsystem Reconstruction And Modeling. Given a “complete” set of genes… Assemble a “complete” picture of the biology of an organism? Gene products.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Extreme Pathways introduced into metabolic analysis by the lab of Bernard Palsson (Dept. of Bioengineering, UC San Diego). The publications of this lab.
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
SVM by Sequential Minimal Optimization (SMO)
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Lecture #23 Varying Parameters. Outline Varying a single parameter – Robustness analysis – Old core E. coli model – New core E. coli model – Literature.
Algorithms for a large sparse nonlinear eigenvalue problem Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya University.
Genetic modification of flux (GMF) for flux prediction of mutants Kyushu Institute of Technology Quanyu Zhao, Hiroyuki Kurata.
Simplex method (algebraic interpretation)
The Optimal Metabolic Network Identification Paula Jouhten Seminar on Computational Systems Biology
Analysis of Algorithms
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
Comparison of networks in cell biology Jörn Behre, Dept. of Bioinformatics, Friedrich-Schiller-University Jena 4th SFB-Workshop "Gene regulatory networks",
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Solving Linear Programming Problems: The Simplex Method
CMPT 438 Algorithms. Why Study Algorithms? Necessary in any computer programming problem ▫Improve algorithm efficiency: run faster, process more data,
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
15. Lecture WS 2014/15 Bioinformatics III 1 V15 Metabolic Pathway Analysis (MPA) Metabolic Pathway Analysis searches for meaningful structural and functional.
Data Structures and Algorithms Introduction to Algorithms M. B. Fayek CUFE 2006.
CSC 211 Data Structures Lecture 13
1 Departament of Bioengineering, University of California 2 Harvard Medical School Department of Genetics Metabolic Flux Balance Analysis and the in Silico.
25. Lecture WS 2004/05Bioinformatics III1 V25 Framework for computation of elementary modes II.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
10. Lecture WS 2014/15 Bioinformatics III1 V10 Metabolic networks - Graph connectivity Graph connectivity is related to analyzing biological networks for.
in „Combinatorics and Computer Science Vol
OR Chapter 8. General LP Problems Converting other forms to general LP problem : min c’x  - max (-c)’x   = by adding a nonnegative slack variable.
20. Lecture WS 2006/07Bioinformatics III1 V20 Extreme Pathways introduced into metabolic analysis by the lab of Bernard Palsson (Dept. of Bioengineering,
Lecture 151 Programming & Data Structures Dynamic Programming GRIFFITH COLLEGE DUBLIN.
DATA STRUCTURES (CS212D) Overview & Review Instructor Information 2  Instructor Information:  Dr. Radwa El Shawi  Room: 
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
The minimum cost flow problem. Solving the minimum cost flow problem.
Linear Programming Piyush Kumar Welcome to CIS5930.
Approximation Algorithms based on linear programming.
CMPT 438 Algorithms.
Advanced Algorithms Analysis and Design
BT8118 – Adv. Topics in Systems Biology
Structural analysis of metabolic network models
The minimum cost flow problem
Introduction to Algorithms
Algorithm Analysis CSE 2011 Winter September 2018.
Chap 3. The simplex method
Objective of This Course
V15 Elementary Flux Modes / Minimal Reaction Cut Sets
V11 Metabolic networks - Graph connectivity
V21 Minimal Reaction Cut Sets – Dual Description Method
Back to Cone Motivation: From the proof of Affine Minkowski, we can see that if we know generators of a polyhedral cone, they can be used to describe.
V11 Metabolic networks - Graph connectivity
Simplex method (algebraic interpretation)
V11 Metabolic networks - Graph connectivity
Presentation transcript:

Seminar in bioinformatics Computation of elementary modes: a unifying framework and the new binary approach Elad Gerson, Spring 2006, Technion. Julien Gagneur and Steffen Klamt BMC Bioinformatics 2004, 5:175

Agenda Quick overview of last week’s lecture. Extension of the EP concept. –Enter EM. General framework for EM computation. –Reversible reactions split. –Network compression. –Post processing. Some implementation tweaks.

Last week on bioinformatics seminar! Given a metabolic network we wish to find all the possible flux distributions which results in a steady state. Meaning, the overall flux in a pathway is 0.

Last week on bioinformatics seminar! This is done by describing the pathway as a stoichiometric matrix S, solving the equation –

Last week on bioinformatics seminar! Notice that we are interested only in solutions where (sign suggests reaction’s direction). Solution space is spanned by linearly independent vectors. We look for a spanning set s.t. every solution can be written as a linear combination of the spanning vector where all coefficients are non-negative (Genetically independent). Those solutions are called Extreme pathways (EP). Can be found using the Null Space Approach (NSA) Algorithm.

Problem Biology suggests some reaction are reversible. Consider the following network for instance – R5 can work in both directions (Not simultaneously!)

Solution ? Remove the restriction, signs suggests direction.. Bad idea.. Not all reactions are reversible. Solutions no longer take the form of a polyhedral cone.

Solution ! Split the reversible reactions.. Find Extreme Pathways using the NSA algorithm. Post process found EPs, merge split reactions (“opposite direction” should be set with a negative sign). Post processed EPs are now called - Elementary Modes (EM). R5aR5b

Compressing the network Removing redundancies Can be united..

Compressing the network Removing redundancies R1 is null in any feasible steady state

Compressing the network Removing redundancies Contradict each other.. Can be eliminated.

Compressing the network Removing redundancies Active in any stead state.

Compressing the network Removing redundancies Some redundancies can be detected as dependent linear rows in the kernel matrix. Iterative approach, remove redundancies until non detected. –Produce better results.

Preprocessing - Metabolic networks yield deeper insight of organisms metabolism. Failure modes analysis will provide Crucial parts identification. Suitable targets for repressing undesired metabolic functions. Apply NSA algorithm. Post process. General framework

The authors offers an efficient implementation to the NSA and CBA (Combined basis – Schuster et. al.) algorithms. –Using binary representation for vectors. Fast bit operators. Efficient memory usage (up to 1.6% of original!) One more tweak

Seminar in bioinformatics Minimal cut sets in biochemical reaction networks Elad Gerson, Spring 2006, Technion. Steffen Klamt and Ernst Dieter Gilles Bioinformatics Vol. 20 no , pages 226–234

Abstract Motivation Metabolic networks yield deeper insight of organisms metabolism. Failure modes analysis will provide Crucial parts identification. Suitable targets for repressing undesired metabolic functions. Results The biochemical networks minimal cut sets concept. Algorithm which computes MCS with respect to an objective reaction. Potential applications includes phenotype predictions. Network verifications. Structural robustness and fragility assessment. Metabolic flux analysis. Target identification in drug discovery.

Introduction Assume we wish to prevent the production of metabolite X. i.e. there is no balanced flux distribution possible which involves obR. Can be done by gene deletion or enzyme inhibition.

Introduction Definition - We call a set of reactions a cut set (with respect to a defined objective reaction) if after the removal of these reactions from the network no feasible balanced flux distribution involves the objective reaction.

Introduction That’s easy.. Consider C0 = {obR} One might wish to cut the reaction at the beginning. What if there are numerous obR’s ? Simultaneous failure might be achieved more efficiently.

Introduction Take two – Remove all reactions except for oBR. Not efficient. Not intelligent.

Introduction Consider C1 = {R5, R8} Sufficient. Neither the removal of R5 nor R8 is sufficient. No subset of C1 is a valid cut set → C1 is minimal.

Introduction Definition - A cut set C (related to a defined objective reaction) is a minimal cut set (MCS) if no proper subset of C is a cut set. Can you spot all the MCS in the network ?

Introduction Is C2 = {R2, R4, R6} minimal ?

Introduction Is C3 = {R2, R5, R7} ?

Introduction How about C1 = {R1} ?

Introduction OK, what about Graph disconnectivity algorithms ? No good, They don’t take the hypergraph nature of metabolic pathways into account.

The algorithm Initialization (1)Calculate the EMs in the given network (2)Define the objective reaction obR (3) Choose all EMs where reaction obR is non-zero and store it in the binary array em_obR (em_obR[i][j]==1 means that reaction j is involved in EM i) (4) Initialize arrays mcs and precutsets as follows (each array contains sets of reaction indices): append {j } to mcs if reaction j is essential (em_obR[i][j]=1 for each EM i), otherwise to precutsets

The algorithm (5) FOR i=2 TO MAX_CUTSETSIZE (5.1) new_precutsets=[ ]; (5.2) FOR j = 1 TO q (q: number of reactions) (5.2.1) Remove all sets from precutsets where reaction j participates (5.2.2) Find all sets of reactions in precutsets that do not cover at least one EM in em_obR where reaction j participates; combine each of these sets with reaction j and store the new preliminary cut sets in temp_precutsets (5.2.3) Drop all temp_precutsets which are a superset of any of the already determined minimal cut sets stored in mcs (5.2.4) Find all retained temp_precutsets which do nowcover all EMs and appendthem to mcs; append all others to new_precutsets ENDFOR (5.3) If isempty(new_precutsets) (5.3.1) Break ELSE (5.3.2) precutsets=new_precutsets ENDIF ENDFOR (6) result: mcs contains the MCSs

Running example Initialization – Calculate EM We are only interested in EM containing obR

Running example Initialization mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 1 mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 1 mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {{1 2}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 1 mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {{1 2}, {1 3}, {1 4}, {1 5} {1 6}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 1 mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {{1 2}, {1 3}, {1 4}, {1 5} {1 6} {1 7} {1 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 1 mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {{1 2}, {1 3}, {1 4}, {1 5}, {1 6}, {1 7}, {1 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 2 mcs = {{1}}, precutsets = {{2},{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 2 mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 2 mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {{2 4}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 2 mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}} new_precutsets = {} temp_precutsets = {{2 4},{2 6},{2 7},{2 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 2 mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}} new_precutsets = {{2 4}} temp_precutsets = {{2 6},{2 7},{2 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 2 mcs = {{1}}, precutsets = {{3},{4},{5},{6},{7},{8}} new_precutsets = {{2 4},{2 6},{2 7},{2 8}} temp_precutsets = {} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 5 mcs = {{1}}, precutsets = {{5},{6},{7},{8}} new_precutsets = {{2 4},{2 6},{2 7},{2 8},..} temp_precutsets = {} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 5 mcs = {{1}}, precutsets = {{6},{7},{8}} new_precutsets = {{2 4},{2 6},{2 7},{2 8},..} temp_precutsets = {{5 6},{5 7},{5 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 5 mcs = {{1}}, precutsets = {{6},{7},{8}} new_precutsets = {{2 4},{2 6},{2 7},{2 8},..} temp_precutsets = {{5 6},{5 7},{5 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 5 mcs = {{1}, {5 6}}, precutsets = {{6},{7},{8}} new_precutsets = {{2 4},{2 6},{2 7},{2 8},..} temp_precutsets = {{5 7},{5 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 5 mcs = {{1}, {5 6}, {5 7}}, precutsets = {{6},{7},{8}} new_precutsets = {{2 4},{2 6},{2 7},{2 8},..} temp_precutsets = {{5 8}} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 2, j = 8 mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {} new_precutsets = {{2 4},{2 6},{2 7},{2 8},..} temp_precutsets = {} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 3, j = 2 mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {{2 4},{2 6},{2 7},{2 8},…{4 6},…} new_precutsets = {} temp_precutsets = {} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 3, j = 2 mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {…{4 6},…} new_precutsets = {} temp_precutsets = {…} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 3, j = 2 mcs = {{1}, {5 6}, {5 7}, {5 8}}, precutsets = {…{4 6},…} new_precutsets = {…} temp_precutsets = {{2 4 6},…} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 3, j = 2 mcs = {{1}, {5 6}, {5 7}, {5 8}, {2 4 6}}, precutsets = {…{4 6},…} new_precutsets = {…} temp_precutsets = {…} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Running example I = 3, j = 8 mcs = {{1}, {5 6}, {5 7}, {5 8}, {2 4 6},…}, precutsets = {} new_precutsets = {…} temp_precutsets = {…} R8R7R6R5R4R3R2R1em_obR (EM2) (EM3) (EM4)

Complexity Let q be the number of reactions. Assuming |EM| << q. In initialization q singletons are generated and tested. In the i-th iteration –Overall number of temp_precutsets generated –O(p) comparisons are made. Hence, (All subsets of q items) –Yes.. exponential.. Maximal MCS size << q bounds polynomial approximation.

MCS in central metabolism of E. coli MCS calculated with ‘biomass synthesis’ as objective reaction (growth). –Network comprises 110 reactions and 89 metabolites. –Catabolic (material breakdown) part modeled in details. Enables excretion of 5 metabolites. Uptake of glucose, acetate, glycerol and succinate. Growth on each substrate was tested separately.

MCS in central metabolism of E. coli

Possible applications Structural fragility and robustness MCS can be used for “risk assessment” in metabolic pathways. –More EMs suggested a more robust and less fragile pathway. EMs number and MCSs size are strongly correlated. (More elements must fail). We seek a better criteria. Glucose is known to be the least fragile growth substrate having most EMs and apparently longest MCSs ‘Dangerous’ MCSs

Possible applications Structural fragility and robustness Definition – Reaction fragility factor Fi is the reciprocal of the average size of all the MCSs the reaction i participates.

Possible applications Structural fragility and robustness Definition – Reaction fragility factor Fi is the reciprocal of the average size of all the MCSs the reaction i participates. May suggest reaction’s importance.

Possible applications Structural fragility and robustness Definition – Reaction fragility factor Fi is the reciprocal of the average size of all the MCSs the reaction i participates. Is there a correlation between Fi and the number of EMs the reaction participates?

Possible applications Structural fragility and robustness

Possible applications Structural fragility and robustness Definition – Network fragility F is defined as where q is then number of reactions.

Possible applications Network verification and mutant phenotype predictions. Cutting an MCS is predicted to leave a metabolic pathway dysfunctional. Apply the algorithm with ‘growth’ as obR. – If a set of gene deletions (or mutants) contains an MCS a non-viable phenotype is expected. Viable phenotype would be a false negative. –Proof for incorrect or incomplete network. –Otherwise growth is possible. Non-viable phenotype would be a false positive. –May suggest a false assumption in the network structure. »One of the reactions in the MCS might be of regulatory nature.

Possible applications Target identification and repressing cellular functions. MCS offers a theoretical tool for target identification in drug discovery. –An irreducible set of interventions needed for pathway dysfunction. –Usually we will look for minimal size of MCS. –Other pathways should be weakly affected. Can be checked easily – set of untouched EM’s. MCS 0, 2, 3, 4 will not affect EM1