Ferhat Ay, Tamer Kahveci & Valerie de-Crecy Lagard 4/17/20151 Ferhat Ay
Metabolic Pathways 4/17/20152 Ferhat Ay
What and Why? 4/17/20153Ferhat Ay Metabolic Pathway Alignment Finding a mapping of the entities of the pathways C2 C3 C4 C5 R1R2 C1 E1E2 C2C4 R1R2 C1 C5 E1 E2 Applications ○ Drug Target Identification ○ Metabolic Reconstruction ○ Phylogeny Prediction
Challanges 4/17/20154 Ferhat Ay E1E2E3 E4 E1E2E3 Graph Alignment Even after Abstraction Metabolic Pathway Alignment problem is NP Complete! Existing Algorithms Heymans et al. (2003) Clemente et al. (2005) Pinter et al. (2005) Singh et al. (2007) …. Abstraction is a problem ! E1 C1 C2 E2 C3 C4 E3 E4 E1 C1 E2 C3 E3 - Where are the compounds? - E1 C1 E2 or E1 C2 E2 ? Pathway Alignment is hard ! Abstraction
Outline 4/17/20155 Ferhat Ay Graph Model of Pathways Consistency of an Alignment Homological & Topological Similarities Eigenvalue Problem Similarity Score Experimental Results
Non-Redundant Graph Model 4/17/20156 Ferhat Ay Pyruv Lip-EThPP R0014 S-Ac 2-ThP A-CoA Di-hy R7618R3270 R
Consistency 4/17/20157 Ferhat Ay 1- Align only the entities of the same type (compatible) R1R2 C1C2 R1 C1 2- The overall mapping should be 1-1 R1 R2 R3
Consistency 4/17/20158 Ferhat Ay C3 C2 C5 C4 R1R2 C1 C2C4 R1R2 C1 C5 3- Align two entities u i, v i only if there exists an aligned entity pair u j, v j such that u j and v j are on the reachability paths of u i and v i respectively. Aligned Entities Backward Reachability Path Forward Reachability Path
Problem Statement 4/17/20159 Ferhat Ay Given a pair of metabolic pathways, our aim is to find the consistent alignment (mapping) of the entities (enzymes, reactions, compounds) such that the similarity between the pathways (SimP score) is maximized.
4/17/ Ferhat Ay Pairwise Similarities (Homology of Entities)
Pairwise Similarities (Homology) 4/17/ Ferhat Ay Enzyme Similarity (SimE) Hierarchical Enzyme Similarity - Webb EC.(2002) Information-Content Enzyme Similarity - Pinter et al.(2005) Compound Similarity (SimC) Identity Score for compounds SIMCOMP Compound Similarity – Hattori et al.(2003)
Pairwise Similarities 4/17/ Ferhat Ay Reaction Similarity (SimR) E1 R1 C3 C1 C2 R2 C6 C4 C7 C5 E2 E3 SimR (R1,R2) = Enzymes max ( SimE (E1,E3), SimC (E2,E3) ) Input Compounds + max ( SimC (C1,C4), SimC (C2,C4) ) Output Compounds + max ( SimC (C3,C5), SimC (C3,C6), SimC (C3,C7) ) SimR (R1,R2) = Enzymes max ( SimE (E1,E3), SimC (E2,E3) ) Input Compounds + max ( SimC (C1,C4), SimC (C2,C4) ) Output Compounds + max ( SimC (C3,C5), SimC (C3,C6), SimC (C3,C7) )
4/17/ Ferhat Ay Topological Similarity (Topology of Pathways)
Neighborhood Graphs 4/17/ Ferhat Ay C4 C5 C6 C7 R1 R2 C1 E2 R3R4 E1 E3 C3 C2 C9 C8 E1E2E3 Enzymes R2 R3 R1 R4 Reactions C1 C3 C2 C4 C5 C6 C7 C8 C9 Compounds
Topological Similarities 4/17/ Ferhat Ay R2 R3 R1 R4 R1R3 R4 R5 |R| = 4 BN (R3)= {R1,R2} FN (R3)= {R4} BN (R3)= {R1} FN (R3)= {R4,R5} (|R| |R| ) x (|R| |R| ) = 16 x 16 A R matrix R1-R1…R2-R1…R4-R4…R4-R5.... R3 -R3 1 / ….. A R [R3,R3][R2,R1] = 1 = 1 2*1 + 1*2 4
Problem Formulation 4/17/ Ferhat Ay R2 R3 R1 R4 R5 R6 R7 R8 R3 R1 R2 R5R7 R8 Focus on R3 – R3 matching Iteration 1: Support of aligned first degree neighbors addedIteration 2: Support of aligned second degree neighbors added Iteration 3: Support of aligned third degree neighbors added Iteration 0: Only pairwise similarity of R3 and R3
4/17/ Ferhat Ay Initial Reaction Similarity Matrix H R 0 Vector H R s Vector Final Reaction Similarity Matrix Power Method Iterations Problem Formulation
Max Weight Bipartite Matching 4/17/ Ferhat Ay Six Possible Orderings ONLY 3 ARE UNIQUE ○ Reactions First ○ Enzymes First ○ Compounds First R First Pruning R1 R2 R3 R1 R3 R2 C1 C2 C3 C4 C2 C3 E1 E2 E3 E1 E2 Consistency Assured ! Weighted Edges Aligned Entities Inconsistent Edges
Alignment Score ( SimP ) 4/17/ Ferhat Ay C2 C3 C4 C5 R1R2 C1 C2C4 R1R2 C1 C5 0 =< SimP <= 1 SimP =1 for identical pathways SimP = Sim(C1) + Sim(C2) +Sim(C4) + ( 1 – Sim(E1) + Sim(E2) 3 2 E1 E2
Outline 4/17/ Ferhat Ay Graph Model of Pathways Consistency of an Alignment Homological & Topological Similarities Eigenvalue Problem Similarity Score Experimental Results
Impact of Alpha 4/17/ Ferhat Ay = 0 Only pairwise similarities of entities - No iterations = 1 Only topology of the graphs = 0.7 is good !
Alternative Entities & Paths 4/17/ Ferhat Ay Kim J. et al. (2007) Eukaryotes (e.g. H.Sapiens) Mevalonate Path Bacterias (e.g. E.Coli) Non-Mevalonate Path Kuzuyama T. et al. (2006)
Phylogeny Prediction 4/17/ Ferhat Ay Thermoprotei Eukaryota Archaea NCBI Taxonomy Our Prediction Deuterostomia
Effect Of Consistency Restriction 4/17/ Ferhat Ay
Running Time 4/17/ Ferhat Ay
4/17/2015 Ferhat Ay 26 For source code and more information:
4/17/2015 Ferhat Ay 27
Error Tolerance 4/17/ Ferhat Ay
Pylogenetic Reconstruction 4/17/201529Ferhat Ay
Effect Of Consistency Restriction 4/17/201530Ferhat Ay
Z-Score Calculation 4/17/201531Ferhat Ay
E1 C1 C2 E2 C3 C4 E3 E4 Challanges 4/17/ Ferhat Ay E1E2E3 E4 E1 C1 E2 C3 E3E1E2E3 - Where are the compounds? - E1 C1 E2 or E1 C2 E2 ? Pathway 1 Pathway 2 Abstraction is a Problem! Pathway 1 Abstracted Pathway 2 Abstracted NO AbstractionAbstraction Alignment Problem is NP Complete !