Presentation is loading. Please wait.

Presentation is loading. Please wait.

05/02/2008 Jae Hyun Kim Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor Faulon, J. L.,

Similar presentations


Presentation on theme: "05/02/2008 Jae Hyun Kim Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor Faulon, J. L.,"— Presentation transcript:

1 05/02/2008 Jae Hyun Kim Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor Faulon, J. L., M. Misra, et al. (2008), Bioinformatics 24(2): 225-33.

2 Terminology Motivation Method  Molecular Signature  Signature Kernel  Signature Product Kernel Results Conclusion 2 Contents jaekim@ku.edu

3 Catalyst  Increases the rate of chemical reaction / biological process  Remains unchanged Enzyme  Biomolecules that catalyze chemical reactions  Usually proteins Metabolite  Intermediates & products of metabolism  Restricted to small molecules 3 Terminology (1) jaekim@ku.edu Reference: www.wikipedia.org

4 Inhibitor  Molecules that decrease enzyme activity  Compete with substrates  Most of drugs/poisons 4 Terminology (2) jaekim@ku.edu Reference: www.wikipedia.org

5 EC Number  Numerical Classification scheme for Enzyme- catalyzed reactions  Four levels of hierarchy Example: EC 3.4.11.4 : tripeptide aminopeptidases  EC 3 : hydrolases (enzymes that use water to break up some other molecules )  EC 3.4 : hydrolases that act on peptide bonds  EC 3.4.11 : hydrolases that cleave off the amino- terminal amino acid from polypeptide  EC 3.4.11.4 : hydrolases that cleave off the amino- terminal end from a tripeptide 5 Enzyme Commission (EC) Number jaekim@ku.edu Reference: www.wikipedia.org

6 Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor 6 Motivation jaekim@ku.edu Protein-Chemical Interaction Large-scale Machine-learning Technique

7 G=(V,E) : Molecular Graph  V : vertex (atom) set  E : edge (bond) set Atomic Signature  Canonical representation of subgraph surrounding a particular atom  include atoms and bonds up to a predefined distance (height) Molecular Signature of G : h  (G)  h  G (x) : atomic signature in G rooted at x of height h  Height Chemicals : 0~6 Protein: 6~18 (amino acid residue 1~7) 7 Molecular Signature jaekim@ku.edu

8 Molecular Signature: Example 8 jaekim@ku.edu (Leucine) (Isoleucine)(Glycine) Depth First Search up to “height” deep ‘(‘ going down, ‘)’ going back up c_, n_: sp3 carbon/nitrogen atom c=, o= : sp2 (double-bond) carbon/oxygen atom h_: hydrogen

9 General form of enzymatic reaction R  s 1 S 1 +s 2 S 2 +…+s n S n  p 1 P 1 +p 2 P 2 +…+p m P m Height h signature of reaction R 9 Reaction Signature jaekim@ku.edu

10 To predict/classify protein-protein interactions  To measure similarity between two pairs of proteins  Kernel Function K( (X 1,X 2 ), (X’ 1,X’ 2 ) ) How to measure similarity between pairs? 10 Pairwise Kernel jaekim@ku.edu

11 Pairwise similarity by component similarity  If X 1 ~X 1 ’ and X 2 ~X 2 ’ then (X 1,X 2 )~(X 1 ’,X 2 ’) Assess directly similarity between pairs  x 12 = (x 1i x 2j + x 2i x 1j ): pairwise representation of (X 1, X 2 ) Similarity inside the pair  Similarity between pairs 11 Kernel Types jaekim@ku.edu From Ben-Hur, A. and W. S. Noble (2005). "Kernel methods for predicting protein-protein interactions." Bioinformatics 21 Suppl 1: i38-46.

12 Definition  Apply to chemicals, proteins, reactions 12 Signature Kernel jaekim@ku.edu

13 P: Protein, C: Chemical Definition : Signature of Complex P  C Two pairs of P-C interaction (P,C) & (Q,D) 13 Signature Product Kernel (1/2) jaekim@ku.edu

14 Similarly, Therefore, 14 Signature Product Kernel (2/2) jaekim@ku.edu

15 Signature Kernel : Example (height 1) 15 jaekim@ku.edu # of occurrence

16 Signature Product Kernel : Example 16 jaekim@ku.edu

17 Signature Similarity VS. Sequence Alignment Scores 17 jaekim@ku.edu Computed for every pair of amino acids Correlation : Chemically similar  high BLOSUM62 score

18 Positive Examples  download from KEGG  more than 50, max 500 Negative Examples:  Equal Number, Random Selection Signature Kernel, 5-fold CV 18 EC Number Classification jaekim@ku.edu Using only reactions Using only protein sequences

19 EC Classification 19 jaekim@ku.edu Class 1Class 1.1 Class 1.1.1Class 1.1.1.1 Using both sequences & reactions Signature Product Kernel

20 Comparison with other Methods 20 jaekim@ku.edu Accuracy = (TP+TN)/ (TP+TN+FP+FN) Auc = Area Under Curve Precision = TP/(TP+FP) Sensitivity=TP/(TP+FN) Specificity=TN/(TN+FP) Jaccard Coefficient = TP/(TP+FP+FN) A larger number indicates better results

21 Prediction  EC No. accepted in September 2006 : Test Set  Predict whether or not a given enzyme will catalyze a given reaction Signature Product Kernel 21 Predicting New Enzyme Interactions jaekim@ku.edu

22 Predict DRUGBANK Using KEGG 22 jaekim@ku.edu Area under ROC = 0.74 Signature Product Kernel Class I : Both in training set Class II: Different Partners Class III: Only Target Class IV: Only Drug Class V: None

23 Unified method for predicting protein- chemical interactions Atomistic structure representation of proteins encompasses information stored in substitution matrices. 23 Conclusion jaekim@ku.edu


Download ppt "05/02/2008 Jae Hyun Kim Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor Faulon, J. L.,"

Similar presentations


Ads by Google