Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar

Similar presentations


Presentation on theme: "Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar"— Presentation transcript:

1 Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Summarized by Jae-Hong Eom

2 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Overview ‘interaction-domain pair profile’ method A technique to predict protein-protein interaction maps across organisms. Uses a high-quality protein interaction map with interaction domain information as input to predict an interaction map in another organism. Combine sequence similarly searches with clustering. Apply to this approach to the prediction of an interaction map of Escherichia coli. Results are compared with predictions of naïve method. (based only on full-length protein sequence similarity). Improved sensitivity and identification of false-positives. © 2001, SNU Biointelligence Lab, 

3 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Introduction Recent emergence of high-throughout techniques to systemically identify physical interactions between proteins  opened new prospects. Protein interaction maps can provide detailed functional insights On characterized as well as not yet uncharacterized proteins. With an information base for the identification of biological complexes and metabolic or signal transduction pathways. © 2001, SNU Biointelligence Lab, 

4 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Introduction (2) On the experimental front: High-throughput techniques derived from the yeast two-hybrid have been used to build protein interaction maps fro several organisms. E.g. Saccharomyces cerevisiae, Caenorhabditis elegans etc. On the computational front: Protein linkage maps have also been predicted ab initio using algorithms based on a sequence data from completely sequenced genomes. E.g. ‘Rosetta stone’ / ‘gene fusion’ method, ‘phylogenetic profile’ method, ‘gene neighbor’ method, the mRNA expression level correlation method. © 2001, SNU Biointelligence Lab, 

5 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Introduction (3) In this paper, present a computational approach Aimed at predicting the protein interaction map of a target organism from a large-scale “reference” interaction map (of a source organism). – that includes interaction domain information. ‘Interacting Domain Profile Pairs’ (IDPP) approach Based on a combination of interaction data and sequence data, and uses a combination of homology searches and clustering. Apply this method to the inference of A protein interaction map of a model organism Escherichia coli from a Helicobacter pylori reference interaction map. Compared with ‘naïve method’ Prediction method based only on full-length protein sequence similarities. © 2001, SNU Biointelligence Lab, 

6 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Introduction (4) IDs di dj In Pi Pj © 2001, SNU Biointelligence Lab, 

7 Algorithm #1 – Naïve method
The correspondence is meant To link a protein in the source organism with any number of proteins sharing the same interaction capabilities in the target organism. © 2001, SNU Biointelligence Lab, 

8 Algorithm #1 – Naïve method (2)
Construction of a correspondence between PS and PT . The library of target sequences is screened against the full-length sequences of proteins connected in MS.. A protein xT of PT is termed homologous to a protein xS of PS if, There is a significant sequence similarity between their sequences. Prediction of interactions Target interaction map MT is completed by linking the proteins in PT. Interaction is predicted between two different target proteins xT and yT if there are two different proteins xS and yS, respectively homologous to xT and yT, and interacting in MS. © 2001, SNU Biointelligence Lab, 

9 Algorithm #1 – Naïve method (3)
Two major a priori weakness Does not take account the range of interactions. Does not fully exploit the network structure of MS .  these two properties of the input data MS can be exploited to increase the sensitivity of the prediction algorithm. © 2001, SNU Biointelligence Lab, 

10 Algorithm #2 – IDPP method
The aim of IDPP algorithm Predict a protein interaction map on a target proteome PT from a source map MS . Fully exploit the properties of the available instance of MS. The availability of domain information for each interaction. The fact that from a given ID domain d of a protein x, the protein interaction map will typically provide several instances of domain interaction with d. Use additional step, for Transformation source map into abstract interaction map (MDS). Correspondence in built between this abstract interaction map and the target proteome. Interactions are inferred along this correspondence. © 2001, SNU Biointelligence Lab, 

11 Naïve method vs. IDPP method
© 2001, SNU Biointelligence Lab, 

12 Algorithm #2 – IDPP method (2)
First step, cluster domain of different proteins that interact with a common region. Fro each protein xS in PS , examine the IDs of all proteins interacting with xS . Theses Ids can be clustered into interacting clusters (IC) Where an IC of protein xS is defined as a set of MS IDs interacting with common region of xS . © 2001, SNU Biointelligence Lab, 

13 Algorithm #2 – IDPP method (3)
Interacting cluster (IC) can be viewed as a clique (A sub-graph where each vertex is linked to all others.) All the IDs are pair-connected by links – I link. Mean ‘interact with the same part of the x protein’. Clustering of homologous IDs Within each IC, regroup domains by sequence similarities. ID sequences are pairwise compared. Alignment above certain threshold are considered significant. If domains d1 and d2 show a significant similarity,  a sequence similarity link (S-link) is generated between d1 and d2 . © 2001, SNU Biointelligence Lab, 

14 Algorithm #2 – IDPP method (4)
Domains are clustered on the basis of S-links. Clustering Non-transitive Non-exclusive Resulting clusters are cliques both in terms of S-links and I-links. This called n-SIC (Similarity & Interacting Cliques) n : the number of IDs in the cluster. Construction of MDS edges (interaction between domain clusters) ID Profile Pairs (IDPP): Interaction between SICs. Analyze all possible pairs of SIC. © 2001, SNU Biointelligence Lab, 

15 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Threshold T = 75% Distinguished types of IDPP ‘1:1 ’ ‘1:n ’ ‘n:m ’ (n > 0) © 2001, SNU Biointelligence Lab, 

16 Algorithm #2 – IDPP method (5)
© 2001, SNU Biointelligence Lab, 

17 Algorithm #2 – IDPP method (6)
Searching for similarities between domain profiles in target proteome PT . For each n-SIC, a library containing the target protein sequence is screened (using as a probe a single ID seq. if n=1, else a ID profile) Significant hits define homologies between target protein domains and source ID profiles. The correspondence between vertices of MDS and PT is Defined by associating to each n-SIS the set of PT proteins similar to its profiles. Inference from MDS to MT : prediction of interaction from the IDPP collection. Final step. The property “x interacts with y” is transported along the correspondence. Inference step is similar to the one described as ‘naïve’ method. © 2001, SNU Biointelligence Lab, 

18 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/
Results © 2001, SNU Biointelligence Lab, 

19 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/

20 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/

21 © 2001, SNU Biointelligence Lab, http://bi.snu.ac.kr/


Download ppt "Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar"

Similar presentations


Ads by Google