Download presentation
Presentation is loading. Please wait.
1
Protein-protein interactions Protein Analysis Workshop 2010 Bioinformatics group Institute of Biotechnology University of helsinki Hung Ta xuanhung.ta@helsinki.fi
2
Outline Why are protein-protein interactions (PPIs) so important?. Experimental methods (high throughput) for discovering PPIs: Yeast-two-hybrid. AP-MS. PPIs databases: DIP, Biogrid, Intact, HPRD… Computational prediction of PPIs Genomics methods Biological context methods Integrative methods STRING (EMBL)
3
Why are PPIs so important? Gene is the basic unit of heredity. Genomes are availabe. genome proteome interactome Proteins, the working molecules of a cell, carry out many biological activities Proteins function by interacting with other proteins, DNA, RNA, small molecules.
4
P2P2 P1P1 P3P3 P4 P5P5 PNPN X Y Search for drug molecules: The body produces a list of proteins: P 1, P 2, P 3,… P N. A pathogen (virus or bacteria) enters the body and produces its own protein, say X. X interacts with one of proteins, say P 1, inhibiting it from its routine activities. Diseases emerge Introduce into the body a new molecule, Y such that X is more attracted to Y than to P 1, freeing P 1 to get back to routine work.
5
Search for drug molecules: Bring out an effective drug into the market could: Take 10-15 years Cost up to US$800 million Test up to 30,000 candidate molecules Databases of molecules interactions or linkages could help to cut down the search for drug molecules.
6
The types of PPIs Binary (physical) interactions: refer to the binding between two proteins whose residues are in contact at some point in time. Funtional linkages: implicate pairwise relationships between proteins that work together (participate in a common structural complex or pathway) to implement biological tasks.
7
Protein physical interactomes and functional linkage maps are available for S. cerevisiae (Uetz et al. 2000; Ito et al. 2001; Ho et al. 2002; Gavin et al. 2002, 2006; Krogan et al. 2006, Tarassov et al. 2008; Yu et al. 2008) E. coli (Butland et al. 2005; Arifuzzaman et al. 2006) C. elegans (Li et al. 2004) D. melanogaster (Giot et al. 2003) Humans (Rual et al. 2005; Stelzl et al. 2005, Ewing et al. 2007) …
8
High throughput experimental methods for discovering PPIs Yeast-two-hybrid (Y2H) Ito T. et al., 2001; Uetz P. et al., 2000; Yu H. et al., 2008 Rual et al. 2005; Stelzl et al. 2005 Affinity purification followed by mass spectrometry (AP- MS). Gavin AC et al., 2002, 2006 Ho Y. et al., 2002 Krogan NJ et al., 2006
9
Y2H experiments Idea: Use a protein of interest as bait in order to discover proteins that physically interact with the bait protein; these are called prey. A single transcription factor is cut into two pieces called Binding Domain (BD) and Activation Domain (AD). Bait (prey) protein is fused to the BD (AD). If bait and prey proteins interact, the transcription of the reporter gene is initiated. High throughput screening the interactions between the bait and the prey library.
10
AP-MS experiments Fuse a TAP tag consisting of protein A and calmodulin binding peptide separated by TEV protease cleavage site to the target protein After the first AP step using an IgG matrix, many contaminants are eliminated. In the second AP step, CBP binds tightly to calmodulin coated beads. After washing which removes remained contaminants and the TEV protease, the bound meterial is released under mild condition with EGTA. Proteins are identified by mass spectrometry
11
Data output by MS is lists including bait protein and its co-purified partners (preys); each accompanied by a reliability score. Use a scoring system combining spokes and matrix models to generate a network of binary PPIs. Each interaction has a confidence score Eliminate low scoring links to obtain high confident network. The network is partitioned into densely connected regions, which are named complexes. AP-MS experiments
12
Computational methods of prediction Comparative Genomic methods Gene neighbourhood Gene fusion Domain-based method Phylogenetic Intergrative methods Biological context methods Co-expression GO Text mining
13
Gene neighbourhood based method Protein a and b whose genes are close in different genomes are predicted to interact. Dandekar, T. et al. (1998). Conservation of gene order: A fingerprint of proteins that physically interact. Trends in Biochemical Sciences, 23(9), 324–328
14
Gene fusion (Rosetta stone) Protein a and b are predicted to interact if they combine (fuse) to form one protein in another organism. Enright, A. Jet al. (1999). Protein interaction maps for complete genomes based on gene fusion events. Nature, 402(6757), 86–90.
15
Domain based methods Well-known experimental PPIs data Inferred domain-domain interactions (DDIs) Interact/Non-interact Protein B Protein A AS, MLE, PE AS: association; MLE: Maximum Likehood Estimation; PE: Parsimony Explanation Validation of inferred DDIs remains difficult due to lack of sufficient and unbias benchmark datasets. The methods show limited performance at predicting PPIs. H.X. Ta, L. Holm, Biochem. Biophys. Res. Commun. (2009)
16
Phylogeny based methods Protein a and c are predicted to interact if they have similar phylogenetic profiles. Pellegrini, M. et al. (1999). Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles. PNAS, 96(8), 4285–4288
17
Biological context methods Gene expression: Two protein whose genes exhibit very similar patterns of expression across multiple states or experiments may then be considered candidates for functional association and possibly direct physical interaction. GO (Gene Ontology) annotations: two interacting proteins likely have the same GO term annotations. Text-mining: Extract interacting protein information from literature (PubMed..): ” is protein K mentioned with protein I in publications ” The techniques are used to validate PPIs discovered by other approaches or are integrated with others in integrative approaches.
18
Integrative methods Naive bayes Random Forest Decision Tree Kernels Logistic Regression Support Vector Machines Jansen R. et al., Science 2003 Bader J.S. et al., Nat Biotech 2004 Lin N. et al., BMC Bioinformatics 2004 Zhang L. et al., BMC Bioinformatics 2004
19
Databases of PPIs DIP(http://dip.doe-mbi.ucla.edu)http://dip.doe-mbi.ucla.edu 71,275 interactions 23,200 proteins 372 organisms BioGRID (http://www.thebiogrid.org)http://www.thebiogrid.org 247,366 non-redundant interactions 31,254 unique proteins 17 organisms IntAct (http://www.ebi.ac.uk/intact)http://www.ebi.ac.uk/intact 232,793 interactions 69,335 proteins MINT (http://mint.bio.uniroma2.it)http://mint.bio.uniroma2.it 89,956 interactions 31,631 proteins SGD (http://www.yeastgenome.org)http://www.yeastgenome.org Saccharomyces Genome Database HPRD (http://www.hprd.org/)http://www.hprd.org/ 39,194 interactions 30,047proteins MIPs: interactions, complexes STRING: Known and Predicted Protein-Protein Interactions
20
DIP Protein function Protein-protein relationship Evolution of protein-protein interaction The network of interacting proteins Unknown protein-protein interaction The best interaction conditions
21
DIP-Searching information
22
Find information about your protein
23
DIP Node (DIP:1143N)
24
Graph of PPIs around DIP:1143N Nodes are proteins Edges are PPIs The center node is DIP:1143N Edge width encodes the number of independent experiments identyfying the interaction. Green (red) is used to draw core (unverified) interactions. Click on each node (edge) to know more about the protein (interaction).
25
List of interacting partners of DIP:1143N
26
STRING: Search Tool for the Retrieval of Interacting Genes/Proteins A database of known and predicted protein interactions Direct (physical) and indirect (functional) associations The database currently covers 2,590,259 proteins from 630 organisms Derived from these sources: Supported by
27
Searching information Query infomation via protein names or protein sequences.
28
Graph of PPIs Nodes are proteins Lines with color is an evidence of interaction between two proteins. The color encodes the method used to detect the interaction. Click on each node to get the information of the corresponding protein. Click on each edge to get information of the interaction between two proteins.
29
List of predicted partners Partners with discription and confidence score. Choose different types of views to see more detail
30
Neighborhood View The red block is the queried protein and others are its neighbors in organisms. Click on the blocks to obtain the information about corresponding proteins. The close organisms show the similar protein neighborhood patterns. Help to find out the close genes/proteins in genomic region.
31
Occurence Views Represents phylogenetic profiles of proteins. Color of the boxes indicates the sequence similarity between the proteins and their homologus protein in the organisms. The size of box shows how many members in the family representing the reported sequence similarity. Click on each box to see the sequence alignment.
32
Gene Fusion View This view shows the individual gene fusion events per species Two different colored boxes next to each other indicate a fusion event. Hovering above a region in a gene gives the gene name; clicking on a gene gives more detailed information
33
References Skrabanek L, Saini HK, Bader GD, Enright AJ. Computational prediction of protein-protein interactions. Methods Mol Biol. 2004;261:445-68 Benjamin A. Shoemaker, Anna R. Panchenko. Deciphering Protein–Protein Interactions. Part I. Experimental Techniques and Databases. PLoS Comput Biol 3(3): e42. doi:10.1371/journal.pcbi.0030042 Benjamin A. Shoemaker, Anna R. Panchenko. Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners. PLoS Comput Biol 3(4): e43. doi:10.1371/journal.pcbi.0030043 Pitre S, Alamgir M, Green JR, Dumontier M, Dehne F, Golshani A. Computational methods for predicting protein-protein interactions. Adv Biochem Eng Biotechnol. 2008;110:247-67. Wodak SJ, Pu S, Vlasblom J, Séraphin B. Challenges and rewards of interaction proteomics. Mol Cell Proteomics. 2009 Jan;8(1):3-18 Yanjun Qi, Ziv Bar-joseph, Judith Klein-seetharaman. Evaluation of different biological data and computational classification methods for use in protein interaction prediction. PROTEINS: Structure, Function, and Bioinformatics. 63(3):490-500
34
Why protein-protein interactions (PPI)? PPIs are involved in many biological processes: Signal transduction Protein complexes or molecular machinery. Protein carrier. Protein modifications (phosphorylation) … PPIs help to decipher the molecular mechanisms underlying the biological functions, and enhance the approaches for drug discovery
35
Assessment of large–scale datasets of PPIs Yu H, et al. (2008). Science 322: 104-110 Benchmarking high-throughput interactions: Y2H: Uetz et al. 2000; Ito et al. 2001 AP-MS: Gavin et al. 2006; Krogan et al. 2006 Binary gold standard (GS): positive reference set (PRS) and random reference set (RRS). MIPs co-complex gold standard. Measure large-scale datasets against Binary-GS and MIPs-GS
36
Assessment of large–scale datasets of PPIs Yu H, et al. (2008). Science 322: 104-110 AP/MS performs well at detecting co-complex associations according to MIPs Y2H performs well at detecting binary interactions according to Binary-GS Y2H AP/MS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.