Protein Interactions Michel Dumontier, Ph.D. Carleton University June 17, 2005 Protein Interactions Michel Dumontier, Ph.D. Carleton University michel@bioinfocg.com Lecture 4.1 (c) 2005 CGDN
Outline Protein interactions Discovery Storage Experimental Lecture 4.1
Molecular Interactions B A Between two molecular objects DNA, RNA, gene, protein, molecular complex, small molecule, photon Binding Sites Under some Experimental Condition With a particular Cellular Location Possibly having a Chemical Action Lecture 4.1
Interaction Discovery Michel Dumontier Interaction Discovery June 17, 2005 Databases Fully electronic Easily computer readable Literature Increasingly electronic Human readable Biologist’s brains Richest data source Limited bandwidth access Experiments Basis for models Lecture 4.1 (c) 2005 CGDN
Yeast Two Hybrid Assay The two-hybrid system is a molecular genetic tool which facilitates the study of protein-protein interactions. If two proteins interact, then a reporter gene is transcriptionally activated. e.g. gal1-lacZ - the beta-galactosidase gene A colour reaction can be seen on specific media. You can use this to Study the interaction between two proteins which you expect to interact Find proteins (prey) which interact with a protein you have already (bait). Lecture 4.1
Two-hybrid assay 1. 3. 2. 4. B A SNF4 SNF1 GAL4-DBD Michel Dumontier Two-hybrid assay June 17, 2005 SNF4 1. B SNF1 A 3. 2. GAL4-DBD Transcription activation domain UASG 4. Fields S. Song O. Nature. 1989 Jul 20;340(6230):245-6. PMID: 2547163 GAL1 Allows growth on galactose Lecture 4.1 (c) 2005 CGDN
Some Two-hybrid caveats Michel Dumontier Some Two-hybrid caveats June 17, 2005 1. A 3. 2. 4. Does the DNA Binding Domain fusion have activity by itself? Lecture 4.1 (c) 2005 CGDN
Some Two-hybrid caveats Michel Dumontier Some Two-hybrid caveats June 17, 2005 1. C B A 3. 2. 4. Is the ‘interaction’ mediated by some other protein? Lecture 4.1 (c) 2005 CGDN
Some Two-hybrid questions Michel Dumontier Some Two-hybrid questions June 17, 2005 1. B A 3. 2. Are the proteins expresssed? Are they over-expressed? Are they in-frame? Are the interacting domains defined? Was the observation reproducible? Was the strength of interaction significant? Was another method used to back-up the conclusion? Are the two proteins from the same compartment? 4. Lecture 4.1 (c) 2005 CGDN
Affinity purification Michel Dumontier Affinity purification June 17, 2005 A this molecule will bind the ‘tag’. tag modification (e.g. HA/GST/His) Protein of interest Lecture 4.1 (c) 2005 CGDN
Affinity purification Michel Dumontier Affinity purification June 17, 2005 the cell A Lecture 4.1 (c) 2005 CGDN
Affinity purification Michel Dumontier Affinity purification June 17, 2005 lots of other untagged proteins the cell A B naturally binding protein Lecture 4.1 (c) 2005 CGDN
Affinity purification Michel Dumontier Affinity purification June 17, 2005 Ruptured membranes A B cell extract Lecture 4.1 (c) 2005 CGDN
Affinity purification Michel Dumontier Affinity purification June 17, 2005 A B untagged proteins go through fastest (flow-through) Lecture 4.1 (c) 2005 CGDN
Affinity purification Michel Dumontier Affinity purification June 17, 2005 A B tagged complexes are slower and come out later (eluate) Lecture 4.1 (c) 2005 CGDN
Some affinity purification questions Michel Dumontier June 17, 2005 Some affinity purification questions Is the bait protein expressed and in frame? Is the bait protein observed? Is the bait protein over-expressed? Are the interacting domains defined? Was the observation reproducible? Was the interactor found in the background? Was the strength of interaction significant? Was the interaction saturable? Was the interactor stoichiometric with the bait protein? Was another method used to back-up the conclusion? Was tandem-affinity purification (TAP) used? Was the interaction shown using an extract or a purified protein? Is the inverse interaction observable? Are the two proteins from the same compartment? Are the two proteins known to be involved in the same process? Is the interactor likely to be physiologically significant? A B Lecture 4.1 (c) 2005 CGDN
Some affinity purification caveats Michel Dumontier June 17, 2005 Some affinity purification caveats First and most importantly, this is only a representation of the observation. You can only tell what proteins are in the eluate; you can’t tell how they are connected to one another. If there is only one other protein present (B), then its likely that A and B are directly interacting. But, what if I told you that two other proteins (B and C) were present along with A…. A B A C B Lecture 4.1 (c) 2005 CGDN
Complexes with unknown topology Michel Dumontier June 17, 2005 Complexes with unknown topology A A A B C B C B C Which of these models is correct? The complex described by this experimental result is said to have an Unknown Topology. Lecture 4.1 (c) 2005 CGDN
Complexes with unknown stoichiometry Michel Dumontier June 17, 2005 Complexes with unknown stoichiometry A A B C Here’s another possibility? The complex described by this experimental result is also said to have Unknown Stoichiometry. Lecture 4.1 (c) 2005 CGDN
High-throughput Mass Spectrometric Protein Complex Identification (HMS-PCI) Michel Dumontier June 17, 2005 Mike Tyers, SLRI Ste12 Ho et al. Nature. 2002 Jan 10;415(6868):180-3 Lecture 4.1 (c) 2005 CGDN
Michel Dumontier June 17, 2005 Lecture 4.1 (c) 2005 CGDN
Synthetic Genetic Interactions Michel Dumontier Synthetic Genetic Interactions June 17, 2005 Synthetic genetic interactions (lethal, slow growth) Mate two mutants without phenotypes to get a daughter cell with a phenotype Synthetic lethal (SL), slow growth robotic mating using the yeast deletion library Genetic interactions provide functional data on protein interactions or redundant genes About 23% of known SLs (1295 - YPD+MIPS) are known protein interactions in yeast Tong et al. Science. 2001 Dec 14;294(5550):2364-8 Lecture 4.1 (c) 2005 CGDN
Working overtime Charlie Boone’s Robots Michel Dumontier Working overtime Charlie Boone’s Robots June 17, 2005 Lecture 4.1 (c) 2005 CGDN
Synthetic Genetic Interactions in Yeast Cell Polarity Cell Wall Maintenance Cell Structure Mitosis Chromosome Structure DNA Synthesis DNA Repair Unknown Others Michel Dumontier Synthetic Genetic Interactions in Yeast June 17, 2005 Lecture 4.1 Tong, Boone (c) 2005 CGDN
SGA Synthetic Genetic Interaction Network 2004 Michel Dumontier June 17, 2005 ~1000 Genes ~4000 Interactions 132 SGA Screens Lecture 4.1 Tong, Boone, Science, Feb 2004 (c) 2005 CGDN
A measure of confidence? Michel Dumontier A measure of confidence? June 17, 2005 How do you know if the interaction really exists? Each method has its advantages and disadvantages. Be aware of systematic errors (i.e. tag effects) Be aware of contaminating proteins. Each method observes interactions from a slightly different experimental condition. Support from many different sources is certainly better than just one. Lecture 4.1 (c) 2005 CGDN
Outline Molecular interactions Discovery Storage Data Mining Databases File Formats Data Mining Lecture 4.1
Interaction/Pathway Databases Michel Dumontier Interaction/Pathway Databases June 17, 2005 Arguably the most accessible data source, but... Varied formats, representation, coverage Pathway data extremely difficult to combine and use Pathway Resource List (http://cbio.mskcc.org/prl/) Lecture 4.1 (c) 2005 CGDN
http://bind.ca A free, open-source database for archiving and exchanging molecular assembly information. BIND is managed by the Blueprint Initiative at Mount Sinai Hospital in Toronto. The database contains Interactions/Reactions Molecular complexes Pathways BIND has an extensive data model, GNU software tools and is based on the NCBI toolkit; extended recently to XML/Java The ~175000 BIND records are curated and validated. Bader GD, Betel D, Hogue CW. (2003) BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31(1):248-50 PMID: 12519993 Lecture 4.1
BIND Interaction Types Lecture 4.1
Interaction Experimental Evidence in BIND Remaining 1% Lecture 4.1
Lecture 4.1
Lecture 4.1
55 Identifier Searches Supported! Lecture 4.1
Lecture 4.1
GI Pair - CSV Export Lecture 4.1
BIND Record Header BIND record identifier Description & Division Publications that support or dispute interaction Export Options Network Visualization Lecture 4.1
BIND Record View Lecture 4.1
BIND Record View The Interacting Molecules (A and B) Main identifier: GI Organism Cross-references and aliases Gene Ontology terms Proteoglyphs Graphical representations of domain and protein structure. Ontoglyphs Graphical representations of molecule function, localization and binding Lecture 4.1
Gene Ontology Functional protein annotation Michel Dumontier Gene Ontology June 17, 2005 Functional protein annotation http://www.geneontology.org Controlled vocabulary for protein function and localization Molecular function e.g. DNA helicase Biological process e.g. mitosis Cellular Component e.g. nucleus Thousands of terms… Lecture 4.1 (c) 2005 CGDN
Lecture 4.1
Lecture 4.1
Ontoglyph Summary View Michel Dumontier Ontoglyph Summary View June 17, 2005 Lecture 4.1 (c) 2005 CGDN
Ontoglyph Filtering Lecture 4.1
Lecture 4.1
Lecture 4.1
Other Interaction Databases DIP http://dip.doe-mbi.ucla.edu MINT http://mint.bio.uniroma2.it/mint MIPS http://mips.gsf.de/proj/yeast/tables/interaction/ IntAct – EBI’s interaction database http://www.ebi.ac.uk/intact/ Human Protein Interaction Database http://www.hpid.org/ TRANSFAC – transcription factors http://www.gene-regulation.com/ Lecture 4.1
Information Exchange Software Database User With Data Michel Dumontier Information Exchange June 17, 2005 Database Software User With Data Exchange Format >100 DBs and tools Tower of Babel Lecture 4.1 (c) 2005 CGDN
Data Exchange File Formats BIND http://bind.ca Peer reviewed but closed process (Spec v3.1) ASN.1 or XML DTD/Schema PSI-MI http://psidev.sourceforge.net Peer reviewed, HUPO community standard Widely adopted BioPax http://www.biopax.org Community schema (Sloan Kettering, BioPathways Consortium) XML Schema, OWL, Protégé and GKB SBML Widely adopted for representing models of biochemical reaction networks Lecture 4.1
BIND ASN.1 (text) XML Flat File Lecture 4.1
PSI level 2 Lecture 4.1
PSI Record Format Lecture 4.1
Michel Dumontier BioPAX June 17, 2005 http://www.biopax.org Represent: Metabolic pathways Signaling pathways Protein-protein, molecular interactions Gene regulatory pathways Genetic interactions Accommodate representations used in existing databases such as BioCyc, BIND, WIT, aMAZE, KEGG, Reactome, etc. Community effort (open meetings) Lecture 4.1 (c) 2005 CGDN
Conclusion Many experimental techniques to generate interaction data Interaction databases like BIND are a great resource for building up interaction networks into pathways Common standards for file formats imperative for making use of all this data! Lecture 4.1