Two stories 1) reconstruction the evolution of a complex 2) Adding qualitative labels to predicted interactions Paulien Smits & Thijs Ettema Department.

Slides:



Advertisements
Similar presentations
STRING Prediction of protein networks through integration of diverse large-scale data sets Lars Juhl Jensen EMBL Heidelberg.
Advertisements

Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways Parsons et al Nature Biotechnology.
1 Modular Co-evolution of metabolic networks Zhao Jing.
MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Darwinian Genomics Csaba Pal Biological Research Center Szeged, Hungary.
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
The STRING database Michael Kuhn EMBL Heidelberg.
Basics of Comparative Genomics Dr G. P. S. Raghava.
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
STRING Modeling of biological systems through cross-species data integration.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
Protein-protein interactions
Protein domains vs. structure domains - an example.
Systems Biology Biological Sequence Analysis
FOG: High-Resolution Fungal Orthologous Groups René van der Heijden Project 5.10: Comparative genomics for the prediction of protein function and pathways.
Indiana University Bloomington, IN Junguk Hur Computational Omics Lab School of Informatics Differential location analysis A novel approach to detecting.
1 Protein-Protein Interaction Networks MSC Seminar in Computational Biology
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Part II Protein interactions and networks Peer Bork EMBL & MDC Heidelberg & Berlin Proteome analysis in.
The STRING Database What it does and how it interfaces to other resources The STRING Database What it does and how it interfaces to other resources Christian.
Protein Interactions and Disease Audry Kang 7/15/2013.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Interaction Networks in Biology: Interface between Physics and Biology, Shekhar C. Mande, August 24, 2009 Interaction Networks in Biology: Interface between.
Comparative Genomics of the Eukaryotes
Genome projects and model organisms Level 3 Molecular Evolution and Bioinformatics Jim Provan.
Essential knowledge 1.A.4:
Protein protein interactions
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
July 2015 CSHL Data analysis: GO tools and YeastMine, use-case examples.
Protein-protein interactions Courtesy of Sarah Teichmann & Jose B. Pereira-Leal MRC Laboratory of Molecular Biology, Cambridge, UK EMBL-EBI.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Networks and Interactions Boo Virk v1.0.
Network Biology Presentation by: Ansuman sahoo 10th semester
The Origin of Eukaryotic Cells  With lots of perplexities and guesses, researchers did many experiments to bring it to light.
Improving Gene Function Prediction Using Gene Neighborhoods Kwangmin Choi Bioinformatics Program School of Informatics Indiana University, Bloomington,
RECONSTRUCTING A “UNIVERSAL TREE” Classical view Prokaryotes Eukaryotes 1977: C. Woese 3 “primordial kingdoms” (or domains) - based on ribosomal RNA sequence.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Proteome and interactome Bioinformatics.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Complementarity of network and sequence information in homologous proteins March, Department of Computing, Imperial College London, London, UK 2.
Protein and RNA Families
PPI team Progress Report PPI team, IDB Lab. Sangwon Yoo, Hoyoung Jeong, Taewhi Lee Mar 2006.
I. Prolinks: a database of protein functional linkage derived from coevolution II. STRING: known and predicted protein-protein associations, integrated.
Functional and Evolutionary Attributes through Analysis of Metabolism Sophia Tsoka European Bioinformatics Institute Cambridge UK.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
Nothing in (computational) biology makes sense except in the light of evolution after Theodosius Dobzhansky (1970) Comparative genomics, genome context.
1 The Genome Gamble, Knowledge or Carnage? Comparative Genomics Leading the Organon Tim Hulsen, Oss, November 11, 2003.
How many genes are there?
1 Computational functional genomics Lital Haham Sivan Pearl.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Gene3D, Orthology and Homology-Based Inheritance of Protein-Protein Interactions Corin Yeats
Computational Biology Signaling networks and drug repositioning Lars Juhl Jensen.
Relationship between Genotype and Phenotype
Protein association networks with STRING
STRING Large-scale data and text mining
Basics of Comparative Genomics
Genome Annotation Continued
STRING Protein networks from data and text mining
Protein Interaction Networks
Introduction to Bioinformatics
TRANSLATED BY: KARUN RAJESH
Network biology An introduction to STRING and Cytoscape
Basics of Comparative Genomics
TRANSLATED BY: KARUN RAJESH
Relationship between Genotype and Phenotype
Essential knowledge 1.B.1:
Presentation transcript:

Two stories 1) reconstruction the evolution of a complex 2) Adding qualitative labels to predicted interactions Paulien Smits & Thijs Ettema Department of Paediatrics, NCMD

Introduction – MRPs Human mitoribosome –2 rRNAs, encoded by mtDNA –79 MRPs, encoded by nDNA Select candidate MRPs for genetic disease –Conservation –Function –Location 55S 28S 39S 12S 16S Science at a Distance

Objectives Detection of MRPs Orthology relations between MRPs from different species New human MRPs based on comparison with MRPs in other species Specific functions of MRPs based on comparison with MRPs in other species Extra domains in MRPs Find MRP associated proteins

New orthology relations (profile-to-profile) Human MRPYeast MRP MRPS25Mrp49 MRPS33Rsm27 MRPL9Mrpl50 MRPL24Mrpl40 MRPL40Mrpl28 MRPL45Mba1 MRPL53Mrpl44 Human MRPBacterial MRP MRPS24S3 MRPL47L29

New mammalian MRPs: Rsm22 Small subunit protein in yeast mitoribosome Orthologs in eukaryotes and prokaryotes Homologous to rRNA methylase S. pombe: fusion protein Rsm22+Cox11 Yeast: Cox11 attached to mitoribosome  Rsm22 is novel mammal MRP with a rRNA methylase function

New mammalian MRPs: Mrp10 Small subunit protein in yeast mitoribosome Yeast mutant has mitochondrial translation defect Orthologs in eukaryotes Distant homology with Cox19  Mrp10 orthologs in Mammals are novel candidate MRPs

Proteome data available Smits et al, NAR 2007

Origins of supernumerary subunits MRPL43, MRPS25 & complex I subunit

MRPL39 & threonyl-tRNA synthetase Origins of supernumerary subunits

MRPL43, MRPS25 & complex I subunit MRPL39 & threonyl-tRNA synthetase MRPL44, dsRNA-binding proteins Origins of supernumerary subunits

MRPL43, MRPS25 & complex I subunit MRPL39 & threonyl-tRNA synthetase MRPL44, dsRNA-binding proteins Mrp1, Rsm26 & superoxide dismutase

Triplication of the S18 protein in the metazoa Where do the supernumerary subunits come from?

One new, metazoa specific protein of the Large subunit (L48) has been obtained by duplication of a protein from the small subunit (S10) Where do the supernumerary subunits come from?

Addition of « new » paralogous subunits in the large and the small subunit in the metazoa Where do the supernumerary subunits come from?

Addition of a new subunit (L45 / MBA1) that is homologous to TIM44 (protein import) and bacterial proteins of unknown function

Homology between Mba1/MRPL45 and TIM44 Dolezal P, Likic V, Tachezy J, Lithgow T. Evolution of the molecular machines for protein import into mitochondria. Science 2006;313:314-8

MRPL45, Mba1 & Tim44 Mba1 is physically associated with LSU Transcription of Mba1 and MRPs is co-regulated Function of MRPL45 unknown COG4395 (MRPL45&Tim44) has similar phylogenetic distribution as COG3175 (Cox11)  Alpha-proteobacterial Tim44 is ancestor of MRPL45 and yeast ortholog Mba1, losing the N- terminus and acquiring a function in translation and COX assembly as a constituent of the mitoribosome

Extra domains

MRP interactors Translation Protein import Acyl carrier proteins Other “hypothetical gene”, essential in bacteria, Mitochondrial phenotype in yeast

Conclusions Established orthology relations between bacterial, fungal and metazoa specific ribosomal proteins Highly dynamic evolution of a mitochondrial protein complex 2 Potential novel human MRPs Homologies show diverse origins of supernumerary MRPs Some MRPs have extra domains Identification of novel MRP interactors

Acknowledgements Paulien Smits Thijs Ettema Bert van den Heuvel Jan Smeitink

Exploration of the omics evidence landscape to distinguish metabolic from physical interactions Vera van Noort Berend Snel Martijn Huynen Vera van Noort Berend Snel Martijn Huynen

Interactome Networks Important to know not only that two proteins interact but also how “the cell” “the network” the genome Snel Bork Huynen PNAS

Genomic data sets Comprehensive complex purification data (Krogan, Gavin) Shared Synthetic lethality Co-regulation (ChIP-on-chip) Co-expression Conserved co-expression (orthologous, paralogous, four species) Gene Neighborhood conservation (STRING pink) Gene CoOccurrence (STRING pink)

Complex purifications Fuse query protein with a hook Pull down hook from in vivo extracts Identify proteins that co-purify Socio-Affinity score

Synthetic lethality One knock-out not lethal, second knock- out not lethal, knock- out both lethal Points to complementary pathways Shared synthetic lethality points to same pathway

Objective: distinguish physical from metabolic in omics data We integrate omics data sets for the budding yeast S.cerevisiae because of many high quality data sets as well as classical knowledge about protein functions We construct two separate reference sets: one for physical interactions and one for metabolic interactions. Physical interactions (Mips complexes) –Remove cytosolic ribosomes –Remove “possible”, “hypothetical”, “predicted” –Remove “other” Metabolic interactions (KEGG pathways < 2000) –Remove paralogs –Remove interactions between same EC numbers –Remove interactions that are already physical

Metabolic and Physical accuracy Positive metabolicNegative metabolicPositive physicalNegative physical in binTP metaFP metaTP physFP phys A meta =TP meta / (TP meta + FP meta + TP phys + FP phys) A phys=TP phys / (TP meta + FP meta + TP phys + FP phys) A total = A meta + A phys

Physical and metabolic accuracy No single data set

Differential accuracy Good at predicting metabolic + bad at predicting physical interactions Positive metabolicNegative metabolicPositive physicalNegative physical in binTP metaFP metaTP physFP phys A meta =TP meta / (TP meta + FP meta + TP phys + FP phys) A phys=TP phys / (TP meta + FP meta + TP phys + FP phys) A total = A meta + A phys A diff = A meta – A phys

Evidence Landscape 1 Absence of physical interactions Metabolic relations in areas where proteomic approaches report no co- purification while strong indications for co-regulation. Logical in hindsight? We should not only use integrations based on the top scoring proteins but also use non-scoring proteins. Need physical protein interaction data sets where the nulls are really true nulls rather than the absence of results Absence of physical interactions Metabolic relations in areas where proteomic approaches report no co- purification while strong indications for co-regulation. Logical in hindsight? We should not only use integrations based on the top scoring proteins but also use non-scoring proteins. Need physical protein interaction data sets where the nulls are really true nulls rather than the absence of results Krogan Gavin Krogan+Gavin CoExp2Sp

Evidence Landscape 2 Krogan+Gavin CoExp2Sp Krogan+Gavin sTF*CoExp CoOcc GeNe CoExp2Sp

Network PPI C: 0.53, k 4.1 Met C: 0.031, k 2.0 Threonine biosynthesis Some pathway links between complexes

Conclusion & Discussion We can in principle distinguish metabolic and physical interactions, if 2 reference sets, if comprehensive Yet sparse (problem for multi-dimensional) Novel ways of integration and more types of omics data will allow extraction of more qualitative predictions on the nature of protein interactions

Acknowledgements EMBL –Peer Bork –Lars Juhl Jensen –Christian von Mering Department of Biology, Utrecht University –Berend Snel