Presentation is loading. Please wait.

Presentation is loading. Please wait.

Do not reproduce without permission 1 Gerstein.info/talks (c) 2005 1 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Permissions Statement This Presentation.

Similar presentations


Presentation on theme: "Do not reproduce without permission 1 Gerstein.info/talks (c) 2005 1 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Permissions Statement This Presentation."— Presentation transcript:

1 Do not reproduce without permission 1 Gerstein.info/talks (c) 2005 1 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Permissions Statement This Presentation is copyright Mark Gerstein, Yale University, 2006. Feel free to use images in it with PROPER acknowledgement (via citation to relevant papers or link to gersteinlab.org).

2 Do not reproduce without permission 2 Gerstein.info/talks (c) 2005 2 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Understanding Protein Function on a Genome-scale using Networks Mark B Gerstein Yale (Comp. Bio. & Bioinformatics) Orfeome 2006 2006.11.16, 16:30-17:00

3 Do not reproduce without permission 3 Gerstein.info/talks (c) 2005 3 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu The problem: Grappling with Function on a Genome Scale? 250 of ~530 originally characterized on chr. 22 [Dunham et al.] >25K Proteins in Entire Human Genome (with alt. splicing).…… ~530

4 Do not reproduce without permission 4 Gerstein.info/talks (c) 2005 4 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Traditional single molecule way to integrate evidence & describe function Descriptive Name: Elongation Factor 2 Summary sentence describing function: This protein promotes the GTP-dependent translocation of the nascent protein chain from the A-site to the P-site of the ribosome. EF2_YEAST Lots of references to papers

5 Do not reproduce without permission 5 Gerstein.info/talks (c) 2005 5 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Toward Systematic Ontologies for Function, using Networks General Networks [Eisenberg et al.] Hierarchies & DAGs [Enzyme, Bairoch; GO, Ashburner; MIPS, Mewes, Frishman] Interaction Vectors [Lan et al, IEEE 90:1848]

6 Do not reproduce without permission 6 Gerstein.info/talks (c) 2005 6 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Networks occupy a midway point in terms of level of understanding 1D: Complete Genetic Partslist ~2D: Bio-molecular Network Wiring Diagram 3D: Detailed structural understanding of cellular machinery [Jeong et al.]

7 Do not reproduce without permission 7 Gerstein.info/talks (c) 2005 7 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Networks as a universal language Disease Spread [Krebs] Protein Interactions [Barabasi] Social Network Food Web Neural Network [Cajal] Electronic Circuit Internet [Burch & Cheswick]

8 Do not reproduce without permission 8 Gerstein.info/talks (c) 2005 8 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Outline Background  Why Study Networks?  Interaction Networks and their properties 3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution  3-D structural point of view  Network properties revisited Genomic analysis of the hierarchical structure of regulatory networks  Construction  Characteristics TopNet & tYNA

9 9 PROTEIN INTERACTION NETWORKS IN YEAST Source: Gavin et al. Nature (2002), Uetz et al. Nature (2000), Cytoscape and DIP Determined by: –Large-scale Yeast-two-hydrid –TAP-Tagging –Literature curation Currently over 20,000 unique interactions available in yeast Spawned a field of computational “graph theory” analyses that view proteins as “nodes” and interactions as “edges” A snapshot of the current interactomeDescription and methodologies ILLUSTRATIVE DIP (Database of interacting Proteins)

10 10 TINY GLOSSARY: DEGREE AND HUBS C: Degree = 1 A: Degree = 5 A is a “Hub”* *The definition of hubs is somewhat arbitrary, usually a cutoff is used Source: PMK

11 11 INTERESTING PROPERTIES OF INTERACTION NETWORKS Source: Various, see following slides Network topology Network Evolution Relationship of topology and genomic features Examples of studies What distribution does the degree (number of interaction partners) follow? What is the relationship between the degree and a proteins essentiality? Is there a relationship between a proteins connectivity and expression profile? What is the relationship between a proteins evolutionary rate and its degree? How did the observed network topology evolve? OVERVIEW

12 12 INTERACTION NETWORKS ARE SCALE-FREE – THEIR TOPOLOGY IS DOMINATED BY SO-CALLED HUBS Source: Barabasi, A. and Albert, R., Science (1999) So-called scale-free topology has been observed in many kinds of networks (among them interaction networks) Scale freeness: A small number of hubs and a large number of poorly connected ones (“Power-law behavior”) Topology is dominated by “hubs” Scale-freeness is in stark contrast to normal (gaussian) distribution p(k) ~ k γ

13 13 HUBS TEND TO BE IMPORTANT PROTEINS, THEY ARE MORE LIKELY TO BE ESSENTIAL PROTEINS AND TEND TO BE MORE CONSERVED Source: Jeong et al. Nature (2001), Yu et al. TiG (2004) and Fraser et al. Science (2002) By now it is well documented that proteins with a large degree tend to be essential proteins in yeast. (“Hubs are essential”) Likewise, it has been found that hubs tend to evolve more slowly than other proteins (“Hubs are slower evolving”) Some Debate on this

14 14 But the “Yes” side appears to be winning … OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE Source: See text Yes, hubs are more conserved Fraser et al. Science (2002) Fraser et al. BMC Evol. Biol. (2003) Wuchty Genome Res. (2004) Jordan et al. Genome Res. (2002) Hahn et al. J. Mol. Evol. (2004) Jordan et al. BMC Evol. Biol. (2003) No, the relationship is unclear ? EXAMPLES Fraser Nature Genetics (2005)

15 15 THERE IS A RELATIONSHIP BETWEEN NETWORK TOPOLOGY AND GENE EXPRESSION DYNAMICS Source: Han et al. Nature (2004) and Yu*, Kim* et al. (Submitted) Frequency Co-expression correlation

16 16 SCALE FREENESS GENERALLY EVOLVES THROUGH PREFERENTIAL ATTACHMENT (THE RICH GET RICHER) Source: Albert et al. Rev. Mod. Phys. (2002) and Middendorf et al. PNAS (2005) Theoretical work shows that a mechanism of preferential attachment leads to a scale- free topology (“The rich get richer”) The Duplication Mutation ModelDescription ILLUSTRATIVE In interaction network, gene duplication followed by mutation of the duplicated gene is generally thought to lead to preferential attachment Simple reasoning: The partners of a hub are more likely to be duplicated than the partners of a non-hub Gene duplication The interaction partners of A are more likely to be duplicated

17 Do not reproduce without permission 17 Gerstein.info/talks (c) 2005 17 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Outline Background  Why Study Networks?  Interaction Networks and their properties 3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution  3-D structural point of view  Network properties revisited Genomic analysis of the hierarchical structure of regulatory networks  Construction  Characteristics TopNet & tYNA

18 18 MOTIVATION ≠ A B1-4 Cdk/cyclin complex Part of the RNA-pol complex ILLUSTRATIVE A B1 B2 B3 B4 Network perspective: Structural biology perspective: = There remains a rich source of knowledge unmined by network theorists!

19 19 THERE IS A PROBLEM WITH SCALE-FREENESS AND REALLY BIG HUBS IN INTERACTION NETWORKS Source: DIP, Institut fuer Festkoerperchemie (Univ. Tuebingen) A really big hub (>200 Interactions) Gedankenexperiment How many maximum neighbors can a protein have? Clearly, a protein is very unlikely to have >200 simultaneous interactors. Some of the >200 are most likely false positives Some others are going to be mutually exclusive interactors (i.e. binding to the same interface). Conclusion There appears to be an obvious discrepancy between >200 and 12. ILLUSTRATIVE Wouldn’t it be great to be able to see the different binding interfaces?

20 20 UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES Source: Kim et al. Science (in press) ILLUSTRATIVE PDB Map all interactions to available homologous structures of interfaces Distinguish overlapping from non- overlapping interfaces

21 21 SHORT DIGRESSION: THIS ALLOWS US TO DISTINGUISH SYSTEMATICALLY BETWEEN SIMULTANEOUSLY POSSIBLE AND MUTUALLY EXCLUSIVE INTERACTIONS Simultaneously possible interactions Mutually exclusive interactions Source: Kim et al. Science (in press)

22 22 Mutually exclusive interactions Simultaneously possible interactions Fraction same biological process p<<0.001 Fraction same molecular function p<<0.001 Mutually exclusive interactions Simultaneously possible interactions Co-expression correlation p<<0.001 Fraction same cellular component p<<0.001 SIMULTANEOUSLY POSSIBLE INTERACTIONS (“PERMANENT”) MORE OFTEN LINK PROTEINS THAT ARE FUNCTIONALLY SIMILAR, COEXPRESSED AND CO-LOCATED Source: Kim et al. Science (in press)

23 23 THAT IS HOW THE RESULTING NETWORK LOOKS LIKE Source: PDB, Pfam, iPfam and Kim et al. Science (in press) Represents a “very high confidence” network Total of 873 nodes and 1269 interactions, each of which is structurally characterized 438 interactions are classified as mutually exclusive and 831 as simultaneously possible While much smaller than DIP, it is of similar size as other high-confidence datasets The Structural Interaction Network (SIN)Properties

24 Do not reproduce without permission 24 Gerstein.info/talks (c) 2005 24 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Outline Background  Why Study Networks?  Interaction Networks and their properties 3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution  3-D structural point of view  Network properties revisited Genomic analysis of the hierarchical structure of regulatory networks  Construction  Characteristics TopNet & tYNA

25 25 REMEMBER THE NETWORK PROPERTIES AS WE DESCRIBED BEFORE? Source: Various, see following slides Network topology Network Evolution Relationship of topology and genomic features Examples of studies What distribution does the degree (number of interaction partners follow?) Does the network easily separate into more than one component? What is the relationship between the degree and a proteins essentiality? Is there a relationship between a proteins connectivity and expression profile? What is the relationship between a proteins evolutionary rate and its degree? How did the observed network topology evolve? OVERVIEW

26 26 THERE DO NOT APPEAR TO BE THE KINDS OF REALLY BIG HUBS AS SEEN BEFORE – IS THE TOPOLOGY STILL SCALE-FREE? Source: Kim et al. Science (in press) With the maximum number of interactions at 13, there are no “really big hubs” in this network Note that in other high-confidence datasets (or similar size), there are still proteins with a much higher degree The degree distribution appears to top out much earlier and less scale free than that of other networks Degree distributionProperties

27 27 Entire genome All proteins In our dataset Single-interface hubs only Multi-interface hubs only Percentage of essential proteins IT’S REALLY ONLY THE MULTI-INTERFACE HUBS THAT ARE SIGNIFICANTLY MORE LIKELY TO BE ESSENTIAL Source: Kim et al. Science (in press)

28 28 All proteins In our dataset Single-interface hubs only Multi-interface hubs only Expression Correlation Expression correlation DATE-HUBS AND PARTY-HUBS ARE REALLY SINGLE-INTERFACE AND MULTI-INTERFACE HUBS Source: Han et al. Nature (2004) and Kim et al. Science (in press) Frequency

29 29 AND ONLY MULTI-INTERFACE PROTEINS ARE EVOLVING SLOWER, SINGLE-INTERFACE HUBS DO NOT Entire genome All proteins In our dataset Single-interface hubs only Multi-interface hubs only Evolutionary Rate (dN/dS) Source: Kim et al. Science (in press)

30 30 But the “Yes” side appears to be winning … OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE Source: See text Yes, hubs are more conserved Fraser et al. Science (2002) Fraser et al. BMC Evol. Biol. (2003) Wuchty Genome Res. (2004) Jordan et al. Genome Res. (2002) Hahn et al. J. Mol. Evol. (2004) Jordan et al. BMC Evol. Biol. (2003) No, the relationship is unclear ? This debate may have arisen because the two different sides were all looking at the wrong variable!

31 31 IN FACT, EVOLUTIONARY RATE CORRELATES BEST WITH THE FRACTION OF INTERFACE AVAILABLE SURFACE AREA Source: Kim et al. Science (in press) DATA IN BINS Small portion of surface area involved in interfaces – fast evolving Large portion of surface area involved in interfaces – slow evolving

32 32 IS THERE A DIFFERENCE BETWEEN SINGLE-INTERFACE HUBS AND MULTI-INTERFACE HUBS WITH RESPECT TO NETWORK EVOLUTION? Source: Kim et al. Science (in press) The Duplication Mutation ModelIn the structural viewpoint If these models were correct, there would be an enrichment of paralogs among B

33 33 Random pair Same partner Same partner different interface Same partner same interface Fraction of paralogs between pairs of proteins MULTI-INTERFACE HUBS DO NOT APPEAR TO EVOLVE BY A GENE DUPLICATION – THE DUPLICATION MUTATION MODEL CAN ONLY EXPLAIN THE EXISTENCE OF SINGLE-INTERFACE HUBS Source: Kim et al. Science (in press) But that also means that the duplication-mutation model cannot explain the full current interaction network!

34 Do not reproduce without permission 34 Gerstein.info/talks (c) 2005 34 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Outline Background  Why Study Networks?  Interaction Networks and their properties 3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution  3-D structural point of view  Network properties revisited Genomic analysis of the hierarchical structure of regulatory networks  Construction  Characteristics TopNet & tYNA

35 Do not reproduce without permission 35 Gerstein.info/talks (c) 2005 35 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Target Genes Transcription Factors 142 transcription factors 3,420 target genes 7,074 regulatory interactions From integrating data from Snyder, Young, Kepes, and TRANSFAC Yeast Regulatory Network: a platform for integration [Yu et al (2003), TIG]

36 Do not reproduce without permission 36 Gerstein.info/talks (c) 2005 36 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Determination of "Level" in Regulatory Network Hierarchy with Breadth-first Search

37 Do not reproduce without permission 37 Gerstein.info/talks (c) 2005 37 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Yeast Regulatory Hierarchy: the Middle-managers Rule

38 Do not reproduce without permission 38 Gerstein.info/talks (c) 2005 38 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Example of Path Through Regulatory Network

39 Do not reproduce without permission 39 Gerstein.info/talks (c) 2005 39 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Yeast Network Similar in Structure to Government Hierarchy with Respect to Middle-managers

40 Do not reproduce without permission 40 Gerstein.info/talks (c) 2005 40 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Yeast and E. coli Networks similar in Structure

41 Do not reproduce without permission 41 Gerstein.info/talks (c) 2005 41 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Outline Background  Why Study Networks?  Interaction Networks and their properties 3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution  3-D structural point of view  Network properties revisited Genomic analysis of the hierarchical structure of regulatory networks  Construction  Characteristics TopNet & tYNA

42 Do not reproduce without permission 42 Gerstein.info/talks (c) 2005 42 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Characteristics of Regulatory Hierarchy: Middle Managers are Information Flow Bottlenecks

43 Do not reproduce without permission 43 Gerstein.info/talks (c) 2005 43 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Characteristics of Regulatory Hierarchy: The Paradox of Influence and Essentiality

44 Do not reproduce without permission 44 Gerstein.info/talks (c) 2005 44 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Characteristics of Regulatory Hierarchy: Topmost proteins sit at center of protein interaction network Avg. Closeness Level

45 Do not reproduce without permission 45 Gerstein.info/talks (c) 2005 45 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Outline Background  Why Study Networks?  Interaction Networks and their properties 3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution  3-D structural point of view  Network properties revisited Genomic analysis of the hierarchical structure of regulatory networks  Construction  Characteristics TopNet & tYNA

46 Do not reproduce without permission 46 Gerstein.info/talks (c) 2005 46 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu TopNet – an automated web tool [Yu et al., 2004; Yip et al. (2005); Similar tools include Cytoscape.org, Idekar, Sander et al] (vers. 2 : "TopNet-like Yale Network Analyzer") Normal website + Downloaded code (JAVA) + Web service (SOAP) with Cytoscape plugin

47 Do not reproduce without permission 47 Gerstein.info/talks (c) 2005 47 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu SVGA visualization, Network Mgt. (Multiple Network Support, tagging with DB)

48 Do not reproduce without permission 48 Gerstein.info/talks (c) 2005 48 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Outline Background  Why Study Networks?  Interaction Networks and their properties 3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution  3-D structural point of view  Network properties revisited Genomic analysis of the hierarchical structure of regulatory networks  Construction  Characteristics TopNet & tYNA

49 Do not reproduce without permission 49 Gerstein.info/talks (c) 2005 49 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Conclusions 3D Analysis of Interaction Network  The topology of a direct physical interaction network is much less dominated by hubs than previously thought  Several genomic features that were previously thought to be correlated with the degree are in fact related to the number of interfaces and not the degree  Specifically, a proteins evolutionary rate appears to be dependent on the fraction of surface area involved in interactions rather than the degree  The current network growth model can only explain a part of currently known networks Regulatory Network Hierarchies  Middle managers dominate, sitting at info. bottlenecks  Paradox of influence and essentiality  Topmost proteins sit at center of interaction network

50 Do not reproduce without permission 50 Gerstein.info/talks (c) 2005 50 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu MG MS Acknowledgements TopNet.GersteinLab.org

51 Do not reproduce without permission 51 Gerstein.info/talks (c) 2005 51 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Acknowledgements TopNet.GersteinLab.org MS MG

52 Do not reproduce without permission 52 Gerstein.info/talks (c) 2005 52 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Acknowledgements TopNet.GersteinLab.org MS MG

53 Do not reproduce without permission 53 Gerstein.info/talks (c) 2005 53 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Acknowledgements TopNet.GersteinLab.org MS MG H Yu P Kim K Yip Y Xia A Paccanaro J Lu S Douglas NIH, NSF, Keck


Download ppt "Do not reproduce without permission 1 Gerstein.info/talks (c) 2005 1 (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Permissions Statement This Presentation."

Similar presentations


Ads by Google