Presentation is loading. Please wait.

Presentation is loading. Please wait.

Identifying functional subnetworks in large-scale datasets Benno Schwikowski Institut Pasteur – Systems Biology Group

Similar presentations


Presentation on theme: "Identifying functional subnetworks in large-scale datasets Benno Schwikowski Institut Pasteur – Systems Biology Group"— Presentation transcript:

1 Identifying functional subnetworks in large-scale datasets Benno Schwikowski Institut Pasteur – Systems Biology Group http://systemsbiology.fr

2 Benno Schwikowski The three levels of this talk 1.Discovery of pathways active in HepC infection 2.Cytoscape plug-ins 3.Cytoscape platform

3 Benno Schwikowski Hepatitis C infection One person out of 30 is infected No vaccine exists In 20% of chronic infections, liver fibrosis and cirrhosis Frequently requires liver transplants

4 Benno Schwikowski Studying HepC infection mRNA changes 50% of transplant livers become re-infected with Hepatitis C Study expression of 7000 genes in re-infected livers after transplantation –1-24 month post-transplant –Samples in 3-6 month intervals 28 biopsies from 11 patients –Mixture of hepatocytes, hepatic stellate cell, Kupffer cells, various types of blood cells Compare against pre-transplant reference pool

5 Benno Schwikowski Result of mRNA expression analysis Most genes (5968 of 7000) were significantly under- or overexpressed in one or more experiments High patient-to-patient variation

6 Benno Schwikowski Our approach 1.Construct seed network among known molecular players 2.Expand seed network to include differentially expressed genes 3.Identify putative pathways by the Active Modules approach

7 Seed network Protein-protein Protein-DNA Phosphorylation Activation Repression Covalent bond Methylation Types of interactions

8 Benno Schwikowski InteractionFetcher plug-in Purpose Dynamically retrieves remote information for selected nodes –From SQL database –Requests data via XML-RPC protocol Currently implemented types Protein/gene synonyms Orthologs Sequences (DNA, protein, DNA upstream) –Gene, protein, Interactions/associations Options Cross-species queries Ortholog information from Homologene Inferred interactions (interologs) Interactive links to Source Web pages 100% open-source (client and server)

9 Benno Schwikowski 2. Expand seed network Purpose Bring significantly up-/downregulated genes “into the picture” Approach Add interactions with differentially expressed genes (“in silico pull-down”) –Use BIND, HPRD databases –Only human-curated interactions

10 Network after InteractionFetcher expansion

11 Benno Schwikowski Identifying putative pathways Why clustering can be problematic Many clustering methods are not model-based  significance of clusters is unclear Any given cluster may not be supported by all experiments – noise problem Clusters tend to contain unrelated genes with vaguely similar profiles

12 Benno Schwikowski The three levels of this talk 1.Discovery of pathways active in HepC infection 2.Cytoscape plug-ins 3.Cytoscape platform

13 Benno Schwikowski How can the clustering issues be addressed? The ActiveModules Plug-in Define “up-/downregulated” on the basis of a well-defined statistical model Also derive clusters from some of the input experiments Use additional evidence to focus on “plausible” clusters  protein interactions

14 Benno Schwikowski Interaction networks Schwikowski, Uetz, Fields Nature Biotechnology (2000)

15 Benno Schwikowski Modular organization of interaction networks

16 Benno Schwikowski A lot of interaction data is becoming available Databases on... Protein-protein interactions Protein-DNA interactions Genetic interactions Metabolic pathways Cell signaling pathways, similarity relationships, literature-based relationships

17 Benno Schwikowski Multi-criteria detection of modules Experiments Genes  2. Differential Gene/Protein Abundances/Activities 1. Interaction network between genes/proteins

18 Perturbations /conditions Rank adjustment: Binomial summation P z = 1-  (z A(j) ) r A(j) =  -1 (1-p A(j) ) m = total number of conditions j = size of subset of conditions Final Score Ideker, Ozier, Schwikowski, Siegel (2002): Bioinformatics 18. S233-240 Scoring a module candidate

19 Benno Schwikowski Pathways in Rosetta’s compendium (300 conditions)

20 Benno Schwikowski The three levels of this talk 1.Discovery of pathways active in HepC infection 2.Cytoscape plug-ins 3.Cytoscape platform

21 Benno Schwikowski Active Modules plug-in applied to HCV re-infection data Iterative application results in four significant highly overlapping subnetworks Repeat analysis only retaining “late-active” re-infection experiments –Eliminates pathways activated by transplant operation –Cutoff: 8 months

22 Which observations can we make locally? Network after InteractionFetcher expansion Bold: Differentially regulated subnetwork Red/Green: Late- active subnetwork

23 Benno Schwikowski Cytotalk plug-in Overrepresentation analysis using Cytotalk plug-in, R, of overrepresentation of genes in Gene Ontology classes Cytotalk enables interactive communication with –C/C++ programs –Java processes –Python –UNIX shell scripts –R, R scripts Can be run on same machine or any other Internet- connected machine Can function as Cytoscape plug-in 100% open-source

24 Benno Schwikowski The three levels of this talk 1.Discovery of pathways active in HepC infection 2.Cytoscape plug-ins 3.Cytoscape platform

25 Benno Schwikowski Some Network Visualization Tools Pajek - Slovenia Osprey - SLRI, Toronto VisANT - BU Biolayout - EBI GraphViz PowerPoint Others Cytoscape (only open-source biology)

26 Cytoscape

27 Benno Schwikowski Cytoscape Basic Concepts Objects visualized as nodes Relationships visualized as edges Attributes (name, sequence, source,...) Mapping attributes  drawing customizable through visual mapper

28 Cytoscape file formats YDR216W pd YIL056W YDR216W pd YKR042W YDR216W pd YGL096W YDR216W pd YDR077W [...] GENEDESCexp0.sigexp1.sigexp0.sigexp1.sig GENE0G00.00.023.211.5 GENE1G10.00.034.65.2 GENE2G20.00.010.028.0 GENE3G30.00.01.644.77 [...] Sample interaction file

29 Benno Schwikowski Display gene & protein expression protein interactions (physical and non-physical) protein classifications Analysis plug-in modules http://www.cytoscape.org/ Java: platform independent + web- start 100% open-source Cytoscape

30 Visual Styles Display gene expression as clear text

31 Visual Styles Map expression values to node colors using a continuous mapper

32 Visual Styles Expression data mapped to node colors

33 Multidimensional attributes Cytoscape, pre-release plug-in Data from Ideker et al., Science (2001)

34 Layout 16 algorithms available through plug-ins Zooming, hide/show, alignment

35 yFiles Circular

36 Benno Schwikowski

37 Cytoscape Core – Differences to most other approaches Emphasis on data analysis & integration No built-in semantics (added by plug-ins) Very simple concepts Human-readable input formats Extensibility

38 Benno Schwikowski Cytoscape extensibility Core: 100% open source Java –Plug-in API –Plug-ins are independently licensed “Just need to do the biology” Template code samples Plug-in

39 Biomodules plug-in Prinz S, Avila-Campillo I, Aldridge C, Srinivasan A, Dimitrov K, Siegel AF, and Galitski T Genome Res. 2004 14: 380-390

40 Benno Schwikowski Cytoscape Plugins Modules in Complex Networks Iliana Avila-Campillo, Tim Galitski Discovering Regulatory and Signaling Circuits in Molecular Interaction Networks Trey Ideker, Owen Ozier, Benno Schwikowski, Andrew Siegel Data Integration in Juvenile Diabetes Research Marta Janer, Paul Shannon A network motif sampler David Reiss, Benno Schwikowski

41 Benno Schwikowski Cytoscape Core Features Visualize and lay out networks Display network data using visual styles Easily organize multiple networks Bird’s eye view navigation of large networks Supports SIF and GML, molecular profiling formats, node/edge attributes Functional annotation from GO + KEGG Metanode support (hierarchical groupings) Extensible through plugins (20 developed)

42 Benno Schwikowski Baliga et al. Genome Research June 2004

43 Benno Schwikowski Collaborators: HCV Institute for Systems Biology, Seattle, WA David Reiss Iliana Avila-Campillo Vesteinn Thorsson Tim Galitski

44 Benno Schwikowski

45 Collaborators: Cytoscape ISB Leroy Hood Rowan Christmas Agilent Technologies Unilever PLC Long-term funding from NIH and participating institutions UCSD Trey Ideker Chris Workman Memorial-Sloan Kettering Cancer Center Chris Sander Gary Bader Ethan Cerami Pasteur Melissa Cline Andrea Splendiani Tero Aittokallio

46 Shannon, P., et al. (2003). Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res 13, 2498-504.

47 Benno Schwikowski Collaborators: Active Networks Trey Ideker Owen Ozier Andrew Siegel Richard Karp

48

49 Benno Schwikowski Levels of Biological Information DNA mRNA Protein Pathways Networks Cells Tissues Organs Individuals Populations Ecologies


Download ppt "Identifying functional subnetworks in large-scale datasets Benno Schwikowski Institut Pasteur – Systems Biology Group"

Similar presentations


Ads by Google