Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.

Slides:



Advertisements
Similar presentations
Microarray statistical validation and functional annotation
Advertisements

Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
Rich Probabilistic Models for Gene Expression Eran Segal (Stanford) Ben Taskar (Stanford) Audrey Gasch (Berkeley) Nir Friedman (Hebrew University) Daphne.
From Sequence to Expression: A Probabilistic Framework Eran Segal (Stanford) Joint work with: Yoseph Barash (Hebrew U.) Itamar Simon (Whitehead Inst.)
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
XML Documentation of Biopathways and Their Simulations in Genomic Object Net Speaker : Hungwei chen.
Gene expression analysis summary Where are we now?
Computational Methodology for Microbial and Metagenomic Characterization using Large Scale Functional Genomic Data Integration Curtis Huttenhower
Identifying new genes involved in the DNA damage checkpoint pathway Courtney Onodera March 16, 2005.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Systems Biology Biological Sequence Analysis
Indiana University Bloomington, IN Junguk Hur Computational Omics Lab School of Informatics Differential location analysis A novel approach to detecting.
1 Protein-Protein Interaction Networks MSC Seminar in Computational Biology
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Epistasis Analysis Using Microarrays Chris Workman.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Query Planning for Searching Inter- Dependent Deep-Web Databases Fan Wang 1, Gagan Agrawal 1, Ruoming Jin 2 1 Department of Computer.
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Answering biological questions using large genomic data collections Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Functional genomics data collection, integration, visualization project Collects functional genomics (microarray, interaction, localization, etc) data.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
A COMPREHENSIVE GENE REGULATORY NETWORK FOR THE DIAUXIC SHIFT IN SACCHAROMYCES CEREVISIAE GEISTLINGER, L., CSABA, G., DIRMEIER, S., KÜFFNER, R., AND ZIMMER,
Network & Systems Modeling 29 June 2009 NCSU GO Workshop.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Agent-based methods for translational cancer multilevel modelling Sylvia Nagl PhD Cancer Systems Science & Biomedical Informatics UCL Cancer Institute.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
A Method for Protein Functional Flow Configuration and Validation Woo-Hyuk Jang 1 Suk-Hoon Jung 1 Dong-Soo Han 1
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
PPI team Progress Report PPI team, IDB Lab. Sangwon Yoo, Hoyoung Jeong, Taewhi Lee Mar 2006.
By: Amira Djebbari and John Quackenbush BMC Systems Biology 2008, 2: 57 Presented by: Garron Wright April 20, 2009 CSCE 582.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
Introduction to biological molecular networks
DNAmRNAProtein Small molecules Environment Regulatory RNA How a cell is wired The dynamics of such interactions emerge as cellular processes and functions.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Predicting Protein Function Annotation using Protein- Protein Interaction Networks By Tamar Eldad Advisor: Dr. Yanay Ofran Computational Biology.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
High throughput biology data management and data intensive computing drivers George Michaels.
Emily Pachunka ● Spring 2017
Networks and Interactions
Genomic Data Integration
Biological networks CS 5263 Bioinformatics.
Mental Functioning and the Gene Ontology
Department of Genetics • Stanford University School of Medicine
Recovering Temporally Rewiring Networks: A Model-based Approach
Large Scale Data Integration
Genomic Data Manipulation
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Presentation transcript:

Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton University PRIME Workshop on Pathway Databases and Modeling Tools June 16, 2006

2 Motivation: building biological networks from experimental data Explosion of functional genomic DATA KNOWLEDGE of components and inter-relationships that lead to function ?  Find missing pathway components  Detect uncharacterized crosstalk between pathways  Discover novel pathways

3 Motivation: building biological networks from experimental data noisy How can we harness this information without sacrificing precision?

4 Directed network discovery: involving the biologist in the search process Previous approaches to network analysis from genomic data: largely undirected global approaches that detect interesting network features Incorporating expert direction can: Improve sensitivity and precision by using context information Focus on relevant information for biologist user (allows interactivity) Two-hybrid interaction network, yeast (SH3 domain) Boone lab Previous work: Bader et al. (2003), Asthana et al. (2004) Yamanashi et al. (2004,2005), Kato et al. (2005)

5 bioPIXIE system overview bioPIXIE: Pathway Inference from eXperimental Interaction Evidence

6 Overview How do we integrate heterogeneous evidence? Expert-driven network discovery Making it usable: practical visualization and other interface considerations Does it work? (evaluation experiments and biological validation) Challenges/opportunities and future work

7 Heterogeneous data integration Diverse forms of data: what’s a unifying framework? Variable coverage, reliability, and relevance Integration scheme should utilize information in data when available, but be robust when missing physical binding genetic interaction cellular localization expression sequence (TF motifs, coding,…)  Bayes net  Map to associations of genes/proteins

8 Bayes net for evidence integration Functional Relationship Microarray correlation Shared transcription factors Purified complex Affinity precipitation 2 Hybrid Synthetic lethality Synthetic rescue Co- localization We infer: Input evidence: grouped by lab (source) and by type Structure: Naïve Bayes (~60 nodes) (also tried TAN) CPT’s: learned from GO gold standard Fully-connected, weighted graph of proteins …

9 Overview How do we integrate heterogeneous evidence? Expert-driven network discovery Making it usable: practical visualization and other interface considerations Does it work? (evaluation experiments and biological validation) Challenges/opportunities and future work

10 Expert-driven network discovery Local search in the PPI network centered at the query Which proteins should we extract as a single, functionally coherent group? Should consider: confidence in links and topology surrounding query group

11 Extracting relevant proteins Basic idea: compute expected linkage to query set e ij = P ( protein i is functionally related to protein j | evidence) X ij : binary RV with prob. e ij S Q ( p i ): # of links from protein i to query set, Q Find proteins that maximize: What about indirect links to the query set?

12 Graph search: handling indirect links Solution: iterative expanding search where indirect links to the query through high confidence neighbors are counted

13 Overview How do we integrate heterogeneous evidence? Expert-driven network discovery Making it usable: practical visualization and other interface considerations Does it work? (evaluation experiments and biological validation) Challenges/opportunities and future work

14 Making bioPIXIE usable Guiding principles:  Accessibility (users can access most recent data with little effort)  Simplicity vs. flexibility  Drill-down (details, e.g. supporting exp. data, hidden until requested)  Browseable

15 Graph visualization

16 Overview How do we integrate heterogeneous evidence? Expert-driven network discovery Making it usable: practical visualization and other interface considerations Does it work? (evaluation experiments and biological validation) Challenges/opportunities and future work

17 Evaluation experiments Recovering known network components: How much does integration help? Results averaged over 31 pathways, processes, and complexes (KEGG, GO, MIPS) 10 random proteins as query set and try to recover remaining members

18 Evaluation experiments (2) Recovering known network components: Do naïve methods of integration/search work just as well? Results averaged over 31 pathways, processes, and complexes (KEGG, GO, MIPS) 10 random proteins as query set and try to recover remaining members

19 Biological validation: finding new components S. cerevisiae uncharacterized gene, YPL077C Predicted involvement in chromosome segregation Using bioPIXIE to characterize unknown genes

20 Biological validation: finding new components P-value based on blind counting: 1.98x10 -7, Fisher’s exact test

21 (Helmut Pospiech) Biological validation: novel links between pathways DNA replication initiation: Cdc7: “switch” that starts replication (activated by Dbf4) Linked to Hsp90 complex by our method Hsp90 (yeast- hsc82,hsp82): Cytosolic molecular chaperone that participates in the folding of several signaling kinases and hormone receptors

22 Genetic analysis of DNA replication-Hsp90 link 10 5 cells wt dbf4Δ hsp82Δ dbf4Δhsp82Δ wt dbf4Δ hsc82Δ dbf4Δhsc82Δ wt dbf4Δ cpr7Δ dbf4Δcpr7Δ RT 30°C 37°C YKO Dbf4 vs. hsp82, hsc82 and co-chaperones: cpr7, sti1, cdc37

23 Overview How do we integrate heterogeneous evidence? Expert-driven network discovery Making it usable: practical visualization and other interface considerations Does it work? (evaluation experiments and biological validation) Challenges/opportunities and future work

24 Practical challenges/opportunities  Visualizing complex networks of interactions in a meaningful way  how does it scale with added data?  easy user navigation around the network  Data-centric vs. established knowledge views How do we overlay current knowledge of pathways with predictions derived from experimental data?

25 Future work An observation: The more specific we can be about the end goal, the better the accuracy of our prediction

26 Future work Exploiting relevance and reliability variation: context- specific integration

27 Summary bioPIXIE can facilitate precise network discovery from experimental data using: Bayesian data integration Expert-directed search Web-based dynamic interface bioPIXIE is an effective tool for browsing genomic evidence and generating specific, testable hypotheses

28 Acknowledgements Olga Troyanskaya Drew Robson Adam Wible Kara Dolinski Camelia Chiriac Matt Hibbs Curtis Huttenhower David Botstein Lab Leonid Kruglyak Lab Thank you!

29 Evaluation experiments (3): what about noise in the query set? AUPRC # of random proteins out of 20 total query proteins

30 Evaluation experiments (4) Comparing with existing approaches SEEDY: proteins ranked by max. direct connection to query Comple xpande r:

31 30°C 37°C HU 0 mM HU 50 mMHU 100 mM wt cpr7Δ sti1Δ dbf4Δ hsp82Δ hsc82Δ dbf4Δhsc82Δ dbf4Δsti1Δ dbf4Δcpr7Δ dbf4Δhsp82Δ wt cpr7Δ sti1Δ dbf4Δcpr7Δ wt cpr7Δ sti1Δ dbf4Δcpr7Δ hsp82Δ hsc82Δ dbf4Δ dbf4Δhsp82Δ dbf4Δhsc82Δ dbf4Δsti1Δ Hydroxyurea sensitivity (replication inhibitor) 10 6 cells

32 Is this interaction specific to DNA replication? 37°C wt cpr7Δ sti1Δ dbf4Δ hsp82Δ hsc82Δ dbf4Δhsc82Δ dbf4Δsti1Δ dbf4Δcpr7Δ dbf4Δhsp82Δ wt cpr7Δ sti1Δ dbf4Δcpr7Δ wt cpr7Δ sti1Δ dbf4Δcpr7Δ hsp82Δ hsc82Δ dbf4Δ dbf4Δhsp82Δ dbf4Δhsc82Δ dbf4Δsti1Δ 10 6 cells MMS treatment has no apparent effect at RT, 30°C or 37°C (shown) MMS sensitivity (induces DNA damage) Conclusions:  Hsp90 complex plays specific role in DNA replication  Hsc82 and hsp82 do not have identical function  Possible new link between signaling cascades, stress, and DNA replication  Our system generates specific, testable hypotheses

33

34