Download presentation
Presentation is loading. Please wait.
Published byLeon Cummings Modified over 9 years ago
1
Making sense of large amounts of molecular data Jason E. McDermott, PhD Research Scientist Computational Biology and Bioinformatics Group Pacific Northwest National Laboratory 1
2
Proteins Nucleic Acids Macromolecular Complex How do components of biological systems interact to produce behavior?
3
Molecular pathways 3 mTOR pathway EGFR pathway http://biocarta.com
4
A Mammoth Problem
5
Scientific Method Overview 5 Hypothesis Experimental design Data generation Analysis/modeling Predictions Interpretation Hypothesis
6
Circumstantial Evidence Traditional experimental approach Cigarette butt on street Neighbor was eyewitness to crime Missing jewelry from the house Fingerprints on doorknob High-throughput experimental approach Cigarette sales in city Testimony from everyone on the block All diamonds sold over last year in 10 mile radius Fingerprints on every surface in the house 6
7
Problem New methods generating mountains of data Very complex systems Traditional methods fail in some cases Progress will be made through better use of this data Objectives Formulate hypotheses for further investigation Identify gene/protein ‘targets’ Identify pathways that drive disease Develop systems-level biological understanding 7
8
What is a ‘target’? ‘Critical nodes’ Regulators of important processes Outcome of modeling (a prediction) that can be used to formulate a hypothesis What are targets used for? Mechanistic understanding of disease processes Potential biomarkers of disease Potential therapeutic treatments: drug development 8
9
Examples I’ll be talking about Bacterial virulence (Salmonella Typhimurium) Viral pathogenesis (avian flu and SARS) Ovarian cancer Approaches I’ll be talking about Machine learning Biological networks Data integration 9
10
LPS TLR4 MEK ERK Egr-1 pH Mg 2+ ROS/ RNS SPI2-T3S Bacterial detection Host defense Environmental response Virulence activation ssrA/B phoP/Q ompR/ envZ ydgT Bacterial survival Invasion Effectors Environmental Modulation Pathogen directed Host directed SPI1+ SCV LPS iNOS NRAMP Fe 2+ Effectors (e.g. SifA, SlrP, SseJ, SspH2) SPI2-T3S Environmental response Virulence activation ssrA/B phoP/Q ompR/ envZ ydgT Effectors (e.g. SifA, SlrP, SseJ, SspH2) Salmonella Typhimurium Pathogen Host
11
Karou Geddes Type-III secretion system secreted effectors SlrP SspH2 SseI SseJ SifA SifB SpvB SseK-1 SopD-1 InvJ SipC +25 other known effectors +??? other unknown effectors http://en.wikipedia.org/
12
Overview of the SVM-based Identification and Evaluation of Virulence Effectors (SIEVE) Method
13
D2 D1 SVM-based Discrimination Positive Negative
14
SIEVE Validation Using CyaA Fusions 14 McDermott, et al. 2011. Infection and Immunity. 79(1):23-32 Niemann, et al. 2011. Infection and Immunity. 79(1): 33-43
15
Biological Networks Types of networks Regulatory networks Protein-protein interaction networks Biochemical reaction networks Association networks Network Node = gene/protein or other component Edge = inferred relationship between components 15 McDermott JE, et al. 2010. Drug Markers, 28(4):253-66.
16
Merging disparate observations of a system to produce a single, more informative view 16 SNVs CNVs mRNA methylation protein phosphorylatio n miRNA Genome Comparison Pathway enrichment LEAP Network analysis metabolome
17
Can we infer a relationship between two genes or proteins based on their expression profiles over a large number of different conditions? A B C Faith, J., et al. “Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles.” 2007. PLoS Biology 5:e8 Network inference method conditions gene
18
What are networks useful for? Networks can be used for: Pretty figures Hypothesis generation Functional modules and their organization Topological identification of target critical nodes Predicting future states of the network Networks are NOT useful for: Final mechanistic insight Fine distinction of types of interactions between components Causality 18
19
Yu H et al. PLoS Comp Biol 2007, 3(4):e59 Hubs High centrality, highly connected Exert regulatory influences Vulnerable Bottlenecks High betweenness Regulate information flow within network Removal could partition network
20
20 Bottlenecks in Salmonella are essential for virulence McDermott J, et al. 2009. J. Comp. Bio. 16(2):169-180
21
21 Discovery of a novel class of effectors by integrating transcriptomic and proteomic networks
22
Respiratory virus pathogenesis What are the causes of pathogenesis in respiratory viruses? Goal: Identify and prioritize potential mediators of pathogenesis that are common and unique to influenza and SARS Goal: Identify and prioritize potential mediators of high- pathogenecity viral infection Approach: Mouse models of infection Transcriptomics Network-based approach Topological network analysis to define targets Validation studies
23
Ido1/Tnfrsf1b Module Kepi Module SARS-CoV-infected Wild type Mouse Inferred Network
24
Hypotheses for Validation KO Mouse Infection SurvivalDeathNegative Phenotype: Network: Altered Negative
25
Predicted targets abrogate influenza pathogenesis Tnfrsf1b (aka. Tnfr2) Predicted common regulator for influenza and SARS pathogenesis Tnf binding Negatively regulate TNFR1 signaling, which is proinflammatory Promote endothelial cell activation/migration Activation and proliferation of immune cells 25 H5N1 infection SARS infection
26
0 5 10 -5
27
Biological Drivers in Ovarian Cancer What genomic characteristics of ovarian cancer are executed at the protein level? Can protein expression be used to identify the most important genomic changes? How can we improve the survival of women with ovarian cancer? Can proteomics provide insight into the biological processes associated with poor survival? Can we use a pathway-based approach to suggest novel therapeutic targets? 27
28
Proteomics Chemoresistance in ovarian and breast cancer Tumor samples from The Cancer Genome Atlas Depth of genomic characterization Many tumors Proteomics and phosphoproteomics characterization of these tumors Pathway/network analysis to reveal patterns and biomarkers Integrate data into single view of the system 28
29
Clustering of Proteins and Phosphoproteins Proteins iTRAQ Batch Proteomic Subtypes Transcriptomic Subtype Log2 abundance relative to universal reference pool Phosphoproteins
30
Linear regression of abundance versus days-to-death suggests possible correlations with patient survival Protein Abundance Phosphorylation (normalized to abundance) A Subset of Proteins and Phosphopeptides Correlate with Patient Survival
31
PDGFRB Pathway Correlated with short survival Correlated with long survival mRNA abundance protein abundance Not observed phosphorylation Weak correlation
32
Module 1 (short survival) Correlated with short survival Correlated with long survival Protein Phosphorylated protein mRNA AP-1 pathway NFAT TF pathway Module 2 (long survival) CD8 T cell receptor downstream pathway Il12-2 pathway Il12-STAT4 pathway Integrated Co-abundance Network for Ovarian Cancer
33
P-value 0.007 IGKV1-5 LAX1 AMPD1 IGHM SLAMF7 P-value 0.005 ATF3 DUSP1 FOSB ZFP36 Kaplan-Meier plots from integrated CNV, mRNA expression, and mutations % survival Months survival Survival Analysis from Network Targets
34
Conclusions Several effective ways of big data integration Machine learning approaches Biological network representation Data integration Understanding of disease requires system-level views Relatively simple approaches can yield novel insight Combining different views of system can improve insight Data analysis and modeling is a starting point- not an end point 34
35
Acknowledgements SysBEP (http://www.sysbep.org) NIAID/NIH Y1-AI-8401 PI: Josh Adkins, PNNL Systems Virology (http://www.systemsvirology.org) NIAID/NIH HHSN272200800060C PI: Michael Katze, UW Clinical Proteomics Tumor Analysis Consortium NCI/NIH 1U24CA160019 PIs: Richard Smith, PNNL; Karin Rodland, PNNL Many, many people in these and other projects who helped with this work and made it possible 35
36
About Me Email: Jason.McDermott@pnnl.gov About: http://www.jasonya.com/wp/about/ Twitter: @BioDataGanache Blog: The Mad Scientist’s Confectioner’s Club http://www.jasonya.com/wp/ 36
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.