Richard H. Scheuermann, Ph.D. November 5, 2012 Support for Systems Biology Data in IRD/ViPR - Proteomics.

Slides:



Advertisements
Similar presentations
Institute for Animal Health Comparative analyses.
Advertisements

Exploring the Human Transcriptome
Virus Pathogen Resource (ViPR) 26 September 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center.
Asking translational research questions using ontology enrichment analysis Nigam Shah
The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
Transcriptomics Breakout. Topics Discussed Transcriptomics Applications and Challenges For Each Systems Biology Project –Host and Pathogen Bacteria Viruses.
Standardizing Metadata Associated with NIAID Genome Sequencing Center Projects Richard H. Scheuermann, Ph.D. Department of Pathology Division of Biomedical.
Systems Biology Data Dissemination Working Group 25FEB2015.
Introduction to Bioinformatics Richard H. Scheuermann, Ph.D. Director of Informatics JCVI.
Host cell responses to viral infection can be monitored by a variety of different high throughput experimental methodologies in order to understand the.
The Golden Age of Biology DNA -> RNA -> Proteins -> Metabolites Genomics Technologies MECHANISMS OF LIFE Health Care Diagnostics Medicines Animal Products.
The Cell, Central Dogma and Human Genome Project.
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
NCBI resources III: GEO and expression data analysis Yanbin Yin Fall
Integrated Bioinformatics Data and Analysis Tools for Herpesviridae Viruses in the Virus Pathogen Resource (ViPR) Yun Zhang 1, Brett Pickett 1, Eva Sadat.
© 2008 Applera Corporation and MDS Inc. MultiQuant ™ Software © 2008 Applera Corporation and MDS Inc.
Daehee Hwang Leroy Hood Institute for Systems Biology.
Richard H. Scheuermann, Ph.D. Department of Pathology Division of Biomedical Informatics U.T. Southwestern Medical Center Standardizing Metadata Associated.
A number of slides taken/modified from:
Standardizing Metadata Associated with NIAID Genome Sequencing Center Projects and their Implementation in NIAID Bioinformatics Resource Centers Richard.
Influenza Research Database (IRD): A Web-based Resource for Influenza Virus Data and Analysis Victoria Hunt 1 *, R. Burke Squires 1, Jyothi Noronha 1,
5.1 Proteomics tools on ExPASy. 5.2 (Part 1) Primary, secondary, and tertiary protein structure.
VectorBase A Resource Centre for Invertebrate Hosts of Human Pathogens Bob MacCallum Imperial College London.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
Sequence Variation Identification and Functional/Structural Inference in the Influenza Research Database (IRD) and Virus Pathogen Resource (ViPR) Yun Zhang.
Statistical Tool for Identifying Sequence Variations That Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) July 22,
ISMB 2005 Detroit, June 27 th 2005 Proteome 1 Michal Linial Institute of Life Sciences The Hebrew University Jerusalem, Israel Computer Science and Engineering.
"Omics", Whole Genomes, Mutations and Other Databases.
Statistical Bioinformatics Genomics Transcriptomics Proteomics Systems Biology.
Yun Zhang J. Craig Venter Institute San Diego, CA, USA August 4, 2012 Integrated Bioinformatics Data and Analysis Tools for Herpesviridae.
Copyright © 2009 Pearson Education, Inc. Genomics, Bioinformatics, and Proteomics Chapter 21 Lecture Concepts of Genetics Tenth Edition.
Statistical Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) Brett E.
A DAPTING PUBLISHED RESEARCH DATA TO A BIOINFORMATICS MODULE IN A UNDERGRADUATE BIOLOGY MAJOR ’ S COURSE N ATURE M AY 26;473(7348): L ONG -
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Systems Biology through Pathway Statistics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Diepenbeek; May
Integration of Host Factor Data into the Virus Pathogen Database and Analysis Resource (ViPR) and the Influenza Research Database (IRD) Brett E. Pickett.
BRC 2011 Session #4 – “Omics” Data. Session #4 - Outline Challenges and Opportunities  pathogen datasets; host datasets; integrating pathogen-host datasets.
Interactions with other BRCs Scott Emrich “all hands” meeting VectorBase.
Valentina Di Francesco Senior Program Officer for Bioinformatics, Structural Genomics and Systems Biology Microbial Genomics.
Central dogma: the story of life RNA DNA Protein.
SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK.
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Richard H. Scheuermann, Ph.D. November 5, 2012 Support for Systems Biology Data in IRD/ViPR.
A New Strategy of Protein Identification in Proteomics Xinmin Yin CS Dept. Ball State Univ.
Host Response to HIV-1 Infection: Quantitative Proteomics & Allied Approaches Eric Chan.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
Gene Expression Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
High throughput biology data management and data intensive computing drivers George Michaels.
Using Scaffold OHRI Proteomics Core Facility. This presentation is intended for Core Facility internal training purposes only.
Expression Data Integration Microarray Gene Expression Database Meeting Sunday 14th November 1999.
GSEA-Pro Tutorial Anne de Jong University of Groningen.
Day 4 Session 22: Questions and follow-up…. James C. Fleet, PhD
Day 2: Session 8: Questions and follow-up…. James C. Fleet, PhD
The Omics Dashboard Suzanne Paley Pathway Tools Workshop 2018
“Proteomics is a science that focuses on the study of proteins: their roles, their structures, their localization, their interactions, and other factors.”
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Volume 5, Issue 1, Pages e4 (July 2017)
Multi-Omics of Single Cells: Strategies and Applications
Fast, Quantitative and Variant Enabled Mapping of Peptides to Genomes
Strategic command of living processes
The Omics Dashboard.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Volume 5, Issue 1, Pages e4 (July 2017)
GSEA-Pro Tutorial Gene Set Enrichment Analysis for Prokaryotes
Volume 15, Issue 2, Pages (April 2016)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Richard H. Scheuermann, Ph.D. November 5, 2012 Support for Systems Biology Data in IRD/ViPR - Proteomics

Projects with Host Factor Data Four systems biology groups funded by NIAID, including: – Systems Virology (Michael Katze group, Univ. Washington) Influenza H1N1 and H5N1 and SARS Coronavirus statistical models, algorithms and software, raw and processed gene expression data, and proteomics data – Systems Influenza (Alan Aderem group, Institute for Systems Biology/Seattle Biomed) Various influenza viruses microarray, mass spectrometry, and lipidomics data ViPR Driving Biological Projects – Abraham Brass, Mass. General Hospital Dengue virus host factor database from RNAi screen – Lynn Enquist / Moriah Szpara, Princeton University Deep sequencing and neuronal microarrays for functional genomic analysis of Herpes Simplex Virus – Richard Kuhn, Purdue University Metabolomics data of Dengue virus infection of human cells and mosquitos – Mike Diamond, Washington University Identification of inhibitory interferon-stimulated genes against flaviviruses and noroviruses using shRNA knockdown Determine the mechanism of action of individual inhibitory ISGs

“Omics” data management (MIBBI vs MIBBI-DB) – Project metadata (1 template) Title, PI, abstract, publications – Experiment metadata (~6 templates) Biosamples, treatments, reagents, protocols, subjects – Primary results data Raw expression values – Data processing metadata (1 template) Normalization and summarization methods – Processed data Data matrix of fold changes and p-values – Data interpretation metadata (1 template) Fold change and p-value cutoffs used – Interpreted results (Host factor biosets) Interesting gene, protein and metabolite lists Visualize biosets in context of biological pathways and networks Statistical analysis of pathway/sub-network overrepresentation Strategy for Handling “Omics” Data

Data Submission Workflows Study metadata Experiment metadata Primary results Analysis metadata Processed data matrix Free text metadata GEO/PRIDE/PNNL/SRA/MetaboLights ViPR/IRD/PATRIC Host factor bioset pointer submission pointer Systems Biology sites

Metadata Submission Template Examples

Host Factor Data

8 Studies To Date

Host Factor Bioset

Transcriptomics => Proteomics Metadata fields are largely re-usable, with some exceptions – Exp_sample_template (protein).xls Exp_sample_template (protein).xls Results data differences – Peptide-level and protein-level IM005_Peptide_normalization_matrix.V2.xlsx IM005_Protein Normalization matrix.xlsx – Statistical measures Results_matrix_ IM005_sig Protein_RM.xlsx

Metadata Field Changes GEO GSM ID => Primary Data Archive + Primary Data Archive ID Semi-structured Experiment Variable to Structured Experiment Variable – Free text (1 day) => value unit pairs in separate fields (1/day; 10^4/plaque forming units) Multiple processed data matrix files – Concatenated IDs separated by (; |) Reagents and protocols are different but should not require submission template changes

Normalized Data Archive at BRC (standard format?) – Peptide normalized data – Protein normalized data – Results matrix of significant proteins BRCs derive bioset lists from results matrix – Handling different significance measures t-test flag, t-test p-value, g-test flag, g-test p-value, log10 ratio

Host Factor Bioset

On Deck Metabolomics and lipidomics data Integration of RNA expression, protein abundance and metabolite abundance Pathway/network visualization and analysis

Acknowledgement Lynn Law, U. Washington Richard Green, U. Washington Peter Askovich, Seattle Biomed Brett Pickett, U.T. Southwestern/JCVI Jyothi Noronha, U.T. Southwestern Eva Sadat, U.T. Southwestern Entire Systems Biology Data Dissemination Task Force, especially Jeremy Zucker NIAID (Alison Yao and Valentina DiFrancesco)

Future Development Plans

GO enrichment Network visualization GO