Statistical Tool for Identifying Sequence Variations That Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) July 22,

Slides:



Advertisements
Similar presentations
©2011 Elsevier, Inc. Molecular Tools and Infectious Disease Epidemiology Betsy Foxman Chapter 7 Omics Analyses in Molecular Epidemiologic Studies.
Advertisements

Office of Infectious Diseases Computational Challenges for Infectious Diseases Michael Shaw, PhD OID/Office of the Director.
Virus Pathogen Resource (ViPR) 26 September 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center.
Centers of Excellence for Influenza Research and Surveillance 6 th Annual Meeting Aug 1, 2012 Status of IRD Development.
Standardizing Metadata Associated with NIAID Genome Sequencing Center Projects Richard H. Scheuermann, Ph.D. Department of Pathology Division of Biomedical.
Genomics, Cancers & Infectious Diseases Qunyuan Zhang Division of Statistical Genomics Washington University School of Medicine.
Introduction to Bioinformatics Richard H. Scheuermann, Ph.D. Director of Informatics JCVI.
Host cell responses to viral infection can be monitored by a variety of different high throughput experimental methodologies in order to understand the.
BIOINFORMATICS Ency Lee.
LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel:
Bioinformatics Resource Centers Influenza Research Database (IRD) Virus Pathogen Database and Analysis Resource (ViPR) 8 December 2010 Richard.
CoMPAS Pro: Comprehensive Meta Prediction and Annotation Services for Proteins Sebastian J. Schultheiß Christoph Malisi.
An analysis of “Bioinformatics analysis of SARS coronavirus genome polymorphism” by Pavlović-Lažetić, et. al Angela Brooks July 9, 2004 SoCalBSI Article.
Informatics Support for Vaccine Projects Using and extending the UCSC bioinformatics infrastructure.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
Influenza A Virus Pandemic Prediction and Simulation Through the Modeling of Reassortment Matthew Ingham Integrated Sciences Program University of British.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Utilizing Fuzzy Logic for Gene Sequence Construction from Sub Sequences and Characteristic Genome Derivation and Assembly.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Integrated Bioinformatics Data and Analysis Tools for Herpesviridae Viruses in the Virus Pathogen Resource (ViPR) Yun Zhang 1, Brett Pickett 1, Eva Sadat.
Richard H. Scheuermann, Ph.D. Department of Pathology Division of Biomedical Informatics U.T. Southwestern Medical Center Standardizing Metadata Associated.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
Laboratory Training for Field Epidemiologists Typing May 2007 Sequencing and Phylogeny.
Databases and tools to study the genomes of hundreds of pathogens, plants, and mammals Richard H. Scheuermann, Ph.D. Director of Informatics J. Craig Venter.
Sequence Feature Variant Type and Evolutionary Trajectory Analysis using the Influenza Research Database (IRD) 19 July 2011 Richard H. Scheuermann,
Influenza Research Database (IRD): A Web-based Resource for Influenza Virus Data and Analysis Victoria Hunt 1 *, R. Burke Squires 1, Jyothi Noronha 1,
Data Mining in the Influenza Research Database (IRD) and the Virus Pathogen Resource (ViPR) JCVI-GSCID/NIAID Workshop University of Limpopo 01 June 2011.
Comparative Genomics in the Influenza Research Database 17 June 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern.
Sequence Variation Identification and Functional/Structural Inference in the Influenza Research Database (IRD) and Virus Pathogen Resource (ViPR) Yun Zhang.
Richard H. Scheuermann, Ph.D. Department of Pathology, UT Southwestern March 30, 2011 Virus Bioinformatics Resource Centers – ViPR & IRD.
Characterization of antigenetic serotypes from the dengue virus in Venezuela by means of Grid Computing R. Isea 1, E. Montes 2, A.J. Rubio-Montero 2, J.D.
Influenza Research Database (IRD) 26 September 2011 Richard H. Scheuermann, Ph.D. Department of Pathology U.T. Southwestern Medical Center.
BioHealthBase: The Bioinformatics Resource Center for Francisella tularensis Shubhada Godbole 1, Stephen M. Beckstrom-Sternberg 2,3, Paul S. Keim 2,3,
BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1, Jyothi Noronha 1, Burke Squires 1, Victoria Hunt 1, Ed Klem.
Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014.
Yun Zhang J. Craig Venter Institute San Diego, CA, USA August 4, 2012 Integrated Bioinformatics Data and Analysis Tools for Herpesviridae.
Large-scale knowledge aggregation for infectious diseases ASEAN-China International Bioinformatics Workshop Singapore, 17 th April 2008 Olivo Miotto Institute.
Statistical Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) Brett E.
Richard H. Scheuermann, Ph.D. November 5, 2012 Support for Systems Biology Data in IRD/ViPR - Proteomics.
BIG Data: Knowledge for Improving Vaccine Virus Selection Richard H. Scheuermann, Ph.D. Director of Informatics JCVI.
Influenza Infectious Disease Ontology (Influenza-IDO) Status August 2010.
Structural Bioinformatics Section Vaccine Research Center/NIAID/NIH
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
Integration of Host Factor Data into the Virus Pathogen Database and Analysis Resource (ViPR) and the Influenza Research Database (IRD) Brett E. Pickett.
The Informatics Crystal Ball: Mining the Past to Predict the Species Jump Event 19 April 2011 Richard H. Scheuermann, Ph.D. Department of.
Variation data in VectorBase NIH/NIAID VectorBase site visit March 2015.
Richard H. Scheuermann, Ph.D. November 5, 2012 Support for Systems Biology Data in IRD/ViPR.
A collaborative tool for sequence annotation. Contact:
Viral Genomics: Strength in Numbers David Spiro Assistant Investigator J. Craig Venter Institute
Improved prediction of antigenic relationships among RNA viruses Richard Reeve Boyd Orr Centre for Population and Ecosystem Health University of Glasgow.
Sequence Search Abhishek Niroula Department of Experimental Medical Science Lund University
COMUS : Clinician-Oriented locus-specific MUtation detection and deposition System Korean BioInformation Center (KOBIC) Sungwoong Jho 8 th InCoB September.
Supplementary Figure S1. Supplementary Figure S2.
Ontology Driven Data Collection for EuPathDB Jie Zheng, Omar Harb, Chris Stoeckert Center for Bioinformatics, University of Pennsylvania.
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Canadian Bioinformatics Workshops
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
Investigations of HIV-1 Env Evolution Evolutionary Bioinformatics Education: A BioQUEST Curriculum Consortium Approach Grand Valley State University August.
Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline Markham RB, Wang WC, Weisstein AE, Wang Z, Munoz A, Templeton A,
Dawit Assefa Ethiopia Health and Nutrition Research Institute Dawit Assefa Ethiopia Health and Nutrition Research Institute Evaluation of an in-house HIV.
Bioinformatics Overview
Comparative genotypic and phenotypic characterization of
Viral Genetics.
Fig. 1. NS1 protein alignment and linear epitope mapping of the 10 antibodies used to run the DENV serotype–specific NS1 rapid tests, pan-DENV NS1 test,
Investigations of HIV-1 Env Evolution
Explore Evolution: Instrument for Analysis
A Web-based Interactive Genome Library for Surveillance, Detection, Characterization and Drug-Resistance Monitoring of Influenza Virus Infection in the.
So those old tests don’t go to waste!
Neonatal HSV-2 genomes are genetically distinct from one another and encompass a broad range of known HSV-2 genetic diversity. Neonatal HSV-2 genomes are.
Presentation transcript:

Statistical Tool for Identifying Sequence Variations That Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) July 22, 2013 Meta-CATS

Overview Overview of the Meta-CATS algorithm Metadata grouping Statistical testing Two similar integrated web toolkits : – The Virus Pathogen Resource (ViPR – viprbrc.org) – The Influenza Research Database (IRD – fludb.org) Review results from two use cases

The Meta-CATS Algorithm 1.Collect a set of virus strains (search database or upload file) 2.Group strains by a metadata attribute or upload a spreadsheet that defines the groups 3.Perform multiple sequence alignment 4.Automatically identify residue positions where there are statistically significant differences between the groups 5.Report results A flexible web-based tool with a few basic steps:

Grouping based on Metadata Examples of metadata that may be of interest: Host of isolation Severity of disease Drug resistance Geographical location Date of isolation Phylogenetic clade assignment Other taxonomic assignments (serotype, genotype, etc.) Or any User Defined attribute in a spreadsheet

The Meta-CATS Computation Multiple sequence alignment of all strains At each residue position (nucleotide or AA) perform a chi-squared test of independence When there are more than 2 groups, at each position identified, perform a chi-square test to determine which pairs of groups contribute to the significant result. Computed results can be viewed directly or downloaded as a CSV file.

The ViPR / IRD Toolkits Location of new Meta-CATS Algorithm

Workbench and Metadata Attributes

First use Case: SARS Coronavirus The “Host” metadata field was used to find the positional differences in Human and Civet predominant strains The Meta-CATS algorithm identified 117 nucleotide positions that significantly differed between the civet and human isolates. The raw p-values ranged from 2.49x10 -2 to 4.33x “Virus Pathogen Database and Analysis Resource (ViPR): A Comprehensive Bioinformatics Database and Analysis Resource for the Coronavirus Research Community”. Picket et. al., Viruses Nov 19;4(11)

Second Use Case: Dengue Virus The “Geographic Location” metadata was used to identify 61 significant differences in the polyprotein between strains of Dengue-3 virus isolated from the Eastern Hemisphere and Western Hemisphere. – Further inspection of the group-specific amino acid composition found a clade of “outlier” sequences likely due to an international transmission event. A separate analysis identified distinct NS1 amino acid residue variations correlating with DENV serotypes – The Meta-CATS algorithm identified 19 positions where the 4 serotypes differed. In 3 locations, which are located within experimentally-determined antibody epitopes, the p-values were less than 7.07x “Metadata-driven Comparative Analysis Tool for Sequences (meta-CATS): an Automated Process for Identifying Significant Sequence Variations Dependent on Differences in Viral Metadata.” Picket et. al., J. of Virology. (submitted)

Summary Through the readily accessible search interface and integrated comparative genomics tools such as Meta-CATS, researchers can easily generate hypotheses that can then be tested in the lab and applied to the development of therapeutics and vaccines. ViPR IRD

Acknowledgements J Craig Venter Institute Richard H. Scheuermann Brett E. Pickett Brian Aevermann Yun Zhang Rick Stanton SMU Mengya Liu Eva Sadat Monnie McGee Northrop Grumman Health Solutions Edward B. Klem Sherry He Sam Zaremba Sanjeev Kumar Liwei Zhou Wei Jen Vecna Christopher N. Larsen