Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) Brett E.

Similar presentations


Presentation on theme: "Statistical Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) Brett E."— Presentation transcript:

1 Statistical Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) Brett E. Pickett 1, Douglas S. Greer 1, Yun Zhang 1, Liwei Zhou 2, Sanjeev Kumar 2, Sam Zaremba 2, Chris Larsen 3, Edward B. Klem 2, Richard H. Scheuermann 1 1 J. Craig Venter Institute, San Diego, CA; 2 Northrop Grumman Health Solutions, Rockville MD; 3 Vecna Technologies, Greenbelt MD. Introduction Figure 2: Screenshots of the Ortholog Group Component. Users can search for orthologs using various criteria (left) and then browse the results according to ortholog group (right). Of the 493 gD protein orthologs predicted by ViPR, 39 (HHV-1) and 25 (HHV-2) non- redundant sequences were included in this analysis. 1 Pickett, B.E., et al. (2012) ViPR: an open bioinformatics database and analysis resource for virology research. Nucl. Acids Res. 40(D1): D593-D598. We would like to thank the primary data providers for the data that was used throughout this study. We also recognize the scientific and technical personnel responsible for supporting and developing ViPR, which has been wholly supported with federal funds from the NIH/NIAID (N01AI2008038 and N01AI40041 to R.H.S.). Figure 6: 3D Protein Structure Viewer in ViPR. A display of a 3D protein structure for HHV-1 glycoprotein D complexed with Nectin- 1. Residue 48 (cyan) and an epitope comprising residues 77-87 (green) are highlighted (PDB ID: 3U82). ViPR can assist in various comparative genomics analyses. As an example use case, we identified 2 significant sequence variations that: Have diverged through speciation between HHV-1 and HHV-2 Overlap with known B-Cell epitopes Could vary in response to external pressure(s) while retaining the ability to bind and enter host cells following speciation In conclusion, the ViPR resource combines a powerful database with integrated bioinformatics tools to perform computational analyses and assist in hypothesis generation. The uniqueness of ViPR lies in: integrating data from various sources capturing unique data on the host response to virus infection combining necessary tools to perform analytical workflows allowing data sharing and storage with collaborators Figure 1: A screenshot of the ViPR homepage The ViPR homepage is the portal used to access the various types of data and advanced functionality for any supported virus family. The Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org), sponsored by the National Institute of Allergy and Infectious Diseases, serves as a single publicly accessible repository of integrated datasets and analysis tools for 14 different virus families--including Herpesviridae. ViPR supports wet-bench virology research focusing on the development of diagnostics, prophylactics, vaccines, and treatments for these pathogens 1. The usefulness of the ViPR system can be demonstrated by using a scientific use case. Here we examine the sequence variation existing within the glycoprotein D (gD) protein in Human Herpesvirus-1 (HHV-1) and Human Herpesvirus-2 (HHV-2). ViPR Supports 14 Virus Families ViPR Integrates Data from Many Sources GenBank sequence records, gene annotations, and strain metadata Protein Databank (PDB) 3D protein structures Immune epitopes from the Immune Epitope Database (IEDB) Clinical data Host Factor Data generated from the NIAID Systems Biology projects and the ViPR-funded Driving Biological Projects UniProtKB protein annotations Gene Ontology (GO) classifications Additional data derived from computational algorithms ViPR Provides Analysis and Visualization Tools Multiple Sequence Alignment Phylogenetic Tree Construction Sequence Polymorphism Analysis Metadata-driven Comparative Genomics Statistical Analysis Genome Annotator Gbrowse Genome Viewer Sequence Format Conversion BLAST Sequence Similarity Search 3D Protein Structure Visualization Sequence Feature Variant Types Ortholog Group Assignments ViPR enables you to store and share data and results through the ViPR Workbench Figure 4: Alignment of gD Amino Acid Sequences HHV-1 (white) and HHV-2 (gray) gD sequences show a high degree of divergence towards the N-terminus of the protein. Blue arrows highlight a subset of significant positions. Phylogenetic Tree 3D Protein Structure Viewer Summary Acknowledgements References Protein Ortholog Search ViPR groups viral proteins together based on their predicted orthology within a virus taxon to facilitate gene/protein search, gene function inference, and virus evolution research. These orthologous groups can then be queried intuitively. A search for orthologs of the US6 gene, which codes for Glycoprotein D (gD), was performed. Non-redundant HHV-1 and HHV-2 sequences were selected for more in-depth analysis. Metadata-driven Comparative Genomics Statistical Analysis Figure 5: Results from Metadata-driven Comparative Analysis Tool for Sequences (Meta-CATS). Shows abridged output using meta-CATS to compare HHV-1 and HHV-2 with residues located within experimentally- determined positive B-Cell epitopes (underlined) found in ViPR. Multiple sequence alignment (MSA) can be calculated directly from: search results, a working set, or custom uploaded sequences. Multiple Sequence Alignment Figure 3: A Phylogenetic Tree Reconstruction of HHV-1 and HHV-2 sequences. The distance-based FastME algorithm, implemented in ViPR was used to generate a phylogenetic tree of all 64 HHV-1 (red) and HHV-2 (blue) amino acid sequences. ViPR uses an automated pipeline to identify sequence variations that significantly differ between groups of strains. Position Chi-square Value P-value Degree Freedom Residue Diversity (Group 1) Residue Diversity (Group 2) 359.8641.02E-14139 G25 R 17 * 63.9931.27E-14236 I, 3 L25 A 3259.8641.02E-14139 A25 P 4659.8641.02E-14139 D25 N 35432.7081.07E-08139 A8 A, 17 V ViPR can generate phylogenetic trees from search results, multiple sequence alignment, working set, or custom sequences via upload. ViPR provides multiple data types for viewing on a 3D structure. Arenaviridae Bunyaviridae Caliciviridae Coronaviridae Filoviridae Flaviviridae Hepeviridae Herpesviridae Paramyxoviridae Picornaviridae Poxviridae Reoviridae Rhabdoviridae Togaviridae


Download ppt "Statistical Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) Brett E."

Similar presentations


Ads by Google