Presentation is loading. Please wait.

Presentation is loading. Please wait.

Operated by Los Alamos National Security, LLC for NNSA Bioscience Discovering virulence genes present in novel strains and metagenomes Chris Stubben IC.

Similar presentations


Presentation on theme: "Operated by Los Alamos National Security, LLC for NNSA Bioscience Discovering virulence genes present in novel strains and metagenomes Chris Stubben IC."— Presentation transcript:

1 Operated by Los Alamos National Security, LLC for NNSA Bioscience Discovering virulence genes present in novel strains and metagenomes Chris Stubben IC postdoc, B-7

2 Operated by Los Alamos National Security, LLC for NNSA Bioscience

3 Overview Review current functional classification systems Discuss Virulence Factor Ontology Identify virulence genes in novel strains and metagenomes Slide 3

4 Functional classification systems EC numbers for enyzmes (1956) Swiss-Prot keywords (1986) E. coli gene functions, M. Riley (1993) TIGR role categories (1995) Gene Ontology (1998) Slide 4 gen e function

5 What functions are related to virulence? Some systems have a few terms – Swiss-Prot keywords = virulence, toxin, antibiotic resistance – TIGR roles = pathogenesis, toxin production and resistance Gene Ontology (GO) also has pathogenesis, resistance to antibiotics, plus many more Slide 5 GO terms related to the enzymatic activity of toxins

6 Gene Ontology (GO) 25,688 terms in three structured controlled vocabularies (ontologies) – 15098 biological processes – 2186 cellular components – 8404 molecular functions Standard for eukaryotic gene annotation Increasingly used for prokaryotes – TIGR (2002) – Plant pathogens by PAMGO at VBI (2005) – Human pathogens at 8 BRCs (2006) Slide 6

7 Bioinformatics Resource Centers (BRC) NIAID funded, $100 million dollar effort to create eight bioinformatic centers for human pathogens Goal is to provide easy access to genomic data from multiple strains like eukaryotic model organism databases Slide 7 BRCs =?

8 Example: Toxin annotation in GO Slide 8 Step 1, Assign GO terms, maybe – activation of Rho GTPase activity – N-terminal peptidyl-glutamine deamination – actin cytoskeleton reorganization – stress fiber formation

9 Step 2, add references and evidence codes Slide 9 Virulence Protein Experimental Sequence similarity Genomic context Computational Function Knockout mutants (IMP) Overexpression phenotypes (IDA) Genetic interactions (IGI) Microarrays (IEP or RCA) BLAST alignments (ISA) Orthologous proteins (ISO) Hidden markov models of protein families or domains (ISM) Phlyogenetic profiles, conserved neighborhoods, gene fusion, shared regulatory sites, etc (IGC)

10 Example: Toxin searches in GO Slide 10 If a gene is annotated to ‘adenylate cyclase activity’, how do you know it’s a toxin? It may also annotated to “cell killing” or related term, but is that enough? However, an alternative is to define virulence factors and toxins (both outside the scope of GO) in a new ontology

11 Why we need a Virulence Factor ontology Lots of effort to characterize pathogenic processes and systems (eg, BRCs) Many different definitions of pathogen, virulence and virulence factors Not clear what terms in GO may be related to toxins and virulence (BRCs have already assigned 750,000 GO terms to 300,000 genes) Slide 11

12 Virulence Factor Ontology working group Goal is to combine existing toxin and virulence terms from various groups into a single ontology – TVFac and antibiotic resistance (AR) terms at LANL – Gemina virulence factors and AR terms at U. of Maryland – PAMGO terms in GO Participants – MITRE. Lynette Hirschmman, Marc Colosimo, and others – LANL. Chris Stubben, Murray Wolinsky and Jian Song – U of Maryland IGS. Lynn Schriml and Michelle Gwinn Slide 12

13 Virulence Factor Ontology (VFO) Three new ontologies, one very simple that points to additional terms in GO or to new ontologies Virulence factor (definition needed!) – toxin associated processes – antibiotic resistance – adhesion – entry into host – acquisition of nutrients from host – avoidance of host defenses – growth within host – modification of host morhphology – dissemination from host Slide 13 New simplified GO trees (slims)

14 Virulence genes in novel strains Emerging, engineered and novel strains will most likely be sequenced quickly using next generation sequencing technologies, and then compared to near neighbor strains using sequence similarity (BLAST) or models (HMMs like PFams, TIGRFams, FIGFams, EnteroFams, etc). Slide 14

15 Compare novel strains to what? Very few manual annotations available for prokaryotes, especially in public databases like NCBI and UniProt Slide 15 “Curated information from the literature serves as the gold-standard data set for comparative analyses” -Nature Sep10, 2008 Table 1. Percentage of genes in UniProt with functional assignments to Gene Ontology terms based on experimental evidence in the primary literature. Use BRCs!

16 BRC annotations Genomes annotations should have references and evidence codes signifying whether annotations were produced experimentally or computationally Slide 16 3.8% of Y.pestis CO92 with manual annotations

17 Y. pestis CO92 annotations at ERIC Slide 17 Table 1 and 2. Sequence features and coding sequence annotations for Y. pestis CO92 at ERIC

18 Yersinia antibiotic resistance genes Slide 18 Table 1 and 2. Antibiotic resistance genes found using Swiss-prot keyword search ‘antibiotic resistance’ in UniProt and using GO term search ‘response to antibiotic’ in ERIC. Only one gene in common!

19 Vibrio toxins in GO, UniProt, and NMPDR Slide 19

20 Virulence genes in metagenomes Recent comparison of virulence genes in chicken, cow, mouse and human gut metagenomes (metavirulomes) was based on SEED subsystem categories at NMPDR Slide 20 Another alternative is to use GO term mappings to protein family and domain databases like PFam

21 IMG/metagenomes from JGI Slide 21 Select metagenomes and save

22 Create abundance profiles Slide 22 Compare using Pfam, COG, or TIGRfam abundance profiles

23 Find virulence genes Slide 23 Use GO term mappings to PFAM database to find virulence genes

24 Need better mappings to virulence genes Current GO term mappings miss most virulence- associated genes. Slide 24 Table 1 and 2. PFAMs and TIGRfams overrepresented in air compared to soil


Download ppt "Operated by Los Alamos National Security, LLC for NNSA Bioscience Discovering virulence genes present in novel strains and metagenomes Chris Stubben IC."

Similar presentations


Ads by Google