IslandPath: A computational aid for identifying genomic islands that may play a role in microbial pathogenicity William Hsiao 1 *, Nancy Price 2, Ivan.

Slides:



Advertisements
Similar presentations
Codon Bias and Regulation of Translation among Bacteria and Phages
Advertisements

Hungarian GEnomic Research on pathogenic Microbes Hu-GERM.
“Classical” view of bacteria genome Single chromosome May have plasmids and phage Simple gene structure Genes have recognisable phenotype Vibrio y Bacteriodes.
Lateral Transfer. Donating Genes Mutation often disrupts the function of a gene Gene transfer is a way to give new functions to the recipient cell Thus,
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Genomic island analysis: Improved web-based software and insights into an apparent gene pool associated with genomic islands William Hsiao Brinkman Laboratory.
Identification of Novel Virulence-Associated Genes via Genome Analysis of Hypothetical Genes Sara Garbom, Åke Forsberg, Hans Wolf- Watz, and Britt-Marie.
GENETICS (CE421/521) - Genetics is one of the most fascinating areas of biology. It has effects at all scales from the molecule to population. Its study.
Vibrio genome analysis Christina Isabella Roland Sarahi Veronica.
1. 2 Antibiotic resistance The antibiotic resistance genes themselves are many and varied, ranging from plasmid-encoded betalactamases which destroy penicillins.
Differential insertion of transposable elements in Anopheles gambiae M & S genomes Jenica L. Abrudan, Ryan C. Kennedy, Maria F. Unger, Michael R. Olson,
MGH-PGA Genomic Analysis of Stress and Inflammation: Sequence Analysis of Pseudomonas aeruginosa Strain PA14 Nicole T. Liberati, Dan G. Lee, Jacinto M.
Pathogenomics: Focusing studies of bacterial pathogenicity through evolutionary analysis of genomes.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
National Microbial Pathogen Data Resource About us NMPDR is a Bioinformatics Resource Center dedicated to the thorough understanding of core.
Regulation of Virulence Genes Salyers & Whitt: Bacterial Pathogenesis: A Molecular Approach ASM Press, 1994 Dorman, C.J: Genetics of Bacterial Virulence.
Gene Transfer Mechanisms – Conjugation (cont.) Transformation and Transduction.
Aynaz Taheri 1 C. Gyles and P. Boerlin. * Transfer of foreign DNA * Mechanisms of transfer of DNA * Mobile genetic elements (MGE) * MGEs in the virulence.
Vibrio Cholera Michelle Ross, Kristin Roman, Risa Siegel.
Genome projects and model organisms Level 3 Molecular Evolution and Bioinformatics Jim Provan.
Bacterial Genetics Xiao-Kui GUO PhD.
Genetic transfer and recombination
Sequencing capacitiesacademic company based microarray facilitiesacademic company based bioinformaticsacademic proteomic facilitiesacedemic Genome Research.
Towards Systematic Identification of cdiGMP Binding Proteins
ERA-NET PathoGenoMics Meeting Bonn 7-8 April, 2005 Research topics of interest in the Area of Genomics of Bacterial and Fungal Pathogens of Humans Prof.
Mastering Microbes with Microchips Fiona Brinkman Fiona Brinkman Department of Molecular Biology and Biochemistry Department of Molecular Biology and Biochemistry.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
IB Bacterial Genomics - Jan Fred Sanger sequenced the first complete genomes, e.g. the 5kbp genome of the phiX174 phage in 1978, the 16kb.
Escherichia coli O157:H7 and Shiga toxin-converting bacteriophage Edward G. Dudley, Ph.D Department of Food Science
Tool 2: “TransBAE” - Identifying Cross- Domain Lateral Transfer Rationale: Pathogen proteins have been identified that manipulate host cells by interacting.
Vesicle-Mediated Transfer of Antibiotic Resistance Between Klebsiella pneumoniae and Serratia marcescens Ondraya Espenshade Department of Biological Sciences,
Anotation: Gene of which little is known What follows is a simulation of an orf page in the proposed graphical interface. The interface does not yet exist.
Bacterial Infection of Cardiovascular system By Dr. Humodi A. Saeed Associate Prof. of Medical Microbiology College of Medical Laboratory Science Sudan.
Small Talk Cell-to-Cell Communication in Bacteria.
If post is spelled P-O- S-T and most is spelled M-O-S-T, how do you spell the word for what you put in the toaster?
The Organization of Cellular Genome Asmarinah Department of Medical Biology.
Pathogenesis of bacterial Infections
Shatha Khalil Ismael. Transformation Certain species of Gram- negative, gram- positive bacteria and some species of Archaea are transformable. The uptake.
Viruses Gene Regulation results in differential Gene Expression, leading to cell Specialization.
The Genetics of Viruses
Pathogenomics How this project began: Ann Rose - take advantage of DNA sequence information - genomics Julian Davies - use the information to understand.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Molecular Genetics Introduction to
Bacterial Genetics.
Pathogenicity of Bacteria. Campylobacter spp. Salmonella spp. Escherichia coli 76 Million Cases of Food-borne illness every year in the USA 325,000 result.
(H)MMs in gene prediction and similarity searches.
Finding genes in the genome
Subsystem: General secretory pathway (sec-SRP) complex (TC 3.A.5.1.1) Matthew Cohoon, Department of Computer Science, University of Chicago, Chicago, IL.
MICROBIOLOGIA GENERALE Prokaryotic genomes. The prokaryotic genome.
Genetics Chapter 4. INTRODUCTION ● The genetic material of Escherichia coli, consists of a single circular DNA molecule is composed of approximately 5.
De novo creation of new genes 1.Retrotransposition (+/- cooption of other sequences) AAAAA Pre-mRNA AAAAA Splicing to remove intron Reverse transcription.
MICROBIOLOGIA GENERALE Prokaryotic genomes. The Escherichia coli nucleoid.
Bergey's Phylogenetic In 1923 David Bergey published Bergey's Manual of Determinative It arranged bacteria in 10 orders.
bacteria and eukaryotes
The Bacteria January 12th, 2010.
Chapter 15 Microbial Genomics.
Bacterial genomics: The controlled chaos of shifty pathogens
Chapter 5 Classification of Medically Important Bacteria
Volume 108, Issue 5, Pages (March 2002)
BSC1010: Intro to Biology I K. Maltz Chapter 21.
Extra chromosomal Agents Transposable elements
Transposable Elements
Gene Regulation results in differential Gene Expression, leading to cell Specialization Viruses
Examples of PAI of various pathogens.
Dissemination of Antibiotic Resistance Genomes
Dissemination of Antibiotic Resistance
Model of the development of PAI of pathogenic E. coli.
Presentation transcript:

IslandPath: A computational aid for identifying genomic islands that may play a role in microbial pathogenicity William Hsiao 1 *, Nancy Price 2, Ivan Wan 3, Steven J. Jones 3, and Fiona S. L. Brinkman 1. 1 Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, 2 Department of Medical Genetics, University of British Columbia, Vancouver, and 3 Genome Sequence Centre, B.C. Cancer Agency, British Columbia, Canada Abstract As more genomes from bacterial pathogens are sequenced, it is becoming apparent that a significant proportion of virulence factors are encoded in clusters of genes, termed Pathogenicity Islands (reviewed in 1 ). These islands and other genomic islands, tend to have atypical guanine and cytosine content (%G+C), contain mobility genes (e.g. transposases and integrases), and are associated with tRNA sequences. We have developed a web-based computational tool, IslandPath, to aid the visualization of these features in a full genome display in order to facilitate the identification of genes in new genome sequences that may be involved in virulence or have horizontal origins. The ability to visualize these features within the genomic context can facilitate better detection of the genomic island borders and neighbouring genes. Atypical %G+C by itself is not indicative of the horizontal origin of the sequence involved, however, the predictive power increases when such regions are associated with mobile elements, direct repeats, or contain genes with similarity to known virulence factors. Therefore, we are incorporating into IslandPath algorithms to detect partial tRNAs in new genomic sequences that are likely to be the reminiscent of phage insertion events, and are also comparing the genomic sequences to a custom-built database of a subset of known virulence factors. Preliminary results are encouraging through our investigation of the ability of IslandPath to visualize known Pathogenicity Islands as distinct regions within the genomes. This computational tool also permitted us to perform a more in-depth analysis of %G+C variance in genomes and enabled us to detect correlations not previously reported. As more and more genome data become available, tools like IslandPath, which can be updated in an automated fashion, will become valuable for genomic research. Acknowledgements This project is funded by the Peter Wall Institute for Advanced Studies. We wish to thank Tatiana Tatusov of NCBI for providing helpful files for IslandPath and acknowledge the efforts of the many genome projects that have made our analysis possible. Methods: Core scripts written in Perl and CGI/Perl Sequence Data: NCBI Genome FTP site Potential mobility elements: COG analysis 2,3 plus keyword scan RNA locations: NCBI data plus tRNAscan-SE 4 %G+C calculated for each ORF Mean and Std. Dev. for all ORFs in genome calculated File containing all ORF information used to generate a graphical representation Virulence Gene Subset (VGS) database developed through literature analysis of genes identified as virulence factors using the “Molecular Koch’s Postulates” (i.e. gene knockout affects virulence) Bacterial Pathogens Primary Diseases Cellular Localization # of ORFs %G+C Mean (ORFs >300bp) %G+C S.D. (ORFs >300bp) Neisseria meningitidis serogroup B strain MC58 meningitisextracellular Neisseria meningitidis serogroup A strain Z2491 meningitisextracellular Xylella fastidiosaCitrus variegated chlorosis extracellular Escherichia coli O157:H7 (E. coli O157:H7_EDL933) diarrhoeafacultative intracellular 5361 (5349) 51.1 (51.9) 5.3 (5.3) Mycoplasma pneumoniae M129 mycoplasmal pneumonia ("walking pneumonia") extracellular Yersinia pestis strain CO92bubonic plague and Pneumonic plague facultative intracellular Streptococcus pneumoniae TIGR4 (S. pneumoniae R6) bacterial pneumonia, meningitis, sepsis, and otitis media extracellular2094 (2043) 40.3 (40.4) 4.4 (4.3) Treponema pallidum Nichols syphilisextracellular Mycoplasma pulmonismurine respiratory mycoplasmosis extracellular Pseudomonas aeruginosa PAO1 variety of mucosal infections (opportunistic) extracellular Rickettsia conorii Malish 7Mediterranean spotted fever obligate intracellular Ureaplasma urealyticum serovar 3 urethritisextracellular Vibrio cholerae N16961choleraextracellularI: 2736 II: 1092 I: 48.1 II: 46.9 I: 3.7 II: 4.3 Borrelia burgdorferi B31Lyme diseasefacultative intracellular Streptococcus pyogenesscarlet fever, toxic shock like syndrome extracellular Mycoplasma genitalium G37 urethritis (opportunistic, usually HIV patients) extracellular Campylobacter jejuni NCTC11168 gastroenteritisextracellular Helicobacter pylori (H. pylori J99) peptic ulcers and gastritisextracellular1566 (1491) 39.4 (39.7) 3.4 (3.3) Haemophilus influenzae Rd-KW20 upper respiratory infection meningitis extracellular Mycobacterium tuberculosis CDC1551 (M. tuberculosis H37Rv) tuberculosisfacultative intracellular 4187 (3918) 65.5 (65.6) 3.3 (3.3) Pasteurella multocida PM70 fowl cholera, cattle septicemia, etc. extracellular Rickettsia prowazekii Madrid E epidemic typhusobligate intracellular Staphylococcus aureus Mu50 (S. aureus N315) food poisoning, toxic shock syndrome, necrotizing fascitis extracellular2714 (2595) 33.3 (32.2) 3.0 (3.0) Mycobacterium lepraeLeprosyobligate intracellular Agrobacterium tumefacien C58 (Cereon) crown gall (in plants)Extracellularc:2721 l:1833 c: 59.8 l: 59.7 c: 2.7 l: 2.9 Chlamydophila pneumoniae AR39 (C. pneumoniae J138) [C. pneumoniae CWL029] chlamydial pneumoniaobligate intracellular 1110 (1070) [1052] 41.1 (41.1) [41.1] 2.6 (2.6) [2.6] Chlamydia trachomatis Dchlamydiaobligate intracellular Chlamydia muridarum MoPn chlamydiaobligate intracellular %G+C Analysis for Complete Genome Sequences: Non-pathogens# of ORFs %G+C Mean (ORFs >300bp) %G+C S.D. (ORFs >300bp) Escherichia coli K Discussion: IslandPath appears to be an effective automated tool to visualize and detect genomic islands. Previous reports have expressed concern about the use of %G+C to detect HGT; however, these reports were examining %G+C for individual genes. We propose that %G+C analysis is effective if clusters of genes containing motifs associated with mobility elements are considered. Foreign genes with similar %G+C to the organism’s genome are not detected, and due to gene amelioration, only “recent” HGT can be detected. This tool represents one approach that can be complemented with others, to prioritize particular genomic islands that merit further research.  Future developments:  Virulence factor homology search (based on comparison to our VGS dataset)  Alternative DNA signatures (e.g. codon usage)  Allow users to input their own sequences for analysis %G+C Analysis General Observations: High %G+C variance is associated with species with evidence of recent horizontal gene transfers (e.g. N. meningitidis). Low %G+C variance is associated with highly clonal species and species with no evidence of horizontal gene transfers (e.g. Chlamydia species, which are obligate intracellular microbes thought to have been ecologically isolated from other bacteria for a longer period than other obligate intracellular bacteria). %G+C variance is similar for single species, with the exception of the two V. cholerae chromosomes and two E. coli strains. However, chromosome II of V. cholerae appears to have originated from a megaplasmid captured by Vibrio 5. For E. coli, pathogenic strain O175:H7 has higher %G+C variance. This might be due to the presence of PAI and other potentially horizontally transferred genetic elements. Frequencies of ORF %G+C in Genomes: Histograms of frequencies of %G+C were plotted for several organisms. Observations: Lowest kurtosis occurs most commonly with a mode of 33.33% for %G+C values of ORFs in a genome (e.g. M. jannaschii DSM2661) This G+C value corresponds to maximum A/T in synonymous sites for the standard codon usage table. Long tails in the frequency plots occur more frequently downward (e.g. H. pylori J99 and N. meningitidis) than upward These observations likely reflect either a bias in gene identification in high G+C genomes, or a selection to higher A+T content. Detection of Proposed or Potential Genomic Islands: Escherichia coli O157:H7: Area displayed in white rectangle is ~ 28kb in size (from 3708kbp to 3736kbp) and contains Type III Secretion proteins Epr’s, Epa’s, and Eiv’s; and numerous hypothetical proteins with unknown functions Vibrio cholerae chromosome I: Area displayed in red rectangle is ~ 34kb in size (from 1896kbp to 1930kbp) and contains a tRNA-ser in the same orientation as the phage integrase downstream of it. The ORFs contain one putative helicase, one chemotaxis protein MotB-related protein, one putative type I restriction enzyme HsdR, one putative DNA methylase, one putative N-acetylneuraminate lyase, one C4-dicarboxylate-binding periplasmic protein, and numerous hypothetical proteins and conserved hypothetical proteins. tRNA when adjacent to an abnormal %G+C region is often observed to be in the same orientation as the stretch. This might be an artefact of phage insertion and excision events as 3’ end of tRNA are common phage attachment (att) sites. Horizontal Gene Transfer and Bacterial Pathogenicity: Several types of mobile elements have been shown to carry virulence factors: Transposons: ST enterotoxin genes in E. coli Prophages: Shiga-like toxins in EHEC Diptheria toxin gene Cholera toxin Botulinum toxins Plasmids: Shigella, Salmonella, Yersinia Pathogenicity Islands: Uro/Entero-pathogenic E. coli Salmonella typhimurium Yersinia spp. Helicobacter pylori Vibrio cholerae References 1 Hacker J and Kaper JB, 2000, Annu Rev Microbiol. 54: Tatusov RL, et al., 1997, Science 278(5338): Tatusov RL, et al., 2001, Nucleic Acids Res. 29(1) Lowe TM and Eddy SR, 1997, Nucleic Acids Res. 25(5): Heidelberg JF, et al., 2000, Nature 406: Whole Genome (predicted) ORF Display: Genome ORFs are displayed to allow interesting regions (rich in mobility genes, abnormal %G+C, close to structural RNAs) to be viewed in a genome context. E.g. H. Pylori Genome Several low %G+C regions can be seen in the graphic display: = CAG island = plasticity zone (contain different genes for J99 and 26695) = region contains virB homologues; not present in strain J99 Detection of Known Pathogenicity Islands: Vibrio cholerae chromosome I: VPI (toxin regulated pili) VPI delineated as a stretch of low %G+C region flanked by mobility genes %GC S.D. Location OrientationProduct pesticin/yersiniabactin receptor protein yersiniabactin siderophore biosynthetic protein yersiniabactin biosynthetic protein YbtT yersiniabactin biosynthetic protein YbtU yersiniabactin biosynthetic protein yersiniabactin biosynthetic protein transcriptional regulator YbtA lipoprotein inner membrane ABC-transporter inner membrane ABC-transporter YbtQ putative signal transducer putative salicylate synthetase integrase Yersinia pestis strain CO92: High Pathogenicity Island core (in red rectangle) Mean: 47.9 STD DEV: 4.9 IslandPath Graphical Display: Each dot in a graphic corresponds to a predicted protein-coding ORF in the genome. Dot colours indicate if an ORF has a higher or lower %G+C than cutoffs you set (default settings are +/- 3.48* of the mean %G+C). You may click on a dot to view a portion of an annotation table presented below the graphic = 1.5 S.D. of the mean for Chlamydia genomes, which are proposed to have undergone no recent horizontal gene transfer (data not shown).