The Integrated Microbial Genome (IMG) systems

Slides:



Advertisements
Similar presentations
Comparative genomics Joachim Bargsten February 2012.
Advertisements

Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Page 1 Integrated Microbial Genomes (IMG) System Victor M. Markowitz Frank Korzeniewski Krishna Palaniappan Ernest Szeto Biological Data Management & Technology.
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
MCSG Site Visit, Argonne, January 30, 2003 Genome Analysis to Select Targets which Probe Fold and Function Space  How many protein superfamilies and families.
I.1 ii.2 iii.3 iv.4 1+1=. i.1 ii.2 iii.3 iv.4 1+1=
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
I.1 ii.2 iii.3 iv.4 1+1=. i.1 ii.2 iii.3 iv.4 1+1=
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
GTL User Facilities Facility II: Whole Proteome Analysis Michelle V. Buchanan.
Automatic methods for functional annotation of sequences Petri Törönen.
Mouse Genome Sequencing
Advancing Science with DNA Sequence Data Curation in IMG-ER Natalia Ivanova MGM Workshop May 16, 2012.
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
Genomics of Microbial Eukaryotes Igor Grigoriev, Fungal Genomics Program Head US DOE Joint Genome Institute, Walnut Creek, CA.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Overview. What is Annotation? Annotation is the process of determining the location and function of all identifiable genes in a genome. Annotation is.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
Advancing Science with DNA Sequence Metagenome definitions: a refresher course Natalia Ivanova MGM Workshop September 12, 2012.
Sequencing the World of Possibilities for Energy & Environment Annotation: function prediction and metabolic reconstruction Thanos Lykidis Genome Biology.
Modeling of complex systems: what is relevant? Arno Knobbe, Marvin Meeng, Joost Kok Leiden Institute of Advanced Computer Science (LIACS)
Genomics of Microbial Eukaryotes Igor Grigoriev Fungal Genomics Program Head US DOE Joint Genome Institute, Walnut Creek, CA.
Protein and RNA Families
2009 IADR, MIAMI, FL, USA Hands-on Experience for using the Human Oral Microbiome Database (HOMD) 2009 IADR Workshop, Miami, FL, USA Tsute (George) Chen.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Analysis and comparison of very large metagenomes with fast clustering and functional annotation Weizhong Li, BMC Bioinformatics 2009 Present by Chuan-Yih.
I. Prolinks: a database of protein functional linkage derived from coevolution II. STRING: known and predicted protein-protein associations, integrated.
Copyright OpenHelix. No use or reproduction without express written consent1.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Metagenome analysis Natalia Ivanova MGM Workshop February 2, 2012.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America:
S. pombe Unicellular archiascomycete Diverged from S. cerevisiae Ma Size ~14 Mb, 3 chromosomes No synteny Data stored in GeneDB.
2006 ICAR: TAIR workshop Organizers: Katica Ilic and Peifen Zhang Location: Reception Room, 4th floor A general overview of TAIR website and demonstration.
Reconstructing the metabolic network of a bacterium from its genome: the construction of LacplantCyc Christof Francke In silico reconstruction of the metabolic.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
The Biologist’s Wishlist A complete and accurate set of all genes and their genomic positions A set of all the transcripts produced by each gene The location.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
The Integrated Microbial Genome (IMG) systems
Comparative Analysis in BioCyc
Functional organization of the yeast proteome by systematic analysis of protein complexes Presented by Nathalie Kirshman and Xinyi Ma.
The Integrated Microbial Genome (IMG) systems
Figure S2 A B Log2 Fold Change (+/- cAMP) Transcriptome (9hr)
Sequence based searches:
Taxonomic profiling with MetaPhlAn2
Department of Genetics • Stanford University School of Medicine
Genome Annotation Continued
Flow diagrams (i) (ii) (iii) x – Example
Taxonomic profiling with MetaPhlAn2
Genomic Data Manipulation
Human Gut Microbiome: Function Matters
INFORMATION FLOW AARTHI & NEHA.
HCV NS5A Inhibitors: The Devil Is in the Details
Overview of Microbial Pathway and Genome Databases
Metagenomics Microbial community DNA extraction
Identify D. melanogaster ortholog
Bacterial genomics: The controlled chaos of shifty pathogens
Volume 20, Issue 5, Pages (November 2014)
Nitzan Koppel, Emily P. Balskus  Cell Chemical Biology 
Victor M. Markowitz, I-Min A. Chen, Ken Chu, Amrita Pati, Natalia N
SIFGD: Setaria italica Functional Genomics Database
Index Notation Sunday, 24 February 2019.
Network biology An introduction to STRING and Cytoscape
The Omics Dashboard.
Volume 20, Issue 5, Pages (November 2014)
Correction of Gene Annotation and Other Sequence Analysis of Arabidopsis Proteins Identified on the 2-D Gels in Figure 1.(A) Spot 71, identified as a 17.2-kD.
TF candidate selection pipeline.
Visualization of Conserved Syntenic Blocks Among Six Cotton Genomes in CottonGen Ping Zheng, Sook Jung, Chun-Huai Cheng, Jing Yu, Heidi Hough, Josh Udall,
Volume 12, Issue 3, Pages (March 2019)
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

The Integrated Microbial Genome (IMG) systems Nikos Kyrpides Genome Biology Program (GBP) DOE Joint Genome institute 1

IMG IMG Systems Data Types Genes SNPs Proteomics Regulons Genomes Functions Metadata Clusters IMG SNPs Proteomics Regulons Transcriptomes

Gene/Genome context analysis tools Gene Context Tools Gene Fusion Gene neighborhood Gene co-occurrence Genome Synteny Tools VISTA (predetermined genome set) DotPlot (two user specified genomes) ACT (multiple user specified genomes) Gene Fusion Gene Synteny Co-occurence DotPlot ACT

Gene Fusions

Conserved chromosomal cassettes Genes are replaced by protein families (COGs, pfams, IMG ortholog families). One gene  multiple families. H G F E D C B A XI X IX VII VI V IV III II I Conserved chromosomal cassette contains: cassettes that share at least TWO protein families, protein families that cassettes have in common. The definition of conserved chromosomal cassette does not take into account the order of the protein families on the cassette. Mavromatis et al, (2009) PLoS ONE

Missing Function context based analysis Missing function from the fatty acid biosynthesis pathway No known gene for this function has a homolog in Streptococci

Genome Synteny tools

IMG Function Curation 1. Protein Product 2. Protein Family public & automatic 1. Protein Product 2. Protein Family

IMG Function Curation (b) manual 3. IMG Term 4. MyIMG

IMG Function Curation 1. Protein Product 2. Protein Family 3. IMG Term Automatic and Manual 1. Protein Product 2. Protein Family 3. IMG Term 4. MyIMG

Who is there?

Finding organisms

What is the role of the organism in the community?

What is the metabolic potential of the community? Function Abundance

Relative abundance of functions Cloning bias. PCR bias. Assembly coverage. Misassemblies. Erroneous gene prediction.

IMG curation

Curation check

Gene annotation curation Allows overview comparisons between cluster (family) and gene annotations to identify over and underclassified genes

Gene page Main gene detail page