Computational genomic strategies for natural product discovery

Slides:



Advertisements
Similar presentations
Ufedo Ruby Awodi and Greg L. Challis
Advertisements

Chapter 18 Regulation of Gene Expression in Prokaryotes
Greg Challis Department of Chemistry Lecture 1: Methods for in silico analysis of cryptic natural product biosynthetic gene clusters Microbial Genomics.
Essentials of Glycobiology Lecture 43 June 10, 2004 Ajit Varki
CHAPTER 8 Metabolic Respiration Overview of Regulation Most genes encode proteins, and most proteins are enzymes. The expression of such a gene can be.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Gene Regulation in Eukaryotes Same basic idea, but more intricate than in prokaryotes Why? 1.Genes have to respond to both environmental and physiological.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
General Microbiology (MICR300)
Biological Information Flow
Relationship between Genotype and Phenotype
Honors Biology The molecules of Cells
Greg Challis Department of Chemistry Lecture 2: Methods for experimental identification of cryptic biosynthetic gene cluster products Microbial Genomics.
Epoxomicin: Assembly Line Engineering for Pharmaceutical Drug Production Using Natural Product Gene Clusters Anna Klavins, Haley Hoffman August 13, 2015.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Introduction to biological molecular networks
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
José A. Cardé Serrano, PhD Universidad Adventista de las Antillas Biol 223 Genética Agosto 2010.
Traditional approach for bioactive natural product discovery fractionation extraction Investigate bioactivity of extract Identify active fraction(s) and.
From the double helix to the genome
The flow of genetic information:
BT8118 – Adv. Topics in Systems Biology
Introduction to Bioinformatics
Protein databases Henrik Nielsen
CHOLESTEROL BIOSYNTHESIS
Control of Gene Expression in Prokaryotes
BCH 405 – REGULATION OF METABOLIC PROCESSES SYNOPSIS
Protein Synthesis (Translation)
The Pathway Tools FBA Module
Chapter 25 Using the Genetic Code
Greg Challis Department of Chemistry, University of Warwick, UK
Immunoglobulins (2 of 2) Ali Al Khader, MD Faculty of Medicine
Molecular Mechanisms of Gene Regulation
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Construction and in vitro analysis of a new bi-modular polypeptide synthetase for synthesis of N-methylated acyl peptides  Florian Schauwecker, Frank.
Chapter Twenty-One Lipid Metabolism.
Chapter 17 Protein Translation (PART 4)
Department of Chemical Engineering
DNA Replication How to make a functional protein Transcription
Relationship between Genotype and Phenotype
Metabolism; Anabolism or Biosynthesis
Protein Synthesis The genetic code – the sequence of nucleotides in DNA – is ultimately translated into the sequence of amino acids in proteins – gene.
The Study of Biological Information
Different Genes ~ Protein Primary Structure
Volume 19, Issue 2, Pages (February 2012)
(A) Structures of mandelalides A to D
Figure 1. An example of a thioviridamide-like molecule, thioalbamide, and inset, a proposed biochemical route to ... Figure 1. An example of a thioviridamide-like.
The 11th lecture in MOLECULAR BIOLOGY
Nilansu Das Dept. of Microbiology Surendranath College
Protein Synthesis The genetic code – the sequence of nucleotides in DNA – is ultimately translated into the sequence of amino acids in proteins – gene.
From Mendel to Genomics
Volume 7, Issue 10, Pages (October 2000)
Structural Basis for Phosphopantetheinyl Carrier Domain Interactions in the Terminal Module of Nonribosomal Peptide Synthetases  Ye Liu, Tengfei Zheng,
Community diversity and metagenome depth interact to influence assembly quality. Community diversity and metagenome depth interact to influence assembly.
Leinamycin Biosynthesis Revealing Unprecedented Architectural Complexity for a Hybrid Polyketide Synthase and Nonribosomal Peptide Synthetase  Gong-Li.
9-2 Replication of DNA.
Biosynthetic Pathway Connects Cryptic Ribosomally Synthesized Posttranslationally Modified Peptide Genes with Pyrroloquinoline Alkaloids  Peter A. Jordan,
New Insights into the Biosynthetic Logic of Ribosomally Synthesized and Post- translationally Modified Peptide Natural Products  Manuel A. Ortega, Wilfred A.
Comparison Of DNA And RNA Synthesis in Prokaryotes and Eukaryotes
THINK ABOUT IT DNA is the genetic material of cells. The sequence of nucleotide bases in the strands of DNA carries some sort of code. In order for that.
Integrative omic approaches for the study of host–pathogen interactions Integrative omic approaches for the study of host–pathogen interactions (A) Proteomic.
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Volume 12, Issue 3, Pages (March 2005)
Small molecules from the human microbiota
M.S COLLEGE OF ARTS ,SCIENCE ,COMMERCE AND BMS
General overview of the bioinformatic pipelines for the 16S rRNA gene microbial profiling and shotgun metagenomics. General overview of the bioinformatic.
Strategies for Engineering Natural Product Biosynthesis in Fungi
Presentation transcript:

Computational genomic strategies for natural product discovery Dr. Marnix H. Medema Bioinformatics Group Wageningen University, The Netherlands EBI Course Exploiting Metagenomics Thursday, december 3rd, 2015, 11:00h

Microbial Biosynthetic pathways: a great source of valuable molecules

Specialized metabolites play key roles in microbiomes

Specialized metabolites play key roles in microbiomes Donia et al. (2014) Cell 158: 1402-1414.

Diverse and complex enzymology produces chemical diversity: riPPs Huge assembly-lines, But: not the only mechanism RiPPs: Ribosomally synthesized and Posttranslationally modified Peptides Ortega et al. (2015), Nature 517: 509-512.

nonproteinogenic amino Diverse and complex enzymology produces chemical diversity: nonribosomal peptides Key enzyme class: Nonribosomal Peptide Synthetase (NRPS) NRPSs can introduce nonproteinogenic amino acids into peptides! Huge assembly-lines, But: not the only mechanism

Diverse and complex enzymology produces chemical diversity: nonribosomal peptides Huge assembly-lines, But: not the only mechanism Schmartz et al. (2014), Nat. Prod. Rep. 12: 5574-5577.

Diverse and complex enzymology produces chemical diversity: polyketides Key enzyme class: Polyketide synthase (PKS) Huge assembly-lines, But: not the only mechanism Menzella et al. (2005), Nat. Biotechnol. 23: 1171-1176.

Diverse and complex enzymology produces chemical diversity: polyketides Not all polyketide synthases are modular, some are iterative! Fungal Type I Type II Type III etc. Huge assembly-lines, But: not the only mechanism Shen et al. (2003), Curr. Opin. Chem. Biol. 7: 285-295.

Diverse and complex enzymology produces chemical diversity: terpenes Huge assembly-lines, But: not the only mechanism Key enzyme classes: terpene synthases / cyclases These turn isoprene precursors into mature terpenoids Gao et al. (2012), Nat. Prod. Rep. 29: 1153-1175.

Diverse and complex enzymology produces chemical diversity: saccharides Key enzyme class: glycosyl transferase Huge assembly-lines, But: not the only mechanism McCranie & Bachmann et al. (2014), Nat. Prod. Rep. 31: 1026-1042.

Biosynthetic gene clusters: the genetic basis of molecular diversity So if we can find new gene clusters, we can find new chemicals! Now how to find new gene clusters?

Modularity of biosynthetic gene clusters Second strategy Cacho et al. (2015) Front. Microbiol 5: 774.

Modularity of biosynthetic gene clusters Second strategy Medema, Cimermancic et al. (2015) PLoS Comp. Biol. 10: e1004016

antiSMASH: A Web Server for the Detection and analysis of biosynthetic gene clusters 15 Medema et al. (2011) Nucl. Acids Res. 39: W339-W346. Blin, Medema et al. (2013) Nucl. Acids Res. 41: W204-212. http://antismash.secondarymetabolites.org

Core structure prediction for polyketide synthase and nonribosomal peptide synthetase gene clusters 16 Medema et al. (2011) Nucl. Acids Res. 39: W339-W346. Blin, Medema et al. (2013) Nucl. Acids Res. 41: W204-212. http://antismash.secondarymetabolites.org

Comparative analysis and subcluster detection 17 Medema et al. (2011) Nucl. Acids Res. 39: W339-W346. Blin, Medema et al. (2013) Nucl. Acids Res. 41: W204-212. http://antismash.secondarymetabolites.org

Another Method to Detect Biosynthetic Gene Clusters in Prokaryotic Genomes Training set consisted of 732 biosynthetic gene clusters of known compounds: 136 type I polyketides 100 nonribosomal peptides 76 type II polyketides 82 polyketide-peptide hybrids 93 oligo- and polysaccharides 38 aminoglycosides 36 terpenoids 27 ribosomal peptides 23 lantibiotics 13 indolocarbazoles 11 type III polyketides 9 fatty acids 9 siderophores 8 nucleosides 6 beta-lactams 4 aminocoumarins 61 others Cimermancic, Medema, Claesen et al. (2014) Cell 158: 412-421

Large metagenomic datasets may contain very large numbers of biosynthetic gene clusters Now there are of course both rare and frequently occurring classes of gene clusters / compounds. What we had not expected was to find large clusters within this network that contain no known gene clusters. We chose one of these regions, which contained two related families of hundreds of gene clusters encoding amongst others very unusual ketosynthases CoA-ligases. Cimermancic, Medema, Claesen et al. (2014) Cell 158: 412-421

Data on bgcs is scattered and not systematically stored

The minimum information about a biosynthetic gene cluster (MIBiG) 21 Medema et al. (2015) Nature Chem. Biol., under review.

a rich set of annotations and metadata on biosynthetic gene clusters General MIBiG Parameters Biosynthetic class MIxS environmental / taxonomic information Number of loci Complete / partial cluster Nucleotide sequence accession 16S accession / sequence Custom gene names Functional sub-clusters Biosynthetic genes Transport-related genes Regulatory genes Resistance/immunity genes Operon architecture Knockout mutant phenotypes Compound name Synonyms for compound name Exact molecular mass Molecular formulae of the compound(s) Compound structure Chemical moieties Compound activity Compound molecular target Publications on activity/toxicity/target Tailoring reactions Evidence for compound-cluster connection Polyketide-specific Polyketide synthase type Polyketide subclass Linear / cyclic PKS genes Number of PKS modules Ketide unit sequence Starter Unit Reductive domains KR stereochemistries AT domain substrate specificities Non-reductive modifying PKS domains Module skipping / iteration Number of iterations (if iterative) Iterative PKS subtype (if iterative) Trans-acyltransferase genes Inactive / atypical domains TE domain type Cyclization / termination type Nonribosomal peptide-specific NRP subclass Linear / cyclic NRPS genes Number of NRPS modules NRP amino acid sequence A domain substrate specificities Variable A domain specificities Condensation domain subtypes Modifying domains (Me/Ox/Red/Epi) Module skipping / iteration TE domain type Cyclization / termination type RiPP-specific RiPP subclass Linear/cyclic Precursor-encoding gene(s) Precursor peptide length Leader peptide length Follower peptide length Core peptide length Core peptide sequence Cleavage recognition site Number of crosslinks Crosslink positions Type of crosslinks/cyclizations Recognition motif in leader peptide Terpenoid-specific Terpene subclass Precursor carbon chain length Final isoprenoid precursor Terpene synthases / cyclases Prenyltransferases Saccharide-specific Saccharide subclass Glycosyltransferase (GT) genes GT substrate specificities Alkaloid-specific Alkaloid subclass Specific for other classes Biosynthetic class specification 22 Again, MIBiG has an important role to play here, as standardized data submission and storage will allow us to build up a parts registry that can function as a trustworthy repository for designing new pathways. Medema et al. (2015) Nature Chem. Biol., under review.

>75 research groups worldwide participated Community annotation of biosynthetic gene clusters using MIBiG 23 >75 research groups worldwide participated Result: detailed annotation of ±400 BGCs, essential annotations for another ±900 BGCs So we currently have a draft version of MIBiG, on which between 60-70 PIs in the field have already commented through an online survey. Later this week, I will organize a discussion session, to which I would like to invite you all to discuss this further. A standard has to be carried by the community.

Community annotation of biosynthetic gene clusters using MIBiG 24 So we currently have a draft version of MIBiG, on which between 60-70 PIs in the field have already commented through an online survey. Later this week, I will organize a discussion session, to which I would like to invite you all to discuss this further. A standard has to be carried by the community.

An online repository for MIBIG information 25 http://mibig.secondarymetabolites.org

Integration with antismash: KnownClusterblast 26

Finding more variants of known enzymatic parts using Multigeneblast 27 Medema et al. (2013) Mol. Biol. Evol. 30: 1218-1223. http://multigeneblast.sf.net

Finally: some suggestions for analyzing metagenomes using antismash 28 Assemble first! Only run contigs > 2 kb; use other tools for very fragmented assemblies, e.g. http://napdos.ucsd.edu/ Sort contigs by size, if >1000 contigs: run locally or contact us to run it on the public server Local installations: Docker container available http://phdops.kblin.org/2015-running-antismash-standalone-from-docker.html