Download presentation
Presentation is loading. Please wait.
Published byBarrie Patrick Modified over 9 years ago
1
Evolution of regulatory interactions in bacteria Mikhail Gelfand Research and Training Center “Bioinformatics”, Institute for Information Transmission Problems, RAS Moscow, Russia Singapore, 17-18 July 2006
2
Comparative genomics of regulation Why –Functional annotation of genes –Metabolic modeling –Practical applications in genetic engineering, drug targeting etc. How –Close genomes: phylogenetic footprinting. Regulatory sites are seen as conservation islands in alignments of gene upstream regions –Distant genomes: consistency filtering. Candidate sites in one genome may be unreliable, but independent occurrence upstream of orthologous genes in many genomes yields reliable predictions Caveats –Presense of (predicted) binding sites does not immediately imply functional regulation –Operon structure –Need to verify presence of orthologous transcription factors in the studied genomes –Orthologous factors may have different binding motifs –One functional system may be regulated by different factors within and between genomes Many genomes –Taxon-specific regulation –Evolution individual sites transcription-fator families transcription factors and their binding motifs simple and complex regulatory systems
3
How it works: Two simple examples Biotin regulator of alpha-proteobacteria Universal regulator of ribonucleotide reductases: reconstruction of the regulatory system and the mechanism of regulation
4
BirA (biotin regulator in eubacteria and archaea): conserved signal, changed spacing Profile 2: Gram-negative bacteriaProfile 1: Gram-positive bacteria, Archaea
5
BirA (biotin regulator in eubacteria and archaea): conserved signal, changed spacing Profile 2: Gram-negative bacteriaProfile 1: Gram-positive bacteria, Archaea BirA of alpha-proteobacteria: no DNA-binding domain
6
Identification of the candidate regulator (BioR) in alpha- proteobacteria 1.Candidate binding sites: similar palindromes upstream of biotin biosynthesis and transport genes in different genomes TTATAGATAA TTATCTATAA TTATAGATAg TTATCTATAA TTATAGATAg TTATCTATAA TcATATATtA TcATAGATAg TTATCTATAA TTATCTATtA TTATCTAcAA TTATCTATAA TcATAGATtA cTATAGATAA TTATCTAcAA
7
1. 2.Positional clustering: candidate transcription factor from the GntR family is often found in the same loci (black arrows) 3.Phyletic patterns: phyletic distribution of candidate sites (red cirsles) exactly coincides with the phyletic distribution of the candidate regulator 4.Autoregulation: in many cases there are candidate sites upstream of the bioR gene itself
8
Conserved signal upstream of nrd genes
9
Identification of the candidate regulator by the analysis of phyletic patterns COG1327: the only COG with exactly the same phylogenetic pattern as the signal –“large scale” on the level of major taxa –“small scale” within major taxa: absent in small parasites among alpha- and gamma- proteobacteria absent in Desulfovibrio spp. among delta-proteobacteria absent in Nostoc sp. among cyanobacteria absent in Oenococcus and Leuconostoc among Firmicutes present only in Treponema denticola among four spirochetes
10
COG1327 “Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains”: regulator of the riboflavin pathway?
11
Additional evidence – 1 nrdR is sometimes clustered with nrd genes or with replication genes dnaB, dnaI, polA
12
Additional evidence – 2 In some genomes, candidate NrdR-binding sites are found upstream of other replication- related genes –dNTP salvage –topoisomerase I, replication initiator dnaA, chromosome partitioning, DNA helicase II
13
Multiple sites (nrd genes): FNR, DnaA, NrdR
14
Mode of regulation Repressor (overlaps with promoters) Co-operative binding: –most sites occur in tandem (> 90% cases) –the distance between the copies (centers of palindromes) equals an integer number of DNA turns: mainly (94%) 30-33 bp, in 84% 31-32 bp – 3 turns 21 bp (2 turns) in Vibrio spp. 41-42 bp (4 turns) in some Firmicutes experimental confirmation in Streptomyces (Borovok et al., 2004)
15
Evolutionary processes that shape regulatory systems Expansion and contraction of regulons Duplications of regulators with or without regulated loci Loss of regulators with or without regulated loci Re-assortment of regulators and structural genes … especially in complex systems Horizontal transfer
16
Loss of regulators, and cryptic sites Loss of the RbsR in Y. pestis (ABC-transporter also is lost) Start codon of rbsD RbsR binding site
17
Regulon expansion: how FruR has become CRA icdA aceA aceB aceEF pckA ppsApykF adhE gpmApgk tpiA gapA pfkA fbp Fructose fruKfruBA eda edd epd Glucose ptsHI-crr Mannose manXYZ mtlD mtlA Mannitol Gamma-proteobacteria
18
Common ancestor of Enterobacteriales icdA aceA aceB aceEF pckA ppsApykF adhE gpmApgk tpiA gapA pfkA fbp Fructose fruKfruBA eda edd epd Glucose ptsHI-crr Mannose manXYZ mtlD mtlA Mannitol Gamma-proteobacteria Enterobacteriales
19
Common ancestor of Escherichia and Salmonella icdA aceA aceB aceEF pckA ppsApykF adhE gpmApgk tpiA gapA pfkA fbp Fructose fruKfruBA eda edd epd Glucose ptsHI-crr Mannose manXYZ mtlD mtlA Mannitol Gamma-proteobacteria Enterobacteriales E. coli and Salmonella spp.
20
Trehalose/maltose catabolism, alpha-proteobacteria Duplicated LacI-family regulators: lineage-specific post-duplication loss
21
The binding signals are very similar (the blue branch is somewhat different: to avoid cross-recognition?)
22
Utilization of an unknown galactoside, gamma-proteobacteria Loss of regulator and merger of regulons: It seems that laci-X was present in the common ancestor (Klebsiella is an outgroup) Yersinia and Klebsiella: two regulons, GalR (not shown, includes genes galK and galT) and Laci-X Erwinia: one regulon, GalR
23
Utilization of maltose/maltodextrin, Firmicutes Two different ABC transporters (shades of red) PTS (pink) Glucoside hydrolases (shades of green) Two regulators (black and grey)
24
Modularity of the functional subsystem Two different ABC systems Three hydrolases in one operon (E. faecalis) or separately
25
Changes of regulation Displacement: invasion of a regulator from a different subfamily (horizontal transfer from a related species?) – blue sites
26
Orthologous TFs with completely different regulons (alpha-proteobaceria and Xanthomonadales)
27
Catabolism of gluconate, proteobacteria
28
extreme variability of regulation of “marginal” regulon members γ Pseudomonas spp. β
29
Combined regulatory network for iron homeostasis genes in -proteobacteria [- Fe] [+Fe] [- Fe] [+Fe] [ Fe]- FeS FeS status of cell The connecting line denote regulatory interactions, which the thickness reflecting the frequency of the interaction in the analyzed genomes. The suggested negative or positive mode of operation is shown by dead-end and arrow-end of the line.
30
Rhizobiales Bradyrhizobiaceae Rhizobiaceae Rhodo- bacterales Hyphomonadaceae Rhodo- bacteraceae Rickettsiales Rhodo- spirillales Sphingomo- nadales - p r o t e o b a c t e r i a Organism Irr MntR Sinorhizobium meliloti Rhizobium leguminosarum Rhizobium etli Agrobacterium tumefaciens Mesorhizobium loti Mesorhizobiumsp. BNC1 Brucella melitensis Bartonellaquintana andspp. Bradyrhizobium japonicum Rhodopseudomonas palustris Nitrobacter hamburgensis Nitrobacter winogradskyi Rhodobacter capsulatus Rhodobacter sphaeroides Silicibacter sp. TM1040 Silicibacter pomeroyi Jannaschia sp.CC51 Rhodobacterales bacterium HTCC2654 Roseobactersp. MED193 Roseovarius nubinhibens ISM Roseovarius sp.217 Loktanella vestfoldensis SKA53 Sulfitobacter sp. EE-36 Oceanicola batsensis HTCC2597 Oceanicaulis alexandrii HTCC2633 Caulobacter crescentus Parvularcula bermudensis HTCC2503 Erythrobacter litoralis Novosphingobium aromaticivorans Sphinopyxis alaskensisgRB2256 Zymomonas mobilis Gluconobacter oxydans Rhodospirillum rubrum Magnetospirillum magneticum Pelagibacter ubique HTCC1002 SM + MUR / FUR RirAIscR RL RHE AGR ML MBNC BME BQ BJ RPA Nham Nwi RC Rsph STM SPO Jann RB2654 MED193 ISM ROS217 SKA53 EE36 OB2597 OA2633 CC PB2503 ELI Saro Sala ZM GOX Rrub Amb Abb. PU1002 ++ -- +++ -- +++ -- +++ -- ++ - +++ -- +++ -- +++ -- ++ -- + + + - - ++ --- ++ --- ++ --- + + +++ - + +++ - + +++ - + +++ - + ++ - +++ - + +++ - + ++ - + +++ - + ++ - + ++ - + ++ - + + - + #? -- + - + -- + - + -- + - + -- + - + -- + - + -- + - + -- ++ -- + - + -- + - + -- - + - + + + + Group Caulobacterales Parvularculales Rickettsia and Ehrlichia species - + --- + + SAR11 cluster A. B. C. D. Fe and Mn regulons Distribution of Irr, Fur/Mur, MntR, RirA, and IscR regulons in α-proteobacteria #?' in RirA column denotes the absence of the rirA gene in an unfinished genomic sequence and the presence of candidate RirA-binding sites upstream of the iron uptake genes.
31
Phylogenetic tree of the Fur family of transcription factors in -proteobacteria - I Fur in - and - proteobacteria Fur in - proteobacteria Fur in Firmicutes in proteobacteria Fur MBNC03003593 RB2654 19538 AGR C 620 RL mur Nwi 0013 RPA0450 BJ fur ROS217 18337 Jann 1799 SPO2477 STM1w01000993 MED193 22541 OB2597 02997 SKA53 03101 Rsph03000505 ISM 15430 GOX0771 ZM01411 Saro02001148 Sala 1452 ELI1325 OA2633 10204 PB2503 04877 CC0057 Rrub02001143 Amb1009 Amb4460 SM mur MBNC03003179 BQ fur2 BMEI0375 Mesorhizobium sp. BNC1(I) Sinorhizobium meliloti Bartonella quintana Rhodopseudomonas palustris Bradyrhizobium japonicum Caulobacter crescentus Zmomonas mobilisy Rhodobacter sphaeroides Silicibacter sp. TM1040 Silicibacter pomeroyi Agrobacterium tumefaciens Rhizobium leguminosarum Brucella melitensis Mesorhizobium sp. BNC1(II) Rhodobacterales bacterium HTCC2654 Nitrobacter winogradskyi Nham 0990 Nitrobacter hamburgensis X14 Jannaschia sp. CC51 Roseovarius sp.217 Roseobacter sp. MED193 Oceanicola batsensis HTCC2597 Loktanella vestfoldensis SKA53 Roseovarius nubinhibens ISM Gluconobacter oxydans Erythrobacter litoralis Novosphingobium aromaticivorans Sphinopyxis alaskensis RB2256 Oceanicaulis alexandrii HTCC2633 Rhodospirillum rubrum Parvularcula bermudensisHTCC2503 Magnetospirillum magneticum (I) EE36 12413 Sulfitobacter sp. EE-36 ECOLI PSEAE NEIMA HELPY BACSU Helicobacter pylori : sp|O25671 Bacillus subtilis: P54574sp| Neisseria meningitidis : sp|P0A0S7 Pseudomonas aeruginosa : sp|Q03456 Escherichia coli : P0A9A9sp| Mur Fur Magnetospirillum magneticum (II) RHE_CH00378 Rhizobiumetli PU1002 04436 Pelagibacter ubique HTCC1002 Irr in proteobacteria proteobacteria Regulator of manganese uptake genes (sit, mntH) Regulator of iron uptake and metabolism genes
32
The A, B, and C groups of - proteobacteria - Mur Caulobacter crescentus Zymomonas mobilis Gluconobacter oxydans Erythrobacter litoralis Novosphingobium aromaticivorans Rhodospirillum rubrum Magnetospirillum magneticum Escherichia coli Sphinopyxis alaskensis Parvularcula bermudensis - Oceanicaulis alexandrii Bacillus subtilis Sequence logos for the identified Fur-binding sites in the D group of proteobacteria Sequence logos for the known Fur-binding sites in Escherichia coli and Bacillus subtilis Identified Mur-binding sites
33
Phylogenetic tree of the Fur family of transcription factors in -proteobacteria - II Fur in - and - proteobacteria Fur in - proteobacteria Fur in Firmicutes Irr in proteo- bacteria regulator of iron homeostasis proteobacteria Fur ECOLI PSEAE NEIMA HELPY BACSU Helicobacter pylori: sp|O25671 Bacillus subtilis: P54574sp| Neisseria meningitidis : sp|P0A0S7 Pseudomonas aeruginosa : sp|Q03456 Escherichia coli : P0A9A9sp| Mur / Fur I rr- AGR C 249 SM irr RL irr1 RL irr2 MLr5570 MBNC03003186 BQ fur1 BMEI1955 BMEI1563 BJ blr1216 RB2654 182 SKA53 01126 ROS217 15500 ISM 00785 OB2597 14726 Jann 1652 Rsph03001693 EE36 03493 STM1w01001534 MED193 17849 SPOA0445 RC irr RPA2339 RPA0424* BJ irr* Nwi 0035* Nham 1013* Nitrobacter hamburgensisX14 Nitrobacter winogradskyi Bradyrhizobium japonicum (I) Agrobacterium tumefaciens Rhizobium leguminosarum (I) Mesorhizobium sp. BNC1 Sinorhizobium meliloti Mesorhizobiumloti Bartonella quintana Brucella melitensis (I) Bradyrhizobium japonicum (II) Rhodobacter sphaeroides Rhodobactercapsulatus Silicibacter pomeroyi Silicibacter sp. TM1040 Roseobacter sp. MED193 Sulfitobacter sp. EE-36 Jannaschia sp. CC51 Oceanicola batsensis HTCC2597 Roseovarius nubinhibens ISM Roseovariussp.217 Loktanella vestfoldensis SKA53 Rhodobacterales bacterium HTCC2654 Rhizobiumetli RHE CH00106 Rhizobium leguminosarum (II) Brucella melitensis (II) Rhodopseudomonas palustris (II) Rhodopseudomonas palustris (I) PU1002 04361 Pelagibacter ubique HTCC1002
34
Sequence logos for the identified Irr binding sites in -proteobacteria (8 species) - IrrThe A group The B group (4 species) - Irr The C group (12 species) - Irr
35
Phylogenetic tree of the Rrf2 family of transcription factors in -proteobacteria proteins with the conserved C-X(6-9)-C(4-6)-C motif within effector-responsive domain proteins without a cysteine triad motif Iron repressor RirA (Rhizobium leguminosarum) Nitrite/NO-sensing regulator NsrR (Nitrosomonas europeae, Escherichia coli) Cysteine metabolism repressor CymR (Bacillus subtilis) Iron-Sulfur cluster synthesis repressor IscR (Escherichia coli) Positional clustering of rrf2-like genes with: iron uptake and storage genes; Fe-S cluster synthesis operons; genes involved in nitrosative stress protection; sulfate uptake/assimilation genes; thioredoxin reductase; carboxymuconolactone decarboxylase-family genes; hmc cytochrome operon Cytochrome complex regulator Rrf2 (Desulfovibrio vulgaris)
36
The A group - RirA(8 species) (12 species)The C group - RirA Sequence logos for the identified RirA-binding sites in -proteobacteria
37
Genes Functions: Iron uptake Iron storage FeS synthesis Iron usage Heme biosynthesis Regulatory genes Manganese uptake Distribution of the conserved members of the Fe- and Mn-responsive regulons and the predicted RirA, Fur/Mur, Irr, and DtxR binding sites in -proteobacteria
38
An attempt to reconstruct the history
39
Regulators and their signals Subtle changes at close evolutionary distances Cases of motif conservation at surprisingly large distances Correlation between contacting nucleotides and amino acid residues
40
DNA signals and protein-DNA interactions CRPPurR IHFTrpR Entropy at aligned sites and the number of contacts (heavy atoms in a base pair at a distance <cutoff from a protein atom)
41
Specificity-determining positions in the LacI family Training set: 459 sequences, average length: 338 amino acids, 85 specificity groups 10 residues contact NPF (analog of the effector) 6 residues in the intersubunit contacts 7 residues contact the operator sequence 7 residues in the effector contact zone (5Ǻ<d min <10Ǻ) 5 residues in the intersubunit contact zone (5Ǻ<d min <10Ǻ) 6 residues in the operator contact zone (5Ǻ<d min <10Ǻ) – 44 SDPs LacI from E.coli
42
The LacI family: subtle changes in signals at close distances G A n CGCG GnGn GCGC
43
CRP/FNR family of regulators
44
Correlation between contacting nucleotides and amino acid residues CooA in Desulfovibrio spp. CRP in Gamma-proteobacteria HcpR in Desulfovibrio spp. FNR in Gamma-proteobacteria DD COOA ALTTEQLSLHMGATRQTVSTLLNNLVR DV COOA ELTMEQLAGLVGTTRQTASTLLNDMIR EC CRP KITRQEIGQIVGCSRETVGRILKMLED YP CRP KXTRQEIGQIVGCSRETVGRILKMLED VC CRP KITRQEIGQIVGCSRETVGRILKMLEE DD HCPR DVSKSLLAGVLGTARETLSRALAKLVE DV HCPR DVTKGLLAGLLGTARETLSRCLSRMVE EC FNR TMTRGDIGNYLGLTVETISRLLGRFQK YP FNR TMTRGDIGNYLGLTVETISRLLGRFQK VC FNR TMTRGDIGNYLGLTVETISRLLGRFQK TGTCGGCnnGCCGACA TTGTgAnnnnnnTcACAA TTGTGAnnnnnnTCACAA TTGATnnnnATCAA Contacting residues: REnnnR TG: 1 st arginine GA: glutamate and 2 nd arginine
45
The correlation holds for other factors in the family
46
Open problems Model the evolution of regulatory systems (a catalog of elementary events, estimates of probabilities) –Birth of a binding site; what are the mechanisms? –Loss of a binding site –Duplication of a regulated gene and/or a regulator –Horizontal transfer of a regulated gene and/or a regulator –Loss of structural a gene and/or a regulator –General properties? Distribution of TF family and regulon sizes Stable cores and flexible margins of functional systems (in terms of gene presence and regulation) Co-evolution of TFs and DNA sites: –“Neutral” model for the evolution of binding sites (with invariant functional pressure from the bound protein) –How do the signals evolve? What is the driving force – changes in TFs? –TF-family, position-specific protein-DNA recognition code? All that needs to take into account the incompleteness and noise in the data
47
Acknowledgements Andrei A. Mironov (algorithms and software) Alexandra B. Rakhmaninova (SDPs) Dmitry Rodionov (now at Burnham Institute) (BioR, NrdR, iron) Olga Laikova (LacI, sugars) Dmitry Ravcheev (FruR) Olga Kalinina (SDPs/LacI) Leonid Mirny, MIT (protein/DNA contacts, SDPs) Andy Johnston, University of East Anglia (iron) Howard Hughes Medical Institute Russian Fund of Basic Research Russian Academy of Sciences, program “Molecular and Cellular Biology” INTAS
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.