RNA – A New Role Amy Anderson
Introduction – How’d they do that? 40+ years, still no complete answer… On You! Off You! Gene Gene Gene Gene How does gene expression get turned on and off?
Anabaena Heterocysts How do the cells differentiate?
So we need something new. Originally researchers looked for proteins… …GCCAATGTCAAAGATTTAGTAGAGACTTGCATATCGTTCACTCCGTGAGTAAGTTTTTGTAATTAACGT… Transcription …but nothing was found to account for everything. So we need something new.
Where to look? E. coli!
This small RNA is called rhyB. E. coli’s Regulatory Mechanism – small RNA Small DNA sequence in between two genes Transcribed RNA bound to mRNA of targeted genes Binding blocked mRNA from being translated. Gene 1 Gene 2 …AATGCGCTA…GAATTATCGCG… 3. Blocks mRNA translation into protein This small RNA is called rhyB. It regulates the expression of proteins associated with the presence of iron. 1. Transcribed into RNA GCUA…GAAUUA 2. sRNA binds to mRNA containing targeted gene …GCUCUUACGUCGUCGAUUGCAGCGUA…
Regulation of rhyB in E.coli RNA Polymerase Fe+2 Fe+2 Fe+2 Fe+2 Fe+2 Fe+2 Fe+2 Fe+2 Fe+2 Fe+2 ……..ACTAAAGTTGCCAATGTCAAAGATTTAGTAGAGACTTGCATATCGA…….TTCACTCCGTGAGTAAGTTTTTGTAATTAACGT…….. rhyB rhyB Promoter site Fe+2 fur Fe+2 Fe+2 Fe+2 Fe+2 Fe+2 Fe+2
! Secondary Structure of rhyB mRNA of targeted gene rhyB binding site GGAGAAC UAUUACUU UUU ! …. GUUAACUAUUTUTTGAUGUGCCAA…. rhyB binding site mRNA of targeted gene
DNA of multiple targeted genes mRNA of targeted genes with rhyB binding site
Does Anabaena have a similar mechanism? My Strategy: - BLAST the intergenic region surrounding each motif. - Look for partial matches that lie antisense (found on the opposite strand of) a gene (preferably one from my list). What I was given: - Set of motifs (sequences) that frequently showed up upstream from possible heterocyst related genes. - A list of possible heterocyst related genes What I was looking for: - Sequences that looked like an RNA (lollypops on a bubble) Motifs: M10 - CACGTTATCTGTTGAGACCGGGTGTAAGGGTTT M11 - TACACCCTTTTCCAAACCCTTGATCTTTCGTTTTCATGCGTAAGT M12 - AAAACTCTACCCACAAGGGGATAGAGTTTTGTCAGTGGTCAGTGG …… Gene 1 Gene 2 …AATGCGCTA…GAATTATCGCG… …AATGCGCTA…GAATTATCGCG… …TTACGCGAT…CTTAATAGCGC… …AATGCGCTA…GAATTATCGCG… …TTACGCGAT…CTTAATAGCGC…
Intergenic regions Motif 10 was found in: asl1111 alr1112 ACGT…M10…AGGT alr2744 alr2745 ACGT…M10…AGGT D5 P13 asl1274 asr1275 ACGT…M10…AGGT all2909 all2908 ACGT…M10…AGGT D7 P16 asl2850 alr2851 ACGT…M10…AGGT all3558 all3557 ACGT…M10…AGGT D15 P19 all3239 alr3240 ACGT…M10…AGGT alr3693 asr3694 ACGT…M10…AGGT D17 P20 all4668 asr4669 ACGT…M10…AGGT all4219 all4218 ACGT…M10…AGGT D24 P21 all1124 all1123 ACGT…M10…AGGT alr4468 alr4469 ACGT…M10…AGGT P6 P22 all1683 all1682 ACGT…M10…AGGT P10 alr4788 all4789 ACGT…M10…AGGT C26 all1782 all1781 ACGT…M10…AGGT P11
BLAST hit alr0787 all0788 12-15 nucleotide overlap Most intergenic regions containing motif 10 hit this spot.
Took an alignment of the portions that hit all0788. Seq 3 0 ---------- ---------- ---------- ---------- ---------- Seq 6 1 ---------- ---------- ---------- ---------- ---GGGTGTA Seq 10 0 ---------- ---------- ---------- ---------- ---------- Seq 7 0 ---------- ---------- ---------- ---------- ---------- Seq 4 0 ---------- ---------- ---------- ---------- ---------- Seq 5 0 ---------- ---------- ---------- ---------- ---------- Seq 1 1 ---------- ---------- ---------- ---------- ---GGGTGTA Seq 8 0 ---------- ---------- ---------- ---------- ---------- Seq 9 1 CAGGTTATCT GTTGAGACCG GGTGTAAGGG TTTAAGGGTA CAGGGGTGTA Seq 11 1 -AGGTTATCT GTTGAGACCG GGTGTAAGGG TTTAAGGGTA CAGGGGTGTA Seq 2 1 ---------- ---------- ---------- ---------- ---GGGTGTA Seq 12 1 ---------- ---------- ---------- ---------- ---GGGTGTA consensus 1 Seq 3 1 ---------- ---------- ------ACCC CTAC------ -ACCCTTCTC Seq 6 8 AGGGTTTCAA AAATTTATAC CCCTATACCC CTAC------ -ACCCTTGTC Seq 10 1 ---------- ---------- ----ATACCC CTAC------ -AACCTTTTC Seq 7 1 ---------- ---------- ----ATACCC CTAC------ -ACCCTTGTC Seq 4 1 ---------- ---------- -------CCC CTAT------ -ACCCTTGTC Seq 5 1 ---------- ---------- ---------- ---------- -ACCCTTGTC Seq 1 8 AGGGTTTCAA GCATTTATAC CCTTATACCC CTAC------ -ACCCTTGTC Seq 8 1 ---------- ---------- ----ATACCC CCAT------ -ACCCTTGTT Seq 9 51 AGGGTTTCAA GCATTTATAC TCCTACATCC CTAC------ -ATCCTTGTC Seq 11 50 AGGGTTTCAA GCATTTATAC CCTTATACCC CTAC------ -ACCCTTGTC Seq 2 8 AGGGTTTCAA GCATTTATAC CCTTGTATCC CTATCCCATA CACCCTTATC Seq 12 8 AGGGTTTCAA GCATTTATAC CCTCATACCC CTAT------ -ACCCTTGTC consensus 51 * **** * Seq 3 18 CAAACCCTTG ATTTCTC-GT TTTCATGCGT AAGTCCTA- Seq 6 51 TAAACCCTTG ATTTTTC-GT TTTCATGCGT AAGTCCTA- Seq 10 20 CAAAACCTTG ATCTTTC-GT TTTCATGTGT AAGTCCTA- Seq 7 20 CAAACCCTCG ATCTTTC-GT TTTCATGCGT AAGTCCTA- Seq 4 17 CAAACCCTTG ATGTTTT-AT TTTTCTGCGT AAGTCCTAT Seq 5 10 CAAACCCTTG ATCTTTC-GT TTTCATGCGT AAGTCCTA- Seq 1 51 CAAACCCTTG ATCTTTCCGT TTTCATGCGT AAGTCCTA- Seq 8 20 CAAACCCTTG ATCTTTC-GG TTTCATGCGT AAGTC---- Seq 9 94 CAAACCCTTG ATCTTTC-GT TTTCATGCCT AAGTCCTA- Seq 11 93 CAAACCCTCG ATCTTTC-GT TTTCATGCGT AAGT----- Seq 2 58 CAAAGCCTTG ATATTTC-TT TTTCATGGGT AAGTC---- Seq 12 51 CAAACCCTTT ATCTTTC-GT TTTCATGCGT AAGTC---- consensus 101 *** *** ** * * *** ** * ****
Alignment of sub-sequences Seq 3 1 GGGTGTAAGG GTTTCAAGCA TTTATACCCT TGTATCCCTA TCCCATACAC Seq 7 1 GGGTGTAAGG GTTTCAAGCA TTTATACCCT CATACCCCTA T-------AC Seq 6 1 GGGTGTAAGG GTTTCAAGCA TTTATACCCT TATACCCCTA C-------AC Seq 1 1 GGGTGTAAGG GTTTCAAGCA TTTATACCCT TATACCCCTA C-------AC Seq 4 1 GGGTGTAAGG GTTTCAAAAA TTTATACCCC TATACCCCTA C-------AC Seq 2 1 GGGTGTAAGG GTTTCAAACA TTTATACCCC TAAACCCCTA T-------AT Seq 5 1 GGGTGTAAGG GTTTCAAGCA TTTATACTCC TACATCCCTA C-------AT consensus 1 ********** ******* * ******* * * ***** * Seq 3 51 CCTTATCCAA AGCCTTGATA TTTCT-TTTT CATGGGTAAG TC---- Seq 7 44 CCTTGTCCAA ACCCTTTATC TTTCG-TTTT CATGCGTAAG TC---- Seq 6 44 CCTTGTCCAA ACCCTCGATC TTTCG-TTTT CATGCGTAAG T----- Seq 1 44 CCTTGTCCAA ACCCTTGATC TTTCCGTTTT CATGCGTAAG TCCTA- Seq 4 44 CCTTGTCTAA ACCCTTGATT TTTCG-TTTT CATGCGTAAG TCCTA- Seq 2 44 CCTTCTCCAA ACCCTTAATC TTTAG-TTTT CATACGTAAG TCCTAT Seq 5 44 CCTTGTCCAA ACCCTTGATC TTTCG-TTTT CATGCCTAAG TCCTA- consensus 51 **** ** ** * *** ** *** **** *** **** *
Secondary structures of aligned sequences
BLASTed this bubble sequence
Some BLAST Results patN About 80% landed in between parallel genes alr2966 alr2967 alr2968 all4019 all4020 all4021 all4022 all4023 alr4810 alr4811 alr4812 patN
Summary and Conclusions More work needs to be done. It is possible that Anabaena employes small RNA as a regulatory mechanism (I found nothing that suggested otherwise). When looking for any kind of pattern or sequence you must always consider the probability that it would happen by chance. The data I showed was just a fraction of everything I had gathered. I didn’t even scratch the surface of this new topic, but I learned a lot trying.
Thanks to: Jeff Elhai Peter Wolk For being so patient with me :^)