CRISPR Direct Repeat Sequences Olivia Ho-Shing 22 November 2009
QUESTIONS: Do different halophile species share similar direct repeat sequences? Can direct repeats indicate phylogeny? Is there any structure to the direct repeats for some potential function? 21 – 37 bp in length Surround spacers that may contain viral sequences Not palindromic, some dyad symmetry ----GACTAC----CTG----GTAGTC---- degenerate DR
CRISPR Finder H. mukohataei CRISPR 1_12 H. californiae H. denitrificans H. mediterranei H. mucosum H. sinaiiensis H. sulfurifontis H. volcanii H. vallismortis*
H. mukohataei direct repeats match hits in: Haloarcula marismortui Halorhabdus utahensis Natronomonas pharaonis Do different halophile species share similar direct repeat sequences?
CLUSTAL multiple sequence alignment Sulfurifontis_17_ GCTTCAATCCCACAAGGGTTCGTCTGAAAC Denitrificans_10_x GCTTCAATCCCACAAGGGTTCGTCTGAAAC Mukohataei_2_ GCTTCAATCCCACAAGGGTCCGTCTGAAAC Sinaiiensis_116_ GCTTCAATCCCACATGGGTTCGTCTGAAAC Californiae_86_ GCTTCAACCCCACGAGGGTCCGTCTGTAAC Utahensis_1_ GCTTCAACCCCACGAGGGTCCGTCTGAAAC Marismortui_1_ GCTTCAACCCCACAAGGGTCCGTCTGAAAC Californiae_86_ GCTTCAACCGCCCAAGGGTCCGTCTGAAAC Mediterranei_5_x GCTTCAACCCAACTAGGGTTCGTCTGTAAC Mucosum_17_ GCTTCAACCCAACTAGGGTTCGTCTGTAACC Mediterranei_13_x GCTTCAACCCAACTAGGGTTCGTCTGTAAC Mucosum_10_x GCTTCAACCCAACTAGGGTTCGTCTGTAAC Mukohataei_1_ GTTTCAGACGGACCCTTGTGGGATTGAAGC Californiae_65_ GTTTCAGACGGACCCTTGGGCGGTTGAAGC Mucosum_4_x GTTACAGACGAACCCTAGTTGGGTTGAAGC Mucosum_11_ GTTACAGACGAACCCTAGTTGGGTTGAAGC Californiae_108_ GTTACAGACGGACCCTCGTGGGGTTGAAGC Denitrificans_5_x GTTTCAGACGAACCCTTGTGGGGTTGAAGC Npharaonis_1_ GTTTCAGACGAACCCTTGTGGGGTTGAAGC Volcanii_16_x GTTTCAGACGAACCCTTGTGGGGTTGAAGC Sulfurifontis_12_ AGTTTCAGACGAACCCTTGTGGGATTGAAGC Sinaiiensis_116_ GTTTCAGACGAACCCTTGTGGGATTGAAGC Volcanii_72_ GGTTTCAGACGAACCCTTGTGGGTTTGAAGC Sulfurifontis_21_ GTTTCAGACGAACCCTTGTTGGGTTGAAGT Sulfurifontis_26_ GTTTCAATC---CCGTTCTGGGTTTCTACCGCATCGCGAC 37 Sinaiiensis_46_ AACCAGAGCGAACAGGGACCACC Sulfurifontis_19_1 GTCGCGATGCGGTAGAAAC---CCAGAACGGGATTGAAAC Sulfurifontis_26_2 GTCGCAGGGCAATAGAAAC---CCAGAACGGGATTGAAAC Npharaonis_1_4 GTCGAGACGGACTGAAAAC---CCAGAACGGGATTGAAAC Sulfurifontis_13_ CCGACACCGACGGCGACGGTCTCGACGACGG Californiae_37_ CTTGTCCTTGACCTCGGTCGTCTTGTCTTT Do different halophile species share similar direct repeat sequences?
H. sinaiiensis H. sinaiiensis H. sinaiiensis 46-1 N. pharaonis 1-1 N. pharaonis 1-4 H. californiae H. californiae H. californiae H. californiae H. californiae 37-1 H. mukohataei 1-12 H. mukohataei 2-1 H. mediterranei 13(x2) H. mediterranei 5(x3) H. mucosum 10(x4) H. mucosum 17-1 H. mucosum 11-1 H. mucosum 4(x4) H. denitrificans 10(x2) H. denitrificans 5(x3) H. volcanii 16(x2) H. volcanii H. sulfurifontis 17-1 H. sulfurifontis 19-1 H. sulfurifontis 26-2 H. sulfurifontis 13-1 H. sulfurifontis 26-1 H. sulfurifontis 12-2 H. sulfurifontis 21-1 ClustalW alignment grouped by species Do different halophile species share similar direct repeat sequences?
H. californiae H. denitrificans H. marismortui H. mediterranei H. mucosum H. mukohataei H. sinaiiensis H. sulfurifontis H. utahensis H. volcanii N. pharaonis H. vallismortis
Can direct repeats indicate phylogeny? Phylogram based on direct repeat sequences H. californiae H. denitrificans H. marismortui H. mediterranei H. mucosum H. mukohataei H. sinaiiensis H. sulfurifontis H. utahensis H. volcanii N. pharaonis H. vallismortis
Can direct repeats indicate phylogeny? Phylogram based on 16S rRNA sequences H. californiae H. denitrificans H. marismortui H. mediterranei H. mucosum H. mukohataei H. sinaiiensis H. sulfurifontis H. utahensis H. volcanii N. pharaonis H. vallismortis
Is there any structure to the direct repeats for some potential function? Characterizing a halophile consensus direct repeat sequence G T T T C A A A C G A A C C [AC] [GT] G G T G G G T T T G A A [AG] C
Is there any structure to the direct repeats for some potential function? Comparing consensus halophile sequence to other species CLUSTAL multiple sequence alignment Consensus_Halophile -GTTTCAAACGAACCCGGGTGGGTTTGAAAC Hmarismortui_2 -GCTTCAACCCCACAAGGGTCCGTCTGAAAC NostocPCC_6 -GTTTCCATCCCCGTGAGGGGTA--AAGGAATTAAAAC- 35 NostocPCC_12 -GTTTCCATCCCCGTGAGGGGTA--AGAGATTAAAAAC- 35 NostocPCC_14 -GTTTCAATCCCTGATAGGGATTTTTGTTAGTTAAAAC- 37 NostocPCC_15 -GTTTCAATCCCTGATAGGGATTTTTGTTAGTTAAAAC- 37 Rxylan_2 -GTTTCAATCCCTTATAGGTAGGCTCAAAAC Ecoli_4 -CGGTTTATCCCCGCTGGCGCGGGGAACTC Ecoli_5 --GGTTTATCCCCGCTGGCGCGGGGAACAC Hmarismortui_C1_2 ---GGCGGTCCCTGTTCGCTCTGGTT NostocPCC_2 GTTACTTACCATCACTTCCCCGCAAGG-GGATGGAAAC- 37 NostocPCC_18 -CTTTCAACCCTCCCATTACTGGAAGGAGGGTTGCAACG 38 NostocPCC_7 -GTTTTAATTCCTTTACCCCT-CACGG-GGATGGAAAC- 35 NostocPCC_8 GTTTCTATTAACACA-AATCCCTATCAGGGATTGAAAC- 37 NostocPCC_17 -GTTGCAACACCATATAATCCCTATTAGGGATTGAAAC- 37 Rxylan_3 --TACCAGGCGTGGATCTTGCCCTCGGACAC
Is there any structure to the direct repeats for some potential function?
Conclusions Halophile direct repeats are similar to each other (significant e-values) – swapping, functionality Interesting triplet motifs in dyad symmetry of direct repeats – Binding site for CRISPR-associated proteins? – Folding site for siRNA? Direct repeats may be more indicative of phylogeny in a larger more widespread group of species