Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

Similar presentations


Presentation on theme: "Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface."— Presentation transcript:

1 Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface has not yet been ported to the web. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests. A narrative to help you go through the simulation appears in a red-bordered box, such as the one below. To begin: 1. Click on Slide Show, (on the upper toolbar) 2. Click View Show 3. Click Continue button Continue Scenario 6

2 You’re intrigued by the motif you found in front of Anabaena PCC 7120 all4312 and its cyanobacterial orthologs (see Scenarios 1 and 5). You’d like to look more deeply into it, by examining the sequence near the orf. You’re not sure what you’re looking for, and you’re open for anything. Continue Scenario 6 Analysis: Tools for directly examining sequence

3 Anab7120:all4312 NostPunc:618.077 TricEryt:5.6053 Syny6803:sll1330 TherElon:tlr1330 Anabaena PCC 7120: all4312 OptionsAnnotate Main Menu History Replicon: Chromosome Coordinates: 5166997 (stop) <- 5167767 (start-GTG) System Length = 256 amino acids Strand: Complementary Function: Two-component response regulator System Syny6803:sll1330: Expression data (click to expand)Experiment Mutant: None Syny6803:sll1330: Failed to segregate Experiment Cyanobacterial orthologs: NostPunc TricEryt Syny6803 TherElon Scenario 1 left us with the provocative finding that all five cyanobacterial orthologs of all4312 are preceded by the same motif. What is that motif and what might it mean? To answer that question, click on the coordinates of all4312 to get to the sequence interface. A Lawrence/Collier conserved motif set

4 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 The interface places you in the Anabaena chromosome in the region surrounding all4312, with the orf highlighted as a block. Clicking on all4312 would get us back to the annotation page. Our goal was to look at the motif preceding the orf, so click on Display.

5 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 We want to display the motif predicted by Lawrence/Collier, so click on Predicted features. Alternate starts Annotated features Predicted features Private features Tandem repeats Inverted repeats Base symbols Invert display Predicted features

6 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 I was hoping to see sequences I recognized, but that’s made more difficult by the orf being on the wrong strand. I could invert the entire display, but instead I’ll just work on a segment. Click Block.

7 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 The highlighted orf sequence could now be downloaded or first translated then downloaded, but I’m interested now only in the region preceding the gene. Click Define, in order to highlight a new block of sequence. Define Invert Translate Save Tools Define

8 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 Define the beginning of the block by clicking on base 5167751 (4 th line up). Then click on the last base on the page (lower right corner).

9 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 Now that the bottom four lines are blocked, Click on Block and then Invert.

10 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 Now that the bottom four lines are blocked, Click on Block and then Invert. Define Invert Translate Save Tools Invert

11 AACTATAACAAAAATTTAATAATATTATCAACTTCGCTCTGGACAAGGCA TAAACTCAACATTTTGCCAACATAGGTTATAAAAAAACGTAGAGGTAATT GTGGCTAGAGTAACAAAGACTACAAAACCTTGGGCATGGGCTTGTTACTT TGAAATTCATCGACGCTAAGGGGTCTTGCCGCCGTGGGTTCGGTTTGTAT all4312 two-component system 5167767 -> 5166997 Anabaena Chromosome (6413771 bp): 5167950-5967751 (inverted).........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5167950 5167900 5167850 5167800 That’s more like it. Now a person attuned to such things can recognize the elements of a binding site for the transcriptional regulator NtcA, followed by the -10 region of a promoter, properly spaced. The gene comes shortly after that, now in the direct (blue) orientation. To get back to the full sequence, click on Block and then unInvert.

12 AACTATAACAAAAATTTAATAATATTATCAACTTCGCTCTGGACAAGGCA TAAACTCAACATTTTGCCAACATAGGTTATAAAAAAACGTAGAGGTAATT GTGGCTAGAGTAACAAAGACTACAAAACCTTGGGCATGGGCTTGTTACTT TGAAATTCATCGACGCTAAGGGGTCTTGCCGCCGTGGGTTCGGTTTGTAT all4312 two-component system 5167767 -> 5166997 Anabaena Chromosome (6413771 bp): 5167950-5967751 (inverted).........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5167950 5167900 5167850 5167800 That’s more like it. Now a person attuned to such things can recognize the elements of a binding site for the transcriptional regulator NtcA, followed by the -10 region of a promoter, properly spaced. The gene comes shortly after that, now in the direct (blue) orientation. To get back to the full sequence, click on Block and then unInvert. Invert Translate Save Tools unInvert

13 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 If suspicious, we could have found this same site by a direct search for its consensus sequence (though there are better ways than this), clicking on Find, then Sequence, and typing in the NtcA/promoter consensus sequence.

14 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 Gene name Description Sequence If suspicious, we could have found this same site by a direct search for its consensus sequence (though there are better ways than this), clicking on Find, then Sequence, and typing in the NtcA/promoter consensus sequence.

15 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 Gene name Description Sequence GTA.{8}TAC.{20,24}TA...T The NtcA binding sequence is flexible, like most sequences of biological interest. Search tools need to be similarly flexible.This search string says: Look for “GTA” followed by 8 nucleotides of any sort, followed by “TAC” followed by 20 to 24 nucleotides, followed by “TA”, three nucleotides, then a final “T”. Press Enter to find a matching sequence.

16 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 It is sometimes easier to see patterns in DNA sequences if we can engage our visual recognition abilities. Click Display and then Base Symbols to try it out for yourself.

17 GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCA AAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCAC AATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTA TGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 Alternate starts Annotated features Local features Tandem repeats Inverted repeats Base symbols It is sometimes easier to see patterns in DNA sequences if we can engage our visual recognition abilities. Click Display and then Base Symbols to try it out for yourself.

18 □ ■■ □□□ ■■ □□□□□ ■ □□□□□ ■■ □ ■■ □■■■■■■■■■■■■■■□■■■■■■ CTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCAC □□■□□■■■□■■■■■ ■■□ □■□■■□□■□□□ ■■ ■■□ □□□■□□■□□□ ■■■ □■□■■■□□□□■■■■□■□□ ■■■■■ □■■□■■■■□□■■□■ □□■■□ ■■■ ■■□■□ ■■■ ■■■■□■□□■■■□■□■ ■□ □■□□□□■□■■□□□■■■□ ■□■■■■□■■■□□□□■□□□□■■□□■□□■□■■□ ■■□□□ ■■■■■□■ ■□■ □□■■ 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 5167901 all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166951-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit Purines are represented as open symbols and pyrimidines as filled in symbols. A and T are purple, G and C are green. Fortunately, you don’t have to remember any of this to recognize patterns. Look at the top line. It’s immediately evident (as it probably was not before) that all4312 is followed by a string of... pyrimidines and then a string of purines. Possibly a termination region? Let’s look beyond. Press the right arrow key to move the display one line down.

19 AACCAAGCCGATGAAGAATGGAACTAA □■ ■ ■■■ ■ ■ ■ ■■■■■□■■■■■■□■ □■■□□□■■□□□□□■□□□□□■■ □■■□■ ■■■■■■ ■■ ■■■■■□■■■■■■ CTAC ACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGT TAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAA ATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGA TCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACG CGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGC TCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCC TGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGAC TAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTG TGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATC CCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAAC GATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTT GCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAA TCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCT TGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCA AGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCA ATACAAACCGAACCCAC □□■□□■■■□■■■■■ ■■■ □■□■■□■■□ ■■■ ■■■■ □□□■□□■□□□■■■□■□■■■□□□□ ■■■■ □■□□■ ■■■ ■□■■□■■■ ■□ □■■□■ □□■■□■■■■■□■□■■■■ ■■■ □■□□■■■□■□■■□□■ □□□□ ■□■■□□□■■■□ 5166901 5166961 5167001 5167051 5167101 5167151 5167201 5167251 5167301 5167351 5167401 5167451 5167501 5167551 5167601 5167651 5167701 5167751 5167801 5167851 alr4311 ABC transporter 5166172 -> 5166927 all4312 two-component system 5166997 <- 5167767 Anabaena Chromosome (6413771 bp): 5166901-5967950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit From the change in color from yellow to blue, we’ve evidently run into a gene on the other strand, this one also ending in a string of pyrimidines. Let’s look further by clicking on PgUp.

20 CCAAAGCAAAACAGGTATAGACACCACTGATGTTCGCCCTTTAGCGCAAC CGTGGATGTATTTGATTTTATTAGGATTTACACTATTACTACTTTTAATT GATGCTTGGGCGATCGCCACAGCTATAGCCATCTAA □■■■■□□■■■■□□□ □■■■■□□□■■■□□□■■ ■■■■■ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ■■■■ ■■■■■■■■■■■■■■■■■ ATGACAGCCCAATTAAGGCTAGAACAAGT TAATCTGTTTGCCAAGCTAAAAACCCAGCTTCAGGGCTACCCAATATTGC AGGATATCTCTTTTGAGATTAACTCTGGCGATCGCCTAGCAATTATTGGC CCCTCCGGTGCTGGTAAAACTTCTTTACTACGTCTAATTAACCGCCTCAG TGAACCTAATAGCGGCAAAATTTTTTTAGAAAATCAAGAATATCCGCAAA TTCCTGTTATCCAGTTGCGCCAGATAGTGACCCTGGTATTACAAGAGCCA AAGTTTCTGGGGATGACAGTCCAACAAGCCTTAGCTTACCCTTTAATTTT GCGCGGTTTGACCAAAGAGACGATTCAGCAGCGAGTCAGTCATTGGGCGG AACAGCTGCAAATCCCTGGTGATTGGTTAGGACGCACTGAGGTACAACTT TCGGCTGGACAGAGACAGCTCGTAGCGATCGCTCGTGCTTTAGTCATTCA ACCGAAAATCCTCCTGTTAGATGAGCCAACCTCTCATCTAGATATTGGTA TAGCCTCCCATCTTATCCAAGTCTTAACCCAGCTAACTCAAACTCATCAC ACAACAATTGTGATGGTAAACAGCCAGCTAGACTTCACTCAGATGTTTTG TAATCGGCTTTTGTATTTACAGCAAGGACGTTTATTGGTTAATCAAACAG CTTCTAACATCGACTGGATTGACTTACAAAAAAGGTTGATGCACGCCGAA AACCAAGCCGATGAAGAATGGAACTAA □■■ ■■■ ■■■■■■■■□■■■■■■□■ 5165961 5166001 5166051 5166101 5166151 5166201 5166251 5166301 5166351 5166401 5166451 5166501 5166551 5166601 5166651 5166701 5166751 5166801 5166851 5166901 alr4310 hypothetical protein 5165532 -> 5166086 alr4311 ABC transporter 5166172 -> 5166927 Anabaena Chromosome (6413771 bp): 5165951-5966950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit The intergenic region between alr4310 and alr4311 shows a remarkable pattern. I’ll give you a few seconds to try to find it yourself... The intergenic region between alr4310 and alr4311 shows a remarkable pattern. I’ll give you a few seconds to try to find it yourself......a series of tandem repeats. Now that we see it by eye, we can ask the computer to find them in a more systematic fashion. Click on Display and then Tandem repeats.

21 CCAAAGCAAAACAGGTATAGACACCACTGATGTTCGCCCTTTAGCGCAAC CGTGGATGTATTTGATTTTATTAGGATTTACACTATTACTACTTTTAATT GATGCTTGGGCGATCGCCACAGCTATAGCCATCTAA □■■■■□□■■■■□□□ □■■■■□□□■■■□□□■■ ■■■■■ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ■■■■ ■■■■■■■■■■■■■■■■■ ATGACAGCCCAATTAAGGCTAGAACAAGT TAATCTGTTTGCCAAGCTAAAAACCCAGCTTCAGGGCTACCCAATATTGC AGGATATCTCTTTTGAGATTAACTCTGGCGATCGCCTAGCAATTATTGGC CCCTCCGGTGCTGGTAAAACTTCTTTACTACGTCTAATTAACCGCCTCAG TGAACCTAATAGCGGCAAAATTTTTTTAGAAAATCAAGAATATCCGCAAA TTCCTGTTATCCAGTTGCGCCAGATAGTGACCCTGGTATTACAAGAGCCA AAGTTTCTGGGGATGACAGTCCAACAAGCCTTAGCTTACCCTTTAATTTT GCGCGGTTTGACCAAAGAGACGATTCAGCAGCGAGTCAGTCATTGGGCGG AACAGCTGCAAATCCCTGGTGATTGGTTAGGACGCACTGAGGTACAACTT TCGGCTGGACAGAGACAGCTCGTAGCGATCGCTCGTGCTTTAGTCATTCA ACCGAAAATCCTCCTGTTAGATGAGCCAACCTCTCATCTAGATATTGGTA TAGCCTCCCATCTTATCCAAGTCTTAACCCAGCTAACTCAAACTCATCAC ACAACAATTGTGATGGTAAACAGCCAGCTAGACTTCACTCAGATGTTTTG TAATCGGCTTTTGTATTTACAGCAAGGACGTTTATTGGTTAATCAAACAG CTTCTAACATCGACTGGATTGACTTACAAAAAAGGTTGATGCACGCCGAA AACCAAGCCGATGAAGAATGGAACTAA □■ ■■ ■■■■■■■■■■□■■■■■■□■ 5165961 5166001 5166051 5166101 5166151 5166201 5166251 5166301 5166351 5166401 5166451 5166501 5166551 5166601 5166651 5166701 5166751 5166801 5166851 5166901 alr4310 hypothetical protein 5165532 -> 5166086 alr4311 ABC transporter 5166172 -> 5166927 Anabaena Chromosome (6413771 bp): 5165951-5966950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit The intergenic region between alr4310 and alr4311 show a remarkable pattern. I’ll give you a few seconds to try to find it yourself......a series of tandem repeats. Now that we see it by eye, we can ask the computer to find them in a more systematic fashion. Click on Display and then Tandem repeats. Alternate starts Annotated features Local features Tandem repeats Inverted repeats Base symbols Tandem repeats

22 CCAAAGCAAAACAGGTATAGACACCACTGATGTTCGCCCTTTAGCGCAAC CGTGGATGTATTTGATTTTATTAGGATTTACACTATTACTACTTTTAATT GATGCTTGGGCGATCGCCACAGCTATAGCCATCTAA □■■■■□□■■■■□□□ □■■■■□□□■■■□□□■■ ■■■■ ■■■■■■■■■■■■■■ ■■■ ■■■■■■■■■■■■■ ■■■■ ■■■■■■■■■■■■■■■■■ ATGACAGCCCAATTAAGGCTAGAACAAGT TAATCTGTTTGCCAAGCTAAAAACCCAGCTTCAGGGCTACCCAATATTGC AGGATATCTCTTTTGAGATTAACTCTGGCGATCGCCTAGCAATTATTGGC CCCTCCGGTGCTGGTAAAACTTCTTTACTACGTCTAATTAACCGCCTCAG TGAACCTAATAGCGGCAAAATTTTTTTAGAAAATCAAGAATATCCGCAAA TTCCTGTTATCCAGTTGCGCCAGATAGTGACCCTGGTATTACAAGAGCCA AAGTTTCTGGGGATGACAGTCCAACAAGCCTTAGCTTACCCTTTAATTTT GCGCGGTTTGACCAAAGAGACGATTCAGCAGCGAGTCAGTCATTGGGCGG AACAGCTGCAAATCCCTGGTGATTGGTTAGGACGCACTGAGGTACAACTT TCGGCTGGACAGAGACAGCTCGTAGCGATCGCTCGTGCTTTAGTCATTCA ACCGAAAATCCTCCTGTTAGATGAGCCAACCTCTCATCTAGATATTGGTA TAGCCTCCCATCTTATCCAAGTCTTAACCCAGCTAACTCAAACTCATCAC ACAACAATTGTGATGGTAAACAGCCAGCTAGACTTCACTCAGATGTTTTG TAATCGGCTTTTGTATTTACAGCAAGGACGTTTATTGGTTAATCAAACAG CTTCTAACATCGACTGGATTGACTTACAAAAAAGGTTGATGCACGCCGAA AACCAAGCCGATGAAGAATGGAACTAA □■ ■■■ ■■■■■■■■■□■■■■■■□■ 5165961 5166001 5166051 5166101 5166151 5166201 5166251 5166301 5166351 5166401 5166451 5166501 5166551 5166601 5166651 5166701 5166751 5166801 5166851 5166901 alr4310 hypothetical protein 5165532 -> 5166086 alr4311 ABC transporter 5166172 -> 5166927 Anabaena Chromosome (6413771 bp): 5165951-5966950.........|.........|.........|.........|.........| Contig GoTo Block Find Display PgUp/PgDn Help Quit The machine saw more than we did! Not only are the repeats we saw more extensive, but there is also another set of repeats nearby. What do they mean? Hard to say, but certainly our chances of figuring them out are better if we can engage our visual imagination and if we can see them in a biological context. End

23 Analysis: Tools for directly examining sequence Summary (article of faith) The freshest insights and most fundamental discoveries require intimate contact with the basic phenomenon. In genomic analysis, the basic phenomenon is often the genome. The sequence interface makes it possible to view DNA features within a biological context. The interface provides tool to aid discovery of features within noncoding DNA. Scenario 6 Software that does most of what you saw already exists, but it would need to be rewritten before it could serve as a web interface.


Download ppt "Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface."

Similar presentations


Ads by Google