Download presentation
Presentation is loading. Please wait.
Published bySolomon Caldwell Modified over 9 years ago
2
Lives of the Scientist
3
Genetic Basis of Differentiation Events in time and space...
4
... driven by patterned gene expression Genetic Basis of Differentiation
5
Events in time and space...... driven by patterned gene expression Genetic Basis of Differentiation
6
Events in time and space...... driven by patterned gene expression Genetic Basis of Differentiation NH 3 N2N2 Nostoc
7
Genetic Basis of Differentiation NH 3 Environmental SignalDevelopmental Response Histidine Kinase How?
8
Genetic Basis of Differentiation Developmental Response Histidine Kinase How? NH 3 Environmental Signal P AT
9
Genetic Basis of Differentiation Developmental Response Histidine Kinase P Response Regulator How? NH 3 Environmental Signal P histidine
10
Genetic Basis of Differentiation Developmental Response Histidine Kinase P Response Regulator How? NH 3 Environmental Signal P ? ? ?NpR3010
11
AATAAAGCTTTACAAACCAA ACTCTGGCTTCAATTGTGTAA CCCAAGCTTTGATTCTTTCCT CTGTTAAATCGGATTGATTAT CTTCATCAAGGGCAAGACCT ACAAATTTACCATCACGAAC AGCTTTAGACTCACTGAATT CATAACCTTCTGTAGGCCAA TAGCCAACTGTTTCACCACC ATTTTCTGAAATTTTTTCCTCT AGAATACCGAGGGCATCTTG AAATGTATCAGGATAACCAA CCTGGTCTCCAGGAGCAAAA TAAGCAACTTTTTTGCCGATG AAGTCAATGTTATCTAACTC ATCATAAAAATTTTCCCAAT CACTTTGCAATTCTCCAACAT TCCAGGTAGGACAACCAAC AACGATATAATCGTAGTTAT TGAAATCACTTGGTTCAGCTT GTGAAATATCATATAAAGTT ACAACACTATCACCACCAAA CTCCTTCTGAATTATTTCTGA TTCAGTTTGGGTATTGCCTGT TTGAGTACCAAAAAATAAAC CAATATTAGACATTTTTACTC CTTTTATGTATTTGCAAAATT ATTTCAATTAAAATATTTAGT AATAATTAATTGTTAGCTAG CTAATAATTAAATTTTTATTA CAATCATTGTAAAAGGCATT GAAAAAGTAAATAAAAATT TTTATTCTACGTTATTTCAAA AATATTTACTTACATATACTT AACCTTTATAGTGATGTAAT ATACTCTAATTCCTATTTTAC TTATAAATACCATCTCAGCTT AATGTAACGAATTTTTCTGTT TATCTTTAAATACAAAAAAT TCAACAAAACTACAGAAAA TTAATCTTAATAACACAAAA CAAGTATCAATCTGTAATAC AACTAAGCTTAAATAAATTA ATAGAAAGCTTCATCTATCT AATAGGTTGAGAATAGTTTA TGTCTAATGACATAAATTCA TTCGTGTTGATTTCATTTGGG TATATTCATCTGATTTAGGAT TTACTCCATTAAGTTTGTACT CATCAATGCCCGCCTGTTGG TATCCACAATTCTCATACAG TGCGCGAGCAAAGTAATCA ATCGTTCGTCGCCATATCTA ACTTTGAGTCAAACAAACCA GTTGGATTACCAACCCTCAA CTAATCGCTTCTTTAAGGCG AGCGATCGCACATTTAACTG TTGGTTGTCACAAGAGAACT AATACTACAGCAGTATATTT AACAACTAAGGGTGGTTCAA CTTTCGCTGCGACTCCTCCAA CGCGCTGAAATACACAGGA CTGATGCGATCGCAAACTCT TTGACTAAATTCCATACATT ATCATGACCATCTCCCAAAC AAACAAGTGGGTTAACCAG ATGCTGACTATTAACATCCC CTGAGTTCGGAGTTGTAGGT CTATTTGACTGGTTCAAAGC GATGATGGAACGGCTTTGTT GCATGAATTAAAAAAAGAC ACACCATCACCTACTTCTAG GATAGACACATCAAACGTCC CACCGCCTAAGTCAAATACC AAGATAATTTCGTTAGTTTTC TTGTCAAGTCCGTAAGCGAG GGCCGCCGCCGTGGGCTAGT TGATAATTCGCAGAACTTTA ATCCCGGCAATTCTACTGGC ATCTTTGGTAGCCTGCCGTTG AGAGTCATTGAAATAGGCAG GGGTGGTAATTACCGCTTGC CTCACTGGTTCCCCCAGATA TGTGCTGGCATCATCTATCA GCTTGCGGACTACCTCATAC CATTTCACGAAAAACCTGAT ACACATGTAAACTCTGAAAC CCTTGCTGTATCAAAGTTTTG TAATTACGAATTACGAATTA CGAATTGATATCAGCCGAGA TTTCTTCGGGTGAAAATTCCT TGTTCAGAGCGGGACAGTGT AGCTTGACATTGCCATTACT GTCACGTACCACTTTGTAAG TAACTTGTTTTGCCTCTTGCG TAACTTCATCATACCTGCGC CCGATGAACCGCTTCACAGA ATAAAAAGTGTTTTCTGGGT TCATTACACCCTGGCGCTT Genetic Basis of Differentiation Developmental Response Histidine Kinase P Response Regulator How? NH 3 Environmental Signal P ? ? ?NpR3010
17
Histidine Kinase NpR3010 Nostoc punctiforme Genes Functionally Related to His Kinase Anabaena PCC 7120 Trichodesmium Synechocystis PCC 6803... (13 total) Find similar genes Blast Conserved
22
>npun_22dec03_Contig1_revised_geneNpR3010 MWHIQDSIITLSNHNQYLTFYKNQVKNPERFCRNVNQFDSQIDFVSCDIL ELKDGRFFEQYSKPLRLAEEIIGTVWSFRDITESQQAKEENRRIIQQEKQ LAEDRAYFTSMIFHEFRNPLNIISYSTSLLKRHSHHWSEEKKLQCLQNLQ TAVEQINQFTDEVLIIESVEAGKLQYELKPIDLNLFCREVLAEMSLYTKG ASQFLLFQNK*
25
MWHIQDSIITLSNHNQYLTFYKNQVKNPERFCRNVNQFDSQIDFVSCDIL ELKDGRFFEQYSKPLRLAEEIIGTVWSFRDITESQQAKEENRRIIQQEKQ LAEDRAYFTSMIFHEFRNPLNIISYSTSLLKRHSHHWSEEKKLQCLQNLQ TAVEQINQFTDEVLIIESVEAGKLQYELKPIDLNLFCREVLAEMSLYTKG ASQFLLFQNK
30
>npun_22dec03_Contig1_revised_geneNpR3008 LSPYLEACCLRISASVSYQRAAEDIEYLTGVEVSKSVQQRLVHRQNFELP QVESTVEELSVDGGNIRIRTIKGQVCDWKGYKATCLHEKQAIAASFQENS LVIDWVKSQSIAPILTCLGDGHDGIWNIVRDFAPEHQRREVLDWFHLMEN LHKIGGSNQRLNQAKILLWQGKVDDAIAVFADCQLKQAFNFCTYLEKHRH RIVNYQYYQAEQICSIGSGAIESTVKQIDRRTKISGAQWKSDNVPQVLAQ RQSLSQWINLCSLNKNWDAPMKSSVERLSDYPVAR*
36
A new family of proteins?! A type of transposase? transposase...ATTTCTCTAGAAAGGCTGAAGGGGGGACAAGCACCCGAAAGCCTTTGTGCT......TAAAGAGATCTTTCCGACTTCCCCCCTGTTCGTGGGCTTTCGGAAACACGA......ATACAGTCAGCTTTATAGGCTTCATGTCGCCCCTTCAGCTAGAAAGGTACATA......TATGTCAGTCGAAATATCCGAAGTACAGCGGGGAAGTCGATCTTTCCATGTAT... TRANSPOSON
37
A new family of proteins?! A type of transposase? transposase...ATTTCTCTAGAAAGGCTGAAGGGGGGACAAGCACCCGAAAGCCTTTGTGCT......TAAAGAGATCTTTCCGACTTCCCCCCTGTTCGTGGGCTTTCGGAAACACGA......ATACAGTCAGCTTTATAGGCTTCATGTCGCCCCTTCAGCTAGAAAGGTACATA......TATGTCAGTCGAAATATCCGAAGTACAGCGGGGAAGTCGATCTTTCCATGTAT... TRANSPOSON
38
A new family of proteins?! A type of transposase?...ATTTCTCTAGAAAGGCTGAAGGGGGGACAAGCACCCGAAAGCCTTTGTGCT......TAAAGAGATCTTTCCGACTTCCCCCCTGTTCGTGGGCTTTCGGAAACACGA......ATACAGTCAGCTTTATAGGCTTCATGTCGCCCCTTCAGCTAGAAAGGTACATA......TATGTCAGTCGAAATATCCGAAGTACAGCGGGGAAGTCGATCTTTCCATGTAT... transposase TRANSPOSON
39
A new family of proteins?! A type of transposase? transposase TRANSPOSON Is Npr3008 a transposase?
50
AATAAAGCTTTACAAA CCAAACTCTGGCTTCA ATTGTGTAACCCAAGC TTTGATTCTTTCCTCTG TTAAATCGGATTGATT ATCTTCATCAAGGGCA AGACCTACAAATTTAC CATCACGAACAGCTTT AGACTCACTGAATTCA TAACCTTCTGTAGGCC AATAGCCAACTGTTTC ACCACCATTTTCTGAA ATTTTTTCCTCTAGAAT ACCGAGGGCATCTTGA AATGTATCAGGATAAC CAACCTGGTCTCCAGG AGCAAAATAAGCAAC TTTTTTGCCGATGAAGT CAATGTTATCTAACTC ATCATAAAAATTTTCC CAATCACTTTGCAATT CTCCAACATTCCAGGT AGGACAACCAACAAC GATATAATCGTAGTTA TTGAAATCACTTGGTT CAGCTTGTGAAATATC ATATAAAGTTACAACA CTATCACCACCAAACT CCTTCTGAATTATTTCT GATTCAGTTTGGGTATT GCCTGTTTGAGTACCA AAAAATAAACCAATA TTAGACATTTTTACTCC TTTTATGTATTTGCAAA ATTATTTCAATTAAAA TATTTAGTAATAATTA ATTGTTAGCTAGCTAA TAATTAAATTTTTATTA CAATCATTGTAAAAGG CATTGAAAAAGTAAAT AAAAATTTTTATTCTAC GTTATTTCAAAAATAT TTACTTACATATACTTA ACCTTTATAGTGATGT AATATACTCTAATTCC TATTTTACTTATAAATA CCATCTCAGCTTAATG TAACGAATTTTTCTGTT TATCTTTAAATACAAA AAATTCAACAAAACTA CAGAAAATTAATCTTA ATAACACAAAACAAG TATCAATCTGTAATAC AACTAAGCTTAAATAA ATTAATAGAAAGCTTC ATCTATCTAATAGGTT GAGAATAGTTTATGTC TAATGACATAAATTCA TTCGTGTTGATTTCATT TGGGTATATTCATCTG ATTTAGGATTTACTCC ATTAAGTTTGTACTCAT CAATGCCCGCCTGTTG GTATCCACAATTCTCA TACAGTGCGCGAGCAA AGTAATCAATCGTTCG TCGCCATATCTAACTTT GAGTCAAACAAACCA GTTGGATTACCAACCC TCAACTAATCGCTTCTT TAAGGCGAGCGATCGC ACATTTAACTGTTGGTT GTCACAAGAGAACTA ATACTACAGCAGTATA TTTAACAACTAAGGGT GGTTCAACTTTCGCTG CGACTCCTCCAACGCG CTGAAATACACAGGA CTGATGCGATCGCAAA CTCTTTGACTAAATTCC ATACATTATCATGACC ATCTCCCAAACAAACA AGTGGGTTAACCAGAT GCTGACTATTAACATC CCCTGAGTTCGGAGTT GTAGGTCTATTTGACT GGTTCAAAGCGATGAT GGAACGGCTTTGTTGC ATGAATTAAAAAAAG ACACACCATCACCTAC TTCTAGGATAGACACA TCAAACGTCCCACCGC CTAAGTCAAATACCAA GATAATTTCGTTAGTTT TCTTGTCAAGTCCGTA AGCGAGGGCCGCCGC CGTGGGCTAGTTGATA ATTCGCAGAACTTTAA TCCCGGCAATTCTACT GGCATCTTTGGTAGCC TGCCGTTGAGAGTCAT TGAAATAGGCAGGGG TGGTAATTACCGCTTG CCTCACTGGTTCCCCC AGATATGTGCTGGCAT CATCTATCAGCTTGCG GACTACCTCATACCAT TTCACGAAAAACCTGA TACACATGTAAACTCT GAAACCCTTGCTGTAT CAAAGTTTTGTAATTA CGAATTACGAATTACG AATTGATATCAGCCGA GATTTCTTCGGGTGAA AATTCCTTGTTCAGAG CGGGACAGTGTAGCTT GACATTGCCATTACTG TCACGTACCACTTTGT AAGTAACTTGTTTTGC CTCTTGCGTAACTTCAT CATACCTGCGCCCGAT GAACCGCTTCACAGAA TAAAAAGTGTTTTCTG GGTTCATTACACCCTG GCGCTT
54
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
55
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
56
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
57
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
58
Filters: Information reducers Squirrel filter
59
Filters: Information reducers Molecular filter
60
TCTACTTATA TTCAATCCAC AGGGCTACAC CTAGTTCTTG AAGAGTCTGT TGAATGAACA CATACATGGT TTATCTGTTT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC CACTAGTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC TTAGATAAAC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCACGCCC CTCCGTAAAC CTCTAACATG ATGTCAGCAA ATATTAAAAA TGAATAAACT TTGTTAAAGG TACAAATGAA AATTAGCAAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT CATTCTAGGG AAACCTGTAT GGTTACATGA ACTGCCTAAA AAACAAGCTA TTATATATTT TAAGAAATTA ATTGCAATTA ATTTCCTGGG CCCCAGCTGT CATTAAAAAG AGGCAAATAC AGCCAAGGAC GACAGCACTG ACCCTCAAGA AGGCACCGGC TGACAGACAG GCTGAAATTC CGCTGAGAGC AGAGTGGTAC ATTGAACCCT CCCTGCACCA GGTCTTTCCT GTGGGCACTG AGTGCAGACA ATGAATGACT GAACGAACGA TTGAATGAAA AGAAATGAGA TATGAGGCAA TCACAGCATC AGGTGACCTT AGTATCTATT CTCGGGAGCG CACGGCTCTA AAGAGGCCCA TATCCAGGCA CCTTTAGATG CAAGAAGGAG GAAACAGCTC GAAATCCCTG AGGCCGGAGG GTCAAGAACT CTCCACCGGC GGCAGCGGCC CCCCGGCCTA AGGCTGCCTG TGCTATAAAT ACGCGGCCCA TTCCCTGGGC TCGGCGGGAC AGATAACATG AATGTGCCCT CTCCGTAAAC CTCTAAC... Filters: Information reducers Sequence filter
61
How do Biologists use Bioinformation? Candidate genesPredicted genes Interpolated Markov model Gene finder TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism?
62
Predicted genesCandidate genesPredicted genes Conform to standard model Challenge accepted beliefs How do Biologists use Bioinformation? Gene finder TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism? Interpolated Markov model
63
Predicted genesCandidate genesPredicted genes Conform to standard model How do Biologists use Bioinformation? Gene finder TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism? Interpolated Markov model
64
Predicted genesCandidate genesPredicted genes Conform to standard model Challenge accepted beliefs How do Biologists use Bioinformation? Gene finder TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism? Interpolated Markov model
65
Filters are powerful globin Highly filtered output Easy to grasp High-level insights
66
Filters Constrain New Discovery globin Highly filtered output Easy to grasp High-level insights Unfiltered output Confusing Basic insights
67
Filters are tempting
68
Globin Filters are tempting
73
The Death of Science
74
Current State of Affairs 1. Need high-level filters
75
2. Need access to raw phenomena AATAAAGCTTTACAAACCAAACTCTGGCTTCAA TTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTT AAATCGGATTGATTATCTTCATCAAGGGCAAGA CCTACAAATTTACCATCACGAACAGCTTTAGAC TCACTGAATTCATAACCTTCTGTAGGCCAATAG CCAACTGTTTCACCACCATTTTCTGAAATTTTTT CCTCTAGAATACCGCAACACTATCACCACCAA ACTCCTTCTGAATTATTTCTGATTCAGTTTGGGT ATTGCCTGTTTGAGTACCAAAAAATAAACCAAT ATTAGAC Current State of Affairs
76
1. Need high-level filters 2. Need access to raw phenomena 3. Need ability to build new tools ASSIGN K12-set FROM Gene-finder (K12-DNA) ASSIGN O157-set FROM Gene-finder (O157-DNA) CONSIDER EACH protein IN O157-set WHEN Constituent-of (K12-set, protein) = FALSE COLLECT protein Current State of Affairs
77
We need… Biologists...... and Programmers
79
1. Need high-level filters 2. Need access to raw phenomena 3. Need ability to build new tools Current State of Affairs Need biologist programmers
80
AATAAAGCTTTACAAACCAAA CTCTGGCTTCAATTGTGTAACC CAAGCTTTGATTCTTTCCTCTG TTAAATCGGATTGATTATCTTC ATCAAGGGCAAGACCTACAAA TTTACCATCACGAACAGCTTTG ARYGACTCACTGAATTCLARAT AACCTTCTGTAGGCCASONATA GCCAACTGTTTCACCACCATTT TCTGAAATTTTTTCCTCT
81
TATTCAAAATGAATTATATCGGTAACTTTAGTACAGAAAATGACGTTAAGA ATATCTGCAACTTTAAACCTGAATGATATTATTATTGGCGGGCCTCCATGCCAG GGATTTAGTATTGCTGGGCCAGCCCAAAEALAVGIASTCCTAAAGATCCTAGAAATG GTTTAGAATTTTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTC ATGGAAAACGTGAATTCAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAG ATATTATTAAGAAAACATTTGGAATTCGAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTG CGGAATATGGCATTCCGCAAATTAGAGAACGGAATTCGATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACT AGGTATTCCTAAAAAAACACATTCTCTGCAATTTTTAAGAATTCGATTTAAATAGGTCTCAATTATCGATCTTCGATGAT ATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACGAATTCGACAGAACTTAATGCGCGTGAAGGAAGTGAA GAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGGAATTCGATACGCTTTACAATCATGTTGCAAT GGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAAGAATTCGACATGGAGCTAGACGACGT AGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGGAATTCGAATTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTG TCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAGAATTCGAATTCAAACTGTCGTATCTCATAAACTATTGCATCGA GAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGGAATTCGAATTCAGTTATGCCAACAACTGATAGAAATCCTCTA GTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGAATTCGAATTCGAGTTGGACCAAAATCAGAAATTACTGACCA AGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCGAATTCGAATTCGAATTCATAATACGAGTCATAACGGCATATATG GCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTGAATTCGAATTCGAATTCGAACAACTTTTTCCAGTAATTCTGGAC GCTCTTCTAACAGTTCCATCAAAGTATA GGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAACTAGGAGGAAAATCCCCTGGAAGCATATCCCACTGAATTCGAATTCGAATTCGAATTCGAATTCGA C AACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATTTCGTAATTGGTGCAACTGTTCAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAACCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAA AAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT
82
Why hasn’t this happened? Part of bioinformatic program written in C if (pcInFile == NULL) pfInFile = stdin; else pfInFile = fopen(pcInFile, "r"); pfOutFile = fopen( pcOutFile, "w" ); if (pfInFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcInFile ); exit(1); } if (pfOutFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcOutFile ); exit(1); } fputc( fgetc(pfInFile), pfOutFile ); /* deal with first '>' in file */ for ( ; ; ) { if (processIdentifier( pfInFile, pfOutFile )) { } else { break; } if (processSequence( pfInFile, pfOutFile )) { } else { break; } } fclose( pfInFile ); fclose( pfOutFile );
83
Why hasn’t this happened? Part of bioinformatic program written in Perl sub match_positions { my $pattern; local $_; ($pattern, $_) = @_; my @results; local $matchStart; my $instrumentedPattern = qr/(?{ $matchStart = pos() })$pattern/; while (/$instrumentedPattern/g) { my $nextStart = pos(); push @results, "[$matchStart..$nextStart)"; pos() = $matchStart+1; } return @results;
84
Why hasn’t this happened? Biologists will not come to programming Programming must come to biologists
85
BioLingua
86
Genetic Basis of Differentiation NH 3 Environmental SignalDevelopmental Response Histidine Kinase P Response Regulator ? ? ?NpR3010
87
Genetic Basis of Differentiation NpR3010 RRHK HK-upstreamHK-downstream
88
HK-upstreamHK-downstream HKRR Genetic Basis of Differentiation NpR3010
89
BioLingua :: (#$Npun.NpF0304 #$Npun.NpR0355 #$Npun.NpR0450 #$Npun.NpF0484 #$Npun.NpR0589 #$Npun.NpF0832 #$Npun.NpF0906 #$Npun.NpR0956 #$Npun.NpF1084 #$Npun.NpF1085 #$Npun.NpR1109 #$Npun.NpF1184 #$Npun.NpF1278 #$Npun.NpR1450 #$Npun.NpF1453 #$Npun.NpF1516 #$Npun.NpR1633 #$Npun.NpR1678 #$Npun.NpR1683 #$Npun.NpR1688 #$Npun.NpF1776 #$Npun.NpR1779 #$Npun.NpF1800 #$Npun.NpR1903 #$Npun.NpR2091 #$Npun.NpF2162 #$Npun.NpR2263 #$Npun.NpF2346 #$Npun.NpF2364 #$Npun.NpR2420 #$Npun.NpR2902 #$Npun.NpF2972 #$Npun.NpR3053 #$Npun.NpF3084 #$Npun.NpR3197 #$Npun.NpR3241 #$Npun.NpF3659 #$Npun.NpF3676 #$Npun.NpR3733 #$Npun.NpF3829 #$Npun.NpR3907 #$Npun.NpR3959 #$Npun.NpF3972 #$Npun.NpR4101 #$Npun.NpR4160 #$Npun.NpR4165 #$Npun.NpF4214 #$Npun.NpR4435 #$Npun.NpF4460 #$Npun.NpR4503 #$Npun.NpR4743 #$Npun.NpR4768 #$Npun.NpF4909 #$Npun.NpR5015 #$Npun.NpF5034 #$Npun.NpF5044 #$Npun.NpR5135 #$Npun.NpR5136 #$Npun.NpR5316 #$Npun.NpF5361 #$Npun.NpF5636 #$Npun.NpF5682 #$Npun.NpF5759 #$Npun.NpF5763 #$Npun.NpF5788 #$Npun.NpR6014 #$Npun.NpR6015 #$Npun.NpR6228 #$Npun.NpF6321 #$Npun.NpR6360 #$Npun.NpF6363 #$Npun.pNpAF075 #$Npun.pNpBR039 #$Npun.pNpBF139 #$Npun.pNpBF146 #$Npun.pNpBR169 #$Npun.pNpBR170 #$Npun.pNpBF205 #$Npun.pNpEF003) (GENES-DESCRIBED-BY "response regulator" IN Npun) > (GENE-UPSTREAM-OF NpF0304)
90
BioLingua :: #$Npun.NpF0303 (GENE-UPSTREAM-OF NpF0304) > (DESCRIPTIONS-OF *) (GENES-UPSTREAM-OF (RESULT 1)) :: (#$Npun.NpF0303 #$Npun.NpF0356 #$Npun.NpF0451 #$Npun.NpF0483 #$Npun.NpR0590 #$Npun.NpF0831 #$Npun.NpF0905 #$Npun.NpF0957 #$Npun.NpR1083 #$Npun.NpF1084 #$Npun.NpR1110 #$Npun.NpF1183 #$Npun.NpF1277 #$Npun.NpR1451 #$Npun.NpR1452 #$Npun.NpR1515 #$Npun.NpF1634 #$Npun.NpR1679 #$Npun.NpF1684 #$Npun.NpR1689 #$Npun.NpF1775 #$Npun.NpF1780 #$Npun.NpF1799 #$Npun.NpR1904 #$Npun.NpR2092 #$Npun.NpF2161 #$Npun.NpR2264 #$Npun.NpR2345 #$Npun.NpF2363 #$Npun.NpR2421 #$Npun.NpR2903 #$Npun.NpR2971 #$Npun.NpR3054 #$Npun.NpR3083 #$Npun.NpR3198 #$Npun.NpF3242 #$Npun.NpR3658 #$Npun.NpF3675 #$Npun.NpR3734 #$Npun.NpR3828 #$Npun.NpF3908 #$Npun.NpR3960 #$Npun.NpF3971 #$Npun.NpF4102 #$Npun.NpR4161 #$Npun.NpF4166 #$Npun.NpR4213 #$Npun.NpR4436 #$Npun.NpF4459 #$Npun.NpR4504 #$Npun.NpR4744 #$Npun.NpR4769 #$Npun.NpR4908 #$Npun.NpF5016 #$Npun.NpF5033 #$Npun.NpF5043 #$Npun.NpR5136 #$Npun.NpF5137 #$Npun.NpF5317 #$Npun.NpF5360 #$Npun.NpR5635 #$Npun.NpF5681 #$Npun.NpF5758 #$Npun.NpR5762 #$Npun.NpR5787 #$Npun.NpR6015 #$Npun.NpR6016 #$Npun.NpR6229 #$Npun.NpR6320 #$Npun.NpF6361 #$Npun.NpF6362 #$Npun.pNpAF074 #$Npun.pNpBR040 #$Npun.pNpBF138 #$Npun.pNpBF145 #$Npun.pNpBR170 #$Npun.pNpBR171 #$Npun.pNpBR204 #$Npun.pNpER002)
91
BioLingua :: ("two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25531611|pir||AD2200 two- "unknown protein [Nostoc sp. PCC 7120] gi|25534386|pir||AH1981 hypothetical protein alr1403 "tmRNA-binding protein [Nostoc sp. PCC 7120] gi|22096164|sp|Q8YM70|SSRP_ANASP SsrA-binding protein "GTP-binding protein era homolog" "unknown protein [Nostoc sp. PCC 7120] gi|25533156|pir||AF2229 hypothetical protein asr3389 "ORF_ID:tlr0160~similar to ferredoxin [Thermosynechococcus elongatus BP-1] "hypothetical protein [Nostoc sp. PCC 7120] gi|25367067|pir||AH2295 hypothetical protein alr3919 "two-component hybrid sensor and regulator [Nostoc sp. PCC 7120] gi|25532444|pir||AE2276 two- "hypothetical protein [Nostoc sp. PCC 7120] gi|25358966|pir||AG2158 hypothetical protein alr2822 "two-component response regulator [Nostoc sp. PCC 7120] gi|25533086|pir||AF2158 two-component "probable two-component sensor histidine kinase [Gloeobacter violaceus] gi|35214672|dbj|BAC92039.1| "phytochrome-like protein [Tolypothrix sp. PCC 7601]" "two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25530471|pir||AC1860 two-component NIL NIL NIL "hypothetical protein [Nostoc sp. PCC 7120] gi|25535333|pir||AI2179 hypothetical protein all2992 NIL "unknown protein [Nostoc sp. PCC 7120] gi|25535440|pir||AI2275 hypothetical protein alr3760 "transcriptional regulator [Nostoc sp. PCC 7120] gi|25302898|pir||AB2544 transcription regulator "similar to two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25531791|pir||AD2385 "putative gluconolactonase precursor [Sinorhizobium meliloti] gi|25369832|pir||G95274 probable "similar to two-component sensor histidine kinase [Nostoc sp. PCC 7120] gi|25531791|pir||AD2385 "hypothetical protein [Nostoc sp. PCC 7120] gi|25530521|pir||AC1903 hypothetical protein asr0773... DESCRIPTIONS-OF *) >
92
BioLingua :: "List of length 79 suppressed" (DEFINE RR-class AS (GENES-DESCRIBED-BY "response regulator" IN Npun) DISPLAY off) > (INTERSECTION-OF (HK-adjacent RR-class)) (DEFINE HK-class AS (GENES-DESCRIBED-BY “histidine kinase" IN Npun) DISPLAY off) :: "List of length 89 suppressed" (DEFINE HK-upstream AS (GENES-UPSTREAM-OF HK-class) DISPLAY off) > :: "List of length 89 suppressed" (DEFINE HK-downstream AS (GENES-DOWNSTREAM-OF HK-class) DISPLAY off) > :: "List of length 89 suppressed" (DEFINE HK-adjacent AS (UNION-OF (HK-upstream HK-downstream)) DISPLAY off) > :: "List of length 178 suppressed"
93
BioLingua :: 22 elements in INTERSECTION > (#$Npun.pNpBF205 #$Npun.pNpBF139 #$Npun.NpR6228 #$Npun.NpR5316 #$Npun.NpF4214 #$Npun.NpF3676 #$Npun.NpF3084 #$Npun.NpR3053 #$Npun.NpR1779 #$Npun.NpR0589 #$Npun.NpF0304 #$Npun.NpR1109 #$Npun.NpF1278 #$Npun.NpF1776 #$Npun.NpF1800 #$Npun.NpR2420 #$Npun.NpR2902 #$Npun.NpR3197 #$Npun.NpR4503 #$Npun.NpF5763 #$Npun.NpF6363 #$Npun.pNpBF146) (INTERSECTION-OF (HK-adjacent RR-class)) > (DEFINE RR-candidates AS (SET-DIFFERENCE RR-class (RESULT 10)) DISPLAY off) :: "List of length 57 suppressed" >
94
Histidine Kinase NpR3010 Nostoc punctiforme Genes Functionally Related to His Kinase Anabaena PCC 7120 Trichodesmium Synechocystis PCC 6803... (13 total) Find similar genes Conserved
95
BioLingua :: 24 elements in INTERSECTION > (#$Npun.pNpBF205 #$Npun.pNpBF139 #$Npun.NpR6228 #$Npun.NpR5316 #$Npun.NpF4214 #$Npun.NpF3676 #$Npun.NpF3084 #$Npun.NpR3053 #$Npun.NpR1779 #$Npun.NpR0589 #$Npun.NpF0304 #$Npun.NpR1109 #$Npun.NpF1278 #$Npun.NpF1776 #$Npun.NpF1800 #$Npun.NpR2420 #$Npun.NpR2902 #$Npun.NpR3197 #$Npun.NpR4503 #$Npun.NpF5763 #$Npun.NpF6363 #$Npun.pNpBF146) (INTERSECTION-OF (RR-adjacent HK-class)) > (DEFINE RR-candidates AS (SET-DIFFERENCE RR-class (RESULT 10)) DISPLAY off) > :: "List of length 57 suppressed" (CONTEXT-OF NpF0304) > (ALL-ORTHOLOGS-OF *) :: ( #$Npun.NpF0303 two-component sensor histidine) 85 (-> #$Npun.NpF0304 two-component response regulat) 473 (-> #$Npun.NpF0305 hypothetical protein glr0895 [) 85 (<- #$Npun.NpR0306 primosomal protein N' [Nostoc ) > (#$Npun.NpR0302 #$Npun.NpF0303 #$Npun.NpF0304 #$Npun.NpF0305 #$Npun.NpR0306)
96
BioLingua :: ((#$S7942.sef0159 #$Npun.NpR0302 #$Gvi.glr0573 #$A29413.Av?3368 #$A7120.all3154) (#$S6803.sll1590 #$Npun.NpF0303 #$Gvi.gll0572 #$A29413.Av?1247 #$A7120.alr3155) (#$S6803.sll1592 #$P9313.PMT1405 #$Npun.NpF0304 #$Gvi.gll0571 #$A29413.Av?1248 #$A7120.alr3156) (#$Tery.Te?7017 #$Npun.NpF0305 #$Cwat.Cw?3050) (#$Tery.Te?2243 #$TeBP1.tll0415 #$S6803.sll0270 #$S8102.SynW1782 #$S7942.sef1895 #$PRO1375.Pro0497 #$P9313.PMT1271 #$PMED4.PMM0497 #$Npun.NpR0306 #$Gvi.gll0025 #$Cwat.Cw?3016 #$A29413.Av?5206 #$A7120.all4248)) (ALL-ORTHOLOGS-OF *) > (CONTEXT-OF NpF0304) > :: ( #$Npun.NpF0303 two-component sensor histidine) 85 (-> #$Npun.NpF0304 two-component response regulat) 473 (-> #$Npun.NpF0305 hypothetical protein glr0895 [) 85 (<- #$Npun.NpR0306 primosomal protein N' [Nostoc ) > (#$Npun.NpR0302 #$Npun.NpF0303 #$Npun.NpF0304 #$Npun.NpF0305 #$Npun.NpR0306)
97
A new family of proteins?! A type of transposase? transposase TRANSPOSON Is Npr3008 a transposase?
98
BioLingua :: Query Q-Start Q-End Subject S-Start S-End E-value %ID 1. "Seq 1" 1 2258 #$Npun.chromosome 3706846 3704589 0.0 100.0 2. "Seq 1" 293 1511 #$Npun.chromosome 4008429 4009647 0.0 100.0 3. "Seq 1" 293 1512 #$Npun.chromosome 7932036 7930817 0.0 99.92 4. "Seq 1" 293 1510 #$Npun.chromosome 4228111 4229328 0.0 99.92 5. "Seq 1" 293 1510 #$Npun.chromosome 3971285 3972502 0.0 99.92 6. "Seq 1" 293 1510 #$Npun.chromosome 4027833 4029050 0.0 99.75 7. "Seq 1" 293 1511 #$Npun.chromosome 2121987 2123204 0.0 99.67 8. "Seq 1" 293 1510 #$Npun.chromosome 2136737 2135521 0.0 99.67 9. "Seq 1" 397 1510 #$Npun.chromosome 2030748 2031861 0.0 99.64 10. "Seq 1" 1537 2258 #$Npun.pNpB 42015 42737 4.6d-83 80.5 11. "Seq 1" 1331 1420 #$Npun.chromosome 8036134 8036045 1.8d-8 83.33 12. "Seq 1" 1319 1385 #$Npun.chromosome 5915424 5915358 2.7d-4 83.58 13. "Seq 1" 1319 1385 #$Npun.chromosome 2577387 2577453 2.7d-4 83.58 > (#$Temp27 #$Temp28 #$Temp29 #$Temp30 #$Temp31 #$Temp32 #$Temp33 #$Temp34 #$Temp35 #$Temp36 #$Temp37 #$Temp38 #$Temp39) (BLAST extended-NpR3008 Npun) > (DEFINE extended-NpR3008 AS (SEQUENCE-OF NpR3008 FROM -700 TO-END +700) DISPLAY off) :: “Results suppressed"
99
BioLingua :: Query Q-Start Q-End Subject S-Start S-End E-value %ID 1. "Seq 1" 1 2258 #$Npun.chromosome 3706846 3704589 0.0 100.0 2. "Seq 1" 293 1511 #$Npun.chromosome 4008429 4009647 0.0 100.0... (BLAST extended-NpR3008 Npun) > (DEFINE extended-NpR3008 AS (SEQUENCE-OF NpR3008 FROM -700 TO-END +700) DISPLAY off) > :: “Results suppressed" (FOR-EACH hit IN * AS (subj S-start) = (GET-ELEMENTS (subject Subject-start) FROM hit) AS start = (- S-start 15) AS end = (+ S-start 40) AS left-end = (SEQUENCE-OF subj FROM start TO end) COLLECT left-end)
100
BioLingua :: Query Q-Start Q-End Subject S-Start S-End E-value %ID 1. "Seq 1" 1 2258 #$Npun.chromosome 3706846 3704589 0.0 100.0 2. "Seq 1" 293 1511 #$Npun.chromosome 4008429 4009647 0.0 100.0... (BLAST extended-NpR3008 Npun) > (DEFINE extended-NpR3008 AS (SEQUENCE-OF NpR3008 FROM -700 TO-END +700) DISPLAY off) > :: “Results suppressed" (FOR-EACH hit IN * AS (subj S-start) = (GET-ELEMENTS (subject Subject-start) FROM hit) AS start = (- S-start 15) AS end = (+ S-start 40) AS left-end = (SEQUENCE-OF subj FROM start TO end) COLLECT left-end) :: > ("TACGCTCTATCTTCAGCAAGTTGTTTTTCTTGCTGTATAATTCGGCGATTCTCTTC" "AAAGAAACGCTAGAGGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "AAACTGGGATGCACCCCTTATTAATGCTCTTTGGAGTCAATACTAATTTTGCCAAA" "TACCTTTGTGATAGGGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "AAATTAGTTTATTATGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "CACCGATTCACTAATGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA" "ACTATTGTAGAGACTGGGTGCATCCCAGTTTTTATTATTCCAAAACAAATAAATAA"...
101
BioLingua (ALIGNMENT-OF * LINE-LENGTH 60 SEGMENT-LENGTH 60) > :: Seq 4 1 TACCTTTGT-GATAGGGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 7 1 -ACTATTGTAGAGACTGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 2 1 -AAAGAAACGCTAGAGGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 5 1 AAATTAGTTTATTA-TGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 6 1 -CACCGATTCACTAATGGGTGCATCCCAGTTTTTATTAT--TCCAAAACAAATAAATAA--------------- Seq 8 1 ----------AAACTGGGATGCA-CCCAGTCTCTACAATAGTTCTAGA-GAACACATAACGTAAATAC------ Seq 3 1 ----------AAACTGGGATGCACCCC--TTATTAATGCTCTTTGGAGTCAATAC-TAATTTTGCCAAA----- Seq 9 1 -----------CATTGTCGCCCCTTGAAGTCATCAAGAC-----TAGGTGTATCAATGACTCCTGAAGAAGA-- Seq 12 1 ------------------GTTCAGCTTGGTAATAGCTGTAGTTAATAATGCGAGAGCGATGTTTTTCGAGATAA Seq 1 1 ---------TACGCTCTATCTTCAGCAAGTTGTTTTTCT--TGCTGTATAATTCGGCGATTCTCTTC------- Seq 10 1 --------------GGTCGGGAAATTGCGAGATTATTCAGTGGCGAAGTAGTGGGAGAACTACCATTGAT---- Seq 11 1 ------------TTGAACAAATTTGTTCGTGGAAATGGTAATTGGAAATTTGCTGCGGAATGCGGTGA------ Seq 13 1 ------------ATTATTAACTACAGCTATTACCAAGCTGAACAACTGTGTTCTATTGGTTCTGGTTC------ consensus 1
102
Genetic Basis of Differentiation NH 3 N2N2 Nostoc + Anabaena Not Synechocystis, Trichodesmium,…
103
BioLingua (DEFINE diff-cb AS (Npun Avar A7120) DISPLAY off) > :: "List of length 3 suppressed" (DEFINE non-diff-cb AS (REMOVE-FROM-SET *loaded-organisms* diff-cb) DISPLAY off) > :: "List of length 10 suppressed" (DEFINE diff-cb-specific AS (COMMON-ORTHOLOGS-OF diff-cb NOT-IN non-diff-cb) DISPLAY off) > :: "List of length 661 suppressed"
104
BioLingua Provides knowledge in accessible form Provides tools accessed in common way Provides results that can be manipulated Provides a programming language that speaks to biologists
105
The Death of Science
107
Credits West Coast - Jeff Shrager - JP Massar - Mike Travers VCU - Austin Hess - James Mastros - Sarah Cousins - Yue Zhao BioLingua: http://ramsites.net/~biolingua/help Jeff Elhai: Center for the Study of Biological Complexity Virginia Commonwealth University Phone: 828-0794 E-mail: ElhaiJ@VCU.Edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.