WSSP Chapter 7 BLASTN: DNA vs DNA searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgctga ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgttgg attgaaggta attatcttgc atgagccagc tgatgagtat gatacagttt
© 2014 WSSP
DSAP: BLASTn Page p. 7-1 © 2014 WSSP
p. 7-1 NCBI BLAST Home Page © 2014 WSSP
p. 7-2 NCBI BLASTN search page © 2013 WSSP
p. 7-2 Copy sequence from DSAP or wave form program © 2014 WSSP
p. 7-3 Choose a database (nr/nt or est) © 2014 WSSP
p. 7-4 Search options (Use defaults) © 2014 WSSP
p. 7-5 BLASTN progress report (search may take a few minutes) © 2014 WSSP
p. 7-5 Format options (use defaults) © 2014 WSSP
p. 7-6 EX1.14 BLASTN nr/nt database © 2014 WSSP
Graphic report of EX2.09 p. 7-7 © 2014 WSSP
p. 7-7 BLASTN list of matches for EX1.14 © 2014 WSSP
EX2.09 BLASTN p. 7-9 © 2014 WSSP
Clicker Question: Which match is the most meaningful? A) B) C) D) E) None © 2014 WSSP
Clicker Question: Which part of the gene appears to be the most conserved? A) Bp B) Bp C) Bp D) All E) None © 2014 WSSP
Clicker Question: The entire insert of a clone was sequenced and a BLASTN search was performed. Are these matches likely to be significant? A) Yes B) No C) Can not tell from data © 2014 WSSP
Question: Which of the following E values indicates the best match? A)1e-10 B)5e-91 C)5.3 D)0.0 E)Can not tell from this data © 2014 WSSP
Best match to EX1.14 p. 7-9 Our Seq. Database Seq. Length of sequence Mismatch Match © 2014 WSSP
Perfect, but short, matches are not usually meaningful >gi| |emb|AL |CNS07EFY Human chromosome 14 DNA sequence BAC R-736L22 of library RPCI-11 from chromosome 14 of Homo sapiens (Human), complete sequence Score = 40.1 bits (20), Expect = 4.6 Identities = 20/20 (100%) Query: 189 ttttctgaatattcataata 208 |||||||||||||||||||| Sbjct: ttttctgaatattcataata © 2014 WSSP
Examine the best alignments: Are they significant? 7-9 © 2014 WSSP
Mismatches i)Bad sequence on our part ii)Bad sequence on their part iii)Differences in the sequence of the two organisms C R E L L I L D A Query TGT CGT GAA CTC CTA ATT CTC GAC GCC ||| ||| ||| || || || || || || Sbjct TGT CGT GAA CTT CTG ATC CTT GAT GCA C R E L L I L D A Query: 383 AGCGTTGCCGTTCGTCAGCTTGATGTTAAGCTGGGCAGCGCGCTCGACGATTCCTTTGCG 324 |||||| |||||||||||||||||||| | ||| || ||||||||||||||||| ||||| Sbjct: 6152 AGCGTTTCCGTTCGTCAGCTTGATGTTCAACTGAGCGGCGCGCTCGACGATTCCCTTGCG 6211 Wobble position: same amino acid, but different codon….degenerate code © 2014 WSSP
C R R T P D P * Query TGTCGT-CGAACTCCTGATCCTTGA |||||| |||||||||||||||||| Sbjct TGTCGTCCGAACTCCTGATCCTTGA C R E L L I L D p Small Gaps- alter the reading frame of the protein © 2014 WSSP
Query: 179 TTCGAGCTACCAGATGATC-GATTGGAACAT-T-C--TGTCATTG-AC-CTTC-AGGTAA 230 ||||||| || | | || |||| || || | | | | ||| | |||| |||| | Sbjct: 4684 TTCGAGCG-CC-GTTAATATGATTACAATATCTACAATATTATTATATGCTTCCAGGTGA 4741 Query: 231 TCAACCATGACCGTGTCAACCGAAACGACGTTATCGGCCGTGCACTATTGAACATGGAGG 290 |||| ||||||||||| ||||| || || || || |||||||| || | || ||||| | Sbjct: 4742 TCAATCATGACCGTGTTAACCGTAATGATGTAATTGGCCGTGCCCTTCTTAATATGGAAG 4801 An example of a match with and without gaps. p © 2014 WSSP
>gi| |dbj|AK | Triticum aestivum cDNA, clone: SET5_E05, cultivar: Chinese Spring Length=650gi| |dbj|AK | Score = 219 bits (242), Expect = 2e-53 Identities = 211/271 (77%), Gaps = 0/271 (0%) Query 10 GATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGCAGGCGGA 69 |||| ||||||||| ||||| || || ||||||||||||||| ||||||||| | | Sbjct 78 GATGCTGGAAGGGAAGGCGACGGTGGAGGACACCGACATGCCGGCCAAGATGCAGCTGCA 137 Query 70 GGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCAAGAGCCT 129 ||||| || || || |||||||| | ||||||||| |||||| |||| | Sbjct 138 GGCCACCTCGGCGGCGTCCAGGGCGCTCGAACGCTTCGACGTCCTCGACTGCCGGAGCAT 197 Query 130 CGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGTGCGTCGT 189 ||| ||||| ||||||||||| || || | |||| |||| ||||| ||||||||||| || Sbjct 198 CGCGGCGCACATCAAGAAGGAGTTCGACACGATCCACGGCCCGGGGTGGCAGTGCGTGGT 257 Query 190 CGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACTTCCGCCT 249 |||| |||||||||||| | |||||| |||| || || |||||||| |||||| || Sbjct 258 GGGCTGCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATATACTTCAAGCT 317 Query 250 GGAGACGCTCCACTTCCTCATCTTCAAAGGC 280 ||| |||||| |||||| ||||||||||| Sbjct 318 CGAGTCGCTCCGGTTCCTCGTCTTCAAAGGC 348 Alignment of the third best match to EX1.14 p © 2014 WSSP
p Alignments near the end of the EX1.13 >gi| |ref|NG_ | Homo sapiens glypican 4 (GPC4), RefSeqGene on chromosome X Length= Score = 71.6 bits (78), Expect = 6e-09 Identities = 42/44 (95%), Gaps = 0/44 (0%)gi| |ref|NG_ | Query 665 CTAGCTTTTCTTAACaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 708 || ||||||||||| ||||||||||||||||||||||||||||| Sbjct CTTGCTTTTCTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA © 2014 WSSP
Question: Is this match biologically significant? A)Yes B)No C)Can not tell from data © 2014 WSSP
A)Yes B)No C)Can not tell from data Question: Is this match biologically significant? © 2014 WSSP
Clicker Question: Is this match likely in a protein coding region? A)Yes B)No C)Can not tell from data © 2014 WSSP
Clicker Question: What is the likely explanation for the gap? A)Sequence error in cDNA B)Error in making the cDNA C)Start of an intron region D)Can not tell from data E)A, B or C © 2014 WSSP
Clicker Question: Is this match likely in a protein coding region? A)Yes B)No C)Can not tell from data © 2014 WSSP
p Fill in the table listing the best matches from three different organisms. List Landoltia if there is a match © 2014 WSSP
Use the clone report to obtain more information about the gene p © 2014 WSSP
Is this a signific ant match? a)Yes b)No p © 2014 WSSP
3) Perform a BLASTn of the est database Change the database p © 2014 WSSP
p BLASTn report of the EX1.14 search of the est database © 2014 WSSP
>gi| |gb|GD | CCHY28888.g1 CCHY Panicum virgatum callus (N) Panicum virgatumgi| |gb|GD | cDNA clone CCHY ', mRNA sequence. Length=624 Score = 246 bits (272), Expect = 1e-61 Identities = 226/286 (79%), Gaps = 0/286 (0%) Strand=Plus/Minus Query 3 GAGAGAAGATGTTGGAAGGGAGGGCGAGAGTAGAAGACACCGACATGCCGAGGAAGATGC 62 |||| | ||| ||||||||| ||||| || || ||||| ||||||||| |||||||| Sbjct 527 GAGACACCATGCTGGAAGGGAAGGCGATGGTGGAGGACACGGACATGCCGGCGAAGATGC 468 Query 63 AGGCGGAGGCCATGAACGCCGCCTCTCACGCGCTCGATCTGTTCGACGTCGCGGACTGCA 122 ||||| |||| ||| || || || || ||||| | ||||||||| |||||| Sbjct 467 AGGCGCAGGCGATGGCGGCGGCGTCCAGGGCCCTCGACCGCTTCGACGTCCTCGACTGCC 408 Query 123 AGAGCCTCGCCGCGCATATCAAGAAGGAATTTGATAAGATCTACGGTCCGGGATGGCAGT 182 |||| |||| ||||| ||||||||||| ||||| | |||| |||| || || ||||| | Sbjct 407 GGAGCATCGCGGCGCACATCAAGAAGGAGTTTGACACGATCCACGGCCCCGGGTGGCAAT 348 Query 183 GCGTCGTCGGCTCCAGCTTCGGCTGTTTCTTCACTCACAAGAAAGGCAGCTTCATCTACT 242 |||| || ||||||||||||||||| | |||||| |||| || || ||||||||||||| Sbjct 347 GCGTGGTGGGCTCCAGCTTCGGCTGCTACTTCACGCACAGCAAGGGGAGCTTCATCTACT 288 Query 243 TCCGCCTGGAGACGCTCCACTTCCTCATCTTCAAAGGCGCGGCCGC 288 |||| || ||| ||||| ||||||||||||||||| ||||| || Sbjct 287 TCCGGCTCGAGTCGCTCAGGTTCCTCATCTTCAAAGGGGCGGCAGC 242 Alignment of the best match to EX1.13 from the est search p © 2014 WSSP
Fill out the DSAP table of the BLASTn search of the est database p © 2014 WSSP
Query 61 CAAGGTCTAAGTACTGAAAAGGAAAGTCTACTAATTACAAAGAAGTTATTGTTTGTACCT 120 |||||||||||||||||||||||||||| ||||||||||||||||||||||||||||||| Sbjct CAAGGTCTAAGTACTGAAAAGGAAAGTCCACTAATTACAAAGAAGTTATTGTTTGTACCT Query 121 TTTGTATCAGGGTTTATTAAATTTCAATCTTTATTGCTGAATCCCGAAACAAGGTGATCT 180 |||||||||||||||||||||||| |||||| |||||||||||||||||||||||||||| Sbjct TTTGTATCAGGGTTTATTAAATTTTAATCTTCATTGCTGAATCCCGAAACAAGGTGATCT Open Question: Why are there differences in the sequences? © 2014 WSSP
Q5. BLASTn Analysis: Is your cDNA similar to genes in other organisms? p © 2014 WSSP
Q6. BLASTn Analysis: Is your cDNA similar to genes in different kingdoms? p © 2014 WSSP i.e. are there any matches to organisms from the eubacteria, archabacteria, protist, fungi, or animal kingdoms or are they all matches to other plants?
! Is the sequence found in many other organisms? © 2014 WSSP