Baseline: Are we at the same stage? Cygwin installed Blast installed Data files: TA496Seq1.txt, PhytophSeq1.txt, TomatoSequence.txt Were the files completely downloaded? In Cygwin Try: grep –c “>” PhytophSeq1.txt 3,921 Try: grep –c “>” TA496Seq1.txt 116,711
Format the database: /cygdrive/c/Blast/bin/formatdb -i./TA496Seq1.txt –p F Run nucleotide BLAST (blastn) /cygdrive/c/Blast/bin/blastall -p blastn -d./TA496Seq1.txt -i./TomatoSequence.seq –o TomatoSeqOut.txt /cygdrive/c/Blast/bin/blastall -p blastn -d./TA496Seq1.txt -i./PhtophSeq1.txt –o PhytOut.txt NOTE: this blast which compares 3,921 sequences to a database of 116,711 sequences will take some time (15 minutes on my laptop).
OUTPUT of BLAST of TA496Seq1.txt with TomatoSequence.txt Score E Sequences producing significant alignments: (bits) Value gi| |gb|BE |BE EST tomato flower buds, gi| |gb|BI |BI EST tomato flower, anth gi| |gb|AI |AI EST tomato ovary, TAMU S gi| |gb|AW |AW EST tomato germinating s gi| |gb|AW |AW EST tomato flower buds gi| |gb|BI |BI EST tomato flower, anth gi| |gb|BI |BI EST tomato flower, anth
OUTPUT of BLAST of TA496Seq1.txt with TomatoSequence.txt >gi| |gb|BE |BE EST tomato flower buds, anthesis, Cornell University Solanum lycopersicum cDNA clone cTOD9L3, mRNA sequence Length = 632 Score = 1237 bits (624), Expect = 0.0 Identities = 630/632 (99%) Strand = Plus / Plus Query: 1504 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 1563 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 60 Query: 1564 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 1623 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 61 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 120
OUTPUT of BLAST of TA496Seq1.txt with TomatoSequence.txt >gi| |gb|BE |BE EST tomato flower buds, anthesis, Cornell University Solanum lycopersicum cDNA clone cTOD9L3, mRNA sequence Length = 632 Score = 1237 bits (624), Expect = 0.0 Identities = 630/632 (99%) Strand = Plus / Plus Query: 1504 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 1563 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 60 Query: 1564 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 1623 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 61 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 120
In Cygwin Try: grep –c “Strand =“./TomatoSeqOut.txt 82 Try: grep –c “Stand =“./PhytOut.txt 292,568 Try: grep –c “Expect = 0.0”./TomatoSeqOut.txt 3 Try: grep –c “Expect = 0.0”./PhytOut.txt 54,643
When we have a large output file from BLAST, how can we find out what is inside? How can we organize and interpret this output when the file is too large to open in a text editor?