Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.

Similar presentations


Presentation on theme: "David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN."— Presentation transcript:

1 David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN

2 BLAST Basic Local Alignment Search Tool
A heuristic method for performing local alignments through searches of high scoring segment pairs (HSP’s) 1st to use statistics to predict significance of initial matches - Offers both sensitivity and speed

3 BLAST Looks for clusters of nearby or locally dense “similar or homologous” k-tuples Uses “look-up” tables to shorten search time Uses larger “word size” than FASTA to accelerate the search process Performs both Global and Local alignment Fastest and most frequently used sequence alignment tool -- THE STANDARD Lecture 3.1

4 Uses of BLAST Identifying species With the use of BLAST, you can possibly correctly identify a species and/or find homologous species. Locating domains When working with a protein sequence you can input it into BLAST, to locate known domains within the sequence of interest. Establishing phylogeny Using the results received through BLAST you can create a phylogenetic tree using the BLAST web-page. It should be noted that phylogenies based on BLAST alone are less reliable than other purpose-built computational phylogenetic methods, so should only be relied upon for "first pass" phylogenetic analyses. DNA mapping When working with a known species, and looking to sequence a gene at an unknown location, BLAST can compare the chromosomal position of the sequence of interest, to relevant sequences in the database(s). Comparison When working with genes, BLAST can locate common genes in two related species, and can be used to map annotations from one organism to another.

5 BLAST Access NCBI BLAST Canadian Bioinformatics Resource BLAST
Canadian Bioinformatics Resource BLAST European Bioinformatics Institute BLAST

6 Lecture 3.1

7 Lecture 3.1

8 Lecture 3.1

9 Different Flavours of BLAST
BLASTP - protein query against protein DB BLASTN - DNA/RNA query against GenBank (DNA) BLASTX - 6 frame trans. DNA query against proteinDB TBLASTN - protein query against 6 frame GB transl. TBLASTX - 6 frame DNA query to 6 frame GB transl. PSI-BLAST - protein ‘profile’ query against protein DB PHI-BLAST - protein pattern against protein DB

10 Other BLAST Services MEGABLAST - for comparison of large sets of long DNA sequences RPS-BLAST - Conserved Domain Detection BLAST 2 Sequences - for performing pairwise alignments for 2 chosen sequences Genomic BLAST - for alignments against select human, microbial or malarial genomes

11 Running NCBI BLAST Lecture 3.1

12 MT0895 MMKIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIMGRVASKEEIKKILS

13 Running NCBI BLAST OR OR
Paste in sequence (FASTA format, raw sequence or type in GI or accession number) >Mysequence MT0895 KIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIDS OR > KIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIDS OR KIQIYGTGCANCQMLEKNAREAVKELGIDAEFEKIKEMDQILEAGLTALPGLAVDGELKIDS Lecture 3.1

14 Running NCBI BLAST Choose a range of interest in the sequence “set subsequences” (not usually used) Select the database from pull-down menu (usually choose nr = non-redundant) Leave “Options” unchanged (use defaults)

15 Running NCBI BLAST Select Database Lecture 3.1

16 Running NCBI BLAST Click BLAST! Lecture 3.1

17 Formatting Results Lecture 3.1

18 BLAST Format Options Lecture 3.1

19 BLAST Output Lecture 3.1

20 BLAST Output Lecture 3.1

21 BLAST Output Lecture 3.1

22 BLAST Output Lecture 3.1

23 BLAST Output Lecture 3.1

24 BLAST Output Lecture 3.1

25 BLAST Parameters Identities - No. & % exact residue matches
Positives - No. and % similar & ID matches Gaps - No. & % gaps introduced Score - Summed HSP score (S) Bit Score - a normalized score (S’) Expect (E) - Expected # of chance HSP aligns

26 Conclusions BLAST is the most important program in bioinformatics (maybe all of biology) BLAST is based on sound statistical principles (key to its speed and sensitivity) A basic understanding of its principles is key for using/interpreting BLAST output Use NBLAST or MEGABLAST for DNA Use PSI-BLAST for protein searches


Download ppt "David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN."

Similar presentations


Ads by Google