1 Improved tools for biological sequence comparison Author: WILLIAM R. PEARSON, DAVID J. LIPMAN Publisher: Proc. Natl. Acad. Sci. USA 1988 Presenter: Hsin-Mao Chen Date:2010/04/28
2 Outline Introduction Step 1 Step 2 Step 3 Step 4 Result
3 Introduction A heuristic method. The FASTA program is a more sensitive derivative of the FASTP program[1985].
4 Introduction SW FASTA
5 Step 1 By using a lookup table to locate all identities between two DNA or amino acid sequences. Dot matrix Lookup table
6 Step 1 Dot matrix FLWRTWS T1 W11 K T1 W11 T1
7 Step A: TCGGA TTCGT ACGGT ACGGA TC B: GTAAA CCACA ktup (k-tuple)=2 Lookup table
8 Step A: TCGGA TTCGT ACGGT ACGGA TC B: GTAAA CCACA ktup (k-tuple)=2 Lookup table
9 Step A: TCGGA TTCGT ACGGT ACGGA TC B: GTAAA CCACA ktup (k-tuple)=2 Lookup table
10 Step 1
11 Step 2 Rescore these 10 regions using a scoring matrix (PAM250 、 Blosum50 ) and save the best initial regions greater than threshold.
12 Step 2 PAM250
13 Step 2
14 Step 3 FASTA calculates an optimal alignment of initial regions as a combination of compatible regions with maximal score.
15 Step 3
16 Step 4 This final comparison considers all possible alignments of the query and library sequence that fall within a band centered around the highest scoring initial region.
17 Step 4
18 Result Ktup=4 IBM PCAT microcomputer Completed in under 15 min