Download presentation
Presentation is loading. Please wait.
8
Expect value Expect value (E-value) Expected number of hits, of equivalent or better score, found by random chance in a database of the size searched.
12
Conserved domains Domain: sequence of amino acids that typically fold to a stable tertiary structure. Many proteins are multi- domain.
16
Blast to Psi-Blast Blast makes use of Scoring Matrix derived from large number of proteins. What if you want to find homologs based upon a specific gene product? Develop a position specific scoring matrix (PSSM).
17
PSSM MGASFMGASF M F W Y G A P V I L C R K E N D Q S T H 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Determine frequency of substitution, and converts to LogOdd score.
18
PSSM MGASFMGASF M F W Y G A P V I L C R K E N D Q S T H 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Can include a score for permitting insertions and deletions. Perhaps this position is at a turn, where INDELs are common. INDEL Indel 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
19
PSSM In evaluating (scoring) alignments, PSSM approaches typically: –Reward matches to columns that have conserved amino acids –Penalize mismatches to columns with conserved amino acid more than mismatches in a variable column
20
PSI-BLAST Input a single query sequence. Executes a BLAST run. Program takes significant hits, incorporates matches into a PSSM. Sequences >98% similar not included (avoid biasing the PSSM).
21
Power of approach: PSI-BLAST is iterative. Takes best hits and improves the scoring matrix.
24
Original Blast had 84 hits.
26
The PSSM will skew towards this region
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.