Presentation is loading. Please wait.

Presentation is loading. Please wait.

Expect value Expect value (E-value) Expected number of hits, of equivalent or better score, found by random chance in a database of the size.

Similar presentations


Presentation on theme: "Expect value Expect value (E-value) Expected number of hits, of equivalent or better score, found by random chance in a database of the size."— Presentation transcript:

1

2

3

4

5

6

7

8 Expect value Expect value (E-value) Expected number of hits, of equivalent or better score, found by random chance in a database of the size searched.

9

10

11

12 Conserved domains Domain: sequence of amino acids that typically fold to a stable tertiary structure. Many proteins are multi- domain.

13

14

15

16 Blast to Psi-Blast Blast makes use of Scoring Matrix derived from large number of proteins. What if you want to find homologs based upon a specific gene product? Develop a position specific scoring matrix (PSSM).

17 PSSM MGASFMGASF M F W Y G A P V I L C R K E N D Q S T H 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Determine frequency of substitution, and converts to LogOdd score.

18 PSSM MGASFMGASF M F W Y G A P V I L C R K E N D Q S T H 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 2 0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Can include a score for permitting insertions and deletions. Perhaps this position is at a turn, where INDELs are common. INDEL Indel 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

19 PSSM In evaluating (scoring) alignments, PSSM approaches typically: –Reward matches to columns that have conserved amino acids –Penalize mismatches to columns with conserved amino acid more than mismatches in a variable column

20 PSI-BLAST Input a single query sequence. Executes a BLAST run. Program takes significant hits, incorporates matches into a PSSM. Sequences >98% similar not included (avoid biasing the PSSM).

21 Power of approach: PSI-BLAST is iterative. Takes best hits and improves the scoring matrix.

22

23

24 Original Blast had 84 hits.

25

26 The PSSM will skew towards this region


Download ppt "Expect value Expect value (E-value) Expected number of hits, of equivalent or better score, found by random chance in a database of the size."

Similar presentations


Ads by Google