Presentation is loading. Please wait.

Presentation is loading. Please wait.

Identifying property based sequence motifs in protein families and superfamies: application to DNase-1 related endonucleases Venkatarajan S. Mathura et.

Similar presentations


Presentation on theme: "Identifying property based sequence motifs in protein families and superfamies: application to DNase-1 related endonucleases Venkatarajan S. Mathura et."— Presentation transcript:

1 Identifying property based sequence motifs in protein families and superfamies: application to DNase-1 related endonucleases Venkatarajan S. Mathura et al. Presented by Mr. Hat

2 Motivation “Statistically derived matrices based on allowed substitution of amino acids are not designed to detect conservation of physical–chemical properties” “Statistically derived matrices based on allowed substitution of amino acids are not designed to detect conservation of physical–chemical properties” Hmmr, psi/phi blast and rps-blast to name a few Hmmr, psi/phi blast and rps-blast to name a few MASIA could compliment these existing gene mining tools MASIA could compliment these existing gene mining tools

3 Methods Created quantitative descriptors E1 – E5 that described amino acid properties and their physical interpretation Created quantitative descriptors E1 – E5 that described amino acid properties and their physical interpretation Created from a comprehensive list of 237 PCP Created from a comprehensive list of 237 PCP They measured conservation by the standard deviation and relative entropy of the values E1 – E5 They measured conservation by the standard deviation and relative entropy of the values E1 – E5 Venkatarajan and Braun 2001 Venkatarajan and Braun 2001 Defined a minimum length cutoff, maximum gap thresh hold Defined a minimum length cutoff, maximum gap thresh hold

4

5 Experiment Used APE family sequences from 42 organisms Used APE family sequences from 42 organisms Both prokaryotes and eukaryotes Both prokaryotes and eukaryotes Used taxonomic classification to remove a bunch of the redundant data Used taxonomic classification to remove a bunch of the redundant data Each motif is represented as a “profile” Each motif is represented as a “profile” Consisting of average values, standard deviation and relative entropies for each vector E1 - E5 Consisting of average values, standard deviation and relative entropies for each vector E1 - E5 MASIA MOTIF MAKER MASIA MOTIF MAKER

6 Experiment (cont.) Used these profiles to search ASTRAL40 database Used these profiles to search ASTRAL40 database

7

8

9

10 Example score matrix for motif 2 and it’s corresponding E1 – E5 values Example score matrix for motif 2 and it’s corresponding E1 – E5 values * means low relative entropy * means low relative entropy + means significant component + means significant component - not a significant component - not a significant component

11 Results MASIA tool found all DNase-like superfamily members in ASTRAL40 MASIA tool found all DNase-like superfamily members in ASTRAL40 But this doesn’t show specificity?? But this doesn’t show specificity?? PSI-Blast --default parameters PSI-Blast --default parameters Used all 42 sequences to seed psi blast Used all 42 sequences to seed psi blast Performed local and NCBI psi-blast Performed local and NCBI psi-blast Searched “non-redundant sequence database” – NR/NT??? Searched “non-redundant sequence database” – NR/NT??? Found no DNase-I or IPP sequences after several iterations Found no DNase-I or IPP sequences after several iterations

12 Results (cont.) PSI-blast (cont.) PSI-blast (cont.) Evalue was increased to.1 Evalue was increased to.1 DNase-I was found after four iterations, but it also brought in 500 other junk sequences DNase-I was found after four iterations, but it also brought in 500 other junk sequences Failed to find DNase-I in the ASTRAL40 database Failed to find DNase-I in the ASTRAL40 database

13

14 How bout them PCP motifs and MASIA! This could possible improve my gene hunting capabilities! Now if I just had fingers to type! By the way, where are those fat BBS mice, I’m getting hungry!


Download ppt "Identifying property based sequence motifs in protein families and superfamies: application to DNase-1 related endonucleases Venkatarajan S. Mathura et."

Similar presentations


Ads by Google