Download presentation
Presentation is loading. Please wait.
1
Evaluating alignments using motif detection Let’s evaluate alignments by searching for motifs If alignment X reveals more functional motifs than Y using technique Z then X is better than Y w.r.t. Z Motifs could be functional sites in proteins or functional regions in non- coding DNA
2
Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic problem With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.
3
What is a “Functional Site”? Defining what constitutes a “functional site” is not trivial Residues that include and cluster around known functionality are clear candidates for functional sites We define a functional site as catalytic residues, binding sites, and regions that clustering around them.
4
Protein
5
Protein + Ligand
6
Functional Sites (FS)
7
Regions that Cluster Around FS
8
Phylogenetic motifs PMs are short sequence fragments that conserve the overall familial phylogeny Are they functional? How do we detect them?
9
Phylogenetic motifs PMs are short sequence fragments that conserve the overall familial phylogeny Are they functional? How do we detect them? First we design a simple heuristic to find them Then we see if the detected sites are functional
10
Scan for Similar Trees Whole Tree
11
Scan for Similar Trees Whole Tree
12
Scan for Similar Trees Windowed Tree Whole Tree
13
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
14
Scan for Similar Trees Partition Metric Score: 8 Windowed Tree Whole Tree
15
Scan for Similar Trees Partition Metric Score: 4 Windowed Tree Whole Tree
16
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
17
Scan for Similar Trees Partition Metric Score: 8 Windowed Tree Whole Tree
18
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
19
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
20
Scan for Similar Trees Partition Metric Score: 0 Windowed Tree Whole Tree
21
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
22
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
23
Scan for Similar Trees Partition Metric Score: 8 Windowed Tree Whole Tree
24
Scan for Similar Trees Partition Metric Score: 0 Windowed Tree Whole Tree
25
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
26
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
27
Scan for Similar Trees Partition Metric Score: 6 Windowed Tree Whole Tree
28
Phylogenetic Motif Identification Compare all windowed trees with whole tree and keep track of the partition metric scores Normalize all partition metric scores by calculating z-scores Call these normalized scores Phylogenetic Similarity Z-scores (PSZ) Set a PSZ threshold for identifying windows that represent phylogenetic motifs
29
Set PSZ Threshold
30
Regions of PMs
31
Map PMs to the Structure
32
Set PSZ Threshold
33
Map PMs to the Structure Map Set PSZ Threshold
34
Map PMs to the Structure Map Set PSZ Threshold
35
PMs in Various Structures
36
PMs and Traditional Motifs
37
TIM Phylogenetic Similarity False Positive Expectation
38
TIM Phylogenetic Similarity False Positive Expectation
39
TIM Phylogenetic Similarity False Positive Expectation
40
TIM Phylogenetic Similarity False Positive Expectation
41
Cytochrome P450 Phylogenetic Similarity False Positive Expectation
42
Cytochrome P450 Phylogenetic Similarity False Positive Expectation
43
Enolase Phylogenetic Similarity False Positive Expectation
44
Glycerol Kinase Phylogenetic Similarity False Positive Expectation
45
Glycerol Kinase Phylogenetic Similarity False Positive Expectation
46
Myoglobin Phylogenetic Similarity False Positive Expectation
47
Myoglobin Phylogenetic Similarity False Positive Expectation
48
Evaluating alignments For a given alignment compute the PMs Determine the number of functional PMs Those identifying more functional PMs will be classified as better alignments
49
Protein datasets
50
Running time
51
Functional PMs PAl=blue MUSCLE=red Both=green (a)=enolase, (b)ammonia channel, (c)=tri-isomerase, (d)=permease, (e)=cytochrome
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.