Presentation is loading. Please wait.

Presentation is loading. Please wait.

A New Approach for Alignment of Multiple Proteins Adam Hebdon Zhang, Xu, Kahveci, Tamer, 2006, “A New Approach for Alignment of Multiple Proteins”, Pacific.

Similar presentations


Presentation on theme: "A New Approach for Alignment of Multiple Proteins Adam Hebdon Zhang, Xu, Kahveci, Tamer, 2006, “A New Approach for Alignment of Multiple Proteins”, Pacific."— Presentation transcript:

1 A New Approach for Alignment of Multiple Proteins Adam Hebdon Zhang, Xu, Kahveci, Tamer, 2006, “A New Approach for Alignment of Multiple Proteins”, Pacific Symposium on Biocomputing, 11:339-350.

2 Why Do We Need A New Approach? Multiple Sequence Alignment one of most fundamental problems of Bioinformatics Multiple Sequence Alignment one of most fundamental problems of Bioinformatics –Used to make predictions, relation of protein sequences, etc. –More difficult when sequences are dissimilar Traditional alignment methods add sequences one by one. Traditional alignment methods add sequences one by one. –Progressive- Anchor-based –Iterative- Probabilistic Number of sequences significantly affects quality of resulting alignment. Number of sequences significantly affects quality of resulting alignment.

3 Why Do We Need A New Approach? Progressive Progressive –Never more than two sequences simultaneously aligned –Allows alignments of any size Iterative Iterative –Start with initial alignment –Repeatedly refine alignment through series of iterations until no improvements made

4 Why Do We Need A New Approach? Anchor-Based Anchor-Based –Use local motifs as anchors for alignment –Unaligned segments aligned using other methods MAFFT, Align-m, L-align, Mavid, PRRP, etc. MAFFT, Align-m, L-align, Mavid, PRRP, etc. Probabilistic Probabilistic –Analyze known multiple alignments & pre- compute substitution probabilities –Maximize alignment for given sequences

5 Horizontal Sequence Alignment (HSA) All proteins are considered at once and aligned simultaneously All proteins are considered at once and aligned simultaneously Particularly accurate in “twilight zone” where other methods are not. Particularly accurate in “twilight zone” where other methods are not. –Twilight zone = % of Identities below 25% HAS performs similar to traditional methods of high Identity percentage HAS performs similar to traditional methods of high Identity percentage

6 Horizontal Sequence Alignment (HSA) 1. Construct Initial Directed Graph from AA Sequence, Secondary Structure, etc. 2. Group Vertices Based on Residue Type 3. Insert Gap Vertices to Align Similar Vertices Topologically 4. Determine Alignment from High Scoring Cliques with Sliding Window 5. Adjust Gap Vertices of Initial Alignment

7 Structure of Graph Sequences & Associated Secondary Structure Vertices Representing Secondary Structure & Gaps Horizontal Graph of Sequence & Color of Different Protein

8 HSA Step 1: Directed edge corresponds to consecutive AA in each protein Directed edge corresponds to consecutive AA in each protein Undirected edge between vertices whose substitution score is sufficient Undirected edge between vertices whose substitution score is sufficient

9 HSA Step 2: Group Fragments Most Likely To Be Aligned Together Group Fragments Most Likely To Be Aligned Together totalScore = typeScore – positionPenalty – lengthPenalty totalScore = typeScore – positionPenalty – lengthPenalty

10 HSA Step 3: Update Graph with Gaps so like vertices are close topologically Update Graph with Gaps so like vertices are close topologically

11 HSA Step 4: Place Window over all sequences where w = number of vertices to cover Place Window over all sequences where w = number of vertices to cover –Ex: w = 3,window covers first 3 vertices of each sequence For Each Clique, align letters of the clique and find the next best clique For Each Clique, align letters of the clique and find the next best clique –Clique = complete sub-graph consisting of one vertex of each color (1 column of multiple alignment) Slide window down & find next clique that: Slide window down & find next clique that: –Doesn’t conflict with previous clique- 1 letter next to letter in previous clique

12 HSA Step 5: Move gaps found inside fragment of alpha-helix or beta-sheet outside the fragment Move gaps found inside fragment of alpha-helix or beta-sheet outside the fragment Final alignment obtained from mapping each vertex in final graph back to original residue Final alignment obtained from mapping each vertex in final graph back to original residue

13 Observations of HSA Identity = 0 – 20% Identity = 0 – 20% –HSA method outperforms traditional alignment methods significantly Identity = 20 – 40% Identity = 20 – 40% –HSA method is comparable to traditional alignment methods Identity = 40- 100 % Identity = 40- 100 % –Little improvement can be made from traditional alignment –HSA performs slightly under traditional methods


Download ppt "A New Approach for Alignment of Multiple Proteins Adam Hebdon Zhang, Xu, Kahveci, Tamer, 2006, “A New Approach for Alignment of Multiple Proteins”, Pacific."

Similar presentations


Ads by Google