Download presentation
Presentation is loading. Please wait.
Published byRadovan Bařtipán Modified over 6 years ago
1
A Hybrid Algorithm for Multiple DNA Sequence Alignment
9/18/2018 A Hybrid Algorithm for Multiple DNA Sequence Alignment Dr. C. Thusangi Wannige and G. Kokila K. Perera Department of Computer Science, University of Ruhuna, Sri Lanka.
2
AGENDA Introduction Problem Definition Algorithm Evaluation
9/18/2018 AGENDA Introduction Problem Definition Algorithm Evaluation Conclusion & Future Improvements
3
9/18/2018 Introduction Bioinformatics is defined as the application of computational techniques to organize and understand the information associated with biomolecules (N. M. Luscombe et al, 2001)
4
Center Star Method for Multiple Sequence Alignment
9/18/2018 Center Star Method for Multiple Sequence Alignment Multiple Sequence Alignment: Infer the homologous regions within the bimolecular sequences Step 1: Identifying the center sequence (Sc) Step 2: Merge the pairwise alignments between the center sequence and the remaining sequences Sc S1 S2 S3 ... Si (i≠c) Sn Star Tree Pairwise Alignments Sum up Gaps in Sc S’c1 S’c2 S’c3 … Sci (i≠c) S’c(n-1) S’c S’1 S’2 S’3 S’n Update Alignment
5
9/18/2018 Research Problem
6
Research Problem Existence of more mutually related sequences
9/18/2018 Research Problem Existence of more mutually related sequences When the sequences exist as subsets of mutually related sequences, the sequences in the subset have partially conserved regions among them When Center Star method is used in these scenarios, the partially conserved regions are hidden in the alignment Subset of sequence Sequence
7
9/18/2018 Algorithm The main output of this research is a more accurate algorithm for MSA
8
Step 1: Identify subsets of sequences
9/18/2018 Step 1: Identify subsets of sequences Quantify distance between sequences: K-mers Use statistics of k length substrings in the sequences to measure the distance between them Clustering method: Bisecting K-means Algorithms A divisive hierarchical clustering method Initial cluster centroids : the two sequences with minimum similarity Terminating criteria : intra-cluster similarity > threshold (0.25)
9
Step 2: Obtain alignments in each subset
9/18/2018 Step 2: Obtain alignments in each subset The sequences within the each subsets are aligned separately using the Center Star method Step 3: Merge the alignments form each subset Align pairs of subsets together using profile alignment Measure distance between 2 subsets using the alignment Apply UPGMA - hierarchical clustering method to build a guide tree that represent relationships between the subset Align the output of the Step 2 for each cluster according to the order of guide tree
10
9/18/2018 Evaluation A comparative approach is used for evaluation
11
Evaluation Data set: Method
9/18/2018 Evaluation Data set: 13 sequences corresponding to envelope glycoprotein (env) gene, V3 region of the HIV-1 genome (Accession no: U U68508) The transmission history of the patients from whom these sequences are obtained is known (Leitner T, Escanilla D, Franzén C, et al.) Method Alignment output of the MSA algorithm is used for a phylogenetic analysis Resulting phylogeny is compared with the true transmission history using Split Distance measure
12
Evaluation Results Algorithm Split Distance New algorithm 0.3
9/18/2018 Evaluation Results Algorithm Split Distance New algorithm 0.3 Center Star 0.4 HAlign ClustalW2 0.5 MUSCLE MAFFT
13
From Alignment of the New Algorithm
9/18/2018 Evaluation Results Actual Phylogeny From Alignment of the New Algorithm Split Distance = 0.3
14
9/18/2018 Conclusion The new algorithm has the main focus on preserving the partially conserved regions in the sequences. Alignment reflect the relationships between sequences more accurately The algorithm is more useful for phylogenetic studies We can improve the algorithm to use lesser the memory and processor requirements This algorithm employs dynamic programming in several occasions.
15
Questions and Suggestions
9/18/2018 Questions and Suggestions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.