Parallel Characteristics of Sequence Alignments Kyle R. Junik.

Slides:



Advertisements
Similar presentations
Sequence Alignments with Indels Evolution produces insertions and deletions (indels) – In addition to substitutions Good example: MHHNALQRRTVWVNAY MHHALQRRTVWVNAY-
Advertisements

Hidden Markov Models (1)  Brief review of discrete time finite Markov Chain  Hidden Markov Model  Examples of HMM in Bioinformatics  Estimations Basic.
Parallell Processing Systems1 Chapter 4 Vector Processors.
NUS CS5247 Motion Planning for Camera Movements in Virtual Environments By Dennis Nieuwenhuisen and Mark H. Overmars In Proc. IEEE Int. Conf. on Robotics.
6/11/2015 © Bud Mishra, 2001 L7-1 Lecture #7: Local Alignment Computational Biology Lecture #7: Local Alignment Bud Mishra Professor of Computer Science.
. Class 4: Fast Sequence Alignment. Alignment in Real Life u One of the major uses of alignments is to find sequences in a “database” u Such collections.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez.
Whole Genome Alignment using Multithreaded Parallel Implementation Hyma S Murthy CMSC 838 Presentation.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2005.
DNA Alignment. Dynamic Programming R. Bellman ~ 1950.
C T C G T A GTCTGTCT Find the Best Alignment For These Two Sequences Score: Match = 1 Mismatch = 0 Gap = -1.
Heuristic alignment algorithms; Cost matrices 2.5 – 2.9 Thomas van Dijk.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez June 23, 2004.
Introduction to Sequence Alignment PENCE Bioinformatics Research Group University of Alberta May 2001.
Multiple Sequence alignment Chitta Baral Arizona State University.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 20, 2003.
Heuristic Approaches for Sequence Alignments
Computational Biology, Part 2 Sequence Comparison with Dot Matrices Robert F. Murphy Copyright  1996, All rights reserved.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Dynamic Programming. Pairwise Alignment Needleman - Wunsch Global Alignment Smith - Waterman Local Alignment.
Developing Pairwise Sequence Alignment Algorithms Dr. Nancy Warter-Perez May 10, 2005.
15-853:Algorithms in the Real World
Sequence comparison: Local alignment
TM Biological Sequence Comparison / Database Homology Searching Aoife McLysaght Summer Intern, Compaq Computer Corporation Ballybrit Business Park, Galway,
Speed Up DNA Sequence Database Search and Alignment by Methods of DSP
Developing Pairwise Sequence Alignment Algorithms
Needleman Wunsch Sequence Alignment
Bioiformatics I Fall Dynamic programming algorithm: pairwise comparisons.
Traceback and local alignment Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington.
ParAlign: with OpenMP Peter Reetz. Overview n Simple algorithm for finding un-gapped alignments n 4 to 5 times faster than Smith- Waterman algorithm &
Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.
Content of the previous class Introduction The evolutionary basis of sequence alignment The Modular Nature of proteins.
Gapped BLAST and PSI- BLAST: a new generation of protein database search programs By Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 01, 2005 Session 14.
Computational Biology, Part 9 Efficient database searching methods Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Approximate XML Joins Huang-Chun Yu Li Xu. Introduction XML is widely used to integrate data from different sources. Perform join operation for XML documents:
Digital Image Processing CCS331 Relationships of Pixel 1.
We want to calculate the score for the yellow box. The final score that we fill in the yellow box will be the SUM of two other scores, we’ll call them.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
BLAST, which stands for basic local alignment search tool, is a heuristic algorithm that is used to find similar sequences of amino acids or nucleotides.
1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.
Sequence Alignments with Indels Evolution produces insertions and deletions (indels) – In addition to substitutions Good example: MHHNALQRRTVWVNAY MHHALQRRTVWVNAY-
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
A Hardware Accelerator for the Fast Retrieval of DIALIGN Biological Sequence Alignments in Linear Space Author: Azzedine Boukerche, Jan M. Correa, Alba.
BLAST, which stands for basic local alignment search tool, is a heuristic algorithm that is used to find similar sequences of amino acids or nucleotides.
Lecture 7 CS5661 Heuristic PSA “Words” to describe dot-matrix analysis Approaches –FASTA –BLAST Searching databases for sequence similarities –PSA –Alternative.
Heuristic Methods for Sequence Database Searching BMI/CS 576 Colin Dewey Fall 2015.
A Recap. Absolute Optimization 1.The domain is constrained to a closed and bounded region most of the time. 2.Closed and bounded regions are guaranteed.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Sequence Alignment.
Course Code #IDCGRF001-A 5.1: Searching and sorting concepts Programming Techniques.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Dynamic programming with more complex models When gaps do occur, they are often longer than one residue.(biology) We can still use all the dynamic programming.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Parallel Programming in Chess Simulations Part 2 Tyler Patton.
Learning to Align: a Statistical Approach
Homology Search Tools Kun-Mao Chao (趙坤茂)
Sequence comparison: Dynamic programming
Sequence comparison: Local alignment
Challenging Cloning Related Problems with GPU-Based Algorithms
Homology Search Tools Kun-Mao Chao (趙坤茂)
Pairwise sequence Alignment.
Lecture #7: FASTA & LFASTA
Find the Best Alignment For These Two Sequences
Pairwise Alignment Global & local alignment
Sequence Alignment Algorithms Morten Nielsen BioSys, DTU
Dynamic Programming Finds the Best Score and the Corresponding Alignment O Alignment: Start in lower right corner and work backwards:
Homology Search Tools Kun-Mao Chao (趙坤茂)
Presentation transcript:

Parallel Characteristics of Sequence Alignments Kyle R. Junik

Overview Introduction to papers Five sequencing algorithms –Needleman, Wunsch, and Sellers (NWS) –Fickett’s algorithm –Parallel NWS –Parallel Fickett’s –Wilbur and Lipman’s algorithm Case Study – Pittsburgh Supercomputing Center

Needleman, Wunsch, & Sellers (NWS) Uses alignment matrix with sequence S 1 across the top and sequence S 2 down the left Linear algorithm which calculates each array position starting in upper left corner Movement down or right in matrix consists of a gap penalty, gp Movement along a diagonal consists of a match or substitution, subs Value at any index, (i,j) is the min( (i-1,j) + gp, (i,j-1) +gp, subs) Running time = O(|S 1 | * |S 2 |)

Fickett’s Algorithm Uses same techniques as NWS algorithm Extra parameter, d max which represents a maximum value for any alignment in the matrix Idea being that if at some index you have a value greater than or equal to d max you can avoid evaluating certain indexes that depend solely on that index Running time = O( |S1| * |S2| ) but will generally run faster than NWS due to ability to pick and choose indices Drawback – Fickett’s algorithm doesn’t guarantee an alignment

Parallel NWS Inherently simple parallel implementation Each element in an anti-diagonal solely depends on previous anti-diagonals Therefore an anti-diagonal can be calculated in parallel Running Time is reduced to O( |S 1 | + |S 2 | )

Parallel Fickett’s Uses anti-diagonal concept from parallel NWS Redefines process for eliminating elements from evaluation Similar speed up over parallel NWS as regular Fickett’s had against NWS Similar drawbacks as Fickett’s algorithm

Wilbur and Lipman’s Algorithm Heuristic algorithm – does not guarantee optimal alignment Searches for k-tuple matches Uses hash tables for parallel search for k-tuples Finds best path amongst k-tuple matches using restricting parameter, w 2 which is a limiting leap of diagonals

Pittsburgh Supercomputing Center Hardware – Thinking Machines CM-2 and Cray Y-MP Uses CM-2 to filter between pairs of sequences to determine if further processing is ideal Uses Cray to analyze alignments using parallel NWS configured to utilized the Cray’s vector processing capabilities Coordination managed by Distributed Code Manager (DCM) which manages operation amongst heterogeneous computing environments