FPGA Acceleration of Gene Rearrangement Analysis Jason D. Bakos Dept. of Computer Science and Engineering University of South Carolina Columbia, SC USA.

Slides:



Advertisements
Similar presentations
Reconstructing Phylogenies from Gene-Order Data Overview.
Advertisements

An Algorithm for Constructing Parsimonious Hybridization Networks with Multiple Phylogenetic Trees Yufeng Wu Dept. of Computer Science & Engineering University.
Parsimony Small Parsimony and Search Algorithms Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks Luay Nakhleh Department of Computer Sciences UT Austin.
Traveling Salesperson Problem
School of CSE, Georgia Tech
DCJUC: A Maximum Parsimony Simulator for Constructing Phylogenetic Tree of Genomes with Unequal Contents Zhaoming Yin Bader-Polo Joint Group Meeting, Nov.
CS171 Introduction to Computer Science II Graphs Strike Back.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Phylogenies Preliminaries Distance-based methods Parsimony Methods.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
High-Performance Algorithm Engineering for Computational Phylogenetics [B Moret, D Bader] Kexue Liu CMSC 838 Presentation.
Tree Reconstruction.
Genome Halving – work in progress Fulton Wang ACGT Group Meeting.
Computes the partial dot products for only the diagonal and upper triangle of the input matrix. The vector computed by this architecture is added to the.
Heterogeneous Computing at USC Dept. of Computer Science and Engineering University of South Carolina Dr. Jason D. Bakos Assistant Professor Heterogeneous.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Whole Genome Alignment using Multithreaded Parallel Implementation Hyma S Murthy CMSC 838 Presentation.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
CISC667, F05, Lec15, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (II) Distance-based methods.
FPGA Acceleration of Phylogeny Reconstruction for Whole Genome Data Jason D. Bakos Panormitis E. Elenis Jijun Tang Dept. of Computer Science and Engineering.
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
Ch 13 – Backtracking + Branch-and-Bound
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
Building Phylogenies Parsimony 2.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
High-Performance Reconfigurable Computing for Genome Analysis Jason D. Bakos Dept. of Computer Science and Engineering University of South Carolina Columbia,
CIS786, Lecture 4 Usman Roshan.
Phylogeny Estimation: Why It Is "Hard", and How to Design Methods with Good Performance Tandy Warnow Department of Computer Sciences University of Texas.
Combinatorial and Statistical Approaches in Gene Rearrangement Analysis Jijun Tang Computer Science and Engineering University of South Carolina
Genome Rearrangement By Ghada Badr Part II. 2  Genomes can be modeled by each gene can be assigned a unique number and is exactly found once in the genome.
Gene expression & Clustering (Chapter 10)
Busby, Dodge, Fleming, and Negrusa. Backtracking Algorithm Is used to solve problems for which a sequence of objects is to be selected from a set such.
Operations Research Assistant Professor Dr. Sana’a Wafa Al-Sayegh 2 nd Semester ITGD4207 University of Palestine.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
Computer Science Research for The Tree of Life Tandy Warnow Department of Computer Sciences University of Texas at Austin.
FPGA FPGA2  A heterogeneous network of workstations (NOW)  FPGAs are expensive, available on some hosts but not others  NOW provide coarse- grained.
Binary Encoding and Gene Rearrangement Analysis Jijun Tang Tianjin University University of South Carolina (803)
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
PERFORMANCE ANALYSIS cont. End-to-End Speedup  Execution time includes communication costs between FPGA and host machine  FPGA consistently outperforms.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
Evolutionary tree reconstruction
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact:
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Ricochet Robots Mitch Powell Daniel Tilgner. Abstract Ricochet robots is a board game created in Germany in A player is given 30 seconds to find.
GRAPPA: Large-scale whole genome phylogenies based upon gene order evolution Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular.
1 Microarray Clustering. 2 Outline Microarrays Hierarchical Clustering K-Means Clustering Corrupted Cliques Problem CAST Clustering Algorithm.
Lecture 19 Minimal Spanning Trees CSCI – 1900 Mathematics for Computer Science Fall 2014 Bill Pine.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSE280Stefano/Hossein Project: Primer design for cancer genomics.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
WABI: Workshop on Algorithms in Bioinformatics
BackTracking CS255.
Inferring phylogenetic trees: Distance and maximum likelihood methods
BNFO 602 Phylogenetics Usman Roshan.
Multiple Genome Rearrangement
Phylogeny.
CS 394C: Computational Biology Algorithms
Algorithms for Inferring the Tree of Life
Presentation transcript:

FPGA Acceleration of Gene Rearrangement Analysis Jason D. Bakos Dept. of Computer Science and Engineering University of South Carolina Columbia, SC USA

FCCM 2007 Napa, CAApril 23, 2007 Talk Outline Application: –Phylogenetic Reconstruction –Gene Rearrangement Data Breakpoint Median Computation Core Architecture Extracting Parallelism Performance Results –Core Computation Speedup –Application Speedup Conclusion and Future Work

FCCM 2007 Napa, CAApril 23, 2007 Phylogenies genus Drosophila

FCCM 2007 Napa, CAApril 23, 2007 Phylogenetic Analysis Phylogenies are used to infer common characteristics among related species

FCCM 2007 Napa, CAApril 23, 2007 Phylogeny Data Structure g1 g2 g5 g4g6 g3 g5 g4 g1g3 g2 g5 g6 g5 g2 g1 g6 g3 g4 Unrooted binary tree n leaf vertices n - 2 internal vertices (degree 3) Tree configurations = (2n - 5) * (2n - 7) * (2n - 9) * … * trillion trees for 16 leaves

FCCM 2007 Napa, CAApril 23, 2007 Phylogenetic Reconstruction Given biological data for a set of species, compute an accurate phylogeny Distance-based Methods –Construct tree using pair-wise distances (UPGMA, neighbor-joining) –Fast but inaccurate Maximum Parsimony Methods –Goal: Accuracy –Construct tree using evolutionary model –Search for most parsimonious tree –Candidate trees must be “scored” All internal vertices are labeled with ancestral data Edge distances are computed between vertex pairs and summed

FCCM 2007 Napa, CAApril 23, 2007 GRAPPA Performance Bottleneck GRAPPA –“Genome Rearrangement Analysis under Parsimony and other Phylogenetic Algorithms” –Exhaustive branch-and-bound or heuristic search through tree space –Labeling computation is performance bottleneck 95+% of execution time is spent computing labels for internal vertices Diameter (Ev. Rate) of Inputs Execution Time Ratio for Labeling 1 0

FCCM 2007 Napa, CAApril 23, 2007 Gene Rearrangement Data Genome data is represented as a circular ordering of genes –“Gene orders” Each gene represents a known nucleotide sequence –Each gene has a positive or negative orientation Rare evolutionary events cause gene rearrangements –Inversion g 0 g 1 g 2 g 3 g 4 g 5 g 0 g 1 –g 4 –g 3 –g 2 g 5 –Transposition g 0 g 1 g 2 g 3 g 4 g 5 g 0 g 2 g 3 g 4 g 1 g 5 –Inverted Transposition g 0 g 1 g 2 g 3 g 4 g 5 g 0 –g 4 –g 3 –g 2 g 1 g 5

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Distance Metric Estimation of number of rearrangement events between gene orders A and B # of adjacencies: g h in A that doesn’t correspond to g h or –h –g in B Example: –A = –B = –Breakpoint distance = 2

FCCM 2007 Napa, CAApril 23, 2007 Median ABC M d(A,M) d(B,M) d(C,M) Ancestral vertices are computed using a median computation All internal vertices have degree 3 Find M that optimally minimizes median score score = d(A,M) + d(B,M) + d(C,M) Breakpoint median: –d() is breakpoint distance

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm as a TSP Construct a fully connected graph containing all g and –g for each gene w(g,-g) = - Initialize all other weights to be 3 For each adjacency gh in the three genomes, decrement weight between vertex –g and h Solve TSP, median is the even-indices of the tour (score = tour cost) A = B = C = Edges not shown have cost = 3 cost = -  cost = 0 cost = 1 cost = An optimal solution corresponding to genome

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Implementation Optimal TSP is feasible due to small graph Implemented as a depth-first branch-and-bound search Upper bound is the current best tour Lower-bound is computed using a linear greedy algorithm –Select a set of minimal-weight edges to complete a partially- constructed tour –To tighten: edges not considered that… have been pruned at or above the current level of the search tree that would create a cycle not including all cities

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm solution otherEnd 1 => => 1 2 => => 2 3 => => 3 4 => => 4 excluded used cost=0 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4 excluded solution cost=0 stack solution otherEnd 1 => => 1 2 => => 2 3 => => 3 4 => => 3 cost=0 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4, 2, 3 otherEnd 1 => => 1 2 => => -4 3 => => 3 4 => => -2 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) excluded solution cost=1 solution cost=0 solution cost=0 Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4, 2, 3 otherEnd 1 => => 1 2 => => -4 3 => => 3 4 => => -2 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) excluded solution cost=0 solution cost=0 solution cost=1 Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4 otherEnd 1 => => 1 2 => => 2 3 => => 3 4 => => 3 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) excluded (2,3) stack solution cost=0 solution cost=0 Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4, 1, 2 otherEnd 1 => => -2 2 => => -1 3 => => 3 4 => => 3 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) excluded (2,3) solution cost=2 solution cost=0 solution cost=0 Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4, 1, 2 otherEnd 1 => => -2 2 => => -1 3 => => 3 4 => => 3 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) excluded (2,3) solution cost=2 solution cost=0 solution cost=0 Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4, 1, 2, -2, -4 otherEnd 1 => => 3 2 => => -1 3 => => 3 4 => => 3 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) excluded (2,3) solution cost=4 solution cost=2 solution cost=0 solution cost=0 Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Breakpoint Median Algorithm used -3, 4, 1, 2, -2, -4, -1, 3 otherEnd 1 => => 3 2 => => -1 3 => => 3 4 => => 3 edges (-3,4,w=0) (2,3,w=1) (1,2,w=2) (-1,-2,w=2) (1,-2,w=2) (-2,-4,w=2) (-1,3,w=2) (-1,-4,w=2) (1,-4,w=2) excluded (2,3) solution cost=4 tour is -1, 1, 2, -2, -4, 4, -3, 3 median is -1, 2, -4, -3 solution cost=2 solution cost=0 solution cost=0 solution cost=6 Edge list:Search state:

FCCM 2007 Napa, CAApril 23, 2007 Median Core Architecture Computation is sequential add edge 7 cycles prune edge 2 or 10 cycles compute lower bound 2n+2 cycles 56 MHz top levelcontroller 21/444 BRAMs/core 1% slices/core

FCCM 2007 Napa, CAApril 23, 2007 Accelerator Architecture Fill FPGAs with median cores Fan-outs and fan-ins are pipelined to meet PCI-X timing Platform: –Annapolis Wild-Star II Pro –Virtex-2 Pro I/O –Programmed I/O –Hosts polls each core for state –Comm. overhead is significant for easy medians

FCCM 2007 Napa, CAApril 23, 2007 Phylogeny Scoring Steps g5 g4 g1g3 g2 g5 g6 1.Initialize unlabeled tree Use 3 nearest labels Initialize upper bound from inputs 2.Iteratively refine tree to convergence Use 3 immediate neighbors Initialize upper bound using score of previous label g5 g4 g1g3 g2 g5 g6

FCCM 2007 Napa, CAApril 23, 2007 Extracting Parallelism: Initialization core 0 core 1 core 2 core n-1 initial upper bound = ub = d(A,B) + d(A,C) d(B,A) + d(B,C) d(C,A) + d(C,B) A, B, C … ub ub - 1 ub - 2 ub - n - 1 Core with a lower initial upper bound will converge on solution fastest A d(A,B) 0 B C A d(A,C) A 0 B C B d(B,C) d(A,B) B 0 C C A d(A,C) d(B,C)

FCCM 2007 Napa, CAApril 23, 2007 Performance Results: Median Computation Average over 1000 median computations 12 cores => 25X speedup

FCCM 2007 Napa, CAApril 23, 2007 Extracting Parallelism: Re-Labeling g5 g4 g1g3 g2 g5 g6 m0 m1 m2 m3 core 0 core 1 core 2 core 3 g4, m1, m2 score(m0) m0, g1, g3 score(m1) m0, g2, m3 score(m2) m2, g5, g6 score(m3) All medians are dispatched in parallel for each iteration Score of previous median used as initial upper bound

FCCM 2007 Napa, CAApril 23, 2007 Performance Results: Accelerated GRAPPA Replace software median with driver for FPGA card Initialization phase: –Use 12 median cores Re-labeling phase: –Parallel labeling –Use n - 2 median cores Average over 10 GRAPPA runs

FCCM 2007 Napa, CAApril 23, 2007 Conclusions Breakpoint median computation: combinatorial optimization algorithm –Control-dependent, memory-intensive Median core consistently achieves speedup over software and only requires 4% of FPGA resources Initial results show that median-level parallelism can be achieved (take advantage of hardware) Up to 25X application speedup

FCCM 2007 Napa, CAApril 23, 2007 Future Work More efficient technique to parallelize median computation Inversion median core Design tree generation and bounding cores –Bottleneck for larger input sets –Initial design requires only 2 BRAMs Develop and package library of phylogenetic cores Develop tool that generates architecture based on characteristics of the input set

FCCM 2007 Napa, CAApril 23, 2007 Raw Data for Median Core

FCCM 2007 Napa, CAApril 23, 2007 Raw Data for Accelerated-GRAPPA