Improving performance of Multiple Sequence Alignment in Multi-client Environments Aaron Zollman CMSC 838 Presentation.

Slides:

Advertisements

Similar presentations

Revisiting Co-Processing for Hash Joins on the Coupled CPU- GPU Architecture School of Computer Engineering Nanyang Technological University 27 th Aug.

Advertisements

Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.

March 2, 2004, BMI Biomedical Data Management Improving Performance of Multiple Sequence Alignment Analysis in Multi-client Environments Use of Inexpensive.

Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.

Parallel Databases By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.

Parallel Database Systems The Future Of High Performance Database Systems David Dewitt and Jim Gray 1992 Presented By – Ajith Karimpana.

Optimal Sum of Pairs Multiple Sequence Alignment David Kelley.

1 Characterizing the Sort Operation on Multithreaded Architectures Layali Rashid, Wessam M. Hassanein, and Moustafa A. Hammad* The Advanced Computer Architecture.

Whole Genome Alignment using Multithreaded Parallel Implementation Hyma S Murthy CMSC 838 Presentation.

Bioinformatics and Phylogenetic Analysis

G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.

TurboBLAST: A Parallel Implementation of BLAST Built on the TurboHub Bin Gan CMSC 838 Presentation.

Solving the Protein Threading Problem in Parallel Nocola Yanev, Rumen Andonov Indrajit Bhattacharya CMSC 838T Presentation.

Performance Optimization of Clustal W: Parallel Clustal W, HT Clustal and MULTICLUSTAL Arunesh Mishra CMSC 838 Presentation Authors : Dmitri Mikhailov,

1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.

Parallel Computation in Biological Sequence Analysis Xue Wu CMSC 838 Presentation.

A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.

Blast heuristics Morten Nielsen Department of Systems Biology, DTU.

Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı

1 CSE544 Database Architecture Tuesday, February 1 st, 2011 Slides courtesy of Magda Balazinska.

Sequence Alignment and Phylogenetic Prediction using Map Reduce Programming Model in Hadoop DFS Presented by C. Geetha Jini (07MW03) D. Komagal Meenakshi.

Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.

IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.

Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.

XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.

Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.

BLAST: A Case Study Lecture 25. BLAST: Introduction The Basic Local Alignment Search Tool, BLAST, is a fast approach to finding similar strings of characters.

1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.

InCoB August 30, HKUST “Speedup Bioinformatics Applications on Multicore- based Processor using Vectorizing & Multithreading Strategies” King.

Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.

An Implementation and Performance Evaluation of Language with Fine-Grain Thread Creation on Shared Memory Parallel Computer Yoshihiro Oyama, Kenjiro Taura,

Shared Memory Parallelization of Decision Tree Construction Using a General Middleware Ruoming Jin Gagan Agrawal Department of Computer and Information.

Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases Jost Enderle, Nicole Schneider, Thomas Seidl RWTH Aachen University,

Author :Tim Oliver, Bertil Schmidt, Darran Nathan, Ralf Clemens, and Douglas Maskell1. Publisher/Conf : th International Conference on Parallel and.

Cuda application-Case study 2015/10/24 1. Introduction (1) 2015/10/24 GPU Workshop 2 The fast increasing power of the GPU (Graphics Processing Unit) and.

Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.

Accelerating Error Correction in High-Throughput Short-Read DNA Sequencing Data with CUDA Haixiang Shi Bertil Schmidt Weiguo Liu Wolfgang Müller-Wittig.

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.

GEM: A Framework for Developing Shared- Memory Parallel GEnomic Applications on Memory Constrained Architectures Mucahid Kutlu Gagan Agrawal Department.

Database Indexing 1 After this lecture, you should be able to:  Understand why we need database indexing.  Define indexes for your tables in MySQL. 

Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.

1 Adaptive Parallelism for Web Search Myeongjae Jeon Rice University In collaboration with Yuxiong He (MSR), Sameh Elnikety (MSR), Alan L. Cox (Rice),

Lecture 14- Parallel Databases Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch

Martin Kruliš by Martin Kruliš (v1.1)1.

High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.

Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,

Storage Systems CSE 598d, Spring 2007 OS Support for DB Management DB File System April 3, 2007 Mark Johnson.

MGM workshop. 19 Oct 2010 Some frequently-used Bioinformatics Tools Konstantinos Mavrommatis Prokaryotic Superprogram.

Bigtable: A Distributed Storage System for Structured Data

Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.

1 CSCE 520 Test 2 Info Indexing Modified from slides of Hector Garcia-Molina and Jeff Ullman.

Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.

Computational Challenges in BIG DATA 28/Apr/2012 China-Korea-Japan Workshop Takeaki Uno National Institute of Informatics & Graduated School for Advanced.

Item Based Recommender System SUPERVISED BY: DR. MANISH KUMAR BAJPAI TARUN BHATIA ( ) VAIBHAV JAISWAL( )

Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi

Optimizing Parallel Algorithms for All Pairs Similarity Search

Ioannis E. Venetis Department of Computer Engineering and Informatics

Parallel Databases.

Genomic Data Clustering on FPGAs for Compression

Memory Management for Scalable Web Data Servers

External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.

Edge computing (1) Content Distribution Networks

KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures

Getting to the root of concurrent binary search tree performance

Threads David Ferry CSCI 3500 – Operating Systems

Path Oram An Extremely Simple Oblivious RAM Protocol

Fast Accesses to Big Data in Memory and Storage Systems

FREERIDE: A Framework for Rapid Implementation of Datamining Engines

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

Improving performance of Multiple Sequence Alignment in Multi-client Environments Aaron Zollman CMSC 838 Presentation

CMSC 838T – Presentation Overview u Overview of talk  CLUSTALW algorithm, speedup opportunities  Problems with caching  Parallelizing technique  Weaknesses  Applying technique to other bioinformatics problems

CMSC 838T – Presentation Motivation u Query overlap in queries submitted to MSA tools  Single researcher: new sequences vs. database  Multiple researchers: similar subsets u CLUSTALW: Progressive algorithm  Three steps  Progressive refinement u Opportunities for speedup  Caching  Query ordering

CMSC 838T – Presentation CLUSTALW: Progressive global alignment u Step 1: Pairwise alignment, distance matrix  Fast technique calculates distance between two scores  Calculated for all sequence pairs  Cost: O(q 2 l 2 ) u Step 2: Guide tree  Group nearest first  Build tree sequentially  Cost: O(q 3 ) u Step 3: Progressive alignment  Align, starting at leaves of tree  Cost: O(ql 2 ) * q sequences – mean length l

CMSC 838T – Presentation Optimization: Query caching u Step 1: Pairwise alignment, building distance matrix  Many requests partially duplicated  Individual distance calculation not dependent on rest of query  Observation: Dominant step in execution time u Steps 2, 3:  Output dependent on results of entire query  Results less reusable u Technique: cache output of step 1  Individual distances MLI…GIS…QPA… MLISHSDLNQ…0.0 GISRETSS…0.0 QPAKKTYTW…0.0 Query 1 Query 2 MLI…GIS…MST… MLISHSDLNQ…0.0 GISRETSS…0.0 MSTVTKYFYKGE…0.0

CMSC 838T – Presentation Challenges to cache implementation u I/O and filesystem overhead  Large cache vs. 2GB file size limit  High seek times within single file u Search and insertion overhead  Sequence: lengthy key  Keyed on each pair of sequences

CMSC 838T – Presentation Technique: 2-level B-Tree cache u Level 1: Map sequence text to sequence ID  Hash of sequence?  Sequentially assigned number  Cache size: O(ql) u Level 2: Map ID pairs to calculated distance  Concatenate IDs from level 1  Lower Level 1 ID -> upper half of Level 2 key  Cache size: O(q 2 ) u Distribute level 2 cache across bins  Round robin or block allocated  Distribute bins across machines * q sequences – mean length l

CMSC 838T – Presentation SMP u Parallelizable:  Pairwise searches performed independently  Farmed out to query threads Web server Query Thread Query Thread Query Thread Query Thread Level 1 maps (per-machine) Level 2 bins (distributed)

CMSC 838T – Presentation SMP u Challenge: Cache coherence  Read-only? l Requires advance knowledge of query details  Online update and serialization? l Locking, duplicate updates  Offline updates? l Per-thread list of cache changes Query Thread Query Thread Query Thread Query Thread Level 1 maps (per-machine) Level 2 bins (distributed)

CMSC 838T – Presentation Evaluation: Implementation u Public B-Tree implementation: GIST library u First evaluation on Intel PC  (Pentium III 650, 75GB disks)  q = sequences  l = 450 amino acids per sequence u Second evaluation on Sun Fire  (Sun Fire 6800, 48*750MHz CPUs, 48GB main memory)  l = 417 amino acids per sequence  q = sequences  Seeded cache with dummy values u Future work: architectural impact

CMSC 838T – Presentation Evaluation: Results

CMSC 838T – Presentation Observations u Simple technique  Cheap and easy to implement  Cheap and easy to deploy  Unsupported claim: Are queries really similar? u Concern about distribution across processors  Paper mentions latency, workload balancing  Also reliability of distributed bins  Cache lifetimes? u Proposed solution “component-based system”  “Hand-wavey”; would like to see more.