High performance bioinformatics

Slides:



Advertisements
Similar presentations
Seeking prime numbers quickly through parallel-computing Daniel J. Wright.
Advertisements

Types of Parallel Computers
History of Distributed Systems Joseph Cordina
16/13/2015 3:30 AM6/13/2015 3:30 AM6/13/2015 3:30 AMIntroduction to Software Development What is a computer? A computer system contains: Central Processing.
Beowulf Cluster Computing Each Computer in the cluster is equipped with: – Intel Core 2 Duo 6400 Processor(Master: Core 2 Duo 6700) – 2 Gigabytes of DDR.
Chapter 4 Assessing and Understanding Performance
F2032 Fundamental of OS Chapter 1 Introduction to Operating System Part 4.
Chapter 9: Moving to Design
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
UNIT - 1Topic - 2 C OMPUTING E NVIRONMENTS. What is Computing Environment? Computing Environment explains how a collection of computers will process and.
Gedae Portability: From Simulation to DSPs to the Cell Broadband Engine James Steed, William Lundgren, Kerry Barnes Gedae, Inc
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Cell processor implementation of a MILC lattice QCD application.
Neuroblastoma Stroma Classification on the Sony Playstation 3 Tim Hartley, Olcay Sertel, Mansoor Khan, Umit Catalyurek, Joel Saltz, Metin Gurcan Department.
PERFORMANCE ANALYSIS cont. End-to-End Speedup  Execution time includes communication costs between FPGA and host machine  FPGA consistently outperforms.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
Software Development Cycle What is Software? Instructions (computer programs) that when executed provide desired function and performance Data structures.
Group May Bryan McCoy Kinit Patel Tyson Williams Advisor/Client: Zhao Zhang.
BioPerf: A Benchmark Suite to Evaluate High- Performance Computer Architecture on Bioinformatics Applications David A. Bader, Yue Li Tao Li Vipin Sachdeva.
Expert System Job Offer Evaluation Software May Abstract The project’s focus is to decide what criteria should be used to determine which job offer.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Group May Bryan McCoy Kinit Patel Tyson Williams.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact:
High Performance Computing on an IBM Cell Processor Team May08-24: Kyle Byerly Matt Rohlf Bryan Venteicher Shannon McCormick Faculty Adviser: Team Website:
High Performance Computing on an IBM Cell Processor Bioinformatics Team Members Kyle Byerly Shannon McCormick Matt Rohlf Bryan Venteicher Advisor Dr. Zhao.
The Octoplier: A New Software Device Affecting Hardware Group 4 Austin Beam Brittany Dearien Brittany Dearien Warren Irwin Amanda Medlin Amanda Medlin.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Presented by Jeremy S. Meredith Sadaf R. Alam Jeffrey S. Vetter Future Technologies Group Computer Science and Mathematics Division Research supported.
Chapter 1: Computer Basics Instructor:. Chapter 1: Computer Basics Learning Objectives: Understand the purpose and elements of information systems Recognize.
Introduction to Computers - Hardware
Unit 2 Technology Systems
GCSE Computing - The CPU
These slides are based on the book:
Understanding and Improving Server Performance
System.
Cross Platform Development using Software Matrix
Central Processing Unit- CPU
CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037
High Performance Computing on an IBM Cell Processor --- Bioinformatics
Cell Architecture.
Network Configurations
Constructing a system with multiple computers or processors
Book: Integrated business processes with ERP systems
Hadoop Clusters Tess Fulkerson.
Chapter 1: Introduction
Chapter 1: Introduction
Operating Systems and Systems Programming
Book: Integrated business processes with ERP systems
Programming Languages
Multiple Processor Systems
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
An Introduction to Software Architecture
Software requirements
Motivation, Terminology, Layered systems (and other random stuff)
CIS 4328 – Senior Project 2 And CEN Engineering of Software 2
The Main Features of Operating Systems
SOFTWARE DEVELOPMENT LIFE CYCLE
GCSE Computing - The CPU
Performance and Code Tuning Overview
Types of Parallel Computers
ICS103 Programming in C 1: Overview of Computers And Programming
Presentation transcript:

High performance bioinformatics Group May 09-06 Bryan McCoy Kinit Patel Tyson Williams High performance bioinformatics

Problem/Need Statement Current ways to solve Bioinformatics problems are either slow or very expensive. There is a need for a way to reduce cost and still deliver high performance in a computer system that can solve Bioinformatics problems.

What is Bioinformatics? Genetic sequencing. Massive amounts of data. Simple operations but many of them. Perfect for distributed computing.

Proposed Solution Use a cluster of PS3s with their embedded Cell processors.

Cell Broadband Engine Has 1 central PowerPC based PPE. Has 8 surrounding SPEs. The 8 SPEs are connected via the element interconnect bus.

Cell Broadband Engine

Functional requirements FR1. Ported applications shall run on the Cell B.E. FR2. The results returned shall be the same as the original program. FR3. The applications shall return their runtime. FR4. The applications shall execute in parallel on multiple Cell B.E.s.

Non-Functional Requirements NF1. The Cells shall all run on the Linux OS. NF2. The resulting runtimes of the ported applications shall be faster than on the original applications. NF3. The ported application shall be coded in the C language.

Operating Environment Use Fedora 9 OS as it is currently supported by the Cell SDK 3.1. Uses the command line for user interface. Use the IBM XLC compiler and/or the current GCC compiler.

Market Survey Results of the survey point to a huge speed up of computationally intensive programs. Dr. Gaurav Khanna at the University of Massachusetts Dartmouth used cluster of 8 PS3s to replace a supercomputer. Universitat Pompeu Fabra, in Barcelona, deployed in 2007 a BOINC system called PS3GRID for collaborative biological computing.

Deliverables The Source Code. Compiled Executable. Runtime Comparisons. Project Final Report. Project Poster. Project Final Presentation.

Work Breakdown Structure Port Apps to Cluster PS3s Problem Definition Research Cell/B.E Research Bioperf Suite Research Distributed Parallel Algorithms Research Previously Done Work End Product Design Design Requirements Design Process Design Documents Considerations and Selections Decide Which Linux to Install Decide which applications to port End Product Implementation Hardware Implementation Prototyping Implementation Software Implementation End Product Testing Ensure Correctness of Output Results Benchmarking Final Documentation and Demonstration Create Final Report Create Project Poster Prepare for Presentation Work Breakdown Structure

Costs Time Equipment Approximately 555 man hours total. Freely donated. Total cost $0. Equipment 3 PS3s Crossbar router Provided for us by client. Total cost $0.

Resource Requirements 3 PlayStation 3s. High performance network switch. Books on distributed computing on Cell. Time.

Work Schedule Gant chart

Risk Assessment Slow network speed. Software support. Limited RAM. Hardware Failure. Lower quality entertainment hardware. Limited prior experience. Software development schedule.

Design Further divide the application into multiple threads for SPE execution on multiple PS3s, alter the functional logic, and vectorize the code where possible.

Software Decomposition Diagram

System Requirements SR1. The system shall allow the user to input multiple DNA sequences in FASTA format through a file interface. SR2. The system shall output all of the most parsimonious trees implied by the input data to the screen. SR3. The system shall share computational work among the PPE and SPEs available to each client/server process. SR4. The front-end shall share computational work with available back-end processes. SR5. The front-end shall be able to connect to at least 2 back-end processes via a high performance router.

System Analysis The key is data flow. Broken into 3 stages. DNA sequences distributed to the PPEs down to the SPEs Each SPE searches every possible parsimony tree for the best possible score using a branch and bound heuristic. Finally the results are aggregated back to the main PPE and the results output.

Specifications Input Output DNA sequence files in FASTA format. Runtime of the application. The most parsimonious phylogenetic tree. The parsimony score of the phylogenetic tree.

Specifications User Interface No changes to the user interface. Uses a command line interface.

Specifications Hardware 3 PlayStation 3s High performance Cross-Bar network switch.

Specifications Software Fedora 9 with Linux 2.6.25 kernel for the Power PC IBM Cell SDK 3.1 IBM XLC 9.0 and GCC 4.3 compilers. DNAPenny 3.6. Bioperf Suite

Specifications Testing Compare benchmarked runtimes over several iterations and inputs to get averages. Compare these runtimes with previous group’s runtimes on single Cell processor. Compare these runtimes with previous group’s runtimes on a high performance server. Quad-core Intel Xeon 3.0GHz, 6GB RAM.

Acknowledgements May08-24 group Bioperf developers Kyle Byerly Shannon McCormick Matt Rohlf Bryan Venteicher Bioperf developers David A. Bader, Georgia Tech Yue Li, Univ. of Florida Tao Li, Univ. of Florida Vipin Sachdeva, IBM Austin

Questions?

Previous Results and Projected Results Code revision 4-Way 3.0GHz Machine (seconds) X Speedup PlayStati on 3 (seconds ) X Speedu p dnapenny_orig 823.568 1 7793.915 dnapenny_slimmer 360.131 2.286856 73 941.981 8.27396 2 parallel_dnapenny_1.0 221.432 3.719281 77 780.867 9.98110 43 supplement_spe_parall el_1SPE 1111.471 7.01225 22 supplement_spe_parall el_3SPE 443.521 17.5728 21 supplement_spe_parall el_6SPE 277.233 28.1132 3 supplement_parallel_ve ctor_1SPE 260.952 29.8672 36 supplement_parallel_ve ctor_3SPE 153.656 50.7231 41 supplement_parallel_ve ctor_6SPE 130.59 59.6823 26 Cluster with 3 PlayStations (Projected) ~54.8 ~142.22 4

Summary Cost: $0. Equipment provided. Time: 555 approximate man hours. Freely Donated. Results: 4x the performance of a similarly priced system.