Reconfigurable Computing (EN2911X, Fall07)

Slides:



Advertisements
Similar presentations
Prof. Drs. Sutarno, MSc., PhD.. Biology is Study of Life Molecular Biology  Studying life at a molecular level Molecular Biology  modern Biology The.
Advertisements

Nucleic Acids Nucleic Acid Basics Contain instructions to build proteins 2 types: – DNA – RNA Composed of smaller units called nucleotides – Monomer:
GENETIC-CONCEPTS.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 17: Application-Driven Hardware Acceleration (3/4)
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 18: Application-Driven Hardware Acceleration (4/4)
LECTURE 5: DNA, RNA & PROTEINS
Molecular Biology Background. Schematic view of DNA organization in a cell.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Incorporating Bioinformatics in an Algorithms Course Lawrence D’Antonio Ramapo College of New Jersey.
Prof. Drs. Sutarno, MSc., PhD..  Chromosomes are made up of Proteins and DNA  DNA carries the genetic information  This information is similar to digital.
Exploration Session Week 8: Computational Biology Melissa Winstanley: (based on slides by Martin Tompa,
Nucleic Acids Nucleic Acid Basics Contain instructions to build proteins 2 types: – DNA – RNA Composed of smaller units called nucleotides – Monomer:
Cellular Metabolism Chapter 4. Introduction Metabolism is many chemical reactionss Metabolism breaks down nutrients and releases energy= catabolism Metabolism.
CSE 6406: Bioinformatics Algorithms. Course Outline
RNA and Protein Synthesis Chapter 13 (M). Information Flow Language of DNA is written as a sequence of bases If the bases are the letters the genes are.
DNA and RNA Chapter 12. Types of Nucleic Acids DNA (Deoxyribose Nucleic Acid) RNA (Ribose Nucleic Acid)
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
Sevas Educational Society All Rights Reserved, 2008 Module 1 Introduction to Bioinformatics.
National 5 Biology Course Notes Part 4 : DNA and production of
Chapter 11 DNA and GENES. DNA: The Molecule of Heredity DNA, the genetic material of organisms, is composed of four kinds nucleotides. A DNA molecule.
Lecture #3 Transcription Unit 4: Molecular Genetics.
Nucleic Acids.
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
CHAPTER 13 RNA and Protein Synthesis. Differences between DNA and RNA  Sugar = Deoxyribose  Double stranded  Bases  Cytosine  Guanine  Adenine 
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
BIOLOGY CONCEPTS & CONNECTIONS Fourth Edition Copyright © 2003 Pearson Education, Inc. publishing as Benjamin Cummings Neil A. Campbell Jane B. Reece Lawrence.
Nucleic Acids Nucleic acids provide the directions for building proteins. Two main types…  DNA – deoxyribonucleic acid  Genetic material (genes) that.
Introduction to Molecular Biology and Genomics BMI/CS 776 Mark Craven January 2002.
Introduction to molecular biology Data Mining Techniques.
DNA: WHAT IS IT, and WHAT IS ITS STRUCTURE? DNA is Deoxyribonucleic Acid, a coiled double helix molecule. Genes are made of DNA. All of your genetic Information.
1 DNA The illustration is a ‘model’ of the double helix forming part of a DNA molecule (Slide 14)
Genetics.
Molecular Genetics Transcription & Translation
DNA, RNA & PROTEINS The molecules of life.
Life’s Instruction Manual or What Genes are Made Of
The Structure of DNA and RNA
3.11 Proteins are essential to the structures and activities of life
Things that may help with comprehension of bioinformatics issues in general and Rosalind problems in particular.
Nucleic Acids.
DNA: The Genetic Material
From DNA to Proteins Transcription.
Agenda 4/23 and 4/24 DNA replication and protein synthesis review
DNA and Heredity DNA Structure and Function - Amoeba Sisters
Protein Synthesis Part 1: Transcription
Nucleic Acids and Protein Synthesis
Transcription and Translation Chapter 12
DNA and Heredity DNA Structure and Function - Amoeba Sisters
DNA and Heredity DNA Structure and Function - Amoeba Sisters
UNIT 5 Protein Synthesis.
What is RNA? Do Now: What is RNA made of?
DNA and Heredity Module 6.
Protein synthesis: Overview
DNA Notes.
Bioinformatics Vicki & Joe.
DNA and Heredity DNA Structure and Function - Amoeba Sisters
Unit 5: DNA, RNA and Protein Synthesis
DNA, RNA & PROTEINS The molecules of life.
DNA: the molecule of heredity
4/6 Objective: Explain the steps and key players in transcription.
LECTURE 5: DNA, RNA & PROTEINS
Nucleic Acids.
DNA and Heredity Module 6.
Basic Local Alignment Search Tool (BLAST)
Macromolecules and the Origin of Life
4/2 Objective: Explain the steps and key players in transcription.
Genes Determine the characteristics of individuals.
DNA Deoxyribonucleic Acid.
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Reconfigurable Computing (EN2911X, Fall07) Lecture 18: Application-Driven Hardware Acceleration (4/4) Prof. Sherief Reda Division of Engineering, Brown University http://ic.engin.brown.edu

Status We have covered popular application-driven hardware acceleration using reconfigurable computing FFT for signal and image processing as an example of divide and conquer algorithms Speech recognition applications Viterbi algorithm for digital communication as an example of dynamic programming algorithms This lecture we overview some of the algorithms for bioinformatics

Quick introduction to molecular biology & bioinformatics

DNA Can be thought of as the “blueprint” for an organism Composed of small molecules called nucleotides four different nucleotides distinguished by the four bases: adenine (A), cytosine (C), guanine (G) and thymine (T) DNA is digital information A single strand of DNA can be thought of as a string composed of the four letters: A, C, G, T ACGTTCTA DNA molecules usually consist of two strands arranged in a double helix structure where A bonds to T and C bonds to G

Genes Genes are the basic units of heredity A gene is a sequence of bases that carries the information required for constructing a particular protein. Such a gene is said to encode a protein The human genome comprises ~ 20K-25K genes Those genes encode > 100,000 proteins

Proteins a folded protein structure amino acids Proteins perform most life functions and even make up the majority of cellular structures. Proteins are large, complex molecules made up of smaller subunits called amino acids. Chemical properties that distinguish the 20 different amino acids cause the protein chains to fold up into specific three-dimensional structures that define their particular functions in the cell. Proteins can be thought of as a string composed from a 20-character alphabet

Central dogma of molecular biology RNA is like DNA except that they are usually single stranded and the base uracil (U) is used in place of thymine (T) a strand of RNA can be thought of as a string composed of the four letters: A, C, G, U

Translation

Translation There are possible 6 reading frames in translating DNA sequences into proteins. In many cases, FPGAs are used to translate a DNA sequence into the 6 frames in parallel and then concurrently apply any subsequent processing

DNA string alignment A sequence alignment is a way of arranging the primary sequences of DNA (or RNA or protein) to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. If two sequences in an alignment share a common ancestor, mismatches can be interpreted as point mutations and gaps as insertion or deletion mutations introduced in one or both lineages in the time since they diverged from one another. At each position, one of three cases can occur: A match occurs when the same character is present in both strings A mismatch, or substitution, when there are two different characters A gap, where is an insertion of one character in only one string, or symmetrically a deletion in the other string How can we find the best alignment between two DNA strings?

Finding the best global alignment [Figures from slides 11-14 from Bioinformatics Applications by D. Lavenier and M. Giraud] Costs: +4 for a match -2 for a mismatch -3 for a gap Needleman and Wunsch (NW) dynamic programming algorithm

Local alignment: finding the most similar subsequences Costs: +4 for a match -2 for a mismatch -3 for a gap Smith and Waterman (SW algorithm)

Dynamic programming advantage on FPGAs All cells on a same anti-diagonal can be computed simultaneously What is the runtime on a general purpose CPU? What is the runtime on an FPGA?

Required number of computational cells

Examples of commercial products Bioceleration Ltd. Each BioXL/H board contains eight FPGA modules and 128MB of global memory. Each of the modules is programmed to calculate four matrix cells per clock cycle (for the Smith-Waterman algorithm). An eight-board BioXL/H executes these applications at a speed of 6 billion matrix cells per second. The clock rate of the system is 25-33MHz (programmable). Examples of applications supported: Smith-Waterman algorithm Translation of nucleic acid sequences to 6 reading frames and search frame into an amino acid database

More examples: TimeLogic “CodeQuest is a biocomputing workstation that processes large genomics searches and sophisticated informatics workflows. Using its FPGA-based DeCypher Engines, the quad-core CodeQuest workstation speeds Tera-BLAST, Smith-Waterman, Hidden Markov Model (HMM) and gene modeling searches at the speed of a mid-sized cluster.” “It brings several fold the performance of a 64-CPU cluster, yet costs less than 10 CPUs”