1 Advisor: Professor R. C. T. Lee Speaker: Jui Peng Lu ( 盧瑞鵬 ) DNA Sequence Assembly.

Slides:



Advertisements
Similar presentations
Gene Prediction: Similarity-Based Approaches
Advertisements

1 Average Case Analysis of an Exact String Matching Algorithm Advisor: Professor R. C. T. Lee Speaker: S. C. Chen.
CS 336 March 19, 2012 Tandy Warnow.
Improved Algorithms for Inferring the Minimum Mosaic of a Set of Recombinants Yufeng Wu and Dan Gusfield UC Davis CPM 2007.
8.5 Factoring x 2 + bx + c. Factoring with Positives x 2 + 3x + 2 ◦Find two factors of 2 whose sum is 3. ◦Helps to make a list (x + 1)(x + 2) Factors.
Variables i)Numeric Variable ii)String Variable
Greedy Algorithms CS 6030 by Savitha Parur Venkitachalam.
Master Course MSc Bioinformatics for Health Sciences H15: Algorithms on strings and sequences Xavier Messeguer Peypoch (
String Recognition Simple case: recognize 1101 “ ” 0 “1” 0 “11” 0 Reset 1 “110” “1101”
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Applied Discrete Mathematics Week 12: Trees
Sequence Alignment Storing, retrieving and comparing DNA sequences in Databases. Comparing two or more sequences for similarities. Searching databases.
Multiple Sequence Alignment Algorithms in Computational Biology Spring 2006 Most of the slides were created by Dan Geiger and Ydo Wexler and edited by.
Finding approximate palindromes in genomic sequences.
CSIE NCNU1 Block Alignment: An Approach for Multiple Sequence Alignment Containing Clusters Advisor: Professor R. C. T. Lee Speaker: B. W. Xiao 2004/06/04.
Parallel Merging Advanced Algorithms & Data Structures Lecture Theme 15 Prof. Dr. Th. Ottmann Summer Semester 2006.
1 Convolution and Its Applications to Sequence Analysis Student: Bo-Hung Wu Advisor: Professor Herng-Yow Chen & R. C. T. Lee Department of Computer Science.
Systems of Linear Equations
Alignment II Dynamic Programming
Sequence similarity. Motivation Same gene, or similar gene Suffix of A similar to prefix of B? Suffix of A similar to prefix of B..Z? Longest similar.
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
Genome Assembly Charles Yan Fragment Assembly Given a large number of fragments, such as ACC AC AT AC AT GG …, the goal is to figure out the original.
Class 2: Basic Sequence Alignment
The Simplified Partial Digest Problem: Hardness and a Probabilistic Analysis Zo ë Abrams Ho-Lin Chen
Sequencing a genome and Basic Sequence Alignment
Copyright © 2010 Pearson Education, Inc. All rights reserved Sec Systems of Linear Equations In Two Variables.
The California Frog-Jumping Contest
Check it out! : Working with Ratio Segments.
(x 1, y 1 ) (x 2, y 2 (x 1, y 1 ) (x 2, y 2 ) |x 1 – x 2 | |y 1 – y 2 | d.
Linear Pair Postulate If two angles form a linear pair then they are supplementary.
DNA Sequencing (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 8, 2005 ChengXiang Zhai Department of Computer Science University of Illinois,
Graphs and DNA sequencing CS 466 Saurabh Sinha. Three problems in graph theory.
1 Compact Error-Resilient Computational DNA Tiling Assemblies John H.Reif, Sudheer Sahu, and Peng Yin Presenter: Seok, Ho-SIK.
JM - 1 Introduction to Bioinformatics: Lecture III Genome Assembly and String Matching Jarek Meller Jarek Meller Division of Biomedical.
Sequencing a genome and Basic Sequence Alignment
Combinatorial Optimization Problems in Computational Biology Ion Mandoiu CSE Department.
Greedy Algorithms CS 498 SS Saurabh Sinha. A greedy approach to the motif finding problem Given t sequences of length n each, to find a profile matrix.
Locus – Equation of Circle Page 5. Essential Question: What is the difference between a linear equation, quadratic equation, and the equation of a circle?
Chapter 1 - Section 3 Special Angles. Supplementary Angles Two or more angles whose sum of their measures is 180 degrees. These angles are also known.
String Matching String matching: definition of the problem (text,pattern) depends on what we have: text or patterns Exact matching: Approximate matching:
Class 01 – Fragment assembly. DNA sequence data DNA sequence data is the motherlode of molecular biology. 10^10 base pairs. One human genome/year. It.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
VLSI AND INTELLIGENT SYTEMS LABORATORY 12 Bit Hamming Code Error Detector/Corrector December 2nd, 2003 Department of Electrical and Computer Engineering.
1 Application of Algorithm Research to Molecular Biology R. C. T. Lee Dept. Of Computer Science National Chinan University.
Outline Today’s topic: greedy algorithms
GENOME ASSEMBLY Candidatus Carsonella Ruddii. Problem: How can Eulerian graphs be used to assemble a genomic sequence? ■Real life scenario: multiple copies.
(C) 2004, SNU Biointelligence Lab, DNA Extraction by Cross Pairing PCR Giuditta Franco, Cinzia Giagulli, Carlo Laudanna, Vincenzo.
Qq q q q q q q q q q q q q q q q q q q Background: DNA Sequencing Goal: Acquire individual’s entire DNA sequence Mechanism: Read DNA fragments and reconstruct.
454 Genome Sequence Assembly and Analysis HC70AL S Brandon Le & Min Chen.
OPERA highthroughput paired-end sequences Reconstructing optimal genomic scaffolds with.
Learning Hidden Graphs Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity.
Hibridization: provide information about l-tuples present in DNA. DNA sequencing There are two techniques: Shotgun: DNA sequences are broken into 100Kb-500Kb.
DNA Sequencing (Lecture for CS498-CXZ Algorithms in Bioinformatics)
FastHASH: A New Algorithm for Fast and Comprehensive Next-generation Sequence Mapping Hongyi Xin1, Donghyuk Lee1, Farhad Hormozdiari2, Can Alkan3, Onur.
Systems of Linear Equations In Two Variables
Matchings (Bipartite Graphs)
1.3 Midpoint and Distance.
Ellipses Ellipse: set of all points in a plane such that the sum of the distances from two given points in a plane, called the foci, is constant. Sum.
Cyclopeptide Sequencing Problem
Example 1 b and c Are Positive
Solve
Chapter 7 Functions and Graphs.
Time Relaxed Spatiotemporal Trajectory Joins
Phylogeny.
DNA Solution of the Maximal Clique Problem
Matchings (Bipartite Graphs)
Lecture 5 Dynamic Programming
Decidability continued….
5-6 Inequalities in ONE Triangle
Homework #2 Due May 29 , Consider a (2,1,4) convolutional code with g(1) = 1+ D2, g(2) = 1+ D + D2 + D3 a. Draw the.
Presentation transcript:

1 Advisor: Professor R. C. T. Lee Speaker: Jui Peng Lu ( 盧瑞鵬 ) DNA Sequence Assembly

2 DNA Sequence Assembly Problem  We are given a set of strings S = {s 1, s 2,…, s n } which are cut from an original sequence by using shotgun method, our job is to reconstruct the original string. Original Sequence First Cutting Second Cutting … s1s1 s2s2 s3s3 s4s4 s5s5

3 Basic Ideas of Our Algorithm  For each input string s i, there is a string s j whose prefix is equal to the suffix of s i. Original Sequence First Cutting Second Cutting s1s1 s2s2 s3s3 s4s4 s5s5

4 Example Suppose we are given the following sequence: AGCCTGCCTAGCCCTAATCTG AGCCT, GCCTAGCCC, TAATCTG AGC, CTGCC, TAGCCCTA, ATCTG Assume the first shot gun method cuts the sequence into the following segments: The second cutting produces the following segments:

5 Example  Input strings S = {AGCCT, GCCTAGCCC, TAATCTG, AGC, CTGCC, TAGCCCTA, ATCTG} AGCCTGCCTAGCCCTAATCTG GCCTAGCCC TAATCTG AGCCT AGC CTGCC TAGCCCTA ATCTG

6 Experimental Results LOCUS in NCBI The length of the original DNA sequence (base pairs) The number of input strings Time (Sec.) NC_ NC_ NC_ BX AP AP AP BX

7 2-Matching Double Digest Problem  Given three sets of distances : A = {2, 9, 5} B = {7, 3, 6} C = {1, 4, 2, 7, 2} Our job is to find the following solution: A B C i 1 = 1, 2, …, p i 2 = 1, 2, …, q i 3 = 1, 2, …, r

8 Basic Ideas of Our Algorithm  There are two blocks in A or B whose lengths are equal to the length of starting and ending block in C.  For each two adjacent blocks in C, there is a block in either A or B whose length is equal to the sum of length of those two adjacent blocks in C A B C

9 Example  Input: A = {3, 4, 7} B = {6, 3, 5} C = {1, 2, 3, 4, 4} A B C

10 Experimental Results  We designed a visual displaying tool to display our experimental results.