Class 01 – Fragment assembly. DNA sequence data DNA sequence data is the motherlode of molecular biology. 10^10 base pairs. One human genome/year. It.

Slides:



Advertisements
Similar presentations
Polymerase Chain Reaction (PCR). PCR produces billions of copies of a specific piece of DNA from trace amounts of starting material. (i.e. blood, skin.
Advertisements

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
Combinatorial Pattern Matching CS 466 Saurabh Sinha.
13-2 Manipulating DNA.
Genome Sequence Assembly: Algorithms and Issues Fiona Wong Jan. 22, 2003 ECS 289A.
15-853Page :Algorithms in the Real World Computational Biology V – Sequencing the “Genome” Thanks to: Dannie Durand for some of the slides. Various.
Class 02: Whole genome sequencing. The seminal papers ``Is Whole Genome Sequencing Feasible?'' ``Whole-Genome DNA.
DNA Fragment Assembly CIS 667 Spring 2004 February 18.
Physical Mapping II + Perl CIS 667 March 2, 2004.
Utilizing Fuzzy Logic for Gene Sequence Construction from Sub Sequences and Characteristic Genome Derivation and Assembly.
Genome Assembly Charles Yan Fragment Assembly Given a large number of fragments, such as ACC AC AT AC AT GG …, the goal is to figure out the original.
Molecular Biology of Genes Chapters DNA Technology (not in your book)
1 Sequencing and Sequence Assembly --overview of the genome sequenceing process Presented by NIE, Lan CSE497 Feb.24, 2004.
The PCR The Polymerase Chain Reaction. The PCR is used to make copies of DNA (amplification). Whole genome OR DNA fragments.
Sequencing a genome and Basic Sequence Alignment
CS 6030 – Bioinformatics Summer II 2012 Jason Eric Johnson
DNA Replication DNA mRNA protein transcription translation replication Before each cell division the DNA must be replicated so each daughter cell can get.
Accuracy: The closeness of a measured volume to the true volume as specified by the volume setting of the pipette. Also known as “mean error”. precision:
From Haystacks to Needles AP Biology Fall Isolating Genes  Gene library: a collection of bacteria that house different cloned DNA fragments, one.
Physical Mapping of DNA Shanna Terry March 2, 2004.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Mon C222 lecture by Veli Mäkinen Thu C222 study group by VM  Mon C222 exercises by Anna Kuosmanen Algorithms in Molecular Biology, 5.
Genome mapping. Techniques Used in the Human Genome Project 1.Linkage mapping can be used to locate genes on particular chromosomes and establish the.
Graphs and DNA sequencing CS 466 Saurabh Sinha. Three problems in graph theory.
Manipulating DNA.
Sequencing a genome. Approximate Molecular Dynamics: New Algorithms with Applications in Protein Folding Author: Qun (Marc) Ma Predicting the 3D native.
13-1 Changing the Living World
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
Sequencing a genome and Basic Sequence Alignment
Combinatorial Optimization Problems in Computational Biology Ion Mandoiu CSE Department.
Introduction to Modeling and Algorithms in Life Sciences Ananth Grama Purdue University
Fragment assembly of DNA A typical approach to sequencing long DNA molecules is to sample and then sequence fragments from them.
Chap. 4 FRAGMENT ASSEMBLY OF DNA Introduction to Computational Molecular Biology Chapter 4.
Fragment Assembly of DNA BIO/CS 471 – Algorithms for Bioinformatics.
Polymerase Chain Reaction Aims  To understand the process of PCR and its uses. Starter - Match each term with its correct description (work in pairs)
Human Genome.
Limits to Computation How do you analyze a new algorithm? –Put it in the form of existing algorithms that you know the analysis. –For example, given 2.
Manipulating DNA. Scientists use their knowledge of the structure of DNA and its chemical properties to study and change DNA molecules Different techniques.
1 Application of Algorithm Research to Molecular Biology R. C. T. Lee Dept. Of Computer Science National Chinan University.
Polymerase Chain Reaction A process used to artificially multiply a chosen piece of genetic material. May also be known as DNA amplification. One strand.
DNA computing on a chip Mitsunori Ogihara and Animesh Ray Nature, 2000 발표자 : 임예니.
Human Influence on Genes. Why Analyze DNA? Check for diseases Check for diseases Identify parents Identify parents Crime scene investigations Crime scene.
Computer Science Background for Biologists CSC 487/687 Computing for Bioinformatics Fall 2005.
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
Outline Today’s topic: greedy algorithms
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
FOOTHILL HIGH SCHOOL SCIENCE DEPARTMENT Chapter 13 Genetic Engineering Section 13-2 Manipulating DNA.
Chapter 5 Sequence Assembly: Assembling the Human Genome.
GA for Sequence Alignment  Pair-wise alignment  Multiple string alignment.
ALLPATHS: De Novo Assembly of Whole-Genome Shotgun Microreads
Learning Hidden Graphs Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity.
13-2: Manipulating DNA Biology 2. Until very recently breeders could not change the DNA of the plants/animals they were breeding Scientists use DNA structure.
DNA Replication -Summarize the events of DNA replication.
DNA Replication The process to create a second equivalent DNA molecule from one original.
Gene Expression PowerPoint presentation text copied directly from NJCTL with corrections made as needed. Graphics may have been substituted with a similar.
Lesson: Sequence processing
Chapter 13.2 Manipulating DNA.
Manipulating DNA Chapter 9
The student is expected to: (6H) describe how techniques such as DNA fingerprinting, genetic modifications, and chromosomal analysis are used to study.
DNA and RNA Chapter 12.
DNA Evidence.
Get out a scratch piece of paper.
Genomics for Regional Development
DNA Solution of the Maximal Clique Problem
Structure and Function within Forensic Science
9-2 Replication of DNA.
CHAPTER 13 DNA: The Indispensable Forensic Science Tool
Fragment Assembly 7/30/2019.
Using the DNA Sequence Knowing the sequence of an organism’s DNA allows researchers to study specific genes, to compare them with the genes of other organisms,
Presentation transcript:

Class 01 – Fragment assembly

DNA sequence data DNA sequence data is the motherlode of molecular biology. 10^10 base pairs. One human genome/year. It is our portal to protein sequences. It is fast, cheap and reliable. How do we get it?

Where the fragments come from Make many copies of a chromosome, using pcr (polymerase chain reaction). Break it up into short pieces. (We can sequence short pieces only.) Reassemble the short pieces.

Simplest version Like a jigsaw puzzle, except that we match overlaps rather than adjacencies. Assume that the shortest assembled string (shortest superstring is the correct solution). We know the orientation of each fragment, and the approximate length of the correct answer. (Real world considerations.)

Toy example ACCGT CGTGC TTAC TACCGT

Solution --ACCGT CGTGC TTAC TACCGT-- _________ TTACCGTGC

Real world complications This model is too optimistic to be realistic Problems: Errors is reading fragments Contamination (chimeras) Could come from either strand Repeats Inverted repeats

The coverage problem Incomplete coverage (leaving ‘contigs’) We may have complete coverage, but not know it (for sure!)

Shortest superstring problem (SSP) Input: A collection F of strings Output: A shortest possible string S s.t. for every f in F, S is a superstring of f. Theorem: SSP is NP-complete. Fact: approximation algorithms for SSP are of no known practical value

Does motivation trump solution? Biologist: ‘Find an efficient algorithm which solves my problem.’ Computer scientist: ‘Give me a problem which I can solve efficiently.’ Culture clash: What happens when neither is possible?