Welcome to Introduction to Bioinformatics Wednesday, 10 February Genome Sequencing/Assembly Genome sequencing/Assembly Click anywhere to go on to the next.

Slides:



Advertisements
Similar presentations
Celera Assembler Arthur L. Delcher Senior Research Scientist CBCB University of Maryland.
Advertisements

Introduction to Bioinformatics* Probability Calculations in Bioinformatics *
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
Chapter 4 Probability and Probability Distributions
Warm-up 5.1 Introduction to Probability 1) 2) 3) 4) 5) 6) 7)
STA Lecture 81 STA 291 Lecture 8 Probability – Probability Rules – Joint and Marginal Probability.
This presentation uses animations and is best viewed as a slide show. To start the presentation, click Slide Show on the top tool bar and then View show.
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
DNA Sequencing Lecture 9, Tuesday April 29, 2003.
Genome Sequence Assembly: Algorithms and Issues Fiona Wong Jan. 22, 2003 ECS 289A.
DNA Sequencing – “Plus and Minus” Plus –Incubate with T4 DNA Polymerase and single dNTP –T4 Polymerase degrades 3’ ends in absence of dNTP –Fractionated.
Class 02: Whole genome sequencing. The seminal papers ``Is Whole Genome Sequencing Feasible?'' ``Whole-Genome DNA.
Stuff to Do. Midterm I questions due 1/31 me your question (with answers), –if you have the capability, mail complete questions, figures, etc. and.
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
2: Large-Scale 1 / 42 1 Large!. 2: Large-Scale 2 / 42 High throughput technologies: Sequencing Gene expression profiling Chip-CHIP and tiling arrays Whole.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Event algebra Probability axioms Combinatorial problems (Sec )
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
Lecture 2. Genome sequencing What good is it? 9/2/09.
Genome sequencing and assembling
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Assembly Bonnie Hurwitz Graduate student TMPL.
Thursday, 5 June 2008 Problems in sequence analysis Identification by sequence similarity Genes Determining Plant-Cyanobacterial Symbioses and Consideration.
Sequencing Data Quality Saulo Aflitos. Read (≈100bp) Contig (≈2Kbp) Scaffold (≈ 2Mbp) Pseudo Molecule (Super Scaffold) Paired-End Mate-Pair LowComplexityRegion.
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Recombinant DNA Technology for the non- science major.
Lecture 15 – Gene Cloning Based on Chapter 08 - Genomics: The Mapping and Sequencing of Genomes Copyright © 2010 Pearson Education Inc.
MATH 3033 based on Dekking et al. A Modern Introduction to Probability and Statistics Slides by Tim Birbeck Instructor Longin Jan Latecki C2: Outcomes,
Aim: Final Review Session 3 Probability
Presentation on genome sequencing. Genome: the complete set of gene of an organism Genome annotation: the process by which the genes, control sequences.
How to Build a Horse Megan Smedinghoff.
AP Biology: Chapter 14 DNA Technologies
Mouse Genome Sequencing
Welcome to Introduction to Bioinformatics* Wednesday, 8 February Genome Sequencing/Assembly (Didn’t have time to do this in class) Discussion of Study.
CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Instructor Longin Jan Latecki C2: Outcomes, events, and probability.
Genome sequencing Haixu Tang School of Informatics.
P. Tang ( 鄧致剛 ); RRC. Gan ( 甘瑞麒 ); PJ Huang ( 黄栢榕 ) Bioinformatics Center, Chang Gung University. Genome Sequencing Genome Resequencing De novo Genome.
Sequencing a genome. Approximate Molecular Dynamics: New Algorithms with Applications in Protein Folding Author: Qun (Marc) Ma Predicting the 3D native.
A Sequenciação em Análises Clínicas Polymerase Chain Reaction.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006.
Problems of Genome Assembly James Yorke and Aleksey Zimin University of Maryland, College Park 1.
Applied Bioinformatics Week 5. Topics Cleaning of Nucleotide Sequences Assembly of Nucleotide Reads.
Human Genome.
Statistics Lecture 4. Last class: measures of spread and box-plots Have completed Chapter 1 Today - Chapter 2.
Probability Rules. We start with four basic rules of probability. They are simple, but you must know them. Rule 1: All probabilities are numbers between.
Lesson 8.7 Page #1-29 (ODD), 33, 35, 41, 43, 47, 49, (ODD) Pick up the handout on the table.
Sixth lecture Concepts of Probabilities. Random Experiment Can be repeated (theoretically) an infinite number of times Has a well-defined set of possible.
AP Statistics Wednesday, 20 November 2015 OBJECTIVE TSW review for the test covering probability.
Chapter 5 Sequence Assembly: Assembling the Human Genome.
454 Genome Sequence Assembly and Analysis HC70AL S Brandon Le & Min Chen.
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Chapter 3 Probability Slides for Optional Sections
Chapter 4 Probability Concepts
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Jeong-Hyeon Choi, Sun Kim, Haixu Tang, Justen Andrews, Don G. Gilbert
Pre-genomic era: finding your own clones
Stuff to Do.
This presentation uses animations and is best viewed as a slide show.
2nd (Next) Generation Sequencing
Bioinformatics: Buzzword or Discipline (???)
A Sequenciação em Análises Clínicas
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Probability Rules Rule 1.
Business and Economics 7th Edition
Presentation transcript:

Welcome to Introduction to Bioinformatics Wednesday, 10 February Genome Sequencing/Assembly Genome sequencing/Assembly Click anywhere to go on to the next slide This demonstration is best viewed as a slide show, enabling you to simulate a session and make changes in cursor position more obvious. To do this, click Slide Show on the top tool bar, then View show.

What to do for summer vacation?

Deadline, SUNday Feb 28!

Target, Monday Mar 1!

Deadline, ???

Deadline, FRIday Feb 26!

Global Viral Genome Project Deadline, whenever!

Learn more about… HHMI: BBSI: VCU-USF: GVGP: (News)

What is the sequence (5' to 3') represented by the gel? Myers et al SQ2 G A T C

What is the sequence (5' to 3') represented by the gel? Myers et al SQ2 G A T C

Dideoxy sequencing (= Sanger sequencing)

Dideoxy sequencing

What is the sequence (5' to 3') represented by the gel? G A T C Myers et al SQ2

What is the sequence (5' to 3') represented by the gel? G A T C ddC TCGTGTACATCGTAACACGGTTAAGTTCGTGTACATCGTAACACGGTTAAGT Myers et al SQ2

Sequencing process Drosophila genome (~100 million nt) Sequence it Technical limitation Reads limited to 100’s of nt

Sequencing process Drosophila genome (~100 million nt)... How many possible 500 nt fragments are there?

Sequencing process Drosophila genome (~100 million nt)... SAMPLE

Sequencing process Drosophila genome (~100 million nt) SAMPLE... How many 500 nt samples needed  100 million nt?

Sequencing process Drosophila genome (~100 million nt) SAMPLE... How many 500 nt samples needed  100 million nt? Is this enough? Oversampling … coverage?

Paint the wall Study Question 8 & 9 "oversampling"? "coverage"? Shotgun sequencing ? How long will this take?

Paint the wall How long will this take? Study Question 8 & 9 "oversampling"? "coverage"? Shotgun sequencing ?

Paint the wall How long will this take? 40 " 25 " 1 sq " Study Question 8 & 9 "oversampling"? "coverage"? Shotgun sequencing ?

Paint the wall How long will this take? 40 " 25 " 1000 paint balls? Study Question 8 & 9 "oversampling"? "coverage"? Shotgun sequencing ?

Oversampling Completeness How much is painted with 1x oversampling? Study Question 8 & 9 "oversampling"? "coverage"? Shotgun sequencing ? What fraction won't be painted?

P(TT) = 1/2 x 1/2 = 1/4 Probability that two coins come up both tails Rule of multiplication intersection independent Gets T from first AND gets T from second Intersection of possibilities (Rule of multiplication)

P(at least 1 T) = 1/4 + 1/4 + 1/4 Probability that either of two coins comes up tails 1/2 x 1/2 = 1/4? Gets HT or TH or TT Union of possibilities (Rule of addition) 1/2 + 1/2 = 1?

P(at least 1 T) = 1/4 + 1/4 + 1/4 Probability that either of two coins comes up tails Gets HT or TH or TT Union of possibilities (Rule of addition) Rule of addition union mutually exclusive

P(at least 1 T) = 1 - 1/4 Probability that either of two coins does not comes up tails Probability(2 T) = 1 – Probability(NOT 2 T) Union of possibilities (Rule of complementation) Rule of complementation yin-yang Adds to 1

Sequencing process Drosophila genome (~100 million nt)... Focus on one nucleotide… What’s the probability that it’s covered by one read? What’s the probability that it’s covered by two reads? What’s the probability that it’s covered by 200,000 reads?

Problem Set 3, Problem 2 Statistics of mini-plasmid assembly

Why read pairs? Scaffolds? DNA Myers et al SQ6 Contig 1Contig 2

G A T C primer x 1000's plasmid insert ~2000 nt mates Myers et al SQ6 Why read pairs? Scaffolds?

... ~ 150,000 nt Bacterial Artificial CHROMOSOME mates Myers et al SQ6 Why read pairs? Scaffolds? P1-derived Artificial CHROMOSOME

Myers et al SQ6 Why read pairs? Scaffolds?

SQ14. From figures given in the text and in Table 1, check the accuracy of each of the following statements: a. "We produced million reads that yielded 1.76 Gbp of sequence..." b. "...trillions of overlaps between reads are examined." c. "...to produce 654,000 of the 2-kbp mates and 497,000 of the 10-kbp mates." Myers et al (2000)