DNA Sequencing. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT.

Slides:



Advertisements
Similar presentations
Applications of genome sequencing projects 1) Molecular Medicine 2) Energy sources and environmental applications 3) Risk assessment 4) Bioarchaeology,
Advertisements

applications of genome sequencing projects
Genomics – The Language of DNA Honors Genetics 2006.
CS273a Lecture 5, Win07, Batzoglou Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort contigs from largest to smallest,
Connection Between Alignment and HMMs. A state model for alignment -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACC IMMJMMMMMMMJJMMMMMMJMMMMMMMIIMMMMMIII.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CHAPTER 18 LECTURE SLIDES.
Some new sequencing technologies. Molecular Inversion Probes.
Physical Mapping I CIS 667 February 26, Physical Mapping A physical map of a piece of DNA tells us the location of certain markers  A marker is.
DNA Sequencing.
DNA Sequencing. CS273a Lecture 3, Spring 07, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
DNA Sequencing. Next few topics DNA Sequencing  Sequencing strategies Hierarchical Online (Walking) Whole Genome Shotgun  Sequencing Assembly Gene Recognition.
CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.
DNA Sequencing and Assembly
DNA Sequencing.
DNA Sequencing. CS262 Lecture 9, Win07, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
DNA Sequencing and Assembly. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA.
$399 Personal Genome Service $2,500 Health Compass service $985 deCODEme (November 2007) (April 2008) $350,000 Whole-genome sequencing (November 2007)
DNA Sequencing.
Conditional Random Fields 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
DNA Sequencing. CS262 Lecture 9, Win06, Batzoglou DNA Sequencing – gel electrophoresis 1.Start at primer(restriction site) 2.Grow DNA chain 3.Include.
DNA Sequencing. DNA sequencing … ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC.
DNA Sequencing. CS262 Lecture 9, Win06, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
CS273a Lecture 1, Autumn 10, Batzoglou DNA Sequencing.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
DNA Sequencing. CS273a Lecture 3, Autumn 08, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
CS273a Lecture 4, Autumn 08, Batzoglou DNA Sequencing.
Genomes summary 1.>930 bacterial genomes sequenced. 2.Circular. Genes densely packed Mbases, ,000 genes 4.Genomes of >200 eukaryotes (45.
DNA Sequencing. Next few topics DNA Sequencing  Sequencing strategies Hierarchical Online (Walking) Whole Genome Shotgun  Sequencing Assembly Gene Recognition.
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Basic Molecular Biology Many slides by Omkar Deshpande.
DNA basics DNA is a molecule located in the nucleus of a cell Every cell in an organism contains the same DNA Characteristics of DNA varies between individuals.
DNA Technology and Genomics
Chapter 20~DNA Technology & Genomics. Who am I? Recombinant DNA n Def: DNA in which genes from 2 different sources are linked n Genetic engineering:
HAPLOID GENOME SIZES (DNA PER HAPLOID CELL) Size rangeExample speciesEx. Size BACTERIA1-10 Mb E. coli: Mb FUNGI10-40 Mb S. cerevisiae 13 Mb INSECTS.
The Clone Age Human Genome Project Recombinant DNA Gel Electrophoresis DNA fingerprints
Trends in Biotechnology
1 Genetics Faculty of Agriculture Instructor: Dr. Jihad Abdallah Topic 13:Recombinant DNA Technology.
Chapter 16 Gene Technology. Focus of Chapter u An introduction to the methods and developments in: u Recombinant DNA u Genetic Engineering u Biotechnology.
CS273a Lecture 4, Autumn 08, Batzoglou CS273a 2011 DNA Sequencing.
Genome sequencing Haixu Tang School of Informatics.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Genome sequencing Haixu Tang School of Informatics.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
DNA Technology. Overview DNA technology makes it possible to clone genes for basic research and commercial applications DNA technology is a powerful set.
Used for detection of genetic diseases, forensics, paternity, evolutionary links Based on the characteristics of mammalian DNA Eukaryotic genome 1000x.
Review from last week. The Making of a Plasmid Plasmid: - a small circular piece of extra-chromosomal bacterial DNA, able to replicate - bacteria exchange.
Basic Molecular Biology Many slides by Omkar Deshpande.
Studijní obor Bioinformatika. LAST LECTURE SUMMARY.
DNA Technology Chapter 11. Genetic Technology- Terms to Know Genetic engineering- Genetic engineering- Recombinant DNA- DNA made from 2 or more organisms.
Alu Polymorphysims as Identity Markers Using PCR and Gel Electrophoresis.
Linkage and Mapping. Figure 4-8 For linked genes, recombinant frequencies are less than 50 percent.
CS273a Lecture 4, Autumn 08, Batzoglou CS273a 2013 DNA Structure.
Human Genome.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
GENETIC ENGINEERING CHAPTER 20
13-1 Copyright  2005 McGraw-Hill Australia Pty Ltd PPTs t/a Biology: An Australian focus 3e by Knox, Ladiges, Evans and Saint Chapter 13: Genetic engineering.
DNA Sequencing.
Genomics Chapter 18.
Whole Genome Sequencing (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 13, 2005 ChengXiang Zhai Department of Computer Science University of.
Genome Analysis. This involves finding out the: order of the bases in the DNA location of genes parts of the DNA that controls the activity of the genes.
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
15.2 Recombinant DNA. Copying DNA – How do scientists copy the DNA of living organisms? –The first step in using the polymerase chain reaction method.
DNA Sequencing.
DNA Sequencing Project
Relationship between Genotype and Phenotype
CLONING VECTORS Shumaila Azam.
CSCI 1810 Computational Molecular Biology 2018
Introduction to Sequencing
Relationship between Genotype and Phenotype
Presentation transcript:

DNA Sequencing

DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC TGATTTTAAAAAAATATT…

Which representative of the species? Which human? Answer one: Answer two: it doesn’t matter Polymorphism rate: number of letter changes between two different members of a species Humans: ~1/1,000 Other organisms have much higher polymorphism rates  Population size!

Why humans are so similar A small population that interbred reduced the genetic variation Out of Africa ~ 40,000 years ago Out of Africa Heterozygosity: H H = 4Nu/(1 + 4Nu) u ~ 10 -8, N ~ 10 4  H ~ 4  N

DNA Sequencing Goal: Find the complete sequence of A, C, G, T’s in DNA Challenge: There is no machine that takes long DNA as an input, and gives the complete sequence as output Can only sequence ~900 letters at a time

DNA Sequencing – vectors + = DNA Shake DNA fragments Vector Circular genome (bacterium, plasmid) Known location (restriction site)

Different types of vectors VECTORSize of insert Plasmid 2,000-10,000 Can control the size Cosmid40,000 BAC (Bacterial Artificial Chromosome) 70, ,000 YAC (Yeast Artificial Chromosome) > 300,000 Not used much recently

DNA Sequencing – gel electrophoresis 1.Start at primer(restriction site) 2.Grow DNA chain 3.Include dideoxynucleoside (modified a, c, g, t) 4.Stops reaction at all possible points 5.Separate products with length, using gel electrophoresis

Current Sanger Sequencing Capacity 3730xl: 690 Kbp/day, 900bp reads … 1550 Kbp/day, 700bp reads … 2100 Kbp/day, 550bp reads

Reconstructing the Sequence (Fragment Assembly) Cover region with high redundancy Overlap & extend reads to reconstruct the original genomic region reads

Repeats Bacterial genomes:5% Mammals:50% Repeat types: Low-Complexity DNA (e.g. ATATATATACATA…) Microsatellite repeats (a 1 …a k ) N where k ~ 3-6 (e.g. CAGCAGTAGCAGCACCAG) Transposons  SINE (Short Interspersed Nuclear Elements) e.g., ALU: ~300-long, 10 6 copies  LINE (Long Interspersed Nuclear Elements) ~4000-long, 200,000 copies  LTR retroposons (Long Terminal Repeats (~700 bp) at each end) cousins of HIV Gene Families genes duplicate & then diverge (paralogs) Recent duplications ~100,000-long, very similar copies

Sequencing and Fragment Assembly AGTAGCACAGA CTACGACGAGA CGATCGTGCGA GCGACGGCGTA GTGTGCTGTAC TGTCGTGTGTG TGTACTCTCCT 3x10 9 nucleotides 50% of human DNA is composed of repeats Error! Glued together two distant regions

What can we do about repeats? Two main approaches: Cluster the reads Link the reads

What can we do about repeats? Two main approaches: Cluster the reads Link the reads

What can we do about repeats? Two main approaches: Cluster the reads Link the reads

Pyrosequencing on a chip $10K 400,000 reads/day bp/read => ~100Mbp/day $10K 1M reads/day 500 bp/read => ~500Mbp/day

Single Molecule Array for Genotyping—Solexa ~$10K ~1.5 Mbp, 36bp reads 3 days

Polony Sequencing

Nanopore Sequencing

Molecular Inversion Probes

Illumina Genotype Arrays

Summary – Sequencing in 2008 Mb /day Read Length Paired?Cost Sanger ? Pyrosequencing 100 (400) 200 (400) ½ length$10K Solexa/ABI 50036Yes$3K Polony ?No$10K Genotyping Array 1M SNP /chip N.A. $300