Sequencing tutorial Peter HANTZ EMBL Heidelberg
Dideoxy (Sanger) sequencing Uni Osnabruck M. Waterman Dideoxy (Sanger) sequencing Principle: Gel electrophoresis: discrimination of 1 bp below ~1000 bp Synthesis: starts with a DNA oligo, stops after incorporating a (marked) ddNTP First ~ 60 bp uncertain (high relative mass of the fluo. dye) Radiolabeling: 4 reactions Csak kis resze dideoxy, gel erzekenysege 1 bp, kell oligonukleotid az inditashoz Dye-termination: 4 fluorescent dyes, one reaction
Pyrosequencing (Roche / 454) ds Bead I. Streptavidin coated Odaragadthoz kepest komplementer lejon es kesobb felhasznaljuk, golyok kidobodnak Library construction A,B: short DNA oligos fused with genomic DNA segments B is biotinilated Selection of dsDNA: streptavidin-coated magnetic beads denaturation: AB strands collected www.454.com wiki
Pyrosequencing (Roche / 454) Bead II. Simple agarose beads coated with B oligos Single sstDNA (singles-stranded template DNA) with cA and cB oligo immobilized one on a bead Bead-bound library emulsified (water-in-oil) Lanyok a golyon, fiuk raragadnak, VIZCSEPP|!!!! Egyelore ketszalu dna, egyik szal kovalensen kapcs, a masik csak hibridizacio reven PCR reaction: One strand will be covalently bound to the bead www.454.com wiki
Pyrosequencing (Roche / 454) denaturation, one strand is released Following the selection of DNA-positive beads (enrichment), Beads+reactants in wells having a diameter of cca 40um Denaturalas: egy szal maradjon, seq by synthesis, ez csak egy szallal mukodik www.454.com wiki
Pyrosequencing (Roche / 454) The reaction: -addition of dNTP-s: incorporation releases pyrophosphate (only one phosphate is needed for the backbone) -ATP sulfurylase converts PPi to ATP -luciferase: acts in the presence of ATP -Unincorporated nucleotides and ATP are degraded by the apyrase Leszakad pirofoszfat (2P), ebbol lesz ATP, feny detektalasa, egyszinu -400,000 reads in parallel -multiple consensus incorporations: >higher signal intensity >problematic... www.454.com wiki
Illumina (Solexa) sequencing -making DNA library (~300bp fragments) -ligation of adapters A and B to the fragments Denaturalas, a szalak csak hibridizalnak -binding the ssDNA randomly to the flow cell surface -complementary primers are ligated to the surface Illumina-Fasteris
Illumina (Solexa) sequencing Bridge amplification: initiation Itt meg nem szekvenalasi cellal szintetizalunk. Az uj szal kovalensen kapcsolodik az uveghez, az eredeti elvesz On the surface: complementary oligos GeneCore
Illumina (Solexa) sequencing Itt meg mindig nem syekvenalunk, csak soxorositunk, linearisan EMBL Gene Core
Illumina (Solexa) sequencing Data aquisition: sequencing by synthesis: “reverible terminator” nucleotides blocked + fluorescently labeled Itt a szekvenalas szintezis altal!! Fityego resz levagasaval inditom ujra de-blocking to enable the synthesis dye cleavage+elimination wash step+repeat TGCA illumina.com
Illumina (Solexa) sequencing Mate-pair sequencing Biotinnal fogjuk meg az ertekes reszt, TRUKKOS MODSZER MINDKET VEGEN VALO SZEKVENALASRA
Single Molecule Real Time Sequencing Principle: fluorescent label on the terminal phosphate of NTP-s DNA polymerase: cleaves this incorporation lasts ~ mS Detection: "Zero-Mode Waveguide" holes: near-field standing waves (~Total Internal Reflection ) Pirofoszfat kilepes! Lenyeg: egyetlen molekula eleg, video Present performance: 1,500 bp in read lengths Wiki Pacific Biosciences
The genome is fragmented randomly (sonication) Assembling Shotgun sequencing The genome is fragmented randomly (sonication) No positional and orientatin information is available The fragments are sequenced The results have to be assembled A ket veguk van szekvenalva, itt a tartalom nem erdekes, nincs transzlacio, szetvagott kep analogia Merging reads into contigs
Bridges of Königsberg Leonhard Euler, 1735 Graphs www.bioalgorithms.info Graphs set of edges that connect pairs of nodes used to model pairwise relations between certain objects Bridges of Königsberg Leonhard Euler, 1735 Find a path that visits each bridge (=edge) once! Eulerian path problem: visit each edge once and only once: linear-time algorithm
Hamiltonian Path Problem www.ams.org Find a route that visits each node (=each airport) exactly once This is an NP (Non-Polynomial) -problem Nem-polinomialis, iszonyatos szamitasi kapacitast er Nem polinomialis, iszonyatos szamitasi kapacitast igenyel the amount of computation necessary, using the most efficient algorithms known at present, grows exponentially with the size of the route map
Traveling Salesman Problem www.wolfram.com Traveling Salesman Problem Find the shortest path which visits every vertex exactly once. That is: the shortest Hamiltonian pathway This is also an NP-hard problem...
The Shortest Superstring Problem Given a set of strings, find a shortest string that contains all of them Input: Strings s1, s2,…., sn Output: A string s that contains all strings s1, s2,…., sn as substrings, such that the length of s is minimized Equivalent of: -finding the shortest Hamiltonian pathway -TSP
Graph Theory helps DNA assembly University of Maryland "Translation" of the problem: a model Nodes: reads Edges: connects nodes if the corresponding reads overlap Example: assembling a bacterial genome Red lines - wrong assembly Bold Black lines - good assembly Assembling the reads = finding the shortest Hamiltonian pathway = TSP = SSP NP...impossible...?
The Way Out: Constructing and analyzing de Bruijn Graphs Finding Eulerian paths in the de Bruijn graph can lead to sequence reconstruction Linear problem! J. Kaptcianos
Thank You for Your attention!
Second-generation DNA sequencing "Sequencing by synthesis" methods (Solexa) 300bp [normal] - 10kb [mate-pair] (454) 1-10 kb, and 20 kb in expt. stage DNA Colonies amplified by PCR: “Polonies” (Solexa) isothermal extension "bridge PCR" note: even PCR-free! (454) emulsion PCR Gelelfo: adott merettertomanyu dns-ek kivalasztasa; SEQ BY SYNTHESIS-mi ez, egyszalu dns, bazispar beepulese, fenykibocsatas fluorescent imaging of the entire array Reads: (Solexa): ~50-80 (454): ~200-300 Nature Biotech, vol. 26
Illumina (Solexa) sequencing flow cell: Paired-end sequencing EMBL GeneCore
ABI: capillary electrophoresis sequencing and SoLID
Directed graphs We assign a certain direction with the edges The Eulerian Path Problem can be re-formulated accordingly: Visit each edge 1! while passing along the edges in their direction Note: Eulerian path might not exist!
(also known as Overlap-Layout-Consensus method) M. Waterman Examples: tenyleg legrovidebb-e kezdet Red: repeats (also known as Overlap-Layout-Consensus method)
directed graph representing overlaps between sequences of symbols The Way Out: Constructing and analyzing de Bruijn Graphs directed graph representing overlaps between sequences of symbols Given sequences of symbols (~reads): ATG, TGG, TGC, GTG, GGC, GCA, GCG, CGT "k-length fragments" (k=3) Nodes: fragments of k-1 (k-1=2) Edges: k-length fragments connecting overlapping vertices Finding Eulerian paths in the de Bruijn graph can lead to sequence reconstruction (Superpath problem, Merging transformation, etc.) Linear problem! J. Kaptcianos