Presentation is loading. Please wait.

Presentation is loading. Please wait.

- DNA sequencing in the last century - Current technologies (Illumina, Ion Torrent) - New developments (PacBio, Nanopore) Topics.

Similar presentations


Presentation on theme: "- DNA sequencing in the last century - Current technologies (Illumina, Ion Torrent) - New developments (PacBio, Nanopore) Topics."— Presentation transcript:

1 - DNA sequencing in the last century - Current technologies (Illumina, Ion Torrent) - New developments (PacBio, Nanopore) Topics

2 T Sanger sequencing - Random incorporation of blocked nucleotides  at any position, reaction stops in a small fraction of the reads TTGCACTTGAGTCGT AACGTGAACTCAGCATAGGCTCAGATAGAT A-Reaction: add dATP (elongation) and ddATP (block) Analogous: C-, G-, T-Reaction ddATP - Developed by Fred Sanger in the 70ies (1918-2013, 2*Nobel laureate: 1958 – protein structure of insulin, 1980 – sequencing of nucleic acids) - Sequencing by synthesis: DNA polymerase is synthesizing a complementray strand by adding single nucleotides TTGCACTGAGTCG AACGTGACTCAGCATAGGCTCAGATAGAT

3 TTGCACTTGAGTCG AACGTGAACTCAGCATAGGCTCAGATAGAT A-Reaction: TTGCA TTGCACTTGA C-Reaction: TTGC TTGCAC TTGCACTTGAGTC G-Reaction: TTG TTGCACTTG TTGCACTTGAG TTGCACTTGAGTCG T-Reaction: TT TTGCACT TTGCACTT TTGCACTTGAGT ddNTP Sanger sequencing ladder of DNA fragments  electrophoresis  sequence T G C A

4 GATTGATAGTTGC CTAACTATCAACGTATAGGCTCAGATAGAT G GA GAT GATT GATTG GATTGA GATTGAT GATTGATA GATTGATAG GATTGATAGT GATTGATAGTT GATTGATAGTTG GATTGATAGTTGC - labeled ddNTPS, capillary sequencing A Sanger sequencing

5 Pyrosequencing - immobilize DNA on beads, pyrosequencing in microreactors dTTP TTGCACTGAGTCGT AACGTGACTCAGCATAGGCTCAGATAGAT PPi ATP Oxyluciferin + light 454 technology

6 DNA-loaded beads + primer + polymerase + sulfurylase + luciferase flowgram TTGCACTGAGTCGT AACGTGACTCAGCAAGTCTATTCACCCAC... 454 technology Problem: homopolymers difficult to detect

7 increase throughput: - DNA gel electrophoresis, single genes in few days - capillary electrophoresis, 96 capillaries per machine, human genome in a few years - sequencing on microbeads: 454 technology Parallelisation & Miniaturisation

8 Illumina sequencing: - sequencing by synthesis - massive parallelisation and miniaturisation by self-organising DNA microarrays on a glass surface - several hundred Gb, >10 9 reads per run Illumina technology

9 - generate libraries - grow clusters on a flowcell - sequence by addition and imaging of blocked & fluorescence-labeled nucleotides Illumina technology

10 library preparation: DNA fragments Blunting by Fill-in and exonuclease Phosphorylation Addition of A-overhang Ligation to adapters Illumina technology

11 cluster generation: 1. flowcell Illumina technology

12 cluster generation: 1. flowcell 2. hybridize template Illumina technology

13 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template Illumina technology

14 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology

15 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology

16 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology

17 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology

18 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology

19 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation Illumina technology

20 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation 6. cleave reverse strand Illumina technology

21 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation 6. cleave reverse strand 7. block 3‘-ends Illumina technology

22 cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation 6. cleave reverse strand 7. block 3‘-ends 8. hybridize primer Illumina technology

23 Imaging & Sequencing: Illumina technology Nucleotide + fluorescent dye + terminator

24 reversible terminators: Illumina technology

25 fluorescently labelled clusters: Illumina technology

26 what can we do with short reads? RNA-seq, identify transcripts, count #reads per transcript  assessment of differential expression problem: reads are too short to establish connectivity of all exons, difficult/impossible to quantify multiple isoforms of a gene Sequencing Applications

27 Stefan Krebs, 30.09.2013 Single end: ambiguous mapping Paired end sequencing: read fragment from both ends -> resolve ambiguities Improvements: Paired end Reads

28 further improvements long jumping mate-pair libraries: circularize large fragment and reads junctions (2-10 kb) resolve large repeats in genome assembly Improvements: Circularization

29 Third generation Sequencing

30 - single molecule detection -several kilobases read length -moderate output (150.000 wells) -expensive instrument and high cost per base Pacific Biosciences

31

32

33 Read length distribution

34 Pacific Biosciences

35 everything that can be converted to a DNA strand can be sequenced - even long-term data storage by encoding in synthetic DNA is possible BIOLOGICAL APPLICATIONS: sequencing of genomes, transcriptomes, population diversity, composition of microbial communities, ChIPseq, methyl-Seq, translating RNA from ribosomes,... MEDICAL APPLICATIONS: whole genome sequencing, exome sequencing, tumor diagnostics, sequencing of T-cell receptor diversity, identification of pathogens,... FORENSICS, FOOD SAFETY, ARCHEOLOGY, … Applications

36 Chromatin Immunoprecipitation (ChIP)

37 mRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Motivation: Regulation of gene expression Transcriptional Post-transcriptional

38 At which loci does a protein bind the DNA? Are there cell-type or environment-specific variations of binding affinity? Which histone modifications determine chromatin structure? To which motifs does a transcription factor bind? What is the “cis-regulatory code” of a gene? Motivation: Regulation of gene expression DNA Activation Repression x Enhancer Promoter

39 Sequencing DNA binding protein of interest Antibody Chromatin Immunoprecipitation (ChIP)

40 Control: input DNA Chromatin Immunoprecipitation (ChIP) Sequencing

41 ChIP-Seq Analysis Workflow Peak Detection Annotation Motif Analysis Visualization Alignment Chromatin Immunoprecipitation (ChIP) ELAND Bowtie SOAP SeqMap … SISSRs QuEST MACS CisGenome … STAN chromHMM … IGV Ensembl GB UCSC GB … cERMIT HMMer Xxmotif …

42 ACCAATAATCAGCTAAGCCGTTAGCCACAGATGGAA Protein of interest Chromatin Immunoprecipitation (ChIP) Sonication crosslink site

43 Read Alignment

44 Read count genome Expected read count Expected read count = total number of reads * extended fragment length / chr length genome T A T T A A T T A T C C C C A T A T A T G A T A T Read Alignment

45 Read direction provides extra information Hongkai Ji et al. Nature Biotechnology 26: 1293-1300. 2008 Read Alignment

46 The ENCODE Project Goal: Define all functional elements in the human genome How: Lots of groups Lots of assays Lots of cell lines Lots of communication/consortium analysis Standardization of methods, reagents, analysis Genome-wide A lot of money

47 47 2 Tier 1 cell lines –GM12878 (B cell) –K562 (CML cells) 5 Tier 2 cells –HeLa S3, HepG2, HUVEC, primary keratinocytes, hESC Many Tier 3 cells RNA profiling (Scott Tenenbaum): Inter-cell line differences are greater than inter-lab differences The ENCODE Project

48 48 RNA-seq RNA-array TF ChIP-seq Histone modif ChIP-seq DNase-seq Bisulfite-seq 1M SNP genotyping Lots of data and data types generated by The ENCODE Project

49 49 Dynamic Bayesian Networks HMM segmentation PCA analysis Open Chromatin Trans. Factor Chip-seq Histone Mod. Chip-seq RNA Std. Peaks Region callsActive regions …… Biological interpretation Integrative Data Analysis

50 50 12 Histone modifications 2 Transcription factors GM12878 K562 “Standard” EM Training Posterior Probability Decoding Genome Viterbi Path State FState IState AState CState E Data: Entire ENCODE Consortium Analysis: Jason Ernst/Manolis Kellis 25-state HMM Integrative Data Analysis


Download ppt "- DNA sequencing in the last century - Current technologies (Illumina, Ion Torrent) - New developments (PacBio, Nanopore) Topics."

Similar presentations


Ads by Google