Download presentation
Presentation is loading. Please wait.
Published byEmily Fleming Modified over 8 years ago
1
- DNA sequencing in the last century - Current technologies (Illumina, Ion Torrent) - New developments (PacBio, Nanopore) Topics
2
T Sanger sequencing - Random incorporation of blocked nucleotides at any position, reaction stops in a small fraction of the reads TTGCACTTGAGTCGT AACGTGAACTCAGCATAGGCTCAGATAGAT A-Reaction: add dATP (elongation) and ddATP (block) Analogous: C-, G-, T-Reaction ddATP - Developed by Fred Sanger in the 70ies (1918-2013, 2*Nobel laureate: 1958 – protein structure of insulin, 1980 – sequencing of nucleic acids) - Sequencing by synthesis: DNA polymerase is synthesizing a complementray strand by adding single nucleotides TTGCACTGAGTCG AACGTGACTCAGCATAGGCTCAGATAGAT
3
TTGCACTTGAGTCG AACGTGAACTCAGCATAGGCTCAGATAGAT A-Reaction: TTGCA TTGCACTTGA C-Reaction: TTGC TTGCAC TTGCACTTGAGTC G-Reaction: TTG TTGCACTTG TTGCACTTGAG TTGCACTTGAGTCG T-Reaction: TT TTGCACT TTGCACTT TTGCACTTGAGT ddNTP Sanger sequencing ladder of DNA fragments electrophoresis sequence T G C A
4
GATTGATAGTTGC CTAACTATCAACGTATAGGCTCAGATAGAT G GA GAT GATT GATTG GATTGA GATTGAT GATTGATA GATTGATAG GATTGATAGT GATTGATAGTT GATTGATAGTTG GATTGATAGTTGC - labeled ddNTPS, capillary sequencing A Sanger sequencing
5
Pyrosequencing - immobilize DNA on beads, pyrosequencing in microreactors dTTP TTGCACTGAGTCGT AACGTGACTCAGCATAGGCTCAGATAGAT PPi ATP Oxyluciferin + light 454 technology
6
DNA-loaded beads + primer + polymerase + sulfurylase + luciferase flowgram TTGCACTGAGTCGT AACGTGACTCAGCAAGTCTATTCACCCAC... 454 technology Problem: homopolymers difficult to detect
7
increase throughput: - DNA gel electrophoresis, single genes in few days - capillary electrophoresis, 96 capillaries per machine, human genome in a few years - sequencing on microbeads: 454 technology Parallelisation & Miniaturisation
8
Illumina sequencing: - sequencing by synthesis - massive parallelisation and miniaturisation by self-organising DNA microarrays on a glass surface - several hundred Gb, >10 9 reads per run Illumina technology
9
- generate libraries - grow clusters on a flowcell - sequence by addition and imaging of blocked & fluorescence-labeled nucleotides Illumina technology
10
library preparation: DNA fragments Blunting by Fill-in and exonuclease Phosphorylation Addition of A-overhang Ligation to adapters Illumina technology
11
cluster generation: 1. flowcell Illumina technology
12
cluster generation: 1. flowcell 2. hybridize template Illumina technology
13
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template Illumina technology
14
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology
15
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology
16
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology
17
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology
18
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification Illumina technology
19
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation Illumina technology
20
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation 6. cleave reverse strand Illumina technology
21
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation 6. cleave reverse strand 7. block 3‘-ends Illumina technology
22
cluster generation: 1. flowcell 2. hybridize template 3. immobilize template 4. bridge amplification 5. linearisation 6. cleave reverse strand 7. block 3‘-ends 8. hybridize primer Illumina technology
23
Imaging & Sequencing: Illumina technology Nucleotide + fluorescent dye + terminator
24
reversible terminators: Illumina technology
25
fluorescently labelled clusters: Illumina technology
26
what can we do with short reads? RNA-seq, identify transcripts, count #reads per transcript assessment of differential expression problem: reads are too short to establish connectivity of all exons, difficult/impossible to quantify multiple isoforms of a gene Sequencing Applications
27
Stefan Krebs, 30.09.2013 Single end: ambiguous mapping Paired end sequencing: read fragment from both ends -> resolve ambiguities Improvements: Paired end Reads
28
further improvements long jumping mate-pair libraries: circularize large fragment and reads junctions (2-10 kb) resolve large repeats in genome assembly Improvements: Circularization
29
Third generation Sequencing
30
- single molecule detection -several kilobases read length -moderate output (150.000 wells) -expensive instrument and high cost per base Pacific Biosciences
33
Read length distribution
34
Pacific Biosciences
35
everything that can be converted to a DNA strand can be sequenced - even long-term data storage by encoding in synthetic DNA is possible BIOLOGICAL APPLICATIONS: sequencing of genomes, transcriptomes, population diversity, composition of microbial communities, ChIPseq, methyl-Seq, translating RNA from ribosomes,... MEDICAL APPLICATIONS: whole genome sequencing, exome sequencing, tumor diagnostics, sequencing of T-cell receptor diversity, identification of pathogens,... FORENSICS, FOOD SAFETY, ARCHEOLOGY, … Applications
36
Chromatin Immunoprecipitation (ChIP)
37
mRNA protein DNA Activation Repression Translation Localization Stability Pol II 3’UTR Motivation: Regulation of gene expression Transcriptional Post-transcriptional
38
At which loci does a protein bind the DNA? Are there cell-type or environment-specific variations of binding affinity? Which histone modifications determine chromatin structure? To which motifs does a transcription factor bind? What is the “cis-regulatory code” of a gene? Motivation: Regulation of gene expression DNA Activation Repression x Enhancer Promoter
39
Sequencing DNA binding protein of interest Antibody Chromatin Immunoprecipitation (ChIP)
40
Control: input DNA Chromatin Immunoprecipitation (ChIP) Sequencing
41
ChIP-Seq Analysis Workflow Peak Detection Annotation Motif Analysis Visualization Alignment Chromatin Immunoprecipitation (ChIP) ELAND Bowtie SOAP SeqMap … SISSRs QuEST MACS CisGenome … STAN chromHMM … IGV Ensembl GB UCSC GB … cERMIT HMMer Xxmotif …
42
ACCAATAATCAGCTAAGCCGTTAGCCACAGATGGAA Protein of interest Chromatin Immunoprecipitation (ChIP) Sonication crosslink site
43
Read Alignment
44
Read count genome Expected read count Expected read count = total number of reads * extended fragment length / chr length genome T A T T A A T T A T C C C C A T A T A T G A T A T Read Alignment
45
Read direction provides extra information Hongkai Ji et al. Nature Biotechnology 26: 1293-1300. 2008 Read Alignment
46
The ENCODE Project Goal: Define all functional elements in the human genome How: Lots of groups Lots of assays Lots of cell lines Lots of communication/consortium analysis Standardization of methods, reagents, analysis Genome-wide A lot of money
47
47 2 Tier 1 cell lines –GM12878 (B cell) –K562 (CML cells) 5 Tier 2 cells –HeLa S3, HepG2, HUVEC, primary keratinocytes, hESC Many Tier 3 cells RNA profiling (Scott Tenenbaum): Inter-cell line differences are greater than inter-lab differences The ENCODE Project
48
48 RNA-seq RNA-array TF ChIP-seq Histone modif ChIP-seq DNase-seq Bisulfite-seq 1M SNP genotyping Lots of data and data types generated by The ENCODE Project
49
49 Dynamic Bayesian Networks HMM segmentation PCA analysis Open Chromatin Trans. Factor Chip-seq Histone Mod. Chip-seq RNA Std. Peaks Region callsActive regions …… Biological interpretation Integrative Data Analysis
50
50 12 Histone modifications 2 Transcription factors GM12878 K562 “Standard” EM Training Posterior Probability Decoding Genome Viterbi Path State FState IState AState CState E Data: Entire ENCODE Consortium Analysis: Jason Ernst/Manolis Kellis 25-state HMM Integrative Data Analysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.