Presentation is loading. Please wait.

Presentation is loading. Please wait.

BioSci D145 Lecture #3 Bruce Blumberg

Similar presentations


Presentation on theme: "BioSci D145 Lecture #3 Bruce Blumberg"— Presentation transcript:

1 Bruce Blumberg (blumberg@uci.edu)
BioSci D145 Lecture #3 Bruce Blumberg 4103 Nat Sci 2 - office hours Tu, Th 3:30-5:00 (or by appointment) phone TA – Riann Egusquiza 4311 Nat Sci 2– office hours W 9:30-11:30 Phone check regularly for announcements, etc.. Updated lectures will be posted on web pages after lecture Don’t forget to discuss term paper topics with me Last year’s midterm is now posted BioSci D145 lecture 1 page 1 ©copyright Bruce Blumberg All rights reserved

2 Why should any funding agency give you money to pursue this research?
Term paper outline Title of your proposal A paragraph introducing your topic and explaining why it is important; i.e., what impact will the knowledge gained have. Why should any funding agency give you money to pursue this research? NIH now requires a statement of human health relevance for all grant applications NSF wants to know what is the intellectual merit of your proposed research and what broader impacts of your proposed research Present your hypothesis A supposition or conjecture put forth to account for known facts; esp. in the sciences, a provisional supposition from which to draw conclusions that shall be in accordance with known facts, and which serves as a starting-point for further investigation by which it may be proved or disproved and the true theory arrived at. Enumerate 2-3 specific aims in the form of questions that test your hypothesis At least one of these aims needs to have a strong “whole genome” component This is not a review article – propose something new. BioSci D145 lecture 4 page 2 ©copyright Bruce Blumberg All rights reserved

3 DNA sequencing = determining the nucleotide sequence of DNA
DNA sequence analysis DNA sequencing = determining the nucleotide sequence of DNA Originally two main methods shared Nobel prize in 1980 Chemical cleavage – Maxam and Gilbert Enzymatic sequencing (based on polymerization reaction) Nobel Prize in Chemistry 1980 Walter Gilbert (Harvard) & Frederick Sanger (MRC Labs) (Sanger also won Nobel in 1958 for protein sequencing) BioSci D145 lecture 4 page 3 ©copyright Bruce Blumberg All rights reserved

4 One of the first reasonable sequencing methods
DNA sequence analysis Maxam and Gilbert One of the first reasonable sequencing methods Very popular in late 70s and early 80s VERY TEDIOUS!! Totally superceded by dideoxy sequencing now BioSci D145 lecture 4 page 4 ©copyright Bruce Blumberg All rights reserved

5 DNA sequence analysis (contd)
Dideoxy sequencing – Sanger 1977 Virtually all routine sequencing is done this way now Requires modified nucleotide 2’3’-dideoxy dNTP DNA polymerase incorporates the ddNTP and chain elongation terminates Original method used 4 separate elongation reactions Products separated by denaturing PAGE and visualized by autoradiography BioSci D145 lecture 4 page 5 ©copyright Bruce Blumberg All rights reserved

6 DNA sequence analysis (contd)
Dideoxy sequencing (contd) – Sanger 1977 Dideoxy NTPs present at ~1% of [dNTP] Each reaction has identified end In principle, all possible chain lengths are represented varies by [dNTPs], [ddNTPs], [primer] and [template] and ratios BioSci D145 lecture 4 page 6 ©copyright Bruce Blumberg All rights reserved

7 DNA sequence analysis (contd)
A C G T A C G T A C G T BioSci D145 lecture 4 page 7 ©copyright Bruce Blumberg All rights reserved

8 Automated DNA sequence analysis
How to improve throughput of sequencing? Incorporate fluorescent ddNTPs, separate products by PAGE Base calling and lane calling issues Key advance was capillary sequencers Separate DNA in a thin capillary instead of gel Very accurate, no tracking errors, much more automation friendly Trace files (dye signals) are analyzed and bases called to create chromatograms. Chromatograms from opposite strands are reconciled with software to create double-stranded sequence data. BioSci D145 lecture 4 page 8 ©copyright Bruce Blumberg All rights reserved

9 Automated DNA sequence analysis
Capillaries vs gels Capillaries much faster – higher field strength possible Fully automated = higher throughput BioSci D145 lecture 4 page 9 ©copyright Bruce Blumberg All rights reserved

10 Applied Biosystems PRISM 377 (Gel, 34-96 lanes)
(Capillary, 96 capillaries) BioSci D145 lecture 4 page 10 ©copyright Bruce Blumberg All rights reserved

11 PCR – polymerase chain reaction amplification of DNA
PCR is most routinely used method to amplify DNA Exponential amplification of DNA by polymerases – Saiki et al, 1985 2n fold amplification, n= # cycles 35 cycles = 235 = 3.4 x 1010 fold Originally used DNA polymerase I Needed to add fresh enzyme at every cycle because heat denaturation of template killed the enzyme Not widely used – too painful to do manually Nobel Prize to Kary Mullis in 1993 for deciding to use Taq DNA polymerase for PCR He was middle author on paper! BioSci D145 lecture 4 page 11 ©copyright Bruce Blumberg All rights reserved

12 PCR – polymerase chain reaction amplification of DNA (contd)
Hot water bacteria: Thermus aquaticus Taq DNA polymerase Life at High Temperatures by Thomas D. Brock Biotechnology in Yellowstone © 1994 Yellowstone Association for Natural Science BioSci D145 lecture 4 page 12 ©copyright Bruce Blumberg All rights reserved

13 Cycle sequencing – fusion of PCR and fluorescent ddNTP sequencing
Combine PCR amplification with dideoxy sequencing – cycle sequencing Linear amplification of template in the presence of fluorescent ddNTPs When nucleotides are used up reaction is over Separate on capillary electrophoresis instrument Advantages Fast, single tube reaction Works with small amounts of starting material Disadvantages Still need to prepare high quality template to sequence Cost and time Many sequencing centers spend time, $$ on template prep Automation requirements BioSci D145 lecture 4 page 13 ©copyright Bruce Blumberg All rights reserved

14 Isothermal amplification – the solution to template preparation
How to make template preparation faster, easier and more reliable? Eliminate automation requirement, amplify starting material in some other way Φ29 DNA polymerase (aka TempliPhi) Enzyme has high processivity and strand displacement activity Isothermal reaction produces huge quantities of DNA from tiny amount of input More efficient than PCR (no temp change, no machine, no cleanup) BioSci D145 lecture 4 page 14 ©copyright Bruce Blumberg All rights reserved

15 Modern DNA sequence analysis
Cycle sequencing Virtually all routine DNA sequencing today is done by cycle sequencing with fluorescent ddNTPs ABI Big Dye chemistry Template preparation still tedious for small scale TempliPHi used in genome centers (no need for most automation) Capillary sequencers predominant for small scale sequencing Retrogen and similar companies But, next generation sequencing has already rapidly displaced old technology in genome centers. 454 sequencing (Roche) Solexa (Illumina) *dominant player at the moment* SoLID (Applied Biosystems) (dead technology due to poor support) 3rd generation sequencing (individual DNA molecule) now available e.g., Pacific Biosciences (sequence reads of 1,000-10K bases) Oxford Biosciences Nanopore (read length 5 kb—200 kb) BioSci D145 lecture 4 page 15 ©copyright Bruce Blumberg All rights reserved

16 Landmarks in DNA sequencing
DNA sequence analysis Landmarks in DNA sequencing Sanger, Nicklen and Coulson. Sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. 74, (1977). Sanger, F. et al. The nucleotide sequence of bacteriophage ΦX174. J Mol Biol 125, (1978). Sutcliffe, J. G. Complete nucleotide sequence of the Escherichia coli plasmid pBR322. Cold Spring Harb Symp Quant Biol 43, (1979). Sanger et al., Nucleotide sequence of bacteriophage lambda DNA. J Mol Biol 162, (1982). Messing, J., Crea, R. & Seeburg, P. H. A system for shotgun DNA sequencing. Nucl.Acids Res 9, (1981). Anderson, S. et al. Sequence and organization of the human mitochondrial genome. Nature 290, (1981). Deininger, P. L. Random subcloning of sonicated DNA: application to shotgun DNA sequence analysis. Anal Biochem 129, (1983). Baer et al. DNA sequence and expression of the B95-8 Epstein-Barr virus genome. Nature 310, (1984). (189 kb) Innis et al. DNA sequencing with Taq DNA polymerase and direct sequencing of PCR-amplified DNA Proc. Natl. Acad. Sci. 85, (1988) BioSci D145 lecture 4 page 16 ©copyright Bruce Blumberg All rights reserved

17 DNA sequence analysis (contd)
Landmarks in DNA sequencing (contd). Haemophilus influenzae (1.83 Mb) Mycoplasma genitalium (0.58 Mb) Saccharomyces cerevisiae genome (13 Mb) Methanococcus jannaschii (1.66 Mb) Escherichia coli (4.6 Mb) Bacillus subtilis (4.2 Mb) Borrelia burgdorferi (1.44 Mb) Archaeoglobus fulgidus (2.18 Mb) Helicobacter pylori (1.66 Mb) first bacterium sequenced, human pathogen smallest free living organism first Archaebacterium Lyme disease first sulfur metabolizing bacterium first bacterium proven to cause cancer BioSci D145 lecture 4 page 17 ©copyright Bruce Blumberg All rights reserved

18 DNA sequence analysis (contd)
Landmarks in DNA sequencing (contd) Treponema pallidum (1.14 Mb) Caenorhabditis elegans genome (97 Mb) Deinococcus radiodurans (3.28 Mb) Drosophila melanogaster (120 Mb) Arabidopsis thaliana (115 Mb) Escherichia coli O157:H7 (4.1 Mb) 2001 – draft Human “genome” 2002 – mouse genome 2002 – Ciona intestinalis 2003 – “complete “human genome 2004 – rat genome 2006 – Human “genome” complete sequence of all chromosomes Many more genomes underway, check JGI, Sanger and other web sites resistant to radiation, starvation, ox stress Pathogenic variant of E. coli Primitive chordate BioSci D145 lecture 4 page 18 ©copyright Bruce Blumberg All rights reserved

19 Complete DNA sequence (all nts both strands, no gaps)
DNA Sequence analysis Complete DNA sequence (all nts both strands, no gaps) complete sequence is desirable but takes time how long depends on size and strategy employed which strategy to use depends on various factors how large is the clone? cDNA genomic How fast is sequence required? sequencing strategies primer walking cloning and sequencing of restriction fragments progressive deletions Bidirectional, unidirectional Shotgun sequencing whole genome with mapping map first (C. elegans) map as you go (many) BioSci D145 lecture 4 page 19 ©copyright Bruce Blumberg All rights reserved

20 DNA Sequence analysis (contd)
Primer walking - walk from the ends with oligonucleotides sequence, back up ~50 nt from end, make a primer and continue Why back up? Need to see overlap to be sure about sequence you are reading BioSci D145 lecture 4 page 20 ©copyright Bruce Blumberg All rights reserved

21 DNA Sequence analysis (contd)
Primer walking (contd) advantages very simple no possibility to lose bits of DNA restriction mapping deletion methods no restriction map needed best choice for short DNA disadvantages slowest method about a week between sequencing runs oligos are not free (and not reusable) not feasible for large sequences applications cDNA sequencing when time is not critical targeted sequencing verification closing gaps in sequences BioSci D145 lecture 4 page 21 ©copyright Bruce Blumberg All rights reserved

22 DNA Sequence analysis (contd)
Cloning and sequencing of restriction fragments once the most popular method make a restriction map, subclone fragments sequence advantages straightforward directed approach can go quickly cloned fragments often useful otherwise RNase protection, nuclease mapping, in situ hybridization disadvantages possible to lose small fragments must run high quality analytical gels depends on quality of restriction map mistaken mapping -> wrong sequence restriction site availability applications sequencing small cDNAs isolating regions to close gaps BioSci D145 lecture 4 page 22 ©copyright Bruce Blumberg All rights reserved

23 DNA Sequence analysis (contd)
nested deletion strategies - sequential deletions from one end of the clone cut, close and sequence Approach make restriction map use enzymes that cut in polylinker and insert Religate, sequence from end with restriction site repeat until finished, filling in gaps with oligos advantages Fast, simple, efficient disadvantages limited by restriction site availability in vector and insert need to make a restriction map BioSci D145 lecture 4 page 23 ©copyright Bruce Blumberg All rights reserved

24 DNA Sequence analysis (contd)
nested deletion strategies (contd) Exonuclease III-mediated deletion cut with polylinker enzyme protect ends - 3’ overhang phosphorothioate cut with enzyme between first cut and the insert can’t leave 3’ overhang timed digestions with Exonuclease III stop reactions, blunt ends ligate and size select recombinants sequence advantages unidirectional processivity of enzyme gives nested deletions BioSci D145 lecture 4 page 24 ©copyright Bruce Blumberg All rights reserved

25 DNA Sequence analysis (contd)
Nested deletion strategies Exonuclease III-mediated deletion (contd) disadvantages need two unique restriction sites flanking insert on each side best used successively to get > 10kb total deletions may not get complete overlaps of sequences fill in with restriction fragments or oligos applications method of choice for moderate size sequencing projects cDNAs genomic clones good for closing larger gaps Small-scale sequence analysis – how is it practiced today? Primer walking ExoIII-mediated deletion with primer walking BioSci D145 lecture 4 page 25 ©copyright Bruce Blumberg All rights reserved

26 Genome sizes for most eukaryotes are large (108-109 bp)
Genome sequencing The problem Genome sizes for most eukaryotes are large ( bp) High quality sequences only about bp per run (Sanger) Nextgen sequences ~150 bp/read The solution Break genome into lots of bits and sequence them all Reassemble with computer The benefit Rapid increase in information about genome size, gene comparisons, etc The cost 3 x 109 bp(human haploid genome) ÷ 600 bp/reaction = 5 x 106 reactions for 1x coverage! Need both strands (x2), need overlaps and need to be sure of sequences ~ reactions/runs required for a human-sized genome About $1-2 per reaction these days, ~$8 commercially. BioSci D145 lecture 4 page 26 ©copyright Bruce Blumberg All rights reserved

27 Genome sequencing (contd)
Shotgun sequencing NOT invented by Craig Venter Messing 1981 first description of shotgun sequencing Sanger lab developed current methods in 1983 approach blast genome into small chunks clone these chunks 3-5 kb, 8 kb plasmid 40 kb fosmid jump repetitive sequences sequence + assemble by computer A priori difficulties how to get nice uniform distribution how to assemble fragments what to do about repeats? How to minimize sequence redundancy? BioSci D145 lecture 4 page 27 ©copyright Bruce Blumberg All rights reserved

28 Genome sequencing(contd)
BioSci D145 lecture 4 page 28 ©copyright Bruce Blumberg All rights reserved

29 Genome sequencing(contd)
BioSci D145 lecture 4 page 29 ©copyright Bruce Blumberg All rights reserved

30 Genome sequencing (contd)
Shotgun sequencing (contd) How to minimize sequence redundancy? Best way to minimize redundancy is map before you start C. elegans was done this way - when the sequence was finished, it was FINISHED mapping took almost 10 years mapping much too tedious and nonprofitable for Celera who cares about redundancy, let’s sequence and make $$ There is scientific value to draft genomes, too. why does redundancy matter? Finished sequence today costs about $0.50/base Note that 10x, % coverage leaves at least 150 kb unsequenced BioSci D145 lecture 4 page 30 ©copyright Bruce Blumberg All rights reserved

31 Other sequencing technologies
Sequencing by hybridization Construct a high-density microchip with all possible combinations of a short oligonucleotide Up to 25-mers By photolithography Synthesized on chip directly Label and hybridize fragment to be sequenced Wash stringently Read fluorescent spots Reconstruct sequence by computer BioSci D145 lecture 5 page 31 ©copyright Bruce Blumberg All rights reserved

32 Other sequencing technologies (contd) stoopped here)
Sequencing by hybridization rarely used for de novo sequencing Extremely fast and useful to sequence something you already know the sequence of but want to identify mutation - resequencing Disease causing changes e.g in mitochondrial DNA SNP discovery Works best for examining sequence of <10 kb BioSci D145 lecture 5 page 32 ©copyright Bruce Blumberg All rights reserved

33 Other sequencing technologies (contd)
SNP discovery Photo shows mitochondrial chip Right panel shows pairs of normal (top) vs disease (bottom) (Leber’s Hereditary Optic Neuropathy) Top 3 disease mutations Bottom control with no change BioSci D145 lecture 5 page 33 ©copyright Bruce Blumberg All rights reserved

34 Other sequencing technologies – Next Generation sequencing
2nd generation = high throughput, short sequences 3rd generation = single molecule sequencing Small number of sequence templates (thousands) but very long reads (~105 bp) What is the immediate implication of this technology for genome assembly? See Metzger, M.L. (2010) Sequencing technologies — the next generation, Nature Reviews Genetics 11, We should now be able to completely sequence large insert clones directly and avoid fragmentation by repetitive elements! BioSci D145 lecture 5 page 34 ©copyright Bruce Blumberg All rights reserved


Download ppt "BioSci D145 Lecture #3 Bruce Blumberg"

Similar presentations


Ads by Google