Computational Molecular Biology Biochem 218 – BioMedical Informatics 231 Doug Brutlag Professor.

Slides:



Advertisements
Similar presentations
The Human Genome Project Main reference: Nature (2001) 409,
Advertisements

Celera Assembler Arthur L. Delcher Senior Research Scientist CBCB University of Maryland.
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
Doug Brutlag 2011 Sequencing the Human Genome Doug Brutlag Professor Emeritus of Biochemistry.
The IWGSC: Building the sequence-based foundation for accelerated wheat breeding Kellye A. Eversole IWGSC Executive Director & The IWGSC Cereals for Food,
SEQUENCING-related topics 1. chain-termination sequencing 2. the polymerase chain reaction (PCR) 3. cycle sequencing 4. large scale sequencing stefanie.hartmann.
CS273a Lecture 5, Win07, Batzoglou Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort contigs from largest to smallest,
CS273a Lecture 4, Autumn 08, Batzoglou Some Terminology insert a fragment that was incorporated in a circular genome, and can be copied (cloned) vector.
DNA Sequencing Lecture 9, Tuesday April 29, 2003.
DNA Sequencing. The Walking Method 1.Build a very redundant library of BACs with sequenced clone- ends (cheap to build) 2.Sequence some “seed” clones.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
DNA Sequencing. CS273a Lecture 3, Spring 07, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
Stuff to Do. Midterm I questions due 1/31 me your question (with answers), –if you have the capability, mail complete questions, figures, etc. and.
CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Sequencing and Assembly Cont’d. CS273a Lecture 5, Win07, Batzoglou Steps to Assemble a Genome 1. Find overlapping reads 4. Derive consensus sequence..ACGATTACAATAGGTT..
Sequencing and Assembly Cont’d. CS273a Lecture 5, Aut08, Batzoglou Steps to Assemble a Genome 1. Find overlapping reads 4. Derive consensus sequence..ACGATTACAATAGGTT..
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
DNA Sequencing and Assembly. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
DNA Sequencing. CS273a Lecture 3, Autumn 08, Batzoglou DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT.
Human Genome Sequence and Variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary,
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Sequence Variation Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey Chapter 4 Genome Sequencing Strategies and procedures for.
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
Genome Sequencing and Assembly High throughput Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Genome Databases Computational Molecular Biology Biochem 218 – BioMedical Informatics.
Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.
Lecture 15 – Gene Cloning Based on Chapter 08 - Genomics: The Mapping and Sequencing of Genomes Copyright © 2010 Pearson Education Inc.
Finishing the Human Genome
How to Build a Horse Megan Smedinghoff.
Mouse Genome Sequencing
Chromosome 8 Sequencing: Current Status and Future Prospects toward Finishing Shusei Sato, Erika Asamizu, Takakazu Kaneko, Hiroyuki Fukuoka, Satoshi Tabata.
CO 10.
Genomics BIT 220 Chapter 21.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Genome sequencing Haixu Tang School of Informatics.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Genome Sequencing in the Legumes Le et al Phylogeny Major sequencing efforts Minor sequencing efforts ~14 MY ~45 MY.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Sequencing a genome. Approximate Molecular Dynamics: New Algorithms with Applications in Protein Folding Author: Qun (Marc) Ma Predicting the 3D native.
Status report on gap closure of the human chromosome 5 BAC map Authentication of C5 BAC maps Map and sequence status Gap status and steps used to close.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
FINISHING WORKSHOP APRIL 2008 CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966 T0731 TM15.
Initial sequencing and analysis of the human genome Averya Johnson Nick Patrick Aaron Lerner Joel Burrill Computer Science 4G October 18, 2005.
Jan Pačes Institute of Molecular Genetics AS CR
Linkage and Mapping. Figure 4-8 For linked genes, recombinant frequencies are less than 50 percent.
Human Genome.
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
BIOL 433 Plant Genetics Term 2, Instructors: Dr. George Haughn Dr. Ljerka Kunst BioSciences 2239BioSciences Tel
Genome Analysis Assaad text book slides only Lectures by F. Assaad can be downlaoded from muenchen.de/~farhah/index.htm.
Third Generation Sequencing. Today Illumina – Solexa sequencing technology 454 Life sciences – 454 sequencer Applied Biosystem – SOLiD system Tomorrow.
Virginia Commonwealth University
Tomato Sequencing Project Meeting at SOL 2008, Oct. 15, 2008
BIOL 433 Plant Genetics Term 2,
Pre-genomic era: finding your own clones
Very important to know the difference between the trees!
Stuff to Do.
BI820: Seminar in Quantitative and Computational Problems in Genomics
Bioinformatics: Buzzword or Discipline (???)
BIOL 433 Plant Genetics Term 2,
CSCI 1810 Computational Molecular Biology 2018
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Human Genome Project Seminal achievement. Scientific milestone.
THE HUMAN GENOME PROJECT. Gene Myers Let’s sequence the human genome with the shotgun strategy That is impossible, and a bad idea anyway Phil Green 1997.
Presentation transcript:

Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) The Human Genome Project

Maeve’s Office Hours Monday and Friday 3:30 to 5:00 PM Beckman Center B403A Please Maeve to set up an appointment if you want to meet with her. Take the elevator to the fourth floor of Beckman and turn left. The hallway leads directly to the lab B403 and my office is inside the lab at B403A.

Current Topics in Genome Analysis

Hierarchical Sequencing vs. Whole Genome Shotgun Sequencing © Gibson & Muse A Primer of Genome Science from Gibson & Muse, A Primer of Genome Science

The Human Genome Project: How should we do it? Weber, J. L., & Myers, E. W. (1997). Human whole-genome shotgun sequencing. Genome Res, 7(5), Weber, J. L., & Myers, E. W. (1997). Human whole-genome shotgun sequencing. Genome Res, 7(5), – list_uids= http:// list_uids= –Use multiple length clones 2 kb, 10 kb and 50 kb –Use clone end sequencing generating mate-pairs –Able to use long clones to leap over repeated regions –Clone length permits one to measure length of repeated regions. –Will find more polymorphisms (SNPs) –Costs less –BAC clone artifacts Differential amplification BACs not stable in bacteria will be lost. Repeated regions will recombine and be lost Green, P. (1997). Against a whole-genome shotgun. Genome Res, 7(5), Green, P. (1997). Against a whole-genome shotgun. Genome Res, 7(5), – list_uids= http:// list_uids= –Preferred clone-by-clone BAC sequencing –Distributed versus monolithic organization –BACs linked to genetic maps –Costs less (sequence 4x human genome) –Finishing simplified and fewer gaps –Haplotyping automatic –Longer repeat regions lengths measured

Gene Myers Let’s sequence the human genome with the shotgun strategy That is impossible, and a bad idea anyway Phil Green 1997 History of Whole Genome Assembly Thanks to Serafim Batzoglou

Public Human Genome Project Strategy

Contig Formation as Mapping Progresses Lander & Waterman

Public Genome Assembly Process

BAC and PAC Libraries in Public Human Genome Project uids= uids=

Total Genome Sequence Information

Comparing Chromosome 2 Sequence Versus Genetic Maps

Synteny Between Human and Mouse =PubMed&dopt=Citation&list_uids= =PubMed&dopt=Citation&list_uids=

Celera Sequencing

Celera Scaffolds

Celera Assembler

Chromosome 21: Public vs Celera Assemblies

Chromosome 8: Public vs. Celera

Finishing Strategy for the Public Genome Project opt=Citation&list_uids=

Finished Sequence in 2004 (Build 35)

Chromosome 7: Draft versus Finished Sequence ds= ds=

Substitutions in BAC Overlaps with BACs from Same or Different Libraries

Gaps in BAC Overlaps with BACs from Same or Different Libraries

Duplications and Deletions in the Human Genome

Percentage of Chromosomes Duplicated n&list_uids= n&list_uids=

Available Next Generation Sequencing Technologies Illumina – Solexa – Illumina Technology – 454 Life Sciences – 454 Life Sciences Technology – Applied Biosystems Inc. (ABI) SOLID Sequencing – ABI SOLID Sequencing Technology – SOLiDSystemSequencing/OverviewofSOLiDSequencingChemistry/index. htmhttp://www3.appliedbiosystems.com/AB_Home/applicationstechnologies/ SOLiDSystemSequencing/OverviewofSOLiDSequencingChemistry/index. htm

Next Generation Sequencing Technologies in Development Pacific BioSciences – Pacific BioSciences Technology – Helicos – Helicos Technology – bid/64/Default.aspxhttp:// bid/64/Default.aspx Complete Genomics – Complete Genomics Technology – /technicalDetails.aspxhttp:// /technicalDetails.aspx