Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.

Slides:



Advertisements
Similar presentations
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
Advertisements

SEQUENCING-related topics 1. chain-termination sequencing 2. the polymerase chain reaction (PCR) 3. cycle sequencing 4. large scale sequencing stefanie.hartmann.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
Cloning lab results Cloning the human genome Physical map of the chromosomes Genome sequencing Integrating physical and recombination maps Polymorphic.
9 Genomics and Beyond Brief Chapter Outline
DNA Sequencing Lecture 9, Tuesday April 29, 2003.
Genome Sequence Assembly: Algorithms and Issues Fiona Wong Jan. 22, 2003 ECS 289A.
DNA Sequencing – “Plus and Minus” Plus –Incubate with T4 DNA Polymerase and single dNTP –T4 Polymerase degrades 3’ ends in absence of dNTP –Fractionated.
Physical Mapping I CIS 667 February 26, Physical Mapping A physical map of a piece of DNA tells us the location of certain markers  A marker is.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
DNA Sequencing and Assembly
The Human Genome Race. Collins vs. Venter Collins Venter.
International Tomato Finishing Workshop Wellcome Trust Sanger Institute April 2007 Wellcome Trust Medical Photographic Library.
Novel multi-platform next generation assembly methods for mammalian genomes The Baylor College of Medicine, Australian Government and University of Connecticut.
3 September, 2004 Chapter 20 Methods: Nucleic Acids.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
DNA Sequencing and Assembly. DNA sequencing How we obtain the sequence of nucleotides of a species …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
Genome sequencing and assembling
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Genomic DNA & cDNA Libraries
Today’s Lecture Genetic mapping studies: two approaches
Presentation on genome sequencing. Genome: the complete set of gene of an organism Genome annotation: the process by which the genes, control sequences.
Mouse Genome Sequencing
Phred/Phrap/Consed Analysis A User’s View Arthur Gruber International Training Course on Bioinformatics Applied to Genomic Studies Rio de Janeiro 2001.
SOL 2008 October 12-16, Cologne, Germany CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Vector NTI. Go Herd! Download your sequence and open the file Click your name on my web page on the class genes page
Genome sequencing Haixu Tang School of Informatics.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
The Changing Face of Sequencing
19.1 Techniques of Molecular Genetics Have Revolutionized Biology
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
Wageningen, April 24-25, 2008 II Tomato Finishing Workshop Chromosome 12 Update ENEA, Rome University of Naples ‘Federico II’ CRIBI and Univ. of Padua.
Applied Bioinformatics Week 5. Topics Cleaning of Nucleotide Sequences Assembly of Nucleotide Reads.
Overview of the Drosophila modENCODE hybrid assemblies Wilson Leung01/2014.
Human Genome.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Genomics Education Partnership: a flexible approach to implement Genomic teachings and research in the classroom Matthew W. Wadsworth and Consuelo J. Alvarez,
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Plasmids that contain l cos sites.
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
A Molecular Toolkit AP Biology Fall The Scissors: Restriction Enzymes  Bacteria possess restriction enzymes whose usual function is to cut apart.
Drosophila Genomics Where are we now? Where are we going? Christopher Shaffer, Wilson Leung, Sarah Elgin Dept of Biology; Washington University in St.
Chapter 5 Sequence Assembly: Assembling the Human Genome.
Gene Technologies and Human ApplicationsSection 3 Section 3: Gene Technologies in Detail Preview Bellringer Key Ideas Basic Tools for Genetic Manipulation.
Genome Analysis. This involves finding out the: order of the bases in the DNA location of genes parts of the DNA that controls the activity of the genes.
KEY CONCEPT DNA sequences of organisms can be changed.
Cse587A/Bio 5747: L2 1/19/06 1 DNA sequencing: Basic idea Background: test tube DNA synthesis DNA polymerase (a natural enzyme) extends 2-stranded DNA.
Virginia Commonwealth University
Human Genome Project.
Pre-genomic era: finding your own clones
BME 130 – Genomes Lecture 2 Mapping Genomes.
Restriction Enzymes and Plasmid Mapping
Today’s Lecture Genetic mapping studies: two approaches
Stuff to Do.
Scientists use several techniques to manipulate DNA.
Projects from the D. grimshawi
New Class Offering.
CSCI 1810 Computational Molecular Biology 2018
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Material for Quiz 5 from Chapter 8
Presentation transcript:

Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University

Mojavensis assembly Isolate genomic DNA Create 3 libraries  Plasmid library: ~3.5 kb inserts (1 million)  Fosmid library: ~37 kb inserts (250,000 clones)  Bac library: ~150 kb inserts (30,000 clones) Sequence both ends Take all data and attempt whole genome assembly

Mojavensis assembly After whole genome shotgun and assembly:  6842 scaffolds; 319,616,621 total bases  5033 gaps; estimated 13,608,479 bases  Average gap size: 2705 bases Room for improvement Better quality needed for Repeat analysis

Mojavensis assembly Differences from Mouse and virilis examples:  There have been no vector sequences seen so don’t expect them  There is significant overlap of each fosmid so it is not vital that you finish the ends of your sequence to high quality  Digests algorithm will add two extra cut sites, one at each “vector”/”insert” junction. See the wiki on work around.

Mojavensis Assembly Since the libraries were made from whole genomic DNA there are two dot chromosomes (maternal and paternal) that give rise to sequence data Sequence differences between the two homologs can confound phred/phrap/consed (originally designed for assembly of clones (Bacs and fosmids) Siblings from mojavensis and other Drosophila species were crossed for 10 generations to remove polymorphisms Unfortunately we have found a higher than expected rate of polymorphisms (about one third of projects in spring 07 had polymorphisms)

Duplication or Polymorphism When you find two nearly identical sequences these could either be polymorphism or a duplication How can you tell which you have? Restriction mapping data!

Case 1: single polymorphism Small isolated single base polymorphisms will usually be found as high quality discrepancies Use your restriction digest data for the region containing the putative polymorphism to guide you  Sizes match: tag as a polymorphism  Sizes do not match: tag “tell phrap not to overlap discrepant reads”

Demo Computers Open X11; start terminal if needed  cd Desktop/465-C16/edit_dir  Start consed  Open contig 5  Open high quality discrepancy navigator  Notice all the discrepancies at 4522

Demo (cont’) Is this a misassembly or a polymorphism? Check digest  Click “find main window”  Click “digests”  In the “Huge list of enzymes” select  EcoRI and SacI to add these to HindIII and EcoRV  Under what to digest select “entire single contig” and enter 5 in the box labeled “single entire contig:”  Click “OK”

Demo (cont’) For error message that it cannot link vector ends to end of contig click “Dismiss” Select “Display digests” from Window menu Select zoom from Window menu The problem area is around base 4522 Select “position” in sort by (lower left) Note base 4522 is within the 3982 base predicted fragment which matches an observed fragment if 3977

Demo (cont’) Since it matches more likely a polymorphism The more digests that match the better Check again later when assembled into larger contig for other digests If at end still looks like polymorphism add polymorphism tag

Case 2: polymorphism clusters Consed will not assemble together Will look like a spanned gap in assembly view with sequence matches at each end Force join the two contigs Check the digest for this region Sizes match: tag all polymorphisms Does not match; undo join or quit and do not save, attempt to call oligo’s to span gap Seek help if needed for oligo design

Case 3: Insert Polymorphism Very complicated Very unlikely in green and low yellow Want to confirm assembly with digests If needed pull out all discrepant reads into new contig to get digests to match Seek help!