Genomics. Gene expression DNA (Genome) pre-mRNA mRNA mRNA (Transcriptome) Proteins (Proteome) Metabolites (Metabolome) Regulation Nucleus Cytoplasm Chromatography.

Slides:



Advertisements
Similar presentations
The Human Genome Project
Advertisements

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Sequencing a genome. Definition Determining the identity and order of nucleotides in the genetic material – usually DNA, sometimes RNA, of an organism.
Recombinant DNA Technology
Genomics & Proteomics What is genomics? GOALS of Genomics
9 Genomics and Beyond Brief Chapter Outline
Recombinant DNA Introduction to Recombinant DNA technology
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
DNA Sequencing and Gene Analysis
16 and 20 February, 2004 Chapter 9 Genomics Mapping and characterizing whole genomes.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
1 Characterization, Amplification, Expression Screening of libraries Amplification of DNA (PCR) Analysis of DNA (Sequencing) Chemical Synthesis of DNA.
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey Chapter 4 Genome Sequencing Strategies and procedures for.
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Genome Sequencing and Assembly High throughput Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
and analysis of gene transcription
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
Presentation on genome sequencing. Genome: the complete set of gene of an organism Genome annotation: the process by which the genes, control sequences.
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
Chapter 19 – Molecular Genetic Analysis and Biotechnology
Trends in Biotechnology
-The methods section of the course covers chapters 21 and 22, not chapters 20 and 21 -Paper discussion on Tuesday - assignment due at the start of class.
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
1 Genetics Faculty of Agriculture Instructor: Dr. Jihad Abdallah Topic 13:Recombinant DNA Technology.
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
Genomics BIT 220 Chapter 21.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
© 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey Chapter 3 Fundamentals of Mapping and Sequencing Basic principles.
Section 2 Genetics and Biotechnology DNA Technology
Recombinant Technololgy
Genome sequencing Haixu Tang School of Informatics.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Bioinformatics and Sequencing Relevant to SolCAP
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
A Sequenciação em Análises Clínicas Polymerase Chain Reaction.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
19.1 Techniques of Molecular Genetics Have Revolutionized Biology
 The process by which desired traits of certain plants and animals are selected and passed on to their future generations is called selective breeding.
Chapter 5: Exploring Genes and Genomes Copyright © 2007 by W. H. Freeman and Company Berg Tymoczko Stryer Biochemistry Sixth Edition.
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006.
Human Genome.
13-1 Copyright  2005 McGraw-Hill Australia Pty Ltd PPTs t/a Biology: An Australian focus 3e by Knox, Ladiges, Evans and Saint Chapter 13: Genetic engineering.
Locating and sequencing genes
Molecular Biology II Lecture 1 OrR. Restriction Endonuclease (sticky end)
Molecular Tools. Recombinant DNA Restriction enzymes Vectors Ligase and other enzymes.
Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism.
Recombinant DNA Reverse genetics Synthesis of DNA probes Restriction enzymes, plasmids and recombinant DNA Genomic and cDNA libraries Applications.
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Next generation sequencing
Human Genome Project.
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
اجابة السؤال الاول.
Pre-genomic era: finding your own clones
Section 2 Genetics and Biotechnology DNA Technology
Stuff to Do.
Relationship between Genotype and Phenotype
The Human Genome Project
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
Today… Review a few items from last class
A Sequenciação em Análises Clínicas
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Relationship between Genotype and Phenotype
Presentation transcript:

Genomics

Gene expression DNA (Genome) pre-mRNA mRNA mRNA (Transcriptome) Proteins (Proteome) Metabolites (Metabolome) Regulation Nucleus Cytoplasm Chromatography Mass spectrometry NMR DNA arrays and chips (semi) qRT-PCR Northern blot + hybrid. Transkriptional fusions Functional genomics 2D electrophoresis Mass spectrometry Protein sequencing Translational fusional Immunodetection Enzyme activities Genome maping Genome sequencing Genome annotations Structural genomics

History of genomes sequencing 1977 bacteriophage øX174 (5386bp, 11 genes) 1981 mitochondrial genome (16,568bp; 13 prots; 2 rRNAs; 22 tRNAs 1986 chloroplast genome (120, ,000bp) 1992 Saccharomyces chromosome III (315kb; 182 ORFs) 1995 Haemophilus influenzae (1.8Mb 1996 Saccharomyces whole genome (12.1Mb; over 600 people 100 laboratories) 1997 E. coli (4.6Mb; 4200 proteins) 1998 Caenorhabditis elegans (97 Mb; 19,000 genů) 2000 Arabidopsis thaliana (115Mb, 25-30,000 genů) 2001 mouse (1 year!) 2001 Homo sapiens (2 projekty) 2005 Pan, rice 2006 Populus Technological improvements

DNA sequencing – principle (Sanger’s method) Polymeration from primer in the presence of low concentration of terminator (dideoxy) ddNTP primer Random termination on all positions with occurance of the nucleotide

Original arrangement sequence - RI labelled primer - 4 separated reactions - with individual ddNTP - ddNTP:dNTP (cca 1:20 – (100)) - PAGE separation A T C G C C C T G T T G A G A Separation by size

Automated sequencing with fluorescence-labelled ddNTP Every ddNTP labelled with different fluorescent dye – all together in one reaction Separation by size in capillary – fluorescence detection

Genom sequencing is more than sequencing of DNA 1 sequencing reaction 300 – 800 bp Typical genom hunderts of millions to billions bp How to manage?

Strategies of genome sequencing Classical strategy (Map-Based Assembly): - minimal quantity of DNA sequencing – sorting of big DNA fragments, successive reading (human genome sequencing – original strategy) - scaffold for genome sequence assemble - time consuming Whole genome shotgun (WGS) – random (7-9x redundant) sequencing – sorting of sequence data (Haemophilus) - problems with repetitive DNA Combination – „hierarchical shotgun“, „chromosome shotgun“

Hierarchical shotgun sequencingWhole-genome shotgun sequencing Green (2001) Nature Reviews Genetics 2: Production of over- lapping clones (e.g. BACs, YACs) and construction of physical map Shearing of DNA and sequencing of subclones Assembly

Hierarchical shotgun sequencing First step: library of big DNA inserts (= genome fragments) phage ( ) vectors: 30 kb cosmids: 50 kb BACs (bacterial artificial chromosomes): kb YACs (yeast artificial chromosomes): cca 0.5-1Mb

Physical „BAC“ map of genome Arrangement (position, orientation) of individual BAC in the genome Fundamental for classical sequencing Very usefull for assembly of „shotgun“ sequences How to make the map from BACs with unknown sequence?

Map construction - BAC fingerprinting x more bp in BACs than in the genome for map construction (Arabidopsis – , rice ) Restriction sites Sequencing of DNA ends

BAC fingerprinting ANIMATION of HIERARCHICAL SHOTGUN:

Minimum tiling path = the lowest possible set of BACs covering the whole sequence physical map arrangement and mapping and clone selection - by restriction fragment analysis - using terminal sequences and hybridization - by hybridization with markers with known position in genetic map

Shotgun sequencing random cleavage + direct sequencing (NGS) BAC/chromosome/whole genome sequencing of clone ends (known distance between) Cosmids (40 Kbp): ~500 bp

Genome (chromosome, BAC...) assembly..ACGATTACAATAGGTT.. 1.Looking for overlaps in primary sequences 2.Assembly to contigs to get short consensus sequences 3.Assebly to supercontigs using the information of sequence pairs (ends + distance) 4. Complete consensus sequence

Repetitive sequences and contig assembly repetition Repetitions are serious problem in assembly, if they are conserved and longer than sequencing run ? ?

Use of markers for whole genome assembly ( STS – sequence tagged sites = short sequences with known position on chromosoms) Supecontigs with scaffold (BAC-end sequences with known distance)

-optimal – libraries with different insert sizes (2, 10, a 50 kbp) -sequencing the linker clone = filling the gap Filling of gaps: shorter clones are better X

What to do with the genome sequence? To annotate! Searching for genes: –Automatic prediction of coding seq. –Prediction of introns/exons –Prediction according to related seq. –Confirmation by cDNA and EST Prediction of function – from experimentally characterized homologues

Fragment of GenBank BAC clone annotation

Graphical interface of BAC annotation

Large genomes alternative strategies of sequencing: - isolation of individual chromosomes e.g. wheat – allows assembly of homeologous chromosomes (allohexaploid) - shotgun sequencing of non-methylated DNA (maize) - sequencing of ESTs (potato)

Expressed Sequence Tags (ESTs) -short sequenced regions of cDNA ( nt) -usually gene fragments (primarilly originate from mRNA) -highly redundant, but also incomplete! -problems: - no regulatory sequences (promotors, introns,...) - only transcripts of certain genes

Preparation of EST library - mRNA - RT with oligoT primer  cDNA -cleavage of RNA from heteroduplex RNAseH - 2nd strand cDNA synthesis - cleavage with restriction endonuclease - adaptor ligation cloning Expressed Sequence Tags (ESTs) sequencing

Assembly of EST contigs - Unigenes

Next generation sequencing - faster and cheaper!!! - parallel sequencing of high numbers of sequences! - no handling with individual sequences! Examples of recently developed or developing technologies: 454 sequencing – pyrosequencing (Roche) - complementary strand synthesis Illumina – sequencing by synthesis - complementary strand synthesis SOLiD - Sequencing by Oligonucleotide Ligation and Detection - ligation of labelled oligonucleotides Oxford nanopore technology - exonuclease degradation, el. current changes detection

Method Single-molecule real-time sequencing (Pacific Bio) Ion semiconductor (Ion Torrent sequencing) Pyrosequencing (454) Sequencing by synthesis (Illumina) Sequencing by ligation (SOLiD sequencing) Chain termination (Sanger sequencing) Read length (30.000) bpup to 400 bp700 bp50 to 300 bp50+50 bp400 to 900 bp Reads per run50.000up to 80 million1 millionup to 3 billion1.2 to 1.4 billionN/A Cost per 1 million bases (in US$)$0.33-$1.00$1$10$0.05 to $0.15$0.13$2400 NGS – comparison of basic parameters

454 technology - pyrosequencing up to 1 mil reads (lenght bp) one day (23 hour procedure) = Mbp

454 technology - pyrosequencing

454 technology

Illumina – sequencing by synthesis (Solexa)

Illumina – seqencing by synthesis (Solexa)

SOLiD™ System (Applied Biosystems) 2 Base Encoding Sequencing by Oligonucleotide Ligation and Detection - reads up to 75 b Gb for a day! - high accuracy up to 99,99 % - initial step – clonal multiplication (similar to 454)

SOLiD™ System Mix of 1024 octamers (number of variations NNN = 64) x 16 known dinucleotides Z = nucleotides universally pairing with any nucleotide (prolongation) – cleaved out after ligation labelling: 4 fluorescent dyes – each for 256 octamers (with just 4 known middle dinucleotides) -

5 independent reactions = each 10 – 15 times repeated ligations of labelled octamers starting from a primer with shifted end

Knowledge of the first nucleotide allows translation of color sequence to nucleotide sequence A A T G C A G G C A T G C C G T A C } alternative translation with different 1st nucleotide

Oxford nanopore technologies – direct sequencing of one DNA strand - protein nanopore in membrane (alpha-hemolysin) - covalently bound exonuclease - monitoring specific decrease in current (metC!)