BioSci 145B lecture 1 page 1 © copyright Bruce Blumberg 2004. All rights reserved mRNA frequency and cloning mRNA frequency classes –classic references.

Slides:



Advertisements
Similar presentations
GBrowse at TAIR Philippe Lamesch TAIR curator. Seqviewer.
Advertisements

Part I: Tips and Techniques from curators GBrowse at TAIR David Swarbreck.
Genome Projects A genome project is the complete DNA sequence of the genome of an organism, and the identification of all its genes Genome projects are.
Recombinant DNA Technology
Serial Analysis of Gene Expression Velculescu, V., Zhang, L., Vogelstein, B. Kinzler, K. (1995) Science.
Group Work: How many chromosomes are found in human cells?
Recombinant DNA Technology
Human Genome Project What did they do? Why did they do it? What will it mean for humankind? Animation OverviewAnimation Overview - Click.
class web site /evolgenome Model system toolkit Genome sequences EST collections.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CHAPTER 18 LECTURE SLIDES.
Genetic models Self-organization How do genetic approaches help to understand development? How can equivalent cells organize themselves into a pattern?
BioSci 203 lecture 28 page 1 © copyright Bruce Blumberg All rights reserved Bio Sci 203 Lecture 28 - cDNA library screening Bruce Blumberg
Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004.
BioSci 203 lecture 20 page 1 © copyright Bruce Blumberg All rights reserved Bio Sci 203 Lecture 20 - cDNA library screening Bruce Blumberg
Alternative splicing and evolution Daniel Jeffares.
Alternative Splicing from ESTs
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
EVOLUTIONARY AND COMPUTATIONAL GENOMICS Shin-Han Shiu Plant Biology / CMB / EEBB / Genetics / QBMI.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Spinal Muscular Atrophy SMN1 Billy Baader - Genetics 677 Medline Plus (2009) Spinal Muscular Atrophy retrieved Feb 3, 2009 from:
Chapter 21 Reading Quiz 1. When cells become specialized in structure & function, it is called … 2. Name 2 of the 5 “model organisms”. 3. What does it.
Genetics and Biotechnology
The Ensembl Gene set The “Genebuild” 21 April 2008.
歐亞書局 PRINCIPLES OF BIOCHEMISTRY Chapter 9 DNA-Based Information Technologies.
TAIR, PMN, SGN and Gramene workshop Focus on comparative genomics and new tools Philippe Lamesch, A. S. Karthikeyan, Aureliano Bombarely Gomez, Pankaj.
PATTERNS OF INHERITANCE
Eukaryotic Gene Expression The “More Complex” Genome.
Meiosis Organisms that reproduce sexually have specialized cells called gametes (sex cells) Gametes are the result of a type of cell division called meiosis.
Genomes School B&I TCD Bioinformatics May Genome sizes Completed eukaryotic nuclear genomes Type of organismSpeciesGenome size (10 6 base pairs)
Cell Division Meiosis.
Chapters 19 - Genetic Analysis of Development: Development Development refers to interaction of then genome with the cytoplasm and external environment.
This presentation was originally prepared by C. William Birky, Jr. Department of Ecology and Evolutionary Biology The University of Arizona It may be used.
IGEM 101: Session 7 4/2/15Jarrod Shilts 4/5/15Ophir Ospovat.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
DAY 1c: Accessing Completed Genomes 1. UCSC Genome Bioinformatics 2. Ensembl 3. NCBI Genomic Biology.
Genetic Technology.
NCBI FieldGuide September 29, 2004 ICGEB NCBI Molecular Biology Resources A Field Guide part 1.
Cluster I. Cluster II Cluster III (contiued) Cluster IV.
1 Genomics The field of biology based on studying the entire DNA sequence of an organism - its “genome”. Genomics tools don’t replace classical genetics.
Comparative genomics Haixu Tang School of Informatics.
MCB 7200: Molecular Biology Biotechnology terminology Common hosts and experimental organisms Transcription and translation Prokaryotic gene organization.
Chapter 1 Introduction.
David Sadava H. Craig Heller Gordon H. Orians William K. Purves David M. Hillis Biologia.blu B – Le basi molecolari della vita e dell’evoluzione The Eukaryotic.
DNA Gene A Transcriptional Control Imprinting Histone Acetylation # of copies of RNA? Post Transcriptional Processing mRNA Stability Translational Control.
Chapters 19 - Genetic Analysis of Development:
How many genes are there?
Lecture 21 – Genome Annotation & Sequenced Genomes Based on Chapther 8 Genomics: The Mapping and Sequencing of Genomes Copyright © 2010 Pearson Education.
Gene models and proteomes for Saccharomyces cerevisiae (Sc), Schizosaccharomyces pombe (Sp), Arabidopsis thaliana (At), Oryza sativa (Os), Drosophila melanogaster.
Supplementary Fig. 1. (A) PCR amplification of wheat TaHSP26 genomic, cDNA and ORF clones. (B) ORF and protein sequence of TaHSP26. An arrowhead indicates.
Chapter 11 Meiosis & Genetics What do you think meiosis makes?
Today’s Goals Describe the advantages of C. elegans as a model organism Discuss the life cycle of the nematode Safely and effectively culture a population.
WSSP Chapter 10 Literature Search Where do you learn about the function of your gene? atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
1 Applied Developmental Biology Dr. Lubna Tahtamouni The Hashemite University 2010 Week # 2 Tools in Developmental Biology 1.
DEVELOPMENTAL BIOLOGY
Bos taurus Olfactory Receptor Katie Davis 1,2 and Sandra Rodriguez-Zas 1 1 Department of Animal Sciences, University of Illinois Urbana-Champaign, 2 ACES.
Radiation hybrid map of the zebrafish genome
MCB 7200: Molecular Biology
PBIO 4500/5500: Biotechnology and Genetic Engineering
Chapters 19 - Genetic Analysis of Development:
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Toward Systems Understanding of Leaf Senescence: An Integrated Multi-Omics Perspective on Leaf Senescence Research  Jeongsik Kim, Hye Ryun Woo, Hong Gil.
Volume 6, Issue 3, Pages (May 2013)
Genomes and Their Evolution
Introduction to Bioinformatics II
Every living organism inherits a blueprint for life from its parents.
Chapters 19 - Genetic Analysis of Development:
Volume 5, Issue 2, Pages (March 2012)
Volume 1, Issue 2, Pages (March 2008)
Toward Systems Understanding of Leaf Senescence: An Integrated Multi-Omics Perspective on Leaf Senescence Research  Jeongsik Kim, Hye Ryun Woo, Hong Gil.
Presentation transcript:

BioSci 145B lecture 1 page 1 © copyright Bruce Blumberg All rights reserved mRNA frequency and cloning mRNA frequency classes –classic references Bishop et al., 1974 Nature 250, Davidson and Britten, 1979 Science 204, –abundant mRNAs that together represent 10-20% of the total RNA mass > 0.2% –intermediate 1,000-2,000 mRNAs together comprising 40-45% of the total % abundance –rare 15,000-20,000 mRNAs comprising 40-45% of the total abundance of each is less than 0.05% of the total some of these might only occur at a few copies per cell How does one go about identifying genes that might only occur at a few copies per cell?

BioSci 145B lecture 1 page 2 © copyright Bruce Blumberg All rights reserved Normalization and subtraction How to identify genes that might only occur at a few copies per cell? –alter the representation of the cDNAs in a library or probe –Normalization - process of reducing the frequency of abundant and increasing the frequency of rare mRNAs Bonaldo et al., 1996 Genome Research 6, –Subtraction - removing cDNAs (mRNAs) expressed in two populations leaving only differentially expressed Sagerström et al. (1997) Ann Rev. Biochem 66,

BioSci 145B lecture 1 page 3 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, The problem – completion of the human genome sequence was very far off in the distance –The big debate (circa 1989) Sequence entire genome –Will take a long time and lots of money Or sequence mRNAs (cDNAs) –Will get coding sequences but how to be sure you have every one? –How to get rare cDNAs?

BioSci 145B lecture 1 page 4 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, In 1991 only a few thousand mRNA sequences identified –Brain mRNAs < 200 –Not good for solving neurological diseases Venter and colleagues from National Institute for Neurological Disorders and Stroke How to get rapid sequence to use for –Mapping –Studying diseases –Gene identification The solution? –High throughput sequencing of random cDNAs (96/day!) Modern machines 8 x 384 /day each –These Expressed Sequence Tags have many uses –Venter proposes that they be used in place of STS (sequence tagged sites) Provide more information with less cost and effort (no extensive validation required)

BioSci 145B lecture 1 page 5 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, What do you get from EST sequencing –Rapid survey of expressed genes in cell, tissue, organ or embryo –Information for gene identification –Tags for gene mapping Test how to improve frequency of new genes in table 1

BioSci 145B lecture 1 page 6 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, Tables 2 –Table 2 shows that they identified a number of already known human genes –Unsaid is that these are all relatively abundant transcripts At least in intermediate class –Suggests what subsequent EST sequencing shows to be the case Random EST sequencing overrepresents abundant and intermediate frequency sequences Underrepresents rare frequency class

BioSci 145B lecture 1 page 7 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, Table 3 shows relationship of ESTs to other non-identical genes in the database –Putative relatives, depending on degree of sequence similarity –Ranges from nearly identical to about 57% (still fairly closely related) –Conclude that EST sequencing can identify relatives of genes known in other species

BioSci 145B lecture 1 page 8 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, Table 4 –Compare sequences with ProSite motif database This categorizes patterns seen in sequences –NLS –Zinc fingers –ATP binding cassette –Etc –Found several that appear to be new members of particular classes –Conclude that EST sequencing and analysis allows one to identify unknown members of known gene families

BioSci 145B lecture 1 page 9 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, Table 5 –Evaluated accuracy of sequencing 92-98% depending on read length Limitation of separation technology (slab gels) –Very poor by today’s standards (99+% at 600 bases) High error rate means must sequence at greater redundancy to get correct sequence

BioSci 145B lecture 1 page 10 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, Figure 1 – identification of human relatives of Drosophila neurogenic genes –These are responsible for neuronal differentiation in Drosophila –Proves that genes of known function from model organism can be used to identify interesting human genes to study

BioSci 145B lecture 1 page 11 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, Figure 2 –Mapped ESTS to chromosomes –Used PCR to check which members of a RH panel corresponded to EST Maps the EST to a chromosome (provided that RH has been so mapped) –Why is this important? This enables mRNAs to be mapped to genomic loci and provides a quick entry point to gene identification –Diseases –Mutations –translocations

BioSci 145B lecture 1 page 12 © copyright Bruce Blumberg All rights reserved Adams et al., (1991) Science 252, Conclusions –EST sequencing is a rapid and efficient way to generate sequence tags with numerous uses bp of sequence is enough to identify sequence, map to chromosome, determine homology with distant organisms Claimed matches with yeast and neurospora sequences –In fact, these were contaminants in library from yeast RNA used as carrier for precipitations during library construction »very sloppy –337/600 sequences were putative new genes – good method to quickly identify genes –Way too many abundant genes – suggested that libraries must be normalized or subtracted to minimize redundancy –Pioneered large scale automated sequence entry –Suggested that in a few years, they would have mapped all of mRNAs from human brain Overly optimistic

BioSci 145B lecture 1 page 13 © copyright Bruce Blumberg All rights reserved OrganismMay 1999August 2000August 2002 Homo sapiens (human)1,380,7372,232,8094,533,427 Mus musculus + domesticus (mouse) 521,6721,604,1152,624,752 Rattus sp. (rat) 112, , ,827 Glycine max (soybean) 8,236 96, ,299 Drosophila melanogaster (fruit fly) 83,197 90, ,583 Danio rerio (zebrafish) 24,567 71, ,334 Hordeum vulgare + subsp. vulgare (barley) ,877 Bos taurus (cattle) , ,495 Xenopus laevis , ,132 Triticum aestivum (wheat) 4 39, ,047 Caenorhabditis elegans (nematode) 72, , ,632 Arabidopsis thaliana (thale cress) 37, , ,624 Ciona intestinalis ,272 Zea mays (maize) 13,177 70, ,610 Medicago truncatula (barrel medic) , ,917 Dictyostelium discoideum 15,199 19, ,197 Lycopersicon esculentum (tomato) 9,088 87, ,346 Chlamydomonas reinhardtii ,324 Sus scrofa (pig) 4,136 33, ,213 Oryza sativa (rice) 40,499 60, ,019 Silurana+Xenopus tropicalis ,619 Solanum tuberosum (potato) 85 94,420 Anopheles gambiae (African malaria mosquito) 86 94,032 Sorghum bicolor (sorghum) ,738 84,712 Gallus gallus (chicken) ,840 62,476 total public ESTs2,464,3375,462,53012,190,151 dbEST Summary

BioSci 145B lecture 1 page 14 © copyright Bruce Blumberg All rights reserved dbEST release April 9, 2004 Homo sapiens (human) 5,484,645 Mus musculus + domesticus (mouse) 4,088,831 Rattus sp. (rat) 592,060 Triticum aestivum (wheat) 555,472 Ciona intestinalis 492,511 Danio rerio (zebrafish) 484,827 Gallus gallus (chicken) 481,956 Bos taurus (cattle) 409,104 Zea mays (maize) 395,955 Xenopus laevis (African clawed frog) 368,783 Hordeum vulgare + subsp. vulgare (barley) 356,856 Xenopus tropicalis 349,052 Glycine max (soybean) 346,582 Sus scrofa (pig) 287,741 Oryza sativa (rice) 283,989 Drosophila melanogaster (fruit fly) 274,367 Saccharum officinarum 246,301 Caenorhabditis elegans (nematode) 231,096 Arabidopsis thaliana (thale cress) 204,396 Sorghum bicolor (sorghum) 190,864 Dictyostelium discoideum 155,032 Lycopersicon esculentum (tomato) 150,519 Oryzias latipes (Japanese medaka) 149,697 Solanum tuberosum (potato) 149,227 Oncorhynchus mykiss (rainbow trout) 142,967 Schistosoma mansoni (blood fluke) 139,135 Vitis vinifera 137,660 Anopheles gambiae (African malaria mosquito) 134,784 Bombyx mori (domestic silkworm) 116,541 Pinus taeda (loblolly pine) 110,622 Lotus corniculatus var. japonicus 110,563 Number of public entries: 20,685,791