Introduction to Genomics and the Tree of Life Friday, October 22, 2010 (part 1) Monday, October 25, 2010 (part 2) Genomics 260.605.01 J. Pevsner

Slides:



Advertisements
Similar presentations
Tracing Evolutionary History
Advertisements

The Tree of Life
An Overview of Microbial Life
History of Life on Earth
The History of Classification Out of this concept was developed the "scale of nature" idea that suggested living things were arranged on a scale of perfection,
Introduction to Genomics and the Tree of Life Friday, October 21, 2011 (part 1) Monday, October 24, 2011 (lab 1) Wednesday, October 26, 2011 (part 2) Genomics.
Alberts, Bray, Hopkins, Johnson Copyright © 2004 Pearson Education, Inc., publishing as Benjamin Cummings Professor: Dr. Barjis Room: P313 Phone: (718)
Classifying Organisms
Phylogeny Systematics Cladistics
Chapter 17 Table of Contents Section 1 Biodiversity
Tree of Life Chapter 26.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Chapter 26 – Phylogeny & the Tree of Life
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Alternative splicing and evolution Daniel Jeffares.
Completed Genomes: Viruses and Bacteria Monday, October 20, 2003 Introduction to Bioinformatics ME: J. Pevsner
1 Systematics and the Phylogenetic Revolution Chapter 25.
Summer 2008 Workshop in Biology and Multimedia for High School Teachers.
TAXONOMY Presentation made by: sakura023. Need to Classify.
The diversity of genomes and the tree of life
Classification Organizing the Diversity of Life. Why do we classify things? – Supermarket aisles – Libraries – Classes – Teams/sports – Members of a family.
and the three domain system
La nuova biologia.blu Le cellule e i viventi David Sadava, David M. Hillis, H. Craig Heller, May R. Berenbaum.
Covers Chapter 4 Structure and Function of the Cell Pages
March 3 rd, 2010  Warm Up Open to ch. 17 to follow along with lecture  Today Review Ch. 17 Lab  Homework Study for Ch. 17 exam on Friday.
Taxonomy & Classification Taxonomy- science of identifying and classifying organisms; all about the naming Classification- systematic grouping and naming.
Introduction to Genomics and the Tree of Life Chapter 13.
Major Events in Evolution ♦4.6 bya – formation of the Earth (Precambrian) ♦3.5 bya – prokaryotic cells ♦2.2 bya – eukaryotic cells ♦600 mya – soft-bodied.
The Evolutionary History of Biodiversity
This presentation was originally prepared by C. William Birky, Jr. Department of Ecology and Evolutionary Biology The University of Arizona It may be used.
The Human Genome (part 1 of 2) Wednesday, November 5, 2003 Introduction to Bioinformatics ME: J. Pevsner
AP Biology Chapter 26. Origin of Life. AP Biology The historical tree of life can be documented with evidence. The Origin of Life.
Chapter 18 Classification. Every year, thousands of new species are discovered Biologists classify them with similar organisms The ways we group organisms.
Copyright © by Holt, Rinehart and Winston. All rights reserved. ResourcesChapter menu To View the presentation as a slideshow with effects select “View”
Using blast to study gene evolution – an example.
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies How We Name Living Things Chapter 12 Copyright © McGraw-Hill Companies.
Classification. Cell Types Cells come in all types of shapes and sizes. Cell Membrane – cells are surrounded by a thin flexible layer Also known as a.
Classification.
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
Life On Earth Do Now: Scientists estimate that there may be as many as 100,000,000 different species of organisms on Earth. Of these, about 1 x 107 have.
ORGANIZING BIODIVERSITY. A SPECIES How do we define a species? A reproductive population that occupies a specific niche (plays a role) in nature Individuals.
Chapter 18: Classification
PHYOGENY & THE Tree of life Represent traits that are either derived or lost due to evolution.
Raven - Johnson - Biology: 6th Ed. - All Rights Reserved - McGraw Hill Companies How We Classify Organisms Chapter 16 Copyright © McGraw-Hill Companies.
Copyright © 2002 Pearson Education, Inc., publishing as Benjamin Cummings Section C: The Major Lineages of Life 1.The five kingdom system reflected increased.
An Introduction to Classification November 29, 2010.
Eukaryotic genes are interrupted by large introns. In eukaryotes, repeated sequences characterize great amounts of noncoding DNA. Bacteria have compact.
Molecular Clocks and Continued Research
Phylogeny and Taxonomy. Phylogeny and Systematics The evolutionary history of a species or related species Reconstructing phylogeny is done using evidence.
Chapter 21 Origin of Life “…sparked by just the right combination of physical events & chemical processes…”
General Microbiology (Micr300)
MICROBIOLOGIA GENERALE Prokaryotic genomes. The Escherichia coli nucleoid.
Evolutionary history of a group of organisms
Phylogeny and the Tree of Life
How to Use This Presentation
Classification of Living Things
Genomes across the tree of life
Chapter 17 Table of Contents Section 1 Biodiversity
Genomes and Their Evolution
Chapter 17 Table of Contents Section 1 Biodiversity
Chapter 26 Phylogeny and the Tree of Life
Evolution of eukaryote genomes
5 kingdoms.
Classifying Organisms
Chapter 17 Table of Contents Section 1 Biodiversity
CC1: Introduction to microbiology and microbial diversity
TAXONOMY Early taxonomists classified all species as either plants or animals Later, five kingdoms were recognized: Monera (prokaryotes), Protista, Plantae,
19.1 How Did Life Begin? I. Bacteria were the first to evolve
PHYLOGENY AND CLASSIFICATION
Presentation transcript:

Introduction to Genomics and the Tree of Life Friday, October 22, 2010 (part 1) Monday, October 25, 2010 (part 2) Genomics J. Pevsner

Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics, 2 nd edition by J Pevsner (© 2009 by Wiley-Blackwell). These images and materials may not be used without permission from the publisher (instructors, me at Visit Copyright notice

We meet 3 times a week, from 10:30 to 11:50 am: W4013 (lecture/discussion and occasional computer lab) Announcements: where/when we meet

Textbook: Bioinformatics and Functional Genomics (2 nd edition, Wiley-Blackwell, 2009) by J. Pevsner, ISBN We’ll cover chapters in this course For those who don’t want to buy a copy, I will share pdfs of all the chapters with the class You can buy a copy at the website and get a nice discount ($80). It’s $80 at Amazon.com. The JHU bookstore may have copies. Welch Library may have copies Book’s website: Course website: or visit 13 Announcements: book, website

Outline of this course Introduction to genomics Viruses Bacteria and archaea (Egbert Hoiczyk) Eukaryotes The eukaryotic chromosome Fungi; yeast functional genomics (Jef Boeke) Protozoans (David Sullivan) Nematodes (Al Scott) Mosquitoes (George Dimopoulos) Rodents: mouse and rat Primates The human genome (Dave Valle) Human disease

Outline of today’s lecture Introduction: 5 perspectives, history of life Genome-sequencing projects: chronology Genome analysis: criteria, resequencing, metagenomics DNA sequencing technologies: Sanger, 454, Solexa Process of genome sequencing: centers, repositories Genome annotation: features, prokaryotes, eukaryotes

Five approaches to genomics As we survey the tree of life, consider these perspectives: Approach I: cataloguing genomic information Genome size; number of chromosomes; GC content; isochores; number of genes; repetitive DNA; unique features of each genome Approach V: Bioinformatics aspects Algorithms, databases, websites Approach IV: Human disease relevance Approach III: function; biological principles; evolution How genome size is regulated; polyploidization; birth and death of genes; neutral theory of evolution; positive and negative selection; speciation Approach II: cataloguing comparative genomic information Orthologs and paralogs; COGs; lateral gene transfer Page 519

Two projects for this course Option [1] Select a genome and describe it in detail. Option [2] Select a gene and describe it in detail. For each, follow the five approaches just outlined, and apply the principles that we learn in this course.

Reading: Webb Miller et al. (2004) Comparative genomics Introduction Lessons learned form comparative genomics What have we learned about genes by comparing genomic sequences? What have we learned about regulation? About 5% of the human genome is under purifying selection Positively regulated regions Mechanisms and history of mammalian evolution Nonuniformity of neutral evolutionary rates within species Nonuniformity of evolution along the branches of phylogeny Learning more form existing data Choice of species Choice of tools Future of comparative genomics

Levels of analysis in genomics leveltopicsdatabases DNAgenes, chromosomesGenBank RNAESTs, ncRNAUniGene, GEO proteinORFs, compositionUniProt complexesbinary, multimericBIND pathwaysCOGs, KEGG organelles organs individualsvariation and diseaseHapMap speciesspeciationTaxBrowser; SGD genusJAX mouse phylumFishBase kingdomTOL

Definitions of terms Genomics is the study of genomes (the DNA comprising an organism) using the tools of bioinformatics. Bioinformatics is the study protein, genes, and genomes using computer algorithms and databases. Systematics is the scientific study of the kinds and diversity of organisms and of any and all relationships among them. Classification is the ordering of organisms into groups on the basis of their relationships. The relationships may be evolutionary (phylogenetic) or may refer to similarities of phenotype (phenetic). Taxonomy is the theory and practice of classifying organisms.

Outline of today’s lecture Introduction: 5 perspectives, history of life: trees Genome-sequencing projects: chronology Genome analysis: criteria, resequencing, metagenomics DNA sequencing technologies: Sanger, 454, Solexa Process of genome sequencing: centers, repositories Genome annotation: features, prokaryotes, eukaryotes

Fig Page 521 Pace (2001) described a tree of life based on small subunit rRNA sequences. This tree shows the main three branches described by Woese and colleagues.

Ernst Haeckel ( ), a supporter of Darwin, published a tree of life (1879) including Monera (formless clumps, later named bacteria). Introduction: Systematics Page 520

plants animals monera fungi protists protozoa invertebrates vertebrates mammals Five kingdom system (Haeckel, 1879) Page 516

Chatton (1937) distinguished prokaryotes (bacteria that lack nuclei) from eukaryotes (having nuclei). Whittaker (1969) and others described the five-kingdom system: animals, plants, protists, fungi, and monera. In the 1970s and 1980s, Carl Woese and colleagues described the archaea, thus forming a tree of life with three main branches: archaea, bacteria, eukaryotes. Introduction: Systematics Page 520

Whittaker RH (1969) New concepts of kingdoms or organisms. Evolutionary relations are better represented by new classifications than by the traditional two kingdoms. Science. 163(863): Whittaker (1969): The two-kingdom system as it might have appeared in the early 1900s PlantaeAnimalia

Whittaker RH (1969) New concepts of kingdoms or organisms. Evolutionary relations are better represented by new classifications than by the traditional two kingdoms. Science. 163(863): The Copeland four-kingdom system of the 1930s-1950s Monera Metaphyta Metazoa Protoctista Prokaryotic Eukaryotic Unicellular Multicellular

Whittaker RH (1969) New concepts of kingdoms or organisms. Evolutionary relations are better represented by new classifications than by the traditional two kingdoms. Science. 163(863): Whittaker (1969): The five-kingdom system PlantaeFungiAnimalia Monera Protista Levels: prokaryotic (Monera) eukaryotic unicellular eukaryotic multicellular

Historically, trees were generated primarily using characters provided by morphological data. Molecular sequence data are now commonly used, including sequences (such as small-subunit RNAs) that are highly conserved. Visit the European Small Subunit Ribosomal RNA database for 20,000 SSU rRNA sequences. Molecular sequences as basis of trees Page 523

Pace (2001) described a tree of life based on small subunit rRNA sequences. This tree shows the main three branches described by Woese and colleagues. It is the best currently accepted model of the tree of life. Fig Page 521

Tree of life from David Hillis’ lab (based on ~3000 rRNAs) animals plants fungi protists bacteria archaea you are here  10-10

you are here Tree of life from David Hillis’ lab (based on ~3000 rRNAs)  10-10

Ribosomal RNA Database Ribosomal Database Project Santos, S. R. and Ochman H. Identification and phylogenetic sorting of bacterial lineages with universally conserved genes and proteins. Environmental Microbiology Jul(6)7: ►Download fusA (translation elongation factor 2 [EF-2]) ►Obtain DNA in the fasta format ►Align by ClustalW in MEGA ►Create a neighbor-joining tree Page 524  10-10

European Small Subunit Ribosomal RNA database (  10-10

Rickettsia Treponema Mycobacterium Aquifex aeolicus Yersinia pestis Clostridium Mycoplasma Bac. antracis Neighbor-joining tree of ~150 fusA (GTPase) DNA sequences

Fig Page 603

Eukaryotes (Baldauf et al. 2000) Fig Page 730

Outline of today’s lecture Introduction: 5 perspectives, history of life: time lines Genome-sequencing projects: chronology Genome analysis: criteria, resequencing, metagenomics DNA sequencing technologies: Sanger, 454, Solexa Process of genome sequencing: centers, repositories Genome annotation: features, prokaryotes, eukaryotes

History of life on earth 4.55 BYAformation of earth (violent 100 MY period) BYAlast ocean-evaporating impacts 3.9 BYAoldest dated rocks 3.8 BYAsun brightened to 70% of today’s luminosity Ammonia, methane, or carbon dioxide atmosphere. Earliest life: RNA, protein Source: Schopf J.W. (ed.), Life’s Origins (U. Calif. Press, 2002) Page 521

History of life on earth: two major eons Source: Schopf J.W. (ed.), Life’s Origins (U. Calif. Press, 2002) Precambrian eonPhanerozoic eon Extends from the formation of the planet to the appearance of fossils of hard- shelled animals 550 MYA From Cambrian explosion to the present 1 BYA234

43210 Billions of years ago (BYA) Origin of life Origin of eukaryotes insects Fungi/animal Plant/animal Hadean eon Archean eonProterozoic eon Phanerozoic eon Earliest fossils Page 522

Insects Cambrian explosion Age of Reptiles ends Land plants Proterozoic eon Phanerozoic eon deuterostome/ protostome echinoderm/ chordate Millions of years ago (MYA) Page 522

Millions of years ago (MYA) Dinosaurs extinct; Mammalian radiation Human/chimp divergence Mass extinction Page 522

Millions of years ago (MYA) Homo sapiens/ Chimp divergence Emergence of Homo erectus Earliest stone tools Australepithecus Lucy Page 522

Homo erectus emerges in Africa Mitochondrial Eve 1,000,000100, ,000 Years ago Page 523

Years ago Neanderthal and Homo erectus disappear Emergence of anatomically modern H. sapiens 100,000 10, ,000 Page 523

Years ago “Ice Man” from Alps Aristotle 10,000 1,0000 5,000 Earliest pyramids Page 523

Years ago algebracalculus Darwin, Mendel Gutenberg 1, Page 523

Page 524 Today’s continents derive from earlier land masses (Laurasia, Gondwana), affecting evolution of species

Outline of today’s lecture Introduction: 5 perspectives, history of life: time lines Genome-sequencing projects: chronology Genome analysis: criteria, resequencing, metagenomics DNA sequencing technologies: Sanger, 454, Solexa Process of genome sequencing: centers, repositories Genome annotation: features, prokaryotes, eukaryotes

We will next summarize the major achievements in genome sequencing projects from a chronological perspective. Chronology of genome sequencing projects Page 525

1976: first viral genome Fiers et al. sequence bacteriophage MS2 (3,569 base pairs, Accession NC_001417). 1977: Sanger et al. sequence bacteriophage  X174. This virus is 5,386 base pairs (encoding 11 genes). See accession J02482; NC_ Chronology of genome sequencing projects Page 527

Fig Page 528 Entrez nucleotide record for bacteriophage  X174 (graphics display)

1981 Human mitochondrial genome 16,500 base pairs (encodes 13 proteins, 2 rRNA, 22 tRNA) Today (10/10), over 2200 mitochondrial genomes sequenced 1986 Chloroplast genome 156,000 base pairs (most are 120 kb to 200 kb) Chronology of genome sequencing projects Page 527

mitochondrion chloroplast Lack mitochondria (?)

Entrez Genomes organelle resource at NCBI  10-10

There are ~2500 eukaryotic organelles (10/10)

MitoDat: resource for organelle genomes “This database is dedicated to the nuclear genes specifying the enzymes, structural proteins, and other proteins, many still not identified, involved in mitochondrial biogenesis and function. MitoDat highlights predominantly human nuclear- encoded mitochondrial proteins.” Not updated recently.  10-10

MitoMap: resource for organelle genomes  10-10

It is possible to map mutations in human mitochondrial DNA that are responsible for disease

1995: first genome of a free-living organism, the bacterium Haemophilus influenzae Chronology of genome sequencing projects Page 530

1995: genome of the bacterium Haemophilus influenzae is sequenced Fig Page 531

How to find information about a genome: NCBI  All databases  Genome  follow link to Bacteria

Overview of bacterial complete genomes (2000) n=30

Overview of bacterial complete genomes (2010) n=3,330

Fig Page 411 You can find functional annotation through the COGs database (Clusters of Orthologous Genes)

Click the circle to access the genome sequence

You can find functional annotation through the COGs database (Clusters of Orthologous Genes) Entrez Genome view of H. influenzae (October 2009)

Click the circle to access the genome sequence Genes are color-coded according to the COGs scheme

1996: first eukaryotic genome The complete genome sequence of the budding yeast Saccharomyces cerevisiae was reported. We will describe this genome soon. Also in 1996, TIGR reported the sequence of the first archaeal genome, Methanococcus jannaschii. Chronology of genome sequencing projects Page 532

1996: a yeast genome is sequenced

To learn about a genome of interest, visit NCBI  TaxBrowser  Genome Projects

To learn about a genome of interest, follow theTaxBrowser  Genome Projects links Size (in megabases), number of chromosomes are given here

To place the sequencing of the yeast genome in context, these are the eukaryotes…

Tree of eukaryotes (Baldauf et al. 2000) Fungi

1997: More bacteria and archaea Escherichia coli 4.6 megabases, 4200 proteins (38% of unknown function) 1998: first multicellular organism Nematode Caenorhabditis elegans 97 Mb; 19,000 genes. 1999: first human chromosome Chromosome 22 (49 Mb, 673 genes) Chronology of genome sequencing projects Page 532

See the article by Webb Miller et al. (2004), “Comparative genomics” for a discussion of annotation and analysis progress made since 1998

1999: Human chromosome 22 sequenced

49 MB 701 genes

2000: Fruitfly Drosophila melanogaster (13,000 genes) Plant Arabidopsis thaliana Human chromosome : draft sequence of the human genome (public consortium and Celera Genomics) Chronology of genome sequencing projects Page 534

To explore human chromosome 21 at NCBI  Find MapViewer  Choose human  Click chromosome 21

2000

2001 draft human genome sequence 2002 S. pombe (just 4,800 genes) 2004“finished” human genome 2007first individual human genome Genomes Project

Outline of Monday’s lecture (Chapter 13) Introduction: 5 perspectives, history of life: time lines Genome-sequencing projects: chronology Genome analysis: criteria, resequencing, metagenomics DNA sequencing technologies: Sanger, 454, Solexa Process of genome sequencing: centers, repositories Genome annotation: features, prokaryotes, eukaryotes