Eukaryotic Genomes: From Parasites to Primates (part 2 of 2) Monday, November 3, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner

Slides:



Advertisements
Similar presentations
GBrowse at TAIR Philippe Lamesch TAIR curator. Seqviewer.
Advertisements

Part I: Tips and Techniques from curators GBrowse at TAIR David Swarbreck.
Genome Projects A genome project is the complete DNA sequence of the genome of an organism, and the identification of all its genes Genome projects are.
Introduction to genomes & genome browsers
Metazoan Genomes Fruit Fly Rice Puffer Fish. Drosophila melanogaster Fruit fly mutants have been studied for nearly 100 years. Fly labs have used phenotypes.
Chapter 15 The Human Genome Project and Genomics
Unit 1: DNA and the Genome Key area 8: Genomic sequencing.
Duplication, rearrangement, and mutation of DNA contribute to genome evolution Chapter 21, Section 5.
Copyright, ©, 2002, John Wiley & Sons, Inc.,Karp/CELL & MOLECULAR BIOLOGY 3E The Stability of the Genome Duplication, Deletion, Transposition.
Copyright © 2008 Pearson Education, Inc., publishing as Pearson Benjamin Cummings PowerPoint ® Lecture Presentations for Biology Eighth Edition Neil Campbell.
Genomics MUPGRET Weekend Workshop Timeline Answers ne_2.html ne_2.html.
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Alternative splicing and evolution Daniel Jeffares.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
[Bejerano Fall10/11] 1 Primer, Friday 10am, Beckman B-302 Ex. 1 is coming.
EVOLUTIONARY AND COMPUTATIONAL GENOMICS Shin-Han Shiu Plant Biology / CMB / EEBB / Genetics / QBMI.
Comparative Genomics of the Eukaryotes
Genome projects and model organisms Level 3 Molecular Evolution and Bioinformatics Jim Provan.
What is genomics? Study of genomes. What is the genome? Entire genetic compliment of an organism.
Genome of Drosophila species Olga Dolgova UAB Barcelona, 2008.
Eukaryotic Genomes: From Parasites to Primates (Part 1 of 2) Monday, November 3, 2003 Introduction to Bioinformatics ME: J. Pevsner
1 How does a fertilized egg become an animal? Clam egg and sperm.
Chapter 5 Genome Sequences and Gene Numbers. 5.1Introduction  Genome size vary from approximately 470 genes for Mycoplasma genitalium to 25,000 for human.
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution Comparing whole genomes enhances – Our ability to understand.
Eukaryotic genomes Genomics J. Pevsner November 17, 2010.
Genomes School B&I TCD Bioinformatics May Genome sizes Completed eukaryotic nuclear genomes Type of organismSpeciesGenome size (10 6 base pairs)
Genomics BIT 220 Chapter 21.
Tentative definition of bioinformatics Bioinformatics, often also called genomics, computational genomics, or computational biology, is a new interdisciplinary.
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
Genomics Lecture 8 By Ms. Shumaila Azam. 2 Genome Evolution “Genomes are more than instruction books for building and maintaining an organism; they also.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
GenomesGenomes Chapter 21 Genomes Sequencing of DNA Human Genome Project countries 20 research centers.
The Human Genome (part 1 of 2) Wednesday, November 5, 2003 Introduction to Bioinformatics ME: J. Pevsner
DAY 1c: Accessing Completed Genomes 1. UCSC Genome Bioinformatics 2. Ensembl 3. NCBI Genomic Biology.
Apicomplexa: Plasmodium
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Chapter 21 Eukaryotic Genome Sequences
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution; Comparing whole genomes enhances – Our ability to understand.
Genomics and Arabidopsis. What is ‘genomics’? Study of an organism’s entire genome –All the DNA encoded in the organism –Nucleus, mitochondria, chloroplasts.
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
Comparative genomics Haixu Tang School of Informatics.
Using blast to study gene evolution – an example.
Prospecting for Genes that Fueled the Green Revolution
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Chapter 1 Introduction.
Genomes & The Tree of Life
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
Evolution at the Molecular Level. Outline Evolution of genomes Evolution of genomes Review of various types and effects of mutations Review of various.
Genomics Chapter 18.
Opener Chapter 24 – Genome Evolution. Comparative Genomes Powerful tool for exploring evolutionary divergence among organisms Footprints on the evolutionary.
How many genes are there?
Evolution at the Molecular Level. Outline Evolution of genomes Evolution of genomes Review of various types and effects of mutations Review of various.
Published primate genome sequences - I Published primate genome sequences - II.
Eukaryotic genes are interrupted by large introns. In eukaryotes, repeated sequences characterize great amounts of noncoding DNA. Bacteria have compact.
E VOLUTION OF E UKARYOTIC G ENOMES G ENE 342 Lecture 13 – Comparative genomics.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Genetics and Evolutionary Biology
Sequencing and personal genomics
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Genomes and Their Evolution
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Eukaryotic Nuclear Genomes
Part I: Tips and Techniques from curators
Evolution of eukaryote genomes
Gene Density and Noncoding DNA
The Content of the Genome
Presentation transcript:

Eukaryotic Genomes: From Parasites to Primates (part 2 of 2) Monday, November 3, 2003 Introduction to Bioinformatics ME: J. Pevsner

Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by J Pevsner (ISBN ). Copyright © 2003 by Wiley. These images and materials may not be used without permission from the publisher. Visit Copyright notice

Individual eukaryotic genomes: Introduction We will next survey eukaryotic genomes. Basic issues are: -- description of complete sequence of the chromosomes -- annotation of the DNA to characterize noncoding DNA -- annotation to identify protein-coding genes -- chromosome structure -- comparative genomics analyses -- molecular evolution -- relation of genotype to phenotype -- disease relevance Page 567

Individual eukaryotic genomes: Introduction We will explore the eukaryotic tree of Baldauf et al. (2000) moving from the bottom upwards. Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF (2000). A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290(5493), Page 567

Individual eukaryotic genomes: Protozoans at the base of the tree Giardia lamblia is a water-borne parasite Disease relevance: giardiasis (causes diarrhea) Distinguishing features: lack of mitochondria, peroxisomes Genome size: 12 Mb Chromosomes: 5 (range 0.7 to >3 Mb) Website: (sequencing in progress) The genome has just three retrotransposons. Also, it appears to have a single intron (ferredoxin gene). Page 570

Individual eukaryotic genomes: trypanosomes and Leishmania Page 571 Trypanosoma brucei causes sleeping sickness (Africa) Trypanosoma cruzi causes Chagas’ disease (S. America) Distinguishing features: transmitted by tsetse flies Genome size: 35 Mb (+/- 25% in various isolates) Chromosomes: 11 (range 1 to >6 Mb); also has intermediate chromosomes and 100 linear minichromosomes Website: Trypanosomes have kinetoplast DNA (circular rings of mitochondrial DNA)(studied by Paul Englund’s lab here).

Individual eukaryotic genomes: trypanosomes and Leishmania Page 571 Leishmania major causes leishmaniasis Genome size: 34 Mb Chromosomes: 36 (range 0.3 to 2.5 Mb) Genes: about 9800 Website: Leishmania chromosome 1 has 79 protein-coding genes. The first 29 (from the left telomere) are all transcribed from one strand, and the next 50 from the opposite strand.

Individual eukaryotic genomes: malaria parasite Plasmodium falciparum Page 573 Plasmodium falciparum causes malaria, killing 2.7 million people each year. Distinguishing features: Four Plasmodium species infect humans: P. falciparum, P. vivax, P. ovale, P. malariae. The life cycle is extremely complex. Genome size: 22.8 Mb Chromosomes: 14 (range 0.6 to 3.3 Mb) Genes: 5268 (comparable to S. pombe)(1 gene/4300 bp) Website: P. falciparum has an adenine+thymine (AT) content of 80.6%. The P. yoelli yoelli genome was also sequenced (infects rats).

Individual eukaryotic genomes: malaria parasite Plasmodium falciparum Page 573 Bioinformatics approaches to Plasmodium falciparum: -- The apicoplast (relic plastid; fatty acid, isoprene metabolism) is a potential drug target. Apicoplast signal sequences found. -- Comparative genomics defines some gene functions, identifies genes lacking in closely related species -- Genes implicated in antigenic variation and immune system evasion can be identified (e.g copies of vir) -- Proteomics applied to four stages of the life cycle (sporozoites, merozoites, trophozoites, gametocytes) -- Atypical metabolic pathways may be exploited, e.g. use of 1-deoxy-D-xylulose 5-phosphate (DOXP) in isoprene biosynthesis.

Individual eukaryotic genomes: overview of plants Plants for a distinct clade in the eukaryotic tree All plants are multicellular Plants are sessile, and depend of photosynthesis (Epifagus is an exception) Plants originated about 1.5 billion years ago (BYA), after eukaryotes had acquired a mitochondrion by endosymbiosis. Plants acquired a plastid (i.e. the chloroplast) over 1 BYA. Page 575

Figure Page 575 After Myerowitz (2002) and Wang et al. (1999)

Individual eukaryotic genomes: overview of plants Eudicots (e.g. Arabidopsis) diverged from monocots (e.g. rice) about 200 million years ago (MYA). Dicots include rosids (Arabidopsis, Glycine max [soybean], M. trunculata) and asterids (e.g. Lycopersiocon esculentum [tomato]). Monocots include cereals (seeds of flowering plants from the grass family). Page 578

Figure Page 577

Individual eukaryotic genomes: Arabidopsis thaliana Page 578 A. thaliana is a thale cress, sometimes called a weed. Distinguishing features: Rapid growth rate, extensive genetics. Member of the Brassicaceae (mustard) family. A flowering plant (emerged 200 MYA). Genome size: 125 Mb (very small for a plant genome). Wheat is 16.5 Gb, barley is 5 Gb. Chromosomes: 5 Genes: 25,498 (comparable to human) Website: --The entire Arabidopsis genome may have duplicated twice duplicated segments of > 100 kilobases

Fig Page 580 The TAIR web browser for Arabidopsis

Individual eukaryotic genomes: rice Page 579 Oryza sativa is rice (subspecies indica, japonica). Distinguishing features: This crop is a staple for half the world’s population. Four groups generated draft versions. Genome size: 430 Mb (1/8 th of human genome). One of the smallest grass genomes. Chromosomes: 12 Genes: about 50,000? (more than human) Website: (and other sites) --The rice genome displays an unusual gradient in GC content. The mean is 43%. The 5’ end of most genes has a higher GC content than the 3’ end (by 25%). GC-rich regions occur selectively in exons (not introns).

Individual eukaryotic genomes: overview of the metazoans The metazoans are animals including worms, insects, and vertebrates (e.g. fish and primates). Page 582

Individual eukaryotic genomes: the slime mold Dictyostelium discoideum Page 582 Dictyostelium discoideum is a slime mold. This forms an outgroup to the metazoans. Distinguishing features: The remarkable life cycle includes single-cell and multicellular forms. Genome size: 34 Mb Chromosomes: 6 Genes: about 11,000 Website: --The Dicty genome has almost 80% AT content (similar to Plasmodium). Thus a whole-chromosome shotgun strategy was employed.

Individual eukaryotic genomes: the nematode C. elegans Page 584 C. elegans is a free-living soil nematode. Distinguishing features: Its genome was the first of a multi- cellular animal to be sequenced (1998). Genome size: 97 Mb Chromosomes: 6 Genes: about 19,000 (spanning 27% of genome) Website: --Many worm functional genomics projects have been performed, such as microarrays at multiple developmental stages.

Individual eukaryotic genomes: the fruitfly Drosophila Page 585 Drosophila’s distinguishing features: Short lifecycle, varied phenotypes, model organism in genetics. Genome size: 180 Mb Chromosomes: 5 Genes: about 13,000 (spanning 27% of genome) Website: --At the time, largest genome for which whole genome shotgun sequencing was applied. --Each genome annotation improves the gene models

This is Ann: the mosquito Anopheles gambiae Page 587 A. gambiae was the second insect genome sequenced. Distinguishing features: It is the malaria parasite vector. Genome size: 278 Mb (twice the size of Drosophila) Chromosomes: 3 Genes: about 14,000 Website: --Diverged from Drosophila 250 MYA (average amino acid sequence identity of orthologs is 56%). Compare human and pufferfish (diverged 400 MYA, 61% identity): insect proteins diverge at a faster rate. --High degree of genetic variation

Individual eukaryotic genomes: the sea squirt Ciona intestinalis The chordates include vertebrates (fish, amphibians, reptiles, birds, mammals) which have a spinal column. Some chordates an invertebrates, such as the sea squirt. Genomes size: 160 Mb (20 times smaller than human) Chromosomes: 14 Genes: 15,852 Significant for our understanding of vertebrate evolution. Page 587

Individual eukaryotic genomes: the fish Fugu rubripes Page 588 Fugu is a pufferfish (also called Takifugu rubripes). Distinguishing features: Diverged from humans 450 MYA; has comparable number of genes in a compact genome. Genome size: 365 Mb (1/10 th human genome) Genes: about 30,000 Website: --Only 2.7% of genome is interspersed repeats (compare 45% in human), based on RepeatMasker. --Introns are relatively short. 75% of Fugu introns are <425 base pairs (for human, 75% are <2609 base pairs).

Individual eukaryotic genomes: the mouse Mus musculus Page 589 M. musculus is the second mammal to have its genome sequenced. Mouse diverged from human 75 MYA. Distinguishing features: only 300 of 30,000 annotated genes have no human orthologs Genome size: 2.5 Gb (euchromatic portion)(cf. 2.9 Gb human) Chromosomes: 6 Genes: about 30,000 Website: --Dozens of mouse-specific expansions occurred, such as olfactory receptor gene family. --40% of mouse genome can be aligned to human genome at the nucleotide level.

Individual eukaryotic genomes: primates Page 591 The phylogenetic tree shows that chimpanzee (Pan troglodytes) and bonobo (pygmy chimpanzee, Pan paniscus) are the two species most closely related to humans. These three species diverged from a common ancestor about 5.4 million years ago, based on an analysis of 36 nuclear genes. Large-scale genome sequencing projects have begun for the chimpanzee. Other genomes under consideration are the rhesus macaque monkey (Macaca mulatta) and the olive baboon (Papio hamadryas anubis).

Perspective and pitfalls Page 531 One of the broadest goals of biology is to understand the nature of each species: what are its mechanisms of development, metabolism, homeostasis, reproduction, and behavior? Sequencing a genome does not answer these questions directly. After genome annotation, we try to interpret the function of the genome’s constituents in the context of various physiological processes. The field of bioinformatics needs continued development of algorithms to find genes, repetitive sequences, genome duplications and other features, as well as tools to identify conserved regions. We may then generate and test hypotheses about genome function.