Download presentation
Presentation is loading. Please wait.
Published byCandace Ramsey Modified over 9 years ago
1
Eukaryotic Genomes: From Parasites to Primates (part 2 of 2) Monday, November 3, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu
2
Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by J Pevsner (ISBN 0-471-21004-8). Copyright © 2003 by Wiley. These images and materials may not be used without permission from the publisher. Visit http://www.bioinfbook.org Copyright notice
3
Individual eukaryotic genomes: Introduction We will next survey eukaryotic genomes. Basic issues are: -- description of complete sequence of the chromosomes -- annotation of the DNA to characterize noncoding DNA -- annotation to identify protein-coding genes -- chromosome structure -- comparative genomics analyses -- molecular evolution -- relation of genotype to phenotype -- disease relevance Page 567
4
Individual eukaryotic genomes: Introduction We will explore the eukaryotic tree of Baldauf et al. (2000) moving from the bottom upwards. Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF (2000). A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290(5493), 972-977. Page 567
9
Individual eukaryotic genomes: Protozoans at the base of the tree Giardia lamblia is a water-borne parasite Disease relevance: giardiasis (causes diarrhea) Distinguishing features: lack of mitochondria, peroxisomes Genome size: 12 Mb Chromosomes: 5 (range 0.7 to >3 Mb) Website: http://www.mbl.edu/Giardia (sequencing in progress) The genome has just three retrotransposons. Also, it appears to have a single intron (ferredoxin gene). Page 570
10
Individual eukaryotic genomes: trypanosomes and Leishmania Page 571 Trypanosoma brucei causes sleeping sickness (Africa) Trypanosoma cruzi causes Chagas’ disease (S. America) Distinguishing features: transmitted by tsetse flies Genome size: 35 Mb (+/- 25% in various isolates) Chromosomes: 11 (range 1 to >6 Mb); also has intermediate chromosomes and 100 linear minichromosomes Website: http://parsun1.path.cam.ac.uk Trypanosomes have kinetoplast DNA (circular rings of mitochondrial DNA)(studied by Paul Englund’s lab here).
11
Individual eukaryotic genomes: trypanosomes and Leishmania Page 571 Leishmania major causes leishmaniasis Genome size: 34 Mb Chromosomes: 36 (range 0.3 to 2.5 Mb) Genes: about 9800 Website: http://www.sanger.ac.uk/Projects/L_major/ Leishmania chromosome 1 has 79 protein-coding genes. The first 29 (from the left telomere) are all transcribed from one strand, and the next 50 from the opposite strand.
12
Individual eukaryotic genomes: malaria parasite Plasmodium falciparum Page 573 Plasmodium falciparum causes malaria, killing 2.7 million people each year. Distinguishing features: Four Plasmodium species infect humans: P. falciparum, P. vivax, P. ovale, P. malariae. The life cycle is extremely complex. Genome size: 22.8 Mb Chromosomes: 14 (range 0.6 to 3.3 Mb) Genes: 5268 (comparable to S. pombe)(1 gene/4300 bp) Website: http://www.plasmodb.org P. falciparum has an adenine+thymine (AT) content of 80.6%. The P. yoelli yoelli genome was also sequenced (infects rats).
13
Individual eukaryotic genomes: malaria parasite Plasmodium falciparum Page 573 Bioinformatics approaches to Plasmodium falciparum: -- The apicoplast (relic plastid; fatty acid, isoprene metabolism) is a potential drug target. Apicoplast signal sequences found. -- Comparative genomics defines some gene functions, identifies genes lacking in closely related species -- Genes implicated in antigenic variation and immune system evasion can be identified (e.g. 1000 copies of vir) -- Proteomics applied to four stages of the life cycle (sporozoites, merozoites, trophozoites, gametocytes) -- Atypical metabolic pathways may be exploited, e.g. use of 1-deoxy-D-xylulose 5-phosphate (DOXP) in isoprene biosynthesis.
14
Individual eukaryotic genomes: overview of plants Plants for a distinct clade in the eukaryotic tree All plants are multicellular Plants are sessile, and depend of photosynthesis (Epifagus is an exception) Plants originated about 1.5 billion years ago (BYA), after eukaryotes had acquired a mitochondrion by endosymbiosis. Plants acquired a plastid (i.e. the chloroplast) over 1 BYA. Page 575
15
Figure 16.22 Page 575 After Myerowitz (2002) and Wang et al. (1999)
16
Individual eukaryotic genomes: overview of plants Eudicots (e.g. Arabidopsis) diverged from monocots (e.g. rice) about 200 million years ago (MYA). Dicots include rosids (Arabidopsis, Glycine max [soybean], M. trunculata) and asterids (e.g. Lycopersiocon esculentum [tomato]). Monocots include cereals (seeds of flowering plants from the grass family). Page 578
17
Figure 16.23 Page 577
18
Individual eukaryotic genomes: Arabidopsis thaliana Page 578 A. thaliana is a thale cress, sometimes called a weed. Distinguishing features: Rapid growth rate, extensive genetics. Member of the Brassicaceae (mustard) family. A flowering plant (emerged 200 MYA). Genome size: 125 Mb (very small for a plant genome). Wheat is 16.5 Gb, barley is 5 Gb. Chromosomes: 5 Genes: 25,498 (comparable to human) Website: http://www.arabidopsis.org --The entire Arabidopsis genome may have duplicated twice. -- 24 duplicated segments of > 100 kilobases
19
Fig. 16.25 Page 580 The TAIR web browser for Arabidopsis
20
Individual eukaryotic genomes: rice Page 579 Oryza sativa is rice (subspecies indica, japonica). Distinguishing features: This crop is a staple for half the world’s population. Four groups generated draft versions. Genome size: 430 Mb (1/8 th of human genome). One of the smallest grass genomes. Chromosomes: 12 Genes: about 50,000? (more than human) Website: http://www.usricegenome.org (and other sites) --The rice genome displays an unusual gradient in GC content. The mean is 43%. The 5’ end of most genes has a higher GC content than the 3’ end (by 25%). GC-rich regions occur selectively in exons (not introns).
21
Individual eukaryotic genomes: overview of the metazoans The metazoans are animals including worms, insects, and vertebrates (e.g. fish and primates). Page 582
22
Individual eukaryotic genomes: the slime mold Dictyostelium discoideum Page 582 Dictyostelium discoideum is a slime mold. This forms an outgroup to the metazoans. Distinguishing features: The remarkable life cycle includes single-cell and multicellular forms. Genome size: 34 Mb Chromosomes: 6 Genes: about 11,000 Website: http://dictybase.org --The Dicty genome has almost 80% AT content (similar to Plasmodium). Thus a whole-chromosome shotgun strategy was employed.
23
Individual eukaryotic genomes: the nematode C. elegans Page 584 C. elegans is a free-living soil nematode. Distinguishing features: Its genome was the first of a multi- cellular animal to be sequenced (1998). Genome size: 97 Mb Chromosomes: 6 Genes: about 19,000 (spanning 27% of genome) Website: http://www.wormbase.org --Many worm functional genomics projects have been performed, such as microarrays at multiple developmental stages.
24
Individual eukaryotic genomes: the fruitfly Drosophila Page 585 Drosophila’s distinguishing features: Short lifecycle, varied phenotypes, model organism in genetics. Genome size: 180 Mb Chromosomes: 5 Genes: about 13,000 (spanning 27% of genome) Website: http://www.fruitfly.org --At the time, largest genome for which whole genome shotgun sequencing was applied. --Each genome annotation improves the gene models
25
This is Ann: the mosquito Anopheles gambiae Page 587 A. gambiae was the second insect genome sequenced. Distinguishing features: It is the malaria parasite vector. Genome size: 278 Mb (twice the size of Drosophila) Chromosomes: 3 Genes: about 14,000 Website: http://www.ensembl.org/Anopheles_gambiae/ --Diverged from Drosophila 250 MYA (average amino acid sequence identity of orthologs is 56%). Compare human and pufferfish (diverged 400 MYA, 61% identity): insect proteins diverge at a faster rate. --High degree of genetic variation
26
Individual eukaryotic genomes: the sea squirt Ciona intestinalis The chordates include vertebrates (fish, amphibians, reptiles, birds, mammals) which have a spinal column. Some chordates an invertebrates, such as the sea squirt. Genomes size: 160 Mb (20 times smaller than human) Chromosomes: 14 Genes: 15,852 Significant for our understanding of vertebrate evolution. Page 587
27
Individual eukaryotic genomes: the fish Fugu rubripes Page 588 Fugu is a pufferfish (also called Takifugu rubripes). Distinguishing features: Diverged from humans 450 MYA; has comparable number of genes in a compact genome. Genome size: 365 Mb (1/10 th human genome) Genes: about 30,000 Website: http://genome.jgi-psf.org/fugu6/fugu6.info.html --Only 2.7% of genome is interspersed repeats (compare 45% in human), based on RepeatMasker. --Introns are relatively short. 75% of Fugu introns are <425 base pairs (for human, 75% are <2609 base pairs).
28
Individual eukaryotic genomes: the mouse Mus musculus Page 589 M. musculus is the second mammal to have its genome sequenced. Mouse diverged from human 75 MYA. Distinguishing features: only 300 of 30,000 annotated genes have no human orthologs Genome size: 2.5 Gb (euchromatic portion)(cf. 2.9 Gb human) Chromosomes: 6 Genes: about 30,000 Website: http://www.informatics.jax.org --Dozens of mouse-specific expansions occurred, such as olfactory receptor gene family. --40% of mouse genome can be aligned to human genome at the nucleotide level.
29
Individual eukaryotic genomes: primates Page 591 The phylogenetic tree shows that chimpanzee (Pan troglodytes) and bonobo (pygmy chimpanzee, Pan paniscus) are the two species most closely related to humans. These three species diverged from a common ancestor about 5.4 million years ago, based on an analysis of 36 nuclear genes. Large-scale genome sequencing projects have begun for the chimpanzee. Other genomes under consideration are the rhesus macaque monkey (Macaca mulatta) and the olive baboon (Papio hamadryas anubis).
31
Perspective and pitfalls Page 531 One of the broadest goals of biology is to understand the nature of each species: what are its mechanisms of development, metabolism, homeostasis, reproduction, and behavior? Sequencing a genome does not answer these questions directly. After genome annotation, we try to interpret the function of the genome’s constituents in the context of various physiological processes. The field of bioinformatics needs continued development of algorithms to find genes, repetitive sequences, genome duplications and other features, as well as tools to identify conserved regions. We may then generate and test hypotheses about genome function.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.