Presentation is loading. Please wait.

Presentation is loading. Please wait.

Eukaryotic genomes: fungi

Similar presentations


Presentation on theme: "Eukaryotic genomes: fungi"— Presentation transcript:

1 Eukaryotic genomes: fungi
Chapter 18: Eukaryotic genomes: fungi Jonathan Pevsner, Ph.D. Bioinformatics and Functional Genomics (Wiley-Liss, 3rd edition, 2015) You may use this PowerPoint for teaching purposes

2 Learning objectives After reading this chapter you should be able to:
■ provide an overview of how fungi are classified; ■ describe the features of the Saccharomyces cerevisiae genome; ■ discuss genome duplication in S. cerevisiae; ■ describe comparative genomics of the genus Saccharomyces; and ■ describe comparative genomics of other fungal phyla.

3 Outline Introduction Description and classification of fungi
Introduction to budding yeast Saccharomyces cerevisiae Gene duplication and genome duplication Comparative analyses of hemiascomycetes Analysis of fungal genomes Perspective

4 Introduction to fungi: phylogeny
Fungi are eukaryotic organisms that can be filamentous (e.g. molds) or unicellular (e.g. the yeast Saccharomyces cerevisiae). Most fungi are aerobic (but S. cerevisiae can grow anaerobically). Fungi have major roles in the ecosystem in degrading organic waste. They have important roles in fermentation, including the manufacture of steroids and penicillin. Several hundred fungal species are known to cause disease in humans.

5 Fungi are a sister group with the metazoans (animals)
B&FG 3e Fig. 18.1 Page 849 We return to this phylogenetic tree in Chapter 19

6 Outline Introduction Description and classification of fungi
Introduction to budding yeast Saccharomyces cerevisiae Gene duplication and genome duplication Comparative analyses of hemiascomycetes Analysis of fungal genomes Perspective

7 Classification of fungi
About 70,000 fungal species have been described (as of 1995), but 1.5 million species may exist. Four phyla: Ascomycota yeasts, truffles, lichens Basidiomycota rusts, smuts, mushrooms Chytridiomycota Allomyces Zygomycota feed on decaying vegetation

8 Classification of fungi
About 70,000 fungal species have been described (as of 1995), but 1.5 million species may exist. Four phyla: Ascomycota yeasts, truffles, lichens Hemiascomycetae Génolevure project Euascomycetae Neurospora Loculoascomycetae Laboulbeniomycetae parasites of insects Basidiomycota rusts, smuts, mushrooms Chytridiomycota Allomyces Zygomycota feed on decaying vegetation

9 Alternate classification of fungi

10 Outline Introduction Description and classification of fungi
Introduction to budding yeast Saccharomyces cerevisiae Gene duplication and genome duplication Comparative analyses of hemiascomycetes Analysis of fungal genomes Perspective

11 Introduction to Saccharomyces cerevisiae
First species domesticated by humans Called baker’s yeast (or brewer’s yeast) Ferments glucose to ethanol and carbon dioxide Model organism for studies of biochemistry, genetics, molecular and cell biology …rapid growth rate …easy to modify genetically …features typical of eukaryotes …relatively simple (unicellular) …relatively small genome

12 Sequencing the S. cerevisiae genome
The genome was sequenced by a highly cooperative consortium in the early 1990s, chromosome by chromosome (the whole genome shotgun approach was not used). This involved 600 researchers in > 100 laboratories. --Physical map created for all XVI chromosomes --Library of 10 kb inserts constructed in phage --The inserts were assembled into contigs The sequence released in 1996, and published in 1997 (Goffeau et al., 1996; Mewes et al., 1997)

13 Features of the S. cerevisiae genome
Sequenced length: 12,068 kb = 12,068,000 base pairs Length of repeats: 1,321 kb Total length: 13,389 kb (~ 13 Mb) Open reading frames (ORFs): 6,275 Questionable ORFs (qORFs): 390 Hypothetical proteins: 5,885 Introns in ORFs: 220 Introns in UTRs: 15 Intact Ty elements: 52 tRNA genes: 275 snRNA genes: 40

14 Features of the S. cerevisiae genome
A notable feature of the genome is its high gene density (about one gene every 2 kilobases). Most bacteria have about one gene per kb, but most eukaryotes have a much sparser gene density. Also, only 4% of S. cerevisiae genes are interrupted by introns. By contrast, 40% of genes from the fungus Schizosaccharomyces pombe have introns.

15 Features of S. cerevisiae genome
B&FG 3e Table 18.1 Page 851 ORF: open reading frame

16 ORFs in the S. cerevisiae genome
How are ORFs defined? In the initial genome analysis, an ORF was defined as >100 codons (thus specifying a protein of ~11 kilodaltons). 390 ORFs were listed as “questionable”, because they were considered unlikely to be authentic genes. For example, they were short, or exhibited unlikely preferences for codon usage. How many ORFs are there in the yeast genome? There are 40,000 ORFs > 20 amino acids; how many of these are authentic?

17 ORFs in the S. cerevisiae genome
Several criteria may be applied to decide if ORFs are authentic protein-coding genes: [1] evidence of conservation in other organisms [2] experimental evidence of gene expression (microarrays, SAGE, functional genomics) The groups of Elizabeth Winzeler and Michael Snyder each described hundreds of previously unannotated genes that are transcribed and translated.

18 Revising the S. cerevisiae gene count through comparative genomics
By sequencing three additional yeast species (Saccharomyces paradoxus, S. bayanus, S. mikatae), Kellis et al. showed that 503 genes should be deleted from the set of yeast genes (leaving 5,726 including 43 newly discovered genes). See Kellis et al. (2003) Nature 423:241.

19 Proteins in S. cerevisiae 288c
B&FG 3e Fig. 18.3 Page 852 Visit NCBI Genome

20 Ten most common protein domains in S. cerevisiae
B&FG 3e Table 18.2 Page 853 From InterPro

21 S. cerevisiae has very few introns
B&FG 3e Fig. 18.4 Page 853

22 Exploring a typical S. cerevisiae chromosome (XII)
We will next familiarize ourselves with the S. cerevisiae genome by exploring a typical chromosome, XII. This chromosome features 38% GC content very little repetitive DNA few introns six Ty elements (transposable elements) a high ORF density: 534 ORFs > 100aa, and 72% of the chromosome has protein-coding genes B&FG 3e Page 854

23 NCBI Map Viewer site: S. cerevisiae
B&FG 3e Fig. 18.5 Page 854

24 Saccharomyces Genome Database (SGD)
SGD is the main community web resource for studying Saccharomyces. We introduced it in Chapter 14.

25 Genome Workbench Genome Workbench is a versatile tool from NCBI that includes a browser, a sophisticated query interface, and the opportunity to load files such as BAM and VCF. B&FG 3e Fig. 18.6 Page 855 Search view of Genome Workbench: query Saccharomyces cerevisiae

26 Genome Workbench Genome Workbench view of genes on chromosome XII
B&FG 3e Fig. 18.6 Page 855 Genome Workbench view of genes on chromosome XII

27 Genome Workbench Additional tracks available on graphical view B&FG 3e
Fig. 18.6 Page 855 Additional tracks available on graphical view

28 Genome Workbench display for a single gene on chromosome XII (VPS33/YLR396C)
B&FG 3e Fig. 18.7 Page 856

29 S. cerevisiae gene nomenclature
YKL159c Y = yeast K = 11th chromosome L = left (or right) arm 159 = 159th ORF c = Crick (bottom) strand w = Watson (top) strand RCN1 = wildtype gene Rcn1p = protein rcn1 = mutant allele The research community studying S. cerevisiae established these nomenclature rules which are among the simplest and most informative in biology. From the name “YKL159c” you know the organism, chromosome and arm, ORF, strand, and molecule type. B&FG 3e Box 18.2 Page 856

30 Ensembl includes resources for S. cerevisiae
B&FG 3e Fig. 18.7 Page 856 Resources include genome assembly data, comparative genomics, regulation, annotation, and variation

31 Ensembl includes resources for S. cerevisiae
In addition to yeast, Ensembl offers resources for a vast number of other genomes. The centralized resources of Ensembl are among the very most important in the field of genomics. B&FG 3e Fig. 18.7 Page 856 Resources include genome assembly data, comparative genomics, regulation, annotation, and variation

32 Exploring chromosome (XII) on the command line
Visit ensembl.org to obtain a Genome Variation Format (GVF) file. Unpack it, then inspect it. There are ~260,000 rows. B&FG 3e Page 857

33 Exploring chromosome (XII) on the command line
The GVF file lists single nucleotide variants (SNVs) in the genome. B&FG 3e Page 858

34 Exploring chromosome (XII) on the command line
Send the data from chromosome XII to a new file called yeast_chrXII_SNVs.gvf. There are ~22,000 variants. B&FG 3e Page 858

35 Exploring chromosome (XII) on the command line
Next we’ll explore a file from ensembl.org that includes a variety of non-coding RNAs. You can download this as well. There are 1977 lines, and 413 separate entries. B&FG 3e Page 859

36 Exploring chromosome (XII) on the command line
Determine how many types of noncoding RNAs are present: B&FG 3e Page 859

37 Exploring chromosome (XII) on the command line
Restrict the output to chromosome XII, and view them: B&FG 3e Page 859

38 Outline Introduction Description and classification of fungi
Introduction to budding yeast Saccharomyces cerevisiae Gene duplication and genome duplication Comparative analyses of hemiascomycetes Analysis of fungal genomes Perspective

39 Duplication of the S. cerevisiae genome
Analysis of the S. cerevisiae genome revealed that many regions are duplicated, both intrachromosomally and interchromosomally (within and between chromosomes). These duplicated regions include both genes and nongenic regions. Such duplications reflect a fundamental aspect of genome evolution. What are the mechanisms by which regions of the genome duplicate?

40 Duplication of the S. cerevisiae genome
Mechanisms of gene duplication tandem repeat slippage during recombination Gene conversion Segmental duplication Lateral gene transfer polyploidy e.g. genome tetraploidy

41 Whole genome duplication
hypothetical diploid genome Autopolyploidy (as occurs in S. cerevisiae): 8 chromosomes became 16! Allopolyploidy: hybridization between related species B&FG 3e Fig. 18.9 Page 861

42 Duplication of the S. cerevisiae genome
Fate of duplicated genes Both copies persist One copy is deleted One copy becomes a pseudogene One copy functionally diverges

43 Duplication of the S. cerevisiae genome
What is the fate of duplicated genes? (see YGOB, below) A duplicated gene (overall in eukaryotes) has a half life of just several million years (Lynch and Conery, 2000). 50% to 92% of duplicated genes are lost (Wagner, 2001) Consider four possible fates of a duplicated gene: [1] Both copies persist (gene dosage effect) [2] One copy is deleted (a common fate) [3] One copy accumulates mutations and becomes a pseudogene (no functional protein product) [4] One copy (or both) diverges functionally. The organism can perform a novel function.

44 Duplication of the S. cerevisiae genome
In 1970, Susumu Ohno published the book Evolution by Gene Duplication. He hypothesized that vertebrate genomes evolved by two rounds of whole genome duplication. This provided genomes with the “raw materials” (new genes) with which to introduce various innovations.

45 Duplication of the S. cerevisiae genome
Ohno (1970): “Had evolution been entirely dependent upon natural selection, from a bacterium only numerous forms of bacteria would have emerged. The creation of metazoans, vertebrates, and finally mammals from unicellular organisms would have been quite impossible, for such big leaps in evolution required the creation of new gene loci with previously nonexistent function. Only the cistron that became redundant was able to escape from the relentless pressure of natural selection. By escaping, it accumulated formerly forbidden mutations to emerge as a new gene locus.”

46 Duplication of the S. cerevisiae genome
Wolfe and Shields (1997, Nature) provided support for Ohno’s paradigm. They hypothesized that the yeast genome duplicated about 100 million years ago. There was a diploid yeast genome with about 5,000 genes. It doubled to a tetraploid number of 10,000 genes. Then there was massive gene loss and chromosomal rearrangement to yield the present day 6,000 genes.

47 BLASTP of yeast chromosomes shows 55 blocks of duplicated regions: evidence for S. cerevisiae genome
duplication. Matches with scores >200 are shown. These are arranged in blocks of genes. B&FG 3e Fig Page 862

48 Duplication of the S. cerevisiae genome
Evidence of genome duplication in yeast -- Systematic BLAST searches show 55 blocks of duplicated sequences. -- There are 376 pairs of homologous genes. You can see the results of chromosomal comparisons on Ken Wolfe’s web site (YGOG; see below for examples) and at the SGD web site.

49 Duplication of the S. cerevisiae genome
Two models for the presence of duplication blocks [1] Whole genome duplication (tetraploidy) followed by gene loss and rearrangements [2] Successive, independent duplication events

50 Duplication of the S. cerevisiae genome
Model [1] is favored for several reasons: -- For 50 of 55 duplicated regions, the orientation of the entire block is preserved with respect to the centromere. The orientation is not random. -- For model [2] we would expect 7 triplicated regions. We observe only 0 or 1. -- Gene order is maintained in 14 hemiascomycetes (the Génolevures project) Page 711

51 Duplication of the S. cerevisiae genome
Why are duplicated genes commonly lost? It might seem highly advantageous to have a second copy of gene, thus permitting functional divergence. Ohno suggested two reasons: [1] After duplication, a deleterious mutation in one of the two genes might now persist. Without duplication, the individual would have been selected against by such a mutation. [2] The presence of a new paralogous sequence could lead to unequal crossing over of homologous chromosomes during meiosis.

52 Duplication of the S. cerevisiae genome
To consider the fate of duplicated genes, consider the example of genes involved in vesicle transport. Vesicles carry cargo from one destination to another. Proteins on vesicles (e.g. vesicle-associated membrane protein, VAMP; Snc1p in yeast) bind to proteins on target membranes (e.g. syntaxin in mammalian and other eukaryotic systems, or Sso1p in yeast). In S. cerevisiae, genome duplication appears to be responsible for the presence of two syntaxins (SSO1 and SSO2) and two VAMPs (SNC1 and SNC2).

53 Duplication of the S. cerevisiae genome
Snc1p Snc2p Sso1p Sso2p

54 Search for information on SSO1 (or any yeast gene) at the SGD website

55 The SGD record for SSO1 provides information on function

56 Duplication of the S. cerevisiae genome
The SGD website reveals that the SSO1 gene is nonessential (i.e. the null mutant is viable), but the double knockout of SSO1 and SSO1 is lethal. Thus, these paralogs may offer functional redundancy to the organism. Also, these proteins could participate in distinct (but complementary) intracellular trafficking steps.

57 Outline Introduction Description and classification of fungi
Introduction to budding yeast Saccharomyces cerevisiae Gene duplication and genome duplication Comparative analyses of hemiascomycetes Analysis of fungal genomes Perspective

58 Phylogeny of yeasts showing whole genome duplication (red circle)
We can align fungal genomes and see evidence for whole genome duplication. We can date the occurrence of this duplication event. B&FG 3e Fig Page 866

59 Comparative analyses of hemiascomycetes: Whole genome duplication
You can explore duplicated genome regions using the Yeast Gene Order Browser (YGOB) at:

60 Yeast Gene Order Browser
B&FG 3e Fig Page 867

61 Yeast Gene Order Browser
post-WGD tracks pre-WGD tracks: species that did not undergo WGD post-WGD tracks: mirror those at top patterns reflect of duplicated genes reflect whole genome duplication B&FG 3e Fig Page 867

62 Yeast Gene Order Browser: patterns of gene loss after WGD

63 Patterns of gene loss after whole-genome duplication in three species
B&FG 3e Fig Page 868

64 Patterns of gene loss after whole-genome duplication in three species
For three species that underwent whole-geneome duplication (C. glabrata, S. cerevisiae, and S. castellii) there are 14 possible fates including loss of no genes (class 0), loss of one gene from any one of the three lineages (class 1A, 1B, 1C), loss of two genes (class 2), loss of three genes from different loci (class 3), or loss of three genes in a convergent manner (class 4; loss of duplicated orthologs)(most common fate of duplicated genes). B&FG 3e Fig Page 868

65 Comparative analyses of hemiascomycetes:
Identification of functional elements Kellis et al. (2003) compared S. paradoxus, S. mikatae, and S. bayanus to S. cerevisiae (divergence dates: 5 to 20 MYA). There were clear orthologous matches, except at the telomeres. For the Gal4 transcription factor and other functional elements, comparative analyses have helped delineate regulatory regions.

66 Yeast Gal4 transcription factor binding site (between GAL10, GAL1)
B&FG 3e Fig Page 869

67 Outline Introduction Description and classification of fungi
Introduction to budding yeast Saccharomyces cerevisiae Gene duplication and genome duplication Comparative analyses of hemiascomycetes Analysis of fungal genomes Perspective

68 Fungal genome projects: representative examples of the Ascomycetes
B&FG 3e Table 18.3 Page 869 ID refers to the NCBI genome project identifier

69 Fungal genome projects: representative examples of the Basidiomycetes
B&FG 3e Table 18.3 Page 869 ID refers to the NCBI genome project identifier

70 Fungal genome projects:
representative examples of fungi other than Ascomycetes and Basiciomycetes B&FG 3e Table 18.5 Page 870 ID refers to the NCBI genome project identifier

71 Fungal genomes We now briefly consider a set of prominent fungal genomes, arranged alphabetically: Aspergillus nidulans Candida albicans Cryptococcus neoformans: model fungal pathogen Encephalitozoon cuniculi: atypical fungus Neurospora crassa Phanerochaete chrysosporium: first Basidiomycete Schizosaccharomyces pombe: fission yeast

72 Fungal pathogen: Aspergillus nidulans
--Of 185 Aspergillus species, 20 are human pathogens --A. nidulans has a sexual life cycle (in contrast to A. fumigatus and A. oryzae [sake, miso, soy]). --A. nidulans has animal-like peroxisomal enzymes

73 TaxPlot tool (NCBI) compares proteins in two proteomes

74 Fungal pathogen: Candida albicans
--Diploid sexually reproducing fungus --Causes opportunistic infections in humans --Genome: 14.8 Mb with 8 chromosome pairs. Seven of these are constant, and the 8th varies from 3 to 4 Mb. --No known haploid state; the heterozygous diploid state was sequenced. --Over 7600 open reading frames --CUG is translated as serine (rather than leucine)

75 An atypical fungus: Encephalitozoon cuniculi
Microsporidia are single-celled eukaryotes that lack mitochondria and peroxisomes. Consistent with their roles as parasites, the E. cuniculi genome is severely reduced in size (2000 proteins, only 2.9 Mb). They were thought to represent deep-branching protozoans, but recent phylogenetic studies place them as an outgroup to fungi.

76 Fungal origin for the microsporidial parasite Encephalitozoon cuniculi (arrow)
B&FG 3e Fig Page 873 Phylogenetic analysis of vacuolar ATPase subunit A from animals, plants, fungi, protists, bacteria, and archaea

77 Mechanisms of reduction of genome size in microsporidia
B&FG 3e Fig Page 874 An ancestral genome is shown schematically with seven genes (blue, red) and large intergenic regions (black line).

78 Orange bread mold: Neurospora crassa
Beadle and Tatum chose N. crassa as a model organism to study gene-protein relationships. The genome sequence was reported: 39 Mb, 7 chromosomes, 10,082 ORFs (Galagan et al., 2003). N. crassa has only 10% repetitive DNA, and incredibly, only 8 pairs of duplicated genes that encode proteins >100 amino acids. This is because Neurospora uses “repeat-induced point mutation” (RIP), a mechanism by which the genome is scanned for duplicated (repeated) sequences. This appears to serve as a genomic defense system, inactivating potentially harmful transposons.

79 Phanerochaete chrysosporium: first Basidiomycete
Basidiomycota diverged from the better-characterized Ascomycota over 500 million years ago. Genome: 30 Mb of DNA, 10 chromosomes. 11,777 genes predicted. White rot fungi (such as P. chrysosporium) degrade the major components of plant cell walls, including cellulose and lignins, using a series of oxidases and peroxidases. The genome encodes hundreds of enzymes that are able to cleave carbohydrates. Comparative analyses of 31 fungal genomes suggests the emergence of white rot with wood-decaying capabilities about 295 MYA.

80 Schizosaccharomyces pombe
The S. pombe genome is 13.8 Mb and encodes ~4900 predicted proteins. Some bacterial genomes encode more proteins (e.g. Mesorhizobium loti with 6752, and Streptomyces coelicolor with 7825 genes). Chromosome Genes Coding Mb 2, % Mb 1, % Mb % Total Mb 4, %

81 Schizosaccharomyces pombe
S. pombe diverged from S. cerevisiae about 330 to 420 million years ago. Many genes are as divergent between these two fungi as they are diverged from humans.

82 Outline Introduction Description and classification of fungi
Introduction to budding yeast Saccharomyces cerevisiae Gene duplication and genome duplication Comparative analyses of hemiascomycetes Analysis of fungal genomes Perspective

83 Perspectives The budding yeast S. cerevisiae is one of the most significant organisms in biology: Its genome is the first of a eukaryote to be sequenced Its biology is simple relative to metazoans Through yeast genetics, powerful functional genomics approaches have been applied to study all yeast genes It is important to note that even for yeast, our knowledge of basic biological questions is highly incomplete. For humans and other eukaryotes we still understand relatively little about how the genotype of an organism leads to its characteristic phenotype. Studies in S. cerevisiae help elucidate these relationships.

84 Features of S. pombe genome
B&FG 3e Table 18.6 Page 875


Download ppt "Eukaryotic genomes: fungi"

Similar presentations


Ads by Google