Presentation is loading. Please wait.

Presentation is loading. Please wait.

FOG: High-Resolution Fungal Orthologous Groups René van der Heijden Project 5.10: Comparative genomics for the prediction of protein function and pathways.

Similar presentations


Presentation on theme: "FOG: High-Resolution Fungal Orthologous Groups René van der Heijden Project 5.10: Comparative genomics for the prediction of protein function and pathways."— Presentation transcript:

1 FOG: High-Resolution Fungal Orthologous Groups René van der Heijden Project 5.10: Comparative genomics for the prediction of protein function and pathways in Saccharomyces cerevisiae

2 What is this presentation about? What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)? Why high-resolution orthology? Automated high-resolution orthology detection The FOG database and some applications

3 Orthology “This gene in that other species …” We don’t have chicken genes ! They mean: the corresponding gene ? Why that particular gene ? Sure this actually is the gene ? Sure that all n orthologs are correct ?

4 the line represents a gene in some ancestral species a long long time ago in a land far far away speciation event there is a speciation event resulting in two species orthologous with the same, orthologous gene time one of the genes gets duplicated resulting in two paralogous genes another speciation event … but one of the paralogous genes is lost in one of the new species another speciation event current set of genes with apparent history Orthologous genes orthologs paralogs

5 Duplications, Speciations, and Orthology Two genes in two species are orthologous if they derive from one gene in their last common ancestor Orthologous genes are likely to have the same function

6 Detecting orthologous genes Usual methods based on blast hit quality: e.g. bi-directional best hit (BBH) BBH ortholog BBH ortholog

7 KOG clusters Based on triangle of BBH between genes of three species InParalogs are added Triangles are extended by other genes and other species

8 KOG statistics These large KOG clusters must have multiple representatives per species Low Resolution: There must be functional specialization within these clusters!

9 High-res versus Low-res Many, Complete, and Closely related genomes Challenge: Automatic Orthology assignment

10 Gene Families Use PSI-blast to recognize (distant) homologs Split gene set into families of homologous genes Challenge: Promiscuous domains Multi domain genes occur very often in Eukaryotic genomes

11 Gene Families Promiscuous domains cause genes to be only partially homologous: –Gene A-B is partially homolgous to gene A-C, as is gene B-C Merging everything with homologous parts generates far too large gene families: –Not possible to obtain proper multiple alignments More advanced technique for separating multi- domain genes into gene families

12 Generating Gene Families More advanced technique for the merging of genes into gene families is not functional yet Fall back on ‘known’ gene families using KOG: –Low resolution orthology assignments for Eukaryotes –Some inclusive families with many genes per species Some statistics: 15 Fungal species with 104.440 genes in total Divided into 11.020 KOG clusters (gene families) Involving 70.867 genes (= 68%)

13 Uncertainty in trees Evolutionary noise –Differing rates of evolution –Convergent evolution (low complexity, coiled coils) –Promiscuous domains (recombination, fusion, fission) Use of heuristic methods –Multiple alignment –Tree making

14 Reading Gene-Trees Although genes spec1,1 and spec2,1 are closer relatives, their distance is larger than that between spec1,1 and spec3,1 The tree suggests at least 2 gene losses

15 Analyze trees … but don’t trust them fully Rigid analysis suggests many duplications and losses Presume scp branch is wrongly placed! If this is correct …. this can’t be

16 Three orthologous groups suggesting 15 gene losses Considering one wrongly placed gene leaves only 2 gene losses Analyze trees … but don’t trust them fully And if we accept wrong placement of branches …

17 Automatic Orthology assignment LOFT: Levels of Orthology From Trees

18 Result Collection of genes is split into KOG families KOG families are aligned and phylogenetic trees are derived Phylogenetic trees are analyzed using LOFT resulting in high-resolution orthology

19 Result

20 Can LOFT be trusted?

21 It seems okay!

22 Applications We now have FOG: a complete set of high resolution orthology assignments for fungi We ‘know’ which orthologous genes are present and absent in which species Phyletic distribution

23 Complex I

24

25

26 Phyletic distribution of mitochondrial orthologous groups

27 Phylogenetic Tree for Mitochondrial Carrier Proteins

28 Orthologous group 24 is an uncharacterized mitochondrial carrier It is present in all fungi, except in Ashbya gossypii In yeast this is known as YMC1, unknown function

29 YMC1: predicted glycine/serine antiporter There are three S.cerevisiae genes with the same phyletic distribution: –subunit glycine decarboxylase –other subunit glycine decarboxylase –gene with unknown function


Download ppt "FOG: High-Resolution Fungal Orthologous Groups René van der Heijden Project 5.10: Comparative genomics for the prediction of protein function and pathways."

Similar presentations


Ads by Google