Presentation is loading. Please wait.

Presentation is loading. Please wait.

PrIME Probabilistic Integrated Models of Evolution

Similar presentations


Presentation on theme: "PrIME Probabilistic Integrated Models of Evolution"— Presentation transcript:

1 PrIME Probabilistic Integrated Models of Evolution
Bengt Sennblad, Dept. Plant and Env. Sciences, GU and Stockholm Bioinformatics Center (SBC)

2 Outline PrIME models GEM SRT GSR HGM Gene Evolution Model
Sequence evolution with Rates and Times GSR Gene Sequence evolution with iid Rates HGM Gene evolution in hybrid networks

3 Outline What? Why? When? How? So?

4 GEM The Gene Evolution Model
Lars Arvestad, KTH Ann-Charlotte Berggren, UU Jens Lagergren, KTH Bengt Sennblad

5 What? Gene (loci) tree evolution
GEM – gene loci (Coalescence – gene alleles) Gene duplication and loss process Models duplications/losses that are fixed in population

6 What? – genetic mechanism
Recombination errors Unequal crossing-over Tandem repeats Segmental duplication Retrotransoposition mRNA ® cDNA ® insertion Loss of intron, regulatory regions Chromosome/genome duplication

7 What? – the process Gene tree evolves inside species tree Speciation
÷ Speciation Duplication Loss This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

8 Why? Ohno, S. 1970, Evolution by gene duplication, Springer,
NEOFUNCTIONALIZATION SUBFUNCTIONALIZATION ADAPTIVE EVOLUTION

9 Fitch’s Definition of Orthology
mouse chicken frog Speciation shown as Duplication shown as Fitch 1970 Orthologs - LCA speciation Paralogs - LCA duplication Orthologs - more likely to share function Paralogs - more likely to have different function

10 Mushegian et al. 1998. Genome Res
Why? INCONGRUENCE! Biological causes Duplication-loss Lateral gene transfer Allele sorting Methodological Systematic Stochastic Mushegian et al Genome Res

11 Reconciling given species and gene tree
Species tree A B C D E F G H A B C D E F G H Gene tree

12 Reconciliation  Þ Reconciled tree (G,g)
Gene tree vertex on species tree edge is a duplication Gene tree vertex on species tree vertex is a speciation This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, ÷

13 When? Parsimony Reconciliation (MPR)
Goodman et al. 1979, Page and co-workers sev. papers, Guigo 1996, Hallett & Lagergren 2000

14 Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, A B C A B C

15 Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

16 Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

17 Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

18 Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

19 Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

20 Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

21 When? Parsimony Reconciliation (MPR) Bootstrapped MPR
Goodman et al. 1979, Page and co-workers sev. papers, Guigo 1996, Hallett & Lagergren 2000 Bootstrapped MPR Storm & Sonnhammer 2002, Zmasek & Eddy 2002

22 Why only MPR? Speciation shown as Duplication shown as
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, 22

23 Why only MPR? Speciation shown as Duplication shown as
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, 23

24 Why only MPR? ® Probabilistic reconciliation Speciation shown as
Duplication shown as Intuition: Depending on gene family, we might believe more in some reconciliations than in others ® Probabilistic reconciliation This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, 24

25 When? Parsimony Reconciliation (MPR) Bootstrapped MPR
Goodman et al. 1979, Page and co-workers sev. papers, Guigo 1996, Hallett & Lagergren 2000 Bootstrapped MPR Storm & Sonnhammer 2002, Zmasek & Eddy 2002 Probabilistic models Full reconciliation model, Arvestad et al. 2003, 2004, Gene copy number, Hahn 2005, Csürös & Miklós2006

26 The Gene Evolution Model
How?

27 Generation vs reconstruction – the Birth-Death model
Generation of data from model Example: BD(l,m) ® tree T Repeated generation ® distribution Statistical tests, e.g., ’parametric bootstrapping’

28 Birth-death process gives trees
1 28

29 Birth-death process gives trees
1 29

30 Birth-death process gives trees
1 30

31 Birth-death process gives trees
1 31

32 Birth-death process gives trees
1 32

33 Birth-death process gives trees
1 33

34 Birth-death process gives trees
1 34

35 Generation vs reconstruction the birth-death model
Generation of data from model Example: BD(l,m) ® tree T Repeated generation ® distribution Statistical tests, e.g., ’parametric bootstrapping’ Reconstruction – Probability of given data Example: given T ® Pr[T|l, m] under BD Compare different trees ® ’reconstruction’

36 Extinct lineages 1 36

37 Extinct lineages 1 37

38 Isomorphisms

39 Gene evolution model Gene effecting events: losses, duplications, and speciation The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound

40 Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound

41 Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound

42 Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound

43 Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process At a speciation each gene linage splits into two The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound

44 Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process At a speciation each gene linage splits into two Finally, losses are pruned The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound

45 Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process At a speciation each gene linage splits into two Finally, losses are pruned Biologically sound Nei et al., 1997 The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound

46 Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

47 Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

48 Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

49 Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

50 Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

51 Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

52 Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,

53 Reconciliation – losses have been pruned
Reconciled tree (G, ) Reconciliation – losses have been pruned ÷ So the end result of this process is a gene tree and a reconciliation of it and the species tree Probabilities Generation probability Posterior probability of a reconciliation given a gene tree

54 Reconstruction Reconstruction Pr[G,g|S, l, m, t] non-trivial
BD over edges Ghosts Isomorphisms Sum over scenarios No. scenarios is exponential in tree size Dynamic programming (DP) Efficient algorithm So the end result of this process is a gene tree and a reconciliation of it and the species tree Probabilities Generation probability Posterior probability of a reconciliation given a gene tree 54

55 GEM Generation relatively simple Probability of reconciled tree
Probability of gene tree: Posterior of reconciliation: Max likelihood reconciliation:

56 Example application: Probabilistic orthology analysis
Sennblad, Lagergren submitted

57 MHC example: MPR orthology
Most parsimonious reconciled trees (\mpr{}) for the MHC data set. (a) \mpr{} for the full four primates data set. (b) \mpr{} for the reduced data set simulating that the human genome has not been sampled. The species tree is scaled to the time scale on the left, which is given in million years ago. Gene annotation follows \cite{Nei1997}. The blue integer numbers in (b) indicate potential speciation vertices; in (a) the blue zero indicates the vertex corresponding to vertex zero in (b) (i.e., the least common ancestor to the \textit{Pongo}, \textit{Gorilla} and all \textit{Saguinus} genes except \textit{Saguinus-3}). All illustrations of reconciled trees were made with \texttt{primetv} \cite{Sennblad2007a}. 57

58 MHC example: MPR orthology
1(b) Most parsimonious reconciled trees (\mpr{}) for the MHC data set. (a) \mpr{} for the full four primates data set. (b) \mpr{} for the reduced data set simulating that the human genome has not been sampled. The species tree is scaled to the time scale on the left, which is given in million years ago. Gene annotation follows \cite{Nei1997}. The blue integer numbers in (b) indicate potential speciation vertices; in (a) the blue zero indicates the vertex corresponding to vertex zero in (b) (i.e., the least common ancestor to the \textit{Pongo}, \textit{Gorilla} and all \textit{Saguinus} genes except \textit{Saguinus-3}). All illustrations of reconciled trees were made with \texttt{primetv} \cite{Sennblad2007a}. 28 orthology predictions 58

59 Three other reconciliations
Out of 210 in total 59

60 Reconciliation probabilities
2 1(b) 60

61 Posterior & posterior mean
Modification of DP for Pr[G|S,l,m] MCMC 61

62 Speciation probabilities
62

63 ABCA 63

64 Speciation probabilities
64

65 MHC birth-death parameter posterior
65

66 When is MP-reconciliation incorrect?
1000 (G,) per square Histogram from simulations study of the frequency of cases where \mpr{} makes false orthology predictions. For combinations of birth and death rates 1000 random reconciled trees were generated. The bars indicate the frequency of cases for which \mpr{} make false orthology predictions. 66

67 Performance of probabilistic orthology analysis
Draw from posterior distribution Biological realism Generate synthetic (G,g) Speciations are positives Analyze classify genes as duplication/speciation based on probability and threshold ROC-curves Sensitivity = TP/(TP+FN) Specificity = TN/(TN+FP)

68 ROC for MHC-like data Speciation? Y sensitivity, X specificity 68

69 ROC for ABCA-like data Speciation? Y sensitivity, X specificity 69

70 Programs primeGEM; xprimeGEM xprimeGEM-max xprimeGEM-enum
Probability of gene tree: Orthology analysis: xprimeGEM Probability of reconciled tree Posterior of reconciliation: xprimeGEM-max Max likelihood reconciliation: xprimeGEM-enum Enumerates all g with Prob.

71

72 ’Comparison’ Coalesence -- Allele evolution
Alleles evolve by random sampling from parental population Underlying pure birth process Gene duplication and loss -- gene family evolution Gene copies are created by duplication events Gene copies are lost by loss events Underlying birth-death process

73 Later similar work -- gene copy number

74 Later similar work -- coalescence

75 Later similar work -- coalescence


Download ppt "PrIME Probabilistic Integrated Models of Evolution"

Similar presentations


Ads by Google