Download presentation
Presentation is loading. Please wait.
Published byEzra Wilson Modified over 6 years ago
1
PrIME Probabilistic Integrated Models of Evolution
Bengt Sennblad, Dept. Plant and Env. Sciences, GU and Stockholm Bioinformatics Center (SBC)
2
Outline PrIME models GEM SRT GSR HGM Gene Evolution Model
Sequence evolution with Rates and Times GSR Gene Sequence evolution with iid Rates HGM Gene evolution in hybrid networks
3
Outline What? Why? When? How? So?
4
GEM The Gene Evolution Model
Lars Arvestad, KTH Ann-Charlotte Berggren, UU Jens Lagergren, KTH Bengt Sennblad
5
What? Gene (loci) tree evolution
GEM – gene loci (Coalescence – gene alleles) Gene duplication and loss process Models duplications/losses that are fixed in population
6
What? – genetic mechanism
Recombination errors Unequal crossing-over Tandem repeats Segmental duplication Retrotransoposition mRNA ® cDNA ® insertion Loss of intron, regulatory regions Chromosome/genome duplication
7
What? – the process Gene tree evolves inside species tree Speciation
÷ Speciation Duplication Loss This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
8
Why? Ohno, S. 1970, Evolution by gene duplication, Springer,
NEOFUNCTIONALIZATION SUBFUNCTIONALIZATION ADAPTIVE EVOLUTION
9
Fitch’s Definition of Orthology
mouse chicken frog Speciation shown as Duplication shown as Fitch 1970 Orthologs - LCA speciation Paralogs - LCA duplication Orthologs - more likely to share function Paralogs - more likely to have different function
10
Mushegian et al. 1998. Genome Res
Why? INCONGRUENCE! Biological causes Duplication-loss Lateral gene transfer Allele sorting Methodological Systematic Stochastic Mushegian et al Genome Res
11
Reconciling given species and gene tree
Species tree A B C D E F G H A B C D E F G H Gene tree
12
Reconciliation Þ Reconciled tree (G,g)
Gene tree vertex on species tree edge is a duplication Gene tree vertex on species tree vertex is a speciation This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, ÷
13
When? Parsimony Reconciliation (MPR)
Goodman et al. 1979, Page and co-workers sev. papers, Guigo 1996, Hallett & Lagergren 2000
14
Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, A B C A B C
15
Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
16
Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
17
Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
18
Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
19
Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
20
Most parsimonious reconciliation
Speciation shown as Duplication shown as Goodman et al 1979 This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
21
When? Parsimony Reconciliation (MPR) Bootstrapped MPR
Goodman et al. 1979, Page and co-workers sev. papers, Guigo 1996, Hallett & Lagergren 2000 Bootstrapped MPR Storm & Sonnhammer 2002, Zmasek & Eddy 2002
22
Why only MPR? Speciation shown as Duplication shown as
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, 22
23
Why only MPR? Speciation shown as Duplication shown as
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, 23
24
Why only MPR? ® Probabilistic reconciliation Speciation shown as
Duplication shown as Intuition: Depending on gene family, we might believe more in some reconciliations than in others ® Probabilistic reconciliation This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree, 24
25
When? Parsimony Reconciliation (MPR) Bootstrapped MPR
Goodman et al. 1979, Page and co-workers sev. papers, Guigo 1996, Hallett & Lagergren 2000 Bootstrapped MPR Storm & Sonnhammer 2002, Zmasek & Eddy 2002 Probabilistic models Full reconciliation model, Arvestad et al. 2003, 2004, Gene copy number, Hahn 2005, Csürös & Miklós2006
26
The Gene Evolution Model
How?
27
Generation vs reconstruction – the Birth-Death model
Generation of data from model Example: BD(l,m) ® tree T Repeated generation ® distribution Statistical tests, e.g., ’parametric bootstrapping’
28
Birth-death process gives trees
1 28
29
Birth-death process gives trees
1 29
30
Birth-death process gives trees
1 30
31
Birth-death process gives trees
1 31
32
Birth-death process gives trees
1 32
33
Birth-death process gives trees
1 33
34
Birth-death process gives trees
1 34
35
Generation vs reconstruction the birth-death model
Generation of data from model Example: BD(l,m) ® tree T Repeated generation ® distribution Statistical tests, e.g., ’parametric bootstrapping’ Reconstruction – Probability of given data Example: given T ® Pr[T|l, m] under BD Compare different trees ® ’reconstruction’
36
Extinct lineages 1 36
37
Extinct lineages 1 37
38
Isomorphisms
39
Gene evolution model Gene effecting events: losses, duplications, and speciation The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound
40
Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound
41
Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound
42
Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound
43
Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process At a speciation each gene linage splits into two The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound
44
Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process At a speciation each gene linage splits into two Finally, losses are pruned The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound
45
Gene evolution model Gene effecting events: losses, duplications, and speciation A gene tree evolves inside the species tree A gene starts at time t before the root Along an edge linear birth death process At a speciation each gene linage splits into two Finally, losses are pruned Biologically sound Nei et al., 1997 The model contains three gene effecting events, namely the ones we have seen before Duplications Speciations; and Losses In the model a gene tree evolves inside a species tree according to the following rules Start Edge, canonical Speciation, reaches, (leaf) Finally, losses are pruned, so a rec is obtained Nei et al have considered a similar model, without formalizing it mathematically, and concluded that it is biologically sound Matematically sound and almost canonical and biologically sound
46
Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
47
Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
48
Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
49
Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
50
Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
51
Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
52
Generating a scenario Speciation Duplication Loss
This is a scenario that shows how a gene tree in blue has evolved with respect to a species tree in black So this is the root of the species tree and this is the root of the gene tree Three types of gene effectiong events are shown... For instance ... Losses are important, but in some cases, as for orthology analyses, they are not needed For this reason we use reconcilations to show the evolution of a gene tree,
53
Reconciliation – losses have been pruned
Reconciled tree (G, ) Reconciliation – losses have been pruned ÷ So the end result of this process is a gene tree and a reconciliation of it and the species tree Probabilities Generation probability Posterior probability of a reconciliation given a gene tree
54
Reconstruction Reconstruction Pr[G,g|S, l, m, t] non-trivial
BD over edges Ghosts Isomorphisms Sum over scenarios No. scenarios is exponential in tree size Dynamic programming (DP) Efficient algorithm So the end result of this process is a gene tree and a reconciliation of it and the species tree Probabilities Generation probability Posterior probability of a reconciliation given a gene tree 54
55
GEM Generation relatively simple Probability of reconciled tree
Probability of gene tree: Posterior of reconciliation: Max likelihood reconciliation:
56
Example application: Probabilistic orthology analysis
Sennblad, Lagergren submitted
57
MHC example: MPR orthology
Most parsimonious reconciled trees (\mpr{}) for the MHC data set. (a) \mpr{} for the full four primates data set. (b) \mpr{} for the reduced data set simulating that the human genome has not been sampled. The species tree is scaled to the time scale on the left, which is given in million years ago. Gene annotation follows \cite{Nei1997}. The blue integer numbers in (b) indicate potential speciation vertices; in (a) the blue zero indicates the vertex corresponding to vertex zero in (b) (i.e., the least common ancestor to the \textit{Pongo}, \textit{Gorilla} and all \textit{Saguinus} genes except \textit{Saguinus-3}). All illustrations of reconciled trees were made with \texttt{primetv} \cite{Sennblad2007a}. 57
58
MHC example: MPR orthology
1(b) Most parsimonious reconciled trees (\mpr{}) for the MHC data set. (a) \mpr{} for the full four primates data set. (b) \mpr{} for the reduced data set simulating that the human genome has not been sampled. The species tree is scaled to the time scale on the left, which is given in million years ago. Gene annotation follows \cite{Nei1997}. The blue integer numbers in (b) indicate potential speciation vertices; in (a) the blue zero indicates the vertex corresponding to vertex zero in (b) (i.e., the least common ancestor to the \textit{Pongo}, \textit{Gorilla} and all \textit{Saguinus} genes except \textit{Saguinus-3}). All illustrations of reconciled trees were made with \texttt{primetv} \cite{Sennblad2007a}. 28 orthology predictions 58
59
Three other reconciliations
Out of 210 in total 59
60
Reconciliation probabilities
2 1(b) 60
61
Posterior & posterior mean
Modification of DP for Pr[G|S,l,m] MCMC 61
62
Speciation probabilities
62
63
ABCA 63
64
Speciation probabilities
64
65
MHC birth-death parameter posterior
65
66
When is MP-reconciliation incorrect?
1000 (G,) per square Histogram from simulations study of the frequency of cases where \mpr{} makes false orthology predictions. For combinations of birth and death rates 1000 random reconciled trees were generated. The bars indicate the frequency of cases for which \mpr{} make false orthology predictions. 66
67
Performance of probabilistic orthology analysis
Draw from posterior distribution Biological realism Generate synthetic (G,g) Speciations are positives Analyze classify genes as duplication/speciation based on probability and threshold ROC-curves Sensitivity = TP/(TP+FN) Specificity = TN/(TN+FP)
68
ROC for MHC-like data Speciation? Y sensitivity, X specificity 68
69
ROC for ABCA-like data Speciation? Y sensitivity, X specificity 69
70
Programs primeGEM; xprimeGEM xprimeGEM-max xprimeGEM-enum
Probability of gene tree: Orthology analysis: xprimeGEM Probability of reconciled tree Posterior of reconciliation: xprimeGEM-max Max likelihood reconciliation: xprimeGEM-enum Enumerates all g with Prob.
72
’Comparison’ Coalesence -- Allele evolution
Alleles evolve by random sampling from parental population Underlying pure birth process Gene duplication and loss -- gene family evolution Gene copies are created by duplication events Gene copies are lost by loss events Underlying birth-death process
73
Later similar work -- gene copy number
74
Later similar work -- coalescence
75
Later similar work -- coalescence
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.