Download presentation
Presentation is loading. Please wait.
Published byGabriella Bilberry Modified over 9 years ago
1
The Coalescent Theory And coalescent- based population genetics programs
2
Overview Set up IMa run The theory Influence Computer programs IMa tutorial
3
Set up IMa Run Download data file from Wiki Open terminal Type command: ima -i IMaEliurus -o IMaEliurus.out -q1 10 –q2 20 -m1 1 -m2 1 –n 10 -t 80 -b 100000 –L0.5 –p 45 Can vary numbers for q’s, t, & m’s
4
Overview Set up IMa run The theory Influence Computer programs IMa tutorial
5
COALESCENT THEORY Formalized in 1982 by Kingman in “The Coalescent” Based on main idea of: Retrospective model of population genetics Dependent on ancestral population size and time since divergence
10
COALESCENT THEORY Formalized in 1982 by Kingman in “The Coalescent” Based on main idea of: Retrospective model of population genetics Dependent on ancestral population size and time since divergence
11
COALESCENT THEORY Terms: Coalescence: two lineages tracing back to a common ancestor at particular time Effective Population Size (N e ): size of Wright- Fisher population; usually smaller than census Theta, Θ: capacity of population to maintain genetic variability (=4N e μ) Incomplete lineage sorting: failure to coalesce
12
COALESCENT THEORY Terms: Coalescence: two lineages tracing back to a common ancestor at particular time Effective Population Size (N e ): size of Wright- Fisher population; usually smaller than census Theta, Θ: capacity of population to maintain genetic variability (=4N e μ) Incomplete lineage sorting: failure to coalesce
13
Wright Fisher Model Describes genetic drift in finite population Assumptions N diploid organisms Monoecious reproduction with infinite number of gametes Non-overlapping generations Random mating No mutation No selection
14
COALESCENT THEORY Terms: Coalescence: two lineages tracing back to a common ancestor at particular time Effective Population Size (N e ): size of Wright- Fisher population; usually smaller than census Theta, Θ: capacity of population to maintain genetic variability (=4N e μ) Incomplete lineage sorting: failure to coalesce
15
COALESCENT THEORY Terms: Coalescence: two lineages tracing back to a common ancestor at particular time Effective Population Size (N e ): size of Wright- Fisher population; usually smaller than census Theta, Θ: capacity of population to maintain genetic variability (=4N e μ) Incomplete lineage sorting: failure to coalesce
16
Incomplete Lineage Sorting Degnan & Salter (2005)
17
COALESCENT THEORY Mathematical expectation of distribution of time back to coalescence Seeks to predict amount of time elapsed between introduction of mutation and arising of particular allele/gene distribution in population
18
Present Past
19
Present Past
20
Present Past
21
Present Past
22
Present Past
23
Present Past
24
Present Past
25
Present Past
26
Present Past
27
Mathematical Representation Θ= 4N e μ P(Coalescent event) = 1/(2Ne) P c (t) = (1 – (1/2N e )) t-1 (1/(2N e )) E(t k ) = 2/(k(k-1))
28
Overview Set up IMa run The theory Influence Computer programs IMa tutorial
29
Influence Population Genetics Phylogenetics Statistical Phylogeography
30
Population Genetics Theory describes the genealogical relationships among individuals in a Wright-Fisher population
31
Phylogenetics Gene tree-Species tree Predicts certain distribution of gene tree frequencies
32
Statistical Phylogeography Individual gene trees contain information about past demographic events when rate of coalescence different between
33
Overview Set up IMa run The theory Influence Computer programs IMa tutorial
34
Computer Programs Kuhner, 2008 BEAST GENETREE LAMARC MIGRATE-N IM/IMa IMa2
35
Computer Programs Kuhner, 2008 BEAST GENETREE LAMARC MIGRATE-N IM/IMa IMa2
36
Computer Programs Kuhner, 2008 BEAST GENETREE LAMARC MIGRATE-N IM/IMa IMa2
37
Computer Programs Kuhner, 2008 BEAST GENETREE LAMARC MIGRATE-N IM/Ima IMa2
38
Computer Programs Kuhner, 2008 BEAST GENETREE LAMARC MIGRATE-N IM/IMa IMa2
39
Computer Programs Kuhner, 2008 BEAST GENETREE LAMARC MIGRATE-N IM/IMa IMa2
40
Computer Programs Coalescent Simulators Approximate Bayesian Computation DIY-ABC PopABC Simulation (Using “Pipeline” Approach) GENOME COAL CoaSim
41
Computer Programs Coalescent Simulators Approximate Bayesian Computation DIY-ABC PopABC Simulation (Using “Pipeline” Approach) GENOME COAL CoaSim
42
Computer Programs Coalescent Simulators Approximate Bayesian Computation DIY-ABC PopABC Simulation (Using “Pipeline” Approach) GENOME COAL CoaSim
43
Computer Programs Coalescent Simulators Approximate Bayesian Computation DIY-ABC PopABC Simulation (Using “Pipeline” Approach) GENOME ms COAL CoaSim
44
Computer Programs Coalescent Simulators Approximate Bayesian Computation DIY-ABC PopABC Simulation (Using “Pipeline” Approach) GENOME ms COAL CoaSim
45
Computer Programs Coalescent Simulators Approximate Bayesian Computation DIY-ABC PopABC Simulation (Using “Pipeline” Approach) GENOME ms COAL CoaSim
46
Computer Programs Coalescent Simulators Approximate Bayesian Computation DIY-ABC PopABC Simulation (Using “Pipeline” Approach) GENOME ms COAL CoaSim
47
Overview Set up IMa run The theory Influence Computer programs IMa tutorial
48
Introduction MCMC simulation of gene genealogies IM simulates model parameters Hey, J (2006)
49
Introduction cont’d Assumptions No other populations more closely related Selective neutrality No recombination within loci Free recombination between loci Mutation model chosen is correct Infinite sites Hasegawa-Kishino-Yano Stepwise Compound locus
50
Input File Example data for IM # im test data population1 population2 3 locus1 1 1 13 I 1 0.0000000008 (0.0000000001, 0.0000000015) pop1_1 ACTACTGTCATGA pop2_1 AGTACTATCACGA hapstrexample 2 1 4 J2 0.75 pop1_1 13 34 GTAC pop1_2 12 35 GTAT pop2_1 12 37 GTAT strexample 2 2 1 S1 1 0.00001 (0.000001, 0.00005) strpop11a 23 strpop11b 26 strpop21a 25 strpop21b 31
51
Input File Example data for IM # im test data population1 population2 3 locus1 1 1 13 I 1 0.0000000008 (0.0000000001, 0.0000000015) pop1_1 ACTACTGTCATGA pop2_1 AGTACTATCACGA hapstrexample 2 1 4 J2 0.75 pop1_1 13 34 GTAC pop1_2 12 35 GTAT pop2_1 12 37 GTAT strexample 2 2 1 S1 1 0.00001 (0.000001, 0.00005) strpop11a 23 strpop11b 26 strpop21a 25 strpop21b 31
52
Input File Example data for IM # im test data population1 population2 3 locus1 1 1 13 I 1 0.0000000008 (0.0000000001, 0.0000000015) pop1_1 ACTACTGTCATGA pop2_1 AGTACTATCACGA hapstrexample 2 1 4 J2 0.75 pop1_1 13 34 GTAC pop1_2 12 35 GTAT pop2_1 12 37 GTAT strexample 2 2 1 S1 1 0.00001 (0.000001, 0.00005) strpop11a 23 strpop11b 26 strpop21a 25 strpop21b 31
53
Command Line (terminal) Command line: ima -i IMaEliurus -o IMaEliurus.out -q1 10 –q2 20 -m1 1 -m2 1 –n 10 -t 80 -b 100000 –L10000 –p 45
54
Command Line (terminal) Command line: ima -i IMaEliurus -o IMaEliurus.out -q1 10 –q2 20 -m1 1 -m2 1 –n 10 -t 80 -b 100000 –L100000 –p 45 More complex run line: ima -i IMaEliurus -o IMaEliurus.out -q1 10 -q2 10 –qA 300 –m 12 –m 23 –t 80 –n 20 –b 100000 –L 0.5 –fl –g1 0.01 –p 45
55
Important Note! Need “IMrun” file which only says “yes” to continue indefinitely (or until it crashes or DSCR kicks the job)
56
Ouput File.out MCMC information Summary Acceptance rates Autocorrelation ESS Chain swapping
57
Ouput File.out Marginal Peak Marginal distributions Minbin Maxbin HiPt HiSmth Mean 95lo/hi HPD90lo/hi
58
Ouput File.out ASCII Curves Plots
59
Ouput File.out.ti No outward information Can be used on subsequent runs when in “L mode”
60
How can I get a “good” run? Conduct preliminary run Duration? Ideally, once run reaches stationarity and convergence Assess autocorrelation Use Metropolis-coupled MCMC Run many, many times (well, at least 3)
61
Robustness of Coalescent Violation to assumptions of: Intralocus recombination Population structure Gene flow from unsampled populations Linkage among loci Divergent selection Different model of substitution
63
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.