Presentation is loading. Please wait.

Presentation is loading. Please wait.

Maximum-likelihood estimation of admixture proportions from genetic data Jinliang Wang.

Similar presentations


Presentation on theme: "Maximum-likelihood estimation of admixture proportions from genetic data Jinliang Wang."— Presentation transcript:

1

2 Maximum-likelihood estimation of admixture proportions from genetic data Jinliang Wang

3 P0P0 P1P1 P2P2 PhPh P1P1 P2P2 n1n1 n2n2 PhPh p1p1 p2p2 NhNh N2N2 N1N1 ShSh S1S1 S2S2 ξ ψ t 1 = ξ/2n 1 t 2 = ξ/2n 2 T 1 = ψ/2N 1 T h = ψ/2N h T 2 = ψ/2N 2 Ω = {p 1, t 1,t 2,T 1,T h,T 2 }

4 P0P0 P1P1 P2P2 PhPh P1P1 P2P2 n1n1 n2n2 PhPh p1p1 p2p2 NhNh N2N2 N1N1 ShSh S1S1 S2S2 ξ ψ t 1 = ξ/2n 1 t 2 = ξ/2n 2 T 1 = ψ/2N 1 T h = ψ/2N h T 2 = ψ/2N 2 Ω = {p 1, t 1,t 2,T 1,T h,T 2 } w c1c1 chch c2c2 x1x1 xhxh x2x2 y1y1 yhyh y2y2 C = (c 1,c 2,c 3 )

5 Likelihood function

6 Random sampling Admixture and genetic drift Genetic drift Prior on w

7 Allele frequencies in P 0 P0P0 w

8 Genetic drift after population split P0P0 P1P1 P2P2 n1n1 n2n2 ξ w x1x1 x2x2 t 1 = ξ/2n 1 t 2 = ξ/2n 2

9 Genetic drift in independent populations

10 Genetic drift: the diffusion approximation t i = ξ/2n i Crow and Kimura (1970) p. 382

11 P0P0 P1P1 P2P2 x1x1 PhPh p1p1 p2p2 xhxh x2x2 The admixture event

12 P0P0 P1P1 P2P2 PhPh P1P1 P2P2 PhPh NhNh N2N2 N1N1 ψ T 1 = ψ/2N 1 T h = ψ/2N h T 2 = ψ/2N 2 x1x1 xhxh x2x2 y1y1 yhyh y2y2 Genetic drift since admixture event

13 PhPh P1P1 P2P2 ShSh S1S1 S2S2 c1c1 chch c2c2 y1y1 yhyh y2y2 C = (c 1,c 2,c 3 ) Random sampling

14 Likelihood function Random sampling Admixture and genetic drift Genetic drift Prior on w

15 African-American Admixture Proportions

16 Profile log-likelihoods for New York Proportion of European ancestry Drift before admixture event Drift since admixture event

17 Application to canid populations: Grey wolf and coyote in North America

18 Common Ancestor Grey WolfCoyote Wolf- like Hybrid Grey WolfCoyote Coyote- like Hybrid

19 Discussion Suitable data Assumptions of the method given the model Comparing the model to other scenarios Aspects of the data used for inference

20 Discussion Suitable data Human data Genotypes of 10 nuclear loci. Chosen because they are either African or European specific or highly differentiated between the two. Canid data 10 microsatellite loci. Neither species-specific nor highly differentiated between wolves and coyotes.

21 Discussion Assumptions of method given the model Alleles are inherited independently across loci in the admixture event Drift acts independently on alleles across loci Alleles in a sampled individual are independent across loci

22 Discussion Assumptions of method given the model The prior distribution on w is flat, not U- shaped Admixture occurs instantaneously The effect of mutation on perturbing allele frequency is negligible

23 Discussion Comparing the model to other scenarios Modern ‘pure’ populations need to be sampled Thus the ‘structure’ of the population is assumed to be known If we cannot sample modern ‘pure’ populations assumes we cannot make inference on the admixture proportions

24 Discussion Aspects of the data used for inference Inference proceeds solely on the basis of allele frequencies Linkage disequilibrium is  Firstly, not used for inference  Secondly, assumed to be negligible LD might be exploited  Enhance inference when modern ‘pure’ populations are sampled  Relax the necessity to sample modern ‘pure’ populations at all


Download ppt "Maximum-likelihood estimation of admixture proportions from genetic data Jinliang Wang."

Similar presentations


Ads by Google