Presentation is loading. Please wait.

Presentation is loading. Please wait.

MStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations Suyash Shringarpure and Eric.

Similar presentations


Presentation on theme: "MStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations Suyash Shringarpure and Eric."— Presentation transcript:

1 mStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations Suyash Shringarpure and Eric Xing School of Computer Science Carnegie Mellon University ICML 2008 Presented by Haojun Chen

2 Outline Background Background Structure Model Structure Model mStruct Model mStruct Model Experiment Results Experiment Results Summary Summary

3 Background Allele: one member of a pair or series of different forms of a gene Allele: one member of a pair or series of different forms of a gene Population structure analysis aim to shed light on evolutionary history of modern human population Population structure analysis aim to shed light on evolutionary history of modern human population Microsatellites and single nucleotide polymorphisms (SNP) data: base of population structure analysis Microsatellites and single nucleotide polymorphisms (SNP) data: base of population structure analysis State-of-the-art method: Structure State-of-the-art method: Structure

4 Structure Model x: Microsatellite alleles x: Microsatellite alleles : unique set : unique set of population-specific multinomial of population-specific multinomial distributions distributions : vector of : vector of multinomial parameters, a.k.a., allele multinomial parameters, a.k.a., allele frequency profile (AP), of the allele frequency profile (AP), of the allele distribution at locus i in ancestral distribution at locus i in ancestral population k population k : total number of observed marker : total number of observed marker alleles at locus I alleles at locus I : total number of marker loci : total number of marker loci : total number of individuals : total number of individuals : individual-specific admixing : individual-specific admixing coefficient vector coefficient vector

5 Pitfall of Structure There is no mutation model for modern individual alleles with respect to common prototypes in the modern populations There is no mutation model for modern individual alleles with respect to common prototypes in the modern populations Every unique allele in the modern population is assumed to have a distinct ancestral frequency, rather than allowing the possibility of it just being a descendent of some common ancestral allele Every unique allele in the modern population is assumed to have a distinct ancestral frequency, rather than allowing the possibility of it just being a descendent of some common ancestral allele

6 mStruct Model : set of ancestral alleles : mutation parameter associated with locus : frequencies of the ancestral alleles : total number of ancestral alleles Microsatellite mutation model SNP mutation model

7 Generative process for Structure Generative process for Structure where where Generative process for mStruct Generative process for mStruct step 2.2 above is replaced by step 2.2 above is replaced by Generative Process

8 mStruct Model Inference MCMC: slow MCMC: slow Variational inference for hidden variable Variational inference for hidden variable variational EM for hyperparameter variational EM for hyperparameter

9 Synthetic Data Twenty microsatellite genotype datasets with 100 individuals from 3 ancestral populations at 50 genotype loci

10 HGDP Microsatellite Data Model selection by BIC (Bayesian Information Criterion) score Model selection by BIC (Bayesian Information Criterion) score

11 HGDP Microsatellite Data am-spectrum: spectrums of different ancestral populations gm-spectrum: spectrums of different geographical populations 1056 individuals from 52 populations at 377 autosomal microsatellite loci

12 Contour of Mutation Rates

13 Summary mStruct takes into account genetic admixture and allele mutation effects mStruct takes into account genetic admixture and allele mutation effects mStruct: extended LDA which allows noisy observations mStruct: extended LDA which allows noisy observations Variational inference algorithm that allows tractable inference developed for mStruct Variational inference algorithm that allows tractable inference developed for mStruct Other application: images, text and so on Other application: images, text and so on


Download ppt "MStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations Suyash Shringarpure and Eric."

Similar presentations


Ads by Google