Download presentation
Presentation is loading. Please wait.
Published bySydney Allison Modified over 6 years ago
1
Introducing Bayesian Approaches to Twin Data Analysis
Lindon Eaves, VIPBG, Richmond. Boulder, March 2001
2
Outline Why use a Bayesian approach? Basic concepts “BUGS”
Live Demo of simple application Applications to twin data
3
Why Use Bayesian Approach?
Intellectually satisfying Get more information out of existing problems (distributions of model parameters, individual“genetic” scores) Tackle problems other methods find difficult (non-linear mixed models – growth curves; GxE interaction)
4
Some references Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (1996) Markov Chain Monte Carlo in Practice. Chapman & Hall, London. Spiegelhalter, D., Thomas, A., Best, N. (2000) WinBUGS Version 1.3, User Manual, MRC BUGS Project: Cambridge. Eaves, L.J., Erkanli, A. (In preparation) Markov Chain Monte Carlo Approaches to Analysis of Genetic and Environmental Components of Human Developmental Change and GxE Interaction. (For Behavior Genetics).
5
The Traditional Approach: Via Likelihood
Given Data D and parameters q: The likelihood function, l, is l=P(D|q). We find q that maximizes l.
6
Typically Maximize likelihood numerically
Fairly easy for linear models and normal variables (“LISREL”) Mx works well (best!)
7
Some things don’t work so well
BUT…. Some things don’t work so well
8
For example: Getting confidence intervals etc.
Non-linear models (require integration over latent variables – hard for large # of parameters) Estimating large numbers of latent variables (e.g. “genetic factor scores”)
9
Markov Chain Monte Carlo Methods: (MCMC)
Allow more general models Obtain confidence intervals and other summary statistics Estimates missing values Estimates latent trait values All as part of the model-fitting process
10
Bayesian approach l=P(D|q). B=P(q|D). ML works with
Bayesian approach seeks distribution of parameters given data: B=P(q|D).
11
Use Bayes theorem: P(q|D)=P( q & D)/P(D) = P(D|q).P(q)/P(D)
How do we get P(q|D)? Use Bayes theorem: P(q|D)=P( q & D)/P(D) = P(D|q).P(q)/P(D)
12
A couple of problems We don’t know P(q) What is P(D)?
13
P(q) “Prior” distribution not known but
may know (guess?) its form, e.g., Means may be normal Variances may be gamma
14
P(D)=SP(D|q).P(q)dq Where S=integral sign (!)
15
(“Monte Carlo” integration)
How do we get integral? If we know P(q) we could sample q many times and evaluate function. Integral is approximated to desired accuracy by mean of k (=large) samples (“Monte Carlo” integration)
16
We don’t know P(q) We only know its “shape”
We still have a problem.. We don’t know P(q) We only know its “shape”
17
“Markov Chain” Monte Carlo…
Simulate a sequence of samples of q that ultimately converge to (non-independent) samples from the desired distribution, P(q).
18
If we succeed… When the sequence has converged (“stationary distribution”, after “burn in” from trial q) we may construct P(q) from sequence of samples.
19
One algorithm that can generate chains in large number of cases…
…The “Gibbs Sampler”, hence: “Bayesian Inference Using Gibbs Sampling” “BUGS” for short Spiegelhalter, D., Thomas, A., Best, N. (2000) WinBUGS Version 1.3, User Manual, MRC BUGS Project: Cambridge.
20
Obtaining WinBUGS Find MRC BUGS project on www (search on WinBUGS)
Download educational version (free) Register by (at site) Install educational version (Instructions at site) Follow instructions in reply to convert to production version (free)
21
Preview of example: Using BUGS to estimate a mean and variance
22
Data and Initial Values for Mean-Variance Problem
list(n=50) y[] list(mu=10,tau=0.2)
23
“Doodle” for mean and variance model
24
BUGS Code for Mean and Variance
25
First 200 iterations of MCMC Algorithm
Values of Mean (mu): First 200 iterations of MCMC Algorithm
26
Values of Variance (Sigma2): First 200 iterations of MCMC Algorithm
27
MCMC Estimates of Mean and Variance:
5000 iterations after 1000 iteration “burn in”.
28
Application to Twin Data
Fitting the AE model to bivariate twin data
29
Table 1: Population parameter values used in simulation of bivariate
twin data and values realized using Mx for ML estimation (N=100 MZ and 100 DZ pairs). Parameter ML estimate Population value mu[1] 9.993 10.0 mu[2] 10.047 sigma2.g[1,1] 0.704 0.8 sigma2.g[1,2] 0.371 0.4 sigma2.g[2,2] 0.741 sigma2.e[1,1] 0.194 0.2 sigma2.e[1,2] 0.098 0.1 sigma2.e[2,2] 0.254
30
Doodle for Multivariate AE Model
31
Start of Data for Bivariate Twin Example
list(N=2,nmz=100,ndz=100,mean=c(0,0), precis =structure(.Data=c(0.0001,0, 0, ),.Dim=c(2,2)), omega.g=structure(.Data=c(0.0001,0,0,0.0001),.Dim=c(2,2)), omega.e=structure(.Data=c(0.0001,0,0,0.0001),.Dim=c(2,2))) ymz[,1,1] ymz[,1,2] ymz[,2,1] ymz[,2,2]
32
Iteration history for estimates of means
33
Iteration History for Genetic Covariances
34
Summary statistics for 5000 MCMC iterations of bivariate AE model after 2000 iteration "burn in"
node mean sd MC error 2.5% median 97.5% deviance mu[1] mu[2] g[1,1] g[1,2] g[2,2] e[1,1] e[1,2] E e[2,2]
35
Comparison of ML and MCMC Estimates for Bivariate AE model
Parameter ML MCMC Mu(1) 10.00 Mu(2) 10.05 G(1,1) 0.704 0.705 G(1,2) 0.371 0.372 G(2,2) 0.741 0.739 E(1,1) 0.194 0.197 E(1,2) 0.098 0.099 E(2,2) 0.254 0.258
36
Illustrative MCMC estimates of genetic effects:
first two DZ twin pairs on two variables Observation Est S.e MC error % Median % g1dz[1,1,1] g1dz[1,1,2] g1dz[1,2,1] g1dz[1,2,2] g1dz[2,1,1] g1dz[2,1,2] g1dz[2,2,1] g1dz[2,2,2]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.