G ROWTH M IXTURE M ODELING Shaunna L. Clark & Ryne Estabrook Advanced Genetic Epidemiology Statistical Workshop October 24, 2012 1.

Slides:



Advertisements
Similar presentations
Inferential Statistics and t - tests
Advertisements

Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University of Surrey.
Mixture modelling of continuous variables. Mixture modelling So far we have dealt with mixture modelling for a selection of binary or ordinal variables.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Independent & Dependent Variables
Sean D. Kristjansson Andrew C. Heath Andrey P. Anokhin Substance Use Among Older Adolescents: A Latent Class Analysis.
The Effects of Alcohol Advertising on Youth Drinking Over Time Leslie Snyder University of Connecticut.
Lecture 4 Linear random coefficients models. Rats example 30 young rats, weights measured weekly for five weeks Dependent variable (Y ij ) is weight for.
Midterm Review Session
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
David Kaplan & Heidi Sweetman University of Delaware Two Methodological Perspectives on the Development of Mathematical Competencies in Young Children:
Multilevel Modeling Soc 543 Fall Presentation overview What is multilevel modeling? Problems with not using multilevel models Benefits of using.
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Ch. 14: The Multiple Regression Model building
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
C ROSS -S ECTIONAL M IXTURE M ODELING Shaunna L. Clark Advanced Genetic Epidemiology Statistical Workshop October 23,
Mixture Modeling Chongming Yang Research Support Center FHSS College.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Introduction to Multilevel Modeling Using SPSS
Regression and Correlation Methods Judy Zhong Ph.D.
Chapter 13: Inference in Regression
Modelling non-independent random effects in multilevel models William Browne Harvey Goldstein University of Bristol.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State.
Social patterning in bed-sharing behaviour A longitudinal latent class analysis (LLCA)
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Growth Mixture Modeling of Longitudinal Data David Huang, Dr.P.H., M.P.H. UCLA, Integrated Substance Abuse Program.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Regression. Population Covariance and Correlation.
Examining Relationships in Quantitative Research
Latent Growth Curve Modeling In Mplus: An Introduction and Practice Examples Part II Edward D. Barker, Ph.D. Social, Genetic, and Developmental Psychiatry.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Testing hypotheses Continuous variables. H H H H H L H L L L L L H H L H L H H L High Murder Low Murder Low Income 31 High Income 24 High Murder Low Murder.
Lecture 4 Introduction to Multiple Regression
G Class 11 Statistical Methods for the Analysis of Change Administrative Issues Why study change? Overview of methodological issues Overview of.
Roghayeh parsaee  These approaches assume that the study sample arises from a homogeneous population  focus is on relationships among variables 
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
University Rennes 2, CRPCC, EA 1285
Exploring social mobility with latent trajectory group analysis Patrick Sturgis, University of Southampton and National Centre for Research Methods From.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
TM Substance Use Transitions from Initial Use to Regular Use to Discontinuance Ralph S. Caraballo, Ph.D., MPH Office on Smoking and Health, CDC, Atlanta.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Developmental Models/ Longitudinal Data Analysis Danielle Dick & Nathan Gillespie Boulder, March 2006.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Growth mixture modeling
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Regression Inference. Height Weight How much would an adult male weigh if he were 5 feet tall? He could weigh varying amounts (in other words, there is.
Stats Methods at IC Lecture 3: Regression.
Chapter 14 Introduction to Multiple Regression
Mixture & Multilevel Modeling
Multiple Regression Analysis and Model Building
New York State Suicide Prevention Conference
CHAPTER 29: Multiple Regression*
Basic Practice of Statistics - 3rd Edition Inference for Regression
Longitudinal Data & Mixed Effects Models
Regression Part II.
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

G ROWTH M IXTURE M ODELING Shaunna L. Clark & Ryne Estabrook Advanced Genetic Epidemiology Statistical Workshop October 24,

O UTLINE Growth Mixture Model Regime Switching Other Longitudinal Mixture Models OpenMx GMM How to extend GMM to FMM How to get individual class probabilities from OpenMx Exercise 2

H OMOGENEITY V S. H ETEROGENEITY Previous session showed a growth model where everyone follows the same mean trajectory of use With some individual variations Is this an accurate representation of the development of substance abuse\dependence? Probably Not 3

G ROWTH M IXTURE M ODELING (GMM) Muthén & Shedden, 1999; Muthén, 2001 Setting A single item measured repeatedly Example: Number of substances currently using Hypothesized trajectory classes Non-users; Early initiate; Late, but consistent use Individual trajectory variation within class Aims Estimate trajectory shapes Linear, quadratic, etc. Estimate trajectory class probabilities Proportion of sample in each trajectory class Estimate variation within class 4

L INEAR G ROWTH M ODEL D IAGRAM 5 σ 2 Slope x T1 x T2 x T3 x T4 x T5 1 I S  Int  Slope σ 2 Int σ 2 Int,Slope σ 2 ε1 σ 2 ε2 σ 2 ε3 σ 2 ε4 σ 2 ε5

L INEAR GMM M ODEL D IAGRAM 6 σ 2 ε1 σ 2 ε2 σ 2 ε3 σ 2 ε4 σ 2 ε5 x T1 x T2 x T3 x T4 x T5 1 I S C  Int  Slope σ 2 Slope σ 2 Int σ 2 Int,Slope

GMM E XAMPLE P ROFILE P LOT 7

8

G ROWTH M IXTURE M ODEL E QUATIONS 9 x itk = Intercept ik + λ tk *Slope ik + ε itk Intercept ik = α 0k + ζ 0ik Slope ik = α 1k + ζ 1ik for individual i at time t in class k ε itk ~ N(0,σ)

L ATENT C LASS G ROWTH M ODEL (LCGA) V S. GMM Nagin, 1999; Nagin & Tremblay, 1999 Same as GMM except no residual variance on growth factors No individual variation within class Everyone has the same trajectory LCGA is a special case of GMM 10 x T1 x T2 x T3 x T4 x T5 1 I S C  Int  Slope σ 2 Int,Slope

C LASS E NUMERATION Still cannot use LRT χ 2 Information Criteria: AIC (Akaike, 1974), BIC (Schwartz,1978) Penalize for number of parameters and sample size Model with lowest value Interpretation and usefulness Profile plot Substantive theory Predictive validity Size of classes 11

A NALYSIS P LAN Determine growth function Determine number of classes Examine mean plots, with and without individual trajectories Determine if growth factor variances need: 1. To be different from zero (GMM vs. LCGA) 2. Should be held equal across classes Add covariates and distal outcomes 12

M ODELING Z ERO 13

H OW DO I MODEL Z EROS ? Particularly relevant for substance abuse (or other outcome with floor effects) to model non- users Some outcomes are right skewed so that there are many low values of the dependent variable However, some outcomes may have more zero’s than expected Example: Alcohol consumption; Individuals who never drink These individuals will always respond that consumed zero drinks 14

W HEN YOU HAVE MORE ZERO ’ S THAN EXPECTED In this case, zeros can be thought of coming from two populations 1. Structural Zeros – zeros always occur in this population Example: Never drinkers 2. Others who produce zero with some probability at the time of measurement Example: Occasional drinkers 15

O NE O PTION Identify those individuals in the two populations Structural zeros can then be eliminated Those who could potentially produce zeros are retained But it can very difficult to tell the difference between the two Or the population of interest is the entire population i.e. both drinkers and non-drinkers Stem issue 16

Z ERO -C LASS Consider what you mean by a zero Only non-users who have not initiated use or those have initiated but only one try? Fix growth factor mean to zero Start not using, stay not using If only fix the means it will not be a pure zero-class Likely to pick up people that have tried once or twice, but have not moved to regular use Fix growth factor means and (co)variance to zero No variance in group Sometimes can cause computation issues 17

R EGIME S WITCHING 18

I S GMM A GOOD MODEL FOR SUBSTANCE USE DEVELOPMENT ? Maybe not Assumes that individuals remain in same trajectory over time Once a heavy smoker always a heavy smoker, even if you successfully quit for a period May not hold with many substance use outcomes Examples: Switching from moderate to heavy drinking, changing from daily smoker to non- smoker 19

I NDIVIDUAL T RAJECTORY P LOTS Dolan et al. (2005) presented the regime switching model (RSM) a way to get traction on this issue 20

D OLAN E T A L. R EGIME S WITCHING M ODEL (RSM) Regime = latent trajectory class Ex: habitual moderate drinkers, heavy drinkers Regime Switch = move from one regime to another Ex: A switch from moderate to heavy drinking Used latent markov modeling for normally distributed outcomes (Schmittmann et al., 2005) 21

RSM WITH O RDINAL D ATA Dolan RSM model was designed to be used with normally distributed data Substance abuse measures are often: If continuous, not normally distributed Count Ex: # of drinking using days per month Categorical Ex: Do you use X substance? As we’ve seen in previous talks, can use the Mehta, Neale and Flay (2004) method when we have ordinal data 22

A PPLICATION : A DOLESCENT D RINKING From Dolan et al. paper Data: National Longitudinal Survey Youth (NLSY97) Years 1998, 1999, 2000, white males and females Age 13 or 14 in 1998 Indicated the regularly drank alcohol Outcome: “In the past 30 days, on days you drank, how much did you drink?” Made ordinal: 0= 0-2 drinks; 1= 3 drinks; 2= 4-6 drinks, 3= 7+ drinks 23

M ODEL S IMPLIFICATIONS FOR GMM & RSM A PPLICATION Assumed linear model Really quadratic No correlation between intercept and slope Where you start drinking at the beginning of the study does not influence how your drinking develops during the study Transition probabilities equivalent across time Probability of drinking between age are the same as

C OMPARING GMM AND RSM 25 Model-2*LLnpAICBICsaBIC 3-Class GMM Class RSM

3-C LASS GMM P ROFILE P LOT 26

3-C LASS RSM P ROFILE P LOT 27

GSM & RSM C OMBINED P ROFILE P LOT 28

RSM T RANSITION P ROBABILITIES Likely to stay in same class Low class unlikely to switch to other classes Most likely to switch between moderate and high drinking classes 29 ClassLowModerateHeavy Low Moderate Heavy

O THER L ONGITUDINAL M IXTURE MODELS Longitudinal Latent Class Analysis Models patterns of change over time, rather than functional growth form Lanza & Collins, 2006; Feldman et al., LCALCGA Binary item category item variables 3 Classes Quadratic

L ATENT T RANSITION A NALYSIS 31 x1x1 x2x2 x3x3 x4x4 x5x5 C1C1 x1x1 x2x2 x3x3 x4x4 x5x5 C2C2 Models transition from one state to another over time Unlike RSM, do not impost growth structure Ex: Drinking alcohol or not over time Graham et al., 1991; Nylund et al., 2006 Script on the OpenMx forum Time 1 Time 2

O THER L ONGITUDINAL M IXTURE M ODELS Survival Mixture Multiple latent classes of individuals with different survival functions Ex: Different groups based on age of initiation Kaplan, 2004; Masyn, 2003; Muthén & Masyn,

O PEN M X : GMM E XAMPLE GMM_example.R 2 Classes Intercept and Slope 33

M AKE OBJECTS FOR THINGS WE WILL REFERENCE THROUGHOUT THE S CRIPT #Number of measurement occasions nocc <- 4 #Number of growth factors (intercept, slope) nfac <- 2 #Number of classes nclass <- 2 #Number of thresholds; 1 minus categories of variable nthresh <- 3 #Function that will help us label our thresholds labFun <- function(name="matrix",nrow=1,ncol=1){matlab <- matrix(paste(rep(name, each=nrow*ncol), rep(rep(1:nrow),ncol), rep(1:ncol,each=nrow),sep="_"))return(matlab)} 34

S ETTING UP THE GROWTH PART OF THE MODEL #Factor Loadings lamda <- mxMatrix("Full", nrow = nocc, nco l= nfac, values = c(rep(1,nocc),0:(nocc-1)),name ="lambda") #Factor Variances phi <-mxMatrix("Diag", nrow = nfac, ncol = nfac, free = TRUE,labels = c("vi", "vs"), name ="phi") #Error terms theta <-mxMatrix("Diag", nrow = nocc, ncol = nocc, free = TRUE,labels = paste("theta",1:nocc,sep = ""), values = 1,name ="theta") #Factor Means alpha <- mxMatrix("Full", nrow= 1, ncol = nfac, free = TRUE, labels = c("mi", "ms"), name ="alpha") 35

G ROWTH P ART C ONT ’D #Item Thresholds thresh <- mxMatrix(type="Full", nrow=nthresh, ncol=nocc, free=rep(c(F,F,T),nocc), values=rep(c(0,1,1.1),nocc), lbound=.0001,labels=labFun("th",nthresh,nocc),name="thresh") cov <-mxAlgebra(lambda %*% phi %*% t(lambda) + theta, name="cov") mean <-mxAlgebra(alpha %*% t(lambda), name="mean” obj<-mxFIMLObjective("cov", "mean", dimnames=names(ordgsmsData), threshold="thresh",vector=TRUE) lgc <- mxModel("LGC", lamda, phi, theta, alpha, thresh, cov, mean, obj) 36

C LASS -S PECIFIC M ODEL class1 <- mxModel(lgc, name ="Class1") class1 <- omxSetParameters(class1, labels = c("vi", "vs", "mi", "ms"), values = c(0.01, 0.05, 0.14, 0.32), newlabels = c("vi1", "vs1", "mi1", "ms1")) As in LCA, repeat for all your latent classes. Just make sure to change the class number and starting values accordingly. 37

C LASS P ROPORTIONS #Fixing one probability to 1 classP <- mxMatrix("Full", nrow = nclass, ncol = 1, free = c(TRUE, FALSE), values = 1, lbound = 0.001, labels = c("p1", "p2"), name="Props") # rescale the class proportion matrix into a class probability matrix by dividing by their sum # (done with a kronecker product of the class proportions and 1/sum) classS <- mxAlgebra(Props%x%(1/sum(Props)), name ="classProbs") 38

C LASS -S PECIFIC O BJECTIVES # weighted by the class probabilities sumll<-mxAlgebra(-2*sum(log( classProbs[1,1]%x%Class1.objective + classProbs[2,1]%x%Class2.objective)), name = "sumll") # make an mxAlgebraObjective obj <- mxAlgebraObjective("sumll") 39

F INISH I T OFF # put it all in a model gmm <- mxModel("GMM 2 Class", mxData(observed = ordgsmsData, type ="raw”), class1, class2, classP, classS, sumll, obj) # run it gmmFit <- mxRun(gmm, unsafe = TRUE) # run it again using starting values from previous run summary(gmmFit2 <- mxRun(gmmFit)) 40

D IFFERENCE B ETWEEN GMM AND FMM? 41 σ 2 In t x T1 x T2 x T3 x T4 x T5 1 I C x1x1 x2x2 x3x3 x4x4 x5x5 C F σ2Fσ2F Factor Mixture ModelIntercept Only Growth Mixture Model 1

GMM AND FMM The difference between the two models shown on the previous slide is that the factor loadings are restricted to 1 in the GMM where in the FMM they are freely estimated Adjust the script by having letting the values of the lambda matrix be freely estimated To run the FMM on the previous page, similar to factor analysis, need to fix a parameter so the model is identified Restrict the mean of two of the factors in two class to set the metric of the factor 42

FMM & M EASUREMENT I NVARIANCE Clark et al. (In Press) In previous version, the threshold of the items were measurement invariant across classes Classes were differentiated based on difference in the mean and variances of the factor Can also have models where there are measurement non-invariant thresholds Classes arising because of difference in item thresholds Add thresholds to class-specific statements Need to restrict the factor mean to zero because can’t identify factor mean and item thresholds 43

H OW DO WE EXTRACT C LASS P ROBABILITIES AND C ALCULATE E NTROPY IN O PEN MX Ryne Estabrook 44

O PEN M X E XERCISE \H OMEWORK Adjust the GMM_example.R script to include: A quadratic growth function A third class Run it Re-run it Interpret the output What are the classes? 45