An Introduction to Latent Curve Models

Slides:



Advertisements
Similar presentations
Latent Growth Curve Models
Advertisements

Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Chapter 12 Inference for Linear Regression
Latent Growth Modeling Chongming Yang Research Support Center FHSS College.
Statistical Analysis Overview I Session 2 Peg Burchinal Frank Porter Graham Child Development Institute, University of North Carolina-Chapel Hill.
Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University of Surrey.
Using Multilevel Modeling to Analyze Longitudinal Data Mark A. Ferro, PhD Offord Centre for Child Studies Lunch & Learn Seminar Series January 22, 2013.
Inference for Regression
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
Growth Curve Model Using SEM
Chapter 10 Simple Regression.
When Measurement Models and Factor Models Conflict: Maximizing Internal Consistency James M. Graham, Ph.D. Western Washington University ABSTRACT: The.
“Ghost Chasing”: Demystifying Latent Variables and SEM
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Analysis of Clustered and Longitudinal Data
Introduction to Multilevel Modeling Using SPSS
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Moderation in Structural Equation Modeling: Specification, Estimation, and Interpretation Using Quadratic Structural Equations Jeffrey R. Edwards University.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Applications The General Linear Model. Transformations.
Introduction Multilevel Analysis
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
BUSI 6480 Lecture 8 Repeated Measures.
Chapter 14 Repeated Measures and Two Factor Analysis of Variance
Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Tutorial I: Missing Value Analysis
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
Chapter 13 Understanding research results: statistical inference.
Multivariate Statistics Latent Growth Curve Modelling. Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University.
Latent Growth Modeling Using Mplus
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
LINEAR MIXED-EFFECTS MODELS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.
Chapter 13 Simple Linear Regression
Advanced Statistical Methods: Continuous Variables
Chapter 4 Basic Estimation Techniques
Structural Equation Modeling using MPlus
Review Guess the correlation
Regression 11/6.
Regression 10/29.
CJT 765: Structural Equation Modeling
Correlation, Regression & Nested Models
Linear Mixed Models in JMP Pro
Statistical Reporting Format
CJT 765: Structural Equation Modeling
Lecture 4 - Model Selection
Simple Linear Regression - Introduction
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
LEARNING OUTCOMES After studying this chapter, you should be able to
Structural Equation Modeling
Simple Linear Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
Fixed, Random and Mixed effects
Statistics II: An Overview of Statistics
Product moment correlation
Some statistics questions answered:
A Latent Variable Location Scale Model for Intensive Longitudinal Data
Longitudinal Data & Mixed Effects Models
Rachael Bedford Mplus: Longitudinal Analysis Workshop 23/06/2015
Structural Equation Modeling
Presentation transcript:

An Introduction to Latent Curve Models Instructor: Shelley Blozis, UC Davis

Outline Longitudinal panel study design Latent Curve Models Missing Data Some examples of fitting latent curve models using Mplus and SAS PROC MIXED

Longitudinal Panel Design A sample of subjects is observed at multiple points in time The timing of the measurements can be the same for all individuals Fixed Occasions Design Or the timing can vary between individuals Varying Occasions Design

Latent Growth Curve Models One of several different statistical methods for the analysis of longitudinal panel data Assume that all individuals in a population have the same functional form But the parameters of the function vary between individuals This allows for the individual curves to vary

Brief Detour: common factor analysis The latent curve model is based on a latent variable model

Common factor model X1 = λ1 + δ1 X2 = λ2 + δ2 X3 = λ3 + δ3 X = Λξ + δ X : set of manifest variables Λ : factor loading matrix ξ : factors (latent variables) δ : set of uniquenesses

Common factor model with a mean structure X = τ + Λξ + δ X is the set of manifest variables τ is the set of intercepts Λ is the factor loading matrix ξ is the set of factors (latent variables) δ is the set of uniquenesses

Latent Variables Multiple manifest variables serve as indicators of an underlying, unobserved variable Indicators are reflective of the construct Example: intelligence, self-esteem, life satisfaction

Similar to the common factor analysis model with the mean structure, the latent curve model is used to account for the means, variances and covariances of the observed scores measured over time

Making the distinction Factor analysis The factors of a factor analysis model represent unobservable variables, such as intelligence or self-esteem In a latent curve model, the factors represent characteristics of change Not latent variables in the standard sense Represent unobservable characteristics of change

Let yti be an observed test score Longitudinal panel study: Intellectual test score for n = 100 children assessed up to four times at different ages Let yti be an observed test score t is an occasion, t = 1,…,4 i is the individual

Using this framework, specify a form of change for the response A latent curve model yi = Ληi + εi Based on a common factor model with a mean structure Using this framework, specify a form of change for the response

If a measured response is modeled by a linear function of time, then a latent curve model could be specified as where

Model intellectual test scores using a linear function of children’s age Individual i’s expected rate of change The expected value of yi at Agei = 0 The expected value of yi at (Agei – 5)

Linear Growth Model But often the intent is to describe variation between individuals in the coefficients and possibly account for this variation From this model we can estimate each individual’s set of coefficients We can also estimate the population trajectory

Use the individual-specific coefficients to estimate each person’s trajectory We can then work with the model to describe variation between individuals in their response level and rate of change

Linear growth model The coefficients are each a sum of a fixed effect (a constant for the population) and a random effect (unique to the individual)

Model Assumptions The residual Assume that the responses of each individual follow an underlying trajectory; here we assume linear growth The observations are assumed to be due to this trajectory plus other factors that are captured in the residual Other factors include possible measurement error, as well as other factors apart from time

An illustration Observed scores that includes measurement error “inherent” trajectory for an individual Scores free of measurement error

Model Assumptions The residual Assume that the responses of each individual follow an underlying trajectory; here we assume linear growth The observations are assumed to be due to this trajectory plus other factors that are captured in the residual Other factors include possible measurement error, as well as other factors apart from time Possible assumptions about the residuals Independent between individuals Independent within individuals Constant variance across time

Model Assumptions The person-specific coefficients Assume the coefficients vary between individuals according to the random effects, b0i and b1i

An illustration The ‘typical’ trajectory “inherent” trajectory for an individual Observed scores Source: Harring & Blozis (2014), Behav Res, 46, 3720384

Model Assumptions The person-specific coefficients Assume the coefficients vary between individuals according to the random effects, b0i and b1i Each individual can have a curve that is unique from those of others Under the model, b0i and b1i are often assumed to be normally distributed, means equal to 0 Each has a variance to describe individual differences The two random effects can covary

Model Assumptions The population model Describes the typical response Figure adapted from Blozis & Harring (2016), Structural Equation Modeling, 23(6), 904-920.

Latent Curve Models versus Multilevel Models We have options in how we approach the analysis Latent curve models As a structural equation model (SEM) Multilevel models It’s possible to specify equivalent models under the two approaches, and by applying the same estimation methods, obtain identical results

Estimating the model: A latent curve model vs. a multilevel model Using Mplus to take the latent curve model approach and SAS PROC MIXED (or R) to take the multilevel model approach Fit a linear growth model to repeated measures of intellectual ability

Latent Curve Model Approach The intercept and slope are the latent variables Time (Age) is incorporated into the model as specific and constrained values of the factor loadings, all possibly unique to each person

Factor loadings, specified as fixed, constrained to specific values The means of ‘Int’ and ‘Slope’ relate to the population model The variances of ‘Int’ and ‘Slope’ relate to variation in the level and rate of change across individuals ‘Int’ and ‘Slope’ free to covary

Data model: yi = Ληi + εi The factor loading matrix, Λ, reflects assumptions about the pattern of change in the response variable Each column is known as a ‘basis function’ For linear change, Λ has two columns The first is a column of ones to represent the intercept The second is a column with values equal to the times of measurement In our example, time is represented by the child’s ages at each assessment Λ = 1 5.58 7.83 10.58 17.17

Data model: yi = Ληi + εi The factor, ηi, is unknown and varies across individuals For an individual, the elements that make up ηi represent different aspects of change in y For linear change, ηi contains two factors ηi = (η0i,η1i )' η0i is individual i’s intercept η1i is individual i’s slope The factors are weights, each linked to a basis function defined in the factor matrix

Data model: yi = Ληi + εi According to the model An observed score is modeled as a weighted linear combination of the basis functions plus residual Underlying trajectory for an individual is given by Ληi The residual is the difference between the observed scores and the individual’s underlying trajectory

Estimation using Mplus Obtain maximum likelihood estimates of model parameters Estimate Factors means Factor variances and their covariance A common variance of the time-specific residuals

Setting up the data file Using the latent variable model approach, set up the data file in wide format

Save as ascii file; no header Indicate missing data

Mplus syntax for fitting a linear growth model random effects The | symbol is used in conjunction with TYPE=RANDOM to name and define the random effects

Mplus syntax for fitting a linear growth model Specifying ‘maximum likelihood’ estimation – we have other options for estimation The model assumes that the variances of the residuals are constant across time “(1)” will constrain the residual variances to be equal

Results from Mplus

Fitting the model using a multilevel model approach Using SAS PROC MIXED Using the same estimator as was used in Mplus (maximum likelihood) PROC MIXED requires data in long-format

Data file in wide format Data for the first child (famid = 1) Data for the 2nd child (famid = 2)

Bring data into SAS

PROC MIXED syntax for fitting a linear growth model with random coefficients

Results from SAS PROC MIXED Fixed Intercept and Slope Variance of the random intercept Covariance between the random intercept and slope Variance of the random slope

Mplus and PROC MIXED comparison

Missing Data Missing data are often encountered in longitudinal panel studies Participants may miss one or more of the planned assessments, including those who drop from a study and do not return

Intelligence test scores Only 16% of cases have complete data for all 4 waves

How latent curve models and multilevel models handle missing data Even though many participants have incomplete data for the 4 waves, all participants are included in the analysis Due to the method of estimation Maximum likelihood Does not require complete data for the response variable

In the intelligence study, scores are studied according to Age that differs between children For children who have complete data for all 4 waves, Λ is composed of 4 rows, 2 columns, ages are unique to the child For children who have data for waves 1-3 but not 4, Λ is composed of 3 rows, 2 columns, ages are unique to the child 1 5.58 7.83 10.58 17.17 1 2.33 5.17 11.42

Due to the way in which the parameters of the models are estimated, missing response data (e.g., intelligence test scores) are handled This is different from other statistical methods, such as ANOVA, that require complete data for all cases for estimation

Assumptions about missing data For complete-case methods (e.g., ANOVA), data are assumed to be missing completely at random (MCAR) MCAR = whether data are missing or not is independent of the missing data as well as any observed data For available-case methods (e.g., latent curve models) data are assumed to be missing at random (MAR) MAR = whether data are missing or not is independent of the missing values; possibly related to any observed data

Back to our example Linear growth is one possible model to describe test scores Can we improve on model fit by considering a nonlinear form of change, such as by applying a quadratic growth model?

Interpretation Age is centered at 5 years Intercept Linear slope Quadratic slope

Slight modification of the Mplus syntax

Quadratic Growth Model - Results

Comparison of Model Fit Linear Growth Quadratic Growth

For nested models, we can use the likelihood ratio test to compare fit Calculate the deviance for each model Deviance = -2*loglikelihood Statistic: chi-square = difference in deviances Linear growth: deviance = -2*(-387.508) = 775.016 Quadratic growth: deviance = -2*(-363.494) = 726.988 Chi-square = 775.016 - 726.988 = 48.028 df = difference in degrees of freedom between models df = 4 Chi-square of 48.028 with 4 df, p < .0001

Accounting for between-person differences in the random effects Mother’s IQ test score Repeated measures of child test score

Mplus syntax for a conditional growth model

Conditional Growth Model - Results

Conditional Growth Model - Results

Other structures are possible One more model: Evaluate our assumptions about the within-subject residuals Typical assumption Residuals are independent with constant variance across time Other structures are possible Autocorrelation Independent with heterogeneous variances across time

Allow for autocorrelation between adjacent residuals

Results

Just an introduction to latent curve models As a framework for the analysis of longitudinal panel data, these models offer many possibilities Are there any benefits to a latent curve model approach versus multilevel? Yes!

Resources Duncan, T. E., Duncan, S. C., & Strycker, L. A. (2006). An introduction to latent variable growth curve modeling: Concepts, issues, and application (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Bollen, K. A., & Curran, P.J. (2006). Latent curve models: A structural equation perspective. Hoboken, NJ: Wiley. Singer, J. D. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models. Journal of Educational and Behavioral Statistics, 23, 323-355. Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York: Oxford University Press.