Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Longitudinal data analysis in HLM. Longitudinal vs cross-sectional HLM Similar things: Fixed effects Random effects Difference: Cross-sectional HLM: individual,
GENERAL LINEAR MODELS: Estimation algorithms
Statistical Analysis Overview I Session 2 Peg Burchinal Frank Porter Graham Child Development Institute, University of North Carolina-Chapel Hill.
FACTORIAL ANOVA Overview of Factorial ANOVA Factorial Designs Types of Effects Assumptions Analyzing the Variance Regression Equation Fixed and Random.
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
Multilevel modeling in R Tom Dunn and Thom Baguley, Psychology, Nottingham Trent University
HLM – ESTIMATING MULTI-LEVEL MODELS Hierarchical Linear Modeling.
Clustered or Multilevel Data
Lecture 9: One Way ANOVA Between Subjects
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Analysis of Variance & Multivariate Analysis of Variance
Today Concepts underlying inferential statistics
Analysis of Clustered and Longitudinal Data
Introduction to Multilevel Modeling Using SPSS
Multilevel Modeling: Other Topics
© Willett, Harvard University Graduate School of Education, 8/27/2015S052/I.3(c) – Slide 1 More details can be found in the “Course Objectives and Content”
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Chapter 13: Inference in Regression
Overview of Meta-Analytic Data Analysis
One-Way Manova For an expository presentation of multivariate analysis of variance (MANOVA). See the following paper, which addresses several questions:
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Advanced Business Research Method Intructor : Prof. Feng-Hui Huang Agung D. Buchdadi DA21G201.
Hierarchical Linear Modeling (HLM): A Conceptual Introduction Jessaca Spybrook Educational Leadership, Research, and Technology.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Growth Curve Models Using Multilevel Modeling with SPSS David A. Kenny January 23, 2014.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
Sociology 680 Multivariate Analysis: Analysis of Variance.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Multilevel Linear Modeling aka HLM. The Design We have data at two different levels In this case, 7,185 students (Level 1) Nested within 160 Schools (Level.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Mixed-Design ANOVA 5 Nov 2010 CPSY501 Dr. Sean Ho Trinity Western University Please download: treatment5.sav.
BUSI 6480 Lecture 8 Repeated Measures.
Chapter 16 Data Analysis: Testing for Associations.
Multi-level Analysis Recognizing the Problem Maureen Smith, MD PhD Depts. of Population Health Sciences and Family Medicine University of Wisconsin-Madison.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
Multilevel Modeling: Other Topics David A. Kenny January 7, 2014.
SW 983 Missing Data Treatment Most of the slides presented here are from the Modern Missing Data Methods, 2011, 5 day course presented by the KUCRMDA,
PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.
Data Analysis in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine October.
Multivariate Analysis: Analysis of Variance
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Methods and Applications CHAPTER 15 ANOVA : Testing for Differences among Many Samples, and Much.
FIXED AND RANDOM EFFECTS IN HLM. Fixed effects produce constant impact on DV. Random effects produce variable impact on DV. F IXED VS RANDOM EFFECTS.
Analysis of Experiments
Tutorial I: Missing Value Analysis
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
Appendix I A Refresher on some Statistical Terms and Tests.
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
BINARY LOGISTIC REGRESSION
REGRESSION G&W p
B&A ; and REGRESSION - ANCOVA B&A ; and
Linear Mixed Models in JMP Pro
Lecture 4 - Model Selection
Repeated Measures Analysis Using Multilevel Modeling with SPSS
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
CHAPTER 29: Multiple Regression*
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Multivariate Analysis: Analysis of Variance
Regression Analysis.
Rachael Bedford Mplus: Longitudinal Analysis Workshop 23/06/2015
MGS 3100 Business Analysis Regression Feb 18, 2016
Multivariate Analysis: Analysis of Variance
Presentation transcript:

Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University

2 Why do we want to analyze longitudinal data under multilevel modeling (MLM) framework? –Dependency issue –Advantages of using MLM over traditional Methods (e.g., Univariate ANOVA, Multivariate ANOVA) –Review of important parameters in MLM How can we do it under SPSS? Road Map

Regression Model: e.g. DV: Test Scores of 1 st Year Grad-Level Statistics IV: GRE_M (GRE Math Test Score) 150 Students (i = 1,…,150) One of the important Assumptions for OLS regression? (Observations are independent from each other)

4 Ignoring the clustered structure (or dependency between observations) in the analyses can result in: Bias in the standard errors *Bias in the test of significance and confidence interval (Type I errors: Inflated alpha level (e.g. set α=.05; actual α=.10))  non-replicable results

5 Advantages of MLM over the traditional Methods on analyzing longitudinal data Univariate ANOVA—Restriction on the error structure: Compound Symmetry (CS) type error structure (higher statistical power but not likely to be met in longitudinal data) Multivariate ANOVA—No restriction on the error structure: Unstructured (UN) type error structure (often too conservative, lower statistical power); can only handle completely balanced data (Listwise deletion) More…

Analyzing Longitudinal Data: Example (Based on Actual Data—variable names changed for ease of presentation): Compare two different teaching methods on Achievement over time Teaching Methods: 78 students are randomly assigned to either: A. Lecture (Control group; 39 students) or B. Computer (Treatment group; 39 students) 4 Achievement (Ach) scores (right after the course, 1 year after, 2 year after, & 3 year after) were collected from each student after treatment (i.e. statistics course)

7 Achievement Computer Lecture Time=0 : Immediately posttest measure Time (Year) 1 2 3

Multi-Level Model (MLM) Note: Start with simple growth model Introduce treatment in example at end 123 e1e1 Ach t Time t 0 Student 36 β0β0 β1β1 e0e0 e2e2 e3e3 A Simple Regression Model for ONE student (student 36) (t=0,1,2,3) e t : Captures variation of individual achievement scores from the fitted regression model WITHIN student 36 V(e ti )=σ 2

Compare to (Micro Level Model) 123 Student 27 Ach ti Time ti 0 Student 36 Student 52 β 1_Student 27 β 1_Student 36 Β 0_Student 36 Β 0_Student 52 Β 0_Student 27 (i=1,2,3,…,78)

10 Student ID Student 27 Ach ti Time ti 0 Student 36 Student 52 β 1_Student 27 β 1_Student 36 Β 0_Student 36 Β 0_Student 52 Β 0_Student 27 Grand Intercept Grand Slope Variance of the intercepts Variance of the Slopes

Overall Model Student 27 Student 36 Student 52 No variation among the 78 intercepts Ach Time 0 γ 00 Captures the deviations of the 78 intercepts from the grand intercept γ 00 Captures the deviations of the 78 slopes from the Grand slope γ 10

Ach Time Overall Model Student 27 Student 36 Student 52 γ 10 No variation among the 78 slopes

13 Ach Time Overall Model

Summary G: Captures between- student differences R: Captures within-student random errors Grand Intercept Grand Slope Variance of the Intercepts Variance of the Slopes Covariance between Intercepts and Slopes V(e ti )=σ 2

15 MACRO vs. MICRO UNITS: Educational study Family studyLongitudinal study MACROSchool /Class FamilyIndividual MICROStudentFamily member Repeated observations

16 MACRO vs. MICRO (Cont.) MODELS: MICRO level model: regression model fits the observations within each MACRO unit MACRO level model: model captures the differences between the overall model and individual regression models from different macro units

17 Dependent Variable: Math Achievement (Achieve, Repeat measures /Micro Level) Predictors: Repeated measure (MICRO) Level Predictor: Time (& any time varying covariates) Student (MACRO) Level Predictor: Computer (Different teaching methods) (& any time-invariant variables such as gender)

18 Data format under MANOVA approaches: Student Treat T0 T1 T2 T3 S S S S1 has responses on all time points S2 has missing response at time 2 (indicated by "--") S3 has missing response at time 0. MANOVA: only retains S1 in the analysis (SPSS Data Format)

19 Student Treat T0 T1 T2 T3 S S S Student Treat Time DV S S S S S S S S S S Data format for MANOVA Data format for Multilevel Model (All 3 students are included in the analyses)

20 Student Treat Time DV S S S S S S S S S S S S Can you transform this dataset back into multivariate format???

21 Questions 1. On average, is there any trend of the math achievement over time? 2. Are there any differences between students on the trend of math achievement over time? (Do all students have the same trend of math achievement over time?)

Micro Level (Level 1): Macro Level (Level 2): Grand Slope Grand Intercept

23 Micro Level Macro Level Combined Model Between School Differences Within School Errors V(e ti )=σ 2 Grand Intercept Grand Slope V(U 0i )=τ 00 V(U 1i )=τ 11

Red: Computer Blue: Lecture

25 MA ti =γ 00 + γ 10 Time ti +U 0i +U 1i Time ti + e ti SPSS MIXED Syntax: MIXED mathach with Time /METHOD = REML /Fixed = intercept Time /Random = intercept Time |Subject(Subid) COVTYPE (UN) /PRINT = G SOLUTION TESTCOV. Execute Default: REML (Restricted Maximum Likelihood) Other option: ML (Maximum Likelihood) Produce asymptotic standard errors and Wald Z-tests for The covariance Parameter estimates identity variable for Macro level Units (e.g., Subid) Captures the overall model Requests for regression coefficients Specify random effects: Effects capture the between- School differences Print G matrix Structure of G matrix (Unstructured) DV with Continuous IV by Categorical IV

26 SPSS Output Basic Information

27

28 Requested by the “Solution” command in the PRINT statement (Line 5) (γ 10 ) Average Trend of the MA score (γ 00 ) Average MA score at Time=0

29 Requested by the “G” command in the PRINT statement (Line 5) τ 00 τ 10 τ 11 τ 01 τ 00 τ 10 τ 11 τ 01 Requested by the “TESTCOV” command in the PRINT statement (Line 5) Asymptotic standard errors and Wald Z-tests σ2σ2

30 Compare Likelihood Ratio Test! Can I have a simpler G matrix (i.e. τ01= τ10 =0) With -2LL: LL: ?

31 Syntax for fitting simpler G SPSS syntax /random = intercept Time |subject(Subid) COVTYPE (Diag)

32 (Model with τ 01 = τ 10 =0 ) -2 Res Log Likelihood (or Deviance) (Model with τ 01 = τ 10 ≠0) -2 Res Log Likelihood (or Deviance) χ 2 (1)=.000, p=1.00 Choose This

33 Compare to model with τ 11 = 0 SPSS syntax /random = intercept |subject(Subid) COVTYPE (Diag)

34 (Model with τ 01 =τ 10 =0, τ 11 ≠0 ) -2 Res Log Likelihood (Model with τ 11 =τ 01 =τ 10 = 0) -2 Res Log Likelihood χ 2 (1)=14.51, p<.001 Choose This Halved P-value

35 Result of the final Model γ 00 γ 10 τ 00 τ 11 σ2σ2

36 1. On average, is there any trend of the math achievement over time? 2. Are there any differences between students on the trend of math achievement over time? (Or, do all students have the same trend of math achievement over time?) τ 00 = τ 11 = Q3. If Yes to Q2, what causes the differences?

37 Micro Level (Level 1): MA ti =  0i +  1i Time ti + e ti (Variance of e ti = σ 2 ) Combined Model: MA ti =γ 00 + γ 01 Comp i + γ 10 Time ti + γ 11 Time ti *Comp i + U 0i + U 1i SES ti + e ti Macro Level (Level 2): β 0i =γ 00 + γ 01 Comp i + U 0i β 1i =γ 10 + γ 11 Comp i + U 1i ( Variance of U 0i = τ 00 ; Variance of U 1i = τ 11 ) Null Hypothesis: Different teaching methods have SAME effects on achievement over time (H 0 : γ 11 = 0)

38 MA ij =γ 00 + γ 01 Comp i + γ 10 Time ti + γ 11 Time ti *Comp i + U 0i + U 1i Time ti + e ti SPSS PROC MIXED Syntax: MIXED mathach with Time /METHOD = REML /Fixed = intercept Comp Time Time*Comp /Random = intercept Time |Subject(Subid) COVTYPE (Diag) /PRINT = G SOLUTION TESTCOV. Execute.

39 Without Comp in the Macro models With Comp in the Macro models

40 (WITHOUT “Comp” in the model) (WITH “Comp” in the model) Proportion of variance in the intercept ( ) explained by “Comp”=( )/ =.13 (or 13%) Proportion of variance in the slope ( ) explained by “Comp”=( )/14.56 =.33 (or 33%)

41 Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept <.0001 time computer time*comp

42 Overall Model for students in the Lecture method group Overall Model for students in the Computer method group Random Effect V(e ti )=σ 2 =90.00

43 Achievement Computer Lecture Time=0 : Immediately posttest measure Time (Year)

Conclusion Advantages of using MLM over traditional ANOVA approaches for analyzing longitudinal data: –1. Can flexibly model the variance function –2. Retain meaning of the random effects –3. Explore factors which predict individual differences in change over time (e.g., Treatment effect) –4.Take both unequal spacing and missing data into account

45 Take Home Exercise A clinical psychologist wants to examine the impact of the stress level of each family member (STRESS) on his/her level of symptomatology (SYMPTOM). There are 100 families, and families vary in size from three to eight members. The total number of participants is 400. a) Can you write out the model? (Hint: What is in the micro model? What is in the macro model?) b) Can you write out the syntax (SPSS) to analyze this model?

46 c) In designing the study, what possible macro predictors do you think the clinical psychologist should include in her study? (e.g. family size?) d) In designing the study, what possible micro predictors do you think the clinical psychologist should include in her study? (e.g. participant’s neuroticism?) e) Can you write out the model? (Hint: What is in the micro model? What is in the macro model) f) Can you write out the syntax (SPSS) to analyze this model?

47 b) SYMPTOM ij = γ 00 + γ 10 STRESS ij + U 0j + U 1j STRESS ij + e ij SPSS Syntax: MIXED Symptom with Stress /fixed = intercept Stress /random = intercept Stress |subject (Family) COVTYPE (UN) /PRINT = G SOLUTION TESTCOV. execute.

48 a) Micro-level model: SYMPTOM ij = β 0j + β 1j STRESS ij + e ij Macro-level model: β 0j = γ 00 + U 0j β 1j = γ 10 + U 1j Combined model: SYMPTOM ij = γ 00 + γ 10 STRESS ij + U 0j + U 1j STRESS ij + e ij

THE END! THANK YOU!