© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 1 More details can be found in the “Course Objectives and Content”

Slides:



Advertisements
Similar presentations
Data: Crab mating patterns Data: Typists (Poisson with random effects) (Poisson Regression, ZIP model, Negative Binomial) Data: Challenger (Binomial with.
Advertisements

Using Multilevel Modeling to Analyze Longitudinal Data Mark A. Ferro, PhD Offord Centre for Child Studies Lunch & Learn Seminar Series January 22, 2013.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
Multiple Regression [ Cross-Sectional Data ]
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
© Willett, Harvard University Graduate School of EducationS052/I.1(a) – Slide 1 S052/§I.1(a): Applied Data Analysis Roadmap of the Course – What Is Today’s.
Chapter 13 Multiple Regression
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
Multiple regression analysis
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Chapter 12 Multiple Regression
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.
Correlation and Regression Analysis
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
S052/Shopping Presentation – Slide #1 © Willett, Harvard University Graduate School of Education S052: Applied Data Analysis Shopping Presentation: A.
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Unit 3b: From Fixed to Random Intercepts © Andrew Ho, Harvard Graduate School of EducationUnit 3b – Slide 1
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
Analysis of Clustered and Longitudinal Data
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
Introduction to Multilevel Modeling Using SPSS
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
Regression and Correlation Methods Judy Zhong Ph.D.
© Willett, Harvard University Graduate School of Education, 8/27/2015S052/I.3(c) – Slide 1 More details can be found in the “Course Objectives and Content”
Overview of Meta-Analytic Data Analysis
Fundamentals of Statistical Analysis DR. SUREJ P JOHN.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Unit 5b: The Logistic Regression Approach to Life Table Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 5b– Slide 1
G Lecture 5 Example fixed Repeated measures as clustered data
Introduction Multilevel Analysis
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Unit 1c: Detecting Influential Data Points and Assessing Their Impact © Andrew Ho, Harvard Graduate School of EducationUnit 1c – Slide 1
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
© Willett, Harvard University Graduate School of Education, 10/23/2015S052/I.1(b) – Slide 1 S052/§I.1(b): Applied Data Analysis Roadmap of the Course.
Multilevel Linear Modeling aka HLM. The Design We have data at two different levels In this case, 7,185 students (Level 1) Nested within 160 Schools (Level.
S052/Shopping Presentation – Slide #1 © Willett, Harvard University Graduate School of Education S052: Applied Data Analysis What Would You Like To Know.
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
© Willett, Harvard University Graduate School of Education, 11/13/2015S052/I.1(c) – Slide 1 More details can be found in the “Course Objectives and Content”
BUSI 6480 Lecture 8 Repeated Measures.
Topic 30: Random Effects. Outline One-way random effects model –Data –Model –Inference.
Developing a Mixed Effects Model Using SAS PROC MIXED
Unit 3a: Introducing the Multilevel Regression Model © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 1
© Willett, Harvard University Graduate School of Education, 12/16/2015S052/I.1(d) – Slide 1 More details can be found in the “Course Objectives and Content”
What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll.
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 1 More details can be found in the “Course Objectives and Content”
Handout Twelve: Design & Analysis of Covariance
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Tutorial I: Missing Value Analysis
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
© Willett, Harvard University Graduate School of Education, 2/19/2016S052/II.1(c) – Slide 1 S052/II.1(c): Applied Data Analysis Roadmap of the Course.
Chapter 1 Introduction to Statistics. Section 1.1 Fundamental Statistical Concepts.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
© Willett, Harvard University Graduate School of Education, 3/1/2016S052/III.1(b) – Slide 1 S052/III.1(b): Applied Data Analysis Roadmap of the Course.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
Unit 2a: Dealing “Empirically” with Nonlinear Relationships © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 1
© Willett, Harvard University Graduate School of Education, 6/13/2016S052/II.2(a3) – Slide 1 S052/II.2(a3): Applied Data Analysis Roadmap of the Course.
What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll.
© Murnane & Willett, Harvard University Graduate School of Education, 6/28/2016S290/Class #09 – Slide 1 S052: Applied Data Analysis Everything in the.
Logistic Regression APKC – STATS AFAC (2016).
A Gentle Introduction to Linear Mixed Modeling and PROC MIXED
An Introductory Tutorial
Presentation transcript:

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 1 More details can be found in the “Course Objectives and Content” handout on the course webpage. Multiple Regression Analysis (MRA) Multiple Regression Analysis (MRA) Do your residuals meet the required assumptions? Test for residual normality Use influence statistics to detect atypical datapoints If your residuals are not independent, replace OLS by GLS regression analysis Use Individual growth modeling Specify a Multi-level Model If your sole predictor is continuous, MRA is identical to correlational analysis If your sole predictor is dichotomous, MRA is identical to a t-test If your several predictors are categorical, MRA is identical to ANOVA If time is a predictor, you need discrete- time survival analysis… If your outcome is categorical, you need to use… Binomial logistic regression analysis (dichotomous outcome) Multinomial logistic regression analysis (polytomous outcome) If you have more predictors than you can deal with, Create taxonomies of fitted models and compare them. Form composites of the indicators of any common construct. Conduct a Principal Components Analysis Use Cluster Analysis Use non-linear regression analysis. Transform the outcome or predictor If your outcome vs. predictor relationship is non-linear, How do you deal with missing data? Today’s Topic Area S052/I.3(b): Applied Data Analysis Roadmap of the Course – What Is Today’s Topic Area?

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 2 S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Printed Syllabus – What Is Today’s Topic? Please check inter-connections among the Roadmap, the Daily Topic Area, the Printed Syllabus, and the content of today’s class when you pre-read the day’s materials. Syllabus Section I.3(b)Using the Multilevel Model To Analyze Longitudinal Data Syllabus Section I.3(b), on Using the Multilevel Model To Analyze Longitudinal Data, includes: Introducing a dataset containing longitudinal data (Slides #3-#5). Conducting exploratory analyses of the ALCUSE data (Slides #5- #8). Why do we need a multilevel model when analyzing longitudinal data (Slide #9)? Specifying a multilevel model appropriate for longitudinal data (Slide #10)? Fitting the multilevel model to longitudinal data, and examining the products of the analysis (Slides #11-#14). Constructing prototypical plots in order to interpret the fitted multilevel model as a representation of change (Slide #15). Syllabus Section I.3(b)Using the Multilevel Model To Analyze Longitudinal Data Syllabus Section I.3(b), on Using the Multilevel Model To Analyze Longitudinal Data, includes: Introducing a dataset containing longitudinal data (Slides #3-#5). Conducting exploratory analyses of the ALCUSE data (Slides #5- #8). Why do we need a multilevel model when analyzing longitudinal data (Slide #9)? Specifying a multilevel model appropriate for longitudinal data (Slide #10)? Fitting the multilevel model to longitudinal data, and examining the products of the analysis (Slides #11-#14). Constructing prototypical plots in order to interpret the fitted multilevel model as a representation of change (Slide #15).

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 3 S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Sample Of Longitudinal Data On Self-Reported Drinking Among Teenagers Data on self-reported alcohol consumption:  82 adolescents, including both boys and girls.  Three waves of longitudinal data, obtained at ages 14, 15 & 16 years.  RQ: Is change over time in self-reported alcohol consumption more rapid for boys than for girls, during middle adolescence? Data on self-reported alcohol consumption:  82 adolescents, including both boys and girls.  Three waves of longitudinal data, obtained at ages 14, 15 & 16 years.  RQ: Is change over time in self-reported alcohol consumption more rapid for boys than for girls, during middle adolescence?

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 4 S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data What Variables Are Present In The ALCUSE Dataset ID Adolescent ID variable is needed so that we can communicate the nesting of time points within adolescents to SAS PROC MIXED Level-1 Outcome Level-1 Outcome:  ALCUSE  ALCUSE – A measure of self-reported adolescent alcohol use.  Differs in value over time (“time-varying”). Level-1 Outcome Level-1 Outcome:  ALCUSE  ALCUSE – A measure of self-reported adolescent alcohol use.  Differs in value over time (“time-varying”). Level-1 Question Predictor Level-1 Question Predictor:  AGE  AGE – the adolescent’s chronological age in years.  Differs in value over time (“time-varying”). Level-1 Question Predictor Level-1 Question Predictor:  AGE  AGE – the adolescent’s chronological age in years.  Differs in value over time (“time-varying”). Level-2 Question Predictor Level-2 Question Predictor:  MALE  MALE – Is the adolescent male?  Fixed in value over time (“time-invariant”). Level-2 Question Predictor Level-2 Question Predictor:  MALE  MALE – Is the adolescent male?  Fixed in value over time (“time-invariant”).

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide etc etc. Level-1 (time- varying) outcome variable, ALCUSE S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Occasions Are Nested Within Adolescents (Not Students Within Schools)! AGE Level-1 (time- varying) question predictor, AGE MALE Level-2 (time- invariant) question predictor, MALE Notice that the longitudinal data have been organized as a “person-period” dataset … Adolescent ID Teenager #1 Teenager #2 Teenager #5 Teenager #4 Teenager #3 Etc.

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 6 In Data-Analytic Handout I.3(b).1, I analyze the ALCUSE data … * * Input, name & label data * *; DATA DRINKING; INFILE 'C:\DATA\S052\ALCUSE.txt'; INPUT ID ALCUSE PEER AGE COA MALE; LABEL ID = 'Adolescent ID Number' ALCUSE= 'Self-Reported Alcohol Use' PEER= 'Self-Reported Peer Alcohol Use' AGE= 'Adolescent Age (Years)' COA= 'Child of an Alcoholic?' MALE= 'Is the Adolescent Male?'; * Take square root of ALCUSE outcome to limit excessive positive skewness; SQRT_ALC = SQRT(ALCUSE); * Re-center values of adolescent AGE on initial age of 14 years; AGE_14 = AGE-14; * Create a two-way interaction of predictors AGE_14 and MALE; AGE_14xMALE = AGE_14*MALE; * * Input, name & label data * *; DATA DRINKING; INFILE 'C:\DATA\S052\ALCUSE.txt'; INPUT ID ALCUSE PEER AGE COA MALE; LABEL ID = 'Adolescent ID Number' ALCUSE= 'Self-Reported Alcohol Use' PEER= 'Self-Reported Peer Alcohol Use' AGE= 'Adolescent Age (Years)' COA= 'Child of an Alcoholic?' MALE= 'Is the Adolescent Male?'; * Take square root of ALCUSE outcome to limit excessive positive skewness; SQRT_ALC = SQRT(ALCUSE); * Re-center values of adolescent AGE on initial age of 14 years; AGE_14 = AGE-14; * Create a two-way interaction of predictors AGE_14 and MALE; AGE_14xMALE = AGE_14*MALE; S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Input and Transformation of the ALCUSE Data In PC-SAS Here, I re-centered adolescent age on the initial age of 14 years: age-14  This forces the intercept parameter in the multilevel model to represent self-reported alcohol use at age-14 (i.e. at the beginning of the data collection period).  This is useful for subsequent interpretation. Here, I re-centered adolescent age on the initial age of 14 years: age-14  This forces the intercept parameter in the multilevel model to represent self-reported alcohol use at age-14 (i.e. at the beginning of the data collection period).  This is useful for subsequent interpretation. Input the data, label all variables ALCUSE is a “count” variable – the self-reported number of drinks consumed by the teenager in a given period:  Exploratory data analysis – and probability theory – suggest that a square-root transformation is useful for “counted” data:  Linearizes the trend line,  Pulls in the long upper tail of the ALCUSE distribution.  Preliminary analysis not included. ALCUSE is a “count” variable – the self-reported number of drinks consumed by the teenager in a given period:  Exploratory data analysis – and probability theory – suggest that a square-root transformation is useful for “counted” data:  Linearizes the trend line,  Pulls in the long upper tail of the ALCUSE distribution.  Preliminary analysis not included. Here I have generated a two-way interaction of AGE_14 and MALE for future inclusion in the multilevel model.

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 7 * * Obtain Univariate & Bivariate Statistics on transformed ALCUSE, by gender * *; * For adolescent girls; DATA GIRLS; SET DRINKING; IF MALE=0; PROC SORT DATA=GIRLS; BY AGE; PROC UNIVARIATE DATA=GIRLS; TITLE6 'Descriptive Statistics for Girls'; VAR SQRT_ALC; BY AGE; PROC BOXPLOT DATA=GIRLS; PLOT SQRT_ALC*AGE / boxstyle=schematic boxconnect=mean vaxis=0 to 2 by.5 cboxes=black cboxfill=grayee boxwidthscale=1 noframe height=3 font=times; * For adolescent boys; DATA BOYS; SET DRINKING; IF MALE=1; PROC SORT DATA=BOYS; BY AGE; PROC UNIVARIATE DATA=BOYS; TITLE6 'Descriptive Statistics for Boys'; VAR SQRT_ALC; BY AGE; PROC BOXPLOT DATA=BOYS; PLOT SQRT_ALC*AGE / boxstyle=schematic boxconnect=mean vaxis=0 to 2 by.5 cboxes=black cboxfill=grayee boxwidthscale=1 noframe height=3 font=times; * * Obtain Univariate & Bivariate Statistics on transformed ALCUSE, by gender * *; * For adolescent girls; DATA GIRLS; SET DRINKING; IF MALE=0; PROC SORT DATA=GIRLS; BY AGE; PROC UNIVARIATE DATA=GIRLS; TITLE6 'Descriptive Statistics for Girls'; VAR SQRT_ALC; BY AGE; PROC BOXPLOT DATA=GIRLS; PLOT SQRT_ALC*AGE / boxstyle=schematic boxconnect=mean vaxis=0 to 2 by.5 cboxes=black cboxfill=grayee boxwidthscale=1 noframe height=3 font=times; * For adolescent boys; DATA BOYS; SET DRINKING; IF MALE=1; PROC SORT DATA=BOYS; BY AGE; PROC UNIVARIATE DATA=BOYS; TITLE6 'Descriptive Statistics for Boys'; VAR SQRT_ALC; BY AGE; PROC BOXPLOT DATA=BOYS; PLOT SQRT_ALC*AGE / boxstyle=schematic boxconnect=mean vaxis=0 to 2 by.5 cboxes=black cboxfill=grayee boxwidthscale=1 noframe height=3 font=times; S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Exploratory Data Analysis of the ALCUSE vs. AGE relationship, by GENDER Exploratory analyses to inspect the data to check if the ALCUSE vs. AGE relationship differs by teenager gender, in the sample  Create a SAS temporary dataset containing the longitudinal data for females:  “IF MALE=0” retains only those cases that are female adolescents.  Sort the data, in order of adolescent age, so that we can obtain descriptive statistics by age.  Obtain descriptive statistics on the transformed outcome, SQRT_ALC, at ages 14, 15 & 16.  Obtain a schematic plot of the sample SQRT_ALC vs. AGE_14 relationship.  Create a SAS temporary dataset containing the longitudinal data for females:  “IF MALE=0” retains only those cases that are female adolescents.  Sort the data, in order of adolescent age, so that we can obtain descriptive statistics by age.  Obtain descriptive statistics on the transformed outcome, SQRT_ALC, at ages 14, 15 & 16.  Obtain a schematic plot of the sample SQRT_ALC vs. AGE_14 relationship. Same data displays for the sub-sample of males.

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 8 Females Males “Eyeball” trend lines through the sub-sample means at each age, for adolescent boys and girls Perhaps?  Boys have a lower self-reported alcohol consumption at age-14, but  Their self-reported alcohol use increases more rapidly over adolescence? Perhaps?  Boys have a lower self-reported alcohol consumption at age-14, but  Their self-reported alcohol use increases more rapidly over adolescence? S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Is There An Interaction Between AGE and Gender, In Predicting Alcohol Use?

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 9 repeated measurementsame individual time With longitudinal data, there is repeated measurement of the same individual over time.  When individuals are measured repeatedly over time, they carry everything that makes them uniquely themselves, forward with them, from occasion to occasion.  They are therefore similar in unobserved ways, from occasion to occasion.  So, in analyzing longitudinal data on them, residuals may be correlated from time to time within person, and we will need multilevel modeling again.  When individuals are measured repeatedly over time, they carry everything that makes them uniquely themselves, forward with them, from occasion to occasion.  They are therefore similar in unobserved ways, from occasion to occasion.  So, in analyzing longitudinal data on them, residuals may be correlated from time to time within person, and we will need multilevel modeling again S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data What Exactly Is The Analytic Problem With Longitudinal Data?

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 10 i = adolescent j = occasion i = adolescent j = occasion S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Familar-Looking Multilevel Model, But For Analyzing Longitudinal Data Predictors at two levels, with associated fixed effects Total or composite residual containing a time-invariant random effect for each adolescent and a random effect for each adolescent on each occasion. Random effect  ij : Unobserved random shock to the outcome for adolescent i on occasion j. Different on each occasion for each individual. Normally distributed, homoscedastic, random. Just like the regular residual in a regression analysis. Random effect  ij : Unobserved random shock to the outcome for adolescent i on occasion j. Different on each occasion for each individual. Normally distributed, homoscedastic, random. Just like the regular residual in a regression analysis. Random effect u i : Time-invariant contribution of adolescent i to the composite residual. Identical on all occasions for each adolescent. Normally distributed, homoscedastic, random. Because each adolescent has the same value of the adolescent-level residual over time, his or her composite residuals may be correlated over time. Random effect u i : Time-invariant contribution of adolescent i to the composite residual. Identical on all occasions for each adolescent. Normally distributed, homoscedastic, random. Because each adolescent has the same value of the adolescent-level residual over time, his or her composite residuals may be correlated over time.

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 11 * * Fit Initial and Final Random-Intercepts Multilevel Models * *; * Sort the data to re-order cases by values of Adoelescent ID and age; PROC SORT DATA=DRINKING; BY ID AGE; * Fit Multilevel Model M1: Initial unconditional random-intercepts model; PROC MIXED METHOD=ML MAXITER=200 COVTEST DATA=DRINKING; TITLE6 'Model M1: Unconditional Random-Intercepts Multilevel Model'; MODEL SQRT_ALC = / SOLUTION; RANDOM INTERCEPT / SUBJECT=ID; * Fit Multilevel Model M2: Add main effects & twoway interaction of AGE_14 & MALE; PROC MIXED METHOD=ML MAXITER=200 COVTEST DATA=DRINKING; TITLE6 'Model M2: Final Random-Intercepts Multilevel Model'; MODEL SQRT_ALC = AGE_14 MALE AGE_14xMALE / SOLUTION; RANDOM INTERCEPT / SUBJECT=ID; * * Fit Initial and Final Random-Intercepts Multilevel Models * *; * Sort the data to re-order cases by values of Adoelescent ID and age; PROC SORT DATA=DRINKING; BY ID AGE; * Fit Multilevel Model M1: Initial unconditional random-intercepts model; PROC MIXED METHOD=ML MAXITER=200 COVTEST DATA=DRINKING; TITLE6 'Model M1: Unconditional Random-Intercepts Multilevel Model'; MODEL SQRT_ALC = / SOLUTION; RANDOM INTERCEPT / SUBJECT=ID; * Fit Multilevel Model M2: Add main effects & twoway interaction of AGE_14 & MALE; PROC MIXED METHOD=ML MAXITER=200 COVTEST DATA=DRINKING; TITLE6 'Model M2: Final Random-Intercepts Multilevel Model'; MODEL SQRT_ALC = AGE_14 MALE AGE_14xMALE / SOLUTION; RANDOM INTERCEPT / SUBJECT=ID; S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Fitting Multilevel Models For Longitudinal Data With PROC MIXED SUBJECT=“ specifies the variable that identifies the grouping of occasions within individual: Here, adolescent ID. Notice I’ve sorted the data by ID before fitting the model! SUBJECT=“ specifies the variable that identifies the grouping of occasions within individual: Here, adolescent ID. Notice I’ve sorted the data by ID before fitting the model! Random effects are specified in RANDOM statement, as before: Random-intercepts error covariance structure chosen by option “INTERCEPT.” Random effects are specified in RANDOM statement, as before: Random-intercepts error covariance structure chosen by option “INTERCEPT.” Fixed effects specified in MODEL statement: Outcome variable is transformed alcohol use, SQRT_ALC. Predictors are: Main effect of gender, MALE Main effect of re-centered adolescent age, AGE_14 Two-way interaction of age and gender, AGE_14  MALE. Fixed effects specified in MODEL statement: Outcome variable is transformed alcohol use, SQRT_ALC. Predictors are: Main effect of gender, MALE Main effect of re-centered adolescent age, AGE_14 Two-way interaction of age and gender, AGE_14  MALE. Unconditional model

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 12 Iteration History Iteration Evaluations -2 Log Like Criterion Convergence criteria met. Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept ID <.0001 Residual <.0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept <.0001 Iteration History Iteration Evaluations -2 Log Like Criterion Convergence criteria met. Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept ID <.0001 Residual <.0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept <.0001 Unconditional Model S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data It Produces The Usual Output … Here’s The Unconditional Model Iteration History Converges rapidly on a solution Iteration History Converges rapidly on a solution Covariance Parameter Estimates Provides estimates of the residual variances and associated tests Covariance Parameter Estimates Provides estimates of the residual variances and associated tests Fit Statistic –2LL = Fit Statistic –2LL = Solution for Fixed Effects Provides estimates of the fixed effects parameters and their associated inferential statistics. Solution for Fixed Effects Provides estimates of the fixed effects parameters and their associated inferential statistics.

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 13 Iteration History Iteration Evaluations -2 Log Like Criterion Convergence criteria met. Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept ID <.0001 Residual <.0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept <.0001 AGE_ MALE AGE_14xMALE Iteration History Iteration Evaluations -2 Log Like Criterion Convergence criteria met. Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept ID <.0001 Residual <.0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept <.0001 AGE_ MALE AGE_14xMALE Final Model S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data It Produces The Usual Output … Here’s The Final Model Iteration History Converges rapidly on a solution Iteration History Converges rapidly on a solution Covariance Parameter Estimates Provides estimates of the residual variances and tests. Covariance Parameter Estimates Provides estimates of the residual variances and tests. Fit Statistic –2LL = Fit Statistic –2LL = Solution for Fixed Effects Provides estimates of the fixed effects parameters and their associated inferential statistics. We’ll interpret them later Solution for Fixed Effects Provides estimates of the fixed effects parameters and their associated inferential statistics. We’ll interpret them later

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 14 S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Estimates of the Fixed and Random Effects plots Estimates of fixed effects can be interpreted in the usual way, using fitted plots (see following). In final model, residual variation is split almost equally between the within-adolescent and the between-adolescent levels:  Intraclass correlation is large, so if we had used the OLS approach we would have been totally incorrect:  The “auto-correlation” among the residuals, from occasion to occasion within an adolescent is 0.57, a substantial linkage across time. In final model, residual variation is split almost equally between the within-adolescent and the between-adolescent levels:  Intraclass correlation is large, so if we had used the OLS approach we would have been totally incorrect:  The “auto-correlation” among the residuals, from occasion to occasion within an adolescent is 0.57, a substantial linkage across time.

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 15 Here’s the fitted multilevel model … The prototypical fitted SQRT_ALC vs. AGE trendlines can be obtained in the usual way, here I create some for prototypical female adolescents and prototypical male adolescents … Plot these Estimated initial status for females Estimated rate of change for females Estimated increment to intercept for males Increment to estimated rate of change for males S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Interpreting Findings By Fitting Trend Lines For Prototypical Population Members

© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 16 Female Male Female Male Transformed De-transformed Figure I.3(b). Fitted age-trajectories of change during middle adolescence, for prototypical boys and girls, in the square root of self-reported alcohol use (left panel), and the self-reported alcohol use (right panel). (Number of adolescents = 82, with 3 waves of data per adolescent). On average. square root of boys’ self-reported alcohol consumption has a greater rate of change over middle adolescence than girls, but their self-reported consumptions do not differ, in the population, at age-14. Note the curvilinearity of the de-transformed fitted trajectories S052/§I.3(b): Using Multilevel Modeling To Analyze Longitudinal Data Interpreting Findings By Fitting Trend Lines For Prototypical Population Members