Advanced quantitative methods for social scientists (2017–2018) LC & PVK Session 2 Multilevel analysis in Stata (with a focus on random slope models for.

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
AMMBR - final stuff xtmixed (and xtreg) (checking for normality, random slopes)
AMMBR from xtreg to xtmixed (+checking for normality, random slopes)
Toolkit + “show your skills” AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are almost.
Lecture 8 (Ch14) Advanced Panel Data Method
SC968: Panel Data Methods for Sociologists Random coefficients models.
Xtreg and xtmixed: recap We have the standard regression model (here with only one x): but think that the data are clustered, and that the intercept (c.
Katie Reed EPSSA Methods Workshop. School environment New Latino destinations Immigrant Incorporation Importance of “context of reception” for immigrants’
1 FE Panel Data assumptions. 2 Assumption #1: E(u it |X i1,…,X iT,  i ) = 0.
Lecture 4 (Chapter 4). Linear Models for Correlated Data We aim to develop a general linear model framework for longitudinal data, in which the inference.
Repeated Measures, Part 3 May, 2009 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and Biostatistics, UCSF.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Advanced Panel Data Techniques
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are done in terms of theory)
The Simple Linear Regression Model: Specification and Estimation
Sociology 601 Class 21: November 10, 2009 Review –formulas for b and se(b) –stata regression commands & output Violations of Model Assumptions, and their.
Shall we take Solow seriously?? Empirics of growth Ania Nicińska Agnieszka Postępska Paweł Zaboklicki.
Multilevel Models 1 Sociology 229: Advanced Regression
Multilevel Models 2 Sociology 8811, Class 24
Multilevel Models 2 Sociology 229A, Class 18
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Multilevel Models 1 Sociology 229A Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Multilevel Models 3 Sociology 8811, Class 25 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
BINARY CHOICE MODELS: LOGIT ANALYSIS
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
Objectives of Multiple Regression
Introduction to Multilevel Modeling Using SPSS
Hypothesis Testing in Linear Regression Analysis
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL.
Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology.
Repeated Measures, Part 2 May, 2009 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and Biostatistics, UCSF.
Introduction Multilevel Analysis
Introduction to Multilevel Modeling Stephen R. Porter Associate Professor Dept. of Educational Leadership and Policy Studies Iowa State University Lagomarcino.
Lecture 3 Linear random intercept models. Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The.
Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.
Multilevel Models 3 Sociology 229A, Class 10 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
Advanced Methods and Models in Behavioral Research – 2009/2010 AMMBR course design CONTENT METHOD Y is 0/1 conjoint analysis logistic regression multi-level.
Tutorial I: Missing Value Analysis
Analysis of Experimental Data III Christoph Engel.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Using Multilevel Modeling in Institutional Research
Multilevel modelling: general ideas and uses
Introduction to Multilevel Modeling Using HLM 6
Chapter 15 Panel Data Models.
From t-test to multilevel analyses Del-2
Lecture 18 Matched Case Control Studies
Stephen W. Raudenbush University of Chicago December 11, 2006
From t-test to multilevel analyses (Linear regression, GLM, …)
Linear Mixed Models in JMP Pro
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
Migration and the Labour Market
From GLM to HLM Working with Continuous Outcomes
Econometric Analysis of Panel Data
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Simple Linear Regression
Count Models 2 Sociology 8811 Lecture 13
Longitudinal Data & Mixed Effects Models
Financial Econometrics Fin. 505
Ordinary Least Square estimator using STATA
Presentation transcript:

Advanced quantitative methods for social scientists (2017–2018) LC & PVK Session 2 Multilevel analysis in Stata (with a focus on random slope models for comparative research) Louis Chauvel University of Luxembourg, PEARL Institute for Research on Socio-Economic Inequality (IRSEI)

Outline Background Method with example: the PISA survey Chauvel L, Leist AK. Socioeconomic hierarchy and health gradient in Europe: the role of income inequality and of social origins. International Journal for Equity in Health. 2015;14:132. doi:10.1186/s12939-015-0263-y. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4647815/ Chauvel L, Hartung A, More Inequality, More Viscosity? Intergenerational Mobility in International Comparison, March 31 – April 2, 2016: PAA Annual Meeting, Washington DC, https://paa.confex.com/paa/2016/meetingapp.cgi/Paper/6597 Outline Background Standard multiple regressions versus random effects models Fixed effects and random effects Basics on notations in multilevel analysis 2-Level models / random effect / random slope Generalization: Higher level models and cross-classified models Method with example: the PISA survey Fitting models random effects and random slopes Post-estimation techniques: BLUPs, Multilevel tools (mlt) Understanding and presenting results Examples of publication Further developments on panel analysis xtmixed as a pervasive command

Main references R Stata http://www.bristol.ac.uk/cmm/learning/support/books.html Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks: Sage Publications. Rabe-Hesketh, S., and A. Skrondal. 2012. Multilevel and longitudinal modeling using STATA. Stata Press. Gelman, A., and J. Hill. 2006. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.  Stata R

Multilevel (2L) data structure Simple example: 2 level data That is … Level 2 Level 1 Country Country Country Country Country Country 1 I1 I2 I3 I4 Country 2 I1 I2 I3 I4 Country 3 I1 I2 I3 I4 Country 4 I1 I2 I3 I4 NB: Minimum 20 level-2 groups

Typical example: PISA 2012 Educational performance at age circa 15 Many countries (68) Parental/family backgrounds Performance variation by country Influence of parental background (Social Reproduction) by country Explanation of Social Reproduction variation? Country GDP/capita, gini etc. The old solution: series of standard OLS To open the dataset and prepare it … http://www.louischauvel.org/ML_pisa_2012.do * PROGRAM SEGMENT 0 To process the old solution … http://www.louischauvel.org/ML_pisa_2012.do * PROGRAM SEGMENT 1

FRANCE ! LUX HKG ALBANIA

matrix R=J(1,5,.) levelsof cco foreach i of numlist `r(levels)' { di `i' ta cnt if `i'==cco capture { quietly: reg PV1READ ST04Q01 f1 stdage if `i'==cco matrix A=e(b) noisily matrix li A matrix C=`i',A matrix R=R \ C } mat li R preserve clear svmat R gen CountryScore=R5 gen SocReproduction=R3 gen cco=R1 two scatter SocR Cou, ml(cco) reg SocR Cou reg SocR Cou if R1!=1 restore

Multilevel Data: why? Multilevel models respect the structure of data we have 1. Clustered data and correlated errors in each cluster 2. ML relaxes assumption of uncorrelated (independent) errors 3. Partitioning variance-covariance components Question: At what level is most of the variance? Conceptually: Different levels and their effects? Statistically: Are your data clustered? Empirically: are there variations both at L1 and L2? … And we can “easily” refine the models

Fixed Effects Model (FEM) & Random Effects (REM) J groups For i cases within j groups aj is a separate intercept for each group at within-group, equivalent to: “within group” model : all variables are centered around mean of each group. In practice : FEM = J replications of standard OLS Models With dummy variable approach => group differences as a fixed effect * PROGRAM SEGMENT 2

Random Effects Alternatively, treat effects as random effect No estimates for each case, but model them A simple random intercept model Notation from Rabe-Hesketh & Skrondal Where b is the main intercept Zeta (z) is a random effect for each group Allowing each of j groups to have its own intercept Assumed to be independent & normally distributed Error (e) is the error term for each case Also assumed to be independent & normally distributed NB: Minimum 20 level-2 groups

xtreg syntax xtreg PV1READ ST04Q01 f1 stdage, i(cco) fe * PROGRAM SEGMENT 4 *Comparing FE and RE models xtreg PV1READ ST04Q01 f1 stdage, i(cco) fe Dependant variable X-explanatory variables level 2 group variable FE or RE model

Usual Solution => Hausman Specification Test Best Model? Fixed effects most consistent as N grows very large But less efficient than random effects when low within-group variation (big between group variation) and small sample size (not PISA…) Usual Solution => Hausman Specification Test Hausman Specification Test: tool help evaluate fit of fixed vs. random effects Logic: Both fixed & random effects models are consistent if models are properly specified However, some model violations cause random effects models to be inconsistent Ex: if X variables are correlated to random error In short: Models should give the same results… If not, random effects may be biased If results are similar, use the most efficient model: random effects If results diverge, odds are that the random effects model is biased. In that case use fixed effects…

Hausman Specification Test Strategy: Estimate both fixed & random effects models Save the estimates each time Finally invoke Hausman test Ex (here with the “old” xtreg stata command): xtreg PV1READ ST04Q01 f1 stdage, i(cco) fe est store femod xtreg PV1READ ST04Q01 f1 stdage, i(cco) re est store remod esttab femod remod hausman femod remod * PROGRAM SEGMENT 4 *Comparing FE and RE models

Linear Fixed Intercepts Model . xtreg PV1READ ST04Q01 parentalbckgrnd stdage, i(cco) fe Fixed-effects (within) regression Number of obs = 413190 Group variable: ccode Number of groups = 67 R-sq: within = 0.1440 Obs per group: min = 259 between = 0.2496 avg = 6167.0 overall = 0.1750 max = 29486 F(3,413120) = 23167.57 corr(u_i, Xb) = 0.1095 Prob > F = 0.0000 --------------------------------------------------------------------------------- PV1READ | Coef. Std. Err. t P>|t| [95% Conf. Interval] ----------------+---------------------------------------------------------------- ST04Q01 | -35.36242 .2558221 -138.23 0.000 -35.86382 -34.86101 parentalbckgrnd | 16.18388 .0727843 222.35 0.000 16.04123 16.32654 stdage | 3.513675 .1302983 26.97 0.000 3.258295 3.769056 _cons | 532.6464 .4030946 1321.39 0.000 531.8563 533.4364 sigma_u | 40.627429 sigma_e | 82.173407 rho | .19642709 (fraction of variance due to u_i) F test that all u_i=0: F(66, 413120) = 1331.73 Prob > F = 0.0000 SD of u (intercepts); SD of e; intra-class correlation

Linear Random Intercepts Model . xtreg PV1READ ST04Q01 parentalbckgrnd stdage, i(cco) re Random-effects GLS regression Number of obs = 413190 Group variable: ccode Number of groups = 67 R-sq: within = 0.1440 Obs per group: min = 259 between = 0.2496 avg = 6167.0 overall = 0.1750 max = 29486 Wald chi2(3) = 69522.37 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 --------------------------------------------------------------------------------- PV1READ | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------+---------------------------------------------------------------- ST04Q01 | -35.36185 .2558235 -138.23 0.000 -35.86326 -34.86045 parentalbckgrnd | 16.18568 .0727767 222.40 0.000 16.04305 16.32832 stdage | 3.512553 .1302975 26.96 0.000 3.257175 3.767932 _cons | 534.6695 4.79814 111.43 0.000 525.2653 544.0737 sigma_u | 39.124915 sigma_e | 82.173407 rho | .18480223 (fraction of variance due to u_i) Assumes normal uj, uncorrelated with X vars SD of u (intercepts); SD of e; intra-class correlation

Hausman Specification Test Example: Pisa read score fe vs re . hausman femod remod ---- Coefficients ---- | (b) (B) (b-B) sqrt(diag(V_b-V_B)) | femod remod Difference S.E. -------------+---------------------------------------------------------------- ST04Q01 | -35.36242 -35.36185 -.000564 . parentalbc~d | 16.18388 16.18568 -.0018003 .0010541 stdage | 3.513675 3.512553 .0011221 .0004391 ------------------------------------------------------------------------------ b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 7.65 Prob>chi2 = 0.0539 (V_b-V_B is not positive definite) Direct comparison of coefficients… Non-significant p-value indicates that models yield similar results… OK

Within & Between Effects / Centering Why do we do Multilevel models?  To understand the role of inequality Between and Within countries So “Centering” variables both grand mean and group mean centering Grand mean centering: computing variables as deviations from overall mean Should be systematically done for X variables Group mean centering: computing variables as deviation from group mean Useful for decomposing within vs. between effects  relative role of inequality between and within countries Often in conjunction with aggregate group mean vars.

Within & Between Effects You can estimate BOTH within- and between-group effects in a single model Strategy: Split a variable (e.g., household possession score) into two new variables… 1. Group mean household possession score 2. Within-group deviation from mean household possession score Often called “group mean centering” Then, put both variables into a random effects model Model will estimate separate coefficients for between vs. within effects Ex: egen betwparentalbckgrnd=mean(parentalbckgrnd), by(cco) gen withinparentalbckgrnd=parentalbckgrnd-betwparentalbckgrnd xtreg PV1READ ST04Q01 stdage betw withi, i(cco) re * PROGRAM SEGMENT 5 *Assessing within and between effects

Linear Random Intercepts Model . xtreg PV1READ ST04Q01 stdage betw withi, i(cco) re Random-effects GLS regression Number of obs = 413190 Group variable: ccode Number of groups = 67 R-sq: within = 0.1440 Obs per group: min = 259 between = 0.2540 avg = 6167.0 overall = 0.1833 max = 29486 Wald chi2(4) = 69526.50 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 --------------------------------------------------------------------------------------- PV1READ | Coef. Std. Err. z P>|z| [95% Conf. Interval] ----------------------+---------------------------------------------------------------- ST04Q01 | -35.36199 .255825 -138.23 0.000 -35.86339 -34.86058 stdage | 3.512629 .1302982 26.96 0.000 3.257249 3.768009 betwparentalbckgrnd | 24.26494 4.669297 5.20 0.000 15.11329 33.41659 withinparentalbckgrnd | 16.18389 .0727852 222.35 0.000 16.04123 16.32655 _cons | 533.5929 4.631905 115.20 0.000 524.5145 542.6712 sigma_u | 37.414277 sigma_e | 82.173407 rho | .17170966 (fraction of variance due to u_i) Parental background has huge effect both within and between

Generalizing: Random Coefficients (=Random slopes) Linear random intercept model allows random variation in intercept (mean) for groups But, the same idea can be applied to other coefficients That is, slope coefficients can ALSO be random! Random Coefficient Model Which can be written as: Where zeta-1 is a random intercept component = differences between countries Zeta-2 is a random slope component = country specific inequality effect

Linear Random Coefficient Model Rabe-Hesketh & Skrondal Both intercepts and slopes vary randomly across j groups PV1READ Inequality between countries vary randomly Inequality within country parentalbckgrnd

xtmixed syntax * PROGRAM SEGMENT 6 * a first random slope model xtmixed – allows random intercepts & slopes “Mixed” models refer to models that have both fixed and random components xtmixed [depvar] [fixed equation] || [random eq], options xtmixed PV1READ ST04Q01 stdage || cco: parentalbckgrnd , iter(5) diff mle cov(unstr) Dependant variable fixed effect variables RE Level 2 variable slope variable estimation options cov(unstructured) cov(unstr) relaxes constraints regarding covariance among random effects (See Rabe-Hesketh & Skrondal) Stata default treats random terms (intercept, slope) as totally uncorrelated… not always reasonable

Example: PISA 2012 . xtmixed supportenv age male dmar demp educ incomerel ses || country: , mle Mixed-effects ML regression Number of obs = 413190 Group variable: ccode Number of groups = 67 Obs per group: min = 259 avg = 6167.0 max = 29486 Wald chi2(2) = 19804.39 Log likelihood = -2405736.9 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ PV1READ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ST04Q01 | -35.14138 .2542637 -138.21 0.000 -35.63973 -34.64304 stdage | 3.483322 .1294873 26.90 0.000 3.229531 3.737112 _cons | 494.2034 4.810792 102.73 0.000 484.7744 503.6324 .../...

Ex: PISA 2012 (cont’d) ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ ccode: Unstructured | sd(parent~d) | 19.28376 1.670908 16.27183 22.8532 sd(_cons) | 54.52031 12.20434 35.15745 84.5472 corr(parent~d,_cons) | .6958577 .1663099 .2234163 .9035452 sd(Residual) | 81.63907 .0898211 81.46321 81.81531 LR test vs. linear regression: chi2(3) = 1.5e+05 Prob > chi2 = 0.0000 “cons” (constant) are intercepts for countries “parent^d” for the slopes Non-zero SDs indicates that both intercepts and slopes vary If some of the estimates are not significant  you can simplify the model

What about the random slopes? Slopes = within country parental background gradient of inequality

What about the random slopes? * PROGRAM SEGMENT 8 * like 6 with BLUP predictors of intercepts and slopes best linear unbiased predictions (BLUPs) slopes intercepts

Multilevel Model Notation Random coeff (random slope) can be expressed in a single equation: Random Coefficient Model However, it is common to separate levels: Level 1 equation Gamma = constant u = random effect Here, we specify a random component for level-1 constant & slope Intercept equation Slope Equation

Cross-Level Interactions Does context (i.e., level-2) influence the effect of level-1 variables? Example: Effect of country inequality (gini) on lower achievements Can you think of others?

Cross-level interactions Idea: specify a level-2 variable that affects a level-1 slope Level 1 equation Intercept equation Slope equation with interaction Cross-level interaction: Level-2 variable Z affects slope (B2) of a level-1 X variable Coefficient g3 reflects size of interaction (effect on B2 per unit change in Z)

Cross-level Interactions Cross-level interaction in single-equation form: Random Coefficient Model with cross-level interaction Stata strategy: manually compute cross-level interaction variables Ex: Poverty*WelfareState, Gender*SingleSexSchool Then, put interaction variable in the “fixed” model Interpretation: B3 coefficient indicates the impact of each unit change in Z on slope B2 If B3 is positive, increase in Z results in larger B2 slope.

Beyond 2-level models Sometimes data has 3 levels or more Ex: School, classroom, individual Ex: Family, individual, time (repeated measures) Can be dealt with in xtmixed xtmixed syntax: specify “fixed” equation and then random effects starting with “top” level xtmixed var1 var2 var3 || schoolid: var2 || classid:var3 Again, specify unstructured covariance: cov(unstr)

Advice about building models Raudenbush & Bryk 2002 Start building the level 1 model first Then build level 2 model Keeping a close eye on level 2 N.