Multilevel Models 2 Sociology 229A, Class 18

Slides:

Advertisements

Similar presentations

Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.

Advertisements

PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator

Multiple Regression Analysis

AMMBR - final stuff xtmixed (and xtreg) (checking for normality, random slopes)

AMMBR from xtreg to xtmixed (+checking for normality, random slopes)

Toolkit + “show your skills” AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are almost.

SC968: Panel Data Methods for Sociologists Random coefficients models.

1 FE Panel Data assumptions. 2 Assumption #1: E(u it |X i1,…,X iT,  i ) = 0.

Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)

Lecture 4 (Chapter 4). Linear Models for Correlated Data We aim to develop a general linear model framework for longitudinal data, in which the inference.

Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.

Advanced Panel Data Techniques

8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.

Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Lecture 4 This week’s reading: Ch. 1 Today:

Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.

Shall we take Solow seriously?? Empirics of growth Ania Nicińska Agnieszka Postępska Paweł Zaboklicki.

Chapter 12 Simple Regression

Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.

Multilevel Models 1 Sociology 229: Advanced Regression

Multilevel Models 2 Sociology 8811, Class 24

Multilevel Models 1 Sociology 229A Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.

1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.

So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.

Multilevel Models 3 Sociology 8811, Class 25 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.

DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.

1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.

Inference for regression - Simple linear regression

Linear Regression Inference

Returning to Consumption

How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.

EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.

What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.

Sociology 5811: Lecture 14: ANOVA 2

Introduction to Linear Regression

Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.

Count Models 1 Sociology 8811 Lecture 12

Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?

Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.

Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.

1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.

Lecture 3 Linear random intercept models. Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The.

Sociology 5811: Lecture 11: T-Tests for Difference in Means Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

Biostat 200 Lecture Simple linear regression Population regression equationμ y|x = α +  x α and  are constants and are called the coefficients.

STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.

Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.

Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.

Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: exercise 4.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.

Multilevel Models 3 Sociology 229A, Class 10 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.

Special topics. Importance of a variable Death penalty example. sum death bd- yv Variable | Obs Mean Std. Dev. Min Max

1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.

Sociology 5811: Lecture 13: ANOVA Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.

1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,

1 In the Monte Carlo experiment in the previous sequence we used the rate of unemployment, U, as an instrument for w in the price inflation equation. SIMULTANEOUS.

1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.

VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.

Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

From t-test to multilevel analyses Del-2

assignment 7 solutions ► office networks ► super staffing

PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator

QM222 Class 8 Section A1 Using categorical data in regression

CHAPTER 29: Multiple Regression*

Advanced quantitative methods for social scientists (2017–2018) LC & PVK Session 2 Multilevel analysis in Stata (with a focus on random slope models for.

Simple Linear Regression

Presentation transcript:

Multilevel Models 2 Sociology 229A, Class 18 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission

Multilevel Data Simple example: 2-level data Which can be shown as: Class Which can be shown as: Class 1 S1 S2 S3 Class 2 Class 3 Level 2 Level 1

Multilevel Data: Problems When is multilevel data NOT a problem? Answer: If you can successfully control for potential sources of correlated error Add a control to OLS model for: classroom, school, and state characteristics that would be sources of correlated error in each group Ex: Teacher quality, class size, budget, etc… But: We often can’t identify or measure all relevant sources of correlated error Thus, we need to abandon simple OLS regression and try other approaches.

Review: Multilevel Strategies Problems of multilevel models Non-independence; correlated error Standard errors = underestimated Solutions: Each has benefits, disadvantages… 1. OLS regression 2. Aggregation (between effects model) 3. Robust Standard Errors 4. Robust Cluster Standard Errors 5. Dummy variables (Fixed Effects Model) 6. Random effects models

Robust Standard Errors Strategy #1: Improve our estimates of the standard errors Option 1: Robust Standard Errors reg y x1 x2 x3, robust The Huber / White / “Sandwich” estimator An alternative method of computing standard errors that is robust to a variety of assumption violations Provides accurate estimates in presence of heteroskedasticity Also, robust to model misspecification Note: Freedman’s criticism: What good are accurate SEs if coefficients are biased due to poor specification?

Robust Cluster Standard Errors Option 2: Robust cluster standard errors A modification of robust SEs to address clustering reg y x1 x2 x3, cluster(groupid) Note: Cluster implies robust (vs. regular SEs) It is easy to adapt robust standard errors to address clustering in data; See: http://www.stata.com/support/faqs/stat/robust_ref.html http://www.stata.com/support/faqs/stat/cluster.html Result: SE estimates typically increase, which is appropriate because non-independent cases aren’t providing as much information as would a sample of independent cases.

Dummy Variables Another solution to correlated error within groups/clusters: Add dummy variables Include a dummy variable for each Level-2 group, to explicitly model variance in means A simple version of a “fixed effects” model (see below) Ex: Student achievement; data from 3 classes Level 1: students; Level 2: classroom Create dummy variables for each class Include all but one dummy variable in the model Or include all dummies and suppress the intercept

Dummy Variables What is the consequence of adding group dummy variables? A separate intercept is estimated for each group Correlated error is absorbed into intercept Groups won’t systematically fall above or below the regression line In fact, all “between group” variation (not just error) is absorbed into the intercept Thus, other variables are really just looking at within group effects This can be good or bad, depending on your goals.

Dummy Variables Note: You can create a set of dummy variables in stata as follows: xi i.classid – creates dummy variables for each unique value of the variable “classid” Creates variables named _Iclassid_1, _Iclassid2, etc These dummies can be added to the analysis by specifying the variable: _Iclassid* Ex: reg y x1 x2 x3 _Iclassid*, nocons “nocons” removes the constant, allowing you to use a full set of dummies. Alternately, you could drop one dummy.

Example: Pro-environmental values Dummy variable model . reg supportenv age male dmar demp educ incomerel ses _Icountry* Source | SS df MS Number of obs = 27807 -------------+------------------------------ F( 32, 27774) = 98.50 Model | 11024.1401 32 344.504377 Prob > F = 0.0000 Residual | 97142.6001 27774 3.49760928 R-squared = 0.1019 -------------+------------------------------ Adj R-squared = 0.1009 Total | 108166.74 27806 3.89005036 Root MSE = 1.8702 ------------------------------------------------------------------------------ supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0038917 .0008158 -4.77 0.000 -.0054906 -.0022927 male | .0979514 .0229672 4.26 0.000 .0529346 .1429683 dmar | .0024493 .0252179 0.10 0.923 -.046979 .0518777 demp | -.0733992 .0252937 -2.90 0.004 -.1229761 -.0238223 educ | .0856092 .0061574 13.90 0.000 .0735404 .097678 incomerel | .0088841 .0059384 1.50 0.135 -.0027554 .0205237 ses | .1318295 .0134313 9.82 0.000 .1055036 .1581554 _Icountry_32 | -.4775214 .085175 -5.61 0.000 -.6444687 -.3105742 _Icountry_50 | .3943565 .0844248 4.67 0.000 .2288798 .5598332 _Icountry_70 | .1696262 .0865254 1.96 0.050 .0000321 .3392203 … dummies omitted … _Icountr~891 | .243995 .0802556 3.04 0.002 .08669 .4012999 _cons | 5.848789 .082609 70.80 0.000 5.686872 6.010707

Dummy Variables Benefits of the dummy variable approach Weaknesses It is simple Just estimate a different intercept for each group sometimes the dummy interpretations can be of interest Weaknesses Cumbersome if you have many groups Uses up lots of degrees of freedom (not parsimonious) Makes it hard to look at other kinds of group dummies Non-varying group variables = collinear with dummies Can be problematic if your main interest is to study effects of variables across groups Dummies purge that variation… focus on within-group variation If you don’t have much within group variation, there isn’t much left to analyze.

Dummy Variables Note: Dummy variables are a simple example of a “fixed effects” model (FEM) Effect of each group is modeled as a “fixed effect” rather than a random variable Also can be thought of as the “within-group” estimator Looks purely at variation within groups Stata can do a Fixed Effects Model without the effort of using all the dummy variables Simply request the “fixed effects” estimator in xtreg.

Fixed Effects Model (FEM) For i cases within j groups Therefore aj is a separate intercept for each group It is equivalent to solely at within-group variation: X-bar-sub-j is mean of X for group j, etc Model is “within group” because all variables are centered around mean of each group.

Fixed Effects Model (FEM) . xtreg supportenv age male dmar demp educ incomerel ses, i(country) fe Fixed-effects (within) regression Number of obs = 27807 Group variable (i): country Number of groups = 26 R-sq: within = 0.0220 Obs per group: min = 511 between = 0.0368 avg = 1069.5 overall = 0.0239 max = 2154 F(7,27774) = 89.23 corr(u_i, Xb) = 0.0213 Prob > F = 0.0000 ------------------------------------------------------------------------------ supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0038917 .0008158 -4.77 0.000 -.0054906 -.0022927 male | .0979514 .0229672 4.26 0.000 .0529346 .1429683 dmar | .0024493 .0252179 0.10 0.923 -.046979 .0518777 demp | -.0733992 .0252937 -2.90 0.004 -.1229761 -.0238223 educ | .0856092 .0061574 13.90 0.000 .0735404 .097678 incomerel | .0088841 .0059384 1.50 0.135 -.0027554 .0205237 ses | .1318295 .0134313 9.82 0.000 .1055036 .1581554 _cons | 5.878524 .052746 111.45 0.000 5.775139 5.981908 sigma_u | .55408807 sigma_e | 1.8701896 rho | .08069488 (fraction of variance due to u_i) F test that all u_i=0: F(25, 27774) = 94.49 Prob > F = 0.0000 Identical to dummy variable model!

ANOVA: A Digression Suppose you wish to model variable Y for j groups (clusters) Ex: Wages for different racial groups Definitions: The grand mean is the mean of all groups Y-bar The group mean is the mean of a particular sub-group of the population Y-bar-sub-j

ANOVA: Concepts & Definitions Y is the dependent variable We are looking to see if Y depends upon the particular group a person is in The effect of a group is the difference between a group’s mean & the grand mean Effect is denoted by alpha (a) If Y-bar = $8.75, YGroup 1 = $8.90, then aGroup 1= $0.15 Effect of being in group j is: It is like a deviation, but for a group.

ANOVA: Concepts & Definitions ANOVA is based on partitioning deviation We initially calculated deviation as the distance of a point from the grand mean: But, you can also think of deviation from a group mean (called “e”): Or, for any case i in group j:

ANOVA: Concepts & Definitions The location of any case is determined by: The Grand Mean, m, common to all cases The group “effect” a, common to members The distance between a group and the grand mean “Between group” variation The within-group deviation (e): called “error” The distance from group mean to an case’s value

The ANOVA Model This is the basis for a formal model: For any population with mean m Comprised of J subgroups, Nj in each group Each with a group effect a The location of any individual can be expressed as follows: Yij refers to the value of case i in group j eij refers to the “error” (i.e., deviation from group mean) for case i in group j

Sum of Squared Deviation We are most interested in two parts of model The group effects: aj Deviation of the group from the grand mean Individual case error: eij Deviation of the individual from the group mean Each are deviations that can be summed up Remember, we square deviations when summing Otherwise, they add up to zero Remember variance is just squared deviation

Sum of Squared Deviation The total deviation can partitioned into aj and eij components: That is, aj + eij = total deviation:

Sum of Squared Deviation The total deviation can partitioned into aj and eij components: The total variance (SStotal) is made up of: aj : between group variance (SSbetween) eij : within group variance (SSwithin) SStotal = SSbetween + SSwithin

ANOVA & Fixed Effects Note that the ANOVA model is similar to the fixed effects model But FEM also includes a bX term to model linear trend ANOVA Fixed Effects Model In fact, if you don’t specify any X variables, they are pretty much the same

Within Group & Between Group Models Group-effect dummy variables in regression model creates a specific estimate of group effects for all cases Bs & error are based on remaining “within group” variation We could do the opposite: ignore within-group variation and just look at differences between Stata’s xtreg command can do this, too This is essentially just modeling group means!

Between Group Model . xtreg supportenv age male dmar demp educ incomerel ses, i(country) be Between regression (regression on group means) Number of obs = 27 Group variable (i): country Number of groups = 27 R-sq: within = . Obs per group: min = 1 between = 0.2505 avg = 1.0 overall = 0.2505 max = 1 F(7,19) = 0.91 sd(u_i + avg(e_i.))= .6378002 Prob > F = 0.5216 ------------------------------------------------------------------------------ supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .0211517 .0391649 0.54 0.595 -.0608215 .1031248 male | 3.966173 4.479358 0.89 0.387 -5.409232 13.34158 dmar | .8001333 1.127099 0.71 0.486 -1.558913 3.15918 demp | -.0571511 1.165915 -0.05 0.961 -2.497439 2.383137 educ | .3743473 .2098779 1.78 0.090 -.0649321 .8136268 incomerel | .148134 .1687438 0.88 0.391 -.2050508 .5013188 ses | -.4126738 .4916416 -0.84 0.412 -1.441691 .6163439 _cons | 2.031181 3.370978 0.60 0.554 -5.024358 9.08672 Note: Results are identical to the aggregated analysis… Note that N is reduced to 27

Fixed vs. Random Effects Dummy variables produce a “fixed” estimate of the intercept for each group But, models don’t need to be based on fixed effects Example: The error term (ei) We could estimate a fixed value for all cases This would use up lots of degrees of freedom – even more than using group dummies In fact, we would use up ALL degrees of freedom Stata output would simply report back the raw data (expressed as deviations from the constant) Instead, we model e as a random variable We assume it is normal, with standard deviation sigma.

Random Effects A simple random intercept model Notation from Rabe-Hesketh & Skrondal 2005, p. 4-5 Random Intercept Model Where b is the main intercept Zeta (z) is a random effect for each group Allowing each of j groups to have its own intercept Assumed to be independent & normally distributed Error (e) is the error term for each case Also assumed to be independent & normally distributed Note: Other texts refer to random intercepts as uj or nj.

Random Effects Issue: The dummy variable approach (ANOVA, FEM) treats group differences as a fixed effect Alternatively, we can treat it as a random effect Don’t estimate values for each case, but model it This requires making assumptions e.g., that group differences are normally distributed with a standard deviation that can be estimated from data.

Linear Random Intercepts Model The random intercept idea can be applied to linear regression Often called a “random effects” model… Result is similar to FEM, BUT: FEM looks only at within group effects Aggregate models (“between effects”) looks across groups Random effects models yield a weighted average of between & within group effects It exploits between & within information, and thus can be more efficient than FEM & aggregate models. IF distributional assumptions are correct.

Linear Random Intercepts Model . xtreg supportenv age male dmar demp educ incomerel ses, i(country) re Random-effects GLS regression Number of obs = 27807 Group variable (i): country Number of groups = 26 R-sq: within = 0.0220 Obs per group: min = 511 between = 0.0371 avg = 1069.5 overall = 0.0240 max = 2154 Random effects u_i ~ Gaussian Wald chi2(7) = 625.50 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ supportenv | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0038709 .0008152 -4.75 0.000 -.0054688 -.0022731 male | .0978732 .0229632 4.26 0.000 .0528661 .1428802 dmar | .0030441 .0252075 0.12 0.904 -.0463618 .05245 demp | -.0737466 .0252831 -2.92 0.004 -.1233007 -.0241926 educ | .0857407 .0061501 13.94 0.000 .0736867 .0977947 incomerel | .0090308 .0059314 1.52 0.128 -.0025945 .0206561 ses | .131528 .0134248 9.80 0.000 .1052158 .1578402 _cons | 5.924611 .1287468 46.02 0.000 5.672272 6.17695 sigma_u | .59876138 sigma_e | 1.8701896 rho | .09297293 (fraction of variance due to u_i) Assumes normal uj, uncorrelated with X vars SD of u (intercepts); SD of e; intra-class correlation

Linear Random Intercepts Model Notes: Model can also be estimated with maximum likelihood estimation (MLE) Stata: xtreg y x1 x2 x3, i(groupid) mle Versus “re”, which specifies weighted least squares estimator Results tend to be similar But, MLE results include a formal test to see whether intercepts really vary across groups Significant p-value indicates that intercepts vary . xtreg supportenv age male dmar demp educ incomerel ses, i(country) mle Random-effects ML regression Number of obs = 27807 Group variable (i): country Number of groups = 26 … MODEL RESULTS OMITTED … /sigma_u | .5397755 .0758087 .4098891 .7108206 /sigma_e | 1.869954 .0079331 1.85447 1.885568 rho | .0769142 .019952 .0448349 .1240176 ------------------------------------------------------------------------------ Likelihood-ratio test of sigma_u=0: chibar2(01)= 2128.07 Prob>=chibar2 = 0.000

Choosing Models Which model is best? There is much discussion (e.g, Halaby 2004) Fixed effects are most consistent under a wide range of circumstances Consistent: Estimates approach true parameter values as N grows very large But, they are less efficient than random effects In cases with low within-group variation (big between group variation) and small sample size, results can be very poor Random Effects = more efficient But, runs into problems if specification is poor Esp. if X variables correlate with random group effects Usually due to omitted variables.

Hausman Specification Test Hausman Specification Test: A tool to help evaluate fit of fixed vs. random effects Logic: Both fixed & random effects models are consistent if models are properly specified However, some model violations cause random effects models to be inconsistent Ex: if X variables are correlated to random error In short: Models should give the same results… If not, random effects may be biased If results are similar, use the most efficient model: random effects If results diverge, odds are that the random effects model is biased. In that case use fixed effects…

Hausman Specification Test Strategy: Estimate both fixed & random effects models Save the estimates each time Finally invoke Hausman test Ex: streg var1 var2 var3, i(groupid) fe estimates store fixed streg var1 var2 var3, i(groupid) re estimates store random hausman fixed random

Hausman Specification Test Example: Environmental attitudes fe vs re . hausman fixed random ---- Coefficients ---- | (b) (B) (b-B) sqrt(diag(V_b-V_B)) | fixed random Difference S.E. -------------+---------------------------------------------------------------- age | -.0038917 -.0038709 -.0000207 .0000297 male | .0979514 .0978732 .0000783 .0004277 dmar | .0024493 .0030441 -.0005948 .0007222 demp | -.0733992 -.0737466 .0003475 .0007303 educ | .0856092 .0857407 -.0001314 .0002993 incomerel | .0088841 .0090308 -.0001467 .0002885 ses | .1318295 .131528 .0003015 .0004153 ------------------------------------------------------------------------------ b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 2.70 Prob>chi2 = 0.9116 Direct comparison of coefficients… Non-significant p-value indicates that models yield similar results…

Within & Between Effects What is the relationship between within-group effects (FEM) and between-effects (BEM)? Usually they are similar Ex: Student skills & test performance Within any classroom, skilled students do best on tests Between classrooms, classes with more skilled students have higher mean test scores.

Within & Between Effects Issue: Between and within effects can differ! Ex: Effects of wealth on attitudes toward welfare At the individual level (within group) Wealthier people are conservative, don’t support welfare At the country level (between groups): Wealthier countries (high aggregate mean) tend to have pro-welfare attitudes (ex: Scandinavia) Result: Wealth has opposite between vs within effects! Issue: Such dynamics often result from omitted level-1 variables (omitted variable bias) Ex: If we control for individual “political conservatism”, effects may be consistent at both levels…

Within & Between Effects You can estimate BOTH within- and between-group effects in a single model Strategy: Split a variable (e.g., SES) into two new variables… 1. Group mean SES 2. Within-group deviation from mean SES Often called “group mean centering” Then, put both variables into a random effects model Model will estimate separate coefficients for between vs. within effects Ex: egen meanvar1 = mean(var1), by(groupid) egen withinvar1 = var1 – meanvar1 Include mean (aggregate) & within variable in model.

Within & Between Effects Example: Pro-environmental attitudes . xtreg supportenv meanage withinage male dmar demp educ incomerel ses, i(country) mle Random-effects ML regression Number of obs = 27807 Group variable (i): country Number of groups = 26 Random effects u_i ~ Gaussian Obs per group: min = 511 avg = 1069.5 max = 2154 LR chi2(8) = 620.41 Log likelihood = -56918.299 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ supportenv | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- meanage | .0268506 .0239453 1.12 0.262 -.0200812 .0737825 withinage | -.003903 .0008156 -4.79 0.000 -.0055016 -.0023044 male | .0981351 .0229623 4.27 0.000 .0531299 .1431403 dmar | .003459 .0252057 0.14 0.891 -.0459432 .0528612 demp | -.0740394 .02528 -2.93 0.003 -.1235873 -.0244914 educ | .0856712 .0061483 13.93 0.000 .0736207 .0977216 incomerel | .008957 .0059298 1.51 0.131 -.0026651 .0205792 ses | .131454 .0134228 9.79 0.000 .1051458 .1577622 _cons | 4.687526 .9703564 4.83 0.000 2.785662 6.58939 Between & within effects are opposite. Older countries are MORE environmental, but older people are LESS. Omitted variables? Wealthy European countries with strong green parties have older populations!

Within & Between Effects / Centering Multilevel models & “centering” variables Grand mean centering: computing variables as deviations from overall mean Often done to X variables Has effect that baseline constant in model reflects mean of all cases Useful for interpretation Group mean centering: computing variables as deviation from group mean Useful for decomposing within vs. between effects Often in conjunction with aggregate group mean vars.

Generalizing: Random Coefficients Linear random intercept model allows random variation in intercept (mean) for groups But, the same idea can be applied to other coefficients That is, slope coefficients can ALSO be random! Random Coefficient Model Which can be written as: Where zeta-1 is a random intercept component Zeta-2 is a random slope component.

Linear Random Coefficient Model Rabe-Hesketh & Skrondal 2004, p. 63 Both intercepts and slopes vary randomly across j groups

Random Coefficients Summary Some things to remember: Dummy variables allow fixed estimates of intercepts across groups Interactions allow fixed estimates of slopes across groups Random coefficients allow intercepts and/or slopes to vary across groups randomly! The model does not directly estimate those effects, just as a model does not estimate coefficients for each case residual BUT, random components can be predicted after the fact (just as you can compute residuals – random error).

STATA Notes: xtreg, xtmixed xtreg – allows estimation of between, within (fixed), and random intercept models xtreg y x1 x2 x3, i(groupid) fe - fixed (within) model xtreg y x1 x2 x3, i(groupid) be - between model xtreg y x1 x2 x3, i(groupid) re - random intercept (GLS) xtreg y x1 x2 x3, i(groupid) mle - random intercept (MLE) xtmixed – allows random slopes & coefs “Mixed” models refer to models that have both fixed and random components xtmixed [depvar] [fixed equation] || [random eq], options Ex: xtmixed y x1 x2 x3 || groupid: x2 Random intercept is assumed. Random coef for X2 specified.

STATA Notes: xtreg, xtmixed Random intercepts xtreg y x1 x2 x3, i(groupid) mle Is equivalent to xtmixed y x1 x2 x3 || groupid: , mle xtmixed assumes random intercept – even if no other random effects are specified after “groupid” But, we can add random coefficients for all Xs: xtmixed y x1 x2 x3 || groupid: x1 x2 x3 , mle Note: xtmixed can do a lot… but GLLAMM can do even more! “General linear & latent mixed models” Must be downloaded into stata. Type “search gllamm” and follow instructions to install…

Random intercepts: xtmixed Example: Pro-environmental attitudes . xtmixed supportenv age male dmar demp educ incomerel ses || country: , mle Mixed-effects ML regression Number of obs = 27807 Group variable: country Number of groups = 26 Obs per group: min = 511 avg = 1069.5 max = 2154 Wald chi2(7) = 625.75 Log likelihood = -56919.098 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ supportenv | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0038662 .0008151 -4.74 0.000 -.0054638 -.0022687 male | .0978558 .0229613 4.26 0.000 .0528524 .1428592 dmar | .0031799 .0252041 0.13 0.900 -.0462193 .0525791 demp | -.0738261 .0252797 -2.92 0.003 -.1233734 -.0242788 educ | .0857707 .0061482 13.95 0.000 .0737204 .097821 incomerel | .0090639 .0059295 1.53 0.126 -.0025578 .0206856 ses | .1314591 .0134228 9.79 0.000 .1051509 .1577674 _cons | 5.924237 .118294 50.08 0.000 5.692385 6.156089 [remainder of output cut off] Note: xtmixed yields identical results to xtreg , mle

Random intercepts: xtmixed Ex: Pro-environmental attitudes (cont’d) supportenv | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0038662 .0008151 -4.74 0.000 -.0054638 -.0022687 male | .0978558 .0229613 4.26 0.000 .0528524 .1428592 dmar | .0031799 .0252041 0.13 0.900 -.0462193 .0525791 demp | -.0738261 .0252797 -2.92 0.003 -.1233734 -.0242788 educ | .0857707 .0061482 13.95 0.000 .0737204 .097821 incomerel | .0090639 .0059295 1.53 0.126 -.0025578 .0206856 ses | .1314591 .0134228 9.79 0.000 .1051509 .1577674 _cons | 5.924237 .118294 50.08 0.000 5.692385 6.156089 ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ country: Identity | sd(_cons) | .5397758 .0758083 .4098899 .7108199 sd(Residual) | 1.869954 .0079331 1.85447 1.885568 LR test vs. linear regression: chibar2(01) = 2128.07 Prob >= chibar2 = 0.0000 xtmixed output puts all random effects below main coefficients. Here, they are “cons” (constant) for groups defined by “country”, plus residual (e) Non-zero SD indicates that intercepts vary

Random Coefficients: xtmixed Ex: Pro-environmental attitudes (cont’d) . xtmixed supportenv age male dmar demp educ incomerel ses || country: educ, mle [output omitted] supportenv | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0035122 .0008185 -4.29 0.000 -.0051164 -.001908 male | .1003692 .0229663 4.37 0.000 .0553561 .1453824 dmar | .0001061 .0252275 0.00 0.997 -.0493388 .049551 demp | -.0722059 .0253888 -2.84 0.004 -.121967 -.0224447 educ | .081586 .0115479 7.07 0.000 .0589526 .1042194 incomerel | .008965 .0060119 1.49 0.136 -.0028181 .0207481 ses | .1311944 .0134708 9.74 0.000 .1047922 .1575966 _cons | 5.931294 .132838 44.65 0.000 5.670936 6.191652 ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ country: Independent | sd(educ) | .0484399 .0087254 .0340312 .0689492 sd(_cons) | .6179026 .0898918 .4646097 .821773 sd(Residual) | 1.86651 .0079227 1.851046 1.882102 LR test vs. linear regression: chi2(2) = 2187.33 Prob > chi2 = 0.0000 Here, we have allowed the slope of educ to vary randomly across countries Educ (slope) varies, too!

Random Coefficients: xtmixed What are random coefficients doing? Let’s look at results from a simplified model Only random slope & intercept for education Model fits a different slope & intercept for each group!

Random Coefficients Why bother with random coefficients? 1. A solution for clustering (non-independence) Usually people just use random intercepts, but slopes may be an issue also 2. You can create a better-fitting model If slopes & intercepts vary, a random coefficient model may fit better Assuming distributional assumptions are met Model fit compared to OLS can be tested…. 3. Better predictions Attention to group-specific random effects can yield better predictions (e.g., slopes) for each group Rather than just looking at “average” slope for all groups 4. Helps us think about multilevel data Ex: cross-level interactions (we’ll discuss soon!)

Multilevel Model Notation So far, we have expressed random effects in a single equation: Random Coefficient Model However, it is common to separate the fixed and random parts into multiple equations: Just a basic OLS model… But, intercept & slope are each specified separately as having a random component Intercept equation Slope Equation

Multilevel Model Notation The “separate equation” formulation is no different from what we did before… But it is a vivid & clear way to present your models All random components are obvious because they are stated in separate equations NOTE: Some software (e.g., HLM) requires this Rules: 1. Specify an OLS model, just like normal 2. Consider which OLS coefficients should have a random component These could be the intercept or any X variable (slope) 3. Specify an additional formula for each random coefficient.