AMMBR from xtreg to xtmixed (+checking for normality, random slopes)

Slides:



Advertisements
Similar presentations
ASSUMPTION CHECKING In regression analysis with Stata
Advertisements

PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Qualitative predictor variables
AMMBR - final stuff xtmixed (and xtreg) (checking for normality, random slopes)
Kin 304 Regression Linear Regression Least Sum of Squares
The Simple Regression Model
Class 28 Get Ready…..
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Toolkit + “show your skills” AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are almost.
Lecture 8 (Ch14) Advanced Panel Data Method
Copyright © 2010 Pearson Education, Inc. Slide
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
SC968: Panel Data Methods for Sociologists Random coefficients models.
Xtreg and xtmixed: recap We have the standard regression model (here with only one x): but think that the data are clustered, and that the intercept (c.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Advanced Methods and Models in Behavioral Research – 2014 Been there / done that: Stata Logistic regression (……) Conjoint analysis Coming up: Multi-level.
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Lecture 4 Linear random coefficients models. Rats example 30 young rats, weights measured weekly for five weeks Dependent variable (Y ij ) is weight for.
AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are done in terms of theory)
Graphs in HLM. Model setup, Run the analysis before graphing Sector = 0 public school Sector = 1 private school.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Statistics for the Social Sciences

1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Finding help. Stata manuals You have all these as pdf! Check the folder /Stata12/docs.
3nd meeting: Multilevel modeling: introducing level 1 (individual) and level 2 (contextual) variables + interactions Subjects for today:  Intra Class.
Introduction to Linear Regression and Correlation Analysis
Today’s program Herwart / Axel: Kiva intro (the Galak et al. paper) Follow-up questions Non-response (and respondent list) Multi-level models in Stata.
Inferences for Regression
Typical paper follow-ups Paper is wrong (in the sense of a real mistake) There is an alternative explanation for the analytical results. You test that.
Advanced Methods and Models in Behavioral Research – 2010/2011 AMMBR course design CONTENT METHOD Y is 0/1 conjoint analysis logistic regression multi-level.
Dealing with data All variables ok? / getting acquainted Base model Final model(s) Assumption checking on final model(s) Conclusion(s) / Inference Better.
Instrumental Variables: Problems Methods of Economic Investigation Lecture 16.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Assumption checking in “normal” multiple regression with Stata.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
A first order model with one binary and one quantitative predictor variable.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Advanced Methods and Models in Behavioral Research – 2009/2010 AMMBR course design CONTENT METHOD Y is 0/1 conjoint analysis logistic regression multi-level.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Analysis of Experimental Data VI
Analysis of Experimental Data III Christoph Engel.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 17 Simple Linear Regression and Correlation.
Regression Chapter 5 January 24 – Part II.
Stata – be the master Stata. “After I have run my standard commands, what can I do to make my model better (and understand better what is going on)?”
Chapter 26 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Econ 326 Prof. Mariana Carrera Lab Session X [DATE]
Chapter 14 Introduction to Multiple Regression
Psych 706: stats II Class #4.
Kin 304 Regression Linear Regression Least Sum of Squares
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
QM222 Class 8 Section A1 Using categorical data in regression
BPK 304W Regression Linear Regression Least Sum of Squares
BPK 304W Correlation.
CHAPTER 29: Multiple Regression*
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Migration and the Labour Market
Regression diagnostics
Presentation transcript:

AMMBR from xtreg to xtmixed (+checking for normality, random slopes)

xtreg (with assumption checking)

We knew already ... We have the standard regression model (here with only one x): but think that the data are clustered, and that the intercept (c0) might be different for different clusters … where the S-variables are dummies per cluster. Because k can be large, this is not always feasible to estimate. Instead we estimate: … with the delta normally distributed with zero mean and variance to be estimated.

And this you can do with xtreg xtset <clustervariable> xtreg y x1 … and by doing this, we are trying to take into account the fact that the errors are otherwise not independent.

A note on xtreg: Replacing the dummies by a delta This is only allowed when the dummies themselves follow a normal distribution TO CHECK THIS: First run your model with all the dummies included (if possible – might not be feasible) Then check whether the coefs of these dummies follow a normal distribution through the following Stata-code:

Run a regression (with numbered dummies) reg y d2. d40 x1 x2 * Run a regression (with numbered dummies) reg y d2 ... d40 x1 x2 * Write the coefficients to a new variable gen coef = . forvalues i=2/40 { replace coef = _b[d`i’] if _n==`i’ } OR: for num 2/40: replace coef = _b[dX] if _n==X swilk coef // test for normality

xtmixed

What if c1 varies as well? The same argument applies. We already had: … and now make the c1 coefficient dependent on the cluster (“random slopes”) This is not feasible to estimate for large k, so instead we want to model: … with zeta a normally distributed variable with zero mean and variance to be estimated

And this you can do with xtmixed xtmixed y x1 || <clustervar>: is just like the xtreg command, but if you want random slopes for x1, you add x1 after the “:” xtmixed y x1 || <clustervar>: x1 Your output then gives you estimates for the variance (or standard deviation) of delta and zeta.

xtmixed can deal with nested clusters too xtmixed can deal with nested clusters too! (here: “classes within schools”) Again the same kind of argument applies. We already had: … and we want separate constant terms per class and per school So we estimate instead: … where delta is again a normally distributed variable at the school level with zero mean and variance to be estimated, and tau is a normally distributed variable at the class level with zero mean and variance to be estimated.

And this you can do with xtmixed as well xtmixed y x1 || school: || class: Remember to put the bigger cluster on the left!

[show this in Stata] (compare empty xtmixed with xtreg)

Horrors xtmixed finds its estimates using an iterative process. (first: you now have a wealth of opportunities with clustered data. All effects might depend on any cluster-level.) xtmixed finds its estimates using an iterative process. This can complicate matters: it might not converge it might converge but to the wrong values (and you can’t tell) it might converge to different estimates for different algorithms in the iterative process You have only a couple of weapons against that: run again using a different algorithm (use option “, mle”) Allow estimation of correlations as well (use option “, cov(unstr)”) run the dummy-variant (with lots of dummies) anyway I do not know if any of these horrors will happen in the data you get! This is also something you can pre-check yourselves.

Splitting up variables (within vs across clusters) Basically this is completely unrelated to the previous. The important thing is that it can be done in clustered data, and can lead to different interpretations (see before) Do note that if you have three levels (pupils within classes within schools) then you can average out on each level …

More to do ... Multilevel data and Y = binary  xtlogit Multilevel data and levels are not nested  “cross-classified” multilevel models  xtmixed The random utility model  clogit

Exam approaching ... PRACTICE!