Toolkit + “show your skills” AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are almost.

Slides:

Advertisements

Similar presentations

Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.

Advertisements

Contextual effects In the previous sections we found that when regressing pupil attainment on pupil prior ability schools vary in both intercept and slope.

Multilevel modelling short course

ASSUMPTION CHECKING In regression analysis with Stata

PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator

{ Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Dummy Variables and Interactions. Dummy Variables What is the the relationship between the % of non-Swiss residents (IV) and discretionary social spending.

AMMBR - final stuff xtmixed (and xtreg) (checking for normality, random slopes)

Kin 304 Regression Linear Regression Least Sum of Squares

AMMBR from xtreg to xtmixed (+checking for normality, random slopes)

Lecture 8 (Ch14) Advanced Panel Data Method

Copyright © 2010 Pearson Education, Inc. Slide

Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.

SC968: Panel Data Methods for Sociologists Random coefficients models.

Xtreg and xtmixed: recap We have the standard regression model (here with only one x): but think that the data are clustered, and that the intercept (c.

Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.

1 FE Panel Data assumptions. 2 Assumption #1: E(u it |X i1,…,X iT,  i ) = 0.

Advanced Methods and Models in Behavioral Research – 2014 Been there / done that: Stata Logistic regression (……) Conjoint analysis Coming up: Multi-level.

Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.

Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Lecture 4 Linear random coefficients models. Rats example 30 young rats, weights measured weekly for five weeks Dependent variable (Y ij ) is weight for.

AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are done in terms of theory)

Graphs in HLM. Model setup, Run the analysis before graphing Sector = 0 public school Sector = 1 private school.

Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.

Statistics for the Social Sciences

1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.

Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.

Finding help. Stata manuals You have all these as pdf! Check the folder /Stata12/docs.

3nd meeting: Multilevel modeling: introducing level 1 (individual) and level 2 (contextual) variables + interactions Subjects for today:  Intra Class.

Today’s program Herwart / Axel: Kiva intro (the Galak et al. paper) Follow-up questions Non-response (and respondent list) Multi-level models in Stata.

Inferences for Regression

Typical paper follow-ups Paper is wrong (in the sense of a real mistake) There is an alternative explanation for the analytical results. You test that.

Advanced Methods and Models in Behavioral Research – 2010/2011 AMMBR course design CONTENT METHOD Y is 0/1 conjoint analysis logistic regression multi-level.

Dealing with data All variables ok? / getting acquainted Base model Final model(s) Assumption checking on final model(s) Conclusion(s) / Inference Better.

Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.

Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.

Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.

Assumption checking in “normal” multiple regression with Stata.

Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.

A first order model with one binary and one quantitative predictor variable.

ANOVA, Regression and Multiple Regression March

Advanced Methods and Models in Behavioral Research – 2009/2010 AMMBR course design CONTENT METHOD Y is 0/1 conjoint analysis logistic regression multi-level.

Analysis of Experimental Data VI

Analysis of Experimental Data III Christoph Engel.

More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.

 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 17 Simple Linear Regression and Correlation.

Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.

Regression Chapter 5 January 24 – Part II.

Stata – be the master Stata. “After I have run my standard commands, what can I do to make my model better (and understand better what is going on)?”

1 Panel Data Analysis in STATA Binam Ghimire. Learning Objectives  Importing file into STATA  Running panel data regression  Run fixed, random effect.

Stats Methods at IC Lecture 3: Regression.

Econ 326 Prof. Mariana Carrera Lab Session X [DATE]

Psych 706: stats II Class #4.

Kin 304 Regression Linear Regression Least Sum of Squares

Inference for Regression

PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator

QM222 Class 8 Section A1 Using categorical data in regression

BPK 304W Regression Linear Regression Least Sum of Squares

Chapter 8 Part 2 Linear Regression

G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations

CHAPTER 29: Multiple Regression*

Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.

Migration and the Labour Market

Regression diagnostics

Regression Forecasting and Model Building

Ordinary Least Square estimator using STATA

Presentation transcript:

Toolkit + “show your skills”

AMMBR from xtreg to xtmixed (+checking for normality, and random slopes, and cross-classified models, and then we are almost done in terms of theory)

xtreg (with assumption checking)

We have the standard regression model (here with only one x): but think that the data are clustered, and that the intercept (c 0 ) might be different for different clusters … where the S-variables are dummies per cluster. Because k can be large, this is not always feasible to estimate. Instead we estimate: … with the delta normally distributed with zero mean and variance to be estimated. We knew already...

And this you can do with xtreg xtset xtreg y x1 … and by doing this, we are trying to take into account the fact that the errors are otherwise not independent.

xtreg: replacing the dummies by a delta This is only allowed when the dummies themselves follow a normal distribution (and when delta and epsilon do not correlate) CHECK NO 1: First run your model with all the dummies included (if possible – might not be feasible) Then check whether the coefs of these dummies follow a normal distribution through the following Stata-code:

* Run a regression (with numbered dummies) reg y d2... d40 x1 x2 * Write the coefficients to a new variable gen coef =. forvalues i=2/40 { replace coef = _b[d`i’]if _n==`i’ } OR: for num 2/40: replace coef = _b[dX] if _n==X swilk coef // test for normality

Note: with all the dummies included, you consider the “within-effects” (the d_ variables) only!

CHECK NO 2: Compare the “dummy-estimates” with the “delta- estimates”: xtset id xtreg y x1 x2, fe // “fe” for “fixed effects” estimates store fixed// store these estimates xtreg y x1 x2, re// “re” for “random effects”* estimates store random// store these estimates hausman fixed random// compare the estimates

Try it yourselves - The THKS data (Tobacco, Health and Knowledge Scale) PostTHKS PreTHKS CC, TV, CCTV Target variable is PostTHKS

xtmixed (random slopes, and >2 levels)

What if c 1 varies as well? The same argument applies. We already had: … and now make the c 1 coefficient dependent on the cluster (“random slopes”) This is not feasible to estimate for large k, so instead we want to model: … with zeta a normally distributed variable with zero mean and variance to be estimated

xtreg does not do this (it only does random intercepts)

And this you can do with xtmixed xtmixed y x1 || : is just like the xtreg command, but if you want random slopes for x1, you add x1 after the “:” xtmixed y x1 || : x1 Your output then gives you estimates for the variance (or standard deviation) of delta and zeta.

The THKS data (Tobacco, Health and Knowledge Scale) PostTHKS PreTHKS CC, TV, CCTV Target variable is PostTHKS

xtmixed postthks cc || schoolid: cc

xtmixed can deal with nested clusters too! (here: “classes within schools”) Again the same kind of argument applies. We already had: … and we want separate constant terms per class and per school So we estimate instead: … where delta is again a normally distributed variable at the school level with zero mean and variance to be estimated, and tau is a normally distributed variable at the class level with zero mean and variance to be estimated.

And this you can do with xtmixed as well xtmixed y x1 || school: || class: Remember to put the bigger cluster on the left!

xtmixed postthks || schoolid: || classid:

[show this in Stata] (compare empty xtmixed with xtreg)

Horrors xtmixed finds its estimates using an iterative process. This can complicate matters: – it might not converge – it might converge but to the wrong values (and you can’t tell) – it might converge to different estimates for different algorithms in the iterative process You have only a couple of weapons against that: – run again using a different algorithm (use option “, mle”) – Allow estimation of correlations as well (use option “, cov(unstr)” ) – (run the dummy-variant (with lots of dummies) anyway) I do not know if any of these horrors will happen in the data you get! This is also something you can pre-check yourselves. (first: you now have a wealth of opportunities with clustered data. All effects might depend on any kind of cluster-level.)

Splitting up variables (within vs across clusters) Basically this is completely unrelated to the previous. The important thing is that it can be done in clustered data, and can lead to different interpretations (see before) HOWEVER: Note that if you have three or more levels (pupils within classes within schools) then you can average out on each level …

There is more... Multilevel data and Y = binary  xtlogit Multilevel data and levels are not nested  “cross- classified” multilevel models  xtmixed The random utility model  clogit Exam material, clogit and xtlogit are not

Cross-classified multi-level models You use the xt-commands to “summarize a large set of dummies”, so to speak … and you have seen this happening – … with the intercept (xtreg) – … with the slope (xtmixed) – … with nested intercepts (xtmixed) And you can also apply it on non-nested clusters (“cross-classified multilevel models”)

And you do this also with xtmixed xtmixed Y X || _all: R.school || _all: R.club In this example, Y is the target variable, predicted with X, using that there are two non-overlapping clusters: school and club. Note: you could try this, for instance, on the motoroccasion.dta data set. (NB you only need to know this basic option, no more complicated ones)

Exam approaching...