Ordinary Least Square estimator using STATA

Slides:



Advertisements
Similar presentations
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Advertisements

Applied Econometrics Second edition
Lecture 8 (Ch14) Advanced Panel Data Method
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
HSRP 734: Advanced Statistical Methods July 24, 2008.
Qualitative Variables and
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
BA 555 Practical Business Analysis
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Project #3 by Daiva Kuncaite Problem 31 (p. 190)
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Chapter Topics Types of Regression Models
1 MF-852 Financial Econometrics Lecture 6 Linear Regression I Roy J. Epstein Fall 2003.
Ch. 14: The Multiple Regression Model building
Economics Prof. Buckles
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Regression and Correlation Methods Judy Zhong Ph.D.
Microeconometrics Aneta Dzik-Walczak 2014/2015. Microeconometrics  Classes: STATA, OLS Instrumental Variable Estimation Panel Data Analysis (RE, FE)
Lecture 2: Key Concepts of Econometrics Prepared by South Asian Network on Economic Modeling Reference Introductory Econometrics: Jeffrey M Wooldridge.
Lecture 3-3 Summarizing r relationships among variables © 1.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Key Data Management Tasks in Stata
Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
Statistical Methods Statistical Methods Descriptive Inferential
1st meeting: Multilevel modeling: introduction Subjects for today:  Basic statistics (testing)  The difference between regression analysis and multilevel.
Regression. Population Covariance and Correlation.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Multiple Linear Regression ● For k>1 number of explanatory variables. e.g.: – Exam grades as function of time devoted to study, as well as SAT scores.
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
9.1 Chapter 9: Dummy Variables A Dummy Variable: is a variable that can take on only 2 possible values: yes, no up, down male, female union member, non-union.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Stata – be the master Stata. “After I have run my standard commands, what can I do to make my model better (and understand better what is going on)?”
11.1 Heteroskedasticity: Nature and Detection Aims and Learning Objectives By the end of this session students should be able to: Explain the nature.
1 Panel Data Analysis in STATA Binam Ghimire. Learning Objectives  Importing file into STATA  Running panel data regression  Run fixed, random effect.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Econ 326 Prof. Mariana Carrera Lab Session X [DATE]
LINEAR REGRESSION 1.
Advanced Quantitative Techniques
Correlation and Simple Linear Regression
QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)
PANEL DATA Development Workshop.
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
DEPARTMENT OF COMPUTER SCIENCE
Lecture Slides Elementary Statistics Thirteenth Edition
I271B Quantitative Methods
CHAPTER 29: Multiple Regression*
Regression Models - Introduction
Migration and the Labour Market
Regression Chapter 8.
Eviews Tutorial for Labor Economics Lei Lei
Simple Linear Regression
CHAPTER 12 More About Regression
Scatter Plots and Least-Squares Lines
3.2. SIMPLE LINEAR REGRESSION
Introduction to Regression
Graphpad Prism 2.
Presentation transcript:

Ordinary Least Square estimator using STATA Evaluation of Public Policy

OLS using STATA Consider a sample of 250 students in the same school, 48% are females and we are interested in measuring the impact male / female can have on height. The function we want to estimate is: 𝑦 𝑖 = 𝛽 1 + 𝛽 2 𝑥 𝑖 + 𝜀 𝑖 Where 𝑦 𝑖 is the height in centimeters of the students (high) 𝑥 𝑖 is a dummy variable that is equal to 0 when the student is male and equal to 1 when the student is female (female): 𝑥 𝑖 = 0 𝑤ℎ𝑒𝑛 𝑖 𝑖𝑠 𝑎 𝑚𝑎𝑙𝑒 1 𝑤ℎ𝑒𝑛 𝑖 𝑖𝑠 𝑎 𝑓𝑒𝑚𝑎𝑙𝑒

Replicate the regression previously specificate with the OLS model In STATA the command that makes estimates with the OLS model is regress [reg] command. We open first the dataset with the command use. And next use the command reg to start the regression: the first variable after the command is the dependent variable , the others are the explenatory variables: reg y x1 x2 Command regress [reg] Excercise: open the dataset high_school.dta Replicate the regression previously specificate with the OLS model The first Help that Stata provides is online, for more details you can consult the PDF manual

STATA results 01 The p-value of the 𝛽 2 is statistically significant at 1% level because the p-value is lower than 0.01

Command regress – options Notice that Stata automatically adds a constant. If you wanted to exclude it, you would have to enter nocons as option: reg y x1 x2, nocons As any other command in Stata, "regress" can be applied to a subset of the observations. Suppose you want to run two separate regressions, one for students under 18 and the other for student over 18: reg y x1 x2 if x3<18 reg y x1 x2 if x3==18 Command regress – options Excercise: using the dataset high_school.dta Replicate the regression previously specificate only for students without illness

STATA results 02 The p-value of the 𝛽 2 is still statistically significant at 1% level because the p-value is lower than 0.01

You will sometimes find very tedious to copy and paste the list of control variables in each regression you run. An easy and elegant way to save some time and space is to define at the beginning of the code the list of control variables with the command global global controls x1 x2 x3 And then simply enter “$controls” each time that you need the list of control variables: reg y $controls Command global Excercise: open the dataset high_school.dta, generate the variable olympus equal to one If the district is Mount Olympus, replicate the regression with also the variables age and Olympus as explanatory variables using the command global

STATA results 03 Also the variable age explains the height of the students, the variable olympus is not statistical significant

Using the last regression show the predict values of the regression After every estimation command (e.g. reg, logit, probit) some "predicted" values (fitted values, residuals, etc.) can be stored in a new variable using the command predict. In our example: reg y x1 x2 x3 predict fit Enter "browse y fit" to appreciate the fit of the model. Predict 01 Excercise: open the dataset high_school.dta, Using the last regression show the predict values of the regression

STATA results 04

Alternately, the residual can be obtained with the option ,residual: predict res, residual Notice that generate the residual is simply: gen res_alternative = y - fit The residuals can be shown in a histogram plot: histogram residual Predict 02 Excercise: open the dataset high_school.dta, Using the last regression show a graph with the distribution of the residual

STATA results 05

Fixed Effect Estimator in STATA before regression, we need to specify that we are using a panel data with the xtset command: xtset panelvar timevar Fixed-effects (FE) model in Stata using the option fe of the xtreg command : xtreg y x1 x2, fe to create n dummies variables that control for fixed time effects: xi: xtreg y x1 x2 i.timevar

Fixed Effect Estimator in STATA Open the dataset panel_wb.dta Set a simple regression in Stata in which the dependent variable is the gdp per capita Replicate the previous regression with country fixed effects Replicate the first regression with year fixed effects

STATA results 06

STATA results 07

STATA results 08