Regression Forced March 17.871 Spring 2006. Regression quantifies how one variable can be described in terms of another.

Slides:



Advertisements
Similar presentations
Dummy Variables and Interactions. Dummy Variables What is the the relationship between the % of non-Swiss residents (IV) and discretionary social spending.
Advertisements

Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.7 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.16 Original citation: Dougherty, C. (2012) EC220 - Introduction.
More on Regression Spring The Linear Relationship between African American Population & Black Legislators.
ELASTICITIES AND DOUBLE-LOGARITHMIC MODELS
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
1 Nonlinear Regression Functions (SW Chapter 8). 2 The TestScore – STR relation looks linear (maybe)…
TigerStat ECOTS Understanding the population of rare and endangered Amur tigers in Siberia. [Gerow et al. (2006)] Estimating the Age distribution.
Sociology 601, Class17: October 27, 2009 Linear relationships. A & F, chapter 9.1 Least squares estimation. A & F 9.2 The linear regression model (9.3)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: exercise 3.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Lecture 4 This week’s reading: Ch. 1 Today:
Adaptive expectations and partial adjustment Presented by: Monika Tarsalewska Piotrek Jeżak Justyna Koper Magdalena Prędota.
Valuation 4: Econometrics Why econometrics? What are the tasks? Specification and estimation Hypotheses testing Example study.
Multiple Regression Spring Gore Likeability Example Suppose: –Gore’s* likeability is a function of Clinton’s likeability and not directly.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data.
Describing Bivariate Relationships Spring 2006.
Regression Example Using Pop Quiz Data. Second Pop Quiz At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
Addressing Alternative Explanations: Multiple Regression Spring 2007.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
Interpreting Bi-variate OLS Regression
1 Zinc Data EPP 245 Statistical Analysis of Laboratory Data.
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
Bivariate Relationships Testing associations (not causation!) Continuous data  Scatter plot (always use first!)  (Pearson) correlation coefficient.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
Returning to Consumption
Country Gini IndexCountryGini IndexCountryGini IndexCountryGini Index Albania28.2Georgia40.4Mozambique39.6Turkey38 Algeria35.3Germany28.3Nepal47.2Turkmenistan40.8.
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
Regression Continued: Functional Form LIR 832. Topics for the Evening 1. Qualitative Variables 2. Non-linear Estimation.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Biostat 200 Lecture Simple linear regression Population regression equationμ y|x = α +  x α and  are constants and are called the coefficients.
. reg LGEARN S WEIGHT85 Source | SS df MS Number of obs = F( 2, 537) = Model |
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.
Simple Linear Regression. Data available : (X,Y) Goal : To predict the response Y. (i.e. to obtain the fitted response function f(X)) Least Squares Fitting.
Special topics. Importance of a variable Death penalty example. sum death bd- yv Variable | Obs Mean Std. Dev. Min Max
Lecture 5. Linear Models for Correlated Data: Inference.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
1 Regression-based Approach for Calculating CBL Dr. Sunil Maheshwari Dominion Virginia Power.
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION 1 Ramsey’s RESET test of functional misspecification is intended to provide a simple indicator of evidence.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
SEMILOGARITHMIC MODELS 1 This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable.
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
WHITE TEST FOR HETEROSCEDASTICITY 1 The White test for heteroscedasticity looks for evidence of an association between the variance of the disturbance.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
QM222 Class 16 & 17 Today’s New topic: Estimating nonlinear relationships QM222 Fall 2017 Section A1.
QM222 Class 11 Section A1 Multiple Regression
QM222 Class 8 Section A1 Using categorical data in regression
The slope, explained variance, residuals
QM222 Class 15 Section D1 Review for test Multicollinearity
EPP 245 Statistical Analysis of Laboratory Data
Introduction to Econometrics, 5th edition
Presentation transcript:

Regression Forced March Spring 2006

Regression quantifies how one variable can be described in terms of another

Black Elected Officials Example I

Stop a second: What is the correlation between beo & bpop?.72,.82,.92?

The Linear Relationship between Two Variables

The Linear Relationship between African American Population & Black Legislators

How did we get that line? 1. Pick a representative value of Y i YiYi

How did we get that line? 2. Decompose Y i into two parts

How did we get that line? 3. Label the points YiYi YiYi ^ εiεi Y i -Y i ^ “residual”

Stop a moment: What is g i ? Vagueness of theory Poor proxies (i.e., measurement error) Wrong functional form See Utts & Heckard discussion about the difference between deterministic relationships and statistical relationships

The Method of Least Squares YiYi YiYi ^ εiεi Y i -Y i ^

Solve for (Utts & Heckard, p. 164)

Solve for (Utts & Heckard, p. 164)

About the Functional Form Linear in the variables vs. linear in the parameters –Y = a + bX + e (linear in both) –Y = a + bX + cX 2 + e (linear in parms.) –Y = a + X b + e (linear in variables) –Y = a + lnX b /Z c + e (linear in neither) Utts & Heckard pp

Black Elected Officials

Log transformations Y = a + bX + eb = dY/dX, or b = the unit change in Y given a unit change in X Typical case Y = a + b lnX + eb = dY/(dX/X), or b = the unit change in Y given a % change in X Cases where there’s a natural limit on growth ln Y = a + bX + eb = (dY/Y)/dX, or b = the % change in Y given a unit change in X Exponential growth ln Y = a + b ln X + eb = (dY/Y)/(dX/X), or b = the % change in Y given a % change in X (elasticity) Economic production

How “good” is the fitted line?

Judging results Substantive interpretation of coefficients Technical judgment of regression –Judgment of coefficients –Judgment of overall fit

Determining Goodness of Fit I Coefficients –Standard error of a coefficient –t-statistic: coeff./s.e.

Standard error of the regression picture YiYi YiYi ^ εiεi Y i -Y i ^ Add these up after squaring

Determining Goodness of Fit Standard error of the regression or standard error of estimate (Root mean square error in STATA) d.f. = n-2

(Y i -Y i ) ^ R 2 picture Y _ (Y i -Y) ^ 0 10 beo bpop beo Fitted values

Y _ (Y i -Y) (Y i -Y i ) ^ (Y i -Y) ^ _ _ 0 10

Determining Goodness of Fit R-squared “coefficient of determination”

Return to Black Elected Officials Example. reg beo bpop Source | SS df MS Number of obs = F( 1, 39) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = beo | Coef. Std. Err. t P>|t| [95% Conf. Interval] bpop | _cons |

Residuals e i = Y i – B 0 – B 1 X i

AL IL

One important numerical property of residuals The sum of the residuals is zero.

Regression Commands in STATA reg depvar indvars predict newvar predict newvar, resid

Why It’s Called Regression Height of Fathers Height of Sons

Some Regressions

Temperature and Latitude

. reg jantemp latitude Source | SS df MS Number of obs = F( 1, 18) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = jantemp | Coef. Std. Err. t P>|t| [95% Conf. Interval] latitude | _cons | predict py (option xb assumed; fitted values). predict ry,resid

gsort -ry. list city jantemp py ry | city jantemp py ry | | | 1. | PortlandOR | 2. | SanFranciscoCA | 3. | LosAngelesCA | 4. | PhoenixAZ | 5. | NewYorkNY | | | 6. | MiamiFL | 7. | BostonMA | 8. | NorfolkVA | 9. | BaltimoreMD | 10. | SyracuseNY | | | 11. | MobileAL | 12. | WashingtonDC | 13. | MemphisTN | 14. | ClevelandOH | 15. | DallasTX | | | 16. | HoustonTX | 17. | KansasCityMO | 18. | PittsburghPA | 19. | MinneapolisMN | 20. | DuluthMN |

Bush Vote and Southern Baptists

. reg bush sbc_mpct Source | SS df MS Number of obs = F( 1, 48) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = bush | Coef. Std. Err. t P>|t| [95% Conf. Interval] sbc_mpct | _cons |

Weight by State Population. reg bush sbc_mpct [aw=votes] (sum of wgt is e+08) Source | SS df MS Number of obs = F( 1, 48) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = bush | Coef. Std. Err. t P>|t| [95% Conf. Interval] sbc_mpct | _cons |

Midterm loss & pres’l popularity

. reg loss gallup Source | SS df MS Number of obs = F( 1, 15) = 5.70 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = loss | Coef. Std. Err. t P>|t| [95% Conf. Interval] gallup | _cons |

. reg loss gallup if year>1948 Source | SS df MS Number of obs = F( 1, 12) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = loss | Coef. Std. Err. t P>|t| [95% Conf. Interval] gallup | _cons |