More on Regression 17.871 Spring 2007. The Linear Relationship between African American Population & Black Legislators.

Slides:



Advertisements
Similar presentations
Dummy Variables and Interactions. Dummy Variables What is the the relationship between the % of non-Swiss residents (IV) and discretionary social spending.
Advertisements

Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.7 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.16 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Heteroskedasticity The Problem:
ELASTICITIES AND DOUBLE-LOGARITHMIC MODELS
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
1 Nonlinear Regression Functions (SW Chapter 8). 2 The TestScore – STR relation looks linear (maybe)…
Sociology 601, Class17: October 27, 2009 Linear relationships. A & F, chapter 9.1 Least squares estimation. A & F 9.2 The linear regression model (9.3)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: exercise 3.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Adaptive expectations and partial adjustment Presented by: Monika Tarsalewska Piotrek Jeżak Justyna Koper Magdalena Prędota.
Valuation 4: Econometrics Why econometrics? What are the tasks? Specification and estimation Hypotheses testing Example study.
Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data.
Regression Example Using Pop Quiz Data. Second Pop Quiz At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
Regression Forced March Spring Regression quantifies how one variable can be described in terms of another.
Sociology 601 Class 23: November 17, 2009 Homework #8 Review –spurious, intervening, & interactions effects –stata regression commands & output F-tests.
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
Interpreting Bi-variate OLS Regression
1 Zinc Data EPP 245 Statistical Analysis of Laboratory Data.
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: semilogarithmic models Original citation: Dougherty, C. (2012) EC220.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
Bivariate Relationships Testing associations (not causation!) Continuous data  Scatter plot (always use first!)  (Pearson) correlation coefficient.
1 TWO SETS OF DUMMY VARIABLES The explanatory variables in a regression model may include multiple sets of dummy variables. This sequence provides an example.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
Returning to Consumption
Country Gini IndexCountryGini IndexCountryGini IndexCountryGini Index Albania28.2Georgia40.4Mozambique39.6Turkey38 Algeria35.3Germany28.3Nepal47.2Turkmenistan40.8.
1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Biostat 200 Lecture Simple linear regression Population regression equationμ y|x = α +  x α and  are constants and are called the coefficients.
. reg LGEARN S WEIGHT85 Source | SS df MS Number of obs = F( 2, 537) = Model |
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: exercise 4.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Special topics. Importance of a variable Death penalty example. sum death bd- yv Variable | Obs Mean Std. Dev. Min Max
Lecture 5. Linear Models for Correlated Data: Inference.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: exercise 6.13 Original citation: Dougherty, C. (2012) EC220 - Introduction.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION 1 Ramsey’s RESET test of functional misspecification is intended to provide a simple indicator of evidence.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
SEMILOGARITHMIC MODELS 1 This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable.
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
WHITE TEST FOR HETEROSCEDASTICITY 1 The White test for heteroscedasticity looks for evidence of an association between the variance of the disturbance.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
QM222 Class 16 & 17 Today’s New topic: Estimating nonlinear relationships QM222 Fall 2017 Section A1.
QM222 Class 11 Section A1 Multiple Regression
QM222 Class 8 Section A1 Using categorical data in regression
The slope, explained variance, residuals
QM222 Class 15 Section D1 Review for test Multicollinearity
EPP 245 Statistical Analysis of Laboratory Data
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

More on Regression Spring 2007

The Linear Relationship between African American Population & Black Legislators

The Linear and Curvilinear Relationship between African American Population & Black Legislators Stata command: qfit. E.g., scatter beo pop || qfit beo pop

About the Functional Form Linear in the variables vs. linear in the parameters  Y = a + bX + e (linear in both)  Y = a + bX + cX 2 + e (linear in parms.)  Y = a + X b + e (linear in variables)  Y = a + lnX b /Z c + e (linear in neither)

Log transformations Y = a + bX + eb = dY/dX, or b = the unit change in Y given a unit change in X Typical case Y = a + b lnX + eb = dY/(dX/X), or b = the unit change in Y given a % change in X Cases where there’s a natural limit on growth ln Y = a + bX + eb = (dY/Y)/dX, or b = the % change in Y given a unit change in X Exponential growth ln Y = a + b ln X + eb = (dY/Y)/(dX/X), or b = the % change in Y given a % change in X (elasticity) Economic production

How “good” is the fitted line? Goodness-of-fit is not necessarily theoretically relevant Focus on  Substantive interpretation of coefficients (most important)  Statistical significance of coefficients (less important) Standard error of a coefficient t-statistic: coeff./s.e. Nevertheless, you should know about  Standard Error of the Estimate (s.e.e.) Also called Standard Error of the Regression Regrettably called Root Mean Squared Error (Rout MSE) in Stata  R-squared

Standard error of the regression picture YiYi YiYi ^ εiεi Y i -Y i ^ Add these up after squaring

Standard error of the estimate d.f. = n-2

(Y i -Y i ) ^ R 2 picture Y _ (Y i -Y) ^ 0 10 beo bpop beo Fitted values

Y _ (Y i -Y) (Y i -Y i ) ^ (Y i -Y) ^ _ _ 0 10

R-squared Also called “coefficient of determination”

Return to Black Elected Officials Example. reg beo bpop Source | SS df MS Number of obs = F( 1, 39) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = beo | Coef. Std. Err. t P>|t| [95% Conf. Interval] bpop | _cons |

Residuals e i = Y i – B 0 – B 1 X i

One important numerical property of residuals The sum of the residuals is zero.

Regression Commands in STATA reg depvar expvars predict newvar predict newvar, resid

Some Regressions

Temperature and Latitude

. reg jantemp latitude Source | SS df MS Number of obs = F( 1, 18) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = jantemp | Coef. Std. Err. t P>|t| [95% Conf. Interval] latitude | _cons | predict py (option xb assumed; fitted values). predict ry,resid

gsort -ry. list city jantemp py ry | city jantemp py ry | | | 1. | PortlandOR | 2. | SanFranciscoCA | 3. | LosAngelesCA | 4. | PhoenixAZ | 5. | NewYorkNY | | | 6. | MiamiFL | 7. | BostonMA | 8. | NorfolkVA | 9. | BaltimoreMD | 10. | SyracuseNY | | | 11. | MobileAL | 12. | WashingtonDC | 13. | MemphisTN | 14. | ClevelandOH | 15. | DallasTX | | | 16. | HoustonTX | 17. | KansasCityMO | 18. | PittsburghPA | 19. | MinneapolisMN | 20. | DuluthMN |

Bush Vote and Southern Baptists

. reg bush sbc_mpct Source | SS df MS Number of obs = F( 1, 48) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = bush | Coef. Std. Err. t P>|t| [95% Conf. Interval] sbc_mpct | _cons |

Weight by State Population. reg bush sbc_mpct [aw=votes] (sum of wgt is e+08) Source | SS df MS Number of obs = F( 1, 48) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = bush | Coef. Std. Err. t P>|t| [95% Conf. Interval] sbc_mpct | _cons |

Midterm loss & pres. popularity

. reg loss gallup Source | SS df MS Number of obs = F( 1, 15) = 5.70 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = loss | Coef. Std. Err. t P>|t| [95% Conf. Interval] gallup | _cons |

. reg loss gallup if year>1948 Source | SS df MS Number of obs = F( 1, 12) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = loss | Coef. Std. Err. t P>|t| [95% Conf. Interval] gallup | _cons |