Download presentation
1
Interpreting Bi-variate OLS Regression
Stata Regression Output Regression plots and RSS R2 -- Coefficient of Determination Adjusted R2 Sample Covariance/Correlation Hypothesis Testing Standard Errors T-tests and P-values
2
Data Use the “caschool.dat” file Data description:
CaliforniaTestScores.pdf Build a Stata do-file as you go Model: Test score=f(student/teacher ratio)
3
Stata Regression Model:
Regressing Student Teacher Ratio onto Test Score histogram str, percent normal histogram testscr, percent normal
4
Regression Output regress testscr str
Source | SS df MS Number of obs = F( 1, 418) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = testscr | Coef. Std. Err t P>|t| Beta str | _cons |
5
Regression Descriptive Statistics
cor testscr str, means Variable | Mean Std. Dev Min Max testscr | str | | testscr str testscr | str |
6
Regression Plot twoway (scatter testscr str) (lfitci testscr str)
7
Measuring “Goodness of Fit”
Root of Mean Squared Error (“Root MSE”) Measures spread around the regression line Coefficient of Determination (R2) “model” or explained sum of squares “total” sum of squares
8
Explaining R2 For each observation Yi, variation around the mean can
be decomposed into that which is “explained” by the regression and that which is not: Book terminology: TSS = (all)2 RSS = (unexplained)2 ESS = (explained)2 unexplained deviation explained deviation Stata terminology: Residual = (unexplained)2 Model = (explained)2 Total = (all)2
9
Sample Covariance & Correlation
Sample covariance for a bivariate model is defined as: Sample correlations (r) “standardize” covariance by dividing by the product of the X and Y standard deviations: Sample correlations range from -1 (perfect negative relationship) to +1 (perfect positive relationship)
10
Standardized Regression Coefficients (aka “Beta Weights” or “Betas”)
Formula: In our example: Interpretation: the number of std. deviations change in Y one should expect from a one-std. deviation change in X.
11
Hypothesis Tests for Regression Coefficients
For our model: Yi = *Xi+ei Another sample of 420 observations would lead to different estimates for b0 and b1. If we drew many such samples, we’d get the sample distribution of the estimates We need to estimate the sample distribution, (because we usually can’t see it) based on our sample size and variance
12
To do that we calculate SEbs (Bivariate case only)
13
Interpreting Standard Errors
Assuming that we estimated the sample standard error correctly, we can identify how many standard errors our estimate is away from zero. For our model: b0 = , and SEb0 = 9.467 b1 = , and SEb1 = .4798 The T-test reports the number of standard errors our estimate falls away from zero. Thus, the “T” for b1 is 4.75 for our model. (rounding!) Estimated Sampling Distribution for b1 b1 = -2.28 (which is 4.75 SEb1 “units” away from b1) b1 - SEb1=-2.76 b1 + SEb1= -1.8
14
Classical Hypothesis Testing
Assume that b1 is zero. What is the probability that your sample would have resulted in an estimate for b1 that is 4.75 SEb1’s away from zero? To find out, determine the cumulative density of the estimated sampling distribution that falls more than 4.75 SEb1’s away from zero. See Table 2, page 757, in Stock & Watson. It reports discrete “p-values”, given the sample size and t-values. Note the distinction between 1 and 2 sided tests In general, if the t-stat is above 2, the p-value will be < which is the acceptable upper limit in a classical hypothesis test. Note: in Stata-speak, a p-value is a “p>|t|” Assume that b1 = 0.0 (null hypothesis) Estimated b1 = 2.27 (working hypothesis)
15
Coming up... For Next Week For Next Week:
Use the caschool.dta dataseet Run a model in Stata using Average Income (avginc) to predict Average Test Scores (testscr) Examine the univariate distributions of both variables and the residuals Walk through the entire interpretation Build a Stata do-file as you go For Next Week: Read Chapter 8 of Stock & Watson
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.