Download presentation
Presentation is loading. Please wait.
Published byCecily Anthony Modified over 9 years ago
1
(c) 2007 IUPUI SPEA K300 (4392) Outline Least Squares Methods Estimation: Least Squares Interpretation of estimators Properties of OLS estimators Variance of Y, b, and a Hypothesis Test of b and a ANOVA table Goodness-of-Fit and R 2
2
(c) 2007 IUPUI SPEA K300 (4392) Linear regression model
3
(c) 2007 IUPUI SPEA K300 (4392) Terminology Dependent variable (DV) = response variable = left-hand side (LHS) variable Independent variables (IV) = explanatory variables = right-hand side (RHS) variables = regressor (excluding a or b 0 ) a (b 0 ) is an estimator of parameter α, β 0 b (b 1 ) is an estimator of parameter β, β 1 a and b are the intercept and slope
4
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method How to draw such a line based on data points observed? Suppose a imaginary line of y= a + bx Imagine a vertical distance (or error) between the line and a data point. E=Y-E(Y) This error (or gap) is the deviation of the data point from the imaginary line, regression line What is the best values of a and b? A and b that minimizes the sum of such errors (deviations of individual data points from the line)
5
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method
6
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method Deviation does not have good properties for computation Why do we use squares of deviation? (e.g., variance) Let us get a and b that can minimize the sum of squared deviations rather than the sum of deviations. This method is called least squares
7
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method Least squares method minimizes the sum of squares of errors (deviations of individual data points form the regression line) Such a and b are called least squares estimators (estimators of parameters α and β). The process of getting parameter estimators (e.g., a and b) is called estimation “Regress Y on X” Lest squares method is the estimation method of ordinary least squares (OLS)
8
(c) 2007 IUPUI SPEA K300 (4392) Ordinary Least Squares Ordinary least squares (OLS) = Linear regression model = Classical linear regression model Linear relationship between Y and Xs Constant slopes (coefficients of Xs) Least squares method Xs are fixed; Y is conditional on Xs Error is not related to Xs Constant variance of errors
9
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method 1 How to get a and b that can minimize the sum of squares of errors?
10
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method 2 Linear algebraic solution Compute a and b so that partial derivatives with respect to a and b are equal to zero
11
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method 3 Take a partial derivative with respect to b and plug in a you got, a=Ybar –b*Xbar
12
(c) 2007 IUPUI SPEA K300 (4392) Least Squares Method 4 Least squares method is an algebraic solution that minimizes the sum of squares of errors (variance component of error) Not recommended
13
(c) 2007 IUPUI SPEA K300 (4392) OLS: Example 10-5 (1) Noxyx-xbary-ybar(x-xb)(y-yb)(x-xbar)^2 143128-14.5-8.5123.25210.25 248120-9.5-16.5156.7590.25 356135-1.5 2.25 4611433.56.522.7512.25 5671419.54.542.7590.25 67015212.515.5193.75156.25 Mean57.5136.5 Sum345819 541.5561.5
14
(c) 2007 IUPUI SPEA K300 (4392) OLS: Example 10-5 (2), NO! Noxyxyx^2 14312855041849 24812057602304 35613575603136 46114387233721 56714194474489 670152106404900 Mean57.5136.5 Sum3458194763420399
15
(c) 2007 IUPUI SPEA K300 (4392) OLS: Example 10-5 (3)
16
(c) 2007 IUPUI SPEA K300 (4392) What Are a and b ? a is an estimator of its parameter α a is the intercept, a point of y where the regression line meets the y axis b is an estimator of its parameter β b is the slope of the regression line b is constant regardless of values of Xs b is more important than a since that is what researchers want to know.
17
(c) 2007 IUPUI SPEA K300 (4392) How to interpret b? For unit increase in x, the expected change in y is b, holding other things (variables) constant. For unit increase in x, we expect that y increases by b, holding other things (variables) constant. For unit increase in x, we expect that y increases by.964, holding other variables constant.
18
(c) 2007 IUPUI SPEA K300 (4392) Properties of OLS estimators The outcome of least squares method is OLS parameter estimators a and b. OLS estimators are linear OLS estimators are unbiased (precise) OLS estimators are efficient (small variance) Gauss-Markov Theorem: Among linear unbiased estimators, least square estimator (OLS estimator) has minimum variance. BLUE (best linear unbiased estimator)
19
(c) 2007 IUPUI SPEA K300 (4392) Hypothesis Test of a an b How reliable are a and b we compute? T-test (Wald test in general) can answer The standardized effect size (effect size / standard error) Effect size is a-0 and b-0 assuming 0 is the hypothesized value; H 0 : α=0, H 0 : β=0 Degrees of freedom is N-K, where K is the number of regressors +1 How to compute standard error (deviation)?
20
(c) 2007 IUPUI SPEA K300 (4392) Variance of b (1) b is a random variable that changes across samples. b is a weighted sum of linear combinations of random variable Y
21
(c) 2007 IUPUI SPEA K300 (4392) Variance of b (2) Variance of Y (error) is σ 2 Var(kY) = k 2 Var(Y) = k 2 σ 2
22
(c) 2007 IUPUI SPEA K300 (4392) Variance of a a=Ybar + b*Xbar Var(b)=σ 2 /SS x, SS x = ∑(X-Xbar) 2 Var(∑Y)=Var(Y 1 )+Var(Y 2 )+…+Var(Y n )=nσ 2 Now, how do we compute the variance of Y, σ 2 ?
23
(c) 2007 IUPUI SPEA K300 (4392) Variance of Y or error Variance of Y is based on residuals (errors), Y-Yhat “Hat” means an estimator of the parameter Y hat is predicted (by a + bX) value of Y; plug in x given a and b to get Y hat Since a regression model includes K parameters (a and b in simple regression), the degrees of freedom is N-K Numerator is SSE in the ANOVA table
24
(c) 2007 IUPUI SPEA K300 (4392) Illustration (1) Noxyx-xbary-ybar(x-xb)(y-yb)(x-xbar)^2yhat(y-yhat)^2 143128-14.5-8.5123.25210.25122.5230.07 248120-9.5-16.5156.7590.25127.3453.85 356135-1.5 2.25 135.050.00 4611433.56.522.7512.25139.889.76 5671419.54.542.7590.25145.6621.73 67015212.515.5193.75156.25148.5511.87 Mean57.5136.5 Sum345819 541.5561.5127.2876 SSE=127.2876, MSE=31.8219
25
(c) 2007 IUPUI SPEA K300 (4392) Illustration (2): Test b How to test whether beta is zero (no effect)? Like y, α and β follow a normal distribution; a and b follows the t distribution b=.9644, SE(b)=.2381,df=N-K=6-2=4 Hypothesis Testing 1. H 0 :β=0 (no effect), H a :β≠0 (two-tailed) 2. Significance level=.05, CV=2.776, df=6-2=4 3. TS=(.9644-0)/.2381=4.0510~t(N-K) 4. TS (4.051)>CV (2.776), Reject H 0 5. Beta (not b) is not zero. There is a significant impact of X on Y
26
(c) 2007 IUPUI SPEA K300 (4392) Illustration (3): Test a How to test whether alpha is zero? Like y, α and β follow a normal distribution; a and b follows the t distribution a=81.0481, SE(a)=13.8809, df=N-K=6-2=4 Hypothesis Testing 1. H 0 :α=0, H a :α≠0 (two-tailed) 2. Significance level=.05, CV=2.776 3. TS=(81.0481-0)/.13.8809=5.8388~t(N-K) 4. TS (5.839)>CV (2.776), Reject H 0 5. Alpha (not a) is not zero. The intercept is discernable from zero (significant intercept).
27
(c) 2007 IUPUI SPEA K300 (4392) Questions How do we test H 0 : β 0 (α)=β 1 =β 2 …=0? Remember that t-test compares only two group means, while ANOVA compares more than two group means simultaneously. The same thing in linear regression. Construct the ANOVA table by partitioning variance of Y; F test examines the above H 0 The ANOVA table provides key information of a regression model
28
(c) 2007 IUPUI SPEA K300 (4392) Partitioning Variance of Y (1)
29
(c) 2007 IUPUI SPEA K300 (4392) Partitioning Variance of Y (2)
30
(c) 2007 IUPUI SPEA K300 (4392) Partitioning Variance of Y (3) 81+.96X Noxyyhat(y-ybar)^2(yhat-ybar)^2(y-yhat)^2 143128122.5272.25195.5430.07 248120127.34272.2583.9453.85 356135135.052.252.090.00 461143139.8842.2511.399.76 567141145.6620.2583.9421.73 670152148.55240.25145.3211.87 Mean57.5136.5SSTSSMSSE Sum345819649.5000522.2124127.2876 122.52=81+.96×43, 148.6=.81+.96×70 SST=SSM+SSE, 649.5=522.2+127.3
31
(c) 2007 IUPUI SPEA K300 (4392) ANOVA Table H 0 : all parameters are zero, β 0 = β 1 = 0 H a : at least one parameter is not zero CV is 12.22 (1,4), TS>CV, reject H 0 SourcesSum of SquaresDFMean SquaresF ModelSSMK-1MSM=SSM/(K-1)MSM/MSE ResidualSSEN-KMSE=SSE/(N-K) TotalSSTN-1 SourcesSum of SquaresDFMean SquaresF Model522.21241 16.41047 Residual127.2876431.8219 Total649.50005
32
(c) 2007 IUPUI SPEA K300 (4392) R 2 and Goodness-of-fit Goodness-of-fit measures evaluates how well a regression model fits the data The smaller SSE, the better fit the model F test examines if all parameters are zero. (large F and small p-value indicate good fit) R 2 (Coefficient of Determination) is SSM/SST that measures how much a model explains the overall variance of Y. R 2 =SSM/SST=522.2/649.5=.80 Large R square means the model fits the data
33
(c) 2007 IUPUI SPEA K300 (4392) Myth and Misunderstanding in R 2 R square is Karl Pearson correlation coefficient squared. r 2 =.8967 2 =.80 If a regression model includes many regressors, R 2 is less useful, if not useless. Addition of any regressor always increases R 2 regardless of the relevance of the regressor Adjusted R 2 give penalty for adding regressors, Adj. R 2 =1-[(N-1)/(N-K)](1-R 2 ) R 2 is not a panacea although its interpretation is intuitive; if the intercept is omitted, R 2 is incorrect. Check specification, F, SSE, and individual parameter estimators to evaluate your model; A model with smaller R 2 can be better in some cases.
34
(c) 2007 IUPUI SPEA K300 (4392) Interpolation and Extrapolation Confidence interval of E(Y|X), where x is within the rage of data x; interpolation Confidence interval of Y|X, where x is beyond the range of data x; extrapolation Extrapolation involves penalty and danger, which widens the confidence interval; less reliable
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.