Class 22. Understanding Regression EMBS Part of 12.7 Sections 1-3 and 7 of Pfeifer Regression note
What is the regression line? It is a line drawn through a cloud of points. It is the line that minimizes sum of squared errors. – Errors are also known as residuals. – Error = Actual – Predicted. – Error is the vertical distance point (actual) to line (predicted). – Points above the line are positive errors. The average of the errors will be always be zero The regression line will always “go through” the average X, average Y. Error aka residual Predicted aka fitted
Can you draw the regression line?
A B C D E Which is the regression line? F
D
(1,1) (3,1) (2,7) (3,3)(2,3) (1,3) Error = 7-3 = 4 Error = 1-3 = -2 Sum of Errors is 0! SSE=(-2^2+4^2+-2^2) is smaller than from any other line. The line goes through (2,3), the average.
Draw in the regression line…
Two Points determine a line… …. and regression can give you the equation. Degrees CDegrees F
Two Points determine a line… …. and regression can give you the equation. Degrees CDegrees F
Data Set AData Set BData Set CData Set D XYXYXYXY Four Sets of X,Y Data
SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations11 ANOVA dfSSMSFSignificance F Regression Residual Total CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0% Intercept X Four Sets of X,Y Data Data Analysis/Regression Identical Regression Output For A, B, C, and D!!!!!
Assumptions
Example: Section 4 IQs IQ Mean Standard Error3.448 Median110 Mode102 Standard Deviation Sample Variance Kurtosis0.228 Skewness Range85 Minimum57 Maximum142 Sum3582 Count33 n s The CLT tells us this test works even if Y is not normal.
Regression Assumptions
Summary: The key assumption of linear regression….. Y ~ N(μ,σ) (no regression) Y│X ~ N(a+bX,σ) (with regression) – In other words μ = a + b (X) or E(Y│X) = a + b(X) Without regression, we used data to estimate and test hypotheses about the parameter μ. With regression, we use (x,y) data to estimate and test hypotheses about the parameters a and b. In both cases, we use the t because we don’t know σ. With regression, we also want to use X to forecast a new Y. The mean of Y given X is a linear function of X. EMBS (12.14)
Example: Assignment 22 MSFHours Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations15 ANOVA df Regression1 Residual13 Total14 Coefficients Intercept MSF n Standard error
Forecasting Y│X=157.3 Plug X=157.3 into the regression equation to get as the point forecast. – The point forecast is the mean of the probability distribution forecast. Under Certain Assumptions……. – GOOD METHOD Pr(Y<8) = NORMDIST(8,10.31,2.77,true) = 0.202
Example: Assignment 22 MSFHours Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations15 ANOVA df Regression1 Residual13 Total14 Coefficients Intercept MSF Job AJob B Intercept11 MSF Point Forecast sigma2.77 X88 Normdist n Standard error
Forecasting Y│X=157.3 Plug X=157.3 into the regression equation to get the point forecast. – The point forecast is the mean of the probability distribution forecast. Under Certain Assumptions……. – BETTER METHOD t= ( )/2.77 = Pr(Y<8) = 1-t.dist.rt(-0.83,13) = dof = n - 2
Forecasting Y│X=157.3 Plug X=157.3 into the regression equation to get the point forecast. – The point forecast is the mean of the probability distribution forecast. Under Certain Assumptions……. – PERFECT METHOD t= ( )/2.93 = Pr(Y<8) = 1-t.dist.rt(-0.79,13) = dof = n - 2
Probability Forecasting with Regression summary
Probability Forecasting with Regression
Summed over the n data points The X for which we predict Y The good and better methods ignore these terms…okay the bigger the n. (EMBS 12.26)
BOTTOM LINE
Much ado about nothing? Perfect (widest and curved) Good (straight and narrowest) Better
TODAY Got a better idea of how the “least squares” regression line goes through the cloud of points. Saw that several “clouds” can have exactly the same regression line….so chart the cloud. Practiced using a regression equation to calculate a point forecast (a mean) Saw three methods for creating a probability distribution forecast of Y│X. – We will use the better method. – We will know that it understates the actual uncertainty…..a problem that goes away as n gets big.
Next Class We will learn about “adjusted R square” – (p 9-10 pfeifer note) – The most over-rated statistic of all time. We will learn the four assumptions required to use regression to make a probability forecast of Y│X. – (Section 5 pfeifer note, 12.4 EMBS) – And how to check each of them. We will learn how to test H0: b=0. – (p pfeifer note, 12.5 EMBS) – And why this is such an important test.