Download presentation
Presentation is loading. Please wait.
1
Simple Linear Regression
Often we want to understand the relationships among variables, e.g., SAT scores and college GPA car weight and gas mileage amount of a certain pollutant in wastewater and bacteria growth in local streams number of takeoffs and landings and degree of metal fatigue in aircraft structures Simplest relationship Y = α + βx
2
Example The owner of a small harness race track in Florida is interested in understanding the relationship between attendance at the track and the total amount bet each night. The data for a two-week period (10 racing nights) is as follows: Attendance, x Amount Bet ($000), Y 117 2.07 128 2.8 122 3.14 119 2.26 131 3.4 135 3.89 125 2.93 120 2.66 130 3.33 127 3.54
3
Estimating the Regression Coefficients
Method of Least Squares Determine a and b (estimates for α and β) so that the sum of the squares of the residuals is minimized.) Steps: Calculate b using and a using
4
For Our Example b = _______________________________________
Night Attendance, x Amount Bet, Y xiyi xi2 1 117 2.07 242.19 13689 2 128 2.8 358.4 16384 3 122 3.14 383.08 14884 4 119 2.26 268.94 14161 5 131 3.4 445.4 17161 6 135 3.89 525.15 18225 7 125 2.93 366.25 15625 8 120 2.66 319.2 14400 9 130 3.33 432.9 16900 10 127 3.54 449.58 16129 TOTAL 1254 30.02 157558 b =((10* )-(1254*30.02))/((10*157558)-1254^2) = a = (30.02/10) – *(1254/10) = b = _______________________________________ a = ______________________________
5
What does this mean? We can draw the regression line that describes the relationship between attendance and amount bet: We can also predict amount bet based on attendance.
6
How good is our prediction?
Estimating the variance: Coefficient of determination, R2 a measure of the “quality of fit,” or the proportion of the variability explained by the fitted model. SSE = sum(residuals2)= s2 = SSE/8 = SST = Σ(Yi - Y)2 = R2 = 1-(SSE/SST) = 1-(0.639/2.945)= (see next page)
7
Calculations … Night Attendance, x Amount Bet, Y xiyi xi2 yhat
residuals2 1 117 2.07 242.19 13689 2 119 2.26 268.94 14161 3 120 2.66 319.2 14400 4 122 3.14 383.08 14884 5 125 2.93 366.25 15625 2.9673 6 127 3.54 449.58 16129 7 128 2.8 358.4 16384 0.0408 8 130 3.33 432.9 16900 9 131 3.4 445.4 17161 0.1584 10 135 3.89 525.15 18225 TOTAL 1254 30.02 157558 (Y - Y)2
8
Or … Using Excel Note the confidence interval … we can also draw a confidence interval around our predictions.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.