Download presentation
Presentation is loading. Please wait.
1
12.3 Least Squares Procedure Aðferð minnstu fervika The Least-squares procedure obtains estimates of the linear equation coefficients b 0 and b 1, in the model by minimizing the sum of the squared residuals e i This results in a procedure stated as Choose b 0 and b 1 so that the quantity is minimized. We use differential calculus to obtain the coefficient estimators that minimize SSE..
2
Least-Squares Derived Coefficient Estimators Metlar fyrir stuðla líkans skv. Aðferð minnstu kvaðrata The slope coefficient estimator is Metill fyrir hallatöluna er And the constant or intercept indicator is skurðpunkt við y-ás We also note that the regression line always goes through the mean X, Y. Við tökum það einnig fram að matið fer ávallt í gegnum meðaltöl X, Y.
3
Standard Assumptions for the Linear Regression Model The following assumptions are used to make inferences about the population linear model by using the estimated coefficients: 1.The x’s are fixed numbers, or they are realizations of random variable, X that are independent of the error terms, i ’s. In the latter case, inference is carried out conditionally on the observed values of the x’s. 2.The error terms are random variables with mean 0 and the same variance, 2. The later is called homoscedasticity or uniform variance. 3.The random error terms, I, are not correlated with one another, so that
4
Regression Analysis for Retail Sales Analysis Aðfallsgreining fyrir sölu (Figure 10.5) The regression equation is Aðfallsjafnan er Y Retail Sales = 1922 + 0.382 X Income b0b0 b1b1
5
Analysis of Variance Greining dreifni The total variability in a regression analysis, SST, can be partitioned into a component explained by the regression, SSR, and a component due to unexplained error, SSE Heildarbreytileik í aðfallsgreiningu, SST, má skipta í útskýrðan breytileika, SSR, og óútskýrðan breytileika, SSE With the components defined as, skilgreining þátta sem mynda jöfuna Total sum of squares Heildar breytileiki Error sum of squares Óútskýrður breytileiki Regression sum of squares Útskýrður breytileiki
6
Regression Analysis for Retail Sales Analysis Aðfallsgreining fyrir sölu (Figure 10.7) The regression equation is Aðfallsjafnan Y Retail Sales = 1922 + 0.382 X Income
7
Coefficient of Determination, R 2 Ákvörðunarstuðullinn Coefficient of Determination The Coefficient of Determination for a regression equation is defined as Ákvörðunarstuðullinn er skilgreindur sem This quantity varies from 0 to 1 and higher values indicate a better regression. Caution should be used in making general interpretations of R 2 because a high value can result from either a small SSE or a large SST or both. Þessi stærð er á bilinu 0 til 1 og gefur til kynna hlutfall útskýrðs breytileika af heildarbreytileika.
8
Correlation and R 2 Fylgni og R 2 The multiple coefficient of determination, R 2, for a simple regression is equal to the simple correlation squared: Ákvörðunarstuðullinn, R2, fyrir einfalda aðfallsgreiningu er jafn fylgni í öðru veldi:
9
Estimation of Model Error Variance Mat á dreifni í truflun aðfallsjöfnu þýðis The quantity SSE is a measure of the total squared deviation about the estimated regression line, and e i is the residual. An estimator for the variance of the population model error is Stærðin SSE, kallast summa fervika afgangsliða, og ei kallast afgangsliður. Metill fyrir breytileika í truflun þýðisjöfnunnar er Division by n – 2 instead of n – 1 results because the simple regression model uses two estimated parameters, b 0 and b 1, instead of one. Við deilum með n – 2 í stað n – 1 vegna þess að einfalda líkan aðfallsgreiningar byggir á mati á tveim óþekktum stikum, b0 og b1.
10
Sampling Distribution of the Least Squares Coefficient Estimator If the standard least squares assumptions hold, then b 1 is an unbiased estimator of 1 and has a population variance Ef klassískar forsendur aðfallsgreiningar halda, þá er metillinn b1 óhneigður metill fyrir 1 og hefur þýðisdreifni and an unbiased sample variance estimator og óhneigð úrtaksdreifni
11
Basis for Inference About the Population Regression Slope Grunnur fyrir ályktanir um halla línunnar Let 1 be a population regression slope and b 1 its least squares estimate based on n pairs of sample observations. Then, if the standard regression assumptions hold and it can also be assumed that the errors i are normally distributed, the random variable Látum 1 vera halla aðfallsjöfnu þýðis og b 1 er mat á hallatölunni byggða á n pörum gagna. Þá gildir að ef klassískar forsendur aðfallsgreiningar halda og hægt er að ganga út frá því að i séu normaldreifð þá verður hendingin is distributed as Student’s t with (n – 2) degrees of freedom. Student’s t dreifð með (n – 2) frígráður. In addition the central limit theorem enables us to conclude that this result is approximately valid for a wide range of non-normal distributions and large sample sizes, n.
12
Excel Output for Retail Sales Model (Figure 10.9) The regression equation is Y Retail Sales = 1922 + 0.382 X Income SSRSSESSTMSR MSE b0b0 b1b1 s b1 t b1 sese
13
Tests of the Population Regression Slope Hallatala aðfallslínu If the regression errors i are normally distributed and the standard least squares assumptions hold (or if the distribution of b 1 is approximately normal), the following tests have significance value : Ef truflanir aðfallsjöfnu þýðis i eru normaldreifðar og klassísku forsendurnar halda, þá hafa eftirfarandi próf marktæknistig : 1.To test either null hypothesis Núll tilgátan against the alternative gegn valtilgátunni the decision rule is ákvörðunarreglan er
14
Tests of the Population Regression Slope (continued) 2.To test either null hypothesis Núll tilgáta against the alternative gegn valtilgátu the decision rule is ákvörðunarreglan er
15
Tests of the Population Regression Slope (continued) 3.To test the null hypothesis til að prófa núll tilgátuna Against the two-sided alternative gegn tvíhliða núll tilgátu the decision rule is ákvörðunarreglan er
16
Confidence Intervals for the Population Regression Slope Öryggismörk fyrir hallatölu aðfallsjöfnu þýðis 1 If the regression errors i, are normally distributed and the standard regression assumptions hold, a 100(1 - )% confidence interval for the population regression slope 1 is given by Where t (n – 2, /2) is the number for which Þar sem t(n – 2, /2) er talan sem eftirfarandi gildir um And the random variable t (n – 2) follows a Student’s t distribution with (n – 2) degrees of freedom. Og hendingin t (n – 2) fylgir Student’s t dreifingu með (n – 2) frígráðum.
17
F test for Simple Regression Coefficient We can test the hypothesis against the alternative By using the F statistic The decision rule is We can also show that the F statistic is For any simple regression analysis.
18
Key Words 4Analysis of Variance 4Assumptions for the Least Squares Coefficient Estimators 4Basis for Inference About the Population Regression Slope 4Coefficient of Determination, R 2 4Confidence Intervals for Predictions 4 Confidence Intervals for the Population Regression Slope b 1 4 Correlation and R 2 4 Estimation of Model Error Variance 4 F test for Simple Regression Coefficient 4 Least-Squares Procedure 4 Linear Regression Outcomes
19
Key Words (continued) 4Linear Regression Population Equation Model 4Population Model 4Sampling Distribution of the Least Squares Coefficient Estimator 4Tests for Zero Population Correlation 4Tests of the Population Regression Slope
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.