Uji Kelinearan dan Keberartian Regresi Pertemuan 02 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008
Bina Nusantara Uji Kelinieran dan Keberartian Regresi Anova pada regresi Sederhana Selang Kepercayaan Parameter Regresi Uji Independen Antar Peubah
Bina Nusantara Measures of Variation: The Sum of Squares SST = SSR + SSE Total Sample Variability = Explained Variability + Unexplained Variability
Bina Nusantara Measures of Variation: The Sum of Squares SST = Total Sum of Squares – Measures the variation of the Y i values around their mean, SSR = Regression Sum of Squares – Explained variation attributable to the relationship between X and Y SSE = Error Sum of Squares – Variation attributable to factors other than the relationship between X and Y (continued)
Bina Nusantara Measures of Variation: The Sum of Squares (continued) XiXi Y X Y SST = (Y i - Y) 2 SSE = (Y i - Y i ) 2 SSR = (Y i - Y) 2 _ _ _
Bina Nusantara Venn Diagrams and Explanatory Power of Regression Sales Sizes Variations in Sales explained by Sizes or variations in Sizes used in explaining variation in Sales Variations in Sales explained by the error term or unexplained by Sizes Variations in store Sizes not used in explaining variation in Sales
Bina Nusantara The ANOVA Table in Excel ANOVA dfSSMSF Significance F RegressionkSSR MSR =SSR/k MSR/MSE P-value of the F Test Residualsn-k-1SSE MSE =SSE/(n-k-1) Totaln-1SST
Bina Nusantara Measures of Variation The Sum of Squares: Example Excel Output for Produce Stores SSR SSE Regression (explained) df Degrees of freedom Error (residual) df Total df SST
Bina Nusantara The Coefficient of Determination Measures the proportion of variation in Y that is explained by the independent variable X in the regression model
Bina Nusantara Venn Diagrams and Explanatory Power of Regression Sales Sizes
Bina Nusantara Coefficients of Determination (r 2 ) and Correlation (r) r 2 = 1, r 2 =.81, r 2 = 0, Y Y i =b 0 +b 1 X i X ^ Y Y i =b 0 +b 1 X i X ^ Y Y i =b 0 +b 1 X i X ^ Y Y i =b 0 +b 1 X i X ^ r = +1 r = -1 r = +0.9 r = 0
Bina Nusantara Standard Error of Estimate Measures the standard deviation (variation) of the Y values around the regression equation
Bina Nusantara Measures of Variation: Produce Store Example Excel Output for Produce Stores r 2 =.94 94% of the variation in annual sales can be explained by the variability in the size of the store as measured by square footage. S yx n
Bina Nusantara Linear Regression Assumptions Normality – Y values are normally distributed for each X – Probability distribution of error is normal Homoscedasticity (Constant Variance) Independence of Errors
Bina Nusantara Consequences of Violation of the Assumptions Violation of the Assumptions – Non-normality (error not normally distributed) – Heteroscedasticity (variance not constant) Usually happens in cross-sectional data – Autocorrelation (errors are not independent) Usually happens in time-series data Consequences of Any Violation of the Assumptions – Predictions and estimations obtained from the sample regression line will not be accurate – Hypothesis testing results will not be reliable It is Important to Verify the Assumptions
Bina Nusantara Y values are normally distributed around the regression line. For each X value, the “spread” or variance around the regression line is the same. Variation of Errors Around the Regression Line X1X1 X2X2 X Y f(e) Sample Regression Line
Bina Nusantara Residual Analysis Purposes – Examine linearity – Evaluate violations of assumptions Graphical Analysis of Residuals – Plot residuals vs. X and time