Download presentation
Presentation is loading. Please wait.
Published byJeffrey Francis Modified over 9 years ago
1
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 (14.5-14.6) Chapter 15 (15.5) Prof. Vera Adamchik
2
Multiple Regression Model E ( y ) = 0 + 1 x 1 + 2 x 2 +...+ p x p + Multiple Regression Equation E ( y ) = 0 + 1 x 1 + 2 x 2 +...+ p x p Unknown parameters are 0, 1, 2,..., p Sample Data: x 1 x 2... x p y.... Estimated Multiple Regression Equation Sample statistics are b 0, b 1, b 2,..., b p b 0, b 1, b 2,..., b p b 0, b 1, b 2,..., b p provide estimates of 0, 1, 2,..., p
3
Hypotheses about β i Ho: i = specific value Ha: i specific value Ho: i specific value Ha: i < specific value Ho: i specific value Ha: i > specific value The most common hypothesis is whether β i equals to zero (that is, no relationship between y and x i
4
To learn how to test for a significant regression relationship, we will use the “Programmer Salary Survey” example from the “Ch. 14-15 Part 1” Power Point file.
5
Testing for significance Two tests are commonly used: the t test and the F test. In simple linear regression, the F and t tests provide the same conclusion. In multiple regression, the F and t tests have different purposes.
6
The F test The F test is used to determine whether a significant relationship exists between the dependent variable and the set of all the independent variables. The F test is referred to as the test for overall significance.
7
The t test If the F test shows an overall significance, the t test is used to determine whether each of the individual independent variables is significant. A separate t test is conducted for each of the independent variables in the model. We refer to each of these t tests as a test for individual significance.
8
Different samples from the same population will produce different values for b i (that is, b 0, b 1, b 2, b 3, etc.). Hence, the estimated regression coefficients are random variables. To test the hypotheses, we need to know the sampling distribution of b i, that is, the sampling distribution of b 1, the sampling distribution of b 2, etc.
9
Sampling distribution of b i Because of the assumption of normally distributed random errors, the sampling distribution of b i is normal. The mean and standard deviation (a.k.a. standard error) of b i, respectively, are: where is the standard deviation of in the regression model.
10
etc. Sampling distributions of b i
11
Because we do not know the value of, we use an estimate of (see the next slide).
12
An estimate of s is referred to as the standard error of the estimate where p is the number of independent variables in the regression model; MSE stands for “the mean square error” and provides the estimate of.
13
n Excel’s Regression Statistics Standard error of the estimate s = sqrt [91.88949/(20-3-1)]=2.396475
14
Estimated standard deviation (standard error) of b i
15
Consequently, we use the t-distribution to test the hypotheses. The t test for a significant relationship is based on the fact that the test statistic follows a t-distribution with n-p-1 degrees of freedom.
16
Tests for individual significance
17
1. Determine the hypotheses. 3. Specify the level of significance. 2. Specify the sampling distribution of b 1 assuming that the null hypothesis is true. OUR EXAMPLE: Testing for significance: t Test
18
4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if t 2.120. OUR EXAMPLE: Testing for significance: t Test
19
5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.0014 < alpha = 0.05. Reject H 0. t = 3.8561 > t critical = 2.120. Reject H 0. We conclude that β 1 is not equal to zero. The evidence is sufficient to conclude that a statistically significant relationship exists between the annual salary and the years of experience. OUR EXAMPLE: Testing for significance: t Test
20
n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Experience” OUR EXAMPLE: Testing for significance: t Test
21
1. Determine the hypotheses. 3. Specify the level of significance. 2. Specify the sampling distribution of b 1 assuming that the null hypothesis is true. OUR EXAMPLE: Testing for significance: t Test
22
4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if if t 2.120. OUR EXAMPLE: Testing for significance: t Test
23
5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.04364 < alpha = 0.05. Reject H 0. t = 2.1905 > t critical = 2.120. Reject H 0. We conclude that β 2 is not equal to zero. The evidence is sufficient to conclude that a statistically significant relationship exists between the annual salary and the score on the programmer aptitude test. OUR EXAMPLE: Testing for significance: t Test
24
n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Test Score” OUR EXAMPLE: Testing for significance: t Test
25
1. Determine the hypotheses. 3. Specify the level of significance. 2. Specify the sampling distribution of b 1 assuming that the null hypothesis is true. OUR EXAMPLE: Testing for significance: t Test
26
4. Select the test statistic and state the rejection rule. Standardized (t -value) approach : The test statistic is p -value approach : Reject H 0 if p -value < 0.05. For = 0.05 and d.f. = 16, t 0.025 critical = 2.120. Reject H 0 if if t 2.120. OUR EXAMPLE: Testing for significance: t Test
27
5. Compute the value of the test statistics. 6. Determine whether to reject H 0. The p -value = 0.26789 > alpha = 0.05. Do not reject H 0. t = 1.1479 < t critical = 2.120. Do not reject H 0. The evidence is insufficient to reject H 0. We conclude that β 3 is equal to zero and that there is no statistically significant relationship between the annual salary and whether the individual has a graduate degree in computer science or information systems. OUR EXAMPLE: Testing for significance: t Test
28
n Excel’s Regression Equation Output Note: Columns F-I are not shown. t statistic and p -value used to test for the individual significance of “Grad. Degr.” OUR EXAMPLE: Testing for significance: t Test
29
Confidence interval for i We can use (1- α )% confidence interval for β i to test the hypotheses just used in the t test. H 0 is rejected if the hypothesized value of β i is not included in the confidence interval for β i.
30
The form of a confidence interval for i is: Confidence interval for i where is the t value providing an area of α/2 in the upper tail of a t distribution with n-p-1 degrees of freedom b i is the pointestimator is the margin of error
31
t-values in EXCEL =TINV(probability,degrees_freedom) Probability is the probability associated with the two-tailed Student's t-distribution. Degrees_freedom is the number of degrees of freedom with which to characterize the distribution. =TINV(0.05,16) = 2.119905285. The t table in the textbook shows 2.120.
32
OUR EXAMPLE: 95% Confidence interval for 1 Conclusion: 0 is not included in the confidence interval. Therefore, reject H 0.
33
OUR EXAMPLE: 95% Confidence interval for 2 Conclusion: 0 is not included in the confidence interval. Therefore, reject H 0.
34
OUR EXAMPLE: 95% Confidence interval for 3 Conclusion: 0 is included in the confidence interval. Therefore, do not reject H 0.
35
Note: Columns C-E are hidden. n Excel’s Regression Equation Output confidence intervals for β 1, β 2, β 3
36
The test for overall significance
37
1. Determine the hypotheses 2. Select the test statistics and specify its distribution H 0 : 1 = 2 =... = p = 0 H a : One or more of the parameters is not equal to zero. F = MSR/MSE (see the next slide) an F distribution with p d.f. in the numerator and n - p - 1 d.f. in the denominator GENERAL STEPS: Testing for significance: F Test
38
F-statistic
39
3. Specify the level of significance 4. State the rejection rule 5. Compute the value of the test statistic p -value approach: Reject H 0 if p -value < . F -value approach: Reject H 0 if F > F (critical) 6. Determine whether to reject H 0 GENERAL STEPS: Testing for significance: F Test
40
1. Determine the hypotheses 2. Select the test statistics and specify its distribution H 0 : 1 = 2 = 3 = 0 H a : One or more of the parameters is not equal to zero. F = MSR/MSE an F distribution with 3 d.f. in the numerator and 16 d.f. in the denominator OUR EXAMPLE: Testing for significance: F Test
41
3. Specify the level of significance 4. State the rejection rule p -value approach: Reject H 0 if p -value < 0.05. F -value approach: For = 0.05 and d.f. = 3, 16; F 0.05 = 3.24. Reject H 0 if F > 3.24. OUR EXAMPLE: Testing for significance: F Test
42
5. Compute the value of the test statistic 6. Determine whether to reject H 0 F = MSR/MSE = 169.2987/5.7431 = 29.4787 p -value = 0.0000009417 (Excel printout) The p -value = 0.0000009417 < alpha = 0.05. Reject H 0. F = 29.4787 > F critical = 3.24. Reject H 0. We conclude that a statistically significant relationship is present between the annual salary and the three independent variables, the years of experience, the score on the programmer aptitude test, and whether the individual has a graduate degree in computer science or information systems. OUR EXAMPLE: Testing for significance: F Test
43
n Excel’s ANOVA Output F statistic MSR and MSE OUR EXAMPLE: Testing for significance: F Test
44
n Excel’s ANOVA Output p -value used to test for overall significance OUR EXAMPLE: Testing for significance: F Test
45
Some cautions about the interpretation of significance tests Just because we are able to reject H 0 : i = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x i and y. (See pp. 593-594 in the textbook.) Rejecting H 0 : i = 0 and concluding that the relationship between x i and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x i and y.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.