Lesson Testing the Significance of the Least Squares Regression Model
Objectives Understand the requirements of the least- squares regression model Compute the standard error of the estimate Verify that the residuals are normally distributed Conduct inference on the slope and intercept Construct a confidence interval about the slope of the least-squares regression model
Vocabulary Bivariate normal distribution – one variable is normally distributed given any value of the other variable and the second variable is normally distributed given any value of the first variable Jointly normally distributed – same as bivariate normal distribution
Least-Squares Regression Model y i = β 0 + β 1 x i + ε i where y i is the value of the response variable for the i th individual β 0 and β 1 are the parameters to be estimated based on the sample data x i is the value of the explanitory variable for the i th individual ε i is a random error term with mean 0 and variance σ² εi = σ² The error terms are independent and normally distributed I = 1, …, n where n is the sample size (number of ordered pairs in the data set)
Requirements for Inferences The mean of the responses depends linearly on the explanatory variable –Verify linearity with a scatter plot (as in Chapter 4) The response variables are normally distributed with the same standard deviation –We plot the residuals against the values of the explanatory variable –If the residuals are spread evenly about a horizontal line drawn at 0, then the requirement of constant variance is satisfied –If the residuals increasingly spread outward (or decreasingly contract inward) about that line at 0, then the requirement of constant variance may not be satisfied
Hypothesis Tests Only after requirements are checked can we proceed with inferences on the slope, β 1, and the intercept, β 0 Tests: Two-TailedLeft TailedRight Tailed H 0 : β 1 = 0 H 1 : β 1 ≠ 0H 1 : β 1 < 0H 1 : β 1 > 0 Note: these procedures are considered robust (in fact for large samples (n > 30), inferential procedures regarding b 1 can be used with significant departures for normality)
Note: degrees of freedom = n – 2 H 0 : β 1 = 0 t α/2 for two-tailed (≠0) t α for one-tailed (>0 or <0) (b 1 – β i ) b 1 t 0 = = s e s b (x i – x) 2 Σ Test Statistic
(y i – y i ) 2 residuals 2 s e = = n – 2 n – 2 ΣΣ Note: by divide by n – 2 because we have estimated two parameters, β 0 and β 1 Standard Error of the Estimate
Conclusions from Test Rejecting the null hypothesis means that for The two-tailed alternative hypothesis, H 1 : β 1 ≠ 0 –The slope is significantly different from 0 –There is a significant linear relationship between the variables x and y The left-tailed alternative hypothesis, H 1 : β 1 < 0 –The slope is significantly less than 0 –There is a significant negative linear relationship between the variables x and y The right-tailed alternative hypothesis, H 1 : β 1 > 0 –The slope is significantly greater than 0 –There is a significant positive linear relationship between the variables x and y
Hypothesis Testing on β 1 Steps for Testing a Claim Regarding the Population Mean with σ Known 0. Test Feasible (requirements) 1.Determine null and alternative hypothesis (and type of test: two tailed, or left or right tailed) 2.Select a level of significance α based on seriousness of making a Type I error 3.Calculate the test statistic 4.Determine the p-value or critical value using level of significance (hence the critical or reject regions) 5.Compare the critical value with the test statistic (also known as the decision rule) 6.State the conclusion
Example
note: t α/2 degrees of freedom = n – 2 pre-conditions: 1)data randomly obtained 2)residuals normally distributed 3)constant error variance s e Lower bound = b 1 – t α/ = b 1 - t α/2 · s b 1 (x i – x) 2 Σ s e Upper bound = b 1 + t α/ = b 1 + t α/2 · s b 1 (x i – x) 2 Σ Confidence Intervals for β1 Confidence intervals are of the form Point estimate ± margin of error
Using TI Enter explanatory variable in L1 and the response variable in L2 Press STAT, highlight TESTS and select E:LinRegTTest Be sure Xlist is L1 and Ylist is L2. Make sure that Freq is set to 1. Set the direction of the alternative hypothesis. Highlight calculate and ENTER.
Summary and Homework Summary –Confidence intervals and prediction intervals quantify the accuracy of predicted values from least- squares regression lines –Confidence intervals for a mean response measure the accuracy of the mean response of all the individuals in a population –Prediction intervals for an individual response measure the accuracy of a single individual’s predicted value Homework –pg ; 1, 2, 3, 4, 7, 12, 13, 18