Download presentation
Presentation is loading. Please wait.
Published byHector Ross Modified over 9 years ago
1
STA291 Statistical Methods Lecture 27
2
Inference for Regression
3
The Population and the Sample The movie budget sample is based on 120 observations. But we know observations vary from sample to sample. So we imagine a true line that summarizes the relationship between x and y for the entire population, Where µ y is the population mean of y at a given value of x. We write µ y instead of y because the regression line assumes that the means of the y values for each value of x fall exactly on the line.
4
For a given value x : Most, if not all, of the y values obtained from a particular sample will not lie on the line. The sampled y values will be distributed about µ y. We can account for the difference between ŷ and µ y by adding the error residual, or ε : The Population and the Sample
5
Regression Inference Collect a sample and estimate the population β’s by finding a regression line (Chapter 6): The residuals e = y – ŷ are the sample-based versions of ε. Account for the uncertainties in β 0 and β 1 by making confidence intervals, as we’ve done for means and proportions. The Population and the Sample
6
Assumptions and Conditions In this order: 1.Linearity Assumption 2.Independence Assumption 3.Equal Variance Assumption 4.Normal Population Assumption
7
Summary of Assumptions and Conditions Assumptions and Conditions
8
Summary of Assumptions and Conditions 1.Make a scatterplot of the data to check for linearity. (Linearity Assumption) 2.Fit a regression and find the residuals, e, and predicted values ŷ. 3.Plot the residuals against time (if appropriate) and check for evidence of patterns (Independence Assumption). 4.Make a scatterplot of the residuals against x or the predicted values. This plot should not exhibit a “fan” or “cone” shape. (Equal Variance Assumption) 5.Make a histogram and Normal probability plot of the residuals (Normal Population Assumption) Assumptions and Conditions
9
The Standard Error of the Slope
10
Which of these scatterplots would give the more consistent regression slope estimate if we were to sample repeatedly from the underlying population? Hint: Compare s e ’s. The Standard Error of the Slope
11
Which of these scatterplots would give the more consistent regression slope estimate if we were to sample repeatedly from the underlying population? Hint: Compare s x ’s. The Standard Error of the Slope
12
Which of these scatterplots would give the more consistent regression slope estimate if we were to sample repeatedly from the underlying population? Hint: Compare n’s. The Standard Error of the Slope
13
A Test for the Regression Slope When the conditions are met, the standardized estimated regression slope, Follows a t-distribution with df = n – 2. We estimate SE ( b 1 ) with: Where s x is the ordinary standard deviation of the x ’s and
14
The usual null hypothesis about the slope is that it’s equal to 0. Why? A slope of zero says that y doesn’t tend to change linearly when x changes. In other words, if the slope equals zero, there is no linear association between the two variables. H 0 : β 1 = 0. This would mean that x and y are not linearly related. H a : β 1 ≠ 0. This would mean... A Test for the Regression Slope
15
CI for the Regression Slope When the assumptions and conditions are met, we can find a confidence interval for 1 from Where the critical value t* depends on the confidence level and has df = n – 2.
16
16.4 A Test for the Regression Slope Example : Soap A soap manufacturer tested a standard bar of soap to see how long it would last. A test subject showered with the soap each day for 15 days and recorded the weight (in grams) remaining. Conditions were met so a linear regression gave the following: Dependent variable is: Weight R squared = 99.5%s = 2.949 Variable Coefficient SE(Coeff) t-ratio P-value Intercept 123.141 1.382 89.1 <0.0001 Day -5.57476 0.1068 -52.2 <0.0001 What is the standard deviation of the residuals? What is the standard error of b 1 ? What are the hypotheses for the regression slope? At α = 0.05, what is the conclusion?
17
16.4 A Test for the Regression Slope Example : Soap A soap manufacturer tested a standard bar of soap to see how long it would last. A test subject showered with the soap each day for 15 days and recorded the weight (in grams) remaining. Conditions were met so a linear regression gave the following: Dependent variable is: Weight R squared = 99.5%s = 2.949 Variable Coefficient SE(Coeff) t-ratio P-value Intercept 123.141 1.382 89.1 <0.0001 Day -5.57476 0.1068 -52.2 <0.0001 What is the standard deviation of the residuals? s e = 2.949 What is the standard error of ? SE( ) = 0.0168
18
16.4 A Test for the Regression Slope Example : Soap A soap manufacturer tested a standard bar of soap to see how long it would last. A test subject showered with the soap each day for 15 days and recorded the weight (in grams) remaining. Conditions were met so a linear regression gave the following: Dependent variable is: Weight R squared = 99.5%s = 2.949 Variable Coefficient SE(Coeff) t-ratio P-value Intercept 123.141 1.382 89.1 <0.0001 Day -5.57476 0.1068 -52.2 <0.0001 What are the hypotheses for the regression slope? At α = 0.05, what is the conclusion? Since the p-value is small (<0.0001), reject the null hypothesis. There is strong evidence of a linear relationship between Weight and Day.
19
16.4 A Test for the Regression Slope Example : Soap A soap manufacturer tested a standard bar of soap to see how long it would last. A test subject showered with the soap each day for 15 days and recorded the weight (in grams) remaining. Conditions were met so a linear regression gave the following: Dependent variable is: Weight R squared = 99.5%s = 2.949 Variable Coefficient SE(Coeff) t-ratio P-value Intercept 123.141 1.382 89.1 <0.0001 Day -5.57476 0.1068 -52.2 <0.0001 Find a 95% confidence interval for the slope? Interpret the 95% confidence interval for the slope? At α = 0.05, is the confidence interval consistent with the hypothesis test conclusion?
20
16.4 A Test for the Regression Slope Example : Soap A soap manufacturer tested a standard bar of soap to see how long it would last. A test subject showered with the soap each day for 15 days and recorded the weight (in grams) remaining. Conditions were met so a linear regression gave the following: Dependent variable is: Weight R squared = 99.5%s = 2.949 Variable Coefficient SE(Coeff) t-ratio P-value Intercept 123.141 1.382 89.1 <0.0001 Day -5.57476 0.1068 -52.2 <0.0001 Find a 95% confidence interval for the slope? Interpret the 95% confidence interval for the slope? We can be 95% confident that weight of soap decreases by between 5.34 and 5.8 grams per day. At α = 0.05, is the confidence interval consistent with the hypothesis test conclusion? Yes, the interval does not contain zero, so reject the null hypothesis.
21
Don’t fit a linear regression to data that aren’t straight. Watch out for changing spread. Watch out for non-Normal errors. Check the histogram and the Normal probability plot. Watch out for extrapolation. It is always dangerous to predict for x-values that lie far away from the center of the data.
22
Watch out for high-influence points and unusual observations. Watch out for one-tailed tests. Most software packages perform only two-tailed tests. Adjust your P-values accordingly.
23
Looking back o Know the Assumptions and conditions for inference about regression coefficients and how to check them, in this order: LIEN o Know the components of the standard error of the slope coefficient o Test statistic o CI Interpretation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.