Section 11.2 Day 2
Confidence Interval Estimation Whenever you reject the null hypothesis that a slope is 0, it is prudent to construct a confidence interval.
Confidence Interval Estimation Whenever you reject the null hypothesis that a slope is 0, it is prudent to construct a confidence interval. If the interval is extremely wide,
Confidence Interval Estimation Whenever you reject the null hypothesis that a slope is 0, it is prudent to construct a confidence interval. If the interval is extremely wide, due to large variation in residuals and small sample size,
Confidence Interval Estimation Whenever you reject the null hypothesis that a slope is 0, it is prudent to construct a confidence interval. If the interval is extremely wide, due to large variation in residuals and small sample size, that tells you the estimate b1 is practically useless.
Confidence Interval Estimation If the interval is extremely wide, due to large variation in residuals and small sample size, that tells you the estimate b1 is practically useless. May have “statistical significance” but no practical significance. Don’t get swept away by the numbers.
Confidence Interval Form Statistic (critical value) (standard deviation of statistic)
Confidence Interval For Slope Statistic (critical value) (standard deviation of statistic)
Confidence Interval For Slope Statistic (critical value) (standard deviation of statistic) b1 t* sb1
Components of Confidence Interval for a Slope How many components are there for constructing a confidence interval?
Components of Confidence Interval for a Slope How many components are there for constructing a confidence interval? 3 What are the components?
Components of Confidence Interval for a Slope 1) Check conditions
Components of Confidence Interval for a Slope 1) Check conditions 2) Do computations
Components of Confidence Interval for a Slope 1) Check conditions 2) Do computations 3) Give interpretation in context
To get a capture rate equal to the advertised rate, the conditional distributions of y for fixed values of x must be: approximately normal, with means that lie on a line, and standard deviations that are relatively constant across all values of x.
To get a capture rate equal to the advertised rate, the conditional distributions of y for fixed values of x must be approximately normal, with means that lie on a line, and standard deviations that are relatively constant across all values of x. Thus, we must check 4 conditions.
First Condition Randomness: Verify you have one of these situations. i. Single random sample from bivariate population
First Condition Randomness: Verify you have one of these situations. i. Single random sample from bivariate population ii. A set of independent random samples, one for each fixed value of the explanatory variable, x
First Condition Randomness: Verify you have one of these situations. i. Single random sample from bivariate population ii. A set of independent random samples, one for each fixed value of the explanatory variable, x iii. Experiment with random assignment of treatments
Second Condition Linearity: Make a scatterplot and check to see if the relationship looks linear.
Second Condition Linearity: Make a scatterplot and check to see if the relationship looks linear. Note: On quiz or test, you must show the scatterplot with labels. Simply saying “based on scatterplot, relationship looks linear” gets no credit.
Third Condition Uniform residuals: Make a residual plot to check departures from linearity and that residuals are of uniform size across all values of x.
Third Condition Uniform residuals: Make a residual plot to check departures from linearity and that residuals are of uniform size across all values of x. Note: On quiz or test, you must show the residual plot you analyzed or no credit.
Fourth Condition Normality: Make a univariate plot (dot plot, stemplot, boxplot or histogram) of the residuals to see if it’s reasonable to assume that the residuals came from a normal distribution.
Fourth Condition Normality: Make a univariate plot (dot plot, stemplot, boxplot or histogram) of the residuals to see if it’s reasonable to assume that the residuals came from a normal distribution. Note: On quiz or test, you must show the plot you analyzed with rationale—no superficial statement
Do Computations Confidence interval is: b1 t* sb1 Value of t* depends on:
Do Computations Confidence interval is: b1 t* sb1 Value of t* depends on: the confidence level and
Do Computations Confidence interval is: b1 t* sb1 Value of t* depends on: the confidence level and the number of degrees of freedom, df, which is n – 2.
Do Computations
Do Computations
Do Computations
If your calculator does not have LinRegTInt, use Note: For quiz/test, everyone will be required to show how they used this formula to compute the CI. If use LinRegTInt no credit!
If must use , use LinRegTTest to calculate b1 and t. Then use so
How do we determine t* ?
To determine t* , use Table B, t-distribution Critical Values
Find t* for following confidence intervals: 1) 90% with 3 df 2) 90% with 13 df 3) 95% with n = 15
Find t* for following confidence intervals: 1) 90% with 3 df 2.353 2) 90% with 13 df
Find t* for following confidence intervals: 2) 90% with 13 df 1.771 3) 95% with n = 15
Find t* for following confidence intervals: 3) 95% with n = 15 Note: df = n - 2 2.160 4) 95% with n = 50
Find t* for following confidence intervals: 95% with n = 50 df = 50 – 2 = 48 2.021
Give Interpretation in Context For 95% confidence interval, you are 95% confident that the slope of the underlying linear relationship lies in the interval ( , ). Remember to put this in context.
Give Interpretation in Context By 95% confidence, we mean that out of every 100 such confidence intervals we construct from random samples, we expect the true value, , to be in 95 of them.
Page 763, D11 For part a, use Mars rocks data on page 737. For predicting redness from sulfate percentage: explanatory variable is ? response variable is ?
Page 763, D11 For part a, use Mars rocks data on page 737. For predicting redness from sulfate percentage: explanatory variable is sulfate percentage response variable is redness
Page 763, D11 For part a, use Mars rocks data on page 737. LinRegTTest b = 0.5249005835
Page 763, D11 For part a, use Mars rocks data on page 737. LinRegTTest b = 0.5249005835 t* = 3.182 (95% CI with 3 df)
Page 763, D11 For part a, use Mars rocks data on page 737. LinRegTTest b = 0.5249005835 t* = 3.182 (95% CI with 3 df)
Page 763, D11 For part a, use Mars rocks data on page 737. b = 0.5249005835 t* = 3.182 (95% CI with 3 df) 0.5249005835 ± 3.182 ● 0.5249005835 3.567005313
Page 763, D11 For part a, use Mars rocks data on page 737. 0.5249005835 ± 0.4682453515 (0.0567, 0.9931) Interpret this interval.
I’m 95% confident that the slope of the true Page 763, D11 For part a, use Mars rocks data on page 737. (0.0567, 0.9931) I’m 95% confident that the slope of the true regression line for predicting redness from sulfate percentage for Mars rocks is in the interval (0.0567, 0.9931).
Page 763, D11 b) Use information in Display 11.17 on page 754
Page 763, D11 b) Use information in Display 11.17 on page 754 (- 0.8705, 1.1373) Interpret this interval.
Page 763, D11 b) Use information in Display 11.17 on page 754 (-0.8705, 1.1373) I’m 95% confident that the slope of the true regression line for predicting redness from sulfate percentage for Mars soil samples is in the interval -0.8705 to 1.1373.
Page 763, D11 I’m 95% confident that the slope of the true regression line for predicting redness from sulfate percentage for Mars soil samples is in the interval - 0.8705 to 1.1373. Because 0 is in this interval, it is possible that the slope of the true regression line is 0.
Page 763, D11 c)
Page 763, D11 c) For 9 df:
Page 763, D11 c) For 9 df: (0.3780, 0.8756)
Page 763, D11 I’m 95% confident that the slope of the true regression line for predicting redness from sulfate percentage for both Mars rocks and soil samples is in the interval 0.3780 to 0.8756.
Page 763, D11 Which interval is narrowest? Which interval is the widest?
Page 763, D11 Which interval is narrowest? For combined sample of rocks and soil Which interval is the widest? Soil samples Why?
Page 763, D11 Interval for soil samples is widest because the points tend to be relatively far from the regression line. Relate this to what you learned in Section 11.1.
Page 763, D11 Points further from the regression line increase the y-deviations which increase the standard error, thus increasing the margin of error.
Page 763, D11 The points in the combined sample cluster relatively close to the regression line which decrease the y-deviations. So the combined sample has the smallest standard error and thus narrowest interval. Plus, the larger sample size and the wider spread among the x’s also contribute to a narrower confidence interval.
Page 763, D12
Page 763, D12 Because the P-value for the two-sided test of a slope is less than 0.05, we would reject the null hypothesis that β1 = 0. Thus, a 95% confidence interval estimate of the slope . . . . .
Page 763, D12 Because the P-value for the two-sided test of a slope is less than 0.05, we would reject the null hypothesis that β1 = 0. Thus, a 95% confidence interval estimate of the slope does not include 0.
Page 769, E15
Page 769, E15 a(i) predicting temperature from chirp rate (1.9924, 4.5898)
Page 769, E15 a(i) predicting temperature from chirp rate (1.9924, 4.5898) a(ii) predicting chirp rate from temperature (0.12829, 0.29556)
Page 769, E15 b. When you reverse the roles of chirp rate and temperature, the entire regression line changes. The unit of the slope changes from degrees per chirp to chirps per degree. The sizes of the residuals change too because they are measured from a different line and from a different direction. Further, they are measured in different units (difference in temperature versus difference in chirp rate). Note: The two slopes are not reciprocals of one another.
Questions?