Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions
Two Quantitative Variables Plot and Interpret –Explanatory Variable and Response Variable –FSDD Numerical Summary –Correlation (r) – describes strength and direction Mathematical Model –LSRL for predicting
Conditions for Regression Inference For any fixed value of x, the response y varies according to a Normal distribution Repeated responses y are Independent of each other Parameters of Interest: The standard deviation of y (call it ) is the same for all values of x. The value of is unknown t-procedures! Degrees of Freedom: n – 2
Conditions for Regression Inference (Cont’d) Look at residuals: residual = Actual – Predicted The true relationship is linear Response varies Normally about the True regression line To estimate, use standard error about the line (s)
Inference Unknown parameters: a and b are unbiased estimators of the least squares regression line for the true intercept and slope, respectively There are n residuals, one for each data point. The residuals from a LSRL always have mean zero. This simplifies their standard error.
Standard Error about the Line Two variables gives: n – 2 df (not n – 1) Call the sample standard deviation (s) a standard error to emphasize that it is estimated from data Calculator will calculate s! Thank you TI!
t-procedures (n - 2 df) CI’s for the regression slope standard error of the LSRL slope b is: Testing hypothesis of No linear relationship x does not predict y r = 0
What is the equation of the LSRL? Estimate the parameters In your opinion, is the LSRL an appropriate model for the data? Would you be willing to predict a students height, if you knew that his arm span is 76 inches?
Construct a 95% CI for mean increase in IQ for each additional peak in crying
Scatter Plot and LSRL? Perform a Test of Significance
Checking the Regression Conditions All observations are Independent There is a true LINEAR relationship The Standard Deviation of the response variable (y) about the true line is the Same everywhere The response (y) varies Normally about the true regression line * Verifying Conditions uses the Residuals!