Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.

Statistical model for linear regression p In the population, the linear regression equation is  y =  0 +  1 x. p Sample data then fits the model: p Data = fit + residual p y i = (  0 +  1 x i ) + (  i ) p where the  i are independent and Normally distributed N(0,  ). p Linear regression assumes equal variance of y (  is the same for all values of x).

  y =   +   x  The intercept  , the slope  , and the standard deviation  of y are the unknown parameters of the regression model  We rely on the random sample data to provide unbiased estimates of these parameters.  The value of ŷ from the least-squares regression line is really a prediction of the mean value of y (  y ) for a given value of x.  The least-squares regression line (ŷ = b 0 + b 1 x) obtained from sample data is the best estimate of the true population regression line (  y =   +   x). ŷ unbiased estimate for mean response  y b 0 unbiased estimate for intercept  0 b 1 unbiased estimate for slope   Estimating the parameters

The regression standard error, s, for n sample data points is calculated from the residuals (y i – ŷ i ): s is an unbiased estimate of the regression standard deviation  In JMP, this is Root Mean Square Error.  The population standard deviation  for y at any given value of x represents the spread of the normal distribution of the  i around the mean  y.

Conditions for inference  The observations are independent.  The relationship is indeed linear.  The standard deviation of y, σ, is the same for all values of x.  The response y varies normally around its mean.

Using residual plots to check for regression validity  The residuals (y − ŷ) give useful information about the contribution of individual data points to the overall pattern of scatter.  We view the residuals in a residual plot:  We may also look at a normal quantile plot of the residuals to check the normality assumption.

Residuals are randomly scattered  good! Curved pattern  the relationship is not linear. Change in variability across plot  σ not equal for all values of x.

Confidence interval for regression parameters  Estimating the regression parameters  0,  1 is a case of one- sample inference with unknown population variance.   We rely on the t distribution, with n – 2 degrees of freedom.  A level C confidence interval for the slope,  1, is proportional to the standard error of the least-squares slope:  b 1 ± t* SE b1  A level C confidence interval for the intercept,  0, is proportional to the standard error of the least-squares intercept:  b 0 ± t* SE b0  t* is the t critical for the t (n – 2) distribution with area C between –t* and +t*.

Significance test for the slope p We can test the hypothesis H 0 :  1 = 0 versus a 1 or 2 sided alternative. p We calculate t = (b 1 -0) / SE b1 p which if H 0 is true, has the p t (n – 2) distribution; use p Table D to find the p-value of p the test. JMP provides the p numerator and denominator p and the p-values when you p Fit Y by X.

Homework for Inference on Regression  Read over these notes and be prepared to use JMP to answer the homework questions, to do all the computations  Start with exercises #10.9-10.11. Work these in class with JMP  Try in a similar way to work #10.12-10.19

Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.

Similar presentations

Presentation on theme: "Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.

Similar presentations

Presentation on theme: "Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company."— Presentation transcript:

Similar presentations

About project

Feedback