Chapter 15 Inference for Regression. How is this similar to what we have done in the past few chapters?  We have been using statistics to estimate parameters.

Slides:



Advertisements
Similar presentations
Chapter 12 Inference for Linear Regression
Advertisements

CHAPTER 24: Inference for Regression
Objectives (BPS chapter 24)
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter Topics Types of Regression Models
Chapter 12 Section 1 Inference for Linear Regression.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Chapter 15 Inference for Regression
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Regression. Height Weight Suppose you took many samples of the same size from this population & calculated the LSRL for each. Using the slope from each.
Introduction to Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
Lesson Inference for Regression. Knowledge Objectives Identify the conditions necessary to do inference for regression. Explain what is meant by.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
AP Statistics Chapter 15 Notes. Inference for a Regression Line Goal: To determine if there is a relationship between two quantitative variables. –i.e.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
AP STATISTICS LESSON 14 – 1 ( DAY 1 ) INFERENCE ABOUT THE MODEL.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Lesson Testing the Significance of the Least Squares Regression Model.
Correlation and Regression Elementary Statistics Larson Farber Chapter 9 Hours of Training Accidents.
Chapter 26: Inference for Slope. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other.
The Practice of Statistics Third Edition Chapter 15: Inference for Regression Copyright © 2008 by W. H. Freeman & Company.
BPS - 5th Ed. Chapter 231 Inference for Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Regression Inference. Height Weight How much would an adult male weigh if he were 5 feet tall? He could weigh varying amounts (in other words, there is.
CHAPTER 12 More About Regression
Chapter 20 Linear and Multiple Regression
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Regression.
AP Statistics Chapter 14 Section 1.
Inference for Regression
CHAPTER 12 More About Regression
Regression.
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
Slides by JOHN LOUCKS St. Edward’s University.
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
Chapter 12 Regression.
Regression.
Regression.
PENGOLAHAN DAN PENYAJIAN
Regression.
Chapter 14 Inference for Regression
Regression Chapter 8.
Regression.
Basic Practice of Statistics - 3rd Edition Inference for Regression
SIMPLE LINEAR REGRESSION
CHAPTER 12 More About Regression
Regression.
Chapter 14 Inference for Regression
Inference for Regression
CHAPTER 12 More About Regression
Inference for Regression
Presentation transcript:

Chapter 15 Inference for Regression

How is this similar to what we have done in the past few chapters?  We have been using statistics to estimate parameters.  We have been using statistics to determine how likely proposed values for the parameters are to be accurate.  We now continue that by using ŷ = a + bx to estimate μ y =  + βx

The BIG Ideas  1. The model for regression inference says that the overall relationship between the explanatory and response variables in the population is described by a straight line with slope β and intercept . Individual responses y for different values of the explanatory variable x are independent and deviate from this line according to a Normal distribution with the same standard deviation σ for any x value.  2. The least-squares regression line estimates the population line. The residuals, deviations of the observations from the least-squares line, combine in the regression standard error to estimate the population standard deviation σ.

The BIG Ideas (continued)  3. Inference about the population slope β is based on the t statistics with n-2 degrees of freedom. These statistics are formed by standardizing statistics such as the least- squares slope b by dividing by their standard errors.  4. Unlike data analysis, inference is legitimate only under certain conditions. Be sure to verify that the model for regression inference does describe your data.

Linear Inference  Using the same principles that we used during the LSRL chapter, we can apply that to inference.  Assumptions (NOTE: WE WILL NOT VERIFY ALL OF THESE AS WE HAVE DONE IN THE PAST) We have n observations on an explanatory x and response y. Our goal is to study the behavior of y for given x values.  THE OBSERVATIONS ARE INDEPENDENT: Repeated y are independent of each other and come from an SRS.  THE TRUE RELATIONSHIP IS LINEAR: The mean response μ y has a straight-line relationship with x: μ y =  + βx  Slope β and intercept  are unknown parameters.  THE STANDARD DEVIATION OF THE RESPONSE ABOUT THE TRUE LINE IS THE SAME EVERYWHERE: The standard deviation of y (σ) is the same for all values of x. Value of sigma is unknown.  THE RESPONSE VARIES NORMALLY ABOUT THE TRUE REGRESSION LINE: For any value of x, the response y varies according to a normal distribution.

Condition Verification  Due to the complexity of truly verifying each condition, we will not address them all individually.  You are expected to analyze the residuals to test for any gross violations of the conditions necessary to do inference for regression.  First, look at the residual plot to see if the spread about the line appears to change as x increases.  Second, graph the residuals in a stemplot to see if they are approximately Normally distributed.

Linear Model  ŷ = a + bx  This equation is an unbiased estimator of the real parameter equation.

ESTIMATES  We use a as an estimate of   We use b as an estimate of β  We use s as an estimate of σ

Facts we need to know  Standard error about the line  Confidence Interval for β  b ± t*SE b  t = b/SE b  P-value = tcdf (t, E 99, df) or tcdf (- E 99, t, df) for neg. t  Don ’ t forget P-value = 2(tcdf (t, E 99, df)) or 2(tcdf (- E 99, t, df)) when using this method if two-sided alternative is chosen.  Degrees of freedom = (n-2)  H 0 : β = 0 There is no correlation between x and y.  H a : β (,≠) 0 There is a (negative, positive, some) correlation between x and y.

AN EXAMPLE OF Using Linear Inference  A statistical output reports the following. SE b = r = r 2 =.8157 b = n = 12  Create a 95% confidence interval for the slope .  Given that a is create an unbiased linear model to express the data.

Answers  The confidence interval is  b ± t*SE b ± 2.228(3.511) ± ( , ) We are 95% confident that the mean of y decreases by between about 15.5 to 31.2 for each unit that x increases.  The unbiased model is ŷ = – x  Don ’ t forget to define your variables ŷ and x when context is available

More Work-Same Example  IS THERE A CORRELATION BETWEEN x and y?  H 0 : β = 0 There is no correlation between x and y.  H a : β ≠ 0 There is a correlation between x and y.  t = b/SE b = /3.511 =  P-value = 2(tcdf (- E 99, , 10))≈  There is significant evidence against our null hypothesis. Due to the low P-value we will reject the null hypothesis. Our results indicate that there is a correlation between x and y.

AP Exam Notes  It has been suggested that for the purposes of the exam, students should be able to take computer output and make an inference (hypothesis test or confidence intervals) about the slope of a regression line and then interpret results. Students should also be able to get regression inference results from their calculators. In addition, students should be able to interpret the values of r, r 2, s, a, b, and SE b in the context of a regression problem.  CAN YOU DO THESE THINGS???

Example  Researchers wanted to know if having larger proportions of physicians in developing African countries is associate with a higher average life expectancy. Data for ten African countries were entered into a statistics package, and regression analysis was requested. On the following slides are the results (note that the explanatory variable is population-physician ratio, which is defined as the population divided by the number of physicians).

Predictor Coef Stdevt-ratio p Constant Ratio * * * * * * s = 6.310R-sq = 50.9%R-sq(adj) = 44.8% Unusual Observations Obs.RatioLifeExpFit Stdev.Fit Residual St.Residual X R R denotes an obs. with a large st. resid. X denotes an obs. whose X value gives it a large influence. 1. What is the equation for the least-squares regression line? Define any variables you use. ANSWER: predicted average life expectancy ratio (population/physician)

Predictor Coef Stdevt-ratio p Constant Ratio * * * * * * s = 6.310R-sq = 50.9%R-sq(adj) = 44.8% Unusual Observations Obs.RatioLifeExpFit Stdev.Fit Residual St.Residual X R R denotes an obs. with a large st. resid. X denotes an obs. whose X value gives it a large influence. 2. What is the correlation? Interpret the correlation in the context of these data. ANSWER: There is a moderately strong negative linear association between the ratio (population/physician) and average life expectancy in developing African countries.

Predictor Coef Stdevt-ratio p Constant Ratio * * * * * * s = 6.310R-sq = 50.9%R-sq(adj) = 44.8% Unusual Observations Obs.RatioLifeExpFit Stdev.Fit Residual St.Residual X R R denotes an obs. with a large st. resid. X denotes an obs. whose X value gives it a large influence. 3. Would you be willing to use this model to predict the life expectancy for a country, given the population-physician ratio? Justify your decision. ANSWER: Yes, as the population/physician ratio decreases, you would hope that the average life expectancy increases, resulting in a negative association. The association appears strong enough to be useful.

Predictor Coef Stdevt-ratio p Constant Ratio * * * * * * s = 6.310R-sq = 50.9%R-sq(adj) = 44.8% Unusual Observations Obs.RatioLifeExpFit Stdev.Fit Residual St.Residual X R R denotes an obs. with a large st. resid. X denotes an obs. whose X value gives it a large influence. 4. Is there sufficient evidence to indicate that average life expectancy in developing African countries has a linear association with population- physician ratio? ANSWER: HOW CAN WE ANSWER THIS QUESTION??? USE INFERENCE (completed on the next slide)

 H 0 : β = 0 There is no association between average life expectancy and population/physician ratio  H a : β < 0 There is a negative association between average life expectancy and population/physician ratio  With just the summary statistics available, we can not adequately verify that our conditions have been satisfied.  t=b/SE b = / = -2.88, df = n-2 =8, P-value = tcdf(-E99, -2.88, 8) =  Due to the low P-value, we will reject H 0.  There is reasonably strong evidence to suggest that there is a significant negative association between average life expectancy and population/physician ratio.

Predictor Coef Stdevt-ratio p Constant Ratio * * * * * * s = 6.310R-sq = 50.9%R-sq(adj) = 44.8% Unusual Observations Obs.RatioLifeExpFit Stdev.Fit Residual St.Residual X R R denotes an obs. with a large st. resid. X denotes an obs. whose X value gives it a large influence. 5. Construct a 90% confidence interval for the slope of the true regression line. b ± t*SE b = ±1.86( ) ANSWER: b ± t*SE b = ±1.86( ) = ( , )