Copyright © 2010 Pearson Education, Inc. Slide 27 - 1.

Slides:



Advertisements
Similar presentations
Continuation of inference testing 9E
Advertisements

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Inferences for Regression.
Copyright © 2010 Pearson Education, Inc. Chapter 27 Inferences for Regression.
Chapter 27 Inferences for Regression This is just for one sample We want to talk about the relation between waist size and %body fat for the complete population.
Inferences for Regression
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Linear Regression t-Tests Cardiovascular fitness among skiers.
Inference for Regression
Objectives (BPS chapter 24)
Inference for Regression 1Section 13.3, Page 284.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Inferences for Regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 29 Multiple Regression.
Chapter 12 Simple Regression
Copyright © 2010 Pearson Education, Inc. Chapter 24 Comparing Means.
Correlation and Regression Analysis
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Inference for regression - Simple linear regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Copyright © 2009 Pearson Education, Inc. Chapter 23 Inferences About Means.
Slide 23-1 Copyright © 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Slide
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 23, Slide 1 Chapter 23 Comparing Means.
Inference for Regression
Inferences for Regression
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 24 Comparing Means.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
AP Statistics Chapter 27 Notes “Inference for Regression”
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 24 Comparing Means.
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 26, Slide 1 Chapter 27 Inferences for Regression.
Inference for Regression
Chapter 26 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Statistics 27 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Statistics 27 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
CHAPTER 12 More About Regression
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Inferences for Regression
Inference for Regression
CHAPTER 12 More About Regression
Monday, April 10 Warm-up Using the given data Create a scatterplot
The Practice of Statistics in the Life Sciences Fourth Edition
Inferences for Regression
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Regression Chapter 8.
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Inferences for Regression
Inference for Regression
Presentation transcript:

Copyright © 2010 Pearson Education, Inc. Slide

Copyright © 2010 Pearson Education, Inc. Slide Solution: C

Copyright © 2010 Pearson Education, Inc. Chapter 27 Inferences for Regression

Copyright © 2010 Pearson Education, Inc. Slide An Example: Body Fat and Waist Size Our example revolves around the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set:

Copyright © 2010 Pearson Education, Inc. Slide Remembering Regression In regression, we want to model the relationship between two quantitative variables, one the predictor and the other the response. To do that, we imagine an idealized regression line, which assumes that the means of the distributions of the response variable fall along the line even though individual values are scattered around it. We would like to know what the regression model can tell us beyond the individuals in the study. We want to make confidence intervals and test hypotheses about the slope and intercept of the regression line.

Copyright © 2010 Pearson Education, Inc. Slide The Population and the Sample We know better than to think that even if we knew every population value, the data would line up perfectly on a straight line. In our sample, there’s a whole distribution of %body fat for men with 38-inch waists:

Copyright © 2010 Pearson Education, Inc. Slide The Population and the Sample (cont.) This is true at each waist size. We could depict the distribution of %body fat at different waist sizes like this:

Copyright © 2010 Pearson Education, Inc. Slide The Population and the Sample (cont.) The model assumes that the means of the distributions of %body fat for each waist size fall along the line even though the individuals are scattered around it. The model is not a perfect description of how the variables are associated, but it may be useful. If we had all the values in the population, we could find the slope and intercept of the idealized regression line explicitly by using least squares.

Copyright © 2010 Pearson Education, Inc. Slide The Population and the Sample (cont.) We write the idealized line with Greek letters and consider the coefficients to be parameters:  0 is the intercept and  1 is the slope. Corresponding to our fitted line of, we write Now, not all the individual y’s are at these means—some lie above the line and some below. Like all models, there are errors.

Copyright © 2010 Pearson Education, Inc. Slide The Population and the Sample (cont.) Denote the errors by . These errors are random, of course, and can be positive or negative. When we add error to the model, we can talk about individual y’s instead of means: This equation is now true for each data point (since there is an  to soak up the deviation) and gives a value of y for each x.

Copyright © 2010 Pearson Education, Inc. Slide Assumptions and Conditions In Chapter 8 when we fit lines to data, we needed to check only the Straight Enough Condition. Now, when we want to make inferences about the coefficients of the line, we’ll have to make more assumptions (and thus check more conditions). We need to be careful about the order in which we check conditions. If an initial assumption is not true, it makes no sense to check the later ones.

Copyright © 2010 Pearson Education, Inc. Slide Assumptions and Conditions (cont.) 1.Linearity Assumption: Straight Enough Condition: Check the scatterplot—the shape must be linear or we can’t use regression at all.

Copyright © 2010 Pearson Education, Inc. Slide Assumptions and Conditions (cont.) 1.Linearity Assumption: If the scatterplot is straight enough, we can go on to some assumptions about the errors. If not, stop here, or consider re-expressing the data to make the scatterplot more nearly linear. Check the Quantitative Data Condition. The data must be quantitative for this to make sense.

Copyright © 2010 Pearson Education, Inc. Slide Assumptions and Conditions (cont.) 2.Independence Assumption: Randomization Condition: the individuals are a representative sample from the population. Check the residual plot (part 1)—the residuals should appear to be randomly scattered.

Copyright © 2010 Pearson Education, Inc. Slide Assumptions and Conditions (cont.) 3.Equal Variance Assumption: Does The Plot Thicken? Condition: Check the residual plot (part 2)—the spread of the residuals should be uniform.

Copyright © 2010 Pearson Education, Inc. Slide Assumptions and Conditions (cont.) 4.Normal Population Assumption: Nearly Normal Condition: Check a histogram of the residuals. The distribution of the residuals should be unimodal and symmetric. Outlier Condition: Check for outliers.

Copyright © 2010 Pearson Education, Inc. Slide Assumptions and Conditions (cont.) If all four assumptions are true, the idealized regression model would look like this: At each value of x there is a distribution of y- values that follows a Normal model, and each of these Normal models is centered on the line and has the same standard deviation.

Copyright © 2010 Pearson Education, Inc. Slide Which Came First: the Conditions or the Residuals? There’s a catch in regression—the best way to check many of the conditions is with the residuals, but we get the residuals only after we compute the regression model. To compute the regression model, however, we should check the conditions. So we work in this order: Make a scatterplot of the data to check the Straight Enough Condition. (If the relationship isn’t straight, try re-expressing the data. Or stop.)

Copyright © 2010 Pearson Education, Inc. Slide If the data are straight enough, fit a regression model and find the residuals, e, and predicted values,. Make a scatterplot of the residuals against x or the predicted values. This plot should have no pattern. Check in particular for any bend, any thickening (or thinning), or any outliers. If the data are measured over time, plot the residuals against time to check for evidence of patterns that might suggest they are not independent. Which Come First: the Conditions or the Residuals? (cont.)

Copyright © 2010 Pearson Education, Inc. Slide Which Come First: the Conditions or the Residuals? (cont.) If the scatterplots look OK, then make a histogram and Normal probability plot of the residuals to check the Nearly Normal Condition. If all the conditions seem to be satisfied, go ahead with inference.

Copyright © 2010 Pearson Education, Inc. Slide Intuition About Regression Inference Spread around the line: Less scatter around the line means the slope will be more consistent from sample to sample. The spread around the line is measured with the residual standard deviation s e. You can always find s e in the regression output, often just labeled s.

Copyright © 2010 Pearson Education, Inc. Slide Intuition About Regression Inference (cont.) Spread around the line: Less scatter around the line means the slope will be more consistent from sample to sample.

Copyright © 2010 Pearson Education, Inc. Slide Intuition About Regression Inference (cont.) Spread of the x’s: A large standard deviation of x provides a more stable regression.

Copyright © 2010 Pearson Education, Inc. Slide Intuition About Regression Inference (cont.) Sample size: Having a larger sample size, n, gives more consistent estimates.

Copyright © 2010 Pearson Education, Inc. Slide Standard Error for the Slope Three aspects of the scatterplot affect the standard error of the regression slope: spread around the line, s e spread of x values, s x sample size, n. The formula for the standard error (which you will probably never have to calculate by hand) is:

Copyright © 2010 Pearson Education, Inc. Slide Sampling Distribution for Regression Slopes When the conditions are met, the standardized estimated regression slope follows a Student’s t-model with n – 2 degrees of freedom.

Copyright © 2010 Pearson Education, Inc. Slide Sampling Distribution for Regression Slopes (cont.) We estimate the standard error with where: n is the number of data values s x is the ordinary standard deviation of the x- values.

Copyright © 2010 Pearson Education, Inc. Slide What About the Intercept? The same reasoning applies for the intercept. The intercept usually isn’t interesting. Most hypothesis tests and confidence intervals for regression are about the slope.

Copyright © 2010 Pearson Education, Inc. Slide Regression Inference A null hypothesis of a zero slope questions the entire claim of a linear relationship between the two variables—often just what we want to know. To test H 0 :  1 = 0, we find and continue as we would with any other t-test. The formula for a confidence interval for  1 is

Copyright © 2010 Pearson Education, Inc. Slide FMin

Copyright © 2010 Pearson Education, Inc. Slide Example: Below are the times and temperatures for 11 of the top performances in women’s marathons during the 1990s. Let’s examine the influence of temperature on the performance of elite runners in marathons. a. Enter the temperatures in L1 and the runner’s times in L2. Check the scatterplot. Is it obviously nonlinear? b. Is the data independent, does the plot thicken? c. Is the histogram of the residuals unimodal and symmetric? (Put RESID (from List, names) as y, What do you have to do to get the residuals?) d. How many degrees of freedom are there? e. Perform a regression slope t-test. Use a two tailed option because we are interested in whether high temperatures enhance or interfere with a runner’s performance. You can store the equation in Y1 (Vars, Y-Vars, Function) f. You are given the t-value and P-value, the coefficients of the regression equation a and b, and the value of s (our sample estimate of the common standard deviation of errors around the true line), r squared and r. Find SE(b) (SE(b) = b/t. g. What is your conclusion?

Copyright © 2010 Pearson Education, Inc. Slide Example: Create a 95% confidence interval for the slope. What is your conclusion?