Agresti/Franklin Statistics, 1 of 88  Section 11.4 What Do We Learn from How the Data Vary Around the Regression Line?

Slides:



Advertisements
Similar presentations
Inference for Regression
Advertisements

Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Objectives (BPS chapter 24)
Chapter 10 Simple Regression.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Chapter Topics Types of Regression Models
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression Analysis
Correlation and Regression Analysis
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Simple Linear Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Correlation & Regression
Introduction to Linear Regression and Correlation Analysis
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
STA291 Statistical Methods Lecture 27. Inference for Regression.
Linear Regression Inference
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Agresti/Franklin Statistics, 1 of 106  Section 9.4 How Can We Analyze Dependent Samples?
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12: Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Agresti/Franklin Statistics, 1 of 87  Section 7.2 How Can We Construct a Confidence Interval to Estimate a Population Proportion?
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
1 Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis Section 12.1: How Can We Model How Two Variables Are Related?
The Practice of Statistics Third Edition Chapter 15: Inference for Regression Copyright © 2008 by W. H. Freeman & Company.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12: Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
BPS - 5th Ed. Chapter 231 Inference for Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Lecture Slides Elementary Statistics Twelfth Edition
CHAPTER 12 More About Regression
Inference for Regression
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Inference for Regression
CHAPTER 12 More About Regression
Lecture Slides Elementary Statistics Thirteenth Edition
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Inference in Linear Models
Basic Practice of Statistics - 3rd Edition Inference for Regression
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Chapter 13 Multiple Regression
Presentation transcript:

Agresti/Franklin Statistics, 1 of 88  Section 11.4 What Do We Learn from How the Data Vary Around the Regression Line?

Agresti/Franklin Statistics, 2 of 88 Residuals and Standardized Residuals A residual is a prediction error – the difference between an observed outcome and its predicted value The magnitude of these residuals depends on the units of measurement for y A standardized version of the residual does not depend on the units

Agresti/Franklin Statistics, 3 of 88 Standardized Residuals Standardized residual: The se formula is complex, so we rely on software to find it A standardized residual indicates how many standard errors a residual falls from 0 Often, observations with standardized residuals larger than 3 in absolute value represent outliers Typo on Pg 553 of Text. Corrected Version 

Agresti/Franklin Statistics, 4 of 88 Example: Detecting an Underachieving College Student Data was collected on a sample of 59 students at the University of Georgia Two of the variables were: CGPA: College Grade Point Average HSGPA: High School Grade Point Average Example 13 in Text

Agresti/Franklin Statistics, 5 of 88 Example: Detecting an Underachieving College Student A regression equation was created from the data: x: HSGPA y: CGPA Equation:

Agresti/Franklin Statistics, 6 of 88 Example: Detecting an Underachieving College Student MINITAB highlights observations that have standardized residuals with absolute value larger than 2:

Agresti/Franklin Statistics, 7 of 88 Example: Detecting an Underachieving College Student Consider the reported standardized residual of This indicates that the residual is 3.14 standard errors below 0 This student’s actual college GPA is quite far below what the regression line predicts

Agresti/Franklin Statistics, 8 of 88 Analyzing Large Standardized Residuals Does it fall well away from the linear trend that the other points follow? Does it have too much influence on the results? Note: Some large standardized residuals may occur just because of ordinary random variability

Agresti/Franklin Statistics, 9 of 88 Histogram of Residuals A histogram of residuals or standardized residuals is a good way of detecting unusual observations A histogram is also a good way of checking the assumption that the conditional distribution of y at each x value is normal Look for a bell-shaped histogram

Agresti/Franklin Statistics, 10 of 88 Histogram of Residuals Suppose the histogram is not bell- shaped: The distribution of the residuals is not normal However…. Two-sided inferences about the slope parameter still work quite well The t- inferences are robust

Agresti/Franklin Statistics, 11 of 88 The Residual Standard Deviation For statistical inference, the regression model assumes that the conditional distribution of y at a fixed value of x is normal, with the same standard deviation at each x This standard deviation, denoted by σ, refers to the variability of y values for all subjects with the same x value

Agresti/Franklin Statistics, 12 of 88 The Residual Standard Deviation The estimate of σ, obtained from the data, is:

Agresti/Franklin Statistics, 13 of 88 Example: How Variable are the Athletes’ Strengths? From MINITAB output, we obtain s, the residual standard deviation of y: For any given x value, we estimate the mean y value using the regression equation and we estimate the standard deviation using s: s = 8.0

Agresti/Franklin Statistics, 14 of 88 Confidence Interval for µ y We estimate µ y, the population mean of y at a given value of x by: We can construct a 95 %confidence interval for µ y using:

Agresti/Franklin Statistics, 15 of 88 Prediction Interval for y The estimate for the mean of y at a fixed value of x is also a prediction for an individual outcome y at the fixed value of x Most regression software will form this interval within which an outcome y is likely to fall This is called a prediction interval for y (See Figure 11.10)

Agresti/Franklin Statistics, 16 of 88 The Residual Standard Deviation Difference in limit of CI and “s”

Agresti/Franklin Statistics, 17 of 88 Prediction Interval for y vs Confidence Interval for µ y The prediction interval for y is an inference about where individual observations fall Use a prediction interval for y if you want to predict where a single observation on y will fall for a particular x value

Agresti/Franklin Statistics, 18 of 88 Prediction Interval for y vs Confidence Interval for µ y The confidence interval for µ y is an inference about where a population mean falls Use a confidence interval for µ y if you want to estimate the mean of y for all individuals having a particular x value

Agresti/Franklin Statistics, 19 of 88 Example: Predicting Maximum Bench Press and Estimating its Mean

Agresti/Franklin Statistics, 20 of 88 Example: Predicting Maximum Bench Press and Estimating its Mean Use the MINITAB output to find and interpret a 95% CI for the population mean of the maximum bench press values for all female high school athletes who can do x = 11 sixty- pound bench presses For all female high school athletes who can do 11 sixty-pound bench presses, we estimate the mean of their maximum bench press values falls between 78 and 82 pounds

Agresti/Franklin Statistics, 21 of 88 Example: Predicting Maximum Bench Press and Estimating its Mean Use the MINITAB output to find and interpret a 95% Prediction Interval for a single new observation on the maximum bench press for a randomly chosen female high school athlete who can do x = 11 sixty-pound bench presses For all female high school athletes who can do 11 sixty-pound bench presses, we predict that 95% of them have maximum bench press values between 64 and 96 pounds

Agresti/Franklin Statistics, 22 of 88 Decomposing the Error OR Regression SS + Residual SS= Total SS R egress i on SS : = P ( ^ y i ¡ ¹ y ) 2 = P ( y i ¡ ¹ y ) 2 ¡ P ( y i ¡ ^ y i ) 2 F=(MS Reg)/(MSE). More general the “t” test (in cases studied in this class it is effectively “t” squared) However in more complicated models (more explanatory variables) the difference and utility of this becomes apparent

Agresti/Franklin Statistics, 23 of 88  Section 11.5 Exponential Regression: A Model for Nonlinearity

Agresti/Franklin Statistics, 24 of 88 Nonlinear Regression Models If a scatterplot indicates substantial curvature in a relationship, then equations that provide curvature are needed Occasionally a scatterplot has a parabolic appearance: as x increases, y increases then it goes back down More often, y tends to continually increase or continually decrease but the trend shows curvature

Agresti/Franklin Statistics, 25 of 88 Example: Exponential Growth in Population Size Since 2000, the population of the U.S. has been growing at a rate of 2% a year The population size in 2000 was 280 million The population size in 2001 was 280 x 1.02 The population size in 2002 was 280 x (1.02 )2 … The population size in 2010 is estimated to be 280 x (1.02 )10 This is called exponential growth

Agresti/Franklin Statistics, 26 of 88 Exponential Regression Model An exponential regression model has the formula: For the mean µ y of y at a given value of x, where α and β are parameters

Agresti/Franklin Statistics, 27 of 88 Exponential Regression Model In the exponential regression equation, the explanatory variable x appears as the exponent of a parameter The mean µ y and the parameter β can take only positive values As x increases, the mean µ y increases when β>1 It continually decreases when 0 < β<1

Agresti/Franklin Statistics, 28 of 88 Exponential Regression Model For exponential regression, the logarithm of the mean is a linear function of x When the exponential regression model holds, a plot of the log of the y values versus x should show an approximate straight-line relation with x

Agresti/Franklin Statistics, 29 of 88 Example: Explosion in Number of People Using the Internet

Agresti/Franklin Statistics, 30 of 88 Example: Explosion in Number of People Using the Internet

Agresti/Franklin Statistics, 31 of 88 Example: Explosion in Number of People Using the Internet

Agresti/Franklin Statistics, 32 of 88 Example: Explosion in Number of People Using the Internet Using regression software, we can create the exponential regression equation: x: the number of years since Start with x = 0 for 1995, then x=1 for 1996, etc y: number of internet users Equation:

Agresti/Franklin Statistics, 33 of 88 Interpreting Exponential Regression Models In the exponential regression model, the parameter α represents the mean value of y when x = 0; The parameter β represents the multiplicative effect on the mean of y for a one-unit increase in x

Agresti/Franklin Statistics, 34 of 88 Example: Explosion in Number of People Using the Internet In this model: The predicted number of Internet users in 1995 (for which x = 0) is million The predicted number of Internet users in 1996 is times