Least-Squares Regression

Slides:



Advertisements
Similar presentations
AP Statistics Section 3.2 C Coefficient of Determination
Advertisements

Chapter 3: Describing Relationships
The Role of r2 in Regression Target Goal: I can use r2 to explain the variation of y that is explained by the LSRL. D4: 3.2b Hw: pg 191 – 43, 46, 48,
CHAPTER 3 Describing Relationships
Section 3.2 Least-Squares Regression
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.2 Least-Squares Regression.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
SWBAT: Calculate and interpret the residual plot for a line of regression Do Now: Do heavier cars really use more gasoline? In the following data set,
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.2 Least-Squares Regression.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
Warm-up O Turn in HW – Ch 8 Worksheet O Complete the warm-up that you picked up by the door. (you have 10 minutes)
+ Scatterplots and Correlation Explanatory and Response VariablesMost statistical studies examine data on more than one variable. In many of these settings,
CHAPTER 3 Describing Relationships
LEAST-SQUARES REGRESSION 3.2 Role of s and r 2 in Regression.
3.2 - Residuals and Least Squares Regression Line.
AP STATISTICS LESSON 3 – 3 (DAY 2) The role of r 2 in regression.
+ Warm Up 10/2 Type the following data into your calculator L1, L2 so you can follow along today…. 1 NEA calFat gain kg
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
CHAPTER 3 Describing Relationships
Describing Relationships
CHAPTER 3 Describing Relationships
Aim – How can we analyze the results of a linear regression?
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
AP Stats: 3.3 Least-Squares Regression Line
No notecard for this quiz!!
residual = observed y – predicted y residual = y - ŷ
AP STATISTICS LESSON 3 – 3 (DAY 2)
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
GET OUT p.161 HW!.
Least-Squares Regression
Chapter 3 Describing Relationships Section 3.2
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Least-Squares Regression
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Least-Squares Regression Least-Squares Regression Line Different regression lines produce different residuals. The regression line we want is the one that minimizes the sum of the squared residuals. Least-Squares Regression Definition: The least-squares regression line of y on x is the line that makes the sum of the squared residuals as small as possible.

Least-Squares Regression Least-Squares Regression Line We can use technology to find the equation of the least- squares regression line. We can also write it in terms of the means and standard deviations of the two variables and their correlation. Least-Squares Regression Definition: Equation of the least-squares regression line We have data on an explanatory variable x and a response variable y for n individuals. From the data, calculate the means and standard deviations of the two variables and their correlation. The least squares regression line is the line ŷ = a + bx with slope and y intercept

Least-Squares Regression Least-Squares Regression Line . The number of miles (in thousands) for the 11 used Hondas have a mean of 50.5 and a standard deviation of 19.3. The asking prices had a mean of $14,425 and a standard deviation of $1,899. The correlation for these variables is r = -0.874. Find the equation of the least-squares regression line and explain what change in price we would expect for each additional 19.3 thousand miles. Definition: Equation of the least-squares regression line slope and y intercept Least-Squares Regression To find the slope: To find the intercept: 14425 = a - 86(50.5)  a = 18,768 For each additional 19.3 thousand miles, we expect the cost to change by So, the CR-V will cost about $1660 fewer dollars for each additional 19.3 thousand miles. Note: Due to roundoff error, values for the slope and y intercept are slightly different from the values on page 166.

Least-Squares Regression Least-Squares Regression Line . Find the least-squares regression equation for the following data using your calculator. Least-Squares Regression You should get Body Weight (lb): 120 187 109 103 131 165 158 116 Backpack Weight (lb): 26 30 24 29 35 31 28

Least-Squares Regression Residual Plots One of the first principles of data analysis is to look for an overall pattern and for striking departures from the pattern. A regression line describes the overall pattern of a linear relationship between two variables. We see departures from this pattern by looking at the residuals. Least-Squares Regression Definition: A residual plot is a scatterplot of the residuals against the explanatory variable. Residual plots help us assess how well a regression line fits the data.

Least-Squares Regression Interpreting Residual Plots A residual plot magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns. The residual plot should show no obvious patterns The residuals should be relatively small in size. Least-Squares Regression Pattern in residuals Linear model not appropriate Definition: If we use a least-squares regression line to predict the values of a response variable y from an explanatory variable x, the standard deviation of the residuals (s) is given by

Least-Squares Regression Residual Plots Here are the scatterplot and the residual plot for the used Hondas data: Least-Squares Regression For the used Hondas data, the standard deviation is s = = $972. So, when we use number of miles to estimate the asking price, we will be off by an average of $972.

Least-Squares Regression The Role of r2 in Regression The standard deviation of the residuals gives us a numerical estimate of the average size of our prediction errors. There is another numerical quantity that tells us how well the least- squares regression line predicts values of the response y. Least-Squares Regression Definition: The coefficient of determination r2 is the fraction of the variation in the values of y that is accounted for by the least-squares regression line of y on x. We can calculate r2 using the following formula: where and

Least-Squares Regression The Role of r2 in Regression r 2 tells us how much better the LSRL does at predicting values of y than simply guessing the mean y for each value in the dataset. Consider the example on page 179. If we needed to predict a backpack weight for a new hiker, but didn’t know each hikers weight, we could use the average backpack weight as our prediction. Least-Squares Regression If we use the mean backpack weight as our prediction, the sum of the squared residuals is 83.87. SST = 83.87 If we use the LSRL to make our predictions, the sum of the squared residuals is 30.90. SSE = 30.90 SSE/SST = 30.97/83.87 SSE/SST = 0.368 Therefore, 36.8% of the variation in pack weight is unaccounted for by the least-squares regression line. 1 – SSE/SST = 1 – 30.97/83.87 r2 = 0.632 63.2 % of the variation in backpack weight is accounted for by the linear model relating pack weight to body weight.