Least Squares Regression Textbook section 3.2. Regression LIne A regression line describes how the response variable (y) changes as an explanatory variable.

Slides:



Advertisements
Similar presentations
Residuals.
Advertisements

2nd Day: Bear Example Length (in) Weight (lb)
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
LSRL Least Squares Regression Line
CHAPTER 3 Describing Relationships
C HAPTER 3: E XAMINING R ELATIONSHIPS. S ECTION 3.3: L EAST -S QUARES R EGRESSION Correlation measures the strength and direction of the linear relationship.
Residuals Target Goal: I can construct and interpret residual plots to assess if a linear model is appropriate. 3.2c Hw: pg 192: 48, 50, 54, 56, 58 -
Section 3.2 Least-Squares Regression
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
Lesson Correlation and Regression Wisdom. Knowledge Objectives Recall the three limitations on the use of correlation and regression. Explain what.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
CHAPTER 3 Describing Relationships
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Get out p. 193 HW and notes. LEAST-SQUARES REGRESSION 3.2 Interpreting Computer Regression Output.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Chapter 3: Describing Relationships
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Influential Points By Noelle Hodge. Does the age at which a child begins to talk predict later score on a test of mental ability? A study of the development.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Warm-up Get a sheet of computer paper/construction paper from the front of the room, and create your very own paper airplane. Try to create planes with.
Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
LSRL Least Squares Regression Line
Regression and Residual Plots
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3 Describing Relationships Section 3.2
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Least Squares Regression Textbook section 3.2

Regression LIne A regression line describes how the response variable (y) changes as an explanatory variable (x) changes.A regression line describes how the response variable (y) changes as an explanatory variable (x) changes. We use the regression line to PREDICT the value for y for a given value x.We use the regression line to PREDICT the value for y for a given value x.

Interpreting a regression line A regression line is a model for the data – line the density curves in chapter 2A regression line is a model for the data – line the density curves in chapter 2 It’s a compact description of the relationship between the variables.It’s a compact description of the relationship between the variables. The equation for a regression line isThe equation for a regression line is (read “y hat”) is the predicted value for the response variable. (read “y hat”) is the predicted value for the response variable. The regression line for our data from before would beThe regression line for our data from before would be

Prediction The accuracy of predictions from a regression line depends on how much the data are scattered about the line.The accuracy of predictions from a regression line depends on how much the data are scattered about the line. Using our line, what if we wanted to predict the selling price for a truck with 100,000 miles.Using our line, what if we wanted to predict the selling price for a truck with 100,000 miles. This is a fairly reasonable prediction.This is a fairly reasonable prediction. What if we wanted to find the price of a truck with 300,000 miles?What if we wanted to find the price of a truck with 300,000 miles? This is called extrapolation because the value is outside of our data set.This is called extrapolation because the value is outside of our data set. We’d have to pay someone else to take our truck!! Always check for a reasonable answer.We’d have to pay someone else to take our truck!! Always check for a reasonable answer.

Least squares regression & Residuals In most cases, no line will pass exactly through the data. A good regression line minimizes the vertical distance between the actual data points and the line itself.In most cases, no line will pass exactly through the data. A good regression line minimizes the vertical distance between the actual data points and the line itself. The red lines denote the residuals – the distance between the actual data and the predicted point on the line. The Least Squares Regression Line minimizes the sum of the squared residuals.

Residual Plots Residual Plots help us to determine if our regression model is appropriate for our data. Residual Plots help us to determine if our regression model is appropriate for our data. On a residual plot, if the points are scattered evenly above and below the line with no distinct pattern, we can be confident that our linear model is appropriate.On a residual plot, if the points are scattered evenly above and below the line with no distinct pattern, we can be confident that our linear model is appropriate.

The role of r 2 r 2 is called the coefficient of determination and is the proportion of variation of y values that is accounted for by the least-squares regression line.r 2 is called the coefficient of determination and is the proportion of variation of y values that is accounted for by the least-squares regression line. For our truck prices example, r 2 = Therefore, we say that “66.4% of the variation in price is accounted for by the linear model relating price to miles driven.”For our truck prices example, r 2 = Therefore, we say that “66.4% of the variation in price is accounted for by the linear model relating price to miles driven.” We’re still discussing how well the line fits the data. If r 2 is close to 1, then the line fits the data well.We’re still discussing how well the line fits the data. If r 2 is close to 1, then the line fits the data well.

Interpreting computer regression output y-intercept (a) Slope (b) Standard Deviation of the Residuals r2r2 We almost always ignore these values. We’ll discuss these in Chapter 12

Putting it all together Does the age at which a child begins to talk predict a later score on a test of mental ability? A study of the development of young children recorded the age in months at which each of 21 children spoke their first word and their Gesell Adaptive Score, the result of an aptitude test taken much later.Does the age at which a child begins to talk predict a later score on a test of mental ability? A study of the development of young children recorded the age in months at which each of 21 children spoke their first word and their Gesell Adaptive Score, the result of an aptitude test taken much later. Should we use a linear model to predict a child’s Gesell score from his or her age at first word? If so, how accurate will predictions be?Should we use a linear model to predict a child’s Gesell score from his or her age at first word? If so, how accurate will predictions be? Age Score Age Score

Continued… How do we answer the questions: Is a linear model appropriate and if so, how well does the least-squares regression line fit the data?How do we answer the questions: Is a linear model appropriate and if so, how well does the least-squares regression line fit the data? 1.Make a scatterplot of the data. 1.Describe what you see. (FODS) – Negative, moderately strong, linear pattern. There appears to be two outliers as one child has a very high score for the age at first word, and another child din’t speak until much later. 2.Check Residuals. 1.No pattern distinguished – linear model seems appropriate 3.Calculator output gives us r 2 = 0.41 which means that only 41% of the variation in Gesell score is accounted for by our linear model of age and score.

Correlation and Regression Wisdom 1.The distinction between response and explanatory variables is important in regression!! 2.Correlation and regression lines describe only linear relationships. 1.Pictures p Correlation and least-squares regression lines are not resistant. 4.Strong Association does NOT imply causation!!

Outliers vs. Influential points This point has a large residual, but does not change the LSRL. It is an outlier, but not very influential. This point has a small residual, but changes the LSRL quite a bit. It is very influential. Influential points often have small residuals because they pull the line towards themselves.