 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

AP Statistics Section 3.2 C Coefficient of Determination
Chapter 12 Inference for Linear Regression
Regression BPS chapter 5 © 2006 W.H. Freeman and Company.
Linear Regression Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 8 Linear Regression.
Chapter 3 Bivariate Data
AP Statistics Chapter 3 Practice Problems
2nd Day: Bear Example Length (in) Weight (lb)
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
CHAPTER 8: LINEAR REGRESSION
Stat 512 – Lecture 17 Inference for Regression (9.5, 9.6)
AP Statistics Chapter 8: Linear Regression
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
Correlation & Regression
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Chapter 6 & 7 Linear Regression & Correlation
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chapter 10 Correlation and Regression
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
 The equation used to calculate Cab Fare is y = 0.75x where y is the cost and x is the number of miles traveled. 1. What is the slope in this equation?
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Regression BPS chapter 5 © 2010 W.H. Freeman and Company.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Chapter 8 Linear Regression *The Linear Model *Residuals *Best Fit Line *Correlation and the Line *Predicated Values *Regression.
Chapter 8 Linear Regression. Slide 8- 2 Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the.
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
 Describe the association between two quantitative variables using a scatterplot’s direction, form, and strength  If the scatterplot’s form is linear,
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Chapter 8 Linear Regression. Objectives & Learning Goals Understand Linear Regression (linear modeling): Create and interpret a linear regression model.
CHAPTER 8 Linear Regression. Residuals Slide  The model won’t be perfect, regardless of the line we draw.  Some points will be above the line.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
Ch 5 Relationships Between Quantitative Variables (pg 150) –Will use 3 tools to describe, picture, and quantify 1) scatterplot 2) correlation 3) regression.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
Statistics 8 Linear Regression. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Least Squares Regression Textbook section 3.2. Regression LIne A regression line describes how the response variable (y) changes as an explanatory variable.
Part II Exploring Relationships Between Variables.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Chapter 8 Part I Answers The explanatory variable (x) is initial drop, measured in feet, and the response variable (y) is duration, measured in seconds.
 Understand how to determine a data point is influential  Understand the difference between Extrapolation and Interpolation  Understand that lurking.
CHAPTER 3 Describing Relationships
(Residuals and
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
CHAPTER 26: Inference for Regression
Unit 4 Vocabulary.
Least-Squares Regression
CHAPTER 3 Describing Relationships
Algebra Review The equation of a straight line y = mx + b
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Presentation transcript:

 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression model using the scatterplot and residual plot AP Statistics Objectives Ch8

 Model  Residuals  Slope  Regression to the mean  Intercept R2R2 Vocabulary  Linear model  Predicted value  Regression line

Residual Plot Vocabulary Chapter 7 Answers Linear Regression Practice Regression Line Notes Chapter 8 Assignments Chp 8 Part I Day 2 Example Lurking Variable

Lurking Variable

Chapter 8 #1 r a) b) c) d)

Chapter 8 #1 r a) b) c) d)

Chapter 8 #1 r a) b) c) d)

Chapter 8 #1 r a) b) c) d)

Standardized Foot Length vs Height 2011 NOTE: (0,0) represents the mean of x and the mean of y. Slope is the correlation

Explanatory or Response Now interpret the R 2. R 2 =.697 According to the linear model, 69.7% of the variability in height is accounted for by variation in foot size.

Explanatory or Response

Residual Plot Example

REMEMBER: POSITIVE RESIDUALS are UNDERESTIMATES

Residual Plot Example NEGATIVE RESIDUALS are OVERESTIMATES

Assignment CHAPTER 8 Part I: pp #2,4,8&10,12&14 Part II: pp #16,18,20,28&30

Chapter 7 Answers a)#1 shows little or no association b)#4 shows a negative association c)#2 & #4 each show a linear association d)#3 shows a moderately strong, curved association e)#2 shows a very strong association

Chapter 7 Answers a) b)0.736 c)0.951 d)-0.021

Chapter 7 Answers The researcher should have plotted the data first. A strong, curved relationship may have a very low correlation. In fact, correlation is only a useful measure of the strength of a linear relationship.

Chapter 7 Answers If the association between GDP and infant mortality is linear, a correlation of shows a moderate, negative association.

Chapter 7 Answers Continent is a categorical variable. Correlation measures the strength of linear associations between quantitative variables.

Chapter 7 Answers Correlation must be between -1 and 1, inclusive. Correlation can never be 1.22.

Chapter 7 Answers A correlation, no matter how strong, cannot prove a cause-and-effect relationship.

Chapter 8 Vocabulary 1) Regression to the mean – each predicted response variable (y) tends to be closer to the mean (in standard deviations) than its corresponding explanatory variable (x)

Chapter 8 Vocabulary 3) Residual – the difference between the actual response value and the predicted response value 4) Overestimate – produces a negative residual 5) Underestimate – produces a positive residual

Chapter 8 Vocabulary 6) Slope – rate of change given in units of the response variable (y) per unit of the explanatory variable (x) 7) intercept – response value when the explanatory value is zero 8) R 2 – Must also be interpreted when describing a regression model (aka Coefficient of Determination)

Chapter 8 Vocabulary 8) R 2 – Must also be interpreted when describing a regression model “According to the linear model, _____% of the variability in _______ (response variable) is accounted for by variation in ________ (explanatory variable)” The remaining variation is due to the residuals

Chapter 8 Vocabulary CONDITIONS FOR USING A LINEAR REGRESSION 1)Quantitative Variables – Check the variables 2)Straight Enough – Check the scatterplot 1 st (should be nearly linear) - Check the residual plot next (should be random scatter) 3) Outlier Condition- - Any outliers need to be investigated

Chapter 8 Vocabulary If you find a pattern in the Residual Plot, that means the residuals (errors) are predictable. If the residuals are predictable, then a better model exists LINEAR MODEL IS NOT APPROPRIATE.

Chapter 8 Vocabulary If you find a pattern in the Residual Plot, that means the residuals (errors) are predictable. If the residuals are predictable, then a better model exists LINEAR MODEL IS NOT APPROPRIATE.

Did you say 2?Wrong. Try again. So what?

Important Note: The correlation is not given directly in this software package. You need to look in two places for it. Taking the square root of the “R squared” (coefficient of determination) is not enough. You must look at the sign of the slope too. Positive slope is a positive r-value. Negative slope is a negative r-value.

So here you should note that the slope is positive. The correlation will be positive too. Since R 2 is 0.482, r will be

So here you should note that the slope is negative. The correlation will be negative too. Since R 2 is 0.482, r will be S/F Ratio Grad Rate

Coefficient of Determination = (0.694) 2 =0.4816

With the linear regression model, 48.2% of the variability in airline fares is accounted for by the variation in distance of the flight.

There is an increase of 7.86 cents for every additional mile. There is an increase of $7.86 for every additional 100 miles.

There is an increase of 7.86 cents for every additional mile. There is an increase of $7.86 for every additional 100 miles.

The model predicts a flight of zero miles will cost $ The airline may have built in an initial cost to pay for some of its expenses.

8. Using those estimates, draw the line on the scatterplot.

12. In general, a positive residual means 13. In general, a negative residual means

A linear model should be appropriate, because 1) the scatterplot shows a nearly linear form and 2) the residual plot shows random scatter.

The coefficient of determination is.482, so

$150 for a flight of about 700 miles seems low compared to the other fares.

“fare” is the response variable. Not all software will call it the dependent variable. Always look for “Constant” and what is listed beside it. Here above it shows the column is for the “variable” and below “dist” is the explanatory variable.

Recall: For y = 3x + 1 the coefficient of x is ‘3’. For computer printouts this is the key column for your regression model.

Recall: For y = 3x + 1 the coefficient of x is ‘3’. For computer printouts this is the key column for your regression model. The “Coefficient” of the “Constant” is the y-intercept for your linear regression.

Recall: For y = 3x + 1 the coefficient of x is ‘3’. For computer printouts this is the key column for your regression model. The “Coefficient” of the “Constant” is the y-intercept for your linear regression. The “Coefficient” of the variable “dist” is the slope for your linear regression.

Recall: For y = 3x + 1 the coefficient of x is ‘3’. For computer printouts this is the key column for your regression model. The “Coefficient” of the “Constant” is the y-intercept for the linear regression. The “Coefficient” of the variable “dist” is the slope for the linear regression.

5. Predict the airfare for a 1000-mile flight.

R 2 doesn’t change, but the equation does.

= miles

8. Residual?

Chp 8 #17 R squared = 92.4% 17a. What is the correlation between tar and nicotine? (NOTE: scatterplot shows a strong positive linear association.)

Chp 8 #17 R squared = 92.4% 17b. What would you predict about the average nicotine content of cigarettes that are 2 standard deviations below average in tar content. = I would predict that the nicotine content would be standard deviations below the average.

Chp 8 #17 R squared = 92.4% 17c. If a cigarette is 1 standard deviation above average in nicotine content, what do you suspect is true about its tar content? = I would predict that the tar content would be standard deviations above the average.