1 Chapter 10, Part 2 Linear Regression. 2 Last Time: A scatterplot gives a picture of the relationship between two quantitative variables. One variable.

Slides:



Advertisements
Similar presentations
MATH 2400 Chapter 5 Notes. Regression Line Uses data to create a linear equation in the form y = ax + b where “a” is the slope of the line (unit rate.
Advertisements

Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Chapter 3 Bivariate Data
Scatter Diagrams and Linear Correlation
Chapter 3 Association: Contingency, Correlation, and Regression
Looking at Data-Relationships 2.1 –Scatter plots.
Correlation A correlation exists between two variables when one of them is related to the other in some way. A scatterplot is a graph in which the paired.
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Describing Relationships: Scatterplots and Correlation
Basic Practice of Statistics - 3rd Edition
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Linear Regression Analysis
Correlation & Regression
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Correlation Chapter 7 The Basics A correlation exists between two variables when the values of one variable are somehow associated with the values of.
Linear Regression Least Squares Method: the Meaning of r 2.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 15 Describing Relationships: Regression, Prediction, and Causation Chapter 151.
Chapter 10 Correlation and Regression
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Chapter 5 Regression BPS - 5th Ed. Chapter 51. Linear Regression  Objective: To quantify the linear relationship between an explanatory variable (x)
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
WARM-UP Do the work on the slip of paper (handout)
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
Chapter 4 – Correlation and Regression before: examined relationship among 1 variable (test grades, metabolism, trip time to work, etc.) now: will examine.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Get out p. 193 HW and notes. LEAST-SQUARES REGRESSION 3.2 Interpreting Computer Regression Output.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Least Squares Regression Textbook section 3.2. Regression LIne A regression line describes how the response variable (y) changes as an explanatory variable.
Part II Exploring Relationships Between Variables.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Chapter 5: 02/17/ Chapter 5 Regression. 2 Chapter 5: 02/17/2004 Objective: To quantify the linear relationship between an explanatory variable (x)
Essential Statistics Regression
Cautions about Correlation and Regression
Daniela Stan Raicu School of CTI, DePaul University
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Least-Squares Regression
Chapter 2 Looking at Data— Relationships
Examining Relationships
Basic Practice of Statistics - 5th Edition Regression
HS 67 (Intro Health Stat) Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition Regression
Chapter 3: Describing Relationships
Least-Squares Regression
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Algebra Review The equation of a straight line y = mx + b
Warm-up: Pg 197 #79-80 Get ready for homework questions
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition Lecture Powerpoint
Chapter 3: Describing Relationships
Presentation transcript:

1 Chapter 10, Part 2 Linear Regression

2 Last Time: A scatterplot gives a picture of the relationship between two quantitative variables. One variable is explanatory, and the other is the response. Today: If we know the value of the explanatory variable, can we predict the value of the response variable? Predictions with Scatterplots

The Regression Line To make predictions, we’ll find a straight line that is the “best fit” for the points in the scatterplot. This is not so simple….

Regression Line in JMP Start by making a scatterplot. Red Triangle menu -> “Fit Line.” The equation of the regression line appears under the “Linear Fit” group. JMP uses column headings as variable names (instead of x and y). Example from the Cars 1993 file: MaxPrice = *MinPrice

Predicted Values We use the equation of the regression line to make predictions about… Individuals not in the original data set. Later measurements of the same individuals. Example: In 1994, a vehicle had a Min. Price of $15,000. Use the previous data to predict the Max. Price. You can do this by hand from the equation: MaxPrice = *MinPrice *(15) =

Are the Predictions Useful? In some cases, the regression line is more useful for predicting values. Consider the following examples (from Cars 1993):

7 Coefficient of Determination If the scatterplot is well-approximated by a straight line, the regression equation is more useful for making predictions. Correlation is one measure of this. The square of the correlation has a more intuitive meaning: What proportion of variation in the Response Variable is explained by variation in the Explanatory Variable? JMP: “RSquare” under “Summary of Fit”

Coefficient of Determination In predicting Max. Price from Min. Price, we had RSquare = About 82% of the variation in Max. Price is explained by a variation in Min. Price. In predicting Highway MPG from Engine size, we have RSquare = Only 39% of the variation in Highway MPG is explained by a variation in Engine Size.

Coefficient of Determination RSquare takes values from 0 to 1. For values close to 0, the regression line is not very useful for predictions. For values close to 1, the regression line is more useful for making predictions. RSquare makes no distinction between positive and negative association of variables.

10 Residuals For each individual in the data set we can compute the difference (error) between the actual and predicted values of the response variable. This difference is called a residual: Residual = (actual value) – (predicted value) In JMP: Click the red triangle by “Linear Fit” and select “Save Residuals” from the drop- down menu. You can also “Plot Residuals.”

11 How does JMP find the Regression Line? JMP uses the most popular method, Ordinary Least Squares (OLS). To measure how a given line fits the data: Compute all residuals, take the square of each. Add up the results to get a “total error.” The closer this total is to zero, the better the line fits the data. Choose the line with the smallest “total error.” (Thankfully) JMP takes care of the details.

12 Limitations of Correlation and Linear Regression: Both describe linear relationships only. Both are sensitive to outliers. Beware of extrapolation: predicting outside of the given range of the explanatory variable. Beware of lurking variables: other factors that may explain a strong correlation. Correlation does not imply causality!

13 Beware Extrapolation! A child’s height was plotted against her age... Can you predict her height at age 8 (96 months)? Can you predict her height at age 30 (360 months)?

14 Beware Extrapolation! Regression line: y = x Height at 96 months? y = 94.93cm (3' 6'') Height at 360 months? y = 209.8cm ( 6’ 10'') Height at birth (x = 0)? y = 71.95cm (2’ 4”)

Beware Lurking Variables! Although there may be a strong correlation (statistical relationship) between two variables, there might not be a direct practical (cause-and-effect) relationship. A lurking variable is a third variable (not in the scatterplot) that might cause the apparent relationship between explanatory and response variables.

Example: Pizza vs. Subway Fare The regression line to the right shows a strong correlation (0.9878) between the cost of: A slice of pizza Subway fare Q: Does the price of pizza affect the price of the subway?

17 In a study of emergency services, it was noted that larger fires tend to have more firefighters present. Suppose we used: –Explanatory Variable: Number of firefighters –Response Variable: Size of the fire We would expect a strong correlation. But it’s ludicrous to conclude that having more firefighters present causes the fire to be larger. Caution: Correlation Does Not Imply Causation