Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Quantitative Variables Chapter 5.

Slides:



Advertisements
Similar presentations
Regression BPS chapter 5 © 2006 W.H. Freeman and Company.
Advertisements

Chapter 4 The Relation between Two Variables
Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
LSRL Least Squares Regression Line
Relationships Between Quantitative Variables
Describing the Relation Between Two Variables
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
CHAPTER 3 Describing Relationships
Ch 2 and 9.1 Relationships Between 2 Variables
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Linear Regression.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
 Chapter 7 Scatterplots, Association, and Correlation.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5 Summarizing Bivariate Data.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Regression BPS chapter 5 © 2010 W.H. Freeman and Company.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Copyright © 2010 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Linear Regression Day 1 – (pg )
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Ch 5 Relationships Between Quantitative Variables (pg 150) –Will use 3 tools to describe, picture, and quantify 1) scatterplot 2) correlation 3) regression.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
CHAPTER 3 Describing Relationships
1 Regression Line Part II Class Class Objective After this class, you will be able to -Evaluate Regression and Correlation Difficulties and Disasters.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Unit 3 – Association: Contingency, Correlation, and Regression Lesson 3-3 Linear Regression, Residuals, and Variation.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Statistics 200 Lecture #6 Thursday, September 8, 2016
CHAPTER 7 LINEAR RELATIONSHIPS
Statistics 200 Lecture #5 Tuesday, September 6, 2016
Chapter 5 LSRL.
LSRL Least Squares Regression Line
Chapter 4 Correlation.
Lecture Notes The Relation between Two Variables Q Q
CHAPTER 3 Describing Relationships
Elementary Statistics: Looking at the Big Picture
Unit 4 Vocabulary.
Least-Squares Regression
Chapter 3 Describing Relationships Section 3.2
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Correlation and Regression
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Homework: pg. 180 #6, 7 6.) A. B. The scatterplot shows a negative, linear, fairly weak relationship. C. long-lived territorial species.
CHAPTER 3 Describing Relationships
Honors Statistics Review Chapters 7 & 8
CHAPTER 3 Describing Relationships
Presentation transcript:

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Quantitative Variables Chapter 5

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 2 Three Tools we will use … Scatterplot, a two-dimensional graph of data values Correlation, a statistic that measures the strength and direction of a linear relationship Regression equation, an equation that describes the average relationship between a response and explanatory variable

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Looking for Patterns with Scatterplots Questions to Ask about a Scatterplot What is the average pattern? Does it look like a straight line or is it curved? What is the direction of the pattern? How much do individual points vary from the average pattern? Are there any unusual data points?

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 4 Positive/Negative Association Two variables have a positive association when the values of one variable tend to increase as the values of the other variable increase. Two variables have a negative association when the values of one variable tend to decrease as the values of the other variable increase.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 5 Example 5.1 Height and Handspan Data shown are the first 12 observations of a data set that includes the heights (in inches) and fully stretched handspans (in centimeters) of 167 college students.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 6 Example 5.1 Height and Handspan Taller people tend to have greater handspan measurements than shorter people do. When two variables tend to increase together, we say that they have a positive association. The handspan and height measurements may have a linear relationship.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 7 Example 5.2 Driver Age and Maximum Legibility Distance of Highway Signs A research firm determined the maximum distance at which each of 30 drivers could read a newly designed sign. The 30 participants in the study ranged in age from 18 to 82 years old. We want to examine the relationship between age and the sign legibility distance.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 8 Example 5.2 Driver Age and Maximum Legibility Distance of Highway Signs We see a negative association with a linear pattern. We will use a straight-line equation to model this relationship.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 9 Example 5.3 The Development of Musical Preferences The 108 participants in the study ranged in age from 16 to 86 years old. We want to examine the relationship between song-specific age (age in the year the song was popular) and musical preference (positive score => above average, negative score => below average).

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 10 Example 5.3 The Development of Musical Preferences Popular music preferences acquired in late adolescence and early adulthood. The association is nonlinear.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 11 Groups and Outliers Use different plotting symbols or colors to represent different subgroups. Look for outliers: points that have an usual combination of data values.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Describing Linear Patterns with a Regression Line Two purposes of the regression line: to estimate the average value of y at any specified value of x to predict the value of y for an individual, given that individual’s x value When the best equation for describing the relationship between x and y is a straight line, the equation is called the regression line.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 13 Example 5.1 Height and Handspan (cont) Regression equation: Handspan = Height Estimate the average handspan for people 60 inches tall: Average handspan = (60) = 18 cm. Predict the handspan for someone who is 60 inches tall: Predicted handspan = (60) = 18 cm.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 14 Example 5.1 Height and Handspan (cont) Regression equation: Handspan = Height Slope = 0.35 => Handspan increases by 0.35 cm, on average, for each increase of 1 inch in height. In a statistical relationship, there is variation from the average pattern.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 15 The Equation for the Regression Line is spoken as “y-hat,” and it is also referred to either as predicted y or estimated y. b 0 is the intercept of the straight line. The intercept is the value of y when x = 0. b 1 is the slope of the straight line. The slope tells us how much of an increase (or decrease) there is for the y variable when the x variable increases by one unit. The sign of the slope tells us whether y increases or decreases when x increases.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 16 Regression equation: Distance = Age Example 5.2 Driver Age and Maximum Legibility Distance of Highway Signs (cont) Estimate the average distance for 20-year-old drivers: Average distance = 577 – 3(20) = 517 ft. Predict the legibility distance for a 20-year-old driver: Predicted distance = 577 – 3(20) = 517 ft. Slope of –3 tells us that, on average, the legibility distance decreases 3 feet when age increases by one year

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 17 Extrapolation Usually a bad idea to use a regression equation to predict values far outside the range where the original data fell. No guarantee that the relationship will continue beyond the range for which we have observed data.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 18 Prediction Errors and Residuals Prediction Error = difference between the observed value of y and the predicted value. Residual =

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 19 Regression equation: = 577 – 3x Example 5.2 Driver Age and Maximum Legibility Distance of Highway Signs (cont) Can compute the residual for all 30 observations. Positive residual => observed value higher than predicted. Negative residual => observed value lower than predicted. x = Agey = DistanceResidual – 3(18)= – 523 = – 3(20)= – 517 = – 3(22)= – 511 = 5

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 20 Least Squares Line and Formulas Least Squares Regression Line: minimizes the sum of squared prediction errors. SSE = Sum of squared prediction errors. Formulas for Slope and Intercept:

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Measuring Strength and Direction with Correlation The strength of the relationship is determined by the closeness of the points to a straight line. The direction is determined by whether one variable generally increases or generally decreases when the other variable increases. Correlation r indicates the strength and the direction of a straight-line relationship.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 22 Interpretation of r and a Formula r is always between –1 and +1 magnitude indicates the strength r = –1 or +1 indicates a perfect linear relationship sign indicates the direction r = 0 indicates a slope of 0 so knowing x does not change the predicted value of y Formula for correlation:

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 23 Example 5.1 Height and Handspan (cont) Regression equation: Handspan = Height Correlation r = => a somewhat strong positive linear relationship.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 24 Regression equation: Distance = Age Example 5.2 Driver Age and Maximum Legibility Distance of Highway Signs (cont) Correlation r = -0.8 => a somewhat strong negative linear association.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 25 Example 5.6 Left and Right Handspans If you know the span of a person’s right hand, can you accurately predict his/her left handspan? Correlation r = => a very strong positive linear relationship.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 26 Example 5.7 Verbal SAT and GPA Grade point averages (GPAs) and verbal SAT scores for a sample of 100 university students. Correlation r = => a moderately strong positive linear relationship.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 27 Example 5.8 Age and Hours of TV Viewing Relationship between age and hours of daily television viewing for 1913 survey respondents. Correlation r = 0.12 => a weak connection. Note: a few claimed to watch more than 20 hours/day!

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 28 Example 5.9 Hours of Sleep and Hours of Study Relationship between reported hours of sleep the previous 24 hours and the reported hours of study during the same period for a sample of 116 college students. Correlation r = –0.36 => a not too strong negative association.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 29 Interpretation of r 2 and a formula Squared correlation r 2 is between 0 and 1 and indicates the proportion of variation in the response explained by x. SSTO = sum of squares total = sum of squared differences between observed y values and. SSE = sum of squared errors (residuals) = sum of squared differences between observed y values and predicted values based on least squares line.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 30 Interpretation of r 2 Example 5.6: Left and Right Handspans r 2 = 0.90 => span of one hand is very predictable from span of other hand. Example 5.8: TV viewing and Age r 2 = => only about 1.4% knowing a person’s age doesn’t help much in predicting amount of daily TV viewing.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 31 Example 5.6: Left and Right Handspans

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Why the Answers May Not Make Sense Allowing outliers to overly influence the results Combining groups inappropriately Using correlation and a straight-line equation to describe curvilinear data

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 33 Example 5.4 Height and Foot Length (cont) Regression equation uncorrected data: height corrected data: height Correlation uncorrected data: r = 0.28 corrected data: r = 0.69 Three outliers were data entry errors.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 34 Example 5.10 Earthquakes in US Correlation all data: r = 0.73 w/o SF: r = –0.96 San Francisco earthquake of 1906.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 35 Example 5.11 Height and Lead Feet Scatterplot of all data: College student heights and responses to the question “What is the fastest you have ever driven a car?” Scatterplot by gender: Combining two groups led to illegitimate correlation

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 36 Example 5.12 Don’t Predict without a Plot Correlation: r = 0.96 Regression Line: population = – (Year) Poor Prediction for Year 2030 = – (2030) or about 255 million, only 6 million more than Population of US (in millions) for each census year between 1790 and 1990.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Correlation Does Not Prove Causation 1.Causation 2.Confounding Factors Present 3.Explanatory and Response are both affected by other variables 4.Response variable is causing a change in the explanatory variable Interpretations of an Observed Association

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 38 Case Study 5.1 A Weighty Issue Relationship between Actual and Ideal Weight Females Males