Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 4 Lesson 2 (5.2) Summarizing Bivariate Data 5.2: Correlation.

Similar presentations


Presentation on theme: "Unit 4 Lesson 2 (5.2) Summarizing Bivariate Data 5.2: Correlation."— Presentation transcript:

1 Unit 4 Lesson 2 (5.2) Summarizing Bivariate Data 5.2: Correlation

2 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Data collected from students in Statistics classes included their heights (in inches) and weights (in pounds): Here we see a moderate, positive, linear association with an outlier. Wouldn’t it be nice if we could quantify the strength? Describe the Scatterplot

3 Correlation Coefficient (r)- quantitativeA quantitative assessment of the strength and direction of the linear relationship in bivariate, quantitative data Pearson’s sample correlation is used the most Equation: What are these values called? These are the z- scores for x and y.

4 Example 5.1 For the six primarily undergraduate universities in California with enrollments between 10,000 and 20,000, six-year graduation rates (y) and student-related expenditures per full-time students (x) for 2003 were reported as follows: Create a scatterplot and calculate r. Expenditures801173238735754870718248 Graduation rates 64.653.046.342.538.533.9

5 Example 5.1 Continued Expenditures801173238735754870718248 Graduation rates 64.653.046.342.538.533.9 Expenditures Graduation Rates r = 0.05 In order to interpret what this number tells us, let ’ s investigate the properties of the correlation coefficient

6 Moderate Correlation Strong correlation Properties of r (correlation coefficient) 1) legitimate values are -1 < r < 1 No Correlation Weak correlation

7 Handout: Correlation

8 lineartransformation 2) value of r is not changed by any linear transformation Suppose that the graduation rates were changed from percents to decimals (divide by 100). Transform the graduation rates and calculate r. Do the following transformations and calculate r 1) x’ = 5(x + 14) 2) y’ = (y + 30) ÷ 4 Expenditures801173238735754870718248 Graduation rates 64.653.046.342.538.533.9 r = 0.05 It is the same! Why?

9 which 3) value of r does not depend on which of the two variables is labeled x Suppose we wanted to estimate the expenditures per student for given graduation rates. Switch x and y, then calculate r. Expenditures801173238735754870718248 Graduation rates 64.653.046.342.538.533.9 r = 0.05 It is the same!

10 affected 4) value of r is affected by extreme values. Plot a revised scatterplot and find r. Expenditures801173238735754870718248 Graduation rates 64.653.046.342.538.533.9 Suppose the 33.9 was REALLY 63.9. What do you think would happen to the value of the correlation coefficient? 63.9 Extreme values affect the correlation coefficient Expenditures Graduation Rates Expenditures Graduation Rates r = 0.42

11 Find the correlation for these points: x-3 -1 1 3 5 7 9 Y40 20 8 4 8 20 40 Compute the correlation coefficient? Sketch the scatterplot linearly 5) value of r is a measure of the extent to which x and y are linearly related r = 0 x y definite r = 0, but the data set has a definite relationship! Does this mean that there is NO relationship between these points?

12 Recap the Properties of r: 1.legitimate values of r are -1 < r < 1 transformation 2.value of r is not changed by any transformation which 3.value of r does not depend on which of the two variables is labeled x affected by extreme values 4.value of r is affected by extreme values linearly 5.value of r is a measure of the extent to which x and y are linearly related

13 Example 5.1 Continued Expenditures801173238735754870718248 Graduation rates 64.653.046.342.538.533.9 Interpret r = 0.05 In order to interpret r, recall the definition of the correlation coefficient. quantitative A quantitative assessment of the strength and direction of the linear relationship between bivariate, quantitative data There is a weak, positive, linear relationship between expenditures and graduation rates.

14 Does a value of r close to 1 or -1 mean that a change in one variable cause a change in the other variable? Consider the following examples: The relationship between the number of cavities in a child ’ s teeth and the size of his or her vocabulary is strong and positive. Consumption of hot chocolate is negatively correlated with crime rate. These variables are both strongly related to the age of the child Both are responses to cold weather Causality can only be shown by carefully controlling values of all variables that might be related to the ones under study. In other words, with a well- controlled, well-designed experiment. So does this mean I should feed children more candy to increase their vocabulary? Should we all drink more hot chocolate to lower the crime rate?

15

16 Correlation does not imply causation

17 Handout: Quiz A-Chapter 7

18 Homework Pg.163: #5.9-5.11, 5.14, 5.16-5.18


Download ppt "Unit 4 Lesson 2 (5.2) Summarizing Bivariate Data 5.2: Correlation."

Similar presentations


Ads by Google