Presentation is loading. Please wait.

Presentation is loading. Please wait.

BPS - 3rd Ed. Chapter 41 Scatterplots and Correlation.

Similar presentations


Presentation on theme: "BPS - 3rd Ed. Chapter 41 Scatterplots and Correlation."— Presentation transcript:

1 BPS - 3rd Ed. Chapter 41 Scatterplots and Correlation

2 BPS - 3rd Ed. Chapter 42 Variable (X) and Variable (Y) u Prior chapters  one variable at a time u This chapter  relationship between two variables u One variable is an “outcome”: response variable (Y) u The other variable is a “predictor”: explanatory variable (X) u Are X and Y related? X  Y?

3 BPS - 3rd Ed. Chapter 43 Question A study investigates whether the there is a relationship between gross domestic product and life expectancy: Which is the explanatory variable (X)? Which is the response variable (Y)? All other variables that may influence life expectancy are “lurking” and may confound the relation between X and Y. Are there lurking variables in this analysis?

4 BPS - 3rd Ed. Chapter 44 u This chapter considers the case in which both X and Y are quantitative variables u Bivariate data points (x i, y i ) are plotted on graph paper to form a scatterplot Scatterplot

5 BPS - 3rd Ed. Chapter 45 u X = percent of students taking SAT u Y = mean SAT verbal score u What is the relationship between X and Y? Example of a scatterplot

6 BPS - 3rd Ed. Chapter 46 Interpreting scatterplots u Form Can data be described by straight line? [Linearity] u Direction Does the line slope upward or downward v Positive association = above-average values of Y accompany above-average values of X (and vice versa) v Negative association = above-average values of Y accompany below-average values of X (and vice versa) u Strength Do data point adhere to imaginary line?

7 BPS - 3rd Ed. Chapter 47 Form [discuss]

8 BPS - 3rd Ed. Chapter 48 Strength and direction u Direction: positive, negative or flat u Strength: How closely does a non- horizontal straight line fit the points of a scatterplot? Close fitting  strong Loose fitting  weak

9 BPS - 3rd Ed. Chapter 49 Strength cannot be reliably judged visually u These two scatterplots are of the same data (they have the exact same correlation) u The second scatter plot looks like a stronger correlation, but this is an artifact of the axis scaling

10 BPS - 3rd Ed. Chapter 410 Correlation coefficient (r) u Let r denote the correlation coefficient u r is always between -1 and +1, inclusive u Sign of r denotes direction of association u Special values for r :  r = +1  all points on upward sloping line  r = -1  all points on downward sloping line  r = 0  no line or horizontal line  The closer r is to +1 or –1, the better the fit of points to the line

11 BPS - 3rd Ed. Chapter 411 Examples of Correlations u Husband’s versus Wife’s ages v r =.94 u Husband’s versus Wife’s heights v r =.36 u Professional Golfer’s Putting Success: Distance of putt in feet versus percent success v r = -.94

12 BPS - 3rd Ed. Chapter 412 Correlation Coefficient r u Data on variables X and Y for n individuals: x 1, x 2, …, x n and y 1, y 2, …, y n u Each variable has a mean and std dev:

13 BPS - 3rd Ed. Chapter 413 Correlation coefficient r The formula for r can be understood by converting data points to standardized scores: where

14 BPS - 3rd Ed. Chapter 414 Illustrative example (gdp_life.sav) Per Capita Gross Domestic Product and Average Life Expectancy for Countries in Western Europe Does GDP predict life expectancy?

15 BPS - 3rd Ed. Chapter 415 Illustrative example (gdp_life.sav) CountryPer Capita GDP (X)Life Expectancy (Y) Austria21.477.48 Belgium23.277.53 Finland20.077.32 France22.778.63 Germany20.877.17 Ireland18.676.39 Italy21.578.51 Netherlands22.078.15 Switzerland23.878.99 United Kingdom21.277.37

16 BPS - 3rd Ed. Chapter 416 Illustrative example (gdp_life.sav) Scatterplot

17 BPS - 3rd Ed. Chapter 417 Illustrative example (gdp_life.sav) xy 21.477.48-0.078-0.3450.027 23.277.531.097-0.282-0.309 20.077.32-0.992-0.5460.542 22.778.630.7701.1020.849 20.877.17-0.470-0.7350.345 18.676.39-1.906-1.7163.271 21.578.51-0.0130.951-0.012 22.078.150.3130.4980.156 23.878.991.4891.5552.315 21.277.37-0.209-0.4830.101 = 21.52 = 77.754 sum = 7.285 s x =1.532s y =0.795

18 BPS - 3rd Ed. Chapter 418 Illustrative example (gdp_life.sav)

19 BPS - 3rd Ed. Chapter 419 Interpretation of r Direction of association: positive or negative Strength of association: the closer |r| is to 1, the stronger the correlation. Here are guidelines: 0.0  |r| < 0.3  weak correlation 0.3  |r| < 0.7  moderate correlation 0.7  |r| < 1.0  strong correlation |r| = 1.0  perfect correlation

20 BPS - 3rd Ed. Chapter 420 Interpretation of r For GDP / life expectancy example, r = 0.809. This indicates a strong positive correlation

21 BPS - 3rd Ed. Chapter 421 Problems with Correlations u Not all relations are linear u Outliers can have large influence on r u Lurking variables confound relations

22 BPS - 3rd Ed. Chapter 422 Not all Relationships are Linear Miles per Gallon versus Speed u r  0 (flat line) u But there is a non- linear relation

23 BPS - 3rd Ed. Chapter 423 Not all Relationships are Linear Miles per Gallon versus Speed u Curved relationship. u r was misleading.

24 BPS - 3rd Ed. Chapter 424 Outliers and Correlation The outlier in the above graph decreases r If we remove the outlier  strong relation

25 BPS - 3rd Ed. Chapter 425 Exercise 4.15: Calories and sodium content of hot dogs (a) What are the lowest and highest calorie counts? …lowest and highest sodium levels? (b) Positive or negative association? (c) Any outliers? If we ignore outlier,is relation still linear? Does the correlation become stronger?

26 BPS - 3rd Ed. Chapter 426 Exercise 4.13: IQ and school grades (a) Positive or negative association? (b) Is form linear? Does it appear strong? (c) What is the IQ and GPA for the outlier on the bottom there?


Download ppt "BPS - 3rd Ed. Chapter 41 Scatterplots and Correlation."

Similar presentations


Ads by Google