Download presentation
Presentation is loading. Please wait.
Published byRose Nicholson Modified over 9 years ago
1
Correlation: A statistic to describe the relationship between variables Hours Worked Pay Hours Worked Pay Hours Worked Pay
2
Univariate vs. Bivariate Statistics Bivariate analyses/graphical representations Scatterplots Correlation: Univariate analyses/ graphical representations: Frequency histograms Measures of central tendency and variability Z-scores linear pattern of relationship between one variable (x) and another variable (y) – an association between two variables relative position of one variable correlates with relative distribution of another variable How can we define correlation?
3
Correlations allow us to look for evidence of a relationship between variables.
4
Correlations can vary in strength
5
Correlations can vary in direction So, how do we QUANTIFY a correlation? We need to come up with a NUMBER that reflects both the strength and direction of the correlation. flu shots given
6
Correlation finds the strength and direction of the best fitting line to the data. XY - ( X) ( Y) n r = X 2 - ( X) 2 Y 2 - ( Y) 2 n n [ [ ] ] The number we calculate in Statistics is called the correlation coefficient. Developed by Karl Pearson, it is also sometimes referred to as Pearson’s r.
7
Example Calculation: the following data represent the number of emergency room visits per year (x) and cigarettes smoked a day (y) by three individuals recruited from New York Methodist Hospital. = 0.94 x237x237 y456y456 x 2 4 9 49 y 2 16 25 36 xy 8 15 42 65 (12) (15) 3 62 (12) 2 3 77 (15) 2 3 5 √[(14)(2)] =
8
Another way to think of the correlation The product of the Z-scores for each pair of scores r = ( Z x Z y ) /( n-1) x237x237 y456y456 Zx -.76 -.38 1.13 Zy 0 1 If x=2, (2-4)/2.65 = -.76 … If y=4, (4-5)/1 = -1
9
Another way to think of the correlation The product of the Z-scores for each pair of scores r = ( Z x Z y ) /( n-1) x237x237 y456y456 Zx -.76 -.38 1.13 Zy 0 1 ZxZy.76 0 1.13 1.89 2 =.945 =.95
10
Interpreting the Pearson r * Range of values: Interpreting the value of r -1.0 to +1.0 * Direction from the sign negative => anticorrelated As one variable goes up the other goes down in value. positive => correlatedAs on variable goes up so does the other. * Strength from the magnitude | r | = 1.0perfect relationship | r | = 0.0no evidence of relationship 0.0 < | r | < 1.0intermediate strength relationship
12
When NOT to use a correlation: Extreme scores r =.97 Non-linear relationships r =.20
13
Some Issues with Correlation NO CAUSATION! Spurious correlation
18
Number of people who drowned in a swimming pool & number of Nicholas Cage films in a given year =.67 Per capita consumption of cheese & number of deaths by becoming tangled in bed sheets =.95 Divorce rate in Maine & consumption of margarine in the United States =.99
19
Preview of Next Lecture: Regression finding the best fitting line to a data set.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.