Download presentation
Presentation is loading. Please wait.
1
Correlations: Correlation Coefficient:
A succinct measure of the strength of the relationship between two variables (e.g. height and weight, age and reaction time, IQ and exam score).
2
There are various types of correlation coefficient, for different purposes:
1. Pearson's "r": Used when both X and Y variables are (a) continuous; (b) (ideally) measurements on interval or ratio scales; (c) normally distributed - e.g. height, weight, I.Q. 2. Spearman's rho: In same circumstances as (1), except that data need only be on an ordinal scale - e.g. attitudes, personality scores.
3
r is a parametric test: the data have to have certain characteristics (parameters) before it can be used. rho is a non-parametric test - less fussy about the nature of the data on which it is performed.
4
Correlations vary between:
+1 (perfect positive correlation: as X increases, so does Y):
5
... and -1 (perfect negative correlation: as X increases, Y decreases, or vice versa).
r = 0 means no correlation between X and Y: changes in X are not associated with systematic changes in Y, or vice versa.
6
Calculating Pearson's r: a worked example: Do students who perform well on one statistics test also perform well on another?
7
Student: Test 1 (X): Test 2 (Y): X2 Y2 XY A 37 75 1369 5625 2775 B 41 78 1681 6084 3198 C 48 88 2304 7744 4224 D 32 80 1024 6400 2560 E 36 1296 2808 F 30 71 900 5041 2130 G 40 1600 3000 H 45 83 2025 6889 3735 I 39 74 1521 5476 2886 J 34 1156 2516 N=10 ΣX = 382 ΣY =776 ΣX2 = 14876 ΣY2 = 60444 ΣXY = 29832
8
Using our values (from the bottom row of the table:)
ΣX = 382 ΣY =776 ΣX2 = 14876 ΣY2 = 60444 ΣXY = 29832 ( ) 776 382 * 29832 - 10 r = ( ) ( ) æ 382 2 ö æ 776 2 ö ç 14876 - ÷ * ç 60444 - ÷ ç ÷ ç ÷ è ø è ø 10 10
9
( ) 7455 . 391 253 80 188 40 226 60 283 r 60217 60444 14592 14876 20 29643 29832 = * - r is This is a positive correlation: students who score highly on the first test tend to score highly on the second (and vice versa).
10
How to interpret the size of a correlation:
r2 is the "coefficient of determination". It tells us what proportion of the variation in the Y scores is associated with changes in X. e.g., if r is 0.2, r2 is 4% (0.2 * 0.2 = = 4%). Only 4% of the variation in Y scores is attributable to Y's relationship with X. Thus, knowing a person's Y score tells you essentially nothing about what their X score might be.
11
Our correlation of 0.75 gives an r2 of 56%.
An r of 0.9, gives an r2 of (0.9 * 0.9 = .81) = 81%. Note that correlations become much stronger the closer they are to 1 (or -1). Correlations of .6 or -.6 (r2 = 36%) are much better than correlations of .3 or -.3 (r2 = 9%), not merely twice as strong!
12
Spearman's rho: Measures the degree of monotonicity rather than linearity in the relationship between two variables - i.e., the extent to which there is some kind of change in X associated with changes in Y: Hence, copes better than Pearson's r when the relationship is monotonic but non-linear - e.g.:
13
Spearman's rho - worked example:
Is there a correlation between the number of vitamin treatments a person has, and their score on a memory test?
14
Subj: No.vitamin teatments (X): Memory test score (Y): Vitamin treatment ranks (X): Memory ranks (Y): D (= X-Y) D2 A 2 22 1 +1 B 34 -1 C 3 36 3.5 +0.5 0.25 4 49 5 E 42 -0.5 F 6 57 7 G 82 7.5 -1.5 2.25 H 8 N = 8 ΣD2 = 6.0
15
OR
16
Step 1: assign ranks to the raw data, for each variable separately.
Rules for ranking: (a) Give the lowest score a rank of 1; next lowest a rank of 2; etc. (b) If two or more scores are identical, this is a "tie": give them the average of the ranks they would have obtained had they been different. The next score that is different, gets the rank it would have had if the tied scores had not occurred.
17
e.g.: raw score "original"rank actual rank: Rank for the tied scores is (2+3)/2 = 2.5 raw score actual rank: Rank for the tied scores is (2+3+4)/3 = 3
18
Step 2: Subtract one set of ranks from the other, to get a set of differences, D. Step 3: Square each of these differences, to get D2. Step 4: Add up the values of D2 , to get ΣD2. Here, ΣD2 = 8.5. N = 8.
19
Step 5:
20
rho = 0.93. There is a strong positive correlation between the number of vitamin treatments a person has, and their memory test score. Pearson's r on the same data = 0.86.
21
Using SPSS to obtain scatterplots: (a) simple scatterplot:
Graphs > Scatter/Dot...
22
Analyze > Regression > Curve Estimation...
Using SPSS to obtain scatterplots: (b) scatterplot with regression line: Analyze > Regression > Curve Estimation... "Constant" is the intercept, "b1" is the slope
23
Using SPSS to obtain correlations:
Analyze > Correlate > Bivariate...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.