Download presentation
Presentation is loading. Please wait.
Published byJeffry Patrick Modified over 9 years ago
1
SPECIAL CORRELATION
2
Introduction… The Pearson correlation specifically measures the degree of linear relationship between two variables. It is most commonly used measure of relationship and is used with data from an interval or a ratio scale of measurement However, other correlation measures have been developed for nonlinear relationship and for other types of data (scale of measurement)
3
Other correlation measures Spearman Rank Order Correlation Biserial Correlation Point Biserial Correlation Tetrachoric Phi Coefficient
4
Spearman Rank Order Correlation
5
Spearman (Rank Order) Correlation Spearman Correlation is designed to measure the relationship between variables measured on an ordinal scale of measurement Also can be used as a valuable alternative to the Pearson correlation, even the original raw scores are on an interval or ratio scale The Spearman correlation measures consistency rather than form: ‘When two variables are consistently related, their rank will be linearly related.
6
Spearman’s Rank-Difference Correlation Method Especially, when samples are small It can be applied as a quick substitute when the number of pairs, or N, is less than 30 It should be applied when the data are already in terms of rank orders rather than interval measurement
7
X Y X – M X Y – M Y (X-M X ) 2 (Y-M Y ) 2 (X-M X )(Y-M Y ) 7463574635 11 3 5 4 7 2 1 -2 0 5 -3 -2 1 10 3 4 0 4114041140 r = √ (SS X )(SS Y ) SP 25 9 1 4 1 Scores Deviations Squared Deviations Products = √ (10)(40) +16 = 20 +16 = +0,8
8
The Computation of Spearman (Rank Order) Correlation r ρ = 1 - 6 Σ D 2 n(n 2 – 1)
9
X74635X74635 Y 11 3 5 4 7 X14253X14253 Y15342Y15342 R X – R Y 0 1 (R X – R Y ) 2 0 1 Scores RankDifferent = 1 - 6 (4) 5(25 – 1) r ρ = 1 - 6 Σ D 2 n(n 2 – 1) = 0,8
10
INTERPRETATION OF A RANK DIFFERENCE COEFFICIENT The rho coefficient is closely to the Pearson r that would be computed from the original measurement. The r ρ values are systematically a bit lower than the corresponding Pearson-r values, but the maximum difference, which occurs when both coefficient are near.50
11
To measure the relationship between anxiety level and test performance, a psychologist obtains a sample of n = 6 college students from introductory statistics course. The students are asked to come to the laboratory 15 minutes before the final exam. In the lab, the psychologist records psychological measure of anxiety (heart rate, skin resistance, blood pressure, etc) for each student. In addition, the psychologist obtains the exam score for each student. LEARNING CHECK
12
Student Anxiety Rating Exam Scores A580 B288 C780 D779 E486 F585 Compute the Pearson and Spearman correlation for the following data. Test the correlation with α =.05
13
The BISERIAL Coefficient of Correlation
14
The biserial r is especially designed for the situation in which both of the variables correlated are continuously measurable, BUT one of the two is for some reason reduced to two categories This reduction to two categories may be a consequence of the only way in which the data can be obtained, as, for example, when one variable is whether or not a student passes or fails a certain standard
15
The COMPUTATION The principle upon which the formula for biserial r is based is that with zero correlation There would no difference means for the continuous variable, and the larger the difference between means, the larger the correlation
16
THE BISERIAL CORRELATION Where: M p =mean of X values for the higher group in the dichotomized variable, the one having ability on which sample is divided into two subgroups M q = mean of X values for the lower group p= proportion of cases in the higher group q= proportion of cases in the lower group y= ordinate of the unit normal-distribution curve at the point of division between segments containing p and q proportion of the cases St= standard deviation of the total sample in the continuously measured variable X rb =rb = M p – M q StSt X pq y
17
THE BISERIAL CORRELATION OR rb =rb = M p – M q StSt X pq y rb =rb = M p – M T StSt X p y
18
y q areap area
19
The Standard Error of r b If the obtained r b is greater than 1.96 times its standard error, we conclude that at.05 level the obtained correlation would not very probably have arisen by chance from a population in which the correlation is zero Sr b = √ N pq y √
20
AN EVALUATION OF THE BISERIAL r Before computing r, of course we need to dichotomize each Y distribution. In adopting a division point, it is well to come as near the median as possible, why? In all these special instances, however, we are not relieve of the responsibility of defending the assumption of the normal population distribution of Y It may seem contradictory to suggest that when the obtained Y distribution is skewed, we resort the biserial r, but note that is the sample distribution that is skewed and the population distribution that must be assumed to be normal
21
THE BISERIAL r IS LESS RELIABLE THAN THE PEARSON r Whenever there is a real choices between computing a Pearson r or a biserial r, however, one should favor the former, unless the sample is very large and computation time is an important consideration The standard error for a biserial r is considerably larger than that for a Pearson r derived from the same sample
22
The POINT BISERIAL Coefficient of Correlation
23
When one of the two variables in a correlation problem is genuine dichotomy, the appropriate type of coefficient to use is point biserial r Examples of genuine dichotomies are male vs female, being a farmer vs not being a farmer Bimodal or other peculiar distributions, although not representating entirely discrete categories, are sufficiently discontinuous to call for the point biserial rather than biserial r
24
The COMPUTATION A product-moment r could be computed with Pearson’s basic formula If r pbi were computed from data that actually justified the use of r b, the coefficient computed would be markly smaller than r b obtained from the same data r b is √pq/y times as large as r pbi
25
THE POINT BISERIAL CORRELATION Where: M p = mean of X values for the higher group in the dichotomized variable, the one having ability on which sample is divided into two subgroups M q = mean of X values for the lower group p= proportion of cases in the higher group q= proportion of cases in the higher group St= standard deviation of the total sample in the continuously measured variable X r pbi = M p – M q StSt pq
26
ALTERNATIVE METHODS OF COMPUTATION FOR THE POINT BISERIAL CORRELATION r pbi = M p – M T StSt p q r pbi = M p – M T StSt Np Nq r pbi = M p – M q StSt NpNq
27
POINT-BISERIAL vs BISERIAL When the dichotomous variable is normally distributed without reasonable doubt, it is recommended that r b be computed and interpreted If there is little doubt that the distribution is a genuine dichotomy, r pbi should be computed and interpreted When in doubt, the r pbi is probably the safer choice
28
Mathematical relation of r pbi to r b r b = r pbi √ pq y r pbi = r b √ pq y
29
The TETRACHORIC Correlation
30
TETRACHORIC CORRELATION A tetrachoric r is computed from data in which both X and Y have been reduced artificially to two categories Under the appropriate condition it gives a coefficient that is numerically equivalent to a Pearson r and may be regard as an approximation to it
31
Where: a= frequency of cases in the higher group in both variables b= frequency of cases in (the higher group in variables X and the lower group in variable Y) c= frequency of cases in (the lower group in variables X and the higher group in variable Y) d= frequency of cases in the lower group in both variables y= ordinate of the unit normal-distribution curve at the point of division between segments containing p and q proportion of the cases y’= ordinate of the unit normal-distribution curve at the point of division between segments containing p’ and q’ proportion of the cases r cos-pi = cos π √ ad bc √ √ +
32
Variable X Variable Y ba d c p’q’ p q y y’
33
ALTERNATIVE METHODS OF COMPUTATION FOR THE TETRACHORIC CORRELATION r cos-pi = cos π √ ad bc √ √ + r cos-pi = cos 180 0 √ ad bc √ √ + 180 0 ad/bc 1+ √ = cos
34
EXAMPLE for THE TETRACHORIC CORRELATION StudentHeightWeight A16561 B16254 C17060 D17458 E16850 F17263 G16352 H15949 CategoryLight (<60kg) Heavy (>60kg) Tall (>170cm) ba Short (<170cm) dc r cos-pi = ad - bc yy’N 2
35
The Standard Error of r t If the obtained r t is greater than 1.96 times its standard error, we conclude that at.05 level the obtained correlation would not very probably have arisen by chance from a population in which the correlation is zero Sr t = √ N pp’qq’ yy’ √
36
TETRACHORIC CORRELATION The tetrachoric r requires that both X and Y represent continuous, normally distributed, and linearly related variables The tetrachoric r is less reliable than the Pearson r. It is more reliable when: ○ N is large, as is true of all statistic ○ r t is large, as is true of other r’s ○ the division in the two categories are near the medians
37
The Phi Coefficient r Φ
38
THE Phi COEFFICIENT r Ф related to the chi square from 2 x 2 table When two distributions correlated are genuinely dichotomous– when the two classes are separated by real gap between them, and previously discussed correlational method do not apply– we may resort to the phi coefficient This coefficient was designed for so-called point distributions, which implies that the two classes have two point values and merely represent some qualitative attribute
39
THE PHI COEFFICIENT r Ф rФ =rФ = αδ - βγ √ pqp’q’ Category Normal Color Vision Color Blind BOTH Male 42 β 18 α 60 p Female 26 δ 14 γ 40 q BOTH 68 q’ 32 p’ 100 1,00 Compute the Phi Coefficient r Φ for the following data.
40
PR Tentukan teknik korelasi yang tepat untuk mengetahui gambaharan hubungan antara variabel di bawah ini. Kemudian hitung koefisien korelasinya 1.IPK dengan Tinggi Badan 2.Jenis Kelamin dengan IPK 3.Jenis Kelamin dengan Jumlah HP>1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.