Download presentation
Presentation is loading. Please wait.
Published byEugene Craig Modified over 9 years ago
1
LECTURE 6 RELIABILITY
2
Reliability is a proportion of variance measure (squared variable) Defined as the proportion of observed score (x) variance due to true score ( ) variance: 2 x = xx’ = 2 / 2 x
3
Var( ) Var(x) Var(e) reliability VENN DIAGRAM REPRESENTATION
4
PARALLEL FORMS OF TESTS If two items x 1 and x 2 are parallel, they have equal true score variance: –Var( 1 ) = Var( 2 ) equal error variance: –Var(e 1 ) = Var(e 2 ) Errors e 1 and e 2 are uncorrelated: (e 1, e 2 ) = 0 1 = 2
5
Reliability: 2 parallel forms x 1 = + e 1, x 2 = + e 2 (x 1,x 2 ) = reliability = xx’ = correlation between parallel forms
6
x1x1 xx e x2x2 e xx xx’ = x * x Reliability: parallel forms
7
Reliability: 3 or more parallel forms For 3 or more items x i, same general form holds reliability of any pair is the correlation between them Reliability of the composite (sum of items) is based on the average inter-item correlation: stepped-up reliability, Spearman-Brown formula
8
Reliability: 3 or more parallel forms Spearman-Brown formula for reliability r xx = k r(i,j) / [ 1+ (k-1) r(i,j) ] Example: 3 items, 1 correlates.5 with 2, 1 correlates.6 with 3, and 2 correlates.7 with 3; average is.6 r xx = 3(.6) / [1 + 2(.6) ] = 1.8/2.2 =.87
9
Reliability: tau equivalent scores If two items x 1 and x 2 are tau equivalent, they have 1 = 2 equal true score variance: –Var( 1 ) = Var( 2 ) unequal error variance: –Var(e 1 ) Var(e 2 ) Errors e 1 and e 2 are uncorrelated: (e 1, e 2 ) = 0
10
Reliability: tau equivalent scores x 1 = + e 1, x 2 = + e 2 (x 1,x 2 ) = reliability = xx’ = correlation between tau eqivalent forms (same computation as for parallel, observed score variances are different)
11
Reliability: Spearman-Brown Can show the reliability of the parallel forms or tau equivalent composite is kk’ = [k xx’ ]/[1 + (k-1) xx’ ] k = # times test is lengthened example: test score has rel=.7 doubling length produces rel = 2(.7)/[1+.7] =.824
12
Reliability: Spearman-Brown example: test score has rel=.95 Halving (half length) produces xx =.5(.95)/[1+(.5-1)(.95)] =.905 Thus, a short form with a random sample of half the items will produce a test with adequate score reliability
13
Reliability: KR-20 for parallel or tau equivalent items/scores Items are scored as 0 or 1, dichotomous scoring Kuder and Richardson (1937): special cases of Cronbach’s more general equation for parallel tests. KR-20 = [k/(k-1)] [ 1 - p i q i / 2 y ], where p i = proportion of respondents obtaining a score of 1 and q i = 1 – p i. p i is the item difficulty
14
Reliability: KR-21 for parallel forms assumption Items are scored as 0 or 1, dichotomous scoring Kuder and Richardson (1937) KR-21 = [k/(k-1)] [ 1 - k p. q. / 2 c ] p. is the mean item difficulty and q. = 1 – p. KR-21 assumes that all items have the same difficulty (parallel forms) item mean gives the best estimate of the population values. KR-21 KR-20.
15
Reliability: congeneric scores If two items x 1 and x 2 are congeneric, 1. 1 2 2. unequal true score variance: Var( 1 ) Var( 2 ) 3. unequal error variance: Var(e 1 ) Var(e 2 ) 4. Errors e 1 and e 2 are uncorrelated: (e 1, e 2 ) = 0
16
Reliability: congeneric scores x 1 = 1 + e 1, x 2 = 2 + e 2 jj = Cov(t 1, t 2 )/ x1 x2 This is the correlation between two separate measures that have a common latent variable
17
11 x1x1 x1 1 e1e1 x2x2 e2e2 x2 2 xx’ = x1 1 12 x2 2 22 12 Congeneric measurement structure
18
Reliability: Coefficient alpha Composite=sum of k parts, each with its own true score and variance C = x 1 + x 2 + …x k ≤ 1 - 2 k / 2 c est = k/(k-1)[1 - s 2 k / s 2 c ]
19
Reliability: Coefficient alpha Alpha = 1. Spearman-Brown for parallel or tau equivalent tests 2. = KR20 for dichotomous items (tau equiv.) = Hoyt, even for 2 x item 0 (congeneric)
20
Hoyt reliability Based on ANOVA concepts extended during the 1930s by Cyrus Hoyt at U. Minnesota Considers items and subjects as factors that are either random or fixed (different models with respect to expected mean squares) Presaged more general Coefficient alpha derivation
21
Reliability: Hoyt ANOVA Source dfExpected Mean Square Person (random) I-1 2 + 2 x items + K 2 Items (random) K-1 2 + k 2 x item + I 2 items error (I-1)(K-1) 2 + 2 x item parallel forms => 2 x item = 0 Hoyt = { ℇ(MS persons ) - ℇ(MS error ) } / ℇ(MS persons ) est Hoyt = [ (MS persons ) - (MS error ) ] / (MS persons )
22
Reliability: Coefficient alpha Composite=sum of k parts, each with its own true score and variance C = x 1 + x 2 + …x k Example: sx1 = 1, sx2=2, sx3=3 sc = 5 est = 3/(3-1)[1 - (1+4+9)/25 ] = 1.5[1 – 14/25] = 16.5/25 =.66
24
JOE1110 SUZY1011 FRANK0010 JUAN0111 SHAMIKA1111 ERIN0001 MICHAEL0111 BRANDY1100 WALID1011 KURT0010 ERIC1110 MAY1000 SPSS DATA FILE
25
R E L I A B I L I T Y A N A L Y S I S - S C A L E (A L P H A) Reliability Coefficients N of Cases = 12.0 N of Items = 4 Alpha =.1579 SPSS RELIABILITY OUTPUT
26
R E L I A B I L I T Y A N A L Y S I S - S C A L E (A L P H A) Reliability Coefficients N of Cases = 12.0 N of Items = 8 Alpha =.6391 Note: same items duplicated SPSS RELIABILITY OUTPUT
27
TRUE SCORE THEORY AND STRUCTURAL EQUATION MODELING True score theory is consistent with the concepts of SEM - latent score (true score) called a factor in SEM - error of measurement - path coefficient between observed score x and latent score is same as index of reliability
28
COMPOSITES AND FACTOR STRUCTURE 3 Manifest (Observed) Variables required for a unique identification of a single factor Parallel forms implies –Equal path coefficients (termed factor loadings) for the manifest variables –Equal error variances –Independence of errors
29
x1x1 xx e x2x2 e xx x i x j = x i * x j = reliability between variables i and j x3x3 e xx Parallel forms factor diagram
30
RELIABILITY FROM SEM TRUE SCORE VARIANCE OF THE COMPOSITE IS OBTAINABLE FROM THE LOADINGS: k = 2 i = Variance of factor i=1 k = # items or subtests = k 2 x = k times pairwise average reliability of items
31
RELIABILITY FROM SEM RELIABILITY OF THE COMPOSITE IS OBTAINABLE FROM THE LOADINGS: = k/(k-1)[1 - 1/ ] example 2 x =.8, K=11 = 11/(10)[1 - 1/8.8 ] =.975
32
TAU EQUIVALENCE ITEM TRUE SCORES DIFFER BY A CONSTANT: i = j + k ERROR STRUCTURE UNCHANGED AS TO EQUAL VARIANCES, INDEPENDENCE
33
CONGENERIC MODEL LESS RESTRICTIVE THAN PARALLEL FORMS OR TAU EQUIVALENCE: –LOADINGS MAY DIFFER –ERROR VARIANCES MAY DIFFER MOST COMPLEX COMPOSITES ARE CONGENERIC: –WAIS, WISC-III, K-ABC, MMPI, etc.
34
x1x1 x1x1 e1e1 x2x2 e2e2 x2x2 (x 1, x 2 )= x 1 * x 2 x3x3 e3e3 x3x3
35
COEFFICIENT ALPHA xx’ = 1 - 2 E / 2 X = 1 - [ 2 i (1 - ii )]/ 2 X, since errors are uncorrelated = k/(k-1)[1 - s 2 i / s 2 C ] where C = x i (composite score) s 2 i = variance of subtest x i s C = variance of composite Does not assume knowledge of subtest ii
36
COEFFICIENT ALPHA- NUNNALLY’S COEFFICIENT IF WE KNOW RELIABILITIES OF EACH SUBTEST, i N = K/(K-1)[1- s 2 i (1- r ii )/ s 2 X ] where r ii = coefficient alpha of each subtest Willson (1996) showed N xx’
37
x1x1 x1x1 e1e1 x2x2 e2e2 x2x2 X i X i = 2 x i + s 2 i x3x3 e3e3 x3x3 s1s1 NUNNALLY’S RELIABILITY CASE s2s2 s3s3
38
Reliability Formula for SEM with Multiple factors (congeneric with subtests) Single factor model: = i 2 / [ i 2 + ii + ij ] > If eij = 0, reduces to = i 2 / [ i 2 + ii ] = Sum(factor loadings on 1 st factor)/ Sum of observed variances This generalizes (Bentler, 2004) to the sum of factor loadings on the 1 st factor divided by the sum of variances and covariances of the factors for multifactor congeneric tests Maximal Reliability for Unit-weighted Composites Peter M. Bentler University of California, Los Angeles UCLA Statistics Preprint No. 405 October 7, 2004 http://preprints.stat.ucla.edu/405/MaximalReliabilityforUnit-weightedcomposites.pdf
39
Multifactor models and specificity Specificity is the correlation between two observed items independent of the true score Can be considered another factor Cronbach’s alpha can overestimate reliability if such factors are present Correlated errors can also result in alpha overestimating reliability
40
x1x1 x1x1 e1e1 x2x2 e2e2 x2x2 Specificities can be misinterpreted as a correlated error model if they are correlated or a second factor x3x3 e3e3 x3x3 s CORRELATED ERROR PROBLEMS s3s3
41
x1x1 x1x1 e1e1 x2x2 e2e2 x2x2 Specificieties can be misinterpreted as a correlated error model if specificities are correlated or are a second factor x3x3 e3e3 x3x3 CORRELATED ERROR PROBLEMS s3s3
42
SPSS SCALE ANALYSIS ITEM DATA EXAMPLE: (Likert items, 0-4 scale) Mean Std Dev Cases 1. CHLDIDEAL (0-8) 2.7029 1.4969 882.0 2. BIRTH CONTROL PILL OK 2.2959 1.0695 882.0 3. SEXED IN SCHOOL 1.1451.3524 882.0 4. POL. VIEWS (CONS-LIB) 4.1349 1.3379 882.0 5. SPANKING OK IN SCHOOL 2.1111.8301 882
43
CORRELATIONS Correlation Matrix CHLDIDEL PILLOK SEXEDUC POLVIEWS CHLDIDEL 1.0000 PILLOK.1074 1.0000 SEXEDUC.1614.2985 1.0000 POLVIEWS.1016.2449.1630 1.0000 SPANKING -.0154 -.0307 -.0901 -.1188
44
SCALE CHARACTERISTICS Statistics for Mean Variance Std Dev Variables Scale 12.3900 7.5798 2.7531 5 Items Mean Minimum Maximum Range Max/Min Variance 2.4780 1.1451 4.1349 2.9898 3.6109 1.1851 Item Variances Mean Minimum Maximum Range Max/Min Variance 1.1976.1242 2.2408 2.1166 18.0415.7132 Inter-itemCorrelations Mean Minimum Maximum Range Max/Min Variance.0822 -.1188.2985.4173 -2.5130.0189
45
ITEM-TOTAL STATS Item-total Statistics Scale Scale Corrected Mean Variance Item- Squared Alpha Total Multiple if item Correlation R deleted CHLDIDEAL 9.6871 4.4559.1397.0342.2121 PILLOK 10.0941 5.2204.2487.1310.0961 SEXEDUC 11.2449 6.9593.2669.1178.2099 POLVIEWS 8.2551 4.7918.1704.0837.1652 SPANKING 10.2789 7.3001 -.0913.0196.3655
46
ANOVA RESULTS Analysis of Variance Source of Variation Sum of Sq.DF Mean Square F Prob. Between People 1335.5664 881 1.5160 Within People 8120.8000 3528 2.3018 Measures 4180.9492 4 1045.2373 934.9.0000 Residual 3939.8508 3524 1.1180 Total 9456.3664 4409 2.1448
47
RELIABILITY ESTIMATE Reliability Coefficients 5 items Alpha =.2625 Standardized item alpha =.3093 Standardized means all items parallel
48
RELIABILITY: APPLICATIONS
49
STANDARD ERRORS s e = standard error of measurement = s x [1 - xx ] 1/2 can be computed if xx is estimable provides error band around an observed score: [ -1.96s e + x, 1.96s e + x ]
50
x +1.96s e -1.96s e ASSUMES ERRORS ARE NORMALLY DISTRIBUTED
51
TRUE SCORE ESTIMATE est = xx x + [1 - xx ] x mean example: x= 90, mean=100, rel.=.9 est =.9 (90) + [1 -.9 ] 100 = 81 + 10 = 91
52
STANDARD ERROR OF TRUE SCORE ESTIMATE S = = s x [ xx ] 1/2 [1 - xx ] 1/2 Provides estimate of range of likely true scores for an estimated true score
53
DIFFERENCE SCORES Difference scores are widely used in education and psychology: Learning disability = Achievement - Predicted Achievement Gain score from beginning to end of school year Brain injury is detected by a large discrepancy in certain IQ scale scores
54
RELIABILITY OF D SCORES D = x - y s 2 D = s 2 x + s 2 y - 2r xy s x s y r DD = [r xx s 2 x + r yy s 2 y -2 r xy s x s y ]/ [s 2 x + s 2 y - 2r xy s x s y ]
55
REGRESSION DISCREPANCY D = y - y pred where y pred = bx + b 0 s DD = [(1 - r 2 xy )(1- r DD )] 1/2 where r DD = [r yy + r xx r xy -2r 2 xy ]/ [1- r 2 xy ]
56
TRUE DISCREPANCY D = DDD = b D y.x (y - y mn ) + b D x.y (x - x mn ) D = [ DDs D = [ b 2 D y.x + b 2 D x.yn +2(b D y.x b D x.y r xy ] =and r DD = {[2-(r xx -r yy ) 2 + (r yy -r xy ) 2 - 2(r yy -r xy )(r xx -r xy )r 2 xy ] / [(1-r xy )(r yy +r xx -2r xy )]} -1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.