Psy 425 Tests & Measurements Furr & Bacharach Chapter 5 Conceptual Basis for Reliability
True Scores? Do scores on a test accurately reflect real psychological differences? Assessment of reliability Detecting the ability of a test to accurately reflect real differences
Classical Test Theory (CTT) Conceptual basis of reliability Outlines procedures for estimating the reliability of psychological measures
CTT True differences vs. measurement error A test’s reliability reflects the extent to which the differences in respondents’ test scores are a function of their true psychological differences, as opposed to measurement error…
Reliability Not all or none Is on a continuum A test may be more or less reliable
Theoretical Reliability is a theoretical notion Not directly observable Can only estimate the reliability
Derivation of Reliability Estimate Estimate is derived based on three factors: Observed scores True scores Measurement error
Observed Scores Values obtained from measurement of some characteristic of an individual
True Scores Real, true amounts of that characteristic
Reliability Extent to which observed scores are consistent with true scores as opposed to other often unknown test and test administration characteristics
Measurement Error “Other” characteristics that contribute to differences in observed scores These characteristics create inconsistencies between observed scores and true scores
Sources of Measurement Error? Can all sources be accounted for?
Post-partum Depression? Accurate Measurement? Factors can obscure observed scores… Measurement of physical properties… Measurement of psychological attributes… Height & Weight? Post-partum Depression?
What sources of error might contribute to scores on a test of depression (i.e., inflate or deflate true scores)? Interpretation of written items Incorrect recording of answers Secondary gain? Defensive or avoidant? Psychological mindedness? Cultural factors?
Test reliability depends on… Extent to which differences in test scores can be attributed to real inter- or intra- individual differences AND Extent to which such differences are a function of measurement error
CTT Person’s observed score on a test is a function of that person’s true score, plus error:
Fundamental Theoretical Assumption of CTT Observed scores on a psychological measure are determined by respondents’ true scores and by measurement error
CTT assumption about measurement error… RANDOM
Random Error Inflation and deflation caused by error is independent of the individuals’ true levels of the psychological attribute being measured… Interpretation of written items Incorrect recording of answers Secondary gain? Defensive or avoidant? Psychological mindedness? Cultural factors?
Important consequences of assumption of random error: Error cancels itself out across respondents Error scores are uncorrelated with true scores
Error cancels itself out…
Correlation between true scores and error scores = 0.0
Four ways to think of reliability
Four ways to think of reliability
Values:
Worksheet
Size of reliability coefficient Test’s reliability Varies between 0 and 1 Larger values = greater psychometric quality As value increases, a greater proportion of the differences among observed scores can be attributed to differences among true scores
Good vs. poor test reliability No clear cutoff In social science research, .70 to .80 is satisfactory Less than that, marginal to poor What about test reliability = 0; is the test at all useful? What about .43?
Improving reliability… Improved Test Rxx = .48 Rxx = .74
Error variance Small degree = respondents’ scores are only being slightly affected by measurement error
Index of reliability “index of reliability” = unsquared correlation between observed and true scores USUALLY – referring to coefficient of reliability or R2
Reliability and Standard Error of Measurement Standard deviation of error scores Represents average size of error scores The greater average difference between observed scores and true scores, the less reliable the test Closely link to reliability - large sempoor Rxx If Rxx = 1, then sem = 0