S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 16: reliability and validity
Last week
This week What are reliability and validity
Reliability and validity Leathers, S. (2003). Parental visiting, conflicting allegiances, and emotional and behavioral problems among foster children. Family Relations, 52, 53-63
Your measurement How do I know that the test, scale, instrument, etc., It works every time I use it (reliability)? How do I know that the test, scale, instrument, etc., I use measures what it is supposed to (validity)?
Scales of measurement What is measurement: The assignment of values to outcomes following a set of rules The scales of measurement have four types: Nominal, ordinal, interval and ratio
Nominal level of measurement An outcome can fit into one and only one class or category E.g. gender, political affiliation The least precise level of measurement Categories should be mutually exclusive
The ordinal level of measurement Things are ordered E.g., a rank of candidates for a job
The interval level of measurement Underlying continuum such that we can talk about how much more a higher performance is than a lesser one. 10 words correct is twice as many as five words correct 10 words correct is two more than eight correct and three more than five correct
Ratio level of measurement The presence of an absolute zero on the scale In biological sciences, zero molecular movement, zero light In social and behavioral sciences, it is a bit harder
In sum Any outcome can be assigned to one of the four scales of measurement Scales of measurement have an order, from the least precise being nominal, to the most precise being ratio The “higher up” the scale of measurement, the more precise the data being collected, and the more detailed and informative the data are
In sum Characteristics scaleAbsolute zeroEquidistant points Ranked dataData in categorical Ratioyes Intervalyes Ordinalyes Nominalyes
Reliability Whether a test, or whatever you use as a measurement tool, measures something consistently
Observed score Observed score = true score + error score
Some ways for reliability Test-retest reliability Want to examine whether a test is reliable over time Compute the Pearson correlation coefficient on scores from a test at Time 1and Time 2
Validity The property of an assessment tool that indicates that the tool does what it says it does Content validity Validate through domain expert Criterion validity Validate your criteria with existing tests or criteria Literature support
More resources Winning textbook: sbk00.htm sbk00.htm Good statistics course main.html main.html Statistical glossary