Dr. Jeffrey Oescher 27 January 2014
Technical Issues Two technical issues Validity Reliability
Technical Issues Validity – the extent to which inferences made on the basis of scores from an instrument are appropriate, meaningful, and useful Characteristics Refers to the interpretation of the results Is a matter of degree Is situation specific Is a unitary concept Involves an overall judgment
Data Collection – Technical Issues Validity evidence Content Face Content Construct Criterion-related Predictive Concurrent Situationally specific
Data Collection – Technical Issues Reliability The extent to which scores are free from error Error is measured by consistency Two perspectives Test – the reliability of a test Agreement – the reliability of an observation
Data Collection – Technical Issues Test reliability evidence Stability Also known as test-retest Measured on a scale of 0 to1 Equivalence Also known as parallel forms Measured on a scale of 0 to 1 Internal consistency Split half KR 20 KR 21 Cronbach alpha All measured on a scale from 0 to 1
Data Collection – Technical Issues Score reliability evidence Standard error of measurement or SEM A statistic that allows one to ascertain the probability that a student’s score falls within a given range of scores Usually reported as the student’s score and ‘SEM = +/- 2.25’ You can add and subtract one (1) SEM to a student’s score and be confident that their score fall within that range of scores 68% of the time You can add and subtract two (2) SEM to a students score and be confident that their score falls with that range of scores 99% of the time Agreement reliability evidence Percentage of agreement between observers More commonly known as inter-rater reliability Ranges on a scale from 0 to 1
Score Interpretation Two types of interpretations: criterion-referenced and norm- referenced Criterion-referenced You need to know the underlying scale (e.g., 0-100, 1-5, etc.) upon which the scores are based The interpretation of the test score is made relative to this underlying scale The scores indicted the students mastered about three-fourths of the objectives The scores are interpreted relative to what the students know The scores easily communicate some level of performance (e.g., good, bad, moderate, etc.)
Score Interpretation Norm-referenced You need to know the reference group (i.e., norming sample) against which the scores are being compared The interpretation of test scores is made in relation to the scores of students in the norming group John’s score put him in the 85 th percentile John’s score indicates he performed better than 85% of the students in the norming group John’s score doesn’t tell us anything about what John knows in terms of content
Score Interpretation A note of caution Which of the following represents a criterion-referenced and norm-referenced interpretation? The scores for the experimental group were significantly higher than those for the control group. The scores for the experimental group indicated mastery of about 95% of the objectives, while those scores for the control group indicated only 65% mastery. These are common examples from the literature you will be reading Be careful about the first interpretation, as it only tells us which group is better. It does not tell us how well either group performed.