Download presentation
Presentation is loading. Please wait.
Published byThomasine Cook Modified over 9 years ago
1
REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree to which an observed score fluctuates due to measurement errors Factors affecting reliability A test must be RELIABLE to be VALID
2
REVIEW II Types of validity Content-related (face) Represents important/necessary knowledge Use “experts” to establish Criterion-related Evidence of a statistical relationship w/ trait being measured Alternative measures must be validated w/ criterion measure Construct-related Validates unobservable theoretical measures
3
REVIEW III Standard Error of Estimate Validity measure Degree of error in estimating a score based on the criterion Methods of obtaining a criterion measure Actual participation Perform criterion Predictive measures Interpreting “r”
4
Criterion-Referenced Measurement PoorSufficientBetter It’s all about me: did I get ‘there’ or not?
5
Criterion-Referenced Testing aka, Mastery Learning Standard Development Judgmental: use experts typical in human performance Normative: theoretically accepted criteria Empirical: cutoff based on available data Combination: expert & norms typically combined
6
Advantages of Criterion-Referenced Measurement Represent specific, desired performance levels linked to a criterion Independent of the % of the population that meets the standard If not met, specific diagnostic evaluations can be made Degree of performance is not important-reaching the standard is Performance linked to specific outcomes Individuals know exactly what is expected of them
7
Limitations of Criterion-Referenced Measurement Cutoff scores always involve subjective judgment Misclassifications can be severe Motivation can be impacted; frustrated/bored
8
Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths
9
Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths
10
Statistical Analysis of CRTs Nominal data (categorical; major, gender, pass/fail, etc.) Contingency table development (2x2 Chi 2 ) Chi-Square analysis (used w/ categorical variables) Proportion of agreement (see next slide) Phi coefficient (correl for dichotomous (y/n) variables)
11
Proportion of Agreement (P) Sum the correctly classified cells/total (n 1 + n 4 )/n 1 +n 2 +n 3 + n 4 Examples on board
12
Considerations with CRT The same as norm-referenced testing Reliability (consistency) Equivalence: is the PACER equivalent to 1-mi run/walk? Stability: does same test result in consistent findings? Validity (Truthfulness of measurement) Criterion-related: concurrent or predictive Construct-related: establish cut scores (see Fig. 7.3)
13
Meeting Criterion-Referenced Standards Possible Decisions Truly Below Criterion Truly Above Criterion Did not achieve standard Correct Decision False Positive Did achieve standard False Negative Correct Decision
14
CRT Reliability Test/Retest of a single measure Fail Day 2 Pass Fail Pass Day 1 n1n1 n2n2 n3n3 n4n4 (n 1 + n 4 )/(n 1 +n 2 +n 3 + n 4)
15
CRT Validity Use of a field test and criterion measure Fail Field Test Pass Fail Pass Criterion n1n1 n2n2 n3n3 n4n4
16
Example 1 FITNESSGRAM Standards (1987) 24 (4%) 21 (4%) 64 (11%) 472 (81%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max P=(24 + 472)/(24+21+64+472) 496/581=85%
17
Example 2 AAHPERD Standards (1988) 130 (22%) 23 (4%) 201 (35%) 227 (39%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max Compare Examples 1-2: F’gram (81%) better predictor of VO 2max than AAHPERD standards (39%) P=(130 + 227)/(130+23+201+227) 357/581=61%
18
Criterion-referenced Measurement Find a friend: Explain one thing that you learned today and share WHY IT MATTERS to you as a future professional
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.