Download presentation
Presentation is loading. Please wait.
Published byBlaise Reeves Modified over 9 years ago
1
Chapter 7 Criterion-Referenced Reliability and Validity PoorSufficientBetter
2
Criterion-Referenced Testing Mastery Learning Standard Development Judgmental Normative Empirical Combination SU
3
Guidelines for Writing Behavioral Objectives (Mager, 1962) Identify the desired behavior by name Define the desired behavior Specify the criteria of acceptable performance
4
Advantages of Criterion-Referenced Measurement Represent specific, desired performance levels linked to a criterion Are independent of the proportion of the population that meets the standard If not met, specific diagnostic evaluations can be made Degree of performance is not important... reaching the standard is
5
Limitations of Criterion-Referenced Measurement Cutoff scores always involve subjective judgment Misclassifications can be severe Students who meet the cutoff may no longer be motivated to do better PF
6
Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths
7
Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths
8
Statistical Analysis of CRTs Nominal data Contingency table development Phi coefficient (PPM) Chi-square analysis Review chapter 5
9
Considerations With CRT The same as norm-referenced testing Reliability Consistency of measurement Validity Truthfulness of measurement
10
Figure 7.1a FITNESSGRAM Standards 24 (4%) 21 (4%) 64 (11%) 472 (81%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max
11
Figure 7.1b AAHPERD Physical Best Standards 130 (22%) 23 (4%) 201 (35%) 227 (39%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max
12
Meeting Criterion-Referenced Standards Possible Decisions Truly below criterion Truly above criterion Did not achieve standard Correct decision False positive Did achieve standard False negative Correct decision
13
Table 7.1 CRT Test-Retest Reliability Example Day 2 Day 1Did not achieve the standard Did achieve the standard Total Did not achieve the standard 8020100 Did achieve the standard 50250300 Total130270400 P =.825 K =.576 phi =.586 2 = 137.13, df = 1, p <.001
14
Table 7.2 Criterion-Referenced Equivalence Reliability Between the 1 Mile Run/Walk and PACER TestsTotal sampleBoysGirls Trial 1 P.76.83.66 K.51.65.33 Trial 2 P.71.76.65 K.43.52.30
15
Figure 7.3 A Theoretical Example of the Divergent Group Method
16
Examples of Criterion Referenced Standards Cholesterol < 240 mg / dl Systolic blood pressure < 130 mmHg Diastolic blood pressure < 90 mmHg FITNESSGRAM 1-mile run time for boy age 10 < 11:30 President’s Challenge Health Fitness curl-ups for girl age 14 > 24
17
CRT Reliability Fail Day 1 Pass Fail Pass Day 2
18
CRT Validity Fail Field Test Pass Fail Pass Criterion
19
Racquetball Example Can a wall volley test serve as a good criterion measure to determine who should enter intermediate racquetball? Example Reliability study Validity study
20
Racquetball Test Illustration 2 extra racquetballs You must always hit the ball from behind the broken line Front Wall The test Trial 1 60 seconds Trial 2 60 seconds
21
Set a standard for passing the field test. Our standard is set at 25 hits. You must hit the ball against the front wall at least 25 times in a trial. This meets the “standard” for entry into intermediate racquetball. You want to see if players can achieve the standard on each trial. If you determine the consistency of their meeting the standard, this is a criterion-referenced reliability study. Reliability Study
22
Reliability—What You Would Like to See Trial 2 Trial 1Failed to meet standard (<25) Met the Standard (> 25) Failed to meet standard (<25) People here on BOTH Trials No one here Met the Standard (> 25) No one herePeople here on BOTH Trials
23
PASW Output Meet standard on Trial 2?Total Did NOT meet standard of 25 DID meet standard of 25 Meet standard on Trial 1? Did NOT meet standard of 25 37643 DID meet standard of 25 21113 Total391756 Chi square = 23.6, p <.001 Phi = 0.65 Percent agreement = (37 + 11)/56 = 48/56 = 85.7% This field test demonstrates acceptable criterion-referenced reliability
24
The standard for passing the field test is 25 hits. We need a criterion measure of TRUE racquetball ability We used self reported racquetball experience. Inexperience = novice player Experienced = skilled OR completed beginning racquetball class You want to see if experienced players are more likely to achieve the standard on the field test and the inexperienced players are less likely to meet the field test standard. This is a criterion-referenced validity study. Validity Study
25
Criterion-Referenced Validity— What You Would Like to See Criterion Results of field Test InexperiencedExperienced < 25 hitsMany people hereNo one here > 25 hitsNo one hereMany people here The criterion is that the student has at least completed a beginning racquetball class
26
PASW Output—Trial 1 vs. Criterion CriterionTotal InexperiencedExperienced Meet standard on trial 1? Did NOT meet standard of 25 331043 DID meet standard of 25 5813 Total381856 Chi square = 6.7, p <.01 Phi = 0.35 Percent agreement = (33 + 8)/56 = 41/56 = 73%
27
PASW Output—Trial 2 vs. Criterion CriterionTotal InexperiencedExperienced Meet standard on trial 2? Did NOT meet standard of 25 30939 DID meet standard of 25 8917 Total381856 Chi square = 4.8, p <.03 Phi = 0.29 Percent agreement = (30 + 9)/56 = 39/56 = 70% The results of the TWO validity studies suggest this field test and the criterion of 25 hits is a moderately valid measure of racquetball experience
28
Table 7.8 Table 7.8 Research Designs in Epidemiology TypeDescription Experimental Randomized clinical trial Randomly assign subjects to treatments or exposures Community trialRandomly assign whole communities to treatments or exposures Observational Cases seriesNoting cases at a particular time or place Cross-sectionalA snapshot of identifiable groups at one point in time Proportionate mortality or morbidity study Compare results of a study group to the population Case-controlCompares known cases of mortality or morbidity with matched noncases CohortLongitudinal, generally long term tracking of populations
29
Epidemiological Statistics Incidence—the number, proportion, rate, or percentage of new cases of mortality and morbidity. Incidence could be calculated in a randomized clinical trial or a prospective, longitudinal cohort study. Prevalence—the number, proportion, rate, or percentage of total cases of mortality and morbidity. Prevalence would be calculated in a cross-sectional study.
30
Estimates of Risk Absolute risk—the risk (proportion, percentage, rate) of mortality or morbidity in a population that is exposed or not exposed to a risk factor. Relative risk—the ratio of risks between the exposed or unexposed populations. This statistic is calculated with incidence measures. Odds ratio—an estimate of relative risk used in prevalence studies. Attributable risk—the risk of mortality and morbidity directly related to a risk factor. It can be thought of as the reduction in risk related to removing a risk factor.
31
Table 7.9 Results of a Hypothetical Study Relating Cholesterol and Heart Attack Mortality Exposure Outcome Heart attack deathsNo heart attack deaths High cholesterol A 25 B 31 No high cholesterol C7C7 D 37
32
Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure “Bad” thing here“Good” thing here “Risky” thing here “Better” thing here
33
Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure DeadAlive Smoker Non-smoker
34
Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure “Bad” thing here“Good” thing here “Risky” thing here “Better” thing here Design a Physical Activity Study
35
Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure DeadAlive Sedentary Physically active
36
Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure “Bad” thing here“Good” thing here “Risky” thing here “Better” thing here Design a Physical Activity Study about USDHHS PA guidelines
37
Setting Up An Epidemiological Study in a 2x2 Contingency Table Outcome Exposure HypertensiveNormotensive Does NOT meet PA guidelines MEETS PA guidelines
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.