Download presentation
Presentation is loading. Please wait.
Published byCameron Conley Modified over 8 years ago
1
Chapter 6 Norm-Referenced Reliability and Validity
2
Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability
3
Observed, Error, and True Scores Observed Score = True Score + Error Score
4
Reliability Reliability is that proportion of observed score variance that is true score variance.
5
Table 6.1 Systolic Blood Pressure Recordings for 10 Subjects Subject Observed BP = True BP + Error BP 1103105-2 2117115+2 3116120-4 4123125-2 5127125+2 61251250 7135125+10 8126130-4 9133135-2 101451450 Sum ( )125012500 Mean (M)125.0125.00 Std. Dev. (s)11.610.84.1 Variance (s 2 )133.6=116.7+16.9
6
Interclass Reliability Pearson Product Moment Test retest Equivalence Split halves Form AForm BTrial 1Trial 2OddEven
7
Table 6.2 Sit-up Performance for 10 Subjects Subject Trial 1 Trial 2 14549 23836 35450 43838 54749 63938 73943 84243 92930 104242 Sum ( ) 413418 Mean (M)41.341.8 Std. Dev (s)6.66.5 Variance (s 2 ) 43.641.7 r xx’ =.927
8
Spearman Brown Prophecy Formula k = the number of items I WANT to estimate the reliability for divided by the number of items I HAVE reliability for
9
Table 6.3 Odd-Even Scores for 10 Subjects Subject Odd Even 11213 2911 3108 496 5118 6710 799 81210 954 1087 Sum ( ) 9286 Mean (M)9.28.6 Std. Dev (s)2.22.6 Variance (s 2 ) 4.86.7 r xx’ =.639
10
Table 6.4 Values of r kk From Spearman-Brown Prophecy Formula r 11.25.501.52.03.04.05.0.10.03.05.14.18.25.31.36.22.07.12.30.36.46.53.59.40.14.25.50.57.67.73.77.50.20.33.60.67.75.80.83.60.27.43.69.75.82.86.88.68.35.52.76.81.86.89.91.80.50.67.86.89.92.94.95.92.74.85.95.96.97.98.98.96.86.92.97.98.99.99.99 K (change in test length)
11
Table 6.5 Effect of a Constant Change in Measures SubjectTrial 1Trial 2 11525 21727 31020 42030 52333 62636 72737 83040 93242 103343 Sum ( ) 233333 Mean (M)23.333.3 Std. Dev. (s)7.77.7 Variance (s 2 ) 59.159.1 r xx’ = 1.00
12
Intraclass Reliability ANOVA Model Cronbach's Alpha Coefficient Alpha Coefficient
13
Intraclass (ANOVA) Reliabilities Common terms you will encounter Alpha reliability Kuder Richardson Formula 20 (KR 20 ) Kuder-Richardson Formula 21 (KR 21 ) ANOVA reliabilities
14
Table 6.6 Calculating the Alpha Coefficient Subject Trial 1 Trial 2 Trial 3 Total 135311 22226 365314 453513 534411 X 19191755 X 2 837963643 s 2 2.701.701.309.50
15
Calculating the Alpha Coefficient
16
Index of Reliability The theoretical correlation between observed scores and true scores
17
Standard Error of Measurement Reflects the degree to which a person's observed score fluctuates as a result of errors of measurement
18
Factors Affecting Test Reliability 1)Fatigue 2)Practice 3)Subject variability 4)Time between testing 5)Circumstances surrounding the testing periods 6)Appropriate difficulty for testing subjects 7)Precision of measurement 8)Environmental conditions
19
Decline in Reliability for the Harvard Alumni Activity Survey as the Time Between Testing Periods Increases Months Between Test-Retest
20
Validity Types Content-related validity Criterion-related validity Statistical or correlational Concurrent Predictive Construct-related validity
21
Standard Error of Estimate Standard error Standard error of prediction
22
Standard Errors SE of Measurement SE of Estimate
23
Methods of Obtaining a Criterion Measure Actual participation e.g., golf, archery Perform the criterion Known valid criterion (e.g., treadmill performance) Expert judges Panel judges Tournament participation Round robin Known valid test
24
Table 6.7 Correlation Matrix for Development of a Golf Skills Test (From Green et al., 1987) Playing golf Long puttChip shotPitch shotMiddle distance shot Drive shot Playing golf 1.00 Long putt.591.00 Chip shot.58.471.00 Pitch shot.54.37.351.00 Middle distance shot.66.55.61.401.00 Drive shot -.65-.62-.48-.52-.791.00 What are these? Concurrent Validity coefficients
25
Table 6.8 Concurrent Validity Coefficients for Golf Test 2-item battery Middle distance shot Pitch shot.72 3-item battery Middle distance shot Pitch shot Long putt.76 4-item battery Middle distance shot Pitch shot Long putt Chip shot.77
26
Figure 6.1 Diagram of Validity and Reliability Terms
27
Interpreting the “r” you obtain Interpreting the “r” You Obtain
28
Various Correlations Actual Golf Score (Criterion) Putting Test Version A (Trial 1) Putting Test Version A (Trial 2) Driving Test Version A (Trial 1) Driving Test Version A (Trial 2) Swing Form Test Version A (Rating 1) Swing Form Test Version A (Rating 2) Actual Golf Score (Criterion) 1.00 Putting Test - Version A (Trial 1) Validity Coefficient (r XY ) 1.00 Putting Test - Version A (Trial 2) Reliability Coefficient (r XX′ ) 1.00 Driving Test - Version A (Trial 1) Pearson Product Moment Correlation Coefficients (r) 1.00 Driving Test - Version A (Trial 2) Reliability Coefficient (r XX′ ) 1.00 Swing Form Test - Version A (Rating 1) Pearson Product Moment Correlation Coefficients ( r XY ) 1.00 Swing Form Test - Version A (Rating 2) Objectivity Coefficient (r XX′ ) 1.00
29
Interpret These Correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What are these? Concurrent Validity coefficients Criterion
30
Interpret These Correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What are these? Reliability coefficients
31
Interpret These Correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What is this? Objectivity coefficient
32
Scatterplot Two trials of Leg Press Prediction line Line of identity
33
Correlation Two trials of Leg Press
34
Concurrent Validity This square represents variance in performance in a skill (e.g., golf)
35
Concurrent Validity The different colors and patterns represent different parts of a skills test battery to measure the criterion (e.g., golf)
36
Concurrent Validity The orange color represents ERROR or unexplained variance in the criterion (e.g., golf) Error
37
Concurrent Validity ACDB Consider the concurrent validity of the above 4 possible skills test batteries
38
Concurrent Validity ACDB Which test battery would you be LEAST likely to use? Why? D—it has the MOST error and requires 4 tests to be administered
39
Concurrent Validity ACDB Which test battery would you be MOST likely to use? Why? C—it has the LEAST error but it requires 3 tests to be administered
40
Concurrent Validity ACDB Which test battery would you use if you are limited in time? A or B—requires 1 or 2 tests to be administered but you lose some validity
41
PASW Examples
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.