Download presentation
Presentation is loading. Please wait.
Published byAshlynn Wilkins Modified over 8 years ago
1
Utilizing Item Analysis to Improve the Evaluation of Student Performance Mihaiela Ristei Gugiu Central Michigan University Mihaiela Ristei Gugiu Central Michigan University
2
Item Quality & Grade Distribution Reliability Point Biserial Correlation Item Difficulty Item Discrimination
3
CTT v. IRT Cook, Eignor & Taft (1988) CTT estimates were more stable than IRT estimates Lawson (1991) Estimates from the CTT & IRT were “almost identical” Ndalichako & Rogers (1997) Estimates from the CTT & IRT (1, 2, & 3-PL) were almost perfectly correlated Fan (1998) Invariance of estimates under CTT was as good if not better than of estimates under IRT.
4
Cronbach’s alpha (α) Estimates the internal consistency of items Poor item: α increases if item deleted Good item: α decreases if item deleted Weakness: no standard for how large the magnitude of change from the overall Cronbach’s α needs to be
5
Point Biserial Correlation Estimates how well a dichotomously scored item correlates with the total test score PoorAcceptableGoodVery Good ρ bis <00< ρ bis <.30.30≤ ρ bis <.50 ρ bis ≥.50
6
Item Difficulty Percentage of examinees who answered an item correctly Optimal difficulty level: 0.50 (50%) Accounting for guessing: 50+50/no. of choices E.g., 4-choice item, optimal difficulty: 0.625 (62.5%)
7
Index of Discrimination (D-index) Distinguishes between the performance on test of high achievers (top 25%) and low achievers (bottom 25%) Takes values between -1 and +1 PoorAcceptableGoodVery Good D<.20.20≤D<.30.30≤D<.40D≥.40
8
Data & Methodology Class on Political Behavior (N=41) 3 Multiple-choice exams SAS 9.2 software Gender (%)Class (%) Male Female 70.7 29.3 Freshmen Sophomore Junior Senior 53.7 26.8 9.8
9
Summary of Recommendations: Existing Methods MethodRetentionRevisionOmission Strong (+2 points) Weak (+1 point) (0 points)(-1 point) Cronbach’s αDrop in α≥0.02 Drop in α<0.02 Increase in α≤0.02 Increase in α>0.02 ρ bis ρ bis ≥0.50.3≤ ρ bis <0.50< ρ bis <0.3 ρ bis ≤0 D-index 1 D 1 ≥0.40.3≤D 1 <0.40.2≤D 1 <0.3D 1 <0.2 Item Difficulty 1 62.5%±5%62.5%±10%62.5%±15% Otherwise Composite 1 If sum is positive If sum is 0If sum is neg.
10
Midterm 1 Exam: Raw Scores
11
Midterm 1 Exam: Cronbach’s α
12
Midterm 1 Exam: Corrected ρ bis
13
Midterm 1 Exam: D-index
14
Midterm 1 Exam: Item Difficulty
15
Midterm 1 Exam: Composite 1
16
Example of a Bad Item The median can be computed for each of the following levels of measurement, EXCEPT: a)interval b)nominal* c)ratio d)ordinal Note: the correct response is marked with an asterisk.
17
Example of a Good Item A crucial difference between stratified sampling and quota sampling is that the observations in the former are selected: a)in a purposive manner b)in a random manner* c)in a convenient manner d)there is no difference Note: the correct response is marked with an asterisk.
18
Summary of Recommendations: Revised Methods MethodRetentionRevisionOmission Strong (+2 points) Weak (+1 point) (0 points)(-1 point) D-index 2 0.1≤D 2 <0.25 0.05≤D 2 <0.1D 2 ≥0.25Otherwise Item Difficulty 2 Target mean% ±5% Target mean% ±10% Target mean% ±15% Otherwise Composite 2 If sum is positive If sum is 0If sum is neg.
19
Midterm 1 Exam: D-index 2
20
Midterm 1 Exam: Item Difficulty 2
21
Midterm 1 Exam: Composite 2
22
Raw Data: Final Course Grades
23
Composite 1: Final Course Grades
24
Composite 2: Final Course Grades
25
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.