Download presentation
Presentation is loading. Please wait.
Published byRoderick Willis Modified over 9 years ago
1
Examing Rounding Rules in Angoff Type Standard Setting Methods Adam E. Wyse Mark D. Reckase
2
Current Projects Multidimensional Item Response Theory – Development of methodology for fine grained analysis of item response data in high dimensional spaces. Application of methodology to gain understanding of constructs assessed by tests. Test Design and Construction – Design of content and statistical specifications for tests using the philosophy of item response theory. Use of computerized test assembly procedures to match test specifications. Portfolio Assessment – Design of portfolio assessment systems, including formal objective scoring of portfolios. Procedures for Setting Standards – Development and evaluation of procedures for setting standards on educational and psychological tests. Includes extensive work on setting standards on the National Assessment of Educational Progress. Computerized Adaptive Testing – Developing procedures for selecting and administering test items to individuals using computer technology. In particular, designing systems to match item selection to the specific requirements for test use.
3
Angoff Method The probability of the minimally competent examinee (MCE) would respond correctly to the item
4
Modified Angoff Method (1) Round to a whole number of score point (Yes/No method) Polytomous Dichotomous
5
Modified Angoff Method (2) Rate the MCE score of each cluster of items. -Round to 1 decimal place -round to integer
6
Modified Angoff Method (3) How to aggregate those rater’s judgment – Mean or median (for excluding the effect of outliner) meanmedian 18.166718.4 20.883321
7
Theoretical Framework Reckase 2006 Round to integer Round to 0.05 Perfectly understand the relation between Item difficulty and Cut theta
8
Theoretical Framework Reckase 2006 Round to 1 decimal place Round to 2 decimal places
9
Theoretical Framework Bias – Individual panelists cut-score – Group level cut-scores: mean or median. Other evidence for evaluating Standard Setting – Correlation: item ratings and P values provided by panelists Can’t detect the panelists’ servility Errors can be incorporated into Reckase evaluation approach.
10
Theoretical Framework Assumption – Only for single round (Without training effect) – Do not include error (In an ideal setting) Investigate the impact of the Angoff modifications and rounding rules in the ideal situation.
11
Data and Method NEAP Data – 20 raters last round – The panelist’s θ cut-score in NEAP was his intended cut-score. 2PL 3PL GPCM: E(X|θ)=1*P 1 (θ)+2*P 2 (θ)+3*P 3 (θ)+4*P 4 (θ)
12
Simulated conditions Round – Integer: 1.2345 1 – Nearest 0.05: 1.2345 1.25 – Nearest 2 decimal places: 1.2345 1.23 Item pool – 180, 107, 109, 53 items
13
Simulated conditions Individual item vs. clusters of items Cut-scores – Basic, Proficient, and advanced Aggregating value – Mean vs. Median
14
Evaluation Criteria Bias: – Average absolute bias: – Bias for the group’s intended cut score – mean: – median:
15
Result –individual panelist >>>> Rounding: integer > 0.05 > 2 decimal places
16
Result –individual panelist Cut-score location: Advanced > Basic > Proficient
17
Result –individual panelist Individual items > cluster level (fewer rounding error) >
18
Result –individual panelist Item pool: 53 items have greater bias than the other pools
19
Result –individual panelist Item pool: 53 items < 180 items, for Proficient, integer. The importance of the location of Cut-score and the items distribution
20
Result –Group panelist Some cases the Mean is better, other cases the Median is better
21
Result –Group panelist Basic were “-” bias, Proficient and Advanced were “+” bias. At cluster item level, the proficient was “-” bias.
22
Result –Group panelist The advanced produced the greatest bias than other two level. The bias did not cancel out for a group of panelists.
23
Result –Group panelist Both the mean and median bias < 0.01 for round to 0.05 and 2 decimal places. Again, more test items did not necessarily.
24
Result –Group panelist Cluster level is better than individual items.
25
Impact on Percent Above Cut-score (PAC) Finding the PAC for the closest value on the NAEP in the pilot study. PAC for estimating θ - PAC for intended θ. Nearest 0.05 or nearest 0.01 did not change. No effect. Minimal impact
26
Impact on Percent Above Cut-score (PAC) Basic: 5.610~13.010 Proficient: -3.823~-4.387 Advanced: -1.156~-1.262
27
Impact on Percent Above Cut-score (PAC) Basic: 4.490~14.190 Proficient: -4.387~-5.346 Advanced: -1.156~-1.343
28
Impact on Percent Above Cut-score (PAC) Bias: Advanced > Basic and Proficient PAC: Advanced < Basic and Proficient There are more student near the basic and proficient cut score
29
Impact on Percent Above Cut-score (PAC) Rounding to the integer dose not present a viable alternative in Angoff method.
30
Discussion Rounding to integer could affect the cut scores. – Using cluster item level can mitigate bias, but biases still remained. Using more test items will not necessarily produce less bias. – The important is the location of the items in relationship to the intended cut-score.
31
Discussion 10 items [-2 ~ +2] Cut score θ = 0 – 5 items rounded to score 1 – 5 items rounded to score 0 Cut total score = 5 θ = 0 Bias = 0
32
Discussion 20 items [-1 ~ +3] Cut score θ = 0 – 5 items rounded to score 1 – 15 items rounded to score 0 Cut total score = 5 θ = -0.438 Bias = -0.438
33
Discussion Using OIB from bookmark to roughly design half of the items were above cut-score. – Impossible to know the location of cut-score. – The intended cut-scores in different panelists are different. Some panelists must have bias In multiple cut-scores, at lease one of cut- scores would produce bias. Rounding to integer present many potential problems.
34
Discussion Challenge: in real situations panelists are not completely consistent in their judgments. – Feedback is helpful for reducing rater inconsistency in NAEP Further development – Examine the bias at the group level
35
Thank you for attention
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.