Presentation is loading. Please wait.

Presentation is loading. Please wait.

Examing Rounding Rules in Angoff Type Standard Setting Methods Adam E. Wyse Mark D. Reckase.

Similar presentations


Presentation on theme: "Examing Rounding Rules in Angoff Type Standard Setting Methods Adam E. Wyse Mark D. Reckase."— Presentation transcript:

1 Examing Rounding Rules in Angoff Type Standard Setting Methods Adam E. Wyse Mark D. Reckase

2 Current Projects Multidimensional Item Response Theory – Development of methodology for fine grained analysis of item response data in high dimensional spaces. Application of methodology to gain understanding of constructs assessed by tests. Test Design and Construction – Design of content and statistical specifications for tests using the philosophy of item response theory. Use of computerized test assembly procedures to match test specifications. Portfolio Assessment – Design of portfolio assessment systems, including formal objective scoring of portfolios. Procedures for Setting Standards – Development and evaluation of procedures for setting standards on educational and psychological tests. Includes extensive work on setting standards on the National Assessment of Educational Progress. Computerized Adaptive Testing – Developing procedures for selecting and administering test items to individuals using computer technology. In particular, designing systems to match item selection to the specific requirements for test use.

3 Angoff Method The probability of the minimally competent examinee (MCE) would respond correctly to the item

4 Modified Angoff Method (1) Round to a whole number of score point (Yes/No method) Polytomous Dichotomous

5 Modified Angoff Method (2) Rate the MCE score of each cluster of items. -Round to 1 decimal place -round to integer

6 Modified Angoff Method (3) How to aggregate those rater’s judgment – Mean or median (for excluding the effect of outliner) meanmedian 18.166718.4 20.883321

7 Theoretical Framework Reckase 2006 Round to integer Round to 0.05 Perfectly understand the relation between Item difficulty and Cut theta

8 Theoretical Framework Reckase 2006 Round to 1 decimal place Round to 2 decimal places

9 Theoretical Framework Bias – Individual panelists cut-score – Group level cut-scores: mean or median. Other evidence for evaluating Standard Setting – Correlation: item ratings and P values provided by panelists Can’t detect the panelists’ servility Errors can be incorporated into Reckase evaluation approach.

10 Theoretical Framework Assumption – Only for single round (Without training effect) – Do not include error (In an ideal setting) Investigate the impact of the Angoff modifications and rounding rules in the ideal situation.

11 Data and Method NEAP Data – 20 raters last round – The panelist’s θ cut-score in NEAP was his intended cut-score. 2PL 3PL GPCM: E(X|θ)=1*P 1 (θ)+2*P 2 (θ)+3*P 3 (θ)+4*P 4 (θ)

12 Simulated conditions Round – Integer: 1.2345  1 – Nearest 0.05: 1.2345  1.25 – Nearest 2 decimal places: 1.2345  1.23 Item pool – 180, 107, 109, 53 items

13 Simulated conditions Individual item vs. clusters of items Cut-scores – Basic, Proficient, and advanced Aggregating value – Mean vs. Median

14 Evaluation Criteria Bias: – Average absolute bias: – Bias for the group’s intended cut score – mean: – median:

15 Result –individual panelist >>>> Rounding: integer > 0.05 > 2 decimal places

16 Result –individual panelist Cut-score location: Advanced > Basic > Proficient

17 Result –individual panelist Individual items > cluster level (fewer rounding error) >

18 Result –individual panelist Item pool: 53 items have greater bias than the other pools

19 Result –individual panelist Item pool: 53 items < 180 items, for Proficient, integer. The importance of the location of Cut-score and the items distribution

20 Result –Group panelist Some cases the Mean is better, other cases the Median is better

21 Result –Group panelist Basic were “-” bias, Proficient and Advanced were “+” bias. At cluster item level, the proficient was “-” bias.

22 Result –Group panelist The advanced produced the greatest bias than other two level. The bias did not cancel out for a group of panelists.

23 Result –Group panelist Both the mean and median bias < 0.01 for round to 0.05 and 2 decimal places. Again, more test items did not necessarily.

24 Result –Group panelist Cluster level is better than individual items.

25 Impact on Percent Above Cut-score (PAC) Finding the PAC for the closest value on the NAEP in the pilot study. PAC for estimating θ - PAC for intended θ. Nearest 0.05 or nearest 0.01 did not change. No effect. Minimal impact

26 Impact on Percent Above Cut-score (PAC) Basic: 5.610~13.010 Proficient: -3.823~-4.387 Advanced: -1.156~-1.262

27 Impact on Percent Above Cut-score (PAC) Basic: 4.490~14.190 Proficient: -4.387~-5.346 Advanced: -1.156~-1.343

28 Impact on Percent Above Cut-score (PAC) Bias: Advanced > Basic and Proficient PAC: Advanced < Basic and Proficient There are more student near the basic and proficient cut score

29 Impact on Percent Above Cut-score (PAC) Rounding to the integer dose not present a viable alternative in Angoff method.

30 Discussion Rounding to integer could affect the cut scores. – Using cluster item level can mitigate bias, but biases still remained. Using more test items will not necessarily produce less bias. – The important is the location of the items in relationship to the intended cut-score.

31 Discussion 10 items [-2 ~ +2] Cut score θ = 0 – 5 items rounded to score 1 – 5 items rounded to score 0 Cut total score = 5  θ = 0 Bias = 0

32 Discussion 20 items [-1 ~ +3] Cut score θ = 0 – 5 items rounded to score 1 – 15 items rounded to score 0 Cut total score = 5  θ = -0.438 Bias = -0.438

33 Discussion Using OIB from bookmark to roughly design half of the items were above cut-score. – Impossible to know the location of cut-score. – The intended cut-scores in different panelists are different. Some panelists must have bias In multiple cut-scores, at lease one of cut- scores would produce bias. Rounding to integer present many potential problems.

34 Discussion Challenge: in real situations panelists are not completely consistent in their judgments. – Feedback is helpful for reducing rater inconsistency in NAEP Further development – Examine the bias at the group level

35 Thank you for attention


Download ppt "Examing Rounding Rules in Angoff Type Standard Setting Methods Adam E. Wyse Mark D. Reckase."

Similar presentations


Ads by Google