Item pocket method to allow response review and change in CAT Kyung T. Han
Response review Aim to reduce examinee’s anxiety during high stakes test. But make CAT less efficient and biased score estimates. Examinee’s test-taking strategies – Wainer strategy – Kingsbury strategy – Generalized Kingsbury (GK) strategy
Wainer strategy Answered all items incorrectly in round 1, then tried to answer all items correctly in round 2. Results in positive bias on theta Maybe happens for high-ability person
Kingsbury strategy Examinee could distinguish between current and previous item difficulties. Examinee went back to change response if current item difficulty is easier than previous one. Assumption: – (a) θ-δ <= -1, then make guess on current response – (b) θ-δ > 0.5, then go back to change response Low-ability examinee is likely to get the benefit
Generalized Kingsbury strategy Speculate on the difficulty level of the next item not only for items with guessed responses but also for all previous items. Strategy offered no meaningful improvement in score estimates in most situations. Only 61% successful in distinguishing the difficulty difference.
CAT with restricted revision options Stocking (1997): reduce Wainer effect – Model 1: change response at the end of test with limited number of item Failed to control if allowable items were larger than 2 – Model 2: multiple separately timed sections and allowed to change responses within section – Model 3: allowed to revise responses only within each item set (common stimulus) 1.May feel anxiety when make decision to go 2.Cannot skip items – May use Kingsbury or GK to find clue
Item pocket method Must answer in the end of test or be scored as incorrect Advantages: – Reduce anxiety – Items can be skipped and put in the pocket one time – Items in pocket do not affect the interim score and item selection (in turn, make Kingsbry and GK strategies ineffective) – Need no section
Simulation 1 If robust to Wainer-like strategy Settings: – 500 items – fixed-length CAT 40 items – MLE – Sympson & Hetter (Rmax = 0.2); or not – Maximum number of items in IP: 0, 2, 4, 6 – Mean absolute error (MAE) and bias – Replications: 25
Simulation 1 Assume examinees use Wainer-like strategy Only IP items can be revised (preserve as many easiest items as possible, because examinees think put them in pocket will be scored as wrong). Answer other non-IP items in normal way IP size is limited. Impact on the final score estimates Not often happen in practical
Results for simulation 1
Simulation 2 Assumed examinees evaluated the relative difficulty of each item against their proficiency. 50% finding out a challenging item and put it in pocket if |θ-δ| < 0.5, otherwise 70%. (preserve challenging items) If IP is full, examinee compare the easiest item in pocket with current challenging item. If the “easiest” item is easier than challenging item, answer it and put challenging item in pocket. (using 50%&70% rule) No time limit, no fatigue
Results for simulation 2 MAE increased by.069,.084,.087 for 2, 4, 6 IP size. Increase in average bias were.057,.075,.080
Low-ability examinees were likely to see more difficulty items ( due to simulation settings), but not for high-ability examinees.
Discussion Time limit should be considered For low-ability examinee, most items put in IP were those initial items due to item selection algorithm selecting an item was based on initial estimate (abound 0).
Conclusion IP – may reduce anxiety – Minimized the effect of Wainer-like strategy – Immune to Kinsbury and GK IP size: – Too small or too large
Questions Why the mean bias is not close to zero when IP size is zero? I'm curious that why no difference was found between the no exposure control condition and SH method condition?
Future study 1.Fixed-precision CAT 2.Everyone has different ability (probability) to tell item difficulty. – Elapsed time of skipping an item 3.Multiple choice item 4.Possible to trick IP method? 5.Utilizing information of IP item (MNAR)