Download presentation
Presentation is loading. Please wait.
Published byAnastasia Blankenship Modified over 9 years ago
1
Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring
2
Technology Enhanced Items Seeing more TEIs in assessments –Consortia –Formative assessments Decisions around TEIs –Count-based (e.g., 25 MCs, 2 CRs, 3 TEIs) –Content-based
3
Drag and Drop TEIs Select –Drag N objects to a single drop target –Similar to ‘Check all that apply’ Selected Response Items Categorize –Drag N objects to M drop targets –Limits: an object can be dragged to multiple Y targets, or no Order –Drag N objects to M drop targets in proper order Composites (multi-part) –Dependencies
4
Claims –Choice of TEI –Justification Creation –Environment –Format –Complexity –Constraints Interoperability –Rendering –Data storage –Porting Performance –Response time –Latency –Efficiency Cost –Time to develop –Permissions –Storage –QA Scoring –Combinatorics –Who sets rules TEI Considerations
5
TEIs Live in the “Grey Area” between MC and CRs Multiple Choice Items Constructed Response Items TEIs
6
Evaluating TE Item Scoring Classical Theory Methods (p-value, score dist, pbis) Analyze trends in responses –Frequency of response patterns –Counts of object choices –Proportion of ‘blank’ responses –Frequent, incorrect responses Analysis may –Suggest where examinees may not understand the item –Highlight alternative correct answers –Suggest need for partial credit or collapsing categories
7
TEI Scoring and Performance Factors Item Design StructureClarityConstraints Examinee “Gets” the itemFacility with Tools Experience with Item Type Scoring Rubric AlignmentRubric ClarityScoring Quality
8
Item 1 Key: 2 points if response matches key. 1 point if top or bottom row matches key. 0 otherwise. There are 19,531 ways to answer a single part, and so 381,459,961 ways to answer both parts.
9
What do the data tell us? Response pattern frequencies More students dragged 2/3 and then 1/3 into boxes than answered the item correctly.
10
Part 1 and Part 2 Frequencies
11
Summation versus expression representation?
12
Original RubricNew Rubric ScoreCountPercentCountPercent 0243281%225775% 12127%33511% 237512%42714% p-value.16.20 190 examinees would have received a higher score 138 ---- 0 to 1 37 ---- 0 to 2 15 ---- 1 to 2
13
Item 1 Summary Item Design –Clarify question –Clarify directions –Review drag target size –Revisit number of drag objects Examinee –Enable practice with infinite wells –Observe examinees answering the item Scoring –Summation versus expression? –14% of responses are blank, why?
14
Item 2 Score Number of Correct Objects Present Number of Incorrect Objects Present 240 1 41 or 2 31 20 0Otherwise Ignoring order, there are 2^10 (1024) possible answers. Preserving order, there are about 10,000,000 possible answers. Ignoring order, there were 573 unique answers. Preserving order, there were 2961 unique answers.
15
Response pattern frequencies
16
What objects are chosen by examinees? ObjectMean Correlation with item score 3(x)87%.13 x+x+x69%.26 x^365% -.52 5x-2x46%.35 x+343% -.37 3x+337% -.36 3(2x-x)33%.17 x/355% -.49 5(x-2)26% -.18 x-x-x23% -.25
17
Object selection by score Object 0 (N=5814) 1 (N=1212) 2 (N=312) 3(x)85% 94%100% x+x+x62% 92%100% x^378% 20%0% 5x-2x37% 73%100% x+353% 7%0% 3x+346% 2%0% 3(2x-x)31% 24%100% x/368% 6%0% 5(x-2)30% 13%0% x-x-x28% 1%0%
18
New Scoring Rules Student needs to drag more correct objects than incorrect objects to earn a score of 1 ScoresOriginal RubricNew Rubric 079%63% 117%33% 24% p-value.12.21
19
Relationship of parts to item score ObjectPercent Original Correlation New Correlation 3(x)87%.13.12 x+x+x69%.26.30 x^365% -.52-.53 5x-2x46%.35.29 x+343% -.37-.52 3x+337% -.36-.50 3(2x-x)33%.17.04 x/355% -.49-.62 5(x-2)26% -.18-.24 x-x-x23% -.25-.36
20
Object Selections by Score Point Original RubricRevised Rubric Object 0 (N=5814) 1 (N=1212) 2 (N=312) 0 (N=4624) 1 (N=2402) 2 (N=312) 3(x)85% 94%100% 85% 91%100% x+x+x62% 92%100% 58% 85%100% x^378% 20%0% 84% 38%0% 5x-2x37% 73%100% 36% 57%100% x+353% 7%0% 64% 10%0% 3x+346% 2%0% 57% 4%0% 3(2x-x)31% 24%100% 35% 19%100% x/368% 6%0% 79% 16%0% 5(x-2)30% 13%0% 34% 15%0% x-x-x28% 1%0% 35% 2%0%
21
Item 2 Summary Item Design –Review drag target size –Revisit number of drag objects Examinee –Examinees appeared to understand the task Scoring –Are more generous rules aligned with standard/claim? –Other rules?
22
Item 3 Student earns a 2 if she drags 4 or 5 correct steps in order and last step is x-3. Student earns a 1 if she drags 3 correct steps in order and last step is x-3. Student earns a 0 otherwise. There are 19,081 ways to answer this item. 20 ways to earn a 2 16 ways to earn a 1
24
Response Frequencies (1108 unique responses)
25
Score distributions ScoreOriginal RubricRevised Rubric N%N% 0389175%375873% 1401%1733% 2122724%122724% P-value.24.25 Revised rubric – allows for partial credit scoring when student response contains correct path, but student drags ‘extra’ objects to fill up the remaining spaces 775 (13% of responses were blank)
26
Item 3 Summary Item –Remove Infinite wells –Add ‘distractors’? –Remove borders around drop targets or make dynamic Examinee –Students seem compelled to drag objects to fill all spaces –Students do not reduce to final answer Scoring –Combinatorics – complicated scoring rules –Reversals? –Same level transformations?
27
Conclusions A review of responses and frequencies can reveal areas of misunderstanding, potential for item revision, or uncaptured correct responses Complexity of item leads to complexity in scoring –More ‘objects’ = more possible correct responses! –Object content influences scoring Placing constraints on item can help –Infinite wells –Size and number of objects Changes to scoring don’t always add value
28
Thank you! slottridge@pacificmetrics.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.