Download presentation
Presentation is loading. Please wait.
Published byMarianne Desroches Modified over 6 years ago
1
Frameworks for Considering Item Response Demands
and Item Difficulty Kristen Huff College Board Steve Ferrara CTB/McGraw-Hill As Christy addressed, understanding item difficulty – and using that understanding in test and item design – is directly related to improving our interpretations of student achievement from criterion-referenced, standards-based assessments. NCSA Session: Theory and Research on Item Response Demands: What Makes Items Difficult? Construct-Relevant? June 20, Detroit
2
Thesis A coherent and comprehensive understanding of the interaction between items and examinees, and the controllable features of items that elicit predictable interactions is needed to: Design items that do a better job of measuring what we’re interested in knowing Design tests that are better suited to facilitate valid inferences about student performance More generally, bridge the gap between large- scale assessment and teaching & learning First bullet: Training item writers; creating item specifications (not style guides); Second bullet: Mention/add application of targeting items to ALs
4
Objective To conduct research (& review past research) that informs the development of a framework, or conceptual structure, of item response demands. Research questions: How much do existing item response demand schemas cover current achievement constructs? What are the features of items that influence item difficulty? What student responses are triggered by different item features, and how do these responses influence item difficulty? Add a slide prior to this: use table from AERA paper. Here is where we are now – complex – no conceptual structure – need to investigate how these various coding schema, which represent different perspectives on item response demands – relate to one another, overlap, and where are the gaps. This slide: Research on item response demands is critical; IDM is a means to an end. Interesting: some item response demand coding schemes are important for item writers / inferences, even if they don’t predict item difficulty. Existing: may need more/different; Current: especially as we are expanding constructs
5
Item Design Item statistic feature feature feature feature
We argue that the current approach to item design and interpretation of item statistics could be improved by such a framework. Currently, we design items that can be described by many different features, characteristics or attributes, such as
6
Item Design Item statistic Reading comprehension # words reading level
overlap between key and text Math number of variables graphics fractions vs whole numbers Item Design Item statistic feature feature feature feature ….. However, without a context, or conceptual framework from which to draw these features, we leave a lot of these critical design decisions up to the individual item writer.
7
Item Design Item statistic
Next, the student takes the test. We acknowledge that this critical interaction between student, who brings a host of contextual varieties to the situation, and the item, with all of its, largely judgmentally-driven, idiosyncratically-chosen features, is essentially…
8
Item Design Item statistic A black box.
We acknowledge that a great deal of work has been done to date on item difficulty modeling – Fischer, Embretson, Gorin, Sheehan, others – and we are in no way criticizing this work. What has languished, however, is a robust research agenda regarding item difficulty, and a coherent, comprehensive framework that seeks synthesize and conceptually organize what we have learned. This framework is needed to guide a more principled approach to item design, and in turn
9
Item Design Item statistic Framework
… inform the elaboration and refinement of the framework. Meaning, once we have the item statistics that result from the examinee/item interaction, we have a framework that aids in interpretation. At the same time, when items are designed in a principled way with features from this framework, the results can help inform the elaboration and refinement of the framework itself.
10
Steve and I have been thinking a lot about this and still are struggling with the organizational structure. For me, the main struggle point is that the framework forces us to think discretely about what is essentially an interaction… but I think this is a good thing! For example, just about any response demand variable can be described in terms of an item feature, and just about any item feature can be described in terms of the response demand it requires of a student. In my mind, the framework can help by forcing us to disentangle and clearly articulate how we describe these various features/response demands.
11
Item Design Item statistic Framework
To wrap up, let’s revisit this model for a second. What this represents is that observed estimates of item performance – that is, estimated difficulty, either in the form of proportion correct or IRT b-estimates, are used to model item difficulty. Meaning, we use the item response demands from the framework as independent variables in a regression equation and use the observed item difficulty statistic as the dependent variable. We contend that our inferences from these analyses would be more robust, more valid, if we had a more systematic way for considering alternative interpretations of observed item difficulty statistics. For example…
12
Intended cognitive complexity
Simple Complex Observed item difficulty Easy “Good” item (Good design and OTL) Item design flaw or Exceptionally effective instruction? Or flawed assumptions? Or restriction in range? Hard No OTL I don’t want to spend too much time on this table, but want to share anecdote that I still can’t shake where I was working with two very different constructs – science and history – and both told me the same things when I put very difficult items in front of them and asked: What makes this item hard? Sometimes, they would have an explanation for me. Other times, though, they would say: “Nothing. It’s just not taught.” So, what the heck does that do to my IDM studies when I have both kinds of hard items in my model??
13
Observed item difficulty Easy Effective Instruction Item design flaw
Opportunity to Learn Taught Not Taught Observed item difficulty Easy Effective Instruction Item design flaw Hard or just tough stuff? No OTL The take-home message from these two tables: OTL matters in interpreting item difficulty for most assessments.
14
Conclusions & Points of Discussion
Mapping the landscape of response demands, item features and the interaction between the two is messy, difficult work Achieving “coherent and comprehensive” frameworks needs to be higher priority in research Start now using draft in operational testing programs Need more effective and easier ways to gauge opportunity to learn for many reasons, but also to inform the work we are recommending
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.