Download presentation
Presentation is loading. Please wait.
Published byChristina Anderson Modified over 8 years ago
1
Randy Bennett Frank Jenkins Hilary Persky Andy Weiss rbennett@ets.org Scoring Simulation Assessments Funded by the National Center for Education Statistics, US Department of Education
2
What is NAEP? National Assessment of Educational Progress The only nationally representative and continuing assessment of what US students know and can do in various subject areas Paper testing program Administered to samples in grades 4, 8, and 12 Scores reported for groups but not individuals
3
TRE Study Purpose Demonstrate an approach to assessing “problem solving with technology” at the 8 th grade level that: Fits the NAEP context Uses extended performance tasks Models student proficiency in an evidence- centered way
4
Conceptualizing Problem Solving with Technology Technology Environment Content Domain Searchable Database Text Processor Simulation Tools Dynamic Displays Spread- sheet Commun- ication Tools Biology Ecology Physics Balloonxxxxxxx Economics History
5
What do the Example Modules Attempt to Measure? By scientific-inquiry skill, we mean being able to find information about a given topic, judge what information is relevant, plan and conduct experiments, monitor one’s efforts, organize and interpret results, and communicate a coherent interpretation. By computer skill, we mean being able to carry out the largely mechanical operations of using a computer to find information, run simulated experiments, get information from dynamic visual displays, construct a table or graph, sort data, and enter text.
8
Scoring the TRE Modules Develop initial scoring specifications during assessment design Represent what is being measured as a graphical model Proposal for how the components of proficiency are organized in the domain of problem solving in technology-rich environments
9
TRE Student Model
10
Connecting Observations to the Student Model Three-step process Feature extraction Feature evaluation Evidence accumulation
11
Feature Extraction All student actions are logged in a transaction record Feature extraction involves pulling out particular observations from the student transaction record Example: the specific experiments the student chose to run for each of the Simulation problems
12
A Portion of the Student Transaction Record #ActionValueTime 1ChooseValues30 2SelectMass9035 3TryIt37 4MakeTable55 5SelectedTabVarsPayload Mass60 6MakeGraph68 7VertAxisAltitude75 8HorizAxisHelium83
13
Feature Evaluation Each extraction needs to be judged as to its correctness Feature evaluation involves assigning scores to these observations
14
A Provisional Feature- Evaluation Rule Quality of experiments used to solve Problem 1 IF the list of payload masses includes the low extreme (10), the middle value (50), and the high extreme (90) with or without additional values, THEN the best experiments were run. IF the list omits one or more of the above required values but includes at least 3 experiments having a range of 50 or more, THEN very good experiments were run. IF the list has only two experiments but the range is at least 50 OR the list has more than two experiments with a range equal to 40, THEN good experiments were run. IF the list has two or fewer experiments with a range less than 50 OR has more than two experiments with a range less than 40, THEN insufficient experiments were run.
15
An Example of a “Best” Solution
16
An Example of an “Insufficient” Solution
17
Evidence Accumulation Feature evaluations (like item responses) need to be combined into summary scores that support the inferences we want to make from performance Evidence accumulation entails combining the feature scores in some principled manner Bayesian inference networks Offer a very general, formal, statistical framework for reasoning about interdependent variables in the presence of uncertainty
18
An Evidence Model Fragment for Exploration Skill in Simulation 1
19
Using Evidence to Update the Student Model
21
TRE Student Model
22
Conclusion TRE illustrates: Measuring problem-solving with technology, with emphasis on the integration of the two skill sets Using extended tasks like those encountered in advanced academic and work environments Modeling student performance in a way that explicitly accounts for multidimensionality and for uncertainty
23
Conclusion Important remaining issues Measurement Tools to evaluate model fit not well-developed Extended performance tasks have limited generalizability Logistical Adequate school technology not yet universal Cost Task production and scoring are labor-intensive
24
Randy Bennett Frank Jenkins Hilary Persky Andy Weiss rbennett@ets.org Scoring Simulation Assessments Funded by the National Center for Education Statistics, US Department of Education
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.