The Future of Assessment URSA 2016 Conference Cedar City, UT July 13-14, 2016
Changing How and What We Measure PreviouslyNow -> Future What you know and can doHow you do it, how did you find a solution, and what can we do to help you do it better/differently: cognitively based Academic disciplinesCross-cutting, real world, contextualized, generalized skills Standardized and linearAdaptive to individuals’ needs for information, customized Paper and oral examinationWherever, whenever, on any device, in the cloud Written and spoken responsesMulti-modal, new modalities (gesture, expression) Sequential scoring and analysis processes Immediate (automated) scoring and score turnaround, continuous analytics 2
Apprenticeship Many things are best assessed by simply asking someone to show it Been very effective for several thousand years Can we do this at scale and in an equitable way?
Conversations (written, choices) 6
Multimodal (spoken, expression, gesture) 7
Insights and Findings Oral examination But now at scale and standardized Relies on sophisticated systems Conversation trees and/or conversation managers Automatic writing evaluation Speech, facial, and gesture recognition and evaluation Interactors, avatars Some of the biggest challenges Natural conversations Signal versus noise Latency (the time it takes for the system to respond) Spontaneous speech and multiple languages
Scenarios and Simulations (from NAEP) 9
Insights and Findings Effective way to ask students to show their work naturally and reduce the inferential distance Can be effective to engage students Effective way to go deeper into a topic and build on things that were established earlier Can provide the kinds of open spaces where learning and discovery can take place Balanced with providing constraints, guided walks Becoming more affordable (in relative terms) Focus on authoring within relatively open environments E.g., a virtual science lab with many tools and materials “Task Effects” remain a significant psychometric issue Focus of the tasks needs to be well understood Combine with more abstract item types for more summative uses
Games Adapting Existing Games Developing New Games 11 Extended GamesMini and Micro Games
Microgames and Extended Games Microgame: Short, targeted game roughly equivalent to a single item or test question (usually an isolated sub-skill) Extended Game: Long(er), more immersive and complex game roughly equivalent to a cumulative end-of-year exam (i.e., multiple skills/constructs with multiple items each).
Robot Sorter Constructs (LP lvl 1-2): Reasons and Evidence 1.Identify relevant and supporting statements for a given argument 2.Argumentation fluency Action: Sorting Mechanic: Swipe up/down to send object down correct path (pro or con, relevant or irrelevant, supporting or not supporting, etc.).
Text Persuasion Constructs (LP lvl 1-2): Appeal Building 1.Can apply simple analytical strategies that map out what a particular audience values or desires, and match appeals to the audience on that basis 2.Argumentation fluency Action: Selection Mechanic: Select statements that address concerns voiced by a conversational partner.
Social Simulators A system that creates social physics that determines behaviors and dialog of game characters (NPCs) and the choices the human players have available to interact with the characters. Social Practices, Intents, Microtheories 15
A Social Simulator for 3C Assessment 16
Insights and Challenges Game-like and game-based assessments show promise in addressing several challenges in (formative) assessment Design (UI/UX) Engagement through feedback, challenges, adaptation, and achievements Shorten inferential distance More complex constructs and interactions Still in infancy Understanding the interaction between design and data in more open environments Relationships with established measures and reliability Scalability and reusability Fairness 17
Summary Virtual Performance Assessments is a rapidly developing field Cognitive Models Evidentiary Reasoning Data Capture and Architecture UI/UX, Adaptation, and Individualization Reporting and Feedback Loops Big questions still outstanding Reliability Context effects Generalizability Across contexts Scalability Validity and Fairness
Thank you We are always interested in partnering with school and teacher to get these advances into the classrooms and to help us learn Andreas Oranje