FERA 2001 Slide 1 November 6, 2001 Making Sense of Data from Complex Assessments Robert J. Mislevy University of Maryland Linda S. Steinberg & Russell.

Slides:



Advertisements
Similar presentations
Performance Assessment
Advertisements

Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Copyright © Allyn & Bacon (2007) Research is a Process of Inquiry Graziano and Raulin Research Methods: Chapter 2 This multimedia product and its contents.
Clinical Assessment (I) : The Assessment Interview
American Educational Research Association Annual Meeting New York, NY - March 23, 2008 Eva L. Baker and Girlie C. Delacruz What Do We Know About Assessment.
© 2013 SRI International - Company Confidential and Proprietary Information Center for Technology in Learning SRI International NSF Showcase 2014 SIGCSE.
The Art and Science of Teaching (2007)
Show Me an Evidential Approach to Assessment Design Michael Rosenfeld F. Jay Breyer David M. Williamson Barbara Showers.
Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI International Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI.
SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell.
Gary D. Borich Effective Teaching Methods 6th Edition
NCTM’s Focus in High School Mathematics: Reasoning and Sense Making.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
University of Maryland Slide 1 May 2, 2001 ECD as KR * Robert J. Mislevy, University of Maryland Roy Levy, University of Maryland Eric G. Hansen, Educational.
University of Maryland Slide 1 July 6, 2005 Presented at Invited Symposium K3, “Assessment Engineering: An Emerging Discipline” at the annual meeting of.
Overview: Competency-Based Education & Evaluation
SLRF 2010 Slide 1 Oct 16, 2010 What is the construct in task-based language assessment? Robert J. Mislevy Professor, Measurement, Statistics and Evaluation.
Specifying a Purpose, Research Questions or Hypothesis
Knowledge Acquisition CIS 479/579 Bruce R. Maxim UM-Dearborn.
Inference & Culture Slide 1 October 21, 2004 Cognitive Diagnosis as Evidentiary Argument Robert J. Mislevy Department of Measurement, Statistics, & Evaluation.
CILVR 2006 Slide 1 May 18, 2006 A Bayesian Perspective on Structured Mixtures of IRT Models Robert Mislevy, Roy Levy, Marc Kroopnick, and Daisy Wise University.
ECOLT 2006 Slide 1 October 13, 2006 Prospectus for the PADI design framework in language testing ECOLT 2006, October 13, 2006, Washington, D.C. PADI is.
LTRC 2007 Messick Address Slide 1 June 9, 2007 Toward a Test Theory for the Interactionalist Era Robert J. Mislevy University of Maryland Samuel J. Messick.
Supervised by, Mr. Ashraf Yaseen. Overview…. Brief Introduction about Knowledge Acquisition. How it can be achieved?. KA Stages. Model. Problems that.
Inference & Culture Slide 1 April 29, 2003 Argument Substance and Argument Structure in Educational Assessment Robert J. Mislevy Department of Measurement,
(1) If Language is a Complex Adaptive System, What is Language Assessment? Presented at “Language as a Complex Adaptive System”, an invited conference.
AERA 2010 Robert L. Linn Lecture Slide 1 May 1, 2010 Integrating Measurement and Sociocognitive Perspectives in Educational Assessment Robert J. Mislevy.
U Iowa Slide 1 Sept 19, 2007 Some Terminology and Concepts for Simulation-Based Assessment Robert J. Mislevy University of Maryland In collaboration with.
Serious Play Conference Los Angeles, CA – July 21, 2012 Girlie C. Delacruz and Ayesha L. Madni Setting Up Learning Objectives and Measurement for Game.
Principles of High Quality Assessment
Research Methods in MIS Dr. Deepak Khazanchi. Objectives for the Course Identify Problem Areas Conduct Interview Do Library Research Develop Theoretical.
Dr. Robert Mayes University of Wyoming Science and Mathematics Teaching Center
1 © 2003 Cisco Systems, Inc. All rights reserved. PTC 6/18/04 What Cisco’s ADI Group is Doing in Performance Testing PTC: June 18, 2004.
ADL Slide 1 December 15, 2009 Evidence-Centered Design and Cisco’s Packet Tracer Simulation-Based Assessment Robert J. Mislevy Professor, Measurement &
The 5 E Instructional Model
Product Evaluation the outcome phase. Do the magic bullets work? How do you know when an innovative educational program has “worked”? How do you know.
Click to edit Master title style  Click to edit Master text styles  Second level  Third level  Fourth level  Fifth level  Click to edit Master text.
Reasoning Abilities Slide #1 김 민 경 Reasoning Abilities David F. Lohman Psychological & Quantitative Foundations College of Education University.
Terry Vendlinski Geneva Haertel SRI International
ECD in the Scenario-Based GED ® Science Test Kevin McCarthy Dennis Fulkerson Science Content Specialists CCSSO NCSA June 29, 2012 Minneapolis This material.
Dimensions of Human Behavior: Person and Environment
Some Implications of Expertise Research for Educational Assessment Robert J. Mislevy University of Maryland National Center for Research on Evaluation,
WELNS 670: Wellness Research Design Chapter 5: Planning Your Research Design.
Learning Progressions: Some Thoughts About What we do With and About Them Jim Pellegrino University of Illinois at Chicago.
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
A Framework of Mathematics Inductive Reasoning Reporter: Lee Chun-Yi Advisor: Chen Ming-Puu Christou, C., & Papageorgiou, E. (2007). A framework of mathematics.
Robert J. Mislevy University of Maryland Geneva Haertel & Britte Haugan Cheng SRI International Robert J. Mislevy University of Maryland Geneva Haertel.
Frames Icons. Over Time Means Issues of importance past, present and future Applying something historic to present knowledge Predicting something based.
On Layers and Objects in Assessment Design Robert Mislevy, University of Maryland Michelle Riconscente, University of Maryland Robert Mislevy, University.
Construct-Centered Design (CCD) What is CCD? Adaptation of aspects of learning-goals-driven design (Krajcik, McNeill, & Reiser, 2007) and evidence- centered.
The Teaching Process. Problem/condition Analyze Design Develop Implement Evaluate.
Unpacking the Elements of Scientific Reasoning Keisha Varma, Patricia Ross, Frances Lawrenz, Gill Roehrig, Douglas Huffman, Leah McGuire, Ying-Chih Chen,
Generic Tasks by Ihab M. Amer Graduate Student Computer Science Dept. AUC, Cairo, Egypt.
Robert J. Mislevy University of Maryland National Center for Research on Evaluation, Standards, and Student Testing (CRESST) NCME San Diego, CA April 15,
Maryland College and Career Readiness Conference Summer 2015.
Knowing What Students Know: The Science and Design of Educational Assessment Committee on the Foundations of Assessment Board on Testing and Assessment,
An Online Support System to Scaffold Real-World Problem Solving Presenter: Chih-Ming Chen Advisor: Min-Puu Chen Date: 11/10/2008 Ge, X., & Er, N. (2005).
Chapter 6 - Standardized Measurement and Assessment
PSY 432: Personality Chapter 1: What is Personality?
Unit 6 Understanding and Implementing Crew Resource Management.
Using Evidence-Centered Design to develop Scenario Based Interactive Computer Tasks Daisy Rutstein.
Teaching with CHRONOS Data and Tools A Framework for Design Cathy Manduca Science Education Resource Center Carleton College June 13, 2006.
Knowing What Students Know Ganesh Padmanabhan 2/19/2004.
Assessment of Learning 1
Using Cognitive Science To Inform Instructional Design
Classroom Assessment Validity And Bias in Assessment.
CLINICAL JUDGMENT Concept 36.
Assessment for Learning
Assessing science enquiry
Oracy Assessment Possibilities
Presentation transcript:

FERA 2001 Slide 1 November 6, 2001 Making Sense of Data from Complex Assessments Robert J. Mislevy University of Maryland Linda S. Steinberg & Russell G. Almond Educational Testing Service FERA November 6, 2001

FERA 2001 Slide 2 November 6, 2001 How much can testing gain from modern cognitive psychology? So long as testing is viewed as something that takes place in a few hours, out of the context of instruction, and for the purpose of predicting a vaguely stated criterion, then the gains to be made are minimal. Buzz Hunt, 1986:

FERA 2001 Slide 3 November 6, 2001 Opportunities for Impact Informal / local use Conceptual design frameworks  E.g., Grant Wiggins, CRESST Toolkits & building blocks  E.g., Assessment Wizard, IMMEX Building structures into products  E.g., HYDRIVE, Mavis Beacon Building structures into programs  E.g., AP Studio Art, DISC

FERA 2001 Slide 4 November 6, 2001 For further information, see...

FERA 2001 Slide 5 November 6, 2001 Don Melnick, NBME: “It is amazing to me how many complex ‘testing’ simulation systems have been developed in the last decade, each without a scoring system. “The NBME has consistently found the challenges in the development of innovative testing methods to lie primarily in the scoring arena.”

FERA 2001 Slide 6 November 6, 2001 The DISC Project The Dental Interactive Simulations Corporation (DISC) The DISC Simulator The DISC Scoring Engine Evidence-Centered Assessment Design The Cognitive Task Analysis (CTA)

FERA 2001 Slide 7 November 6, 2001 Evidence-centered assessment design The three basic models

FERA 2001 Slide 8 November 6, 2001 What complex of knowledge, skills, or other attributes should be assessed? (Messick, 1992 ) Evidence-centered assessment design

FERA 2001 Slide 9 November 6, 2001 What complex of knowledge, skills, or other attributes should be assessed? (Messick, 1992 ) Student Model Variables Evidence-centered assessment design

FERA 2001 Slide 10 November 6, 2001 What behaviors or performances should reveal those constructs? Evidence-centered assessment design

FERA 2001 Slide 11 November 6, 2001 What behaviors or performances should reveal those constructs? Work product Evidence-centered assessment design

FERA 2001 Slide 12 November 6, 2001 What behaviors or performances should reveal those constructs? Work product Observable variables Evidence-centered assessment design

FERA 2001 Slide 13 November 6, 2001 What behaviors or performances should reveal those constructs? Observable variables Evidence-centered assessment design

FERA 2001 Slide 14 November 6, 2001 What behaviors or performances should reveal those constructs? Observable variables Evidence-centered assessment design Student Model Variables

FERA 2001 Slide 15 November 6, 2001 What tasks or situations should elicit those behaviors? Evidence-centered assessment design

FERA 2001 Slide 16 November 6, 2001 What tasks or situations should elicit those behaviors? Stimulus Specifications Evidence-centered assessment design

FERA 2001 Slide 17 November 6, 2001 What tasks or situations should elicit those behaviors? Work Product Specifications Evidence-centered assessment design

FERA 2001 Slide 18 November 6, 2001 Implications for Student Model SM variables should be consistent with … The results of the CTA. The purpose of assessment: What aspects of skill and knowledge should be used to accumulate evidence across tasks, for pass/fail reporting and finer-grained feedback?

FERA 2001 Slide 19 November 6, 2001 Simplified Version of the DISC Student Model

FERA 2001 Slide 20 November 6, 2001 Implications for Evidence Models The CTA produced ‘performance features’ that characterize recurring patterns of behavior and differentiate levels of expertise. These features ground generally-defined, re-usable ‘observed variables’ in evidence models. We defined re-usable evidence models for recurring scenarios for use with many tasks.

FERA 2001 Slide 21 November 6, 2001 An Evidence Model

FERA 2001 Slide 22 November 6, 2001 Evidence Models: Statistical Submodel What’s constant across cases that use the EM »Student-model parents. »Identification of observable variables. »Structure of conditional probability relationships between SM parents and observable children. What’s tailored to particular cases »Values of conditional probabilities »Specific meaning of observables.

FERA 2001 Slide 23 November 6, 2001 Evidence Models: Evaluation Submodel What’s constant across cases »Identification and formal definition of observable variables. »Generally-stated “proto-rules” for evaluating their values. What’s tailored to particular cases »Case-specific rules for evaluating values of observables-- Instantiations of proto-rules tailored to the specifics of case.

FERA 2001 Slide 24 November 6, 2001 “Docking” an Evidence Model Evidence Model Student Model

FERA 2001 Slide 25 November 6, 2001 “Docking” an Evidence Model Evidence Model Student Model

FERA 2001 Slide 26 November 6, 2001 Initial Status Expert.28 Competent.43 Novice.28 All.33 Some.33 None.33

FERA 2001 Slide 27 November 6, 2001 Expert.39 Competent.51 Novice.11 All 1.00 Some.00 None.00 Status after four ‘good’ findings

FERA 2001 Slide 28 November 6, 2001 Expert.15 Competent.54 Novice.30 All.00 Some.00 None 1.00 Status after one ‘good’ and three ‘bad’ findings

FERA 2001 Slide 29 November 6, 2001 “Docking” another Evidence Model Evidence Model Student Model

FERA 2001 Slide 30 November 6, 2001 “Docking” another Evidence Model Evidence Model Student Model

FERA 2001 Slide 31 November 6, 2001 Implications for Task Models Task models are schemas for phases of cases, constructed around key features that... the simulator needs for its virtual-patient data base, characterize features we need to evoke specified aspects of skill/knowledge, characterize features of tasks that affect difficulty, characterize features we need to assemble tasks into tests.

FERA 2001 Slide 32 November 6, 2001 Implications for Simulator Once we’ve determined the kind of evidence we need as evidence about targeted knowledge, how must we construct the simulator to provide the data we need? Nature of problems »Distinguish phases in the patient interaction cycle. »Use typical forms of information & control availability. »Dynamic patient condition & cross time cases. Nature of affordances »Examinees must be able to seek and gather data, »indicate hypotheses, »justify hypotheses with respect to cues, »justify actions with respect to hypotheses.

FERA 2001 Slide 33 November 6, 2001 Payoff Re-usable student-model » Can project to overall score for licensing » Supports mid-level feedback as well Re-usable evidence and task models » Can write indefinitely many unique cases using schemas » Framework for writing case-specific evaluation rules Machinery can generalize to other uses & domains

FERA 2001 Slide 34 November 6, 2001 Two ways to “score” complex assessments THE HARD WAY: Ask ‘how do you score it?’ after you’ve built the assessment and scripted the tasks or scenarios. A DIFFERENT HARD, BUT MORE LIKELY TO WORK, WAY: Design the assessment and the tasks/scenarios around what you want to make inferences about, what you need to see to ground them, and the structure of the interrelationships. Part 2 Conclusion

FERA 2001 Slide 35 November 6, 2001 We can attack new assessment challenges by working from generative principles: Principles from measurement and evidentiary reasoning, coordinated with... inferences framed in terms of current and continually evolving psychology, using current and continually evolving technologies to help gather and evaluate data in that light, in a coherent assessment design framework. Grand Conclusion

FERA 2001 Slide 36 November 6, 2001