SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell.

Slides:



Advertisements
Similar presentations
Performance Assessment
Advertisements

K-6 Science and Technology Consistent teaching – Assessing K-6 Science and Technology © 2006 Curriculum K-12 Directorate, NSW Department of Education and.
Elliott / October Understanding the Construct to be Assessed Stephen N. Elliott, PhD Learning Science Institute & Dept. of Special Education Vanderbilt.
© 2013 SRI International - Company Confidential and Proprietary Information Center for Technology in Learning SRI International NSF Showcase 2014 SIGCSE.
Show Me an Evidential Approach to Assessment Design Michael Rosenfeld F. Jay Breyer David M. Williamson Barbara Showers.
Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI International Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI.
©2001 CBMS Math Preparation of Teachers Teachers need to study the mathematics of a cluster of grade levels, both to be ready for the various ways in which.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
University of Maryland Slide 1 May 2, 2001 ECD as KR * Robert J. Mislevy, University of Maryland Roy Levy, University of Maryland Eric G. Hansen, Educational.
University of Maryland Slide 1 July 6, 2005 Presented at Invited Symposium K3, “Assessment Engineering: An Emerging Discipline” at the annual meeting of.
SLRF 2010 Slide 1 Oct 16, 2010 What is the construct in task-based language assessment? Robert J. Mislevy Professor, Measurement, Statistics and Evaluation.
Inference & Culture Slide 1 October 21, 2004 Cognitive Diagnosis as Evidentiary Argument Robert J. Mislevy Department of Measurement, Statistics, & Evaluation.
CILVR 2006 Slide 1 May 18, 2006 A Bayesian Perspective on Structured Mixtures of IRT Models Robert Mislevy, Roy Levy, Marc Kroopnick, and Daisy Wise University.
ECOLT 2006 Slide 1 October 13, 2006 Prospectus for the PADI design framework in language testing ECOLT 2006, October 13, 2006, Washington, D.C. PADI is.
Inference & Culture Slide 1 April 29, 2003 Argument Substance and Argument Structure in Educational Assessment Robert J. Mislevy Department of Measurement,
AERA 2010 Robert L. Linn Lecture Slide 1 May 1, 2010 Integrating Measurement and Sociocognitive Perspectives in Educational Assessment Robert J. Mislevy.
U Iowa Slide 1 Sept 19, 2007 Some Terminology and Concepts for Simulation-Based Assessment Robert J. Mislevy University of Maryland In collaboration with.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Teaching with Depth An Understanding of Webb’s Depth of Knowledge
Principles of High Quality Assessment
Thanks to Nir Friedman, HU
FERA 2001 Slide 1 November 6, 2001 Making Sense of Data from Complex Assessments Robert J. Mislevy University of Maryland Linda S. Steinberg & Russell.
Robert M. Saltzman © DS 851: 4 Main Components 1.Applications The more you see, the better 2.Probability & Statistics Computer does most of the work.
ADL Slide 1 December 15, 2009 Evidence-Centered Design and Cisco’s Packet Tracer Simulation-Based Assessment Robert J. Mislevy Professor, Measurement &
PERFORMANCE TASKS. INTRODUCTION & PURPOSE Students create products or perform tasks to show their mastery of a particular skill Students select a response.
COPYRIGHT WESTED, 2010 Calipers II: Using Simulations to Assess Complex Science Learning Diagnostic Assessments Panel DRK-12 PI Meeting - Dec 1–3, 2010.
Click to edit Master title style  Click to edit Master text styles  Second level  Third level  Fourth level  Fifth level  Click to edit Master text.
The Use of Student Work as a Context for Promoting Student Understanding and Reasoning Yvonne Grant Portland MI Public Schools Michigan State University.
Terry Vendlinski Geneva Haertel SRI International
Foundations of Educating Healthcare Providers
ECD in the Scenario-Based GED ® Science Test Kevin McCarthy Dennis Fulkerson Science Content Specialists CCSSO NCSA June 29, 2012 Minneapolis This material.
The Design Phase: Using Evidence-Centered Assessment Design Monty Python argument.
Number Sense Standards Measurement and Geometry Statistics, Data Analysis and Probability CST Math 6 Released Questions Algebra and Functions 0 Questions.
Dimensions of Human Behavior: Person and Environment
Some Implications of Expertise Research for Educational Assessment Robert J. Mislevy University of Maryland National Center for Research on Evaluation,
SLB /04/07 Thinking and Communicating “The Spiritual Life is Thinking!” (R.B. Thieme, Jr.)
Webb’s Depth of Knowledge
Learning Progressions: Some Thoughts About What we do With and About Them Jim Pellegrino University of Illinois at Chicago.
Welcome to AP Biology Mr. Levine Ext. # 2317.
Academic Needs of L2/Bilingual Learners
The present publication was developed under grant X from the U.S. Department of Education, Office of Special Education Programs. The views.
Robert J. Mislevy University of Maryland Geneva Haertel & Britte Haugan Cheng SRI International Robert J. Mislevy University of Maryland Geneva Haertel.
The use of asynchronously scored items in adaptive test sessions. Marty McCall Smarter Balanced Assessment Consortium CCSSO NCSA San Diego CA.
Putting Research to Work in K-8 Science Classrooms Ready, Set, SCIENCE.
Common Core Standards for Mathematics Standards for Mathematical Practice Carry across all grade levels Describe habits of mind of a mathematically expert.
Achievethecore.org 1 Setting the Context for the Common Core State Standards Sandra Alberti Student Achievement Partners.
2014 NAEP Technology and Engineering Literacy Assessment Junichi Hara July 7, 2010.
CogAT Cognitive Abilities Test ™ Report to Parents What does CogAT measure? CogAT measures cognitive development of a student in the areas of learned reasoning.
Depth of Knowledge Assessments (D.O.K.) Roseville City School District Leadership Team.
On Layers and Objects in Assessment Design Robert Mislevy, University of Maryland Michelle Riconscente, University of Maryland Robert Mislevy, University.
UI Lab. 류 현 정 Ecological Interface Design: Progress and challenges Kim J. Vicente, University of Toronto.
© 2009 All Rights Reserved Jody Underwood Chief Scientist
1 Technical & Business Writing (ENG-715) Muhammad Bilal Bashir UIIT, Rawalpindi.
Katherine L. McEldoon & Bethany Rittle-Johnson. Project Goals Develop an assessment of elementary students’ functional thinking abilities, an early algebra.
Evidence-Centered Game Design Kristen DiCerbo, Ph.D. Principal Research Scientist, Pearson Learning Games Scientist, GlassLab.
Robert J. Mislevy University of Maryland National Center for Research on Evaluation, Standards, and Student Testing (CRESST) NCME San Diego, CA April 15,
Enriching Assessment of the Core Albert Oosterhof, Faranak Rohani, & Penny J. Gilmer Florida State University Center for Advancement of Learning and Assessment.
AERA April 2005 Models and Tools for Drawing Inferences from Student Work: The BEAR Scoring Engine Cathleen Kennedy & Mark Wilson University of California,
Paper III Qualitative research methodology.  Qualitative research is designed to reveal a specific target audience’s range of behavior and the perceptions.
Randy Bennett Frank Jenkins Hilary Persky Andy Weiss Scoring Simulation Assessments Funded by the National Center for Education Statistics,
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Using Evidence-Centered Design to develop Scenario Based Interactive Computer Tasks Daisy Rutstein.
Grade 7 & 8 Mathematics Reporter : Richard M. Oco Ph. D. Ed.Mgt-Student.
Funded by the Library of Congress.
Teacher Professional Learning and Development Presentation for PPTA Curriculum Workshops 2009.
Writing Learning Outcomes Best Practices. Do Now What is your process for writing learning objectives? How do you come up with the information?
Knowing What Students Know Ganesh Padmanabhan 2/19/2004.
Using Cognitive Science To Inform Instructional Design
Writing Learning Outcomes
Principled Assessment Designs for Inquiry (PADI)
Presentation transcript:

SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell G. Almond Educational Testing Service February 25, 2000 Presented at the Technology Design Workshop sponsored by the U.S. Department of Education, held at Stanford Research Institute, Menlo Park, CA, February 25-26, The work of the first author was supported in part by the Educational Research and Development Centers Program, PR/Award Number R305B60002, as administered by the Office of Educational Research and Improvement, U.S. Department of Education. The findings and opinions expressed in this report do not reflect the positions or policies of the National Institute on Student Achievement, Curriculum, and Assessment, the Office of Educational Research and Improvement, or the U.S. Department of Education.

SRI Technology Evaluation WorkshopSlide 2RJM 2/23/00 Some opportunities... Cognitive/educational psychology » how people learn, » organize knowledge, » put knowledge to use. Technology to... » create, present, and vivify “tasks”; » evoke, capture, parse, and store data; » evaluate, report, and use results.

SRI Technology Evaluation WorkshopSlide 3RJM 2/23/00 A Challenge _ How the heck do you make sense of rich, complex data, for more ambitious inferences about students?

SRI Technology Evaluation WorkshopSlide 4RJM 2/23/00 A Response Design assessment from generative principles Psychology 2.Purpose 3.Evidentiary reasoning Conceptual design LEADS Tasks, statistics & technology FOLLOW

SRI Technology Evaluation WorkshopSlide 5RJM 2/23/00 Principled Assessment Design The three basic models

SRI Technology Evaluation WorkshopSlide 6RJM 2/23/00 Evidence-centered assessment design _ What complex of knowledge, skills, or other attributes should be assessed, presumably because they are tied to explicit or implicit objectives of instruction or are otherwise valued by society? (Messick, 1992)

SRI Technology Evaluation WorkshopSlide 7RJM 2/23/00 Evidence-centered assessment design _ What complex of knowledge, skills, or other attributes should be assessed, presumably because they are tied to explicit or implicit objectives of instruction or are otherwise valued by society? _ What behaviors or performances should reveal those constructs? (Messick, 1992)

SRI Technology Evaluation WorkshopSlide 8RJM 2/23/00 The Evidence Model(s) Evidence rules extract features from a work product and evaluate values of observable variables. Evidence Model(s) Stat modelEvidence rules Work product Observable variables

SRI Technology Evaluation WorkshopSlide 9RJM 2/23/00 Evidence Model(s) Stat modelEvidence rules The Evidence Model(s) The statistical component expresses the how the observable variables depend, in probability, on student model variables. Student model variables Observable variables

SRI Technology Evaluation WorkshopSlide 10RJM 2/23/00 Evidence-centered assessment design _ What complex of knowledge, skills, or other attributes should be assessed, presumably because they are tied to explicit or implicit objectives of instruction or are otherwise valued by society? _ What behaviors or performances should reveal those constructs? _ What tasks or situations should elicit those behaviors? (Messick, 1992)

SRI Technology Evaluation WorkshopSlide 11RJM 2/23/00 The Task Model(s) Task-model variables describe features of tasks. A task model provides a framework for describing and constructing the situations in which examinees act. Task Model(s) 1. xxxxxxxx 2. xxxxxxxx 3. xxxxxxxx 4. xxxxxxxx 5. xxxxxxxx 6. xxxxxxxx

SRI Technology Evaluation WorkshopSlide 12RJM 2/23/00 The Task Model(s) Includes specifications for the stimulus material, conditions, and affordances-- the environment in which the student will say, do, or produce something. Task Model(s) 1. xxxxxxxx 2. xxxxxxxx 3. xxxxxxxx 4. xxxxxxxx 5. xxxxxxxx 6. xxxxxxxx

SRI Technology Evaluation WorkshopSlide 13RJM 2/23/00 The Task Model(s) Includes specifications for the “work product”: the form in which what the student says, does, or produces will be captured. Task Model(s) 1. xxxxxxxx 2. xxxxxxxx 3. xxxxxxxx 4. xxxxxxxx 5. xxxxxxxx 6. xxxxxxxx

SRI Technology Evaluation WorkshopSlide 14RJM 2/23/00 Leverage Points... _ For cognitive/educational psychology _ For statistics _ For technology

SRI Technology Evaluation WorkshopSlide 15RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model.

SRI Technology Evaluation WorkshopSlide 16RJM 2/23/00 Example a: GRE Verbal Reasoning The student model is just the IRT ability parameter  the tendency to make correct responses in the mix of items presented in a GRE-V. 

Example b: HYDRIVE Student-model variables in HYDRIVE A Bayes net fragment. Overall Proficiency Procedural Knowledge Power System Knowledge Strategic Knowledge Use of Gauges Space Splitting Electrical Tests Serial Elimination Landing Gear Knowledge Canopy Knowledge Electronics Knowledge Hydraulics Knowledge Mechanical Knowledge

SRI Technology Evaluation WorkshopSlide 18RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, The work product

SRI Technology Evaluation WorkshopSlide 19RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, and how to recognize and summarize its key features.

SRI Technology Evaluation WorkshopSlide 20RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, and how to recognize and summarize its key features. _ Modeling which aspects of performance depend on which aspects of knowledge, in what ways.

SRI Technology Evaluation WorkshopSlide 21RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, and how to recognize and summarize its key features. _ Modeling how which aspects of performance depend on which aspects of knowledge, in what ways. _ Effective ways to elicit the kinds of behavior we need to see.

SRI Technology Evaluation WorkshopSlide 22RJM 2/23/00 Leverage Points for Statistics _ Managing uncertainty with respect to the student model. _Bayes nets (generalize beyond familiar test theory models--eg, VanLehn) _Modular construction of models _Monte Carlo estimation _Knowledge-based model construction wrt the student model.

SRI Technology Evaluation WorkshopSlide 23RJM 2/23/00 Leverage Points for Statistics _ Managing the stochastic relationship between observations in particular tasks and the persistent unobservable student model variables. _Bayes nets _Modular construction of models (incl psychometric building blocks) _Monte Carlo approximation _Knowledge-based model construction--docking with the student model.

SRI Technology Evaluation WorkshopSlide 24RJM 2/23/00 Example a, continued: GRE-V  Sample Bayes net -- Student model fragment docked with an Evidence Model fragment (IRT model & parameters for this item) XjXj Library of Evidence Model Bayes net fragments X1X1 X2X2 :::: XnXn

Example b, continued: HYDRIVE Sample Bayes net fragment Library of fragments Canopy Situation-- No split possible Canopy Situation-- No split possible Use of Gauges Serial Elimination Canopy Knowledge Hydraulics Knowledge Mechanical Knowledge

SRI Technology Evaluation WorkshopSlide 26RJM 2/23/00 Leverage Points for Statistics _ Extracting features and determining values of observable variables. _Bayes nets (also neural networks, rule-based logic) _Modeling human raters for training, quality control, efficiency

SRI Technology Evaluation WorkshopSlide 27RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model.

SRI Technology Evaluation WorkshopSlide 28RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. Stimulus material Work environment

SRI Technology Evaluation WorkshopSlide 29RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. Work product

SRI Technology Evaluation WorkshopSlide 30RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. _ Automated extraction and evaluation of key features of complex work.

SRI Technology Evaluation WorkshopSlide 31RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. _ Automated extraction and evaluation of key features of complex work. _ Construction and calculation to guide acquisition of, and manage of uncertainty about, our knowledge about the student.

SRI Technology Evaluation WorkshopSlide 32RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. _ Automated extraction and evaluation of key features of complex work. _ Construction and calculation to guide acquisition of, and manage and uncertainty about, knowledge about the student. _ Automated/assisted task construction, presentation, management.

SRI Technology Evaluation WorkshopSlide 33RJM 2/23/00 The Cloud behind the Silver Lining _ These developments will have the most impact when assessments are built for well-defined purposes, and connected with a conception of knowledge in the targeted domain. _ They will have much less impact for ‘drop-in-from- the-sky’ large-scale assessments like NAEP.