Presentation is loading. Please wait.

Presentation is loading. Please wait.

SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell.

Similar presentations


Presentation on theme: "SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell."— Presentation transcript:

1 SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell G. Almond Educational Testing Service February 25, 2000 Presented at the Technology Design Workshop sponsored by the U.S. Department of Education, held at Stanford Research Institute, Menlo Park, CA, February 25-26, 2000. The work of the first author was supported in part by the Educational Research and Development Centers Program, PR/Award Number R305B60002, as administered by the Office of Educational Research and Improvement, U.S. Department of Education. The findings and opinions expressed in this report do not reflect the positions or policies of the National Institute on Student Achievement, Curriculum, and Assessment, the Office of Educational Research and Improvement, or the U.S. Department of Education.

2 SRI Technology Evaluation WorkshopSlide 2RJM 2/23/00 Some opportunities... Cognitive/educational psychology » how people learn, » organize knowledge, » put knowledge to use. Technology to... » create, present, and vivify “tasks”; » evoke, capture, parse, and store data; » evaluate, report, and use results.

3 SRI Technology Evaluation WorkshopSlide 3RJM 2/23/00 A Challenge _ How the heck do you make sense of rich, complex data, for more ambitious inferences about students?

4 SRI Technology Evaluation WorkshopSlide 4RJM 2/23/00 A Response Design assessment from generative principles... 1. Psychology 2.Purpose 3.Evidentiary reasoning Conceptual design LEADS Tasks, statistics & technology FOLLOW

5 SRI Technology Evaluation WorkshopSlide 5RJM 2/23/00 Principled Assessment Design The three basic models

6 SRI Technology Evaluation WorkshopSlide 6RJM 2/23/00 Evidence-centered assessment design _ What complex of knowledge, skills, or other attributes should be assessed, presumably because they are tied to explicit or implicit objectives of instruction or are otherwise valued by society? (Messick, 1992)

7 SRI Technology Evaluation WorkshopSlide 7RJM 2/23/00 Evidence-centered assessment design _ What complex of knowledge, skills, or other attributes should be assessed, presumably because they are tied to explicit or implicit objectives of instruction or are otherwise valued by society? _ What behaviors or performances should reveal those constructs? (Messick, 1992)

8 SRI Technology Evaluation WorkshopSlide 8RJM 2/23/00 The Evidence Model(s) Evidence rules extract features from a work product and evaluate values of observable variables. Evidence Model(s) Stat modelEvidence rules Work product Observable variables

9 SRI Technology Evaluation WorkshopSlide 9RJM 2/23/00 Evidence Model(s) Stat modelEvidence rules The Evidence Model(s) The statistical component expresses the how the observable variables depend, in probability, on student model variables. Student model variables Observable variables

10 SRI Technology Evaluation WorkshopSlide 10RJM 2/23/00 Evidence-centered assessment design _ What complex of knowledge, skills, or other attributes should be assessed, presumably because they are tied to explicit or implicit objectives of instruction or are otherwise valued by society? _ What behaviors or performances should reveal those constructs? _ What tasks or situations should elicit those behaviors? (Messick, 1992)

11 SRI Technology Evaluation WorkshopSlide 11RJM 2/23/00 The Task Model(s) Task-model variables describe features of tasks. A task model provides a framework for describing and constructing the situations in which examinees act. Task Model(s) 1. xxxxxxxx 2. xxxxxxxx 3. xxxxxxxx 4. xxxxxxxx 5. xxxxxxxx 6. xxxxxxxx

12 SRI Technology Evaluation WorkshopSlide 12RJM 2/23/00 The Task Model(s) Includes specifications for the stimulus material, conditions, and affordances-- the environment in which the student will say, do, or produce something. Task Model(s) 1. xxxxxxxx 2. xxxxxxxx 3. xxxxxxxx 4. xxxxxxxx 5. xxxxxxxx 6. xxxxxxxx

13 SRI Technology Evaluation WorkshopSlide 13RJM 2/23/00 The Task Model(s) Includes specifications for the “work product”: the form in which what the student says, does, or produces will be captured. Task Model(s) 1. xxxxxxxx 2. xxxxxxxx 3. xxxxxxxx 4. xxxxxxxx 5. xxxxxxxx 6. xxxxxxxx

14 SRI Technology Evaluation WorkshopSlide 14RJM 2/23/00 Leverage Points... _ For cognitive/educational psychology _ For statistics _ For technology

15 SRI Technology Evaluation WorkshopSlide 15RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model.

16 SRI Technology Evaluation WorkshopSlide 16RJM 2/23/00 Example a: GRE Verbal Reasoning The student model is just the IRT ability parameter  the tendency to make correct responses in the mix of items presented in a GRE-V. 

17 Example b: HYDRIVE Student-model variables in HYDRIVE A Bayes net fragment. Overall Proficiency Procedural Knowledge Power System Knowledge Strategic Knowledge Use of Gauges Space Splitting Electrical Tests Serial Elimination Landing Gear Knowledge Canopy Knowledge Electronics Knowledge Hydraulics Knowledge Mechanical Knowledge

18 SRI Technology Evaluation WorkshopSlide 18RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, The work product

19 SRI Technology Evaluation WorkshopSlide 19RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, and how to recognize and summarize its key features.

20 SRI Technology Evaluation WorkshopSlide 20RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, and how to recognize and summarize its key features. _ Modeling which aspects of performance depend on which aspects of knowledge, in what ways.

21 SRI Technology Evaluation WorkshopSlide 21RJM 2/23/00 Leverage Points for Cog Psych _ The character and substance of the student model. _ What we can observe to give us evidence, and how to recognize and summarize its key features. _ Modeling how which aspects of performance depend on which aspects of knowledge, in what ways. _ Effective ways to elicit the kinds of behavior we need to see.

22 SRI Technology Evaluation WorkshopSlide 22RJM 2/23/00 Leverage Points for Statistics _ Managing uncertainty with respect to the student model. _Bayes nets (generalize beyond familiar test theory models--eg, VanLehn) _Modular construction of models _Monte Carlo estimation _Knowledge-based model construction wrt the student model.

23 SRI Technology Evaluation WorkshopSlide 23RJM 2/23/00 Leverage Points for Statistics _ Managing the stochastic relationship between observations in particular tasks and the persistent unobservable student model variables. _Bayes nets _Modular construction of models (incl psychometric building blocks) _Monte Carlo approximation _Knowledge-based model construction--docking with the student model.

24 SRI Technology Evaluation WorkshopSlide 24RJM 2/23/00 Example a, continued: GRE-V  Sample Bayes net -- Student model fragment docked with an Evidence Model fragment (IRT model & parameters for this item) XjXj Library of Evidence Model Bayes net fragments X1X1 X2X2 :::: XnXn

25 Example b, continued: HYDRIVE Sample Bayes net fragment Library of fragments Canopy Situation-- No split possible Canopy Situation-- No split possible Use of Gauges Serial Elimination Canopy Knowledge Hydraulics Knowledge Mechanical Knowledge

26 SRI Technology Evaluation WorkshopSlide 26RJM 2/23/00 Leverage Points for Statistics _ Extracting features and determining values of observable variables. _Bayes nets (also neural networks, rule-based logic) _Modeling human raters for training, quality control, efficiency

27 SRI Technology Evaluation WorkshopSlide 27RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model.

28 SRI Technology Evaluation WorkshopSlide 28RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. Stimulus material Work environment

29 SRI Technology Evaluation WorkshopSlide 29RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. Work product

30 SRI Technology Evaluation WorkshopSlide 30RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. _ Automated extraction and evaluation of key features of complex work.

31 SRI Technology Evaluation WorkshopSlide 31RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. _ Automated extraction and evaluation of key features of complex work. _ Construction and calculation to guide acquisition of, and manage of uncertainty about, our knowledge about the student.

32 SRI Technology Evaluation WorkshopSlide 32RJM 2/23/00 Leverage Points for Technology _ Dynamic assembly of the student model. _ Complex and realistic tasks that can produce direct evidence about knowledge used for production and interaction. _ Automated extraction and evaluation of key features of complex work. _ Construction and calculation to guide acquisition of, and manage and uncertainty about, knowledge about the student. _ Automated/assisted task construction, presentation, management.

33 SRI Technology Evaluation WorkshopSlide 33RJM 2/23/00 The Cloud behind the Silver Lining _ These developments will have the most impact when assessments are built for well-defined purposes, and connected with a conception of knowledge in the targeted domain. _ They will have much less impact for ‘drop-in-from- the-sky’ large-scale assessments like NAEP.


Download ppt "SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell."

Similar presentations


Ads by Google