1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research.

1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research on Evaluation, Standards, and Student Testing (CRESST) Supported by the Naval Education and Training Command, the Office of Naval Research, and the Institute of Education Sciences July 27, 2005 – Arlington, VA The findings and opinions expressed in this presentation do not reflect the positions or policies of the Naval Education and Training Command, the Office of Naval Research, or the Institute of Education Sciences

2 Goals for the Presentation n Consider methods to strengthen the link between learning and performance n Use cognitively based assessment to structure and measure objectives during instruction, post-training and on the job n Emphasize design of core architecture reusable tools to build & measure effective, life-long competencies n Identify benefits and savings for the Navy

3 National Center for Research on Evaluation, Standards, and Student Testing (CRESST) Consortium of R&D performers led by UCLA: n USC, Harvard, Stanford, RAND, UC Santa Barbara, Colorado n CRESST partners with other R&D organizations

4 National Center for Research on Evaluation, Standards, and Student Testing (CRESST) [Cont’d] n Mission –R&D in measurement, evaluation, and technology leading to improvement in learning and performance settings –Set the national agenda in R&D in the field –Validity, usability, credibility –Focus on rapidly usable solutions and tools –Tools allow reduced cycle time from requirements to use

5 National Center for Research on Evaluation, Standards, and Student Testing (CRESST) [Cont’d] n President-Elect AERA; 7 former presidents n Chair, Board on Testing and Assessment, National Research Council, The National Academies n Standards for Educational and Psychological Testing (1999) n Army Science Board, Defense Science Board task forces n History of DoD R&D, ONR, NETC, OSD, ARI, TRADOC, ARL, U.S. Marine Corps; NATO n Congressional councils and testimony n Multidisciplinary staff

7 State of Testing in the States n External, varying standards and tests from States n Range of targets (AYP) n Short timeline to serious sanctions n Raised scores only “OK” evidence of learning n Are there incentives to measure “high standards”? n Are there incentives to create assessments that respond to quality instruction ? n Growing enthusiasm for use of classroom assessment for accountability n Benchmark tests n Need for new ways to think about the relationship of accountability, long-term learning and performance

8 Language Check n Cognitive Model: research synthesis used to create architecture for tests and measures (and for instruction) n Ontology: formal knowledge representation (in software) of a domain of knowledge, showing relationships—sources, experts, text, observation; used in tools for assessment design n Formative assessment: assessment information to pinpoint needs (gaps, misconceptions) for improvement in instruction or on-the-job n Transfer: ability to use knowledge in different contexts

9 Learning Research n Efficient learning demands understanding of principles or big ideas (schema) and their relationships (mental models) n Learning design needs to take into account limits of working memory n Strong evidence for formative assessment: motivated practice with informative feedback n Assessment design needs to link pre-, formative, end-of-training, and refresher measures n Specification of full domain and potential transfer areas

10 Measure Design: Learning Research n Focus first on what is known about improved learning as the way to design measures: acquisition, retention, expertise, automaticity, transfer n Science-based, domain-independent cognitive demands (reusable) objects—paired with content and context to achieve desired knowledge and skills n Criterion performance is based on expertise models (not simply rater judgments) n Design and arrangement of objects is architecture for learning and measurement

11 Measurement Purposes System or Program n Needs sensing n System monitoring n Evaluation n Improvement n Accountability Individual/Team n Selection/Placement n Opt out n Diagnosis n Formative/Progress n Achievement n Certification/Career n Skill retention n Transfer of learning 5 Vector

12 Changes in Measurement/ Assessment Policy and Practices n From: One purpose, one measure n To: Multiple purposes—well-designed measure(s) with proficiency standards n Difficult to retrofit measure designed for one purpose to serve another n Evidence of technical quality? Methods of aggregation? Scaling? Fairness

13 5-Vector Implications n More than one purpose for data from tests, performance records, assessments –improvement of trainee KSAs –improvement of program effectiveness; evaluation of program or system readiness/effectiveness –certification of individual/team performance –personnel uses  Challenge: comparability

14 Multipurpose Measurement/ Metrics* n Place higher demands on technical quality of measures n Suggest more front-end design, to support adaptation and repurposing n Full representation (in ontologies or other software-supported structures) to link goals, enabling objectives, and content n A shift in the way to think about learning and training * Metrics are measures in a framework for interpretation; a ratio of achievement to time, cost, benchmarks

15 CRESST Model-Based Assessment n Reusable measurement objects to be linked to skill objects n First, depends upon cognitive analysis (domain independent, e.g., problem solving) n Essential to institute in a well-represented content or skill area (strategies and knowledge developed from experts* n May use different forms of cognitive analysis n May behavioral formats, templates –multiple choice, simulated performance, AAR, game settings, written responses, knowledge representations (maps), traces of procedures in technology, checklists

16 Cognitive Human Capital Model-Based Assessment Content Understanding Problem Solving Teamwork and Collaboration MetacognitionCommunication Learning

17 CRESST Approach n Summarize scientific knowledge about learning n Find cognitive elements that can be adapted and reused in different topics, subjects and age levels. These elements make a “family” of models n Embed model in subject matter n Focus on “Big” content ideas to support learning and application n Create templates, scoring schemes, training, and reporting systems (authoring systems available) n Conduct research (we do) to assure technical quality and fairness

18 Alignment Weak http://www.fly-ford.com/StepByStep-Front-Series.html http://www.powerofyoga.com/ copyright 2004 DK Cavanaugh U.S. Department of Energy Human Genome Program, Http://www.ornl.gov/hgmis http://www.carinasoft.com

19 Generally, How HCMBA Works n Understanding a procedure  Knowing what the components of the procedure are  Knowing when to execute the procedure, including symptom detection, and search strategies to confirm problem  Knowing principles underlying procedure  Knowing how to execute the procedure  Knowing when the procedure is off task or not working  Repair options l Ability to explain task completed AND describe steps for a different system (transfer)  Embed in content and context  Worked example  Executing procedure with feedback loops  Criterion testing—comparison benchmarks

20 Content/ Skill Ontology

21 Examples of Model-Based Assessment n Risk Assessment EDO –Cognitive demands of skill include problem identification, judging urgency, constraints and costs –Content demands involve prior knowledge in task, e.g., ship repair, knowledge needed to find alternatives, vendors, conflicting missions, etc., principles of optimization vs cycle time

22 EDO Risk Management Simulation* *CRESST/ USC/BTL’s iRides

23 Ontology of M-16 Marksmanship

24 Model-Based Example: M-16 Marksmanship Marksmanship Inventory Knowledge Assessment Knowledge Mapping Evaluation of Shooter Positions Shot-to-Shot Analysis Cognitive Demand Fidelity Current Work: Performance Sensing Diagnosis/ Prescription... using technologies – sensors, ontologies, and Bayes nets – to identify knowledge gaps and determine remediation and feedback Building on the science of measures of performance...

25 M-16 Marksmanship Example Scenario “The shooter is calling right but his rounds are hitting left of the target.” Task “Diagnose and then correct the shooter's problem” Information sources Position Target Shooter’s notebook Rifle Mental state, gear, fatigue, anxiety Wind flags

26 M-16 Marksmanship Improvement Diagnosis and prescription individualized feedback and content Sensing and assessment information content

27 Language Check n Validity: appropriate inferences are drawn from test(s) n Reliability: assessments give consistent and stable findings n Accuracy: respondents are placed in categories where they belong

28 CRESST Evidence-Based Validity Criteria for HC Assessment Models* n Cognitive complexity n Reliable or dependable n Accuracy of content/skill domain n Instructionally sensitive n Transfer and generalization n Learning focused n Validity evidence reported for each purpose n Fair n Credible * Baker, O’Neil, & Linn, American Psychologist, 1993

29 Interplay of Model-Based Design, Development, and Validity Evidence n Experiment on prompt specificity n Studies of extended embedded assessments n Studies of rater agreement and training n Studies of collaborative assessment n Studies of utility across age ranges and subjects n Reusable models (without CRESST hands-on) n Scaling-up to thousands of examinees in a formal context n Experimental studies of prior knowledge n Criterion validity studies n Studies of generalizability within subject domains n Studies of L1 impact n Studies of OTL n Studies of instructor’s knowledge n Cost and feasibility studies* n Prediction of distal outcomes n Experimental studies of instructional sensitivity

30 Report Objects

31 Measure Authoring ScreenShot

32 Summary of Tools n Tools include cognitive demands for particular classes of KSAs, to be applied in templates, objects, or other formats represented in authoring systems n Specific domain or task ontology (knowledge representation of content) n Ontological knowledge fills slots in the templates or objects n Commercial ontology systems available n Measurement authoring systems for HC Assessment Models (with evidence)

33 OUTCOME 1: Coherence n Coherent macro architecture for training and operations and measurement n Coherent view from the sailor, management and system views-to support training, retraining, assessment occurs in new environments (distance learning) 5 vector

34 OUTCOME 2: Cost Savings n Each model has reusable templates and objects, empirically validated, to match cognitive requirements n Freestanding measures do not need to be designed and revalidated anew for each task n Cost of design drops, cost of measures drops, throughout life cycle n Common framework supports retention and transfer of learning n Common HCA objects will simplify demands on trainer n Multiple-purposed measures will need different reporting metrics but should have common reporting framework

35 OUTCOME 3: More Trustworthy Evidence of Effectiveness, Readiness, or Individual or Team Performance n Common frameworks for assessment n Ontology (full representation of content) n Instructional strategies to support learning and transfer n Aggregation of outcomes using common metrics n Standard reporting formats for each assessment purpose

36 OUTCOME 4: Flexibility and Reduced Volatility Within a General Structure n Plenty of room for differential preferences by leaders of different configurations or those with different training goals n Evidence in Navy projects, engineering courses, academic topics, across trainees with different backgrounds, in different settings, with different levels of skill of instructor n Easy-to-use guidelines and tools as exemplars

37 Trust EfficacyNetworks EffortTransparency Learning Organization Teamwork Skills Social/Organizational Capital in Knowledge Management-5 Vector Implications

38 Revolution = Opportunities and Constraints n Navy needs common framework so that their work can be easily integrated n Navy needs common metrics to assess their effectiveness and tools to interpret data n Navy needs to provide vendors with framework to permit achievement and performance integration of HCMA from multiple sources

39 CRESST Web Site http://www.cresst.org eva@ucla.edu

40 Back Up

41 Marksmanship Knowledge Inventory Diagnosis and prescription Output of the recommender: areas needing remediation and prescribed content

1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research.

Similar presentations

Presentation on theme: "1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research.

Similar presentations

Presentation on theme: "1 Measuring the Link Between Learning and Performance Eva L.Baker UCLA Graduate School of Education & Information Studies National Center for Research."— Presentation transcript:

Similar presentations

About project

Feedback