Download presentation
Presentation is loading. Please wait.
1
C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles, September 10, 2002 UCLA Graduate School of Education & Information Studies Center for the Study of Evaluation National Center for Research on Evaluation, Standards, and Student Testing
2
C R E S S T / U C L A “High stakes should not be associated with the results of any assessment until the qualities of validity, reliability, and fairness have been addressed.” Raising Standards for American Education, National Council on Education Standards and Testing, 1992 (p. 27)
3
C R E S S T / U C L A Tests and Assessments Are Intended to Be: The operational arm of reform directing attention to standards The target productively motivating student, teacher, and administrator performance The basis on which rewards, help, and sanctions are based The systematic signs to the public that their schools are providing quality education An integral part of the process of educational and instructional design and improvement: A major validity issue
4
C R E S S T / U C L A Theory of Action of Assessment Systems: “Knowledge Is Power” Assessments are standards-based, sensitive to quality instruction, and responsive to legitimate changes in actions The results reported are accurate The results are validly interpreted The responsible individuals are willing to act and can motivate action by team members Practical actions to improve the situation are known and available
5
C R E S S T / U C L A Theory of Action for Assessment Systems (Cont’d) Cognizant individuals and team members possess the requisite knowledge to apply alternative methods The selected actions are adequately implemented The actions will improve subsequent results Barriers to improvement have lower strength than the desire to achieve goals, and clear and powerful incentives support positive actions
6
C R E S S T / U C L A Checking How Well Tests and Assessments Represent the Underlying Reality of Learning and Performance How well do the tests extract key elements known to be essential for competence in the domain? What is the relationship of the test design to other evidence of learning in the domain? Does performance, or some of its attributes, transfer to other subject matters (generalize)? Does performance really predict next level and/or exit criteria (vertical transfer)?
7
C R E S S T / U C L A Imagine that Generalization and Transfer Were Our Real Goals (They Are!) Most tests used for accountability are general and lightly sample content Are tests results valid for the “Standards” rather than just for the included items? Are tests designed using domain-independent and domain-specific research knowledge as well as magical psychometric properties? How do multiple measures get used?
8
C R E S S T / U C L A Using Same Measures for Different Purposes Instruction, monitoring, accountability Too much testing, too much cost Little evidence of multiple validity Can situation be fixed? Options Design multi-purpose tests Aggregating up from teacher assessment (NRC, 2001)— Capacity built within districts and supported by technology
9
C R E S S T / U C L A Using Multiple Measures to Improve Validity Multiple ways to measure standards: validity, fairness, transfer Common framework for all assessments Multiple levels: classroom, district, state Technology options: models, templates, objects
10
C R E S S T / U C L A Ideal Assessment Design Requirements Operational specification of the domain Domain-independent cognitive demands Domain-dependent learning model Well-sampled content, including prior knowledge requirements Task templates and situation descriptions Process and criteria for open-ended performance
11
C R E S S T / U C L A Ideal Assessment Design Requirements (Cont’d) Evidence Horizontal transfer across situations, formats, similar content (standards) Vertical relationships predicting development and progress Standards set (cut scores) at real boundaries Differential sensitivity to test prep vs. teaching significant content and intellectual skills
12
C R E S S T / U C L A Research Rarely Used in Validity Discussions Summary of findings from research on learning and instruction Learning is highly specific If transfer is expected, it must be taught No procedures vs. strategies provided for teachers or learners
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.