AACC Mini Conference June 8-9, 2011

Slides:



Advertisements
Similar presentations
KATIE BUCKLEY, HARVARD UNIVERSITY SCOTT MARION, CENTER FOR ASSESSMENT NATIONAL CONFERENCE ON STUDENT ASSESSMENT (NCSA) NATIONAL HARBOR, MD JUNE 22, 2013.
Advertisements

The Design and Implementation of Educator Evaluation Systems, Variability of Systems and the Role of a Theory of Action Rhode Island Lisa Foehr Rhode Island.
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
Issues of Technical Adequacy in Measuring Student Growth for Educator Effectiveness Stanley Rabinowitz, Ph.D. Director, Assessment & Standards Development.
VALIDITY.
MCAS-Alt: Alternate Assessment in Massachusetts Technical Challenges and Approaches to Validity Daniel J. Wiener, Administrator of Inclusive Assessment.
C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,
Grade 12 Subject Specific Ministry Training Sessions
Classroom Assessment A Practical Guide for Educators by Craig A
Assessing and Evaluating Learning
Understanding Validity for Teachers
Student Growth Measures in Teacher Evaluation Module 2: Selecting Appropriate Assessments 1.
ERIKA HALL CENTER FOR ASSESSMENT PRESENTATION AT THE 2014 NATIONAL CONFERENCE ON STUDENT ASSESSMENT NEW ORLEANS JUNE 25, 2014 The Role of a Theory of Action.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Principles in language testing What is a good test?
Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 7 Portfolio Assessments.
Including Quality Assurance Within The Theory of Action Presented to: CCSSO 2012 National Conference on Student Assessment June 27, 2012.
CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria.
Student assessment Assessment tools AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.
After lunch - Mix it up! Arrange your tables so that everyone else seated at your table represents another district. 1.
Student assessment Assessment tools AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.
EQAO Assessments and Rangefinding
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
Self-Assessing Locally-Designed Assessments Jennifer Borgioli Learner-Centered Initiatives, Ltd.
Evaluating Impacts of MSP Grants Ellen Bobronnikov January 6, 2009 Common Issues and Potential Solutions.
Educator Effectiveness Summit School District’s Recommendation for the School Year.
TPGES Overview Part II Jenny C. Ray PGES Consultant.
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
SCOTT MARION, CENTER FOR ASSESSMENT PRESENTATION AT CCSSO NCSA AS PART OF THE SYMPOSIUM ON: STUDENT GROWTH IN THE NON-TESTED SUBJECTS AND GRADES: OPTIONS.
Critiquing Quantitative Research.  A critical appraisal is careful evaluation of all aspects of a research study in order to assess the merits, limitations,
Instrument Development and Psychometric Evaluation: Scientific Standards May 2012 Dynamic Tools to Measure Health Outcomes from the Patient Perspective.
H. D. BROWN & P. ABEYWICKRAMA 12. Grading & Student Evaluation.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Nuts and Bolts of Assessment
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
Writing a sound proposal
Understanding Standards: Advanced Higher Event
VALIDITY by Barli Tambunan/
Introduction to the Validation Phase
Integrating Theory into Practice
An Introduction to Teacher Evaluation
Test Blueprints for Adaptive Assessments
Smarter Balanced Assessment Results
Concept of Test Validity
Elayne Colón and Tom Dana
Teacher Effectiveness Project
Improving the Accessibility of Locally Developed Assessments CCSSO National Conference on Student Assessment 2016 Phyllis Lynch, PhD Director, Instruction,
Chapter 5: Assessment and Accountability
Introduction to the Validation Phase
HRM – UNIT 10 Elspeth Woods 9 May 2013
Introduction to the Validation Phase
ASSESSMENT OF STUDENT LEARNING
Classroom Assessments Checklists, Rating Scales, and Rubrics
Reliability & Validity
Classroom Assessment Validity And Bias in Assessment.
Week 3 Class Discussion.
Bursting the assessment mythology: A discussion of key concepts
Validating Interim Assessments
Innovation and Score Reporting
Assessment Directors WebEx March 29, :00-3:00 pm ET
Study Questions To what extent do English language learners have opportunity to learn the subject content specified in state academic standards and.
Brian Gong Center for Assessment
Secondary Assessment Calendar
A LEVEL Paper Three– Section A
Assessing Academic Programs at IPFW
Deconstructing Standard 2a Dr. Julie Reffel Valdosta State University
Assessment Literacy: Test Purpose and Use
Innovative Approaches for Examining Alignment
Framework for Recognition
Doug Glasshoff Greg Gaden Darin Kelberlau
Presentation transcript:

AACC Mini Conference June 8-9, 2011 Developing and Selecting Student Growth Measures for Use in Teacher Evaluation Systems Joan Herman and Pete Goldschmidt CCSSO National Conference on Student Assessment Orlando, Florida June 21, 2011

Source of Presentation Developing and Selecting Assessments of Student Growth for Use in Teacher Evaluation Systems Joan L. Herman, Margaret Heritage, and Pete Goldschmidt

Focus of Attention Sophisticated statistical models proposed to estimate the relative value individual teachers add to students’ performance Little attention paid to the quality of the student assessments models use to estimate student growth

Quality of Measures Matters Generating student growth scores requires at least two assessments of student learning Carefully designed and validated to provide trustworthy evidence for evaluating teacher effectiveness Guidance to states and districts as they develop and/or select student growth measures

Purpose of Guidance Recognize the pressure on states and districts Provide an understanding of what developing/selecting quality assessment entails Support short-term and long-term planning for continuous improvement of measures

BASIC ARGUMENT JUSTIFYING USE IN TEACHER EVALUATION

Validity Validity is overarching concept that defines quality in educational measurement Concerns the extent to which an assessment measures what it is intended to measure and provides sound evidence for specific decision- making purposes. Validation involves evaluating or justifying a specific interpretation(s) or use(s) of the scores

Validity Framework Establishes the basic argument that justifies the use of student growth measures as part of teacher evaluation Lays out the essential claims within the argument that need to be justified Suggests sources of evidence for substantiating the claims Uses accumulated evidence to evaluate and improve score validity

ARGUMENTS AND PROPOSITIONS

Propositions to Justify Use of Measures for Evaluating Teacher Effectiveness

CLAIMS AND EVIDENCE

Propositions and Claims to the Validity Evaluation Proposition 1: The standards clearly define learning expectations for the subject area and each grade level. Design Claims: Clarity Feasibility Explicit progressions Evidence: Expert reviews

Propositions and Claims to the Validity Evaluation Proposition 2: The assessment instruments have been designed to yield scores that can accurately and fairly reflect student achievement of standards. Design Claims: Alignment with standards (specs and items) Fair and Accessible Replicable procedures Evidence: Expert reviews of alignment Sensitivity reviews Measurement review of administration and scoring procedures

Propositions and Claims to the Validity Evaluation Proposition 2b: The assessment instruments have been designed to yield scores that accurately and fairly reflect student growth over the course of the year. Design Claims: Sample the range of where students may start and end the school year Designed to be sensitive to instruction Evidence: Expert review

Propositions and Claims to the Validity Evaluation Proposition 3: There is evidence that the assessment scores accurately and fairly measure the learning expectations Design Claims: Psychometric analyses confirm the assessment’s blueprint Scores are sufficiently precise and reliable Scores are fair/unbiased Evidence: Psychometric analyses Content analysis Bias analyses

Propositions and Claims to the Validity Evaluation Proposition 4: There is evidence that student growth scores accurately and fairly measure student progress over the course of the year Design Claims: Score scale reflects the full distribution of where students may start and end the year Growth scores are sufficiently precise and reliable for all students Growth scores are fair/relatively free of bias Evidence: Psychometric modeling and fit statistics Sensitivity/bias analyses

Propositions and Claims to the Validity Evaluation Proposition 5: There is evidence that assessment scores represent teachers’ contribution to student growth Design Claims: Scores are instructionally sensitive Scores representing teacher contribution are sufficiently precise and reliable Scores representing teachers contributions are relatively free of bias Evidence: Research studies on instructional sensitivity Assumption checking Advanced statistical modeling

ACCUMULATED EVIDENCE

Propositions and Claims to the Validity Evaluation Validity is a matter of degree Appraisal of all claims and evidence Strengths and weaknesses Assessment and validity evidence can always be improved

GETTING STARTED

Where to start? In the beginning- will not meet all or even many of the criteria Start by being clear on learning expectations Ensure assessments developed or selected are aligned Collect evidence for propositions during operational administrations Develop long-term agenda to improve assessments

Finally….. A single assessment cannot adequately capture the multi-faceted domain of teacher effectiveness Multiple measures are essential