TECHNICAL AND CONSEQUENTIAL VALIDITY IN THE DESIGN AND USE OF VALUE-ADDED SYSTEMS LAFOLLETTE SCHOOL OF PUBLIC AFFAIRS & VALUE-ADDED RESEARCH CENTER, UNIVERSITY.

Slides:



Advertisements
Similar presentations
Value Added in CPS. What is value added? A measure of the contribution of schooling to student performance Uses statistical techniques to isolate the.
Advertisements

Using Growth Models to improve quality of school accountability systems October 22, 2010.
Haywood County Schools February 20,2013
Designs to Estimate Impacts of MSP Projects with Confidence. Ellen Bobronnikov March 29, 2010.
Explaining Race Differences in Student Behavior: The Relative Contribution of Student, Peer, and School Characteristics Clara G. Muschkin* and Audrey N.
Informing Policy: State Longitudinal Data Systems Jane Hannaway, Director The Urban Institute CALDER
TEACHER EFFECTIVENESS INITIATIVE VALUE-ADDED TRAINING Value-Added Research Center (VARC)
Districts and States Working with VARC Minneapolis Milwaukee Racine Chicago Madison Tulsa Atlanta New York City Los Angeles Hillsborough County NORTH DAKOTA.
MCAS-Alt: Alternate Assessment in Massachusetts Technical Challenges and Approaches to Validity Daniel J. Wiener, Administrator of Inclusive Assessment.
Using Growth Models for Accountability Pete Goldschmidt, Ph.D. Assistant Professor California State University Northridge Senior Researcher National Center.
1/16 CRESST/UCLA Alternative Assessment for English Language Learners Christy Kim Boscardin Barbara Jones Shannon Madsen Claire Nishimura Jae-Eun Park.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Alignment Inclusive Assessment Seminar Brian Gong Claudia.
DATA ANALYTICS TO SUPPORT K-12 EDUCATION: 25 YEARS OF RESEARCH AND NATIONWIDE IMPLEMENTATION October 20, 2014 Robert H. Meyer, Research Professor, WCER.
What Makes For a Good Teacher and Who Can Tell? Douglas N. Harris Tim R. Sass Dept. of Ed. Policy Studies Dept. of Economics Univ. of Wisconsin Florida.
Understanding Validity for Teachers
Linking Instructional Practices to Value-added Student Learning Minneapolis Public Schools November,
Value-added Accountability for Achievement in Minneapolis Schools and Classrooms Minneapolis Public Schools December,
performance INDICATORs performance APPRAISAL RUBRIC
VALUE ADDED DESIGN TEAM Pay For Performance. VA Ideas This power point is to relay ideas that members of the VA design team have developed. No decisions.
Experimental Design The Gold Standard?.
-- Preliminary, Do Not Quote Without Permission -- VALUE-ADDED MODELS AND THE MEASUREMENT OF TEACHER QUALITY Douglas HarrisTim R. Sass Dept. of Ed. LeadershipDept.
Copyright © 2001 by The Psychological Corporation 1 The Academic Competence Evaluation Scales (ACES) Rating scale technology for identifying students with.
Determining Sample Size
Welcome to Presentation to House Education and Early Learning and Human Services Committees January 17,
NCLB AND VALUE-ADDED APPROACHES ECS State Leader Forum on Educational Accountability June 4, 2004 Stanley Rabinowitz, Ph.D. WestEd
Assessment Group for Provincial Assessments, June Kadriye Ercikan University of British Columbia.
Student Engagement Survey Results and Analysis June 2011.
Evaluating Teacher Performance Daniel Muijs, University of Southampton.
Quasi-Experimental Designs For Evaluating MSP Projects: Processes & Some Results Dr. George N. Bratton Project Evaluator in Arkansas.
Evaluating the Vermont Mathematics Initiative (VMI) in a Value Added Context H. ‘Bud’ Meyers, Ph.D. College of Education and Social Services University.
1 New York State Growth Model for Educator Evaluation 2011–12 July 2012 PRESENTATION as of 7/9/12.
Sensitivity of Teacher Value-Added Estimates to Student and Peer Control Variables October 2013 Matthew Johnson Stephen Lipscomb Brian Gill.
Confidential and Proprietary. Copyright © 2010 Educational Testing Service. All rights reserved. 10/7/2015 A Model for Scaling, Linking, and Reporting.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Instruction, Teacher Evaluation and Value-Added Student Learning Minneapolis Public Schools November,
TEACHER EFFECTIVENESS INITIATIVE VALUE-ADDED TRAINING Value-Added Research Center (VARC) October 2012.
VALUE-ADDED REPORT INTERPRETATION AND FAQS Minnesota Report Example.
Project Overview EAT.RIGHT.NOW. (ERN) is a nutrition education outreach program, available to eligible students enrolled in The School District of Philadelphia.
Project on Educator Effectiveness & Quality Chancellor Summit September 27, 2011 Cynthia Osborne, Ph.D.
Tests and Measurements Intersession 2006.
Issues in Assessment Design, Vertical Alignment, and Data Management : Working with Growth Models Pete Goldschmidt UCLA Graduate School of Education &
Final Reports from the Measures of Effective Teaching Project Tom Kane Harvard University Steve Cantrell, Bill & Melinda Gates Foundation.
© 2011, Tulsa Public Schools Copyright © Tulsa Public Schools 2011 © 2011, Tulsa Public Schools Jana Burk, Tulsa Public Schools Fellow Office of Teacher.
Growth Model for District “X” Why Use Growth Models? Showing progress over time is a more fair way of evaluating It is not just a “snap shot” in time.
Using Teacher Evaluation as a Tool for Professional Growth and School Improvement Redmond School District
Research Methods for Counselors COUN 597 University of Saint Joseph Class # 4 Copyright © 2015 by R. Halstead. All rights reserved.
The Influence of Social Capital On Test Scores: How Much Do Families, Schools & Communities Matter? Glenn D. Israel, University of Florida Lionel J. Beaulieu,
Contextual Effects of Bilingual Programs on Beginning Reading Barbara R. Foorman, Lee Branum-Martin, David J. Francis, & Paras D. Mehta Florida Center.
“Value added” measures of teacher quality: use and policy validity Sean P. Corcoran New York University NYU Abu Dhabi Conference January 22, 2009.
UCLA Graduate School of Education & Information Studies National Center for Research on Evaluation, Standards, and Student Testing Growth Models for Monitoring.
 Development of a model evaluation instrument based on professional performance standards (Danielson Framework for Teaching)  Develop multiple measures.
TEACHER EFFECTIVENESS INITIATIVE VALUE-ADDED TRAINING Value-Added Research Center (VARC)
March 23, NYSCSS Annual Conference Crossroads of Change: The Common Core in Social Studies.
1 New York State Growth Model for Educator Evaluation 2011–12 July 2012 PRESENTATION as of 7/9/12.
ESEA, TAP, and Charter handouts-- 3 per page with notes and cover of one page.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.
Teacher Incentive Fund U.S. Department of Education.
Combining Multiple Measures What are the indicators/ components? What are the priority outcomes? What are the performance expectations? How can we evaluate.
Forum on Evaluating Educator Effectiveness: Critical Considerations for Including Students with Disabilities Lynn Holdheide Vanderbilt University, National.
Purpose of Teacher Evaluation and Observation Minnesota Teacher Evaluation Requirements Develop, improve and support qualified teachers and effective.
Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.
Value Added Assessment (VAA) Linking educational practice to student outcomes.
CaMSP Science Assessment Webinar Public Works, Inc. Sharing Lessons Learned in the Development and Use of Science Assessments for CaMSP Teachers and Students.
 Mark D. Reckase.  Student achievement is a result of the interaction of the student and the educational environment including each teacher.  Teachers.
Exploring Data Use & School Performance in an Urban School District Kyo Yamashiro, Joan L. Herman, & Kilchan Choi UCLA Graduate School of Education & Information.
1 New York State Growth Model for Educator Evaluation June 2012 PRESENTATION as of 6/14/12.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
Dr. Robert H. Meyer Research Professor and Director
Implementing the Specialized Service Professional State Model Evaluation System for Measures of Student Outcomes.
15.1 The Role of Statistics in the Research Process
Presentation transcript:

TECHNICAL AND CONSEQUENTIAL VALIDITY IN THE DESIGN AND USE OF VALUE-ADDED SYSTEMS LAFOLLETTE SCHOOL OF PUBLIC AFFAIRS & VALUE-ADDED RESEARCH CENTER, UNIVERSITY OF WISCONSIN-MADISON Robert Meyer, Research Professor and Director

VARC Partner Districts and States  Design of Wisconsin State Value-Added System (1989)  Minneapolis (1992)  Milwaukee (1996)  Chicago (2006)  Department of Education: Teacher Incentive Fund (TIF) (2006 and 2010)  Madison (2008)  Wisconsin Value-Added System (2009)  Milwaukee Area Public and Private Schools (2009)  Racine(2009)  New York City (2009)  Minnesota, North Dakota & South Dakota: Teacher Education Institutions and Districts (2009)  Illinois (2010)  Hillsborough (2010)  Atlanta (2010)  Los Angeles (2010)  Tulsa (2010)  Collier County (2012)  New York (2012)  California Charter Schools Association (2012)  Oklahoma Gear Up (2012)

Minneapolis Milwaukee Chicago Madison Tulsa Atlanta New York City Los Angeles Hillsborough County NORTH DAKOTA SOUTH DAKOTA MINNESOTA WISCONSIN ILLINOIS Districts and States Working with VARC Collier County NEW YORK

Context and Research Questions

Components to Educator Effectiveness Systems Educator Effectiveness Systems Data Requirements and Data Quality Professional Development (Understanding and Application) Evaluating Instructional Practices, Programs, and Policies Alignment with School, District, State Policies and Practices Embed within a Framework of Data-Informed Decision-Making Value-Added System

Uses of a Value-Added System Value- Added Evidence that All students can Learn Set School Performance Standards Triage: Identify Low Performing Schools Contribute to District Knowledge about “What Works” Data-Informed Decision- Making / Performance Management

Development of a Value-Added System  Clarity: What is the objective?  Dimensions of validity and reliability  Why? Achieve accuracy, fairness, improved teaching and learning How complex should a value-added model be? Possible rule: "Simpler is better, unless it is wrong.”

Dimensions of Validity and Reliability  Accuracy  Criterion validity  Technical (causal) validity  Reliability (precision)  Consequential validity  Transparency

Technical validity  Technical validity measures the degree to which the statistical model and data used in the model (for example, student outcomes, student characteristics, and student-classroom-teacher linkages) provide consistent (unbiased) estimates of performance using the available student outcomes/assessments  Requires development of a quasi-experimental model that captures (to the extent possible) the structural factors that determine student achievement and growth in student achievement

Consequential validity  Consequential validity addresses the incentives and decisions that are triggered by the design and use of performance measures and performance systems

Transparency  Transparency addresses the consequences of simplicity versus complexity in the design (and clarity of explanation) of value-added models and reports

Criterion Validity  Criterion validity captures the degree to which effect estimates based on available student outcome data fully align with estimates based on the complete spectrum of student outcomes valued by stakeholders

Reliability  Reliability (or precision) captures statistical error due to the fact that effectiveness estimates are based on finite samples of students, which in the context of estimating classroom and teacher performance are generally small

Application of Framework  Develop a value-added model that incorporates important structural factors that determine growth in student achievement and specify performance parameters that represent educational units (classrooms) and agents (teachers)  Identify and address threats to validity that could cause bias in the estimation of desired performance parameters  Specify data uses, including the design of reports intended to inform decision making

Technical vs. Consequential Validity I  Consider the consequences of controlling for prior achievement and other predictors – switching from measurement of attainment (as in NCLB) to growth  Positive from the standpoint of technical validity because the estimates are more accurate  Possibly negative from the perspective of consequential validity if controlling for prior achievement and other predictors inevitably leads to reduced expectations for poor and minority students.

Technical vs. Consequential Validity II  Consequences of inclusion of demographic variables?  Possibly positive from the standpoint of technical validity because the estimates are more accurate  Possibly negative from the perspective of consequential validity because the inclusion of these variables inevitably leads to reduced expectations for poor and minority students.  Or, the reverse is true

Value-Added Model

Generally Recommended Value-Added Model Features  Longitudinal student outcome/assessment data  Flexible (data-driven) posttest-on-pretest link, including possible nonlinearities in this relationship  Contextual covariates  Adjust for test measurement error  Address changes in assessments over time  Allow for end-of-grade & end-of-course exams  Dosage/student mobility  Allow differential effects by student characteristics  Statistical shrinkage: address noise due to small samples  Measures of precision and confidence ranges

Model Simplifications  Longitudinal data for two time periods (appropriate for early grades)  Model will be defined in terms of true test scores. Estimation method controls for test measurement error  Posttest on pretest relationship is assumed to be linear – this can be generalized  Student mobility with the school year is ignored in order to simplify notation

Structural Determinants of Achievement and Achievement Growth  Student level  Prior achievement  Student and family contribution  Within-classroom allocation of resources (including student performance expectations)  School contributions external to classroom (supplemental in-school instruction, after school instruction, summer school)

Structural Determinants of Achievement and Achievement Growth  Classroom level  Peer effects  Contributions external to teacher (school resources, policies, and climate, class size)  Contributions internal to teacher (teacher resources, policies, and instructional practices, alignment with standards implied by assessments) (factors that may be covered by observational rubrics)

Preview of Alternative Performance Parameters  Teacher performance:  Classroom performance:  Includes contributions in classroom from student peers and resources external to teacher (such as other staff and class size)  Factors external to the classroom (supplemental in-school instruction, after school instruction, summer school):  Classroom/school performance:  Includes contributions from classroom and resources external to the classroom

Model Specification Strategy  Include in the model all structural determinants of achievement and achievement growth  Be explicit how demographic variables and prior achievement contribute directly or indirectly (via other determinants) to achievement and growth  Two types of student and demographic variables:  Level I (Student level):  Level II (Classroom level):  Subscripts: student i, teacher j, and school k

I: Student-Level Equation  Posttest:  Pretest: with durability/decay parameter:  Student and family contribution:  Within classroom contribution:  Supplemental contribution:  Measures of supplemental factors not observed  Subscripts: student i, teacher j, and school k

Alternative Student-Level Equation  Include explicit measures of supplemental resources in the model, producing a multiple-input (crossed effects) model  This model is tractable if the crossed effects are not highly collinear. If the crossed effects are highly (or completely) collinear, then it may be possible to address provision of supplemental resources in the second level of the model as a factor external to the teacher.  Our focus is on the conventional one input model

Condition Factors on Student-Level Demographic Variables  Student and family factor  Within classroom factor  Supplemental factor

Defines a VAM of Student Growth and Classroom/School Performance  Combine student-level structural factors  Pretest coefficient  Effect of student-level characteristics  Classroom/school performance

Decomposition of Average Achievement  Predicted achievement = Prior achievement + Student growth  Average post achievement = Predicted achievement + Classroom/school performance  Teacher subscripts jk dropped

Technical Validity  Classroom/school performance from the value- added model that includes demographic variables is structural parameter of interest:  The performance parameter obtained from a model that excludes demographic variables is (approximately)  This parameter is biased

II. Classroom/School Level Equation  Classroom/school performance:  Peer effects:  Contributions external to teacher:  Contributions internal to teacher:

Condition Factors on Average Classroom- Level Demographic Variables  Peer effects:  Contributions external to teacher:  Contributions internal to teacher:

Defines a Model of Classroom/School Performance  Preferred model (but not identified)  Teacher parameter (not identified):  Bias: productivity external to teacher =  Feasible model (biased)  Bias is caused by “over-controlling”

Dilemma in the Choice of Models from the Perspective of Technical Validity  Option A: Use classroom/school performance as a proxy measure of teacher performance; commit an error of “omission”  Option B: Use the feasible, but biased, estimate of teacher performance; commit an error of “commission”  Option C: Use a more complicated model to control for the factors external to the teacher

Consequential Validity: Uses and Decisions  Parental choice of schools  Teachers willingness to teach in given schools  Identification of master teachers  Identification of teachers for professional development  Performance based compensation  Provision of supplemental services  Avoid bubble effects: incentives to deploy resources to students as artifact of statistical measures (Statistics based on means rather than medians can be affected by all students)

Key Point: the Power of Two  Decisions need to be informed by:  Measure of school/classroom or teacher performance  Measures of student achievement Actual average student achievement Student achievement target (e.g., proficiency status)  Options  Use only information on student attainment (NCLB)  Use only information on value-added performance  Use both pieces of data to inform decisions

Achievement Target, Performance, and Achievement Shortfall – Retrospective View  Example with two teachers  Focus on use of classroom/school indicator  Scale of parameters:  Value-added ratings are centered around zero with a standard deviation of one, and thus range from approximately -3 to 3  All other parameters (average achievement and the average contribution of demographic characteristics) are centered around zero and have been transformed to the value-added scale, although the standard deviations of these parameters are not constrained to equal one

How to Read the Scatter Plots Value-Added ( ) Percent Prof/Adv (2009) Schools in your district A A. Students know a lot and are growing faster than predicted B B. Students are behind, but are growing faster than predicted C C. Students know a lot, but are growing slower than predicted D D. Students are behind, and are growing slower than predicted E E. Students are about average in how much they know and how fast they are growing

Achievement Target, Performance, and Achievement Shortfall – Retrospective View Achievement Target Average Prior Achievement Student Factor Classroom/School Performance Average Posttest Achievement Shortfall

Achievement Target, Performance, and Achievement Shortfall – Prospective View Achievement Target Average Prior Achievement Student Factor Classroom/School Performance Average Posttest Achievement Shortfall NA 26 37

Achievement Target, Performance, and Achievement Shortfall – Prospective View Achievement Target Average Prior Achievement Student Factor Classroom/School Performance Average Posttest Achievement Shortfall

The Pros and Cons of Using Attainment Only  It is straightforward to connect actual attainment with achievement targets and maintain a universal target  Average achievement and related attainment indicators such as percent proficient are severely biased as measures of classroom/school performance  Given a universal achievement target, the achievement shortfalls very enormously across teachers and schools

The Pros and Cons of Using Value- Added Only  The value-added model provides an unbiased/consistent estimate of classroom/school performance  High value-added targets do not eliminate achievement shortfalls if prior achievement (or more correctly, predicted achievement, which includes student growth) is extremely low

The Power of Using Both Indicators  The value-added model provides an unbiased/consistent estimate of classroom/school performance  Achievement shortfalls can be identified prospectively and thus can trigger supplemental resource allocations designed to eliminate them

Include Student-Level Demographics?  Yes, to provide more accurate measures of classroom/school performance  Does this reduce expectations?  No, achievement targets are set independently  Predicted achievement shortfalls are not reduced in a model that includes student demographics. In fact, they are identical  Supplemental resource allocations can be triggered to eliminate achievement shortfalls

Does Including Demographic Variables Matter? Value Added Difference Percent of Schools Percent of Students Female African American Hispanic Asian Indian White Free Reduced Lunch