Seattle’s Teacher Evaluation Reform in Context
Two questions How does PG&E’s design compare to evaluation systems nationwide? What can SPS learn about implementing PG&E from similar efforts in other districts?
Bottom lines Do…Don’t… Realign instructional and operational systems to support eval system Treat evaluation as a stand- alone reform Communicate constantly about structure and purpose Assume people understand Train principals to work with teachers on improving practice Focus only on “calibrating” observations Monitor reliability and validity of measures Assume you’re measuring what you want
Today’s briefing on implementation analysis Approach Findings Implications Discussion
Our approach Reviewed empirical studies on implementation of PG&E-like reforms Looked for evidence on what districts are actually doing, not should be doing.
Presentation includes information from studies on: Chicago (2), Denver, Washington D.C., Coventry, RI, Washoe Country (Reno), NV, Cincinnati The Measure of Effective Teaching (MET) Project, and the Teacher Advancement Program (TAP)
What the research base covers Studies focus on –Implementation dynamics and fidelity –Validity and reliability of performance rating But generally no evidence on –Effects on teacher workforce or classroom practice –Effects on student learning
Four key findings Evaluation reforms can expose problems in other district-wide systems Teachers and principals often struggle with understanding and carrying out the reforms Observation-based ratings can identify “effective” teachers, but there’s room to improve Observation-based ratings are more reliable when based on multiple observations
Reforms expose problems in other district-wide systems Teaching-focus of reform highlights misalignments in instructional and operational systems –Are PD and curriculum aligned with instructional frameworks and assessments? –Are training, hiring, and payroll aligned in HR? –Do data systems speak to each other (E.g., compensation and evaluation)?
People struggle to understand and implement the reforms Teachers struggle to understand structure and purpose of new evaluation systems –Especially financial incentives Principals struggle to work with teachers to improve teaching practice –Most training focuses on calibrating observations and ratings –Time constraints are big issue
Observation-based ratings “work,” but could be better Teachers who do well on observation ratings also tend to have higher VAM scores Ratings are better at identifying “effective” teachers when combined with other measures
(Kane & Staiger, 2012, p.9) Combining measures adds predictive power
More observations = more reliable (Kane & Staiger, 2012, p.37)
Implications Ensure district improvement initiatives complement and support PG&E implementation –E.g., Work of EDs, HR, C&I, and DoTS Assess how well people understand PG&E and redouble communication efforts
Implications con’t Train principals in observations and rating but also working with teachers to improve practice –Place a premium on hiring and developing leadership talent Create systematic process to monitor reliability and validity of PG&E evaluations. –Double ratings –Comparing ratings to VAM
Bottom lines Do…Don’t… Realign instructional and operational systems to support eval system Treat evaluation as a stand- alone reform Communicate constantly about structure and purpose Assume people understand Train principals to work with teachers on improving practice Focus only on “calibrating” observations Monitor reliability and validity of measures Assume you’re measuring what you want
Inclusion criteria Research must be on programs with teacher evaluation systems, not simply pay reform systems Research must evaluate domestic reform at the district, county or state level Study must examine student outcomes, instructional practice, or effects on staffing (recruitment, retention, dismissal) Studies must clearly state the methodology that the authors use, the research sample and the sources of data that the research uses Authors must explain and justify the thoughtful creation of their samples (i.e., reports must not simply use convenience samples) The study must include quantitative or qualitative data that represents reform outcomes throughout the geographic area of implementation The research must compare measured outcomes to either a control group, the school’s past performance, or both.
Studies in review Milanowski, A.T. (2004). The Relationship Between Teacher Performance Evaluation Scores and Student Achievement: Evidence from Cincinnati. Peabody Journal of Education, 79(4), Proctor, D., Walters, B., Reichardt, R., Goldhaber, D., Walch, J. (2011). Making a difference in education reform: ProComp external evaluation report University of Colorado Denver Center for Education Data and Research. Sartin, L., Stoelinga, S.R., Brown, E.R. (2011). Rethinking teacher evaluation in Chicago: Lessons learned from classroom observations, principal-teacher conferences, and distriCurtis, District of Columbia Public Schools: Defining Instructional Expectations and Aligning Accountability and Glazerman, S., Seifullah, A. (2012) An evaluation of the Chicago Teacher Advancement Program (Chicago TAP) after four years. Washington, D.C: Mathematica Policy Research. Thomas J. Kane and Douglas O. Staiger, Gathering Feedback for Teaching: Combining High- ‐ Quality Observations with Student Surveys and Achievement Gains (Seattle, WA: Bill & Melinda Gates Foundation, January 4, 2012) Kimball, S. M., White, B., Milanowski, A. T., Borman, G. (2004). Examining the relationship between teacher evaluation and student assessment results in Washoe County. Peabody Journal of Education. 79(4), Milanowski, A.T. (2004). The ct implementation. Consortium on Chicago School Research at the University of Chicago Urban Education Institute. Springer, M. G. (2008). Impact of the Teacher Advancement Program on student test score gains: Findings from an independent appraisal. National Center of Performance Incentives, Peabody College of Vanderbilt University. Retrieved from dProg1.pdf. dProg1.pdf White, B. (2004). The relationship between teacher evaluation scores and student achievement: Evidence from Coventry, RI. Madison, WI: Consortium for Policy Research in Education.