C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu.

Slides:



Advertisements
Similar presentations
NYC Teacher Data Initiative: An introduction for Teachers ESO Focus on Professional Development December 2008.
Advertisements

Value Added in CPS. What is value added? A measure of the contribution of schooling to student performance Uses statistical techniques to isolate the.
Using Growth Models to improve quality of school accountability systems October 22, 2010.
Work Disruption, Worker Health, and Productivity Mariesa Herrmann Columbia University Jonah Rockoff Columbia Business School and NBER Evidence from Teaching.
AMY ELLEN SCHWARTZ NEW YORK UNIVERSITY LEANNA STIEFEL NEW YORK UNIVERSITY ROSS RUBENSTEIN SYRACUSE UNIVERSITY JEFFREY ZABEL TUFTS UNIVERSITY Can Reorganizing.
Teacher Evaluation and Rewards OECD Mexico Joint Workshop December 1-2, 2009 Susan Sclafani National Center on Education and the Economy.
What Does Research Tell Us About Identifying Effective Teachers? Jonah Rockoff Columbia Business School Nonprofit Leadership Forum, May 2010.
Briefing: NYU Education Policy Breakfast on Teacher Quality November 4, 2011 Dennis M. Walcott Chancellor NYC Department of Education.
TEACHER QUALITY AND DISTRIBUTION Principals and Teachers Effectiveness and Evaluation NSBA’s Federal Relations Network Conference February
Teacher Effectiveness in Urban Schools Richard Buddin & Gema Zamarro IES Research Conference, June 2010.
EVAL 6970: Meta-Analysis Vote Counting, The Sign Test, Power, Publication Bias, and Outliers Dr. Chris L. S. Coryn Spring 2011.
NYC ACHIEVEMENT GAINS COMPARED TO OTHER LARGE CITIES SINCE 2003 Changes in NAEP scores Leonie Haimson & Elli Marcus Class Size Matters January.
Are Teacher-Level Value- Added Estimates Biased? An Experimental Validation of Non-Experimental Estimates Thomas J. KaneDouglas O. Staiger HGSEDartmouth.
Human Capital Policies in Education: Further Research on Teachers and Principals 5 rd Annual CALDER Conference January 27 th, 2012.
Explaining Race Differences in Student Behavior: The Relative Contribution of Student, Peer, and School Characteristics Clara G. Muschkin* and Audrey N.
Teacher Quality, Distribution, and Turnover in El Paso Ed Fuller The University of Texas at Austin El Paso, Tx June28, 2006.
Informing Policy: State Longitudinal Data Systems Jane Hannaway, Director The Urban Institute CALDER
Using State Longitudinal Data Systems for Education Policy Research : The NC Experience Helen F. Ladd CALDER and Duke University Caldercenter.org
Enquiring mines wanna no.... Who is it? Coleman Report “[S]chools bring little influence to bear upon a child’s achievement that is independent of.
Using Growth Models for Accountability Pete Goldschmidt, Ph.D. Assistant Professor California State University Northridge Senior Researcher National Center.
Work Disruption, Worker Health, and Productivity Mariesa Herrmann Columbia University Jonah Rockoff Columbia Business School and NBER Evidence from Teaching.
Understanding Student Achievement: The Value of Administrative Data Eric Hanushek Stanford University.
Chapter 7 Correlational Research Gay, Mills, and Airasian
What Makes For a Good Teacher and Who Can Tell? Douglas N. Harris Tim R. Sass Dept. of Ed. Policy Studies Dept. of Economics Univ. of Wisconsin Florida.
Education Leadership How districts can grow and support a pipeline of highly effective leaders Presentation to: SREB Leadership Forum Jody Spiro Senior.
Production Functions and Measuring the Effect of Teachers on Student Achievement With Value-Added HSE March 20, 2012.
March, What does the new law require?  20% State student growth data (increases to 25% upon implementation of value0added growth model)  20%
Different Skills? Identifying Differentially Effective Teachers of English Language Learners Ben Master, Susanna Loeb, Camille Whitney, James Wyckoff 5.
Including a detailed description of the Colorado Growth Model 1.
John Cronin, Ph.D. Director The Kingsbury NWEA Measuring and Modeling Growth in a High Stakes Environment.
1 Comments on: “New Research on Training, Growing and Evaluating Teachers” 6 th Annual CALDER Conference February 21, 2013.
The Narrowing Gap in NYC Teacher Qualifications and its Implications for Student Achievement Don Boyd, Hamp Lankford, Susanna Loeb, Jonah Rockoff, & Jim.
Educator Preparation, Retention, and Effectiveness Ed Fuller University Council for Educational Administration and The University of Texas at Austin February.
Human Capital Policies in Education: Further Research on Teachers and Principals 5 rd Annual CALDER Conference January 27 th, 2012.
1 Concentration of Low-Performing Students (8 th grade Math, 2005)
Jim Lloyd_2007 Educational Value Added Assessment System (EVAAS) Olmsted Falls City Schools Initial Presentation of 4 th Grade Students.
1 New York State Growth Model for Educator Evaluation 2011–12 July 2012 PRESENTATION as of 7/9/12.
Sensitivity of Teacher Value-Added Estimates to Student and Peer Control Variables October 2013 Matthew Johnson Stephen Lipscomb Brian Gill.
School Accountability and the Distribution of Student Achievement Randall Reback Barnard College Economics Department and Teachers College, Columbia University.
Special Education Teacher Quality and Student Achievement Li Feng Tim R. Sass Dept. of Finance & Econ.Dept. of Economics Texas State UniversityFlorida.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Developing an Effective Teacher Education System.
RESEARCH ON MEASURING TEACHING EFFECTIVENESS Roxanne Stansbury EDU 250.
State Charter Schools Commission of Georgia SCSC Academic Accountability Update State Charter School Performance
Measuring Student Growth and its Role in the Evaluation Process.
Final Reports from the Measures of Effective Teaching Project Tom Kane Harvard University Steve Cantrell, Bill & Melinda Gates Foundation.
Don Boyd, Pam Grossman, Karen Hammerness, Hamp Lankford, Susanna Loeb, Matt Ronfeldt & Jim Wyckoff This work is supported.
What Works? What Doesn’t? Overview of Teacher Compensation: What Works? What Doesn’t? James H. Stronge College of William and Mary Williamsburg, Virginia.
“Value added” measures of teacher quality: use and policy validity Sean P. Corcoran New York University NYU Abu Dhabi Conference January 22, 2009.
TEACHER EFFECTIVENESS INITIATIVE VALUE-ADDED TRAINING Value-Added Research Center (VARC)
Public Finance Seminar Spring 2015, Professor Yinger Public Production Functions.
Impediments to the estimation of teacher value added Steven Rivkin Jun Ishii April 2008.
Copyright © 2010, SAS Institute Inc. All rights reserved. How Do They Do That? EVAAS and the New Tests October 2013 SAS ® EVAAS ® for K-12.
Quality Jeanne M. Burns, Ph.D. Louisiana Board of Regents Qualitative State Research Team Kristin Gansle Louisiana State University and A&M College Value-Added.
DVAS Training Find out how Battelle for Kids can help Presentation Outcomes Learn rationale for value-added progress measures Receive conceptual.
Best Practices and New Perspectives in HR The Strategic Data Project Human Capital Diagnostic Spring GASPA | May 5, 2011.
APPR: Ready or Not Joan Townley & Andy Greene October 20 and 21, 2011.
Race to the Top Assessment Program: Public Hearing on Common Assessments January 20, 2010 Washington, DC Presenter: Lauress L. Wise, HumRRO Aab-sad-nov08item09.
CREATE – National Evaluation Institute Annual Conference – October 8-10, 2009 The Brown Hotel, Louisville, Kentucky Research and Evaluation that inform.
Copyright © 2010, SAS Institute Inc. All rights reserved. How Do They Do That? EVAAS and the New Tests October 2013 SAS ® EVAAS ® for K-12.
MEASURING THE IMPACT OF HOMELESSNESS IN THE CLASSROOM Anna Shaw-Amoah Policy Analyst BEYOND HOUSING: A National Conversation on Child Homelessness and.
Human Capital Policies in Education: Further Research on Teachers and Principals 5 rd Annual CALDER Conference January 27 th, 2012.
Teacher effectiveness. Kane, Rockoff and Staiger (2007)
C R E S S T / CU University of Colorado at Boulder National Center for Research on Evaluation, Standards, and Student Testing Design Principles for Assessment.
1 New York State Growth Model for Educator Evaluation June 2012 PRESENTATION as of 6/14/12.
Jason Grissom, Demetra Kalogrides
Preliminary, please do not cite or quote without author permission
School Quality and the Black-White Achievement Gap
Dan Goldhaber1,2, Vanessa Quince2, and Roddy Theobald1
Public Finance Seminar Professor Yinger
Presentation transcript:

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Measuring and Enhancing Teacher Effectiveness: Data, Methods, and Policies Susanna Loeb* Higher School of Economics National Research University, Moscow September 2014 * content joint with Jim Wyckoff & Allison Atteberry, Ben Master, Matt Ronfeldt or Luke Miller

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Why Measure Teacher Effectiveness? Better decisions – Direct e.g. whom to promote – Indirect Improved understanding – e.g. what experiences improve teacher effectiveness?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Today A bit of history on teacher effectiveness measures in the US Considerations of Measurement Four examples of potential uses – focus on the last one

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Large-Scale Test Data Availability Test-Based Accountability – State Level First TX, NC, SC, FL and others introduced yearly tests to track school performance. – Federal Level - No Child Left Behind Act Required ELA and math tests in 3 rd -8 th grade plus one in high school State and district data allowed researchers to assess policy effects and the effects of teachers – Teachers vary widely in their ability to improve student achievement (Gordon, Kane, & Staiger 2006; Rivkin, Hanushek, & Kain 2005; Sanders & Rivers 1996) – Teachers improve with experience, particularly during their first two years (e.g. Rockoff, 2004)

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu The Widget Effect 2009 Study in 12 large school districts Schools and districts – Not measuring teacher effectiveness In districts that use binary evaluation ratings (generally “satisfactory” or “unsatisfactory”), more than 99 percent of teachers receive the satisfactory rating. Districts that use a broader range of rating options do little better; in these districts, 94 percent of teachers receive one of the top two ratings and less than 1 percent are rated unsatisfactory. – Not considering teacher effectiveness in decisions

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Push for Evaluation Combination of – Recognition of Teacher Importance – Recognition of the Widget Effect Lead to strong push for new evaluation systems – Not based solely on subjective assessments given the forces leading to little variation. Speed of change probably due to Obama administration policies – close ties to entrepreneurial educators: TNTP, TFA…

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Race to the Top $4.35 Billion Competition as part of the American Recovery and Reinvestment Act of 2009 Most points for “Great Teachers and Leaders” (138/500) – Improving teacher and principal effectiveness based on performance (58 points) – Ensuring equitable distribution of effective teachers and principals (25 points) – Providing high-quality pathways for aspiring teachers and principals (21 points) – Providing effective support to teachers and principals (20 points) – Improving the effectiveness of teacher and principal preparation programs (14 points)

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Improving teacher effectiveness using performance measures Raises Questions – How to measure effectiveness? – How to use measures of effectiveness once you have them? What are different kinds? – Output based (e.g., based on student test performance) – Process based (e.g., based on structured observational protocol) – Holistic / Subjective (e.g., principal evaluations) What features do we want? – Validity (measurement property) – Reliability (measurement property) – Stability (effectiveness property) Focus today on measures based on student test scores – Similar analyses could be done with other measures

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Value-Added Measure teacher effectiveness by how much students’ test performance improve from the spring of the prior year to the spring of the current year Idea is to isolate the teacher’s effect from other effects on learning – “value-added” Can only be calculated for teachers in grades and subject areas for which there are tests in the prior year as well as the current year Clearly better than using test performance levels Far from perfect – e.g., based on imperfect tests, subject to random fluctuations and potential gaming

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu VAM - How are they calculated Student test scores gains relative to what we think they would be Most are a basic regression – Predict what a student would score in the spring based on linear function of prior score, demographic characteristics, program participation (maybe), class characteristics, school characteristics – Value added is the average differences between predicted and actual “Colorado Growth Model” – For each student, how much do they learn relative to other students with the same prior test score (percentiles)? – Median percentile of growth for the class Do Different Value-Added Models Tell Us the Same Things? – Models vary in how they account for student backgrounds, school, and classroom resources and whether they compare teachers across a district (or state) or just within schools. – Correlations between models are often high, but even so different models will categorize many teachers differently. (Goldhaber & Theobald, 2013)

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu A detailed example Test Score Predicted by prior score, background, and classroom Use residual (plus classroom) and predicted by classmate & school characteristics Average residual for each teacher NYC Standard Deviations: ELA: 0.24 (.19 shrunk) Math: 0.28 (.21 shrunk) NYC Standard Deviations: ELA: 0.24 (.19 shrunk) Math: 0.28 (.21 shrunk)

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Is VA a “Good” Measure? Carnegie Knowledge Network – – Test score measures imperfect measure of all we care about for students – Not obvious bias (especially within schools) – Substantial measurement error – Less when considering groups of teachers – Benefits of use depend on alternatives

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu POTENTIAL USES: 2 DIRECT AND 2 INDIRECT Understanding and Decision Making

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Example 1: simulated use the case of Layoffs Several school districts confronted teacher layoffs in the Spring 2010 and 2011 – Some avoided layoffs, e.g., New York City – Others did not, e.g., LA and DC Layoffs nearly always determined by a measure of seniority Many superintendents raised concerns that seniority layoffs compromise teacher quality

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu What might we expect if substituted VA for Seniority? Seniority layoffs typically affect teachers with two or fewer years of experience – On average teachers improve markedly during their first 3-4 years Large variance in teacher effectiveness within and across experience Many districts have recently focused on recruiting more able teachers

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Simulate: Who is laid off by 5% Salary Savings under Seniority vs. VA? Simply simulated what would happen if 5% of the workforce had been laid off two years earlier by seniority or value-added Fewer teachers laid off with VA layoffs: – Seniority-based layoff system would layoff 7% of teachers – VA system would terminate 5% of teachers Little overlap – Only 13% of seniority layoffs would also be laid off by VA – VA estimates that control for experience reduces overlap to 5% VA layoffs are, on average, 7 years more experienced than seniority layoffs

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Value-Added of Layoffs by Seniority and VA 4 th and 5 th grade

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu How would principals have rated laid off teachers? 2.5% of our sample received an “Unsatisfactory” rating by their principal from – Of these 16% would have been VA layoffs, but only 8% of VA layoffs would have received a “U” rating – none would have been seniority layoffs

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Effects on Student Learning Difference Std deviations of student achievement Std deviations of teacher VA Small effect overall since only 5% laid off, but large effects on students with the effected teachers.

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Layoff Example Dismissal based on teacher performance measures likely to have less negative effects on students than dismissal based on experience In reality, given coverage and reliability concerns, value-added measures would likely be used in combination with other performance measures Availability of performance measures allowed for simulation of policy effects that could be helpful for policy decisions

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu  Teacher Tenure: job protection most often received after 3 years  Tenure history ▫NJ first tenure law 1909; NY 1917; CA 1921; MI, PA WI 1937 ▫48 states ▫Contentious then, contentious now  Policy on two tracks ▫Eliminate tenure GA: eliminated 2001, reinstated 2003 ID: passed 2011, voters repealed 2012 SD: passed 2012, voters upheld, will eliminate by 2016 FL: eliminated in 2011; NC: will eliminate by 2018 ▫Make more rigorous More than half the states require meaningful evaluation 20 states require student test performance 25 states have multiple categories for evaluation Example 2: actual use the case of Promotion

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu  Principal recommends, superintendent decides  Tenure decisions: approve, extend or deny  Prior to tenure largely automatic  Reform encouraged careful review  ▫Classroom obs, evals of teacher work products, annual S/D/U ratings ▫Teacher data reports (value-added measures for some teachers); in-class assessments aligned with NY standards ▫District guidance: “tenure in doubt”, “tenure likely”; rationale for cases that countered district guidance  ▫All teachers rated as highly effective, effective, developing, ineffective ▫District performance flags, but no guidance  ▫Same as before except value-added measures not available in time  ▫Same as before with State provided growth scores and growth ratings replacing local value-added measures New York City tenure policy

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu How did tenure rates change following reform? New tenure Policy

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu SAT Math SAT Verb LAST Exam U Rated D Rated Low Attd Attributes of teachers by tenure decision, to Tenure Decision VAM ELA* VAM Math* Approve Extend Deny * Value added results for only % of a SD in teacher effectiveness Which teachers were affected by the policy? Extend v. Approve: p<0.05 Extend v. Deny: p<0.05

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Attributes of extended teachers by attrition behavior, & Attrition Status VAM ELA VAM Math SAT Math SAT Verbal LAST Cert Exam Same School-0.091~ ** Transfer Exit Notes: ** p<0.01, * p<0.05, ~ p<0.1 – compares same school to transfer/exit How did the composition continuing teachers change following reform?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Tenure Example Effectiveness measures used directly in practice – Reform of practice, not policy, that worked within the current contract Imprecision is part of all evaluation measures – Here structure of reform allows for corrections

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Example 3: to understand schooling, the case of Turnover, Nationally, about 1/3 teachers leave the profession in first 5 years – Higher in high-poverty, urban, & low-performing schools (Hanushek, Kain & Rivkin, 1999) In NYC, about 14% of 4 th & 5 th grade teachers leave their school each year 4% migrate schools, 10% leave district Is this problematic?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Teacher turnover often assumed to harm student achievement…but is it? – Little empirical evidence for direct effect (Guin, 2004) Turnover rates are higher in lower-performing schools (Guin, 2004; Hanushek et al. 1999) – Causal? A third factor explaining both (principal leaving)? – Direction? Some turnover can be beneficial – new ideas, person- job match (Organizational management lit, e.g. Abelson & Baysinger, 1984) Background

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Consider 2 Theories of Action Compositional – turnover changes composition of teachers (esp. quality) which, in turn, impacts achievement Disruption – disruptive effect beyond changes in composition of teachers – Organizational -- ALL teachers – NOT just leavers & their replacements

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Unique identification strategy – school-by-grade- by-year level turnover (2 measures) Two classes of fixed-effects regression models – Grade-by-School: Look within same school and grade across time lower achievement in years with more turnover? – School-by-Year: Within same school and year across grades Lower achievement in grades with more turnover? Methods

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Student achievement is lower in years/grades when turnover rates were higher Math scores are 8-10 percent of a standard deviation lower in years when there is 100 percent turnover (vs. no turnover). ELA smaller effect: 5-6 percent In a grade level that has 5 teachers, reducing turnover from 2 teachers leaving to none increases math achievement by 3% of SD – Small but meaningful, and applies to all students in grade level – Roughly same magnitude of coefficient on free lunch eligibility Probably underestimating effect exploiting “idioscyncratic” turnover (ignore systemic effects) Findings

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Is the effect compositional? Control for teaching experience, new to the school, and value-added Evidence for compositional theory of action – Significant effect remains unexplained by compositional (30-70%) Also, evidence for disruptive effect beyond changes in teacher composition – Students of stayers do worse in years with more turnover

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Turnover Example Student test score measures used to better understand the implications of turnover of students Value-added measures allowed for distinguishing compositional effects of turnover from disruptive effects

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Example 4: to understand Teaching & Learning, the case of Persistent Learning Final example – explores what students learn in school and how that impacts their later achievements

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Getting on the same page Knowledge & Skill ContentSubject SpecificOverlapping / GeneralTermLong that buildsShort or peripheralLearningSource TeacherOther Knowledge & Skill Type

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Getting on the same page Short- Term Long-Term Subject Long- Term General Short- Term Long-Term Subject Long- Term General

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Cross-subject effects Short- Term Long-Term Subject Long- Term General Short- Term Long-Term Subject Long- Term General

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Why Might Teachers Vary In Persistence? different forgetting of “long-run” knowledge Different Students different abilities Different Teachers different incentives (e.g. teaching to the test) or supports Different Schools

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Relevant Extant Research Student test score gains depend on their teacher Some but not all teacher-driven gains persists into future years (about 20%-35%) Persistence is higher for test-score gains on low-stakes tests Knowledge gains from teachers result in long-run gains in earnings Long-term earning gains are greater for ELA knowledge gained from teachers (though teachers affect ELA less) Long-term earnings effects lower for low-income students, even though teachers’ effects on test-scores are similar

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu What’s missing (and interesting) ? Few persistence studies – Replication No cross-subject persistence studies for test performance – Distinguishing general and specific knowledge gains Few studies of variance in persistence

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Research Questions 1.What is the persistence of teachers’ value-added within and across subject areas? 2.Does value-added persistence vary by teachers’ ability? 3.Does value-added persistence vary by students’ background or prior achievement? – Does variation in persistence stem from students’ differential rates of forgetting previously acquired long- term knowledge? 4.Do school-level characteristics predict variation in teachers’ persistence?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu 1. What is the persistence of teachers’ value- added within and across subject areas? Use method from Jacob, Lefgren and Sims (2010) Predict current test score with students’ prior test score, – Same subject: Gives observed relationship between prior and current score. – Other subject: Gives observed relationship between prior and current score in other subject. Instruments prior score with twice lagged score (only using variation in score that was there the prior year) – Same subject: How much of long-term knowledge is retained – Other subject: How much long-term knowledge is general (applies to both subjects) Instruments prior knowledge with prior teacher value-added (only using variation in score that came from teacher) – Same subject: How much of learning from teacher is persistent – Other subject: How much learning from teacher is general

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Cross subject Replace the outcome measure with the other subject score (and classroom fixed effects with other subject classroom fixed effects) Long-run knowledge – Same approach captures percent of long-term knowledge that is general knowledge Persistence – Same approach captures percent of teacher effect that is persistent through only general knowledge

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Context: Correlations ELA teachers’ value added Not Much

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Research Question 1 What is the persistence of teachers’ value- added within and across subject areas?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Persistence of Observed Knowledge, Long Term Knowledge, and Teacher Value Added Retain most long-term knowledge Retain about 20% of learned knowledge Retain most long-term knowledge Retain about 20% of learned knowledge

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Cross-subject Learning from ELA teachers affects future math 3+ times as much as Math teachers affect ELA (almost as much as math learning affects math) Learning from ELA teachers affects future math 3+ times as much as Math teachers affect ELA (almost as much as math learning affects math) About 60% of long-term goes across subjects

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Research Question 2 Does value-added persistence vary by teachers’ ability?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Table 4: Heterogeneity of ELA Teachers’ Persistence

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Table 5: Heterogeneity of Math Teachers’ Persistence

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Research Question 3 Does value-added persistence vary by students’ background or prior scores?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Heterogeneity of ELA Teachers’ Persistence Poor, Black, Hispanic and Low-Performing Student Retain Less of What They Learn from Teachers

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Heterogeneity of Math Teachers’ Persistence Not the same for math except: Math learning has even less of an effect on ELA for Black, Hispanic and Low-Scoring Students Not the same for math except: Math learning has even less of an effect on ELA for Black, Hispanic and Low-Scoring Students

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Does variation in persistence stem from students’ differential rates of forgetting previously acquired long-term knowledge?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Table 6: Heterogeneity in Long-Term Knowledge Persistence

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Table 6: Heterogeneity in Long-Term Knowledge Persistence

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Research Question 4 Do school-level characteristics predict variation in teachers’ persistence?

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu ELA Teacher persistence estimates across multiple school-level characteristics

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Summary 1.About 20 percent of what students learn from a teacher is long-term knowledge – Similar for math teachers and ela teachers 2.More of ELA teachers’ effect work through general knowledge that affects Math as well as ELA – about 15% of learning vs 4% for math 3.ELA teacher persistence is higher for high ability teachers 4.ELA teacher persistence is lower for low-performing and low-income students – Higher rate of forgetting explains a small part – Schools explain far more – persistence lower in in schools serving low performing students with few high-ability teachers

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Implications ELA teaching affects both ELA and Math learning Teachers vary in their persistence in ways not captured by value-added Likely causes (worth considering when assessing teachers) – Ability – Incentives

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Examples: VA for Direct and Indirect Use 1.Layoffs – Simulating potential policy effects when used for layoffs 2.Tenure – Tracing policy effects with used in practice 3.Turnover – Understanding the implications of school processes for student learning 4.Persistence - Understanding teaching and learning

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Measures of Effectiveness Inherently flawed – Do not captured the full range of effectiveness – Measurement error (affected by unobserved shocks and differences) – May have bias Yet, may be useful in practice – Real-time decision making – Broader understanding Whether value-added is useful – Availability of tests that measure valued outcomes – Availability of alternative measures of teacher effectiveness

C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu C ENTER FOR E DUCATION P OLICY A NALYSIS at S TANFORD U NIVERSITY cepa.stanford.edu Measuring and Enhancing Teacher Effectiveness: Data, Methods, and Policies Susanna Loeb* Higher School of Economics National Research University, Moscow September 2014 * content joint with Jim Wyckoff & Allison Atteberry, Ben Master, Matt Ronfeldt or Luke Miller