Are Teacher-Level Value- Added Estimates Biased? An Experimental Validation of Non-Experimental Estimates Thomas J. KaneDouglas O. Staiger HGSEDartmouth.

Slides:

Advertisements

Similar presentations

Value Added in CPS. What is value added? A measure of the contribution of schooling to student performance Uses statistical techniques to isolate the.

Advertisements

Hierarchical Linear Modeling: An Introduction & Applications in Organizational Research Michael C. Rodriguez.

Teacher Training, Teacher Quality and Student Achievement Douglas Harris Tim R. Sass Dept. of Educational Dept. of Economics Policy Studies Florida State.

Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.

Teacher Effectiveness in Urban Schools Richard Buddin & Gema Zamarro IES Research Conference, June 2010.

Teacher Productivity & Models of Employer Learning Economic Models in Education Research Workshop University of Chicago April 7, 2011 Douglas O. Staiger.

Explaining Race Differences in Student Behavior: The Relative Contribution of Student, Peer, and School Characteristics Clara G. Muschkin* and Audrey N.

Omitted Variable Bias Methods of Economic Investigation Lecture 7 1.

1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.

Chapter 4 Multiple Regression.

Magnet Schools and Peers: Effects on Student Achievement Dale Ballou Vanderbilt University November, 2007 Thanks to Steve Rivkin, Julie Berry Cullen, Adam.

Treatment Effects: What works for Whom? Spyros Konstantopoulos Michigan State University.

REGRESSION AND CORRELATION

Introduction to Probability and Statistics Linear Regression and Correlation.

Understanding Student Achievement: The Value of Administrative Data Eric Hanushek Stanford University.

Autocorrelation Lecture 18 Lecture 18.

Stat 112: Lecture 9 Notes Homework 3: Due next Thursday

What Makes For a Good Teacher and Who Can Tell? Douglas N. Harris Tim R. Sass Dept. of Ed. Policy Studies Dept. of Economics Univ. of Wisconsin Florida.

Production Functions and Measuring the Effect of Teachers on Student Achievement With Value-Added HSE March 20, 2012.

Introduction to NYC Teacher Data Initiative Training for Schools Fall 2008.

-- Preliminary, Do Not Quote Without Permission -- VALUE-ADDED MODELS AND THE MEASUREMENT OF TEACHER QUALITY Douglas HarrisTim R. Sass Dept. of Ed. LeadershipDept.

Human Capital Policies in Education: Further Research on Teachers and Principals 5 rd Annual CALDER Conference January 27 th, 2012.

Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)

Meryle Weinstein, Emilyn Ruble Whitesell and Amy Ellen Schwartz New York University Improving Education through Accountability and Evaluation: Lessons.

Moderation & Mediation

Evaluating the Vermont Mathematics Initiative (VMI) in a Value Added Context H. ‘Bud’ Meyers, Ph.D. College of Education and Social Services University.

Early Selection in Hungary A Possible Cause of High Educational Inequality Daniel Horn research fellow Institute of Economics, Hungarian Academy of Sciences.

Sensitivity of Teacher Value-Added Estimates to Student and Peer Control Variables October 2013 Matthew Johnson Stephen Lipscomb Brian Gill.

The Impact of Including Predictors and Using Various Hierarchical Linear Models on Evaluating School Effectiveness in Mathematics Nicole Traxel & Cindy.

Special Education Teacher Quality and Student Achievement Li Feng Tim R. Sass Dept. of Finance & Econ.Dept. of Economics Texas State UniversityFlorida.

Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.

RESEARCH ON MEASURING TEACHING EFFECTIVENESS Roxanne Stansbury EDU 250.

Sensitivity of Teacher Value-Added Estimates to Student and Peer Control Variables March 2012 Presentation to the Association of Education Finance and.

Don Boyd, Pam Grossman, Karen Hammerness, Hamp Lankford, Susanna Loeb, Matt Ronfeldt & Jim Wyckoff This work is supported.

Portability of Teacher Effectiveness across School Settings Zeyu Xu, Umut Ozek, Matthew Corritore May 29, 2016 Bill & Melinda Gates Foundation Evaluation.

Public Policy Analysis ECON 3386 Anant Nyshadham.

Application 3: Estimating the Effect of Education on Earnings Methods of Economic Investigation Lecture 9 1.

Using Regression Discontinuity Analysis to Measure the Impacts of Reading First Howard S. Bloom

“Value added” measures of teacher quality: use and policy validity Sean P. Corcoran New York University NYU Abu Dhabi Conference January 22, 2009.

Public Finance Seminar Spring 2015, Professor Yinger Public Production Functions.

Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.

Impediments to the estimation of teacher value added Steven Rivkin Jun Ishii April 2008.

Strategies for estimating the effects of teacher credentials Helen F. Ladd Based on joint work with Charles Clotfelter and Jacob Vigdor CALDER Conference,

LECTURE 4 EPSY 652 FALL Computing Effect Sizes- Mean Difference Effects Glass: e = (Mean Experimental – Mean Control )/SD o SD = Square Root (average.

Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.

Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:

1 Module One: Measurements and Uncertainties No measurement can perfectly determine the value of the quantity being measured. The uncertainty of a measurement.

Using School Choice Lotteries to Test Measures of School Effectiveness David Deming Harvard University and NBER.

A Re-Evaluation of The Tennessee STAR Project

Human Capital Policies in Education: Further Research on Teachers and Principals 5 rd Annual CALDER Conference January 27 th, 2012.

Teacher effectiveness. Kane, Rockoff and Staiger (2007)

Teacher Quality/Effectiveness: Defining, Developing, and Assessing Policies and Practices Part III: Setting Policies around Teacher Quality/Effectiveness.

Regression Discontinuity Design Case Study : National Evaluation of Early Reading First Peter Z. Schochet Decision Information Resources, Inc.

Using Prior Scores to Evaluate Bias in Value-Added Models Raj Chetty, Stanford University and NBER John N. Friedman, Brown University and NBER Jonah Rockoff,

Florida Algebra I EOC Value-Added Model June 2013.

Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.

Methods of Presenting and Interpreting Information Class 9.

Public Finance Seminar Spring 2017, Professor Yinger

Analysis for Designs with Assignment of Both Clusters and Individuals

Top-performing education systems: Who are they, and how do we know?

Jason Grissom, Demetra Kalogrides

School Quality and the Black-White Achievement Gap

Educational Analytics

12 Inferential Analysis.

Portability of Teacher Effectiveness across School Settings

Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.

12 Inferential Analysis.

Public Finance Seminar Professor Yinger

Understanding Statistical Inferences

Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.

Presentation transcript:

Are Teacher-Level Value- Added Estimates Biased? An Experimental Validation of Non-Experimental Estimates Thomas J. KaneDouglas O. Staiger HGSEDartmouth College

LAUSD Data Grades 2 through 5 Three Time Periods: Years before Random Assignment: Spring 2000 through Spring 2003 Years of Random Assignment: Either Spring 2004 or 2005 Years after Random Assignment: Spring 2005 (or 2006) through Spring 2007 Outcomes: California Standards Test (Spring ) Stanford 9 Tests (Spring 2000 through 2002) California Achievement Test (Spring 2003) Covariates: Student: baseline math and reading scores (interacted with grade), race/ethnicity (hispanic, white, black, other or missing), ever retained, Title I, Eligible for free lunch, Gifted and talented, Special education, English language development (level 1-5). Peers: Means of all the above for students in classrooms. Fixed Effects: School x Grade x Track x Year Sample Exclusions: Special Education Exclusion: >20 percent special education classes Small and Large Class Exclusion: Fewer than 5 and more than 36 students in class All standardized by grade and year.

Experimental Design Sample of NBPTS applicants from Los Angeles area. Sample of Comparison teachers working in same school, grade, calendar track. LAUSD chief of staff wrote letters to principals inviting them to draw up two classrooms that they would be willing to assign to either teacher. If principal agreed, classroom rosters (not individual students) were randomly assigned by LAUSD on the day of switching. Yielded 78 pairs of teachers (156 classrooms and 3500 students) for whom we had estimates of “value-added” impacts from the pre-experimental period.

Step 1: Estimate a Variety of Non-Experimental Specifications using Pre-Experimental Data Generate Empirical Bayes estimates (VA j ) of teacher effects using a variety of specifications of A, X.

Step 2: Test Validity of VA j in Predicting Within-Pair Experimental Differences. At the classroom level: Differencing within each pair, p=1 through 78:

Summary of Sample Comparisons  The experimental sample of teachers was more experienced. (15 vs years in LAUSD)  The pre-experimental mean and s.d. of VA j were similar in the experimental and non-experimental samples.  Could not reject the hypothesis of no relationship between VA 2p -VA 1p and differences in mean baseline characteristics.  Could not reject the hypothesis of no differential attrition or teacher switching.

Why would student fixed-effect models underestimate differences in teacher value added? When we demean student data, we subtract off 1/T of current teacher’s effect (T=#years of data on each student)  underestimate magnitude of teacher effect by 1/T (i.e., need d.f. correction) In our data, typical student had 2-4 years of data, so magnitude is biased down by ½ to ¼. Subtract off even more of teacher effect if some of current teacher’s effect persists into scores in future years (FE model assumes no persistence)  underestimate magnitude by 1/T for teacher in year T (since this teacher’s effect only in last year’s score)  underestimate magnitude by more than 1/T for teachers in earlier years, with downward bias largest for first teacher. If first teacher’s effect completely persistent, we would subtract off all of the effect & estimate no variance in 1 st year teacher effect.

Structural Model for Estimating Fade- out Parameter, δ

IV Strategy for Estimating Fade-Out Parameter (δ) in Non-Exp Data We can rewrite the error component model as: OLS estimates of δ biased, because A ijt-1 correlated with error Use prior year teacher dummies to instrument for A ijt-1 Assumes that prior year teacher assignment is not correlated with Control for teacher or classroom fixed effects to capture current teacher/classroom effects.

Joint Validity of Non-Experimental Estimates of δ and VA j.

Potential Sources of Fade-out Unused knowledge may becomes inoperable. Grade-specific content is not entirely reflected in future achievement. (e.g. even if you’ve not forgotten logarithms, may not hurt you in calculus)

Potential Sources of Fade-out Unused knowledge becomes inoperable. Grade-specific content is not entirely relevant for future achievement. (e.g. even if you’ve not forgotten logarithms, may not hurt you in calculus) Takes more effort to keep students at high performance level than at low performance level. Students of best teachers mixed with students of worst teachers in following year, and new teacher will focus effort on students who are behind. (  no fade-out if teachers were all effective)

Is Teacher-Student Sorting Different in Los Angeles?

Summary of Main Findings: All non-experimental specifications provided information regarding experimental outcomes, but those controlling for baseline score yielded unbiased predictions with highest explanatory power. The experimental impacts in both math and english language arts seem to fade out at annual rate of Similar fade-out was observed non-experimentally. Depending on source, fade-out has important implications for calculations of long-term benefits of improvements in average teacher effects.

Next steps: Test for “complementaries” in teacher effects across years. (e.g. What is the effect of having a high or low-value added teacher in two consecutive years?) (Current experiment won’t help, but STAR experiment might.)

Empirical Methods: 2.Generating Empirical Bayes Estimates of Non- Experimental Teacher Effects.

Why would current gains be related to prior teacher assignments? We find teacher effect fading out Let VA t = value added of teacher in year t a k = % left after k years Then A t = VA t + a 1 VA t-1 + a 2 VA t-2 + … Implies gains include % of prior teacher effect (A t – A t-1 ) = VA t + (a 1 – 1)VA t-1 + (a 2 – a 1 )VA t-2 + … Our estimate of a 1 ≈0.5 implies Variance of prior teacher effect would be roughly 25% of the variance of current teacher effect. Prior teacher effect would enter with negative sign. Does fade-out mean the non-structural approach would be biased? Do we need to estimate full human capital production function? Depends partially on correlation among VA jt, VA jt-1 VA at-1 …,

Why would current gains be related to future teacher assignments? Students are assigned to future teachers based on current performance. e.g., tracking, student sorting This is why the unadjusted mean end of year score was a biased measure of teacher effects. (If differences in baseline scores were just random noise, mean student scores from the non-experimental period would have been a noisy but unbiased estimator). In value-added regression, this generates relationship between future teacher assignment (in t+1) and current end-of-year score (in t) (that is, future teacher assignments are endogenous to current year gains). We would expect future teacher assignments to be related to current gains, as Rothstein (2007) reports.

What is the variance in teacher effects on student achievement? Non-Experimental Studies: Armour (1971), Hanushek (1976), McCaffrey et. al. (2004), Murnane and Phillips (1981), Rockoff (2004), Hanushek, Rivkin and Kain (2005), Jacob and Lefgren (2005), Aaronson, Barrow and Sander (2007), Kane, Rockoff and Staiger (2006), Gordon, Kane and Staiger (2006) Standard Deviation in teacher-effect estimated.10 to.25 student-level standard deviations. Experimental Study (TN Class-Size Experiment): Nye, Konstantopoulous and Hedges (2004) Teachers and students were randomly assigned to classes of various sizes, grades K through 3. Looked at teacher effect, net of class size category effects and school effects. Standard Deviation in teacher-effect estimated.08 to.11 student-level standard deviations. Even higher (.10 to.18) in low SES schools.

Interpretation of Coefficient on Lagged Student Performance We estimate several non-experimental specifications Β o =0 (no controls), Β o =1 (“gains”), Β o <1 (“quasi-gains”) and ask: Which yields unbiased estimates of teacher effects (μ j )? Which minimizes the mean squared error in predicting student outcomes?. We place no structural interpretation on Β o. Β o presumably contains a number of different roles– (i) systematic selection of students to teachers, (ii) fade-out of prior educational inputs, (iii) measurement error. These separate roles are difficult to identify. The various biases introduced may or may not be offsetting.