MDE / OEAA 1 Growing Pains: The State of the Art in Value-Added Modeling Presentation on March 2, 2005 to Michigan School Testing Conference By Joseph.

Slides:



Advertisements
Similar presentations
Using Growth Models to improve quality of school accountability systems October 22, 2010.
Advertisements

Square Peg and Round Hole… As parents and educators, the change in grading systems requires a fundamental switch in our thinking… 4=A 1=F 2=D 3=B.
Mark D. Reckase Michigan State University The Evaluation of Teachers and Schools Using the Educator Response Function (ERF)
VALUE – ADDED 101 Ken Bernacki and Denise Brewster.
EVAL 6970: Meta-Analysis Vote Counting, The Sign Test, Power, Publication Bias, and Outliers Dr. Chris L. S. Coryn Spring 2011.
Student Learning Targets (SLT) You Can Do This! Getting Ready for the School Year.
Comparing Growth in Student Performance David Stern, UC Berkeley Career Academy Support Network Presentation to Educating for Careers/ California Partnership.
Michigan Council for Educator Effectiveness Toward an Improvement-Focused System of Educator Evaluation Jennifer Hammond OCTE Meeting November 7, 2013.
Dallas ISD’s Value-Added Model School Effectiveness Index (SEI) Classroom Effectiveness Index (CEI) Data Analysis, Reporting, and Research Services.
Statistics for Managers Using Microsoft® Excel 5th Edition
When Measurement Models and Factor Models Conflict: Maximizing Internal Consistency James M. Graham, Ph.D. Western Washington University ABSTRACT: The.
Clustered or Multilevel Data
Using Growth Models for Accountability Pete Goldschmidt, Ph.D. Assistant Professor California State University Northridge Senior Researcher National Center.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Alignment Inclusive Assessment Seminar Brian Gong Claudia.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Analysis of Clustered and Longitudinal Data
Introduction to GREAT for ELs Office of Student Assessment Wisconsin Department of Public Instruction (608)
Measurement and Data Quality
Common Questions What tests are students asked to take? What are students learning? How’s my school doing? Who makes decisions about Wyoming Education?
An Introduction to HLM and SEM
Creating Assessments with English Language Learners in Mind In this module we will examine: Who are English Language Learners (ELL) and how are they identified?
Determining Sample Size
Introduction to the Georgia Student Growth Model Student Growth Percentiles 1.
Introduction to Adequate Yearly Progress (AYP) Michigan Department of Education Office of Psychometrics, Accountability, Research, & Evaluation Summer.
NCLB AND VALUE-ADDED APPROACHES ECS State Leader Forum on Educational Accountability June 4, 2004 Stanley Rabinowitz, Ph.D. WestEd
Office of Institutional Research, Planning and Assessment January 24, 2011 UNDERSTANDING THE DIAGNOSTIC GUIDE.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Technical Adequacy Session One Part Three.
The Use of Trajectory-Modeled Growth as Part of Adequate Yearly Progress: One State's Results Christopher I Cobitz, Ph.D. Reporting Section Chief North.
1 New York State Growth Model for Educator Evaluation 2011–12 July 2012 PRESENTATION as of 7/9/12.
The Impact of Including Predictors and Using Various Hierarchical Linear Models on Evaluating School Effectiveness in Mathematics Nicole Traxel & Cindy.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
A Closer Look at Adequate Yearly Progress (AYP) Michigan Department of Education Office of Educational Assessment and Accountability Paul Bielawski Conference.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Issues in Assessment Design, Vertical Alignment, and Data Management : Working with Growth Models Pete Goldschmidt UCLA Graduate School of Education &
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Regression Analysis A statistical procedure used to find relations among a set of variables.
School Accountability in Delaware for the School Year August 3, 2005.
Preliminary Data: Not a Final Accountability Document1 SAISD TAKS Performance Board Work Session June 2004 Office of Research, Evaluation,
“Value added” measures of teacher quality: use and policy validity Sean P. Corcoran New York University NYU Abu Dhabi Conference January 22, 2009.
AERA March 25, 2008 Delaware’s Growth Model and Results from Year One.
Michigan School Report Card Update Michigan Department of Education.
MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.
University of Ostrava Czech republic 26-31, March, 2012.
NCLB / Education YES! What’s New for Students With Disabilities? Michigan Department of Education.
Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
Release of Preliminary Value-Added Data Webinar August 13, 2012 Florida Department of Education.
Application of Growth and Value-Added Models to WASL A Summary of Issues, Developments and Plans for Washington WERA Symposium on Achievement Growth Models.
1 Getting Up to Speed on Value-Added - An Accountability Perspective Presentation by the Ohio Department of Education.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
ADEQUATE YEARLY PROGRESS. Adequate Yearly Progress Adequate Yearly Progress (AYP), – Is part of the federal No Child Left Behind Act (NCLB) – makes schools.
1 Accountability Systems.  Do RFEPs count in the EL subgroup for API?  How many “points” is a proficient score worth?  Does a passing score on the.
School and District Accountability Reports Implementing No Child Left Behind (NCLB) The New York State Education Department March 2004.
C R E S S T / CU University of Colorado at Boulder National Center for Research on Evaluation, Standards, and Student Testing Measuring Adequate Yearly.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Chapter 10: The t Test For Two Independent Samples.
 Mark D. Reckase.  Student achievement is a result of the interaction of the student and the educational environment including each teacher.  Teachers.
1 New York State Growth Model for Educator Evaluation June 2012 PRESENTATION as of 6/14/12.
1 Testing Various Models in Support of Improving API Scores.
Chapter 15 Multiple Regression Model Building
Smarter Balanced Assessment Results
Reliability and Validity of Measurement
Office of Education Improvement and Innovation
Assessment Literacy: Test Purpose and Use
Understanding Statistical Inferences
Understanding How the Ranking is Calculated
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

MDE / OEAA 1 Growing Pains: The State of the Art in Value-Added Modeling Presentation on March 2, 2005 to Michigan School Testing Conference By Joseph A. Martineau Psychometrician Office of Educational Assessment & Accountability Michigan Department of Education

MDE / OEAA 2 Why Value Added? Value Added measures of achievement are being discussed as a possible addition to the regulations of No Child Left Behind (NCLB). –Various ways of implementing Value Added in NCLB are possible –One likely implementation of Value Added is as another way to make safe harbor if the percent proficiency targets are not met

MDE / OEAA 3 What is Value Added? In accountability, Value Added is a term that describes the part of achievement (or change in achievement) that is attributable to the effectiveness of a unit (teacher or school) Positive estimates indicate units that are above average, negative estimates indicate that units are below average Defining what is attributable to the effectiveness of a unit is a matter of philosophical debate

MDE / OEAA 4 The Logic of Value Added Holding educators accountable for student performance has many pitfalls –Educators cannot control their students’ incoming achievement –Educators cannot control the effectiveness of their students previous teachers/schools –Educators cannot control the effects of non- instructional student characteristics such as… Poverty Parental education Mobility Home environment Etcetera…

MDE / OEAA 5 The Logic of Value Added, Continued… Value Added Models (VAM) attempt to obtain pure estimates of the contribution of educators to student achievement and/or growth in achievement –The promise of VAM is that educators are held accountable only for their impact on student learning –The idea is not rocket science (Sanders), but the implementation is (Reckase)

MDE / OEAA 6 The Idea Is Not Rocket Science For each school… Estimate the expected average achievement or gain score Calculate the observed average achievement or gain score Subtract the expected from the observed average score Define the resulting difference between expected and observed scores as the value added by the school

MDE / OEAA 7 The Idea Is Not Rocket Science Adjusting Achievement Targets to be More Fair to Educators

MDE / OEAA 8 The Idea Is Not Rocket Science Adjusting Gain Targets to be More Fair to Educators (Tennessee Model)

MDE / OEAA 9 The Idea Is Not Rocket Science Adjusting Gain Targets to be More Fair to Educators (Dallas Model)

MDE / OEAA 10 The Idea Is Getting Closer to Rocket Science Adjusting Yearly Gain Targets to Meet a Final Achievement Goal (Thum Model)

MDE / OEAA 11 The Implementation IS Rocket Science In a Growth-Based VAM, For Each School You Must… 1.Specify a Mixed Model (a sophisticated statistical procedure that accounts for the structure of data coming from multiple occasions for each student, and multiple students per unit) 2.Estimate an overall average gain for each school year, and for the entire set of students and schools 3.Estimate a unique expected average gain for each school year and school 4.Estimate the difference between the school’s actual average trajectory and the expected average trajectory for each school year and school 5.Keep track of previous schools’ effects so that they don’t get counted toward later schools 6.Estimate a unique expected gain for each school year, student, and school 7.Estimate the difference between the expected gain and the actual gain for each school year, student, and school 8.Keep track of all differences across years so that a student’s high growth in one year is not counted toward all subsequent years 9.Estimate all of these expected and actual gains together so that they are unbiased and reliable 10.Do this all using a sparse data matrix, which causes ordinary software to choke 11.So, you write your own software, and develop new applications of statistical theory to make your idea work 12.Communicate the results in an understandable fashion to stakeholders

MDE / OEAA 12 The Problem with Rocket Science And with rocket science, many things can cause large distortions in the results of VAM, including –Small problems with the scales of measurement –Small programming errors –Small errors in assumptions needed for the statistical models to work appropriately

MDE / OEAA 13 Statistical Issues in VAM 50 years ago, researchers despaired of every being able to measure growth validly, because the statistical issues seemed insurmountable Most of the statistical issues have been solved by the introduction of Statistical Mixed Models

MDE / OEAA 14 Statistical Issues in VAM, Continued… For VAM, one very significant statistical issue remains –The parts of the statistical models that produce estimates of Value Added were originally included in statistical models with the purpose of accounting for sources of error so that other effects were easier to identify. Therefore… Therefore, estimates of value added can also be classified as error terms Estimates of Value Added are technically the portion of achievement or gains that cannot be explained by anything else included in the model In effect, the implementation of a Value-Added Model says “whatever portion of achievement and/or growth we do not know how to explain is to be attributed to schools”

MDE / OEAA 15 Statistical Issues in VAM, Continued… Philosophical, ethical, and political considerations of attributing to schools all achievement/gains that cannot be explained any other way –Do we have to remove differences explained by ethnicity before we can attribute the rest to schools? –Do we have to remove differences explained by poverty before we can attribute the rest to schools? –Etcetera… –Is it possible to ever satisfy the majority of stakeholders that what’s left over is pure enough to hold schools accountable for? No matter how we answer these questions, it raises additional philosophical, ethical, and political concerns.

MDE / OEAA 16 Ethical Issues in VAM, Continued… VAMs as Currently Implemented –Focus lies squarely on being fair to educators In TN and OH… –All educators are expected to produce the same average gains in their students –The achievement gap is expected to remain as it was because educators or lower-achieving groups of students are not expected to help their students catch up In Dallas… – All educators are expected to produce gains in their students that are equivalent to the average gains achieved by similar groups of students –The achievement gap may be expected to widen because lower performing groups of students may achieve lower average gains than other groups of students

MDE / OEAA 17 Ethical Issues in VAM, Continued… Where does VAM take into account fairness for low- performing students? –Currently implemented VAMs say basically, “I need to see one year’s growth for one year of instruction” where (as in the Dallas model), one year’s worth of growth can be less for some groups of students than for others –Because of concerns about being fair to educators, groups of students that start out behind are left behind by the same amount (or even more) –Thum model is a compromise that expects a modest amount more of educators serving low-achieving students, but that the gap will be closed over many grades Not really a VAM A mixture of status and growth

MDE / OEAA 18 Political Issues in VAM Complexity –Rocket Science is a political liability –As more of the statistical and ethical issues of VAM are addressed, VAMs are likely to become even more inaccessible to the lay audience –VAM requires an extraordinary amount of trust in those who implement the system Ethical issues will be decided by a political process that does not necessarily account for the best interest of students and educators, e.g.… –Dallas: Focus on best interests of educators at the possible price of increasing achievement gaps –TN, HO: Focus on best interests of educators at the possible price of leaving achievement gaps as they are –Thum: Focus on best interests of low-performing groups at the possible expense of (1) high-performing groups of students, and (2) making low- achieving schools less attractive to qualified teachers –The state of the art in VAM is incapable of providing for both high achievement for all students and fairness in evaluating educators of lower-performing students

MDE / OEAA 19 Measurement Issues in VAM Having solved most of the statistical issues in VAM, the measurement issues have been forgotten in the excitement

MDE / OEAA 20 Measurement Issues in VAM, Continued… Assumes that the same thing is being measured at every grade level of the test –Presents a dilemma In order to measure validly, we have to measure what is being taught, which changes over grade levels In order to calculate growth, gains, and value-added, we have to measure the same thing every time we measure –Value added models are being applied to “construct-shifting” scales as if the scales were interval-level measures of student achievement on unchanging content

MDE / OEAA 21 Cautions in using Vertical Scales Scholars have been warning against the use of construct-shifting scales to measure growth for 50+ years However, the use of vertical scales in growth models has become increasingly prevalent in scholarly literature with the advent of recent statistical developments (HLM and SEM) So am I just straining at gnats? –Can’t I just use vertical scales to measure growth? –What harm can it do? –How big is the effect of changing content on growth- and growth-based value-added models?

MDE / OEAA 22 Hypothetical example A vertically scaled mathematics test –Grades 3-8 –Composed of only two constructs Basic Computation (BC) Problem Solving (PS) BC is heavily represented in early grades PS is heavily represented in later grades –Only the single, combined math score is available (BC and PS are just in the background)

MDE / OEAA 23 Hypothetical example

MDE / OEAA 24 Hypothetical Example

MDE / OEAA 25 Hypothetical Example

MDE / OEAA 26 The Effects of Construct Shift Construct shift affects –The estimation of educational effectiveness (the results of Value-Added Models) –Does not accurately identify effectiveness if student achievement is outside the range measured well by the grade-level test –Attributes effectiveness of prior teachers/schools to current teachers/schools (violates the promise of Value-Added Models)

MDE / OEAA 27

MDE / OEAA 28 Reliability Ratio of construct-related variance to total variance (construct-related plus non- construct-related variance) Extend to Value-Added Models –Ratio of variance in true value added to total variance (true value-added variance plus variance of distortions) How important is this distortion, especially when the constructs are correlated?

MDE / OEAA 29 Reliability Martineau (in press) derived an an upper bound on reliability of VAM Affected by content balance (more balanced means lower reliability) Affected by correlation in value added (higher correlation means higher reliability) Affected by grade level (later grades have lower reliability) Affected by magnitude of changes in content across grades (larger changes mean lower reliability)

MDE / OEAA 30 Reliability of VAM Results

MDE / OEAA 31 Reliability Only in extraordinary circumstances are the results reliable enough for high-stakes use For research use, the results may be reliable enough in some limited circumstances

MDE / OEAA 32 Alleviating low reliability of value- added analyses Twice a year testing –Not politically viable –Completely eliminates low reliability Once yearly testing, new equating design –Embed the entire set of below-grade items on the current grade test by including a small portion of the set on each of multiple test forms –Calibrate a separate vertical scale for each adjacent pair of grades (e.g. 3/4, 4/5, 5/6…) –Concurrent calibration of grade 3 and 4 items together, 4 and 5 items together, 5 and 6 items together… –Should markedly reduce the amount of construct shift, and increase the reliability to an acceptable degree

MDE / OEAA 33 Contact Information Joseph Martineau Office of Educational Assessment & Accountability Michigan Department of Education P.O. Box Lansing, MI (517)