Measuring Teacher Effectiveness: Challenges and Opportunities

Slides:

Advertisements

Similar presentations

PD Plan Agenda August 26, 2008 PBTE Indicators Track

Advertisements

Lee County Human Resources Glenda Jones. School Speech-Language Pathologist Evaluation Process Intended Purpose of the Standards Guide professional development.

April 6, 2011 DRAFT Educator Evaluation Project. Teacher Education and Licensure DRAFT The ultimate goal of all educator evaluation should be… TO IMPROVE.

Copyright © 2009 National Comprehensive Center for Teacher Quality. All rights reserved. Multiple Measures of Teacher Effectiveness Laura Goe, Ph.D. New.

Copyright © 2009 National Comprehensive Center for Teacher Quality. All rights reserved. Measuring Teacher Effectiveness in Untested Subjects and Grades.

Teacher Evaluation Models: A National Perspective Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and Dissemination,The National.

How can we measure teachers’ contributions to student learning growth in the “non-tested” subjects and grades? Laura Goe, Ph.D. Research Scientist, ETS,

North Carolina Professional Teaching Standards Lee County Schools New Hire Training

What should be the basis of

performance INDICATORs performance APPRAISAL RUBRIC

Differentiated Supervision

Using Student Learning Growth as a Measure of Effectiveness Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Linking Teacher Evaluation and Professional Growth IEL Washington Policy Seminar April 22, 2013  Washington, D.C. Laura Goe, Ph.D. Research Scientist,

Teacher Evaluation Systems: Opportunities and Challenges An Overview of State Trends Laura Goe, Ph.D. Research Scientist, ETS Sr. Research and Technical.

Principal Evaluation in Massachusetts: Where we are now National Summit on Educator Effectiveness Principal Evaluation Breakout Session #2 Claudia Bach,

Welcome What’s a pilot?. What’s the purpose of the pilot? Support teachers and administrators with the new evaluation system as we learn together about.

Meeting SB 290 District Evaluation Requirements

Presentation for Teachers and Administrators in the New Canaan Public Schools, New Canaan, CT What is differentiated supervision? Why is it necessary?

Interstate New Teacher Assessment and Support Consortium (INTASC)

Washington State Teacher and Principal Evaluation 1.

Measures of Teachers’ Contributions to Student Learning Growth Laura Goe, Ph.D. Research Scientist, Understanding Teaching Quality Research Group, ETS.

Examining Value-Added Models to Measure Teacher Effectiveness Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Stronge Teacher Effectiveness Performance Evaluation System

Teacher Evaluation in Rural Schools Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive Center for Teacher.

Compass: Module 2 Compass Requirements: Teachers’ Overall Evaluation Rating Student Growth Student Learning Targets (SLTs) Value-added Score (VAM) where.

Evaluating the “Caseload” Educators Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive Center for Teacher.

Laying the Groundwork for the New Teacher Professional Growth and Effectiveness System TPGES.

THE DANIELSON FRAMEWORK. LEARNING TARGET I will be be able to identify to others the value of the classroom teacher, the Domains of the Danielson framework.

Measuring teacher effectiveness using multiple measures Ellen Sullivan Assistant in Educational Services, Research and Education Services, NYSUT AFT Workshop.

Effective Coaching for Success Presenter: Dr. Wendy Perry 2015.

What Are We Measuring? The Use of Formative and Summative Assessments Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and.

South Western School District Differentiated Supervision Plan DRAFT 2010.

PERSONNEL EVALUATION SYSTEMS How We Help Our Staff Become More Effective Margie Simineo – June, 2010.

NC Teacher Evaluation Process

Using Teacher Evaluation as a Tool for Professional Growth and School Improvement Redmond School District

1 Evaluating Teacher Effectiveness Presented at the March 11, 2010 Meeting of Professional Standards Commission for Teachers (PSCT) Rachele DiMeglio Michigan.

Washington State Teacher and Principal Evaluation Project Update 11/29/12.

The Why (Waiver & Strategic Plan) Aligned to research: MET Study Components: Framework/Multiple Measures Pilot Requirements Timeline.

Copyright © 2009 National Comprehensive Center for Teacher Quality. All rights reserved. Evaluating Teacher Effectiveness: Some Models to Consider Laura.

Evaluating Teacher Effectiveness: Selecting Measures Laura Goe, Ph.D. SIG Schools Webinar August 12, 2011.

Copyright © 2009 National Comprehensive Center for Teacher Quality. All rights reserved. Using Student Growth as Part of a Multiple Measures Teacher Evaluation.

 Development of a model evaluation instrument based on professional performance standards (Danielson Framework for Teaching)  Develop multiple measures.

PGES: The Final 10% i21: Navigating the 21 st Century Highway to Top Ten.

NCATE STANDARD I STATUS REPORT  Hyacinth E. Findlay  March 1, 2007.

Models for Evaluating Teacher Effectiveness Laura Goe, Ph.D. California Labor Management Conference May 5, 2011  Los Angeles, CA.

Copyright © 2009 National Comprehensive Center for Teacher Quality. All rights reserved. Student Learning and Achievement in Measuring Teacher Effectiveness.

Assessing Teacher Effectiveness Charlotte Danielson

What Are the Characteristics of an Effective Portfolio? By Jay Barrett.

Candidate Assessment of Performance CAP The Evidence Binder.

Teacher Evaluation Systems 2.0: What Have We Learned? EdWeek Webinar March 14, 2013 Laura Goe, Ph.D. Research Scientist, ETS Sr. Research and Technical.

Candidate Assessment of Performance CAP The Evidence Binder.

Weighting components of teacher evaluation models Laura Goe, Ph.D. Research Scientist, ETS Principal Investigator for Research and Dissemination, The National.

Measures of Teachers’ Contributions to Student Learning Growth Laura Goe, Ph.D. Research Scientist, Understanding Teaching Quality Research Group, ETS.

Copyright © 2009 National Comprehensive Center for Teacher Quality. All rights reserved. Evaluating Teacher and Principal Effectiveness Laura Goe, Ph.D.

Forum on Evaluating Educator Effectiveness: Critical Considerations for Including Students with Disabilities Lynn Holdheide Vanderbilt University, National.

UPDATE ON EDUCATOR EVALUATIONS IN MICHIGAN Directors and Representatives of Teacher Education Programs April 22, 2016.

OEA Leadership Academy 2011 Michele Winship, Ph.D.

Supplemental Text Project Kenn Ward EDL 678 Dr. Pfennig June 2013.

Springfield Public Schools SEEDS: Collecting Evidence for Educators Winter 2013.

Tri City United Public Schools August 6, 2013 “Leading for educational excellence and equity. Every day for every one.”

FLORIDA EDUCATORS ACCOMPLISHED PRACTICES Newly revised.

Implementing the Professional Growth Process Session 3 Observing Teaching and Professional Conversations American International School-Riyadh Saturday,

Teaching and Learning Cycle and Differentiated Instruction A Perfect Fit Rigor Relevance Quality Learning Environment Differentiation.

OEA Leadership Academy 2011 Michele Winship, Ph.D.

Using Student Growth in Teacher Evaluation and Development Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive.

Educator Recruitment and Development Office of Professional Development The NC Teacher Evaluation Process 1.

Clinical Practice evaluations and Performance Review

Evaluating Teacher Effectiveness: An Overview Laura Goe, Ph.D.

Teacher Evaluation Models: A National Perspective

Teacher Evaluation: The Non-tested Subjects and Grades

Presentation transcript:

Measuring Teacher Effectiveness: Challenges and Opportunities Laura Goe, Ph.D. Research Scientist, ETS, and Principal Investigator for the National Comprehensive Center for Teacher Quality National Association of Latino Elected and Appointed Officials (NALEO) Education Fund NALEO Audio Conference  April 10, 2012

Laura Goe, Ph.D. Former teacher in rural & urban schools Special education (7th & 8th grade, Tunica, MS) Language arts (7th grade, Memphis, TN) Graduate of UC Berkeley’s Policy, Organizations, Measurement & Evaluation doctoral program Principal Investigator for the National Comprehensive Center for Teacher Quality Research Scientist in the Performance Research Group at ETS 2

The National Comprehensive Center for Teacher Quality A federally-funded partnership whose mission is to help states carry out the teacher quality mandates of ESEA Vanderbilt University Learning Point Associates, an affiliate of American Institutes for Research Educational Testing Service

Today’s presentation available online To download a copy of this presentation go to www.lauragoe.com Go to Publications and Presentations page Today’s presentation is at the bottom of the page

The goal of teacher evaluation The ultimate goal of all teacher evaluation should be… TO IMPROVE TEACHING AND LEARNING Introduction and discuss why we focus on teacher evaluation

Questions to be considered What is teacher effectiveness and why should we measure it? How do you measure teacher effectiveness? What are strengths and cautions to keep in mind when using these measures?

Differentiating among teachers “It is nearly impossible to discover and act on performance differences among teachers when documented records show them all to be the same.” (Glazerman et al., 2011, pg 1)

Trends in teacher evaluation The policy imperative to change teacher evaluation has outstripped the research Though we don’t yet know which model and combination of measures will identify effective teachers, many states and districts feel compelled to move forward at a rapid pace Inclusion of student achievement growth data represents an important “culture shift” in evaluation Communication and teacher/administrator participation and buy-in are crucial to ensure change The implementation challenges are considerable We are models exist for states and districts to adopt or adapt Many districts have limited capacity to implement comprehensive systems, and states have limited resources to help them

It’s an equity issue Value-added research shows that teachers vary greatly in their contributions to student achievement (Rivkin, Hanushek, & Kain, 2005). The Widget Effect report (Weisberg et al., 2009) found that 90% of teachers were rated “good” or better in districts where students were failing at high levels

A simple definition of teacher effectiveness Anderson (1991) stated that “… an effective teacher is one who quite consistently achieves goals which either directly or indirectly focus on the learning of their students” (p. 18).

Race to the Top definition of effective & highly effective teacher Effective teacher: students achieve acceptable rates (e.g., at least one grade level in an academic year) of student growth (as defined in this notice). States, LEAs, or schools must include multiple measures, provided that teacher effectiveness is evaluated, in significant part, by student growth (as defined in this notice). Supplemental measures may include, for example, multiple observation-based assessments of teacher performance. (pg 7) Highly effective teacher students achieve high rates (e.g., one and one-half grade levels in an academic year) of student growth (as defined in this notice).

Race to the Top definition of student growth Student growth means the change in student achievement (as defined in this notice) for an individual student between two or more points in time. A State may also include other measures that are rigorous and comparable across classrooms. (pg 11) 12

Goe, Bell, & Little (2008) definition of teacher effectiveness Have high expectations for all students and help students learn, as measured by value-added or alternative measures. Contribute to positive academic, attitudinal, and social outcomes for students, such as regular attendance, on-time promotion to the next grade, on-time graduation, self-efficacy, and cooperative behavior. Use diverse resources to plan and structure engaging learning opportunities; monitor student progress formatively, adapting instruction as needed; and evaluate learning using multiple sources of evidence. Contribute to the development of classrooms and schools that value diversity and civic-mindedness. Collaborate with other teachers, administrators, parents, and education professionals to ensure student success, particularly the success of students with special needs and those at high risk for failure.

Measures and models: Definitions Measures are the instruments, assessments, protocols, rubrics, and tools that are used in determining teacher effectiveness Models are the state or district systems of teacher evaluation including all of the inputs and decision points (measures, instruments, processes, training, and scoring, etc.) that result in determinations about individual teachers’ effectiveness

Multiple measures of teacher effectiveness Evidence of growth in student learning and competency Standardized tests, pre/post tests in untested subjects Student performance (art, music, etc.) Curriculum-based tests given in a standardized manner Classroom-based tests such as DIBELS Evidence of instructional quality Classroom observations Lesson plans, assignments, and student work Student surveys such as Harvard’s Tripod Electronic portfolios/evidence binders Evidence of professional responsibility Administrator/supervisor reports, parent surveys Teacher reflection and self-reports, records of contributions

Teacher observations: strengths and weaknesses Great for teacher professional growth If observation is followed by opportunity to discuss results If support is provided for those who need it Helps evaluator (principals or others) understand teachers’ needs across school or across district Weaknesses Essential to have alignment between teaching standards and observation instrument Resource intensive (personnel time, training, calibrating) Validity of observation results may vary with who is doing them, depending on how well trained and calibrated they are

Example: University of Virginia’s CLASS observation tool Emotional Support Classroom Organization Instructional Support Pre-K and K-3 Positive Climate Negative Climate Teacher Sensitivity Regard for Student (Adolescent) Perspectives Behavior Management Productivity Instructional Learning Formats Concept Development Quality of Feedback Language Modeling Upper Elementary/ Secondary Content Understanding Analysis and Problem Solving Refer to page 2 in the CLASS manual The CLASS is organized according to 3 broad areas or domains of classroom quality: Emotional Support Classroom Organization, and Instructional Support Within each domain, there are multiple dimensions that contribute to the overall measured quality. 17

Example: Charlotte Danielson’s Framework for Teaching Domain 1: Planning and Preparation includes comprehensive understanding of the content to be taught, knowledge of the students’ backgrounds, and designing instruction and assessment. Domain 3: Instruction is concerned with the teacher’s skill in engaging students in learning the content, and includes the wide range of instructional strategies that enable students to learn. Domain 2: The Classroom Environment addresses the teacher’s skill in establishing an environment conducive to learning, including both the physical and interpersonal aspects of the environment. Domain 4: Professional Responsibilities addresses a teacher’s additional professional responsibilities, including self-assessment and reflection, communication with parents, participating in ongoing professional development, and contributing to the school and district environment.

Validity of classroom observations is highly dependent on training A teacher should get the same score no matter who observes him This requires that all observers be trained on the instruments and processes Occasional “calibrating” should be done; more often if there are discrepancies or new observers Who the evaluators are matters less than the fact that they are trained to recognize evidence and score it consistently Teachers should also be trained on the observation forms and processes so they can participate actively and fully in the process

Risk management vs. one-size-fits-all in teacher observations Conducting high-quality observations is a resource-intensive process A more efficient use of resources is for teachers who have not yet demonstrated competence to be on a more intensive observation schedule New teachers Teachers who have changed teaching assignments or schools Other measures are less resource intensive and can be used routinely (surveys, student outcomes, portfolios)

Reliability results when using different combinations of raters and lessons Figure 2. Errors and Imprecision: the reliability of different combinations of raters and lessons. From Hill et al., 2012 (see references list). Used with permission of author.

Formal vs. informal observations Formal observations are likely to be Announced and scheduled in advance according to a pre-determined yearly schedule Include pre- and post-conferences with review of lesson plans and artifacts Last an entire class period Result in a set of scores on multiple indicators Informal observations are likely to be Unannounced, drop-in Last less than an entire class period Result in informal verbal or written feedback to the teacher, perhaps on only one indicator

Questions to ask about observations How many observations per year? Vary by new vs. experience? Vary by demonstrated competence? Combination of formal and informal? Who should conduct the observations? Will multiple observers be required? How will they be trained? Workshops? Online (video-based)? Will they need to be certified?

Value-added models Many variations on value-added models TVAAS (Sander’s original model) typically uses 3+ years of prior test scores to predict the next score for a student Used since the 1990’s for teachers in Tennessee, but not for high-stakes evaluation purposes Most states and districts that currently use VAMs use the Sanders’ model, also called EVAAS There are other models that use less student data to make predictions Considerable variation in “controls” used HB 24 24

Growth vs. Proficiency Models End of Year Start of School Year Achievement Proficient Teacher B: “Failure” on Ach. Levels Teacher A: “Success” on Ach. Levels In terms of growth, Teachers A and B are performing equally Slide courtesy of Doug Harris, Ph.D, University of Wisconsin-Madison

Growth vs. Proficiency Models (2) End of Year Start of School Year Achievement Proficient Teacher A Teacher B A teacher with low-proficiency students can still be high in terms of GROWTH (and vice versa) Slide courtesy of Doug Harris, Ph.D, University of Wisconsin-Madison

Colorado Growth Model Colorado Growth model Focuses on “growth to proficiency” Measures students against “academic peers” Also called criterion‐referenced growth‐to‐standard models The student growth percentile is “descriptive” whereas value-added seeks to determine the contribution of a school or teacher to student achievement (Betebenner 2008)

Colorado Growth Model Slide courtesy of Damian Betebenner at www.nciea.org

What value-added and growth models cannot tell you Value-added and growth models are really measuring classroom, not teacher, effects Value-added models can’t tell you why a particular teacher’s students are scoring higher than expected Maybe the teacher is focusing instruction narrowly on test content Or maybe the teacher is offering a rich, engaging curriculum that fosters deep student learning. How the teacher is achieving results matters!

Recommendation from NBPTS Task Force (Linn et al., 2011) Recommendation 2: Employ measures of student learning explicitly aligned with the elements of curriculum for which the teachers are responsible. This recommendation emphasizes the importance of ensuring that teachers are evaluated for what they are teaching.

School-wide VAM illustration

Measuring teachers’ contributions to student learning growth: A summary of current models Description Student learning objectives Teachers assess students at beginning of year and set objectives then assesses again at end of year; principal or designee works with teacher, determines success Subject & grade alike team models (“Ask a Teacher”) Teachers meet in grade-specific and/or subject-specific teams to consider and agree on appropriate measures that they will all use to determine their individual contributions to student learning growth Content Collaboratives Content experts (external) identify measures and groups of content teachers consider the measures from the perspective of classroom use; may not include pre- and post measures Pre-and post-tests model Identify or create pre- and post-tests for every grade and subject School-wide value-added Teachers in tested subjects & grades receive their own value-added score; all other teachers get the school-wide average

Tripod Survey domains Harvard’s Tripod Survey – the 7 C’s Caring about students (nurturing productive relationships); Controlling behavior (promoting cooperation and peer support); Clarifying ideas and lessons (making success seem feasible); Challenging students to work hard and think hard (pressing for effort and rigor); Captivating students (making learning interesting and relevant); Conferring (eliciting students’ feedback and respecting their ideas); Consolidating (connecting and integrating ideas to support learning)

Tripod Survey results Control is the strongest correlate of value added gains However, it is important to keep in mind that a good teacher achieves control by being good on the other dimensions English & Spanish, paper or online versions at three levels: k-2, 3-5, 6-12 For more info: http://www.tripodproject.org/index.php/index/

Why you should keep (and provide support to) the less effective teachers With the right instructional strategies and guidance, motivated teachers can improve practice and student outcomes The teachers you hire to replace your less effective teachers are not necessarily going to be more effective You may not be able to find better replacements! You may not be any to find any replacements! The replacements you find may not stay

Measures that help teachers grow Measures which include protocols and processes that teachers can examine and comprehend Measures that are directly and explicitly aligned with teaching standards Measures that motivate teachers to examine their own practice against specific standards Measures that allow teachers to participate in or co-construct the evaluation (such as portfolios) Measures that give teachers opportunities to discuss the results for formative purposes with evaluators, administrators, teacher learning communities, mentors, coaches, etc. Measures that are aligned with and used to inform professional growth and development offerings

Evaluating Teacher Preparation Programs (TPPs) Evaluate teacher performance (including student outcomes) Use results as a measure of TPP success (for evaluation purposes) Use results to improve TPP curriculum and instruction K-12 Teaching and learning improves as a result of changes made by TPPs

Meeting the “standards” It’s possible to be meeting accreditation standards (NCATE, TEAC) but still not be preparing fully effective teachers If TPPs are not adequately preparing teachers for the contexts and communities which they serve, their effectiveness may be hampered

Final thoughts The limitations: The opportunities: There are no perfect measures There are no perfect models Changing the culture of evaluation is hard work The opportunities: Evidence can be used to trigger support for struggling teachers and acknowledge effective ones Multiple sources of evidence can provide powerful information to improve teaching and learning Evidence is more valid than “judgment” and provides better information for teachers to improve practice

References Anderson, L. (1991). Increasing teacher effectiveness. Paris: UNESCO, International Institute for Educational Planning. Glazerman, S., D. Goldhaber, et al. (2011). Passing muster: Evaluating evaluation systems. Washington, DC, Brown Center on Education Policy at Brookings. http://www.brookings.edu/reports/2010/1117_evaluating_teachers.aspx Goe, L., C. Bell, et al. (2008). Approaches to evaluating teacher effectiveness: A research synthesis, Washington, DC: National Comprehensive Center for Teacher Quality: 1-103. http://www.tqsource.org/publications/teacherEffectiveness.php Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56-64. Linn, R., Bond, L., Darling-Hammond, L., Harris, D., Hess, F., & Shulman, L. (2011). Student learning, student achievement: How do teachers measure up? Arlington, VA: National Board for Professional Teaching Standards. http://www.nbpts.org/index.cfm?t=downloader.cfm&id=1305 Race to the Top Application http://www2.ed.gov/programs/racetothetop/resources.html Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417 - 458. http://www.econ.ucsb.edu/~jon/Econ230C/HanushekRivkin.pdf Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. Brooklyn, NY: The New Teacher Project. http://widgeteffect.org/downloads/TheWidgetEffect.pdf

Questions?

Laura Goe, Ph.D. 609-619-1648 lgoe@ets.org www.lauragoe.com https://twitter.com/GoeLaura National Comprehensive Center for Teacher Quality 1000 Thomas Jefferson Street, NW Washington, D.C. 20007 www.tqsource.org