Formative evaluation of teaching performance Dylan Wiliam INEE seminar, Mexico City, 5 December 2013

Slides:



Advertisements
Similar presentations
Performance Assessment
Advertisements

Improving outcomes and closing achievement gaps: the role of assessment Dylan Wiliam UCET Symposium March 2009, Belfast
Getting serious about school improvement: new models of teacher professional development Dylan Wiliam Presentation to Governors Institute for Data Driven.
Formative Assessment: Looking beyond the techniques Dr Jeremy Hodgen Kings College London.
Wynne Harlen. What do you mean by assessment? Is there assessment when: 1. A teacher asks pupils questions to find out what ideas they have about a topic.
Formative assessment and contingency in the regulation of learning processes Contribution to a Symposium entitled “Toward a theory of classroom assessment.
Gallup Q12 Definitions Notes to Managers
How to prepare students for a world we cannot possibly imagine Salzburg Global Seminar, December 2011 Dylan Wiliam
Embedding formative assessment with teacher learning communities Dylan Wiliam
5 Key Strategies for Assessment for Learning & PGES
The search for the ‘dark matter’ of teacher quality Dylan Wiliam
Stopping people doing good things: The essence of effective leadership Dylan Wiliam
Integrating assessment with instruction: what will it take to make it work? Dylan Wiliam.
TWSSP Summer Academy June 24-28, Celebrations.
How can we collect relevant evidence of student understanding?
C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,
Meaningful Learning in an Information Age
From the gym window most Sundays – I observe?. Deliberate Practice Colvin (2008) noted that exceptional performers, were not necessarily the most talented.
Formative assessment in mathematics: opportunities and challenges
Measuring Learning Outcomes Evaluation
Understanding Validity for Teachers
Evaluating Teacher Performance: Getting it Right CPRE Annual Conference November 21-23, 2002 Charlotte Danielson
Assessment for teaching Presented at the Black Sea Conference, Batumi, September 12, Patrick Griffin Assessment Research Centre Melbourne Graduate.
CLASSROOM ASSESSMENT FOR STUDENT LEARNING
Goal Understand the impact on student achievement from effective use of formative assessment, and the role of principals, teachers, and students in that.
Meeting SB 290 District Evaluation Requirements
Evidence-based decision making: ‘Micro’ issues Rama Mathew Delhi University, Delhi.
Webinar: Leadership Teams October 2013: Idaho RTI.
Presenter: Dylan Wiliam Embedding Formative Assessment: Practical Techniques for K – 12 Classrooms.
How do we prepare young people for a world we cannot imagine? Roedean Lecture, 20 March 2013 Dylan Wiliam
Conceptual Framework for the College of Education Created by: Dr. Joe P. Brasher.
Truly Transformational Learning Practices: An Analysis of What Moves in the Best Classrooms Dylan Wiliam
Curriculum and Learning Omaha Public Schools
How can assessment support learning? Keynote address to Network Connections Pittsburgh, PA; February 9th, 2006 Dylan Wiliam, Educational Testing Service.
Classroom Assessment A Practical Guide for Educators by Craig A
Daniel Muijs, University of Southampton
Why we need to raise achievement Dylan Wiliam
Leading Teacher Learning Communities. A model for teacher learning 42  Content, then process  Content (what we want teachers to change):  Evidence.
Formative Assessment: The most fruitful ground for improving student achievement Dylan Wiliam Renaissance Learning Leadership Conference,
THE DANIELSON FRAMEWORK. LEARNING TARGET I will be be able to identify to others the value of the classroom teacher, the Domains of the Danielson framework.
What difference does teacher quality make to social class inequalities? LERN/IoE/DEBRe Conference: Socio-economic status, social class and education ;
The Danielson Framework Emmanuel Andre Owings Mills High School Fall 2013.
Transforming lives through learning
Formative assessment: definitions and relationships
Classroom Diagnostic Tools. Pre-Formative Assessment of Current CDT Knowledge.
Teacher Learning Communities. A model for teacher learning 20  Content, then process  Content (what we want teachers to change):  Evidence  Ideas.
Standards Aligned System What is SAS? A collaborative product of research and good practice Six distinct elements Clear Standards Fair.
 Development of a model evaluation instrument based on professional performance standards (Danielson Framework for Teaching)  Develop multiple measures.
PGES: The Final 10% i21: Navigating the 21 st Century Highway to Top Ten.
A framework for teaching (Danielson, 1996) Domain 2: The classroom environment – 2a: Creating an environment of respect and rapport – 2b: Establishing.
Critical Issues in Formative Assessment NCME conference, April 2013, San Francisco, CA.
Assessing instructional and assessment practice: What makes a lesson formative? CRESST conference, September 2004 UCLA Sunset Village, CA Dylan Wiliam.
LEARNER CENTERED APPROACH
Assessing Teacher Effectiveness Charlotte Danielson
Dylan Wiliam Why and How Assessment for Learning Works
Why teaching will never be a research-based profession (and why that’s a Good Thing) Dylan Wiliam 1.
COURSE AND SYLLABUS DESIGN
Supplemental Text Project Kenn Ward EDL 678 Dr. Pfennig June 2013.
Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.
Sustaining Improvement in Teacher Professional Development TEACHER LEARNING COMMUNITIES AT HUNTER'S BAR JUNIOR SCHOOL.
Implementing Formative Assessment Processes: What's Working in Schools and Why it is Working Sophie Snell & Mary Jenatscheck.
Sustaining the development of formative assessment with teacher learning communities Dylan Wiliam Keynote presentation Bedfordshire Headteachers’ Conference,
Towards a Comprehensive Meaning for Formative Assessment: The Case of Mathematics Athanasios Gagatsis, Theodora Christodoulou, Paraskevi Michael-Chrysanthou,
Classroom Assessment A Practical Guide for Educators by Craig A
Teacher learning: The key to improving the world!
Bryanston Education Summit; 7 June 2017
Dylan Wiliam Assessment: The Bridge Between Teaching and Learning Dylan Wiliam
Quality in formative assessment
Assessment for Learning
Presentation transcript:

Formative evaluation of teaching performance Dylan Wiliam INEE seminar, Mexico City, 5 December

Outline 1.Education matters, for individuals and society 2.Teaching quality is the crucial variable 3.Teaching quality is not the same as teacher quality 4.Predicting who will be good teachers is almost impossible 5.Evaluating teacher quality is inherently difficult 6.Professional development is the key to teacher quality 7.Feedback is more complicated than generally assumed 8.Formative evaluation of teaching performance 9.Strategies for formative evaluation 10.Validity of formative evaluation of teaching 11.Implementing formative evaluation of teaching

Education matters: for individuals and society 3

What is the purpose of education? 4  Four main philosophies of education  Personal empowerment  Cultural transmission  Preparation for citizenship  Preparation for work  All are important  Any education system is a (sometimes uneasy) compromise between these four forces

Raising achievement matters 5  For individuals:  Increased lifetime earnings  Improved health  Longer life  For society:  Lower criminal justice costs  Lower healthcare costs  Increased economic growth: Net present value to Mexico of a 25-point increase on PISA: US$5 trillion Net present value to Mexico of getting all students to 400 on PISA: US$26 trillion (Hanushek & Woessman, 2010)

Teaching quality is the crucial variable 6

We need to focus on classrooms, not schools 7  In most countries, variability at the classroom level is much greater than that at school level.  As long as you go to school, it doesn’t matter very much which school you go to.  But it matters very much which classrooms you are in.

Within schools Between schools McGaw (2008) Within-school variation: 64% Between school variation not explained by social background: 18% Between school variation explained by social back- ground of students: 5% Between school variation explained by social back- ground of schools: 16%

Teaching quality is not the same as teacher quality 9

Teaching quality/teacher quality  Teaching quality depends on a number of factors  The time teachers have to plan teaching  The size of classes  The resources available  The skills of the teacher  All of these are important, but the quality of the teacher seems to be especially important

Teacher quality 11  Take a group of 50 teachers all teaching the same subject:  In the classroom of the best teacher, students learn in six months what students taught by the average teacher will take a year to learn.  In the classroom of the least effective teacher, students will take two years to learn the same amount (Hanushek & Rivkin, 2006)  And in the classrooms of the best teachers, students from disadvantaged backgrounds learn as much as others (Hamre & Pianta, 2005)

The “dark matter” of teacher quality 12  Teachers make a difference  But what makes the difference in teachers?  In particular, can we predict student progress from: Teacher qualifications? Value-added? Teacher observation?

Predicting who will be good teachers is almost impossible 13

Teacher qualifications and student progress 14 MathematicsReading PrimaryMiddleHighPrimaryMiddleHigh General theory of education courses Teaching practice courses Pedagogical content courses Advanced university courses Aptitude test scores Harris and Sass (2007) MathematicsReading PrimaryMiddleHighPrimaryMiddleHigh General theory of education courses — Teaching practice courses —+ Pedagogical content courses ++ Advanced university courses —+ Aptitude test scores —

Evaluating teacher quality is inherently difficult 15

Framework for teaching (Danielson 1996)  Four domains of professional practice 1.Planning and preparation 2.Classroom environment 3.Instruction 4.Professional responsibilities  Links with student achievement (Sartain, et al. 2011)  Domains 1 and 4: no impact on student achievement  Domains 2 and 3: some impact on student achievement 16

A framework for teaching (Danielson, 1996)  Domain 2: The classroom environment  2a: Creating an environment of respect and rapport  2b: Establishing a culture for learning  2c: Managing classroom procedures  2d: Managing student behavior  2e: Organizing physical space  Domain 3: Instruction  3a: Communicating with students  3b: Using questioning and discussion techniques  3c: Engaging students in learning  3d: Using assessment in instruction  3e: Demonstrating flexibility and responsiveness

Observations and teacher quality 18 Sartain, Stoelinga, Brown, Luppescu, Matsko, Miller, Durwood, Jiang, and Glazer (2011) So, the highest-rated teachers are 30% more productive than the lowest rated But the best teachers are 400% more productive than the least effective

We don’t know much about teaching… 19  We cannot predict how good a teacher will be  We cannot tell good teaching when we see it  Expert ratings of teaching  Student ratings of teaching  We cannot evaluate teaching with test scores

Traditional approaches to improving teaching  Two main approaches  Removing ineffective teachers  Rewarding good teachers  Problems  Consume large amounts of management time  Technically difficult to do well  Create competition between teachers  Differentially effective according to task complexity

The story so far  Improving student achievement is a priority for every country  Improving student achievement requires improving teacher quality  Improving teacher quality requires investment in serving teachers

Professional development is the key to teacher quality 22

General conclusions about expertise  Elite performance is the result of at least a decade of maximal efforts to improve performance through an optimal distribution of deliberate practice  What distinguishes experts from others is the commitment to deliberate practice  Deliberate practice is  an effortful activity that can be sustained only for a limited time each day  neither motivating nor enjoyable—it is instrumental in achieving further improvement in performance 23

Expertise  According to Berliner (1994), experts:  Excel mainly in their own domain  Often develop automaticity for the repetitive operations that are needed to accomplish their goals  Are more sensitive to the task demands and social situation when solving problems  Are more opportunistic and flexible in their teaching than novices  Represent problems in qualitatively different ways than novices  Have faster and more accurate pattern recognition capabilities  Perceive meaningful patterns in the domain in which they are experienced  Begin to solve problems slower but bring richer and more personal sources of information to bear 24

Effects of experience in teaching 25 MathematicsReading Rivkin, Hanushek and Kain (2005)

Implications for education systems  Pursuing a strategy of getting the “best and brightest” into teaching is unlikely to succeed  Currently all teachers slow, and most actually stop, improving after two or three years in the classroom  Expertise research therefore suggests that they are only beginning to scratch the surface of what they are capable of  What we need is to persuade those with a real passion for working with young people to become teachers, and to continue to improve as long as they stay in the job.  There is no limit to what we can achieve if we support our teachers in the right way

Feedback is generally more complex than generally assumed 27

Important caveats about research findings 28  Educational research can only tell us what was, not what might be.  Moreover, in education, “What works?” is not the right question, because  everything works somewhere, and  nothing works everywhere, which is why  in education, the right question is, “Under what conditions does this work?”

Effects of formative assessment SourceEffect size Kluger & DeNisi (1996)0.41 Black &Wiliam (1998)0.4 to 0.7 Wiliam et al., (2004)0.32 Hattie & Timperley (2007)0.96 Shute (2008)0.4 to 0.8 Standardized effect size: differences in means, measured in population standard deviations

Understanding meta-analysis 30  A technique for aggregating results from different studies by converting empirical results to a common measure (usually effect size)  Standardized effect size is defined as:  Problems with meta-analysis  The “file drawer” problem  Variation in population variability  Selection of studies  Sensitivity of outcome measures

Effects of feedback 31  Kluger & DeNisi (1996) review of 3000 research reports  Excluding those:  without adequate controls  with poor design  with fewer than 10 participants  where performance was not measured  without details of effect sizes  left 131 reports, 607 effect sizes, involving individuals  On average, feedback increases achievement  Effect sizes highly variable  38% (50 out of 131) of effect sizes were negative

Getting feedback right is hard Response typeFeedback indicates performance… falls short of goalexceeds goal Change behaviorIncrease effortExert less effort Change goalReduce aspirationIncrease aspiration Abandon goalDecide goal is too hardDecide goal is too easy Reject feedbackFeedback is ignored

Kluger and DeNisi’s conclusions… These considerations of utility and alternative interventions suggest that even an FI [feedback intervention] with demonstrated positive effects on performance should not be administered whenever possible. Rather, additional development of FIT [feedback intervention theory] is needed to establish the circumstance under which positive FI effects on performance are also lasting and efficient and when these effects are transient and have questionable utility. This research must focus on the processes induced by FIs and not on the general question of whether FIs improve performance—look at how little progress 90 years of attempts to answer the latter question have yielded. (p. 278)

Formative evaluation of teaching performance 34

The evidence base for formative assessment  Fuchs & Fuchs (1986)  Natriello (1987)  Crooks (1988)  Bangert-Drowns, et al. (1991)  Dempster (1991, 1992)  Elshout-Mohr (1994)  Kluger & DeNisi (1996)  Black & Wiliam (1998)  Nyquist (2003)  Brookhart (2004)  Allal & Lopez (2005)  Köller (2005)  Brookhart (2007)  Wiliam (2007)  Hattie & Timperley (2007)  Shute (2008) 35

Assessment for learning/formative assessment “Assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting students’ learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence. An assessment activity can help learning if it provides information that teachers and their students can use as feedback in assessing themselves and one another and in modifying the teaching and learning activities in which they are engaged. Such assessment becomes “formative assessment” when the evidence is actually used to adapt the teaching work to meet learning needs.” (Black, Harrison, Lee, Marshall & Wiliam, 2004 p. 10)

Theoretical questions 37  Need for clear definitions  So that research outcomes are commensurable  Theorization and definition  Possible variables Category (instruments, outcomes, functions) Beneficiaries (teachers, learners) Timescale (months, weeks, days, hours, minutes) Consequences (outcomes, instruction, decisions) Theory of action (what gets formed?)

Formative assessment: a new definition “ An evaluation of teacher performance functions formatively to the extent that evidence of teacher performance that is elicited by the assessment is interpreted by leaders, teachers, or their peers to make decisions about the professional development of the teacher that are likely to be better, or better founded, than those that would have been taken in the absence of that evidence.”

 Formative evaluation involves the creation of, and capitalization upon, moments of contingency in the regulation of teachers’ learning processes  Kinds of regulation (Perrenoud, 1998)  Proactive  Interactive  Retroactive  Agents  Leaders (external regulation)  Peers (co-regulation)  Teachers (self-regulation)

Strategies of formative evaluation 40

Unpacking formative assessment of teaching Where the teacher is now Where the teacher is going How to get there Leader Peer Teacher Clarifying, sharing and understanding learning intentions Engineering effective situations, tasks and activities that elicit evidence of development Providing feed- back that moves learners forward Activating teachers as learning resources for one another Activating teachers as owners of their own learning

Validity of formative evaluation

Validity: an evolving concept 43  Evolution of the idea  A property of a test  A property of students’ results on a test  A property of the inferences drawn on the basis of test results  For any test:  some inferences are warranted  some are not  “One validates not a test but an interpretation of data arising from a specified procedure” (Cronbach, 1971; emphasis in original)  No such thing as a valid assessment!

Validating formative evaluation  An assessment is a procedure for making inferences:  about what the learner knows (summative)  about what to do next (formative)  Summative inferences are validated by consistency of meanings across different readers  Formative inferences are validated by the consequences for learners

Implementing formative evaluation of teaching performance 45

A model for teacher learning  Content, then process  Content (what we want teachers to change):  Evidence  Ideas (strategies and techniques)  Process (how to go about change):  Choice  Flexibility  Small steps  Accountability  Support 46

Choice

A strengths-based approach to change 48  Talent development requires attending to both strengths and weaknesses  The question is how to distribute attention between the two:  For novices, attention to weaknesses is likely to have the greatest payoff  For more experienced teachers, attention to strengths is likely to be more advantageous

Flexibility

Tight, but loose  Two opposing factors in any school reform  Need for flexibility to adapt to local circumstances  Need to maintain fidelity to the theory of action of the reform, to minimise “lethal mutations”  The “tight but loose” formulation:  … combines an obsessive adherence to central design principles (the “tight” part) with accommodations to the needs, resources, constraints, and affordances that occur in any school or district (the “loose” part), but only where these do not conflict with the theory of action of the intervention. 50

Small steps

Expertise  According to Berliner (1994), experts:  Excel mainly in their own domain  Often develop automaticity for the repetitive operations that are needed to accomplish their goals  Are more sensitive to the task demands and social situation when solving problems  Are more opportunistic and flexible in their teaching than novices  Represent problems in qualitatively different ways than novices  Have faster and more accurate pattern recognition capabilities  Perceive meaningful patterns in the domain in which they are experienced  Begin to solve problems slower but bring richer and more personal sources of information to bear 52

Looking at the wrong knowledge 53  The most powerful teacher knowledge is not explicit:  That’s why telling teachers what to do doesn’t work.  What we know is more than we can say.  And that is why most professional development has been relatively ineffective.  Improving practice involves changing habits, not adding knowledge:  That’s why it’s hard: And the hardest bit is not getting new ideas into people’s heads. It’s getting the old ones out.  That’s why it takes time.  But it doesn’t happen naturally:  If it did, the most experienced teachers would be the most productive, and that’s not true (Hanushek & Rivkin, 2006).

Hand hygiene in hospitalsStudyFocus Compliance rate Preston, Larson, & Stamm (1981)Open ward16% ICU30% Albert & Condie (1981)ICU28% to 41% Larson (1983)All wards45% Donowitz (1987)Pediatric ICU30% Graham (1990)ICU32% Dubbert (1990)ICU81% Pettinger & Nettleman (1991)Surgical ICU51% Larson, et al. (1992)Neonatal ICU29% Doebbeling, et al. (1992)ICU40% Zimakoff, et al. (1992)ICU40% Meengs, et al. (1994)ER (Casualty)32% Pittet, Mourouga, & Perneger (1999)All wards48% ICU36% Pittet (2001)

Accountability

Making a commitment 56  Action planning:  Forces teachers to make their ideas concrete and creates a record  Makes the teachers accountable for doing what they promised  Requires each teacher to focus on a small number of changes  Requires the teachers to identify what they will give up or reduce  A good action plan:  Does not try to change everything at once  Spells out specific changes in teaching practice  Relates to the five “key strategies” of AFL  Is achievable within a reasonable period of time  Identifies something that the teacher will no longer do or will do less of

Support

Supportive accountability  What is needed from teachers:  A commitment to: The continual improvement of practice Focus on those things that make a difference to students  What is needed from leaders:  A commitment to engineer effective learning environments for teachers by: Creating expectations for continually improving practice Keeping the focus on the things that make a difference to students Providing the time, space, dispensation, and support for innovation Supporting risk-taking 58