Formative assessment in mathematics: opportunities and challenges

Slides:



Advertisements
Similar presentations
Ed-D 420 Inclusion of Exceptional Learners. CAT time Learner-Centered - Learner-centered techniques focus on strategies and approaches to improve learning.
Advertisements

Assessment FOR Learning in theory
Performance Assessment
The Journey – Improving Writing Through Formative Assessment Presented By: Sarah McManus, Section Chief, Testing Policy & Operations Phyllis Blue, Middle.
Understanding meta-analysis: I think youll find its a bit more complicated than that (Goldacre, 2008) 1.
Formative Assessment: Looking beyond the techniques Dr Jeremy Hodgen Kings College London.
Wynne Harlen. What do you mean by assessment? Is there assessment when: 1. A teacher asks pupils questions to find out what ideas they have about a topic.
Formative assessment and contingency in the regulation of learning processes Contribution to a Symposium entitled “Toward a theory of classroom assessment.
Motivation, assessment and learning…. a whole school approach.
The “Highly Effective” Early Childhood Classroom Environment
5 Key Strategies for Assessment for Learning & PGES
Formative Assessment and Student Achievement: Two Years of Implementation of the Keeping Learning on Track® Program Courtney Bell (ETS) Jonathan Steinberg.
Integrating assessment with instruction: what will it take to make it work? Dylan Wiliam.
TWSSP Summer Academy June 24-28, Celebrations.
How can we collect relevant evidence of student understanding?
Enhance classroom discourse through effective questioning with PLC support Engage students to work with teachers to improve classroom discourse.
Do we need to Assess for Learning? Concordia University Michael Pellegrin, MEESR March 2015.
FORMATIVE ASSESSMENT Nisreen Ahmed Wilma Teviotdale.
Measuring Learning Outcomes Evaluation
Formative evaluation of teaching performance Dylan Wiliam INEE seminar, Mexico City, 5 December 2013
NCTM High School Interactive Institute,
Created by The School District of Lee County, CSDC in conjunction with Cindy Harrison, Adams 12 Five Star Schools SETTING GOALS (OBJECTIVES) & PROVIDING.
Fred Gross Education Development Center, Inc.
Assessment for teaching Presented at the Black Sea Conference, Batumi, September 12, Patrick Griffin Assessment Research Centre Melbourne Graduate.
CLASSROOM ASSESSMENT FOR STUDENT LEARNING
September 13, 2014 The Whole School Success Partnership Saturday, September 13th, 2014.
Goal Understand the impact on student achievement from effective use of formative assessment, and the role of principals, teachers, and students in that.
N E P F N evada E ducator P erformance F ramework Southern Nevada Regional Professional Development Program Standard 5 Part 1 Secondary Mathematics.
Evidence-based decision making: ‘Micro’ issues Rama Mathew Delhi University, Delhi.
Interstate New Teacher Assessment and Support Consortium (INTASC)
Developing teachers’ mathematics knowledge for teaching Challenges in the implementation and sustainability of a new MSP Dr. Tara Stevens Department of.
The difference between learning goals and activities
Truly Transformational Learning Practices: An Analysis of What Moves in the Best Classrooms Dylan Wiliam
Woodlands PS How to use an Instructional Intelligence Framework to develop continuity and consistency across all year levels.
A Role for Formalized Tools in Formative Assessment Bob Dolan, Senior Research Scientist, Pearson CCSSO NCSA | National Harbor |
Putting Ideas Into Practice Assessment For Learning Black, Harrison, Lee, Marshall, Wiliam A Practical Application of Formative Assessment.
How can assessment support learning? Keynote address to Network Connections Pittsburgh, PA; February 9th, 2006 Dylan Wiliam, Educational Testing Service.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Setting High Academic Expectations that Ensure Academic Achievement TEAM PLANNING STANDARDS & OBJECTIVES TEACHER CONTENT KNOWLEDGE.
Assessment Practices That Lead to Student Learning Core Academy, Summer 2012.
James G Ladwig Newcastle Institute for Research in Education The impact of teacher practice on student outcomes in Te Kotahitanga.
Transforming lives through learning
Hawaii Department of Education Bridging the Gap: Professional Development Protocols “Tools for Schools”
CT 854: Assessment and Evaluation in Science & Mathematics
Raising standards, improving lives The use of assessment to improve learning: the evidence 15 September Jacqueline White HMI National Adviser for Assessment.
Office of School Improvement Differentiated Webinar Series Formative Assessment – Feedback February 28,2012 Dr. Dorothea Shannon, Thomasyne Beverly, Dr.
Formative assessment: definitions and relationships
Record Keeping and Using Data to Determine Report Card Markings.
Formative assessment and effective feedback at Manor Lakes College
The Power of Feedback Hattie & Timperley (2007) from Review of Educational Research, 77(1)
Critical Issues in Formative Assessment NCME conference, April 2013, San Francisco, CA.
The Value in Formative Assessment Prepared By: Jen Ramos.
Dylan Wiliam Why and How Assessment for Learning Works
Types of Assessment Thoughts for the 21 st Century.
Implementing Formative Assessment Processes: What's Working in Schools and Why it is Working Sophie Snell & Mary Jenatscheck.
Chapter 6 Assessing Science Learning Updated Spring 2012 – D. Fulton.
Assessment in Numeracy Staff Meeting # “Independent learners have the ability to seek out and gain new skills, new knowledge and new understandings.
Towards a Comprehensive Meaning for Formative Assessment: The Case of Mathematics Athanasios Gagatsis, Theodora Christodoulou, Paraskevi Michael-Chrysanthou,
NCTM conference, April 2017: San Antonio, TX
Classroom Assessments Checklists, Rating Scales, and Rubrics
Assessment of Learning 1
ASSESSMENT OF STUDENT LEARNING
Classroom Assessments Checklists, Rating Scales, and Rubrics
KA2 Strategic Partnerships – HU01-KA
Outline Formative assessment: A critical review The definitional issue
Quality in formative assessment
Assessment for Learning
Assessment The purpose of this workshop / discussion is to extend further teachers’ understanding of the Department's Assessment Advice. This workshop.
Presentation transcript:

Formative assessment in mathematics: opportunities and challenges Dylan Wiliam (@dylanwiliam) Seminar at Teachers College, Columbia University October 2013

A research agenda for formative assessment Definitional issues Domain-specificity issues Effectiveness issues Communication issues Implementation issues Adoption issues

Definitional issues

The evidence base for formative assessment Fuchs & Fuchs (1986) Natriello (1987) Crooks (1988) Bangert-Drowns, et al. (1991) Dempster (1991, 1992) Elshout-Mohr (1994) Kluger & DeNisi (1996) Black & Wiliam (1998) Nyquist (2003) Brookhart (2004) Allal & Lopez (2005) Köller (2005) Brookhart (2007) Wiliam (2007) Hattie & Timperley (2007) Shute (2008)

Definitions of formative assessment We use the general term assessment to refer to all those activities undertaken by teachers—and by their students in assessing themselves—that provide information to be used as feedback to modify teaching and learning activities. Such assessment becomes formative assessment when the evidence is actually used to adapt the teaching to meet student needs” (Black & Wiliam, 1998 p. 140) “the process used by teachers and students to recognise and respond to student learning in order to enhance that learning, during the learning” (Cowie & Bell, 1999 p. 32) “assessment carried out during the instructional process for the purpose of improving teaching or learning” (Shepard et al., 2005 p. 275)

“Formative assessment refers to frequent, interactive assessments of students’ progress and understanding to identify learning needs and adjust teaching appropriately” (Looney, 2005, p. 21) “A formative assessment is a tool that teachers use to measure student grasp of specific topics and skills they are teaching. It’s a ‘midstream’ tool to identify specific student misconceptions and mistakes while the material is being taught” (Kahl, 2005 p. 11)

“Assessment for Learning is the process of seeking and interpreting evidence for use by learners and their teachers to decide where the learners are in their learning, where they need to go and how best to get there” (Assessment Reform Group, 2002 pp. 2-3) “Assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting students’ learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence. An assessment activity can help learning if it provides information that teachers and their students can use as feedback in assessing themselves and one another and in modifying the teaching and learning activities in which they are engaged. Such assessment becomes “formative assessment” when the evidence is actually used to adapt the teaching work to meet learning needs.” (Black, Harrison, Lee, Marshall & Wiliam, 2004 p. 10)

Theoretical questions Need for clear definitions So that research outcomes are commensurable Theorization and definition Possible variables Category (instruments, outcomes, functions) Beneficiaries (teachers, learners) Timescale (months, weeks, days, hours, minutes) Consequences (outcomes, instruction, decisions) Theory of action (what gets formed?)

Formative assessment: a new definition “An assessment functions formatively to the extent that evidence about student achievement elicited by the assessment is interpreted and used, by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions that would have been taken in the absence of that evidence.”

Unpacking formative assessment Where the learner is going Where the learner is How to get there Engineering effective discussions, tasks, and activities that elicit evidence of learning Providing feedback that moves learners forward Teacher Clarifying, sharing and understanding learning intentions Peer Activating students as learning resources for one another Activating students as owners of their own learning Learner

Definitional issues: potential research How can formative assessment be defined and what are the consequences of different definitions, for psychometrics, for communication, and for adoption?

Domain specificity issues

Pedagogy and didactics Some aspects of formative assessment are generic Some aspects of formative assessment are domain-specific There is a continuing debate about what aspects of formative assessment are generic (pedagogy) and which are domain-specific (didactics)

Clarifying, sharing and understanding learning intentions

A standard middle school math problem… Two farmers have adjoining fields with a common boundary that is not straight. This is inconvenient for plowing. How can they divide the two fields so that the boundary is straight, but the two fields have the same area as they had before?

How many rectangles?

Engineering effective discussions, activities, and classroom tasks that elicit evidence of learning

Questioning in math: Diagnosis In which of these right-angled triangles is a2 + b2 = c2 ? A a c b C E B D F

Diagnostic item: medians What is the median for the following data set? 38 74 22 44 96 22 19 53 22 38 and 44 41 46 70 77 This data set has no median Q8-61-05 Key: C (Mis)conceptions: median has to be a number in the data set, cannot calculate median with an even number of elements (B) And (F) would not be included – stick out as longer – neither correct

Diagnostic item: means What can you say about the means of the following two data sets? Set 1: 10 12 13 15 Set 2: 10 12 13 15 0 The two sets have the same mean. The two sets have different means. It depends on whether you choose to count the zero. Q4-67-02 Key: B (Mis)conception – added a zero to data set does not impact the mean – A and C variations of each other

Providing feedback that moves learners forward

Getting feedback right is hard Response type Feedback indicates performance… falls short of goal exceeds goal Change behavior Increase effort Exert less effort Change goal Reduce aspiration Increase aspiration Abandon goal Decide goal is too hard Decide goal is too easy Reject feedback Feedback is ignored

Activating students: as learning resources for one another as owners of their own learning

+/–/interesting: responses for “+” I got that ball-park estimates are supposed to be simple I know that you have to look at it and say “OK” I know that when I am adding the number I end up with must be bigger than the one I started at I get most of the problems It was easy for me because on the first one it says 328 so I took the 2 and made it a 12 I know that we would have to regroup I know how to do plus and minus because we have been doing it for a long time I get it when you cross out a number and make it a new one I know that when you can’t – from both colomes you go to the third colome and take that from it I know that when my answer is right the ball park estimate is close to it

+/–/interesting: responses for “–” I am still a tiny bit confused about subtraction regrouping I am a little bit confused about ball park estimates I get confused because sometimes I don’t get the problem I am confused when you subtract really big numbers like 1,000 something I’m still a little bit confused about regrouping Minus is confusing when you have to regroup twice Minus is a little bit hard when you have to regroup I don’t understand when you borrow which colome you borrow from when both are 0 I am still confused about showing what I did to solve the problem I am a little confused about when you need to subtract

+/–/interesting: responses for “interesting” Carrying the number over to the next number It’s interesting how some people go to the nearest hundred while some go to the nearest ten It’s interesting how some have to regroup twice It’s pretty interesting about how you have to work really hard I am interested in borrowing because I didn’t just get it yet. I want to really get to know it I find it weird that you could just keep going from colome to colome when you need to borrow On the ball park estimate it is easy but sometimes hard I really think that regrouping is pretty amazing It is cool how addition and subtraction regrouping is just moving numbers and you could get it right easily

Domain-specificity issues: potential research How much domain-specific knowledge does a teacher need in order to be able to implement high-quality formative assessment routines consistently? Can domain-specific formative assessment tools be independent of a particular curriculum?

The effectiveness issue

Effects of formative assessment Standardized effect size: differences in means, measured in population standard deviations Source Effect size Kluger & DeNisi (1996) 0.41 Black &Wiliam (1998) 0.4 to 0.7 Wiliam et al., (2004) 0.32 Hattie & Timperley (2007) 0.96 Shute (2008) 0.4 to 0.8

Understanding meta-analysis: “I think you’ll find it’s a bit more complicated than that” (Goldacre, 2008)

Understanding meta-analysis A technique for aggregating results from different studies by converting empirical results to a common measure (usually effect size) Standardized effect size is defined as: Problems with meta-analysis The “file drawer” problem Variation in population variability Selection of studies Sensitivity of outcome measures

The “file drawer” problem

The importance of statistical power The statistical power of an experiment is the probability that the experiment will yield an effect that is large enough to be statistically significant. In single-level designs, power depends on significance level set magnitude of effect size of experiment The power of most social studies experiments is low Psychology: 0.4 (Sedlmeier & Gigerenzer, 1989) Neuroscience: 0.2 (Burton et al., 2013) Education: 0.4 Only lucky experiments get published…

Variation in variability

Annual growth in achievement, by age A 50% increase in the rate of learning for six-year-olds is equivalent to an effect size of 0.76 A 50% increase in the rate of learning for 15-year-olds is equivalent to an effect size of 0.1 Bloom, Hill, Black, and Lipsey (2008)

Variation in variability Studies with younger children will produce larger effect size estimates Studies with restricted populations (e.g., children with special needs, gifted students) will produce larger effect size estimates

Selection of studies

Feedback in STEM subjects Review of 9000 papers on feedback in mathematics, science and technology Only 238 papers retained Background papers 24 Descriptive papers 79 Qualitative papers 24 Quantitative papers 111 Mathematics 60 Science 35 Technology 16 Ruiz-Primo and Li (2013)

Classification of feedback studies Who provided the feedback (teacher, peer, self, or technology-based)? How was the feedback delivered (individual, small group, or whole class)? What was the role of the student in the feedback (provider or receiver)? What was the focus of the feedback (e.g., product, process, self-regulation for cognitive feedback; or goal orientation, self-efficacy for affective feedback) On what was the feedback based (student product or process)? What type of feedback was provided (evaluative, descriptive, or holistic)? How was feedback provided or presented (written, video, oral, or video)? What was the referent of feedback (self, others, or mastery criteria)? How, and how often was feedback given in the study (one time or multiple times; with or without pedagogical use)?

Main findings Characteristic of studies included Maths Science Feedback treatment is a single event lasting minutes 85% 72% Reliability of outcome measures 39% 63% Validity of outcome measures 24% 3% Dealing only or mainly with declarative knowledge 12% 36% Schematic knowledge (e.g., knowing why) 9% 0% Multiple feedback events in a week 14% 17%

Sensitivity to instruction

Sensitivity of outcome measures Distance of assessment from the curriculum Immediate e.g., science journals, notebooks, and classroom tests Close e.g., where an immediate assessment asked about number of pendulum swings in 15 seconds, a close assessment asks about the time taken for 10 swings Proximal e.g., if an immediate assessment asked students to construct boats out of paper cups, the proximal assessment would ask for an explanation of what makes bottles float Distal e.g., where the assessment task is sampled from a different domain and where the problem, procedures, materials and measurement methods differed from those used in the original activities Remote standardized national achievement tests. Ruiz-Primo, Shavelson, Hamilton, and Klein (2002)

Impact of sensitivity to instruction Effect size Close Proximal

Effectiveness issues: potential research Under what kind of conditions does the implementation of formative assessment practices in classrooms lead to student improvement? What kinds of increases in the rate of student learning are possible?

Communication issues

Dissemination models Gas-pump attendant FedEx IKEA Sherpa Gardener PhD supervisor

So much for the easy bit… Theorization Ideas Products Evidence of impact Advocacy

Communication issues: potential research How can the vision of effective formative assessment practice be communicated to teachers?

Implementation issues

Hand hygiene in hospitals Study Focus Compliance rate Preston, Larson, & Stamm (1981) Open ward 16% ICU 30% Albert & Condie (1981) 28% to 41% Larson (1983) All wards 45% Donowitz (1987) Pediatric ICU Graham (1990) 32% Dubbert (1990) 81% Pettinger & Nettleman (1991) Surgical ICU 51% Larson, et al. (1992) Neonatal ICU 29% Doebbeling, et al. (1992) 40% Zimakoff, et al. (1992) Meengs, et al. (1994) ER (Casualty) Pittet, Mourouga, & Perneger (1999) 48% 36% Pittet (2001)

Implementation issues What are the practical obstacles to the introduction of formative assessment practices, and how can they be overcome? What kinds of tools and supports can be provided for teachers, and what needs to be developed locally?

Adoption issues

The story so far… 1993-1998 Review of research on formative assessment 1998-2003 Face-to-face implementations with groups of teachers 2003-2008 Attempts to produce faithful implementations at scale 2008-2013 Creating the conditions for implementations at scale

Adoption issues: potential research How can we support leaders in prioritizing changes that make the most difference to student outcomes?

Comments? Questions? www.dylanwiliam.net