Port Phillip/Bayside Network: Using student achievement data to inform improved teaching programs Presented by Philip Holmes-Smith (School Research Evaluation and Measurement Services)
Overview of the workshop 1.Some definitions and some facts about tests a)Diagnostic vs. Summative Testing b)The reliability of summative (standardised) tests c)Using other summative tests to monitor student progress 1.Using and interpreting NAPLAN Data a)Assessment of Learning - Using SPA - Using SPA b)Assessment for Learning (Using tests diagnostically to improve learning) - Writing Criteria Report and the Student Response Report - Writing Criteria Report and the Student Response Report - Item Analysis Report - Item Analysis Report
1(a).Diagnosticvs. Summative Tests
Diagnostic Testing Assessment tools such as the Maths Online Interview, Marie Clay Inventory and Probe together with teacher assessments such as teacher questioning in class, teacher observations and student work (including portfolios) are all examples of “Diagnostic” data Research shows that our most effective teachers, in terms of improving the learning outcomes of students, constantly use diagnostic information to inform their teaching. IF a teacher users diagnostic information about what each student can and can’t do to inform their teaching for each student, Hattie (2003) shows that this has the single biggest impact on improving student learning outcomes
Summative (Standardised) Testing AIM/NAPLAN and other tests like TORCH, PAT-R, PAT-Math, SA Spelling and On-Demand Adaptive Tests are referred to as “Summative” tests Summative testing is essential to monitor the effectiveness of your teaching. (We will look at ways of doing this later.) Research shows that summative tests do not lead to improved learning outcomes. As the saying goes: “You don’t fatten a pig by weighing it” “You don’t fatten a pig by weighing it” So, although it is essential, keep summative testing to a minimum.
1(b). The Reliability of Summative Tests
Three Questions 1. Do you believe that your students’ NAPLAN results accurately reflect their level of performance?
Three Questions 1. Do you believe that your students’ NAPLAN results accurately reflect their level of performance? 2. If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? Have your weakest students received a low score? Have your weakest students received a low score? Have your average students received a score at about expected level? Have your average students received a score at about expected level? Have your best students received a high score? Have your best students received a high score?
Three Questions 1. Do you believe that your students’ NAPLAN results accurately reflect their level of performance? 2. If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? Have your weakest students received a low score? Have your weakest students received a low score? Have your average students received a score at about expected level? Have your average students received a score at about expected level? Have your best students received a high score? Have your best students received a high score? 3. Think about your students who received high and low scores: Are your low scores too low? Are your low scores too low? Are your high scores too high? Are your high scores too high?
Is this reading score reliable?
Summary Statements about Scores Low scores (i.e. more than 0.5 VELS levels below expected) indicate poor performance but the actual values should be considered as indicative only. High scores (i.e. more than 0.5 VELS levels above expected) indicate good performance but the actual values should be considered as indicative only. Average scores indicate roughly expected levels of performance but the actual values should be considered as indicative only.
Item Difficulties for She’s Crying on the TORCH scale score scale
Converting Raw test Scores (She’s Crying) to TORCH scale score
Test difficulties of the TORCH Tests on the TORCH score scale together with Year Level mean scores
Different norm tables for different tests
Source: ACER, 2006 Test difficulties of the PAT-Maths Tests on the PATM scale score scale together with Year Level mean scores Year 1 Year 2 Year 3 Year 4 Year 5 Year 6&7 Year 8&9 Year 10
Summative Testing and Triangulation Even if you give the right test to the right student, sometimes, the test score does not reflect the true ability of the student – every measurement is associated with some error To overcome this we should aim to get at least three independent measures – what researchers call TRIANGULATION. This may include: Teacher judgment Teacher judgment NAPLAN results NAPLAN results Other pen & paper summative tests (e.g. TORCH, PAT-R, PAT-Maths) Other pen & paper summative tests (e.g. TORCH, PAT-R, PAT-Maths) On-line summative tests (e.g. On-Demand ‘Adaptive’ testing, Assessment of English) On-line summative tests (e.g. On-Demand ‘Adaptive’ testing, Assessment of English) BUT remember, more summative testing does not lead to improved learning outcomes – so keep the summative testing to a minimum
Things to look for in a summative test Needs to have a single developmental scale that shows increasing levels of achievement over all the year levels at your school Needs to have “norms” or expected levels for each year level (e.g. The National “norm” for Yr 3 students on TORCH is an average of 34.7). Needs to be able to demonstrate growth from one year to the next (e.g. during Yr 4, the average student grows from a score of 34.7 in Yr 3 to an expected score of 41.4 in Yr 4 As a bonus, the test could also provides diagnostic information
Norms for Year 3 to Year 10 On the TORCH scale
My Recommended Summative Tests (Pen & Paper) Reading Comprehension TORCH and TORCH plus TORCH and TORCH plus Progressive Achievement Test - Reading (PAT-R, 4 th Edition) Progressive Achievement Test - Reading (PAT-R, 4 th Edition) Mathematics Progressive Achievement Test - Mathematics (PAT-Maths, 3 rd Edition) combined with the I Can Do Maths Progressive Achievement Test - Mathematics (PAT-Maths, 3 rd Edition) combined with the I Can Do Maths Spelling South Australian Spelling South Australian Spelling
My Recommended Summative Tests (On-Line) On-Demand - Reading Comprehension The 30-item “On-Demand” Adaptive Reading test The 30-item “On-Demand” Adaptive Reading test On-Demand - Spelling The 30-item “On-Demand” Adaptive Spelling test The 30-item “On-Demand” Adaptive Spelling test On-Demand - Writing Conventions The 30-item “On-Demand” Adaptive Writing test The 30-item “On-Demand” Adaptive Writing test Assessment of English in the Early Years (2010). Includes: Comprehension, Comprehension, Spelling, Spelling, Writing Writing Speaking & Listening Speaking & Listening On-Demand - Mathematics (Number, Measurement, Chance & Data and Space) The 60-item “On-Demand” Adaptive General Mathematics test The 60-item “On-Demand” Adaptive General Mathematics test
2(b). Assessment of Learning Using SPA
The NAPLAN Data Service Main Menu
The Student Achievement Level Report Menu
The Student Achievement Level Report
The Student Achievement Level Report Menu
The Student Achievement Level Report
Extracting Outcome Level Data for Further Analysis
Cut-points for colour coding Yr 3Yr 5Yr 7Yr 9 <= 1.675<= 2.675<= 3.675<= >= 2.675>= 3.675>= 4.675>= 5.675
Working with the extracted data See SPA
2(b) Assessment for Learning Using the tests diagnostically to improve student learning
The NAPLAN Data Service Main Menu
The Writing Criteria Report Menu
Writing – The marking rubric NAPLAN Marking rubric is comprised of 10 criteria. Namely: Marking rubric is comprised of 10 criteria. Namely: - Audience - Cohesion - Text structure - Paragraphing - Ideas - Sentence structure - Character and setting - Punctuation - Vocabulary - Spelling The marking rubric can be downloaded from: or marking rubric can be downloaded from: or This document also contains some annotated marked examples.This document also contains some annotated marked examples.
Writing – The marking rubric (cont)
Writing – Annotated marked example
The Writing Criteria Report Across the State less than 20% received a score of “1” and about 40% received a score of “3” However, in this school, nearly 40% received a score of “1” and under 20% received a score of “3” This school could benefit from some lessons on how to develop ideas in a Narrative
The Student Response Report Menu
The Student Response Report – Writing (by criteria)
The NAPLAN Data Service Main Menu
The Item Analysis Report Menu
The Item Analysis Report
Year 5 Reading Correct = ? Most common incorrect = ?
Year 5 Reading Correct = D (93%) Most common incorrect = A, B, C (2-3% each)
Year 5 Reading Correct = ? Most common incorrect = ?
Year 5 Reading Correct = B (56%) Most common incorrect = C (34%)