The trouble with resits … Dr Chris Ricketts Sub-Dean (Teaching Enhancement), Faculty of Technology and Director of Assessment, Peninsula College of Medicine.

Slides:



Advertisements
Similar presentations
Evaluation and Quality Of electronic journals and related information resources.
Advertisements

Research Methods David Parkinson 26th April 2007.
Effective Assessment and Feedback
Standardized Scales.
By Anthony Campanaro & Dennis Hernandez
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
Validity In our last class, we began to discuss some of the ways in which we can assess the quality of our measurements. We discussed the concept of reliability.
Knowledge Claims and Questions
In order to reduce uncertainty and make our communication behavior more predictable and routine, our communication is organized by social roles. Early.
Developing your Assessment Judy Cohen Curriculum Developer Unit for the Enhancement of Learning and Teaching.
Providing Constructive Feedback
Reliability, Validity, Trustworthiness If a research says it must be right, then it must be right,… right??
Reliability or Validity Reliability gets more attention: n n Easier to understand n n Easier to measure n n More formulas (like stats!) n n Base for validity.
E-assessment: a risk-based approach to success Dr Chris Ricketts, Sub-Dean (Teaching Enhancement), Faculty of Technology, University of Plymouth and Director.
Using Growth Models for Accountability Pete Goldschmidt, Ph.D. Assistant Professor California State University Northridge Senior Researcher National Center.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
Modelling Cardinal Utilities from Ordinal Utility data: An exploratory analysis Peter Gilks, Chris McCabe, John Brazier, Aki Tsuchiya, Josh Solomon.
Writing tips Based on Michael Kremer’s “Checklist”,
Assessment of Systems Effort Factors Functionality Impact Factors Functionality Interface Usability What it does Collection Value to task Effectiveness.
Bivariate & Multivariate Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas process of.
Principles of High Quality Assessment
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Preparing for the Verbal Reasoning Measure. Overview Introduction to the Verbal Reasoning Measure Question Types and Strategies for Answering General.
Writing Workshop Constructing your College Essay
Achievement Testing Dale Pietrzak, Ed.D., LPC-MH, CCMHC University of South Dakota Counseling & Psychology in Education.
Effective Questioning in the classroom
How to Write an Introduction. Introduction Knowing how to write an introduction is yet another part in the process of writing a research paper. In the.
Advanced Research Methodology
© Curriculum Foundation1 Section 2 The nature of the assessment task Section 2 The nature of the assessment task There are three key questions: What are.
EAPY 677: Perspectives in Educational Psychology Dr. K. A. Korb University of Jos 14 May 2009.
Thinking Actively in a Social Context T A S C.
IB Internal Assessment Design. Designing an Experiment Formulate a research question. Read the background theory. Decide on the equipment you will need.
Individuals with Lower Literacy Levels: Accessing and Navigating Healthcare Herbert, H. 1, Adams, J. 1, Lowe, W. 1, Leuddeke, J Faculty of Health.
Grade 3 EQAO Parent Information Session
The Role of Information in Systems for Learning Paul Nichols Charles DePascale The Center for Assessment.
ESSAY WRITING Mrs. Shazia Zamir, DHA College Phase VIII.
Assessment of MSc and PhD students E J Wood School of Biochemistry & Microbiology University of Leeds Leeds, LS2 9JT, UK
Feb using assessment to motivate learning. QAA Assessment Driven Learning Jean Cook
Reliability Chapter 3. Classical Test Theory Every observed score is a combination of true score plus error. Obs. = T + E.
Placement Testing Dr. Edward Morante Faculty to Faculty Webinar April 19, 2012 Sponsored by Lone Star College.
MEng projects 2013/14 Semester 2 week 10 update Mike Spann Project coordinator
EDU 8603 Day 6. What do the following numbers mean?
Educator’s view of the assessment tool. Contents Getting started Getting around – creating assessments – assigning assessments – marking assessments Interpreting.
Examining Rubric Design and Inter-rater Reliability: a Fun Grading Project Presented at the Third Annual Association for the Assessment of Learning in.
Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.
Achievement Exam Analysis ED3604E/F February 5/9, 2010 adapted from materials created by Gerry Varty, Director of Instruction Wolf Creek Public Schools.
Assessment at KS4 Bury C of E High School Engaging Parents Information.
Building Exams Dennis Duncan University of Georgia.
My Professors Just Don’t Care! Carl Burns Director, Counseling Center Tammy Pratt Coordinator, Academic Support Programs.
Assessment meeting for parents of children in Y1 to Y6 Wednesday 9 th December 6pm & Thursday 10th December 9:15am.
IB Internal Assessment Exploration. Designing an Experiment Formulate a research question. Read the background theory. State the variables. Decide on.
Developing ‘designerly’ thinking in the Foundation Stage Materials created by Clare Benson, Chris Cannon & Sandie Kendall Contact:
Title Page and Introduction Gregory A. Thomas, PhD Coe College Action Research I.
GUIDED READING P-12 Loddon Mallee Region. Revisit Shared Reading In your head, think - what have you stopped doing what have you started doing and what.
Key Stage 2 SATs 2016 Childer Thornton Primary School.
Educational Research Chapter 8. Tools of Research Scales and instruments – measure complex characteristics such as intelligence and achievement Scales.
IGCSE FIRST LANGUAGE Exam Guide. WHAT YOU ARE EXAMINED ON  In this course, you are examined on two thing:  Reading Skills  Writing Skills  It sounds.
Assessment for Learning Centre for Academic Practice Enhancement, Middlesex University.
Feedback Overall very well done! Strong commitment to project topic, passion came through Real effort to find relevant literature.
edTPA: Task 1 Support Module
For NCEA L2 and L3 Statistics
interpret and use Financial Statistics
What is a CAT? What is a CAT?.
Week 3 Class Discussion.
Bursting the assessment mythology: A discussion of key concepts
What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 14th May 2018.
Assessments-Purpose and Principles
In-Service Teacher Training
Presentation transcript:

The trouble with resits … Dr Chris Ricketts Sub-Dean (Teaching Enhancement), Faculty of Technology and Director of Assessment, Peninsula College of Medicine and Dentistry but School of Mathematics and Statistics

Outline The coincidences that led me here Something about educational measurement The literature Some theory The question to which I don’t know the answer (yet!)

(Mental) health warning Mostly theory and speculation, no results.

Background Had been working on ‘domain referenced testing’ in PCMD. Had been thinking about ‘progress testing’. Chairing university’s assessment review. Received a paper with the title ‘The trouble with resits…’ to review. Started to think … Time to share my problem!

The ‘progress test’ prompt A ‘progress test’ is a test set at graduation level but sat by students in all years.

The ‘progress test’ prompt Concept goes back to Goulet (1955). First practical application described by Arnold & Willoughby (1990). My question - How can we use prior information (results on previous tests) to improve our estimate of what a student currently knows?

The referee prompt Looked at resits in clinical examinations Claimed that ‘it would be a brave assessment team which set a higher pass- mark for a resit …’ My question – Why do we not do this?

Educational measurement Educational measurement … is better conceived of as testing student performance on a sample of tasks from the area for purposes of predicting the extent of satisfactory performance in the area as a whole. Bock, Thissen & Zimowski (1997)

Educational measurement vs. competency testing? Competency testing means you can do a specific task. Multiple tries are sensible. (Perhaps we need repetitive competency assessments? Is a sample of one enough?) This is different from educational measurement. Educational measurement is an inference problem. Take a sample of tasks representative of the whole domain. On the basis of the performance on the sample we make inference about the whole domain.

The trouble with resits … A resit is another sample. How should we treat it?

The literature on re-sits There’s some info about what people do but very little about why. Educational Measurement: Issues and Practice has nothing. Journal of Educational Measurement has nothing. Assessment and Evaluation in Higher Education has nothing. “Measurement and assessment in teaching” Linn & Gronlund has nothing. Can you help?

Some theory (1) All educational measurements are made with uncertainty This is usually described as the ‘Standard error of measurement’. The aim is to come to reliable decisions, usually implying a measurement with a small standard error.

Inference and uncertainty Uncertainty arises because of 1) the sample, 2) other sources of error. If someone fails, is this because the sample is inappropriate for them? - there is “case specificity”. A resit is another sample. How should we treat it?

Some theory (2) Adaptive testing or multi-stage testing. In classical adaptive testing we give a student a task of average difficulty. If they pass they the get a harder task. This gives more information at their particular ability level. If they fail, they get an easier task. Assumes ‘unidimensionality’. Students who hover around the pass mark generally sit longer tests to reduce the standard error of measurement.

Some theory (3) In multi-stage testing we give a student a sample of tasks. If they are a clear pass or fail the test ends. Students near the pass mark are given another sample of tasks. If the combined sample gives a clear pass/fail decision then the test stops. If there is still too much uncertainty, another sample of tasks is given. Again, students who hover around the pass mark generally sit longer tests to reduce the standard error of measurement.

How should we treat resits?? Is a resit an independent sample? That’s how we (and everyone else) treat it. Or is a resit a second sample? Are we using it to increase the sample size?

Should we use prior information?? After the first test we have an indication that a student who fails has not mastered the content/tasks that we expect. Should we use that information when we assess the resit? If we should, how would it work?

How would it work? The student mark on the combined first attempt and resit is used to make the pass/fail decision.

How would it work? Students who narrowly fail on the first attempt would only have to improve slightly to pass. Students who fail badly on the first attempt would have to improve substantially to pass.

Implications and discussion I need help! Your thoughts? Anyone know any literature on the ‘Theory of resits’???