How does item writing flaws (IWF), cognitive level and re-use of items (RI) affect the quality of multiple choice questions (MCQ) and the students’ performance?

Slides:



Advertisements
Similar presentations
An Introduction to Test Construction
Advertisements

Test Construction A workshop. Activity 1 Using the information you read Brown (2004) (particularly on pages 51-64), develop criteria for evaluating an.
An Introduction to Computer- assisted Assessment Joanna Bull and Ian Hesketh CAA Centre Teaching and Learning Directorate.
Item Analysis.
What is Assess2Know ® ? Assess2Know is an assessment tool that enables districts to create high-quality reading and math benchmark assessments for grades.
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
Chapter 4 – Reliability Observed Scores and True Scores Error
Using Test Item Analysis to Improve Students’ Assessment
Using Multiple Choice Tests for Assessment Purposes: Designing Multiple Choice Tests to Reflect and Foster Learning Outcomes Terri Flateby, Ph.D.
Item Analysis: A Crash Course Lou Ann Cooper, PhD Master Educator Fellowship Program January 10, 2008.
Minnesota State Community and Technical College Critical Thinking Assignment Example and Assessment.
Dr. Majed Wadi MBChB, MSc Med Edu
Test Construction Processes 1- Determining the function and the form 2- Planning( Content: table of specification) 3- Preparing( Knowledge and experience)
Item Analysis What makes a question good??? Answer options?
Objective Exam Score Distribution. Item Difficulty Power Item
Item Analysis Prof. Trevor Gibbs. Item Analysis After you have set your assessment: How can you be sure that the test items are appropriate?—Not too easy.
Multiple Choice Test Item Analysis Facilitator: Sophia Scott.
 Do non-majors learn genetics at a different rate than majors?  What factors affect how students think about and learn difficult genetics concepts? Jenny.
Module 6 Test Construction &Evaluation. Lesson’s focus Stages in Test Construction Tasks in Test Test Evaluation.
Chapter 8 Measuring Cognitive Knowledge. Cognitive Domain Intellectual abilities ranging from rote memory tasks to the synthesis and evaluation of complex.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Online assessment: the use of web based self-assessment materials to support self-directed learning Online assessment: the use of web based self-assessment.
Gen Ed Assessment Critical Thinking Outcome Multiple Choice Question (MCQ) Development Project in the Social Sciences BASED ON SLIDES FROM DEC. LAURA BLASI,
The Analysis of the quality of learning achievement of the students enrolled in Introduction to Programming with Visual Basic 2010 Present By Thitima Chuangchai.
ASSESSMENT IN EDUCATION ASSESSMENT IN EDUCATION. Copyright Keith Morrison, 2004 ITEM TYPES IN A TEST Missing words and incomplete sentences Multiple choice.
Student assessment AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.
Human Learning Asma Marghalani.
Based on Common Core.  South Carolina will discontinue PASS testing in the year This will be a bridge year for common core and state standards.
 EDM 152 ?. THE BEST 3 LETTER WORD IN DIE DICTIONARY IS : ASK.
Assessing in the Cognitive Domain KNR 341 Chapter 8 Dr. Henninger.
Writing Multiple Choice Questions. Types Norm-referenced –Students are ranked according to the ability being measured by the test with the average passing.
Building Exams Dennis Duncan University of Georgia.
“The End Justifies the Means” When and how to use Computerised Assessment to Promote Learning Phil Davies School of Computing.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Designing a Classroom Test Anthony Paolo, PhD Director of Assessment & Evaluation Office of Medical Education & Psychometrician for CTC Teaching & Learning.
1 How to Take Tests 4 How to Take a True/False Test.
Test Question Writing Instructor Development ANSF Nurse Training Program.
Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student.
Psychometrics: Exam Analysis David Hope
Assessment and the Institutional Environment Context Institutiona l Mission vision and values Intended learning and Educational Experiences Impact Educational.
Dept. of Community Medicine, PDU Government Medical College,
Copyright © Springer Publishing Company, LLC. All Rights Reserved. DEVELOPING AND USING TESTS – Chapter 11 –
Questioning By: Shuhudha Rizwan (2007). What is a question? A question is a sentence, which has an interrogative form or function In the classroom, questions.
Dr Anie Attan 26 April 2017 Language Academy UTMJB
Georgetown University
ARDHIAN SUSENO CHOIRUL RISA PRADANA P.
Oleh: Beni Setiawan, Wahyu Budi Sabtiawan
Update on Data Collection and Reporting
Concept of Test Validity
Using EduStat© Software
Data Analysis and Standard Setting
Classroom Analytics.
Topic 2 – Cognitive Psychology
Reliability & Validity
Using Formative Assessment to Improve Student Achievement
Greg Miller Iowa State University
Office of Education Improvement and Innovation
Test Development Test conceptualization Test construction Test tryout
Cerebral palsy profile (cp-pro)
Dept. of Community Medicine, PDU Government Medical College,
Summative Assessment Grade 6 April 2018 Develop Revise Pilot Analyze
Dr Anie Attan 26 April 2017 Language Academy UTMJB
Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 8 Objective Test Items.
Developing Questioning Skills
Decision-Making Capacity - under the medical model
EDUC 2130 Quiz #10 W. Huitt.
Driven by Data to Empower Instruction and Learning
Assessment David Taylor.
Being SMARTER with your Objectives
Presentation transcript:

How does item writing flaws (IWF), cognitive level and re-use of items (RI) affect the quality of multiple choice questions (MCQ) and the students’ performance? Bjørn Mørkedal, Tobias Schmidt Slørdahl, Torstein Vik

Background MCQs was introduced at our faculty in 2005 The first four year classes has 60% MCQ in the end-of-year-examination (in addition to 40% MEQ) As an on-going quality control we assessed the 460 MCQs delivered in 2008

MCQ-type used in Trondheim Vignette A-type 3-5 alternatives, 1 is the best No penalty for wrong answer Question Alternatives

Measurements 1 Evaluation Synthesis Analysis Application Understanding Cognitive level Evaluation Synthesis Analysis Application Understanding Knowledge K2 K1 Bloom’s six levels of cognition

Measurements 2 Item writing flaws (IWF) Hand cover test Convergence Longest answer best Longest answer not best Word repetition Logical clues Vague terms «Never», «Always», etc.

Measurements 3 Re-used items (RI) The faculty allows the re-use of items on a proportion of the exam Analysed objectively using plagiarism software, comparing items from 2008 with previous exams Items from previous examinations are publically available to students

Analyses of outcome Item difficulty Discrimination index (DI) the proportion of students with correct answer Discrimination index (DI) ability of an item to distinguish less well from well performing students Mean test score equivalent to item difficulty for the entire exam This information was gathered after examinations using item analysis

Results 1 Adjusted R2 of different item analyses indices in relation to performance of items   Difficulty Discriminating power R2 Re-used item 0.08 0.07 K2 0.01 Any IWF 0.00

Results 2 Proportion of re-used items Stage Items Total Proportion 1 38 120 32 % 2 27 23 % 3 12 100 12 % 4 17 14 % Overall 94 460 20 %

Results 3 Student score by stage and RI-status Total New items Re-used items Stage Score SD 1 86.8 5.6 82.2 7.5 96.8 3.5 2 78.6 9.3 74.7 10.0 92.0 9.4 3 74.8 6.3 72.4 6.9 92.4 8.0 4 76.4 7.8 8.3 86.3 8.5 Overall 79.3 76.6 8.8 91.9 P < 0.001 Score is the number of correct answers times 100 divided by number of items. SD: Standard deviation.

Conclusions IWFs and cognitive level had little effect on the students’ performance Re-use of items influenced the results significantly Students had significantly higher scores in re-used than in new items Increasing the question bank or avoiding re-use of items is important in order to maintain high quality of MCQ-examinations

Backup-slide 1 Difficulty Discriminating power R2 Re-used item 0.08   Difficulty Discriminating power R2 Re-used item 0.08 0.07 K2 0.01 Any IWF 0.00 Number of IWF Handcover Convergence Short vignette Longest answer best (strict) Longest answer not best (strict) Longest answer best (Tobias) Longest answer best (eyeballing) Longest answer not best (eyeballing) Word repeat Logical clue “Never” Vague terms

Backup-slide 2 Student score by stage and IWF-status Total IWF-items Non-IWF-items Stage Score SD 1 86.8 5.6 85.4 6.9 87.5 2 78.6 9.3 77.9 10.3 78.9 9.4 3 74.8 6.3 77.8 6.8 72.6 7.5 4 76.4 7.8 73.9 9.6 78.0 Overall 79.3 8.5 79.2 9.1

Backup-slide 3 Proportion of items with one or more item writing flaws Stage All items New items Re-used items 1 53 % 67 % 76 % 2 43 % 52 % 63 % 3 42 % 50 % 4 38 % 46 % 41 % Overall 51 %

Backup-slide 4 Proportion of items with K2 Stage Items Proportion 1 5 4 % 2 7 6 % 3 41 41 % 4 62 52 % Overall 115 25 %