UMDNJ-New Jersey Medical School

Slides:



Advertisements
Similar presentations
Knowledge Dietary Managers Association 1 PART II - DMA Certification Exam Blueprint and Exam Development-
Advertisements

Item Analysis.
FACULTY DEVELOPMENT PROFESSIONAL SERIES OFFICE OF MEDICAL EDUCATION TULANE UNIVERSITY SCHOOL OF MEDICINE Using Statistics to Evaluate Multiple Choice.
Hands-On Workshop on Writing Patient Vignettes and Clinical Problem-Solving Questions Lecia Apantaku, MD Michael Fennewald, PhD Sept. 16, 2015.
Copyright © 2012 Pearson Education, Inc. or its affiliate(s). All rights reserved
Psychometric Aspects of Linking Tests to the CEF Norman Verhelst National Institute for Educational Measurement (Cito) Arnhem – The Netherlands.
Topic 4B Test Construction.
Issues of Reliability, Validity and Item Analysis in Classroom Assessment by Professor Stafford A. Griffith Jamaica Teachers Association Education Conference.
M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
1Reliability Introduction to Communication Research School of Communication Studies James Madison University Dr. Michael Smilowitz.
1. 2 Dr. Shama Mashhood Dr. Shama Mashhood Medical Educationist Medical Educationist & Coordinator Coordinator Question Review Committee Question Review.
Item Analysis: A Crash Course Lou Ann Cooper, PhD Master Educator Fellowship Program January 10, 2008.
Medical school attendedPassing grade Dr JohnNorthsouth COM (NSCOM)80% Dr SmithEastwest COM (EWCOM)50% Which of these doctors would you like to treat you?
CREATING EFFECTIVE QUESTIONS FOR ASSESSMENT AND AS AIDS IN LEARNING IN TODAY'S PHARMACOLOGY PROGRAMS George A. Dunaway, Ph.D. Emeritus Professor Department.
Dr. Majed Wadi MBChB, MSc Med Edu
Test Construction Processes 1- Determining the function and the form 2- Planning( Content: table of specification) 3- Preparing( Knowledge and experience)
Item Analysis Prof. Trevor Gibbs. Item Analysis After you have set your assessment: How can you be sure that the test items are appropriate?—Not too easy.
Standardized Test Scores Common Representations for Parents and Students.
Student Achievement and Predictors of Student Achievement in a State Level Agricultural Mechanics Career Development Event Edward Franklin Glen Miller.
Technical Issues Two concerns Validity Reliability
Classroom Assessment A Practical Guide for Educators by Craig A
GUIDELINES FOR SETTING A GOOD QUESTION PAPER
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
Office of Institutional Research, Planning and Assessment January 24, 2011 UNDERSTANDING THE DIAGNOSTIC GUIDE.
Establishing Content Validity of a Novel Written Examination to Assess Medical Students on the Surgery Clerkship A Berlin MD, A Reinert BA, A Swan-Sein.
Part #3 © 2014 Rollant Concepts, Inc.2 Assembling a Test #
Technical Adequacy Session One Part Three.
The Genetics Concept Assessment: a new concept inventory for genetics Michelle K. Smith, William B. Wood, and Jennifer K. Knight Science Education Initiative.
CONSTRUCTING OBJECTIVE TEST ITEMS: MULTIPLE-CHOICE FORMS CONSTRUCTING OBJECTIVE TEST ITEMS: MULTIPLE-CHOICE FORMS CHAPTER 8 AMY L. BLACKWELL JUNE 19, 2007.
Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Techniques to improve test items and instruction
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
8 Strategies for the Multiple Choice Portion of the AP Literature and Composition Exam.
RELIABILITY AND VALIDITY OF ASSESSMENT
Assessment and Testing
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Using the Many-Faceted Rasch Model to Evaluate Standard Setting Judgments: An IllustrationWith the Advanced Placement Environmental Science Exam Pamela.
Introduction to Item Analysis Objectives: To begin to understand how to identify items that should be improved or eliminated.
March 11, 2013 Chicago, IL American Board of Preventive Medicine American Board of Preventive Medicine Clinical Informatics Examination Committee Measurement.
Designing a Classroom Test Anthony Paolo, PhD Director of Assessment & Evaluation Office of Medical Education & Psychometrician for CTC Teaching & Learning.
Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student.
Mater Gardens Middle School MATHEMATICS DEPARTMENT WHERE LEARNING HAS NO FINISH LINE ! 1.
Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.
Dept. of Community Medicine, PDU Government Medical College,
Copyright © Springer Publishing Company, LLC. All Rights Reserved. DEVELOPING AND USING TESTS – Chapter 11 –
Remediation & Standard Setting in Clerkships Laszlo Kiraly MD FACS Associate Professor of Surgery Oregon Health & Science University.
Using Data to Drive Decision Making:
Georgetown University
ARDHIAN SUSENO CHOIRUL RISA PRADANA P.
Concept of Test Validity
Classroom Analytics.
The All-important Placement Cut Scores
Reliability & Validity
Human Resource Management By Dr. Debashish Sengupta
Math-Curriculum Based Measurement (M-CBM)
Milwee Middle School Math Night
Calculating Reliability of Quantitative Measures
PSY 614 Instructor: Emily Bullock, Ph.D.
Constructing Exam Questions
Classroom Assessment Ways to improve tests.
Chapter 4 Characteristics of a Good Test
Dept. of Community Medicine, PDU Government Medical College,
Summative Assessment Grade 6 April 2018 Develop Revise Pilot Analyze
به نام خدا.
Multiple Choice Item (MCI) Quick Reference Guide
Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 8 Objective Test Items.
Multiple Choice Item (MCI) Quick Reference Guide
Tests are given for 4 primary reasons.
Presentation transcript:

UMDNJ-New Jersey Medical School Developing a Departmental Examination for a Third Year Family Medicine Clerkship Judy C. Washington, MD, Jesse Crosson, PhD, Chantal Brazeau, MD UMDNJ-New Jersey Medical School

Educational Goals and Objectives Participants attending this lecture-discussion will leave understanding How to develop a departmental exam from an existing database. How to apply the National Board of Medical Examiners standards to exam kit questions How to validate a test instrument using existing resources. How the process to develop a standardized exam from an exam kit can be simplified with the assistance of a statistician

Our Team Our team had no experience in designing a test but… Dr. Washington took courses in curriculum design/evaluation served as a content expert Dr. Brazeau has experience with standards setting with NBME Dr. Crosson has the statistical expertise assisted in the analysis of the reports from the testing center instructed them how to do the year-end summary analysis

Rationale Current SHELF Exam not relevant to the 20 common problems and other important concepts in Family Medicine well standardized Clerkship faculty need a reliable examination for the third year clerkship Using Sloane’s Essentials of Family Medicine (4th edition) the Exam Kit could help solve the dilemma Used both exams during transition (curved)

Developing the question database Categorizing and discarding questions Predoc education committee (five faculty) NBME standards were used Selecting the items Developed a database of Hard/Easy questions Rewrote or developed new questions Developing the test Allowed the computer to select the questions 60 hard/40 easy for a total of 100 questions

NBME Standards Testwiseness: avoid absolute terms, grammatical cues, long correct answer Irrelevant difficulty: avoid complex options, none of the above, tricky stems, vague terms Other guidelines: avoid negatively phrased items, long stem-short options are best, avoid trivial facts, test important concepts

Validating the Exam Administered the first version- July 2002 100 questions Students were responsible for the entire text Students continued to take the SHELF Discarded non-discriminatory questions Too easy/too hard Calculating the reliability coefficient Calculated by our testing center/Dr. Crosson determined the reliability of the number

Reliability Coefficient and Discrimination Index The extent to which the test is likely to produce consistent scores Types: inter-correlations between items, length and content of the test Ranges from 0 (no reliability) to 1.00 (perfect reliability) Discrimination Index Difference between the % correct in the upper and lower group Point Biserial Correlation Correlation between examinees performance on the item (right or wrong) and total test score Mean average score

Validating the Exam Fourth version of the exam year 1 had a high reliability coefficient reliability coefficient was .85 lowest grade was 56 and highest grade was 89 with a mean of 72 test was curved to the mean during 2002-2003 but not 2003-2004 Administered the same exam January-June 2003 (year 1) and July 2003- June 2004 (year 2) Cumulative reliability coefficient was .78 (.8 to.9 was our goal)

Challenges Two exams were viewed as jumping through hoops Not using the SHELF 2004-2005 Significant number of students failed (18) The departmental Exam was viewed as not representative of what was covered Exam now covers fewer chapters that include the 20 common problems

Challenges Creating an exam that was felt to be more representative We reviewed the old exam Discarded questions Use questions from the database that we created and from the MUSC site Validating a new exam Decided to use Modified Angoff Procedure Convened faculty to review exam and calculate the score

Modified Angoff procedure A group of experts discusses the characteristic of a “borderline” examinee For each item on the test judges estimate the percentage of borderline examinees who would answer the item correctly Pass/fail standard the average of the percentages for the items.

Angoff Scores

Challenges The present exam Making changes to improve reliability Mean score is the average of the Angoff Score: 71.97 Curved to 81.61 (based on best rotation from last year) Reliability coefficient is lower: 0.66 4 students failed Making changes to improve reliability Reconvened faculty to review 19 questions that are too easy Recalculate Angoff of these questions and the mean of the test or… Rewrite questions, calculate the Angoff, and then the mean of the test Next exam to be given February 11, 2005

Discussion Questions about the process? What resources do you already have to allow you to do this process? Testing center could provide analysis? Do you have a statistician? Do you find students complain about the content of the exam? This helps make it valid especially when you have to report to the Dean of Education Is anyone using a similar process?

Resources Constructing Written Test Questions For the Basic and Clinical Sciences - Section IV. National Board of Medical Examiners. http://www.nbme.org/PDF/2001iwgsec4.pdf