Standard Setting Zagreb, July 2009.

Slides:

Advertisements

Similar presentations

A Systems Approach To Training

Advertisements

Knowledge Dietary Managers Association 1 PART II - DMA Certification Exam Blueprint and Exam Development-

Principles of Standard Setting

Standardized Scales.

Copyright © 2012 Pearson Education, Inc. or its affiliate(s). All rights reserved

Spiros Papageorgiou University of Michigan

M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.

© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.

Comprehensive Models of Formative Assessment. A theory of formative assessment.

Standardized Tests What They Measure How They Measure.

Advanced Topics in Standard Setting. Methodology Implementation Validity of standard setting.

1 New England Common Assessment Program (NECAP) Setting Performance Standards.

Setting Performance Standards Grades 5-7 NJ ASK NJDOE Riverside Publishing May 17, 2006.

Presented by Denise Sibley Laura Jean Kerr Mississippi Assessment Center Research and Curriculum Unit.

Presented at the 2006 CLEAR Annual Conference September Alexandria, Virginia Something from Nothing: Limitations of Diagnostic Information in a CAT.

New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Standard Setting Inclusive Assessment Seminar Marianne.

Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items.

Standard Setting Different names for the same thing Standard Passing Score Cut Score Cutoff Score Mastery Level Bench Mark.

Setting Alternate Achievement Standards Prepared by Sue Rigney U.S. Department of Education NCEO Teleconference March 21, 2005.

June 23, 2003 Council of Chief State School Officers What Does “Proficiency” Mean for Students with Cognitive Disabilities Dr. Ron Cammaert Riverside Publishing.

Examing Rounding Rules in Angoff Type Standard Setting Methods Adam E. Wyse Mark D. Reckase.

Wastewater Treatment Plant Operator Exam Setting Performance Standards With The Modified Angoff Procedure.

Standardized Test Scores Common Representations for Parents and Students.

Problem solving in project management

Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska.

The BILC BAT: A Research and Development Success Story Ray T. Clifford BILC Professional Seminar Vienna, Austria 11 October.

Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.

1 Establishing A Passing Standard Paul D. Naylor, Ph.D. Psychometric Consultant.

1 An Introduction to Language Testing Fundamentals of Language Testing Fundamentals of Language Testing Dr Abbas Mousavi American Public University.

1 New England Common Assessment Program (NECAP) Setting Performance Standards.

Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through.

 Closing the loop: Providing test developers with performance level descriptors so standard setters can do their job Amanda A. Wolkowitz Alpine Testing.

Standardizing Testing in NATO Peggy Garza and the BAT WG Bureau for International Language Co-ordination.

Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.

NRTs and CRTs Group members: Camila, Ariel, Annie, William.

Assessment Training Nebo School District. Assessment Literacy.

Employing Empirical Data in Judgmental Processes Wayne J. Camara National Conference on Student Assessment, San Diego, CA June 23, 2015.

STANDARD SETTING Prepared by Ludmila Kozhevnikova and Viktoria Levchenko Based on material by Anthony Green.

Cut Points ITE Section One n What are Cut Points?

Standard Setting Results for the Oklahoma Alternate Assessment Program Dr. Michael Clark Research Scientist Psychometric & Research Services Pearson State.

Educator’s view of the assessment tool. Contents Getting started Getting around – creating assessments – assigning assessments – marking assessments Interpreting.

Benchmark Advisory Test (BAT) Update BILC Conference Athens, Greece Dr. Ray Clifford and Dr. Martha Herzog June 2008.

Using the Many-Faceted Rasch Model to Evaluate Standard Setting Judgments: An IllustrationWith the Advanced Placement Environmental Science Exam Pamela.

How was LAA 2 developed?  Committee of Louisiana educators (general ed and special ed) Two meetings (July and August 2005) Facilitated by contractor.

Assessment Assessment is the collection, recording and analysis of data about students as they work over a period of time. This should include, teacher,

RelEx Introduction to the Standardization Phase Relating language examinations to the Common European Framework of Reference for Languages Gilles Breton.

Setting Performance Standards EPSY 8225 Cizek, G.J., Bunch, M.B., & Koons, H. (2004). An NCME Instructional Module on Setting Performance Standards: Contemporary.

Chapter 11 Effective Grading in Physical Education 11 Effective Grading in Physical Education C H A P T E R.

EVALUATING EPP-CREATED ASSESSMENTS

Jean-Guy Blais Université de Montréal

Introduction to the Specification Phase

CLEAR 2011 Annual Educational Conference

ECML Colloquium2016 The experience of the ECML RELANG team

Assessments for Monitoring and Improving the Quality of Education

Introduction to the Validation Phase

ARDHIAN SUSENO CHOIRUL RISA PRADANA P.

Types of Tests.

How Psychologists Ask and Answer Questions Statistics Unit 2 – pg

Introduction to the Validation Phase

The All-important Placement Cut Scores

Next-Generation MCAS: Update and review of standard setting

NWEA Measures of Academic Progress (MAP)

RELATING NATIONAL EXTERNAL EXAMINATIONS IN SLOVENIA TO THE CEFR LEVELS

Criterion Referencing Judges Who are the best predictors?

Calculating Reliability of Quantitative Measures

Consistency and Reliability in Rating Student’s Work

Setting Cutoff Scores for Legal Defensibility

From Learning to Testing

Basic Statistics for Non-Mathematicians: What do statistics tell us

Deanna L. Morgan The College Board

Presentation transcript:

Standard Setting Zagreb, July 2009

The Problem: Setting cut scores for criterion referenced tests Not having many test takers Having no concurrent validity

What is a standard setting process? The step where the policy descriptions and elaborated descriptions are transformed into a different language, the numerical language of the test score There is an intention in the standard that is specified by the policy of the agency. (Reckase, 2009)

Purpose The purpose of the standard setting process must be made clear. We must be clear about what we want to achieve in making pass/fail decisions, Do we want to exclude or include, for example

Procedure Select a large and representative panel Choose a standard setting method Be sure to have good descriptors of performance categories Train participants to use decriptors and standard setting method Compile item ratings from participants Facilitate discussion among participants about first round of rating, Individual feedback and results of pretesting Make new round of item ratings Let participants review again – then arrive at final recommended cut score Assemble documentation af the standard setting process

About the procedure Three types of feedback to participants after first round Normative Reality Impact Hvordan deres egne resultater svarer til andres standard ceviation ets Hvordan items virkede i test population (kan selvfølgelig ikke have masters og non masters da cut scorer ikke er sathvilkenvirkning cut score vil have på senere testtagere

The Benchmark Advisory Test (BAT) and Standard Setting What is the BAT ? Angoff’s footnote suggests that one might ”ask each judge to state the probability that the minimally acceptable person would answer each item correctly” (meaning of course a minimally competent examinee) During first round of rating we answered the 2 questions: What level is this item and how many people at a threshold level would get this item right?

Our modified, modified Angoff (example from last round of rating) Estimate the correct response rate for each item for examinees at 3 proficiency levels The range entered may range from about 25% to 100 % 25%= the chance of random response being correct 100%= no chance of answering incorrectly, even due to a lapse in attention or any other reason At one level below the target level At the targeted level At one level above the targeted level

The process was iterative. Three rounds of ratings, one via internet Mean and Standard deviation of the ratings are calculated (Each rater could see his/her own deviation from the mean) Final passing score is calculated by averaging the rater means

Bibliografi Reckase, Mark D.: Standard Setting Theory and Practise: Issues and Difficulties. In Linking the CEFR levels: Research Perspectives (ed N Figueras and José Noijons)(CITO; EALTA, Arnhem 2009) Cizek, Gregory J. and Bunch , Michael B.: Standard Setting (sage Publications, 2007)