Introduction to the Validation Phase

Slides:



Advertisements
Similar presentations
Test Development.
Advertisements

Access to HE Diploma Grading. The Access to HE grading model unit grading all level 3 units (level 2 units will not be graded) no aggregate or single.
1 SESSION 3 FORMAL ASSESSMENT TASKS CAT and IT ASSESSMENT TOOLS.
© Stichting CITO Instituut voor Toetsontwikkeling 1 Mapping the Dutch Foreign Language State Examinations onto the Common European Framework of Reference.
Spiros Papageorgiou University of Michigan
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
Advanced Topics in Standard Setting. Methodology Implementation Validity of standard setting.
Chapter 4 Validity.
VALIDITY.
Standard Setting Different names for the same thing Standard Passing Score Cut Score Cutoff Score Mastery Level Bench Mark.
C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Alignment Inclusive Assessment Seminar Brian Gong Claudia.
Grade 12 Subject Specific Ministry Training Sessions
Understanding Validity for Teachers
Validity and Reliability
Quality in language assessment – guidelines and standards Waldek Martyniuk ECML Graz, Austria.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Principles in language testing What is a good test?
The Analysis of the quality of learning achievement of the students enrolled in Introduction to Programming with Visual Basic 2010 Present By Thitima Chuangchai.
CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria.
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Validity.
 Job evaluation is the process of systematically determining the relative worth of jobs to create a job structure for the organization  The evaluation.
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:
Relating examinations to the CEFR – the Council of Europe Manual and supplementary materials Waldek Martyniuk ECML, Graz, Austria.
 A test is said to be valid if it measures accurately what it is supposed to measure and nothing else.  For Example; “Is photography an art or a science?
Chapter 14: Affective Assessment
Chapter 6 - Standardized Measurement and Assessment
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
RelEx Introduction to the Standardization Phase Relating language examinations to the Common European Framework of Reference for Languages Gilles Breton.
Assessment in Education ~ What teachers need to know.
Copyright © Springer Publishing Company, LLC. All Rights Reserved. DEVELOPING AND USING TESTS – Chapter 11 –
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Development of Assessments Laura Mason Consultant.
EVALUATING EPP-CREATED ASSESSMENTS
Designing Rubrics with the Three Categories of Knowledge
Introduction to the Specification Phase
Assessment of Learning 1
Classroom Assessment A Practical Guide for Educators by Craig A
VALIDITY by Barli Tambunan/
Lecture 5 Validity and Reliability
ECML Colloquium2016 The experience of the ECML RELANG team
Test Blueprints for Adaptive Assessments
Evaluation of measuring tools: validity
Introduction to the Validation Phase
Training in Classroom Assessment Related to the CEFR
Introduction to the Validation Phase
Validity.
Week 3 Class Discussion.
RELATING NATIONAL EXTERNAL EXAMINATIONS IN SLOVENIA TO THE CEFR LEVELS
H070 Topic Title H470 Topic Title.
Reliability and Validity of Measurement
VALIDITY Ceren Çınar.
Specification of Learning Outcomes (LOs)
Module 5: Relating foreign language curricula
From Learning to Testing
Testing Receptive Skills
Relating Examinations to the CEFR Empowering Language Professionals
RELANG Relating language examinations to the common European reference levels of language proficiency: promoting quality assurance in education and facilitating.
Matthew McCullagh Linking the Principles of Assessment to the QA Criteria.
Consider the Evidence Evidence-driven decision making
Understanding and Using Standardized Tests
Methodology Week 5.
Assessment Literacy: Test Purpose and Use
TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1
AACC Mini Conference June 8-9, 2011
Presentation transcript:

Introduction to the Validation Phase Relating language examinations to the Common European Framework of Reference for Languages José Noijons APEOICVA/ECML Valencia, 27-28 March 2009

Suggested Linking Procedures in the Manual Familiarisation with the CEFR Linking on the basis of specification of examination content Standardisation and Benchmarking Standard setting Validation: checking that exam results relate to CEFR levels as intended

What is validity? Validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Although classical models divided the concept into various "validities," such as content validity, criterion validity, and construct validity, the modern view is that validity is a single unitary construct. (Wikipedia) In simpler terms: does the test measure what it intends to measure?

Aspects of validity Content Validity Operational validity: pilots and pretests Psychometric aspects Procedural validity of standardization Internal validity of standard setting External validation

Content validity Does the test accurately reflect the syllabus on which it is based AND reflect the descriptors in the CEFR? Does the content specification reflect all areas to be assessed in suitable proportions?

Quality criteria for items An item must be relevant at intended level specific objective acceptable transparent efficient in correct language in a clear lay-out These criteria will contribute to the validity and reliability of a test. Begrippen worden hierna uitgelegd

Quality criteria for items Relevance Is the item addressing the intended knowledge or skill at the intended CEFR level? Does the item not test other knowledge and abilities than the intended ones (e.g. reading skill, general intelligence, knowledge of grammar)? How to realize: Refer to specific CEFR descriptors for each skill Use test matrix (or test grid) Relate questions with purpose of the test Make items that are recognizable for the student

Quality criteria for items At intended CEFR level Is the question appropriate for the CEFR level ? Does the question make a correct selection between those who know and those who do not know? How to realize: Make experts work in a team Assign screeners Avoid manipulation of wording to influence level of difficulty Avoid unnecessary information Use data analysis (through pretesting or afterwards)

Quality criteria for items Transparent Does the student know how many answers, details or arguments are expected? Does the item relate to what students expect (syllabus, preparatory work)? Do students know the maximum score for the item? Is the item format known to the students? How to realize: Use clear instructions for student Use clear terminology in line with syllabus and related tests Indicate maximum score for an item Use item formats or types students have been acquainted with

Validation of Standard Setting Has the procedure of standard setting had the effects as intended: was the training effective, did the judges feel free to follow their own insights? These are questions of procedural validity. Are the judgments of the judges to be trusted: Are judges consistent within themselves? Are judges consistent with each other? Is the aggregated standard to be considered as the definite standard? These questions and their answer constitute the internal validity of the standard setting. Are the results of the standard setting – allocating students to a CEFR level on the basis of their test score – trustworthy? The basic answer to this question comes from independent evidence which corroborates the results of a particular standard setting: empirical validity.

Equivalence & Equating How to make sure that exams are equal across sittings years languages regarding content CEFR level

Maintaining equivalence (1) How to make sure that exams are equal across sittings years languages? Starting point: Syllabus approved by the Generalitiat global description per subject of the content of the exam

Maintaining equivalence (2) Regarding content CEFR level Through content specification in an exam model Number of items and examination time Item format: MC, open items knowledge, skills etc. Domains / topics Test grid

Maintaining equivalence (3) How to realise equivalence in difficulty? Assuming that the overall achievements of the group of candidates in subsequent years are comparable: adaptation of standards equalizing the mean score equalizing the percentage of pass/fail