Introduction to the Validation Phase

Slides:

Advertisements

Similar presentations

Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010.

Advertisements

Training teachers to use the European Language Portfolio Former les enseignants à lutilisation du Porfolio européen des langues.

Item Response Theory in a Multi-level Framework Saralyn Miller Meg Oliphint EDU 7309.

Spiros Papageorgiou University of Michigan

Issues of Reliability, Validity and Item Analysis in Classroom Assessment by Professor Stafford A. Griffith Jamaica Teachers Association Education Conference.

VALIDITY AND RELIABILITY

MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT

General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.

Issues of Technical Adequacy in Measuring Student Growth for Educator Effectiveness Stanley Rabinowitz, Ph.D. Director, Assessment & Standards Development.

RESEARCH METHODS Lecture 18

Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.

Classroom Assessment A Practical Guide for Educators by Craig A

Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.

Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.

Foundations of Educational Measurement

McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.

Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.

Quality in language assessment – guidelines and standards Waldek Martyniuk ECML Graz, Austria.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 11 Part 3 Measurement Concepts MEASUREMENT.

Principles of Test Construction

Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations.

Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through.

CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria.

EDU 8603 Day 6. What do the following numbers mean?

Chapter Thirteen Measurement Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.

6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)

Measurement Validity.

Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.

Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar.

Validity: Introduction. Reliability and Validity Reliability Low High Validity Low High.

Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)

Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”

Measurement Theory in Marketing Research. Measurement What is measurement?  Assignment of numerals to objects to represent quantities of attributes Don’t.

Relating examinations to the CEFR – the Council of Europe Manual and supplementary materials Waldek Martyniuk ECML, Graz, Austria.

Chapter 6 - Standardized Measurement and Assessment

Issues in Personality Assessment

Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.

Survey Design Class 02.  It is a true measure  Measurement Validity is the degree of fit between a construct and indicators of it.  It refers to how.

WHS AP Psychology Unit 7: Intelligence (Cognition) Essential Task 7-3:Explain how psychologists design tests, including standardization strategies and.

RelEx Introduction to the Standardization Phase Relating language examinations to the Common European Framework of Reference for Languages Gilles Breton.

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 25 Critiquing Assessments Sherrilene Classen, Craig A. Velozo.

Reliability and Validity

Test Validation Topics in the BILC Testing Seminars

Introduction to the Specification Phase

Lecture 5 Validity and Reliability

ECML Colloquium2016 The experience of the ECML RELANG team

Introduction to the Validation Phase

CHAPTER 3: Practical Measurement Concepts

Assessment in Science Rodney Doran, SUNY Buffalo, Emeritus

Test Design & Construction

Evaluation of measuring tools: validity

Training in Classroom Assessment Related to the CEFR

Tests and Measurements: Reliability

Introduction to the Validation Phase

Validity and Reliability

Week 3 Class Discussion.

پرسشنامه کارگاه.

RELATING NATIONAL EXTERNAL EXAMINATIONS IN SLOVENIA TO THE CEFR LEVELS

VALIDITY Ceren Çınar.

RESEARCH METHODS Lecture 18

Classroom Assessment Ways to improve tests.

Specification of Learning Outcomes (LOs)

From Learning to Testing

Basic Statistics for Non-Mathematicians: What do statistics tell us

Relating Examinations to the CEFR Empowering Language Professionals

How can one measure intelligence?

AACC Mini Conference June 8-9, 2011

Qualities of a good data gathering procedures

Presentation transcript:

Introduction to the Validation Phase Relating language examinations to the Common European Framework of Reference for Languages Gábor Szabó ECML ClassRelEx Workshop Graz, 24-26 November 2010

Suggested Linking Procedures in the Manual Familiarization with the CEFR Linking on the basis of specification of examination content Standardization and Benchmarking Standard setting Validation: checking that exam results relate to CEFR levels as intended

What is validity? Does the test measure what it intends to measure? The degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Traditional classification of validity: Content validity Construct validity Criterion related validity Face validity More modern approach: Validity seen as single unitary construct

Aspects of validity Content Validity Operational validity: pilots and pretests Psychometric aspects Procedural validity of standardization Internal validity of standard setting External validation

Content validity Does the test accurately reflect the syllabus on which it is based AND reflect the descriptors in the CEFR? Does the content specification reflect all areas to be assessed in suitable proportions?

Operational validity Do pilot populations accurately represent the target population of the test? Is the pilot test takers’ performance representative of their true ability? (response validity) Begrippen worden hierna uitgelegd

Psychometric aspects Do the test’s psychometric qualities support validity claims? CTT-based results Test-level data Reliability figures Mean, mode, median Standard deviation Measurement error Score distribution Item-level data Facility values Discrimination indices

Psychometric aspects Do the test’s psychometric qualities support validity claims? IRT-based results Item difficulty figures Person ability figures Fit statistics items persons DIF (Differential Item Functioning; item bias)

Procedural validity of standardization Has the procedure of standard setting had the effects as intended? Was the training effective? Did the judges feel free to follow their own insights?

Internal validity of standard setting Are the judgments of the judges to be trusted? Are judges consistent within themselves? Are judges consistent with each other? Is the aggregated standard to be considered as the definite standard?

External validation Establishing the validity of a test in relation to an external point of reference (CEFR) Correlation analysis Validation of standardization Teacher judgments Application of anchor tests