Introduction to the Validation Phase Relating language examinations to the Common European Framework of Reference for Languages Gábor Szabó ECML ClassRelEx Workshop Graz, 24-26 November 2010
Suggested Linking Procedures in the Manual Familiarization with the CEFR Linking on the basis of specification of examination content Standardization and Benchmarking Standard setting Validation: checking that exam results relate to CEFR levels as intended
What is validity? Does the test measure what it intends to measure? The degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Traditional classification of validity: Content validity Construct validity Criterion related validity Face validity More modern approach: Validity seen as single unitary construct
Aspects of validity Content Validity Operational validity: pilots and pretests Psychometric aspects Procedural validity of standardization Internal validity of standard setting External validation
Content validity Does the test accurately reflect the syllabus on which it is based AND reflect the descriptors in the CEFR? Does the content specification reflect all areas to be assessed in suitable proportions?
Operational validity Do pilot populations accurately represent the target population of the test? Is the pilot test takers’ performance representative of their true ability? (response validity) Begrippen worden hierna uitgelegd
Psychometric aspects Do the test’s psychometric qualities support validity claims? CTT-based results Test-level data Reliability figures Mean, mode, median Standard deviation Measurement error Score distribution Item-level data Facility values Discrimination indices
Psychometric aspects Do the test’s psychometric qualities support validity claims? IRT-based results Item difficulty figures Person ability figures Fit statistics items persons DIF (Differential Item Functioning; item bias)
Procedural validity of standardization Has the procedure of standard setting had the effects as intended? Was the training effective? Did the judges feel free to follow their own insights?
Internal validity of standard setting Are the judgments of the judges to be trusted? Are judges consistent within themselves? Are judges consistent with each other? Is the aggregated standard to be considered as the definite standard?
External validation Establishing the validity of a test in relation to an external point of reference (CEFR) Correlation analysis Validation of standardization Teacher judgments Application of anchor tests