Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to the Validation Phase

Similar presentations


Presentation on theme: "Introduction to the Validation Phase"— Presentation transcript:

1 Introduction to the Validation Phase
Relating language examinations to the Common European Framework of Reference for Languages Gábor Szabó ECML RelEx Workshop Graz, May 2009

2 Suggested Linking Procedures in the Manual
Familiarisation with the CEFR Linking on the basis of specification of examination content Standardisation and Benchmarking Standard setting Validation: checking that exam results relate to CEFR levels as intended

3 What is validity? Does the test measure what it intends to measure?
The degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Traditional classification of validity: Content validity Construct validity Criterion related validity Face validity More modern approach: Validity seen as single unitary construct

4 Aspects of validity Content Validity
Operational validity: pilots and pretests Psychometric aspects Procedural validity of standardization Internal validity of standard setting External validation

5 Content validity Does the test accurately reflect the syllabus on which it is based AND reflect the descriptors in the CEFR? Does the content specification reflect all areas to be assessed in suitable proportions?

6 Operational validity Do pilot populations accurately represent the target population of the test? Is the pilot test takers’ performance representative of their true ability? (response validity) Begrippen worden hierna uitgelegd

7 Psychometric aspects Do the test’s psychometric qualities support validity claims? CTT-based results Test-level data Reliability figures Mean, mode, median Standard deviation Measurement error Score distribution Item-level data Facility values Discrimination indices

8 Psychometric aspects Do the test’s psychometric qualities support validity claims? IRT-based results

9 Psychometric aspects IRT
Low ability High ability Px I I I3 Low difficulty High difficulty

10 Psychometric aspects IRT
Item Y Low difficulty High difficulty – – – – P P P P P P P P P9 Low ability High ability

11 Psychometric aspects IRT
X XXXXXXX + XX XXXXXXXXX + XX MEASURE | MAP OF PERSONS AND ITEMS PERSONS-+- ITEMS XXXXXXXXXXX | XXXXXX XXXXXXXXXX | XXXXX XXXXXXXXXX | XXXX | XXX XXXXXXXXXXXXX | XX XXXXXXX | XX X | XX | XX | X X | X XXXXXXX | X XXXXXX | X XX + X X | X | XXXXX | XXXX | XX |

12 Psychometric aspects IRT
ITEMS STATISTICS: ENTRY ORDER | NUM SCORE COUNT MEASURE ERROR|MNSQ INFIT|MNSQ OUTFT|PTBIS| NAME | | | | | | .25| 1 | | | | .37| 2 | | | | .19| 3 | | | | .26| 4 | | | | .14| 5 | | | | .26| 6 | | | | .11| 7 | | | | .20| 8 | | | | .18| 9 | | | | .06| 10

13 Psychometric aspects Do the test’s psychometric qualities support validity claims? IRT-based results Item difficulty figures Person ability figures Fit statistics Items persons

14 Procedural validity of standardization
Has the procedure of standard setting had the effects as intended? Was the training effective? Did the judges feel free to follow their own insights?

15 Internal validity of standard setting
Are the judgements of the judges to be trusted? Are judges consistent within themselves? Are judges consistent with each other? Is the aggregated standard to be considered as the definite standard?

16 External validation Establishing the validity of a test in relation to an external point of reference (CEFR) Correlation analysis Validation of standardization Teacher judgements Application of anchor tests

17 External validation Correlation analysis

18 External validation Correlation analysis

19 External validation Correlation analysis

20 External validation Validation of standardization
To what extent is the judges’ standard valid? Decison tables Criterion test Test to be validated Contingency value

21 External validation Validation of standardization – decision tables
Test 1 B2 - 152 12 8 128 Test 2 = /3=93,333%

22 External validation Teacher judgements
To what extent do decisions based on test results coincide with teacher judgements? Decison tables Box plots The least certain form of external empirical validation Heavily dependent on teachers’ interpretation of CEFR descriptors

23 External validation Application of anchor tests
To what extent do item logit values coincide in the anchor test and the test to be validated? To what extent do candidates’ ability logit values coincide based on the two tests? Checking item and person fit

24 External validation Application of anchor tests
test to be validated

25 External validation Application of anchor tests

26 External validation An example
Design Applying tests already linked to the CEFR as point of reference ECL task Reference task Estimating IRT-based item difficulties Comparing item diffculty logits in the two tasks

27 External validation An example
English B2 reading

28 External validation An example

29 External validation Potential problems with empirical external validation Availability of reference tests ”less frequently taught languages” (LFTL) Acceptance of reference tests Increasing criticism of reference tests Problems with reference tests Often little or no evidence of actual link between reference tests and CEFR Task properties (item number)

30 External validation German B2 reading

31 External validation Suggested solutions Availability Acceptance
Support to LFTL to develop and share tasks (ECML) Acceptance Setting up transparent, consensus-based criteria for tasks to be accepted as reference points (would handle task quality issue) Emphasize candidate-centered methods But only where judges’/teachers’ familiarity with CEFR is documented and guaranteed


Download ppt "Introduction to the Validation Phase"

Similar presentations


Ads by Google