” Interface” Validity Investigating the potential role of face validity in content validation Gábor Szabó, Robert Märcz ECL Examinations EALTA 9 - Innsbruck, June 2, 2012
Outline -Questions of face validity -New approach -Context, participants and instruments -Results -Conclusions
EALTA 9 - Innsbruck, June 2, 2012 ”Post mortem”? Educational context: it is important to seem to be testing as well as to be actually doing it Test takers’ acceptance of the test: - contributes to the validity of it - source of motivation Lay opinion – taken seriously?
EALTA 9 - Innsbruck, June 2, 2012 ”Interface” validity New approach: Test takers are asked to - give their opinion on the test (face validity) - give their opinion on the content (content validity)
EALTA 9 - Innsbruck, June 2, 2012 Context and participants ECL International Language Examination System Level – B2 Reading comprehension test Two tasks:sentence completion short answer Online questionnaire 903 answers within the first week (cc 50%)
EALTA 9 - Innsbruck, June 2, 2012 The instrument Questionnaire of 17 items Four-point Likert scale (4: completely true – 1: not true at all) 6 items – on face validity: general statements concerning difficulty, layout, etc. 11 items – on content validity: descriptors of the CEFR paraphrased Two negative items (halo effect)
EALTA 9 - Innsbruck, June 2, 2012 The Questionnaire - Examples Face validity: 3. I had enough time to complete the tasks. Content validity Original CEFR descriptor: ”Can understand articles and reports concerned with contemporary problems in which the writers adopt particular stances or viewpoints.” 9. I could understand the viewpoints of the writer. 16. It was difficult to understand the viewpoints of the writer.
EALTA 9 - Innsbruck, June 2, 2012 Procedure Halo effect:analysing the parallel opposite items we found significant negative correlations ( /-0.670) Deleting responses with inconsistent response patterns 791 candidates’ responses were found valid and consistent
EALTA 9 - Innsbruck, June 2, 2012 Results and analysis Descriptive statistics
EALTA 9 - Innsbruck, June 2, 2012 Results and analysis Item correlations –Expectation: significant, probably moderate correlations Descriptors tap into different aspects of B2 construct –Actual results Strong, significant correlation (0.807) in one case: Though the text was long I was able to scan it quickly Though the text was complex I was able to scan it quickly
EALTA 9 - Innsbruck, June 2, 2012 Results and analysis –Actual results Moderate, significant correlations ( ) I could quickly identify the content of the text – I could understand the viewpoints of the writer I could understand the stance of the writer – I could quickly identify the content of the text I could quickly identify the content of the text –Though the text was complex I was able to scan it quickly Most consistent pattern of correlations in the case of item 8: I could quickly identify the content of the text
EALTA 9 - Innsbruck, June 2, 2012 Results and analysis –Actual results Low, sometimes not significant, occasionally negative correlations (<0.4) I could rarely find idioms in the text A broad active vocabulary was needed to complete the tasks The text was concerned with contemporary problems
EALTA 9 - Innsbruck, June 2, 2012 Results and analysis Batch correlations –Correlating face validity items with content validity items Significant, moderate correlation (0.536) found Indication of relationship between constructs?
EALTA 9 - Innsbruck, June 2, 2012 Conclusions Using candidate feedback in content validation is potentially useful Further analyses of data in progress –Checking for significant differences between sets of responses to different items Refinement of reworded descriptors needed Further research necessary –Relationship between candidate performance and opinion
EALTA 9 - Innsbruck, June 2, 2012 Thank you for your attention!