14th International GALA conference, Thessaloniki, 14-16 December 2007 Behavioural scales of language proficiency: insights from the use of the Common European Framework of Reference Spiros Papageorgiou University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Outline Background Aims Data collection Data analysis Results Implications University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Background Advent of the CEFR: increased interest in behavioural scales of language proficiency Using the CEFR scales: Problems Designing test specifications (Alderson et al., 2006) Measuring progression in grammar (Keddle, 2004) Describing the construct of vocabulary (Huhta & Figueras, 2004) Designing proficiency scales (Generalitat de Catalunya, 2006) University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Background (2) Using the CEFR scales: Criticism Equivalence of tests constructed for different purposes (Fulcher, 2004b;Weir, 2005) Danger of viewing a test as non valid because of not claiming relevance to the CEFR (Fulcher, 2004a) Progression in language proficiency not based on SLA research but on judgements by teachers (cf. North 2000; North & Schneider 1998) University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Aims of the study Investigation of three research questions: Can users of the CEFR rank-order the scaled descriptors in the way the appear in the 2001 volume? If differences in scaling exist between the users of the CEFR and the 2001 volume, why does this happen? Can training contribute to more successful scaling? University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Data collection 12 users of the scales acting as judges in relating two language examinations to the CEFR Data collected during Familiarisation sessions described in the Manual for relating examinations to the CEFR Part of a doctoral thesis at Lancaster University (Papageorgiou, 2007) and a research project at Trinity College London Task: sort descriptors into the six levels University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Data collection (2) University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Data analysis Analysis: FACETS Rasch computer program 3 facets: descriptors-raters-occasions Rank-ordering of elements of facets on a common scale Fit statistics (Bond and Fox, 2001; McNamara, 1996) Overfit: too predictable pattern Misfit: more than expected variance Acceptable range of fit statistics Descriptors: .4-1.2 (Linacre & Wright, 1994) Raters: .5-1.5 (Weigle, 1998) University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Results: Writing Levels A1-B1 University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Results: Writing Levels B2-C2 University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Results: Raters University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Results: Occassions University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Results: Correlations Correlations of scaling between the judges and the CEFR volume University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Summary of results Trained judges perceived language ability as intended in the CEFR Almost identical scaling Cut-offs between B2-C1 and C1-C2 unclear Competences other than linguistic: misfitting descriptors Unclear and inconsistent wording resulted in level misplacement by the judges Mixed effect of training University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Implications of findings Common understanding of the construct in the CEFR scales can be achieved, but How valid is it to claim that a test is linked to B2 instead of C1 and C1 instead of C2? How can sociolinguistic and strategic competences be tested in relation to the CEFR? Can SLA research help better understand these issues? University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli
Contact details Spiros Papageorgiou University of Michigan English Language Institute 500 East Washington Street Ann Arbor, MI 48104-2028 USA spapag@umich.edu University of Michigan English Language Institute Testing and Certification Division www.lsa.umich.edu/eli