Spiros Papageorgiou University of Michigan

Slides:



Advertisements
Similar presentations
The Math Studies Project for Internal Assessment
Advertisements

Enabling successful communication of geographical understanding in written assessments AE SIG GA Conference 2013.
Victorian Curriculum and Assessment Authority
You can use this presentation to: Gain an overall understanding of the purpose of the revised tool Learn about the changes that have been made Find advice.
© Stichting CITO Instituut voor Toetsontwikkeling 1 Mapping the Dutch Foreign Language State Examinations onto the Common European Framework of Reference.
How does DIALANG use the CEF?
A Tale of Two Tests STANAG and CEFR Comparing the Results of side-by-side testing of reading proficiency BILC Conference May 2010 Istanbul, Turkey Dr.
Psychometric Aspects of Linking Tests to the CEF Norman Verhelst National Institute for Educational Measurement (Cito) Arnhem – The Netherlands.
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
Advanced Topics in Standard Setting. Methodology Implementation Validity of standard setting.
Objective Develop an understanding of Appendix B: CA ELD Standards Part II: Learning About How English Works.
1 New England Common Assessment Program (NECAP) Setting Performance Standards.
Using the CEFR in Catalonia Neus Figueras
Setting Performance Standards Grades 5-7 NJ ASK NJDOE Riverside Publishing May 17, 2006.
Presented by Denise Sibley Laura Jean Kerr Mississippi Assessment Center Research and Curriculum Unit.
1 The New Adaptive Version of the Basic English Skills Test Oral Interview Dorry M. Kenyon Funded by OVAE Contract: ED-00-CO-0130 The BEST Plus.
Chapter 4 Validity.
| ERK/ CEFR in Context 23 January 2015, Groningen Estelle Meima Language Centre.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Standard Setting Inclusive Assessment Seminar Marianne.
Setting Alternate Achievement Standards Prepared by Sue Rigney U.S. Department of Education NCEO Teleconference March 21, 2005.
Technical Issues Two concerns Validity Reliability
Validity and Reliability
Raili Hildén University of Helsinki Relating the Finnish School Scale to the CEFR.
1 Development of Valid and Reliable Case Studies for Teaching, Diagnostic Reasoning, and Other Purposes Margaret Lunney, RN, PhD Professor College of.
Evaluation and Testing course: Exam information 6 th semester.
14th International GALA conference, Thessaloniki, December 2007
(2) Using age-appropriate activities, students expand their ability to perform novice tasks and develop their ability to perform the tasks of the intermediate.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
ELA Common Core Shifts. Shift 1 Balancing Informational & Literary Text.
Evaluating a Research Report
1 Use of qualitative methods in relating exams to the Common European Framework: What can we learn? Spiros Papageorgiou Lancaster University The Third.
Organizing Your Information
Workshop: assessing writing Prepared by Olga Simonova, Maria Verbitskaya, Elena Solovova, Inna Chmykh Based on material by Anthony Green.
GEORGIA HIGH SCHOOL GRADUATION WRITING TEST September 25, 2013.
STANDARD SETTING Prepared by Ludmila Kozhevnikova and Viktoria Levchenko Based on material by Anthony Green.
Group 3 林正昀 Adam, 李燕俞 Amber, 李季樺 Gina, 徐家慧 Alice.
Standard Setting Results for the Oklahoma Alternate Assessment Program Dr. Michael Clark Research Scientist Psychometric & Research Services Pearson State.
The Math Studies Project for Internal Assessment A good project should be able to be followed by a non-mathematician and be self explanatory all the way.
COUNCIL OF CHIEF STATE SCHOOL OFFICERS (CCSSO) & NATIONAL GOVERNORS ASSOCIATION CENTER FOR BEST PRACTICES (NGA CENTER) JUNE 2010.
Using the Many-Faceted Rasch Model to Evaluate Standard Setting Judgments: An IllustrationWith the Advanced Placement Environmental Science Exam Pamela.
Writing. Academic Writing Allow about 20 minutes In TASK 1 candidates are presented with a graph, table,chart or diagram and are asked to describe, summarise.
Curriculum Framework for Romani Seminar for decision makers and practitioners Council of Europe, 31 May and 1 June 2007 Introduction to the Common European.
Relating examinations to the CEFR – the Council of Europe Manual and supplementary materials Waldek Martyniuk ECML, Graz, Austria.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Chapter 6 - Standardized Measurement and Assessment
The CEFR and the MFL classroom PDST seminar Maynooth University 7 Nov 2015
Stages of Test Development By Lily Novita
Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.
RelEx Introduction to the Standardization Phase Relating language examinations to the Common European Framework of Reference for Languages Gilles Breton.
REGISTRATION CODE: EET699
ACCET 2014 Presented by: Brenda Nazari-Robati The Language Company Lynore M. Carnuccio The Language Company.
To my presentation about:  IELTS, meaning and it’s band scores.  The tests of the IELTS  Listening test.  Listening common challenges.  Reading.
ENGLISH EXIT TEST REGISTRATION CODE: EET 699. BACKGROUND Directive from Senate (Senate Meeting No. 192 on 8 August 2014). An “exit requirement” for graduating.
EVALUATING EPP-CREATED ASSESSMENTS
REGISTRATION CODE: EET699
Assessments for Monitoring and Improving the Quality of Education
Introduction to the Validation Phase
Introduction of IELTS Test
Introduction to the Validation Phase
Training in Classroom Assessment Related to the CEFR
Introduction to the Validation Phase
RELATING NATIONAL EXTERNAL EXAMINATIONS IN SLOVENIA TO THE CEFR LEVELS
REGISTRATION CODE: EET699
Standard Setting for NGSS
Timeline for STAAR EOC Standard Setting Process
Specification of Learning Outcomes (LOs)
From Learning to Testing
RELANG Relating language examinations to the common European reference levels of language proficiency: promoting quality assurance in education and facilitating.
REGISTRATION CODE: EET699
Deanna L. Morgan The College Board
Presentation transcript:

Spiros Papageorgiou University of Michigan spapag@umich.edu Using the Common European Framework of Reference to Report Language Test Scores Spiros Papageorgiou University of Michigan spapag@umich.edu

Overview The Common European Framework of Reference (CEFR) The Manual for relating language examinations to the CEFR Standard setting An example of a CEFR standard setting study in Colombia

The CEFR Reference document—not prescriptive Basis for the elaboration of language syllabi, curricula, examinations, and textbooks Language objectives: Description of what language learners have to learn to do in order to use a language for communication Six main levels of proficiency: A1 (lowest), A2, B1, B2, C1, C2 (highest)

The Manual for Relating Examinations to the CEFR It aims to “help the providers of examinations to develop, apply and report transparent, practical procedures in a cumulative process of continuing improvement in order to situate their examination(s) in relation to the Common European Framework” (p. 1).

Stages for Relating Test Content and Test Scores to the CEFR Familiarization Specification Standardization training and benchmarking Standard setting Validation

Standard Setting The decision making process of classifying examination results in a number of successive levels Performance Level Descriptions (PLD): statements describing what learners can do with language (e.g., CEFR descriptors) Performance Level Labels (PLL): labels of PLD (e.g., A1–C2) Cut scores: the boundary between two successive levels Participation of expert judges (panelists)

PLL PLD C2 Can write clear, smoothly flowing, complex texts in an appropriate and effective style and a logical structure which helps the reader to find significant points. C1 Can write clear, well-structured texts of complex subjects, underlining the relevant salient issues, expanding and supporting points of view at some length with subsidiary points, reasons and relevant examples, and rounding off with an appropriate conclusion. B2 Can write clear, detailed texts on a variety of subjects related to his field of interest, synthesising and evaluating information and arguments from a number of sources. B1 Can write straightforward connected texts on a range of familiar subjects within his field of interest, by linking a series of shorter discrete elements into a linear sequence. A2 Can write a series of simple phrases and sentences linked with simple connectors like “and”, “but” and “because”. A1 Can write simple isolated phrases and sentences.

An Example of a Standard Setting Study in Colombia Reporting scores for the Michigan English Test on the CEFR levels 13 participants from the 9 Binational centers in Colombia Familiarization with the CEFR Training with item difficulty (Pilot Form B) Angoff standard setting method First round of judgments Pilot Form A statistical information Second round of judgments

Standard Setting Validity Evidence Procedural validity: examining whether the procedures followed were practical and implemented properly; that feedback given to the judges was effective; and that documentation was sufficiently compiled. Internal validity: addressing issues of accuracy and consistency of the standard setting results. External validation: collecting evidence from independent sources that support the outcome of the standard setting meeting.

The Familiarization Task A1 = 1, A2 = 2, B1 = 3, B2 = 4, C1 = 5, C2 = 6

Procedural Validity: Internalization of the CEFR Correlation of descriptor level judgments with the CEFR during the Familiarization stage Descriptors J1 J2 J3 J4 J5 J6 J7 J8 J9 J10 J11 J12 J13 Listening .85 .89 .80 .81 .71 .77 .79 .88 .70 .91 .84 Reading .92 .86 .69 .82 .62 .90 Vocabulary .93 .96 .76 .73 .97 Grammar .94 .87 .95 .78

Internal Validity: Method Consistency Standard error of judgments should be ≤ ½ of the standard error of the test (Section I 1.71 and Section II 1.74 ) Cut score SEj incl. extreme ratings SEj excl. extreme ratings Section I B1 1.97 1.57 Section I B2 1.34 Section I C1 1.69 Section II B1 2.00 1.71 Section II B2 2.30 1.62 Section II C1 2.57

Internal Validity: Decision Consistency Calculating agreement coefficient rho (p0; max .98) and kappa (k; max 71) Cut score p0 k Section I B1 .90 .68 Section I B2 .88 .70 Section I C1 .97 .61 Section II B1 .95 .64 Section II B2 .86 .71 Section II C1 .94 .65

Internal Validity: Intra-judge Consistency Correlation of mean of judgments with empirical item difficulty MET section/round of judgments Correlation Section I, Round 1 .42 Section I, Round 2 .83 Section II, Round 1 .73 Section II, Round 2 .92 We based the final decision on the round 2 judgments

Internal Validity: Inter-judge Consistency Indices of agreement and consistency Index Section I Section II ICC .94 W .80 .76 Alpha

External Validity: Reasonableness of the Cut Scores Classification of Pilot Form A test takers (N = 660) into CEFR levels Level Section I Section II A2 105 (15.91%) 55 (8.33%) B1 408 (61.81%) 323 (48.94%) B2 95 (14.39%) 214 (32.43%) C1 52 (7.88%) 68 (10.30%)

External Validity: Comparison of Level Classifications Exact and adjacent level agreement of classifications (N = 302) provided by a test center and the cut score Agreement Section I Section II Exact level 122 (40.40%) 92 (30.46%) Within 1 level 290 (96.03%) 264 (87.42%)

Final Stage Before Reporting Test Scores: Equating A statistical procedure used to allow for comparisons of scores obtained on different test forms Adjustment of differences in test form difficulty (but not content) Scaled scores, not percentages Examinee position on the language ability scale Scores are comparable across different administrations Linked to the CEFR cut scores

Reported Scores Both section scores should be taken into account when interpreting the test results for use in decision-making CEFR Level MET Section I scores MET Section II scores C1 64 and above B2 53–63 B1 40–52 A2 39 or below

For more information visit www.lsa.umich.edu/eli/testing