Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar.

Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar Atanasov New Bulgarian University BILC Seminar, 10-15 October 2010-Varna

Outline  Examination procedure  Main concepts and observations  Socio-cognitive test validation framework, Cyril Weir (2005) and criteria  Scoring validity for listening and reading parts of the test  Scoring validity for essay

Test structure 1. Listening paper: two tasks  15 MCQ 2. Reading paper: five tasks  6 items matching response format  10 items bank-cloze response format  10 items open-cloze response format  16 items short-answer response format  2 open-ended questions  5 MCQ 3. Essay: 180-220 words

Too much? The concept of communicative language ability (CEFR) The concept of test usefulness (Bachman) The concept of justifing the use of language assessment in real world (Bachman) The concept of validity The Code of practice (ALTE *, for example) * Association of Language Testers in Europe

Statements NBU exam is high-stake. NBU exam is criterion-oriented. NBU exam is ‘independent’. Evidences for test validation were not established, BUT there was a routine practice for test development process and test administration.

The Socio-cognitive Framework for test validation, Cyril Weir (2005) Test takers characteristics and: Context validity Theory-based validity Scoring validity Consequential validity Criterion-related validity

“ Before-the –test- event” Context validity Theory-based validity “After- the- test –event” Scoring validity Consequential validity Criterion-related validity

Scoring validity for listening and reading parts of the test are established by: Item analysis Internal consistency Error of measurement Marker reliability Not just looking at them! Investigate, discuss, learn and take decisions!

Analisis3-parameter IRT model Advantages Item parameter estimates are independent of the group of examinees used Test taker ability estimates are independent of the particular set of items used Degree of Difficulty to specify the discrimination to specify the content

Summer session, 2010

Item number Version 1 Values of difficulty Version 2 Values of difficulty Version 3 Values of difficulty Version 4 Values of difficulty 1-1,7-1,21,6-0,7 2-1,5-1,21.9-2,2 3-1,7-2,92,6-0,4 4-0,5-2,4-0,9-0,2 5-3-0,12,6-1,4 6-0.7-0,1-0,3-0,2

Possible decisions Remedial procedures Classroom assessment Only certification decision

Scoring validity for writing is established by: Criteria/rating scale Rating procedures: Rater training Standardization Rating conditions Rating Moderation Statistical analysis Raters Grading

Conclusion for the essay: Good Two raters Analytic writing scale Rubrics and input Negative The score depends on the raters No task specific scale No standardization

Now is fact that: We will continue our work for item writer’s training content and statistical specification of the items test review and test revision

Shearing: Investigation (small steps to “strong” validity). Comparison (language ability of the same population at the same level) Cooperation ( in research project)

Thank you New Bulgarian University www.nbu.bg

Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar.

Similar presentations

Presentation on theme: "Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar.

Similar presentations

Presentation on theme: "Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar."— Presentation transcript:

Similar presentations

About project

Feedback