Roadmap Towards a Validity Argument

Slides:



Advertisements
Similar presentations
A Tale of Two Tests STANAG and CEFR Comparing the Results of side-by-side testing of reading proficiency BILC Conference May 2010 Istanbul, Turkey Dr.
Advertisements

Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 6 Validity.
Updated 11/16/06©1996 & forthcoming, Bachman & Palmer & OUPPage 1 The Place of Intended Impact in Assessment Use Arguments * Lyle F. Bachman Department.
1 Assessment Use Argument Nancy Powers Chief of English Testing Section SHAPE, Mons, Belgium Sept 2013.
GCSE PROJECT GUIDELINES Use this presentation to make sure you have the correct content for you project - click on.
Evaluating tests and examinations What questions to ask to make sure your assessment is the best that can be produced within your context. Dianne Wall.
Using statistics in small-scale language education research Jean Turner © Taylor & Francis 2014.
BILC Standardization Initiatives and Conference Objectives
Assessment Literacy for Language Teachers by Peggy Garza Partner Language Training Center Europe Associate BILC Secretary for Testing Programs.
RMA - BDLC LANGUAGE REQUIREMENTS: AAP-16 WG OVERVIEW BILC VARNA – October 2010 Marc Isselé.
Texas Observation Protocols (TOP) TOP Rater Holistic Rating Training: TOP Overview Summer-Fall 2006 Texas Education Agency Student Assessment Division.
Evaluating the Validity of NLSC Self-Assessment Scores Charles W. Stansfield Jing Gao Bill Rivers.
Unit 5:Elements of A Viable COOP Capability (cont.)  Define and explain the terms tests, training, and exercises (TT&E)  Explain the importance of a.
1 Making sound teacher judgments and moderating them Moderation for Primary Teachers Owhata School Staff meeting 26 September 2011.
D2.LAN.CL10.03 Slide 1. Performance Criteria Element 1: Read and write English to recruit and induct new employees Slide 2 Write job descriptions for.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Ways for Improvement of Validity of Qualifications PHARE TVET RO2006/ Training and Advice for Further Development of the TVET.
Comp 20 - Training & Instructional Design Unit 6 - Assessment This material was developed by Columbia University, funded by the Department of Health and.
Standardizing Testing in NATO Peggy Garza and the BAT WG Bureau for International Language Co-ordination.
1 Use of qualitative methods in relating exams to the Common European Framework: What can we learn? Spiros Papageorgiou Lancaster University The Third.
Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 7 Portfolio Assessments.
ADEPT 1 SAFE-T Judgments. SAFE-T 2 What are the stages of SAFE-T? Stage I: Preparation  Stage I: Preparation  Stage II: Collection.
Bilingual Students and the Law n Title VI of the Civil Rights Act of 1964 n Title VII of the Elementary and Secondary Education Act - The Bilingual Education.
RUBRICS AND CHECKLISTS KEITHA LUCAS HAMANN ASSESSMENT IN ARTS EDUCATION.
Office of Pipeline Safety Qualifications of Individuals: Protocol Review Integrity Management Workshop July 23-24, 2002.
Military Language Testing at the National Defence University and the Common European Framework BILC CONFERENCE BUDAPEST.
MINISTRY OF DEFENCE REPUBLIC OF BULGARIA
1 Combining Language Proficiency Testing in Accordance with STANAG 6001 with Specialist Language Skills Testing Dr. Dugald Sturges, Federal Office of Languages.
FORCE GOAL 0356 – CENTERED LANGUAGE TRAINING IN THE MILITARY ENVIRONMENT Romanian Armed Forces Study BILC Conference 2006.
Programme Objectives Analyze the main components of a competency-based qualification system (e.g., Singapore Workforce Skills) Analyze the process and.
Study Group 1 Advanced Distributed Learning (ADL) Course in English for Military Operations Bureau for International Language Co-ordination.
School of Health Sciences Week 8! AHIMA Practice Briefs Healthcare Delivery & Information Management HI 125 Instructor: Alisa Hayes, MSA, RHIA, CCRC.
SHAPE The SHAPE Update BILC Conference – Istanbul 2010 Michael Adubato SHAPE Language Testing Centre Supreme Headquarters Allied Powers Europe 1.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
Cpt. Nato Jiadze Ms. Mzia Skhulukha Ms. Mzia Skhulukha J-7 Joint Staff of Georgia Success in Training and Testing in Georgian Armed Forces (GAF) Teaching.
1 TESL Evaluating CALL Packages:Curriculum/Pedagogical/Lingui stics Dr. Henry Tao GUO Office: B 418.
Developing a curriculum according to Job Requirements Elias Papadopoulos Instructor of English as a foreign language. Examiner of officers and non-commissioned.
Grant Proposal Writing
© 2008 by Prentice Hall4-1 Employee Recording Describe daily work activities in diary or log Problem: Employees exaggerating job importance Valuable in.
BILC Testing Seminars Language Testing Seminar (LTS) Advanced Language Testing Seminar (ALTS)
© 2012, Community Training and Assistance Center © 2012, Teaching Learning Solutions Linking ISLLC and your Principal Rubrics to a Case.
Colorado Academic Standards Colorado English Language Proficiency (CELP) Standards There are now five English language development standards: Standard.
EVALUATING EPP-CREATED ASSESSMENTS
Classroom Assessments Checklists, Rating Scales, and Rubrics
50 Years of BILC: The Evolution of STANAG – 2016 and the first Benchmark Advisory Test Ray Clifford 24 May 2016.
The assessment process For Administrative units
Introduction to the Workshop
Bureau for International
STANAG 6001 Testing Update and Introduction to the 2017 Workshop
Test Design & Construction
HRM – UNIT 10 Elspeth Woods 9 May 2013
Classroom Assessments Checklists, Rating Scales, and Rubrics
BILC Standardization Efforts & BAT, Round 2
Dr. Olivier Thunus UNECE Task Force Vice-Chair
Validity and reliability of rating speaking and writing performances
Week 3 Class Discussion.
BILC ANNUAL CONFERENCE 2018
Chief of English Testing, Language Programs
Study Group # 1: Familiarization with STANAG 6001 for Non-Specialists
Best Practices in STANAG 6001 Testing
Explaining the Methodology : steps to take and content to include
STANAG 6001 Testing Workshop
Training Teachers to Assess the Productive Skills
Defence Requirements Authority for Culture and Language (DRACL)
Assessment Use Argument
OPERATIONAL READINESS
jot down your thoughts re:
jot down your thoughts re:
BILC ANNUAL CONFERENCE 2019 Tartu, Estonia
Successful trialling: from trial and error to best practices
Presentation transcript:

Roadmap Towards a Validity Argument Peggy Garza Associate BILC Secretary

Views of Validity Traditional Current Adapted from Chapelle (1999) Validity was considered a characteristic of a test: the extent to which a test measures what it is supposed to measure. Validity is considered an argument concerning test interpretation and use: the extent to which test interpretations and uses can be justified. Reliability was seen as distinct from and a necessary condition for validity. Reliability can be seen as on type of validity evidence. Validity was often established through correlations of a test with other tests. Validity is argued on the basis of a number of types of evidence, including the consequences of testing. Adapted from Chapelle (1999)

Validity Argument Glossary definition: Interpretive argument that presents evidence to make a case justifying the score-based inferences and the intended uses of the test. Validity Argument

Testing Purpose Develop a Validity Argument for a particular test use NATO Interoperability Capability Targets Job Descriptions Training or exercise requirements From STANAG 6001: Participating nations adopt the appended table of language proficiency levels for the purpose of: Communicating language requirements for international staff appointments. Recording and reporting, in international correspondence, measures of language proficiency. Comparing national standards through a standardized table while preserving each nation’s right to maintain its own internal proficiency standards.

How are SLPs Officially Used in NATO? E1101 outlines English Language Capability Targets For NATO Command Structure staff / participation: As described by job description, OR Officers SLP 3 3 3 3 NCOs SLP 2+ 2+ 2+ 2+ For NATO operations, exercises or training Officers in command positions and principal staff officers SLP 2+ 2+ 2+ 2+ All other officers SLP 2 2 2 2 NCOs OR-5 and above likely to have contact with personnel from other nations SLP 2 2 2 2 Enlisted personnel planned to operate tactical comms, or on NATO comms networks or are members of tactical air control elements SLP 2 2 1 1

Job Description Requirement for English: SLP 2222 What does this really mean? SLP 2 1 2 1 Not Qualified  SLP 2 2 3 2 Qualified  Test validation is the process of making a case for the interpretation and uses of test scores.

Audience for Validity Argument Determine the audience for your validity argument “An argument is made and judged by an audience” Chapelle (2011) National stakeholders BILC: IAW sound practices in STANAG 6001 testing

Tasks, criteria, instruction Score need Test design & development Administration/ performance Rating/ evaluation Score use Purpose of assessment Tasks, criteria, instruction Performance Criteria Scores Validity evidence is created and collected throughout the test development cycle. Starting at the planning phase, tasks, criteria and instructions must correspond to STANAG 6001 descriptors, CTA statements, and established test development guidelines. Diagram adapted from Luoma 2004

Getting started : collecting & organizing evidence to support a validity argument for National STANAG 6001 Tests Table of Contents: Test Information for Stakeholders Testing Personnel Qualifications and Training Records Outline or Summary of Training Session Content Test Specifications Test Moderation Checklists and Documents Test Validation Documents and Statistical Data Test Administration Procedures Test Security Handbook Records of Norming Sessions for Raters Statistical Data on Rater Reliability Records and Summaries of a Priori Research Records and Summaries of a Posteriori Research _________________

Roadmap Activity Tasks Identify your audience(s) List evidence or documentation in as many Building Blocks as possible Share/exchange ideas and examples of evidence

Bibliography Chapelle, C. A. (2012). Validity argument for language assessment: The framework is simple. Language Testing, 29(1), 19–27. Chapelle, C. A. (1999). Validity in language assessment. Annual Review of Applied Linguistics, 19(1), 254–272. Luoma, S. (2004). Assessing Speaking. Cambridge: Cambridge University Press.