Presentation is loading. Please wait.

Presentation is loading. Please wait.

NATO BAT Testing: The First 200 BILC Professional Seminar 6 October, 2009 Copenhagen, Denmark Dr. Elvira Swender, ACTFL.

Similar presentations


Presentation on theme: "NATO BAT Testing: The First 200 BILC Professional Seminar 6 October, 2009 Copenhagen, Denmark Dr. Elvira Swender, ACTFL."— Presentation transcript:

1 NATO BAT Testing: The First 200 BILC Professional Seminar 6 October, 2009 Copenhagen, Denmark Dr. Elvira Swender, ACTFL

2 This Report 1.History of Benchmark Advisory Tests (BAT) 2.2009 Administration of BAT in 4-Skills 3.BAT Scores 4.Comparing National Scores to Benchmark Scores 5.Observations

3 This Report 1.History of Benchmark Advisory Tests (BAT) 2.2009 Administration of BAT 3.Combined BAT Scores 4.Comparing National Scores to Benchmark Scores 5.Observations

4 Why Benchmark Testing? To provide an external measure against which nations can compare their national STANAG test results To promote relative parity of scale interpretation and application across national testing programs To standardize what is tested and how it is tested

5 BAT History Launched as a volunteer, collaborative project –The BILC Test Working Group 13 members from 8 nations Contributions received from many other nations –The original goal was to develop a Reading test Later awarded a competitive contract by ACT –December, 2006

6 BAT History (cont’d) ACTFL working with BILC Working Group –To develop tests in 4 skill modalities. Reading and Listening tests piloted and validated Speaking and Writing tests developed –Testers and raters trained and certified Test administration and reporting protocols developed 200 BAT 4-skills tests allocated under the contract Tests administered and rated Scores reported to Nations

7 BAT Reading and Listening Tests Internet-delivered and computer scored Criterion-referenced tests –Allow for direct application of the STANAG Proficiency Scale Each proficiency level is tested separately –Test takers take all items for Levels 1,2,3 –20 texts at each level; one item with multiple choice responses per text The proficiency rating is assigned based on two separate scores –“Floor” – sustained ability across a range of tasks and contexts specific to one level –“Ceiling” – non-sustained ability at the next higher proficiency level

8 BAT Speaking Test Telephonic Oral Proficiency Interview –Goal is to a produce a speech sample that best demonstrates the speaker’s highest level of spoken language ability across the tasks and contexts for the level Interview consists of –Standardized structure of “level checks” and “probes” –NATO specific role-play situation Conducted and rated by one certified BAT-S Tester –Independently second rated by a separate certified tester or rater Ratings must agree exactly –Level and plus level scores are assigned –Discrepancies are arbitrated

9 BAT Writing Test Internet-delivered Open constructed response Four, multi-level, prompts –Prompts target tasks and contexts of STANAG levels 1,2,3 –NATO specific prompt Rated by a minimum of two certified BAT-W Raters –Ratings must agree exactly –Level and plus level scores are assigned –Discrepancies are arbitrated

10 This Report 1.History of Benchmark Advisory Tests (BAT) 2.2009 Administration of BAT battery 3.Combined BAT Scores 4.Comparing National Scores to Benchmark Scores 5.Observations

11 2009 BAT Administration Allocation to 11 Nations –8 Nations have completed testing Testing began in May, 2009 Tests administered by LTI, the ACTFL Testing Office

12 2009 BAT Administration Each Nation has a customized client site –Request tests –View and print test schedules –Obtain test administration instructions, passwords, and test codes –Retrieve Ratings

13 ]

14 This Report 1.History of Benchmark Advisory Tests (BAT) 2.2009 Administration of BAT 3.Combined BAT Scores 4.Comparing National Scores to Benchmark Scores 5.Observations

15 Total Number of BAT Scores SkillBAT Listening119 Speaking115 Reading119 Writing115

16 BAT Scores by Level Cumulative

17 This Report 1.History of Benchmark Advisory Tests (BAT) 2.2009 Administration of BAT 3.Combined BAT Scores 4.Comparing National Scores to Benchmark Scores 5.Observations

18 Alignment of National Scores and BAT Scores ListeningSpeakingReadingWriting Black (5) 40% (7) 29% – – – – White (11) 64% (18) 56% (13) 92% (18) 39% Red (18) 89% (18) 83% (18) 83% (18) 50% Blue (20) 85% (19) 47% (20) 55% (20) 60% Maroon (16) 69% (15) 47% (14) 64% (18) 50% Purple (12) 8% – – (13) 54% – – Yellow (17) 24% (18) 0% (18) 33% (18) 0% ListeningSpeakingReadingWriting

19 This Report 1.History of Benchmark Advisory Tests (BAT) 2.2009 Administration of BAT 3.Combined BAT Scores 4.Comparing National Scores to Benchmark Scores 5.Observations

20 Observations – Listening Scores Exact agreement of BAT and National Scores is 58% –69 of the 119 Listening scores agree exactly When the scores disagree, the National score is HIGHER 88% of the time In 8 cases (7%), disagreement is across two levels –1 vs 3 and 2 vs 4

21 Observations – Speaking Scores Exact agreement of BAT and National Scores is 46% –53 of 115 Speaking scores agree exactly When the scores disagree, the National score is HIGHER in all cases In 6 cases (6%),the disagreement is across two levels –1 vs 3 and 2 vs 4

22 Observations – Reading Scores Exact agreement of BAT and National Scores is 62% –74 of 119 Reading scores agree exactly When the scores disagree, the National score is HIGHER in 85% of the cases In 2 cases, the disagreement is across two levels –1 vs 3

23 Observations – Writing Scores Exact agreement of BAT and National Scores is 38% –44 of 115 Writing scores agree exactly When there is disagreement, the National score is HIGHER in all cases In 15 cases, the disagreement is across two levels –1 vs 3 and 2 vs 4

24 Accounting for Strictness or Leniency Testing rehearsed rather than unrehearsed material –Performance vs proficiency Inconsistencies in interpretation of the STANAG When “plus” ratings are not used, the tendency to award the next higher level rating to a performance that is substantially better than a baseline performance

25 For Receptive Skills Compensatory cut score setting Lack of alignment of author purpose, text type, and reader task at level Inadequate item response alternatives

26 For Productive Skills Misalignment of test type and test purpose –Ex: list of discrete questions when goal is to measure spoken language proficiency Inadequate tester/rater norming

27 Plus Ratings Within the Level 1 Range –60% of ratings are 1 –40% of ratings are 1+ Within the Level 2 Range –50% of ratings are 2 –50% of ratings are 2+

28 Profiles Only 12 of 115 profiles (10%) were “flat” –1 1 1 1 (8) –2 2 2 2 (2) –3 3 3 3 (2) All remaining profiles are mixed

29 We are all wondering. What will the future bring?

30 Let’s hope it’s not the same kind of anxiety these early linguists experienced.

31

32 Questions?

33 Extra Slides

34 Side-by-side BAT and National Test Scores Skill BAT scores only BAT Scores and National Scores Reading119103 Listening119100 Speaking11595 Writing11595

35 BAT Scores by Level Reading LevelBAT- R% of Total 0+11 1119 1+1311 21614 2+1210 36655 Total119

36 BAT Scores by Level Listening LevelBAT- R% of Total 0+32 11210 1+1916 21513 2+2118 34941 Total119

37 BAT Scores by Level Speaking LevelBAT- S% of Total 1109 1+2219 23934 2+2925 31513 Total115

38 BAT Scores by Level Writing LevelBAT- W% of Total 11110 1+2824 25144 2+2219 333 Total115

39 Comparing Scores by Level Reading BAT-RNational Test Level 1239 Level 22335 Level 35549 Level 4-10 BAT L1 BAT L 2 BAT L3 National L 1 9 - National L 2 12176 National L3 2 540 National L 4 10

40 Comparing Scores by Level Listening BAT-LNational Test Level 124 12 Level 229 28 Level 344 52 Level 4- 8 BAT L1 BAT L 2 BAT L3 National L 1 10 National L 2 815 5 National L3 612 33 National L 4 26

41 Comparing Scores by Level Speaking BAT-SNational Test Level 128 11 Level 252 34 Level 3 15 44 Level 4 - 6 BAT L1 BAT L 2 BAT L3 National L 1 11 National L 2 1420 National L3 428 12 National L 4 4 2

42 Comparing Scores by Level Writing BAT-WNational Test Level 1 3514 Level 25736 Level 33 35 Level 4- 10 BAT L1 BAT L 2 BAT L3 National L 1 14 National L 2 1620 National L3 5273 National L 4 10

43 ListeningSpeakingReadingWriting Black (5)40%(7)29%–––– White (11)64%(18)56%(13)92%(18)39% Red (18)89%(18)83%(18)83%(18)50% Blue (20)85%(19)47%(20)55%(20)60% Maroon (16)69%(15)47%(14)64%(18)50% Purple (12)8%––(13)54%–– Yellow (17)24%(18)0%(18)33%(18)0% Alignment of National Scores and BAT Scores


Download ppt "NATO BAT Testing: The First 200 BILC Professional Seminar 6 October, 2009 Copenhagen, Denmark Dr. Elvira Swender, ACTFL."

Similar presentations


Ads by Google