Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 1 Automatic Assessment of Spoken Modern Standard Arabic NAACL Boulder, Colorado.

Similar presentations


Presentation on theme: "Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 1 Automatic Assessment of Spoken Modern Standard Arabic NAACL Boulder, Colorado."— Presentation transcript:

1 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 1 Automatic Assessment of Spoken Modern Standard Arabic NAACL Boulder, Colorado 5 June 2009 Pearson Knowledge Technologies Palo Alto, California Jian Cheng Jared Bernstein Ulrike Pado Masa Suzuki

2 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 2 Outline 1.Pearson Knowledge Technologies 2.How Versant tests operate 2. Versant Arabic Test (development) 3. Validation evidence 4. Predictive accuracy

3 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 3 Pearson Knowledge Tech. (PKT) (KAT + Ordinate) are now PKT KAT ≈ {LSA, Essay Scoring, Write-to-Learn, PTE, etc.} Ordinate ≈ {Versant, ORF for NCES, VersaReader, PTE, etc.) PKT is part of Pearson Pearson ≈ { FT, Economist, Penguin, Longman, PsychCorp, … etc} PearsonKT is in Boulder, Colorado and Palo Alto, California.

4 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 4 Test delivery Database tests, prompts, responses ENGLISH SPANISH DUTCH speech report Communication Network Delivery Interface California Anywhere Scoring system ARABIC

5 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 5 Versant Database Test Delivery Server Scoring “The train’s been delayed by one hour ” How Versant tests operate

6 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 6 Versant Arabic Test DLI purpose ~1000 students at DLI need predictive speaking tests Requirements Accurate test of Arabic listening & speaking Convenient to use at DLI and worldwide (ILR is costly) Suitable for repeated formative testing High peak capacity for mass screening

7 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 7 Construct Comparison OPI Construct: Oral Proficiency as manifest in an Oral Proficiency Interview, is compatible with communicative competence as reflected in the functional level and/or complexity of content accurately produced. Versant Construct: facility in spoken language – the ability to understand spoken language and speak appropriately in response at a conversational pace on everyday topics.

8 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 8 Versant Arabic Test Part A: Reading Part B: Repeat -1 Part C: Short Answers Part D: Sentence Builds Part E: Repeat -2 Part F: Passage Retelling Test Structure

9 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 9 Versant Scoring ReadRepeat Sentence 1Sent BuildRepeat Sentence 2SAQ Passage Human Scoring VocabularySentence MasteryFluency Pronunciation 20%30% 20%

10 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 10 How Versants are developed (1) Scale Estimates Test Spec Versant Scores Native Test Developers Ordinate System Item Text Recorded Items Validation Concurrent ILR Interviews Arabic Learners Native Scribes Criteria Native Judges scale scores transcripts ILR Scores Arabic Natives Internal External (Versant Arabic Test)

11 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 11 kutubu al-waladi – the books of the boy kataba al-waladu – wrote the boy subj No disambiguating short vowels written Vowels carry phonetic information Vowels carry grammar information Arabic Challenges: Voweling

12 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 12 for visit of us – for our visit Complicates lexicon lookup, frequency estimates… “Short” Arabic items are harder than English items with the same number of words Complex Morphology li ziyaarat naa

13 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 13 Development & Run-time Processes Compilation of expectation and runtime flow

14 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 14 Training data sources Native Data EgyptSyriaIraqPalestineOtherTotal 4842811791875171648 Learner Data DLINon-DLITotal 11205521672 Prompt Voices Country EgyptIraqJordanMoroccoLebanonPalestineSyria Voices F, M MFM Prompt Voices and Training Samples

15 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 15 Reliability: Scores are consistent Validity: Native and non-native speakers should be clearly distinct MSA and dialect speakers should be distinct (since we’re testing MSA) Machine scores should predict human scores Validation Criteria

16 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 16 Reliability Score Split-Half Reliability (N = 134) Test – Retest Reliability (N = 100) Overall0.980.97 Sentence Mastery 0.970.96 Vocabulary0.890.82 Fluency0.970.96 Pronunciation0.960.94

17 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 17 Native ~ Non-Native Scores

18 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 18 Natives by Countries

19 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 19 Educated ~ Uneducated Speakers Cumulative Density Arabic Overall Score

20 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 20 Machine – Human Comparison Score Correlation (N = 134) Overall0.97 Sentence Mastery0.97 Vocabulary0.96 Fluency0.84 Pronunciation0.83

21 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 21 How Versants Compare to OPIs Versant Arabic Overall Score ILR OPI Score (logits) N = 118 r = 0.87

22 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 22 Spanish & English: Versant ~ Human ILR OPI Score (logits) Versant Spanish Score N = 37 r = 0.92 SpanishEnglish N = 37 r = 0.92 N = 151 r = 0.86

23 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 23 Summary Versant Arabic Test (VAT) is in operation Based on a large and wide body of transcribed spoken material VAT is available on demand Returns consistent, accurate scores that reflect real-time skills with MSA VAT can triage or screen for OPI tests

24 Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 24 النهاية Thanks to Waheed Samy, Naima Bousofara Omar, Eli Andrews, Mohamed Al-Saffar, Nazir Kikhia, Rula Kikhia,and Linda Istanbulli for item development and data collection/transcription in Arabic, and to Andy Freeman for providing diacritic markings.


Download ppt "Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder 2009 1 Automatic Assessment of Spoken Modern Standard Arabic NAACL Boulder, Colorado."

Similar presentations


Ads by Google