CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L. Lamel, S. Rosset (2), N. Moreau, D. Mostefa (3) (1) UPC, Spain (2) LIMSI, France (3) ELDA, France QAST Website : http://www.lsi.upc.edu/~qast/

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 2 Outline 1.Objectives 2.Description of the tasks 3.Participants 4.Results 5.Future work

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 3 Objectives of QAST 2008 -Development of robust QA for speech transcripts -Measure loss due to ASR inaccuracies manual transcriptions, automatic transcriptions -Measure loss at different ASR word error rates -Test with different kinds of speech spontaneous speech, prepared speech -Development of QA for languages other than English English, French, Spanish

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 4 QAST 2008 Organization Task jointly organized by : -UPC, Spain (Coordinator) J. Turmo, P. Comas -ELDA, France N. Moreau, D. Mostefa -LIMSI-CNRS, France S. Rosset, L. Lamel

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 5 Evaluation Data CorpusLang.DescriptionTasksWER CHIL QAST 2007 EnglishLectures (~25h)T1(a): Manual transcriptions - T1(b): ASR transcriptions 20% AMI QAST 2007 EnglishMeetings (~100h)T2(a): Manual transcriptions - T2(b): ASR transcriptions 38% ESTERFrenchBroadcast News (~10h) T3(a): Manual transcriptions - T3(b): ASR transcriptions 11.9% / 23.9% / 35.4% EPPS- EN EnglishSessions European Parliament (~3h) T4(a): Manual transcriptions - T4(b): ASR transcriptions 10.6% / 14.0% / 24.1% EPPS- ES SpanishSessions European Parliament (~3h) T5(a): manual transcriptions - T5(b): ASR transcriptions 11.5% / 12.7% / 13.7%

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 6 Development setEvaluation set TaskData# questionsData# questions T1 (CHIL, English)10 seminars5015 seminars100 T2 (AMI, English)50 meetings50118 meetings100 T3 (ESTER, French)6 shows5012 shows100 T4 (EPPS, English)3 sessions503 sessions100 T5 (EPPS, Spanish)1 session505 sessions100 Questions Factual questions: ~75% Expected answers = named entities (10 types: person, location, organization, language, system, measure, time, color, shape, material) Definition questions: ~25% 4 types of answers: person, organization, object, other ‘NIL’ questions: ~10%

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 7 Participants could submit up to: –2 submissions per task (and per WER) –5 answers per question Answers for ‘manual transcriptions’ tasks: Answer_string + Doc_ID Answers for ‘automatic transcriptions’ tasks: Answer_string + Doc_ID + Time_start + Time_end Submissions

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 8 Four possible judgments (as in QA@CLEF): Correct / Incorrect / Inexact / Unsupported ‘Manual transcriptions’ tasks: Manual assessment with the QASTLE interface ‘Automatic’ transcriptions tasks Automatic assessment (script) + manual check 2 metrics: –Mean Reciprocal Rank (MRR) measures how well right answers are ranked on average –Accuracy fraction of correct answers ranked in the first position Assessments

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 9 49 submissions from 5 participants: Participants T1aT1bT2aT2bT3aT3bT4aT4bT5aT5b 2-----2--- ------12-- 1111231323 ------13-- 1212--1616 43232361439 Univ. Chemnitz (CUT) INAOE LIMSI Univ. Alicante (UA) UPC TOTAL:

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 10 Best results for manual transcriptions Task T1a T2a T3a T4a T5a Factual MRRAcc(%) 0.5347.4 0.4737.8 0.5045.3 0.4440.0 0.3229.3 Definitional MRRAcc(%) 0.1818.2 0.2219.2 0.4744.0 0.1616.0 0.4436.0 All MRRAcc(%) 0.4541.0 0.4033.0 0.4945.0 0.3734.0 0.3531.0

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 11 Best results for ASR transcriptions TaskWER T1b20.0% T2b38.0% T3b11.9% 23.9% 35.4% T4b10.6% 14.0% 24.1% T5b11.5% 12.7% 13.7% All MRRAcc(%) 0.3431.0 0.2018.0 0.4541.0 0.3025.0 0.2421.0 0.3330.0 0.2420.0 0.2319.0 0.2624.0 0.2320.0 0.2523.0 All (manual) MRRAcc(%) 0.4541.0 0.4033.0 0.4945.0 0.3734.0 0.3531.0

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 12 5 participants (as in 2007) 4 different countries (vs. 5 in 2007) Germany, Spain, France, Mexico 49 submitted runs (vs. 28 runs in 2007) Loss in accuracy with ASR transcribed speech (performance falls when WER rises) QAST 2009: Written & Oral Questions... Conclusion

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.

Similar presentations

Presentation on theme: "CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.

Similar presentations

Presentation on theme: "CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 1 Overview of QAST 2008 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L."— Presentation transcript:

Similar presentations

About project

Feedback