Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 1 Overview of QAST 2007 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1),

Similar presentations


Presentation on theme: "CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 1 Overview of QAST 2007 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1),"— Presentation transcript:

1 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 1 Overview of QAST 2007 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), C. Ayache, D. Mostefa (2), L. Lamel and S. Rosset (3) (1) UPC, Spain (2) ELDA, France (3) LIMSI, France QAST Website : http://www.lsi.upc.edu/~qast/

2 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 2 Outline 1.Task 2.Participants 3.Results 4.Conclusion and future work

3 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 3 Task: QAST 2007 Organization Task jointly organized by : -UPC, Spain (J. Turmo, P. Comas) Coordinator -ELDA, France (C. Ayache, D. Mostefa) -LIMSI-CNRS, France (S. Rosset, L. Lamel)

4 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 4 4 tasks were proposed: –T1 : QA in manual transcriptions of lectures –T2 : QA in automatic transcriptions of lectures –T3 : QA in manual transcriptions of meetings –T4 : QA in automatic transcriptions of meetings 2 data collections: –The CHIL corpus: around 25 hours (1 hour per lecture) Domain of lectures: Speech and language processing –The AMI corpus: around 100 hours (168 meetings) Domain of meetings: Design of television remote control Task: Evaluation Protocol

5 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 5 For each task, 2 sets of questions were provided: Development set (1 February 2007): –Lectures: 10 lectures, 50 questions –Meetings: 50 meetings, 50 questions Evaluation set (18 June 2007): –Lectures: 15 lectures, 100 questions –Meetings: 118 meetings, 100 questions Task: Questions and answer types

6 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 6 Factual questions Who is a guru in speech recognition? Expected answers = named entities. List of NEs: person, location, organization, language, system/method, measure, time, color, shape, material. No definition questions. Task: Questions and answer types

7 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 7 Assessors used QASTLE, an evaluation tool developed in Perl (by ELDA), to evaluate the data. Four possible judgments: –Correct –Incorrect –Inexact (too short or too long) –Unsupported (correct answers but wrong document) Task: Human judgment

8 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 8 Two metrics were used: –Mean Reciprocal Rank (MRR): measures how well ranked is a right answer. –Accuracy: the fraction of correct answers ranked in the first position in the list of 5 possible answers Participants could submit up to 2 submissions per task and 5 answers per question. Task: Scoring

9 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 9 Five teams submitted results for one or more QAST tasks: –CLT, Center for Language Technology, Australia ; –DFKI, Germany ; –LIMSI-CNRS, Laboratoire d’Informatique et de Mécanique des Sciences de l’Ingénieur, France ; –Tokyo Institute of Technology, Japan ; –UPC, Universitat Politècnica de Catalunya, Spain. In total, 28 submission files were evaluated: Participants CHIL Corpus (lectures)AMI Corpus (meetings) T1 (manual)T2 (ASR)T3 (manual)T4 (ASR) 8 submissions9 submissions5 submissions6 submissions

10 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 10 Due to some problems (typos, answer types and also missing time information at word level for some AMI meetings) some questions have been deleted from test set for scoring. Final counts: –T1 and T2: 98 questions –T3: 96 questions –T4: 93 questions Results

11 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 11 QA on CHIL manual transcriptions: Results for T1 System# Questions Returned # Correct Answers MRRAccuracy clt1_t198160.090.06 clt2_t198160.090.05 dfki1_t198190.170.15 limsi1_t198430.370.32 limsi2_t198560.460.39 tokyo1_t198320.190.14 tokyo2_t198340.200.14 upc1_t198540.530.51

12 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 12 QA on CHIL automatic transcriptions: Results for T2 System# Questions Returned # Correct Answers MRRAccuracy clt1_t298130.060.03 clt2_t298120.050.02 dfki1_t29890.09 limsi1_t298280.230.20 limsi2_t298280.240.21 tokyo1_t298170.120.08 tokyo2_t298180.120.08 upc1_t296370.370.36 upc2_t297290.250.24

13 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 13 QA on AMI manual transcriptions: Results for T3 System# Questions Returned # Correct Answers MRRAccuracy clt1_t396310.230.16 clt2_t396290.250.20 limsi1_t396310.280.25 limsi2_t396400.310.25 upc1_t395230.220.20

14 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 14 QA on AMI automatic transcriptions: Results for T4 System# Questions Returned # Correct Answers MRRAccuracy clt1_t493170.100.06 clt2_t493190.130.08 limsi1_t493210.190.18 limsi2_t493210.190.17 upc1_t491220.220.21 upc2_t492170.150.13

15 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 15 5 participants from 5 different countries (France, Germany, Spain, Australia and Japan) => 28 runs Very encouraging results QA technology can be useful to deal with spontaneous speech transcripts. High loss in accuracy with automatically transcribed speech Conclusion and future work

16 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 16 Future work aims at including: Other languages than English Oral questions Other question types: definition, list, etc. Other domains of data collections: European Parliament, broadcast news, etc. Conclusion and future work

17 CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 17 The QAST Website: http://www.lsi.upc.edu/~qast/ For more information


Download ppt "CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 1 Overview of QAST 2007 - Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1),"

Similar presentations


Ads by Google