Download presentation
Presentation is loading. Please wait.
Published byRoss Anthony Modified over 9 years ago
1
D L T Cross-Language French-English Question Answering using the DLT System at CLEF 2003 Aoife O’Gorman Igal Gabbay Richard F.E. Sutcliffe Documents and Linguistic Technology Group Univeristy of Limerick
2
D L T Outline Objectives System architecture Key components Task performance evaluation Findings
3
D L T Objectives Learn the issues involved in multilingual QA Combine the components of our existing English and French monolingual QA systems
4
D L T System architecture Query classification Query translation (Google) & re-formulation Text retrieval (dtSearch) Named entity recognition Answer entity selection
5
D L T Query classification Categories based on translated TREC 2002 queries Keyword based classification what_country De quel pays le jeu de croquet est-il originaire De quel nation..? Unknown
6
D L T Query translation and re-formulation Submitting the French query in its original form on the Google Language Tools page Tokenisation Selective removal of stopwords Example: Qui a été élu gouverneur de la California? Who was elected governor of California? [ ‘elected’, ‘governor’, ‘California’]
7
D L T Text Retrieval: Submitting queries to dtSearch dtSeach indexed the doc collection based on tags Inserting a w/1 connector between two capitalised words Submitting untranslated quotations for exact match Inserting an AND connnector between all other terms (Boolean) Limited verb expansion based on common verbs used in TREC questions
8
D L T Named Entity Recoginition: General Names Captures any instances of general names in cases where we are not sure what to look for. A general_name is defined in our system to be up to five capitalised terms interspersed with optional prepositions. Examples: Limerick City University of Limerick
9
D L T Answer entity selection highest_scoring What year was Robert Frost born? in entity(date,[1,8,7,5],[[],[],[], [], [1,8,7,5]],[],[],[]), poet target([Robert]) target(Frost]) was target([born]) in San Francisco most_frequent When did “The Simpsons” first appear on television? When target([The]) target([Simpsons]) was target(first]) broadcast in entity(date[1,9,8,9,,[[],[],[],[],[],[1,9,8,9],[],[],])
10
D L T Task performance evaluation GroupRun NameMRRNo. of Q. with at least one right answer NIL Questions strictlenientstrictlenientreturnedcorrect CS- CMU lumoex031bf.153.1703842928 lumoex032bf.131.1493135917 DLTG dltgex031bf.115.120232411910 dltgex032bf.110.115222311910 RALI udemex032bf.140.160384231 Adapted from Magnini (2003)
11
D L T Findings Query classification: unexpected formulation of queries, too few categories Translation: problems with names, titles, - We need better query-specific translation - Localisation of names/titles - Possibly limit translation to search terms An interface could be built for the parser to enable it to be tested by an end user Error types 6-13 could be investigated and the parser extended to handle some of them Practical studies in the use of STS could be carried out
12
D L T Findings Text retrieval: allow relaxation and more sophisticated expansion of search queries Named entity recognition: find better alternatives to answer questions of type Unknown Answer entity selection: take into account distance and density of query terms Usability issue: answers may need to be translated back to French
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.