TRUST & QRISTAL (TRUST = Text Retrieval Using Semantic Technologies) (QRISTAL = Questions-Réponses Intégrant un Système de Traitement Automatique des Langues) Questions-Replies Integrating a System to Treat (process) Automatically the Languages Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
1. TRUST Presentation 2. QRISTAL Presentation 3. QRISTAL Evaluation
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT 1. TRUST Presentation TRUST is an R&D project co-financed by the EU Commission, under Synapse technological leadership,,and addressing a multilingual QA system.It was submitted by a consortium of 6 Smes Synapse Développement, Toulouse, France Expert System Solutions, Modène, Italie Priberam, Lisbonne, Portugal TiP, Katowice, Pologne Convis, Berlin, Allemagne & Paris, France Sémiosphère, Toulouse, France (coordination)
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT TRUST started in November 2001 and completed in October It was designed to be an industrial project with aim to commercialise in B2B and B2C, a QA software allowing to any user to retrieve one or several answers to a general purpose or factual question. It was bound to answer to questions from a finite corpus (hard disk, set of documents…), or questions addressed to Internet, via a meta-engine, using the most popular engine (Google, MSN, Altavista, AOL, etc.)
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT The targeted languages were French, Italian, Polish, Portuguese. English was not part of Trust but was developed in parallel. The pivot language, allowing to ask a question in one language and get the reply in another is English. All partners owned a syntactic analyser and important linguistic resources. Synapse, as technology transferor, had at disposal a previously commercialised engine (called Chercheur) to index and retrieve.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Ontologie générale Documents Dico des formes dérivées Ontologie des types questions Indexation Découpage des blocs Correction orthographe Analyse syntaxique Analyse conceptuelle Index mots-clés blocs Index entités nommées Index têtes dérivation Index des concepts Index des domaines Résolution anaphores Index des types de questions-réponses Question Traitement Question Correction orthographe Analyse syntaxique Analyse conceptuelle Extraction mots-clés Type de la question Traduction si multilingue Recherche dans Index Synonymes + converses Sélection des blocs Ordonnancement blocs Extraction des blocs Extraction réponse Réponse(s) Correction orthographe Analyse syntaxique Analyse conceptuelle Type de la réponse Mots-clés du bloc Résolution des anaphores Détection des métaphores Sélection phrase(s) Tri des phrases Cohérence, justification Extraction réponse(s)
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Trust Engine description At completion, the Trust engine has very original features : the indexation is carried-out on words, expressions, named entities but also on concepts, domains and the types of QA The excerpt search, and the answer extraction are using a very deep and sharp syntactic, conceptual, and semantic analysis..
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT A Modular Architecture Module Linguistique français Module Linguistique italien Module Linguistique portugais Module Linguistique polonais Module Linguistique anglais Moteur dindexation Moteur dextraction de blocs de texte Index Documents Visualisation Des résultats Visualisation Des résultats
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Document Indexation TRUST indexes numerous document formats (.html,.doc,.pdf,.ps,.sgml,.xml,.hlp,.dbx, etc.) as well as archived/compressed (.zip) and ascii texts. An automated spelling checking may be carried out prior to it. Beyond the usual indexation of the terms, a semantic and syntactic analysis performs the indexation of the concepts and the typology of answers (ex. : a date of birth,a title or an occupation for a person, etc.)
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT The simple words are indexed by « head of derivation » i.e. words such as « symétrie », « symétriques », « asymétrie », « dissymétrique », « symétriseraient » ou « symétrisable » will be indexed under the same heading « symétrie ». This technique allows to reduce the size of the indexes and facilitates the grouping of neighbouring notions, thus avoiding the classical « term expansion » process during the request.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Technical characteristics Currently indexation is performed in 1Ko blocks, ie the texts are sliced in 1Ko blocks and any head of derivation will be indexed and allocated an occurrence number (ex: found 3 times in the blocks, occurrence is 3) The indexation speed is very different according to the languages.It is about 300 Mo/hour in French and Polish, about 240 Mo/hour in Portuguese, about 100 Mo/hour for English and about 10 Mo/hour for italian.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Conceptual Indexation and Ontology TRUST shares a common ontology with all linguistic modules of the various languages attached to it.This ontology, developed by Synapse, includes 5 hierarchical levels corresponding to : 28 categories at the main superior level 94 categories at the second level 256 categories at the third level 3387 categories at the fourth level over terms (including meanings for words) & over « syntagmes » at the basic level.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Indexation & Types of questions TRUST indexes the types of questions. It means that each linguistic module,when analysing each block of text, attempts to detect/profile the possible answer for each type of question (person, date, event, cause, aim, etc.) The present taxonomy of the type of questions comprises 86 different categories.It goes beyond the « factual » because including notions such as « usefulness » « comparison » « judgment » but also categories like « yes/no » or a classification.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Analysis of the question When the question is keyed in by the user, automatically the language of the question is detected, and its matching linguistic module performs the semantic and syntactic analysis of the question. When some words of the question have several meanings, the most probable meaning is choosen, but the user may force the meaning of each word. The same linguistic modules determines the domain, the concepts and above all the type of the question.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT The text search From the data obtained via the analysis of the question,(heads of derivation,named entities, domains, concepts, the question profile/type), the search engine extracts from the index, the blocks of texts best suiting the set of data. A balance of the different available data is carried out in order to avoid that a disambiguation error relating to the meaning or the type of the question prevents obtention of the blocks of texts that may contain an answer.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Extraction of Answers For a given question, after possible spell-check,syntactic, semantic, conceptual analysis, then detection of the question, heads of derivations,named entities, concepts, domains, the types of QA are compared to the indexes for these different types. The best ranked blocks are analysed and answers extracted. The extraction of the answer is performed by the search of the named entities or syntactic groups in « position of use for the answering ».
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Response time A keyed-in question on a closed corpus (hard disk, corpus, Intranet) the answer is provided in French in less than 3 seconds. With other languages it can be up to 10 seconds. A keyed-in question on Internet,the response time may be anything between 2 to 14 seconds, depending on the language used, the number of pages analysed (user- definable) and the type of the question (a few answers are retrieved very quickly just on the available resumé or short description)
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT 2. QRISTAL Presentation QRISTAL (is acronym de Questions-Réponses Intégrant un Système de Traitement Automatique des Langues) is the B2C version of TRUST. It is priced at 99 and commercialised in retail computer outlets and in large consummer market distributors such as Virgin Stores or FNAC. Fruit of a 6 year development, QRISTAL performs beyond the TRUST set limits, but is undoubtedly arising from this project.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT QRISTAL may be used in 2 major functions : Provide exact answers to questions on « closed corpora » (hard disk, s, Intranet, etc.), these being previously indexed so as extract the answers from the blocks of text corresponding to the analysis of the question. Provide the exact answers to questions addressed to Internet (web). In this case, Qristal converts the questions in « understandable requests » for the standard engines, extracts the returned pages and their short description, analyses them and computes the answers.
In Qristal, a special attention has been given to the « user- self definability ». In design, Qristal is targeting those unfamiliar with SQL or web requests, and wishing to obtain directly an answer while formulating their questions in common natural language. Therefore the interface must be very user-friendly and as simple as possible,in order for them to profile Qristal usage to suit their habits and wishes. For more experimented users, files of questions as well as work on several indexes permit a more advanced usability, Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Commercialisation QRISTAL has been commercialised since December 2004 and has had a registered base of more than 700 users in that single month. Users are satisfied of the results obtained in French, while their judgment on the other language results is ( a bit unfairly) critical. Qristal appears to be very « reliable and stable », user- friendly as very few calls to the support/customer service may justify this appreciation. Users expectations are very large and their satisfaction will mean for us to produce a lot of efforts.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Press Article in « La Dépêche du Midi » du 4 January 2005
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Perspectives QRISTAL will have updates in the coming years, with the following improvements : improve the rate of exact answers, eliminate noise use the notoriety of the pages to order them carry out more precise inferences to extract the answers allow « user profiles » include other languages (German, Spanish ) better differentiate the answer mode (alone, all ) better situate the answers in their context
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT 3. QRISTAL Evaluation QRISTAL was evaluated within a contest called EQUER, as a campaign of evaluation of QA systems of the the EVALDA project ( EVALDA and Technolangue projects, have been initiated by the French Ministry for Industry, Research and Culture. The EQUER campaign was organised by ELDA (Evaluations and Language resources Distribution Agency, and was deployed between January 2003 and December
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT The EQUER campaign, very similar in its principles to TREC-QA (USA) or NTCIR (Japan), included 2 different tests : 500 all domain/ all purpose questions, mainly factual, on a journalistic and administrative corpus of 1,5 Go. 200 questions, very often non-factual, on a medical corpus made of scientific articles and web pages of about 50 Mo.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT The 500 all purpose questions were sectioned in : 407 factual simple questions (ex.: Comment s'appelle le fils de Juliette Binoche ?) 31 questions having a list as answer (ex.: Quels sont les trois pays qui bordent la Bosnie-Herzégovine ?) 32 questions having a definition as answer (ex.: Quest- ce que la NSA ?) 30 binary questions binaires, having Yes/No as answer (ex.: La carte didentité existe-t-elle au Royaume-Uni ?)
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT The EQUER contestants were : 4 commercial companies :(the first 2 are very large firms) Commissariat à lÉnergie Atomique, Saclay, France France Telecom, Lannion, France Sinequa, Paris, France Synapse Développement, Toulouse, France 3 University laboratories : LIA & SMART, Avignon, France LIMSI, Orsay, France Université de Neuchâtel, Suisse
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Procedure and Metrics to allocate marks The used metrics to allocate marks to the results was MRR (Mean Reciprocal Rank), i.e. 1 for an exact answer in a first position, ½ for an exact answer in a second position, 1/3 for an exact answer in a third position, etc. Only 5 answers were accounted for, except for binary question with one exact justified answer was to be accepted. For the questions having a list as answer, the used metrics was NIAP (Non Interpolated Average Precision).
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT The Synapse QA system evaluated during the EQUER campaign was a « pre-version » of QRISTAL, not having all the functionalities to extract the exact answer. With EQUER, Synapse participated to its first ever campaign to evaluate QA systems, while many other contestants had experience in participating to TREC-QA or CLEF-QA, for the English language QA or French Language QA.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Technical performance The full set of the 500 questions of the general corpus was processed in 23 minutes and 17 seconds, hence less than 3 seconds per question. The speed of the linguistic analysis of the blocks was about 400 Mo/hour for the indexation, i.e words/second. The speed of analysis and extraction of the answer was about 230 Mo/h, i.e words/second. On 500 questions, the type « correct » has been determined in 98% of the cases. These speed tests were carried out on a Pentium 3 GHz with 1 Go Ram memory.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
As shown in the previous graph, the best EQUER QA system (i.e Synapse), performs as well as the best one in TREC or NTCIR (MRR of 0,58 versus 0,68 & 0,61) for exact answers. This level is, in all cases, superior to the second best in TREC or NTCIR. These results consolidate the theoritical options and the quality of the resources developed within TRUST and implemented in QRISTAL.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Other Evaluations During the evaluation, a set of 100 references of texts, originating from a standard engine, was provided for each question. With this data, Synapseengine performance was 0,64 (versus 0,70) for the « passages » and 0,48 (versus 0,58) for the exact answers. An in-house test has later shown that in « inhibiting » the function « the type of the question » the MRR fell from 0,70 to 0,46 for the « passages », hence epitomising the importance of this functionality.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT After extraction of the texts enclosing the answers of the general corpus of1,5 Go, we achieved to reduce it to 180 Go. It is noticeable that the results of the 500 questions are very near on each of the 2 corpora. This leads us to think that the size of the corpus could be considered as negligeable for the Quality of the results, contrary to an usually admitted idea in information retrieval. The said corpus of questions included « reformulations ». A benchmark comparing the answers of the questions at the « start position » versus the position after « reformulations » has shown that the results are very near to each other (93% of answers in first position are identical).
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT Future evaluations Synapse has the intention to participate in CLEF-QA both in monolingual and multilingual options in 2005 Currently, no other evaluation campaign is planned in France to follow-up EQUER, but an evaluation of a transcript from an oral corpus should take place in the coming month.
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT FIN End Merci ! Thank you