Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante) Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, Departamento de Lenguajes y Sistemas Informáticos Universidad de Alicante
iCLEF Outline 1.Introduction and objectives 2.Method of interaction I: passages 3.Method of interaction II: syntactic semantic patterns 4.Description of the experiment 5.Results and conclusions
Introduction and objectives
iCLEF Important aspect in Interactive Cross Language Information Access is the way in which the system shows the relevant information to the user –Only with this information, the user must decide if the document is relevant or not A key point for the correct selection of documents and for futures refinements of the query Introduction
iCLEF Problem: –Multilingualism: The language of the query and the language of the documents are different Main solutions: –To show the information in the language of the query –To show the information in the language of the document Introduction
iCLEF iCLEF 2002 (Llopis et al. 2002): –To use a system based on passage for the interaction with the user –This approach is better than the use of the whole document –Main problem: many passages was unreadable for the users due to problems with the machine translation of the passages Introduction
iCLEF Our aim at iCLEF 2003: –Improve this approach in two aspects: The interaction speed: the time consuming by the user between the uploading of the passage to the decision about its relevance The recall and precision in the selection of the relevant documents –But avoiding the use of Machine Translation –We have defined an interactive approach based on syntactic-semantic patterns (Navarro et al. 2003) Introduction
iCLEF Objectives at iCLEF 2003 To know if it is possible that a user decide if a document is relevant or not only with the syntactic semantic patterns extracted from the passages To know if the interaction based on syntactic semantic patterns is better than the interaction based on passages only To know if the use of syntactic semantic patterns is better than the machine translation of the passages
Method I: Approach based on passages
iCLEF Method I: passages Developed and presented at iCLEF 2002 (Llopis et al. 2002) Passage: a relevant piece of text of a document With the use of passages, only the most relevant information of a document is shown to the user –Not the whole document
Method II: Approach based on syntactic semantic patterns
iCLEF Syntactic-semantic pattern Linguistic pattern formed by three components: –A verb with one sense (necessary) –The subcategorization frame of the sense –The selectional preferences of each argument (semantic features)
iCLEF Automatic extraction of pattern Parser MiniPar Steps: –Look for a verb –Look for a noun at the left of the verb –Look for a noun or preposition plus noun at the right of the verb –Look for a noun or preposition plus noun at the right of the previous noun
iCLEF Primakov suggested that the Administration was using the Ames arrest to score domestic political points, to punish Russia for its independent stance on the conflict in Bosnia- Herzegovina and to provide convenient excuse for cutting American aid to Russia, according to journalists who attended. Primakov suggest Administration administration use Ames arrest administration score domestic point Primakov punish Russia for its stance Primakov provide convenient excuse for Primakov cut American aid to Russia according to journalist journalist attend Example
iCLEF Automatic extraction of pattern The patterns are extracted from the passages The patterns show only the basic information of each sentence: –the most important words: the verb and the arguments, –the syntactic and semantic relations between them It is enough to know the topic of a document and to decide about its relevance
iCLEF Automatic extraction of pattern Hypothesis: –It is possible to decide about the relevance of a document only with the patterns –For a searcher with passive language abilities in the foreign language, it is more easy to process the patterns than the complete passage, because he put the attention only in the main words of each sentence
Description of the experiment
iCLEF Experiment Cross-language document selection Search group: Spanish with passive language abilities in English Information Retrieval System: IR-n system (Llopis 2003) –It uses the complete query –From each query, extract 25 (possible) relevant documents
iCLEF Experiment Each retrieved document is showed to the user: –System 1 shows only passages (in English) –System 2 shows the patterns extracted from the passages (in English) With this, the user must decide if the document is relevant or not Through HTML interface, we save the relevant judgment and the time consuming
Results and conclusions
iCLEF F-alpha average SYSTEMF-alpha average Passages Patterns
iCLEF Time consuming
iCLEF Conclusions Only with the syntactic semantic patterns it is possible to decide about the relevance of a document in a foreign language (if the searcher has passive abilities in this language) The time consuming in the judgment decision is less with the patterns than with the passages in most of the cases With the syntactic semantic patterns and/or passages it is possible to avoid the use of machine translation systems for users with passive abilities in the language of the document.
Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante) Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, Departamento de Lenguajes y Sistemas Informáticos. Universidad de Alicante