Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante) Borja Navarro,

Slides:



Advertisements
Similar presentations
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Advertisements

Relevance Feedback Limitations –Must yield result within at most 3-4 iterations –Users will likely terminate the process sooner –User may get irritated.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Evaluating Hierarchical Clustering of Search Results Departamento de Lenguajes y Sistemas Informáticos UNED, Spain Juan Cigarrán Anselmo Peñas Julio Gonzalo.
Terminology Retrieval: towards a synergy between thesaurus and free text searching Anselmo Peñas, Felisa Verdejo and Julio Gonzalo Dpto. Lenguajes y Sistemas.
Improved TF-IDF Ranker
Developing and Evaluating a Query Recommendation Feature to Assist Users with Online Information Seeking & Retrieval With graduate students: Karl Gyllstrom,
Browsing by phrases: terminological information in interactive multilingual text retrieval Anselmo Peñas, Julio Gonzalo and Felisa Verdejo NLP Group, Dpto.
Presenters: Başak Çakar Şadiye Kaptanoğlu.  Typical output of an IR system – static predefined summary ◦ Title ◦ First few sentences  Not a clear view.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Search Engines and Information Retrieval
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Rutgers Components Phase 2 Principal investigators –Paul Kantor, PI; Design, modelling and analysis –Kwong Bor Ng, Co-PI - Fusion; Experimental design.
Using the Semantic Web for Web Searches Norman Piedade de Noronha, Mário J. Silva XLDB / LaSIGE, Faculdade de Ciências, Universidade de Lisboa.
Machine Learning for Information Extraction Li Xu.
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
Advance Information Retrieval Topics Hassan Bashiri.
Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam.
Collaborative Cross-Language Search Douglas W. Oard University of Maryland, College Park May 14, 2015SICS Workshop.
April 20023CSG11 Electronic Commerce Design (1) John Wordsworth Department of Computer Science The University of Reading Room.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
UAM CorpusTool: An Overview Debopam Das Discourse Research Group Department of Linguistics Simon Fraser University Feb 5, 2014.
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
Search Engines and Information Retrieval Chapter 1.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
Departamento de Lenguajes y Sistemas Informáticos Spoken Document Retrieval experiments with IR-n system Fernando Llopis Pascual Patricio Martínez-Barco.
T raining on Read&Write GOLD Dick Powers
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
JASS 2005 Next-Generation User-Centered Information Management Information visualization Alexander S. Babaev Faculty of Applied Mathematics.
JULIO GONZALO, VÍCTOR PEINADO, PAUL CLOUGH & JUSSI KARLGREN CLEF 2009, CORFU iCLEF 2009 overview tags : image_search, multilinguality, interactivity, log_analysis,
CLEF 2004 – Interactive Xling Bookmarking, thesaurus, and cooperation in bilingual Q & A Jussi Karlgren – Preben Hansen –
“ SINAI at CLEF 2005 : The evolution of the CLEF2003 system.” Fernando Martínez-Santiago Miguel Ángel García-Cumbreras University of Jaén.
A Study on Query Expansion Methods for Patent Retrieval Walid MagdyGareth Jones Centre for Next Generation Localisation School of Computing Dublin City.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
Search Engine Architecture
UNED at iCLEF 2008: Analysis of a large log of multilingual image searches in Flickr Victor Peinado, Javier Artiles, Julio Gonzalo and Fernando López-Ostenero.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Information Transfer through Online Summarizing and Translation Technology Sanja Seljan*, Ksenija Klasnić**, Mara Stojanac*, Barbara Pešorda*, Nives Mikelić.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Knowledge based Question Answering System Anurag Gautam Harshit Maheshwari.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
Acceso a la información mediante exploración de sintagmas Anselmo Peñas, Julio Gonzalo y Felisa Verdejo Dpto. Lenguajes y Sistemas Informáticos UNED III.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
Survey on Long Queries in Keyword Search : Phrase-based IR Sungchan Park
The CLEF 2005 interactive track (iCLEF) Julio Gonzalo 1, Paul Clough 2 and Alessandro Vallin Departamento de Lenguajes y Sistemas Informáticos, Universidad.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Computational and Statistical Methods for Corpus Analysis: Overview
CS 620 Class Presentation Using WordNet to Improve User Modelling in a Web Document Recommender System Using WordNet to Improve User Modelling in a Web.
Q4 Measuring Effectiveness
Combining Keyword and Semantic Search for Best Effort Information Retrieval  Andrew Zitzelberger 1.
CS246: Information Retrieval
What is the Entrance Exams Task
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante) Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, Departamento de Lenguajes y Sistemas Informáticos Universidad de Alicante

iCLEF Outline 1.Introduction and objectives 2.Method of interaction I: passages 3.Method of interaction II: syntactic semantic patterns 4.Description of the experiment 5.Results and conclusions

Introduction and objectives

iCLEF Important aspect in Interactive Cross Language Information Access is the way in which the system shows the relevant information to the user –Only with this information, the user must decide if the document is relevant or not A key point for the correct selection of documents and for futures refinements of the query Introduction

iCLEF Problem: –Multilingualism: The language of the query and the language of the documents are different Main solutions: –To show the information in the language of the query –To show the information in the language of the document Introduction

iCLEF iCLEF 2002 (Llopis et al. 2002): –To use a system based on passage for the interaction with the user –This approach is better than the use of the whole document –Main problem: many passages was unreadable for the users due to problems with the machine translation of the passages Introduction

iCLEF Our aim at iCLEF 2003: –Improve this approach in two aspects: The interaction speed: the time consuming by the user between the uploading of the passage to the decision about its relevance The recall and precision in the selection of the relevant documents –But avoiding the use of Machine Translation –We have defined an interactive approach based on syntactic-semantic patterns (Navarro et al. 2003) Introduction

iCLEF Objectives at iCLEF 2003 To know if it is possible that a user decide if a document is relevant or not only with the syntactic semantic patterns extracted from the passages To know if the interaction based on syntactic semantic patterns is better than the interaction based on passages only To know if the use of syntactic semantic patterns is better than the machine translation of the passages

Method I: Approach based on passages

iCLEF Method I: passages Developed and presented at iCLEF 2002 (Llopis et al. 2002) Passage: a relevant piece of text of a document With the use of passages, only the most relevant information of a document is shown to the user –Not the whole document

Method II: Approach based on syntactic semantic patterns

iCLEF Syntactic-semantic pattern Linguistic pattern formed by three components: –A verb with one sense (necessary) –The subcategorization frame of the sense –The selectional preferences of each argument (semantic features)

iCLEF Automatic extraction of pattern Parser MiniPar Steps: –Look for a verb –Look for a noun at the left of the verb –Look for a noun or preposition plus noun at the right of the verb –Look for a noun or preposition plus noun at the right of the previous noun

iCLEF Primakov suggested that the Administration was using the Ames arrest to score domestic political points, to punish Russia for its independent stance on the conflict in Bosnia- Herzegovina and to provide convenient excuse for cutting American aid to Russia, according to journalists who attended. Primakov suggest Administration administration use Ames arrest administration score domestic point Primakov punish Russia for its stance Primakov provide convenient excuse for Primakov cut American aid to Russia according to journalist journalist attend Example

iCLEF Automatic extraction of pattern The patterns are extracted from the passages The patterns show only the basic information of each sentence: –the most important words: the verb and the arguments, –the syntactic and semantic relations between them It is enough to know the topic of a document and to decide about its relevance

iCLEF Automatic extraction of pattern Hypothesis: –It is possible to decide about the relevance of a document only with the patterns –For a searcher with passive language abilities in the foreign language, it is more easy to process the patterns than the complete passage, because he put the attention only in the main words of each sentence

Description of the experiment

iCLEF Experiment Cross-language document selection Search group: Spanish with passive language abilities in English Information Retrieval System: IR-n system (Llopis 2003) –It uses the complete query –From each query, extract 25 (possible) relevant documents

iCLEF Experiment Each retrieved document is showed to the user: –System 1 shows only passages (in English) –System 2 shows the patterns extracted from the passages (in English) With this, the user must decide if the document is relevant or not Through HTML interface, we save the relevant judgment and the time consuming

Results and conclusions

iCLEF F-alpha average SYSTEMF-alpha average Passages Patterns

iCLEF Time consuming

iCLEF Conclusions Only with the syntactic semantic patterns it is possible to decide about the relevance of a document in a foreign language (if the searcher has passive abilities in this language) The time consuming in the judgment decision is less with the patterns than with the passages in most of the cases With the syntactic semantic patterns and/or passages it is possible to avoid the use of machine translation systems for users with passive abilities in the language of the document.

Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante) Borja Navarro, Fernando Llopis, Miguel Ángel Varó {borja, llopis, Departamento de Lenguajes y Sistemas Informáticos. Universidad de Alicante