Diana Inkpen, University of Ottawa, CLEF 2005 Using various indexing schemes and multiple translations in the CL-SR task at CLEF 2005 Diana Inkpen, Muath.

Slides:



Advertisements
Similar presentations
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Advertisements

Chapter 5: Introduction to Information Retrieval
Multilingual experiments of CLEF 2003 Eija Airio, Heikki Keskustalo, Turid Hedlund, Ari Pirkola University of Tampere, Finland Department of Information.
IR Models: Overview, Boolean, and Vector
Presented by Li-Tal Mashiach Learning to Rank: A Machine Learning Approach to Static Ranking Algorithms for Large Data Sets Student Symposium.
Modern Information Retrieval Chapter 2 Modeling. Probabilistic model the appearance or absent of an index term in a document is interpreted either as.
Probabilistic IR Models Based on probability theory Basic idea : Given a document d and a query q, Estimate the likelihood of d being relevant for the.
Ch 4: Information Retrieval and Text Mining
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
Evaluating the Performance of IR Sytems
Advance Information Retrieval Topics Hassan Bashiri.
Vector Space Model CS 652 Information Extraction and Integration.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Scalable Text Mining with Sparse Generative Models
HYPERGEO 1 st technical verification ARISTOTLE UNIVERSITY OF THESSALONIKI Baseline Document Retrieval Component N. Bassiou, C. Kotropoulos, I. Pitas 20/07/2000,
Chapter 5: Information Retrieval and Web Search
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.
LREC Combining Multiple Models for Speech Information Retrieval Muath Alzghool and Diana Inkpen University of Ottawa Canada.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
Querying Across Languages: A Dictionary-Based Approach to Multilingual Information Retrieval Doctorate Course Web Information Retrieval Speaker Gaia Trecarichi.
5 June 2006Polettini Nicola1 Term Weighting in Information Retrieval Polettini Nicola Monday, June 5, 2006 Web Information Retrieval.
“ SINAI at CLEF 2005 : The evolution of the CLEF2003 system.” Fernando Martínez-Santiago Miguel Ángel García-Cumbreras University of Jaén.
A Study on Query Expansion Methods for Patent Retrieval Walid MagdyGareth Jones Centre for Next Generation Localisation School of Computing Dublin City.
Leveraging Reusability: Cost-effective Lexical Acquisition for Large-scale Ontology Translation G. Craig Murray et al. COLING 2006 Reporter Yong-Xiang.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010.
The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Generic text summarization using relevance measure and latent semantic analysis Gong Yihong and Xin Liu SIGIR, April 2015 Yubin Lim.
Chapter 6: Information Retrieval and Web Search
CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 1 Overview of QAST Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1),
A merging strategy proposal: The 2-step retrieval status value method Fernando Mart´inez-Santiago · L. Alfonso Ure ˜na-L´opez · Maite Mart´in-Valdivia.
CSE3201/CSE4500 Term Weighting.
UA in ImageCLEF 2005 Maximiliano Saiz Noeda. Index System  Indexing  Retrieval Image category classification  Building  Use Experiments and results.
IR Homework #2 By J. H. Wang Mar. 31, Programming Exercise #2: Query Processing and Searching Goal: to search relevant documents for a given query.
D L T Cross-Language French-English Question Answering using the DLT System at CLEF 2003 Aoife O’Gorman Igal Gabbay Richard F.E. Sutcliffe Documents and.
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
1 FollowMyLink Individual APT Presentation Third Talk February 2006.
Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
CLEF-2005 CL-SR at Maryland: Document and Query Expansion Using Side Collections and Thesauri Jianqiang Wang and Douglas W. Oard College of Information.
Vector Space Models.
CLEF Kerkyra Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Arantxa Otegi UNIPD: Giorgio Di Nunzio UH: Thomas Mandl.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
1 Flexible and Efficient Toolbox for Information Retrieval MIRACLE group José Miguel Goñi-Menoyo (UPM) José Carlos González-Cristóbal (UPM-Daedalus) Julio.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
Departamento de Lenguajes y Sistemas Informáticos Cross-language experiments with IR-n system CLEF-2003.
Ranked Retrieval INST 734 Module 3 Doug Oard. Agenda Ranked retrieval  Similarity-based ranking Probability-based ranking.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
1 CS 430 / INFO 430 Information Retrieval Lecture 3 Searching Full Text 3.
September 16, 2004CLEF 2004 CLEF-2005 CL-SDR: Proposing an IR Test Collection for Spontaneous Conversational Speech Gareth Jones (Dublin City University,
Thomas Mandl: Robust CLEF Overview 1 Cross-Language Evaluation Forum (CLEF) Thomas Mandl Information Science Universität Hildesheim
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
IR Homework #2 By J. H. Wang Apr. 13, Programming Exercise #2: Query Processing and Searching Goal: to search for relevant documents Input: a query.
BioCreAtIvE Critical Assessment for Information Extraction in Biology Granada, Spain, March28-March 31, 2004 Task 2: Functional annotation of gene products.
IR 6 Scoring, term weighting and the vector space model.
1 SINAI at CLEF 2004: Using Machine Translation resources with mixed 2-step RSV merging algorithm Fernando Martínez Santiago Miguel Ángel García Cumbreras.
Multilingual Search using Query Translation and Collection Selection Jacques Savoy, Pierre-Yves Berger University of Neuchatel, Switzerland
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
F. López-Ostenero, V. Peinado, V. Sama & F. Verdejo
Experiments for the CL-SR task at CLEF 2006
Information Retrieval and Web Search
Large scale multilingual and multimodal integration
Attention for translation
Presentation transcript:

Diana Inkpen, University of Ottawa, CLEF 2005 Using various indexing schemes and multiple translations in the CL-SR task at CLEF 2005 Diana Inkpen, Muath Alzghool, and Aminul Islam University of Ottawa

Diana Inkpen, University of Ottawa, CLEF 2005 System overview Smart IR system Online MT tools Task –Collection – ASR transcribed text –Training queries (38), test queries (25) –Relevance judgments

Diana Inkpen, University of Ottawa, CLEF 2005 Results of the five submitted runs, for topics in English, Spanish, French, and German

Diana Inkpen, University of Ottawa, CLEF 2005 Translating queries with online MT tools Spanish, German, French: wordlingo_translator.html Czech: 1.

Diana Inkpen, University of Ottawa, CLEF 2005 Example query 1159 Child survivors in Sweden Describe survival mechanisms of children born in who spend the war in concentration camps or in hiding and who presently live in Sweden. The relevant material should describe the circumstances and inner resources of the surviving children. The relevant material also describes how the wartime experience affected their post-war adult life Les enfants survivants en Suède Descriptions des mécanismes de survie des enfants nés entre 1930 et 1933 qui ont passé la guerre en camps de concentration ou cachés et qui vivent actuellement en Suède.

Diana Inkpen, University of Ottawa, CLEF 2005 Example of translated query (from French) 1159 surviving children in Sweden The children survivors in Sweden surviving children in Sweden The surviving children in Sweden surviving children in Sweden Descriptions of the mechanisms of survival of the children born between 1930 and 1933 who passed the war in concentration camps or hidden and who currently live in Sweden. Descriptions of the survival mechanisms of the born children between 1930 and 1933 that passed the war in concentration camps or hidden and that live currently in Sweden. Descriptions of the mechanisms of survival of the children born between 1930 and 1933 who passed the war in concentration camps or hidden and who currently live in Sweden. … …

Diana Inkpen, University of Ottawa, CLEF 2005 Results on the output of each Machine Translation system: Spanish, French, German, and Czech

Diana Inkpen, University of Ottawa, CLEF 2005 Smart IR system (Buckley et. al., 2000), Stemming, pseudo-relevance. Many weighting schemes (60 x 60) –For document terms –For query terms

Diana Inkpen, University of Ottawa, CLEF 2005 Weighting schemes for documents and queries Term frequency component none (n) : max-norm (m) : augmented normalized (a): log (l): square (s): Merging of collection frequency component none (n): inverse document frequency weight (t): probabilistic (p): squared (s): … Merging of vector normalization none (n): sum (s): cosine (c):

Diana Inkpen, University of Ottawa, CLEF 2005 Results of the various indexing (weighting) schemes, for English topics

Diana Inkpen, University of Ottawa, CLEF 2005 Phonetic transcripts The documents and the queries were transcribed in phonetic form and split into 4-grams. Example: 1159 ch_ay_l_d s_ax_r_v ax_r_v_ay r_v_ay_v v_ay_v_ax ay_v_ax_r v_ax_r_z ih_n s_w_iy_d w_iy_d_ax iy_d_ax_n

Diana Inkpen, University of Ottawa, CLEF 2005 Results on phonetic n-grams, and combination text plus phonetic n-grams

Diana Inkpen, University of Ottawa, CLEF 2005 Results of indexing all fields: manual keywords and summaries, ASR transcripts