With or without users? Julio Gonzalo UNEDhttp://nlp.uned.es.

Slides:

Advertisements

Similar presentations

SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.

Advertisements

Chapter 5: Introduction to Information Retrieval

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

Search Engines and Information Retrieval

Information Retrieval Review

IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.

1 Discussion Class 11 Click through Data as Implicit Feedback.

Modern Information Retrieval

Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.

A machine learning approach to improve precision for navigational queries in a Web information retrieval system Reiner Kraft

INFO 624 Week 3 Retrieval System Evaluation

Information Retrieval Ch Information retrieval Goal: Finding documents Search engines on the world wide web IR system characters Document collection.

1 Information Retrieval and Web Search Introduction.

Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.

Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.

Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.

World Wide Web As the World Wide Web increased in popularity, it was difficult to keep track of all web addresses. Search engines were created to minimize.

Web Search – Summer Term 2006 II. Information Retrieval (Basics Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.

Evaluation Information retrieval Web. Purposes of Evaluation System Performance Evaluation efficiency of data structures and methods operational profile.

Overview of Search Engines

Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.

Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.

CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Evaluation David Kauchak cs458 Fall 2012 adapted from:

Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.

Evaluation David Kauchak cs160 Fall 2009 adapted from:

Search Engines and Information Retrieval Chapter 1.

CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.

©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.

An Analysis of Assessor Behavior in Crowdsourced Preference Judgments Dongqing Zhu and Ben Carterette University of Delaware.

Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.

UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.

Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.

Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.

MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.

Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.

Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.

Chapter 6: Information Retrieval and Web Search

UNED at iCLEF 2008: Analysis of a large log of multilingual image searches in Flickr Victor Peinado, Javier Artiles, Julio Gonzalo and Fernando López-Ostenero.

WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.

1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.

Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.

Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.

© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.

How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”

Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.

Thomas Mandl: GeoCLEF Track Overview Cross-Language Evaluation Forum (CLEF) Thomas Mandl, (U. Hildesheim) 8 th Workshop.

Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.

Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.

Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq

What Does the User Really Want ? Relevance, Precision and Recall.

Search engine note. Search Signals “Heuristics” which allow for the sorting of search results – Word based: frequency, position, … – HTML based: emphasis,

1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.

Acceso a la información mediante exploración de sintagmas Anselmo Peñas, Julio Gonzalo y Felisa Verdejo Dpto. Lenguajes y Sistemas Informáticos UNED III.

The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.

Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.

Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,

1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.

INFORMATION RETRIEVAL Pabitra Mitra Computer Science and Engineering IIT Kharagpur

CSCE 590 Web Scraping – Information Extraction II

Information Storage and Retrieval Fall Lecture 1: Introduction and History.

Information Retrieval and Web Search

Information Retrieval and Web Search

Information Retrieval on the World Wide Web

Information Retrieval and Web Search

IR Theory: Evaluation Methods

CSE 635 Multimedia Information Retrieval

Information Retrieval and Web Design

Information Retrieval and Web Search

Discussion Class 9 Google.

Presentation transcript:

With or without users? Julio Gonzalo UNEDhttp://nlp.uned.es

The classical IR model query Relevant docs (precise) Information need (fixed) Document collection Query expansion Formal models Indexing Clustering Query/document comparison Data structures Weighting heuristics Visualization feedback Filtering Goal: all relevant information and only relevant information

Does it apply to web search?

Is Relevance what the user needs? Most frequent questions, Infoseek 1999 (SIGIR Forum) 1.Empty question 2.sex 8. Pamela Anderson (first multiword question in the rank) Google No! It is quality, saliency, reliability... In one or two links

Is word frequency useful?

Pagerank addresses user needs Clasificados.wanadoo.es Realizadores.tv Chat.rincondelvago.com mx.dir.yahoo.com telecinco ¡ El texto de los enlaces es el más valioso para indexar!

With or without users? Google’s first commandment: Focus on the user and all the rest will come along. Google’s first commandment: Focus on the user and all the rest will come along. “With or without users?” is not the right question “With or without users?” is not the right question “With or without user focus?” YES “With or without user focus?” YES

Is CLEF focusing on users? Multilingual track: If I have equivalent sets of relevant news in many languages, I do not want a merged set. I want the subset in my native language! Multilingual track: If I have equivalent sets of relevant news in many languages, I do not want a merged set. I want the subset in my native language! Q&A track: How much does it take to find an answer with an IR engine? (Ask QA assessors!!) Q&A track: How much does it take to find an answer with an IR engine? (Ask QA assessors!!) Interactive track: natural user task, but artificial users! Interactive track: natural user task, but artificial users! Only image CLEF & GIRT partially pass the test Only image CLEF & GIRT partially pass the test Why the intersection between ECDL and CLEF is almost null? Why the intersection between ECDL and CLEF is almost null? Multilingual web track: danger of making the same pre-google mistake. Multilingual web track: danger of making the same pre-google mistake.

The web is truly multilingual by nature... But the web is redundant, and average users are looking for a single perfect link!! Almost no need for cross-language users (cf Google)

Vertical search engines? Structured data Information need Web pages extraction query

Conclusions We need more focus on user needs... We need more focus on user needs And all the rest will come along!... And all the rest will come along! Tenth Google’s commandment: great just isn’t good enough Tenth Google’s commandment: great just isn’t good enough