Document Clustering for Natural Language Dialogue-based IR (Google for the Blind) Antoine Raux IR Seminar and Lab 11-743 Fall 2003 Initial Presentation.

Slides:



Advertisements
Similar presentations
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Advertisements

1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 11 Designing for Usability I.
Natural Language Processing WEB SEARCH ENGINES August, 2002.
(Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003.
IAEA International Atomic Energy Agency INIS Collection Search: Introduction and main features INIS Training Seminar 7-11 October 2013, Vienna Domenico.
Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Page 1 June 2, 2015 Optimizing for Search Making it easier for users to find your content.
Information Retrieval in Practice
Search Engines and Information Retrieval
Assignment: Improving search rank – search engine optimization Read the following post carefully.
IR Models: Structural Models
Interfaces for Selecting and Understanding Collections.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
© Anselm SpoerriInfo + Web Tech Course Information Technologies Info + Web Tech Course Anselm Spoerri PhD (MIT) Rutgers University
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
J. Chen, O. R. Zaiane and R. Goebel An Unsupervised Approach to Cluster Web Search Results based on Word Sense Communities.
Web Search – Summer Term 2006 II. Information Retrieval (Basics Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
Creating and Visualizing Document Classification J. Gelernter, D. Cao, R. Lu, E. Fink, J. Carbonell.
Overview of Search Engines
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Search Engines and Information Retrieval Chapter 1.
Aardvark Anatomy of a Large-Scale Social Search Engine.
JASS 2005 Next-Generation User-Centered Information Management Information visualization Alexander S. Babaev Faculty of Applied Mathematics.
Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 8.1 Chapter 8 : The Mobile Web Mobile computing.
Sharad Oberoi and Susan Finger Carnegie Mellon University DesignWebs: Towards the Creation of an Interactive Navigational Tool to assist and support Engineering.
PERSONALIZED SEARCH Ram Nithin Baalay. Personalized Search? Search Engine: A Vital Need Next level of Intelligent Information Retrieval. Retrieval of.
Tag Data and Personalized Information Retrieval 1.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
A seminar on “Mobile Version of The Website”
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Personalized Search Xiao Liu
Instant Information Access With Magnify Search Dr. Rado Kotorov Technical Director Strategic Product Mgt.
CSM06 Information Retrieval Lecture 6: Visualising the Results Set Dr Andrew Salway
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
University of Malta CSA3080: Lecture 4 © Chris Staff 1 of 14 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
Search Engine Architecture
Search Engines.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”
WebFOCUS Magnify: Search Based Applications Dr. Rado Kotorov Technical Director of Strategic Product Management.
SEARCH ENGINES AND BOOLEAN OPS. QUINTIN LUNSFORD.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
WebEx. Google 101: Getting more from Google 7/26/2010.
Information Retrieval
NATURAL LANGUAGE PROCESSING Zachary McNellis. Overview  Background  Areas of NLP  How it works?  Future of NLP  References.
Searching for the Best Engine Presented by Gong GI Hyun, IDS Lab., Seoul National University.
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Necessary Changes to Modern Library Catalogs and Potential Solutions Meg Gill ILS 506-S70.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
June 30, 2005 Public Web Site Search Project Update: 6/30/2005 Linda Busdiecker & Andy Nguyen Department of Information Technology.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
CS 501: Software Engineering Fall 1999 Lecture 23 Design for Usability I.
Information Retrieval in Practice
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Search Engine Architecture
Proposal for Term Project
Google Search Appliance: improving the search experience
Search Engine Architecture
SIS: A system for Personal Information Retrieval and Re-Use
Visualization of Web Search Results in 3D
Search Pages and Results
Thanks to Bill Arms, Marti Hearst
International Marketing and Output Database Conference 2005
Identify Different Chinese People with Identical Names on the Web
Search Engine Architecture
Information Retrieval and Web Design
Presentation transcript:

Document Clustering for Natural Language Dialogue-based IR (Google for the Blind) Antoine Raux IR Seminar and Lab Fall 2003 Initial Presentation

Background Current search engines: user browses through an ordered list of retrieved documents Problem: for some users and/or some devices, this is not realistic –Speech only interaction (e.g. for the blind, phone-based systems) –Small devices (e.g. PDA, cell phones)

Goal Build a light-weight, “natural language” dialogue-based interface to refine queries and select relevant documents.

Previous Work Research on innovative interfaces for IR using clustering –e.g. Scatter/Gather Technology to enable the visually impaired to access the web –Page readers, web navigators But no research combining the two: –Using clustering to perform speech-based IR

Problem Definition Starts with a free query –“What are you seeking information about?” Turn-based interaction –System turns: 20 words at most –User turns: free, natural language Goal: retrieve one document

Baseline Solution Feed the initial query to a traditional search engine Read the first words of the top document and ask “Do you want this document?” If users says “No”, read the first words of the next document, and so on.

Proposed (Initial) Solution 1.Feed the initial query to a traditional search engine 2.Cluster the set of selected documents into a small number of cluster (e.g. 2-5) 3.Label each cluster 4.Generate a question to select a cluster 5.If small number of documents left, traverse the list sequentially else go back to step 2

Planned Implementation Use the Lemur Toolkit for parsing, indexing and basic retrieval Implement (in C++) a top-down clustering algorithm that labels each cluster Implement a (very simple) CGI user interface RetrievalClusteringQuestion Document Summaries

To be continued…