Navigation-Aided Retrieval

Slides:



Advertisements
Similar presentations
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Advertisements

VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE, and Shumeet Baluja, Member, IEEE.
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Introduction Information Management systems are designed to retrieve information efficiently. Such systems typically provide an interface in which users.
Modern Information Retrieval
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Using Hyperlink structure information for web search.
Scent Trails: Integrating Browsing and Searching on the Web Christopher Olson et al. Blake Adams November 4, 2003.
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Querying Structured Text in an XML Database By Xuemei Luo.
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Interactive Probabilistic Search for GikiCLEF Ray R Larson School of Information University of California, Berkeley Ray R Larson School of Information.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
21/11/20151Gianluca Demartini Ranking Clusters for Web Search Gianluca Demartini Paul–Alexandru Chirita Ingo Brunkhorst Wolfgang Nejdl L3S Info Lunch Hannover,
Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
University of Malta CSA3080: Lecture 12 © Chris Staff 1 of 22 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
- University of North Texas - DSCI 5240 Fall Graduate Presentation - Option A Slides Modified From 2008 Jones and Bartlett Publishers, Inc. Version.
CSE 6392 – Data Exploration and Analysis in Relational Databases April 20, 2006.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
UOS Personalized Search Zhang Tao 장도. Zhang Tao Data Mining Contents Overview 1 The Outride Approach 2 The outride Personalized Search System 3 Testing.
Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.
General Architecture of Retrieval Systems 1Adrienn Skrop.
IR Theory: Web Information Retrieval. Web IRFusion IR Search Engine 2.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
Evaluation Anisio Lacerda.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
DATA MINING Introductory and Advanced Topics Part III – Web Mining
Proposal for Term Project
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Globey's World Abstract End-Product Description Technical Approach
HITS Hypertext-Induced Topic Selection
Preface to the special issue on context-aware recommender systems
Search Engine Architecture
User-Adaptive Systems
Augmenting (personal) IR
A Comparative Study of Link Analysis Algorithms
Martin Rajman, EPFL Switzerland & Martin Vesely, CERN Switzerland
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Web Mining Department of Computer Science and Engg.
Magnet & /facet Zheng Liang
Lecture 8 Information Retrieval Introduction
Junghoo “John” Cho UCLA
Relevance and Reinforcement in Interactive Browsing
Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.
Information Retrieval and Web Design
Retrieval Performance Evaluation - Measures
Discussion Class 9 Google.
Connecting the Dots Between News Article
Presentation transcript:

Navigation-Aided Retrieval Shashank Pandit and Christopher Olstony Presentation by Yang Yu CSE 450 Web Data Mining

Outline Introduction Related Work System Model Prototype System Evaluation Summary & Future Work

Introduction Background reasons for this work Navigation-Aided Difficulty in formulating appropriate queries Open-ended search tasks Preference for orienteering Navigation-Aided Retrieval

Introduction Organic versus Synthetic Structure Contributions One is trying to synthesize structure automatically into query results One is trying to use structure that naturally exists in documents Advantages of organic NAR Human oversight. Familiar user interface. A single view of the document collection. Robust implementation by a third party Contributions Formal model of navigation-aided retrieval An overview of techniques for a NAR-based retrieval system Empirical evaluation via a user study

Related Work Selecting Starting Points Guiding Navigation Best Trails system An ad-hoc scoring function for starting points Restrict starting points to be documents that themselves match the query It does not take into account navigability factors User interface departs substantially from the traditional interface Topic distillation that mainly uses HITS Only effective for broad topic areas for which there are many hubs and authorities Guiding Navigation WebWatcher highlights hyperlinks along paths taken by previous users who had posed similar queries.

System Model Generic Model Query submodel: Navigation submodel: generic scoring function Assumption: every member of relevance set St is a singleton set. “Fatten" St into {d1, d2, …, dn}.

System Model Instantiations of Generic Model Conventional Probabilistic IR Model Navigation-Conscious Model The two terms embody the two key factors the number of documents reachable from d that are relevant to the search task the ease and accuracy with which the user is able to navigate to those documents.

Prototype System Preprocessing Content Engine Connectivity Engine: <d1, d2, dW, W(N(d2), d1, d2)> Intermediary

Prototype System

Prototype System Selecting Starting Points 1. Retrieve from the content engine all documents d’ relevant to q. 2. For each relevant document d’ retrieved in Step 1, then retrieve from the connectivity engine all documents d that can navigate to d’; 3. For each unique document d in Step 2, compute the starting point score; 4. Sort the documents in decreasing order of this score, truncate after the top k documents.

Prototype System Adding Navigation Guidance Efficiency and Scalability 1. Retrieve from the content engine all documents d’ for which R(d’, q)>= T; 2. For each document d’ retrieved in Step 1, retrieve from the connectivity engine the tuple corresponding to <d, d’>, if it exists. 3. For each <d1, d2, dW, W(N(d2), d1, d2)> tuple retrieved in Step 2, highlight links on d that point to dW. Efficiency and Scalability

Evaluation Experimental Hypotheses Search Task Test Sets In query-only scenarios, Volant does not perform significantly worse In combined query/navigation scenarios, Volant performs better The best organic starting point is of higher quality than one that can be synthesized using existing techniques. Search Task Test Sets Unambiguous: Ambiguous: Performance on Unambiguous Queries

Evaluation Performance on Ambiguous Queries 4 Criteria - Breadth; Accessibility; Appeal; Usefulness.

Summary and Future Work Effectiveness Relationship to conventional IR Relationship to synthetic approaches Future Work Add redundancy to corpora Tune scoring function to be applicable for synthetic starting points Unified method can both for exploration and directly return document

Thank you! Questions or Comments?