A machine learning approach to improve precision for navigational queries in a Web information retrieval system Reiner Kraft

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.

Retrieval Evaluation J. H. Wang Mar. 18, Outline Chap. 3, Retrieval Evaluation –Retrieval Performance Evaluation –Reference Collections.

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.

Exercising these ideas  You have a description of each item in a small collection. (30 web sites)  Assume we are looking for information about boxers,

Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.

Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.

A Quality Focused Crawler for Health Information Tim Tang.

“ The Anatomy of a Large-Scale Hypertextual Web Search Engine ” Presented by Ahmed Khaled Al-Shantout ICS

Evaluating Search Engine

Search Engines and Information Retrieval

Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan Susan T.Dumains Eric Horvitz MIT,CSAILMicrosoft Researcher Microsoft.

Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.

1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.

Scott Wen-tau Yih Joint work with Kristina Toutanova, John Platt, Chris Meek Microsoft Research.

Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.

Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.

Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.

Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.

Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.

The University of Kansas Vitalseek Dr. Susan Gauch.

Text-Based Content Search and Retrieval in ad hoc P2P Communities Francisco Matias Cuenca-Acuna Thu D. Nguyen

MARS: Applying Multiplicative Adaptive User Preference Retrieval to Web Search Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.

Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.

Chapter 5: Information Retrieval and Web Search

CSCI 5417 Information Retrieval Systems Jim Martin Lecture 6 9/8/2011.

Evaluation David Kauchak cs458 Fall 2012 adapted from:

Evaluation David Kauchak cs160 Fall 2009 adapted from:

Search Engines and Information Retrieval Chapter 1.

Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?

MPI Informatik 1/17 Oberseminar AG5 Result merging in a Peer-to-Peer Web Search Engine Supervisors: Speaker : Sergey Chernov Prof. Gerhard Weikum Christian.

Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.

UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.

Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.

Clustering Personalized Web Search Results Xuehua Shen and Hong Cheng.

Clustering Top-Ranking Sentences for Information Access Anastasios Tombros, Joemon Jose, Ian Ruthven University of Glasgow & University of Strathclyde.

CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.

Less is More Probabilistic Models for Retrieving Fewer Relevant Documents Harr Chen, David R. Karger MIT CSAIL ACM SIGIR 2006 August 9, 2006.

GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.

Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.

A Novel Visualization Model for Web Search Results Nguyen T, and Zhang J IEEE Transactions on Visualization and Computer Graphics PAWS Meeting Presented.

Presenter: Libin Zheng, Yongqi Zhang Department of Computer Science and Engineering HKUST Date: 24/11/2015 Crowd-aided course selection on MOOC.

Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.

Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq

Supporting Knowledge Discovery: Next Generation of Search Engines Qiaozhu Mei 04/21/2005.

What Does the User Really Want ? Relevance, Precision and Recall.

Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Longzhuang Li, Yi Shang, Wei Zhang 2002.ACM. Improvement of HITS-based Algorithms.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

Information Retrieval Quality of a Search Engine.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.

Introduction to Information Retrieval Introduction to Information Retrieval Lecture 10 Evaluation.

1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.

CS791 - Technologies of Google Spring A Webbased Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.

Relevant Document Distribution Estimation Method for Resource Selection Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University

Sampath Jayarathna Cal Poly Pomona

Large Scale Search: Inverted Index, etc.

Evaluation of IR Systems

IR Theory: Evaluation Methods

Learning Literature Search Models from Citation Behavior

Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.

INF 141: Information Retrieval

Web Information retrieval (Web IR)

Learning to Rank with Ties

Presentation transcript:

A machine learning approach to improve precision for navigational queries in a Web information retrieval system Reiner Kraft

Motivation Ranking of search results: –Require high precision vs. recall –Navigational queries (homepage finding task) should have desired result on top –Users are impatient and don’t examine low ranked results –Want to incorporate users relevance judgment to improve overall ranking

Project Goal Use on-line learning algorithm, that given query q, find homepage h q –Rank r(q,h q ) is within top k ranked search results, where k<20 –More ambitious: Let r(q,h q ) =1 –Improve precision of top k search results Algorithm design has to be space and time efficient to be of practical use

Overall setup On-line learning algorithm based on weighted majority algorithm Predict with weighted median for query q User is teacher and provides reinforcements: –Negative Vote: document ranked too high (-) –Positive Vote: document ranked too low (+) Algorithm incorporate feedback and update ranking for q

LearnRank 1 Use good quality ranking of search engine for query q as initialization of expert’s weights Uses matrix of experts per query q Each expert predicts fixed rank (linear distribution) Rows of experts are managed by k master algoritms (MA) and combine predictions MA predict with weighted median Master rank algorithm (MRA) then combines predictions of MA’s by sorting Need to resolve ties using heuristics based on votes MA’s are using fixed multiplicative update to punish poorly performing experts

The expert weight matrix M q DocMapping E1E1 E2E2 E3E3 d2d d3d d1d MA 1 predicts: 1 MA 2 predicts: 2 MA 3 predicts: 3 MRA predicts then: (d 2,1),(d 3,2),(d 1,3) Example:

LearnRank 2 Uses absolute loss based on distance to voted rank Uses shared update –Takes some of the weight of misleading experts and distributes it among the other experts –Better adaptability

Average precision of one query over time

Average Votes Distribution

Average Precision compared to initial search engine ranking

Conclusion LearnRank 1 and LearnRank 2 outperform initial search engine ranking in terms of average precision over time LearnRank 2 performs better because of shared update (more adaptive) Algorithms are time and space efficient and can be easily implement in search engines