Retroactive Answering of Search Queries Beverly Yang Glen Jeh.

Slides:



Advertisements
Similar presentations
Beliefs & Biases in Web Search
Advertisements

Retroactive Answering of Search Queries Beverly Yang Glen Jeh Google.
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Evaluating Search Engine
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
Learning to Advertise. Introduction Advertising on the Internet = $$$ –Especially search advertising and web page advertising Problem: –Selecting ads.
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
J. Chen, O. R. Zaiane and R. Goebel An Unsupervised Approach to Cluster Web Search Results based on Word Sense Communities.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
The AdWords Toolbox All the tools you need to make your ad run more efficiently!
Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
TwitterSearch : A Comparison of Microblog Search and Web Search
Getting started on informaworld™ How do I register my institution with informaworld™? How is my institution’s online access activated? What do I do if.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Page 1 These instructions will help guide you through the pages of the Self-Nomination Process web site. Please follow these steps to navigate through.
1 Context-Aware Search Personalization with Concept Preference CIKM’11 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
MINING RELATED QUERIES FROM SEARCH ENGINE QUERY LOGS Xiaodong Shi and Christopher C. Yang Definitions: Query Record: A query record represents the submission.
Springerlink.com Introduction to SpringerLink springerlink.com.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
1 Can People Collaborate to Improve the relevance of Search Results? Florian Eiteljörge June 11, 2013Florian Eiteljörge.
Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Interactive Science Publishing: A Joint OSA-NLM Project Michael J. Ackerman National Library of Medicine.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Search Engine Architecture
Information Retrieval Effectiveness of Folksonomies on the World Wide Web P. Jason Morrison.
Query Suggestion Naama Kraus Slides are based on the papers: Baeza-Yates, Hurtado, Mendoza, Improving search engines by query clustering Boldi, Bonchi,
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.
A process of taking your best guesses. Companies have web sites where you can access your information.
CONTENTS  Definition And History  Basic services of INTERNET  The World Wide Web (W.W.W.)  WWW browsers  INTERNET search engines  Uses of INTERNET.
+ User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Company LOGO In the Name of Allah,The Most Gracious, The Most Merciful King Khalid University College of Computer and Information System Websites Programming.
Post-Ranking query suggestion by diversifying search Chao Wang.
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
“In the beginning -- before Google -- a darkness was upon the land.” Joel Achenbach Washington Post.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Adaptive Faceted Browsing in Job Offers Danielle H. Lee
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
How "Next Generation" Are We? A Snapshot of the Current State of OPACs in U.S. and Canadian Academic Libraries Melissa A. Hofmann and Sharon Yang, Moore.
ASSOCIATIVE BROWSING Evaluating 1 Jin Y. Kim / W. Bruce Croft / David Smith by Simulation.
Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.
Internet Search Techniques Finding What You’re Looking For.
Searching the Web for academic information Ruth Stubbings.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Retroactive Answering of Search Queries Beverly Yang Glen Jeh Google, Inc. Presented.
Search Engine Architecture
Mining Query Subtopics from Search Log Data
Data Mining Chapter 6 Search Engines
Search Engine Architecture
Web Mining Research: A Survey
Presentation transcript:

Retroactive Answering of Search Queries Beverly Yang Glen Jeh

2 Introduction Major web search engines have recently begun offering search history services Query-specific web recommendations (QSRs) “ britney spears concert san francisco ” Standing query Google's Web Alerts

3

4 Subproblems Automatically detecting when queries represent standing interests Detecting when new interesting results have come up for these queries

5 System Architecture

6 System Description Read a user's actions from the history database Identify the top M queries that most likely represent standing interests Submit each of these M queries to the search engine Compare the first 10 current results with the previous results Identify any new results as potential recommendations Score each recommendation The top N recommendations according to this score are displayed to the user

7 Problem Definition Prior Fulfillment –Has the user already found a satisfactory result (or set of results) for her query? Query Interest Level –What is the user's interest level in the query topic? Need/Interest Duration. –How timely is the information need?

8 Sample Query Session html encode java (8 s) –RESULTCLICK (91.00 s) –RESULTCLICK ( s) –RESULTCLICK (12.00 s) html –NEXTPAGE (5.00 s) -- start = 10 –RESULTCLICK ( s) –REFINEMENT (21.00 s) -- html encode java utility –RESULTCLICK (32.00 s) –NEXTPAGE (8.00 s) -- start = 10 –NEXTPAGE (30.00 s) -- start = 20 (Total time: s)

9 Signals Number of terms Number of clicks and number of refinements History match Navigational Repeated non-navigational

10 Interest Score iscore = a · log(# clicks + # refinements) + b · log(# repetitions) + c · (history match score).

11 Web Alerts Example On October 16, 2005, an alert for the query “beverly yang,” the name of one of the authors, returned the URL The result moved into the top 10 results for the query between October 15 and 16, “network game adapter,” the result moved into the top 10 on October 12, 2005, dropped out, and moved back in just 12 days later.

12 Determining Interesting Results “rss reader”

13 Analysis Example Good recommendation should have: –New to the user –Good Page –Recently “promoted” Signals –History presence –Rank –Popularity and relevance (PR) score –Above Dropoff

14 Quality Score qscore = a · (PR score) + b · (rank). qscore* = a · (PR score) + b · ( 1 / rank ).

15 User Study Setup First-person study –users are asked to evaluate their interest level on a number of their own past queries. Third-person study –evaluators reviewed anonymous query sessions, and assessed the quality of recommendations made on these sessions.

16 Questionnaire (1/3) 1.) During this query session, did you find a satisfactory answer to your needs? YesSomewhatNoCan’t Remember 52.4%21.5%14.9%11.2% 2.) Assume that some time after this query session, our system discovered a new, high- quality result for the query/queries in the session. If we were to show you this quality result, how interested would you be in viewing it? VerySomewhatVaguelyNot 17.8%22.5%22.0%37.7%

17 Questionnaire (2/3) 3.) How long past the time of this session would you be interested in seeing the new result ? OngoingMonthWeekMinute/Now 43.9%13.9%30.8%11.4% 4.) Assume you were interested in seeing more results for the query. Above how good would you rate the quality of this result? (First-person study) ExcellentGoodFair/Poor 25.0%18.8%56.3% (Third-person study) ExcellentGoodFairPoor 18.9%32.1%33.3%15.7 %

18 Questionnaire (3/3) 5.) How many queries do you currently have registered as web alerts? (not including any you’ve registered for Google work purposes) 012>=2 73.3%20.0%6.7%0% 6.) For the queries you marked as very or somewhat interesting, roughly how many have you registered for web alerts? 012>=2 100%0%

19 Selecting Query Sessions We eliminated all sessions for special- purpose queries, such as map queries, calculator queries, etc. We eliminated any query session with –no events –no clicks and only 1 or 2 refinements –non-repeated navigational queries This heuristic eliminated over 75% of the query sessions

20 Selecting Recommendations We only attempt to generate recommendations for queries for which we have the history presence signal We only consider results in the current top 10 results for the query For any new result that the user has not yet seen Boolean signals –whether the result appeared in the top 3 –whether the PR scores were above a certain threshold Out of this pool we select the top recommendations according to qscore to be displayed to the user

21 Results In our first-person study, 18 subjects evaluated 382 query sessions total. These subjects also evaluated a total of 16 recommended web pages. In our third-person study, 4 evaluators reviewed and scored a total of 159 recommended web pages over 159 anonymous query sessions (one recommendation per session).

22 Summary Standing interests are strongly indicated by a high number of clicks (e.g., > 8 clicks), a high number of refinements (e.g., > 3 refinements), and a high history match score. Recommendation quality is strongly indicated by a high PR score, and surprisingly, a low rank.

23 Number of Clicks

24 Number of Refinement

25 Query Score

26 Identifying Standing Interests Number of clicks and refinements History match –very interested is 39.1%, not interested is just 4.3% Number of terms –very interested in 25% of the queries with >= 6 terms, but only 6.7% of the queries with 1 term. Repeated Non-navigational –only 18 queries fall into this category

27 Precision and Recall Precision as the percentage of query sessions returned by this heuristic that were standing interests Recall as the percentage of all standing interests that appeared in the survey 90% precision and 11% recall, 69% precision and 28% recall, or 57% precision and 55% recall.

28 Rank of Result

29 PR Score

30 QScore

31 Precision Recall Precision as the percentage of desired web pages out of all pages selected by the heuristic. Recall is defined to be the percentage of selected desired pages out of all desired pages considered in our survey dataset.

32 For Quality Scores

33 Above Dropoff

34 Related Work Amazon recommend items for users to purchase based on their past purchases, and the behavior of other users with similar history. Many similar techniques developed in data- mining, such as association rules, clustering, co-citation analysis, etc., are also directly applicable to recommendations. A number of papers have explored personalization of web search based on user history

35 Conclusion We present QSR, a new system that retroactively answers search queries representing standing interests. Results show that we can achieve high accuracy in automatically identifying queries that represent standing interests, as well as in identifying relevant recommendations for these interests.