Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza.

Slides:



Advertisements
Similar presentations
Predicting User Interests from Contextual Information
Advertisements

Evaluating Novelty and Diversity Charles Clarke School of Computer Science University of Waterloo two talks in one!
Struggling or Exploring? Disambiguating Long Search Sessions
Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
FindAll: A Local Search Engine for Mobile Phones Aruna Balasubramanian University of Washington.
Personalization and Search Jaime Teevan Microsoft Research.
Searchable Web sites Recommendation Date : 2012/2/20 Source : WSDM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh Jia-ling 1.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Design Guidelines for Effective WWW History Mechanisms Linda Tauscher and Saul Greenberg University of Calgary This talk accompanied a paper, and was presented.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Evaluating Search Engine
THE WEB CHANGES EVERYTHING Jaime Teevan, Microsoft Research.
Presented by Li-Tal Mashiach Learning to Rank: A Machine Learning Approach to Static Ranking Algorithms for Large Data Sets Student Symposium.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Web Projections Learning from Contextual Subgraphs of the Web Jure Leskovec, CMU Susan Dumais, MSR Eric Horvitz, MSR.
1 Automatic Identification of User Goals in Web Search Uichin Lee, Zhenyu Liu, Junghoo Cho Computer Science Department, UCLA {uclee, vicliu,
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
Information Re-Retrieval: Repeat Queries in Yahoo’s Logs Jaime Teevan, Eytan Adar, Rosie Jones, Michael A. S. Potts SIGIR 2007.
Section 2: Finding and Refinding Jaime Teevan Microsoft Research 1.
Finding and Re-Finding Through Personalization Jaime Teevan MIT, CSAIL David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee), Eytan.
SLOW SEARCH Jaime Teevan, Kevyn Collins-Thompson, Ryen White, Susan Dumais and Yubin Kim.
The Perfect Search Engine Is Not Enough Jaime Teevan †, Christine Alvarado †, Mark S. Ackerman ‡ and David R. Karger † † MIT, CSAIL ‡ University of Michigan.
TwitterSearch : A Comparison of Microblog Search and Web Search
Jaime Teevan Microsoft Research Finding and Re-Finding Personal Information.
Facets of Personalization Jaime Teevan Microsoft Research (CLUES) with S. Dumais, E. Horvitz, D. Liebling, E. Adar, J. Elsas, R. Hughes.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
Understanding and Predicting Personal Navigation Date : 2012/4/16 Source : WSDM 11 Speaker : Chiu, I- Chih Advisor : Dr. Koh Jia-ling 1.
Understanding Query Ambiguity Jaime Teevan, Susan Dumais, Dan Liebling Microsoft Research.
Implicit Acquisition of Context for Personalization of Information Retrieval Systems Chang Liu, Nicholas J. Belkin School of Communication and Information.
“W HERE ’ D IT GO ?” Jaime Teevan Microsoft Research.
Improving Classification Accuracy Using Automatically Extracted Training Data Ariel Fuxman A. Kannan, A. Goldberg, R. Agrawal, P. Tsaparas, J. Shafer Search.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.
Personalizing Search Jaime Teevan, MIT Susan T. Dumais, MSR and Eric Horvitz, MSR.
CiteSight: Contextual Citation Recommendation with Differential Search Avishay Livne 1, Vivek Gokuladas 2, Jaime Teevan 3, Susan Dumais 3, Eytan Adar 1.
Retroactive Answering of Search Queries Beverly Yang Glen Jeh.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
 Who Uses Web Search for What? And How?. Contribution  Combine behavioral observation and demographic features of users  Provide important insight.
Understanding and Predicting Personal Navigation.
Post-Ranking query suggestion by diversifying search Chao Wang.
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
THE WEB CHANGES EVERYTHING Jaime Teevan, Microsoft
Date: 2013/9/25 Author: Mikhail Ageev, Dmitry Lagun, Eugene Agichtein Source: SIGIR’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Improving Search Result.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Helping People Find Information Better Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger.
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
Jaime Teevan MIT, CSAIL The Re:Search Engine. “Pick a card, any card.”
Potential for Personalization Transactions on Computer-Human Interaction, 17(1), March 2010 Data Mining for Understanding User Needs Jaime Teevan, Susan.
SEARCH AND CONTEXT Susan Dumais, Microsoft Research INFO 320.
The Web Changes Everything How Dynamic Content Affects the Way People Find Online Jaime Teevan Microsoft Research (CLUES) with S. Dumais, D. Liebling,
Jaime Teevan MIT, CSAIL. HCI at MIT ♠HCI Seminar ♥Fridays at 1:30 pm ♥URL: ♥Announcement mailing list ♠Applying HCI to.
Human Computer Interaction Lecture 21 User Support
Evaluation Anisio Lacerda.
Improving searches through community clustering of information
Simultaneous Support for Finding and Re-Finding
Amazing The Re:Search Engine Jaime Teevan MIT, CSAIL.
Consistency During Search Without Stagnation
The Perfect Search Engine Is Not Enough
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
Presentation transcript:

Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza

What’s the URL for this year’s SIGIR?

Doesn’t really matter… Call for papers (Dec. ‘06) Submission instructions (Jan ’07) Response date? (Apr. ’07) Formatting guidelines (May ’07) Proceedings (Jun ’07) Travel plans/registration (Jul ’07)

Overview Log analysis to: – Quantify amount of re-finding behavior – Understand types of re-finding Re-finding is very common Stability of results affects re-finding Possible to identify re-finding behavior

What Is Known About Re-Finding Re-finding recent topic of interest Web re-visitation common [Tauscher & Greenberg] People follow known paths for re-finding → Search engines likely to be used for re-finding Query log analysis of re-finding – Query sessions [Jones & Fain] – Temporal aspects [Sanderson & Dumais]

Study Methodology Looked for re-finding in Yahoo’s query logs 114 anonymous users – Tracked for a year (average activity: 97 days) – Users identified via cookie – 13,060 queries and their clicks Log studies rich but lack intention – Infer intention – Supplement with large user study (119 users)

Inferring Re-Finding Intent Really hard problem – No one to ask: what were you doing? But… we can make some inferences Click on previously clicked results? Click on different results? Same query issued before? New query? Hypothesize re-finding Intent

Click on previously clicked results? Click on different results? Same query issued before? New query? Hypothesize re-finding Intent Click on previously clicked results? Click on different results? Same query issued before? New query?

Click on previously clicked results? Click on different results? Same query issued before? New query? Click same and different?

Click on previously clicked results? Click on different results? Same query issued before? New query? Click same and different? 1 click > 1 click

3100 (24%) 36 (<1%) 635 (5%) 485 (4%) 637 (5%) 4 (<1%) 660 (5%) 7503 (57%) Click on previously clicked results? Click on different results? Same query issued before? New query? Click same and different? 1 click > 1 click 39% Navigational Re-finding with different query

How Queries Change Many ways queries can change – Capitalization (“new york” and “New York”) – Word swap (“britney spears” and “spears britney”) – Word merge (“walmart” and “wal mart”) – Word removal (“orange county venues” and “orange county music venues”) 17 types of change identified – 2049 combinations explored – Log data and supplemental study – Most normalizations require only one type of change

Rank Change Reduces Re-Finding Results change rank Change reduces probability of repeat click – No rank change: 88% chance – Rank change: 53% chance Why? – Gone? – Not seen? – New results are better?

Gone? Not Seen? Better?

Change Slows Re-Finding Look at time to click as proxy for Ease Rank change  slower repeat click – Compared with initial search to click – No rank change: Re-click is faster – Rank change: Re-click is slower Changes interferes and stability helps  ?

Helping People Re-Find Potential way to take advantage of stability – Automatically determine if the task is re-finding – Keep results consistent with expectation – Simple form of personalization Can we automatically predict if a query is intended for re-finding?

Predicting the Query Target For simple navigational queries, predict what URL will be clicked For complex repeat queries, two binary classification tasks: – Will a new (never visited) result be clicked? – Will an old (previously visited) result be clicked?

Predicting Navigational Queries Predict navigational query clicks using – Query issued twice before – Queries with the same one result clicked Very effective prediction – 96% accuracy: Predict one of the results clicked – 95% accuracy: Predict first result clicked – 94% accuracy: Predict only result clicked

Predicting More Complex Queries Trained an SVM to identify – If a new result will be clicked – If an old result will be clicked Effective features: – Number of previous searches for the same thing – Whether any or the results were clicked >1 time – Number of clicks each time the query was issued Accuracy around 80% for both prediction tasks

Future Work Experiment with different history mechanisms – Given knowledge about re-finding intent, how do we best modify result pages? How to integrate new, better results? Contextual re-finding – Re-finding varies by user – Re-finding varies by time of day

Summary Log analysis supplemented by a user study Re-finding is very common – Navigational queries are particularly common – Categorized potential re-finding behavior – Explored ways query strings are modified Stability of result rank impacts re-finding tasks Provided a first step in the solution by automatically classifying repeat queries to identify re-finding

Thank you! Questions? Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza