Measuring the value of search trails in web logs Presentation by Maksym Taran & Scott Breyfogle Research by Ryen White & Jeff Huang.

Slides:



Advertisements
Similar presentations
Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,
Advertisements

Chapter 5: Introduction to Information Retrieval
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Welcome to Florida International University Online J.O.B.S. Link Applicant Tutorial.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
Search Engines and Information Retrieval
Amanda Spink : Analysis of Web Searching and Retrieval Larry Reeve INFO861 - Topics in Information Science Dr. McCain - Winter 2004.
Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
Searching the Web II. The Web Why is it important: –“Free” ubiquitous information resource –Broad coverage of topics and perspectives –Becoming dominant.
Ryen White, Susan Dumais, Jaime Teevan Microsoft Research {ryenw, sdumais,
COMP 630L Paper Presentation Javy Hoi Ying Lau. Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou,
Information Retrieval
Overview of Search Engines
Internet Research Search Engines & Subject Directories.
Query Log Analysis Naama Kraus Slides are based on the papers: Andrei Broder, A taxonomy of web search Ricardo Baeza-Yates, Graphs from Search Engine Queries.
Alexander Hartmann.  Free service offered by Google that generates detailed statistics about the visitors to a website. A premium version is also available.
SLOW SEARCH Jaime Teevan, Kevyn Collins-Thompson, Ryen White, Susan Dumais and Yubin Kim.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα
*Chapter One: What is Footnote?* Footnote allows people to find and share over 70 million historical documents Use the search engine to explore documents.
From Devices to People: Attribution of Search Activity in Multi-User Settings Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz Microsoft Research,
1 Announcements Research Paper due today Research Talks –Nov. 29 (Monday) Kayatana and Lance –Dec. 1 (Wednesday) Mark and Jeremy –Dec. 3 (Friday) Joe and.
Search Engines and Information Retrieval Chapter 1.
Authors: Maryam Kamvar and Shumeet Baluja Date of Publication: August 2007 Name of Speaker: Venkatasomeswara Pawan Addanki.
MarketLine HQ ADVANTAGE – your subscription service Explore today at
Information Architecture The science of figuring out what you want your Web site to do and then constructing a blueprint before you dive in and put the.
WEEK 3 THE TERM PAPER. WHAT IS A TERM PAPER? An academic essay that is rather lengthy, prepared by an academic writer Written in a concise and well documented.
50’S60’S70’S80’S 90’S 2000+Historical Hairstyles.
Project 6: Undergraduate Activity Related to the UCI Libraries Azia Foster. Andrew Zaldivar. Scott Roeder.
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Tag Data and Personalized Information Retrieval 1.
Author: Sali Allister Date: 21/06/2011 COASTAL Google Analytics Report March 2011 – June /03/2011 – 08/06/11.
Author: Sali Allister Date: 09/12/2010 COASTAL Google Analytics Report September 2010 – December /09/2010 – 08/12/2010.
Author: Sali Allister Date: 10/01/2012 COASTAL Google Analytics Report September 2011– December /09/2011 – 08/12/11.
Author: Sali Allister Date: 18/10/2011 COASTAL Google Analytics Report June 2011 – September /06/2011 – 08/09/11.
Search Engine Interfaces search engine modus operandi.
SEO  What is it?  Seo is a collection of techniques targeted towards increasing the presence of a website on a search engine.
« Pruning Policies for Two-Tiered Inverted Index with Correctness Guarantee » Proceedings of the 30th annual international ACM SIGIR, Amsterdam 2007) A.
Copyright © 1994 Carnegie Mellon University Disciplined Software Engineering - Lecture 3 1 Software Size Estimation I Material adapted from: Disciplined.
Ryen W. White, Dan Morris Microsoft Research, Redmond, USA {ryenw,
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
1 FollowMyLink Individual APT Presentation Third Talk February 2006.
Bibliometrics toolkit Website: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Further info: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Scopus Scopus was launched by Elsevier in.
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
Evaluating the digital library: 3: Web site at Loughborough university library Dr Graham Walton, Service Development Manager and Honorary Research Fellow.
 Who Uses Web Search for What? And How?. Contribution  Combine behavioral observation and demographic features of users  Provide important insight.
DEV344. NEW VISITORS GROWTH BOUNCE RATE LOSS CONVERSION RATE ORDER VALUE x TIME ON SITE PAGES PER VISIT NUMBER OF VISITS SEARCHES TWEETS MENTIONS.
Post-Ranking query suggestion by diversifying search Chao Wang.
More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.
Page Quality: In Search of an Unbiased Web Ranking Seminar on databases and the internet. Hebrew University of Jerusalem Winter 2008 Ofir Cooper
Ecommerce Applications 2009/10 Session 41 E-Commerce Applications Step by step building of a shop in Shopcreator.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Ronny Kohavi, Microsoft Slides available at
1 Context-Aware Ranking in Web Search (SIGIR 10’) Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, Hang Li 2010/10/26.
Discovering Changes on the Web What’s New on the Web? The Evolution of the Web from a Search Engine Perspective Alexandros Ntoulas Junghoo Cho Christopher.
EXERCISE: HOW TO USE A WIKI Prof. Dr. Walter Ruf 1
Information Retrieval in Practice
Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs Ryen W. White1 Jeff Huang2 1Microsoft Research 1University of Washington.
Search Engines & Subject Directories
HOW TO TWEET OLSP Fall SOTE.
Accessing Engineering Information
Search Engines & Subject Directories
Search Engines & Subject Directories
Using analytics to drive traffic
Krug Chapter 2 How We Really Use the Web and Web Site Design
Google Analytics Data for “Going Out There” website
Retrieval Performance Evaluation - Measures
Presentation transcript:

Measuring the value of search trails in web logs Presentation by Maksym Taran & Scott Breyfogle Research by Ryen White & Jeff Huang

What's all this? When we search, we usually get individual documents But people often click around to other pages… Search trails! Paper looked at whether they were useful or not Got data, analyzed ‘usefulness metrics’ Ostensibly to make search results pages better

Search trails Here’s one: Components: Origin, Destination, Sub-trail

The data Data from 3 months of user logs 2 months for user history, 1 month for analysis Pruned heavy and light users Only included queries with relevance ratings Took at most 10 trails / person

The experiment Computed metrics for the remaining data Compared metrics – between trail components – between popular and unpopular queries – user query history

Metrics: Relevance Human Rating Six point scale Average different views

Metrics: Coverage Background info – DMOZ / Open Directory Project Query interest model – consists of the DMOZ labels for query results – weighted so label frequency counts Coverage of a site w.r.t. a query: – fraction of weighted labels relevant to the query that are also relevant to the site

Metrics: Diversity Diversity of a site w.r.t. a query: – Unweighted coverage – Unweighted fraction of labels that are relevant to a query that are also relevant to the site

Metrics: Novelty Novelty of a site w.r.t. a query and a user: – Diversity, but taking into account user history – The unweighted fraction of relevant labels that a user hasn’t already seen for this query

Metrics: Utility Hard to get a good metric for this An easy one is time spent on site Utility of a site w.r.t. a query: – did the user spent 30 s or more on the site? – a source said it was reasonable…

Results!

Results! Query popularity Queries were split into 3 popularity tiers Saw all metrics increase with popularity Agrees with previous findings that search engines perform better for more popular queries

Results! Query history Some users had histories of re-running queries Queries were grouped by ‘historical frequency’ – None – Some – Lots Relevance and utility rise with more history Coverage, diversity, novelty fall

Criticism Most results trivial: – More topics are covered when more than one page is visited – First and last pages are more relevant User knows what they are looking for Definition of a session is shaky Many search paths may end after the first few clicks Users recycle tabs

Criticism & Further work Questionable applicability to results pages – It’s difficult to show trails on results pages – Do the users value the extra coverage/diversity/novelty in the trail? Further work needed – Figure out whether it would be good to show people trails – Find more appropriate/user-derived metrics