Query Logs – Used everywhere and for everything Sai Vallurupalli.

Slides:



Advertisements
Similar presentations
UNIVERSITY COLLEGE DUBLIN DUBLIN CITY UNIVERSITY This material is based upon work supported by Science Foundation Ireland under Grant No. 03/IN3/1361 TEMPORAL.
Advertisements

Web Mining.
Web Usage Mining Web Usage Mining (Clickstream Analysis) Mark Levene (Follow the links to learn more!)
Struggling or Exploring? Disambiguating Long Search Sessions
Optimizing search engines using clickthrough data
Eye Tracking Analysis of User Behavior in WWW Search Laura Granka Thorsten Joachims Geri Gay.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
1 Web Search and Web Search Overlap: What the Deal? Amanda Spink Queensland University of Technology.
Search Engines and Information Retrieval
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Amanda Spink : Analysis of Web Searching and Retrieval Larry Reeve INFO861 - Topics in Information Science Dr. McCain - Winter 2004.
ECE 7995 CACHING AND PREFETCHING TECHNIQUES. Locality In Search Engine Queries And Its Implications For Caching By: LAKSHMI JANARDHAN – ba8671 JUNAID.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
Information Re-Retrieval: Repeat Queries in Yahoo’s Logs Jaime Teevan, Eytan Adar, Rosie Jones, Michael A. S. Potts SIGIR 2007.
Consumers on the Web: Identification of usage patterns Consumers on the Web: Identification of usage patterns by Nina Koiso-Kanttila
Overview of Search Engines
Query Log Analysis Naama Kraus Slides are based on the papers: Andrei Broder, A taxonomy of web search Ricardo Baeza-Yates, Graphs from Search Engine Queries.
Search Engine Optimization Andrew Steward Matthew Golling.
Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
1 Web Search and Advanced Internet Services 290N Class Introduction Tao Yang, 2014.
Stanford HCI Group Adobe Advanced Technology Labs Two Studies of Opportunistic Programming: Interleaving Web Foraging, Learning, and Writing Code Joel.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
User Searching Behaviors (and Interactive Retrieval Techniques) within a Library Gateway William H. Mischo Mary C. Schlembach David S. Vess University.
Search Engines and Information Retrieval Chapter 1.
EXTRACT: MINING SOCIAL FEATURES FROM WLAN TRACES: A GENDER-BASED CASE STUDY By Udayan Kumar Ahmed Helmy University of Florida Presented by Ahmed Alghamdi.
1 Information Retrieval and Advanced Internet Services 290N Class Introduction Tao Yang, 2015
Authors: Maryam Kamvar and Shumeet Baluja Date of Publication: August 2007 Name of Speaker: Venkatasomeswara Pawan Addanki.
APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
CSCI-235 Micro-Computer in Science Internet Search.
 Search Engine Search Engine  Steps to Search for webpages pertaining to a specific information Steps to Search for webpages pertaining to a specific.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
JANE LI, SCOTT B. HUFFMAN, AND AKIHITO TOKUDA JULY 2009 PRESENTED BY : GAURANG JHAWAR Good Abandonment in Mobile and PC Internet Search 1.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Personalized Search Xiao Liu
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
1 Date: 2012/9/13 Source: Yang Song, Dengyong Zhou, Li-wei Heal(WSDM’12) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Query Suggestion by Constructing.
CS 347Notes101 CS 347 Parallel and Distributed Data Processing Distributed Information Retrieval Hector Garcia-Molina Zoltan Gyongyi.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
Retroactive Answering of Search Queries Beverly Yang Glen Jeh.
Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Web Mining Issues Size Size –>350 million pages –Grows at about 1 million pages a day Diverse types of data Diverse types of data.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Post-Ranking query suggestion by diversifying search Chao Wang.
Web Search – Summer Term 2006 VII. Web Search - Indexing: Structure Index (c) Wolfgang Hürst, Albert-Ludwigs-University.
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
Why Decision Engine Bing Demos Search Interaction model Data-driven Research Problems Q & A.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
ASSOCIATIVE BROWSING Evaluating 1 Jin Y. Kim / W. Bruce Croft / David Smith by Simulation.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
A Large Scale Study of Wireless Search Behavior: Google Mobile Search By Maryam Kamvar, Shumeet Baluja Presented by Prashanth Kumar Muthoju, Aditya Varakantam.
Queries and Interfaces
Search Engine Architecture
Lecture 12: Relevance Feedback & Query Expansion - II
Web Mining Research: A Survey
Web Search and Advanced Internet Services
Journal of Web Semantics 55 (2019)
Presentation transcript:

Query Logs – Used everywhere and for everything Sai Vallurupalli

What are query logs useful for? In Social Sciences, Medical & Health, Advertising & Marketing, Law Enforcement etc. Understanding Search Behavior – Trends and Hot Trends average length of search terms most frequently searched terms percentage of repeat queries query term frequency distributions number of users using the advanced features number of queries a user entered before being satisfied with the results or giving up average number of result pages and links examined Understanding and Categorizing Queries & Users Informational, Navigational, Transactional, Connectivity

What are query logs useful for (contd.): For improving applications that produce these logs Improving Document Scoring Scoring based on usage statistics, i.e., number of users, type of users, nature of the visit etc. Scoring is based on usage patterns. Score increases if more users select the document more time is spent on the document, or an increase in rate of time spent number of search terms that resulted in the document increase, or the rate of increase increases the document moves up in rank positions, the rate of position movement increases Improving Performance thru Query caching

What is logged? User identifier or session identifier IP address identifying the device Query terms Query timestamp, additional timestamps for clicked results List of URLs or results, ranks, whether they were clicked on click through data relevance feedback links page dwell time search exit

Improving Information Retrieval Determining Query Intent Select a set of adjacent queries for a single need by a single user Understand user query modification & reformulation Determine equivalent descriptions for an information need Identify and account for misspelled terms Query Recommendation Query expansion Relevance feedback Query Suggestion Query Caching

Privacy Concerns & Current Research Contain Sensitive Information Can be Mined for User information Anonymizing Privacy/utility tradeoff Can be used to Determine User Intent Topical Obfuscation with dummy query injection By not logging unique queries Substituting user query with a group of queries which produce same results Length of query logs not keep logs for more than a certain period

Future Work Studying search patterns/behaviors in mobile environments question/answering, longer queries data-driven search -- chained queries and intent revision more privacy protection techniques

References Analysis of a Very Large Web Search Engine Query Log, Craig Silverstein, Monika Henzinger, Hannes Marais, Michael Moritz Users’ interactions with the Excite Web Search Engine – A query reformulation and relevance feedback analysis, Amanda Spink, Carol Chang, Agnes Goz. Learning about the World through Long-Term Query Logs, Matthew Richardson. User 4XXXXX9: Anonymizing Query Logs, Eytan Adar. “I Know What You Did Last Summer” – Query Logs and User Privacy, Rosie Jones, Ravi Kumar, Bo Pang, Andrew Tomkins. Query Logs Alone are not Enough, carrie Grimes, Diane Tang, Daniel Russell Providing Privacy through Plausibly Deniable Search, Mummoorthy Murugesan, Chris Clifton. Web Search log analysis – Programmers rarely refine queries, but are good at it, Joel Brandt, Philip J Guo, Joel Lewenstein, Mira Dontcheva, Scott Klemmer. Clustering Query Refinements by User Intent Analysis of Long Queries in a Large Scale Search Log, Michael Bendersky, Bruce Croft. Search Trends: Are Compound Queries the Start of the Shift to Data Driven Search? Google patents for Document scoring, employing usage statistics in document retrieval, methods for determining equivalent descriptions for an information need, extracting user intent from query logs.