Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)

Slides:



Advertisements
Similar presentations
Beliefs & Biases in Web Search
Advertisements

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,
Struggling or Exploring? Disambiguating Long Search Sessions
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Date: 2012/8/13 Source: Luca Maria Aiello. al(CIKM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Behavior-driven Clustering of Queries into Topics.
Optimizing search engines using clickthrough data
A Machine Learning Approach for Improved BM25 Retrieval
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Evaluating Search Engine
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza.
Slide 1 SOLVING THE HOMEWORK PROBLEMS Simple linear regression is an appropriate model of the relationship between two quantitative variables provided.
From Devices to People: Attribution of Search Activity in Multi-User Settings Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz Microsoft Research,
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
1 Context-Aware Search Personalization with Concept Preference CIKM’11 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
by B. Zadrozny and C. Elkan
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Google News Personalization: Scalable Online Collaborative Filtering
Presenter: Lung-Hao Lee ( 李龍豪 ) January 7, 309.
Improving Cloaking Detection Using Search Query Popularity and Monetizability Kumar Chellapilla and David M Chickering Live Labs, Microsoft.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.
Personalizing Web Search using Long Term Browsing History Nicolaas Matthijs, Cambridge Filip Radlinski, Microsoft In Proceedings of WSDM
Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization --- Lei Tang, Jianping Zhang and Huan Liu.
1 Using The Past To Score The Present: Extending Term Weighting Models with Revision History Analysis CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG,
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
COLLABORATIVE SEARCH TECHNIQUES Submitted By: Shikha Singla MIT-872-2K11 M.Tech(2 nd Sem) Information Technology.
 Who Uses Web Search for What? And How?. Contribution  Combine behavioral observation and demographic features of users  Provide important insight.
Post-Ranking query suggestion by diversifying search Chao Wang.
Date: 2012/11/29 Author: Chen Wang, Keping Bi, Yunhua Hu, Hang Li, Guihong Cao Source: WSDM’12 Advisor: Jia-ling, Koh Speaker: Shun-Chen, Cheng.
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
TO Each His Own: Personalized Content Selection Based on Text Comprehensibility Date: 2013/01/24 Author: Chenhao Tan, Evgeniy Gabrilovich, Bo Pang Source:
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Date: 2013/9/25 Author: Mikhail Ageev, Dmitry Lagun, Eugene Agichtein Source: SIGIR’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Improving Search Result.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
Why Decision Engine Bing Demos Search Interaction model Data-driven Research Problems Q & A.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Distinguishing humans from robots in web search logs preliminary results using query rates and intervals Omer Duskin Dror G. Feitelson School of Computer.
1 Context-Aware Ranking in Web Search (SIGIR 10’) Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, Hang Li 2010/10/26.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
1 Cross Market Modeling for Query- Entity Matching Manish Gupta, Prashant Borole, Praful Hebbar, Rupesh Mehta, Niranjan Nayak.
Click Through Rate Prediction for Local Search Results
Improving searches through community clustering of information
Evaluation of IR Systems
An Empirical Study of Learning to Rank for Entity Search
Detecting Online Commercial Intention (OCI)
Intent-Aware Semantic Query Annotation
ISWC 2013 Entity Recommendations in Web Search
Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz
Intro to Machine Learning
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
Learning to Rank with Ties
Presentation transcript:

Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)

Motivation Improvements in ranking can be achieved in two ways: – Better features/methods for promoting high- quality result pages – Methods for filtering/demotion of adversarial and abusive content Main idea: temporal information can be leveraged to characterize the quality of content.

Learning-to-Rank Well known application of regression modeling Learn useful features and their interactions for ranking documents in response to a user query Features: document-specific, query-specific or document-query specific

Web Spam Detection Ranking of search results is often artificially changed to promote certain type of content (web spam) Anti-spam measures are highly reactive and ad hoc No previous work explored the fundamental properties of spam hosts and queries

Main idea search logs query and host profiles P1P1 time P2P2 P3P3 PnPn measures 1 measures 2 measures 3 measures n time aggregate into temporal features

Main idea Temporal changes are quantified along two orthogonal dimensions: hosts and queries Host churn: measure of inorganic host behavior in search results Query volatility: measure of likelihood of a query being compromised by spammers

Host churn Goal: quantify the temporal behavior of hosts in search results for different queries Profile includes 4 attributes: query coverage, number of impressions, click-through rate, average position in search results) Idea: spamming and low-quality hosts exhibit inorganic changes in their appearance in search results of different queries

Host churn churn metric

Host churn normal host spam host

Query volatility Goal: identify queries with temporally changing behavior; Profile: number of impressions, sets of results and click-throughs for a query at different time points; Idea: spammed or potentially spammable queries exhibit highly inconsistent behavior over time.

Query volatility Query results volatility: spam-prone queries are likely to produce semantically incoherent results over time Query impressions volatility: buzzy queries are less likely to be spam-prone Query clicks volatility: click-through densities on different search results positions are more consistent for less spam-prone queries Query sessions volatility: users are less likely to be satisfied with search results and click on them for spam-prone queries

Query results volatility Non-spam Spam

Query results volatility volatility metric

Query impressions volatility Buzzy queries are less likely to be spam-prone, since buzz is a non-trivial prediction Given time series of query counts, the ``buzziness’’ of a query is estimated with Kurtosis and Pearson coefficients

Query clicks volatility Less-spam prone, navigational queries have consistently higher density of clicks on the first few search results Click discrepancies are captured through mean, standard deviation and Pearson correlation coefficient for clicks and skips at each position

Query sessions volatility Fraction of sessions with one click on organic search results [over all sessions for the query] Fraction of sessions with no clicks on organic or sponsored search results Fraction of sessions with no click on any of the presented organic results Fraction of sessions with user clicks on a query reformulation

Spam-prone query classification Spam-prone queries (284 queries) – Filter historical Query Triage Spam complaints Non spam-prone queries (276 queries) Gradient Boosted Decision Tree Model 10-fold cross-validation

Results SPAMMEAN (baseline) – mean host-spam score for a query, developed over the years VARIABILITY – features derived from temporal profiles, language-independent Combined model most effective, variability by itself very effective

Results Position, click and result-set volatility are the key features SPAMMEAN continues to be ranked as the top feature in the combined model

Results The distributions of query spamicity scores for queries containing spam and non-spam terms are clearly different Key terms in queries on both sides of the spamicity score range indicate the accuracy of the classifier “adult”- queries “general”- queries

Ranking MLR ranking baseline (MLR 14) –1.8M query-url pairs used for training –Test on held-out data-set (7000 samples) –Query spamicity score is added to all production features Evaluation using Discounted Cumulative Gain (DCG) metric Spam Query Classification as a new feature –Covered queries are 50% of all queries

Results The coverage of the spamicity score is 50%, hence the overall improvement across all queries is not statistically significant Queries covered with spamicity score show signifcant improvement Spamicity score feature ranks among the top 30 ranking features

Conclusions Proposed a simple and effective method to characterize the temporal behavior of queries and hosts Features based on temporal profiles outperform state-of-the-art baselines in two different tasks Many verticals are similar to spam: trending queries.

Future work More in-depth analysis of temporally correlated verticals: separate ranking function Qualitative analysis of spam-prone queries along semantic dimensions Shorter time intervals for aggregation