Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Fast Algorithms For Hierarchical Range Histogram Constructions
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Analysis and Modeling of Social Networks Foudalis Ilias.
Catching the Drift: Learning Broad Matches from Clickthrough Data Sonal Gupta, Mikhail Bilenko, Matthew Richardson University of Texas at Austin, Microsoft.
Digital Marketing Overview Tpugliese Adapted from Anton Koekemoer | April 2012.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Markov Networks.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
A Graphical Model For Simultaneous Partitioning And Labeling Philip Cowans & Martin Szummer AISTATS, Jan 2005 Cambridge.
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
1 Exploratory Tools for Follow-up Studies to Microarray Experiments Kaushik Sinha Ruoming Jin Gagan Agrawal Helen Piontkivska Ohio State and Kent State.
Link Analysis, PageRank and Search Engines on the Web
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Query Log Analysis Naama Kraus Slides are based on the papers: Andrei Broder, A taxonomy of web search Ricardo Baeza-Yates, Graphs from Search Engine Queries.
© Copyright 2012 STI INNSBRUCK Christoph Fuchs.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling
GDG DevFest Central Italy Joint work with J. Feldman, S. Lattanzi, V. Mirrokni (Google Research), S. Leonardi (Sapienza U. Rome), H. Lynch (Google)
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
1 Online algorithms Typically when we solve problems and design algorithms we assume that we know all the data a priori. However in many practical situations.
Presenter: Lung-Hao Lee ( 李龍豪 ) January 7, 309.
Center for E-Business Technology Seoul National University Seoul, Korea BrowseRank: letting the web users vote for page importance Yuting Liu, Bin Gao,
Understanding Crowds’ Migration on the Web Yong Wang Komal Pal Aleksandar Kuzmanovic Northwestern University
Lecture 4 Title: Search Engines By: Mr Hashem Alaidaros MKT 445.
Improving Cloaking Detection Using Search Query Popularity and Monetizability Kumar Chellapilla and David M Chickering Live Labs, Microsoft.
Improving Classification Accuracy Using Automatically Extracted Training Data Ariel Fuxman A. Kannan, A. Goldberg, R. Agrawal, P. Tsaparas, J. Shafer Search.
How does the market of sponsored links operate? User enters a query The auction for the link to appear on the search results page takes place Advertisements.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
The Business Model of Google MBAA 609 R. Nakatsu.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Query Suggestion Naama Kraus Slides are based on the papers: Baeza-Yates, Hurtado, Mendoza, Improving search engines by query clustering Boldi, Bonchi,
CS774. Markov Random Field : Theory and Application Lecture 02
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
GrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns Author : Stamatina Thomaidou, Konstantinos Leymonis, and Michalis Vazirgiannis.
Algorithmic Detection of Semantic Similarity WWW 2005.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Quality Score Jim Jansen College of Information Sciences and Technology The Pennsylvania State University
Post-Ranking query suggestion by diversifying search Chao Wang.
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
Advisor: Koh Jia-Ling Nonhlanhla Shongwe EFFICIENT QUERY EXPANSION FOR ADVERTISEMENT SEARCH WANG.H, LIANG.Y, FU.L, XUE.G, YU.Y SIGIR’09.
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
Using category-Based Adherence to Cluster Market-Basket Data Author : Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen Graduate : Chien-Ming Hsiao.
Markov Random Fields in Vision
Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.
Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative
Information Retrieval
GANG: Detecting Fraudulent Users in OSNs
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Markov Networks.
Presentation transcript:

Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1

Motivation  The main source of income for search engines is web search advertising, which places relevant advertisements together with the search engine results.  Given a specific keyword, advertisers bid for it, and the winner of the auction has her ads displayed as sponsored links next to the search results.

Keyword Generation Problem  The problem of identifying an appropriate set of keywords for a specific advertiser.  Also known as keyword research.  Examples: Google’s Adwords Keyword Tool, Overture/Yahoo! Keyword Selector Tool and Microsoft adCenter Labs’ Keyword Group Detection

Solution – Query-Click Logs  Maintain the queries that users pose to the search engine and the documents that are clicked in return.  Clicks define a strong association between the queries and the URLs.  This association is used to find the queries that are related to the interests of the advertisers.

Example  Suppose the owner of shoes.com online store launches an ad campaign.  Most of the queries to shoes.com come from users interested in buying shoes  Query-click log has mapping of queries and the clicked documents/url

Problem Definition  Input A search engine click log L that consists of triples where q is a query, u is the URL of a document, and f qu is the number of times that the users issued query q ɛ Q (set of all queries) and clicked and clicked on URL u ɛ U (set of all queries). The click log L is considered as a weighted bipartite graph and is known as click graph where Q and U constitute the partitions of the graph, and for every record in the log, there is an edge (q,u) ɛ E with weight f qu. A set of concepts C={c 1 ….c k }. The concepts represent abstract themes that the advertiser is interested in which can be either general (eg. Shoes) or specific (eg. Running shoes) A seed set S u U×C of URLs in the click log that are manually assigned to the concepts in C. The seed set S consists of pairs where u ɛ U and c ɛ C is the label of concept c.

Problem Definition cont..  Output – Given G, C and S the goal of keyword generation problem is to populate the concepts in C with queries from Q.  These queries are then used as keyword suggestions to the advertisers that are interested in the specific concept.

A Random Walk Algorithm  For some query q ɛ Q, compute the affinity of q to some seed node s ɛ S as the probability that a random walk that starts from q ends up at node s.  Similarly, the affinity of q to the concept class c ɛ C is the probability that the random walk that starts from q ends up in any seed node in class c.

ARW cont..  l q (or l u ) denote random variable pertaining to the concept label for query q (or URL u)  P(l q = c), probability that a random that starts from q will be absorbed at some node of the class c.  α is the probability of making a transition to the null class absorbing node, from any node in the graph.  γ threshold to discard probabilities of class to increase efficiency.

Markov Random Fields (MRF)  An MRF is an undirected graph, where each node in the graph is associated with a random variable and edges model the pairwise relationships between the random variables.  Markov assumption is the characteristic of MRF that the value of a random variable is independent of the rest of the graph, given the value of all its neighbors.

Gaussian Markov Random Fields  Use of continuous relaxation instead of discrete, that is, the class labels are real numbers in the [0,1] interval.

Variational Inference and Mean Field Algorithm  Labels are discrete unlike Gaussian Markov Random Field.  The goal of variational inference is to approximate the true intractable posterior distribution P with a tractable distribution P ^ that has a simpler form.

Experiment Result  Total 20 categories  ARW has max Micro- avg Relevance  Snippets has higher Micro-avg for 5 out of 20 categories  Mean field has higher avg for 2 categories

Conclusion  An approach to keyword generation that leverages the information available in the search engine click logs.  This approach requires minimal effort from the part of the advertisers.  Promising experimental results demonstrate that these algorithms can scale to large query logs and produce high-quality results.

Thank you !!