Download presentation
Presentation is loading. Please wait.
Published byFrancis Smith Modified over 8 years ago
1
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1
2
Motivation The main source of income for search engines is web search advertising, which places relevant advertisements together with the search engine results. Given a specific keyword, advertisers bid for it, and the winner of the auction has her ads displayed as sponsored links next to the search results.
3
Keyword Generation Problem The problem of identifying an appropriate set of keywords for a specific advertiser. Also known as keyword research. Examples: Google’s Adwords Keyword Tool, Overture/Yahoo! Keyword Selector Tool and Microsoft adCenter Labs’ Keyword Group Detection
4
Solution – Query-Click Logs Maintain the queries that users pose to the search engine and the documents that are clicked in return. Clicks define a strong association between the queries and the URLs. This association is used to find the queries that are related to the interests of the advertisers.
5
Example Suppose the owner of shoes.com online store launches an ad campaign. Most of the queries to shoes.com come from users interested in buying shoes Query-click log has mapping of queries and the clicked documents/url
6
Problem Definition Input A search engine click log L that consists of triples where q is a query, u is the URL of a document, and f qu is the number of times that the users issued query q ɛ Q (set of all queries) and clicked and clicked on URL u ɛ U (set of all queries). The click log L is considered as a weighted bipartite graph and is known as click graph where Q and U constitute the partitions of the graph, and for every record in the log, there is an edge (q,u) ɛ E with weight f qu. A set of concepts C={c 1 ….c k }. The concepts represent abstract themes that the advertiser is interested in which can be either general (eg. Shoes) or specific (eg. Running shoes) A seed set S u U×C of URLs in the click log that are manually assigned to the concepts in C. The seed set S consists of pairs where u ɛ U and c ɛ C is the label of concept c.
7
Problem Definition cont.. Output – Given G, C and S the goal of keyword generation problem is to populate the concepts in C with queries from Q. These queries are then used as keyword suggestions to the advertisers that are interested in the specific concept.
8
A Random Walk Algorithm For some query q ɛ Q, compute the affinity of q to some seed node s ɛ S as the probability that a random walk that starts from q ends up at node s. Similarly, the affinity of q to the concept class c ɛ C is the probability that the random walk that starts from q ends up in any seed node in class c.
9
ARW cont.. l q (or l u ) denote random variable pertaining to the concept label for query q (or URL u) P(l q = c), probability that a random that starts from q will be absorbed at some node of the class c. α is the probability of making a transition to the null class absorbing node, from any node in the graph. γ threshold to discard probabilities of class to increase efficiency.
10
Markov Random Fields (MRF) An MRF is an undirected graph, where each node in the graph is associated with a random variable and edges model the pairwise relationships between the random variables. Markov assumption is the characteristic of MRF that the value of a random variable is independent of the rest of the graph, given the value of all its neighbors.
11
Gaussian Markov Random Fields Use of continuous relaxation instead of discrete, that is, the class labels are real numbers in the [0,1] interval.
12
Variational Inference and Mean Field Algorithm Labels are discrete unlike Gaussian Markov Random Field. The goal of variational inference is to approximate the true intractable posterior distribution P with a tractable distribution P ^ that has a simpler form.
13
Experiment Result Total 20 categories ARW has max Micro- avg Relevance Snippets has higher Micro-avg for 5 out of 20 categories Mean field has higher avg for 2 categories
14
Conclusion An approach to keyword generation that leverages the information available in the search engine click logs. This approach requires minimal effort from the part of the advertisers. Promising experimental results demonstrate that these algorithms can scale to large query logs and produce high-quality results.
15
Thank you !!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.