Presentation is loading. Please wait.

Presentation is loading. Please wait.

Post-Ranking query suggestion by diversifying search Chao Wang.

Similar presentations


Presentation on theme: "Post-Ranking query suggestion by diversifying search Chao Wang."— Presentation transcript:

1 Post-Ranking query suggestion by diversifying search Chao Wang

2 Mission diversifying the content of the search results from suggested queries while keep-ing the suggestion relevant random walk : a mathematical formalization of a path that consists of a succession of random steps. Example: Stock price, Molecule travels in a liquid

3 Suggested queries After a user submits a query, a set of relevant queries are suggested to the user. If not satisfied with the results on the page, the user may choose to click on the suggested queries. Research indicated that query suggestion greatly improves user satisfaction rate.

4 Existing work and improvement Focus on discovering relevant queries from search engine logs. ( co-clicked URLS and session information) They forget to address the diversification of the query suggestions. When a user clicks on the suggested query,he/she expects to gain additional information. SERP diversification between two queries to be the difference between their top-returned search results..Example:Delta airline

5 related work Random walk model: Queries and URLs are represented as nodes in a bipartite graph where each edge connects one query with one URL, which indicates a click. Entropy model: various user clicks have different importance. A click on a more specific URL is weighted higher than a click on a general URL Rare queries: combine information from clicked URLs and skipped URLs by constructing two bipartite graphs Rare queries: use walk model on the query-URL bipartite graph by calculating the query hitting time and can encourage diversities.

6 Mission Mission: Rather than focusing on improving the relevance of documents by re-ranking them, we aim at re-ranking suggested queries which help users refine their intent. previous limitation: the existing works on diversifying search results only focused on ambiguous queries where those queries have more than one user intents, previous limitation: only focus on relevance and do not consider diversification issue.

7 Generate suggestion candidates Collected from random walk model : Apply to the query-click logs. User session : find out user activities within a certain period of time to extract relevant queries

8 Ranking Function

9 Features 1 Open directory project :https://www.dmoz.org/https://www.dmoz.org/ Build using a binary tree Paper example : (next page)

10

11 Features 2, 3, 4 Feature 2 and 3 check similarity between URL strings and domain names. Value = 1 if two strings are the same and 0 otherwise. Feature 4 compute the correlation between two ordered SERP lists. Concordant if both URLs are identical and ranked at the same position Similarity calculation : not main focus on this paper.

12 Training labels and learning algorithms ask people to evaluate the relevance between query and suggestions. ( score between 0 and 3) Classification : support vector machines classify instances into one of the four classes with detailed ranked score. Example. The research is based on LambdaSMART algorithm because of its superior performance.

13 13 When data is very informative, shrinkage is zero and it moves toward 1 when data is less informative,

14 Data acquisition Randomly samples 13,421 queries between Sep 2010 and Nov 2010. These are queries that trigger at least one related search on the search result page

15 performance for different query types Average query length : 2.51. Average suggestion length. Long > 4, medium 2<= length <=4, short < 2 Navigational queries and information queries Normalized discounted cumulative gain (NDCG): a measure of ranking quality and used to measure effectiveness of web search engine algorithms. value between 0 and 1

16 performance for different query types

17 conclusion First gather a set of suggestion candidates then rank them suggestions based on their diversification scores. Diversification score based on features : ODP category, URL string difference, domain difference. Important discovery : the similarity between queries and suggested queries indeed drops lots of room for improvement and will explore more features


Download ppt "Post-Ranking query suggestion by diversifying search Chao Wang."

Similar presentations


Ads by Google