Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Speaker: Ruirui Li 1 The University of Hong Kong.

Similar presentations


Presentation on theme: "Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Speaker: Ruirui Li 1 The University of Hong Kong."— Presentation transcript:

1 Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Speaker: Ruirui Li 1 The University of Hong Kong

2 Outline Motivation Problem Statement DQR Model Experiments & Evaluation 2

3 Motivation Massive information arose on the Internet. 3 200420122008 1 trillion 8 billion ??? Numbers from Google Annul report in 2004 and its official blog in 2008 year Number of Indexed URLs by Google

4 Motivation User activities in the searching Process. 4 query Search results user Search intent : CIKM living place : CIKM 2012 Hotel : Maui Hotel clicks Lodging CIKM

5 Motivation 5 query Search results user Search intent : CIKM living place : CIKM 2012 Hotel : Maui Hotel clicks Mine Lodging CIKM

6 Motivation The effectiveness of IR depends on input queries. Users suffer: Translating human thoughts (search intent) into a concise set of keywords (query) is never straightforward. 6 Search results Search intent : CIKM living place : CIKM 2012 Hotel : Maui Hotel clicks Lodging CIKM

7 Motivation Input queries are short. Composed of only one or two terms. Number of terms in a query. 7

8 Motivation Short queries lead to two issues. Issue 1. Ambiguity: Example: query ``jaguar’’ Issue 2. Not specific enough: Example: query ``Disney’’ 8 Cat CartoonStorePark NFL TeamAutomobile Brand

9 Motivation Most traditional approaches focus on relevance. 1. The most relevant queries to the input query tend to be similar to each other. 2. This generates redundant and monotonic recommendations. 3. Such recommendations provide limit coverage of the recommendation space. 9

10 Motivation A recommender should provide queries that are not only relevant but also diversified. With diversified recommendations: 1. We can cover multiple potential search intents of the user. 2. The risk users won’t be satisfied is minimized. 3. Finally, users find desired targets in fewer recommendation cycles. 10

11 Problem statement 11 Output: a list of recommended queries Y. Input: a query q and an integer m. Query recommender m: Number of recommended queries Recommended queries Y Query q GOAL: At least one query in Y is relevant to the user’s search intent.

12 Problem statement 12 Output: a list of recommended queries Y. Input: a query q and an integer m. Query recommender m: Number of recommended queries Recommended queries Y Query q GOAL: At least one query in Y is relevant to the user’s search intent.

13 Problem statement 13 Query recommender m: Number of recommended queries Recommended queries Y Query q Five properties: 1. Relevance. 2. Redundancy-free. 3. Diversity. 4. Ranking. 5. Real time response.

14 DQR: framework 14 Offline: Redundancy-free issue. Mine query concepts from search log. Online: Diversity issue. Propose a probabilistic diversification model.

15 DQR: offline Mining query concepts. The same search intent can be expressed by different queries. Example: ``Microsoft Research Asia’’, ``MSRA’’, ``MS Research Beijing’’. A query concept is a set of queries which express the same or similar search intents. 15 Microsoft Research Asia MSRA MS Research Beijing

16 DQR: online 16

17 DQR: online Greedy strategy: 17 Concept selection Concept pool Input query:1. 2. 1. 3. 2. m. … …

18 DQR: diversification : query concept belongs. : query concepts already selected. : query concept to be selected. Objective function: Favor query concepts which are relevant to the input query. Penalize query concepts which are relevant to the query concepts already selected. 18

19 DQR: diversification Objective function: Estimation: 19

20 DQR: diversification Click set s: A set of clicked URLs. 20

21 DQR: diversification Objective function: Relevance: Diversity: 21

22 Experiments Datasets: Search log collected from search engine. AOL time period: 01 March, 2006-31 May, 2006. SOGOU time period: 01 June, 2008-31 June, 2008. 22

23 Baseline No golden standard for query commendation 23

24 Evaluation User study 12 users, 60 test queries 24

25 For a test query q and recommendations by a certain approach. Three relevance levels: Irrelevant (0 points) Partially relevant (1 point) Relevant (2 points) Evaluation 25 Recommendations

26 Evaluation Three performance Metrics: Relevance Diversity Ranking 26

27 Relevance 27 Results on AOL Query levelConcept level

28 Diversity Metric: Intent-Coverage It measures the unique search intents covered by the top m recommended queries. Since each intent represents a specified user search intent, higher Intent-Coverage indicates higher probability to satisfy different users. 28

29 For a test query q and recommendations by a certain approach. Three relevance levels: Irrelevant (0 points) Partially relevant (1 point) Relevant (2 points) Evaluation 29 Recommendations

30 Diversity Metric: Intent-Coverage 30 Results on AOL

31 Ranking Metric: Normalized Discounted Cumulative Gain (NDCG) 31 Results on AOL

32 Thanks! Questions Suggestions 32

33 Diversity ranking Metric: 33 Results on AOL Results on SOGOU

34 Motivation Diversification is highly needed by the use of mobile devices. One in Seven queries come from mobile devices. With limited space. 34 3.5 inch 17.0 inch 15.4 inch 13.3 inch Screen size is much smaller Numbers from Global mobile statistics 2012 ( mobiThinking )

35 DQR: clustering A Hawaii restaurant: Unlimited tables. Each table can hold unlimited customers. Customers arrives in a stream. Problem: whenever a customer arrives, assign him to a table. Properties: Familiar people together. Unfamiliar people apart. 35

36 DQR: clustering Customer stream 36 Compactness control:

37 DQR Extract representative query from query concept. Voting strategy: Compute a score for each query A score for each query q is therefore computed as: 37

38 Relevance 38 Results on SOGOU

39 CIKM Proc. of 2012 Int. Conf. on Information and Knowledge Management (CIKM’12), Maui, Hawaii, Oct. 2012, to appear 39


Download ppt "Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Speaker: Ruirui Li 1 The University of Hong Kong."

Similar presentations


Ads by Google