Download presentation
Presentation is loading. Please wait.
Published byElwin Sims Modified over 8 years ago
1
Personalizing Search Jaime Teevan, MIT Susan T. Dumais, MSR and Eric Horvitz, MSR
2
Relevant result “pia workshop” Query:
3
Outline Approaches to personalization The PS algorithm Evaluation Results Future work
4
Approaches to Personalization Content of user profile Long-term interests Liu, et al. [14], Compass Filter [13] Short-term interests Query refinement [2,12,15], Watson [4] How user profile is developed Explicit Relevance feedback [19], query refinement [2,12,15] Implicit Query history [20, 22], browsing history [16, 23] Very rich user profile
5
PS Search Engine query
6
PS Search Engine query dog cat monkey banana food baby infant child boy girl forest hiking walking gorp baby infant child boy girl csail mit artificial research robot web search retrieval ir hunt
7
1.60.2 6.0 0.2 2.7 1.3 PS Search Engine query Search results page web search retrieval ir hunt 1.3
8
Calculating a Document’s Score Based on standard tf.idf Score = Σ tf i * w i web search retrieval ir hunt 1.3
9
Calculating a Document’s Score Based on standard tf.idf Score = Σ tf i * w i Σ 0.1 0.05 0.5 0.35 0.3 1.3
10
0.3 0.7 0.1 0.23 0.6 0.6 0.002 0.7 0.1 0.01 0.6 0.2 0.8 0.1 0.001 0.3 0.4 0.1 0.7 0.001 0.23 0.6 0.1 0.05 0.5 0.35 0.3 N nini Calculating a Document’s Score Based on standard tf.idf Score = Σ tf i * w i World (N) (n i ) w i = log Σ 1.3
11
0.002 0.7 0.1 0.01 0.6 0.3 0.7 0.1 0.23 0.6 0.6 0.002 0.7 0.1 0.01 0.6 0.2 0.8 0.1 0.001 0.3 0.4 0.1 0.7 0.001 0.23 0.6 0.1 0.05 0.5 0.35 0.3 N nini Calculating a Document’s Score Based on standard tf.idf Score = Σ tf i * w i (N) (n i ) w i = log World riri R (r i +0.5)(N-n i -R+r i +0.5) (n i -r i +0.5)(R-r i +0.5) w i = log † † From Sparck Jones, Walker and Roberson, 1998 [21]. ’ ’ Where: N = N+R, n i = n i +r i ’ ’ Client (r i +0.5)(N-n i -R+r i +0.5) (n i -r i +0.5)(R-r i +0.5) w i = log ’
12
Finding the Parameter Values Corpus representation (N, n i ) How common is the term in general? Web vs. result set User representation (R, r i ) How well does it represent the user’s interest? All vs. recent vs. Web vs. queries vs. none Document representation What terms to sum over? Full document vs. snippet web search retrieval ir hunt
13
Building a Test Bed 15 evaluators x ~10 queries 131 queries total Personally meaningful queries Selected from a list Queries issued earlier (kept diary) Evaluate 50 results for each query Highly relevant / relevant / irrelevant Index of personal information
14
Evaluating Personalized Search Measure algorithm quality DCG(i) = { Look at one parameter at a time 67 different parameter combinations! Hold other parameters constant and vary one Look at best parameter combination Compare with various baselines Gain(i), DCG (i–1) + Gain(i)/log(i), if i = 1 otherwise
15
Analysis of Parameters User
16
Analysis of Parameters CorpusUserDocument
17
PS Improves Text Retrieval No model Relevance Feedback Personalized Search 0.37 0.41 0.46
18
Text Features Not Enough 0.37 0.41 0.46 0.56
19
Take Advantage of Web Ranking 0.37 0.41 0.46 0.56 0.58 PS+Web
20
Summary Personalization of Web search Result re-ranking User’s documents as relevance feedback Rich representations important Rich user profile particularly important Efficiency hacks possible Need to incorporate features beyond text
21
Further Exploration Improved non-text components Usage data Personalized PageRank Learn parameters Based on individual Based on query Based on results UIs for user control
22
User Interface Issues Make personalization transparent Give user control over personalization Slider between Web and personalized results Allows for background computation Exacerbates problem with re-finding Results change as user model changes Thesis research – Re:Search Engine
23
Thank you! teevan@csail.mit.edu sdumais@microsoft.com horvitz@microsoft.com
25
Much Room for Improvement Group ranking Best improves on Web by 23% More people Less improvement Personal ranking Best improves on Web by 38% Remains constant Potential for Personalization
26
Evaluating Personalized Search Query selection Chose from 10 pre-selected queries Previously issued query cancer Microsoft traffic … bison frise Red Sox airlines … Las Vegas rice McDonalds … Pre-selected 53 pre-selected (2-9/query) Total: 137 Joe Mary
27
Making PS Practical Learn most about personalization by deploying a system Best algorithm reasonably efficient Merging server and client Query expansion Get more relevant results in the set to be re-ranked Design snippets for personalization
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.