Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR
Query expansion Personalization Algorithms Standard IR Document Query User Server Client
Query expansion Personalization Algorithms Standard IR Document Query User Server Client v. Result re-ranking
Result Re-Ranking Ensures privacy Good evaluation framework Can look at rich user profile Look at light weight user models Collected on server side Sent as query expansion
Seesaw Search EngineSeesaw dog 1 cat10 india 2 mit 4 search93 amherst12 vegas 1
Seesaw Search Engine query dog 1 cat10 india 2 mit 4 search93 amherst12 vegas 1
Seesaw Search Engine query dog 1 cat10 india 2 mit 4 search93 amherst12 vegas 1 dog cat monkey banana food baby infant child boy girl forest hiking walking gorp baby infant child boy girl csail mit artificial research robot web search retrieval ir hunt
Seesaw Search Engine query dog 1 cat10 india 2 mit 4 search93 amherst12 vegas Search results page web search retrieval ir hunt 1.3
Calculating a Document’s Score Based on standard tf.idf web search retrieval ir hunt 1.3
Calculating a Document’s Score Based on standard tf.idf (r i +0.5)(N-n i -R+r i +0.5) (n i -r i +0.5)(R-r i +0.5) w i = log User as relevance feedback Stuff I’ve Seen index More is better
Finding the Score Efficiently Corpus representation (N, n i ) Web statistics Result set Document representation Download document Use result set snippet Efficiency hacks generally OK!
Evaluating Personalized Search 15 evaluators Evaluate 50 results for a query Highly relevant Relevant Irrelevant Measure algorithm quality DCG(i) = { Gain(i), DCG (i–1) + Gain(i)/log(i), if i = 1 otherwise
Evaluating Personalized Search Query selection Chose from 10 pre-selected queries Previously issued query cancer Microsoft traffic … bison frise Red Sox airlines … Las Vegas rice McDonalds … Pre-selected 53 pre-selected (2-9/query) Total: 137 Joe Mary
Seesaw Improves Text Retrieval Random Relevance Feedback Seesaw
Text Features Not Enough
Take Advantage of Web Ranking
Further Exploration Explore larger parameter space Learn parameters Based on individual Based on query Based on results Give user control?
Making Seesaw Practical Learn most about personalization by deploying a system Best algorithm reasonably efficient Merging server and client Query expansion Get more relevant results in the set to be re-ranked Design snippets for personalization
User Interface Issues Make personalization transparent Give user control over personalization Slider between Web and personalized results Allows for background computation Creates problem with re-finding Results change as user model changes Thesis research – Re:Search Engine
Thank you!