Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovering and Using Groups to Improve Personalized Search Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research.

Similar presentations


Presentation on theme: "Discovering and Using Groups to Improve Personalized Search Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research."— Presentation transcript:

1 Discovering and Using Groups to Improve Personalized Search Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research

2 Diego Velasquez, Las Lanzas

3 People Express Things Differently Differences can be a challenge for Web search – Picture of a man handing over a key. – Oil painting of the surrender of Breda.

4 People Express Things Differently Differences can be a challenge for Web search – Picture of a man handing over a key. – Oil painting of the surrender of Breda. Personalization – Closes the gap using more about the person Groupization – Closes the gap using more about the group

5 How to Take Advantage of Groups? Who do we share interests with? Do we talk about things similarly? What algorithms should we use?

6 Related Work Personalization – Implicit information valuable [Dou et al. 2007; Shen et al. 2005] – More data = better performance [Teevan et al. 2005] Collaborative filtering & recommender systems – Identify related groups Browsed pages [Almeida & Almeida 2004; Sugiyama et al. 2005] Queries [Freyne & Smyth 2006; Lee 2005] Location [Mei & Church 2008], company [Smyth 2007], etc. – Use group data to fill in missing personal data Typically data based on user behavior

7 Who do we share interests with? – Similarity in query selection – Similarity in what is considered relevant Do we talk about things similarly? – Similarity in user profile What algorithms should we use? – Groupize results using groups of user profiles – Evaluate using groups’ relevance judgments Who do we share interests with? – Similarity in query selection – Similarity in what is considered relevant Do we talk about things similarly? – Similarity in user profile What algorithms should we use? – Groupize results using groups of user profiles – Evaluate using groups’ relevance judgments Who do we share interests with? – Similarity in query selection – Similarity in what is considered relevant Do we talk about things similarly? – Similarity in user profile What algorithms should we use? – Groupize results using groups of user profiles – Evaluate using groups’ relevance judgments Who do we share interests with? – Similarity in query selection – Similarity in what is considered relevant Do we talk about things similarly? – Similarity in user profile What algorithms should we use? – Groupize results using groups of user profiles – Evaluate using groups’ relevance judgments How We Answered the Questions

8 Interested in Many Group Types Group longevity – Task-based – Trait-based Group identification – Explicit – Implicit Task-basedTrait-based Longevity Identification ImplicitExplicit Task Age Gender Job team Job role LocationInterest group Relevance judgments Query selectionDesktop content

9 People Studied Trait-based dataset 110 people – Work – Interests – Demographics Microsoft employees Task-based dataset 10 groups x 3 (= 30) Know each other Have common task – “Find economic pros and cons of telecommuting” – “Search for information about companies offering learning services to corporate customers”

10 Queries Studied Trait-based dataset Challenge – Overlapping queries – Natural motivation Queries picked from 12 – Work c# delegates, live meeting – Interests bread recipes, toilet train dog Task-based dataset Common task – Telecommuting v. office pros and cons of working in an office social comparison telecommuting versus office telecommuting working at home cost benefit

11 Data Collected Queries evaluated Explicit relevance judgments – 20 - 40 results – Personal relevance Highly relevant Relevant Not relevant User profile: Desktop index

12 Answering the Questions Who do we share interests with? Do we talk about things similarly? What algorithms should we use?

13 Who do we share interests with? Variation in query selection – Work groups selected similar work queries – Social groups selected similar social queries Variation in relevance judgments – Judgments varied greatly (κ=0.08) – Task-based groups most similar – Similar for one query ≠ similar for another

14 In task groupNot in groupDifference 0.420.3134% In task groupNot in groupDifference All queries0.420.3134% Group queries0.770.35120% Do we talk about things similarly? Group profile similarity – Members more similar to each other than others – Most similar for aspects related to the group Clustering profiles recreates groups Index similarity ≠ judgment similarity – Correlation coefficient of 0.09

15 What algorithms should we use? Calculate personalized score for each member – Content: User profile as relevance feedback – Behavior: Previously visited URLs and domains – [Teevan et al. 2005] Sum personalized scores across group Produces same ranking for all members (r i +0.5)(N-n i -R+r i +0.5) (n i -r i +0.5)(R-r i +0.5) tf i log Σ terms i

16 Performance: Task-Based Groups Personalization improves on Web Groupization gains +5% Web Personalized Groupized

17 Performance: Task-Based Groups Personalization improves on Web Groupization gains +5% Split by query type – On-task v. off-task – Groupization the same as personalization for off-task queries – 11% improvement for on-task queries Off-task queriesOn-task queries Web Personalized Groupized

18 Performance: Trait-Based Groups Groupization Personalization InterestsWork

19 Performance: Trait-Based Groups Groupization Personalization Work queries Interest queries InterestsWork

20 Performance: Trait-Based Groups Groupization Personalization Work queries Interest queries InterestsWork

21 What We Learned Who do we share interests with? – Depends on the task Do we talk about things similarly? – Variation in profiles even with similar judgments What algorithms should we use? – Groupization can take advantage of variation for group-related tasks

22 Thank you. Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research

23 Groupization Performance

24 Related Work: Collaborative Search People collaborate on search – Students [Twidale et al. 1997], professionals [Morris 2008] – Tasks: Travel, shopping, research, school work Systems to support collaborative search – SearchTogether [Morris & Horvitz 2007] – Cerchiamo [Pickens et al. 2008] – CoSearch [Amershi & Morris 2008] – People form explicit task-based groups

25 Related Work: Algorithms Personalization – Implicit information valuable [Dou et al. 2007; Shen et al. 2005] – More data = better performance [Teevan et al. 2005] Collaborative filtering & recommender systems – Identify related groups Browsed pages [Almeida & Almeida 2004; Sugiyama et al. 2005] Queries [Freyne & Smyth 2006; Lee 2005] Location [Mei & Church 2008], company [Smyth 2007], etc. – Use group data to fill in missing personal data Typically data based on user behavior

26 Identifying Groups Explicitly – Tasks: Tools for collaboration [Morris & Horvitz 2007] – Traits: Profiles Implicitly – Interests: Sites visited, queries – Tasks: Query – Location: IP address [Mei & Church 2008] – Gender: Queries [Jones et al. 2007] – Interesting area to explore: Social networks


Download ppt "Discovering and Using Groups to Improve Personalized Search Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research."

Similar presentations


Ads by Google