Presentation is loading. Please wait.

Presentation is loading. Please wait.

Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.

Similar presentations


Presentation on theme: "Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee."— Presentation transcript:

1 Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee

2 Copyright  2008 by CEBT Introduction  Search and Recommendation systems include contextual information to effectively model users’ interests This paper presents the effectiveness of five variant sources of contextual information for user interests modeling – Social, history, task, collection and user interaction This paper evaluate the utility of these sources and overlaps between them – the context overlap outperforms any isolated sources IDS Lab. Seminar - 2Center for E-Business Technology

3 Copyright  2008 by CEBT Introduction  Contextual information Interaction – Recent interaction behavior preceding the current page Collection – Pages with hyperlinks to the current page Task – Pages related to the current page by sharing the same search queries Historic – The long term interests for the current user Social – The combined interests of other users that also visit the current page IDS Lab. Seminar - 3Center for E-Business Technology

4 Copyright  2008 by CEBT Log Data  Browse trails Extracted from user logs (From August 2008 to November 2008) Consist of a temporally ordered sequence of URLs visited by a user per Web browser instance or browser tab Termination of trails – A period of user inactivity of 30 or more minutes – Termination of the browser instance or tab  Context trails Extracted from the set of browse trails Comprise a terminal URL u t, and the lists of five Web pages preceding u t in the browse trail (u t-5,.., u t-1 ) The five pages forms the immediate session based interaction context  T h : the set of terminal URLs IDS Lab. Seminar - 4Center for E-Business Technology

5 Copyright  2008 by CEBT User Interest Models  All pages extracted from context (interaction, collection, historic, task, and social) are classified into Web categories (i.e., ODP) User interests were represented as a lists of ODP category labels ODP labels in the lists were ranked based on each label’s frequency in the context IDS Lab. Seminar - 5Center for E-Business Technology

6 Copyright  2008 by CEBT User Interest Models  No Context (only u t ) One ODP label is assigned to the terminal URL  Interaction Context (u t-5,.., u t-1 ) One ODP is assigned to each of the five pages The label frequencies are used to created a ranked list of labels The ranked list is the interest model for the interaction context of u t IDS Lab. Seminar - 6Center for E-Business Technology

7 Copyright  2008 by CEBT User Interest Models  Task Context Created using ODP labels assigned to Web pages visited by other users with same query (or similar tasks) Queries are common in u t and u r IDS Lab. Seminar - 7Center for E-Business Technology ODP labels Ranked lists are regarded as task context

8 Copyright  2008 by CEBT User Interest Models  Collection Context Created using Web pages containing hyperlinks that refer to u t – In-links for each u t ODP labels are assigned to each in-links  Historic Context Created for each user based on their long-term interaction history To create each user’s historic context, we classified all Web pages the user visited, and assigned ODP labels to the pages  Social Context We found users who have also visited u t, and combined their interest models (historic context) to create a ranked list of ODP labels This list formed the interest model for the social context of u t IDS Lab. Seminar - 8Center for E-Business Technology

9 Copyright  2008 by CEBT Data Preparation  Interest model effectiveness may vary depending on temporal distance from u t to some future time point Short – Within one hour from u t Medium – Within one day from u t Long – Within one week from u t The futures are overlapping – e.g., medium contains short IDS Lab. Seminar - 9Center for E-Business Technology

10 Copyright  2008 by CEBT Evaluation Methodology  Find the short, medium and long term futures and build ground- truth interest models for each of them (making correct interest models)  Build user interest models for different context sources  Determine the accuracy of the context-based models in predicting the ground truth IDS Lab. Seminar - 10Center for E-Business Technology

11 Copyright  2008 by CEBT Measures  p@1 The top predicted category label pl 1 for a context trail matched to its top actual label l 1  p@3 The top predicted category label pl1 for a context trail matched to its top actual label l 1, l 2, l 3  Mean reciprocal rank (MRR) If l 1 matched pl i, the score assigned was the reciprocal of the prediction rank position, 1/i The computed scores were averaged to computed final MRR  Normalized discounted cumulative gain Emphasize highly relevant ODP labels appearing early in the result list  F1 Harmonic mean of precision and recall IDS Lab. Seminar - 11Center for E-Business Technology

12 Copyright  2008 by CEBT Results  Context source comparison Different sources of contextual information may be suited for different tasks – To predict user interests immediately, u t, interaction and task context can be used – To predict long term interests, historic and social context can be used IDS Lab. Seminar - 12Center for E-Business Technology

13 Copyright  2008 by CEBT Results  Handling near misses Near miss – E.g., although two ODP labels are different, we can consider that two labels are same with slight loss in precision /Sports/golf/instruction/golf school & /Sports/golf/instruction One level back-off means convert all ODP to their top level (e.g., /Sports/) IDS Lab. Seminar - 13Center for E-Business Technology

14 Copyright  2008 by CEBT Results  Combining contexts After 57 context combinations are tested, top 10 combination are displayed – Those combinations that are significantly different from the best performing model in Context source comparison are marked IDS Lab. Seminar - 14Center for E-Business Technology

15 Copyright  2008 by CEBT Conclusion  We build a variety of user interest models based on the current page, contextual variants, and overlaps between contexts  The interest models were required to predict short-, medium-, long-term interests The predictive value of each contextual sources varies according to the time duration of the prediction IDS Lab. Seminar - 15Center for E-Business Technology


Download ppt "Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee."

Similar presentations


Ads by Google