Presentation is loading. Please wait.

Presentation is loading. Please wait.

Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models Wei ChuSeung-Taek Park WWW 2009 Audience Science Yahoo! Labs.

Similar presentations


Presentation on theme: "Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models Wei ChuSeung-Taek Park WWW 2009 Audience Science Yahoo! Labs."— Presentation transcript:

1 Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models Wei ChuSeung-Taek Park WWW 2009 Audience Science Yahoo! Labs.

2 Outline Dynamic content –Yahoo! Front Page Today Module –Difficulties on new users and new items Personalized recommendation –Global level, one-size-fits-all / most popular –Segmented level, segmentation –Individual level, personalization Methodology –Predictive bilinear models Findings in the case study Conclusions WWW 2009

3 Dynamic Content WWW 2009 Yahoo! Front Page

4 At default, the article at F1 is highlighted at the Story position. Articles are selected from a hourly-refreshed article pool. Replacement on out-of-date articles happens every a few hours. GOAL: select the most attractive article for the Story position to draw users attention and then increase users retention. Dynamic Content WWW 2009 Today Module

5 Dynamic Content WWW 2009 Today Module a)Click-through rate (CTR) is decaying temporally, e.g. breaking news. b)About 40% clickers are first-time clickers. c)Lifetime of an article is usually short, only a few hours; d)The universe of content pool is dynamic. 9 days data

6 Difficulties on Dynamic Content WWW 2009 Collaborative filtering provides good solution to a closed world –Overlaps in feedback across users are relatively high –The universe of content items is almost static CTR is decaying temporally –Difficult to compare users feedback on the same article received at different time slots Lifetime of an article is usually short, only a few hours –Reduce overlaps in feedback across users The universe of content pool is dynamic –Have to wait for clicks on new items (content-based filtering helps) –Storage and retrieval of historical ratings of retired items are demanding About 40% clickers are first-time clickers –Hard on new users without historical ratings (most popular is baseline) Cold-Start Recommendation existing items new items existing users Collaborative filtering Content-based filtering new users most popular WAIT

7 Solution: Feature-based modeling WWW 2009 Users with open profiles –Demographical information, age, gender, location –Property usage over Yahoo! networks –Search logs, purchase history etc. Content profile management –Static descriptors: categories, title, bags of words of textual content etc. –Temporal characteristics: popularity, CTR, freshness etc. Feature-based regression models for personalization at individual level –New items: initialize popularity based on static meta features –New users: estimate preferences on items based on user features

8 Methodology WWW 2009 Data representation –User features ( D features per user) –Content features ( C features per article) –Historical feedback (story click or not) Predictive bilinear models –Bilinear score for a user/article pair the b -th feature of user the a -th feature of item affinity between and affinitysportsfinance age 500.50.8 age 200.90.2 male0.60.5 1.5 0.7 1.1 1.3 C 100 D 1000

9 Model fitting on historical feedback –Regression on continuous targets –Logistic regression on binary targets –Probabilistic framework Optimal affinities at maximum a posteriori (MAP) estimate Prediction Offline Optimization affinitysportsfinance age 50?? age 20?? male?? WWW 2009 affinitysportsfinance age 500.50.8 age 200.90.2 male0.60.5 1.5 0.7 C 100 D 1000

10 Data collection –Random serving policy –Temporal partition –About 40 million events for training –About 5 million distinct users –Test events (about 0.6 million story clicks) Offline performance metric –Click Portion: the fraction of clicks at rank position r Case Study WWW 2009 Application: Front Page Today Module

11 Data collection –Random serving policy –Temporal partition –About 40 million events for training –About 5 million distinct users –Test events (about 0.6 million story clicks) Offline performance metric –Click Portion: the fraction of clicks at rank position r Case Study WWW 2009 Application: Front Page Today Module Click Rank : 2 at the moment of the click event in test LikeDislike

12 Data collection –Random serving policy –Temporal partition –About 40 million events for training –About 5 million distinct users –Test events (about 0.6 million story clicks) Offline performance metric –Click Portion: the fraction of test clicks at rank r Case Study WWW 2009 Application: Front Page Today Module at the moment of click events in testClick Rank : 1 LikeDislike

13 Case Study WWW 2009 Baseline: select the article with the highest CTR (EMP) –One-size-fits-all approach by online CTR tracking (Agarwal et al. NIPS 2009; Agarwal et al. WWW 2009) Segmentation –Age/gender-based segmentation with 6 clusters (GM) –Conjoint analysis with 5 clusters (Chu et al. KDD 2009) (SEG5) Collaborative filtering –Item-based collaborative filtering (IBCF) –Content-based filtering (CB) –Hybrid CB with CTR (CB+EMP) : Feature-based personalized models –Bilinear regression (RG) –Logistic bilinear regression (LRG) –LRG without article CTR feature (LRG-CTR) Matchbox: Large Scale Bayesian Recommendations Stern, Herbrich and Graepel (WWW2009) Microsoft Res. Thursday XL-2, Statistical Methods

14 Case Study WWW 2009 Lift over the baseline EMP one-size-fits-all SEG5: tensor conjoint analysis with 5 clusters CB+EMP: Logistic Bilinear Models

15 Case Study WWW 2009 A utility function (overall performance at top 4 positions) where is Click Portion at rank r

16 Bucket Test WWW 2009 Lift on offline metric (click portion) of three segmentation models Gender: –male, female, unknown AgeGender: –11 segments Tensor-5 (SEG5): –5 clusters Method: Tensor-5 > AgeGender > Gender Lift at rank 1: 0.08 > 0.65 > 0.55

17 Bucket Test WWW 2009 Method: Tensor-5 > AgeGender > Gender Lift on offline metric : 8% > 6.5% > 5.5% Lift in online bucket test : 3.24 > 2.45% > 1.49%

18 Summary WWW 2009 Feature-based bilinear regression models for personalized recommendation on cold-start situation of dynamic content. The affinities between user attributes and content features are optimized by learning from historical user feedback. Alleviate cold-start difficulties by leveraging available information at both user and item sides. Significantly outperform six competitive approaches at global, segmented or individual levels on an offline metric. We validated our offline metric by bucket test on segmentation models.

19 Acknowledgment WWW 2009 We thank our colleague: Raghu Ramakrishnan Scott Roy Deepak Agarwal Bee-Chung Chen Pradheep Elango Ajoy Sojan Todd Beaupre Nitin Motgi Amit Phadke Seinjuti Chakraborty Joe Zachariah

20 Questions?


Download ppt "Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models Wei ChuSeung-Taek Park WWW 2009 Audience Science Yahoo! Labs."

Similar presentations


Ads by Google