Personalizing Web Page Recommendation via Collaborative Filtering and Topic-Aware Markov Model Qingyan Yang, Ju Fan, Jianyong Wang, Lizhu Zhou Database Research Group, DCS&T, Tsinghua University
Motivation Recommender framework Experimental evaluation Conclusions 5/24/20152DB Group, DCS&T, Tsinghua University Agenda
Motivation Recommender framework Experimental evaluation Conclusions 5/24/20153DB Group, DCS&T, Tsinghua University
The Web is explosively growing ▪ By the end of 2009 ( source: the 25th Internet Report, 2010 ) ◦ 33,600,000,000 Web pages in China ◦ Twice as many as that in 2003 Finding desired information is more difficult. ▪ Users often wander aimless on the Web without visiting pages of his/her interests ▪ Or spend a long time on finding the expected information. Motivation 5/24/ DB Group, DCS&T, Tsinghua University
5/24/2015DB Group, DCS&T, Tsinghua University5 Web page recommendation Web page recommendation
Objective ▪ To understand users' navigation behavior ▪ To show some pages of users' interests at a specific time Existing popular solutions ▪ Markov model and its variants ▪ Temporal relation is important. Web page recommendation 5/24/ DB Group, DCS&T, Tsinghua University If the browsing sequence is "A B C … A B C … A B C", Then C is recommended when A and B are visited one after another
No personalized recommendations ▪ All users receive the same results Topic information of pages is neglected. ▪ Two pages, which are sequentially visited, may be very different in terms of topics. Limitations 5/24/2015DB Group, DCS&T, Tsinghua University7
Personalized Web page recommendation Two novel features ▪ Personalization ◦ Meet preference of different users 5/24/2015DB Group, DCS&T, Tsinghua University8 PIGEON: our solution I am a blog about finance
Two novel features ▪ Personalization ▪ Topical coherence ◦ To be relevant to users' present missions 5/24/2015DB Group, DCS&T, Tsinghua University9 PIGEON: our solution
Motivation Recommender framework Experimental evaluation Conclusions 5/24/201510DB Group, DCS&T, Tsinghua University
Recommender framework 5/24/2015DB Group, DCS&T, Tsinghua University11
Data representation Navigation graph 5/24/2015DB Group, DCS&T, Tsinghua University12 TimeUser IDIP addressTargetSource (09:44:44)(0e0c…)( )A() (09:44:58)(0e0c…)( )BA (10:14:29)(0e0c…)( )GA A B C D E F G H I J K L M Web page Edge: jump relation Weight: relation frequency Jump relation
Topic discovery Basic idea ▪ We assume that pages with similar URLs or evolved in jump relations are topically relevant. URLs Features ▪ Keywords. e.g., dblp. uni - trier. de / db /index.html ▪ Expanded by Manifold-based keyword propagation Web page clustering ▪ Each cluster represents one topic 5/24/2015DB Group, DCS&T, Tsinghua University13
Example 5/24/2015DB Group, DCS&T, Tsinghua University A B C D E F G HIJ K L M
Topic-Aware Markov Model Take n-grams as states. e.g., n=2 Web page preference score ▪ Maximum likelihood estimation ▪ e.g., P(D|BC) = f(BCD)/f(BC) = 1/2 A B C D B C A AB BC CD DB CA A C C A, B D B AB BC CD AC CC CA DB CA BD DB Topical state Temporal state 5/24/201515DB Group, DCS&T, Tsinghua University A B C D B C A
Personalized Recommender Collaborative filtering ▪ Basic idea 5/24/201516DB Group, DCS&T, Tsinghua University user similarities Web page preference
User Similarity User profile ▪ A set of topics Similarity measurement ▪ Topic similarity ▪ Maximum weight matching 5/24/201517DB Group, DCS&T, Tsinghua University
Motivation Recommender framework Experimental evaluation Conclusions 5/24/201518DB Group, DCS&T, Tsinghua University
Experiment settings Data set ▪ 1,402,371 records of 375 users in 34 days ▪ First 30 days for training and 4 days for testing Metrics are precision and recall Comparative methods 5/24/201519DB Group, DCS&T, Tsinghua University TemporalTopicalPersonalized BaselineY TAMMYY PIGEONYYY
Experimental evaluation 1 st -order model2 nd -order model 5/24/201520DB Group, DCS&T, Tsinghua University
Motivation Recommender framework Experimental evaluation Conclusions 5/24/201521DB Group, DCS&T, Tsinghua University
Conclusions 5/24/2015DB Group, DCS&T, Tsinghua University22 Taking user similarities into account, we could recommend Web pages to meet different users' preferences. We discover users' interested topics using an effective graph-based clustering algorithm. We devise a topic-aware Markov model to learn navigation patterns which contribute to the topically coherent recommendations.
THANKS 5/24/2015DB Group, DCS&T, Tsinghua University23