Fusing Rating-based and Hitting-based Algorithms in Recommender Systems Xin Xin 2009-10-23
Outline Motivation Our Approach Experiments
Recommender System X: Observations Avg. ratings of the same ages and gender Avg. ratings of the same genres trust relation
Rating-based Vs Hitting-based Recommendation Rating-based: whether recommended items can be rated with high scores by the user. Hitting-based: how many recommended items will be hit by the user in the future.
Limitations Item Missing Problem Item improper Problem hitting-based missed D: cumulate a large number of small effects can obtain great potential ratting-based missed A: directly reflect sales of companies Item improper Problem Hitting-based: #A = #B (ideal: #B>>#A) user suffer from low quality items Ratting-based:#D > #B (ideal: #B>>#D) user will not interested in visiting the recommended results
Motivation: fusing rating-based and hitting-based recommendation State-of-the-art ratting-based methods EigenRank (random walk theory) Model both preference order and ratings Outperform other methods State-of-the-art hitting-based methods Co-occurrence-based methods (relational feature) Hitting-frequency-based methods (local feature) Challenge: How to combine them together? Linear integration: unnatural combine incompatible feature function values 0.003(distribution in EigenRank)*a+8000(hitting freq)*b? Rank combination: loss quantity information
Our contribution Propose a new combination model by extending random walk in EigenRank from discrete-time Markov Process (DMP) to continuous-time Markov Process (CMP) by employing queueing theory, the combination has an intuitive interpretation, making the feature functions being naturally combined without losing quantity information the accuracy is better than other combination methods through experimental results
Outline Motivation Our Approach Experiments EigenRank Co-occurrence Combination Hitting Frequency Combination Algorithms Experiments
EigenRank Stationary distribution:
Co-occurrence Combination frequent co-occurrence items in the past are also likely to appear together in the future.
Hitting Frequency Combination popular items are likely to interest users.
Hitting Frequency Formulation 1) costumers’ arrival follows the time-homogenous Poisson Process. 2) service time follows exponential distribution with the same service rate u. 3) waiting time of a customer on the condition that there is a queue:
Algorithm Complexity 1) probability transition matrix building; O(number of users) 2) stationary distribution calculation. O(number of items)
Outline Motivation Our Approach Experiments
Experiments Setup Datasets Metric Protocol MovieLens Netflix Metric Protocol MovienLens: Given 5, Given 10, Given 15 Netflix: Given 5, Given 10, Given 20
Empirical Study of Traditional Methods on Multiple Metrics
Impact of Parameters
Overall Performance
Distribution of Recommended Results
More Detailed Results
Thank you very much xxin@cse.cuhk.edu.hk