Learning user preferences for 2CP-regression for a recommender system Alan Eckhardt, Peter Vojtáš Department of Software Engineering, Charles University in Prague, Czech Republic
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Outline Motivation User model Peak and 2CP Experiments Conclusion and future work
SOFSEM 2010, Špindlerův mlýv, Czech Republic, User preference learning Helping the user to find what she looks for E.g. notebooks A small amount of information required from the user Ratings of notebooks,... Construction of a general user preference model Each user has his/her own preference model Recommendation of the top k notebooks to the user Which the preference model has chosen as the most preferred for the user
SOFSEM 2010, Špindlerův mlýv, Czech Republic, User preference learning Recommendation process Initial set Centers of clusters of objects Construction of user model Recommendation More iterations possible In each iteration the user model is refined
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Two step user model User model learning is divided into two steps 1.Local preferences - normalization of the attribute values of notebooks to their preference degrees Transforms the space into [0,1] N 2.Global preferences - aggregation of preference degrees of attribute values into the predicted rating
SOFSEM 2010, Špindlerův mlýv, Czech Republic, User model Fuzzy sets Normalize the space to monotone space [0,1] N Define pareto front Set of incomparable objects Candidates for the best object (1,…,1) is the best object 1 1 0
SOFSEM 2010, Špindlerův mlýv, Czech Republic, User model Aggregation Resolves the best object from pareto front The second best object may not be on pareto front Two methods – Statistical and Instances st best 2nd best
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Normalization of numerical attributes Linear regression Preference of the smallest or the largest value Quadratic regression Can detect ideal values, but often fails in experiments
SOFSEM 2010, Špindlerův mlýv, Czech Republic, CP regression Preference dependence between attributes This is not a dependence in the dataset (e.g. the resolution of display influences the price) The influence of the value of attribute A 1 on the preference of attribute A 2 E.g. the value of the producer (IBM) of a notebook influence the preference of the price of the notebook (for IBM, the ideal price is 2200$).
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Peak Motivation User often prefer once particular value of attribute Finding the peak value Traversing the training set Which is small Testing the error of linear regressions on both sides of the peak We know exactly which value is the most preferred Useful for visual representation
SOFSEM 2010, Špindlerův mlýv, Czech Republic, CP regression+Peak Dependence of price on the value of manufacturer ACER => High price ASUS => Lower price
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Experiment settings Dataset of 200 notebooks Artificial user preferences The preference of price was dependent on the value of producer Training sets of sizes 2-60 The rest of the dataset was used as testing set Error measures RMSE Kendall coefficient
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Experiment settings Tested methods Support Vector Machines from Weka Mean – returns the average rating from the training set Instances – classification, uses objects from the training as boundaries on rating Statistical – weighted average with learned weights 2CP Both Instances and Statistical can use local preference normalization – Linear, Quadratic, Peak 2CP serves to find the relation between the preference of an attribute value and the value of another
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Experiment results
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Experiment results
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Experiment results
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Conclusion Proposal of method Peak Combination with 2CP Experimental evaluation with very good results Using rank correlation measure
SOFSEM 2010, Špindlerův mlýv, Czech Republic, Future work nCP-regression Clustering of similar values for better robustness Degree of relation between two attributes