The Effect of Dimensionality Reduction in Recommendation Systems Juntae Kim Department of Computer Engineering Dongguk University
Contents Introduction Collaborative Recommendation Data Sparseness Problem Dimensionality Reduction by using SVD An Example Experiments Conclusion
Introduction e-CRM Recommendation System Provides personalized service Enhance sales by Product recommendation, target advertisement, etc. Recommendation System Demographic features Recommend items Item features Customer Purchase history Sales history
Introduction Use item-to-item similarity – content-based Use item-to-item similarity – association A C B like similar contents Recommend A C B like high correlation Recommend
Introduction Use people-to-people similarity – demographic Use people-to-people similarity – collaborative A C B similar feature like Recommend like A B high correlation Recommend A B C like
Collaborative Method Advantages Method No needs of contents analysis Items that are difficult to analyze contents can be recommended Ex> Movie, music, … No needs of user information High precision Method Find out similar users Predict preferences based on similar users preferences
Collaborative Method Computing similarity Pearson correlation coefficient ( [-1, 1] ) : Rating of user a to item i Example User a: (1, 8, 9) (-5, +2, +3) User b: (2, 9, 7) (-4, +3, +1) User a is similar to b User c: (9, 3, 3) (+4, -2, -2)
Collaborative Method Prediction of preferences Weighted sum of similar users’ preferences : 사용자 a와 u의 유사도 Example Average rating of user a: 5 Preferences of user a User b: (2, 8, 8), wa,b = 0.5 = (5, 5, 5) + (-4, 2, 2)*0.5 User c: (4, 4, 7), wa,c = 0.1 + (-1, -1, 2)*0.1 = (2.9, 5.9, 6.2)
Data Sparseness Problem Example data
Data Sparseness Problem Explicit ratings are not usually available Available data purchase, click, etc. 0 or 1 Computing correlation is not appropriate (no negative preference information) use cosine similarity
Data Sparseness Problem Available data are usually very sparse Buy 2~3 items among thousands of items Cosine similarity can not be computed Reduce dimension
Dimensionality Reduction Using category information Represent user preference vector with item categories Monster Co., Lion King, Pocahontas animation Holloween, Scream horror
Dimensionality Reduction Singular Value Decomposition (SVD) Decompose the user-item matrix Amn Amn = Umm Smn (Vnn)T S : Diagonal matrix that contains the singular values of A in descending order U, V : Orthogonal matrices Rotating the axes of the n-dimensional space 1st axis runs along the direction of largest variation
Dimensionality Reduction SVD example
Dimensionality Reduction Approximation of A Select largest k singular values A’mn = Umk Skk (Vnk)T Computing user similarity AAT = USVT(USVT)T = USVTVSTUT = (US)(US)T Projection of A into k dimension A’mn Vnk = Umk Skk
An Example User-item matrix
An Example Reduction, k = 2
An Example User-user similarity
An Example User vectors in 2-D space u6 u4 u5 u1 u2 u3
Experiments Dataset – MovieLens Experiments 943 users, 1628 movies, 1~5 rating, 6.4% rated Change ratings to 0/1 3.6% rated Experiments Compare performance of plain collaborative(CF) and reduced dimension(SVD) recommendation CF: 60 neighbor SVD: rank 20 Change sparseness to 2.0%, 1.0%, 0.5%
Experiments Metric Result Hit ratio Remove 1 rating from each user test data Recommend 10 items for each user If the test data is in the recommended item hit Total # of hit Total # of test data Result Sparseness 3.6% SVD improves hit ratio by x % Sparseness 0.5% SVD improves hit ratio by x % Hit ratio =
Experiments Results
Conclusion Solve data sparseness problem Experimental results Reduce dimension – heuristics Reduce dimension – SVD Experimental results SVD shows more performance improvement in sparser data Future research Statistical analysis Combined methods
References Basu, C, Hirsh, H., Cohen, W., “Recommeder Systems. Recommedation As Classification: Using Social And Conent-Based Information,” Proceedings of the Workshop on Recommendation system. AAAI Press, Menlo Park California, 1998. Billsus, D., Pazzani, M. j., “Learning Collaborative Information Filters,” Proceedings of workshop on recommender system, 1998. Berry, M. W., Dumais, S. T., and O’Brain, G. W. “Using Linear Algebra for Intelligent Information Retrieval,” SIAM Review, 37(4), pp. 573-595, 1995. Breese, J. S., Heckerman, D., and Kadie, C., “Empirical Analysis of Predictive Algorithm for Collaborative Filtering,”Proceeding of the Fourteenth Conference UAI, July 1998. Goldberg, k., Roeder, T., Gupta, D., and Perkins, C., “Eigentaste: A Constant Time Collaborative Filtering Algorithm,” Technical Report M00/41. Electronics Research Laborotary, University of California, Berkeley, 2000. Herlocker, J., Konstan, J., Borchers, A., Riedl, J., “An Algorithmic Framework for Performing Collaborative Filtering,”Proceedings of the 1999 Conference on Research and Development in Information Retrieval, Aug. 1999. Sarwar, B. M. “Sparsity, Scalability, and Distribution in Recommender Systems,” Ph.D. Thesis, Computer Science Dept., University of Minnesota, 2001. Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J., “Application of Dimensionality Reduction in Recommender System-A Case Study,”WebKDD 00-Web-mining for E-Commerce Workshop, 2000. Schafer, J. B., Konstan, J., and Riedl, J., “Recommender Systems in E-Commerce,” Proceedings of the ACM Conference on Electronic Commerce, November 1999. Shardanand, U., "Social information filtering for music recommendation," Technical Report MA95, MIT Media Laboratory, 1995.