Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supervisor: Associate Prof. Jiuyong Li(John) Student: Kang Sun Date: 28 th May 2010.

Similar presentations


Presentation on theme: "Supervisor: Associate Prof. Jiuyong Li(John) Student: Kang Sun Date: 28 th May 2010."— Presentation transcript:

1 Supervisor: Associate Prof. Jiuyong Li(John) Student: Kang Sun Date: 28 th May 2010

2 Outline Introduction Motivations Related work Experiments Conclusion

3 Introduction friends and neighbours were the main resource to provide recommendations recommendations from friends a) best café in the local area b) best book in particular topic

4 Motivation Find out a more reliable and accuracy solution Large database supposed to help user to get more accuracy result, however, when recommendation turn to online, similar user become hard to found

5 Research question How to build up a framework to improve the prediction accuracy among recommendation data sets?

6 Dilemma Normally, data store in the large online recommendation database contains lot of unrated items. Unrated items could affect the result of recommendation

7 Related work Sparse Matrix Prediction Filling in Collaborative Filtering[Liu et al. 2009b] Develop the approach to overcome the sparse problem in user-based and item-based Similarity computation based on the Boolean matrix

8 Related work Effective Missing Data Prediction for Collaborative Filtering[Ma, King & Lyu 2007] Develop user information and item information combination to give better performance

9 Related work A Hybrid User and Item-based Collaborative Filtering with Smoothing on Sparse Data[(Rong & Yansheng 2006] a framework to alleviate sparse problem smoothing did increased the quality of recommendation by their experiments

10 research data sets Whole Jester has 617, 000 ratings of 100 jokes by 24, 900 users. range from −10 to +10. Whole rating matrix is filled to about 25%. This research was using the part 1 of the three parts Data from 24,983 users who have rated 36 or more jokes, a matrix with dimensions 24983 X 101 data set is in.CSV format

11 Data pre-processing Jester data sets using 99 to represent the unrated value First step is to change all the unrated values to 0. Second part is the most important part of this research which is predict the necessary unrated value for future prediction generation All the data processing were using R programming

12 Data set preview -7.828.79-9.66-8.16-7.52-8.5-9.854.17-8.98-4.76 4.08-0.296.364.37-2.38-9.66-0.73-5.348.889.22 99 9.039.279.039.2799 8.3599 1.88.16-2.826.21991.84 8.54.61-4.17-5.391.361.67.044.61-0.445.73 -6.17-3.540.44-8.5-7.09-4.32-8.69-0.87-6.65-1.8 99 8.59-9.857.728.7999 6.843.169.17-6.21-8.16-1.79.271.41-5.19-4.42 -3.79-3.54-9.42-6.89-8.74-0.29-5.29-8.93-7.86-1.6 3.015.15 3.016.415.158.932.523.018.16 -2.914.0899 -5.73992.48-5.29991.46 1.311.82.57-2.380.73 -0.975-7.23-1.36 99 5.87995.580.53997.14 9.229.279.228.37.430.443.58.165.978.98 8.79-5.786.023.697.77-5.838.698.59-5.927.52 -3.51.552.33-4.134.22-2.28-2.96-0.492.911.99 99-9.2799 -7.38998.74-6.31992.33 3.167.623.798.254.227.622.430.970.530.83 4.223.6499 2.52994.13-5.19997.91 997.6299 -8.642.438.93-6.699-9.47 2.57-0.7399 2.5799-4.222.6799-1.31 7.285.3999 -4.22998.933.5996.12

13 Data processing approach Joke1Joke2Joke3Joke4Joke5Joke6Joke7 Joke 8 User 200134300 User 300045400 User 400233200 Manhattan distance measure is applied

14 Data processing Distance between user 2 and user 3 is four Distance between user 2 and user 4 is three User 4 seems more close to user 3

15 Data processing approach(con’d.) Joke1Joke2Joke3Joke4Joke5Joke6Joke7 Joke 8 User 200134300 User 300145400 User 4002332 0 0

16 Data processing Distance between user 2 and user 3 is three Distance between user 2 and user 4 is three Both user 3 and 4 has the same distance with user 2

17 Measurement of accuracy relative squared error used to computing the accuracy Traditional CF accuracy of joke 3 Accuracy= 1-(1-2)²/1=0 Current approach accuracy Accuracy=1-(1-1.5)²/1=75%

18 User similarity comparison

19 Conclusion Heavy computation force Methods for both unrated value and missing value

20 References Liu, Z, Wang, H, Qu, W, Liu, W & Fan, R 2009b, Sparse Matrix Prediction Filling in Collaborative Filtering, IEEE Computer Society, pp. 304-307. Ma, H, King, I & Lyu, MR 2007, Effective missing data prediction for collaborative filtering, ACM, Amsterdam, The Netherlands, pp. 39-46. Rong, H & Yansheng, L 2006, 'A Hybrid User and Item-Based Collaborative Filtering with Smoothing on Sparse Data', paper presented at the Artificial Reality and Telexistence--Workshops, 2006. ICAT '06. 16th International Conference on, Nov. 2006.


Download ppt "Supervisor: Associate Prof. Jiuyong Li(John) Student: Kang Sun Date: 28 th May 2010."

Similar presentations


Ads by Google