Download presentation
Presentation is loading. Please wait.
1
1 Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles The 3rd ACM Conference on Recommender Systems, New York City, NY, USA, October 22-25, 2009 http://lca.epfl.ch/privacy Reza Shokri Pedram Pedarsani George Theodorakopoulos Jean-Pierre Hubaux
2
2 Privacy in Recommender Systems Untrusted Server –Tracking users’ activities Publishing Users’ Profiles –Re-identification attacks on anonymous datasets A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, 2008.
3
3 Problem Statement Improving users privacy with minimum imposition of accuracy loss on the recommendations –Centralized recommender system –Contact between users –Distributed privacy preserving mechanism Distributed aggregation of users’ profiles –Users hide the items they have actually rated through adding items rated by other users to their profile Proposed Solution
4
4 Outline Profile Aggregation Aggregation Methods Evaluation
5
5 Profile Aggregation items ratings 2 2553 443315 2 1 3 4 Each user gives a subset of his items to his contact peer Thus, users profiles are aggregated after the contact AliceBob
6
6 System Model Online profile Offline profile synchronization Actual Profile: Set of items rated by a user Offline Profile: Actual profile + aggregated items Online Profile: The latest synchronized offline profile on the server contact
7
7 Online Profiles vs. Actual Profiles … … Online profile of users … … Actual profile of users
8
8 Aggregation Methods How many items to aggregate? Which items to aggregate? Similarity-based Aggregation (Similarity: The Pearson’s correlation coefficient) –Random Selection (SRS) –Minimum Rating Frequency (SMRF) (rating frequency: percentage of users that have rated an item) IMDB: 167,237 votes IMDB: 1,625 votes
9
9 Evaluation Metrics Privacy Gain Accuracy Loss
10
10 Privacy Gain number of users actual profile of user ‘u’ online profile of user ‘u’ rating frequency of item ‘i’ R. Myers, R. C. Wilson, and E. R. Hancock. Bayesian graph edit distance. IEEE Trans. Pattern Anal. Mach. Intell., 22(6), 2000. Intuition: Structural difference of two graphs (online and actual) viewed as difference between correspondent edges Privacy : How difficult is for the server to guess the users’ actual profiles, having access to their online profiles Weight of items added by aggregation Weight of items in online profile
11
11 Accuracy Loss The bipartite graph that contains actual ratings The bipartite graph available to the server
12
12 Experiment Simulation on randomly chosen profiles –From the Netflix prize dataset –300 users –Average: 30000 ratings and 2500 items in each experiment Memory-based CF: user-based Testing set: 10% of the actual ratings of each user Users select their contact peers at random Aggregation methods –Union –SRS –SMRF
13
13 Privacy Gain Similarity-based Random Selection (SRS) Similarity-based Minimum Rating Frequency (SMRF)
14
14 Accuracy Loss Similarity-based Random Selection (SRS) Similarity-based Minimum Rating Frequency (SMRF)
15
15 Tradeoff between Privacy and Accuracy
16
16 Conclusion A novel method for privacy preservation in collaborative filtering recommendation systems Protection of users privacy against an untrusted server Considerably improving users privacy with minimum effect on recommendations accuracy by aggregating users’ profiles based on their similarities Proposed method can also be used on protecting privacy of users in published datasets
17
17 Future Work The evaluation of the mechanism can be improved by considering more realistic contact pattern between users, e.g., users friendship in a social network, or physical vicinity We would like to evaluate the practical implication of the method on the maintenance of the profiles http://lca.epfl.ch/privacy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.