1 1 Privacy-Preserving Collaborative Filtering Using Randomized Perturbation Techniques Huseyin Polat and Wenliang (Kevin) Du Department of EECS Syracuse University

2 2 Collaborative Filtering (CF) Process i1i1 i2i2 iqiq imim u 1 u2u2 uaua unun Item for which prediction is sought Active user Prediction P aq = Prediction on item q for active user

3 3 Basic Mechanisms of CF Central Database Collect preferences of people Active user ratings and a query Similarity Metric, and preference function Recommendation for the active user

4 4 A Collaborative Filtering Scheme Prediction for active user on item q Weighted average of preferences Similarity weight between active user and user i z-scores for item q Rating for user i on item q

5 5 Privacy-Preserving CF (PPCF) CF systems are threat to individual privacy Customer data is an asset and can be sold (and has been sold). Collecting high-quality data is difficult because of privacy concerns The challenge is: how can users contribute their private data for CF without compromising their privacy?

6 6 Randomized Perturbation (RP) Collaborative Filtering Central Database User 1 User 2 User n-1 User n +R 1 +R 2 +R n-1 +R n

7 7 Two Building Blocks A’=A+RB’ = B+V How to compute: A’ = A+R = (a 1 +r 1, …, a n +r n ) B’ = B+V = (b 1 +v 1, …, b n +v n ) 1 n

8 8 Dot Product

9 9 SUM

10 10 Conducting Collaborative Filtering Active user has z ak Dot productSum Server sends and to active user for each k

11 11 Mean Absolute Errors n = number of users t = number of items Data:MovieLens Rating range:[1,5] First Group n=154, t=154 Second Group n=143, t=265 Third Group n=1000, t=261

12 12 Mean Absolute Errors n = number of users t = number of items Data: Jester Rating range: [-10,10] First Group n=250, t=32 Second Group n=263, t=69 Third Group n=1000, t=69

13 13 Conclusion We have presented a solution to PPCF using Randomization techniques Our solution makes it possible to preserve users privacy while still producing accurate recommendations We analyzed how different parameters affect accuracy

