Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1
Social Advertising Armani Gucci Prada Armani Gucci Prada Recommend ads based on private shopping histories of “friends” in the social network. 2 AliceBetty Nikon HP Nike Nikon HP Nike
3 Social Advertising … in real world A product that is followed by your friends … Items (products/people) liked by Alice’s friends are better recommendations for Alice
Social Advertising … privacy problem 4 Fact that “Betty” liked “VistaPrint” is leaked to “Alice” Alice Betty Only the items (products/people) liked by Alice’s friends are recommendations for Alice
Social Advertising … privacy problem 5 Alice Betty Recommending irrelevant items some times improves privacy, but reduces accuracy
6 Social Advertising Privacy problem AliceBetty Alice is recommended ‘X’ Can we provide accurate recommendations to Alice based on the social network, while ensuring that Alice cannot deduce that Betty likes ‘X’ ?
Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 7
Social Recommendations A set of agents – Yahoo/Facebook users, medical patients A set of recommended items – Other users (friends), advertisements, products (drugs) A network of edges connecting the agents, items – Social network, patient-doctor and patient-drug history Problem: – Recommend a new item i to agent a based on the network 8
Social Recommendations(this talk) A set of agents – Yahoo/Facebook users, medical patients A set of recommended items – Other users (friends), advertisements, products (drugs) A network of edges connecting the agents, items – Social network, patient-doctor and patient-drug history Problem: – Recommend a new friend i to target user a based on the social network 9
Social Recommendations 10 Target Node (a) Candidate Recommendations u(a, i 3 )u(a, i 2 ) u(a, i 1 ) Utility Function – u(a, i) utility of recommending candidate i to target a Examples [Liben-Nowell et al. 2003]: # of Common Neighbors # of Weighted Paths Personalized Page Rank Utility Function – u(a, i) utility of recommending candidate i to target a Examples [Liben-Nowell et al. 2003]: # of Common Neighbors # of Weighted Paths Personalized Page Rank
Non-Private Recommendation Algorithm 11 u(a, i 3 )u(a, i 2 ) u(a, i 1 ) Utility Function – u(a, i) utility of recommending candidate i to target a Utility Function – u(a, i) utility of recommending candidate i to target a Algorithm For each target node a For each candidate i Compute p(a, i) that maximizes Σ u(a,i) p(a,i) endfor Randomly pick one of the candidates with probability p(a,i) endfor Algorithm For each target node a For each candidate i Compute p(a, i) that maximizes Σ u(a,i) p(a,i) endfor Randomly pick one of the candidates with probability p(a,i) endfor a
Example: Common Neighbors Utility 12 Utility Function – u(a, i) utility of recommending candidate i to target a Utility Function – u(a, i) utility of recommending candidate i to target a Common Neighbors Utility: “Alice and Bob are likely to be friends if they have many common neighbors” u(a,i 1 ) = f(2), u(a, i 2 ) = f(3), u(a,i 3 ) = f(1) Non-Private Algorithm Return the candidate with max u(a, i) Randomly pick a candidate with probability proportional to u(a,i) Common Neighbors Utility: “Alice and Bob are likely to be friends if they have many common neighbors” u(a,i 1 ) = f(2), u(a, i 2 ) = f(3), u(a,i 3 ) = f(1) Non-Private Algorithm Return the candidate with max u(a, i) Randomly pick a candidate with probability proportional to u(a,i) u(a, i 3 )u(a, i 2 ) u(a, i 1 ) a
Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 13
Differential Privacy For every output … OD2D2 D1D1 Adversary should not be able to distinguish between any D 1 and D 2 based on any O Pr[D 1 O] Pr[D 2 O]. Adversary should not be able to distinguish between any D 1 and D 2 based on any O Pr[D 1 O] Pr[D 2 O]. For every pair of inputs that differ in one value 1) log [Dwork 2006]
Privacy for Social Recommendations Sensitive information: Recommendation should not disclose the existence of an edge between two nodes. 15 Pr[ recommending (i, a) | G 1 ] Pr[ recommending (i, a) | G 2 ] log< ε a i G1G1 a i G2G2
Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 16
Measuring loss in utility due to privacy Suppose algorithm A recommends node i of utility u i with probability p i. Accuracy of A is defined as – comparison with utility of non-private algorithm 17
Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 18
Algorithms for Differential Privacy Theorem: No deterministic algorithm guarantees differential privacy. Exponential Mechanism – Sample output space based on a distance metric. Laplace Mechanism – Add noise from a Laplace distribution to query answers. 19
Privacy Preserving Recommendations Must pick a node with non-zero probability even if u = 0 20 Exponential Mechanism [McSherry et al. 2007] Exponential Mechanism [McSherry et al. 2007] Randomly pick a candidate with probability proportional to exp( ε∙u(a,i) / Δ ) (Δ is maximum change in utilities by changing one edge) Randomly pick a candidate with probability proportional to exp( ε∙u(a,i) / Δ ) (Δ is maximum change in utilities by changing one edge) u(a, i 3 )u(a, i 2 ) u(a, i 1 ) a Satisfies ε -differential privacy
Accuracy of Exponential Mechanism + Common Neighbors Utility 21 WikiVote Network (ε = 0.5) 60% of users have accuracy < 10%
Accuracy of Exponential Mechanism + Common Neighbors Utility 22 Twitter sample (ε = 1) 98% of users have accuracy < 5%
Can we do better? Maybe common neighbors utility is an especially non- private utility … – Consider a general utility functions that follow intuitive axioms Maybe the Exponential Mechanism algorithm does not guarantee sufficient accuracy... – Consider any algorithm that satisfies differential privacy 23
Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 24
u(a, i 4 ) Axioms on Utility Functions 25 u(a, i 3 ) u(a, i 2 ) u(a, i 1 ) a Identical with respect to ‘a’. Hence, u(a, i 3 ) = u(a, i 4 ) Identical with respect to ‘a’. Hence, u(a, i 3 ) = u(a, i 4 )
Axioms on Utility Functions 26 “Most of the utility of recommendation to a target is concentrated on a small number of candidates.”
Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 27
Accuracy-Privacy Tradeoff 28 Common Neighbors & Weighted Paths Utility*: To achieve constant accuracy for target node a, ε > Ω(log n / degree(a)) Common Neighbors & Weighted Paths Utility*: To achieve constant accuracy for target node a, ε > Ω(log n / degree(a)) * under some mild assumptions on the weighted paths utility …
Implications of Accuracy-Privacy Tradeoff 29 WikiVote Network (ε = 0.5) 60% of users have accuracy < 55%
Implications of Accuracy-Privacy Tradeoff 30 Twitter sample (ε = 1) 95% of users have accuracy < 5%
Takeaway … “For majority of the nodes in the network, recommendations must either be inaccurate or violate differential privacy!” – Maybe this is a “bad idea” – Or, Maybe differential privacy is too strong a privacy definition to shoot for. 31
Intuition behind main result 32 Skip >>
Intuition behind main result 33 a i G1G1 j a i G2G2 j u 1 (a, i), p 1 (a, i) u 1 (a, j), p 1 (a, j) u 2 (a, i), p 2 (a, i) u 2 (a, j), p 2 (a, j) p 1 (a,i) p 2 (a,i) < e ε
Intuition behind main result 34 a i G2G2 j p 1 (a,i) p 2 (a,i) < e ε a i G3G3 j p 3 (a,j) p 1 (a,j) < e ε a i G1G1 j
Using Exchangeability 35 a i G2G2 j p 1 (a,i) p 2 (a,i) < e ε a i G3G3 j p 3 (a,j) p 1 (a,j) < e ε G3 is an isomorphism of G2. u 2 (a,i) = u 3 (a,j) implies p 2 (a,i) = p 3 (a,j)
Using Exchangeability 36 p 1 (a,i) p 1 (a,j) < e 2ε G3 is an isomorphism of G2. u 2 (a,i) = u 3 (a,j) implies p 2 (a,i) = p 3 (a,j)
Using Exchangeability In general if any node i can be “transformed” to node j in t edge changes. Then, 37 p 1 (a,i) p 1 (a,j) < e tε probability of recommending highest utility node is at most e tε times probability of recommending worst utility node.
Final Act: Using Concentration Few nodes have high utility for target a – 10s of nodes share a common neighbor with a Many nodes have low utility for target a – Millions of nodes don’t share a common neighbor with a Thus, there exist i and j such that 38 p 1 (a,i) p 1 (a,j) < e tε Ω(n) =
Summary of Social Recommendations Question: “Can social recommendations be made while guaranteeing strong privacy conditions?” – General utility functions satisfying natural axioms – Any algorithm satisfying differential privacy Answer: “For majority of nodes in the network, recommendations must either be inaccurate or violate differential privacy!” – Maybe this is a “bad idea” – Or, Maybe differential privacy is too strong a privacy definition to shoot for. 39
Summary of Social Recommendations Answer: “For majority of nodes in the network, recommendations must either be inaccurate or violate differential privacy!” – Maybe this is a “bad idea” – Or, Maybe differential privacy is too strong a privacy definition to shoot for. Open Question: “What is the minimum amount of personal information that a user must be willing to disclose in order to get personalized recommendations?” 40
Thank you 41