Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.
DIFFERENTIAL PRIVACY REU Project Mentors: Darakhshan Mir James Abello Marco A. Perez.
Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.
Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Simulatability “The enemy knows the system”, Claude Shannon CompSci Instructor: Ashwin Machanavajjhala 1Lecture 6 : Fall 12.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
Foundations of Privacy Lecture 4 Lecturer: Moni Naor.
Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements Raju Balakrishnan (Arizona State University)
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
An brief tour of Differential Privacy Avrim Blum Computer Science Dept Your guide:
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 4 Comparison-based sorting Why sorting? Formal analysis of Quick-Sort Comparison.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
EXPANDER GRAPHS Properties & Applications. Things to cover ! Definitions Properties Combinatorial, Spectral properties Constructions “Explicit” constructions.
Purnamrita Sarkar (UC Berkeley) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.) 1.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
2. Attacks on Anonymized Social Networks. Setting A social network Edges may be private –E.g., “communication graph” The study of social structure by.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Path Planning in Expansive C-Spaces D. HsuJ.-C. LatombeR. Motwani CS Dept., Stanford University, 1997.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Foundations of Privacy Lecture 11 Lecturer: Moni Naor.
Differential Privacy (2). Outline  Using differential privacy Database queries Data mining  Non interactive case  New developments.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
Differentially Private Data Release for Data Mining Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University Montreal,
Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12.
Foundations of Privacy Lecture 6 Lecturer: Moni Naor.
Distributed Algorithms on a Congested Clique Christoph Lenzen.
Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)
1 Privacy-Preserving Distributed Information Sharing Nan Zhang and Wei Zhao Texas A&M University, USA.
Fair Allocation with Succinct Representation Azarakhsh Malekian (NWU) Joint Work with Saeed Alaei, Ravi Kumar, Erik Vee UMDYahoo! Research.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Tuning Privacy-Utility Tradeoffs in Statistical Databases using Policies Ashwin Machanavajjhala cs.duke.edu Collaborators: Daniel Kifer (PSU),
Dynamic Covering for Recommendation Systems Ioannis Antonellis Anish Das Sarma Shaddin Dughmi.
Private Approximation of Search Problems Amos Beimel Paz Carmi Kobbi Nissim Enav Weinreb (Technion)
Privacy of Correlated Data & Relaxations of Differential Privacy CompSci Instructor: Ashwin Machanavajjhala 1Lecture 16: Fall 12.
QoS Routing in Networks with Inaccurate Information: Theory and Algorithms Roch A. Guerin and Ariel Orda Presented by: Tiewei Wang Jun Chen July 10, 2000.
Xiaowei Ying, Xintao Wu Univ. of North Carolina at Charlotte PAKDD-09 April 28, Bangkok, Thailand On Link Privacy in Randomizing Social Networks.
Foundations of Privacy Lecture 5 Lecturer: Moni Naor.
Maria-Florina Balcan Active Learning Maria Florina Balcan Lecture 26th.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.
Estimating PageRank on Graph Streams Atish Das Sarma (Georgia Tech) Sreenivas Gollapudi, Rina Panigrahy (Microsoft Research)
Differential Privacy (1). Outline  Background  Definition.
Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Privacy Preserving in Social Network Based System PRENTER: YI LIANG.
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
No Free Lunch in Data Privacy CompSci Instructor: Ashwin Machanavajjhala 1Lecture 15: Fall 12.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Private Data Management with Verification
Procrastination … with Variable Present Bias
Approximating the MST Weight in Sublinear Time
Understanding Generalization in Adaptive Data Analysis
Privacy-preserving Release of Statistics: Differential Privacy
On Communication Protocols that Compute Almost Privately
Graph Analysis with Node Differential Privacy
Privacy-Preserving Classification
Differential Privacy in Practice
Vitaly (the West Coast) Feldman
Randomized Algorithms CS648
Foundations of Privacy Lecture 7
Lecture 6: Counting triangles Dynamic graphs & sampling
Published in: IEEE Transactions on Industrial Informatics
Some contents are borrowed from Adam Smith’s slides
Differential Privacy (1)
Presentation transcript:

Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1

Social Advertising Armani Gucci Prada Armani Gucci Prada Recommend ads based on private shopping histories of “friends” in the social network. 2 AliceBetty Nikon HP Nike Nikon HP Nike

3 Social Advertising … in real world A product that is followed by your friends … Items (products/people) liked by Alice’s friends are better recommendations for Alice

Social Advertising … privacy problem 4 Fact that “Betty” liked “VistaPrint” is leaked to “Alice” Alice Betty Only the items (products/people) liked by Alice’s friends are recommendations for Alice

Social Advertising … privacy problem 5 Alice Betty Recommending irrelevant items some times improves privacy, but reduces accuracy

6 Social Advertising Privacy problem AliceBetty Alice is recommended ‘X’ Can we provide accurate recommendations to Alice based on the social network, while ensuring that Alice cannot deduce that Betty likes ‘X’ ?

Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 7

Social Recommendations A set of agents – Yahoo/Facebook users, medical patients A set of recommended items – Other users (friends), advertisements, products (drugs) A network of edges connecting the agents, items – Social network, patient-doctor and patient-drug history Problem: – Recommend a new item i to agent a based on the network 8

Social Recommendations(this talk) A set of agents – Yahoo/Facebook users, medical patients A set of recommended items – Other users (friends), advertisements, products (drugs) A network of edges connecting the agents, items – Social network, patient-doctor and patient-drug history Problem: – Recommend a new friend i to target user a based on the social network 9

Social Recommendations 10 Target Node (a) Candidate Recommendations u(a, i 3 )u(a, i 2 ) u(a, i 1 ) Utility Function – u(a, i) utility of recommending candidate i to target a Examples [Liben-Nowell et al. 2003]: # of Common Neighbors # of Weighted Paths Personalized Page Rank Utility Function – u(a, i) utility of recommending candidate i to target a Examples [Liben-Nowell et al. 2003]: # of Common Neighbors # of Weighted Paths Personalized Page Rank

Non-Private Recommendation Algorithm 11 u(a, i 3 )u(a, i 2 ) u(a, i 1 ) Utility Function – u(a, i) utility of recommending candidate i to target a Utility Function – u(a, i) utility of recommending candidate i to target a Algorithm For each target node a For each candidate i Compute p(a, i) that maximizes Σ u(a,i) p(a,i) endfor Randomly pick one of the candidates with probability p(a,i) endfor Algorithm For each target node a For each candidate i Compute p(a, i) that maximizes Σ u(a,i) p(a,i) endfor Randomly pick one of the candidates with probability p(a,i) endfor a

Example: Common Neighbors Utility 12 Utility Function – u(a, i) utility of recommending candidate i to target a Utility Function – u(a, i) utility of recommending candidate i to target a Common Neighbors Utility: “Alice and Bob are likely to be friends if they have many common neighbors” u(a,i 1 ) = f(2), u(a, i 2 ) = f(3), u(a,i 3 ) = f(1) Non-Private Algorithm Return the candidate with max u(a, i) Randomly pick a candidate with probability proportional to u(a,i) Common Neighbors Utility: “Alice and Bob are likely to be friends if they have many common neighbors” u(a,i 1 ) = f(2), u(a, i 2 ) = f(3), u(a,i 3 ) = f(1) Non-Private Algorithm Return the candidate with max u(a, i) Randomly pick a candidate with probability proportional to u(a,i) u(a, i 3 )u(a, i 2 ) u(a, i 1 ) a

Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 13

Differential Privacy For every output … OD2D2 D1D1 Adversary should not be able to distinguish between any D 1 and D 2 based on any O Pr[D 1  O] Pr[D 2  O]. Adversary should not be able to distinguish between any D 1 and D 2 based on any O Pr[D 1  O] Pr[D 2  O]. For every pair of inputs that differ in one value 1) log [Dwork 2006]

Privacy for Social Recommendations Sensitive information: Recommendation should not disclose the existence of an edge between two nodes. 15 Pr[ recommending (i, a) | G 1 ] Pr[ recommending (i, a) | G 2 ] log< ε a i G1G1 a i G2G2

Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 16

Measuring loss in utility due to privacy Suppose algorithm A recommends node i of utility u i with probability p i. Accuracy of A is defined as – comparison with utility of non-private algorithm 17

Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 18

Algorithms for Differential Privacy Theorem: No deterministic algorithm guarantees differential privacy. Exponential Mechanism – Sample output space based on a distance metric. Laplace Mechanism – Add noise from a Laplace distribution to query answers. 19

Privacy Preserving Recommendations Must pick a node with non-zero probability even if u = 0 20 Exponential Mechanism [McSherry et al. 2007] Exponential Mechanism [McSherry et al. 2007] Randomly pick a candidate with probability proportional to exp( ε∙u(a,i) / Δ ) (Δ is maximum change in utilities by changing one edge) Randomly pick a candidate with probability proportional to exp( ε∙u(a,i) / Δ ) (Δ is maximum change in utilities by changing one edge) u(a, i 3 )u(a, i 2 ) u(a, i 1 ) a Satisfies ε -differential privacy

Accuracy of Exponential Mechanism + Common Neighbors Utility 21 WikiVote Network (ε = 0.5) 60% of users have accuracy < 10%

Accuracy of Exponential Mechanism + Common Neighbors Utility 22 Twitter sample (ε = 1) 98% of users have accuracy < 5%

Can we do better? Maybe common neighbors utility is an especially non- private utility … – Consider a general utility functions that follow intuitive axioms Maybe the Exponential Mechanism algorithm does not guarantee sufficient accuracy... – Consider any algorithm that satisfies differential privacy 23

Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 24

u(a, i 4 ) Axioms on Utility Functions 25 u(a, i 3 ) u(a, i 2 ) u(a, i 1 ) a Identical with respect to ‘a’. Hence, u(a, i 3 ) = u(a, i 4 ) Identical with respect to ‘a’. Hence, u(a, i 3 ) = u(a, i 4 )

Axioms on Utility Functions 26 “Most of the utility of recommendation to a target is concentrated on a small number of candidates.”

Outline of this talk Formal social recommendations problem – Privacy for social recommendations – Accuracy of social recommendations – Example private algorithm and its accuracy Privacy-Accuracy trade-off – Properties satisfied by a general algorithm – Theoretical bound 27

Accuracy-Privacy Tradeoff 28 Common Neighbors & Weighted Paths Utility*: To achieve constant accuracy for target node a, ε > Ω(log n / degree(a)) Common Neighbors & Weighted Paths Utility*: To achieve constant accuracy for target node a, ε > Ω(log n / degree(a)) * under some mild assumptions on the weighted paths utility …

Implications of Accuracy-Privacy Tradeoff 29 WikiVote Network (ε = 0.5) 60% of users have accuracy < 55%

Implications of Accuracy-Privacy Tradeoff 30 Twitter sample (ε = 1) 95% of users have accuracy < 5%

Takeaway … “For majority of the nodes in the network, recommendations must either be inaccurate or violate differential privacy!” – Maybe this is a “bad idea” – Or, Maybe differential privacy is too strong a privacy definition to shoot for. 31

Intuition behind main result 32 Skip >>

Intuition behind main result 33 a i G1G1 j a i G2G2 j u 1 (a, i), p 1 (a, i) u 1 (a, j), p 1 (a, j) u 2 (a, i), p 2 (a, i) u 2 (a, j), p 2 (a, j) p 1 (a,i) p 2 (a,i) < e ε

Intuition behind main result 34 a i G2G2 j p 1 (a,i) p 2 (a,i) < e ε a i G3G3 j p 3 (a,j) p 1 (a,j) < e ε a i G1G1 j

Using Exchangeability 35 a i G2G2 j p 1 (a,i) p 2 (a,i) < e ε a i G3G3 j p 3 (a,j) p 1 (a,j) < e ε G3 is an isomorphism of G2. u 2 (a,i) = u 3 (a,j) implies p 2 (a,i) = p 3 (a,j)

Using Exchangeability 36 p 1 (a,i) p 1 (a,j) < e 2ε G3 is an isomorphism of G2. u 2 (a,i) = u 3 (a,j) implies p 2 (a,i) = p 3 (a,j)

Using Exchangeability In general if any node i can be “transformed” to node j in t edge changes. Then, 37 p 1 (a,i) p 1 (a,j) < e tε probability of recommending highest utility node is at most e tε times probability of recommending worst utility node.

Final Act: Using Concentration Few nodes have high utility for target a – 10s of nodes share a common neighbor with a Many nodes have low utility for target a – Millions of nodes don’t share a common neighbor with a Thus, there exist i and j such that 38 p 1 (a,i) p 1 (a,j) < e tε Ω(n) =

Summary of Social Recommendations Question: “Can social recommendations be made while guaranteeing strong privacy conditions?” – General utility functions satisfying natural axioms – Any algorithm satisfying differential privacy Answer: “For majority of nodes in the network, recommendations must either be inaccurate or violate differential privacy!” – Maybe this is a “bad idea” – Or, Maybe differential privacy is too strong a privacy definition to shoot for. 39

Summary of Social Recommendations Answer: “For majority of nodes in the network, recommendations must either be inaccurate or violate differential privacy!” – Maybe this is a “bad idea” – Or, Maybe differential privacy is too strong a privacy definition to shoot for. Open Question: “What is the minimum amount of personal information that a user must be willing to disclose in order to get personalized recommendations?” 40

Thank you 41