Download presentation
Presentation is loading. Please wait.
Published byJoshua Cunningham Modified over 9 years ago
1
1 Differentially Private Analysis of Graphs and Social Networks Sofya Raskhodnikova Pennsylvania State University
2
Graphs and networks 2 Image source: Nykamp DQ, “An introduction to networks.” From Math Insight. http://mathinsight.org/network_introduction. Nykamp DQhttp://mathinsight.org/network_introduction Many types of data can be represented as graphs, where nodes represent individuals and edges capture relationships.
3
Potentially sensitive information in graphs Social, romantic and sexual relationships “Friendships” in an online social network Financial transactions Phone calls and email communication Doctor-patient relationships 3 Source: Christakis, Fowler. The Spread of Obesity in a Large Social Network over 32 Years. N Engl J Med 2007; 357:370-379 Source: B. Aven. The effects of corruption on organizational networks and individual behavior. MIT workshop: Information and Decision in Social Networks, 2011.
4
Two conflicting goals Privacy: protecting information of individuals. Utility: drawing accurate conclusions about aggregate information. 4 PrivacyUtility
5
5 False dichotomy: personally identifying vs. non-personally identifying information. Links and any other information about individual can be used for de-anonymization. ``Anonymized’’ graphs still pose privacy risk Bearman, Moody, Stovel. Chains of affection: The structure of adolescent romantic and sexual networks, American J. Sociology, 2008 In a typical real-life network, many nodes have unique neighborhoods.
6
Some published de-anonymization attacks 6 –Movie ratings [Narayanan, Shmatikov 08] De-identified Netflix users based on information from a public movie database IMDb. –Social networks [Backstrom, Dwork, Kleinberg 07; Narayanan, Shmatikov 09; Narayanan, Shi, Rubinstein 12] Re-identified users in an online social network (anonymized Twitter) based information from a public online social network (Flickr). –Computer networks [Coull, Wright, Monrose, Collins, Reiter 07; Ribeiro, Chen, Miklau, Townsley 08,…] Can reidentify individuals based on external sources. Movies People
7
Government agency for surveillance. A phisher/spammer to write a personalized message. Health insurance provider to check preexisting conditions. Marketers to focus advertising on influential nodes. Stalkers, nosy neighbors, colleagues, or employers. Who’d want to de-anonymize a social network graph? 7 image sources: Andrew Joyner, http://dukeromkey.com/
8
8 What information can be released without violating privacy?
9
Differential privacy (for graph data) Graph G 9 image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/ Algorithm Data processing output Data release
10
Two variants of differential privacy for graphs Edge differential privacy Two graphs are neighbors if they differ in one edge. Node differential privacy Two graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edges. 10 G:
11
Differential privacy (for graph data) Graph G 11 image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/ Algorithm Data processing output Data release
12
Some useful properties of differential privacy 12
13
Is differential privacy too strong? No weaker notion has been proposed that satisfies all three useful properties. We can actually attain it for many useful statistics! 13
14
14 What graph statistics can be computed accurately with differential privacy?
15
Graph statistics 15 … … Fraction of nodes of degree d Degree d … … The degree of a node is the number of connections it has.
16
Tools used in differentially private graph algorithms Smooth sensitivit y –A more nuanced notion of sensitivity than the one mentioned in the previous talk Sample and aggregate Maximum flow Linear and convex programming Random projections Iterative updates Postprocessing 16
17
17 Differentially private graph analysis A taste of techniques
18
Basic question: how to compute a statistic f Graph G 18 image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/ Algorithm Data processing Data release
19
19 Challenge for node privacy: high sensitivity
20
20 Challenge for node privacy: high sensitivity
21
21 Idea: project onto graphs with low sensitivity. [Kasiviswanathan Nissim Raskhodnikova Smith 13] See also [Blocki Blum Datta Sheffet 13, Chen Zhou 13]
22
22 “Projections” on graphs of small degree All graphs
23
Lipschitz extensions 23 All graphs
24
Summary Accurate subgraph counts for realistic graphs can be computed by node-private algorithms –Use Lipschitz extensions and linear programming –It is one example of many graph statistics that node-private algorithms do well on. 24
25
What can’t be computed differentially privately? Differential privacy explicitly excludes the possibility of computing anything that depends on one person’s data: –Is there a node in the graph that has atypical connections? –``suspicious communication patterns’’? 25
26
What we are working on Node differentially private algorithms for releasing –a large number of graph statistics at once –synthetic graphs Exciting area of research: –Edge-private algorithms [Nissim, Raskhodnikova, Smith 07; Hay, Rastogi, Miklau, Suciu 09; Hay, Li, Miklau, Jensen 09; Hardt, Rothblum 10; Karwa, Raskhodnikova, Smith, Yaroslavtsev 11; Karwa, Slavkovic 12; Blocki, Blum, Datta, Sheffet 12; Gupta, Roth, Ullman 12; Mir, Wright 12; Kifer, Lin 13, …] –Node-private algorithms [Gehrke Lui Pass 12; Blocki Blum Datta Sheffet 13, Kasiviswanathan Nissim Raskhodnikova Smith 13, Chen Zhou 13, Raskhodnikova Smith,..] 26
27
Conclusions We are close to having edge-private and node-private algorithms that work well in practice for many basic graph statistics. Accurate node-private algorithms were thought to be impossible only a few years ago. Differential privacy is influencing other scientific disciplines –Next talk: reducing false discovery rate. 27
28
Experiments for the flow and LP method [Lu] 28 Graph# nodes# edgesMax degree Time, secs # edges CA-GrQc5,24228,992810.027 CA-HepTh9,87751,996650.680.5 CA-AstroPh18,772396,2205040.3410,222 com-dblp-ungraph317,0802,099,73234322128 com-youtube-ungraph1,134,8905,975,24828,754994
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.