Download presentation
Presentation is loading. Please wait.
Published byShanon Phillips Modified over 9 years ago
1
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Randomizing Social Network: A Spectrum Preserving Approach
2
Randomizing Social Network: a Spectrum Preserving Approach, SDM08 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Framework Background & Motivation Graph Spectrum & Structure Spectrum & Perturbation Spectrum Preserving Randomization Privacy Protection Conclusion & Future Work 2
3
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Background & Motivation Social Network 3 Randomizing Social Network: a Spectrum Preserving Approach, SDM08 Friendship in Karate club [Zachary, 77] Biological association network of dolphins [Lusseau et al., 03] Collaboration network of scientists [Newman, 06] Network of US political books (105 nodes, 441 edges) Books about US politics sold by Amazon.com. Edges represent frequent co-purchasing of books by the same buyers. Nodes have been given colors of blue, white, or red to indicate whether they are "liberal", "neutral", or "conservative".
4
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Background & Motivation Privacy Issues in Social Network: Social network contains much private relation information; Anonymization is not enough for protecting the privacy. Subgraph attacks [Backstrom et al., WWW07, Hay et al., 07]. 4 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
5
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Background & Motivation Graph Randomization/Perturbation: 1. Random Add/Del edges (no. of edges unchanged) 2. Random Switch edges (nodes’ degree unchanged) 5 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
6
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Background & Motivation Graph perturbation is resilient to subgraph attacks (refer to our paper for more details). 6 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
7
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Motivation Graph Randomization/Perturbation: Data utility: How will the graph structure change due to perturbation? How to preserve graph structural features better? Data privacy: Protection on the link privacy. 7 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
8
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Graph Spectrum & Structure Background & Motivation Graph Spectrum & Structure Spectrum & Perturbation Spectrum Preserving Randomization Privacy Protection Conclusion & Future Work 8
9
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Graph Spectrum and Structure Numerous properties and measures of networks (Graph G contains n nodes and m edges): Harmonic mean of shortest distance; Transitivity(cluster coefficient) Subgraph centrality; Modularity (community structure); And many others 9 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
10
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Graph Spectrum and Structure Adjacency Matrix (Graph G contains n nodes and m edges): Adjacency Spectrum A is symmetric, it has n real eigenvalues: 10 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
11
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Graph Spectrum and Structure Laplacian Matrix: Laplacian Spectrum 11 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
12
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Graph Spectrum and Structure Many real graph structural features are related to adjacency/Laplacian spectrum, e.g.: No. of triangles: Subgraph centrality: Graph diameter: k disconnected parts in the graph ⇔ k 0’s in the Laplacian spectrum. 12 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
13
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Graph Spectrum and Structure Two important eigenvalues: and 1. The maximum degree, chromatic number, clique number etc. are related to ; 2. Epidemic threshold for virus propagates in the network is related to [Wang et al., KDD03]; 3. indicates the community structure of the graph: clear community structure ⇔ ≈ 0. 13 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
14
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum & Perturbation Background & Motivation Graph Spectrum & Structure Spectrum & Perturbation Spectrum Preserving Randomization Privacy Protection Conclusion & Future Work 14
15
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum & Perturbation Graph Perturbation: 1. Random Add/Del edges (no. of edges doesn’t change) 2. Random Switch edges (nodes’ degree doesn’t change) 15 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
16
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum & Perturbation Both topological and spectral graph features change along the perturbation, and they shows similar trends. (Networks of US political books, 105 nodes and 441 edges) 16 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
17
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum & Perturbation General bound on spectrum in perturbation: Do the randomization for k times (refer to our paper for more details) 17 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
18
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum Preserving Randomization Background & Motivation Graph Spectrum & Structure Spectrum & Perturbation Spectrum Preserving Randomization Privacy Protection Conclusion & Future Work 18
19
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum Preserving Randomization Intuition: since spectrum is related to many graph topological features, can we preserve more structural features by controlling the movement of eigenvalues? 19 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
20
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum Preserving Randomization Spectral Switch (apply to adjacency matrix): To increase the eigenvalue: To decrease the eigenvalue: 20 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
21
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum Preserving Randomization Spectral Switch (apply to Laplacian matrix): To decrease the eigenvalue: To increase the eigenvalue: 21 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
22
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum Preserving Randomization Evaluation: (Networks of US political books, 105 nodes and 441 edges) 22 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
23
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Spectrum Preserving Randomization Similarly, we also develop Spectral Add/Del strategy (Refer to our paper for more details) In summary, by controlling the movement of the eigenvalues, spectrum can preserving randomization strategies better preserve the graph structure. 23 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
24
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Privacy Protection Background & Motivation Graph Spectrum & Structure Spectrum & Perturbation Spectrum Preserving Randomization Privacy Protection Conclusion & Future Work 24
25
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Privacy Protection Privacy protection measure: A-prior probability (without the released data): Posterior probability (with released the data & perturbation parameters): The absolute measure The relative measure 25 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
26
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Privacy Protection How many times shall we do add/del or switches? Objective: the minimum level of protection should be above some threshold: For random add/del and random switch 26 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
27
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Privacy Protection Spectral strategy and random strategy do not differ much in protecting the privacy: In the graph, there exits both up-edge pairs and down-edge pairs. Their proportions affect the privacy protection of spectral strategy Further study in future work 27 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
28
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Conclusion & Future Work Background & Motivation Graph Spectrum & Structure Spectrum & Perturbation Spectrum Preserving Randomization Privacy Protection Conclusion & Future Work 28
29
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Conclusion 1. Graph structure and spectrum are closely related, and perturbation can significantly change both. 2. Spectrum preserving randomization strategies can better preserve the graph structure; 3. Privacy protection issues for random perturbation. 29 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
30
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Future Work 1. Further study on privacy issues of spectral strategy; 2. A more flexible algorithm for other eigenvalues; 3. Algorithms controlling the magnitude of eigenvalues’ change. 30 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
31
Thank You! Questions? 31
32
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Reference 1. L. Backstrom et al., Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography, 2007. 2. M. Hay et al., Anonymizing social networks, 2007. 3. Y. Wang et al., Epidemic spreading in real networks: An eigenvalue viewpoint, 2003. 32 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
33
2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia Privacy Protection Privacy protection measure: We can proof, for a given (i, j) Therefore, to calculation the measure is based on calculating the number of false edges (refer to our paper for more details). 33 Randomizing Social Network: a Spectrum Preserving Approach, SDM08
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.