Structural Data De-anonymization: Quantification, Practice, and Implications Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology.

Slides:



Advertisements
Similar presentations
An Interactive-Voting Based Map Matching Algorithm
Advertisements

1/22 Worst and Best-Case Coverage in Sensor Networks Seapahn Meguerdichian, Farinaz Koushanfar, Miodrag Potkonjak, and Mani Srivastava IEEE TRANSACTIONS.
Intel Research Internet Coordinate Systems - 03/03/2004 Internet Coordinate Systems Marcelo Pias Intel Research Cambridge
Quality Aware Privacy Protection for Location-based Services Zhen Xiao, Xiaofeng Meng Renmin University of China Jianliang Xu Hong Kong Baptist University.
Correlation Search in Graph Databases Yiping Ke James Cheng Wilfred Ng Presented By Phani Yarlagadda.
De-anonymizing social networks Arvind Narayanan, Vitaly Shmatikov.
The IEEE International Conference on Big Data 2013 Arash Fard M. Usman Nisar Lakshmish Ramaswamy John A. Miller Matthew Saltz Computer Science Department.
More Efficient Generation of Plane Triangulations Shin-ichi Nakano Takeaki Uno Gunma University National Institute of JAPAN Informatics, JAPAN 23/Sep/2003.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Finding Topic-sensitive Influential Twitterers Presenter 吴伟涛 TwitterRank:
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
Xiaowei Ying Xintao Wu Univ. of North Carolina at Charlotte 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed.
Forwarding Redundancy in Opportunistic Mobile Networks: Investigation and Elimination Wei Gao 1, Qinghua Li 2 and Guohong Cao 3 1 The University of Tennessee,
Leting Wu Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte Reconstruction from Randomized Graph via Low Rank Approximation.
1 Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles The 3rd ACM Conference on Recommender Systems, New.
Malicious parties may employ (a) structure-based or (b) label-based attacks to re-identify users and thus learn sensitive information about their rating.
2. Attacks on Anonymized Social Networks. Setting A social network Edges may be private –E.g., “communication graph” The study of social structure by.
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
Structure based Data De-anonymization of Social Networks and Mobility Traces Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
1 1 Chenhao Tan, 1 Jie Tang, 2 Jimeng Sun, 3 Quan Lin, 4 Fengjiao Wang 1 Department of Computer Science and Technology, Tsinghua University, China 2 IBM.
Viral Marketing for Dedicated Customers Presented by: Cheng Long 25 August, 2012.
Solutions to Security and Privacy Issues in Mobile Social Networking
Primary Social Behavior aware Routing and Scheduling for Cognitive Radio Networks Shouling Ji and Raheem Beyah Georgia Institute of Technology Zhipeng.
WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Protecting Sensitive Labels in Social Network Data Anonymization.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Jing (Selena) He and Hisham M. Haddad Department of Computer Science, Kennesaw State University Shouling Ji, Xiaojing Liao, and Raheem Beyah School of.
Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Whitespace Measurement and Virtual Backbone Construction for Cognitive Radio Networks: From the Social Perspective Shouling Ji and Raheem Beyah Georgia.
Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.
Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm Mike Yuan CS 6/73201 Advanced Operating Systems Fall 2007 Dr. Nesterenko.
Efficient Computing k-Coverage Paths in Multihop Wireless Sensor Networks XuFei Mao, ShaoJie Tang, and Xiang-Yang Li Dept. of Computer Science, Illinois.
On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.
Introduction Graph Data Why data sharing/publishing
Community-enhanced De-anonymization of Online Social Networks Shirin Nilizadeh, Apu Kapadia, Yong-Yeol Ahn Indiana University Bloomington CCS 2014.
Privacy Protection in Social Networks Instructor: Assoc. Prof. Dr. DANG Tran Khanh Present : Bui Tien Duc Lam Van Dai Nguyen Viet Dang.
Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Panther: Fast Top-k Similarity Search in Large Networks JING ZHANG, JIE TANG, CONG MA, HANGHANG TONG, YU JING, AND JUANZI LI Presented by Moumita Chanda.
An Effective Method to Improve the Resistance to Frangibility in Scale-free Networks Kaihua Xu HuaZhong Normal University.
Graph Data Management Lab, School of Computer Science Personalized Privacy Protection in Social Networks (VLDB2011)
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Incremental Run-time Application Mapping for Heterogeneous Network on Chip 2012 IEEE 14th International Conference on High Performance Computing and Communications.
Bo Zong, Yinghui Wu, Ambuj K. Singh, Xifeng Yan 1 Inferring the Underlying Structure of Information Cascades
Optimal Relay Placement for Indoor Sensor Networks Cuiyao Xue †, Yanmin Zhu †, Lei Ni †, Minglu Li †, Bo Li ‡ † Shanghai Jiao Tong University ‡ HK University.
Yinghui Wu, SIGMOD Incremental Graph Pattern Matching Wenfei Fan Xin Wang Yinghui Wu University of Edinburgh Jianzhong Li Jizhou Luo Harbin Institute.
Pure Topological Mapping in Mobile Robotics Authors : Dimitri Marinakis Gregory Dudek Speaker :李宗明 M99G0103 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO.
A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:
1 Link Privacy in Social Networks Aleksandra Korolova, Rajeev Motwani, Shubha U. Nabar CIKM’08 Advisor: Dr. Koh, JiaLing Speaker: Li, HueiJyun Date: 2009/3/30.
O n the Relative De-anonymizability of Graph Data: Quantification and Evaluation Shouling Ji, Weiqing Li, Shukun Yang and Raheem Beyah Georgia Institute.
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
Introduction | Model | Solution | Evaluation
Urban Sensing Based on Human Mobility
T.W. Scholten, C. de Persis, P. Tesi
A Study of Group-Tree Matching in Large Scale Group Communications
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Introduction Secondary Users (SUs) Primary Users (PUs)
Dieudo Mulamba November 2017
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Efficient Subgraph Similarity All-Matching
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
GANG: Detecting Fraudulent Users in OSNs
Actively Learning Ontology Matching via User Interaction
PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs.
Presentation transcript:

Structural Data De-anonymization: Quantification, Practice, and Implications Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology Mudhakar Srivatsa IBM T. J. Watson Research Center

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Narayanan-Shmatikov attack (IEEE S&P 2009) A. Narayanan and V. Shmatikov, De-anonymizing Social Networks, IEEE S&P Anonymized data: Twitter (crawled in late 2007) – A microblogging service – 224K users, 8.5M edges Auxiliary data: Flicker (crawled in late 2007/early 2008) – A photo-sharing service – 3.3M users, 53M edges Result: 30.8% of the users are successfully de-anonymized TwitterFlicker User mapping Heuristics Eccentricity Edge directionality Node degree Revisiting nodes Reverse match

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Srivatsa-Hicks Attacks (ACM CCS 2012) M. Srivatsa and M. Hicks, De-anonymizing Mobility Traces: using Social Networks as a Side- Channel, ACM CCS Anonymized data – Mobility traces: St Andrews, Smallblue, and Infocom 2006 Auxiliary data – Social networks: Facebook, and DBLP Over 80% users can be successfully de-anonymized

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Motivation Question 1: Why can structural data be de-anonymized? Question 2: What are the conditions for successful data de-anonymization? Question 3: What portion of users can be de-anonymized in a structural dataset? [1] P. Pedarsani and M. Grossglauser, On the Privacy of Anonymized Networks, KDD [2] L. Yartseva and M. Grossglauser, On the Performance of Percolation Graph Matching, COSN [3] N. Korula and S. Lattanzi, An Efficient Reconciliation Algorithm for Social Networks, VLDB 2014.

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Motivation Question 1: Why can structural data be de-anonymized? Question 2: What are the conditions for successful data de-anonymization? Question 3: What portion of users can be de-anonymized in a structural dataset? Our Constribution Address the above three open questions under a practical data model.

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Outline Introduction and Motivation System Model De-anonymization Quantification Evaluation Implication 1: Optimization based De-anonymization (ODA) Practice Implication 2: Secure Data Publishing Conclusion

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization System Model Anonymized Data Auxiliary Data De-anonymization Measurement

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization System Model Anonymized Data Auxiliary Data De-anonymization Measurement Quantification conceptual underlying graph Configuration Model G can have an arbitrary degree sequence that follows any distribution

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Outline Introduction and Motivation System Model De-anonymization Quantification Evaluation Implication 1: Optimization based De-anonymization (ODA) Practice Implication 2: Secure Data Publishing Conclusion

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization De-anonymization Quantification Perfect De-anonymization Quantification Structural Similarity ConditionGraph/Data Size Condition

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization De-anonymization Quantification -Perfect De-anonymization Quantification Structural Similarity ConditionGraph/Data Size Condition

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Outline Introduction and Motivation System Model De-anonymization Quantification Evaluation Implication 1: Optimization based De-anonymization (ODA) Practice Implication 2: Secure Data Publishing Conclusion

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Evaluation Datasets

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Evaluation Perfect De-anonymization Condition Structural Similarity Condition Graph/Data Size Condition

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Evaluation -Perfect De-anonymization Condition Structural Similarity Condition Graph/Data Size Condition Projection/Sampling Condition

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Evaluation -Perfect De-anonymizability Structural Similarity Condition

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Evaluation -Perfect De-anonymizability Structural Similarity Condition How many users can be successfully de-anonymized

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Outline Introduction and Motivation System Model De-anonymization Quantification Evaluation Implication 1: Optimization based De-anonymization (ODA) Practice Implication 2: Secure Data Publishing Conclusion

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Optimization based De-anonymization (ODA) Our quantification implies – An optimum de-anonymization solution exists – However, it is difficult to find it. Select candidate users from unmapped users with top degrees Mapping candidate users by minimizing the Edge Error function

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Optimization based De-anonymization (ODA) Our quantification implies – An optimum de-anonymization solution exists – However, it is difficult to find it. Space complexity Time complexity ODA Features 1. Cold start (seed-free) 2. Can be used by other attacks for landmark (seed) identification 3. Optimization based

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization ODA Evaluation Dataset – Google+ (4.7M users, 90.8M edges): using random sampling to get anonymized graphs and auxiliary graphs – Gowalla: Anonymized graphs: constructed based on 6.4M check-ins generated by.2M users Auxiliary graph: the Gowalla social graph of the.2 users (1M edges) – Results: landmark identification

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization ODA Evaluation Dataset – Google+ (4.7M users, 90.8M edges): using random sampling to get anonymized graphs and auxiliary graphs – Gowalla: Anonymized graphs: constructed based on 6.4M check-ins generated by.2M users Auxiliary graph: the Gowalla social graph of the.2 users (1M edges) – Results: de-anonymization

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Outline Introduction and Motivation System Model De-anonymization Quantification Evaluation Implication 1: Optimization based De-anonymization (ODA) Practice Implication 2: Secure Data Publishing Conclusion

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Secure Structural Data Publishing Structural information is important Based on our quantification – Secure structural data publishing is difficult, at least theoretically Open problem …

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Conclusion – We proposed the first quantification framework for structural data de- anonymization under a practical data model – We conducted a large-scale de-anonymizability evaluation of 26 real world structural datasets – We designed a cold-start optimization-based de-anonymization algorithm Acknowledgement We thank the anonymous reviewers very much for their valuable comments!

S. Ji, W. Li, M. Srivatsa and R. BeyahStructural Data De-anonymization Thank you! Shouling Ji