Structure based Data De-anonymization of Social Networks and Mobility Traces Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology.

Slides:



Advertisements
Similar presentations
ICDE 2014 LinkSCAN*: Overlapping Community Detection Using the Link-Space Transformation Sungsu Lim †, Seungwoo Ryu ‡, Sejeong Kwon§, Kyomin Jung ¶, and.
Advertisements

De-anonymizing social networks Arvind Narayanan, Vitaly Shmatikov.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
The End of Anonymity Vitaly Shmatikov. Tastes and Purchases slide 2.
Beyond Trilateration: On the Localizability of Wireless Ad Hoc Networks Reported by: 莫斌.
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
Structural Data De-anonymization: Quantification, Practice, and Implications Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology.
Graph Data Management Lab School of Computer Science , Bristol, UK.
On the Construction of Energy- Efficient Broadcast Tree with Hitch-hiking in Wireless Networks Source: 2004 International Performance Computing and Communications.
1 Preserving Privacy in Collaborative Filtering through Distributed Aggregation of Offline Profiles The 3rd ACM Conference on Recommender Systems, New.
2. Attacks on Anonymized Social Networks. Setting A social network Edges may be private –E.g., “communication graph” The study of social structure by.
Privacy in Social Networks:
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Minas Gjoka, UC IrvineWalking in Facebook 1 Walking in Facebook: A Case Study of Unbiased Sampling of OSNs Minas Gjoka, Maciej Kurant ‡, Carter Butts,
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
1 Real Time, Online Detection of Abandoned Objects in Public Areas Proceedings of the 2006 IEEE International Conference on Robotics and Automation Authors.
SybilGuard: Defending Against Sybil Attacks via Social Networks Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman Presented by Ryan.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
© Y. Zhu and Y. University of North Carolina at Charlotte, USA 1 Chapter 1: Social-based Routing Protocols in Opportunistic Networks Ying Zhu and.
Credit-Based Incentive Data Dissemination in Mobile Social Networks Guoliang Liu, Shouling Ji, Zhipeng Cai Georgia State University Georgia Institute of.
Private Analysis of Graphs
1 Speaker : 童耀民 MA1G Authors: Ze Li Dept. of Electr. & Comput. Eng., Clemson Univ., Clemson, SC, USA Haiying Shen ; Hailang Wang ; Guoxin.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
FaceTrust: Assessing the Credibility of Online Personas via Social Networks Michael Sirivianos, Kyungbaek Kim and Xiaowei Yang in collaboration with J.W.
Advanced Software Engineering PROJECT. 1. MapReduce Join (2 students)  Focused on performance analysis on different implementation of join processors.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Resisting Structural Re-identification in Anonymized Social Networks Michael Hay, Gerome Miklau, David Jensen, Don Towsley, Philipp Weis University of.
Mining Social Networks for Personalized Prioritization Shinjae Yoo, Yiming Yang, Frank Lin, II-Chul Moon [KDD ’09] 1 Advisor: Dr. Koh Jia-Ling Reporter:
Xiaowei Ying, Xintao Wu Univ. of North Carolina at Charlotte PAKDD-09 April 28, Bangkok, Thailand On Link Privacy in Randomizing Social Networks.
BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.
Evaluating Network Security with Two-Layer Attack Graphs Anming Xie Zhuhua Cai Cong Tang Jianbin Hu Zhong Chen ACSAC (Dec., 2009) 2010/6/151.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Finding Top-k Shortest Path Distance Changes in an Evolutionary Network SSTD th August 2011 Manish Gupta UIUC Charu Aggarwal IBM Jiawei Han UIUC.
Ahmed Osama Research Assistant. Presentation Outline Winc- Nile University- Privacy Preserving Over Network Coding 2  Introduction  Network coding 
Whitespace Measurement and Virtual Backbone Construction for Cognitive Radio Networks: From the Social Perspective Shouling Ji and Raheem Beyah Georgia.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge NDSS 2015, Shouling Ji, Georgia Institute of Technology.
Measuring Behavioral Trust in Social Networks
Privacy Preserving Payments in Credit Networks By: Moreno-Sanchez et al from Saarland University Presented By: Cody Watson Some Slides Borrowed From NDSS’15.
Introduction Graph Data Why data sharing/publishing
Community-enhanced De-anonymization of Online Social Networks Shirin Nilizadeh, Apu Kapadia, Yong-Yeol Ahn Indiana University Bloomington CCS 2014.
Stefanos Antaris A Socio-Aware Decentralized Topology Construction Protocol Stefanos Antaris *, Despina Stasi *, Mikael Högqvist † George Pallis *, Marios.
Privacy Protection in Social Networks Instructor: Assoc. Prof. Dr. DANG Tran Khanh Present : Bui Tien Duc Lam Van Dai Nguyen Viet Dang.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Panther: Fast Top-k Similarity Search in Large Networks JING ZHANG, JIE TANG, CONG MA, HANGHANG TONG, YU JING, AND JUANZI LI Presented by Moumita Chanda.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Stefanos Antaris Distributed Publish/Subscribe Notification System for Online Social Networks Stefanos Antaris *, Sarunas Girdzijauskas † George Pallis.
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Bo Zong, Yinghui Wu, Ambuj K. Singh, Xifeng Yan 1 Inferring the Underlying Structure of Information Cascades
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
1 Link Privacy in Social Networks Aleksandra Korolova, Rajeev Motwani, Shubha U. Nabar CIKM’08 Advisor: Dr. Koh, JiaLing Speaker: Li, HueiJyun Date: 2009/3/30.
O n the Relative De-anonymizability of Graph Data: Quantification and Evaluation Shouling Ji, Weiqing Li, Shukun Yang and Raheem Beyah Georgia Institute.
Privacy Issues in Graph Data Publishing Summer intern: Qing Zhang (from NC State University) Mentors: Graham Cormode and Divesh Srivastava.
Xiaowei Ying, Kai Pan, Xintao Wu, Ling Guo Univ. of North Carolina at Charlotte SNA-KDD June 28, 2009, Paris, France Comparisons of Randomization and K-degree.
Cohesive Subgraph Computation over Large Graphs
Distributed voting application for handheld devices
Introduction Secondary Users (SUs) Primary Users (PUs)
Dieudo Mulamba November 2017
RankClus: Integrating Clustering with Ranking for Heterogeneous Information Network Analysis Yizhou Sun, Jiawei Han, Peixiang Zhao, Zhijun Yin, Hong Cheng,
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Mingzhen Mo and Irwin King
A Unified Framework for Location Privacy
Presentation transcript:

Structure based Data De-anonymization of Social Networks and Mobility Traces Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology Mudhakar Srivatsa IBM T. J. Watson Research Center Jing S. He KSU Presenter: Qin Liu, Chinese University of Hong Kong

Ji et al.Structure based Data De-anonymization Introduction Social networking services are a fast-growing business nowadays – Facebook, Twitter, Google+, LiveJournal, YouTube, … When users participate in online social network activities, people’s privacy suffers potential serious threat – Create personal portfolio, post current location, … Countermeasures – Naïve anonymization: removing “Personally Identifiable Information (PII)” – Edge modification – k-anonymity and its varients Still vulnerable to powerful structure-based de-anonymization attacks – Narayanan-Shmatikov attack (IEEE S&P 2009) – Srivatsa-Hicks attack (ACM CCS 2012) – Others

Ji et al.Structure based Data De-anonymization Narayanan-Shmatikov attack (IEEE S&P 2009) Anonymized data: Twitter (crawled in late 2007) – A microblogging service – 224K users, 8.5M edges Auxiliary data: Flicker (crawled in late 2007/early 2008) – A photo-sharing service – 3.3M users, 53M edges Result: 30.8% of the users are successfully de-anonymized TwitterFlicker User mapping Heuristics Eccentricity Edge directionality Node degree Revisiting nodes Reverse match

Ji et al.Structure based Data De-anonymization Srivatsa-Hicks (ACM CCS 2012) Anonymized data – Mobility traces: St Andrews, Smallblue, and Infocom 2006 Auxiliary data – Social networks: Facebook, and DBLP De-anonymize mobility traces using corresponding social networks Over 80% users can be successfully de-anonymized

Ji et al.Structure based Data De-anonymization Other structural de-anonymization attacks Backstrom et al. attack (WWW 2007) – Both active attacks and passive attacks Narayanan et al. attack (IJCNN 2011) – A simplified version Narayanan-Shmatikov attack (IEEE S&P 2009) – For breaching link privacy Pedarsani et al. attack (Allerton 2013) – A Bayesian method based attack

Ji et al.Structure based Data De-anonymization Limitations of existing attacks Not scalable – E.g., Backstrom et al. attack (WWW 2007) needs to create Sybil users before anonymized data release, which is not controllable or scalable – E.g., Srivatsa-Hicks attack (CCS 2012) has a complexity of O(k!n 3 ), k is the number seeds, which is not scalable High computational cost – E.g., Narayanan-Shmatikov attack (S&P 2009) has a complexity of O(n k +n 4 ) Not general – E.g., Narayanan-Shmatikov attack (S&P 2009) is designed for directed graph – E.g., Pedarsani et al. attack (Allerton 2013) is good for sparse graphs but bad for dense graphs

Ji et al.Structure based Data De-anonymization Our contributions Defined and mesured three de-anonymization metrics – Strucutral similarity, relative distance similarity, and inheritance similarity Proposed a Unified Similarity (US) based De-Anonymization (DA) framework – Iteratively de-anonymize data with accuracy guarantee Generalized DA to an Adaptive De-Anonymization (ADA) framework – To de-anonymize large-scale data without the knowledge on the overlap size between the anonymized data and the auxiliary data Applied the proposed de-anonymization attacks to real world datasets – Successfully de-anonymized three mobility traces: At Andrews, Infocom06, and Smallblue – Successfully de-anonymized three social network datasets: ArnetMiner, Google+, and Facebook

Ji et al.Structure based Data De-anonymization Outline Background Preliminaries and Model De-anonymization Generalized Scalable De-anonymization Experiments Conclusion and Future Work

Ji et al.Structure based Data De-anonymization Preliminaries and Model Anonymized data graph Auxiliary data graph Attack Model – A de-anonymization attack is a mapping of users from the anonymized graph to the auxiliary graph, i.e.,

Ji et al.Structure based Data De-anonymization Datasets – mobility traces Mobility traces (anonymized data) and social networks (auxiliary data) (same as Srivatsa-Hicks attack (ACM CCS 2012)) Preprocess mobility traces to construct anonymized contact graphs (see Srivatsa and Hick’s paper for detail) Use social network as auxiliary data to de-anonymize mobility traces

Ji et al.Structure based Data De-anonymization Datasets – social networks ArnetMiner – A coauthor network – A weighted graph with weight indicating the number of coauthored papers – 1,127 authors and 6,690 “coauthor” relationships Google+ – Two Google+ datasets crawled on July 19 and August 6 in 2011, denoted by JUL and AUG, respectively – JUL: 5,200 users, 7,062 connections – AUG: 5,200 users, 7,813 connections Facebook – 63,731 users – 1,269,502 friend relationships

Ji et al.Structure based Data De-anonymization Outline Background Preliminaries and Model De-anonymization Generalized Scalable De-anonymization Experiments Conclusion and Future Work

Ji et al.Structure based Data De-anonymization De-anonymization High-Level Description – Seed selection – Mapping propagation Seed selection – Identify a small number of seed mappings from the anonymized graph to the auxiliary graph – Bootstrap the de-anonymization Mapping propagation – De- anonymize the anonymized graph using multiple similarity measurements

Ji et al.Structure based Data De-anonymization Mapping Propagation Metrics – Structural Similarity – Relative Distance Similarity – Inheritance Similarity – Unified Similarity We also defined the weighted version of these metrics by considering the weights on edges Propagation framework

Ji et al.Structure based Data De-anonymization Structural Similarity Degree centrality – The number of ties that a node has in a graph

Ji et al.Structure based Data De-anonymization Structural Similarity Closeness centrality – How close a node is to others nodes in a graph

Ji et al.Structure based Data De-anonymization Structural Similarity Betweenness centrality – A node’s global structural importance within a graph

Ji et al.Structure based Data De-anonymization Structural Similarity Defined as the cosine similarity between two nodes’ degree, closeness, and betweenness centralities

Ji et al.Structure based Data De-anonymization Relative Distance Similarity Defined as the cosine similarity between two nodes’ distance vectors to seeds

Ji et al.Structure based Data De-anonymization Inheritance Similarity Characterize the knowledge provided by current mapping results – Two nodes have more common mapped neighbors will have high inheritance similarity score

Ji et al.Structure based Data De-anonymization Unified Similarity (US) Considering the structural similarity, relative distance similarity, and inheritance similarity Weights US Structural similarity Relative distance similarity Inheritance similarity

Ji et al.Structure based Data De-anonymization US based De-Anonymization (DA) Framework Step 1: seed identification by existing techniques Step 2: calculate two candidate node sets C a and C u from the anonymized graph and the auxiliary graph, respectively Step 3: calculate the US of each user from C a to every user in C u, and construct a weighted bipartite graph from C a and C u based on the calculated US scores Step 4: Seek a maximum weighted bipartite matching Step 5: Decide whether to accept a node de-anonymization result in the bipartite mathching Go to step 2 if the end condition is not reached

Ji et al.Structure based Data De-anonymization Outline Background Preliminaries and Model De-anonymization Generalized Scalable De-anonymization Experiments Conclusion and Future Work

Ji et al.Structure based Data De-anonymization Generalized Scalable De-anonymization Core Matching Subgraph (CMS)

Ji et al.Structure based Data De-anonymization Adaptive De-Anonymization (ADA) Identify initial CMS Run DA on initial CMS Update CMS or End

Ji et al.Structure based Data De-anonymization Outline Background Preliminaries and Model De-anonymization Generalized Scalable De-anonymization Experiments Conclusion and Future Work

Ji et al.Structure based Data De-anonymization Experiments – de-anonymize mobility traces

Ji et al.Structure based Data De-anonymization Experiments – de-anonymize ArnetMiner

Ji et al.Structure based Data De-anonymization Experiments – de-anonymize Google+

Ji et al.Structure based Data De-anonymization Experiments – de-anonymize Facebook

Ji et al.Structure based Data De-anonymization Conclusion and Future Work Conclusion – Proposed and examined several structural similarity metrics – Designed a new scalable structural de-anonymization framework for mobility traces and social networks – Validated the proposed de-anonymization framework on multiple mobility traces and social networks Future work – More experiments on large-scale datasets – De-anonymizablity quantification (partially done in our ACM CCS 2014 paper) – Secure data publishing system

Ji et al.Structure based Data De-anonymization Thank you and the presenter Qin Liu! Shouling Ji