ADAPTIVE DATA ANONYMIZATION AGAINST INFORMATION FUSION BASED PRIVACY ATTACKS ON ENTERPRISE DATA Srivatsava Ranjit Ganta, Shruthi Prabhakara, Raj Acharya.

Slides:



Advertisements
Similar presentations
Cipher Techniques to Protect Anonymized Mobility Traces from Privacy Attacks Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip and Nageswara S. V. Rao.
Advertisements

Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Wang, Lakshmanan Probabilistic Privacy Analysis of Published Views, IDAR'07 Probabilistic Privacy Analysis of Published Views Hui (Wendy) Wang Laks V.S.
Anonymizing Location-based data Jarmanjit Singh Jar_sing(at)encs.concordia.ca Harpreet Sandhu h_san(at)encs.concordia.ca Qing Shi q_shi(at)encs.concordia.ca.
Template-Based Privacy Preservation in Classification Problems IEEE ICDM 2005 Benjamin C. M. Fung Simon Fraser University BC, Canada Ke.
Anonymizing Healthcare Data: A Case Study on the Blood Transfusion Service Benjamin C.M. Fung Concordia University Montreal, QC, Canada
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Fast Data Anonymization with Low Information Loss 1 National University of Singapore 2 Hong Kong University
UTEPComputer Science Dept.1 University of Texas at El Paso Privacy in Statistical Databases Dr. Luc Longpré Computer Science Department Spring 2006.
K Beyond k-Anonimity: A Decision Theoretic Framework for Assessing Privacy Risk M.Scannapieco, G.Lebanon, M.R.Fouad and E.Bertino.
1 On the Anonymization of Sparse High-Dimensional Data 1 National University of Singapore 2 Chinese University of Hong.
Anatomy: Simple and Effective Privacy Preservation Israel Chernyak DB Seminar (winter 2009)
Privacy: Challenges and Opportunities Tadayoshi Kohno Department of Computer Science and Engineering University of Washington.
April 13, 2010 Towards Publishing Recommendation Data With Predictive Anonymization Chih-Cheng Chang †, Brian Thompson †, Hui Wang ‡, Danfeng Yao † †‡
Privacy-Aware Computing Introduction. Outline  Brief introduction Motivating applications Major research issues  Tentative schedule  Reading assignments.
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
Database Laboratory Regular Seminar TaeHoon Kim.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Publishing Microdata with a Robust Privacy Guarantee
APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.
Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010.
Privacy-Aware Personalization for Mobile Advertising
m-Privacy for Collaborative Data Publishing
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
Protecting Sensitive Labels in Social Network Data Anonymization.
Background Knowledge Attack for Generalization based Privacy- Preserving Data Mining.
Accuracy-Constrained Privacy-Preserving Access Control Mechanism for Relational Data.
K-Anonymity & Algorithms
Survey on Privacy-Related Technologies Presented by Richard Lin Zhou.
Data Anonymization (1). Outline  Problem  concepts  algorithms on domain generalization hierarchy  Algorithms on numerical data.
Data Anonymization – Introduction and k-anonymity Li Xiong CS573 Data Privacy and Security.
Ranjit Ganta, Raj Acharya, Shruthi Prabhakara Department of Computer Science and Engineering, Penn State University DATA WAREHOUSE FOR BIO-GEO HEALTH CARE.
The 1 st Competition on Critical Assessment of Data Privacy and Protection The privacy workshop is jointly sponsored by iDASH (U54HL108460) and the collaborating.
Computer Science and Engineering Computer System Security CSE 5339/7339 Session 21 November 2, 2004.
Privacy vs. Utility Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Preventing Private Information Inference Attacks on Social Networks.
Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition X. Liu, A. Srivastava, and Kyle Gallivan, “Optimal linear representations.
Supporting Privacy Protection in Personalized Web Search.
MaskIt: Privately Releasing User Context Streams for Personalized Mobile Applications SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference.
Privacy-preserving data publishing
m-Privacy for Collaborative Data Publishing
Location Privacy Protection for Location-based Services CS587x Lecture Department of Computer Science Iowa State University.
Graph Data Management Lab, School of Computer Science Personalized Privacy Protection in Social Networks (VLDB2011)
1 Token–based Dynamic Trust Establishment for Web Services Zhengping Wu and Alfred C. Weaver Department of Computer Science University of Virginia March.
Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian.
Big Data Analytics Are we at risk? Dr. Csilla Farkas Director Center for Information Assurance Engineering (CIAE) Department of Computer Science and Engineering.
Defect Prediction using Smote & GA 1 Dr. Abdul Rauf.
Privacy Issues in Graph Data Publishing Summer intern: Qing Zhang (from NC State University) Mentors: Graham Cormode and Divesh Srivastava.
` Printing: This poster is 48” wide by 36” high. It’s designed to be printed on a large-format printer. Customizing the Content: The placeholders in this.
Expanding the Role of Synthetic Data at the U.S. Census Bureau 59 th ISI World Statistics Congress August 28 th, 2013 By Ron S. Jarmin U.S. Census Bureau.
Deriving Private Information from Association Rule Mining Results Zutao Zhu, Guan Wang, and Wenliang Du ICDE /3/181.
Fast Data Anonymization with Low Information Loss
ACHIEVING k-ANONYMITY PRIVACY PROTECTION USING GENERALIZATION AND SUPPRESSION International Journal on Uncertainty, Fuzziness and Knowledge-based Systems,
Xiaokui Xiao and Yufei Tao Chinese University of Hong Kong
International Conference on Data Engineering (ICDE 2016)
DEFECT PREDICTION : USING MACHINE LEARNING
Adaptable safety and security in v2x systems
Privacy Preserving Data Publishing
By (Group 17) Mahesha Yelluru Rao Surabhee Sinha Deep Vakharia
Ioannis Ioannidis, Ananth Grama and Ioannis Ioannidis
Current Developments in Differential Privacy
Palanivel Kodeswaran* and Evelyne Viegas†
ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Privacy preserving cloud computing
Presented by : SaiVenkatanikhil Nimmagadda
United States Department of Justice Office of Information Policy
Published in: IEEE Transactions on Industrial Informatics
SHUFFLING-SLICING IN DATA MINING
Presentation transcript:

ADAPTIVE DATA ANONYMIZATION AGAINST INFORMATION FUSION BASED PRIVACY ATTACKS ON ENTERPRISE DATA Srivatsava Ranjit Ganta, Shruthi Prabhakara, Raj Acharya Department of Computer Science and Engineering Penn State University ABSTRACT : Data Privacy is one of the key challenges faced by enterprises today. Enterprises manage several individual-specific sensitive information such as customer data, employee records etc on a daily basis. Anonymization techniques (ex. k-anonymity) allow enterprises to safely release these sensitive data such that individual privacy is preserved while allowing organizations to maintain and share such valuable information. However, current anonymization techniques are prone to attacks where-in an intruder can fuse auxiliary information with the anonymized data to infer sensitive information. In this poster, we demonstrate a Information Fusion Based Privacy Attack on anonymized enterprise data and propose a prototype solution to address this problem. INFORMATION FUSION BASED PRIVACY ATTACK: Consider the possibility in which an adversary (possibly an insider) who is given (or otherwise acquires) access to anonymized release to estimate the sensitive data. To achieve this, he uses the identifier attributes present in the release to search for additional information about the customers from other sources such as web. Abundant individual-specific information is available on the web through homepages, blogs, personals etc. The adversary then uses his understanding of the data and fuses the anonymized release and web-based auxiliary information to estimate the sensitive data. The goal of this research is to demonstrate such a Web-based Information Fusion Attack on enterprise data. GOAL: Given a sensitive private dataset P, the goal is to compute Fusion-Resilient Anonymization Pˊ from P such that: Pˊ is resilient to Information Fusion based Privacy Attacks. The utility U offered by Pˊmeets the release requirements. Given a sensitive private dataset P, web-based auxiliary information Q, an information fusion system F, compute an anonymized dataset Pˊsuch that the weighed sum of adversarial estimation error and utility computed as: H = W1 * (P○P) + W2* U is maximized, where P represents the estimate of P made by the adversary using Pˊ, Q and F. U represents the utility of the released dataset, (D1 ○ D2) represents the dissimilarity between two datasets D1 and D2 and W1 and W2 are the weights assigned for privacy protection against information fusion attacks and data utility respectively. SOLUTION: EXPERIMENTAL RESULTS: Information Gain (Figure 3): Before Information fusion the dissimilarity between the original data and released data is (P○Pˊ). After Information Fusion, the adversary’s estimate P is closer to P when compared to Pˊ. The difference between P○Pˊ and P○P is the amount of Information Gain by the adversary through fusion. Optimal Anonymization (Figure 5): We use the discernability metric defined in [1] to measure utility of a k-anonymized data set . For a value of k=12, the resulting anonymization offers maximum weighted sum of privacy protection and utility. CONCLUSION: This research sheds light on the shortcomings of existing anonymization schemes in the context of enterprise data. We defined a Information Fusion based Privacy Attacks wherein an adversary uses publicly available web-based information along with the anonymized data to inflict a privacy breach. We also formulated the problem of finding a fusion resilient data anonymization and propose one possible solution to address this problem. REFERENCES: R. Bayardo and R. Agarwal. Data Privacy through optimal k-anonymization. In the proceedings of ICDE 2005. J. Domingo-Ferrer. Practical data oriented microaggregation for statistical disclosure control. In the transactions of IEEE TKDE, 2002. B. Kosko. Neural Networks and Fuzzy Systems. Prentice Hall. K. LeFevre, D. DeWitt and R.Ramakrishnan. Mondrian multidimensional k-anonymity. In the proceedings of ICDE 2006. P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical Report, CMU, 1998.