Secure Distributed Framework for Achieving -Differential Privacy Dima Alhadidi, Noman Mohammed, Benjamin C. M. Fung, and Mourad Debbabi Concordia Institute.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.
Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System ` Introduction With the deployment of smart card automated.
Hani AbuSharkh Benjamin C. M. Fung fung (at) ciise.concordia.ca
SplitX: High-Performance Private Analytics Ruichuan Chen (Bell Labs / Alcatel-Lucent) Istemi Ekin Akkus (MPI-SWS) Paul Francis (MPI-SWS)
Rational Oblivious Transfer KARTIK NAYAK, XIONG FAN.
CS555Topic 241 Cryptography CS 555 Topic 24: Secure Function Evaluation.
Private Analysis of Graph Structure With Vishesh Karwa, Sofya Raskhodnikova and Adam Smith Pennsylvania State University Grigory Yaroslavtsev
Anonymizing Healthcare Data: A Case Study on the Blood Transfusion Service Benjamin C.M. Fung Concordia University Montreal, QC, Canada
Privacy-Preserving Data Mashup Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University.
Li Xiong CS573 Data Privacy and Security Privacy Preserving Data Mining – Secure multiparty computation and random response techniques.
Fast Data Anonymization with Low Information Loss 1 National University of Singapore 2 Hong Kong University
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
Differential Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.
An architecture for Privacy Preserving Mining of Client Information Jaideep Vaidya Purdue University This is joint work with Murat.
Finding Personally Identifying Information Mark Shaneck CSCI 5707 May 6, 2004.
Anatomy: Simple and Effective Privacy Preservation Israel Chernyak DB Seminar (winter 2009)
Privacy Preserving K-means Clustering on Vertically Partitioned Data Presented by: Jaideep Vaidya Joint work: Prof. Chris Clifton.
1 Introduction to Secure Computation Benny Pinkas HP Labs, Princeton.
Privacy Preserving Data Mining: An Overview and Examination of Euclidean Distance Preserving Data Transformation Chris Giannella cgiannel AT acm DOT org.
Privacy Preserving Learning of Decision Trees Benny Pinkas HP Labs Joint work with Yehuda Lindell (done while at the Weizmann Institute)
Database Laboratory Regular Seminar TaeHoon Kim.
Differentially Private Data Release for Data Mining Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University Montreal,
Task 1: Privacy Preserving Genomic Data Sharing Presented by Noman Mohammed School of Computer Science McGill University 24 March 2014.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12.
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.
CS573 Data Privacy and Security Statistical Databases
Privacy-Aware Personalization for Mobile Advertising
m-Privacy for Collaborative Data Publishing
Integrating Private Databases for Data Analysis IEEE ISI 2005 Benjamin C. M. Fung Simon Fraser University BC, Canada Ke Wang Simon Fraser.
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
Tools for Privacy Preserving Distributed Data Mining
Background Knowledge Attack for Generalization based Privacy- Preserving Data Mining.
Cryptographic methods for privacy aware computing: applications.
Mining Multiple Private Databases Topk Queries Across Multiple Private Databases (2005) Li Xiong (Emory University) Subramanyam Chitti (GA Tech) Ling Liu.
1 Privacy Preserving Data Mining Haiqin Yang Extracted from a ppt “Secure Multiparty Computation and Privacy” Added “Privacy Preserving SVM”
Privacy of Correlated Data & Relaxations of Differential Privacy CompSci Instructor: Ashwin Machanavajjhala 1Lecture 16: Fall 12.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
Security Control Methods for Statistical Database Li Xiong CS573 Data Privacy and Security.
Preservation of Proximity Privacy in Publishing Numerical Sensitive Data J. Li, Y. Tao, and X. Xiao SIGMOD 08 Presented by Hongwei Tian.
Mining Multiple Private Databases Topk Queries Across Multiple Private Databases (2005) Mining Multiple Private Databases Using a kNN Classifier (2007)
1 Publishing Naive Bayesian Classifiers: Privacy without Accuracy Loss Author: Barzan Mozafari and Carlo Zaniolo Speaker: Hongwei Tian.
Privacy vs. Utility Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
Privacy-preserving data publishing
1/3/ A Framework for Privacy- Preserving Cluster Analysis IEEE ISI 2008 Benjamin C. M. Fung Concordia University Canada Lingyu.
m-Privacy for Collaborative Data Publishing
Preserving Privacy GPS Traces via Uncertainty-Aware Path Cloaking Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady Presenter:Yao Lu ECE 256, Spring.
Anonymizing Data with Quasi-Sensitive Attribute Values Pu Shi 1, Li Xiong 1, Benjamin C. M. Fung 2 1 Departmen of Mathematics and Computer Science, Emory.
Probabilistic km-anonymity (Efficient Anonymization of Large Set-valued Datasets) Gergely Acs (INRIA) Jagdish Achara (INRIA)
Differential Privacy (1). Outline  Background  Definition.
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
Privacy Preserving Outlier Detection using Locality Sensitive Hashing
Privacy-Preserving Data Aggregation without Secure Channel: Multivariate Polynomial Evaluation Taeho Jung 1, XuFei Mao 2, Xiang-Yang Li 1, Shao-Jie Tang.
Cryptographic methods. Outline  Preliminary Assumptions Public-key encryption  Oblivious Transfer (OT)  Random share based methods  Homomorphic Encryption.
Multi-Party Computation r n parties: P 1,…,P n  P i has input s i  Parties want to compute f(s 1,…,s n ) together  P i doesn’t want any information.
A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has cancer. -suppose.
The Monte Carlo Method/ Markov Chains/ Metropolitan Algorithm from sec in “Adaptive Cooperative Systems” -summarized by Jinsan Yang.
Private Data Management with Verification
Differential Privacy in Practice
Current Developments in Differential Privacy
Differential Privacy (2)
Walking in the Crowd: Anonymizing Trajectory Data for Pattern Analysis
Published in: IEEE Transactions on Industrial Informatics
Presentation transcript:

Secure Distributed Framework for Achieving -Differential Privacy Dima Alhadidi, Noman Mohammed, Benjamin C. M. Fung, and Mourad Debbabi Concordia Institute for Information Systems Engineering Concordia University, Montreal, Quebec, Canada

2 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

3 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

4 6/24/2012 Motivation IndividualsData PublisherAnonymization Algorithm Data Recipients Centralized Distributed

5 6/24/2012 Motivation Distributed: Vertically-Partitioned IDJob 1Writer 2Dancer 3Writer 4Dancer 5Engineer 6 7 8Dancer 9Lawyer 10Lawyer IDSexSalary 1M30K 2M25K 3M35K 4F37K 5F65K 6F35K 7M30K 8F44K 9M 10F44K

6 6/24/2012 Motivation Distributed: Vertically-Partitioned IDJobSexSalary 1WriterM30K 2DancerM25K 3WriterM35K 4DancerF37K 5EngineerF65K 6EngineerF35K 7EngineerM30K 8DancerF44K 9LawyerM44K 10LawyerF44K

7 6/24/2012 Motivation Distributed: Horizontally-Partitioned IDJobSexAgeSurgery 1JanitorM34Transgender 2LawyerF58Plastic 3MoverM58Urology 4LawyerM24Vascular 5MoverM34Transgender 6JanitorM44Plastic 7DoctorF44Vascular IDJobSexAgeSurgery 8DoctorM58Plastic 9DoctorM24Urology 10JanitorF63Vascular 11MoverF63Plastic

8 6/24/2012 Motivation Distributed: Horizontally-Partitioned IDJobSexAgeSurgery 1JanitorM34Transgender 2LawyerF58Plastic 3MoverM58Urology 4LawyerM24Vascular 5MoverM34Transgender 6JanitorM44Plastic 7DoctorF44Vascular 8DoctorM58Plastic 9DoctorM24Urology 10JanitorF63Vascular 11MoverF63Plastic

9 6/24/2012 Motivation Distributed: Horizontally-Partitioned IDJobSexAgeSurgery 1JanitorM34Transgender 2LawyerF58Plastic 3MoverM58Urology 4LawyerM24Vascular 5MoverM34Transgender 6JanitorM44Plastic 7DoctorF44Vascular 8DoctorM58Plastic 9DoctorM24Urology 10JanitorF63Vascular 11MoverF63Plastic

10 6/24/2012 Motivation Distributed: Horizontally-Partitioned IDJobSexAgeSurgery 1JanitorM34Transgender 2LawyerF58Plastic 3MoverM58Urology 4LawyerM24Vascular 5MoverM34Transgender 6JanitorM44Plastic 7DoctorF44Vascular 8DoctorM58Plastic 9DoctorM24Urology 10JanitorF63Vascular 11MoverF63Plastic

11 6/24/2012 Motivation Distributed: Horizontally-Partitioned IDJobSexAgeSurgery 1JanitorM34Transgender 2LawyerF58Plastic 3MoverM58Urology 4LawyerM24Vascular 5MoverM34Transgender 6JanitorM44Plastic 7DoctorF44Vascular 8DoctorM58Plastic 9DoctorM24Urology 10JanitorF63Vascular 11MoverF63Plastic

12 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

13 6/24/2012 Problem Statement Desideratum to develop a two-party data publishing algorithm for horizontally-partitioned data which : –achieves differential privacy and –satisfies the security definition of secure multiparty computation (SMC).

14 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

15 6/24/2012 Related Work Algorithms Data OwnerPrivacy Model Centralized Distributed Differential Privacy Partition- based Privacy HorizontallyVertically LeFevre et al., Fung et al., etc  Xiao et al., Mohammed et al., etc.  Jurczyk and Xiong, Mohammed et al.  Jiang and Clifton, Mohammed et al.  Our proposal 

16 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

17 6/24/2012 k-Anonymity

18 6/24/2012 k-Anonymity Quasi-identifier (QID)

19 6/24/2012 k-Anonymity 3-anonymous patient table JobSexAgeDisease ProfessionalMale[36-40]Fever ProfessionalMale[36-40]Fever ProfessionalMale[36-40]Hepatitis ArtistFemale[30-35]Flu ArtistFemale[30-35]Hepatitis ArtistFemale[30-35]Hepatitis ArtistFemale[30-35]Hepatitis

20 6/24/2012 Differential Privacy D D

21 6/24/2012 Laplace Mechanism D

22 6/24/2012 Exponential Mechanism McSherry and Talwar have proposed the exponential mechanism that can choose an output that is close to the optimum with respect to a utility function while preserving differential privacy.

23 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

24 6/24/2012 Two-Party Differentially Private Data Release Generalizing the raw data Adding noisy count

25 6/24/2012 Generalizing the raw data Distributed Exponential Mechanism (DEM)

26 6/24/2012 Generalization Distributed Exponential Mechanism (DEM)

27 6/24/2012 Adding Noisy Count Each party adds a Laplace noise to its count. Each party sends the result to the other party.

28 6/24/2012 Two-Party Protocol for Exponential Mechanism Input: 1.Two raw data sets by two parties 2.Set of candidates 3.Privacy budget Output : Winner candidate

29 6/24/2012 Max Utility Function IDClassJobSexAgeSurgery 1NJanitorM34Transgender 2YLawyerF58Plastic 3YMoverM58Urology 4NLawyerM24Vascular 5YMoverM34Transgender 6YJanitorM44Plastic 7YDoctorF44Vascular Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar D1D1

30 6/24/2012 Max Utility Function Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar D2D2 IDClassJobSexAgeSurgery 8NDoctorM58Plastic 9YDoctorM24Urology 10YJanitorF63Vascular 11YMoverF63Plastic

31 6/24/2012 Max Utility Function Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar IDClassJobSexAgeSurgery 1NJanitorM34Transgender 2YLawyerF58Plastic 3YMoverM58Urology 4NLawyerM24Vascular 5YMoverM34Transgender 6YJanitorM44Plastic 7YDoctorF44Vascular 8NDoctorM58Plastic 9YDoctorM24Urology 10YJanitorF63Vascular 11YMoverF63Plastic D 1 & D 2

32 6/24/2012 Computing Max Utility Function Blue-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar

33 6/24/2012 Computing Max Utility Function max=1 Blue-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar

34 6/24/2012 Computing Max Utility Function max=1 Blue-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar

35 6/24/2012 Computing Max Utility Function max=5, sum=5 Blue-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar

36 6/24/2012 Computing Max Utility Function sum=5 White-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar

37 6/24/2012 Computing Max Utility Function max=2, sum=5 White-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar

38 6/24/2012 Computing Max Utility Function max=2, sum=5 White-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar

39 6/24/2012 Computing Max Utility Function max=3, sum=8 White-collar Max Class JobData Set YN 531Blue-collar D1D1 21White-collar 320Blue-collar D2D2 11White-collar 851Blue-collar Integrated D 1 and D 2 32White-collar Result: Shares  1 and  2

40 6/24/2012 Computing the Exponential Equation Given the scores of all the candidates, exponential mechanism selects the candidate having score u with the following probability: Shares  1 and  2

41 6/24/2012 Computing the Exponential Equation = Taylor Series =

42 6/24/2012 Computing the Exponential Equation Lowest common multiplier of {2!,…,w!}, no fraction Approximating up to a predetermined number s after the decimal point

43 6/24/2012 Computing the Exponential Equation No fraction

44 6/24/2012 Computing the Exponential Equation Oblivious Polynomial Evaluation First Party Second Party Result First Party Second Party

45 6/24/2012 Computing the Exponential Equation Second Party First Party

46 6/24/2012 Computing the Exponential Equation Picking a random number [0,1]

47 6/24/2012 Computing the Exponential Equation 0 Picking a random number [0, ]

48 6/24/2012 Picking a Random Number Second Party Random Value Protocol [Bunn and Ostrovsky 2007] First Party Second Party First Party

49 6/24/2012 Picking a Winner

50 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

51 6/24/2012 Performance Analysis –Adult: is a Census data 6 numerical attributes. 8 categorical attributes. 45,222 census records –Cost Estimates 37.5 minutes of computation 37.3 minutes of communication using T1 line with Mbits/second bandwidth.

52 6/24/2012 Scaling Impact

53 6/24/2012 Outline Motivation Problem Statement Related Work Background Two-Party Differentially Private Data Release Performance Analysis Conclusion

54 6/24/2012 Conclusion Data release algorithm –Two-party –Differentially-private –Secure –Horizontally-partitioned –Non-interactive setting

55 6/24/2012 Future Work Consider different scenarios –Two parties vs. multiple parties –Semi-honest vs. malicious adversary model –Horizontally vs. Vertically partitioned data For all these scenarios, we need efficient algorithms