Differential Privacy (2). Outline  Using differential privacy Database queries Data mining  Non interactive case  New developments.

Slides:



Advertisements
Similar presentations
I have a DREAM! (DiffeRentially privatE smArt Metering) Gergely Acs and Claude Castelluccia {gergely.acs, INRIA 2011.
Advertisements

Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
ECG Signal processing (2)
Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System ` Introduction With the deployment of smart card automated.
A Paper on RANDOM SAMPLING OVER JOINS by SURAJIT CHAUDHARI RAJEEV MOTWANI VIVEK NARASAYYA PRESENTED BY, JEEVAN KUMAR GOGINENI SARANYA GOTTIPATI.
Fast Algorithms For Hierarchical Range Histogram Constructions
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Privacy Enhancing Technologies
Mining for High Complexity Regions Using Entropy and Box Counting Dimension Quad-Trees Rosanne Vetro, Wei Ding, Dan A. Simovici Computer Science Department.
Seminar in Foundations of Privacy 1.Adding Consistency to Differential Privacy 2.Attacks on Anonymized Social Networks Inbal Talgam March 2008.
Differential Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Based on Slides by D. Gunopulos (UCR)
Privacy without Noise Yitao Duan NetEase Youdao R&D Beijing China CIKM 2009.
Calibrating Noise to Sensitivity in Private Data Analysis
Privacy-Preserving Data Mining Rakesh Agrawal Ramakrishnan Srikant IBM Almaden Research Center 650 Harry Road, San Jose, CA Published in: ACM SIGMOD.
Database Access Control & Privacy: Is There A Common Ground? Surajit Chaudhuri, Raghav Kaushik and Ravi Ramamurthy Microsoft Research.
Differentially Private Data Release for Data Mining Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University Montreal,
Overview of Distributed Data Mining Xiaoling Wang March 11, 2003.
Differentially Private Transit Data Publication: A Case Study on the Montreal Transportation System Rui Chen, Concordia University Benjamin C. M. Fung,
Multiplicative Weights Algorithms CompSci Instructor: Ashwin Machanavajjhala 1Lecture 13 : Fall 12.
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Foundations of Privacy Lecture 6 Lecturer: Moni Naor.
Data mining and machine learning A brief introduction.
Differential Privacy - Apps Presented By Nikhil M Chandrappa 1.
APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.
CS573 Data Privacy and Security Statistical Databases
Making the most of social historic data Aleksander Kolcz Twitter, Inc.
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
Differentially Private Marginals Release with Mutual Consistency and Error Independent of Sample Size Cynthia Dwork, Microsoft TexPoint fonts used in EMF.
Data Mining By Dave Maung.
Implementing Differential Privacy & Side-channel attacks CompSci Instructor: Ashwin Machanavajjhala 1Lecture 14 : Fall 12.
Privacy of Correlated Data & Relaxations of Differential Privacy CompSci Instructor: Ashwin Machanavajjhala 1Lecture 16: Fall 12.
Personalized Social Recommendations – Accurate or Private? A. Machanavajjhala (Yahoo!), with A. Korolova (Stanford), A. Das Sarma (Google) 1.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Additive Data Perturbation: the Basic Problem and Techniques.
PRISM: Private Retrieval of the Internet’s Sensitive Metadata Ang ChenAndreas Haeberlen University of Pennsylvania.
Privacy-preserving rule mining. Outline  A brief introduction to association rule mining  Privacy preserving rule mining Single party  Perturbation.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
Privacy-preserving data publishing
Map-Reduce examples 1. So, what is it? A two phase process geared toward optimizing broad, widely distributed parallel computing platforms Apache Hadoop.
Differential Privacy (1). Outline  Background  Definition.
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Yang, et al. Differentially Private Data Publication and Analysis. Tutorial at SIGMOD’12 Part 4: Data Dependent Query Processing Methods Yin “David” Yang.
Output Perturbation with Query Relaxation By: XIAO Xiaokui and TAO Yufei Presenter: CUI Yingjie.
A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has cancer. -suppose.
Ensemble Classifiers.
University of Texas at El Paso
Privacy-Preserving Data Mining
Private Data Management with Verification
DATA MINING © Prentice Hall.
Privacy-preserving Release of Statistics: Differential Privacy
Privacy and Fault-Tolerance in Distributed Optimization Nitin Vaidya University of Illinois at Urbana-Champaign.
Designing Private Forums
Cardinality Estimator 2014/2016
Differential Privacy in Practice
Differential Privacy (2)
Group 9 – Data Mining: Data
Published in: IEEE Transactions on Industrial Informatics
Combinations of Functions
CS639: Data Management for Data Science
CS 685: Special Topics in Data Mining Jinze Liu
Some contents are borrowed from Adam Smith’s slides
Differential Privacy (1)
Presentation transcript:

Differential Privacy (2)

Outline  Using differential privacy Database queries Data mining  Non interactive case  New developments

Definition Mechanism: K(x) = f(x) + D, D is some noise. It is an output perturbation method.

Sensitivity function  Captures how great a difference must be hidden by the additive noise How to design the noise D? It is actually linked back to the function f(x)

Adding LAP noise Why does this work?

Proof sketch Let K(x) = f(x) + D =r. Thus, r-f(x) has Lap distribution with the scale df/e. Similarly, K(x’) = f(x’)+D=r, and r-f(x’) has the same distribution P(K(x) = r) = exp(-|f(x)-r|(e/df)) P(K(x’)= r) = exp(-|f(x’)-r|(e/df)) P(K(x)=r)/P(K(x’)=r) = exp( (|f(x’)-r|-|f(x)-r|)(e/df)) apply triangle inequality <= exp( |f(x’)-f(x)|(e/df)) = exp(e)

Composition  Sequential composition  Parallel composition --for disjoint sets, the ultimate privacy guarantee depends only on the worst of the guarantees of each analysis, not the sum.

Database queries (PINQ)  Basic aggregate operations Noisy count Noisy sum Noisy average  composition rule  Stable transformation |T(A) - T(B)| <= c|A-B|, and M provides e- diff privacy => Composite computation M(T(x)) is ce- diff privacy

Data mining with differential privacy (paper)  Decision tree Basic operation: scan through the domain to find the split that maximizes some classification measure  Basic idea of the diff-privacy version Users interact with the data server to find out required information These operations can be transformed to counting operations -- apply NoisyCount Sensitivity of the function is determined by the classification measure

 Privacy budget e User specified total budget e Composite operations need a specific e’ for each operation

Tradeoff between utility and privacy

Non interactive differential privacy  Noisy histogram release

Sampling and filtering

Partitioning

New settings  Against an adversary who has access to the algorithm’s internal state  Differential privacy under continual observation