Privacy Research Overview

Slides:



Advertisements
Similar presentations
Ragib Hasan Johns Hopkins University en Spring 2011 Lecture 8 04/04/2011 Security and Privacy in Cloud Computing.
Advertisements

Interaction of RFID Technology and Public Policy Presentation at RFID Privacy MIT 15 TH November 2003 By Rakesh Kumar
1 Privacy in Microdata Release Prof. Ravi Sandhu Executive Director and Endowed Chair March 22, © Ravi Sandhu.
1 IS 2150 / TEL 2810 Information Security & Privacy James Joshi Associate Professor, SIS Lecture 11 April 10, 2013 Information Privacy (Contributed by.
UTEPComputer Science Dept.1 University of Texas at El Paso Privacy in Statistical Databases Dr. Luc Longpré Computer Science Department Spring 2006.
Privacy and Contextual Integrity: Framework and Applications Adam Barth, Anupam Datta, John C. Mitchell (Stanford), and Helen Nissenbaum (NYU) TRUST Winter.
Differential Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.
Do You Trust Your Recommender? An Exploration of Privacy and Trust in Recommender Systems Dan Frankowski, Dan Cosley, Shilad Sen, Tony Lam, Loren Terveen,
C MU U sable P rivacy and S ecurity Laboratory 1 Privacy Policy, Law and Technology Engineering Privacy November 6, 2008.
CMSC 414 Computer (and Network) Security Lecture 2 Jonathan Katz.
Course Review Anupam Datta CMU Fall A: Foundations of Security and Privacy.
K Beyond k-Anonimity: A Decision Theoretic Framework for Assessing Privacy Risk M.Scannapieco, G.Lebanon, M.R.Fouad and E.Bertino.
Finding Personally Identifying Information Mark Shaneck CSCI 5707 May 6, 2004.
Using Digital Credentials On The World-Wide Web M. Winslett.
C MU U sable P rivacy and S ecurity Laboratory 1 Privacy Policy, Law and Technology Data Privacy October 30, 2008.
C MU U sable P rivacy and S ecurity Laboratory Philosophical definitions of privacy Lorrie Faith Cranor October 19, 2007.
Anatomy: Simple and Effective Privacy Preservation Israel Chernyak DB Seminar (winter 2009)
Privacy Policy, Law and Technology Carnegie Mellon University Fall 2007 Lorrie Cranor 1 Data Privacy.
Shuchi Chawla, Cynthia Dwork, Frank McSherry, Adam Smith, Larry Stockmeyer, Hoeteck Wee From Idiosyncratic to Stereotypical: Toward Privacy in Public Databases.
Protecting Privacy when Disclosing Information Pierangela Samarati Latanya Sweeney.
Privacy Preserving Data Mining: An Overview and Examination of Euclidean Distance Preserving Data Transformation Chris Giannella cgiannel AT acm DOT org.
Privacy-Aware Computing Introduction. Outline  Brief introduction Motivating applications Major research issues  Tentative schedule  Reading assignments.
Privacy Preserving Learning of Decision Trees Benny Pinkas HP Labs Joint work with Yehuda Lindell (done while at the Weizmann Institute)
Task 1: Privacy Preserving Genomic Data Sharing Presented by Noman Mohammed School of Computer Science McGill University 24 March 2014.
Information-Theoretic Security and Security under Composition Eyal Kushilevitz (Technion) Yehuda Lindell (Bar-Ilan University) Tal Rabin (IBM T.J. Watson)
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
1 Privacy-Preserving Distributed Information Sharing Nan Zhang and Wei Zhao Texas A&M University, USA.
Ragib Hasan University of Alabama at Birmingham CS 491/691/791 Fall 2011 Lecture 16 10/11/2011 Security and Privacy in Cloud Computing.
1 CIS 5371 Cryptography 3. Private-Key Encryption and Pseudorandomness B ased on: Jonathan Katz and Yehuda Lindel Introduction to Modern Cryptography.
Publishing Microdata with a Robust Privacy Guarantee
0x1A Great Papers in Computer Security Vitaly Shmatikov CS 380S
Do you believe in this? Due to its very nature, the Internet is NOT a safe or secure environment. It is an ever-changing medium where anyone and everyone.
1 Privacy Preserving Data Mining Haiqin Yang Extracted from a ppt “Secure Multiparty Computation and Privacy” Added “Privacy Preserving SVM”
Secure Sensor Data/Information Management and Mining Bhavani Thuraisingham The University of Texas at Dallas October 2005.
Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.
Data Anonymization (1). Outline  Problem  concepts  algorithms on domain generalization hierarchy  Algorithms on numerical data.
Patient Confidentiality and Electronic Medical Records Ann J. Olsen, MBA, MA Information Security Officer and Director, Information Management Planning.
Data Anonymization – Introduction and k-anonymity Li Xiong CS573 Data Privacy and Security.
Privacy vs. Utility Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Differential Privacy Some contents are borrowed from Adam Smith’s slides.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Privacy-preserving data publishing
m-Privacy for Collaborative Data Publishing
Anonymity - Background Prof. Newman, instructor CSE-E (don’t leave message) Office Hours (tentative): 10-noon TR - subject:
Anonymity and Privacy Issues --- re-identification
Privacy and Contextual Integrity: Framework and Applications Adam Barth, Anupam Datta, John C. Mitchell (Stanford) Helen Nissenbaum (NYU)
1 Privacy Preserving Data Mining Introduction August 2 nd, 2013 Shaibal Chakrabarty.
Probabilistic km-anonymity (Efficient Anonymization of Large Set-valued Datasets) Gergely Acs (INRIA) Jagdish Achara (INRIA)
Quantification of Integrity Michael Clarkson and Fred B. Schneider Cornell University IEEE Computer Security Foundations Symposium July 17, 2010.
Differential Privacy (1). Outline  Background  Definition.
Differential Privacy Xintao Wu Oct 31, Sanitization approaches Input perturbation –Add noise to data –Generalize data Summary statistics –Means,
1 Differential Privacy Cynthia Dwork Mamadou H. Diallo.
Unraveling an old cloak: k-anonymity for location privacy
Database Privacy (ongoing work) Shuchi Chawla, Cynthia Dwork, Adam Smith, Larry Stockmeyer, Hoeteck Wee.
Reconciling Confidentiality Risk Measures from Statistics and Computer Science Jerry Reiter Department of Statistical Science Duke University.
University of Texas at El Paso
ACHIEVING k-ANONYMITY PRIVACY PROTECTION USING GENERALIZATION AND SUPPRESSION International Journal on Uncertainty, Fuzziness and Knowledge-based Systems,
Michael Spiegel, Esq Timothy Shimeall, Ph.D.
Executive Director and Endowed Chair
TRUST Area 3 Overview: Privacy, Usability, & Social Impact
Privacy-preserving Release of Statistics: Differential Privacy
Executive Director and Endowed Chair
By (Group 17) Mahesha Yelluru Rao Surabhee Sinha Deep Vakharia
Differential Privacy in Practice
Lecture 27: Privacy CS /7/2018.
Presented by : SaiVenkatanikhil Nimmagadda
TELE3119: Trusted Networks Week 4
18734: Foundations of Privacy
Differential Privacy (1)
Presentation transcript:

Privacy Research Overview 18739A: Foundations of Security and Privacy Privacy Research Overview Anupam Datta Fall 2007-08

Privacy Research Space What is Privacy? [Philosophy, Law, Public Policy] TODAY Next 3 lectures Formal Model, Policy Language, Compliance-check Algorithms [Programming Languages, Logic] TODAY Implementation-level Compliance [Software Engg, Formal Methods] Data Privacy [Databases, Cryptography]

Philosophical studies on privacy Reading Overview article in Stanford Encyclopedia of Philosophy http://plato.stanford.edu/entries/privacy/ Alan Westin, Privacy and Freedom, 1967 Ruth Gavison, Privacy and the Limits of Law, 1980 Helen Nissenbaum, Privacy as Contextual Integrity, 2004 (more on Nov 8)

Westin 1967 Privacy and control over information “Privacy is the claim of individuals, groups or institutions to determine for themselves when, how, and to what extent information about them is communicated to others” Relevant when you give personal information to a web site; agree to privacy policy posted on web site May not apply to your personal health information

Gavison 1980 Privacy as limited access to self “A loss of privacy occurs as others obtain information about an individual, pay attention to him, or gain access to him. These three elements of secrecy, anonymity, and solitude are distinct and independent, but interrelated, and the complex concept of privacy is richer than any definition centered around only one of them.” Basis for database privacy definition discussed later

Gavison 1980 On utility “We start from the obvious fact that both perfect privacy and total loss of privacy are undesirable. Individuals must be in some intermediate state – a balance between privacy and interaction …Privacy thus cannot be said to be a value in the sense that the more people have of it, the better.” This balance between privacy and utility will show up in data privacy as well as in privacy policy languages, e.g. health data could be shared with medical researchers

Privacy Laws in the US HIPAA (Health Insurance Portability and Accountability Act, 1996) Protecting personal health information GLBA (Gramm-Leach-Bliley-Act, 1999) Protecting personal information held by financial service institutions COPPA (Children‘s Online Privacy Protection Act, 1998) Protecting information posted online by children under 13 More details in lecture on Nov 8.

Data Privacy Releasing sanitized databases k-anonymity (c,t)-isolation Differential privacy Privacy preserving data mining

Sanitization of Databases Add noise, delete names, etc. Real Database (RDB) Sanitized Database (SDB) Health records Census data Protect privacy Provide useful information (utility)

Re-identification by linking Linking two sets of data on shared attributes may uniquely identify some individuals: Example [Sweeney] : De-identified medical data was released, purchased Voter Registration List of MA, re-identified Governor 87 % of US population uniquely identifiable by 5-digit ZIP, sex, dob

K-anonymity (1) Quasi-identifier: Set of attributes (e.g. ZIP, sex, dob) that can be linked with external data to uniquely identify individuals in the population Make every record in the table indistinguishable from at least k-1 other records with respect to quasi-identifiers Linking on quasi-identifiers yields at least k records for each possible value of the quasi-identifier

K-anonymity and beyond Provides some protection: linking on ZIP, age, nationality yields 4 records Limitations: lack of diversity in sensitive attributes, background knowledge, subsequent releases on the same data set Utility: less suppression implies better utility

(c,t)-isolation (2) Mathematical definition motivated by Gavison’s idea that privacy is protected to the extent that an individual blends into a crowd. Image courtesy of WaldoWiki: http://images.wikia.com/waldo/images/a/ae/LandofWaldos.jpg

Definition of (c,t)-isolation Let y be any RDB point, and let δy=║q-y║2. We say that q (c,t)-isolates y iff B(q,cδy) contains fewer than t points in the RDB, that is, |B(q,c δy) ∩ RDB| < t. A database is represented by n points in high dimensional space (one dimension per column) x2 xt-2 x1 q cδy δy y

Definition of (c,t)-isolation (contd)

Differential Privacy: Motivation (3) Guaranteeing that a sanitized database does not imply any private information is too hard Auxiliary info: Terry is an inch taller than average Sanitized database: The average height is 6 feet Sanitized database only provided non-private data, but resulted in private info being learned All surveyors really need is for people to be comfortable supplying their private data People will be comfortable if providing data does not change the sanitized database enough to be noticed

Differential Privacy: Formalization Want a sanitization function K that maps two databases D1 and D2 that differ by one person to about the same sanitized databases K(D1) and K(D2) Make a disclosure S about as likely with K(D1) as K(D2) A randomized function K give ε-differential privacy if for all data sets D1 and D2 differing in at most one element and all subset S of Range(K), Pr[K(D1) in S] ≤ exp(ε) × Pr[K(D2) in S]

Privacy Preserving Data Mining Reference Y. Lindell and B. Pinkas. Privacy Preserving Data Mining, Journal of Cryptology, 15(3):177-206, 2002. Problem: Compute some function of two confidential databases without revealing unnecessary information Example: Govt. database of suspected terrorists intersection with airline passengers database Approach: Cryptographic techniques for secure multiparty computation

 The Security Definition (Slide: Lindell) REAL IDEAL For every real adversary A there exists an adversary S  Protocol interaction Trusted party Computational Indistinguishability: every probabilistic polynomial-time observer that receives the input/output distribution of the honest parties and the adversary, outputs 1 upon receiving the distribution generated in IDEAL with negligibly close probability to when it is generated in REAL. REAL IDEAL