Multiplicative data perturbation (2)

Slides:



Advertisements
Similar presentations
Technische Universität Ilmenau CCSW 2013 Sander Wozniak
Advertisements

Cloud Computing Security Monir Azraoui, Kaoutar Elkhiyaoui, Refik Molva, Melek Ӧ nen, Pasquale Puzio December 18, 2013 – Sophia-Antipolis, France.
A Privacy Preserving Index for Range Queries
Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Searching on Multi-Dimensional Data
PRIVACY AND SECURITY ISSUES IN DATA MINING P.h.D. Candidate: Anna Monreale Supervisors Prof. Dino Pedreschi Dott.ssa Fosca Giannotti University of Pisa.
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
SAC’06 April 23-27, 2006, Dijon, France Towards Value Disclosure Analysis in Modeling General Databases Xintao Wu UNC Charlotte Songtao Guo UNC Charlotte.
SafeQ: Secure and Efficient Query Processing in Sensor Networks Fei Chen and Alex X. Liu Department of Computer Science and Engineering Michigan State.
Privacy and Integrity Preserving in Distributed Systems Presented for Ph.D. Qualifying Examination Fei Chen Michigan State University August 25 th, 2009.
Structured Data Types and Encapsulation Mechanisms to create new data types: –Structured data Homogeneous: arrays, lists, sets, Non-homogeneous: records.
Database Laboratory Regular Seminar TaeHoon Kim.
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Privacy Preserving Query Processing in Cloud Computing Wen Jie
1 Privacy-Preserving Distributed Information Sharing Nan Zhang and Wei Zhao Texas A&M University, USA.
Mohammad Ahmadian COP-6087 University of Central Florida.
Sensitive Data  Data that should not be made public  What if some but not all of the elements of a DB are sensitive Inherently sensitiveInherently sensitive.
Database Management 9. course. Execution of queries.
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
Multiple Aggregations Over Data Streams Rui ZhangNational Univ. of Singapore Nick KoudasUniv. of Toronto Beng Chin OoiNational Univ. of Singapore Divesh.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Multiplicative Data Perturbations. Outline  Introduction  Multiplicative data perturbations Rotation perturbation Geometric Data Perturbation Random.
Multiplicative Data Perturbations. Outline  Introduction  Multiplicative data perturbations Rotation perturbation Geometric Data Perturbation Random.
Data Anonymization (1). Outline  Problem  concepts  algorithms on domain generalization hierarchy  Algorithms on numerical data.
Secure Data Outsourcing. Outline  Motivation  Background  Research issues  Summary.
Other Perturbation Techniques. Outline  Randomized Responses  Sketch  Project ideas.
Additive Data Perturbation: the Basic Problem and Techniques.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
Privacy vs. Utility Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Privacy preserving data mining – multiplicative perturbation techniques Li Xiong CS573 Data Privacy and Anonymity.
Presented By Amarjit Datta
Secure Data Outsourcing
Keyword search on encrypted data. Keyword search problem  Linux utility: grep  Information retrieval Basic operation Advanced operations – relevance.
Privacy Preserving Outlier Detection using Locality Sensitive Hashing
MPC Cloud Database with Sense of Security. Introduction Cloud computing – IT as a service from third party service provider Security in cloud environment.
Data Security and Privacy Keke Chen
Searchable Encryption in Cloud
CPS216: Data-intensive Computing Systems
Indexing Goals: Store large files Support multiple search keys
Oblivious Parallel RAM: Improved Efficiency and Generic Constructions
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
Multidimensional Access Structures
Cryptographic hash functions
Privacy Preserving Similarity Evaluation of Time Series Data
COMP 430 Intro. to Database Systems
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Bolin Ding Silu Huang* Surajit Chaudhuri Kaushik Chakrabarti Chi Wang
Principal Component Analysis (PCA)
Query Processing in Databases Dr. M. Gavrilova
Sameh Shohdy, Yu Su, and Gagan Agrawal
Chapter 15 QUERY EXECUTION.
A Privacy-Preserving Index for Range Queries
Native Multidimensional Indexing in Relational Databases
File organization and Indexing
Chapter 11: Indexing and Hashing
Database Security (Chapter 8, Sections 4-7)
Lecture 2- Query Processing (continued)
Native Multidimensional Indexing in Relational Databases
Active Directory Overview
Chapter 13 The Data Warehouse
Multiplicative Data Perturbations (1)
DATABASE HISTOGRAMS E0 261 Jayant Haritsa
Basis Expansions and Generalized Additive Models (1)
Chapter 11: Indexing and Hashing
Hashing.
Chapter 11: Indexing and Hashing
Slides Credit: Sogand Sadrhaghighi
Verifiable Attribute Based Keyword Search with Fine-Grained Owner-Enforced Search Authorization in the Cloud They really need a shorter title.
Path Oram An Extremely Simple Oblivious RAM Protocol
Stefano Tempesta Secure Machine Learning with SQL Server Always Encrypted with Secure Enclaves.
Presentation transcript:

Multiplicative data perturbation (2)

Multiplicative Perturbation: RASP Random space perturbation

confidential query services in the cloud framework Data D D’ D’ D’=F(D) Data owner q’ Query q q’=Q(q) H(q’,D’) Authorized Users Result R’ Result R R=G(R’) Trusted client Honest but curious cloud RASP framework for confidential query services in the cloud

Order preserving encryption Agrawal2004, Boldyreva2009 The set of data is securely transformed so that the order is preserved but the distribution and domain are changed Benefits: indexing/searching on OPE encrypted data Weakness: once the original distribution is known, OPE is broken

Not attribute-wise order preserving Order preserving encryption (OPE, Agrawal et al 2004) is not resilient to distribution-based attacks Original Xi distribution is known Transformed Xi’ distribution OPE Bucket based Estimation

RASP perturbation k-dimensional numeric data, n records, represented as a k x n matrix, x: a record RG: random number generator A: (k+2)x(k+2) random invertible matrix K_ope : key for Order preserving encryption

Properties Not an OPE Preserves convexity of the dataset Convex dataset in Rk  another convex dataset in Rk+2. Good for range query Each range query in Rk  hyperplane based query  range query in Rk+2 .

RASP properties Convexity preserving Queried range (hypercube) is convex RASP transforms the range to another convex (polyhedron) half space: wTx<=a wTx=a The intersection of convex sets is also convex.

illustration of convexity preserving Perturbed space Original space OPE space Xi < a  E(Xi)<E(a)

Secure query transformation A naïve solution Based on the convexity preserving property Problems: (1) A-1 can be probed (2) is . . If a is known, the whole dimension i is breached.

Secure query transformation Enhanced solution Xk+2 is always positive (Xi-a)  0  (Xi-a)Xk+2  0 Correspondingly, in the encrypted space yTy  0, Problems addressed: (1) A-1 cannot be derived from  (2) (Xi-a)Xk+2  0 contains the random component Xk+2 that protects the condition (Xi-a)  0

Efficient two-stage query processing illustrated Stage2: Filter out the junk records Stage1: Querying this bounding box Original space Transformed space A multidimensional tree index is been built on the encrypted data (in the transformed space) in the server.

The client calculates the large bounding box; Stage 1: The client calculates the large bounding box; The server uses the index to find the results. Stage 2: filter the initial results with the conditions yTiy  0 for 1…2m Note: the two-stage strategy works, if the output of stage 1 is significantly smaller than the original database and can be fit into the memory. Otherwise, use linear scan with stage 2 filtering.