Privacy preserving cloud computing

Slides:



Advertisements
Similar presentations
1 Key Exchange Solutions Diffie-Hellman Protocol Needham Schroeder Protocol X.509 Certification.
Advertisements

Lect. 18: Cryptographic Protocols. 2 1.Cryptographic Protocols 2.Special Signatures 3.Secret Sharing and Threshold Cryptography 4.Zero-knowledge Proofs.
PRIVACY AND SECURITY ISSUES IN DATA MINING P.h.D. Candidate: Anna Monreale Supervisors Prof. Dino Pedreschi Dott.ssa Fosca Giannotti University of Pisa.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
An architecture for Privacy Preserving Mining of Client Information Jaideep Vaidya Purdue University This is joint work with Murat.
Fast Algorithms for Association Rule Mining
David Froot.  How do we transmit information and data, especially over the internet, in a way that is secure and unreadable by anyone but the sender.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Tools for Privacy Preserving Distributed Data Mining
SECURED OUTSOURCING OF FREQUENT ITEMSET MINING Hana Chih-Hua Tai Dept. of CSIE, National Taipei University.
Protection of outsourced data MARIA ANGEL MARQUEZ ANDRADE.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
TEMPLATE DESIGN © Predicate-Tree based Pretty Good Protection of Data William Perrizo, Arjun G. Roy Department of Computer.
Association Rule Mining
Security in Outsourced Association Rule Mining. Agenda  Introduction  Approximate randomized technique  Encryption  Summary and future work.
Keyword search on encrypted data. Keyword search problem  Linux utility: grep  Information retrieval Basic operation Advanced operations – relevance.
Elgamal Public Key Encryption CSCI 5857: Encoding and Encryption.
Public Key Cryptosystem In Symmetric or Private Key cryptosystems the encryption and decryption keys are either the same or can be easily found from each.
Cryptography services Lecturer: Dr. Peter Soreanu Students: Raed Awad Ahmad Abdalhalim
MapReduce MapReduce is one of the most popular distributed programming models Model has two phases: Map Phase: Distributed processing based on key, value.
Searchable Encryption in Cloud
최신정보보호기술 경일대학교 사이버보안학과 김 현성.
Hummingbird: Privacy at the time of Twitter
Security in Outsourcing of Association Rule Mining
Advanced Information Security 5 ECC Cryptography
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
Anonymous Communication
Some slides borrowed from Philippe Golle, Markus Jacobson
Privacy Preserving Similarity Evaluation of Time Series Data
Frequent Pattern Mining
Basic Network Encryption
Introduction to security goals and usage of cryptographic algorithms
Market Basket Many-to-many relationship between different objects
By (Group 17) Mahesha Yelluru Rao Surabhee Sinha Deep Vakharia
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Elliptic Curve Cryptography (ECC)
0x1A Great Papers in Computer Security
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Security through Encryption
PART VII Security.
Data Mining Association Analysis: Basic Concepts and Algorithms
Celia Li Computer Science and Engineering York University
Privacy Preserving Data Mining
Anonymous Communication
Elliptic Curve Cryptography (ECC)
Nonce Making Sense of Nonces.
Approximate Frequency Counts over Data Streams
Engineering Secure Software
NET 311 Information Security
Public-Key, Digital Signatures, Management, Security
Presented by : SaiVenkatanikhil Nimmagadda
Key Distribution Reference: Pfleeger, Charles P., Security in Computing, 2nd Edition, Prentice Hall, /18/2019 Ref: Pfleeger96, Ch.4.
Chapter 3 - Public-Key Cryptography & Authentication
Basic Network Encryption
Published in: IEEE Transactions on Industrial Informatics
Basic of Modern Cryptography
Oblivious Transfer.
Modern Cryptography.
Security: Public Key Cryptography
Association Analysis: Basic Concepts
Anonymous Communication
Secure Diffie-Hellman Algorithm
How to Use Charm Crypto Lib
Lecture 6.2: Protocols - Authentication and Key Exchange II
Presentation transcript:

Privacy preserving cloud computing Issues and solutions

The cloud computing is getting more and more popular, however, because of it’s nature, the privacy of cloud computing users is compromised due to the data out sourcing. Today, we are going to talk about a particular cloud computing scenario called Associate Rules Mining over a transaction database, discuss its privacy issues and how people resolve them. Overview

A little background Support The k-anonymity concept A privacy preserving requirement that only allow k as the smallest number of entries that can be traced in a database using possible identifiers. In another word, no unique entry will be identified in the database if it is in k- anonymity(Example) The EIgamal Crypto System A variant of Diffie Hellman Crypto System (Example) The Plaintext Equality Test In short, a method to test the equality of two pieces of plain text given their Elgamal encryptions without disclosing two pieces of plain text. Association rules mining: In short, to find out what items are bought together frequently by performing a data mining algorithm in a transaction database. a transaction record in database will look like: {ID:###, amount:**, item0:milk, item1 beer, …} And a rule will look like: {milk, egg} -> {bread} Support Calculated as sp(n) = number of times n appears in set S ÷ number of items in S A little background

A little more background Apriori algorithm Apriori is an algorithm for frequent item set mining and association rule learning over transactional databases. And it works as follows: Input transactions set T, threshold e: Initialize a pairs set S Initialize a pairs set R K <- 1 While True: Go over the T and generate all unique item pairs with k items in the pair into S If S is empty break For each pair p calculate the support SP(p) = number of p appeared in T / number of item in T Remove pairs with SP(p) < e from S Add all remaining pairs in S to R K <- k + 1 Return R A little more background

A Little more background Elgamal Crypto We are going to describle it in a classic Alice and Bob scenario Alice Bob Choose his big prime 𝜌 and 𝛼 Generate private key 𝐾 𝑝𝑟 = 𝑑 Generate public key 𝐾 𝑝𝑢𝑏 =𝛽= 𝑎 𝑑 mod 𝑝 (𝛽, 𝜌, 𝛼) Choose her 𝑖 Generate her 𝐾 𝑡𝑚𝑝 = 𝛼 𝑖 𝑚𝑜𝑑 𝑝 Generate her 𝐾 𝑚𝑎𝑠𝑘 = 𝛽 𝑖 𝑚𝑜𝑑 𝑝 Generate her encrypted text 𝑦=𝑥× 𝐾 𝑚𝑎𝑠𝑘 𝑚𝑜𝑑 𝑝 𝑦, 𝐾 𝑡𝑚𝑝 → 𝐾 𝑚𝑎𝑠𝑘 = 𝐾 𝑡𝑚𝑝 𝑑 mod p, 𝑥=𝑦× 𝐾 𝑚𝑎𝑠𝑘 −1 A Little more background

Consider the scene: Associated Rules Mining over transactions database What privacy issue might occur? User data transaction data is out sourced to a curious cloud database server. Assume the mining service is performed by a third provider, the third mining provider might be interested in the transaction data. The internet service provider might be interested in the mined rules when they are transferred to data owners. Privacy Issues

A privacy preserving schema for Associated Rules Mining in transaction servers proposed by[1] provides a solution. Step 1: Transactions data encrypted and uploaded to DB using Elgamal Crypto Sytem by client Step 2: Because the Elgamal Crypto can some times encrypt two same item(text) into different encryptions, Plaintext Equality Test are performed by DB to eliminate the different encryptions for the same item, after this, all encryptions in DB are unique. Step 3: DB performs Apriori on data stored and return rules to client in encrypted form. No server S# are used? Because it’s not the final solution, recall the threshold e In Apriori and the support of the rules, they will be known by the DB. In another word, the DB is not completely blind to the client data. Fig 1 proposed system architecture (refer to [1]) Solutions [1] Yi, X., Rao, F.Y., Bertino, E. and Bouguettaya, A., 2015, April. Privacy-preserving association rule mining in cloud computing. In Proceedings of the 10th ACM symposium on information, computer and communications security (pp. 439-450). ACM.

Solutions How to hide the supports of rules ? Add noise to data stored in DB to add noise we need to construct a table of items that could be included in a transaction which the DM servers S# can refer to create a transaction noise. Algorithm: Item Dictionary Anonymization[1] Purpose: create an items table and achieve K-anonymity to all servers Input: {a1, a2, a3, …, ay}(all items included in all transactions) Output: {c1, c2,, …, cy} (a encrypted items table) for i in 1…y{ DB server looks up the ai from input; DB computes encryption c using public key given by client from ai } for j in 1 to n{ server Sj mixies c1, c2, …, cy by re-encryption and random shuffling. server Sj forwards the result to Sj+1 server Return (c1, c2, …, cy); Solutions [1] Yi, X., Rao, F.Y., Bertino, E. and Bouguettaya, A., 2015, April. Privacy-preserving association rule mining in cloud computing. In Proceedings of the 10th ACM symposium on information, computer and communications security (pp. 439-450). ACM.

Solutions How to hide the supports of rules ? Add noise to data stored in DB After the item table created we can now use the DM servers S# to create noise transactions, for maximum anonymity we also need to anonymize the whole transaction data set. Algorithm: Same Item Identification and replacement[1] Purpose: replace old encryptions using the encryptions in item table This algorithm basically iterate through all the items in all the transactions and replace the old encryptions with the new one in the newly created item table. Algorithm: Transaction Anonymization[1] Purpose: achieve K-anonymity for all transactions to all servers. Basically this algorithm will use two loops to re-encrypt and shuffle all the transactions, for each transaction additional encryption for each item and shuffling are applied. Solutions [1] Yi, X., Rao, F.Y., Bertino, E. and Bouguettaya, A., 2015, April. Privacy-preserving association rule mining in cloud computing. In Proceedings of the 10th ACM symposium on information, computer and communications security (pp. 439-450). ACM.

Solutions Finally, we are in the process of data mining! What is different now? The transactions in DB now has a lot of noise, thus the DB can not get real support(sp) for each item or item set. All the transaction records are perfectly mixed and achieved maximum anonymity to all servers. All items have unique encryptions The next step is to perform normal Apriori algorithm in the DB by DMs and return encrypted rules to the client. Because all the items are encrypted twice, the client will have to decrypt all the rules twice. Solutions

The proposed method may have a big security issue, because it is stated in paper that the DMs (Server Sj) need to look at the original item set to create the item encryption table to add noise. The proposed method will have to go over the whole transaction dataset twice to create the whole ready-for-mining data set, and the method encrypts each item twice. It is not very computationally efficient. Finally, the proposed method needs to use auxiliary servers to perform the tasks in the adding noise phase and the mining phase, this can potentially increase risk of privacy leaks and it increases the expense considering the client is using this service because of limited budget. Comments

Thanks for Listening