Collusion-Resistant Anonymous Data Collection Method Mafruz Zaman Ashrafi See-Kiong Ng Institute for Infocomm Research Singapore.

Slides:



Advertisements
Similar presentations
Secure Multiparty Computations on Bitcoin
Advertisements

Side Channel Attacks on CBC Encrypted Messages in the PKCS#7 Format
ITIS 6200/ Secure multiparty computation – Alice has x, Bob has y, we want to calculate f(x, y) without disclosing the values – We can only do.
Secure Socket Layer.
Digital Signatures and Hash Functions. Digital Signatures.
1 Introduction CSE 5351: Introduction to cryptography Reading assignment: Chapter 1 of Katz & Lindell.
WEP 1 WEP WEP 2 WEP  WEP == Wired Equivalent Privacy  The stated goal of WEP is to make wireless LAN as secure as a wired LAN  According to Tanenbaum:
Lect. 18: Cryptographic Protocols. 2 1.Cryptographic Protocols 2.Special Signatures 3.Secret Sharing and Threshold Cryptography 4.Zero-knowledge Proofs.
Security and Privacy Issues in Wireless Communication By: Michael Glus, MSEE EEL
PRIVACY AND SECURITY ISSUES IN DATA MINING P.h.D. Candidate: Anna Monreale Supervisors Prof. Dino Pedreschi Dott.ssa Fosca Giannotti University of Pisa.
Efficient Private Techniques for Verifying Social Proximity Michael J. Freedman and Antonio Nicolosi Discussion by: A. Ziad Hatahet.
Implementation of a Two-way Authentication Protocol Using Shared Key with Hash CS265 Sec. 2 David Wang.
Reusable Anonymous Return Channels
15-1 Last time Internet Application Security and Privacy Public-key encryption Integrity.
Session 5 Hash functions and digital signatures. Contents Hash functions – Definition – Requirements – Construction – Security – Applications 2/44.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
CMSC 414 Computer and Network Security Lecture 6 Jonathan Katz.
CMSC 414 Computer (and Network) Security Lecture 2 Jonathan Katz.
Wired Equivalent Privacy (WEP)
Apr 22, 2003Mårten Trolin1 Agenda Course high-lights – Symmetric and asymmetric cryptography – Digital signatures and MACs – Certificates – Protocols Interactive.
An Efficient and Spontaneous Privacy-Preserving Protocol for Secure Vehicular Communications Hu Xiong, Konstantin Beznosov, Zhiguang Qin, Matei Ripeanu.
A Designer’s Guide to KEMs Alex Dent
Privacy-Preserving Cross-Domain Network Reachability Quantification
CMSC 414 Computer and Network Security Lecture 19 Jonathan Katz.
TinySec: Link Layer Security Chris Karlof, Naveen Sastry, David Wagner University of California, Berkeley Presenter: Todd Fielder.
SPINS: Security Protocols for Sensor Networks Adrian Perrig Robert Szewczyk Victor Wen David Culler Doug TygarUC Berkeley.
CMSC 414 Computer (and Network) Security Lecture 24 Jonathan Katz.
August 6, 2003 Security Systems for Distributed Models in Ptolemy II Rakesh Reddy Carnegie Mellon University Motivation.
1 CIS 5371 Cryptography 9. Data Integrity Techniques.
Stealth Probing: Efficient Data- Plane Security for IP Routing Ioannis Avramopoulos Princeton University Joint work with Jennifer Rexford.
Public Key Model 8. Cryptography part 2.
8. Data Integrity Techniques
The RSA Algorithm Rocky K. C. Chang, March
1 Introduction to Security and Cryptology Enterprise Systems DT211 Denis Manley.
On the Anonymity of Anonymity Systems Andrei Serjantov (anonymous)
Slicing the Onion: Anonymity Using Unreliable Overlays Sachin Katti Jeffrey Cohen & Dina Katabi.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
1 Privacy-Preserving Distributed Information Sharing Nan Zhang and Wei Zhao Texas A&M University, USA.
Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms David Chaum CACM Vol. 24 No. 2 February 1981 Presented by: Adam Lee 1/24/2006 David.
COEN 350 Mobile Security. Wireless Security Wireless offers additional challenges: Physical media can easily be sniffed. War Driving Legal? U.S. federal.
10/1/2015 9:38:06 AM1AIIS. OUTLINE Introduction Goals In Cryptography Secrete Key Cryptography Public Key Cryptograpgy Digital Signatures 2 10/1/2015.
An efficient secure distributed anonymous routing protocol for mobile and wireless ad hoc networks Authors: A. Boukerche, K. El-Khatib, L. Xu, L. Korba.
Secure Incremental Maintenance of Distributed Association Rules.
Cryptography, Authentication and Digital Signatures
Message Authentication Code July Message Authentication Problem  Message Authentication is concerned with:  protecting the integrity of a message.
Basic Cryptography 1. What is cryptography? Cryptography is a mathematical method of protecting information –Cryptography is part of, but not equal to,
Security protocols and their verification Mark Ryan University of Birmingham Midlands Graduate School University of Birmingham April 2005 Steve Kremer.
R. Newman Anonymity - Background. Defining anonymity Defining anonymity Need for anonymity Need for anonymity Defining privacy Defining privacy Threats.
Dr. Reuven Aviv, Nov 2008 Conventional Encryption 1 Conventional Encryption & Message Confidentiality Acknowledgements for slides Henric Johnson Blekinge.
Presented by: Suparita Parakarn Kinzang Wangdi Research Report Presentation Computer Network Security.
IPsec Introduction 18.2 Security associations 18.3 Internet Security Association and Key Management Protocol (ISAKMP) 18.4 Internet Key Exchange.
Linkability of Some Blind Signature Schemes Swee-Huay Heng 1, Wun-She Yap 1 Khoongming Khoo 2 1 Multimedia University, 2 DSO National Laboratories.
Symmetric Encryption Lesson Introduction ●Block cipher primitives ●DES ●AES ●Encrypting large message ●Message integrity.
Public Key Encryption, Secure WWW Transactions & Digital Signatures.
m-Privacy for Collaborative Data Publishing
Software Security Seminar - 1 Chapter 4. Intermediate Protocols 발표자 : 이장원 Applied Cryptography.
A Brief Introduction to Mix Networks Ari Juels RSA Laboratories © 2001, RSA Security Inc.
Security Analysis of a Privacy-Preserving Decentralized Key-Policy Attribute-Based Encryption Scheme.
1 Diffie-Hellman (Key Exchange) Protocol Rocky K. C. Chang 9 February 2007.
New Efficient Image Encryption Scheme Based on Partial Encryption Karl Martin Multimedia Lab Dept. of Electrical and Computer Eng. University of Toronto.
Key Generation Protocol in IBC Author : Dhruti Sharma and Devesh Jinwala 論文報告 2015/12/24 董晏彰 1.
1 Diffie-Hellman (Key Exchange) Protocol Rocky K. C. Chang 9 February 2007.
Presented By, Mohammad Anees SSE, Mukka. Contents Cryptography Photon Polarization Quantum Key Distribution BB84 Protocol Security of Quantum Cryptography.
Lesson Introduction ●Authentication protocols ●Key exchange protocols ●Kerberos Security Protocols.
Key Wrap Algorithm.
1 Anonymity. 2 Overview  What is anonymity?  Why should anyone care about anonymity?  Relationship with security and in particular identification 
Differential Privacy in Practice
Scalable and Privacy-preserving Design of On/Off-chain Smart Contracts
Key Exchange, Man-in-the-Middle Attack
Presentation transcript:

Collusion-Resistant Anonymous Data Collection Method Mafruz Zaman Ashrafi See-Kiong Ng Institute for Infocomm Research Singapore

Introduction  Quality data is a pre-requisite to obtain good data mining results.  Collecting good quality data requires efforts and money.  Internet is a convenient and low-cost platform for large-scale data collection.

Some Motivating Examples

Corporate Survey A large organization wishes to poll its employees for sensitive information.  eg. How satisfied they are with their bosses’ management skills. -Individuals need to rate their bosses. -However, they are afraid of the price to pay for honesty.

Health Information A drug company wishes to find out adverse effects of a drug.  eg. Relationship between the effects of a drug with other drugs. -Patients need to disclose all the drugs they are taking. -However, disclosing drug info may reveal health condition.

Traffic Monitoring Individual drivers wish to avoid roads with problematic conditions.  eg. Find out the congested road intersections and other bottlenecks. -Individuals need to disclose their GPS info. -However, disclosing GPS info may reveal current position.

Introduction Cont’d..  However, collecting data online has its challenges.  Privacy is the number-one concern for online respondents.  Respondents are reluctant to provide truthful information if their privacy is not protected.

Technical Challenges

Objective: Online Data Collection Two Actors: Data Collector and Respondents - The data collector wants to obtain the responses from a set of respondents. - The respondents submit honest responses only if the data collector is unable to link a particular response and its respondent.

Challenges 1.How does the data collector guarantee that it is unable to associate a particular response to the corresponding respondent? 2.How can a collusion attack be mitigated? 3.How can an honest respondent pull out his response without revealing it to the data collector if he finds a threat to his anonymity? 4.How can we reduce the computational and communication overhead?

Related Works 1. Randomized Response -Respondents’ responses are associated with the result of the toss of a coin. -Only a respondent knows whether the answer reflects the toss of the coin or his true experience. Pros: -A well-known technique. -Easy to use. Cons: -Adds noise to the result in response set that could distort the accuracy of the data mining results.

Related Works Cont’d… 2.Cryptographic Techniques -Respondents employ two sets of keys to encrypt their responses before sending to the data collector. -Each respondent strips off a layer off encryption sequentially and shuffles decrypted results. -All respondents verify the intermediate results before the data collector obtains the actual response set. Pros: -A deterministic technique. -The data mining results are accurate. Cons: -Vulnerable against collusion attacks. -Higher communication overhead.

Building Blocks of Our Approach 1.ElGamal Crypto - is a asymmetric public key encryption scheme. - is a probabilistic encryption. -achieves semantic security. - is malleable. 2. Substitution Cipher -Replace a character with another character. -Example:

The Hybrid Model ElGamal Encryption Substitution Cipher ElGamal Encryption Original response An Onion - Employs both ElGamal and Substitution Cipher. - Builds an Onion for a response. - Removes encryption layer (De-Onion) will result in the original response. An Onion Layer

The Hybrid Model Cont’d.. An example Onion De-Onion Original response Original response

The Protocol

The Protocol has five phases 1.Data Preparation 2.Data Submission 3.Anonymization 4.Verification 5.Decryption

Phase I: Data Preparation Suppose there are 3 respondents (Alice, Bob and Carol). Bob’s Data Preparation Process Bob’s Original Response 8902 DM’s. Pri key 2453 Bob’s Sec. key 8091 Alice’s Sec. key Bob’s Encrypted Response  d Bob Carol’s Sec. key

Phase I: Data Preparation (cont’d..) Bob also computes an partial intermediate verification code W Bob … … … … … … BobAliceCarol Bob Alice Carol W Bob = 6652  4240  7056  b b

Phase II: Data Submission -Each participant submits an encrypted response i.e. and W to the data miner. The Data Miner -Computes the verification code Ω C = W Bob  W Alice  W Carol -Encrypts Ω C using its secondary key and sends the result in encrypted value to each participant. -Shuffles response set {d 1, d 2, d 3 } = {,, } -Sends {d 1, d 2, d 3 } to Carol.

Phase III: Anonymization -Carol “de-onions” one layer from each of the responses {d 1, d 2, d 3 }. eg, ElGamal Decryption Substitution De-Cipher ElGamal Decryption d’ x Intermediate verification

Phase III: Anonymization (cont’d..) -… and computes intermediate verification V carol. AliceBobCarol Alice Bob …. -Shuffles the results in set {d’ y,d’ z,d’ x } = {,, } -Sends {d’ y,d’ z,d’ x } to the Data Miner. V Carol = 7809  2291  6790  V C

Phase III: Anonymization (cont’d..) -The Data Miner sends the randomize set {d’ y,d’ z,d’ x } to next participant (eg, Alice) -Similar to Carol, Alice also ‘de-onion’ one layer from each element of {d’ y,d’ z,d’ x }. -Computes intermediate verification. -Shuffles the results in set {d’ p,d’ q,d’ r }={,, } -Sends {d’ p,d’ q,d’ r } to the Data Miner.

Phase III: Anonymization (cont’d..) -The data miner sends {d’ p,d’ q,d’ r } to the last participant (i.e. Bob), who ‘de-onion’ another layer from this set. -Computes intermediate verification, shuffles the result in set ‘ S ’= {d’ m,d’ n,d’ o } and sends S to data miner.

Phase IV: Verification R -Data miner computes the final secondary encryption value ‘ R ’ from S. R -Sends ‘ R ’ along with its secondary secret key to all participants. -Bob, Alice and Carol decrypt intermediate verification code they received at Phase 2. -They also compute Ω V and check Ω V = Ω C -If ok, each of them sends their secondary secret key to the data miner.

Phase V: Decryption -Data miner uses the respondents’ secondary keys to strip off remaining encryption layers from S. -It uses its own primary key to strip off the final layer to reveal the original responses {….,1234,…..}.

Results and Analysis

Performance Analysis - Communication Overhead Brickell et al. KDD 2006

Complexity - Computation -Respondent’s, O(N) -Data Miner, O(N 2 ) -Communication -Participant’s, O(N)

Conclusion  The privacy of individual is an important issue in online data collection.  Ignoring respondents’ privacy will result in inaccuracy in the data.  Privacy-preserving online data collection must be (i) deterministic and (ii) efficient.

Conclusion  Deterministic: We employ crypto techniques  Collusion Resistance: We incorporate onion/de-onion technique (using ElGama + Substitution) to create a protective layer against collusion  Efficiency: Verification is done on single values instead of entire datasets

Thank you Q&A

The Protocol cont’d.. Suppose there are 3 respondents (Alice, Bob and Carol). 1.Data Preparation (Bob’s) DM’s. Pri key Bob’s Sec. key 8091 Alice’s Sec. key 7609 Carol’s Sec. key Bob’s Pri. key Substitution Cipher Alice’s Pri. key Substitution Cipher 5607 Alice’s Pri. key Carol’s Pri. key 7056 Substitution Cipher 3905 Carol’s Pri. key 8893 Bob’s Original Response - Bob generates a random number θ and computes b a = g θ and b b = gθ Bob also generates W Bob = 6652  4240  7056  b b Bob’s Encrypted Response  d Bob

The Protocol cont’d.. Suppose there are 3 respondents (Alice, Bob and Carol). 1.Data Preparation (Bob’s) DM’s. Pri key Bob’s Sec. key 8091 Alice’s Sec. key 7609 Carol’s Sec. key Bob’s Pri. key Substitution Cipher Alice’s Pri. key Substitution Cipher 5607 Alice’s Pri. key Carol’s Pri. key 7056 Substitution Cipher 3905 Carol’s Pri. key 8893 Bob’s Original Response - Bob generates a random number θ and computes b a = g θ and b b = gθ Bob also generates W Bob = 6652  4240  7056  b b Bob’s Encrypted Response  d Bob

Related Works Cont’d… 3.Mixed Networks -Respondents send response to an intermediate hop. -Each hop strips off a layer of encryption, which allows them to obtain the next hop’s address and forward the result to it. -The process continues till the response reached to the data collector. Pros: -Require less communication overhead. Cons: -Probabilistic approach and only works well if all participants and honest. -Intermediate hops can collaborate to breach an honest respondent’s anonymity.

The Hybrid Model Cont’d An example OnionDe-Onion ElGamal Encryption Substitution Cipher ElGamal Encryption ElGamal Decryption Substitution De-cipher ElGamal Decryption Original response