Secure Computation of the k’th Ranked Element Gagan Aggarwal Stanford University Joint work with Nina Mishra and Benny Pinkas, HP Labs.

Slides:



Advertisements
Similar presentations
Polylogarithmic Private Approximations and Efficient Matching
Advertisements

Private Inference Control David Woodruff MIT Joint work with Jessica Staddon (PARC)
Private Inference Control
Efficient Private Approximation Protocols Piotr Indyk David Woodruff Work in progress.
Revisiting the efficiency of malicious two party computation David Woodruff MIT.
Secure Computation of Linear Algebraic Functions
Secure Evaluation of Multivariate Polynomials
Secure Multiparty Computations on Bitcoin
Oblivious Branching Program Evaluation
Efficient Two-party and Multiparty Computation against Covert Adversaries Vipul Goyal Payman Mohassel Adam Smith Penn Sate UCLAUC Davis.
ITIS 6200/ Secure multiparty computation – Alice has x, Bob has y, we want to calculate f(x, y) without disclosing the values – We can only do.
Semi-Honest to Malicious Oblivious-Transfer The Black-box Way Iftach Haitner Weizmann Institute of Science.
Gillat Kol (IAS) joint work with Ran Raz (Weizmann + IAS) Interactive Channel Capacity.
Rational Oblivious Transfer KARTIK NAYAK, XIONG FAN.
CS555Topic 241 Cryptography CS 555 Topic 24: Secure Function Evaluation.
Gillat Kol joint work with Ran Raz Competing Provers Protocols for Circuit Evaluation.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Amortizing Garbled Circuits Yan Huang, Jonathan Katz, Alex Malozemoff (UMD) Vlad Kolesnikov (Bell Labs) Ranjit Kumaresan (Technion) Cut-and-Choose Yao-Based.
Introduction to Modern Cryptography, Lecture 12 Secure Multi-Party Computation.
Eran Omri, Bar-Ilan University Joint work with Amos Beimel and Ilan Orlov, BGU Ilan Orlov…!??!!
Modeling Insider Attacks on Group Key Exchange Protocols Jonathan Katz Ji Sun Shin University of Maryland.
Yan Huang, Jonathan Katz, David Evans University of Maryland, University of Virginia Efficient Secure Two-Party Computation Using Symmetric Cut-and-Choose.
SIA: Secure Information Aggregation in Sensor Networks Bartosz Przydatek, Dawn Song, Adrian Perrig Carnegie Mellon University Carl Hartung CSCI 7143: Secure.
Poorvi Vora/CTO/IPG/HP 01/03 1 The channel coding theorem and the security of binary randomization Poorvi Vora Hewlett-Packard Co.
COVERT TWO-PARTY COMPUTATION LUIS VON AHN CARNEGIE MELLON UNIVERSITY JOINT WORK WITH NICK HOPPER JOHN LANGFORD.
Oblivious Transfer based on the McEliece Assumptions
Co-operative Private Equality Test(CPET) Ronghua Li and Chuan-Kun Wu (received June 21, 2005; revised and accepted July 4, 2005) International Journal.
Jointly Restraining Big Brother: Using cryptography to reconcile privacy with data aggregation Ran Canetti IBM Research.
Private Analysis of Data Sets Benny Pinkas HP Labs, Princeton.
1 Introduction to Secure Computation Benny Pinkas HP Labs, Princeton.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Privacy Preserving Data Mining Yehuda Lindell & Benny Pinkas.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
The Complexity of Algorithms and the Lower Bounds of Problems
On Everlasting Security in the Hybrid Bounded Storage Model Danny Harnik Moni Naor.
Privacy Preserving Learning of Decision Trees Benny Pinkas HP Labs Joint work with Yehuda Lindell (done while at the Weizmann Institute)
K-Anonymous Message Transmission Luis von Ahn Andrew Bortz Nick Hopper The Aladdin Center Carnegie Mellon University.
Slide 1 Vitaly Shmatikov CS 380S Oblivious Transfer and Secure Multi-Party Computation With Malicious Parties.
1 Cross-Domain Secure Computation Chongwon Cho (HRL Laboratories) Sanjam Garg (IBM T.J. Watson) Rafail Ostrovsky (UCLA)
Secure Computation of Surveys Joan Feigenbaum Benny Pinkas Raphael S. Ryger Felipe Saint Jean Workshop on Secure Multiparty Protocols (SMP 2004)
CHAPTER 7: SORTING & SEARCHING Introduction to Computer Science Using Ruby (c) Ophir Frieder at al 2012.
How to play ANY mental game
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
1 Privacy-Preserving Distributed Information Sharing Nan Zhang and Wei Zhao Texas A&M University, USA.
Analysis of Algorithms
Slide 1 Vitaly Shmatikov CS 380S Introduction to Secure Multi-Party Computation.
Secure two-party computation: a visual way by Paolo D’Arco and Roberto De Prisco.
Privacy Preserving Data Mining Yehuda Lindell Benny Pinkas Presenter: Justin Brickell.
Privacy-Preserving Credit Checking Keith Frikken, Mikhail Atallah, and Chen Zhang Purdue University June 7, 2005.
CSC 211 Data Structures Lecture 13
On the Communication Complexity of SFE with Long Output Daniel Wichs (Northeastern) joint work with Pavel Hubáček.
Privacy-preserving rule mining. Outline  A brief introduction to association rule mining  Privacy preserving rule mining Single party  Perturbation.
Rational Cryptography Some Recent Results Jonathan Katz University of Maryland.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Page 1March 1, th Estonian Winter School in Computer Science Privacy Preserving Data Mining Lecture 2 Cryptographic Solutions Benny Pinkas HP Labs,
COMPSCI 102 Discrete Mathematics for Computer Science.
Multi-Party Proofs and Computation Based in part on materials from Cornell class CS 4830.
Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller.
CSCE 411H Design and Analysis of Algorithms Set 10: Lower Bounds Prof. Evdokia Nikolova* Spring 2013 CSCE 411H, Spring 2013: Set 10 1 * Slides adapted.
Efficient Private Matching and Set Intersection Mike Freedman, NYU Kobbi Nissim, MSR Benny Pinkas, HP Labs EUROCRYPT 2004.
Communication Complexity Guy Feigenblat Based on lecture by Dr. Ely Porat Some slides where adapted from various sources Complexity course Computer science.
Round-Efficient Multi-Party Computation in Point-to-Point Networks Jonathan Katz Chiu-Yuen Koo University of Maryland.
Cryptographic methods. Outline  Preliminary Assumptions Public-key encryption  Oblivious Transfer (OT)  Random share based methods  Homomorphic Encryption.
Jeffrey D. Ullman Stanford University.  A real story from CS341 data-mining project class.  Students involved did a wonderful job, got an “A.”  But.
Secure Computation Basics Yan Huang Indiana University May 9, 2016.
Multi-Party Computation r n parties: P 1,…,P n  P i has input s i  Parties want to compute f(s 1,…,s n ) together  P i doesn’t want any information.
Lower bounds for Unconditionally Secure MPC Ivan Damgård Jesper Buus Nielsen Antigoni Polychroniadou Aarhus University.
Topic 36: Zero-Knowledge Proofs
Course Business I am traveling April 25-May 3rd
Data Structures: Segment Trees, Fenwick Trees
Presentation transcript:

Secure Computation of the k’th Ranked Element Gagan Aggarwal Stanford University Joint work with Nina Mishra and Benny Pinkas, HP Labs

A story … I bet the dumbest student in Gryffindor has a higher IQ than the median IQ of all students in the school. But you don’t even know what the median IQ is … But, what about privacy of the students. We can do “Secure function evaluation” … This is all “theory”. It can’t be efficient. Let us compute it...

Rising Need for Privacy Many opportunities of interaction between institutions and agencies holding sensitive data. Privacy cannot be sacrificed. I.e. different agencies might hold data which they are not allowed to share. A need for protocols to evaluate functions while preserving privacy of data.

Privacy-preserving Computation: the ideal case x y F(x,y) and nothing else Input: Output: x y F(x,y)

Trusted third parties are rare x y F(x,y) Run a protocol to evaluate F(x,y) without a trusted party. Two kinds of adversaries: Semi-honest – Follows the protocol, but is curious to learn more than F(x,y). Malicious - Might do anything.

Is there anything better? x y F(x,y) Does the trusted party scenario make sense? Are the parties motivated to submit their true inputs? Can they tolerate the disclosure of F(x,y)? Our goal: Implement the scenario without a trusted party.

Definition of security: semi-honest model … xy F(x,y) Protocol is secure if Bob can generate the sequence of messages exchanged from his own input y and the value of F(x,y).

Definition of security: malicious model … x Protocol is secure if  adversary Bob,  an input y s.t. Bob’s actions correspond to him presenting y to a trusted third party.

Secure Function Evaluation Secure Function Evaluation [ Yao, GMW,BGW,CCD ] x y C(x,y) and nothing else Input: Output: F(x,y) – A public function. Represented as a Boolean circuit C(x,y). Implementation: O(|X|) “oblivious transfers”. O(|C|) communication. Pretty efficient for small circuits! e.g. Is x > y? (Millionaire’s problem) C(x,y) and nothing else

Some useful primitives Useful to have efficient solutions for simple primitives. Let X and Y be sets of elements: –X  Y (first talk) –Statistics over X  Y: Max, Min, Average, Median, k th - ranked element.

k th -ranked element Inputs: –Alice: S A Bob: S B –Large sets of unique items (є S). –The rank k Could depend on the size of input datasets. Median: k = (|S A | + |S B |) / 2 Output: –x  S A  S B s.t. x has k-1 elements smaller than it.

Motivation Basic statistical analysis of distributed data. E.g. histogram of salaries in all CS departments (Taulbee survey).

Faculty salary for top 12 CS departments( ) Faculty rank NumberMinimum MeanMedian Maximum Non-tenure teaching 75$37 K$72 K $110 K Assistant professor 118$50 K$81 K $96 K Associate professor 86$63 K$91 K $120 K Full professor 218$52 K$123K$117 K$199 K

Results Finding the k th ranked item (D=|domain|) –Two-party: reduction to log k secure comparisons of log D bit numbers. log k rounds * O(log D) –Multi-party: reduction to log D simple computations with log D bit numbers. log D rounds * O(log D) –Also, security against malicious parties. –Can hide the size of the datasets.

Related work Lower bound: Ω(log D) –From communication complexity. Generic constructions –Using circuits [Yao …]: Overhead at least linear in k. –Naor-Nissim: Overhead of Ω(D).

RARA An (insecure) two-party median protocol LALA SASA SBSB m A RBRB LBLB m B L A lies below the median, R B lies above the median. New median is same as original median. Recursion  Need log n rounds m A < m B (assume each set contains n=2 i items)

Secure two-party median protocol A finds its median m A. B finds its median m B. mA < mBmA < mB A deletes elements ≤ m A. B deletes elements > m B. A deletes elements > m A. B deletes elements ≤ m B. YES NO Secure comparison (e.g. a small circuit)

An example A B mA>mBmA>mB mA<mBmA<mB mA<mBmA<mB mA>mBmA>mB mA<mBmA<mB Median found!!

Proof of security A B mA>mBmA>mB mA<mBmA<mB mA<mBmA<mB mA>mBmA>mB mA<mBmA<mB median mA>mBmA>mB mA<mBmA<mB mA<mBmA<mB mA>mBmA>mB mA<mBmA<mB Median

Still to come… Security against malicious parties. Adapt the median protocol for arbitrary k and arbitrary input set size. Hide the size of the datasets. k th element for multi-party scenario.

Security against malicious parties Comparisons secure against malicious parties. Verify that parties’ inputs to comparisons are consistent. I.e., prevent –Round 1: m A = Is told to delete all x>1000. –Round 2: m A = 1100… Solution: Each round sends secure “state” to next round (i.e., boundaries for parties’ inputs). Implement “reactive computation” [C,CLOS]. Can implement in a single circuit. Efficient security against malicious parties.

Security against malicious parties a4 < b4a4 < b4 a7 < b1a7 < b1 a2 < b6a2 < b6 a6 < b2a6 < b2 a5 < b3a5 < b3 a3 < b5a3 < b5 a1 < b7a1 < b7 a8 < b1a8 < b1 a7 < b2a7 < b2 a6 < b3a6 < b3 a5 < b4a5 < b4 a4 < b5a4 < b5 a3 < b6a3 < b6 a2 < b7a2 < b7 a1 < b8a1 < b8 YES Y N Y Y YN N N NO

Security against malicious parties a4 < b4a4 < b4 a7 < b1a7 < b1 a2 < b6a2 < b6 a6 < b2a6 < b2 a5 < b3a5 < b3 a3 < b5a3 < b5 a1 < b7a1 < b7 a8 < b1a8 < b1 a7 < b2a7 < b2 a6 < b3a6 < b3 a5 < b4a5 < b4 a4 < b5a4 < b5 a3 < b6a3 < b6 a2 < b7a2 < b7 a1 < b8a1 < b8 YES Y N Y Y YN N N NO

Security against malicious parties a4 < b4a4 < b4 a7 < b1a7 < b1 a2 < b6a2 < b6 a5 < b3a5 < b3 a3 < b5a3 < b5 a1 < b7a1 < b7 a8 < b1a8 < b1 a7 < b2a7 < b2 a5 < b4a5 < b4 a4 < b5a4 < b5 a3 < b6a3 < b6 a2 < b7a2 < b7 a1 < b8a1 < b8 YES Y N Y Y YN N N NO a6 < b2a6 < b2 a6 < b3a6 < b3

Security against malicious parties An adversary is fully defined by the input a i ’s it gives for each of the nodes of this tree. These (consistent) a i ’s form an input x which can be used with F(x,y) to generate a transcript.

++ Arbitrary input size, arbitrary k SASA SBSB k Now, compute the median of two sets of size k. Size should be a power of 2. median of new inputs = k th element of original inputs 2i2i ++ --

Hiding size of inputs Can search for k th element without revealing size of input sets. However, k=n/2 (median) reveals input size. Solution: Let U=2 i be a bound on input size. |S A | U -- ++ -- ++ |S B | Median of new datasets is same as median of original datasets.

The multi-party case Input: Party P i has set S i, i=1..n. (all values  [a,b], where a and b are known) Output: k th element of S 1  …  S n Basic Idea: Binary search on [a,b].

An example Left Right Done Median found!! ab

The multi-party case Protocol: Set m = (a+b)/2. Repeat: –P i inputs to a secure computation L i = # elements in S i smaller than m. B i = # times m appears in S i. -The following is computed securely: If ΣL i  k, Else, if ΣL i + B i  k, Otherwise, Upper half Lower half Found median

The multi-party case Can be made secure for malicious case. –Using consistency checks. Works for two-party case. –Can be used for non-distinct elements.

Summary Efficient secure computation of the median. –Two-party: log k rounds * O(log D) –Multi-party: log D rounds * O(log D) –Communication overhead is very close to the communication complexity lower bound of log D bits. Malicious case is efficient too. –Do not use generic tools. –Instead, we implement simple consistency checks to get security against malicious parties.

Thanks for your attention! Thanks for your attention!

Open Problems Approximation protocols for NP-hard problems. –Clustering does not admit exact poly-time solutions. At best, hope for a protocol that computes an approximation. Then, comparison to a trusted party which computes the exact solution doesn’t seem fair. –Need an appropriate notion of privacy. Efficient solutions for more primitives.

Definition of security: malicious model Real model Ideal model/ Trusted party model x y F(x,y) … Learns no more than

The multi-party case Input: Party P i has set S i, i=1..n. (all values  [a,b], where a and b are known) Output: k th element of S 1  …  S n Protocol: Set m = (a+b)/2. Repeat: –P i inputs to a secure computation L i = # elements in S i smaller than m. B i = # times m appears in S i. -The following is computed securely: If ΣL i  k, set b=m, m=(a+m)/2. Else, if ΣL i + B i  k, stop. k th element is m. Otherwise, set a=m, m = (m+b)/2. Right Left Done

Definition of security: semi-honest model … xy F(x,y) Protocol is secure if Bob can generate the transcript from his own input y and the value of F(x,y). s.t. T’ is computationally indistinguishable from the actual transcript of the protocol.

Definition of security: semi-honest model … xy F(x,y) Protocol is secure if Bob can generate the sequence of messages exchanged from his own input y and the value of F(x,y).

Definition of security: malicious model … x Protocol is secure if for every adversary Bob, there exists an input y s.t. Bob can generate a computationally indistinguishable transcript from this input y and the value of F(x,y).

Security against malicious parties Consistency checks ensure that –Along any execution path, a i < a j and b i <b j for all i<j. –Any a i or b i appears at most twice on each execution path, and are checked to be consistent at those occurrences. Any adversary is fully defined by the input b i ’s it gives for each of the nodes of this tree. These (consistent) b i ’s form an input y which can be used with F(x,y) to generate a transcript.

Previous work Generic constructions using circuits[Yao …]: –Overhead at least linear in k. Naor-Nissim: –Any function which can be computed with communication complexity of c bits, can be privately computed with overhead 2 C. –Communication complexity of median is Θ(log D) bits. –Implies overhead of D using this approach.