CS573 Data Privacy and Security

Name: CS573 Data Privacy and Security
Uploaded: 2017-08-17T20:25:06+00:00
Duration: PTM27S42
Channel: Rosalind Lyons
Description: CS573 Data Privacy and Security

CS573 Data Privacy and Security
Secure Multiparty Computation – Basic Cryptographic Methods Li Xiong CS573 Data Privacy and Security 1

The Love Game (AKA the AND game)
He loves me, he loves me not… She loves me, she loves me not… Want to know if both parties are interested in each other. But… Do not want to reveal unrequited love. Input = 1 : I love you Input = 0: I love you Must compute F(X,Y)=X AND Y, giving F(X,Y) to both players. … as a friend Can we reveal the answer without revealing the inputs?

The Spoiled Children Problem (AKA The Millionaires Problem [Yao 1982])
Who has more toys? Who Cares? Pearl wants to know whether she has more toys than Gersh, Doesn’t want to tell Gersh anything. Doesn’t want Pearl to know how many toys he has. Pearl wants to know whether she has more toys than Gersh,. Gersh is willing for Pearl to find out who has more toys, Can we give Pearl the information she wants, and nothing else, without giving Gersh any information at all?

Secure Multiparty Computation
A set of parties with private inputs Parties wish to jointly compute a function of their inputs so that certain security properties (like privacy and correctness) are preserved Properties must be ensured even if some of the parties maliciously attack the protocol Examples Secure elections Auctions Privacy preserving data mining …

Application to Private Data Mining
The setting: Data is distributed at different sites These sites may be third parties (e.g., hospitals, government bodies) or may be the individual him or herself The aim: Compute the data mining algorithm on the data so that nothing but the output is learned Privacy  Security (why?) xn x1 x3 x2 f(x1,x2,…, xn) Secure computation can be used to solve distributed data-mining problem That is, carry out a secure computation

Privacy and Secure Computation
Privacy  Security Secure computation only deals with the process of computing the function It does not ask whether or not the function should be computed A two-stage process: Decide that the function/algorithm should be computed – an issue of privacy Apply secure computation techniques to compute it securely – security Cryptography aims for the following (regarding privacy): A secure protocol must reveal no more information than the output of the function itself That is, the process of protocol computation reveals nothing. Cryptography does not deal with the question of whether or not the function reveals much information E.g., mean of two parties’ salaries Deciding which functions to compute is a different challenge that must also be addressed in the context of privacy preserving data mining. Secure computation: Assume that there is a function that all parties wish to compute Secure computation shows how to compute that function in the safest way possible In particular, it guarantees minimal information leakage (the output only) Privacy: Does the function output itself reveal “sensitive information”, or Should the parties agree to compute this function?

Outline Secure multiparty computation Problem and security definitions
Feasibility results for secure computation Basic cryptographic tools and general constructions

Heuristic Approach to Security
Build a protocol Try to break the protocol Fix the break Return to (2) Real adversaries won’t tell you that they have broken the protocol You can never be really sure that the protocol is secure Compare to algorithms: Inputs are not adversarial Hackers will do anything to exploit a weakness – if one exists, it may well be found Security cannot be checked empirically

Another Heuristic Tactic
Design a protocol Provide a list of attacks that (provably) cannot be carried out on the protocol Reason that the list is complete Problem: often, the list is not complete…

A Rigorous Approach Provide an exact problem definition
Adversarial power Network model Meaning of security Prove that the protocol is secure The history of computer security shows that the heuristic approach is likely to fail Security is very tricky and often anti-intuitive Often by reduction to an assumed hard problem, like factoring large composites

Secure Multiparty Computation
A set of parties with private inputs wish to compute some joint function of their inputs. Parties wish to preserve some security properties. e.g., privacy and correctness. Example: secure election protocol Security must be preserved in the face of adversarial behavior by some of the participants, or by an external party.

Defining Security Components of ANY security definition
Adversarial power Network model Type of network Existence of trusted help Stand-alone versus composition Security guarantees It is crucial that all the above are explicitly and clearly defined.

Security Requirements
Consider a secure auction (with secret bids): An adversary may wish to learn the bids of all parties – to prevent this, require privacy An adversary may wish to win with a lower bid than the highest – to prevent this, require correctness But, the adversary may also wish to ensure that it always gives the highest bid – to prevent this, require independence of inputs

Defining Security Option 1: analyze security concerns for each specific problem Auctions: privacy and correctness Contract signing: fairness Problems: How do we know that all concerns are covered? Definitions are application dependent and need to be redefined from scratch for each task But then, maybe can condition vote for a party based on whether it will get the minimum number of votes

Defining Security – Option 2
The real/ideal model paradigm for defining security [GMW,GL,Be,MR,Ca]: Ideal model: parties send inputs to a trusted party, who computes the function for them Real model: parties run a real protocol with no trusted help A protocol is secure if any attack on a real protocol can be carried out in the ideal model Since no attacks can be carried out in the ideal model, security is implied

The Real Model x y Protocol output Protocol output

The Ideal Model x y x y f1(x,y) f2(x,y) f2(x,y) f1(x,y)

The Security Definition:
 For every real adversary A there exists an adversary S Protocol interaction Trusted party REAL IDEAL

Properties of the Definition
Privacy: The ideal-model adversary cannot learn more about the honest party’s input than what is revealed by the function output Thus, the same is true of the real-model adversary Correctness: In the ideal model, the function is always computed correctly Thus, the same is true in the real-model Others: For example, fairness, independence of inputs

Why This Approach? General – it captures all applications
The specifics of an application are defined by its functionality, security is defined as above The security guarantees achieved are easily understood (because the ideal model is easily understood) We can be confident that we did not “miss” any security requirements

Adversary Model Computational power: Adversarial behaviour:
Probabilistic polynomial-time versus all-powerful Adversarial behaviour: Semi-honest: follows protocol instructions Malicious: arbitrary actions Corruption behaviour Static: set of corrupted parties fixed at onset Adaptive: can choose to corrupt parties at any time during computation Number of corruptions Honest majority versus unlimited corruptions Corruption – when and how parties are corrupted Actions a corrupted party are allowed to take

Outline Secure multiparty computation Defining security
Feasibility results for secure computation Basic cryptographic tools and general constructions 22

Feasibility – A Fundamental Theorem
Any multiparty functionality can be securely computed For any number of corrupted parties: security with abort is achieved, assuming enhanced trapdoor permutations [Yao,GMW] With an honest majority: full security is achieved, assume private channels only [BGW,CCD] There is a rich literature that presents feasibility results (and efficient protocols) for many different models. We will not present a survey of these, however we will talk a little about composition… These results hold in the stand-alone model When composition is considered, things are much more difficult

Outline Secure multiparty computation Defining security
Feasibility results for secure computation Basic cryptographic tools and general constructions

Public-key encryption
Let (G,E,D) be a public-key encryption scheme G is a key-generation algorithm (pk,sk)  G Pk: public key Sk: secret key Terms Plaintext: the original text, notated as m Ciphertext: the encrypted text, notated as c Encryption: c = Epk(m) Decryption: m = Dsk(c) Concept of one-way function: knowing c, pk, and the function Epk, it is still computationally intractable to find m. *Different implementations available, e.g. RSA

Construction paradigms
Passively-secure computation for two-parties Use oblivious transfer to securely select a value Passively-secure computation with shares Use secret sharing scheme such that data can be reconstructed from some shares From passively-secure protocols to actively- secure protocols Use zero-knowledge proofs to force parties to behave in a way consistent with the passively- secure protocol

1-out-of-2 Oblivious Transfer (OT)
Inputs Sender has two messages m0 and m1 Receiver has a single bit {0,1} Outputs Sender receives nothing Receiver obtain m and learns nothing of m1-

Semi-Honest OT Let (G,E,D) be a public-key encryption scheme
G is a key-generation algorithm (pk,sk)  G Encryption: c = Epk(m) Decryption: m = Dsk(c) Assume that a public-key can be sampled without knowledge of its secret key: Oblivious key generation: pk  OG El-Gamal encryption has this property

Semi-Honest OT Protocol for Oblivious Transfer
Receiver (with input ): Receiver chooses one key-pair (pk,sk) and one public-key pk’ (oblivious of secret-key). Receiver sets pk = pk, pk1- = pk’ Note: receiver can decrypt for pk but not for pk1- Receiver sends pk0,pk1 to sender Sender (with input m0,m1): Sends receiver c0=Epk0(m0), c1=Epk1(m1) Receiver: Decrypts c using sk and obtains m.

Security Proof Intuition:
Sender’s view consists only of two public keys pk0 and pk1. Therefore, it doesn’t learn anything about that value of . The receiver only knows one secret-key and so can only learn one message Note: this assumes semi-honest behavior. A malicious receiver can choose two keys together with their secret keys. Formally: Sender’s view is independent of receiver’s input and so can easily be simulated (just give it 2 keys) Receiver’s view can be simulated by obtaining the output m and sending it Epk0(m),Epk1(m).

Generalization Can define 1-out-of-k oblivious transfer
Protocol remains the same: Choose k-1 public keys for which the secret key is unknown Choose 1 public-key and secret-key pair

General GMW Construction
For simplicity – consider two-party case Let f be the function that the parties wish to compute Represent f as an arithmetic circuit with addition and multiplication gates Aim – compute gate-by-gate, revealing only random shares each time

Random Shares Paradigm
Let a be some value: Party 1 holds a random value a1 Party 2 holds a+a1 Note that without knowing a1, a+a1 is just a random value revealing nothing of a. We say that the parties hold random shares of a. The computation will be such that all intermediate values are random shares (and so they reveal nothing).

Circuit Computation Stage 1: each party randomly shares its input with the other party Stage 2: compute gates of circuit as follows Given random shares to the input wires, compute random shares of the output wires Stage 3: combine shares of the output wires in order to obtain actual output AND AND Alice’s inputs NOT Bob’s inputs AND OR OR

Addition Gates Input wires to gate have values a and b:
Party 1 has shares a1 and b1 Party 2 has shares a2 and b2 Note: a1+a2=a and b1+b2=b To compute random shares of output c=a+b Party 1 locally computes c1=a1+b1 Party 2 locally computes c2=a2+b2 Note: c1+c2=a1+a2+b1+b2=a+b=c 35

Multiplication Gates Input wires to gate have values a and b:
Party 1 has shares a1 and b1 Party 2 has shares a2 and b2 Wish to compute c = ab = (a1+a2)(b1+b2) Party 1 knows its concrete share values. Party 2’s values are unknown to Party 1, but there are only 4 possibilities (depending on correspondence to 00,01,10,11)

Multiplication (cont)
Party 1 prepares a table as follows: Row 1 corresponds to Party 2’s input 00 Row 2 corresponds to Party 2’s input 01 Row 3 corresponds to Party 2’s input 10 Row 4 corresponds to Party 2’s input 11 Let r be a random bit chosen by Party 1: Row 1 contains the value ab+r when a2=0,b2=0 Row 2 contains the value ab+r when a2=0,b2=1 Row 3 contains the value ab+r when a2=1,b2=0 Row 4 contains the value ab+r when a2=1,b2=1

Concrete Example Assume: a1=0, b1=1 Assume: r=1 Row Party 2’s shares
Output value 1 a2=0,b2=0 (0+0).(1+0)+1=1 2 a2=0,b2=1 (0+0).(1+1)+1=1 3 a2=1,b2=0 (0+1).(1+0)+1=0 4 a2=1,b2=1 (0+1).(1+1)+1=1

The Gate Protocol The parties run a 1-out-of-4 oblivious transfer protocol Party 1 plays the sender: message i is row i of the table. Party 2 plays the receiver: it inputs 1 if a2=0 and b2=0, 2 if a2=0 and b2=1, and so on… Output: Party 2 receives c2=c+r – this is its output Party 1 outputs c1=r Note: c1 and c2 are random shares of c, as required

Summary By computing each gate these way, at the end the parties hold shares of the output wires. Function output generated by simply sending shares to each other.

Security Reduction to the oblivious transfer protocol
Assuming security of the OT protocol, parties only see random values until the end. Therefore, simulation is straightforward. Note: correctness relies heavily on semi- honest behavior (otherwise can modify shares).

Outline Secure multiparty computation Coming up Defining security
Feasibility results for secure computation Basic cryptographic tools and general constructions Coming up Applications in privacy preserving distributed data mining Random response protocols 42

A real-world problem and some simple solutions
Bob comes to Ron (a manager), with a complaint about a sensitive matter, asking Ron to keep his identity confidential A few months later, Moshe (another manager) tells Ron that someone has complained to him, also with a confidentiality request, about the same matter Ron and Moshe would like to determine whether the same person has complained to each of them without giving information to each other about their identities Comparing information without leaking it. Fagin et al, 1996

References Secure Multiparty Computation for Privacy- Preserving Data Mining, Pinkas, 2008 Chapter 7: General Cryptographic Protocols ( 7.1 Overview), The Foundations of Cryptography, Volume 2, Oded Goldreich Comparing information without leaking it. Fagin et al, 1996

Slides credits Tutorial on secure multi-party computation, Lindell
Introduction to secure multi-party computation, Vitaly Shmatikov, UT Austin

Remark The semi-honest model is often used as a tool for obtaining security against malicious parties. In many (most?) settings, security against semi-honest adversaries does not suffice. In some settings, it may suffice. One example: hospitals that wish to share data.

Malicious Adversaries
The above protocol is not secure against malicious adversaries: A malicious adversary may learn more than it should. A malicious adversary can cause the honest party to receive incorrect output. We need to be able to extract a malicious adversary’s input and send it to the trusted party.

Tool: Zero Knowledge Problem setting: a prover wishes to prove a statement to the verifier so that: Zero knowledge: the verifier will learn nothing beyond the fact that the statement is correct Soundness: the prover will not be able to convince the verifier of a wrong statement Zero-knowledge proven using simulation.

Illustrative Example Prover has two colored cards that he claims are of different color The verifier is color blind and wants a proof that the colors are different. Idea 1: use a machine to measure the light waves and color. But, then the verifier will learn what the colors are.

Example (continued) Protocol:
Verifier writes color1 and color2 on the back of the cards and shows the prover Verifier holds out one card so that the prover only sees the front The prover then says whether or not it is color1 or color2 Soundness: if they are both the same color, the prover will fail with probability ½. By repeating many times, will obtain good soundness bound. Zero knowledge: verifier can simulate by itself by holding out a card and just saying the color that it knows

Zero Knowledge Fundamental Theorem [GMR]: zero- knowledge proofs exist for all languages in NP Observation: given commitment to input and random tape, and given incoming message series, correctness of next message in a protocol is an NP-statement. Therefore, it can be proved in zero-knowledge.

Protocol Compilation Given any protocol, construct a new protocol as follows: Both parties commit to inputs Both parties generate uniform random tape Parties send messages to each other, each message is proved “correct” with respect to the original protocol, with zero-knowledge proofs.

Resulting Protocol Theorem: if the initial protocol was secure against semi-honest adversaries, then the compiled protocol is secure against malicious adversaries. Proof: Show that even malicious adversaries are limited to semi-honest behavior. Show that the additional messages from the compilation all reveal nothing.

Summary GMW paradigm: First, construct a protocol for semi-honest adv. Then, compile it so that it is secure also against malicious adversaries There are many other ways to construct secure protocols – some of them significantly more efficient. Efficient protocols against semi-honest adversaries are far easier to obtain than for malicious adversaries.

Useful References Oded Goldreich. Foundations of Cryptography Volume 1 – Basic Tools. Cambridge University Press. Computational hardness, pseudorandomness, zero knowledge Oded Goldreich. Foundations of Cryptography Volume 2 – Basic Applications. Cambridge University Press. Chapter on secure computation Papers: an endless list (I would rather not go on record here, but am very happy to personally refer people).

CS573 Data Privacy and Security

Similar presentations

Presentation on theme: "CS573 Data Privacy and Security"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS573 Data Privacy and Security

Similar presentations

Presentation on theme: "CS573 Data Privacy and Security"— Presentation transcript:

Similar presentations

About project

Feedback