Helen: Maliciously Secure Coopetitive Learning for Linear Models

Slides:

Advertisements

Similar presentations

Secure Evaluation of Multivariate Polynomials

Advertisements

Spring 2000CS 4611 Security Outline Encryption Algorithms Authentication Protocols Message Integrity Protocols Key Distribution Firewalls.

Efficient Zero-Knowledge Proof Systems Jens Groth University College London.

Lect. 18: Cryptographic Protocols. 2 1.Cryptographic Protocols 2.Special Signatures 3.Secret Sharing and Threshold Cryptography 4.Zero-knowledge Proofs.

Yan Huang, Jonathan Katz, David Evans University of Maryland, University of Virginia Efficient Secure Two-Party Computation Using Symmetric Cut-and-Choose.

Security Overview Hofstra University University College for Continuing Education - Advanced Java Programming Lecturer: Engin Yalt May 24, 2006.

CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.

CMSC 414 Computer and Network Security Lecture 6 Jonathan Katz.

CMSC 414 Computer (and Network) Security Lecture 2 Jonathan Katz.

10/25/20061 Threshold Paillier Encryption Web Service A Master’s Project Proposal by Brett Wilson.

Secure Efficient Multiparty Computing of Multivariate Polynomials and Applications Dana Dachman-Soled, Tal Malkin, Mariana Raykova, Moti Yung.

Cryptographic Technologies

CMSC 414 Computer and Network Security Lecture 6 Jonathan Katz.

Practical Private Computation of Vector Addition-Based Functions Yitao Duan and John Canny Computer Science Division University of California, Berkeley.

Chapter 8.  Cryptography is the science of keeping information secure in terms of confidentiality and integrity.  Cryptography is also referred to as.

Cong Wang1, Qian Wang1, Kui Ren1 and Wenjing Lou2

Efficient and Robust Private Set Intersection and multiparty multivariate polynomials Dana Dachman-Soled 1, Tal Malkin 1, Mariana Raykova 1, Moti Yung.

Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.

Wai Kit Wong 1, Ben Kao 2, David W. Cheung 2, Rongbin Li 2, Siu Ming Yiu 2 1 Hang Seng Management College, Hong Kong 2 University of Hong Kong.

Presented by: Sanketh Beerabbi University of Central Florida.

On the Communication Complexity of SFE with Long Output Daniel Wichs (Northeastern) joint work with Pavel Hubáček.

Ahmed Osama Research Assistant. Presentation Outline Winc- Nile University- Privacy Preserving Over Network Coding 2  Introduction  Network coding 

Non-Interactive Verifiable Computing August 5, 2009 Bryan Parno Carnegie Mellon University Rosario Gennaro, Craig Gentry IBM Research.

Secure Computation (Lecture 2) Arpita Patra. Vishwaroop of MPC.

Lecture 2 Page 1 CS 236, Spring 2008 More on Cryptography CS 236 On-Line MS Program Networks and Systems Security Peter Reiher Spring, 2008.

Secure Data Outsourcing

Lecture 5 Page 1 CS 236 Online More on Cryptography CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.

Second Price Auctions A Case Study of Secure Distributed Computing Bart De Decker Gregory Neven Frank Piessens Erik Van Hoeymissen.

Verifiable Threshold Secret Sharing and Full Fair Secure Two-party Computation YE Jian-wei March 7, 2009.

Cryptographic methods. Outline  Preliminary Assumptions Public-key encryption  Oblivious Transfer (OT)  Random share based methods  Homomorphic Encryption.

@Yuan Xue CS 285 Network Security Public-Key Cryptography Yuan Xue Fall 2012.

Secure and Practical Outsourcing of Linear Programming in Cloud Computing.

Searchable Encryption in Cloud

Talal H. Noor, Quan Z. Sheng, Lina Yao,

Security Outline Encryption Algorithms Authentication Protocols

Antonis Papadimitriou, Arjun Narayan, Andreas Haeberlen

A Fast Trust Region Newton Method for Logistic Regression

Public Key Encryption Systems

MPC and Verifiable Computation on Committed Data

Privacy-preserving Reinforcement Learning

Outline Desirable characteristics of ciphers Uses of cryptography

Privacy Preserving Similarity Evaluation of Time Series Data

Sindhusha Doddapaneni

Committed MPC Multiparty Computation from Homomorphic Commitments

Cloud Computing By P.Mahesh

Outline Desirable characteristics of ciphers Uses of cryptography

A Verified DSL for MPC in

Verifiable Oblivious Storage

Four-Round Secure Computation without Setup

Security through Encryption

PART VII Security.

Security in Network Communications

Making Secure Computation Practical

Cloud Security 李芮，蒋希坤，崔男 2018年4月.

Distributed Ledger Technology (DLT) and Blockchain

Homework #4 Solutions Brian A. LaMacchia

Outline Using cryptography in networks IPSec SSL and TLS.

The loss function, the normal equation,

Malicious-Secure Private Set Intersection via Dual Execution

Cryptography Lecture 5.

ONLINE SECURE DATA SERVICE

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Kriti shreshtha.

Mathematical Foundations of BME Reza Shadmehr

MPC Scenario 1. “Privacy-protected contingency tables”

Public Key Encryption Systems

Batch Normalization.

Differential Privacy (1)

Logistic Regression Geoff Hulten.

Presentation transcript:

Helen: Maliciously Secure Coopetitive Learning for Linear Models CS294 AI-Sys Presented by Dylan Dreyer April 10, 2019

Outline Motivation + Problem Statement Background + Threat Model Overview of Helen + Key Features Results Discussion

Outline Motivation + Problem Statement Background + Threat Model Overview of Helen + Key Features Results Discussion

Motivation #1 Model money laundering -> criminals move money back and forth cannot share customers’ data in plaintext

Motivation #2 Model

Problem “collaboratively train machine learning models on combined datasets for a common benefit” “organizations cannot share their sensitive data in plaintext due to privacy policies and regulations or due to business competition” Collaboration is beneficial higher quality models enable new applications

Outline Motivation + Problem Background + Threat Model Overview of Helen + Key Features Results Discussion

Background on Coopetitive Learning Coopetitive -> cooperative and competitive Secure multi-party computation (MPC) inefficient Previous works are limited unrealistic threat models limited to two parties General secure MPC is inefficient inefficient -> the paper implements SGD using SPDZ (maliciously secure MPC library that they also use for Helen), and notes that for 4 parties, 100K data points per party, 90 features -> 3 months to train. Previous works - homomorphic encryption, privacy preserving distributed computation threat model is unrealistic -> rely on outsourcing to non-colluding servers, assume a passive attacker that doesn’t deviate from the protocol, base confidentiality of their data on correct behavior of others limited because only involve two parties (SecureML), utilizes trusted hardware which opens up more avenues of vulnerability

Threat Model malicious setting - only trust yourself! all other parties can misbehave/be malicious during protocol all parties agree on a functionality to compute confidentiality of final model not protected

Background on Crypto Building Blocks threshold partially homomorphic encryption partially homomorphic ex. Paillier -> Enc(X) * Enc(Y) = Enc(X+Y) threshold need enough shares of secret key to decrypt zero knowledge proofs prove that a certain statement is true without revealing the prover’s secret secure multi party computation jointly compute a function over inputs while keeping inputs private SPDZ chosen over garbled circuits because matrix operations are more efficient Helen uses a threshold of all parties

Outline Motivation + Problem Background + Threat Model Overview of Helen + Key Features Results Discussion

Overview of Helen platform for maliciously secure coopetitive learning supports regularized linear models paper notes that these types of models are widely used few organizations, lots of data, smaller number of features

Key Features of Helen Overarching goal: Make expensive cryptographic computation independent of number of training samples Make all parties commit to input dataset and prove it Use ADMM (Alternating Direction Method of Multipliers)/LASSO use partially homomorphic encryption to encrypt global weights such that each party can compute in a decentralized manner 5 phases Agreement Phase Initialization Phase Input Preparation Phase Model Compute Phase Model Release Phase

Input Preparation Phase Goal: broadcast encrypted summaries of data and commit Why? Malicious parties could use inconsistent data during protocol How? Encrypt data and attach various proofs of knowledge Naive method: commit on input dataset crypto computation scales linearly requires complex matrix inversions in MPC zero knowledge proof proves that party knows value of data commit to X_i and after this, any party cannot use different inputs for the rest of the algorithm each party stores the encrypted summaries from every other party -> paper notes this is viable due to summaries being much smaller than data

Input Preparation Phase Goal: broadcast encrypted summaries of data and commit Why? Malicious parties could use inconsistent data during protocol How? Encrypt data and attach various proofs of knowledge Better method: Decompose A and b via SVD all of these matrices are dimension d, no longer n Each party broadcasts encrypted A, b, y*, V, Σ, ϴ along with proofs of knowledge

Input Preparation Phase End of input preparation phase after this, any party cannot use different inputs for the rest of the algorithm each party stores the encrypted summaries from every other party -> paper notes this is viable due to summaries being much smaller than data

Model Compute Phase Goal: run ADMM algorithm iteratively and update encrypted global weights Why ADMM? efficient for linear models converges in few iterations (10) supports decentralized computation reduces number of expensive MPC syncs thus, efficient for cryptographic training SGD scales linearly with the number of samples

Model Compute Phase Goal: run ADMM iteratively to update encrypted global weights 1. Local optimization Each party calculates Enc(wik+1) also generate a proof of this 2. Coordination using MPC Parties use input summaries to verify Enc(wik+1) Convert weights to MPC Compute softmax via MPC Convert z back into encrypted form

Model Release Phase Goal: jointly decrypt and release model parameters (z) ciphertext to MPC conversion verify this conversion jointly decrypt model parameters (z)

Outline Motivation + Problem Background + Threat Model Overview of Helen + Key Features Results Discussion

Results Evaluation of runtime of Helen’s different phases using a synthetic dataset SVD, although it operates on n, the number of features, is fast due to operating on plaintext data

Results used sklearn on plaintext data instead of SGD because does not compile on large instances maybe would have liked to see comparison of using same algorithm as Helen/using SGD

Results Helen vs. SGD baseline Helen achieves orders of magnitude faster training time Helen input prep very slow to scale -> only part that scales is plaintext SVD calculation would have been nice to test against other baselines as well would have been nice to see error compared to SGD baseline

Outline Motivation + Problem Background + Threat Model Overview of Helen + Key Features Results Discussion

Discussion Is there a need to extend to other types of models? Consequences of this? Trusted hardware (enclaves) is another popular approach to computing on sensitive data. Is it more viable? What happens when more parties get involved? Comparison vs. federated learning? Questions?