Privacy-preserving Machine Learning

Slides:



Advertisements
Similar presentations
SPORC: Group Collaboration using Untrusted Cloud Resources Ariel J. Feldman, William P. Zeller, Michael J. Freedman, Edward W. Felten Published in OSDI’2010.
Advertisements

Virtualization and Cloud Computing. Definition Virtualization is the ability to run multiple operating systems on a single physical system and share the.
Building web applications on top of encrypted data using Mylar Presented by Tenglu Liang Tai Liu.
Efficient Public Key Infrastructure Implementation in Wireless Sensor Networks Wireless Communication and Sensor Computing, ICWCSC International.
Ragib Hasan Johns Hopkins University en Spring 2011 Lecture 3 02/14/2010 Security and Privacy in Cloud Computing.
SplitX: High-Performance Private Analytics Ruichuan Chen (Bell Labs / Alcatel-Lucent) Istemi Ekin Akkus (MPI-SWS) Paul Francis (MPI-SWS)
Ragib Hasan Johns Hopkins University en Spring 2010 Lecture 5 03/08/2010 Security and Privacy in Cloud Computing.
Design, Implementation, and Experimentation on Mobile Agent Security for Electronic Commerce Applications Anthony H. W. Chan, Caris K. M. Wong, T. Y. Wong,
Privacy-Preserving Computation and Verification of Aggregate Queries on Outsourced Databases Brian Thompson 1, Stuart Haber 2, William G. Horne 2, Tomas.
Privacy Preserving Data Mining: An Overview and Examination of Euclidean Distance Preserving Data Transformation Chris Giannella cgiannel AT acm DOT org.
A Seminar on Securities In Cloud Computing Presented by Sanjib Kumar Raul Mtech(ICT) Roll-10IT61B09 IIT Kharagpur Under the supervision of Prof. Indranil.
ObliviStore High Performance Oblivious Cloud Storage Emil StefanovElaine Shi
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
Security Security is a measure of the system’s ability to protect data and information from unauthorized access while still providing access to people.
CS CS 5150 Software Engineering Lecture 18 Security.
Privacy-Aware Personalization for Mobile Advertising
E-Commerce Security Professor: Morteza Anvari Student: Xiaoli Li Student ID: March 10, 2001.
Architectures of distributed systems Fundamental Models
Hiding in the Mobile Crowd: Location Privacy through Collaboration.
Annual Conference of ITA ACITA 2010 Secure Sharing in Distributed Information Management Applications: Problems and Directions Piotr Mardziel, Adam Bender,
1 Privacy Preserving Data Mining Haiqin Yang Extracted from a ppt “Secure Multiparty Computation and Privacy” Added “Privacy Preserving SVM”
IP Security IP sec IPsec is short for Internet Protocol Security. It was originally created as a part of IPv6, but has been retrofitted into IPv4. It.
A Data-Reachability Model for Elucidating Privacy and Security Risks Related to the Use of Online Social Networks S. Creese, M. Goldsmith, J. Nurse, E.
Data Access and Security in Multiple Heterogeneous Databases Afroz Deepti.
Computer Science 1 TinySeRSync: Secure and Resilient Time Synchronization in Wireless Sensor Networks Speaker: Sangwon Hyun Acknowledgement: Slides were.
Lecture 24 Wireless Network Security
services/load-stress-performance- testing.php Computer Platforms Evaluating performance.
Security Analysis of a Privacy-Preserving Decentralized Key-Policy Attribute-Based Encryption Scheme.
DOWeR Detecting Outliers in Web Service Requests Master’s Presentation of Christian Blass.
Secure Offloading of Legacy IDSes Using Remote VM Introspection in Semi-trusted IaaS Clouds Kenichi Kourai Kazuki Juda Kyushu Institute of Technology.
Constraint Framework, page 1 Collaborative learning for security and repair in application communities MIT site visit April 10, 2007 Constraints approach.
Design and implementation of Cross domain cooperative firewall Jerry Cheng, Hao Yang, Starsky H.Y. Wong, Petros Zerfos, Songwu Lu UCLA Computer Science.
Problem: Internet diagnostics and forensics
Stealing Machine Learning Models via Prediction APIs
VANET.
Efficient Multi-User Indexing for Secure Keyword Search
Lan Zhou, Vijay Varadharajan, and Michael Hitchens
Network Security Analysis Name : Waleed Al-Rumaih ID :
IP Security IP sec IPsec is short for Internet Protocol Security. It was originally created as a part of IPv6, but has been retrofitted into IPv4. It works.
R SE to the challenges of ntelligent systems
Privacy Preserving Similarity Evaluation of Time Series Data
Chapter 5 Electronic Commerce | Security
Introduction to Operating Systems
Xutong Chen and Yan Chen
Florian Tramèr (joint work with Dan Boneh)
Chapter 5 Electronic Commerce | Security
Securing Cloud-assisted Services
Research Interests.
Multi-party Authentication in Web Services
Security Of Wireless Sensor Networks
Architectures of distributed systems Fundamental Models
Architectures of distributed systems Fundamental Models
From Viewstamped Replication to BFT
Stealing DNN models: Attacks and Defenses
Privacy-preserving Prediction
Privacy-preserving and Secure AI
Artifacts of Adversarial Examples
Malicious-Secure Private Set Intersection via Dual Execution
ONLINE SECURE DATA SERVICE
Privacy preserving cloud computing
Data Warehousing Data Mining Privacy
Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Kriti shreshtha.
Security of Wireless Sensor Networks
Architectures of distributed systems Fundamental Models
Database System Architectures
Helen: Maliciously Secure Coopetitive Learning for Linear Models
Securing Cloud-assisted Services
Exploiting Unintended Feature Leakage in Collaborative Learning
Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems NDSS 2019 Hadi Abdullah, Washington Garcia, Christian Peeters, Patrick.
Presentation transcript:

Privacy-preserving Machine Learning Jian Liu, Mika Juuti, N. Asokan Secure Systems Group, Aalto University

How do you evaluate ML-based systems? Effectiveness of inference accuracy/score measures on held-out test set? Performance inference speed and memory consumption? Hardware/software requirements e.g. memory/processor limitations, or specific software library? There are a lot of posters and talks today about machine learning. I am sure you guys know how to evaluate a machine learning system.

Meeting requirements in the presence of an adversary Security & Privacy? Meeting requirements in the presence of an adversary But did you pay attention to the security and privacy. i.e.,

Machine learning pipeline Pre-processor Trainer Model Inference Service Provider Data owners Clients Here is the machine learning pipeline. … An adversary can present in any phase of this pipeline. Legend entities components

Inference Service Provider 1. Malicious data owners Pre-processor Trainer Model Inference Service Provider Data owners Clients Attack target Risk Remedies Model (integrity) Data poisoning [1, 2] Access control Robust estimators Active learning (human-in-the-loop learning) Outlier removal / normality models If the data owners are malicious, they can poison the data to affect the model integrity. [1] https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist [2] https://wikipedia.org/wiki/Naive_Bayes_spam_filtering#Disadvantages

2. Malicious pre-processor Trainer Model Inference Service Provider Data owners Clients Attack target Risk Remedies Model (integrity) Data poisoning Access control Robust estimators Active learning (human-in-the-loop learning) Outlier removal / normality models Training data (confidentiality) Unauthorized data use (e.g. profiling) Adding noise (e.g. differential privacy) [1] Oblivious aggregation (e.g., homomorphic encryption) If the preprocessor is malicious, he can poison the data as well, and he can also see the whole training data. [1] Heikkila et al. ”Differentially Private Bayesian Learning on Distributed Data”, NIPS’17

3. Malicious model trainer Pre-processor Trainer Model Inference Service Provider Data owners Clients Attack target Risk Remedies Training data (confidentiality) Unauthorized data use (e.g. profiling) Oblivious training (learning with encrypted data) [1] The training data is also visible to the malicious trainer. [1] Graepel et al. “ML Confidential”, ICISC’12

4. Malicious inference service provider Pre-processor Trainer Model Inference Service Provider Data owners Clients Attack target Risk Remedies Inference queries/results (confidentiality) Unauthorized data use (e.g. profiling) Oblivious inference [1,2,3] What if the service provider is malicious? I will talk about it later. [1] Gilad-Bachrach et al. “CryptoNets”, ICML’16 [2] Mohassel et al. “SecureML”, IEEE S&P’17 [3] Liu et al. “MiniONN”, ACM CCS’17

Inference Service Provider 5. Malicious client Pre-processor Trainer Model Inference Service Provider Data owners Clients Attack target Risk Remedies Training data (confidentiality) Membership inference Model inversion Minimize information leakage in responses Differential privacy Model (confidentiality) Model theft [1] Normality model for client queries Adaptive responses to client requests Model (integrity) Model evasion [2] If the clients are malicious, they can steal the model, evade the model or even invert the model by sending queries. [1] Tramer et al, “Stealing ML models via prediction APIs”, UsenixSEC’16 [2] Dang et al, “Evading Classifiers by Morphing in the Dark”, CCS’17

Machine learning as a service (MLaaS) Input Predictions Next, I will focus on the case of malicious service provider, who can see all queries sent by users. This can be very serious in some scenarios like online diagnosis. violate clients’ privacy

Oblivious Neural Networks (ONN) Given a neural network, is it possible to make it oblivious? server learns nothing about clients' input; clients learn nothing about the model.

Example: CryptoNets FHE-encrypted input FHE-encrypted predictions High throughput for batch queries from same client High overhead for single queries: 297.5s and 372MB Only supports low-degree polynomials [GDLLNW16] CryptoNets, ICML 2016

MiniONN: Overview Blinded input Blinded predictions oblivious protocols Blinded predictions Lightweight primitives: Additively homomorphic encryption (with SIMD) Secure two-party computation

Performance (for single queries) Model Latency (s) Msg sizes (MB) Accuracy % MNIST/Square 0.4 (+ 0.88) 44 (+ 3.6) 98.95 CIFAR-10/ReLU 472 (+ 72) 6226 (+ 3046) 81.61 PTB/Sigmoid 4.39 (+ 13.9) 474 (+ 86.7) 120/114.5 (perplexity) It achieves 700 performance improvement over cryptoNets. Pre-computation time in parenthesis