Securing Cloud-assisted Services

Securing Cloud-assisted Services
11:50 – 12:15 (25 mins) N. Asokan @nasokan 14 June

Helsinki, Finland Innovation hub Home to leading universities
University of Helsinki (top 100 overall) Aalto1 University (top 100 in Computer Science) Innovation hub Local giants: Nokia, Ericsson Security innovators: F-Secure, SSH, ... Recent arrivals: Intel, Samsung, Huawei, ... New tigers: Rovio, Supercell, ..., lots of startups 1. Aalto: Library at the Mount Angel Abbey

Software Made in Finland
Larry Ewing, By Source, Fair use,

Services are moving to “the cloud”

Services are moving to “the cloud”
Example: cloud storage Example: cloud-based malware scanning service …

Securing cloud storage
Client-side encryption of user data is desirable But naïve client-side encryption conflicts with Storage provider business requirement: deduplication ([LPA15] ACM CCS ’15, [LDLA18] CT RSA ‘18) End user usability requirement: multi-device access ([P+AS18] IEEE IC ‘18, CeBIT ‘16)

Malware checking On-device checking
User APK data Malware DB On-device checking High communication and computation costs Database changes frequently Database is revealed to everyone Cloud-based checking Minimal communication and computation costs Database can change frequently Database is not revealed to everyone User privacy at risk!

Cloud-based malware scanning service
Needs to learn about apps installed on client devices Can therefore infer personal characteristics of users

Oblivious Neural Network Predictions via MiniONN Transformations
Thanks for the introduction. I am glad to present our work about oblivious neural network predictions. N. Asokan @nasokan (Joint work with Jian Liu, Mika Juuti, Yao Lu) By Source, Fair use,

Machine learning as a service (MLaaS)
Input Predictions Nowadays, *machine learning as a service* is a popular paradigm to provide *online predictions*. The server usually trains a *model* and receives inputs from clients. The server runs the model on the input and returns the predictions. Let’s take *google translate* as an example. The server uses a model to translate the *sentences* submitted by the clients. In this case, clients have to *reveal* their input and predictions to the server. In some scenarios like *online medical diagnosis*, the input data can be very *sensitive*. If the server is malicious, it can *misuse* the data. For example, instead of using the input for diagnosis, it can use it to profile the users or sell the data. violation of clients’ privacy

Oblivious Neural Networks (ONN)
Given a neural network, is it possible to make it oblivious? server learns nothing about clients' input clients learn nothing about the model To this end, we ask the question: given *any* existing neural network model whether is it possible to make it *oblivious*. What I mean by oblivious is that during the prediction, … (stress nothing)

Example: CryptoNets FHE-encrypted input FHE-encrypted predictions
We are not the first one to consider this problem. For example last year, Microsoft research proposed a solution called *CryptoNets*. It has clients encrypt their input using fully homomorphic encryption and the server runs its model on the encrypted data returns the encrypted predictions. It provides *high throughput* for batch queries from the *same* client, but it has two problems First, it has *high overhead*. For example in the MNIST dataset that contains handwritten digits, it takes *5 min* to identify one character. For comparison, without privacy, it takes less than *1 millisecond*. Second, it can only support *limited* neural networks because homomorphic encryption cannot deal with *comparison operations* and *high-degree polynomials* which are commonly used in neural networks. For instance, in their experiments, they used a model with *square* activation function, which is not typically used. High throughput for batch queries from same client High overhead for single queries: 297.5s and 372MB (MNIST dataset) Cannot support: high-degree polynomials, comparisons, … [GDLLNW16] CryptoNets, ICML 2016 FHE: Fully homomorphic encryption (

MiniONN: Overview Blinded input Blinded predictions Low overhead: ~1s
By Source, Fair use, Blinded input oblivious protocols Blinded predictions We present a solution called MiniONN, which stands for minimizing the overhead for oblivious neural networks. We minimize overhead by only using lightweight primitives such as additively homomorphic encryption and secure two-party computation As a result, we achieves a *low* overhead with around *1s* for a single prediction. In additions, we support *all* commonly operations in neural networks. Low overhead: ~1s Support all common neural networks

All operations are in a finite field
Skip to performance Example It only has three layers: some layers represents linear transformation that consists multiplication by weight matrices and additions of biased vectors. These alternate with layers representing non-linear transformations that are called *activation layers* or *pooling layers*. Notice that X, z are sensitive to the client, whereas W and b are sensitive to the server. In order to transform all neural networks we need to be able transform linear transformations, activation functions and pooling operations This is what I am going to do in the next a few slides. For now, I assume all operations are in a finite field. Later I will show how to deal with floating-point numbers. All operations are in a finite field

Core idea: use secret sharing for oblivious computation
Skip to performance Core idea: use secret sharing for oblivious computation [LJLA17] MiniONN, ACM CCS 2017 client & server have shares and s.t. That is, at the beginning of each layer, each party will hold a share which is indistinguishable from random to that party, and which: if added together, can recover the original input. … They will do the same thing for *all* layers and combine the shares in the end. This guarantees that if we can securely evaluate each layer, the *combination of all layers* will be secure as well. Next, I will show how to do this for each layer. In other words, I will show how to do this for linear transformation, activation function and pooling operation. client & server have shares and s.t. Use efficient cryptographic primitives (2PC, additively homomorphic encryption)

Secret sharing initial input
Skip to performance Secret sharing initial input To share the initial input, the client just randomly generates its own share x^c and calculates server’s share x^s as x minus x^c. Note that x^c is independent of x. So it can be pre-chosen. Note that xc is independent of x. Can be pre-chosen

Oblivious linear transformation
Skip to performance Oblivious linear transformation For the oblivious linear transformation layer, we start from the basic matrix multiplication. We can expand it and re-arrange it into this format. The key observation here is that this part can be computed locally by the server. The tricky part is here: W is only known by the server and x^c is only known by the client. They need to obliviously calculate the dot-product. Compute locally by the server Dot-product

Oblivious linear transformation: dot-product
Skip to performance Oblivious linear transformation: dot-product Homomorphic Encryption with SIMD Server decrypts the ciphertexts and adds together those plaintexts that corresponds to each row. Let’s call the result the vector *U*. Similarly, the client adds the r values that corresponds to each row. Let’s call the result the vector *V*. Notice that when you add the vectors U and V, the randomness will get cancelled. The result is exactly Wx^c. Notice that if we add together u and v, the random numbers got cancel out. Also notice that u v w x^c are indpendent Therefore the triplet U V x^c can be computed in a precomputation phase. We use SIMD technique to further improve the performance. SIMD represents single instruction multiple data. Since the ciphertext is much larger than the plaintext, so we batch multiple elements into one ciphertext. u + v = W•xc; Note: u, v, and W•xc are independent of x. <u,v,xc > generated/stored in a precomputation phase

Skip to performance Oblivious linear transformation Now, this tricky part can be easily calculated by u plus v. Then the server just outputs this part to the next layer and the client outputs this part.

Skip to performance Oblivious linear transformation We re-write them as y^s and y^c. Now, we have seen how to convert linear transformations into oblivious forms.

Recall: use secret sharing for oblivious computation
Skip to performance Recall: use secret sharing for oblivious computation client & server have shares and s.t. That is, at the beginning of each layer, each party will hold a share which is indistinguishable from random to that party, and which: if added together, can recover the original input. … They will do the same thing for *all* layers and combine the shares in the end. This guarantees that if we can securely evaluate each layer, the *combination of all layers* will be secure as well. Next, I will show how to do this for each layer. In other words, I will show how to do this for linear transformation, activation function and pooling operation. client & server have shares and s.t.

Oblivious activation/pooling functions
Skip to performance Oblivious activation/pooling functions Piecewise linear functions e.g., ReLU: Oblivious ReLU: easily computed obliviously by a garbled circuit Next, let’s see how to do this for activation functions and pooling operations which are typically non-linear. We classify all such functions into two categories.

Oblivious activation/pooling functions
Skip to performance Oblivious activation/pooling functions Smooth functions e.g., Sigmoid: Oblivious sigmoid: approximate by a piecewise linear function then compute obliviously by a garbled circuit empirically: ~14 segments sufficient As we all know, it can be approximated by high-degree polynomials using Taylor series. However, it involves a large performance overhead and accuracy loss. Instead, we approximate it by a piecewise linear function. We first divided it into multiple segments That means given an input we first find the segment it belongs to and then calculate the corresponding garbled circuit and multiply with one. For other segments, we also calculate the garbled circuit but multiply with zero. Then we add together the results for all segments.

Combining the final result
They can jointly calculate max(y1,y2) (for minimizing information leakage)

Recall: use secret sharing for oblivious computation
Now, we have seen how to compute each layer obliviously

Performance (for single queries)
Skip to End Performance (for single queries) Model Latency (s) Msg sizes (MB) Loss of accuracy MNIST/Square 0.4 (+ 0.88) 44 (+ 3.6) none CIFAR-10/ReLU 472 (+ 72) 6226 (+ 3046) none PTB/Sigmoid 4.39 (+ 13.9) 474 (+ 86.7) Less than 0.5% (cross-entropy loss) We first want to do a head-to-head comparison with CryptoNets. We transform the same model trained from the MNIST dataset using square as the activation function. It uses 1.3 seconds for a single prediction, which is 300 times faster than CryptoNets. If we don‘t include the precomputaion time, it is 700 times faster. In addition, we want to show that our method can be applied to more challenging datasets. We trained a more complex model from the CIFAR-10 dataset. It uses ReLU as the activation function and consists of 20 layers. Even though the cost is high, but as far as we know, this is the first attempt for transforming such a complex network. We also trained a model from the Penn Treebank (PTB) dataset, which is a standard dataset for language modeling. The goal of the model is to predict the next words in a sentence given the previous words. We used Sigmoid as the activation function to train the model. During prediction, we replace sigmoid with our approximated version, and the cross-entropy loss only increase with 0.5%. Pre-computation phase timings in parentheses PTB = Penn Treebank

Skip to End MiniONN pros and cons x faster than CryptoNets Can transform any given neural network to its oblivious variant Still ~1000x slower than without privacy Server can no longer filter requests or do sophisticated metering Reveals structure (but not params) of NN

Hardware-security mechanisms are pervasive
Skip to End Hardware-security mechanisms are pervasive Other Software Trusted Software Protected Storage Root of Trust Hardware support for Isolated execution: Isolated Execution Environment Protected storage: Sealing Ability to convince remote verifiers: Remote Attestation Trusted Execuction Environments (TEEs) Operating in parallel with “rich execution environments” (REEs) Cryptocards Trusted Platform Modules ARM TrustZone Intel Software Guard Extensions [A+14] “Mobile Trusted Computing”, Proceedings of the IEEE, 102(8) (2014) [EKA14] “Untapped potential of trusted execution environments”, IEEE S&P Magazine, 12:04 (2014)

Using a server-side TEE to run the model
Skip to End Using a server-side TEE to run the model 3. Provision model configuration, filtering policy 1. Attest server’s NN Engine (in TEE) (model-independent) Neural Network Engine 2. Input 4. Prediction 2. Input MiniONN + policy filtering + advanced metering + performance + better model secrecy - harder to reason about client input privacy

ML is very fragile in adversarial settings
MiniONN: Efficiently transform any given neural network into oblivious form with no/negligible accuracy loss Try at: Trusted Computing can help realize improved security and privacy for ML ML is very fragile in adversarial settings [LJLA17] MiniONN, ACM CCS 2017, [KNLAS19] Private Decision Trees, PETS 2019 @nasokan

Securing cloud-assisted services
Cloud-assisted services raise new security/privacy concerns But naïve solutions may conflict with privacy, usability, deployability, … Solutions using Trusted hardware + cryptography Security Usability Deployability/Cost @nasokan [TLPEPA17] Circle Game, ACM ASIACCS 2017 [LJLA17] MiniONN, ACM CCS 2017, [KNLAS19] Private Decision Trees, PETS 2019

Supplementary material

Background: Kinibi on ARM TrustZone
Trusted Untrusted *Kinibi: Trusted OS from Trustonic Rich Execution Environment (REE) Trusted Execution Environment (TEE) Kinibi Trusted OS from Trustonic Remote attestation Establish a trusted channel Private memory Confidentiality Integrity Obliviousness Adversary Observe On-Chip Memory Client App Shared Memory Trusted App Rich OS Trusted OS (Kinibi) TrustZone Hardware Extensions

Background: Intel SGX CPU enforced TEE (enclave) Remote attestation
Trusted Untrusted OS System Memory CPU enforced TEE (enclave) Remote attestation Secure memory Confidentiality Integrity Obliviousness only within 4 KB page granularity Adversary Observe User Process Enclave Page Cache Enclave Enclave Code TEE (Encrypted & integrity-protected) App Data Enclave Data App Code Physical address space REE

Android app landscape On average a user installs 95 apps (Yahoo Aviate) Yahoo Aviate study Source: Unique new Android malware samples Source: G Data 2015: 2018: Current dictionary size < 224 entries Even comparatively “high” FPR (e.g., ~2-10) may have negligible impact on privacy

Cloud-scale PMT Verify Apps: cloud-based service to check for harmful Android apps prior to installation “… over 1 billion devices protected by Google’s security services, and over 400 million device security scans were conducted per day” Android Security 2015 Year in Review “2 billion+ Android devices checked per day” (c.f. < 17 million malware samples)

Performance: batch queries
Kinibi on ARM TrustZone Intel SGX

Securing Cloud-assisted Services

Similar presentations

Presentation on theme: "Securing Cloud-assisted Services"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Securing Cloud-assisted Services

Similar presentations

Presentation on theme: "Securing Cloud-assisted Services"— Presentation transcript:

Similar presentations

About project

Feedback