Stealing DNN models: Attacks and Defenses

Slides:

Advertisements

Similar presentations

The Sniper Attack: Anonymously Deanonymizing and Disabling the Tor Network Rob Jansen et. al NDSS 2014 Presenter: Yue Li Part of slides adapted from R.

Advertisements

ROAD AND TRAFFIC SIGNS Sample questions from the (Road and Traffic Signs) unit of the Driving Theory syllabus. Signs and Symbols from Instant Art Traffic.

Home QUIZ ON Home 1. Pedestrians Prohibited2. Pedestrian Crossing 3. Men at work4. No standing or stopping.

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

1 Accurate Object Detection with Joint Classification- Regression Random Forests Presenter ByungIn Yoo CS688/WST665.

Face Detection CSE 576. Face detection State-of-the-art face detection demo (Courtesy Boris Babenko)Boris Babenko.

TRAFFIC SIGN BY kru Somjai Triratsuk Warning Signs School Signs Slipper Road Turn left Curve Right.

Ch. 4 Traffic Control NY State DMV.

Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.

Towards Coastal Threat Evaluation Decision Support Presentation by Jacques du Toit Operational Research University of Stellenbosch 3 December 2010.

Safety Signs. Traffic Light Ahead Slow down and prepare to stop.

Traffic Signs Part Two. What do these 2 signs tell you?

Bayesian Active Learning with Evidence-Based Instance Selection LMCE at ECML PKDD th September 2015, Porto Niall Twomey, Tom Diethe, Peter Flach.

Date of download: 7/8/2016 Copyright © 2016 SPIE. All rights reserved. A scalable platform for learning and evaluating a real-time vehicle detection system.

The Application of Data Mining in Telecommunication by Wang Lina February 2003.

rules of the road Click me

Traffic Signs, Signals, and Road Markings

2017 Traffic Records Forum August 7, 2017 New Orleans, LA Andrea Bill

UNIT 3 Foundations of Effective Driving

TRANSPORTATION TUESDAY

Rules and Regulations for Safe Driving

Stealing Machine Learning Models via Prediction APIs

A lustrum of malware network communication: Evolution & insights

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

INTERSECTIONS CHAPTER 10.

DeepXplore: Automated Whitebox Testing of Deep Learning Systems

Introductory Seminar on Research: Fall 2017

COMP61011 : Machine Learning Ensemble Models

Intelligent Information System Lab

Privacy-preserving Machine Learning

Sample questions from the (Road and Traffic Signs) unit of the Driving Theory syllabus. Signs and Symbols from Instant Art Traffic Signs CD-ROM.

Poisoning Attacks with Back-Gradient Optimization

AI in Cyber-security: Examples of Algorithms & Techniques

Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules

State-of-the-art face recognition systems

Signs and Symbols from Instant Art Traffic Signs CD-ROM

Anupam Das , Nikita Borisov

A survey of network anomaly detection techniques

Bolun Wang*, Yuanshun Yao, Bimal Viswanath§ Haitao Zheng, Ben Y. Zhao

INF 5860 Machine learning for image classification

Traffic Signs, Signals, and Road Markings

Adversarial Evasion-Resilient Hardware Malware Detectors

Submitted by the experts of OICA

Pose Estimation for non-cooperative Spacecraft Rendevous using CNN

Image recognition: Defense adversarial attacks

Traffic Safety.

Privacy-preserving and Secure AI

Department of Electrical Engineering

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Tuning CNN: Tips & Tricks

Binghui Wang, Le Zhang, Neil Zhenqiang Gong

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

RCNN, Fast-RCNN, Faster-RCNN

Designing Neural Network Architectures Using Reinforcement Learning

GANG: Detecting Fraudulent Users in OSNs

Adversarial Learning for Security System

Attacks on Remote Face Classifiers

Reuben Feinman Research advised by Brenden Lake

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Modeling IDS using hybrid intelligent systems

Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.

Trusting Machine Learning Algorithms for Safeguards Applications

Sign Language Recognition With Unsupervised Feature Learning

Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems NDSS 2019 Hadi Abdullah, Washington Garcia, Christian Peeters, Patrick.

Presentation transcript:

Stealing DNN models: Attacks and Defenses Mika Juuti, Sebastian Szyller, Alexey Dmitrenko, Samuel Marchal, N. Asokan

Prediction Service Provider API Background Machine learning increasingly popular: business advantage to companies API: black-box access to clients Automate tedious decision-making Attacker wants to compromise Model confidentiality ~ model extraction Does stolen model agree with target model (classification results)? Model integrity (prediction quality) ~ transferable adversarial examples Are adversarial examples for stolen model also adversarial examples on target model? Prediction Service Provider API ML model Target model Client Speed limit 80km/h Attacker Stolen model [1] Tramer et al. Stealing ML models via prediction APIs. UsenixSEC’16.

Generic model extraction with synthetic samples Select model hyperparameters Initial data collection Query API for predictions Train attacker model (“stolen model”) Generate new queries Terminate -- hyperparameters, layers, ... [1, 2] -- unlabeled seed samples for attack -- get labels / classification probabilities -- update model -- probe boundaries with synthetic samples -- budget exceeded Duplication rounds [1] Oh et al. Towards Reverse Engineering Black-box Neural Networks. ICLR’18 [2] Wang and Gong. Stealing Hyperparameters in Machine Learning. S&P’18 [3] Papernot et al. Practical black-box attacks against machine learning. AsiaCCS’17. [3]

Datasets MNIST: B&W Digits Server trained with 55,000 images 4 layers (2 conv + 2 dense) Seed samples: 10 ~ 500 images 10 classes GTSRB: Traffic Sign Recognition Server trained w/ 39,000 images 5 layers (2 conv + 3 dense) Seed samples: 43 ~ 2150 images 8 macro-classes Click to show macro-classes

Comparative evaluation with state-of-the-art Better targeted transferability: Jb-topk targets a specific class Better agreement: fully training substitute models crucial

Take-aways Agreement and transferability do not go hand-in-hand Probabilities improve transferability Dropout may help on complex data Synthetic samples with probabilities boost transferability Thinner, deeper networks more resilient to transferable adversarial examples

Attacks: Common characteristics Seed samples ~ novelty in queries establish initial decision boundaries Synthetic samples ~ similar to existing samples refine the boundaries Study distribution of queries to detect model extraction attacks

Defence: PRADA Protection Against DNN Model Extraction Attacks Stateful defense: Detects lack-of-novelty in client queries Lazy clustering Keeps track of submitted queries Adds “novel” samples to growing set Parameters W : window size – speed of detection 𝚫 : change in growth rate – recall Idea of detection: Kink!

Detection efficiency All known model extraction attacks detected Slowest on Tramer ~ inefficient attack Requires >500k queries to succeed [1] [1] (Conservative estimate based on) Tramer et al. Stealing ML models via prediction APIs. UsenixSEC’16. [2] Papernot et al. Practical black-box attacks against machine learning. AsiaCCS’17.

False positives Test with randomly sampled data from different distributions Traffic Signs: German / Belgian Signs Digits: MNIST / USPS No false positives PRADA relies on relative data distribution = client behavior German Belgian MNIST USPS

Summary Model extraction is a serious threat to model owners Proposed new attack that outperforms state-of-the-art Take-aways for model extraction attacks & transferability PRADA detects all known model extraction attacks https://arxiv.org/abs/1805.02628

Extra

Structured tests Exp 1: What is the impact of natural seed samples? Attacker only uses seed samples. Exp 2: What advantage do synthetic samples give? Continue attack with synthetic samples. Compare our new attack with existing. Exp 3: What effect does model complexity have? Transferability. Compare (1) number of layers and (2) number of parameters.

1. Impact of using only natural seed samples: MNIST Number of adversarial examples tested is always 90 for x ≥ 100 Number of adversarial examples tested is always 70 for x ≥ 430 Agreement and transferability do not go hand-in-hand Probabilities improve transferability Dropout may help on complex data

2. Benefits of using synthetic samples (4) Synthetic samples with probabilities boost transferability Click to show untargeted transferability

3. Effect of model complexity (5) Thinner, deeper networks more resilient to transferable adversarial examples MNIST: Transferability

RW circle Gray circle 1 2 3 4 5 6 7 8 No passing No passing (weight limit) Right-of-way Priority road Yield Stop No vehicles No vehicles (weight limit) No entry Warning Priority 9 10 11 12 13 14 15 16 17 Yield Caution Dangerous curve (left) Dangerous curve (right) Double curve Bumpy Slippery Road narrows Road work Traffic signals Stop 18 19 20 21 22 23 24 25 26 No entry Pedestrians Children Bicycles Ice Wildlife End of limits Turn right Turn right Ahead only Blue circle 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Go straight or right Go straight or left Keep right Keep left Roundabout End of no passing End of no passing (weight limit) Back to datasets

1. Impact of using only natural seed samples: MNIST

1. Impact of using only natural seed samples: GTSRB

2. Synthetic samples: untargeted transferability

Detection efficiency – sequential data from client Sequential data is co-dependant Good parameters more conservative All known model extraction attacks detected Detection at first window overlap Seed samples + window size (50) No difference on Tramer Requires >500k queries to succeed [1] [1] (Conservative estimate based on) Tramer et al. Stealing ML models via prediction APIs. UsenixSEC’16. [2] Papernot et al. Practical black-box attacks against machine learning. AsiaCCS’17. 21