RHMD: Evasion-Resilient Hardware Malware Detectors

Slides:

Advertisements

Similar presentations

Negative Selection Algorithms at GECCO /22/2005.

Advertisements

Computational Learning An intuitive approach. Human Learning Objects in world –Learning by exploration and who knows? Language –informal training, inputs.

On the Hardness of Evading Combinations of Linear Classifiers Daniel Lowd University of Oregon Joint work with David Stevens.

Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.

Jarhead Analysis and Detection of Malicious Java Applets Johannes Schlumberger, Christopher Kruegel, Giovanni Vigna University of California Annual Computer.

Automated malware classification based on network behavior

CISC Machine Learning for Solving Systems Problems Presented by: Akanksha Kaul Dept of Computer & Information Sciences University of Delaware SBMDS:

Combining Supervised and Unsupervised Learning for Zero-Day Malware Detection © 2013 Narus, Inc. Prakash Comar 1 Lei Liu 1 Sabyasachi (Saby) Saha 2 Pang-Ning.

Panorama: Capturing System-wide Information Flow for Malware Detection and Analysis Authors: Heng Yin, Dawn Song, Manuel Egele, Christoper Kruegel, and.

Meltem Ozsoy*, Caleb Donovick*, Iakov Gorelik*,

Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.

Behavior-based Spyware Detection By Engin Kirda and Christopher Kruegel Secure Systems Lab Technical University Vienna Greg Banks, Giovanni Vigna, and.

Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.

AUTHORS: ASAF SHABTAI, URI KANONOV, YUVAL ELOVICI, CHANAN GLEZER, AND YAEL WEISS "ANDROMALY": A BEHAVIORAL MALWARE DETECTION FRAMEWORK FOR ANDROID.

Man vs. Machine: Adversarial Detection of Malicious Crowdsourcing Workers Gang Wang, Tianyi Wang, Haitao Zheng, Ben Y. Zhao, UC Santa Barbara, Usenix Security.

Rotation Invariant Neural-Network Based Face Detection

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.

KAIST Internet Security Lab. CS710 Behavioral Detection of Malware on Mobile Handsets MobiSys 2008, Abhijit Bose et al 이 승 민.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

1 Diversifying Sensors to Improve Network Resilience Wenliang (Kevin) Du Electrical Engineering & Computer Science Syracuse University.

CISC Machine Learning for Solving Systems Problems Presented by: Sandeep Dept of Computer & Information Sciences University of Delaware Detection.

Deriving Input Syntactic Structure From Execution Zhiqiang Lin Xiangyu Zhang Purdue University November 11 th, 2008 The 16th ACM SIGSOFT International.

CISC Machine Learning for Solving Systems Problems John Cavazos Dept of Computer & Information Sciences University of Delaware

Ensemble Learning for Low-level Hardware-supported Malware Detection

EE515/IS523: Security 101: Think Like an Adversary Evading Anomarly Detection through Variance Injection Attacks on PCA Benjamin I.P. Rubinstein, Blaine.

Trends and Lessons from Three Years Fighting Malicious Extensions Nav Jagpal, Eric Dingle, Jean-Philippe, Gravel Panayiotis, Mavrommatis Niels, Provos.

Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.

Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.

Technische Universität München Yulia Gembarzhevskaya LARGE-SCALE MALWARE CLASSIFICATON USING RANDOM PROJECTIONS AND NEURAL NETWORKS Technische Universität.

Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA

October 20-23rd, 2015 FEEBO: A Framework for Empirical Evaluation of Malware Detection Resilience Against Behavior Obfuscation Sebastian Banescu Tobias.

Memory Protection through Dynamic Access Control Kun Zhang, Tao Zhang and Santosh Pande College of Computing Georgia Institute of Technology.

UC Marco Vieira University of Coimbra

Covert Channels Through Branch Predictors: a Feasibility Study

Hardware based Intrusion Detection

Intrusion Detection using Deep Neural Networks

Stealing Machine Learning Models via Prediction APIs

MadeCR: Correlation-based Malware Detection for Cognitive Radio

Machine Learning overview Chapter 18, 21

Machine Learning overview Chapter 18, 21

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

Pfizer HTS Machine Learning Algorithms: November 2002

Source: Procedia Computer Science（2015）70:

Active Learning Intrusion Detection using k-Means Clustering Selection

BotCatch: A Behavior and Signature Correlated Bot Detection Approach

Estimating Link Signatures with Machine Learning Algorithms

RIC: Relaxed Inclusion Caches for Mitigating LLC Side-Channel Attacks

COMP61011 : Machine Learning Ensemble Models

An Enhanced Support Vector Machine Model for Intrusion Detection

Poisoning Attacks with Back-Gradient Optimization

Dieudo Mulamba November 2017

Asymmetric Gradient Boosting with Application to Spam Filtering

Machine Learning Week 1.

Perceptrons for Dummies

CSSE463: Image Recognition Day 11

Soft Error Detection for Iterative Applications Using Offline Training

Lithography Diagnostics Based on Empirical Modeling

Alain Goossens & Jean-Pierre Van Loo Data scientists – SII Belgium

Adversarial Evasion-Resilient Hardware Malware Detectors

All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis.

The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’

Applying SVM to Data Bypass Prediction

Pattern Recognition & Machine Learning

Overview of deep learning

Junheng, Shengming, Yunsheng 11/09/2018

Enabling Dynamic Network Access Control with Anomaly-based IDS and SDN

Rohan Yadav and Charles Yuan (rohany) (chenhuiy)

Modeling IDS using hybrid intelligent systems

Machine Learning.

Presentation transcript:

RHMD: Evasion-Resilient Hardware Malware Detectors Khaled N. Khasawneh*, Nael Abu-Ghazaleh*, Dmitry Ponomarev**, Lei Yu** University of California, Riverside *, Binghamton University ** MICRO 2017 – Boston, USA, October 2017

Malware is Everywhere!

Over 250,000 malware registered every day! Malware is Everywhere! Over 250,000 malware registered every day!

Traditional Software Malware Detection Static malware detection Search for signatures in the executable Can detect all known malware with no false alarms Can be evaded by new malware and polymorphic malware Dynamic malware detection Monitors the behavior of the program Can detect unknown malware Very high overhead limiting use in practice

Hardware Malware Detectors (HMDs) Use Machine Learning: detect malware as computational anomaly Use low-level features collected from the hardware Can be always-on without adding performance overhead Many research papers including ISCA’13, HPCA’15 and MICRO’16

Paper Contributions Can malware evade HMDs? Reverse-engineer HMDs Develop evasive malware Evade detection after re-training

Can we make HMDs robust to evasion? Paper Contributions Can malware evade HMDs? If yes Can we make HMDs robust to evasion? Reverse-engineer HMDs 1- Provably harder to reverse-engineer 2- Robust to evasion Yes! Using RHMDs Develop evasive malware Evade detection after re-training

Reverse Engineering

How to Reverse Engineer HMDs? Challenges: We don’t know the detection period We don’t know the features used We don’t know the detection algorithm Approach: Train different classifiers Derive specific parameters as an optimization problem

Reverse Engineering HMDs Attacker Training Data _________________________

Reverse Engineering HMDs Victim HMD Attacker Training Data _________________________ 10100 Black box output

Reverse Engineering HMDs Victim HMD Attacker Training Data _________________________ 10100 Black box output Training model Data Labels

Reverse Engineering HMDs Victim HMD Attacker Training Data _________________________ 10100 Black box output Training model Data Labels Reverse-engineered HMD

We Can Guess Detectors Parameters! Victim HMD parameters: - 10K detection period - Instructions features vector

We Can Guess Detectors Parameters! Victim HMD parameters: - 10K detection period - Instructions features vector Guessing detection period: LR: Logistic Regression DT: Decision Tree SVM: Support Vector Machines

We Can Guess Detectors Parameters! Victim HMD parameters: - 10K detection period - Instructions features vector Guessing feature vector: LR: Logistic Regression DT: Decision Tree SVM: Support Vector Machines

Reverse Engineering Effectiveness Logistic Regression Neural Networks

Reverse Engineering Effectiveness Current generation of HMDs can be reverse engineered Logistic Regression Neural Networks

Evading HMDs

How to Create Evasive Malware? Challenges: - We don’t have malware source code - We can’t decompile malware because its obfuscated Our approach: PIN Dynamic Control Flow Graph

What we Should Add to Evade? Logistic Regression (LR) LR is defined by a weight vector θ Add instructions whose weights are negative

What we Should Add to Evade? Neural Network (NN) Collapse the description of the NN into a single vector Add instructions whose weights are negative

What we Should Add to Evade? Current generation of HMDs are vulnerable to evasion attacks! Neural Network (NN) Collapse the description of the NN into a single vector Add instructions whose weights are negative

Does re-training Help?

Can we Retrain with Samples of Evasive Malware? Linear Model Logistic Regression

Can we Retrain with Samples of Evasive Malware? Linear Model Logistic Regression Non-Linear Model Neural Network

Explaining Retraining Performance Linear Model (LR)

Explaining Retraining Performance Non-Linear Model (NN)

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining?

What if we Keep Retraining? Re-training is not a general solution

Can we Build Detectors that Resist Evasion?

Overview of RHMDs RHMD HMD 1 HMD 2 Pool of diverse HMDs . HMD n

Overview of RHMDs RHMD HMD 1 HMD 2 Input Output . HMD n Selector

Overview of RHMDs … RHMD . Features vector Input Output Detection period Number of committed instructions … Features vector RHMD HMD 1 HMD 2 Input Output . HMD n Selector

Overview of RHMDs … … RHMD . Features vector Input Output Detection period Number of committed instructions … … Features vector RHMD HMD 1 HMD 2 Input Output . HMD n Selector

Overview of RHMDs … … … RHMD . Features vector Input Output Detection period Number of committed instructions … … … Features vector RHMD HMD 1 HMD 2 Input Output . HMD n Selector

Overview of RHMDs … … … RHMD Diversify by Different: 1- Features Detection period Number of committed instructions … … … Features vector RHMD Diversify by Different: 1- Features 2- Detection periods HMD 1 HMD 2 . HMD n Selector

Reverse Engineer RHMDs Randomizing the features (a) Two feature vectors (b) Three feature vectors

Reverse Engineer RHMDs Randomizing the features and detection period (a) Two feature vectors and two periods (b) Three feature vectors and two periods

RHMD is Resilient to Evasion

Hardware Overhead FPGA prototype on open core (AO486): RHMD with three detectors: Area increase 1.72% Power increase 0.78%

Conclusion Current generation of HMDs vulnerable to evasion Developed a methodology to reverse-engineer and evade detectors Explored Re-training HMDs Benefit is limited Developed new class of Evasion-Resilient HMDs Robust to evasion Low overhead

RAID 2015 – Kyoto, Japan, November 2015 Thank you! Questions? RAID 2015 – Kyoto, Japan, November 2015

Can’t Just Randomly Add Instructions

Evasion Overhead