October 20-23rd, 2015 Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features Joshua Saxe, Dr. Konstantin Berlin Invincea.

Slides:

Advertisements

Similar presentations

Patch to the Future: Unsupervised Visual Prediction

Advertisements

Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Final Project: Project 9 Part 1: Neural Networks Part 2: Overview of Classifiers Aparna S. Varde April 28, 2005 CS539: Machine Learning Course Instructor:

Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.

Oral Defense by Sunny Tang 15 Aug 2003

SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:

PPT 206 Instrumentation, Measurement and Control SEM 2 (2012/2013) Dr. Hayder Kh. Q. Ali 1.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.

Printing: This poster is 48” wide by 36” high. It’s designed to be printed on a large-format printer. Customizing the Content: The placeholders in this.

Table 3:Yale Result Table 2:ORL Result Introduction System Architecture The Approach and Experimental Results A Face Processing System Based on Committee.

Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.

Classification / Regression Neural Networks 2

1 A Feature Selection and Evaluation Scheme for Computer Virus Detection Olivier Henchiri and Nathalie Japkowicz School of Information Technology and Engineering.

Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.

1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.

NEURAL - FUZZY LOGIC FOR AUTOMATIC OBJECT RECOGNITION.

1 K-Means+ID3 A Novel Method for Supervised Anomaly Detection by Cascading K-Means Clustering and ID3 Decision Tree Learning Methods Author : Shekhar R.

1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 25 Nov 4, 2005 Nanjing University of Science & Technology.

Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.

Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.

Ensemble Learning for Low-level Hardware-supported Malware Detection

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Speech Lab, ECE, State University of New York at Binghamton  Classification accuracies of neural network (left) and MXL (right) classifiers with various.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models.

IEEE AI - BASED POWER SYSTEM TRANSIENT SECURITY ASSESSMENT Dr. Hossam Talaat Dept. of Electrical Power & Machines Faculty of Engineering - Ain Shams.

Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.

Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA

October 20-23rd, 2015 FEEBO: A Framework for Empirical Evaluation of Malware Detection Resilience Against Behavior Obfuscation Sebastian Banescu Tobias.

Co-funded by the European Union SlideSP4 Theoretical Neuroscience – HBP 2 nd Periodic Review – June 2016 SP4 Major Achievements: Ramp-Up Phase overall.

Bassem Makni SML 16 Click to add text 1 Deep Learning of RDF rules Semantic Machine Learning.

Face recognition using Histograms of Oriented Gradients

Malware Classification and Novelty Detection Using PE Header Information Nasser Salim CS529 – Final Project April, 2011.

Graph-based Dependency Parsing with Bidirectional LSTM Wenhui Wang and Baobao Chang Institute of Computational Linguistics, Peking University.

Introduction to Machine Learning, its potential usage in network area,

Recognition of biological cells – development

Deep Learning Amin Sobhani.

A Pool of Deep Models for Event Recognition

CSSE463: Image Recognition Day 11

Source: Procedia Computer Science（2015）70:

Introductory Seminar on Research: Fall 2017

Basic machine learning background with Python scikit-learn

Machine Learning Basics

Neural Networks and Backpropagation

Classification / Regression Neural Networks 2

Pearson Lanka (Pvt) Ltd.

Grid Long Short-Term Memory

CSSE463: Image Recognition Day 11

Approximate Fully Connected Neural Network Generation

Data Warehousing and Data Mining

MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,

Steve Zhang Armando Fox In collaboration with:

Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.

Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle

The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’

Xin Qi, Matthew Keally, Gang Zhou, Yantao Li, Zhen Ren

Deep Residual Learning for Automatic Seizure Detection

CSSE463: Image Recognition Day 11

Outline System architecture Current work Experiments Next Steps

CSSE463: Image Recognition Day 11

August 8, 2006 Danny Budik, Itamar Elhanany Machine Intelligence Lab

CAMCOS Report Day December 9th, 2015 San Jose State University

Phase based adaptive Branch predictor: Seeing the forest for the trees

A Deep Reinforcement Learning Approach to Traffic Management

Presentation transcript:

October 20-23rd, 2015 Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features Joshua Saxe, Dr. Konstantin Berlin Invincea Labs

October 20-23rd, 2015 Motivation and research question  On average, anti-virus sensors have a 40% chance of missing zero-day malware (according to a 2014 study performed by Lastline/UCSB)  The data seem to suggest that it takes almost a year before anti- virus sensors start detecting the hardest-to-detect zero-day malware  Deep learning holds the promise of providing an orthogonal detection methodology that can significantly increase our overall detection rate  Deep learning has produced big breakthroughs in computer science problem areas recently: could this extend to malware detection? 2

October 20-23rd, 2015 Our “deep learning” neural network approach  Goal: Exploit recent breakthroughs in deep neural networks to achieve breakthrough results in malware detection

October 20-23rd, 2015 Previous Work  Our overall approach is not new, machine learning based malware detection is two decades old  What is new is our attempt to exploit recent developments in machine learning that have produced breakthroughs against other problems (object recognition)  Specifically: new neural network activation functions, new optimization approaches, GPU-based training, new dimensionality reduction 4

October 20-23rd, 2015 Our approach 5

October 20-23rd, 2015 What are neural networks? -A set of inputs -A set of nonlinear transforms to those inputs -A set of outputs -This simple setup can approximate any function, given the right parameters Learned decision boundary

October 20-23rd, 2015 Our neural network architecture 7

October 20-23rd, 2015 Contextual byte histogram features: a key component of our feature space -Feature extraction algorithm: -Slide a 1024 byte window over the target binary, taking 256 byte steps -Compute the entropy of each 1024 byte window -For each byte in the window, store a tuple (byte, entropy) -Create a 2d histogram with byte values on one axis and entropy on another axis

October 20-23rd, 2015 Byte/entropy histograms: a key component of our feature space

October 20-23rd, 2015 Byte/entropy representation of a binary file (benign in this example)

October 20-23rd, 2015 Byte/entropy representation of a binary file with a simulated component added

October 20-23rd, 2015 Findings 12

October 20-23rd, 2015 In-lab accuracy evaluation on 400k files ROC curve zoomed to low FPR range We can detect about 75% of malware samples our neural network has never seen before at a 0.01% false positive rate We can detect 95% of malware samples that our neural network has never seen before at a 0.1% false positive rate

October 20-23rd, 2015 Simulating concept drift On this test we train on malware with compile timestamps between 2000 and July 31 st 2014 Then we evaluate our ability to detect malware received in our lab over the last year The results, as you’d expect, are noticeably worse, but still pretty good!

October 20-23rd, 2015 Measuring the positive impact of more training data 15

October 20-23rd, 2015 Product integration and results in live settings  Our detection model has been integrated into the upcoming release of our product and is currently under testing on customer networks  60% detection rate on new malware as false positives converge on 0 (in contrast to anti-virus engines’ 40%)  95% detection rate on new malware as false positives approach five per day  Test performed on feed of new binaries obtained from multiple customer networks compromising on the order of thousands of individual machines

October 20-23rd, 2015 Impact, summary and conclusions  Deep learning methods yield state-of-the-art results on the static malware detection problem  Our novel insertion/reordering invariant feature representation for static binaries yields improved static detection results  Our time-splitting evaluation reveals malware “concept drift” and is an important evaluation that should be built upon by other malware detection researchers 17

October 20-23rd, 2015 Remaining Questions  What happens when we train on more samples?  What happens when we mix in behavioral analysis?  Could sequence oriented deep learning models (recurrent neural networks) improve our results over feed-forward networks?  How would our results compare to traditional AV systems in head-to-head comparisons? 18

October 20-23rd, 2015 Questions / comments? Joshua Saxe Senior Principal Research Engineer Invincea Labs 19