Meltem Ozsoy*, Caleb Donovick*, Iakov Gorelik*,

Slides:



Advertisements
Similar presentations
COMP375 Computer Architecture and Organization Senior Review.
Advertisements

Tuning of Loop Cache Architectures to Programs in Embedded System Design Susan Cotterell and Frank Vahid Department of Computer Science and Engineering.
Profiler In software engineering, profiling ("program profiling", "software profiling") is a form of dynamic program analysis that measures, for example,
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
CPE 731 Advanced Computer Architecture ILP: Part V – Multiple Issue Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
General information Course web page: html Office hours:- Prof. Eyal.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
2/15/2006"Software-Hardware Cooperative Memory Disambiguation", Alok Garg, HPCA Software-Hardware Cooperative Memory Disambiguation Ruke Huang, Alok.
Chapter 1 and 2 Computer System and Operating System Overview
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Chapter 2: Impact of Machine Architectures What is the Relationship Between Programs, Programming Languages, and Computers.
Computer Organization and Architecture
A Characterization of Processor Performance in the VAX-11/780 From the ISCA Proceedings 1984 Emer & Clark.
Chapter 6 Memory and Programmable Logic Devices
1 RAKSHA: A FLEXIBLE ARCHITECTURE FOR SOFTWARE SECURITY Computer Systems Laboratory Stanford University Hari Kannan, Michael Dalton, Christos Kozyrakis.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Lect 13-1 Lect 13: and Pentium. Lect Microprocessor Family  Microprocessor  Introduced in 1989  High Integration  On-chip 8K.
Reduced Instruction Set Computers (RISC) Computer Organization and Architecture.
COMPUTER ORGANIZATIONS CSNB123 May 2014Systems and Networking1.
Jarhead Analysis and Detection of Malicious Java Applets Johannes Schlumberger, Christopher Kruegel, Giovanni Vigna University of California Annual Computer.
Tanenbaum 8.3 See references
MutantX-S: Scalable Malware Clustering Based on Static Features Xin Hu, IBM T.J. Watson Research Center; Sandeep Bhatkar and Kent Griffin, Symantec Research.
Secure Embedded Processing through Hardware-assisted Run-time Monitoring Zubin Kumar.
Panorama: Capturing System-wide Information Flow for Malware Detection and Analysis Authors: Heng Yin, Dawn Song, Manuel Egele, Christoper Kruegel, and.
CH13 Reduced Instruction Set Computers {Make hardware Simpler, but quicker} Key features  Large number of general purpose registers  Use of compiler.
Kenichi Kourai (Kyushu Institute of Technology) Takuya Nagata (Kyushu Institute of Technology) A Secure Framework for Monitoring Operating Systems Using.
Three fundamental concepts in computer security: Reference Monitors: An access control concept that refers to an abstract machine that mediates all accesses.
AUTHORS: ASAF SHABTAI, URI KANONOV, YUVAL ELOVICI, CHANAN GLEZER, AND YAEL WEISS "ANDROMALY": A BEHAVIORAL MALWARE DETECTION FRAMEWORK FOR ANDROID.
Ramazan Bitirgen, Engin Ipek and Jose F.Martinez MICRO’08 Presented by PAK,EUNJI Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors.
Is Out-Of-Order Out Of Date ? IA-64’s parallel architecture will improve processor performance William S. Worley Jr., HP Labs Jerry Huck, IA-64 Architecture.
OPERATING SYSTEMS Goals of the course Definitions of operating systems Operating system goals What is not an operating system Computer architecture O/S.
Branch Regulation: Low-Overhead Protection from Code Reuse Attacks.
Computer architecture Lecture 11: Reduced Instruction Set Computers Piotr Bilski.
1 Instruction Set Architecture (ISA) Alexander Titov 10/20/2012.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
CS5222 Advanced Computer Architecture Part 3: VLIW Architecture
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
1 Text Reference: Warford. 2 Computer Architecture: The design of those aspects of a computer which are visible to the programmer. Architecture Organization.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Reduced Instruction Set Computers. Major Advances in Computers(1) The family concept —IBM System/ —DEC PDP-8 —Separates architecture from implementation.
Ensemble Learning for Low-level Hardware-supported Malware Detection
Full and Para Virtualization
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
COMPUTER ORGANIZATIONS CSNB123 NSMS2013 Ver.1Systems and Networking1.
A Survey on Runtime Smashed Stack Detection 坂井研究室 M 豊島隆志.
An Overview of Parallel Processing
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
IEEE AI - BASED POWER SYSTEM TRANSIENT SECURITY ASSESSMENT Dr. Hossam Talaat Dept. of Electrical Power & Machines Faculty of Engineering - Ain Shams.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
§ Georgia Institute of Technology, † Intel Corporation Initial Observations of Hardware/Software Co-simulation using FPGA in Architecture Research Taeweon.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
A Framework For Trusted Instruction Execution Via Basic Block Signature Verification Milena Milenković, Aleksandar Milenković, and Emil Jovanov Electrical.
Covert Channels Through Branch Predictors: a Feasibility Study
Hardware based Intrusion Detection
Big data classification using neural network
Topics to be covered Instruction Execution Characteristics
Control Unit Lecture 6.
Micro-programmed Control Unit
OS Virtualization.
Taeweon Suh § Hsien-Hsin S. Lee § Shih-Lien Lu † John Shen †
Page Replacement.
Adversarial Evasion-Resilient Hardware Malware Detectors
Processor Organization and Architecture
RHMD: Evasion-Resilient Hardware Malware Detectors
Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware Kriti shreshtha.
In Today’s Class.. General Kernel Responsibilities Kernel Organization
Presentation transcript:

Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy*, Caleb Donovick*, Iakov Gorelik*, Nael Abu-Ghazaleh** and Dmitry Ponomarev* * Binghamton University, ** University of California, Riverside HPCA 2015 - San Francisco, CA

Malware Growth Anti-virus software OS Level Defenses Execution Monitoring AV Test Malware Statistics,2014 (http://www.av-test.org/en/statistics/malware/) HPCA 2015 - San Francisco, CA

What This Work is All About Comprehensive execution monitors are too heavy-weight to be always-on Performance loss Low-level indicators were shown to be effective to classify malware Demme et al. (ISCA 2013) proposed offline detection using performance counters Our contribution: online detection in hardware Hardware classifies are not perfect, thus: Two Level Detection Framework: Use hardware-based detector to prioritize the work of heavy-weight software detector HPCA 2015 - San Francisco, CA

Two Level Detection Framework HPCA 2015 - San Francisco, CA

Malware Detection Static Analysis Limitations of Static Analysis Study program without execution Signature generation with byte/instruction sequences Using source code, CFG generation Limitations of Static Analysis Requires source code, disassembly Metamorphic malware (Self Modifying Code) Polymorphic (encrypted) malware Non-deterministic inputs can change program flow HPCA 2015 - San Francisco, CA

Malware Detection Dynamic Analysis Limitations of Dynamic Analysis System calls, function parameters, API calls, created processes/threads, etc. monitored Expensive, uses VM or emulator Limitations of Dynamic Analysis Only effective against analyzed malware Advanced Persistent Threats (APTs) can bypass with zero-day exploits HPCA 2015 - San Francisco, CA

Execution Monitoring Systemcall Forwarding Application VM VM Modified Application EM Application EM EM Kernel Kernel Kernel Systemcall Forwarding Proxos (OSDI’06) VM Introspection, Isolated Monitoring Livewire(NDSS’03), Virtuoso (IEEE Security & Privacy’11) Reference Monitoring PinOS(ACM VEE’07), Kernel DBT(ASPLOS’12) HPCA 2015 - San Francisco, CA

Malware Detection at Low-level Sub-semantic Monitoring Low-level indicators of program such as Performance Counters (Demme et al. ISCA’13) are monitored Limitations Detection is after the fact Not real-time Features are limited to available performance counters HPCA 2015 - San Francisco, CA

Our Proposal: MAP Malware Aware Processor (MAP) Use hardware for sub-semantic detection Train a simple machine learning algorithm Periodic checks during execution Perform online detection using time series analysis in hardware High overhead software analysis activated only for suspicious programs (Two Level Detection) HPCA 2015 - San Francisco, CA

MAP Design Overview Instruction Cache Exception Unit Physical Register File Issue ROB & Architectural Register File Exception Unit Instruction Fetch MAP Rename/Decode Collect sub-semantic features Have a simple machine learning engine Check executing program in real-time Branch Prediction Functional Units MMU Data Cache HPCA 2015 - San Francisco, CA

Sub-Semantic Feature Space Architectural ARCH : Frequency of memory read/writes, taken & immediate branches and unaligned memory accesses Memory Address MEM1 : Frequency of memory address distance histogram MEM2 : Memory address distance histogram mix Instruction INS1 : Frequency of instruction categories INS2 : Difference between two most frequent opcodes INS3 : Existence of categories INS4 : Existence of opcodes HPCA 2015 - San Francisco, CA

Machine Learning Algorithms Logistic Regression Hypothesis function (ax1+bx2+ ... +c) is trained to figure out weights (a, b, c) Sigmoid function translates the hypothesis function to a value (0 – 1) Neural Network (multi layer perceptron) One hypothesis function trained for each layer Translation function is tanh HPCA 2015 - San Francisco, CA

Data Set & Data Collection Family Train Test Val Extended Total Vundo 14 2 5 21 42 Emerleox 10 4 33 52 Virut 8 3 7 46 64 Sality 12 Ejik 6 101 118 Looper 145 164 AdRotator 1 119 136 PornDialer 11 196 217 Boaxxe 13 211 230 99 34 36 918 1087 32-bit Windows 7 on VirtualBox Windows Security Services disabled Features collected through PIN during execution of malware University Of Mannheim dataset Offensive Computing VirusTotal HPCA 2015 - San Francisco, CA

Selecting Features for Classification Offline detection performance Low hardware implementation complexity Used for hardware implementation HPCA 2015 - San Francisco, CA

Key Aspects of MAP Operation Machine Learning model trained at design time Weights for the model are loaded into MAP hardware While program executes, MAP hardware collects features at instruction commit stage For each 10K committed instructions, a binary decision (malware/regular) is made HPCA 2015 - San Francisco, CA

MAP Online Detection Periodic binary signals created for 10K instructions during execution Exponentially Weighted Moving Average (EWMA) is used for filtering out occasional false positives/negatives Additional optimizations for efficient hardware implementation Fixed Point representation Sliding window of signals HPCA 2015 - San Francisco, CA

Hardware Implementation Logistic Regression Neural Network HPCA 2015 - San Francisco, CA

MAP FPGA Implementation HPCA 2015 - San Francisco, CA

Example of EWMA Logistic Regression Neural Network HPCA 2015 - San Francisco, CA

Results HPCA 2015 - San Francisco, CA

Key Results of MAP Best performing feature is based on instruction opcodes MAP achieves 89% real-time detection with only 6% false positives with a simple LR prediction Physical design overhead Cycle time 1.9%(LR), 5.5%(NN) Area 0.3%(LR), 5.7%(NN) Power 0.1%(LR), 1.7%(NN) HPCA 2015 - San Francisco, CA

Future Directions MAP can be extended as a configurable malware detection engine Updating weights for new malware Configuring features Integrated FPGAs in new CPU designs (Intel Xeon) can be used for MAP HPCA 2015 - San Francisco, CA

Thank You! Questions? HPCA 2015 - San Francisco, CA