Download presentation
1
Malware-Aware Processors: A Framework for Efficient Online Malware Detection
Meltem Ozsoy*, Caleb Donovick*, Iakov Gorelik*, Nael Abu-Ghazaleh** and Dmitry Ponomarev* * Binghamton University, ** University of California, Riverside HPCA San Francisco, CA
2
Malware Growth Anti-virus software OS Level Defenses Execution
Monitoring AV Test Malware Statistics,2014 ( HPCA San Francisco, CA
3
What This Work is All About
Comprehensive execution monitors are too heavy-weight to be always-on Performance loss Low-level indicators were shown to be effective to classify malware Demme et al. (ISCA 2013) proposed offline detection using performance counters Our contribution: online detection in hardware Hardware classifies are not perfect, thus: Two Level Detection Framework: Use hardware-based detector to prioritize the work of heavy-weight software detector HPCA San Francisco, CA
4
Two Level Detection Framework
HPCA San Francisco, CA
5
Malware Detection Static Analysis Limitations of Static Analysis
Study program without execution Signature generation with byte/instruction sequences Using source code, CFG generation Limitations of Static Analysis Requires source code, disassembly Metamorphic malware (Self Modifying Code) Polymorphic (encrypted) malware Non-deterministic inputs can change program flow HPCA San Francisco, CA
6
Malware Detection Dynamic Analysis Limitations of Dynamic Analysis
System calls, function parameters, API calls, created processes/threads, etc. monitored Expensive, uses VM or emulator Limitations of Dynamic Analysis Only effective against analyzed malware Advanced Persistent Threats (APTs) can bypass with zero-day exploits HPCA San Francisco, CA
7
Execution Monitoring Systemcall Forwarding
Application VM VM Modified Application EM Application EM EM Kernel Kernel Kernel Systemcall Forwarding Proxos (OSDI’06) VM Introspection, Isolated Monitoring Livewire(NDSS’03), Virtuoso (IEEE Security & Privacy’11) Reference Monitoring PinOS(ACM VEE’07), Kernel DBT(ASPLOS’12) HPCA San Francisco, CA
8
Malware Detection at Low-level
Sub-semantic Monitoring Low-level indicators of program such as Performance Counters (Demme et al. ISCA’13) are monitored Limitations Detection is after the fact Not real-time Features are limited to available performance counters HPCA San Francisco, CA
9
Our Proposal: MAP Malware Aware Processor (MAP)
Use hardware for sub-semantic detection Train a simple machine learning algorithm Periodic checks during execution Perform online detection using time series analysis in hardware High overhead software analysis activated only for suspicious programs (Two Level Detection) HPCA San Francisco, CA
10
MAP Design Overview Instruction Cache Exception Unit
Physical Register File Issue ROB & Architectural Register File Exception Unit Instruction Fetch MAP Rename/Decode Collect sub-semantic features Have a simple machine learning engine Check executing program in real-time Branch Prediction Functional Units MMU Data Cache HPCA San Francisco, CA
11
Sub-Semantic Feature Space
Architectural ARCH : Frequency of memory read/writes, taken & immediate branches and unaligned memory accesses Memory Address MEM1 : Frequency of memory address distance histogram MEM2 : Memory address distance histogram mix Instruction INS1 : Frequency of instruction categories INS2 : Difference between two most frequent opcodes INS3 : Existence of categories INS4 : Existence of opcodes HPCA San Francisco, CA
12
Machine Learning Algorithms
Logistic Regression Hypothesis function (ax1+bx c) is trained to figure out weights (a, b, c) Sigmoid function translates the hypothesis function to a value (0 – 1) Neural Network (multi layer perceptron) One hypothesis function trained for each layer Translation function is tanh HPCA San Francisco, CA
13
Data Set & Data Collection
Family Train Test Val Extended Total Vundo 14 2 5 21 42 Emerleox 10 4 33 52 Virut 8 3 7 46 64 Sality 12 Ejik 6 101 118 Looper 145 164 AdRotator 1 119 136 PornDialer 11 196 217 Boaxxe 13 211 230 99 34 36 918 1087 32-bit Windows 7 on VirtualBox Windows Security Services disabled Features collected through PIN during execution of malware University Of Mannheim dataset Offensive Computing VirusTotal HPCA San Francisco, CA
14
Selecting Features for Classification
Offline detection performance Low hardware implementation complexity Used for hardware implementation HPCA San Francisco, CA
15
Key Aspects of MAP Operation
Machine Learning model trained at design time Weights for the model are loaded into MAP hardware While program executes, MAP hardware collects features at instruction commit stage For each 10K committed instructions, a binary decision (malware/regular) is made HPCA San Francisco, CA
16
MAP Online Detection Periodic binary signals created for 10K instructions during execution Exponentially Weighted Moving Average (EWMA) is used for filtering out occasional false positives/negatives Additional optimizations for efficient hardware implementation Fixed Point representation Sliding window of signals HPCA San Francisco, CA
17
Hardware Implementation
Logistic Regression Neural Network HPCA San Francisco, CA
18
MAP FPGA Implementation
HPCA San Francisco, CA
19
Example of EWMA Logistic Regression Neural Network
HPCA San Francisco, CA
20
Results HPCA San Francisco, CA
21
Key Results of MAP Best performing feature is based on instruction opcodes MAP achieves 89% real-time detection with only 6% false positives with a simple LR prediction Physical design overhead Cycle time 1.9%(LR), 5.5%(NN) Area %(LR), 5.7%(NN) Power %(LR), 1.7%(NN) HPCA San Francisco, CA
22
Future Directions MAP can be extended as a configurable malware detection engine Updating weights for new malware Configuring features Integrated FPGAs in new CPU designs (Intel Xeon) can be used for MAP HPCA San Francisco, CA
23
Thank You! Questions? HPCA San Francisco, CA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.