Download presentation
Presentation is loading. Please wait.
Published byEsther Cobb Modified over 9 years ago
1
AccessMiner Using System- Centric Models for Malware Protection Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu and Engin Kirda ACM CCS 2010 Oct. 1
2
OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 2
3
OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 3
4
Malware Detection Signature ◦ Static content ◦ Byte strings, instruction sequences =>Code obfuscation Behavior ◦ Dynamic actions ◦ Sequences of System calls, API functions ◦ A program-centric approach ◦ …good results? 4
5
Malware Detection Problem Test case ◦ Small scale About 10 benign applications ◦ Limited execution A few minutes, sandbox ◦ Synthetic inputs ◦ Single machine 5
6
Malware Detection Problem(cont.) Program-centric model ◦ Narrow view on a program ◦ Diversity of system call information ◦ How benign programs interact with their environment? ◦ Their models may specific to a small set of benign applications only 6
7
OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 7
8
System Call Data Collection A Microsoft Windows kernel module ◦ Collect, anonymize, and upload system call logs ◦ Hooks the System Services Descriptor Table ◦ Mindful of system resource 8
9
Kernel collector 79 different system calls ◦ Related to files, regs, processes and threads, networking, memory. ◦ Same subset in Anubis 9
10
System Call Data Sensitive data are replaced ◦ Non-system paths, user-root registry key, IP addresses 10
11
System Call Data Collection Large and diverse set of system call traces ◦ Ten different machines, different users ◦ Serveral weeks ◦ 114.5GB of data ◦ 1.556 billion system call ◦ 362,600 processes ◦ 242 applications 11
12
Data set 2~4 days with 2~12 hours Production systems, development systems 12
13
Data Normalization Raw data(system call logs) =>Accessed resources and access type Tracking the access operations ◦ The set of resources open at any given time OS handles ◦ Until the resource is released(NtClose) Execution path and file name: ◦ NtOpenFile, NtCreateSection, NtCreateThread 13
14
OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 14
15
Analysis of System Call Data How diverse is the collected system call data? Focus on types ◦ Long tradition in the security community ◦ Most models rely upon characteristic patterns Ignore argument values 15
16
Creating n-gram Models Follow a ” standard ” approach 1.Extract n-grams Models for a set of malware programs and a set of benign programs 2.Find all n-grams appear in malware programs but not in benign programs 3.Hope those n-grams are characteristic for malware programs 16
17
Unique n-gram analysis 17
18
n-gram Models 10,838 malware samples from Anubis Ten experiments(ten machines) ◦ System call traces from 9 machines and 2/3 of the malware set to train an n-grams ◦ Perform detection with remaining system calls traces and 1/3 malwares 18
19
Detection Results 19
20
Program-Centric Models and Detection Since system-call sequences invoked by benign applications are diverse ◦ Have difficulties in distingushing normal and malicious behaviors A large amount of data is needed 20
21
OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 21
22
System-Centric Models and Detection Generalize how benign programs interact with the operating system Record the files and the registry entries ◦ Read, write, execute It is “ convergence ” 22
23
Access Activity Model A set of labels for operating system resources A label “L” is a set of access tokens ◦ {t 0,t 1,…,t n } A token “t” is a pair ◦, a => application op => type of access 23
24
Initial Access Activity Model(1) Use system-call traces of all benign processes A virtual file system tree Application “a” C:\foo\a.txt (write) Application “b” C:\foo\bar\b.rar (exec) 24
25
Model Pre-processing(2) Remove some elements in the tree ◦ Microsoft Windows services ◦ Desktop indexing programs ◦ Anti-virus software Identify applications that start processes with different names ◦ C:\Windows\system32 => win_core 25
26
Model Generalization(3) Propagated Container ◦ All children are private(without *) ◦ C:\Program Files Merged => 26
27
System-Centric Model Detection For any op Find the longest prefix P shared between the path to the resource and the folders in the virtual tree stored by our model Ten experiments ◦ File system access activity model About 100 labels ◦ Registry access activity model About 3000 labels ◦ Full access activity model 27
28
Detection Results(Files) //Looks sobering Many samples(Malware) don ’ t work(!) ◦ 10,838 -> 7,847 Use only write operation ◦ Our own logging component ◦ Software updates 28
29
Detection Results(Regs) 29 HKEY_USER\Software\Microsoft ◦ Need a larger training set
30
OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 30
31
Discussion and Conclusion Full access activity model ◦ 91% detection / 0% false positives System-centric approach Policy violations occurred only for few, specific classes of programs Network limitation MAC policy ◦ SELinux 31
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.