Download presentation
Presentation is loading. Please wait.
Published byAlexander Allison Modified over 8 years ago
1
Corrado LeitaSymantec Research Labs Ulrich Bayer Technical University Vienna Engin KirdaInstitute Eurecom @ iSecLab
2
Outline Introduction Related Work SGNET and EPM Clustering Results Conclusion 2010/7/20 2 ADLab Meeting
3
2010/7/20ADLab Meeting 3
4
Introduction 2010/7/20 4 ADLab Meeting
5
Introduction 2010/7/20 5 ADLab Meeting
6
Introduction 2010/7/20ADLab Meeting 6
7
2010/7/20ADLab Meeting 7
8
Related Work Ghorghescu, 2005 Disassembling Comparing their basic blocks Kolter and Maloof, 2006 Comparing a hex dump of their code segments Wicherski, 2009, peHash Polymorphic binaries receive the same hash value According to the portions of the PE header that are not mutated 2010/7/20ADLab Meeting 8
9
Related Work Lee and Mody, 2006 Based on system call traces First attempts to cluster malware according to its behavior Bailey et al., 2007 The first builds a clustering system that described a sample’s behavior in more abstract terms O(n^2) 2010/7/20ADLab Meeting 9
10
Related Work Anubis http://anubis.iseclab.org/ http://anubis.iseclab.org/ Data tainting The tracking of sensitive compare operations Dynamic analysis system for capturing a sample’s behavior 2010/7/20ADLab Meeting 10
11
2010/7/20ADLab Meeting 11
12
SGNET and EPM Clustering 2010/7/20ADLab Meeting 12
13
SGNET and EPM Clustering SGNET ScriptGen Learning 0-day behavior Argos Program flow hijack detection Nepenthes Shellcode emulation Malware download 2010/7/20ADLab Meeting 13
14
SGNET and EPM Clustering Sensor: ScriptGen FSM Sample Factory: Argos Shellcode handlers: Nepenthes 2010/7/20ADLab Meeting 14
15
2010/7/20ADLab Meeting 15
16
EPM Clustering 2010/7/20ADLab Meeting 16
17
EPM Clustering Phase 1: feature definition 2010/7/20ADLab Meeting 17
18
EPM Clustering 2010/7/20ADLab Meeting 18 Pi PUSH-based interaction PULL-based interaction Central repository Mu PE header characteristics seem to be more difficult to mutate The change in their value is likely to be associated to a modification or recompilation of existing codebase
19
EPM Clustering Clearly, all of the features taken into account for the classification could be easily randomized by the malware writer More complex (costly) polymorphic approaches might appear in the future 2010/7/20ADLab Meeting 19
20
EPM Clustering Phase 2: invariant discovery An invariant value is a value that is not specific to a certain.. Attack instance Attacker Destination Threshold-based: At least 10 different attack instances At least 3 different attackers At least 3 honeypot IPs 2010/7/20ADLab Meeting 20
21
EPM Clustering Phase 3: pattern discovery T = v 1, v 2, v 3, …, v n 2010/7/20ADLab Meeting 21
22
EPM Clustering Phase 4: pattern-based classification Clustering Multiple patterns could match the same instance Each instance is always associated with the most specific pattern matching its feature values All the instances associated to the same pattern are said to belong to the same EPM cluster 2010/7/20ADLab Meeting 22
23
EPM Clustering E-clusters Exploit P-clusters Payload M-clusters Malware 2010/7/20ADLab Meeting 23
24
EPM Clustering 2010/7/20ADLab Meeting 24
25
2010/7/20ADLab Meeting 25
26
Results Data: Jan 2008 ~ May 2009, collected by SGNET deployment 6353 malware samples Only 5165 can be correctly executed in Anubis Some malwares can not download correctly by Nepenthes 2010/7/20ADLab Meeting 26
27
Results 39 E-clusters 27 P-clusters 260 M-clusters 972 B-clusters 2010/7/20ADLab Meeting 27
28
Results 2010/7/20ADLab Meeting 28
29
Results #(exploit/payload combinations) is low Most malware variants seem to be sharing few distinct exploitation routines for propagation #(B-clusters) is lower than #(M-clusters) Some M-clusters are likely to correspond to variations of the same codebase 2010/7/20ADLab Meeting 29
30
Results 2010/7/20ADLab Meeting 30
31
Results 2010/7/20ADLab Meeting 31
32
Results P-pattern 45: PUSH-based download TCP port 9988 2010/7/20ADLab Meeting 32
33
Results M-cluster 13: 2010/7/20ADLab Meeting 33
34
Results M-cluster 13 is a polymorphic malware associated to several different B-clusters MD5 is not an invariant Allaple mutates its content at each attack instance 2010/7/20ADLab Meeting 34
35
Results Each behavioral profile corresponds to an execution time of 4 mins Bot? Honeypots may help! 2010/7/20ADLab Meeting 35
36
Results 2010/7/20ADLab Meeting 36
37
Results Allaple Worm exploiting MS04-007 DoS attacks 2010/7/20ADLab Meeting 37
38
Results IRC servers 2010/7/20ADLab Meeting 38
39
2010/7/20ADLab Meeting 39
40
Conclusion Combine different clustering techniques Improve effectiveness in building intelligence on the threats economy 2010/7/20ADLab Meeting 40
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.