Download presentation
Presentation is loading. Please wait.
Published byMilo Leonard Modified over 8 years ago
1
Finding Diversity in Remote Code Injection Exploits Justin Ma, John Dunagan, Helen J. Wang, Stefan Savage, Geoffrey M. Voelker *University of California, San Diego *Microsoft Research
2
2 Encountering new malware Have I seen this before? How closely related is it to what I have seen before?
3
3 Practical considerations ? New defense?
4
4 Theoretical considerations ? ? Evolutionary relationship?
5
5 Grouping similar malware together… Ultimately, construct malware families Anti-virus industry is active in this area
6
6 Motivation 710 new families 40,000 new variants Family and variant defined in ad-hoc fashion… Is there a systematic way to determine the nature of this diversity?
7
7 Exploit diversity Attacker MS RPC Request Exploit
8
8 Polymorphism Attacker Encrypted
9
9 Behind the encryption… Attacker
10
10 Differing constants Attacker Different IP address
11
11 Functional differences Attacker Waiting for a connection
12
12 Different code base Attacker Calling “tftp.exe”
13
13 ISystemActivator vulnerability 1,561 exploit attempts How different are they? 90 unique payloads
14
14 Our goal Automatically construct phylogeny, or family tree of exploits
15
15 Outline for this talk On classifying shellcodes Steps for systematically studying shellcodes –Trace collection –Shellcode extraction –Shellcode decryption –Comparing samples –Cluster analysis Post-hoc manual inspection to validate –Look at the code!
16
16 Why shellcodes? Our study focuses on exploits They are packaged with the exploit –First foreign code that executes on a newly infected machine –Part of exploit with most leeway for variation Primary challenge: collecting and analyzing shellcodes
17
17 Remote code injection attacks Victim Victim’s stack memory high low MS RPC Request Exploit Shellcode Flow of execution Decrypted shellcode Vulnerable buffer
18
18 Trace collection Studying 5 vulnerabilities Residential –2-day trace –Windows XP SP2 –29 unused DSL IP addresses –4,400 exploit samples Enterprise Trace –1 Hour –Active responders –5x /24 subnets –1,500 exploit samples
19
19 Shellcode extraction Shield (Sigcomm’04) –Framework for specifying network-based protocols and vulnerabilities –Extracts shellcodes from raw network packets
20
20 Shellcode decryption Shellcode is encrypted –Use shellcode’s own decryption loop! Limited emulation –Similar to generic decryption technique used for viruses
21
21 Comparing samples: Candidate metrics Edit distance –Too specific: non-code portions of payload made related exploits unnecessarily distant Structural distance –Control flow graph over basic blocks –Basic blocks summarized with a color/hash –Too general: did not capture subtle instruction variations between exploit families
22
22 Comparing samples: Final metric Exedit distance metric –Edit distance over executed parts of shellcode Distinguishes code from data Maintains instruction-level details Canonical string for shellcode
23
23 Cluster analysis Need to group samples using the exedit distance metric Agglomerative clustering –Each iteration, merge closest pair of clusters –Cluster distance = distance of furthest samples between two clusters
24
24 Results Caught exploits for 5 vulnerabilities over traces Summary for residential trace ExploitsUnique exploits Families SQL Resolution76721 LSASS1,769565 ISystemActivator1,561906 RemoteActivation338 2
25
25 ISystemActivator 10% clustering threshold Need to manually verify this… 6 families
26
26 ISystemActivator 4-byte decoding key Kernel-address loading function Function-finding block
27
27 ISystemActivator 4-byte decoding key Kernel-address loading function Function-finding block 4-byte encoding key Kernel base loaderFunction finder
28
28 ISystemActivator Longest payload Many function blocks in middle of payload
29
29 ISystemActivator Command-line call to “tftp.exe”
30
30 ISystemActivator Different instructions in parts, otherwise very similar
31
31 ISystemActivator “Bind” version “Connect-back” version
32
32 Conclusions Systematic method for classifying exploits –Exploit collection –Shellcode extraction and decryption –Shellcode comparison using exedit distance –Group exploits with clustering Similarity between samples in computed phylogenies corresponded well with observed differences Useful step toward automating malware classification
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.