Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS
Introduction On the arms race Related work Our approach Automatic generation Implementation Evaluation 2Advaced Defense Lab
Code-injection attack Source code for script-language Byte-code Machine code The common component The injected code or … shellcode Advaced Defense Lab3
Shellcode is delivered in tandem with the exploitation. Store shellcode in memory, then exploit Shellcode takes the form of directly executable machine code. polymorphism Advaced Defense Lab4
Even polymorphic shellcode is constrained by an essential component: the decoder. Shellcode is fundamentally different in structure than non-executable payload data. This paper!!! Advaced Defense Lab5 Decoder Encoded data
Automatically producing English Shellcode Although it is not indistinguishable form authentic English prose. Do you want to analyze? Advaced Defense Lab6
Shellcode developers are often faced with constraints that limit the range of byte-values aceepted. e.g. printable, alphanumeric, MIME Encoding Self-modification Advaced Defense Lab7
Much literature describing code injection attacks assumes a standard attack template. A NOP sled, shellcode, and one or more pointer While emulation and static analysis have bean successful in identifying some failings of advanced shellcode. But…overhead Advaced Defense Lab8
It has been suggested that malicious polymorphic behavior cannot be modeled effectively. On the infeasibility of Modeling Polymorphic Shellcode. By Y. Song et al. Advaced Defense Lab9
Limit the spoils of exploitation and to prevent developers from writing vulnerable code Preventing the execution of injected code Content-based input-validation Polymorphic ▪ To identify self-decrypting shellcode ▪ But … non-self-contained polymorphic shellcode Advaced Defense Lab10
Shellcode is simply an ordered list of machine instructions. “Shake Shake Shake!” push %ebx; push “ake ”; push %ebx; push “ake ”; push %ebx; push “ake!”; But add, mov, call To develop an automated approach Arbitrary shellcode English representation Advaced Defense Lab11
English shellcode is completely self- contained. Advaced Defense Lab12
The decoder must be English-cpmpatible Cannot use many instruction ▪ E.g. loop instructions Our decoder has the form: Initialization Decoder Encoded payload Advaced Defense Lab13
Only English-compatible instructions English-compatible instructions that can produce useful instructions Favor instructions that have less-constrained ASCII equivalents push %eax (“P”) > push %ecx (“Q”) Advaced Defense Lab14
Overwriting registers and patching some instructions Using inc instruction and manipulatiing the alignment of the stack Advaced Defense Lab15
Advaced Defense Lab16
“and r/m8, r8”(0x20, ASCII space character) add ▪ lods (load string from esi) Advaced Defense Lab17
Two pointer: %esi, %edi Advaced Defense Lab18 ”,” and “ ” ”u” and “decode” ”G”
Advaced Defense Lab19
Using popa instruction (ASCII character “a”) Advaced Defense Lab20
Taken as-is, the custom decoder will have common English characters, but will not appearance of English text. Add some instructions between decoder instructions Augmenting a statistical language generation algorithm. Advaced Defense Lab21
n-gram model length is 5 the i th instruction in decoder have a level i A sentence have score i when it complete level i Advaced Defense Lab22
Advaced Defense Lab23
Using beam search algorithm Keep the best m(=20,000) candidates during the process For encoded payload, observe how many target byte are encoded Advaced Defense Lab24
The training data Over 15,000 Wikipedia articles 27,000 books from the Project GutenbergProject Gutenberg Language engine was constructed in the Java language using the LingPipe APILingPipe API Scoring engine using ptrace API Executor Watcher Taking 12 hours Advaced Defense Lab25
Advaced Defense Lab26
Emulation Expand 1 instruction into tens of instructions Monitored direct execution Maintain 2 machine state Use 3 separate stacks Pause 2 conditions ▪ Encounter a jump ▪ Change memory Roughly in less than 1 hour Advaced Defense Lab27
Exit(0) 2054 bytes Advaced Defense Lab28
Windows Bind DLL Inject Advaced Defense Lab29