Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011.

Similar presentations


Presentation on theme: "1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011."— Presentation transcript:

1 1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011

2 Forensic analysts need help Malware Analysis and Instrumentation 2 90% of malware resists analysis [1]  Malware attacks cost billions of dollars annually [2]  65% of users feel effect of cyber crime [3]  69% cybercrimes are resolved [3]  28 days on average to resolve a cybercrime [3] [1] McAfee. 2008 [2] Computer Economics. 2007 [3] Norton. 2010 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 Malware Binary

3 Malware Analysis and Instrumentation 3 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 Binary code identification Control- and data-flow analysis Instrumentation Effectiveness on malware The needed toolbox Forensic analysts need help Malware Binary

4 Malware Analysis and Instrumentation Dyninst Dyninst is a toolbox for analysts 4 program binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Dyninst CFG loop, block, function, instruction instrument- ation function replace- ment call stack walking forward & backward slices loop analysis process control library injection symbol table reading, writing binary rewriting machine language parsing Control flow analyzer Instrumenter Data flow analyzer

5 Analysis tool Dyninst Dyninst is a toolbox for analysts Malware Analysis and Instrumentation Mutator  Specifies instrumentation  Gets callbacks for runtime events  Builds high-level analysis program binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Dyninst Control flow analyzer Instrumenter Data flow analyzer CFG 5 loop, block, function, instruction instrument- ation function replace- ment call stack walking forward & backward slices loop analysis process control library injection symbol table reading, writing binary rewriting machine language parsing

6 Analysis tool Dyninst is a toolbox for analysts Malware Analysis and Instrumentation 6 Analysis of network communications Code visualizations Time bomb detection and analysis Identification of stolen data Reports on anti- analysis techniques printf(…) counter++ if (pred) callback(…) getTarget(insn) Code snippets Mutator  Specifies instrumentation  Gets callbacks for runtime events  Builds high-level analysis program binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 CFG Dyninst Control flow analyzer Instrumenter Data flow analyzer

7 Analysis tool Dyninst Dyninst on malware Malware Analysis and Instrumentation 7 printf(…) counter++ if (pred) callback(…) getTarget(insn) Code snippets Mutator  Specifies instrumentation  Gets callbacks for runtime events  Builds high-level analysis Malware defeats static analysis & is sensitive to instrument- ation malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 CFG Analysis of network communications Code visualizations Time bomb detection and analysis Identification of stolen data Reports on anti- analysis techniques Analysis of network communications Code visualizations Time bomb detection and analysis Identification of stolen data Reports on anti- analysis techniques Control flow analyzer Instrumenter Data flow analyzer

8 Analysis tool Dyninst Control flow analyzer Instrument- er Data flow analyzer Dyninst on malware Malware Analysis and Instrumentation 8 printf(…) counter++ if (pred) callback(…) getTarget(insn) Code snippets Mutator  Specifies instrumentation  Gets callbacks for runtime events  Builds high-level analysis Malware defeats static analysis & is sensitive to instrument- ation malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 CFG SR-Dyninst static-dynamic analysis Analysis of network communications Code visualizations Time bomb detection and analysis Identification of stolen data Reports on anti- analysis techniques Control flow analyzer Sensitivity Resistant Instrumenter Data flow analyzer

9 Outline Malware Analysis and Instrumentation 9 Anti-analysis tricks Hybrid static-dynamic analysis Sensitivity resistance Results H.A. Anti S.R. Res. 9

10 PC-sensitive code Obfuscated control flow Unpacked code Overwritten code Anti-patching Address-space probing PC-sensitive code call-pop pairs, return-address manipulation, call-stack tampering & probing Anti-analysis tricks Malware Analysis and Instrumentation 10 Obfuscated control flow indirect control flow, stack tampering, overlapping code, signal-based ctrl flow Unpacked code all-at-once, block-, loop-, function-at-a-time, to empty or allocated space Overwritten code single operand or opcode, whole instruction, function, code section, buffer Anti-patching checksum whole regions, probe for patches, use code as data, move stack ptr Anti Address-space probing scans & probes of locations that should be un-allocated Anti-analysis Anti-instrumentation

11 030405060708090a0b0c0d e80300 e9eb045d4555c3 CALLJMP 40d00a459dd4f7 JMPPOPINCPUSHRET 40d00eebp anti-patching storm worm Obfuscated control flow Malware Analysis and Instrumentation 11 obfuscated control flow 40d002 address-space probing unpacked code overwritten code obfuscated control flow Entry Point pc-sensitive code Anti

12 storm worm Unpacked code Malware Analysis and Instrumentation 12 Entry Point 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 obfuscated control flow unpacked code obfuscated control flow Anti 12 anti-patching address-space probing overwritten code pc-sensitive code

13 Overwritten code Malware Analysis and Instrumentation 13 Upack packer obfuscated control flow overwritten code obfuscated control flow Anti 13 anti-patching address-space probing pc-sensitive code unpacked code 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 Entry Point

14 PC Sensitive code Malware Analysis and Instrumentation 14 obfuscated control flow overwritten code obfuscated control flow Anti 14 anti-patching address-space probing pc-sensitive code unpacked code Local Data Access call pop esi add esi, eax mov ebx, ptr[esi] data Use call to get current PC Pop PC into register Construct pointer and dereference e.g., ASProtect

15 anti-patching obfuscated control flow Anti-patching Malware Analysis and Instrumentation 15 checksum routine protected code xor eax, eax cmp eax,.chksum jne.fail e.g., PECompact Checksumming detects instrumentation [Aucsmith 96] add eax, ptr[ebx] add ebx, 4 cmp ebx, 0x41000 jne.loop jmp instrument- ation is detected passfail calculate checksum of protected region compare to expected value Anti 15 address-space probing unpacked code overwritten code pc-sensitive code

16 Address-space probing Malware Analysis and Instrumentation 16 obfuscated control flow overwritten code obfuscated control flow Anti 16 anti-patching address-space probing pc-sensitive code unpacked code segv_handler() { ptr += PAGESIZE; goto RESTART: } int *ptr = 0; sigaction(SIGSEGV, segv_handler); while(1) { RESTART: *ptr; ptr += PAGESIZE; } data code instrumentation Memory Scan

17 Malware Analysis and Instrumentation 17 Code discovery algorithm Hybrid algorithm: ? Parse from known entry points Instrument control flow that may lead to new code Resume execution H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0 ?

18 Malware Analysis and Instrumentation 18 Code discovery algorithm ? Parse from known entry points Instrument control flow that may lead to new code Resume execution ? Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0

19 Malware Analysis and Instrumentation 19 Code discovery algorithm ? Parse from known entry points Instrument control flow that may lead to new code Resume execution Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0 ?

20 Malware Analysis and Instrumentation 20 Code discovery algorithm ? Parse from known entry points Instrument control flow that may lead to new code Resume execution Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0 ?

21 Malware Analysis and Instrumentation 21 Code discovery algorithm Parse from known entry points Instrument control flow that may lead to new code Resume execution Hybrid algorithm: H.A. instrumentexceptionoverwrite CALL ptr[eax] DIV eax, 0 ?

22  Standard control-flow traversal  start from known entry points  follow control flow to find code  New conservative assumption  unresolved calls may not return So, we don’t parse garbage code  New stack tamper detection  backwards slice at ret instruction So, we detect modified return addresses Hybrid Analysis of Program Binaries call ptr[eax] pop ebp inc ebp push ebp ret garbage Accurate parsing H.A. 22

23 Malware Analysis and Instrumentation 23 Instrumentation-based discovery H.A. Invalid control transfers Indirect control transfers Exception-based control transfers push eax ret call 401000 Invalid Region call ptr[eax] ? jmp eax ? xor eax, eax mov ebx, ptr[eax] Exception Handler

24 … call ptr[eax] Instrumentation-based discovery H.A. Hybrid Analysis of Program Binaries 24 ? process Dyninst

25 … call ptr[eax] Dyninst Instrumentation-based discovery H.A. Hybrid Analysis of Program Binaries 25 findTarget(targ) { if ( !cacheLookup(targ) ) RPC_updateAnalysis(targ); } jmp 823456 … call ptr[eax] call findTarget (ptr[eax]) restore state save state process

26 Overwritten code discovery Malware Analysis and Instrumentation 26 Dyninst write RWX 26 H.A. RWX

27 Dyninst Hybrid Analysis of Program Binaries write When to update Challenges  large incremental overwrites  writes to data  writes to own page R E code write handler CFG update routine H.A. Overwritten code discovery 27

28 Dyninst Hybrid Analysis of Program Binaries When to update Challenges  large incremental overwrites  writes to data  writes to own page Approach  Delay the update until write routine terminates R E CFG update routine code write handler D.A. write Overwritten code discovery 28

29 Update after overwrite 1.Handle overwrite signal a)instrument write loop exits b)copy overwritten page c)restore write permissions d)resume execution 2.Update CFG when writes end a)remove overwritten and unreachable blocks b)parse at entry points to overwritten regions c)remove write permissions d)resume execution Overwritten code discovery Malware Analysis and Instrumentation 29 Dyninst R-X code write handler CFG update routine write Update after overwrite 1.Handle overwrite signal a)instrument write loop exits b)copy overwritten page c)restore write permissions d)resume execution 2.Update CFG when writes end a)remove overwritten and unreachable blocks b)parse at entry points to overwritten regions c)remove write permissions d)resume execution cb RWX cb R-X 29 H.A.

30 Dyninst Overwritten code discovery Malware Analysis and Instrumentation 30 Update after overwrite 1.Handle overwrite signal a)instrument write loop exits b)copy overwritten page c)restore write permissions d)resume execution 2.Update CFG when writes end a)remove overwritten and unreachable blocks b)parse at entry points to overwritten regions c)remove write permissions d)resume execution R-X RWX code write handler CFG update routine cb write cb 30 H.A.

31 Behavior Changes  Program modification affects local behavior  These changes propagate  Malware detects changes (or crashes) Malware Analysis and Instrumentation 31 S.R.

32 Sensitivity Resistant Approach  Identify instructions sensitive to modification  Moved instructions that access the program counter  Memory operations that may access patched code  Memory operations that may scan the address space  Project effects on program behavior  Are output (or control flow) affected?  Use a forward slice and symbolic evaluation  Determine how to compensate for modification  E.g. by emulating the original instruction Malware Analysis and Instrumentation 32 S.R.

33 PC-sensitivity analysis Malware Analysis and Instrumentation 33 S.R. main: call foo... call next next: pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx foo:... ret main: Sensitive: call foo Slice: call foo ret Symbolic expansion: pc = $retAddr + $delta Sensitive: call next Slice: call next pop %esi add %esi, %eax mov %(esi), %ebx jmp %ebx Symbolic expansion: pc = [$next + %eax + $delta] main: call foo... push $next pop %esi add %esi, %eax mov (%esi), %ebx jmp %ebx reloc_main:

34 Sensitivity Classes  PC (program counter) sensitive  Moved instruction that accesses the PC  CF (control flow) sensitive  Instruction whose control flow successor was moved  CAD (code as data) sensitive  Instruction that reads from overwritten memory  AVU (allocated vs. unallocated) sensitive  Instruction that accesses newly allocated memory Malware Analysis and Instrumentation 34 S.R.

35 Visible Compatibility  What behavior do we need to preserve?  Allow localized changes that aren’t visible from outside the program  Preserve:  Output  Approximation: control flow Malware Analysis and Instrumentation 35 S.R.

36 Handling CAD Sensitivity Malware Analysis and Instrumentation 36 S.R. checksum routine xor eax, eax cmp eax,.chksum jne.fail add eax, ptr[ebx] add ebx, 4 cmp ebx, 0x41000 jne.loop passfail data code instrumentation patch add ebx, 4 cmp ebx, 0x41000 jne.loop emulate (add eax, ptr[ebx]) restore state save state jmp 863828 shadow memory

37 Emulating Memory (Simplified) Malware Analysis and Instrumentation 37 S.R.  Save state  Determine effective address  Translate effective address  Restore state  Emulate original memory instruction push %eax push %ecx push %edx lahf push %eax lea, %ebx call translate pop %eax sahf pop %edx pop %ecx pop %eax mov (%ebx), %ebx

38 The Devil in the Details  IA-32 is a rich instruction set  Most instructions can access memory  And malware uses a wide variety of them  Instruction classes:  Most common: MOD/RM byte  Less common: “string” operations  Least common: absolute address Malware Analysis and Instrumentation 38 S.R.

39 String Operations  “String” instructions implicitly use ESI/EDI  scas/lods/stos/movs/cmps/ins/outs  Some update ESI/EDI, making emulation tricky  Malware loves these for copying blocks of memory Malware Analysis and Instrumentation 39 S.R. movs mov %edi, %edx mov %esi, %ecx call TranslateShift add %edx, %edi add %ecx, %esi movs sub %edx, %edi sub %ecx, %esi

40 Address-space scanning Malware Analysis and Instrumentation 40 S.R. scan routine xor eax, eax call chk_mem mov ptr[eax], ebx add eax, 4 cmp eax, 0 jne.loop passfail data code instrumentation patch add eax, 4 cmp ebx, 0 jne.loop emulate (mov ptr[eax], ebx) restore state save state jmp 863828 segv_handler... dyn_segv_handler...

41 Exception Handler Interposition Malware Analysis and Instrumentation 41 S.R. push %eax push %ecx push %edx lahf push %eax lea, %eax call translate pop %eax sahf pop %edx pop %ecx pop %eax mov (%eax), %eax Windows Libraries Faulting insn: Faulting addr: 0 Registers: dyn_segv_handler... segv_handler... Exception Record Faulting insn: Faulting addr: Registers:

42 Dyninst SR- Dyninst x x √ √ √ x √ √ √ √ √ √ yes Malware Analysis and Instrumentation 42 The packers we’re studying [1] Packer (r)evolution. Panda Research, 2008. Two-month average Feb-March 2008. Packer Malware market share [1] 0.13%MEW 0.17%WinUPack 0.33%Yoda's Protector 0.37%Armadillo 0.43%Asprotect 1.26%FSG 1.29%Aspack 1.74%nPack 2.08%Upack 2.59%PECompact 2.95%Themida 4.06%EXECryptor 6.21%PolyEnE 9.45%UPX 0.89%Nspack Res. Self- modifying yes Anti instru- mentation yes Obfuscated yes √ √ √ anti-debugging techniques

43 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 SD-Dyninst comprehensive instrumentation network call instrumentation Stack trace at 1 st network communication Control flow graph showing executed blocks Defensive tactics report  unpacked code  overwritten code  control flow obfuscations Trace of Win API calls 43 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 200 binaries Malware Analysis and Instrumentation Res. Sample malware analysis factory

44 Malware Analysis and Instrumentation 44 Factory results for Conficker A initial bootstrap code packed payload Res.

45 45 API func non executed block static block unpacked block Factory results for Conficker A Res.

46 Stack-walk of Conficker’s communications thread Frame pc=0x100016f7 func: DYNstopThread at 0x100001670[Dyninst] Frame pc=0x71ab2dc0 func: select at 0x71ab2dc0[Win DLL] Frame pc=0x401f34 func: nosym1f058 at 0x41f058[Conficker] Instrument network calls and perform a stack-walk 46 (We can also print stackwalks of Conficker’s other threads) Malware Analysis and Instrumentation Factory results for Conficker A Res.

47  Reduced relocation overhead despite emulation  Better handling of program features  Exceptions  Indirect control flow Malware Analysis and Instrumentation 47 Improved Dyninst overhead Res.

48 Malware Analysis and Instrumentation 48 Conclusion SR-Dyninst gives you  All the benefits of Dyninst on malware  Safer instrumentation on normal binaries Ongoing work  Anti-debugger techniques  More descriptive CFGs  Automated defensive-mode activation  SR-Dyninst in next Dyninst release


Download ppt "1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011."

Similar presentations


Ads by Google