Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Model for Self-Modifying Code Bertrand Anckaert, Matias Madou and Koen De Bosschere 8 th Information Hiding Conference, July 11 th 2006.

Similar presentations


Presentation on theme: "A Model for Self-Modifying Code Bertrand Anckaert, Matias Madou and Koen De Bosschere 8 th Information Hiding Conference, July 11 th 2006."— Presentation transcript:

1 A Model for Self-Modifying Code Bertrand Anckaert, Matias Madou and Koen De Bosschere 8 th Information Hiding Conference, July 11 th 2006

2 2 oProblem for Reverse-Engineering oUsed for Hiding Program Internals Software Protection oCopyright Protection Mechanisms oSecret Algorithms o… Malicious intent of viruses oProgram Optimization Self-Modifying Code

3 3 Scope 010010101101110 101011111101101 101101101011001 100110011011101 010111001101101 010101111101111 110111000001110 010011101101101 101101010110101 001001010100 011101011111 010010101101110 101011101101101 101101101011001 100110011011101 010111001101101 010101111101111 110111000001110 010011101101101 101101010110101 001011010100 011101011111 Focus: malicious host paradigm Not: malicious code paradigm known

4 4 Goal oInternal Representation oConstruction and Deconstruction oAccurate and Conservative oAnalyses and Transformations

5 5 oIntroduction oRunning Example oInternal Representation oConstruction and Deconstruction oAnalyses and Transformations oApplications Overview Accurate and Conservative

6 6 Example: ISA AssemblyBinarySemantics movb value to0xc6 value to set byte at address to to value value inc reg0x40 reg increment register reg dec reg0x48 reg decrement register reg push reg0xff reg push register reg jmp to0x0c to jump to address to (absolute)

7 7 Example: Introduction AddressBinaryAssembly 0x0 0x3 0x5 0x8 0xa 0xc c6 0c 08 40 01 c6 0c 05 40 03 ff 02 48 01 movb 0xc 0x8 inc %ebx movb 0xc 0x5 inc %edx push %ecx dec %ebx

8 8 Example: Trace movb 0xc 0x8 inc %ebx movb 0xc 0x5 inc %edx push %ecx dec %ebx 1  movb 0xc 0x8 inc %ebx movb 0xc 0x5 jmp 0x3 push %ecx dec %ebx  movb 0xc 0x8 inc %ebx jmp 0xc jmp 0x3 push %ecx dec %ebx  2 3 4 5 6 7 =inc %ebx 2) inc %ebx 3) movb 0xc 0x5 4) jmp 0x3 5) inc %ebx 6) jmp 0xc 7) dec %ebx Trace: 1) movb 0xc 0x8 1 3

9 9 oScope oRunning Example oInternal Representation Superposition of CFGs Codebytes Codebyte Conditional Edges Consumption of Codebyte Values oConstruction and Deconstruction oAnalyses and Transformations oApplications Overview

10 10 CFG for Traditional Code oOne of the most important internal representations for traditional code Well-understood how to: oconstruct and deconstruct oaccurate and conservative oanalysis and transformations representation of a superset of all possible executions

11 11 not conservative Traditional CFG Construction for SMC movb 0xc 0x8 inc %ebx movb 0xc 0x5 inc %edx push %ecx dec %ebx   inc %ebx movb 0xc 0x5 jmp 0x3 push %ecx dec %ebx dec %ebx push %ecx jmp 0x3 inc %ebx jmp 0xc movb 0xc 0x8 1) movb 0xc 0x8 2) inc %ebx 3) movb 0xc 0x5 4) jmp 0x3 5) inc %ebx 6) jmp 0xc 7) dec %ebx 12,53712,537 1 2,5342,534 7 7 4 2,562,56 1 not a superset not accurate Unreachable Code Elimination

12 12 Example: Superposition of CFGs movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 dec %ebx jmp 0xc inc %edx push %ecx 2) inc %ebx 3) movb 0xc 0x5 4) jmp 0x3 5) inc %ebx 6) jmp 0xc 7) dec %ebx 1) movb 0xc 0x8 1 2,5 3 4 6 7

13 13 Contains CFG 1 movb 0xc 0x8 inc %ebx movb 0xc 0x5 inc %edx push %ecx dec %ebx  movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx

14 14 Contains CFG 2  inc %ebx movb 0xc 0x5 jmp 0x3 push %ecx dec %ebx movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx

15 15 Contains CFG 3  dec %ebx push %ecx jmp 0x3 inc %ebx jmp 0xc movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx

16 16 Superposition of CFGs oRepresents a superset of all possible executions oBut: how do we linearize a graph with multiple outgoing/incoming fall-through paths? how do we analyze what states the program can be in at a given program point? … Extensions

17 17 oScope oRunning Example oInternal Representation Superposition of CFGs CodeBytes CodeByte Conditional Edges Consumption of CodeByte Values oConstruction and Deconstruction oAnalyses and Transformations oApplications Overview

18 18 CodeByte 0x5 c6 0c identifier (address) states initial state

19 19 Extension 1: CodeBytes movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx 0x3 40 0x4 01 0x6 0c 0x7 05 0xa ff 0xb 02 0x9 03 0xc 48 0xd 01 0x8 40 0c 0x5 c6 0c 0x0 c6 0x1 0c 0x2 08

20 20 Extension 2: CodeByte Conditional Edges movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx 0x3 40 0x4 01 0x6 0c 0x7 05 0xa ff 0xb 02 0x9 03 0xc 48 0xd 01 0x8 40 0c 0x5 c6 0c 0x0 c6 0x1 0c 0x2 08 *(0x5)==c6 *(0x8)==0c *(0x5)==0c *(0x8)==40

21 21 Extension 3: Consumption of CodeBytes oA codebyte is read when it is interpreted as (part of) an instruction by the CPU oImportant for data analyses, such as liveness analysis

22 22 Traditional Code vs. Self-Modifying Code oTraditional Code No Overlap Not Self-Inspecting Not Self-Modifying oSpecial case of self-modifying code. Extensions can be omitted because: Can be easily linearized as instructions do not overlap Target locations of control transfers can be in only one state Result of data analyses on code is trivial as the code is constant

23 23 oScope oRunning Example oInternal Representation oConstruction and Deconstruction oAnalyses and Transformations oApplications Overview

24 24 Construction oRequires that we know: Targets of control flow Which instructions write what where oNot a problem in the malicious host paradigm oIn the malicious code paradigm (Future Work): Observing dynamic execution Static extension

25 25 Linearization movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx push %ecx dec %ebx jmp 0xc 0x3 40 0x4 01 0x6 0c 0x7 05 0xa ff 0xb 02 0x9 03 0xc 48 0xd 01 0x8 40 0c 0x5 c6 0c 0x0 c6 0x1 0c 0x2 08 c6 0c 08 40 01 c6 0c 05 40 03 ff 02 48 01

26 26 Example: Introduction AddressBinaryAssembly 0x0 0x3 0x5 0x8 0xa 0xc c6 0c 08 40 01 c6 0c 05 40 03 ff 02 48 01 movb 0xc 0x8 inc %ebx movb 0xc 0x5 inc %edx push %ecx dec %ebx

27 27 oScope oRunning Example oInternal Representation oConstruction and Deconstruction oAnalyses and Transformations Constant Propagation Unreachable Code(Byte) Elimination Liveness Analysis Loop Unrolling oApplications Overview

28 28 *(0x8)==40 Constant Propagation movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx 0x3 40 0x4 01 0x6 0c 0x7 05 0xa ff 0xb 02 0x9 03 0xc 48 0xd 01 0x8 40 0c 0x5 c6 0c 0x0 c6 0x1 0c 0x2 08 *(0x5)==c6 *(0x8)==0c *(0x5)==0c

29 29 Unreachable Code(Byte) Elimination movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx 0x3 40 0x4 01 0x6 0c 0x7 05 0xa ff 0xb 02 0x9 03 0xc 48 0xd 01 0x8 40 0c 0x5 c6 0c 0x0 c6 0x1 0c 0x2 08 *(0x5)==c6 *(0x8)==0c *(0x5)==0c

30 30 Liveness Analysis movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 dec %ebx jmp 0xc 0x3 40 0x4 01 0x6 0c 0x7 05 0x9 03 0xc 48 0xd 01 0x8 40 0c 0x5 c6 0c 0x0 c6 0x1 0c 0x2 08 *(0x5)==c6 *(0x8)==0c *(0x5)==0c 0x8

31 31 Idempotent Instruction Removal movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 dec %ebx jmp 0xc 0x3 40 0x4 01 0x6 0c 0x7 05 0x9 03 0xc 48 0xd 01 40 0c 0x5 c6 0c 0x0 c6 0x1 0c 0x2 08 *(0x5)==c6 *(0x8)==0c *(0x5)==0c 0x8

32 32 1) movb 0xc 0x8 2) inc %ebx 3) movb 0xc 0x5 4) jmp 0x3 5) inc %ebx 6) jmp 0xc 7) dec %ebx Loop Unrolling and … inc %ebx _c c6 0c _e 0c jmp 0xc dec %ebx jmp 0xc inc %ebx movb 0xc 0x5 movb 0xc _c movb 0xc 0x5 movb 0xc _c jmp 0x3 _a 40 _b 01 _f 0c _g 0c _h _c _d 0c 0x5 c6 0c 0x7 05 0x6 0c 0x3 40 0x4 01 _i 0c _j 0c _k _c 0x3 40 0x4 01 *(_c)==0c *(_c)==c6 *(0x5)==0c *(0x5)==c6 = 0xc 48 0xd 01

33 33 oScope oRunning Example oInternal Representation oConstruction and Deconstruction oAnalyses and Transformations oApplications Overview

34 34 Applications oOutlining of almost identical code snippets through one-bit modifiers oOverlapping similar functions through diff scripts oSignificant slowdown (factor 1.15 up to 3)

35 35 Almost Identical Code Snippets push 0xa804245c pop %ebx ret 0x0 68 0x1 5c 0x4 a8 0x5 5b 0x6 c3 0x3 04 mov 4(%esp),%ebx test 0x5b,%al ret 0x2 24 0x0 8b 0x1 5c 0x4 a8 0x5 5b 0x6 c3 0x3 04 0x2 24

36 36 Merged Code Snippets push 0xa804245c pop %ebx 0x1 5c 0x4 a8 0x5 5b 0x6 c3 0x3 04 mov 4(%esp),%ebx test 0x5b,%al 0x2 24 0x0 8b 68 ret movb 0x68 0x0 jmp 0x0 movb 0x8b 0x0 jmp 0x0

37 37 Conclusion oSuperposition of different CFGs oThree extensions CodeByte datastructure CodeByte conditional edges Consumption of CodeBytes Internal Representation Allows for: Construction (limited) and Deconstruction Conservative and Accurate Analyses and Transformations (iterative)

38 Questions? Presentation: http://www.elis.ugent.be/~banckaerhttp://www.elis.ugent.be/~banckaer Tool: http://www.elis.ugent.be/diablohttp://www.elis.ugent.be/diablo

39 39 Linearization oChains of instructions Chains of codebytes oCodebytes c and d must be concatenated: c and d are successive codebytes in an instruction c is the last codebyte of instruction I and d is the first codebyte of instruction J and I and J are successive instructions in a basic block c is the last codebyte of basic block A and d is the first codebyte of basic block B and A and B are connected by a fall-through path

40 40 Example: Superposition of CFGs movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx

41 41 Example: Superposition of CFGs movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx

42 42 Example: Superposition of CFGs movb 0xc 0x8 inc %ebx jmp 0x3 movb 0xc 0x5 inc %edx dec %ebx jmp 0xc push %ecx


Download ppt "A Model for Self-Modifying Code Bertrand Anckaert, Matias Madou and Koen De Bosschere 8 th Information Hiding Conference, July 11 th 2006."

Similar presentations


Ads by Google