Assembly Code Verification Using Model Checking Hao XIAO Singapore University of Technology and Design
Outline Motivation Approach overview ILA PAT On-going & future work
Motivation
Benefits Achieve more reliability. More software components can be verified. Circumvent problems caused by compiler. The verification target is most close to what is running on a CPU than the source code. Easy to verify. Binaries have more elegant syntax and well defined semantics than source code.
Challenges Instruction complexity. Lack of high level semantic information. Dynamic jump and call, no clear boundaries for “function”. How to specify properties for assembly code. Scalability. Assembly code is much longer than source code.
Design Goals Accuracy: Faithfully handle the complex instructions in some ISA. Extensibility: Easy extensible to handle different Instruction Set of various architecture. Ease of Use: Those who are not familiar with temporal logic or assembly language should also find it is useful. High Efficiency: Scalable to large programs.
Approach Overview (1) ELF Vine IL Vine Emulator Static Analyzer Model Checker User & Built in properties Properties Parser Counter Example
Approach Overview (2) Accuracy and Extensibility: Vine IL. Ease of Use: Built in properties, if source is available, link counter examples back to source. High Efficiency: property guided abstractions techniques for state space reduction; Function abstraction.
PAT Vine IL Emulator Static Analyses Built in properties Example-buffer overflow checking
Vine IL Binary file Assembly VEX IR Vine IL Libbfd VineLibVex
Vine IL Example
Emulator(State builder) Emulator is used to generate the successor states based on the current state. A state consists of CPU registers, PC, memory. Separate global states from local states. Byte precision memory model.
Static Analyses for Space Reduction Stack Analysis Dead Variable Analysis. Value Set Analysis Interrupt Flag Analysis. Path Reduction
Built-in Properties Stack overflow checking Integer overflow checking Null pointer deference. Division by zero checking Uninitialized variable checking Data race checking
Example-Buffer Overflow Checking Buffer overflow in assembly level: write to a memory location beyond the boundaries of current stack frame. Identify instrumentation point: find write operations which have a variable d as its destination address. Assertion instrumentation: Add assertion d > %ebp && d < %esp before the write instruction. Model checking assertions.
Example- C++ source code
Example-Assembly Code s1 s2 s3 s4 s5 s6
S1 S2 S3 S4S5 S6 J1 J2 J3 Control Flow Graph
S1 S3 S4.1 S5 S6 J1 J3 CFG for Instrumented Code S4.2 A1 Error esp1 = esp0 - 0x4 M[esp1] = ebp0 ebp1 = esp1 esp2 = esp1 - max{0, 15} esp3 = esp2 – 0x20 M[ebp1 +0x8]>1 eax0= M[ebp1 + 0xc] eax1= M[eax0 + 0x4] M[esp3 + 0x18] = eax1 M[esp3 + 0x1c] = 0 ebx0 =φ(S3,S4.2,M[ esp3 + 0x1c]) eax2 = M[esp3 + 0x18] eax3 = strlen (eax2) eax3 < ebx0 eax4 = M[esp3 + 0x1c] eax5 = eax4 + M[esp3 + 0x18] edx0 = M[eax5] eax6 = esp3 + 0x10 eax7 = M[esp3 + 0x1c] + eax6 eax7 > ebp1 && eax7 < esp3 M[eax7] = edx0 M[esp3+0x1c] = M[esp3 + 0x1c] + 1
On-going & future Work Implementation. More abstraction techniques(e.g., irrelevant code elimination). Symbolic model checking
The End Thanks !