Download presentation
Presentation is loading. Please wait.
1
TAintscope A Checksum-Aware Directed fuzzing Tool for Automatic Software Vulnerability Detection Tielei Wang1, Tao Wei1, Guofei Gu2, Wei Zou1 1Peking University, China 2Texas A&M University, US
2
2 terms Checksum – a way to check the integrity of data. Used in network protocols and files. Fuzzing – generating malformed inputs and feeding them to the application. Dynamic Taint Analysis – runs a program and observes which computations are affected by predefined taint sources (e.g. input) data Checksum field Checksum function
3
The problem The input mutation space is enormous .
3 The problem The input mutation space is enormous . Most malformed inputs dropped at an early stage, if the program employs a checksum mechanism.
4
The problem 1 void decode_image(FILE* fd){ 2 ...
4 The problem 1 void decode_image(FILE* fd){ 2 ... 3 int length = get_length(fd); 4 int recomputed_chksum = checksum(fd, length); 5 int chksum_in_file = get_checksum(fd); //line 6 is used to check the integrity of inputs 6 if(chksum_in_file != recomputed_chksum) 7 error(); 8 int Width = get_width(input_file); 9 int Height = get_height(input_file); 10 int size = Width*Height*sizeof(int); 11 int* p = malloc(size); 12 ... 13 for(i=0; i<Height; i++){// read ith row to p 14 read_row(p+Width*i, i, fd);
5
The IDEA 5 To infer whether/where a program checks the integrity of input. Identify which input bytes can flow into sensitive points: Taint analysis at byte level – monitors how application uses the input data. Create malformed input focusing the “hot bytes”. Repair checksum fields in input, to expose vulnerability. Fully automatic Found 27 new vulnerability – acrobat reader, google picasa and more.
6
How does it work? Dynamic taint tracing Detecting checksum
6 How does it work? Dynamic taint tracing Detecting checksum Directed fuzzing Repairing crashed samples
7
How does it work? Checksum Locator Directed Fuzzer Checksum Repairer
7 How does it work? Modified Program Crashed Samples Checksum Locator Directed Fuzzer Checksum Repairer Instruction Profile Hot Bytes Info Reports Execution Monitor
8
How does it work? Dynamic taint tracing
8 How does it work? Dynamic taint tracing Runs the program with well-formed input. Execution monitor records: Which input bytes related to arguments of API functions (e.g. malloc, strcpy) – “hot bytes” report. Which bytes each conditional jump instruction depends on (e.g. JZ, JE, JB) – checksum report. Considering only data flow (no control flow).
9
How does it work? Dynamic taint tracing eax {0x6, 0x7}, ebx {0x8, 0x9}
Instruments instructions – movement (e.g. MOV, PUSH), arithmetic (e.g. SUB, ADD), logic (e.g. AND, XOR) Taints all values written by an instruction with union of all taint labels associated with values used by that instruction. Considering also eflags register. eax {0x6, 0x7}, ebx {0x8, 0x9} add eax, ebx eax {0x6, 0x7, 0x8, 0x9}, eflags {0x6, 0x7, 0x8, 0x9}
10
How does it work? Dynamic taint tracing - EXAMPLE
10 How does it work? Dynamic taint tracing - EXAMPLE Input size is 1024 bytes “hot bytes” report: 8 int Width = get_width(input_file); 9 int Height = get_height(input_file); 10 int size = Width*Height*sizeof(int); 11 int* p = malloc(size); … 0x8048d5b: invoking malloc: [0x8,0xf]
11
How does it work? Dynamic taint tracing - EXAMPLE
11 How does it work? Dynamic taint tracing - EXAMPLE Input size is 1024 bytes checksum report: 6 if(chksum_in_file != recomputed_chksum) error(); … 0x8048d4f: JZ: 1024: [0x0,0x3ff]
12
How does it work? 2. Detecting checksum Checksum detector:
12 How does it work? 2. Detecting checksum Checksum detector: identify potential checksum check points the recomputed checksum value depends on many input bytes Instruments conditional jump. Before execution, checks whether the number of marks associated with eflags register exceeds a threshold. Problem with decompressed bytes.
13
How does it work? 2. Detecting checksum Refinement:
13 How does it work? 2. Detecting checksum Refinement: Well-formed inputs can pass the checksum test, but most malformed inputs cannot
14
How does it work? 2. Detecting checksum Refinement:
14 How does it work? 2. Detecting checksum Refinement: Well-formed inputs can pass the checksum test, but most malformed inputs cannot Run well-formed inputs, identify the always-taken and always-not-taken instructions.
15
How does it work? 2. Detecting checksum Refinement:
15 How does it work? 2. Detecting checksum Refinement: Well-formed inputs can pass the checksum test, but most malformed inputs cannot Run well-formed inputs, identify the always-taken and always-not-taken instructions. Run malformed inputs, also identify the always-taken and always-not-taken instructions.
16
How does it work? 2. Detecting checksum Refinement:
16 How does it work? 2. Detecting checksum Refinement: Well-formed inputs can pass the checksum test, but most malformed inputs cannot Run well-formed inputs, identify the always-taken and always-not-taken instructions. Run malformed inputs, also identify the always-taken and always-not-taken instructions. Identify the conditional jump instructions that behaves completely different when processing well-formed and malformed inputs.
17
How does it work? 2. Detecting checksum Checksum detector:
17 How does it work? 2. Detecting checksum Checksum detector: Creates bypass rules – always-taken, always-not-taken 6 if(chksum_in_file != recomputed_chksum) error(); … 0x8048d4f: JZ: 1024: [0x0,0x3ff] 0x8048d4f: JZ: always-taken
18
How does it work? 2. Detecting checksum Checksum detector:
18 How does it work? 2. Detecting checksum Checksum detector: Checksum field identification Input bytes that affects chksum_in_file are the checksum field. 6 if(chksum_in_file != recomputed_chksum) error();
19
How does it work? 3. Directed fuzzing
19 How does it work? 3. Directed fuzzing Generates malformed test cases – feeds them to the original or instrumented program. According to the bypass rules, alters the execution traces at check points – sets the eflags register.
20
How does it work? 3. Directed fuzzing
20 How does it work? 3. Directed fuzzing All malformed test cases are constructed based on the “hot bytes” information Using attack heuristics: bytes that influence memory allocation are set to small, large or negative. bytes that flow into string functions are replaced by characters such as %n, %p. Output – test cases that could cause to crash or consume 100% CPU.
21
How does it work? 3. Directed fuzzing
21 How does it work? 3. Directed fuzzing 6 if(chksum_in_file != recomputed_chksum) error(); 8 int Width = get_width(input_file); 9 int Height = get_height(input_file); 10 int size = Width*Height*sizeof(int); 11 int* p = malloc(size); … 0x8048d4f: JZ: 1024: [0x0,0x3ff] Checksum report … 0x8048d5b: invoking malloc: [0x8,0xf] “hot bytes” report Bypass info 0x8048d4f: JZ: always-taken
22
How does it work? 3. Directed fuzzing
22 How does it work? 3. Directed fuzzing 6 if(chksum_in_file != recomputed_chksum) error(); 8 int Width = get_width(input_file); 9 int Height = get_height(input_file); 10 int size = Width*Height*sizeof(int); 11 int* p = malloc(size); Before executing 0x8048d4f, the fuzzer sets the flag ZF in eflags to an opposite value … 0x8048d4f: JZ: 1024: [0x0,0x3ff] Checksum report … 0x8048d5b: invoking malloc: [0x8,0xf] “hot bytes” report Bypass info 0x8048d4f: JZ: always-taken
23
How does it work? 4. Repairing crashed samples
23 How does it work? 4. Repairing crashed samples Fixing is expensive - fixes checksum fields only in test cases that caused crashing. How? Cr – row data in the checksum field D – input data protected by checksum filed Checksum() – the complete checksum algorithm T – transformation We want to pass the constraint: Checksum(D) == T(Cr)
24
How does it work? 4. Repairing crashed samples
24 How does it work? 4. Repairing crashed samples Using symbolic execution to solve: Checksum(D) is a runtime determinable constant: Only Cr is a symbolic value. Common transformations (e.g. converting from hex/oct to decimal), can be solved by existing solvers (STP). Checksum(D) == T(Cr) c== T(Cr)
25
How does it work? 4. Repairing crashed samples
25 How does it work? 4. Repairing crashed samples If the new test case cause the original program to crash, a potential vulnerability is detected!
26
26 evaluation An incomplete list of applications:
27
27 evaluation “hot bytes” identification results – memory allocation
28
28 evaluation Checksum identification results: Threshold = 16
29
29 evaluation Correct checksum fields:
30
evaluation 27 previous unknown Vulnerabilities: MS Paint ImageMagick
30 27 previous unknown Vulnerabilities: MS Paint ImageMagick Adobe Acrobat Google Picasa irfanview gstreamer Winamp XEmacs Amaya dillo wxWidgets PDFlib
31
31 evaluation Vulnerabilities detected by TaintScope:
32
32 Discussion TaintScope cannot deal with secure integrity check schemes (e.g. cryptographic hash algorithms, digital signature) – impossible to generate valid test cases. Limited effectiveness when all input data are encrypted (tracking decrypted data). Checksum check points identification can be affected by the quality of inputs. Not tracks control flow propagation. Not all instructions of x86 are instrumented by the execution monitor.
33
Conclusion TaintScope can perform: Directed fuzzing
33 Conclusion TaintScope can perform: Directed fuzzing Identify which bytes flow into system/library calls. dramatically reduce the mutation space. Checksum-aware fuzzing Disable checksum checks by control flow alternation. Generate correct checksum fields in invalid inputs.
34
34 questions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.