VUzzer: Application-aware Evolutionary Fuzzing Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos Victor van der Veen 32 year old from Amsterdam 3rd year PhD student with Herbert Bos ETA: summer 2018
VUzzer Smart fuzzing without symbolic execution Your takeaway message of today Smart fuzzing without symbolic execution Extract application features for meaningful mutation VUzzer 30K inputs: 403 crashes AFL 30.000K inputs: 238 crashes VUzzer @ NDSS – March 1, 2017
What are we trying to solve anyway? PROBLEM STATEMENT What are we trying to solve anyway? VUzzer @ NDSS – March 1, 2017
AFL AFL will run for hours on this ... read(fd, buf, size); if (buf[5] == 0xD8 && buf[4] == 0xFF) // interesting code here else pr_exit(“Invalid file”); AFL will run for hours on this VUzzer @ NDSS – March 1, 2017
AFL AFL will run for hours on this ... read(fd, buf, size); if (buf[5] == 0xD8 && buf[4] == 0xFF) // interesting code here else pr_exit(“Invalid file”); AFL will run for hours on this Has to figure out that offset 4 and 5 are of interest (where) VUzzer @ NDSS – March 1, 2017
AFL AFL will run for hours on this ... read(fd, buf, size); if (buf[5] == 0xD8 && buf[4] == 0xFF) // interesting code here else pr_exit(“Invalid file”); AFL will run for hours on this Has to figure out that offset 4 and 5 are of interest (where) Needs to guess 0xFFD8 (what) VUzzer @ NDSS – March 1, 2017
More Problems Handling ‘Complex’ Code Structures Magic bytes Certain values shall be placed at pre-determined offsets Deeper execution Many inputs will end up in less interesting error-handling code (multibyte) Markers Not at fixed offsets: if (strstr(&buf, “MAZE”)) ... VUzzer @ NDSS – March 1, 2017
Our mutation-, coverage-based, greybox fuzzer VUZZER Our mutation-, coverage-based, greybox fuzzer VUzzer @ NDSS – March 1, 2017
VUzzer Where to mutate, what to insert Avoid non-scalable techniques Evolutionary fuzzing Mutate/select most promising paths Magic byte detection Find possible magic byte values to reach deeper into the binary (limited) Input type detection Aid mutation by detecting input bytes of certain types (integers) Avoid non-scalable techniques No symbolic execution Limited use of Dynamic Taint Analysis VUzzer @ NDSS – March 1, 2017
Feature Extraction Data-flow features Control-flow features VUzzer @ NDSS – March 1, 2017
Feature Extraction Data-flow features Control-flow features Information about relationship between input data and program computations Extracted using static analysis / dynamic taint analysis Example: cmp instructions on x86 Offsets: which input bytes are compared against? (taint analysis) Magic values: immediate operands for cmp (static analysis) Example: lea instructions Integer types: is index operand tainted? Control-flow features VUzzer @ NDSS – March 1, 2017
Feature Extraction Data-flow features Control-flow features Information about importance of certain execution paths Identify error-handling blocks (heuristics based) Rank basic blocks to prioritize hard-to-reach code Each basic block gets a weight depending on how deep it is nested Error-handling blocks get a negative weight VUzzer @ NDSS – March 1, 2017
VUzzer @ NDSS – March 1, 2017
CMP Immediates Rank Basic Blocks VUzzer @ NDSS – March 1, 2017
Seed Inputs (known valid)
Dynamic Taint Analysis Seed Inputs (known valid)
Dynamic Taint Analysis + Error-handling code + Magic bytes + LEA offsets VUzzer @ NDSS – March 1, 2017
Fitness(executed code) High scores for inputs that execute highly ranked basic blocks
Mutate and loop Dynamic Taint Analysis only when new code is covered VUzzer @ NDSS – March 1, 2017
VUZZER VUzzer @ NDSS – March 1, 2017
AFL: 22k inputs VUzzer: 400 inputs Evaluation Darpa Cyber Grand Challenge VUzzer / AFL 13 binaries, 6 hours per binary Far fewer VUzzer inputs VUzzer: 400 inputs VUzzer @ NDSS – March 1, 2017
Evaluation LAVA-M Dataset LAVA: inject hard-to-reach faults to (e.g.,) evaluate fuzzers VUzzer hits significantly more bugs than FUZZER: coverage-based SES: symbolic execution / SAT-based Program Total Bugs FUZZER SES VUzzer uniq 28 7 27 base64 44 9 17 md5sum 57 2 1 who 2136 18 50 VUzzer @ NDSS – March 1, 2017
Evaluation Various Applications Comparison against AFL on vanilla Ubuntu 14.04 More bugs with fewer inputs Program Crashes (AFL) Crashes (VUzzer) Inputs (AFL) Inputs (VUzzer) mpg321 19 337 883K 24K gif2png 7 127 1.84M 43K pdf2svg 13 923K 5K tcpdump 3 2.89M 78K tcptrace 238 403 3.29M 30K Djpeg 1 35.9M 90K VUzzer @ NDSS – March 1, 2017
Evaluation Various Applications Comparison against AFL on vanilla Ubuntu 14.04 More bugs with fewer inputs Program Crashes (AFL) Crashes (VUzzer) Inputs (AFL) Inputs (VUzzer) mpg321 19 337 883K 24K gif2png 7 127 1.84M 43K pdf2svg 13 923K 5K tcpdump 3 2.89M 78K tcptrace 238 403 3.29M 30K Djpeg 1 35.9M 90K VUzzer @ NDSS – March 1, 2017
Evaluation Various Applications Comparison against AFL on vanilla Ubuntu 14.04 Crash faster Consistent progress VUzzer @ NDSS – March 1, 2017
Conclusion VUzzer Novel fuzzing technique based on evolutionary approach Application-aware fuzzer by exploiting data-flow and control-flow features Prioritize hard-to-reach code paths Deprioritize error-handling code Significantly more bugs with orders of magnitude fewer inputs in less time VUzzer @ NDSS – March 1, 2017
https://github.com/vusec/vuzzer Final Remarks Open Source VUSec Project Page https://github.com/vusec/vuzzer https://vusec.net/projects/fuzzing s.rawat@vu.nl