TaintCheck and LockSet LBA Reading Group Presentation by Shimin Chen
Papers TaintCheck: "Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software." J. Newsome and D. Song. NDSS 2005.Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software LockSet: "Eraser: A dynamic race detector for multi-threaded programs." S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. ACM TOCS, 15(4), 1997.Eraser: A dynamic race detector for multi-threaded programs
TaintCheck Goal: Detecting Overwrite Attacks Most commonly occurring security exploits are overwrite attacks Exploit software bugs: Buffer overrun: e.g. strcpy() Integer overflow Bad printf statement: e.g. printf (buf) Overwrite sensitive data values: Return address on stack Global Offset Table (GOT): dynamic library function pointers Format strings
TaintCheck Overview Label data originating from or arithmetically derived from untrusted sources (e.g. the network) as tainted Keep track of the propagation of tainted data as the program executes Detect when tainted data is used in dangerous ways (e.g. used as jump target address) Dynamic binary instrumentation: e.g. Valgrind
Taintcheck State 1 bit for every register (e.g. eax, ebx, ecx, edx, esi, edi, esp, ebp) 1 bit for every byte in the application 0 – not tainted; 1 – tainted (the paper also proposed more sophisticated state schemes maintaining linked lists to track taint propagation)
TaintSeed: What Data Should Be Marked as Tainted? Input data returned by system calls: read, recv, etc. Mark the taint bit for every byte of the input data buffer Configurable based on Socket? Standard input? File user ID?
TaintTracker: How should the taint attribute propagate? Instrument every instruction Given an operation: d = s1 op s2 Compute: taint_d = taint_s1 OR taint_s2 Handle corner cases: e.g. xor eax, eax
TaintAssert: What usage of tainted data should raise an alarm as an attack? Check every indirect jump instruction: If jump target address in a register, check the taint bit of the register If jump target address in memory, check the taint bit for the memory location Check format strings of printf-like calls Other checks: e.g. syscall args Report error if tainted
Usages without Alarm EFLAGS Program control flow typically determined by input data Addresses used in data movement inst Common to use input data as array index Too many false positives if alarmed
Taintcheck Evaluation Functionality: Performance: up to 37X slowdown
Taintcheck for Automatic Signature Generation
LockSet Overview Goal: detecting violation of locking convention in multi- threaded program Every shared object must be protected by one or a set of locks consistently throughout the program C(v): the lock set for object v locks_held(t): the set of locs held by thread t
Improvements Initialization Read-shared data Read-write lock
LockSet State For every 4-byte application word, keep a 4-byte (32-bit) information 2 bits to encode one of the four states 30 bits to encode first thread ID (exclusive) or lockset address (shared/shared-modified) 2 bits30 bits state Address of lock set / owner ID
Valgrind Implementation Monitor pthread calls: pthread_mutex_lock / pthread_mutex_unlock Modify locks_held(t) Monitor malloc/free calls: Initialize states Instrument each memory reference: Compute lock set intersection Improvements: monitor pthread_create / pthread_join for dealing with exclusive data ownership change
Limitations Cannot deal with barrier synchronization In scientific computing, a program consists of multiple stages separated with barriers Shared data usage pattern can be very different across stages An object v can be accessed correctly by T1 in stage A and by T2 in stage B without locking