RUGRAT: Runtime Test Case Generation using Dynamic Compilers Ben Breech NASA Goddard Space Flight Center Lori Pollock John Cavazos University of Delaware
Motivating Example if ((sptr = malloc (size + 1)) == NULL) { findmem (); findmem (); if ((sptr = malloc (size + 1)) == NULL) xlfail (“insufficient string space”); } How do I test this callsite? Make the machine run out of memory? Flip the conditional, recompile, flip back? Pretend it doesn’t exist during testing?
Generalizing the Problem Code to handle uncommon situations Difficult to test Difficult to test May need external environment event to trigger May need external environment event to trigger Examples: Error handling code Error handling code Testing program security mechanisms Testing program security mechanisms
Observation Hard to reach code executes when program thinks something uncommon has occurred if ((sptr = malloc (size + 1)) == NULL) { findmem (); findmem (); xlfail (“insufficient string space”); } if ((sptr = malloc (size + 1)) == NULL) Could test findmem() by simulating error e.g., could add instructions to program so program believes malloc failed e.g., could add instructions to program so program believes malloc failed
RUGRAT Approach Use Dynamic Compilers to generate test cases for hard to reach code. Automatically add instructions to program during execution to simulate uncommon situation.
Dynamic Compilers Dynamic compilers perform compilation tasks during program execution code Analysis transformation optimization Create basic block translate Basic block Mod. Basic block Execute on CPU Dynamic Compiler
RUGRAT Architecture code Analysis transformation optimization Create basic block translate Basic block Mod. Basic block Execute on CPU Dynamic Compiler Create basic block Dynatest Generator Test spec Test Oracle Test Report
Test Spec Details where/how for inserting tests Current prototype limited (environment vars). Can express: Function locations Function locations “test all calls to function x”“test all calls to function x” “test only second call to x in function y”“test only second call to x in function y” Failure value (e.g., 0, -1, etc) Failure value (e.g., 0, -1, etc) Some side effects Some side effects
Dynatest Generator Scans instructions for location to insert test (e.g., call to function X) Allows function X to execute Adds instructions to simulate error Instructions added after function X Instructions added after function X Program thinks error happened, reacts Program thinks error happened, reacts
Example if ((sptr = malloc (size + 1)) == NULL) { findmem (); findmem (); xlfail (“insufficient string space”); if ((sptr = malloc (size + 1)) == NULL) call malloc (code for malloc) movl sptr cmpl sptr, 0 jnz L1 call findmem …. L1: … Dynatest Generator call malloc (code for malloc) movl 0, movl ENOMEM, errno movl sptr cmpl sptr, 0 jnz L1 call findmem …. L1: … }L1:
The Good, the Bad and the Ugly The Bad: Not a perfect simulation The Good: Adequate simulation Can target system or appl calls Saves quite a lot of tester effort The Ugly: Still a prototype
Security Mechanism Testing: Encrypting Function Pointers Protects progs against func pointer attacks Difficult to test (need vulnerable program and attack) RUGRAT can simulate attack by adding instructions Very different from error handling code case Very different from error handling code case RUGRAT can be used for variety of testing tasks.
Current Implementation Notes Used DynamoRIO 1 dynamic compiler Some limitations (but new version is available) Some limitations (but new version is available) Test spec from env. vars Nothing fancy for oracle 1 Bruening, et al., CGO 2003
Experiments Ran variety of programs with RUGRAT space, SPEC, MiBENCH space, SPEC, MiBENCH Tested handling of errors in malloc / fopen / write malloc / fopen / write Application calls Application calls
Experiments Summary Can RUGRAT generate tests to cover error handling code? YES! RUGRAT tested error handling code at 120+ callsites (missed one because DynamoRIO incurred a segfault)
Experiments Summary Can RUGRAT increase statement coverage for error handling code? YES! RUGRAT increased code coverage ~ 50% (on average) of error handling code Not all statements executed b/c of different optionsNot all statements executed b/c of different options RUGRAT detected cases of omission errorsRUGRAT detected cases of omission errors
Fault Detection Could RUGRAT help detect failures in error handling code? Grad students seeded faults into error handling code for space program Changed assignments, loops, conditionals,etc Changed assignments, loops, conditionals,etc Seeded total of 34 faults Seeded total of 34 faults
Fault Detection Summary RUGRAT detected 15 / 34 faults Of 19 undetected faults: 6 changed return values, but callers only checked certain vals (e.g., if (func () != 0)) 6 changed return values, but callers only checked certain vals (e.g., if (func () != 0)) 2 allocated too little memory (malloc may allocate more memory than requested anyway) 2 allocated too little memory (malloc may allocate more memory than requested anyway) 2 unknown 2 unknown 1 caused space to quit 1 caused space to quit 8 instances were caller performed same code as callee (e.g., any fault in callee was undid by caller) 8 instances were caller performed same code as callee (e.g., any fault in callee was undid by caller)
Some related work Holodeck 1, FIG 2 Require tester provide alternative “stub” functions to do testing Require tester provide alternative “stub” functions to do testing Miss application calls Miss application calls Dynamic branch switching 3 Not originally intended for testing error code Not originally intended for testing error code Need to know which branch to change Need to know which branch to change Far less accurate simulation Far less accurate simulation 1 Thompson et al., SAC Broadwell et al., SHAMAN Zhang et al., ICSE 2006
Conclusions and Summary Presented RUGRAT architecture Can test hard to reach (and seldom tested) code by using dynamic compilers Can test hard to reach (and seldom tested) code by using dynamic compilers Saves tester effort Saves tester effort RUGRAT is a general tool
RUGRAT Architecture code Basic block Mod. Basic block Execute on CPU Dynamic Compiler Create basic block Dynatest Generator Test spec Test Oracle Test Report
Experiments Summary Tested variety programs with RUGRAT 120+ error code handling callsites covered Both application and system calls Both application and system calls Increased error code coverage ~ 50% over regular test cases Not all error code statements could be covered Not all error code statements could be covered Different options, etcDifferent options, etc Reasonable time overhead