Presentation of Failure- Oblivious Computing vs. Rx OS Seminar, winter 2005 by Lauge Wullf and Jacob Munk-Stander January 4 th, 2006
Agenda Introduction Failure-Oblivious Computing Rx: Treating Bugs As Allergies
Introduction Problem –Reliability (deterministic and non-deterministic) Cause –Software defects account for up to 40% of system failures –Memory- and concurrency related bugs cause more than 60% of system vulnerabilities Effect –Expensive
Introduction Solutions –Safe languages, e.g. ML, Java or C# –Rebooting/restarting Whole program restart, micro rebooting, etc. –Check pointing and recovery Check point, roll back on failure, re-execute –Application specific Multi-process model, exception handling, etc. –Non-conventional approaches E.g. failure-oblivious computing
Failure-Oblivious Computing An instance of acceptability-oriented computing: –A flawed system must ensure that it respects basic acceptability properties, e.g.: System must never accelerate the vehicle beyond a specific velocity System should continue to execute even if it has a memory error Makes invalid memory accesses oblivious –Invalid reads return manufactured values –Invalid writes are discarded Thus, no termination of processes or exceptions
Failure-Oblivious Computing, cont. Behavior –Standard Compilation memory corruption, potential crash –Safe Compilation process terminates without potentially contaminating global data –Failure-Oblivious Compilation process continues execution, speculative, unsafe execution path
Failure-Oblivious Computing, cont. Example, Pine 4.44 –Index uses From field of messages –Quotes certain characters –Bug when quoting certain values Maximum length is miscalculated, thus a too small buffer is allocated for quoted value –Standard and Safe: Pine crashes on start –FOC: Pine operates “normally”
Failure-Oblivious Computing, cont. Example, bug-server (fictional)bug-server –FOC uses malloc/free to monitor memory access –Memory deallocation takes up much time, bug-server2.0 uses memory pools: pool *new_pool() creates a new pool for memory allocation void *pool_alloc(pool *p, size_t size) allocates size bytes from the pool p void free_pool(pool *p) frees all memory allocated to pool p –Pools internally use malloc to create new or extend pools, free to free pools –A security exploit is released, affects only 2.0, why?
Failure-Oblivious Computing, cont. Extension to gcc Implemented using checking code and continuation code Checking code evaluates whether a memory access is valid or not Continuation code executes when an invalid memory access occurs –Discards erroneous writes –Manufactures a sequence of results for erroneous reads, [0, 1, 2, 0, 1, 3, 0, 1, 4, …]
Failure-Oblivious Computing, cont. Checking code –based on Jones and Kelley’s scheme –enhanced by Ruwase and Lam Jones and Kelley’s scheme –A table maps locations to data units –A data unit is e.g. a struct, array, variable –The table tracks intended data units and is used to distinguish in-bounds from out- of-bounds pointers
Failure-Oblivious Computing, cont. Base Case – always in-bounds –Base pointer is the address of an array, struct or variable. –Intended data unit is the corresponding data unit of base pointer Pointer Arithmetic –Starting pointer + offset –In-bound if and only if starting pointer and derived pointer point to the same data unit –Intended data unit is the same for both –Does not work with “reverse” pointer arithmetic? Pointer Variables –In-bound if-and-only if it was assigned to in-bound pointer –Intended data unit is the same as the pointer to which it was assigned
Failure-Oblivious Computing, cont. Valid out-of-bounds pointer –Points to the next byte after intended data unit –Obtained by padding each data item with an extra byte Illegal out-of-bounds-pointer have value ILLEGAL (-2) Used to support valid out-of-bounds pointers in terminating loops when using pointer arithmetic
Failure-Oblivious Computing, cont. Dereferencing pointer, checks table: –in-bounds pointer returns referent value –out-of-bounds pointer causes program to halt with error Does not support pointer arithmetic used to obtain a pointer to a location past the end of intended data unit, which is then used to calculate an in- bound
Failure-Oblivious Computing, cont. Ruwase and Lam’s enhancement –Out-of-bounds pointers are set to point to out-of-bounds (OOB) object –OOB object: Start address of intended data unit Offset from this address –Can track out-of-bounds pointers to their intended data unit
Failure-Oblivious Computing, cont. Pros –Global state is not corrupted –Local data accessed in loops Individual iteration failures can be handled –Servers without state No propagation of errors beyond a single request –Interactive programs Programs do not crash Can show meaningful results Tolerable slow-down
Failure-Oblivious Computing, cont. Cons: –“safe compiler for C” What if this introduces bugs? Only C? Programs must be recompiled –Always in use, not only when needed –Manufactured reads can lead to wrong execution path, i.e. not for correctness-critical applications Only tested in the case of Midnight Commander
Failure-Oblivious Computing, cont. Cons, cont. –“The key question is how (or even if) the incorrect or unexpected result may propagate through the remaining computation to affect the overall results of the program” How to determine this is not answered Vaguely mentions that FOC is less appropriate for such cases Global change, thus might only be suited for isolated functionality, i.e. local
Failure-Oblivious Computing, cont. Cons, cont. –Patch-management Rather have a fixed system than one which seems to run fine, but might not –“Lucky” cases: Pine – different method used elsewhere Sendmail – length-check catches error Midnight Commander – dangling link minimizes error Mutt – server returns “does not exist”
Failure-Oblivious Computing, cont. Performance –Programs that would crash earlier continue execution –Slowdown from 1.03 to 8.1 times the original performance