Checking System Rules Using System-Specific Programmer Written Compiler Extensions Dawson Engler, Benjamin Chelf, Andy Chou and Seth Hallem Presented by: Erez Louidor
Traditional Methods for Correctness Checking Formal verification –Models are difficult and costly to construct. –Models need maintenance. –Sometimes suffer from over-simplifications. Can be used to rule out deviant behaviors.
Traditional Methods for Correctness Checking (cont.) Testing Dynamic - The number of execution paths grows exponentially with code size: -Thorough testing requires writing many tests -Thorough testing costs a lot of time. - Requires running the tested code Can be used to check properties that are very difficult to check by other (static) means. Real-time constraints.
Traditional Methods for Correctness Checking (cont.) Manual inspection (code review) –Error prone. –Impractical to perform thoroughly in large systems. Usually done only on “Critical” sub-systems. Specifying the property to check is usually easy.
Static Analysis with Meta Compilation Compilers can be used to check general restrictions statically: –“A function can be called only with the parameter types it was declared.” –“A function cannot change a ‘const’ object.” Programming languages are usually too general to express system-specific restrictions: –“A function that returns with an error must free the resources it acquired.” –“Check every user-supplied pointer for validity before dereferencing it.” –“A blocking function must not be called when interrupts are disabled.”
Static Analysis with Meta Compilation (cont.) Solution: extend the compiler, to check system-specific rules. –Use a meta-language to write compiler extension, and a meta-compiler to compile the extension.
Static Analysis with Meta Compilation (cont.) Compilers work with the code itself: No need to construct and maintain a model. Static: Can examine all execution paths. Does not require running the code. -Some properties are very hard/impossible to check. Scales well to large systems. – Can be used to find bugs, not to guarantee their absence. – Can produce many false warnings.
A Meta Compilation System Extensions are written in Metal –A high-level language based on the state-machine abstraction. Metal extensions are compiled with the metal compiler. The compiled extensions are dynamically linked into xg++, a C++ compiler built on top of g++. The system is “still under development” and is not publicly available.
Example 1: A Metal Interrupt Checker. sm check_interrupts { // Variables used in patterns decl { unsigned } flags; // Patterns to specify enable/disable functions. pat enable = { sti(); } | { restore_flags(flags); }; pat disable = { cli(); }; // States. The first state is the initial state. is_enabled: disable is_disabled | enable { err(“double enable”}; }; is_disabled: enable is_enabled | disable { err(“double disable”}; } $end_of_path$ { err(“exiting with interrupts enabled!”);}; }
Metal Overview A Metal program describes an extended state- machine by transition rules: –Each transition rule specifies: A source state. A pattern. An optional action - specified as C code. An optional destination state. –If any code matches the pattern, and the current state is the source state, metal executes the action, and updates the current state By default, this state-machine is applied down every execution path.
Example 1: A Metal Interrupt Checker (cont.). /* From Linux kernel drivers/block/raid5.c */ static struct buffer_head* get_free_buffer(struct stripe_head* sh, int b_size) { stuct buffer_head* bh; unsigned long flags; save_flags(flags); cli(); if ((bh=sh->buffer_pool)==NULL) return NULL; sh->buffer_pool = bh->b_next; bh->b_size = b_size; restore_flags(flags); return bh; } An extended version of this checker found 82 bugs in the Linux kernel code.
Example 2: A static memory allocation checker sm null_checker { decl {scalar} sz; decl {const int} retv; decl {any_ptr} v1; state decl { any_ptr } v; start, v.all: {((v=(any)malloc(sz)) == 0) true=v.null, false=v.not_null | {((v=(any)malloc(sz)) != 0) true=v.not_null, false=v.null | {((v =(any)malloc(sz)} v.unknown; v.unknown, v.null, v.not_null: { (v==0) } true=v.null, false=v.not_null | { (v != 0) } true=v.not_null, false=v.null; v.unknown, v.not_null: {return retv; } {if (mgk_int_cst(retv) < 0) err(“Error path leak!”);};
Example 2: A static memory allocation checker (cont.) v.null, v.unknown: {*(any *)v } {err(“Using pointer illegally!”); }; v.unknown, v.null, v.not_null: { free(v); } v.freed; v.freed: { free(v) } { err(“Dup free!”); } | { v } { err(“Use-after-free!”); }; v.all: { v=v1 } v.stop; }
Example 2: A static memory allocation checker (cont.) An extended version of this checker found 132 errors in the Linux kernel code: 61 of which were false warnings: The checker does not handle variable copies. It does not detect when a clean-up routine would free the memory. /* Drivers/isdn/pcbit:pcbit_init_dev */ … free(dev); iounmap((unsigned char*)dev->sh_mem); release_mem_region(dev->ph_mem, 4096); …
The Analysis Algorithm The algorithm traverses all paths in the source code flow graph. The algorithm’s state is a list of state-values: –A global state-value ‘start’ in the previous example. –A set of variable-instance state-values: Each variable instance is bound to a different program-object, e.g. there may be several v’s, each bound to a different pointer. Binding a state-value to a variable, adds an instance-state to the set. (e.g. setting v’s state-value to ‘not_null’). Binding the special state-value ‘stop’ to a variable-instance, removes that instance from the set of instance-variable states, and stops tracking that program object.
The Analysis Algorithm (cont.) At each node in the graph, and for each transition rule, the algorithm iterates on all the state-values in the current state, looking for an applicable transition. For each node, the algorithm keeps a list of the states in which it visited that node. Upon reaching a node with an already visited state the algorithm back-tracks. –Deals with loops (assuming the SM has a finite number of states). –Speeds up the analysis by pruning redundant paths. –Assumes the SM is deterministic.
Example 3: Assertions Metal extensions can also operate in a linear flow- insensitive mode. sm Assert flow_insensitive { decl {any} expr, x, y, z; decl {any_call} any_fcall; decl {any_args} args; // Find all assert calls. Then apply SM to “expr” in state “in_assert” start: { assert(expr); } {mgk_expr_recurse(expr, in_assert); }; // Find all side-effects. A function call is considered a side-effect. in_assert: { any_fcall(args) } {err(“function call”); } | { x = y } { err(“assignment”); } // ‘=‘ also matches ‘+=‘, ‘-=‘, etc. | {z++} { err(“post-increment”); } | {--z} { err (“pre-decrement”); }; // We should also check for ++z, z--. }
Example 3: Assertions (cont.) The authors also wrote a flow sensitive metal static assertion checker, that tries to find false assertions in compile-time. –Hindered by the primitive dataflow analysis of xg++. –Found 5 errors in the FLASH cache-coherence code.
Possible Restrictions Using similar checkers it is possible to check restrictions of the following forms: –Never/Always do X Never use floating-point in the kernel. Never allocate more than xxx bytes on the stack. Always allocate as much storage as an object needs –Never/Always do X before/after Y Check user pointers before using them in the kernel. –In situation X do (not do) Y If interrupts are disabled, do not block.
Global Analysis So far, every checker used local analysis: –The analysis was confined to the function scope. Many system rules, are context dependent, and apply globally across function boundaries: –“A blocking function must not be called when interrupts are disabled.”
Global Analysis (cont.) The xg++ system described in the paper does not support global analysis well. To check the above rule in the Linux kernel, the authors used xg++ to compute a list of all possible blocking functions: –A function is possibly-blocking if it either blocks directly or starts a call-chain containing a function that blocks directly. They then applied a metal local-analysis extension to check for every function that it doesn’t call a possibly blocking function in an interrupt-disabled mode. This method found 123 bugs, and produced 8 false warnings.
Optimizing with Meta- Compilation Domain-specific static-analysis can be used for finding optimization opportunities: –The optimizers built into compilers are inherently general, and therefore may be too conservative for a specific system. –Some optimizations are system-specific. The authors used metal extensions to find hundreds of optimization opportunities in the FLASH machine cache coherence code.