Download presentation
Presentation is loading. Please wait.
Published byPrudence Shaw Modified over 9 years ago
1
Constraint-Based Analysis CS 6340 1
2
2 void f(state *x, state *y) { result = spin_trylock( & x->lock); spin_lock( & y->lock); … if (!result) spin_unlock( & x->lock); spin_unlock( & y->lock); } Code Example Path Sensitivity result (!result) Pointers & Heap ( & x->lock); ( & y->lock); Inter- procedural Flow Sensitivity spin_trylock spin_lock spin_unlock Locked Unlocked Error unlock lock unlock lock
3
3 Saturn What? –SAT-based approach to static bug detection How? –SAT-based approach –Program constructs Boolean constraints –Inference SAT solving Why SAT? –Lots of reasons, but for now: –Program states naturally expressed as bits –The theory for bits is SAT –Efficient solvers widely available
4
4 Intuition Analyzing in one direction is problematic –Forwards or backwards –Consider null dereference analysis No null ptr assignments: forwards is best No dereferences: backwards is best Constraints –Give a global picture of the program –Allow more efficient order of solution
5
5 Straight-line Code void f(int x, int y) { int z = x & y ; assert(z == x); } x 31 …x0x0 y 31 …y0y0 == x 31 y 31 … x0y0x0y0 Bitwise-AND R y & x z == ;
6
6 Straight-line Code void f(int x, int y) { int z = x & y; assert(z == x); } R Query: Is-Satisfiable( ) Answer: Yes x = [00…1] y = [00…0] Negated assertion is satisfiable. Therefore, the assertion may fail.
7
7 Control Flow – Preparation Approach –Assumes loop free program –Unroll loops, drop backedges May miss errors that are deeply buried –Bug finding, not verification –Many errors surface in a few iterations Advantages –Simplicity, reduces false positives
8
8 if (c) Control Flow – Example Merges –preserve path sensitivity –select bits based on the values of incoming guards G = c, x: [a 31 …a 0 ] G = c, x: [b 31 …b 0 ] G = c c, x: [v 31 …v 0 ] where v i = (c a i ) ( c b i ) c x = a; x = b; res = x; cc if (c) x = a; else x = b; res = x; true
9
9 Pointers – Overview May point to different locations… –Thus, use points-to sets p: { l 1,…,l n } … but path sensitive –Use guards on points-to relationships p: { (g 1, l 1 ), …, (g n, l n ) }
10
10 G = c, p: { (true, y) } Pointers – Example G = true, p: { (true, x) } p = & x; if (c) p = & y; res = *p; G = true, p: { (c, y); ( c, x)} if (c) res = y; else if ( c) res = x;
11
11 Pointers – Recap Guarded Location Sets { (g 1, l 1 ), …, (g n, l n ) } Guards –Condition under which points-to relationship holds –Collected from statement guards Pointer Dereference –Conditional Assignments
12
12 Not Covered Other Constructs –Structs, … Modeling of the environment Optimizations –several to reduce size of formulas –some form of program slicing important
13
13 What can we do with Saturn? int f(lock_t *l) { lock(l); … unlock(l); } if (l->state == Unlocked) l->state = Locked; else l->state = Error; if (l->state == Locked) l->state = Unlocked; else l->state = Error; Locked Unlocked Error unlock lock unlock lock
14
14 General FSM Checking Encode FSM in the program –State Integer –Transition Conditional Assignments Check code behavior –SAT queries
15
15 How are we doing so far? Precision: Scalability: –SAT limit is 1M clauses –About 10 functions Solution: –Divide and conquer –Function summaries
16
16 Function Summaries (1 st try) Function behavior can be summarized with a set of state transitions Summary: * l: Unlocked Unlocked Locked Error int f(lock_t *l) { lock(l); … unlock(l); return 0; }
17
17 int f(lock_t *l) { lock(l); … if (err) return -1; … unlock(l); return 0; } A Difficulty Problem –two possible output states –distinguished by return value (retval == 0)… Summary 1. (retval == 0) *l: Unlocked Unlocked Locked Error 2. (retval == 0) *l: Unlocked Locked Locked Error
18
18 FSM Function Summaries Summary representation (simplified): { P in, P out, R } User gives: –P in : predicates on initial state –P out : predicates on final state –Express interprocedural path sensitivity Saturn computes: –R: guarded state transitions –Used to simulate function behavior at call site
19
19 int f(lock_t *l) { lock(l); … if (err) return -1; … unlock(l); return 0; } Lock Summary (2 nd try) Output predicate: –P out = { (retval == 0) } Summary (R): 1. (retval == 0) *l: Unlocked Unlocked Locked Error 2. (retval == 0) *l: Unlocked Locked Locked Error
20
20 Lock checker for Linux Parameters: –States: { Locked, Unlocked, Error } –P in = {} –P out = { (retval == 0) } Experiment: –Linux Kernel 2.6.5: 4.8MLOC –~40 lock/unlock/trylock primitives –20 hours to analyze 3.0GHz Pentium IV, 1GB memory
21
21 Double Locking/Unlocking static void sscape_coproc_close(…) { spin_lock_irqsave(&devc->lock, flags); if (…) sscape_write(devc, DMAA_REG, 0x20); … } static void sscape_write(struct … *devc, …) { spin_lock_irqsave(&devc->lock, flags); … }
22
22 Ambiguous Return State int i2o_claim_device(…) { down(&i2o_configuration_lock); if (d->owner) { up(&i2o_configuration_lock); return –EBUSY; } if (…) { return –EBUSY; } … }
23
23 Bugs TypeBugsFalse Pos.% Bugs Double Locking 1349957% Ambiguous State 452267% Total17912160% Previous Work: MC (31), CQual (18), <20% Bugs
24
24 Function Summary Database 63,000 functions in Linux – More than 23,000 are lock related – 17,000 with locking constraints on entry – Around 9,000 affects more than one lock – 193 lock wrappers – 375 unlock wrappers – 36 with return value/lock state correlation Available on the web...
25
25 Another Checker Memory leaks –Common, esp. in error handling code –Hard to find –Problematic in long running applications Current techniques –Escape analysis –Ownership types –Region based analysis…
26
26 Simple Leak char *f() { char *p; p = (char*)malloc(…); … if (err) return NULL; … return p; }
27
27 Scenario 1 – Malloc Wrappers char *f() { char *p; p = (char*)strdup(…); … if (err) return NULL; … return p; }
28
28 Scenario 2 – External References char *f(struct *s) { char *p; p = (char*)malloc(…); s->name = p; if (err) return NULL; … return p; }
29
29 Scenario 3 – Function Calls char *f(struct state *s) { char *p; p = (char*)malloc(…); g(s, p); if (err) return NULL; … return p; } void g(s, p) { s->name = p; }
30
30 Scenario 4 – Data dependency void f(int len) { char fastbuf[10], *p; if (len < 10) p = fastbuf; else p = (char *)malloc(len); … if (p != fastbuf) free(p); }
31
31 Requirements Track points-to relationships precisely Infer escaping functions –ones that create external references to objects passed in via parameters Infer allocation functions
32
32 Analysis Part I – Points-to Rule PointsTo(p, l) –condition under which p points to l (p) = { (g 0, l 0 ), …, (g n-1, l n-1 ) } PointsTo(p, l) = g i (if l i = l) false (otherwise)
33
33 Analysis Part II – EscapeVia EscapeVia(l, p, X) –the condition under which location l escapes via pointer p, excluding references in set X Access Roots –Every object in the function body is accessed through one of the following “roots” Parameters (p 1 …p n ) The Return Value (ret_val) Global Variables Local Variables Heap Allocated Objects
34
34 Analysis Part II – EscapeVia Never escape through local variables Root(p) Locals X EscapeVia(l, p, X) = false Always escape through global variables RootOf(p) Globals EscapeVia(l, p, X) = PointsTo(p, l)
35
35 Escaping through parameters/return RootOf(p) (Params { ret_val }) – X EscapeVia(l, p, X) = PointsTo(p, l) Escaping via another allocated location RootOf(p) NewLocs – X EscapeVia(l, p, X) = PointsTo(p, l) Escaped(p,X {RootOf(l)}) Analysis Part II – EscapeVia
36
36 Analysis Part III – Escape/Leak Escape Condition Escaped(l, X) = p EscapedVia(l, p, X) Leak Condition Leaked(l, X) = Escaped(l, X) Leak Checker For all new locations l, there is a leak if Satisfiable(Leaked(l, {}))
37
37 Results LOC (K) # Alloc Func. # BugsFP (%) Samba40480838.79% OpenSSL2961011170.85% BinUtils90991136(66)3.55% OpenSSH361929(10)0% Total1,6462913653.69%
38
38 Why SAT? (Revisited …) Moore’s Law Uniform modeling of constructs as bits Constraints –Local specification –Global solution Incremental SAT solving –makes multiple queries efficient
39
39 Why SAT? (Cont.) Path sensitivity is important –To find bugs –To reduce false positives –Much easier to model precisely with SAT Compositionality is important –Function summaries critical for scalability –Easy to construct with SAT queries
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.