SPEED: Statically Estimating Symbolic Computational Complexity of Programs Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual.

SPEED: Statically Estimating Symbolic Computational Complexity of Programs Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A Trishul Chilimbi MSR Redmond Krishna Mehra MSR Bangalore

Problem Definition Compute symbolic complexity bounds of procedures in terms of inputs (assuming unit cost for statements). Can use different cost metrics. –Only count memory instructions –Only count memory allocation instructions and weight them with memory allocated (space bounds) –Only count network instructions weighted appropriately (network traffic bounds) Can also compute bounds for interesting code fragments. –code executed between lock acquire/release. 1

Comparison with Profiling 2 ProfilingStatic Symbolic Complexity (+) Generates real running cost.(-) Ignores low-level architectural features like caches, pipelines (-) Only as good as test-inputs.(+) Generates worst-case bounds. (+) Can also generate predicated bounds. (-) Requires building of executables => Late feedback. (+) Requires only compilation of relevant procedures => Immediate feedback.

Applications Provide immediate feedback during code development –Code Editing. –Use of unfamiliar APIs. Performance Analysis –Identify corner cases. Embedded Systems –Establish space bounds. Soft Real-time Systems –Establish time bounds. –Feedback into a runtime power-management scheme. 3

Outline  Challenges in Bounds Analysis Idea #1: Proof Structure (control flow) Idea #2: Quantitative Functions (data-structures) 4

Challenges in Computing Bounds Presence of control-flow –Bounds for even simple programs are non-linear, disjunctive. –Sometimes even proving termination is hard. Presence of data-structures –Expressing bounds requires numerical fns over data-structures. –Computing these bounds requires sophisticated shape analysis. 5

Counter Instrumentation Based Solution The main challenge is in computing loop bounds. A simple counter instrumentation scheme: Loop bounds can be obtained by computing bounds on c using invariant generation tools [CAV ‘08] However the required invariants are usually disjunctive, non- linear, and refer to heap -- and hence hard to compute. Our solution: Refinement of above scheme that allows bounds generation using simple linear invariant generation tools. 6 while (cond) do S c := 0; while (cond) do S; c := c+1;

int size; // Assume(0 · e1.len, e2.len · size); Equals (StringBuffer s1, StringBuffer s2) { c1 := c2 := c3 := 0; e1:=s1.GetHead(); e2:=s2.GetHead(); i1:=e1.len-1; i2:=e2.len-1; while (true) { while (i1 ¸ 0 Æ i2 ¸ 0) { if (e1.arr[i1]  e2.arr[i2]) return 0; i1--; i2--; c3++; } while (i1<0 Æ e1  null) { e1 := s1.GetNext(e1); i1 := i1+e1.len; c1++; c3 := 0; } while (i2<0 Æ e2  null) { e2 := s2.GetNext(e2); i2 := i2+e2.len; c2++; c3 := 0; } if (i1<0) return (i2<0); if (i2<0) return 0; c3++; }; return 1; } Total iterations of 2 nd & 3 rd inner loops: Len(s1) & Len(s2). For each iteration of 2 nd & 3 rd inner loops, combined iterations of 1 st inner loop & outer loop: size Therefore total complexity is (1+size)*(1+Len(s1)+Len(s2)) Example: Non-linear bounds 7

Example: Disjunctive Bounds Example2(int n, x 0, z 0 ) { c1 := 0; c2 := 0; x := x 0 ; z := z 0 ; while (x<n) if (z>x) x++; c1++; else z++; c2++; } Termination proof based on disjunctively well-founded relation. We can even compute bounds using following proof structure: –Number of times if-branch is executed (if at all): n-x 0 –Number of times else-branch is executed (if at all): n-z 0 –Therefore, total iterations: Max(0,n-x 0 ) + Max(0,n-z 0 ) 8

Outline Challenges in Bounds Analysis  Idea #1: Proof Structure (control flow) Idea #2: Quantitative Functions (data-structures) 9

Proof Structure Proof Structure specifies where to increment and initialize multiple counter variables. It is a tuple (M,G) such that M maps each backedge q to some counter variable c. –“c++” at q. G is some DAG over counter variables. –“c:=0” at entry and where any predecessor in G is incremented. Invariant tool can bound counters instrumented as above. 10 c1 c2 c3 Proof Structure for StringBuffer Example: M = {q  c3, q1  c3, q2  c1, q3  c2,} q: backedge of outer loop, qi: backedge of i th inner loop G =

Computing bound from a proof structure Given a proof structure (M,G), bound U is computed as: U = Sum { TotalBound(c) | c } TotalBound(c) = Max{ 0, B(q) | M(q) = c } £ (1 + Sum{TotalBound(c’) | (c’,c) 2 G}) Where B(q) is the bound computed on M(q) at q. Bound for StringBuffer Example: U = Len(s1) + Len(s2) + (1+size) £ (Len(s1)+Len(s2)) 11

Automatically Computing Proof Structure Total number of potential proof structures (M,G) are exponential in number of back-edges. –Hence a naïve search is expensive. Key Idea: Increasing counters and dependencies increases ability of an invariant generation tool to discover bounds. –But cannot simply make all counters depend on each other. –Need to find right set of dependencies that create a DAG. There is a quadratic (in number of back-edges) algorithm to compute a (counter-optimal) proof structure. [POPL ’09] –A counter-optimal proof structure uses minimal counters and miminal dependencies between counters. –Generally, this leads to more precise bounds. 12

Outline Challenges in Bounds Analysis Idea #1: Proof Structure (control flow)  Idea #2: Quantitative Functions (data-structures) 13

Quantitative Functions Defined over tuple of abstract data-structures –Similar to ghost fields. Len(L) : Length of list L. Pos(e,L) : Position of list-element e in List L. Semantics is defined by describing effect of data- structure methods on quantitative functions. –Sequence of (conditional) assignments and assumes. –Can also refer to unscoped variables (universally quantified). 14 Data Structure OperationUpdates to Quantitative Functions L.Append(e);Len(L)++; Pos(e,L) := Len(L); L.Delete(e);Len(L)--; if (Pos(e,L) < Pos(e’,L)) Pos(e’,L) --; e1 := L.GetNext(e2);Pos(e1,L) := Pos(e2,L)+1; Assume(Pos(e1,L) · Len(L));

Principles behind defining Quantitative Functions Precision –Defining more quantitative fns. increases ability of linear invariant generation tool to find bounds. –In practice, a few quantitative fns are usually sufficient. Soundness –Method annotations are always sound from tool’s perspective. –User’s responsibility to ensure that intended semantics matches with the method annotations. –Verification is possible if intended semantics can be described in an appropriate logic Gulwani, Sagiv, Lev-Ami: “A Combination Framework for Tracking Partition Sizes”, POPL 2009. 15

Computing Invariants over Quantitative Functions Instrument a data-structure method call with its effect allowing quantitative fns. to be treated as uninterpreted. –Instantiate unscoped variables with all appropriate terms. Use a linear invariant generation tool with support for uninterpreted functions. –Abstract Interpretation based Technique. Combine Polyhedron abstract domain [Cousot, POPL ‘79] with uninterpreted fns domain [Gulwani, Necula, SAS’ 04] using domain-combinators [Gulwani, Tiwari, PLDI ‘06] –Constraint-based Invariant Generation Technique. [Beyer et.al., VMCAI ‘07] 16

Example: Breadth First Traversal BFT(List L): ToDo.Init(); L.MoveTo(L.Head(),ToDo); c:=0; while (! ToDo.IsEmpty()) e := ToDo.Head(); ToDo.Delete(e); foreach successor s in e.Successors() if (L.contains(s)) L.MoveTo(s,ToDo); c++; Inductive Invariant at back-edge of while-loop c · Old(Len(L)) - Len(L) – Len(ToDo) Æ Len(L) ¸ 0 Æ Len(ToDo) ¸ 0 This implies a bound of Old(Len(L)) for while loop. 17

Quantitative Functions for Bit-vectors Ones(b): Number of 1 bits in b One(b): Position of least significant 1 bit in b Bits(b): Number of bits in b 18 Data Structure Operation Updates to Quantitative Functions a := b << index;Ones(a) := ?; Assume (Ones(a) · Ones(b)); One(a) := index + One(b);

Example Iterate(BitVector a): b := a; c := 0; while (BitScanForward(&id1,b)) b := b | ((1 << id1)-1); // set all bits before id1 if (BitScanForward(&id2, » b)) break; b := b & ( » ((1 << id2)-1)); // reset bits before id2 c++; Each loop iteration masks chunk of consecutive 1s to 0. Our tool computes invariant: c · Ones(a)-Ones(b) Æ 2c · One(b)-One(a) Æ One(b) · Bits(a) This implies bound of Min {Ones(a), Bits(a)/2 } 19

Quantitative Functions for List of Lists TotalNodes(L) = Sum { Len(e’) | L.BelongsTo(e’) } MaxNodes(L) = Max { Len(e’) | L.BelongsTo(e’) } 20 ProgramBound for (e := L.Head(); e  null; e := L.GetNext(e));Len(L) for (e := L.Head(); e  null; e := L.GetNext(e)) for (f := e.Head(); f  null; f := e.GetNext(f)); Len(L) + TotalNodes(L) for (e := L.Head(); e  null; e := L.GetNext(e)) if (*) break; for (f := e.Head(); f  null; f := e.GetNext(f)); Len(L) + MaxNodes(L)

Quantitative Functions for Trees Nodes(T): Total number of nodes in tree T Height(T): Height of tree T 21

Conclusion Applications of Symbolic Bounds Analysis –Interactive code development, Embedded/Real-time systems Challenges in Bounds Analysis –Control flow leads to non-linear and disjunctive bounds. –Data-structures require numerical shape analysis. Idea #1: Proof Structure (control flow) –Addresses issue of non-linear and disjunctive bounds. –Reduces Bounds Analysis to linear numerical shape analysis. Idea #2: Quantitative Functions (data-structures) –Further reduces Bounds Analysis to linear invariant generation over uninterpreted functions. 22

SPEED: Statically Estimating Symbolic Computational Complexity of Programs Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual.

Similar presentations

Presentation on theme: "SPEED: Statically Estimating Symbolic Computational Complexity of Programs Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SPEED: Statically Estimating Symbolic Computational Complexity of Programs Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual.

Similar presentations

Presentation on theme: "SPEED: Statically Estimating Symbolic Computational Complexity of Programs Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual."— Presentation transcript:

Similar presentations

About project

Feedback