SPEED: Precise & Efficient Static Estimation of Symbolic Computational Complexity Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint.

Slides:



Advertisements
Similar presentations
1 Parametric Heap Usage Analysis for Functional Programs Leena Unnikrishnan Scott D. Stoller.
Advertisements

College of Information Technology & Design
Termination Proofs from Tests
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A AAA A A A AA A Proving that non-blocking algorithms don't block.
Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
PZ03D Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ03D - Program verification Programming Language Design.
MATH 224 – Discrete Mathematics
Intermediate Code Generation
DETAILED DESIGN, IMPLEMENTATIONA AND TESTING Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Static Single Assignment CS 540. Spring Efficient Representations for Reachability Efficiency is measured in terms of the size of the representation.
Analysis of Algorithms CS Data Structures Section 2.6.
1 1 Regression Verification for Multi-Threaded Programs Sagar Chaki, SEI-Pittsburgh Arie Gurfinkel, SEI-Pittsburgh Ofer Strichman, Technion-Haifa Originally.
The LC-3 – Chapter 6 COMP 2620 Dr. James Money COMP
HST 952 Computing for Biomedical Scientists Lecture 10.
Program Analysis as Constraint Solving Sumit Gulwani (MSR Redmond) Ramarathnam Venkatesan (MSR Redmond) Saurabh Srivastava (Univ. of Maryland) TexPoint.
Discovering Affine Equalities Using Random Interpretation Sumit Gulwani George Necula EECS Department University of California, Berkeley.
Precise Inter-procedural Analysis Sumit Gulwani George C. Necula using Random Interpretation presented by Kian Win Ong UC Berkeley.
SPEED: Statically Estimating Symbolic Computational Complexity of Programs Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
1 “White box” or “glass box” tests “White Box” (or “Glass Box”) Tests.
A Numerical Abstract Domain based on Expression Abstraction + Max Operator with Application in Timing Analysis Sumit Gulwani (MSR Redmond) Bhargav Gulavani.
Improving Code Generation Honors Compilers April 16 th 2002.
Analysis of Algorithms 7/2/2015CS202 - Fundamentals of Computer Science II1.
CHAPTER 10 Recursion. 2 Recursive Thinking Recursion is a programming technique in which a method can call itself to solve a problem A recursive definition.
Analysis of Algorithms Spring 2015CS202 - Fundamentals of Computer Science II1.
February 17, 2015Applied Discrete Mathematics Week 3: Algorithms 1 Double Summations Table 2 in 4 th Edition: Section th Edition: Section th.
Design Space Exploration
1 Recursion Algorithm Analysis Standard Algorithms Chapter 7.
Swarat Chaudhuri Roberto Lublinerman Pennsylvania State University Sumit Gulwani Microsoft Research CAUCHY Continuity analysis of programs.
Stephen P. Carl - CS 2421 Recursion Reading : Chapter 4.
Agenda Introduction Overview of White-box testing Basis path testing
1 7.Algorithm Efficiency What to measure? Space utilization: amount of memory required  Time efficiency: amount of time required to process the data Depends.
Additional Problems.
L. Grewe.  An array ◦ stores several elements of the same type ◦ can be thought of as a list of elements: int a[8]
1 7.Algorithm Efficiency What to measure? Space utilization: amount of memory required  Time efficiency: amount of time required to process the data.
An Effective Method to Control Interrupt Handler for Data Race Detection Makoto Higashi †, Tetsuo Yamamoto ‡, Yasuhiro Hayase †, Takashi Ishio † and Katsuro.
Program Correctness. 2 Program Verification An object is a finite state machine: –Its attribute values are its state. –Its methods optionally: Transition.
Unit Testing 101 Black Box v. White Box. Definition of V&V Verification - is the product correct Validation - is it the correct product.
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A Sumit Gulwani (Microsoft Research, Redmond) The Reachability-Bound.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Lecture 4: Calculating by Iterating. The while Repetition Statement Repetition structure Programmer specifies an action to be repeated while some condition.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Flow Control (for) Outline 4.1Introduction 4.2The.
Detecting Equality of Variables in Programs Bowen Alpern, Mark N. Wegman, F. Kenneth Zadeck Presented by: Abdulrahman Mahmoud.
Introduction to Analysis of Algorithms CS342 S2004.
Digital Image Processing Introduction to M-function Programming.
1 Proving program termination Lecture 5 · February 4 th, 2008 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.
More linear search with invariants CS 5010 Program Design Paradigms “Bootcamp” Lesson TexPoint fonts used in EMF. Read the TexPoint manual before.
Flow Control in Imperative Languages. Activity 1 What does the word: ‘Imperative’ mean? 5mins …having CONTROL and ORDER!
Analysis of Algorithms Spring 2016CS202 - Fundamentals of Computer Science II1.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
White-Box Testing Statement coverage Branch coverage Path coverage
Complexity Analysis (Part I)
CS1010 Programming Methodology
Analysis of Algorithms
Software Testing.
Math/CSE 1019C: Discrete Mathematics for Computer Science Fall 2012
Analysis of Algorithms
Recursion "To understand recursion, one must first understand recursion." -Stephen Hawking.
Programming Fundamentals Lecture #6 Program Control
CS 154, Lecture 6: Communication Complexity
Software Testing (Lecture 11-a)
Applied Discrete Mathematics Week 6: Computation
Analysis of Algorithms
White-Box Testing Techniques I
Program Flow.
The Zoo of Software Security Techniques
Complexity Analysis (Part I)
Analysis of Algorithms
CS 5010 Program Design Paradigms “Bootcamp” Lesson 8.3
Complexity Analysis (Part I)
Presentation transcript:

SPEED: Precise & Efficient Static Estimation of Symbolic Computational Complexity Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A Trishul Chilimbi MSR Redmond Krishna Mehra MSR Bangalore

Problem Definition Compute symbolic complexity bound of procedures in terms of inputs (assuming unit cost for statements). Can use different cost metrics. –Only count memory instructions –Only count memory allocation instructions and weight them with memory allocated (space bound) –Only count network instructions weighted appropriately (network traffic bound) Can also compute bound for interesting code fragments. –code executed between lock acquire/release. 1

Applications Provide immediate feedback during code development –Use of unfamiliar APIs. –Code Editing. Performance Analysis –Identify corner cases (unlike profiling) Embedded Systems –Establish space bounds. 2

Two Key Challenges Hard part is to bound loops. This is equivalent to computing a bound on instrumented counter c using invariant generators. However the required invariants are hard to compute. –Challenge 1: Control-flow in S leads to disjunctive, non-linear bounds, which in turn requires disjunctive, non-linear invariants. Key Idea 1: Multiple Counter Instrumentation –Challenge 2: Iteration over data-structures in S requires reference to numerical properties of these data-structures. Key Idea 2: User-defined Quantitative Functions 3 while (cond) do S c := 0; while (cond) do S; c := c+1;

Outline  Key Idea 1: Multiple Counter Instrumentation –Addresses issue of disjunctive and non-linear bounds. Key Idea 2: Quantitative Functions –Addresses issue of bounds for loops that iterate over data-structures. 4

Example 1: Disjunctive Bound Example(int n, x 0, z 0 ) { c1 := 0; c2 := 0; x := x 0 ; z := z 0 ; while (x<n) if (z>x) x++; c1++; else z++; c2++; } Termination proof based on disjunctively well-founded relation. We can even compute bounds using following proof strategy: –Number of times if-branch is executed (if at all): n-x 0 –Number of times else-branch is executed (if at all): n-z 0 –Therefore, total iterations: Max(0,n-x 0 ) + Max(0,n-z 0 ) 5

int msize; // Assume(0 · e1.len, e2.len < msize); Equals (StringBuffer s1, StringBuffer s2) { c1 := c2 := c3 := 0; e1:=s1.GetHead();e2:=s2.GetHead();i1:=e1.len-1;i2:=e2.len-1; while (true) { if (e1.arr[i1]  e2.arr[i2]) return 0; i1--; i2--; while (i1<0 Æ e1  null) { e1 := s1.GetNext(e1); i1 := i1+e1.len; c1++; c3 := 0; } while (i2<0 Æ e2  null) { e2 := s2.GetNext(e2); i2 := i2+e2.len; c2++; c3 := 0; } if (i1<0) return (i2<0); if (i2<0) return 0; c3++; }; return 1; } Example 2: Non-linear Bound 6 Total iterations of 1 st & 2 nd inner loops: Len(s1) & Len(s2). In between any two iterations of inner loops, iterations of outer loop: msize Therefore total complexity is (1+msize)*(1+Len(s1)+Len(s2))

Automatically Computing Counter Placement Total number of potential counter placements are exponential in number of back-edges. –Hence a naïve search is expensive. Key Idea: Increasing counters and dependencies increases ability of an invariant generation tool to discover bounds. –But cannot simply make all counters depend on each other. Need to find right set of dependencies that create a DAG. There is a quadratic (in number of back-edges) algorithm to compute an optimal counter placement. –An optimal counter placement scheme uses minimal counters and miminal dependencies between counters. Generally, this leads to more precise bounds. 7

Outline Key Idea 1: Multiple Counter Instrumentation –Addresses issue of disjunctive and non-linear bounds.  Key Idea 2: Quantitative Functions –Addresses issue of bounds for loops that iterate over data-structures. 8

Quantitative Functions Defined over tuple of abstract data-structures Len(L) : Length of list L. Pos(e,L) : Position of list-element e in List L. Semantics is defined by describing effect of data- structure methods on quantitative functions. –Sequence of (conditional) assignments and assumes. –Can also refer to unscoped variables (universally quantified). Paper gives examples of quantitative fns for trees, bit- vectors, and composite data-structures, eg., list of lists. 9 Data Structure OperationUpdates to Quantitative Functions L.Append(e);Len(L)++; Pos(e,L) := Len(L); L.Delete(e);Len(L)--; if (Pos(e,L) < Pos(e’,L)) Pos(e’,L) --;

Example 3: Data-structure Iteration BreadthFirstTraversal(List L): ToDo.Init(); L.MoveTo(L.Head(),ToDo); c:=0; while (! ToDo.IsEmpty()) e := ToDo.Head(); ToDo.Delete(e); foreach successor s in e.Successors() if (L.contains(s)) L.MoveTo(s,ToDo); c++; Inductive Invariant at back-edge of while-loop c · Old(Len(L)) - Len(L) – Len(ToDo) Æ Len(L) ¸ 0 Æ Len(ToDo) ¸ 0 This implies a bound of Old(Len(L)) for while loop. 10

Computing Invariants over Quantitative Functions Instrument a data-structure method call with its effect allowing quantitative fns. to be treated as uninterpreted. –Instantiate unscoped variables with all appropriate terms. Use a linear invariant generation tool with support for uninterpreted functions. –Abstract Interpretation based Technique. Use domain-combinators to [Gulwani, Tiwari, PLDI ‘06] combine Polyhedron abstract domain [Cousot, POPL ‘79] with uninterpreted fns domain [Gulwani, Necula, SAS’ 04] –Constraint-based Invariant Generation Technique. [Beyer et.al., VMCAI ‘07] 11

Related Work Type system approaches for Resource Bound Certification –Only verify as opposed to inferring bound annotations. WCET (Worst Case Execution Time) Analysis –Focused on modeling low-level architectural details. –For loop bounds, either require user annotation, or uses simple pattern matching or some simple numerical analysis. Termination Analysis –Complexity Analysis provides more information, and can also bound other resources, e.g., memory space usage. 12

Conclusion & Limitations Applications of Symbolic Bound Analysis –Interactive code development, Embedded systems Two Key Ideas –Multiple Counter Instrumentation Helps compute non-linear and disjunctive bounds. –Quantitative Functions Helps compute bounds that refer to numerical heap properties. Limitations: Concurrent Procedures 13