Ditto: Speeding Up Runtime Data Structure Invariant Checks AJ Shankar and Ras Bodik UC Berkeley.

Ditto: Speeding Up Runtime Data Structure Invariant Checks AJ Shankar and Ras Bodik UC Berkeley

Motivation: A Debugging Scenario Buggy program: a large-scale web application in Java Primary data structure: hashMap of shopping carts Carts are modified throughout code Bug: hashMap acting weird: carts disappearing, etc. Hypothesis: cart modification violates hashCode() invariance

How to Check the Hypothesis? Debugger facilities inadequate Idea: write a runtime check  Iterates over buckets, checks hashCode() of each cart in bucket  Run check frequently to pinpoint error

Problem The check is slow! (100x slowdown)  Rerunning the program is now a problem Furthermore, what if bug isn’t reproducible?  Run the program with the check on entire test suite?  Infeasible.

Our Tool: Ditto Ditto speeds up data structure invariant checks  Usually asymptotically in size of data structure  Hash table: 10x speedup at 1600 elements What invariant checks can Ditto handle?  Side-effect-free: cannot return fresh mutable objects  Recursive: not an inherent limitation of algorithm

Basic Observation: Incrementalize Invariant checks the entire data structure … … but once checked, a local change can be (re)checked locally! So, first establish invariant, then incrementally check changes … … … “Hash code of each cart in table corresponds to containing bucket.” …

A New Domain Existing incrementalizers: general purpose but not automatic [Acar PLDI 2006]  User must annotate the program  For functional programs  Other caveats (conversion to CPS, etc.) Ditto is automatic in this domain  Functional invariant checks in an imperative Java setting  No user annotations  Allows arbitrary heap updates outside the invariant  A simple bytecode-to-bytecode implementation

Ditto Algorithm Overview  First run of check: construct graph of the computation Stores function calls, concrete inputs  Track changes to computation inputs  Subsequent runs of check: rerun only subcomputations with changed inputs Incrementally update computation graph = incrementally compute invariant check

Example Invariant Check Ensures a tree is locally ordered boolean isOrdered(Tree t) { if (t == null) return true; if (t.left != null && t.left.value >= t.value) return false; if (t.right != null && t.right.value <= t.value) return false; return isOrdered(t.left) && isOrdered(t.right); }

1. Constructing a Computation Graph Purpose of computation graph:  For unchanged parts of data structure, reuse existing results  For changed parts, identify parts of check that need to be rerun Graph stores the initial check run:  Node = function invocation, along with its Concrete formal arguments Concrete heap accesses Return value Same inputs = can reuse return val Changed inputs = must rerun Inputs

1. Constructing a Computation Graph A P C The Heap Node created with concrete formal arg A Calls children Heap reads from a.value, a.left, a.right, a.left.value, a.right.value are remembered Returns true During first check run, by instrumentation isOrdered(P) isOrdered(A) isOrdered(B)isOrdered(C) B

2. Detecting Changed Inputs Inputs to check that could change between runs:  Arguments – easy to detect (passed to the check)  Heap values – harder (could be modified anywhere in code) Selective write barriers  Statically determine which fields are read in the check  Barriers collect changed heap inputs used by check In example: add write barriers for all writes into fields:  Tree.left  Tree.right  Tree.value if (t == null) return true; if (t.left != null && t.left.value >= t.value) return false; if (t.right != null && t.right.value <= t.value) return false; return isOrdered(t.left) && isOrdered(t.right);

… …… B 3. Rerunning the Invariant isOrdered() Data structure modification: Add node N, remove node F A … D F G … … C … E … …… N A … D F G … … C … E B … …

3. Rerunning the Invariant Goal: Incrementally update computation graph  Graph must look as if check was run afresh Tree With New Modifications … …… N A … D F G … … C … E B … Computation Graph From Last Run isOrdered(A ) … …… B A … D F G … … C E … true Write barriers say…

3. Rerunning the Invariant isOrdered(A) is first node that needs to be rerun  Parent inputs haven’t changed (functions are side- effect-free) Rerunning exposes new node N What happens at isOrdered(B)? … …… N A … D F G … … C … E B … … …… B A … D F G … … C E … true N

3. Rerunning the Invariant isOrdered(B) has same formal args, heap inputs We’d like to reuse its previous result  And end this subcomputation Problem: isOrdered(B) also depends on return values of its callees  Which might change, since isOrdered(D) will be rerun  So we can’t be sure isOrdered(B)’s result will be the same! … …… N A … D F G … … C … E B … … …… B A … D F G … … C E … true N

Optimistic Memoization Don’t want to rerun all nodes between B and D Solution: we optimistically assume that isOrdered(B) will return the same result  Invariant checks generally do! (e.g. “success”) Check assumption when we rerun isOrdered(D) For now, reuse previous result, finish up A  A returns previous result (true), so finished here N … …… B A … D F G … … C E … …

3. Rerunning the Invariant Now we rerun isOrdered(D) Reuse previous result of isOrdered(E), (G)  No further changes so no need for optimism isOrdered(F) pruned from graph isOrdered(D) returns previous result (true)  So optimistic assumption was correct  Computations around isOrdered(A) all correct N … …… B A … D F G … … C E … …

false What If isOrdered(D) Returned false? Result propagated up graph  Continues as long as return val differs In this case, root node of graph is reached  Result for entire computation is changed Automatically corrects optimistic assumptions …… … D G E … N B A false

Result of Algorithm We’ve incrementally updated computation graph to reflect updated data structure  Even with circular dependencies throughout graph, only reran 3 nodes Result of computation is result of root node (true) Graph is ready for next incremental update … …… N A … D F G … … C … E B … … …… B A … D G … … C E … true N

Evaluation Ran on a number of common data structure invariants, two real-world examples Most complex invariant: red-black trees  Tree is globally ordered  Same # of black nodes to leaf  Other RB properties (Black follows Red, etc.)  We were unable to incrementalize this check by hand!

Kernel Results

Real-world Examples Tetris-like game Netcols  Invariant: no “floating” jewels in grid  With check, main event loop ran at 80ms, noticeably laggy  Result: event loop to 15ms with Ditto JavaScript obfuscator  Invariant: no excluded keywords (based on a set of criteria) in renaming map

Summary Results:  Automatic incrementalization made practical  For checks in Java programs  Data structure checks viable for development environment Made possible by  Selection of an interesting domain  Optimistic memoization Web: http://www.cs.berkeley.edu/~aj/cs/ditto/

Ditto: Speeding Up Runtime Data Structure Invariant Checks AJ Shankar and Ras Bodik UC Berkeley.

Similar presentations

Presentation on theme: "Ditto: Speeding Up Runtime Data Structure Invariant Checks AJ Shankar and Ras Bodik UC Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ditto: Speeding Up Runtime Data Structure Invariant Checks AJ Shankar and Ras Bodik UC Berkeley.

Similar presentations

Presentation on theme: "Ditto: Speeding Up Runtime Data Structure Invariant Checks AJ Shankar and Ras Bodik UC Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback