Presentation is loading. Please wait.

Presentation is loading. Please wait.

Calculating Stack Distances Efficiently George Almasi,Calin Cascaval,David Padua

Similar presentations


Presentation on theme: "Calculating Stack Distances Efficiently George Almasi,Calin Cascaval,David Padua"— Presentation transcript:

1 Calculating Stack Distances Efficiently George Almasi,Calin Cascaval,David Padua {galmasi,cascaval,padua}@cs.uiuc.edu

2 What this talk is, and is not, about This talk is about: –Algorithms to calculate stack distance histograms –Speed/memory optimization of trace analysis to create stack distance histogram This talk is not about: –why stack distance histograms are/are not useful –relative merits of inter- reference distance vs. stack distance –speed/memory optimization of applications

3 Two measures of locality Inter-reference distance: –the number of other references between two references to the same address in the trace Stack distance: –The number of distinct addresses referred between two references to the same address a b c d b c d e a Inter-ref distance = 7 stack distance = 4

4 Stack Distances As Cache Misses compute the number of cache hits and misses as follows: hits(C) =  s(  ) C  =1 misses(C) =  s(  ) Inf  =C+1

5 Inter-reference distance Given that at time t ref(t)=x find t 0, time of last previous reference to x inter reference distance: Efficient implementation: a (hash)table H(x) = t 0, the trace index of the last reference to x; Memory usage ~ 2x original program Cost O(1) per reference

6 Stack distance a b c d e... x y z u v f a b c d e x y z u v f h a b c d e y z u v f h x  Depth(x) a b c d e... y z u v f h x z u v a c d e f b h x 1 y 3

7 Stack distance Simulates an infinite cache with LRU replacement policy nice properties (inclusion!) naïve implementation: stack as linked list/array –m = 250,000 average maximum stack depth –list traversal/array updates; O(m) per trace element

8 Stack distance Given t 0, the definition of the stack distance is stackdist(t) = |Z|, where z is the set of distinct references between t 0 and t:

9 Insight: stack is contained in trace abbgedfzfcebcda Time gzfebcda Trace Stack Time=t Stack top g gg

10 Holes Index t x in the trace is a hole if ref(t x ) has already been referenced again at a later time t y < t. Using holes, we can say –stackdist(t) = refdist(t) - #holes(t 0 to t) How many holes are there between t 0 and t?

11 An interval tree of holes oooaooo... t t0t0 o a Prev. ref to a ref to a k:kk+4:k+5 Single tree operation: count_and_add (t 0 ) Determines # of holes between t 0 and t; adds a new hole at t 0 Adding a hole can create a new interval - or fuse two existing ones k+2:k+3

12 Operations on the interval tree k:n Add to interval edge: count_and_add(p) p=n+1 k:n+1 Create new interval: count_and_add(p) p > n+1 k:n Join two intervals: count_and_add(p) p = n+1 k:n+1 p:p k:n n+2:p k:p

13 Pre-allocated hole trees basics: –tree is pre-allocated –binary, balanced –each node contains a number: the number of holes in its right subtree –memory used by node depends on node’s depth a modified version of the B&K algorithm: –holes instead of references –binary instead of n-ary –better memory usage

14 Pre-allocated hole trees abbgedfzfcebcda 1010 0 1000 011 03 1 n n count += n n=n+1

15 Many Questions Q: Why holes and not stack elements? A: Holes need 1/2 the maintenance of stack elements. Q: Will the interval tree grow to  ? A: No. Intervals fuse together spontaneously. Q: How big will the tree be? A: #of intervals = O(stack depth) Depth of a tree of stack elements would be the same size Q: Will the tree be unbalanced? A: Yes, because it tends to grow on one side.

16 More questions Q: what kind of interval tree? A: RB and AVL Q: Which is better? A: AVL is better. Q: Why? A: –shorter average tree height: h+1 vs. 2h –not all operations change the tree structure

17 Comparisons Interval trees: exec time O(log(m)) memory usage O(m) AVL better than RB pointer chasing, bad locality Pre-allocated trees: exec time O(log(n)) memory usage O(n) –hits practical limit holes are better –reduced maintenance no pointer chasing, good locality

18 Results: hole interval trees

19 Results: preallocated trees

20 Conclusions Stack distances with holes: –using RB/AVL interval trees –using pre-allocated trees Using holes reduces linear overhead by 20-40% for both kinds of algorithms.


Download ppt "Calculating Stack Distances Efficiently George Almasi,Calin Cascaval,David Padua"

Similar presentations


Ads by Google