Download presentation
Presentation is loading. Please wait.
Published byLionel Harvey Modified over 9 years ago
1
Calculating Stack Distances Efficiently George Almasi,Calin Cascaval,David Padua {galmasi,cascaval,padua}@cs.uiuc.edu
2
What this talk is, and is not, about This talk is about: –Algorithms to calculate stack distance histograms –Speed/memory optimization of trace analysis to create stack distance histogram This talk is not about: –why stack distance histograms are/are not useful –relative merits of inter- reference distance vs. stack distance –speed/memory optimization of applications
3
Two measures of locality Inter-reference distance: –the number of other references between two references to the same address in the trace Stack distance: –The number of distinct addresses referred between two references to the same address a b c d b c d e a Inter-ref distance = 7 stack distance = 4
4
Stack Distances As Cache Misses compute the number of cache hits and misses as follows: hits(C) = s( ) C =1 misses(C) = s( ) Inf =C+1
5
Inter-reference distance Given that at time t ref(t)=x find t 0, time of last previous reference to x inter reference distance: Efficient implementation: a (hash)table H(x) = t 0, the trace index of the last reference to x; Memory usage ~ 2x original program Cost O(1) per reference
6
Stack distance a b c d e... x y z u v f a b c d e x y z u v f h a b c d e y z u v f h x Depth(x) a b c d e... y z u v f h x z u v a c d e f b h x 1 y 3
7
Stack distance Simulates an infinite cache with LRU replacement policy nice properties (inclusion!) naïve implementation: stack as linked list/array –m = 250,000 average maximum stack depth –list traversal/array updates; O(m) per trace element
8
Stack distance Given t 0, the definition of the stack distance is stackdist(t) = |Z|, where z is the set of distinct references between t 0 and t:
9
Insight: stack is contained in trace abbgedfzfcebcda Time gzfebcda Trace Stack Time=t Stack top g gg
10
Holes Index t x in the trace is a hole if ref(t x ) has already been referenced again at a later time t y < t. Using holes, we can say –stackdist(t) = refdist(t) - #holes(t 0 to t) How many holes are there between t 0 and t?
11
An interval tree of holes oooaooo... t t0t0 o a Prev. ref to a ref to a k:kk+4:k+5 Single tree operation: count_and_add (t 0 ) Determines # of holes between t 0 and t; adds a new hole at t 0 Adding a hole can create a new interval - or fuse two existing ones k+2:k+3
12
Operations on the interval tree k:n Add to interval edge: count_and_add(p) p=n+1 k:n+1 Create new interval: count_and_add(p) p > n+1 k:n Join two intervals: count_and_add(p) p = n+1 k:n+1 p:p k:n n+2:p k:p
13
Pre-allocated hole trees basics: –tree is pre-allocated –binary, balanced –each node contains a number: the number of holes in its right subtree –memory used by node depends on node’s depth a modified version of the B&K algorithm: –holes instead of references –binary instead of n-ary –better memory usage
14
Pre-allocated hole trees abbgedfzfcebcda 1010 0 1000 011 03 1 n n count += n n=n+1
15
Many Questions Q: Why holes and not stack elements? A: Holes need 1/2 the maintenance of stack elements. Q: Will the interval tree grow to ? A: No. Intervals fuse together spontaneously. Q: How big will the tree be? A: #of intervals = O(stack depth) Depth of a tree of stack elements would be the same size Q: Will the tree be unbalanced? A: Yes, because it tends to grow on one side.
16
More questions Q: what kind of interval tree? A: RB and AVL Q: Which is better? A: AVL is better. Q: Why? A: –shorter average tree height: h+1 vs. 2h –not all operations change the tree structure
17
Comparisons Interval trees: exec time O(log(m)) memory usage O(m) AVL better than RB pointer chasing, bad locality Pre-allocated trees: exec time O(log(n)) memory usage O(n) –hits practical limit holes are better –reduced maintenance no pointer chasing, good locality
18
Results: hole interval trees
19
Results: preallocated trees
20
Conclusions Stack distances with holes: –using RB/AVL interval trees –using pre-allocated trees Using holes reduces linear overhead by 20-40% for both kinds of algorithms.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.