Fundamental Data Structures and Algorithms Jeffrey Stylos February 8 th, 2005 Splay Trees Adapted from slides by Peter Lee October 4, 2001
Today Review HW2 Review of binary trees Splay trees
Announcements HW2 is out! Due next Thursday at midnight! Get started now! Quiz 2 next week, during recitation
Objects in calendar are closer than they appear. — Jim Duncan
And now, a few words about HW2 [Homework 2 write-up]
Let’s talk about BlackBoard What you always wanted to know about the discussion board (but were afraid to ask)
Last Time…
Binary trees A binary tree is either empty (we'll write nil for clarity), or looks like (x,L,R) where x is an object, and L, R are binary subtrees
Binary search trees (BSTs) A binary tree T is a binary search tree iff flat(T) is an ordered sequence. Equivalently, in (x,L,R) all the nodes in L are less than x, and all the nodes in R are larger than x.
flat(T) = 2,3,4,5,6,7,9 Example
search(x,nil) = false search(x,(x,L,R)) = true search(x,(a,L,R)) = search(x,L) x<a search(x,(a,L,R)) = search(x,R) x>a Binary search How does one search in a BST? Inductively:
Log-time vs linear time log k (N) = O(N) for any constant k. I.e, logarithms grow very slowly.
flat(T) = 2,3,4,5,6,7,9 A bad tree
Balanced trees Intuitively, one way to ensure good behavior is to make “balanced” search trees. If always balanced, then operations such as search and insert will require only O(log(N)) comparisons.
flat(T) = 2,3,4,5,6,7,9 Example
Forcing good behavior It is clear (?) that for any n inputs, there always is a BST containing these elements of logarithmic depth. But if we just insert the standard way, we may build a very unbalanced, deep tree. Can we somehow force the tree to remain shallow? At low cost?
Today Review HW2 Review of binary trees Splay trees
Splay Trees
Binary search trees Simple binary search trees can have bad behavior for some insertion sequences. Average case O(log N), worst case O(N). AVL trees maintain a balance invariant to prevent this bad behavior. Accomplished via rotations during insert. Splay trees achieve amortized running time of O(log N). Accomplished via rotations during find.
Never do today… If you are a reasonably lazy person… Consider the problem of serving a six-course dinner. You could spend 10 minutes washing the dishes right after each course. Or you could let the dishes sit and spend an hour tomorrow (or the next day) washing them. Either way, you will spend one hour doing dishes.
A better example If you are a reasonably smart person… Consider the problem of getting groceries. You could spend 30 minutes going to the Giant Eagle every time you run out of something. Or you could wait until you have run out of 10 items, and then make a single 2- hour trip to the Giant Eagle. In this case, you would save 3 hours!
Amortized running time The analysis that allows us to conclude that we spend the same (or less) amount of time over a sequence of operations is called amortized analysis. If we say that the amortized running time of a sequence of operations is O(f(N)): Some operations might be more than O(f(N)), other less. But the average over the entire sequence is O(f(N)).
Splay trees Splay trees provide a guarantee that any sequence of M operations (starting from an empty tree) will require O(Mlog N) time. Hence, each operation has amortized cost of O(log N). It is possible that a single operation requires O(N) time. But there are no bad sequences of operations on a splay tree.
A basic observation Let’s suppose that a node requires O(N) time to find. If and when this happens, we must move it somewhere closer to the root so that if we access it again, it will definitely require less than O(N) time. If we don’t do this, then O(log N) amortized behavior is not possible.
Danny Sleator The inventor of splay trees. Winner of Kannelakis award. See his splay tree demo at /~sleator /~sleator (And definitely a procrastinator.)
The basic idea Every time a node is accessed, move it to the root (“splay” it) The move can be accomplished by performing “AVL” rotations. Practical benefits: In practice, nodes are often accessed multiple times. AVL rotations make trees more balanced.
Using rotations Suppose we perform a find operation, which accesses node n. Splay n, moving it up to the root. Doing this requires O(d) time, where d is the depth of n. Hence O(N) (worst case) But a subsequent access of n will require only O(1) time, and the tree will be better balanced, too.
Splaying, case 1 There are four cases to consider. Case 1: The node is already the root. Nothing to do.
Splaying, case 2 Case 2: Accessed node’s parent is the root. Perform a single rotation. Z Y X ZY X
Splaying, cases 3 and 4 The next two cases cover the situation in which the accessed node has a grandparent.
Splaying, case 3 Case 3: Zig-zag (left). Perform an AVL double rotation. a Z b X Y1Y1 Y2Y2 a Z b XY1Y1 Y2Y2
Splaying, case 4 Case 4: Zig-zig (left). Special rotation. a Z b Y W X a Z b Y W X
Symmetry And there are symmetric cases for zig-zag and zig-zig to the right.
Splay tree example Insert {0, 1, 2, 3, 4, 5, 6}, then find zig-zig right
Splay tree example, cont’d zig-zag right
Splaying example, cont’d zig-zag right
Result of splaying Access of 6 required N nodes visited and modified. But now accessing 5 requires only N/2 nodes visited and modified. Will also bring all nodes up to N/4 of root. (Try it!) And all nodes are shallower. Every access will tend to improve the tree for future operations
Using splay operations …to implement “find(x)”: Splay x Look at root (compare to x) …to implement “insert(x)”: Splay x Look at root If it exists, do nothing Else, make root the child of x, call x the new root
Using splay operations (cont.) …to implement “delete(x)”: Splay x Look at root (compare to x) If it doesn’t exist, do nothing Delete root If either child of root is null make the other child new root Splay largest key of left-branch (we could also splay smallest of right) Can do this by splaying the non-existing element, because of the way splay works Everything in right-branch tree will be smaller than everything in left-branch tree New left-branch tree will have empty right-branch (since we splayed the largest value) Set left-branch tree's right-branch to be the right-branch tree
Using splay operations (cont.) …to implement “delete(x)”: Splay x Look at root (compare to x) If it doesn’t exist, do nothing Delete root Splay largest b f a c x de
Nice properties of splay trees If you have a small working set, the operations’ costs are proportional to the size of the working set Why? If you splay the elements in order, the total cost is O(N) Why?
Analysis of splay trees The analysis of the running time of splay trees is quite difficult. Any single find or insert might take O(N) time. But any sequence of M operations, starting from an empty tree, will take only O(Mlog N) time. In practice, splay trees work extremely well.
Splaying summary Splaying has the effect of moving the accessed node to the root. It also reduces the depth of almost all of the nodes along the access path.
All done for today!