CSE 326: Data Structures Lecture #18 Fistful of Data Structures

CSE 326: Data Structures Lecture #18 Fistful of Data Structures
Steve Wolfman Winter Quarter 2000 I’m going to blow through quite a few “advanced” data structures today. What should you take away from these? You should know what ADT these implement. You should have a general idea of why you might use one. Given a description of how one works, you should be able to reason intelligently about them.

Today’s Outline What Steve Didn’t Get To On Monday
Warm-up: augmenting leftist heaps Binomial Queues Treaps Randomized Skip Lists What Steve Won’t Get To (Ever?) We’ll warm up by playing with a data structure we’re already familiar with; then, we’ll jump into a bunch of other data structures. At the end, I’ll mention some data structures that you might never see; then again, they might be useful and important (or at least interesting) to you someday.

Thinking about DecreaseKey in Leftist Heaps
Why not just percolate up? decreaseKey( , 3) node 7 3 12 8 7 8 Percolate up works, so why don’t we do it? Because the node could need to go all the way to the root, and it might be O(n) away from the root! FYI: From here on in, I’ll just write 15, but remember that I actually need a pointer to the node. 3 15 17 9 30 12 17 9 30 20 22 18 20 22 18

DecreaseKey in Leftist Heaps
7 7 12 8 12 8 3 So, instead we’ll just cut off the tree where we change it. Now, we can just merge these two new trees, right? Wrong! One of these is _not_ a leftist tree anymore! 15 17 9 30 3 17 9 30 20 22 18 20 22 18 Now just merge the two?

Fixing DecreaseKey in Leftist Heaps
This may not be leftist 7 7 So, fix it! 12 8 8 12 So, let’s just fix it. We’ll swap subtrees back up the tree until we have a leftist heap again. This works, right? Now, we can just merge. 3 17 9 30 9 30 17 20 22 18 18 This is still leftist Now, merge!

DecreaseKey runtime runtime: How many nodes could possibly have the wrong Null Path Length up the line? How long does this take? Well, the merge takes O(log n), the cutting takes O(1). How about fixing the tree? The question, like in AVL trees, is really how many NPLs can we change? Can we change O(n)? Well, NPL is the shortest distance to a NULL. What’s the largest NPL in the tree? O(log n) (otherwise there would be a right branch that was too long). So, we can only reduce NPLs, and k levels up, we can only reduce the NPL to k. So, we only need to fix log n parent nodes (actually log n + 1)

Delete in Leftist Heaps
decreaseKey(15, -) deleteMin() runtime: Now, delete is easy! Decrease the key to negative infinity. Then, delete it. How long does that take? O(log n)

Binomial Trees A binomial tree of rank 0 is a one node tree.
A binomial tree of rank k is a binomial tree of rank k-1 with another binomial tree of rank k-1 hanging from its root. rank 0 Momentary digression. rank 1

First Five Binomial Trees
rank 0 rank 1 rank 2 rank 3 rank 4 Let’s talk about some properties here. First, and most important, how big is the binomial tree of rank k? 0 -> > > > > 16. 2k Next, what are the children of the root of a binom. Tree of rank k? A binom. Tree of each previous rank! Finally, why binomial tree? In a binom tree of rank k at level I there are k choose I nodes. That’s why they’re called binomial, but that’s not really important. How many nodes does a binomial tree of rank k have?

Binomial Queue Heap Data Structure
rank 3 rank 2 rank 1 rank 0 Composed of a forest of binomial trees, no two of the same rank Heap-order enforced within each tree Ranks present can be computed using the binary representation of the tree’s size size = 10 = 10102 rank 1 rank 3 5 3 Here’s our first new data structure. This one is a priority queue data structure. Notice that one binomial queue may be made of many heap-ordered trees. Also, notice that those trees might have a large branching factor. 9 7 13 4 15 6 10 21

Insertion in Binomial Queues
rank 1 rank 2 rank 0 rank 1 rank 2 10 5 3 10 5 3 9 7 13 9 7 13 15 15 OK, how do we do operations on these? Let’s insert. One node is a binomial tree. So, we’ll just put in a new binomial tree. If there’s no rank 0 tree already, this is just the new rank zero tree! If there’s no rank 0 tree, just put the new node in as a rank 0 tree.

Insertion in Binomial Queues
rank 0 rank 1 rank 0 rank 1 rank 2 10 5 3 5 3 3 7 10 7 7 5 But, what if there is a rank 0 tree? Well, we can merge the two rank 0 trees to make a rank 1 tree. If there were no rank 1 tree, we’d just stop. If there is, then we merge with that to make a rank 2, and so forth. It’s just like adding two binary numbers. We have carry trees! How long does this take? Constant time per merge. At most O(log n) trees, so at most O(log n) merges. BUT just like Zasha proved adding 1 to a binary number is amortized constant. This, too, is amortized constant! 10 It’s like addition of binary numbers! 1+1 = 0 plus a carry tree 1+1+1 = 1 plus a carry tree runtime:

Merge in Binomial Queues
rank 1 rank 2 rank 0 rank 1 5 3 11 4 9 7 13 16 15 rank 0 rank 3 Speaking of merging. How about merging two full binomial queues? It’s just like insert. If there’s an empty spot, we just put the tree in (like rank 0). Otherwise, we merge two trees to get a carry tree. Then, if necessary, we merge that with the next level. Runtime? O(log n) for the same reasons. 11 3 = 1001 7 13 4 15 16 5 runtime: 9

DeleteMin in Binomial Queues
These are one Binomial Queue 11 10 1 3 14 8 25 7 13 4 27 15 16 5 9 These are another DeleteMin, surprisingly, is just a merge. We find the smallest root (O(log n) since there are O(log n) roots). Then, we pull that tree out, snip its root, and make a new binomial queue from the children. Finally, we merge the two back together. Runtime? O(log n) 8 10 3 Just merge the two: 11 14 25 7 13 4 27 15 16 5 runtime: 9

Binomial Queue Summary
Implements priority queue ADT Insert in amortized O(1) time FindMin (with some tricks) in O(1) time DeleteMin in O(log n) time Merge in O(log n) time Memory use O(1) per node about the cost of skew heaps Complexity? So, this is good if you need to do a bunch of inserts quickly.

Treap Dictionary Data Structure
heap in yellow; search tree in blue Treaps have the binary search tree binary tree property search tree property Treaps also have the heap-order property! randomly assigned priorities 2 9 6 7 4 18 Now, let’s look at a really funky tree. It combines a tree and a heap and gets good expected runtime. 7 8 9 15 10 30 Legend: priority key 15 12

Tree + Heap… Why Bother? Insert data in sorted order into a treap; what shape tree comes out? 6 7 insert(7) 6 7 insert(8) 8 6 7 insert(9) 8 2 9 6 7 insert(12) 8 2 9 15 12 Notice that it doesn’t matter what order the input comes in. The shape of the tree is fully specified by what the keys are and what their random priorities are! So, there’s no bad inputs, only bad random numbers! That’s the difference between average time and expected time. Which one is better? Legend: priority key

Treap Insert Choose a random priority Insert as in normal BST
Rotate up until heap order is restored insert(15) 2 9 2 9 2 9 This is pretty simple as long as rotate is already written. 6 7 15 12 6 7 15 12 6 7 9 15 7 8 7 8 9 15 7 8 15 12

Treap Delete Find the key Increase its value to 
Rotate it to the fringe Snip it off 2 9 6 7 6 7 9 15 7 8  9 7 8 15 12 9 15 6 7 6 7 15 12 Sorry about how packed this slide is. Basically, rotate the node down to the fringe and then cut it off. This is pretty simple as long as you have rotate, as well. However, you do need to find the smaller child as you go! 6 7 7 8 9 15 7 8 9 15 7 8 9 15  9 15 12 15 12  9 15 12

Treap Summary Implements Dictionary ADT Memory use Complexity?
insert in expected O(log n) time delete in expected O(log n) time find in expected O(log n) time Memory use O(1) per node about the cost of AVL trees Complexity? The big advantage of this is that it’s simple compared to AVL or even splay. There’s no zig-zig vs. zig-zag vs. zig. Unfortunately, it doesn’t give worst case or even amortized O(log n) performance. It gives expected O(log n) performance.

Perfect Binary Skip List
Sorted linked list # of links of a node is its height The height i link of each node (that has one) links to the next node of height i or greater Now, onto something completely different. 22 11 8 19 29 2 10 13 20 23

Find in a Perfect Binary Skip List
Start i at the maximum height Until the node is found or i is one and the next node is too large: If the next node along the i link is less than the target, traverse to the next node Otherwise, decrease i by one How many times can we go down? Log n. How many times can we go right? Also log n. So, the runtime is log n. runtime:

Randomized Skip List Intuition
It’s far too hard to insert into a perfect skip list, but is perfection necessary? What matters in a skip list? What really matters is that there’s way fewer tall nodes than short ones. Because, you need to get a good way through the bottom level list with each high traverse.

Randomized Skip List Sorted linked list
# of links of a node is its height The height i link of each node (that has one) links to the next node of height i or greater There should be about 1/2 as many nodes of height i+1 as there are of height i So, let’s create a list where all we insist on is that there’s about twice as many nodes at the base as at the next level up (and so forth). 13 8 22 10 20 29 2 11 19 23

Find in a RSL Start i at the maximum height
Until the node is found or i is one and the next node is too large: If the next node along the i link is less than the target, traverse to the next node Otherwise, decrease i by one Expected time here is O(log n). I won’t prove it, but the intuition is that we’ll probably cover enough distance at each level. Same as for a perfect skip list! runtime:

Insertion in a RSL Flip a coin until it comes up heads; that takes i flips. Make the new node’s height i. Do a find, remembering nodes where we go down Put the node at the spot where the find ends Point all the nodes where we went down (up to the new node’s height) at the new node Point the new node’s links where those redirected pointers were pointing To insert, we just need to decide what level to make the node, and the rest is pretty straightforward. How do we decide the height? Each time we flip a coin, we have a 1/2 chance of heads. So count the # of flips before a heads and put the node at that height. Let’s look at an example.

Insertion Example in RSL
2 19 23 8 13 29 20 10 11 insert(22) with 3 flips 13 The bold lines and boxes are the ones we traverse. How long does this take? The same as find (asymptotically), so expected O(log n). 8 22 10 20 29 2 11 19 23 runtime:

Range Queries and Iteration
Range query: search for everything that falls between two values Iteration: successively return (in order) each element in the structure Now, one interesting thing here is how easy it is to do these two things. To range query: find the start point then walk along the linked list at the bottom outputting each node on the way until the end point.. Takes O(log n + m) where m is the final number of nodes in the query. Iteration is similar except we know the bounds: the start and end. It’s the same as a linked list! (O(n)). In other words, these support range queries and iteration easily. Search trees support them, but not always easily. How do we do them? How fast are they?

Randomized Skip List Summary
Implements Dictionary ADT insert in expected O(log n) find in expected O(log n) delete? Memory use expected constant memory per node about double a linked list Complexity?

What We Won’t Discuss Pairing heaps - practically, the fastest and best implementation of heaps for decreaseKey and merge; they use the leftist cut and merge technique Red-Black Trees - a balanced tree that uses just a one-bit color flag and some invariants to maintain balance: see www/homes/sds/rb.html AA-Trees - a cross between Red-Black trees and B-Trees that is relatively simple to code and gives worst case O(log n) running time Deterministic skip lists - a version of skip lists that gives worst case O(log n) running time

To Do Finish Project III Browse chapters 10 & 12 in the book
Form Project IV teams! groups of 4-6 2 1/2 week project demos at the end

Coming Up Quad Trees k-D Trees Quiz (February 17th)
Project III due (February 17th by 5PM!) Project IV distributed (February 18th) Next time, we’ll talk through deciding what kind of table to use for an example. We’ll look at extendible hashing for huge data sets. And, we’ll have the CIDR student interview… so, come with your think caps on.

CSE 326: Data Structures Lecture #18 Fistful of Data Structures

Similar presentations

Presentation on theme: "CSE 326: Data Structures Lecture #18 Fistful of Data Structures"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE 326: Data Structures Lecture #18 Fistful of Data Structures

Similar presentations

Presentation on theme: "CSE 326: Data Structures Lecture #18 Fistful of Data Structures"— Presentation transcript:

Similar presentations

About project

Feedback