CSE 326: Data Structures Lecture #18 Fistful of Data Structures

Slides:



Advertisements
Similar presentations
More on Randomized Data Structures. Motivation for Randomized Data Structures We’ve seen many data structures with good average case performance on random.
Advertisements

CSE 326 Randomized Data Structures David Kaplan Dept of Computer Science & Engineering Autumn 2001.
1 CSE 326: Data Structures Trees Lecture 7: Wednesday, Jan 23, 2003.
WEEK 3 Leftist Heaps CE222 Dr. Senem Kumova Metin CE222_Dr. Senem Kumova Metin.
Week 8 - Wednesday.  What did we talk about last time?  Level order traversal  BST delete  2-3 trees.
CSE 326: Data Structures Lecture #11 AVL and Splay Trees Steve Wolfman Winter Quarter 2000.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
CSE 326: Data Structures Lecture #6 Putting Our Heaps Together Steve Wolfman Winter Quarter 2000.
CSE 326: Data Structures Lecture #9 Big, Bad B-Trees Steve Wolfman Winter Quarter 2000.
CSE373: Data Structures & Algorithms Lecture 8: AVL Trees and Priority Queues Linda Shapiro Spring 2016.
CSE332: Data Abstractions Lecture 7: AVL Trees
CSE373: Data Structures & Algorithms Priority Queues
CSE 326: Data Structures Priority Queues (Heaps)
School of Computing Clemson University Fall, 2012
AA Trees.
BCA-II Data Structure Using C
B/B+ Trees 4.7.
CSE 326: Data Structures Lecture #8 Pruning (and Pruning for the Lazy)
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
Topics covered (since exam 1):
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act
Binary Search Trees.
CSE 326: Data Structures Lecture #23 Data Structures
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act
B+-Trees.
B+-Trees.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
October 30th – Priority QUeues
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
Week 11 - Friday CS221.
Hashing Exercises.
Cse 373 April 26th – Exam Review.
Binary Search Trees Why this is a useful data structure. Terminology
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
original list {67, 33,49, 21, 25, 94} pass { } {67 94}
CSE 326: Data Structures: Midterm Review
CMSC 341: Data Structures Priority Queues – Binary Heaps
CSE 326: Data Structures Lecture #4 Heaps more Priority Qs
CSE332: Data Abstractions Lecture 5: Binary Heaps, Continued
Topics covered (since exam 1):
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
A Kind of Binary Tree Usually Stored in an Array
David Kaplan Dept of Computer Science & Engineering Autumn 2001
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act
Heaps and the Heapsort Heaps and priority queues
CSE 332: Data Abstractions Binary Search Trees
CSE373: Data Structures & Algorithms Lecture 5: AVL Trees
CSE 326: Data Structures Lecture #5 Political Heaps
Tree Rotations and AVL Trees
CSE 332: Data Structures Priority Queues – Binary Heaps Part II
CSE 332: Data Abstractions AVL Trees
Advanced Implementation of Tables
CSE373: Data Structures & Algorithms Lecture 7: Binary Heaps, Continued Dan Grossman Fall 2013.
CSE 326: Data Structures Lecture #8 Balanced Dendrology
CSE 326: Data Structures Lecture #9 AVL II
Richard Anderson Spring 2016
CSE 326: Data Structures Splay Trees
Topics covered (since exam 1):
CSE 326: Data Structures Lecture #5 Heaps More
CSE 326: Data Structures Lecture #24 Odds ‘n Ends
Data Structures and Algorithm Analysis Priority Queues (Heaps)
CSE 373: Data Structures and Algorithms
Heaps & Multi-way Search Trees
CSE 326: Data Structures Lecture #10 Amazingly Vexing Letters
CSE 326: Data Structures Lecture #10 B-Trees
CS 106B Lecture 20: Binary Search Trees Wednesday, May 17, 2017
CSE 326: Data Structures Lecture #14
326 Lecture 9 Henry Kautz Winter Quarter 2002
More on Randomized Data Structures
Presentation transcript:

CSE 326: Data Structures Lecture #18 Fistful of Data Structures Steve Wolfman Winter Quarter 2000 I’m going to blow through quite a few “advanced” data structures today. What should you take away from these? You should know what ADT these implement. You should have a general idea of why you might use one. Given a description of how one works, you should be able to reason intelligently about them.

Today’s Outline What Steve Didn’t Get To On Monday Warm-up: augmenting leftist heaps Binomial Queues Treaps Randomized Skip Lists What Steve Won’t Get To (Ever?) We’ll warm up by playing with a data structure we’re already familiar with; then, we’ll jump into a bunch of other data structures. At the end, I’ll mention some data structures that you might never see; then again, they might be useful and important (or at least interesting) to you someday.

Thinking about DecreaseKey in Leftist Heaps Why not just percolate up? decreaseKey( , 3) node 7 3 12 8 7 8 Percolate up works, so why don’t we do it? Because the node could need to go all the way to the root, and it might be O(n) away from the root! FYI: From here on in, I’ll just write 15, but remember that I actually need a pointer to the node. 3 15 17 9 30 12 17 9 30 20 22 18 20 22 18

DecreaseKey in Leftist Heaps 7 7 12 8 12 8 3 So, instead we’ll just cut off the tree where we change it. Now, we can just merge these two new trees, right? Wrong! One of these is _not_ a leftist tree anymore! 15 17 9 30 3 17 9 30 20 22 18 20 22 18 Now just merge the two?

Fixing DecreaseKey in Leftist Heaps This may not be leftist 7 7 So, fix it! 12 8 8 12 So, let’s just fix it. We’ll swap subtrees back up the tree until we have a leftist heap again. This works, right? Now, we can just merge. 3 17 9 30 9 30 17 20 22 18 18 This is still leftist Now, merge!

DecreaseKey runtime runtime: How many nodes could possibly have the wrong Null Path Length up the line? How long does this take? Well, the merge takes O(log n), the cutting takes O(1). How about fixing the tree? The question, like in AVL trees, is really how many NPLs can we change? Can we change O(n)? Well, NPL is the shortest distance to a NULL. What’s the largest NPL in the tree? O(log n) (otherwise there would be a right branch that was too long). So, we can only reduce NPLs, and k levels up, we can only reduce the NPL to k. So, we only need to fix log n parent nodes (actually log n + 1)

Delete in Leftist Heaps decreaseKey(15, -) deleteMin() runtime: Now, delete is easy! Decrease the key to negative infinity. Then, delete it. How long does that take? O(log n)

Binomial Trees A binomial tree of rank 0 is a one node tree. A binomial tree of rank k is a binomial tree of rank k-1 with another binomial tree of rank k-1 hanging from its root. rank 0 Momentary digression. rank 1

First Five Binomial Trees rank 0 rank 1 rank 2 rank 3 rank 4 Let’s talk about some properties here. First, and most important, how big is the binomial tree of rank k? 0 -> 1. 1 -> 2. 2 -> 4. 3 -> 8. 4 -> 16. 2k Next, what are the children of the root of a binom. Tree of rank k? A binom. Tree of each previous rank! Finally, why binomial tree? In a binom tree of rank k at level I there are k choose I nodes. That’s why they’re called binomial, but that’s not really important. How many nodes does a binomial tree of rank k have?

Binomial Queue Heap Data Structure rank 3 rank 2 rank 1 rank 0 Composed of a forest of binomial trees, no two of the same rank Heap-order enforced within each tree Ranks present can be computed using the binary representation of the tree’s size size = 10 = 10102 rank 1 rank 3 5 3 Here’s our first new data structure. This one is a priority queue data structure. Notice that one binomial queue may be made of many heap-ordered trees. Also, notice that those trees might have a large branching factor. 9 7 13 4 15 6 10 21

Insertion in Binomial Queues rank 1 rank 2 rank 0 rank 1 rank 2 10 5 3 10 5 3 9 7 13 9 7 13 15 15 OK, how do we do operations on these? Let’s insert. One node is a binomial tree. So, we’ll just put in a new binomial tree. If there’s no rank 0 tree already, this is just the new rank zero tree! If there’s no rank 0 tree, just put the new node in as a rank 0 tree.

Insertion in Binomial Queues rank 0 rank 1 rank 0 rank 1 rank 2 10 5 3 5 3 3 7 10 7 7 5 But, what if there is a rank 0 tree? Well, we can merge the two rank 0 trees to make a rank 1 tree. If there were no rank 1 tree, we’d just stop. If there is, then we merge with that to make a rank 2, and so forth. It’s just like adding two binary numbers. We have carry trees! How long does this take? Constant time per merge. At most O(log n) trees, so at most O(log n) merges. BUT just like Zasha proved adding 1 to a binary number is amortized constant. This, too, is amortized constant! 10 It’s like addition of binary numbers! 1+1 = 0 plus a carry tree 1+1+1 = 1 plus a carry tree runtime:

Merge in Binomial Queues rank 1 rank 2 rank 0 rank 1 5 3 11 4 9 7 13 16 15 rank 0 rank 3 Speaking of merging. How about merging two full binomial queues? It’s just like insert. If there’s an empty spot, we just put the tree in (like rank 0). Otherwise, we merge two trees to get a carry tree. Then, if necessary, we merge that with the next level. Runtime? O(log n) for the same reasons. 11 3 0110 + 0011 = 1001 7 13 4 15 16 5 runtime: 9

DeleteMin in Binomial Queues These are one Binomial Queue 11 10 1 3 14 8 25 7 13 4 27 15 16 5 9 These are another DeleteMin, surprisingly, is just a merge. We find the smallest root (O(log n) since there are O(log n) roots). Then, we pull that tree out, snip its root, and make a new binomial queue from the children. Finally, we merge the two back together. Runtime? O(log n) 8 10 3 Just merge the two: 11 14 25 7 13 4 27 15 16 5 runtime: 9

Binomial Queue Summary Implements priority queue ADT Insert in amortized O(1) time FindMin (with some tricks) in O(1) time DeleteMin in O(log n) time Merge in O(log n) time Memory use O(1) per node about the cost of skew heaps Complexity? So, this is good if you need to do a bunch of inserts quickly.

Treap Dictionary Data Structure heap in yellow; search tree in blue Treaps have the binary search tree binary tree property search tree property Treaps also have the heap-order property! randomly assigned priorities 2 9 6 7 4 18 Now, let’s look at a really funky tree. It combines a tree and a heap and gets good expected runtime. 7 8 9 15 10 30 Legend: priority key 15 12

Tree + Heap… Why Bother? Insert data in sorted order into a treap; what shape tree comes out? 6 7 insert(7) 6 7 insert(8) 8 6 7 insert(9) 8 2 9 6 7 insert(12) 8 2 9 15 12 Notice that it doesn’t matter what order the input comes in. The shape of the tree is fully specified by what the keys are and what their random priorities are! So, there’s no bad inputs, only bad random numbers! That’s the difference between average time and expected time. Which one is better? Legend: priority key

Treap Insert Choose a random priority Insert as in normal BST Rotate up until heap order is restored insert(15) 2 9 2 9 2 9 This is pretty simple as long as rotate is already written. 6 7 15 12 6 7 15 12 6 7 9 15 7 8 7 8 9 15 7 8 15 12

Treap Delete Find the key Increase its value to  Rotate it to the fringe Snip it off 2 9 6 7 6 7 9 15 7 8  9 7 8 15 12 9 15 6 7 6 7 15 12 Sorry about how packed this slide is. Basically, rotate the node down to the fringe and then cut it off. This is pretty simple as long as you have rotate, as well. However, you do need to find the smaller child as you go! 6 7 7 8 9 15 7 8 9 15 7 8 9 15  9 15 12 15 12  9 15 12

Treap Summary Implements Dictionary ADT Memory use Complexity? insert in expected O(log n) time delete in expected O(log n) time find in expected O(log n) time Memory use O(1) per node about the cost of AVL trees Complexity? The big advantage of this is that it’s simple compared to AVL or even splay. There’s no zig-zig vs. zig-zag vs. zig. Unfortunately, it doesn’t give worst case or even amortized O(log n) performance. It gives expected O(log n) performance.

Perfect Binary Skip List Sorted linked list # of links of a node is its height The height i link of each node (that has one) links to the next node of height i or greater Now, onto something completely different. 22 11 8 19 29 2 10 13 20 23

Find in a Perfect Binary Skip List Start i at the maximum height Until the node is found or i is one and the next node is too large: If the next node along the i link is less than the target, traverse to the next node Otherwise, decrease i by one How many times can we go down? Log n. How many times can we go right? Also log n. So, the runtime is log n. runtime:

Randomized Skip List Intuition It’s far too hard to insert into a perfect skip list, but is perfection necessary? What matters in a skip list? What really matters is that there’s way fewer tall nodes than short ones. Because, you need to get a good way through the bottom level list with each high traverse.

Randomized Skip List Sorted linked list # of links of a node is its height The height i link of each node (that has one) links to the next node of height i or greater There should be about 1/2 as many nodes of height i+1 as there are of height i So, let’s create a list where all we insist on is that there’s about twice as many nodes at the base as at the next level up (and so forth). 13 8 22 10 20 29 2 11 19 23

Find in a RSL Start i at the maximum height Until the node is found or i is one and the next node is too large: If the next node along the i link is less than the target, traverse to the next node Otherwise, decrease i by one Expected time here is O(log n). I won’t prove it, but the intuition is that we’ll probably cover enough distance at each level. Same as for a perfect skip list! runtime:

Insertion in a RSL Flip a coin until it comes up heads; that takes i flips. Make the new node’s height i. Do a find, remembering nodes where we go down Put the node at the spot where the find ends Point all the nodes where we went down (up to the new node’s height) at the new node Point the new node’s links where those redirected pointers were pointing To insert, we just need to decide what level to make the node, and the rest is pretty straightforward. How do we decide the height? Each time we flip a coin, we have a 1/2 chance of heads. So count the # of flips before a heads and put the node at that height. Let’s look at an example.

Insertion Example in RSL 2 19 23 8 13 29 20 10 11 insert(22) with 3 flips 13 The bold lines and boxes are the ones we traverse. How long does this take? The same as find (asymptotically), so expected O(log n). 8 22 10 20 29 2 11 19 23 runtime:

Range Queries and Iteration Range query: search for everything that falls between two values Iteration: successively return (in order) each element in the structure Now, one interesting thing here is how easy it is to do these two things. To range query: find the start point then walk along the linked list at the bottom outputting each node on the way until the end point.. Takes O(log n + m) where m is the final number of nodes in the query. Iteration is similar except we know the bounds: the start and end. It’s the same as a linked list! (O(n)). In other words, these support range queries and iteration easily. Search trees support them, but not always easily. How do we do them? How fast are they?

Randomized Skip List Summary Implements Dictionary ADT insert in expected O(log n) find in expected O(log n) delete? Memory use expected constant memory per node about double a linked list Complexity?

What We Won’t Discuss Pairing heaps - practically, the fastest and best implementation of heaps for decreaseKey and merge; they use the leftist cut and merge technique Red-Black Trees - a balanced tree that uses just a one-bit color flag and some invariants to maintain balance: see www/homes/sds/rb.html AA-Trees - a cross between Red-Black trees and B-Trees that is relatively simple to code and gives worst case O(log n) running time Deterministic skip lists - a version of skip lists that gives worst case O(log n) running time

To Do Finish Project III Browse chapters 10 & 12 in the book Form Project IV teams! groups of 4-6 2 1/2 week project demos at the end

Coming Up Quad Trees k-D Trees Quiz (February 17th) Project III due (February 17th by 5PM!) Project IV distributed (February 18th) Next time, we’ll talk through deciding what kind of table to use for an example. We’ll look at extendible hashing for huge data sets. And, we’ll have the CIDR student interview… so, come with your think caps on.