More on Randomized Data Structures

More on Randomized Data Structures

Motivation for Randomized Data Structures
We’ve seen many data structures with good average case performance on random inputs, but bad behavior on particular inputs e.g. Binary Search Trees  Instead of randomizing the input (since we cannot!), consider randomizing the data structure No bad inputs, just unlucky random numbers Expected case good behavior on any input

Average vs. Expected Time
Average (1/N)   xi Expectation  Pr(xi)  xi Deterministic with good average time If your application happens to always (or often) use the “bad” case, you are in big trouble! Randomized with good expected time Once in a while you will have an expensive operation, but no inputs can make this happen all the time Like an insurance policy for your algorithm!

Randomized Data Structures
Define a property (or subroutine) in an algorithm Sample or randomly modify the property Use altered property as if it were the true property Can transform average case runtimes into expected runtimes (remove input dependency). Sometimes allows substantial speedup in exchange for probabilistic unsoundness.

Randomization in Action
Quicksort Randomized data structures Treaps Randomized skip lists

Treap Dictionary Data Structure
Treap is a BST binary tree property search tree property Treap is also a heap heap-order property random priorities priority key 2 9 6 7 4 18 7 8 9 15 10 30 15 12

Treap Insert Choose a random priority Insert as in normal BST
Rotate up until heap order is restored 2 9 insert(15) 2 9 2 9 6 7 14 12 6 7 14 12 6 7 9 15 7 8 7 8 9 15 7 8 14 12

Tree + Heap… Why Bother? Insert data in sorted order into a treap … 6
What shape tree comes out? 6 7 insert(7) 6 7 insert(8) 8 6 7 insert(9) 8 2 9 6 7 insert(12) 8 2 9 15 12 Notice that it doesn’t matter what order the input comes in. The shape of the tree is fully specified by what the keys are and what their random priorities are! So, there’s no bad inputs, only bad random numbers! That’s the difference between average time and expected time. Which one is better?

Treap Delete delete(9) 2 9 6 7 Find the key Increase its value to 
Rotate it to the fringe Snip it off 6 7 9 15  9 7 8 7 8 15 12 9 15 15 12 6 7 6 7 Basically, rotate the node down to the fringe and then cut it off. This is pretty simple as long as you have rotate, as well. However, you do need to find the smaller child as you go! 6 7 7 8 9 15 7 8 9 15 7 8 9 15  9 15 12 15 12  9 15 12

Treap Delete (2) 6 7 7 8 6  9 15 12 7 8 6  9 15 12 7 8 9 15  9 15 12

Treap Delete (3) 7 8 6 9 15 12 7 8 6  9 15 12 7 8 6 9 15 12  9

Treap Summary Implements Dictionary ADT Memory use
insert in expected O(log n) time delete in expected O(log n) time find in expected O(log n) time but worst case O(n) Memory use O(1) per node about the cost of AVL trees Very simple to implement little overhead – less than AVL trees The big advantage of this is that it’s simple compared to AVL or even splay. There’s no zig-zig vs. zig-zag vs. zig. Unfortunately, it doesn’t give worst case or even amortized O(log n) performance. It gives expected O(log n) performance.

More on Randomized Data Structures

Similar presentations

Presentation on theme: "More on Randomized Data Structures"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

More on Randomized Data Structures

Similar presentations

Presentation on theme: "More on Randomized Data Structures"— Presentation transcript:

Similar presentations

About project

Feedback