CSE 326 Randomized Data Structures David Kaplan Dept of Computer Science & Engineering Autumn 2001.

CSE 326 Randomized Data Structures David Kaplan Dept of Computer Science & Engineering Autumn 2001

Randomized Data StructuresCSE 326 Autumn 20012 Motivation for Randomized Data Structures We’ve seen many data structures with good average case performance on random inputs, but bad behavior on particular inputs  e.g. Binary Search Trees  Instead of randomizing the input (since we cannot!), consider randomizing the data structure  No bad inputs, just unlucky random numbers  Expected case good behavior on any input

Randomized Data StructuresCSE 326 Autumn 20013 Average vs. Expected Time Average(1/N)   x i Expectation  Pr(x i )  x i Deterministic with good average time  If your application happens to always (or often) use the “bad” case, you are in big trouble! Randomized with good expected time  Once in a while you will have an expensive operation, but no inputs can make this happen all the time Like an insurance policy for your algorithm!

Randomized Data StructuresCSE 326 Autumn 20014 Randomized Data Structures  Define a property (or subroutine) in an algorithm  Sample or randomly modify the property  Use altered property as if it were the true property Can transform average case runtimes into expected runtimes (remove input dependency). Sometimes allows substantial speedup in exchange for probabilistic unsoundness.

Randomized Data StructuresCSE 326 Autumn 20015 Randomization in Action  Quicksort  Randomized data structures  Treaps  Randomized skip lists  Random number generation  Primality testing

Randomized Data StructuresCSE 326 Autumn 20016 Treap Dictionary Data Structure Treap is a BST  binary tree property  search tree property Treap is also a heap  heap-order property  random priorities 15 12 10 30 9 15 7878 4 18 6767 2929 priority key

Randomized Data StructuresCSE 326 Autumn 20017 Treap Insert  Choose a random priority  Insert as in normal BST  Rotate up until heap order is restored 6767 insert(15) 7878 2929 14 12 6767 7878 2929 14 12 9 15 6767 7878 2929 9 15 14 12

Randomized Data StructuresCSE 326 Autumn 20018 Tree + Heap… Why Bother?  Insert data in sorted order into a treap … What shape tree comes out? 6767 insert(7) 6767 insert(8) 7878 6767 insert(9) 7878 2929 6767 insert(12) 7878 2929 15 12

Treap Delete  Find the key  Increase its value to   Rotate it to the fringe  Snip it off delete(9) 6767 7878 2  9 15 12 7878 6767 99 9 15 12 7878 6767 9 15 99 12 7878 6767 9 15 12 99 7878 6767 9 15 12

Randomized Data StructuresCSE 326 Autumn 200110 Treap Summary  Implements Dictionary ADT  insert in expected O(log n) time  delete in expected O(log n) time  find in expected O(log n) time  but worst case O(n)  Memory use  O(1) per node  about the cost of AVL trees  Very simple to implement  little overhead – less than AVL trees

Randomized Data StructuresCSE 326 Autumn 200111 Perfect Binary Skip List  Sorted linked list  # of links of a node is its height  The height i link of each node (that has one) links to the next node of height i or greater 8 2 11 10 19 1320 22 29 23

Randomized Data StructuresCSE 326 Autumn 200112 Find in a Perfect Binary Skip List  Start i at the maximum height  Until the node is found or i is one and the next node is too large:  If the next node along the i link is less than the target, traverse to the next node  Otherwise, decrease i by one Runtime?

Randomized Data StructuresCSE 326 Autumn 200113 Randomized Skip List Intuition It’s far too hard to insert into a perfect skip list But is perfection necessary? What matters in a skip list?  We want way fewer tall nodes than short ones  Make good progress through the list with each high traverse

Randomized Data StructuresCSE 326 Autumn 200114 Randomized Skip List  Sorted linked list  # of links of a node is its height  The height i link of each node (that has one) links to the next node of height i or greater  There should be about 1/2 as many height i+1 nodes as height i nodes 2 1923 8 13 292010 22 11

Randomized Data StructuresCSE 326 Autumn 200115 Find in a RSL  Start i at the maximum height  Until the node is found or i is one and the next node is too large:  If the next node along the i link is less than the target, traverse to the next node  Otherwise, decrease i by one Same as for a perfect skip list! Runtime?

Randomized Data StructuresCSE 326 Autumn 200116 Insertion in a RSL  Flip a coin until it comes up heads; that takes i flips. Make the new node’s height i.  Do a find, remembering nodes where we go down  Put the node at the spot where the find ends  Point all the nodes where we went down (up to the new node’s height) at the new node  Point the new node’s links where those redirected pointers were pointing

RSL Insert Example 2 1923 8 13 292010 11 insert(22) with 3 flips 2 1923 8 13 292010 22 11 Runtime?

Randomized Data StructuresCSE 326 Autumn 200118 Range Queries and Iteration Range query: search for everything that falls between two values  find start point  walk list at the bottom  output each node until the end point Iteration: successively return (in order) each element in the structure  start at beginning, walk list to end  Just like a linked list! Runtimes?

Randomized Data StructuresCSE 326 Autumn 200119 Randomized Skip List Summary  Implements Dictionary ADT  insert in expected O(log n)  find in expected O(log n)  Memory use  expected constant memory per node  about double a linked list  Less overhead than search trees for iteration over range queries

Randomized Data StructuresCSE 326 Autumn 200120 Random Number Generation Linear congruential generator  x i+1 = Ax i mod M Choose A, M to minimize repetitions  M is prime  (Ax i mod M) cycles through {1, …, M-1} before repeating  Good choices: M = 2 31 – 1 = 2,147,483,647 A = 48,271 Must handle overflow!

Randomized Data StructuresCSE 326 Autumn 200121 Some Properties of Primes If:  P is prime  0 < X < P Then:  X P-1 = 1 (mod P)  the only solutions to X 2 = 1 (mod P) are: X = 1 and X = P - 1

Randomized Data StructuresCSE 326 Autumn 200122 Checking Non-Primality Exponentiation (pow function, Weiss 2.4.4) can be modified to detect non-primality (composite-ness) // Computes x^n mod(M) HugeInt pow(HugeInt x, HugeInt n, HugeInt M) { if (n == 0) return 1; if (n == 1) return x; HugeInt squared = x * x % M; // 1 M isn’t prime! if (isEven(n)) return pow(squared, n/2, M); else return (pow(squared, n/2, M) * x); }

Randomized Data StructuresCSE 326 Autumn 200123 Checking Primality Systematic algorithm For prime P, for all A such that 0 < A < P Calculate A P-1 mod P using pow Check at each step of pow and at end for primality conditions Randomized algorithm Use one random A  If the randomized algorithm reports failure, then P isn’t prime.  If the randomized algorithm reports success, then P might be prime.  At most (P-9)/4 values of A fool this prime check  if algorithm reports success, then P is prime with probability > ¾ Each new A has independent probability of false positive  We can make probability of false positive arbitrarily small

Randomized Data StructuresCSE 326 Autumn 200124 Evaluating Randomized Primality Testing Your probability of being struck by lightning this year: 0.00004% Your probability that a number that tests prime 11 times in a row is actually not prime: 0.00003% Your probability of winning a lottery of 1 million people five times in a row: 1 in 2 100 Your probability that a number that tests prime 50 times in a row is actually not prime: 1 in 2 100

Randomized Data StructuresCSE 326 Autumn 200125 Cycles Prime numbers Random numbers Randomized algorithms

CSE 326 Randomized Data Structures David Kaplan Dept of Computer Science & Engineering Autumn 2001.

Similar presentations

Presentation on theme: "CSE 326 Randomized Data Structures David Kaplan Dept of Computer Science & Engineering Autumn 2001."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE 326 Randomized Data Structures David Kaplan Dept of Computer Science & Engineering Autumn 2001.

Similar presentations

Presentation on theme: "CSE 326 Randomized Data Structures David Kaplan Dept of Computer Science & Engineering Autumn 2001."— Presentation transcript:

Similar presentations

About project

Feedback