Presentation is loading. Please wait.

Presentation is loading. Please wait.

Amortized Analysis of Rehashing

Similar presentations


Presentation on theme: "Amortized Analysis of Rehashing"— Presentation transcript:

1 Amortized Analysis of Rehashing

2 What is rehashing Hash table too full  spend a lot of time looking in buckets Solution: rehash make hash table twice the size for each item in original hash table, hash to location in bigger table

3 Assumptions for Thursday, Feb. 10, 2000
Rehash whenever table is 50% full or more Just a sequence of inserts (can be generalized for other operations) We never get a collision (once dealing with collisions, we’re in average-case analysis territory) Hash table starts as size 2

4 Observations How expensive is an insert? How expensive is a rehash?
When will we need to rehash?

5 Observations How expensive is an insert? How expensive is a rehash?
O(1) - assuming no collisions. Say it’s 1. How expensive is a rehash? O(N) - where N is current size of table. Say it’s N. When will we need to rehash? whenever size of table is power of 2 (1,2,4,8,16,…)

6 Amortized analysis Strategy 1: add up operations
Note: I’ve made assumptions about the constants, but the analysis could be done for any constants. Note 2: This is sometimes called the “Aggregate Method”

7 Amortized analysis Strategy 2: Accounting Method
Charge the cost of some operations to other operations Each operation gives us some “tokens”, which we can spend on future operations

8 Accounting Method analysis
Each insert gives us 3 tokens 1 token for that insert 1 token for rehashing this item the first time 1 token for rehashing another item that already got rehashed once or more (since we rehash on double the size, # of items never hashed = # of items already hashed once or more)

9 What happens tokens added tokens used Insert 1 1A,1B,1C 1A Rehash to 4 1B – rehash 1 Insert 2 2A,2B,2C 2A Rehash to 8 2B – rehash 2 2C – rehash 1 Insert 3 3A,3B,3C 3A Insert 4 4A,4B,4C 4A Rehash to 16 3B – rehash 3 4B – rehash 4 3C – rehash 1 4C – rehash 2

10 What happens tokens added tokens used Insert 5 5A,5B,5C 5A Insert 6 6A,6B,6C 6A Insert 7 7A,7B,7C 7A Insert 8 8A,8B,8C 8A Rehash to 32 5B – rehash 5 6B – rehash 6 7B – rehash 7 8B – rehash 8 5C – rehash 1 6C – rehash 2 7C – rehash 3 8C – rehash 4

11 Potential function Potential function: more sophisticated tokens
Potential function is a function When operation costs less, potential goes up When operation costs more, take from potential Always positive – or we’re taking too long

12 Tokens as a potential function
potential function at operation #i P(i)= 1 + 2F - S/2 F = # of filled (non-empty) hash table slots S = total # of hash table slots Actual cost of operation of operation #i C(i) Amortized cost of operation #(i+1): CA(i+1)=C(i+1) + ( P(i+1) – P(i) )

13 The Math Beginning: For insert with no rehashing
F=0, S=2. P(i)=0 For insert with no rehashing C(i+1)=1 P(i+1)-P(i)=2 (added non-empty slot) For insert with rehashing of N items C(i+1)=1+N P(i+1)-P(i)=2-N (before, F=N, S=2N. After, S=4N)

14 Potential method analysis
P(i) is always positive. Yes. We rehash when F  1/2S, at which point, 4F=S. Etc… Amortized cost is constant CA(i+1)=C(i+1)+P(i+1)-P(i) = 3, in both cases (previous slide)


Download ppt "Amortized Analysis of Rehashing"

Similar presentations


Ads by Google