ANALYSIS OF SOFT HEAP Varun Mishra April 16,2009
Outline What is a Soft Heap? Data Structure Heap Operations insert, merge, deletemin,sift Complexity Bounds Applications
What is a Soft Heap? A sequence of heap ordered “binarized“ binomial trees(soft queues) with possibly some subtrees missing Acheives an amortized constant-time for meld,delete and findmin and O(log 1 / ε ) time for insert in a Heap. Atmost εn items may be corrupted where n is the number of inserts Uses the concept of “car pooling” to beat the logarithmic time bound Used for median finding, computing MST of a graph and approximate sorting
Data Structure Each item in head list has a suffix-min pointer An entire item-list can be stored at each node Heap ordering is on the common key 3,2,4 13,8,15,24
Additional points Binomial trees are arranged in the head-list in increasing order of rank. No two trees has same rank. Rank of a node is defined as the rank of the corresponding target node in a binomial tree The items in the item-list whose key is less than the node’s key are corrupted. All item-list members move together to implement “car pooling”. 4 2,3,4
Heap Operations : Insert Create a single node and meld it with the remaining soft heap
Meld Break the heap with lower rank and meld each soft queue into the other heap Insert the queue h into head list to maintain order in ranks Perform carry propagation if required Call update suffix_min
Update Suffix_min Let h was the head of the last queue that was modified. Update suffix_min pointers in a backward fashion UpdateSuffix_min(h) { If (key[h] next)]) suffix_min[h] = h Else suffix_min[h] = suffix_min[h->next] UpdateSuffix_min(h->prev) }
Delete and Findmin Delete : Simply mark the element to be deleted Findmin : Find the smallest un-marked item A variant DeleteMin will be implemented
DeleteMin Follow the suffix_min pointer from beginning of head- list and delete the item from item list of root If the item-list is empty, we need to refill it. Before doing so, we check if the following rank invariant holds : #(children) of root >= Rank(root)/2. If not, we dismantle the root to meld back its children into the heap.
Root dismantling if (childcount_h < rank(h)/2) { h->prev->next = h->next; h->next->prev = h->prev; UpdateSuffix_Min(h->prev); temp = h; while (tmp->next <> NULL) { meld (tmp->child); tmp = tmp->next; }
Sift : refilling the item list Append the item list of node v with itemlist of v->next. Copy key(v->next) into key(v). (Swap child and next pointers if necessary) If (rank(v) > r and rank(v)%2 == 1 ) Call sift again. (this extra call results in corruption) r is defined as r = 2 + 2[ log 1 / ε]. This ensures that corruption occurs only at lower depths of heap.
Sift empty Let delete min was called in the current setting.
Sift Φ empty The value at leaf node set to Φ.
Sift Φ 8 empty Φ and 8 swapped.
Sift Φ 8 empty 8 copied upwards in item list of parent. 8 8
Sift Φ copied upwards in item list of parent. Now 8 can be deleted. 8 8
Sift - II Φ 8 8 Consider other scenario where the node V had rank > r. Sift called again. 8 8 v
Sift - II Φ Φ 8 8 v
Sift - II Φ Nodes with Φ key are pruned. v
Sift - II Swap Φ and 9. v Φ
Sift - II ,9 v Φ Append 9 into the item list of V. Note that 8 is now a corrupted key.
Complexity Bounds |item –list(v)| <= max {1, 2^(rank[v/2]-r/2) } (can be shown by induction) #(Corrupted items) <= εn (Nodes with corrupted keys <= 1/2 r. Together with definition of r and bound on item-list, we can prove it.)
Meld We will show that total time taken for all melds is O(n) The entire sequence of soft heap melds can be modeled as a binary tree. So, MeldCost(x) = 1 + min { cost(Size[y]), cost(Size[z]) } Total cost <= ∑ k= 1 to H k * log(n/ 2 k ) where H = log(n) = O(n) We can charge the dismantle-induced melds against the absent leaves.
Sift Amortized cost for all refilling item-list operations is O(rn). Every call to sift takes O(r) time and results in increasing the size of item-list at a node by atleast 1. So there can be atmost 'n' calls to soft and the overall time complexity is O(rn). So it follows that all operations can be done in amortized constant time except insert – O(log 1 / ε ) which pays for the sift and eventual deletion of that element.
Applications Finding Median or kth largest element Insert the elements in a soft heap with error rate 1/3. Our aim is to find a nice pivot element. Call delete-min n/3 times. The largest element deleted has rank between n/3 and 2n/3. So after each iteration, we can remove atleast n/3 items from consideration. The overall running time is n + 2/3n + (2/3) 2 n = O(n).
Applications Approximate Sorting Insert the n items in a soft heap and delete the minimum items repeatedly. Total number of inversions is bounded by ε n 2. So we can do a near/approximate sorting in O(n) with atmost ε n 2 inversions. Minimum Spanning Tree in O(m * c(m,n)) where c is the classical inverse of Ackerman’s function. This is one of the fastest deterministic time algo for computing MST
Thank You