Download presentation
Presentation is loading. Please wait.
Published byฮ ฮญฯฯฮนฯ ฮฮฑฮบฯฮฎ Modified over 6 years ago
1
Selection in heaps and row-sorted matrices
using soft heaps Haim Kaplan Lรกszlรณ Kozma Or Zamir Uri Zwick (ๆญฆ็ ) ๆธ
ๅๅคงๅญฆไบคๅไฟกๆฏ็ ็ฉถ้ข May 15, 2018 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA
2
Generalized selection
Given ๐ items from a totally ordered domain, with a partial order on them already known, find the ๐-th smallest item, or the set of ๐ smallest items. All algorithms in this talk are comparison-based. smaller Partial order larger
3
Generalized selection
Given ๐ items from a totally ordered domain, with a partial order on them already known, find the ๐-th smallest item. The generalized sorting problem is well understood. Information-theoretic lower bound is essentially tight. Information-theoretic lower bound for generalized selection may be extremely loose. (E.g., finding the minimum) General answer for generalized selection is not known. Some interesting special cases were studied.
4
Some interesting selection problems
Collection of sorted lists Doubly sorted matrix Binary min-heap โset of pairwise sumsโ ๐+๐= ๐ฅ+๐ฆ | ๐ฅโ๐,๐ฆโ๐ (๐,๐ are not sorted) Studied by [Frederickson-Johnson (1982)] [Frederickson (1993)] We give simpler and somewhat improved algorithms.
5
Selecting the ๐-th smallest item in a heap
1 4 2 3 5 7 6 8 9 Already extracted Currently in the heap Trivial algorithm: ๐(๐ log ๐ ) Insert root into an auxiliary priority queue ๐. Repeat ๐ times: Extract the minimum item from ๐. Insert its two children into ๐. Not seen yet Returns the ๐ smallest items in sorted order.
6
Selecting the ๐-th smallest item in a heap
9 15 7 25 16 11 14 12 10 19 23 17 22 30 21 Non-trivial algorithm: ๐(๐) [Frederickson (1993)] ๐ log ๐ โ๐ log log ๐ โ๐ log log log ๐ โ ๐ 3 log โ ๐ โ ๐ 2 log โ ๐ โ ๐ Matching the information-theoretic lower bound. We get an ๐(๐)-time algorithm by essentially running the trivial algorithm using a soft heap.
7
Fibonacci heaps (Hollow heaps)
Heaps vs. Soft Heaps [Chazelle โ00] [Kaplan-Tarjan-Zwick โ13] Fibonacci heaps (Hollow heaps) Soft Heaps Make-heap ๐(1) Insert Extract-min ๐(logโก๐) ๐(logโก1/๐) Meld All bounds are amortized. Soft heaps may increase keys of items. Items with increased keys are corrupt. At most ๐๐ผ items in the heaps are corrupt, where ๐ผ is the total number of insertions.
8
Previous applications of Soft Heaps
A deterministic ๐(๐๐ผ(๐,๐))-time algorithm for finding minimum spanning trees [Chazelle โ00] An optimal deterministic algorithm for finding minimum spanning trees (with a yet unknown running time) [Pettie-Ramachandran โ02] New selection and approximate sorting algorithms [Chazelle โ00]
9
Deletions from a binary heap
9 15 7 25 16 11 14 12 10 19 23 17 22 30 21
10
Deletions from a binary heap
9 15 25 16 11 14 12 10 19 23 17 22 30 21
11
Deletions from a binary heap
15 25 16 11 14 12 10 19 23 17 22 30 21
12
Deletions from a binary heap
15 9 25 16 11 14 12 10 19 23 17 22 30 21
13
Deletions from a binary heap
15 9 25 16 11 14 12 19 23 17 22 30 21
14
Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21
15
Deletions from a binary heap
10 15 9 25 16 11 14 19 23 17 22 30 21
16
Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21
17
Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21
18
Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21
19
Binary heaps with lists [Kaplan-Tarjan-Zwick โ13]
Corrupt key of all items in the list Original keys Tree is heap ordered with respect to corrupt keys 18 3 12 2 4 16 17 18 22 22 15 1 27 27 8 40 40 35 35 30 30 45 45 Each node has a list of items. (Most) items in lists of length>1 are corrupt.
20
Binary heaps with lists [Kaplan-Tarjan-Zwick โ13]
Corrupt key of all items in the list Tree is heap ordered with respect to corrupt keys 18 3 12 2 4 16 17 18 22 22 15 1 27 27 8 40 40 35 35 30 30 45 45 Deleting an item of smallest corrupt key is easy. Until the list at the root becomes emptyโฆ
21
Double even refill [Kaplan-Tarjan-Zwick โ13]
When a node of even rank ๐โฅ๐ก is empty, fill it twice, concatenating the two lists of items. ๐ Move list of smaller child to root 18 56 1 12 2 4 16 17 18 44 10 45 56
22
Double even refill [Kaplan-Tarjan-Zwick โ13]
When a node of even rank ๐โฅ๐ก is empty, fill it twice, concatenating the two lists of items. ๐ 18 1 12 2 4 16 17 18 Recursively fill the child 56 44 10 45 56
23
Double even refill [Kaplan-Tarjan-Zwick โ13]
When a node of even rank ๐โฅ๐ก is empty, fill it twice, concatenating the two lists of items. 18 1 12 2 4 16 17 18 If ๐โฅ๐ก is even, do it again! 20 56 4 14 20 5 44 10 45 56 ๐ก= log 3 ๐ Moving lists takes a constant time.
24
Controlling corruption
The size of a node of rank ๐ is at most: Corrupt items rank โฅ๐ก ๐ ๐ = ๐โ๐ก 2 if ๐>๐ก 1 otherwise Claim: Number of nodes of rank ๐ is at most ๐/ 2 ๐ rank <๐ก uncorrupt items Number of corrupt items: ๐ ๐โฅ๐ก ๐ ๐ 2 ๐ =๐ ๐ 2 ๐ก =๐ ๐๐ ๐ก= log 3 ๐
25
Soft Heaps โ assumptions
Insert and Meld do not corrupt items. All corruptions are caused by Extract-min, after extraction. Extract-min returns a list of newly corrupted items. ๐,๐ถ โ Extract-min(๐) Item with smallest (corrupt) key, extracted from the heap. List of newly corrupt items, corrupted after extracting ๐. (Remain in the heap.) ๐.๐๐๐๐๐ข๐๐ก โ Is ๐ corrupt?
26
Selection from a heap using a soft heap
Run the naรฏve algorithm using a soft heap. When an item is extracted, insert its children, and the children of all items corrupted by the extraction, into the soft heap. ๐โ Soft-Heap(๐) Insert(๐,๐) for ๐โ1 to ๐โ1: ๐,๐ถ โ Extract-min(๐) if not ๐.๐๐๐๐๐ข๐๐ก: ๐ถโ๐ถโช ๐ for ๐โ๐ถ: Insert(๐,๐.๐๐๐๐ก) Insert(๐,๐.๐๐๐โ๐ก) Claim 1: After ๐โ1 iterations, the ๐ smallest items were inserted into ๐. Claim 2: Total number of items inserted into ๐ is ๐(๐). Finish by finding the ๐-th smallest among the inserted items, using a standard selection algorithm.
27
Selection from a heap using a soft heap
Proof of Claim 1 Claim 1: After ๐โ1 iterations, the ๐ smallest items were inserted into ๐. After each iteration, ๐ constrains a barrier of uncorrupt items, and possibly some corrupt items above them. All other items above the barrier were already extracted from ๐. Corrupt items Barrier The item extracted at each iteration is smaller or equal to the smallest item on the barrier. (We rely on the assumption that corruption occurs only after extractions.) After ๐ iterations, the rank of the smallest item on the barrier is at least ๐+1. After ๐โ1 iterations, the rank of the smallest item on the barrier is at least ๐. After ๐โ1 iterations, the ๐ smallest items must be on or above the barrier, i.e., they were inserted into ๐, as claimed.
28
Selection from a binary heap Proof of Claim 2
Claim 2: Total number of items inserted into ๐ is ๐(๐). ๐ผ โ Number of insertions ๐ถ โ Number of corrupt items ๐ถ < 1+2๐ 1โ2๐ ๐ ๐ผ<2๐+2๐ถ ๐ถ<๐+๐ ๐ผ ๐ผ < ๐ 1โ2๐ ๐ It is thus enough to take ๐< 1 2 , e.g., ๐= 1 4 . Each soft heap operation takes ๐(1) time. Total running time (and number of comparisons) is ๐(๐). Simple!
29
New โoutput-sensitiveโ result:
Row-sorted matrices Select the ๐-th smallest item from a collection of ๐ sorted lists. ๐ ๐ ๐+๐ , ๐ ๐ log ๐ ๐ [Frederickson-Johnson (1982)] We immediately get ๐(๐+๐) using soft heaps. Number of items in the ๐-row that are among the ๐ smallest We also get a simple ๐ ๐ log ๐ ๐ algorithm. ๐ ๐+ ๐=1 ๐ log ( ๐ ๐ +1) New โoutput-sensitiveโ result:
30
Row-sorted matrices - ๐ ๐ log ๐ ๐
Split each row into blocks of size ๐ 2๐ . Select the smallest 2๐ block leaders. (Using the ๐ ๐+๐ algorithm.) ๐ The ๐ smallest items must reside in at least 2๐ blocks. Thus, the selected 2๐ leaders must be among the ๐ smallest. At least ๐ blocks, i.e., the non-last blocks in each row, are fully contained in the set of ๐ smallest, and can be eliminated. In ๐(๐) time, ๐ was reduced to about ๐/2. After log ๐ ๐ iterations, ๐ is down to ๐, and we use previous algorithm.
31
Row-sorted matrices - ๐ ๐+ ๐=1 ๐ log ๐ ๐
Let ๐ ๐ be the length of the ๐-th list. ๐ ๐โฒ Long rows: ๐ ๐ โฅ๐/2๐. Let ๐ โฒ be the number of long rows. The short rows contain at most ๐/2 items and are put โon holdโ. The long rows contain at least ๐/2 items of the solution. Split the long rows into blocks of size ๐/(4 ๐ โฒ ). The ๐/2 smallest items in long rows reside in at least 2๐โฒ blocks. Select the 2๐โฒ smallest leaders in the long rows. At least ๐โฒ non-last blocks, containing at least ๐/4 items, eliminated.
32
Row-sorted matrices - ๐ ๐+ ๐=1 ๐ log ๐ ๐
Let ๐ ๐ be the length of the ๐-th list. ๐ ๐โฒ Cost of iteration is ๐ ๐ โฒ , proportional to number of participating rows. Each iteration reduces ๐ to at most 3๐/4. Threshold (=๐/2๐) decreases exponentially to a constant. Row ๐ participates in at most the last ๐ log ๐ ๐ iterations. Total running time is ๐ ๐+ ๐=1 ๐ log ๐ ๐ .
33
Row-sorted matrices - ๐ ๐+ ๐=1 ๐ log ( ๐ ๐ +1)
Select the ๐ smallest items from a collection of ๐ sorted lists. Let ๐ ๐ be the number of items selected from ๐-th row. ๐ Let ๐ฟ= ๐=1 ๐ log (๐ ๐ +1) . Split each row into blocks of sizes 1,2,4,โฆ Note that ๐ฟ is exactly the number of blocks that cover the ๐ smallest items. If we know ๐ฟ, or a tight upper bound โ on it, we could select the โ smallest leaders and use the ๐ ๐ algorithm. If ๐ฟ not know, try โ=๐, 2๐, 4๐, โฆ
34
Concluding remarks Results for general partial orders?
More applications of Soft Heaps? Thank you for your attention!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.