Presentation is loading. Please wait.

Presentation is loading. Please wait.

Selection in heaps and row-sorted matrices

Similar presentations


Presentation on theme: "Selection in heaps and row-sorted matrices"โ€” Presentation transcript:

1 Selection in heaps and row-sorted matrices
using soft heaps Haim Kaplan Lรกszlรณ Kozma Or Zamir Uri Zwick (ๆญฆ็† ) ๆธ…ๅŽๅคงๅญฆไบคๅ‰ไฟกๆฏ็ ”็ฉถ้™ข May 15, 2018 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA

2 Generalized selection
Given ๐‘› items from a totally ordered domain, with a partial order on them already known, find the ๐‘˜-th smallest item, or the set of ๐‘˜ smallest items. All algorithms in this talk are comparison-based. smaller Partial order larger

3 Generalized selection
Given ๐‘› items from a totally ordered domain, with a partial order on them already known, find the ๐‘˜-th smallest item. The generalized sorting problem is well understood. Information-theoretic lower bound is essentially tight. Information-theoretic lower bound for generalized selection may be extremely loose. (E.g., finding the minimum) General answer for generalized selection is not known. Some interesting special cases were studied.

4 Some interesting selection problems
Collection of sorted lists Doubly sorted matrix Binary min-heap โ€œset of pairwise sumsโ€ ๐‘‹+๐‘Œ= ๐‘ฅ+๐‘ฆ | ๐‘ฅโˆˆ๐‘‹,๐‘ฆโˆˆ๐‘Œ (๐‘‹,๐‘Œ are not sorted) Studied by [Frederickson-Johnson (1982)] [Frederickson (1993)] We give simpler and somewhat improved algorithms.

5 Selecting the ๐‘˜-th smallest item in a heap
1 4 2 3 5 7 6 8 9 Already extracted Currently in the heap Trivial algorithm: ๐‘‚(๐‘˜ log ๐‘˜ ) Insert root into an auxiliary priority queue ๐‘„. Repeat ๐‘˜ times: Extract the minimum item from ๐‘„. Insert its two children into ๐‘„. Not seen yet Returns the ๐‘˜ smallest items in sorted order.

6 Selecting the ๐‘˜-th smallest item in a heap
9 15 7 25 16 11 14 12 10 19 23 17 22 30 21 Non-trivial algorithm: ๐‘‚(๐‘˜) [Frederickson (1993)] ๐‘˜ log ๐‘˜ โ†’๐‘˜ log log ๐‘˜ โ†’๐‘˜ log log log ๐‘˜ โ†’ ๐‘˜ 3 log โˆ— ๐‘˜ โ†’ ๐‘˜ 2 log โˆ— ๐‘˜ โ†’ ๐‘˜ Matching the information-theoretic lower bound. We get an ๐‘‚(๐‘˜)-time algorithm by essentially running the trivial algorithm using a soft heap.

7 Fibonacci heaps (Hollow heaps)
Heaps vs. Soft Heaps [Chazelle โ€™00] [Kaplan-Tarjan-Zwick โ€™13] Fibonacci heaps (Hollow heaps) Soft Heaps Make-heap ๐‘‚(1) Insert Extract-min ๐‘‚(logโก๐‘›) ๐‘‚(logโก1/๐œ€) Meld All bounds are amortized. Soft heaps may increase keys of items. Items with increased keys are corrupt. At most ๐œ€๐ผ items in the heaps are corrupt, where ๐ผ is the total number of insertions.

8 Previous applications of Soft Heaps
A deterministic ๐‘‚(๐‘š๐›ผ(๐‘š,๐‘›))-time algorithm for finding minimum spanning trees [Chazelle โ€™00] An optimal deterministic algorithm for finding minimum spanning trees (with a yet unknown running time) [Pettie-Ramachandran โ€™02] New selection and approximate sorting algorithms [Chazelle โ€™00]

9 Deletions from a binary heap
9 15 7 25 16 11 14 12 10 19 23 17 22 30 21

10 Deletions from a binary heap
9 15 25 16 11 14 12 10 19 23 17 22 30 21

11 Deletions from a binary heap
15 25 16 11 14 12 10 19 23 17 22 30 21

12 Deletions from a binary heap
15 9 25 16 11 14 12 10 19 23 17 22 30 21

13 Deletions from a binary heap
15 9 25 16 11 14 12 19 23 17 22 30 21

14 Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21

15 Deletions from a binary heap
10 15 9 25 16 11 14 19 23 17 22 30 21

16 Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21

17 Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21

18 Deletions from a binary heap
10 15 9 25 16 11 14 12 19 23 17 22 30 21

19 Binary heaps with lists [Kaplan-Tarjan-Zwick โ€™13]
Corrupt key of all items in the list Original keys Tree is heap ordered with respect to corrupt keys 18 3 12 2 4 16 17 18 22 22 15 1 27 27 8 40 40 35 35 30 30 45 45 Each node has a list of items. (Most) items in lists of length>1 are corrupt.

20 Binary heaps with lists [Kaplan-Tarjan-Zwick โ€™13]
Corrupt key of all items in the list Tree is heap ordered with respect to corrupt keys 18 3 12 2 4 16 17 18 22 22 15 1 27 27 8 40 40 35 35 30 30 45 45 Deleting an item of smallest corrupt key is easy. Until the list at the root becomes emptyโ€ฆ

21 Double even refill [Kaplan-Tarjan-Zwick โ€™13]
When a node of even rank ๐‘˜โ‰ฅ๐‘ก is empty, fill it twice, concatenating the two lists of items. ๐‘˜ Move list of smaller child to root 18 56 1 12 2 4 16 17 18 44 10 45 56

22 Double even refill [Kaplan-Tarjan-Zwick โ€™13]
When a node of even rank ๐‘˜โ‰ฅ๐‘ก is empty, fill it twice, concatenating the two lists of items. ๐‘˜ 18 1 12 2 4 16 17 18 Recursively fill the child 56 44 10 45 56

23 Double even refill [Kaplan-Tarjan-Zwick โ€™13]
When a node of even rank ๐‘˜โ‰ฅ๐‘ก is empty, fill it twice, concatenating the two lists of items. 18 1 12 2 4 16 17 18 If ๐‘˜โ‰ฅ๐‘ก is even, do it again! 20 56 4 14 20 5 44 10 45 56 ๐‘ก= log 3 ๐œ€ Moving lists takes a constant time.

24 Controlling corruption
The size of a node of rank ๐‘˜ is at most: Corrupt items rank โ‰ฅ๐‘ก ๐‘  ๐‘˜ = ๐‘˜โˆ’๐‘ก 2 if ๐‘˜>๐‘ก 1 otherwise Claim: Number of nodes of rank ๐‘˜ is at most ๐‘›/ 2 ๐‘˜ rank <๐‘ก uncorrupt items Number of corrupt items: ๐‘› ๐‘˜โ‰ฅ๐‘ก ๐‘  ๐‘˜ 2 ๐‘˜ =๐‘‚ ๐‘› 2 ๐‘ก =๐‘‚ ๐œ€๐‘› ๐‘ก= log 3 ๐œ€

25 Soft Heaps โ€“ assumptions
Insert and Meld do not corrupt items. All corruptions are caused by Extract-min, after extraction. Extract-min returns a list of newly corrupted items. ๐‘’,๐ถ โ† Extract-min(๐‘„) Item with smallest (corrupt) key, extracted from the heap. List of newly corrupt items, corrupted after extracting ๐‘’. (Remain in the heap.) ๐‘’.๐‘๐‘œ๐‘Ÿ๐‘Ÿ๐‘ข๐‘๐‘ก โ€“ Is ๐‘’ corrupt?

26 Selection from a heap using a soft heap
Run the naรฏve algorithm using a soft heap. When an item is extracted, insert its children, and the children of all items corrupted by the extraction, into the soft heap. ๐‘„โ† Soft-Heap(๐œ€) Insert(๐‘„,๐‘Ÿ) for ๐‘–โ†1 to ๐‘˜โˆ’1: ๐‘’,๐ถ โ† Extract-min(๐‘„) if not ๐‘’.๐‘๐‘œ๐‘Ÿ๐‘Ÿ๐‘ข๐‘๐‘ก: ๐ถโ†๐ถโˆช ๐‘’ for ๐‘’โˆˆ๐ถ: Insert(๐‘„,๐‘’.๐‘™๐‘’๐‘“๐‘ก) Insert(๐‘„,๐‘’.๐‘Ÿ๐‘–๐‘”โ„Ž๐‘ก) Claim 1: After ๐‘˜โˆ’1 iterations, the ๐‘˜ smallest items were inserted into ๐‘„. Claim 2: Total number of items inserted into ๐‘„ is ๐‘‚(๐‘˜). Finish by finding the ๐‘˜-th smallest among the inserted items, using a standard selection algorithm.

27 Selection from a heap using a soft heap
Proof of Claim 1 Claim 1: After ๐‘˜โˆ’1 iterations, the ๐‘˜ smallest items were inserted into ๐‘„. After each iteration, ๐‘„ constrains a barrier of uncorrupt items, and possibly some corrupt items above them. All other items above the barrier were already extracted from ๐‘„. Corrupt items Barrier The item extracted at each iteration is smaller or equal to the smallest item on the barrier. (We rely on the assumption that corruption occurs only after extractions.) After ๐‘– iterations, the rank of the smallest item on the barrier is at least ๐‘–+1. After ๐‘˜โˆ’1 iterations, the rank of the smallest item on the barrier is at least ๐‘˜. After ๐‘˜โˆ’1 iterations, the ๐‘˜ smallest items must be on or above the barrier, i.e., they were inserted into ๐‘„, as claimed.

28 Selection from a binary heap Proof of Claim 2
Claim 2: Total number of items inserted into ๐‘„ is ๐‘‚(๐‘˜). ๐ผ โ€“ Number of insertions ๐ถ โ€“ Number of corrupt items ๐ถ < 1+2๐œ€ 1โˆ’2๐œ€ ๐‘˜ ๐ผ<2๐‘˜+2๐ถ ๐ถ<๐‘˜+๐œ€ ๐ผ ๐ผ < ๐œ€ 1โˆ’2๐œ€ ๐‘˜ It is thus enough to take ๐œ€< 1 2 , e.g., ๐œ€= 1 4 . Each soft heap operation takes ๐‘‚(1) time. Total running time (and number of comparisons) is ๐‘‚(๐‘˜). Simple!

29 New โ€œoutput-sensitiveโ€ result:
Row-sorted matrices Select the ๐‘˜-th smallest item from a collection of ๐‘š sorted lists. ๐‘š ๐‘‚ ๐‘š+๐‘˜ , ๐‘‚ ๐‘š log ๐‘˜ ๐‘š [Frederickson-Johnson (1982)] We immediately get ๐‘‚(๐‘š+๐‘˜) using soft heaps. Number of items in the ๐‘–-row that are among the ๐‘˜ smallest We also get a simple ๐‘‚ ๐‘š log ๐‘˜ ๐‘š algorithm. ๐‘‚ ๐‘š+ ๐‘–=1 ๐‘š log ( ๐‘˜ ๐‘– +1) New โ€œoutput-sensitiveโ€ result:

30 Row-sorted matrices - ๐‘‚ ๐‘š log ๐‘˜ ๐‘š
Split each row into blocks of size ๐‘˜ 2๐‘š . Select the smallest 2๐‘š block leaders. (Using the ๐‘‚ ๐‘š+๐‘˜ algorithm.) ๐‘š The ๐‘˜ smallest items must reside in at least 2๐‘š blocks. Thus, the selected 2๐‘š leaders must be among the ๐‘˜ smallest. At least ๐‘š blocks, i.e., the non-last blocks in each row, are fully contained in the set of ๐‘˜ smallest, and can be eliminated. In ๐‘‚(๐‘š) time, ๐‘˜ was reduced to about ๐‘˜/2. After log ๐‘˜ ๐‘š iterations, ๐‘˜ is down to ๐‘š, and we use previous algorithm.

31 Row-sorted matrices - ๐‘‚ ๐‘š+ ๐‘–=1 ๐‘š log ๐‘› ๐‘–
Let ๐‘› ๐‘– be the length of the ๐‘–-th list. ๐‘š ๐‘šโ€ฒ Long rows: ๐‘› ๐‘– โ‰ฅ๐‘˜/2๐‘š. Let ๐‘š โ€ฒ be the number of long rows. The short rows contain at most ๐‘˜/2 items and are put โ€œon holdโ€. The long rows contain at least ๐‘˜/2 items of the solution. Split the long rows into blocks of size ๐‘˜/(4 ๐‘š โ€ฒ ). The ๐‘˜/2 smallest items in long rows reside in at least 2๐‘šโ€ฒ blocks. Select the 2๐‘šโ€ฒ smallest leaders in the long rows. At least ๐‘šโ€ฒ non-last blocks, containing at least ๐‘˜/4 items, eliminated.

32 Row-sorted matrices - ๐‘‚ ๐‘š+ ๐‘–=1 ๐‘š log ๐‘› ๐‘–
Let ๐‘› ๐‘– be the length of the ๐‘–-th list. ๐‘š ๐‘šโ€ฒ Cost of iteration is ๐‘‚ ๐‘š โ€ฒ , proportional to number of participating rows. Each iteration reduces ๐‘˜ to at most 3๐‘˜/4. Threshold (=๐‘˜/2๐‘š) decreases exponentially to a constant. Row ๐‘– participates in at most the last ๐‘‚ log ๐‘› ๐‘– iterations. Total running time is ๐‘‚ ๐‘š+ ๐‘–=1 ๐‘š log ๐‘› ๐‘– .

33 Row-sorted matrices - ๐‘‚ ๐‘š+ ๐‘–=1 ๐‘š log ( ๐‘˜ ๐‘– +1)
Select the ๐‘˜ smallest items from a collection of ๐‘š sorted lists. Let ๐‘˜ ๐‘– be the number of items selected from ๐‘–-th row. ๐‘š Let ๐ฟ= ๐‘–=1 ๐‘š log (๐‘˜ ๐‘– +1) . Split each row into blocks of sizes 1,2,4,โ€ฆ Note that ๐ฟ is exactly the number of blocks that cover the ๐‘˜ smallest items. If we know ๐ฟ, or a tight upper bound โ„“ on it, we could select the โ„“ smallest leaders and use the ๐‘› ๐‘– algorithm. If ๐ฟ not know, try โ„“=๐‘š, 2๐‘š, 4๐‘š, โ€ฆ

34 Concluding remarks Results for general partial orders?
More applications of Soft Heaps? Thank you for your attention!


Download ppt "Selection in heaps and row-sorted matrices"

Similar presentations


Ads by Google