Download presentation
Presentation is loading. Please wait.
1
Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1
2
Observation Computer science is (still) a young field. We often settle for the first (good) solution. It may not be the best: the design space is rich. 2
3
3 Research Agenda For fundamental problems, systematically explore the design space to find the best solutions, seeking elegance: “ a quality of neatness and ingenious simplicity in the solution of a problem (especially in science or mathematics).” wordnet.princeton.edu/perl/webwn Keep the design simple, allow complexity in the analysis.
4
4 Searching: Dictionary Problem Maintain a set of items, so that Access: find a given item Insert: add a new item Delete: remove an item are efficient. Assumption: items are totally ordered, so that binary comparison is possible
5
Binary Search Tree Symmetric order < > e ci agn k Why not hashing? 5
6
6 Binary Search Tree Access k e ci agn k > > <
7
7 Binary Search Tree Insert h e ci agn k > h < >
8
8 Binary Search Tree Delete i e ci agn kh m Find successor Swap i k Delete k
9
Problem: imbalance How to bound the height? Maintain local balance condition, rebalance after insert or delete balanced tree Restructure after each access self-adjusting tree Store balance information in nodes, guarantee O(log n) height After (during) insert/delete, restore balance bottom-up (top-down): Update balance information Restructure along access path a b c d e f 9
10
10 Restructuring primitive: (Single) Rotation Preserves symmetric order Changes heights Takes O(1) time y x AB C x y BC A right left
11
11 Known Balanced Trees AVL trees (“passé” according to one author) weight balanced trees 2,3 trees B trees red-black trees etc. Goal: small height, little rebalancing, simple algorithms not binary
12
12 Ranks Each node has an integer rank, a proxy for height Convention: leaves have rank 0, missing nodes have rank -1 rank of tree = rank of root rank difference of a child = rank of parent - rank of child i-child: node of rank difference i i,j-node: children have rank differences i and j
13
13 Example of a rank-balanced tree e ci agn kh 3 0 1 1 2 2 1 1 1 1 1 0 1 0 1 If all rank differences are positive, rank height
14
14 Rank Rules AVL trees: every node is a 1,1- or 1,2-node Rank-balanced trees: every node is a 1,1-, 1,2-, or 2,2-node (rank differences are 1 or 2) Red-black trees: all rank differences are 0 or 1, no 0-child is the parent of another All need one balance bit per node.
15
15 Height bounds n k = minimum n for rank k AVL trees: n 0 = 1, n 1 = 2, n k = n k-1 + n k-2 + 1, n k = F k+3 - 1 n k = F k+3 - 1, F k+2 < k k log n 1.44lg n Rank-balanced trees: n 0 = 1, n 1 = 2, n k = 2n k-2, n k = 2 k/2 k 2lg n Same bound for red-black trees
16
16 b Insert bInsert c 0 c 1 a Insert a > > 1 0 Rotate left at b 0 1 Insertion example Demote a 0 0 1 Promote a Promote b
17
17 > e d 2 Insert e 0 0 1 1 0 1 c b > Insert d a 1 > Rotate left at d 12 Insertion example Insert c Demote c 0 0 0 0 1 1 Promote c Promote b Promote d
18
18 0 1 e 2 1 1 d b a c 2 Insertion example Insert eInsert f > > > f 0 2 1 0 Rotate left at d Demote b 1 0 0 0 0 1 2 Promote e Promote d
19
19 1 Insert f f 1 1 e d b 2 Insertion example a c 1 1 1 0 0 0 1
20
20 Rebalancing: insertion Non-terminal 0, 1, or 2 rotations O(log n) rank changes No 2,2 nodes = AVL trees log n height!
21
21 2 1 0 def e Delete aDelete fDelete d 1 Swap with successor Delete 1 f 1 d b 2 Deletion example a c 1 1 1 0 0 0 Double rotate at c Double promote c Demote b Double demote e
22
22 e c b Delete f Deletion example 2 0 2 0 2
23
23 Rebalancing: deletion Non-terminal 0, 1, or 2 rotations O(log n) rank changes
24
24 Amortized (time-averaged) analysis If t i is the actual time of operation i and i is the potential of the data structure after operation i, the amortized time of operation i is
25
25 Non-terminal cases Must decrease potential!
26
Insertions: 1,1-node 1,2-node Deletions: 2,2-node 1,1- or 1,2-node = #1,1-nodes + 2 #2,2-nodes non-terminating steps are free, last insertion step: Δ 2, last deletion step: Δ 3 If there are m inserts and d deletes (n = m - d), the number of rebalancing steps is O(m + d) 26
27
27 Rank-Balanced Trees height 2lg n 2 rotations per rebalancing O(1) amortized rebalancing time Red-Black Trees height 2lg n 3 rotations per rebalancing O(1) amortized rebalancing time Yes.No. Are rank-balanced trees better?
28
Better height bound? Sequential Insertions: rank-balancedred-black height = lg n (best)height = 2lg n (worst) Theorem. The height of a rank-balanced tree is at most log m. Degrades gracefully from AVL trees as d/m 1 28
29
29 Proof Give a node a count of 1 when it is inserted Total amount of count in tree is m Potential of a node = total count in its subtree When a node is deleted, its count is added to its parent if it has one Let k be the minimum potential of a node of rank k Claim: k satisfies 0 = 1, 1 = 2, k = 1 + k-1 + k-2 for k > 1 m F k+3 - 1 k
30
30 Proof of Claim k = 1 + k-1 + k-2 for k > 1 Easy for 1,1- and 1,2-nodes Harder for 2,2-nodes (created by deletions) But counts are inherited
31
Rebalancing frequency How high does rebalancing propagate? O(m + d) rebalancing steps total, which implies O((m + d) / k) insertions/deletions at rank k Actually, we can show: Theorem. There are O((m + d) / 2 k/3 ) rebalancing steps at rank k. 31
32
32 Proof Use an exponential potential: 1,1- and 2,2-nodes of rank i get potential b i 1,2-nodes of rank i get potential b i-2 where b = 2 1/3 The potential change in the non-terminal steps telescopes. Combine this effect with initialization and terminal step.
33
33 0 01 1 0 0 1 1 1 1 2 2 2 2 Telescoping potential … 0 1 1 1 … bibi b i-1 - b i+1 b i - b i+2 b i+1 - b i+3 b i+2 - 0 = -b i+3
34
34 Fix k. Cut off growth in potential at rank k: 1,1- and 2,2-nodes of rank i: b min{i,k-3} 1,2-nodes of rank i: b min{i-2,k-3} Then a rebalancing that propagates to rank k or above decreases the potential by b k-3. The same idea works for red-black trees (we think).
35
Preliminary evaluation 35 TestNXRed-black treesRank-balanced trees # rots 10 6 # bals 10 6 avg. pLen max. pLen # rots 10 6 # bals 10 6 avg. pLen max. pLen id81926710886426.443116.07010.47215.62729.553133.73710.39015.092 queue_id81926710886450.317285.12911.37522.50150.325184.52711.19513.999 work_set_id81926710886441.714185.34810.51016.18143.686159.68910.44515.345 zipf_id81926710886425.238112.85810.41315.45828.272130.92710.33815.045 dyn_zipf_id.81926710886423.176103.47210.47715.66126.038125.98510.40415.158
36
Preliminary evaluation 36 TestNXRed-black treesRank-balanced trees # rots 10 6 # bals 10 6 avg. pLen max. pLen # rots 10 6 # bals 10 6 avg. pLen max. pLen id81926710886426.443116.07010.47215.62729.553133.73710.39015.092 queue_id81926710886450.317285.12911.37522.50150.325184.52711.19513.999 work_set_id81926710886441.714185.34810.51016.18143.686159.68910.44515.345 zipf_id81926710886425.238112.85810.41315.45828.272130.92710.33815.045 dyn_zipf_id.81926710886423.176103.47210.47715.66126.038125.98510.40415.158
37
Preliminary evaluation 37 TestNXRed-black treesRank-balanced trees # rots 10 6 # bals 10 6 avg. pLen max. pLen # rots 10 6 # bals 10 6 avg. pLen max. pLen id81926710886426.443116.07010.47215.62729.553133.73710.39015.092 queue_id81926710886450.317285.12911.37522.50150.325184.52711.19513.999 work_set_id81926710886441.714185.34810.51016.18143.686159.68910.44515.345 zipf_id81926710886425.238112.85810.41315.45828.272130.92710.33815.045 dyn_zipf_id.81926710886423.176103.47210.47715.66126.038125.98510.40415.158
38
Preliminary evaluation 38 TestNXRed-black treesRank-balanced trees # rots 10 6 # bals 10 6 avg. pLen max. pLen # rots 10 6 # bals 10 6 avg. pLen max. pLen id81926710886426.443116.07010.47215.62729.553133.73710.39015.092 queue_id81926710886450.317285.12911.37522.50150.325184.52711.19513.999 work_set_id81926710886441.714185.34810.51016.18143.686159.68910.44515.345 zipf_id81926710886425.238112.85810.41315.45828.272130.92710.33815.045 dyn_zipf_id.81926710886423.176103.47210.47715.66126.038125.98510.40415.158
39
Conclusion Rank-balanced trees are a relaxation of AVL trees with behavior at least as good as red-black trees and better in important ways. Especially the height bound of min{2lg n, log m} Exponential potential functions yield new insights into the efficiency of rebalancing. We anticipate more applications. 39
40
For insertions, yes. But what about deletions? Deletion rebalancing is complicated, ignored by textbooks, and many database systems do not do it So, can we avoid deletion rebalancing? Yes. Relaxation of AVL trees, ravl trees, achieves log m access time using lglg m + 1 balance bits … and no rebalancing during deletion! Is rebalancing necessary? 40
41
41 Thank you
42
42 Extra slides
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.