Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright 2004-2006 Curt Hill Balance in Binary Trees Impact on Performance.

Similar presentations


Presentation on theme: "Copyright 2004-2006 Curt Hill Balance in Binary Trees Impact on Performance."— Presentation transcript:

1 Copyright 2004-2006 Curt Hill Balance in Binary Trees Impact on Performance

2 Copyright 2004-2006 Curt Hill Tree Shape and Performance A tree that is balanced has excellent performance O(log 2 N) for: –Searches –Insertions –Deletions Only a hash table can beat this performance –But it has its own issues

3 Copyright 2004-2006 Curt Hill What is balance? The notion is that the two sub-trees are of about the same size Thus a search eliminates half the tree in each examination Perfect balance: –For each node in the tree, the size of the two sub-trees are off by at most one

4 Copyright 2004-2006 Curt Hill Probabilities What is the likelihood that a randomly built tree will have good performance characteristics? This is a difficult question The shape of a tree is dependent on the entry order of the nodes to be inserted Example: –Consider the integers 1-7 as the items to put in a tree –There are 7! = 5040 ways to order their input 7 ways to choose first 6 ways to choose second etc.

5 Copyright 2004-2006 Curt Hill What do we want? 4 6 75 2 31 A search must look at no more than 3 nodes

6 Copyright 2004-2006 Curt Hill Example Continued There are two really bad ways to choose the tree: –In ascending order or descending order –There are only two of these but there are several others that are just as bad –Consider 1 7 6 5 4 3 2 or 1 2 3 7 6 5 4 Bad in this case means that every node has zero or one descendents

7 Copyright 2004-2006 Curt Hill What do we not want? 1 A search must look at no more than 7 nodes 2 3 4 2 6 7 1 5 3 7 6 5 4 Arrival in ascending order Equally bad

8 Copyright 2004-2006 Curt Hill Negative Combinatorics There are two ways to choose the first item –Each subsequent item provides two ways: –The next item in ascending order –The last item –Therefore 2 * 2 * 2 * 2 * 2 * 2 * 1 –Looks like 64 ways to choose a list –This is 1.27% chance of a list A search would look at no more than 7 nodes

9 Copyright 2004-2006 Curt Hill Positive Combinatorics There is only one way to choose the root, it must be the 4 There are two ways to choose the second: 2 or 6 There are three ways to choose the third –If 2 was picked the 6 or any descendent of 2 –If 6 was picked the 2 or any descendent of 6 It gets exciting after that

10 Copyright 2004-2006 Curt Hill Positive Combinatorics Sub-cases need to be examined of the three last choices These do not work well in this kind of presentation I believe that there are 80 out of 5040 (1.5%) permutations that yield a perfectly balance tree However, most possibilities fall somewhere in between maximum pathes of 7 and 3

11 Copyright 2004-2006 Curt Hill Summary The worst case is a linked list which is bad –The worst case is not very likely The best case is perfectly balanced –The best case is more likely, but still unlikely Empirical studies indicate that the average path length of a unbalanced tree to be only 39% longer than a perfectly balanced tree Balancing is hard and slows insertions and deletions

12 Copyright 2004-2006 Curt Hill When to Balance In most cases an unbalanced tree will perform quite adequately If the application fulfills the following two criteria then balancing could be considered –The data is large and the search performance impacts the program –The number of searches is large compared to insertion and deletion

13 Copyright 2004-2006 Curt Hill Perfectly balanced trees Definition: –For each node the number of nodes of the left and right sub-trees differ by only 1 Balancing a tree is a recursive process that involves nodes from the leaves to root It is usually the case that control information is placed in node that measures the balance

14 Copyright 2004-2006 Curt Hill Balance Again Balancing occurs in insertion and deletion, but not searches It is somewhat intricate so perfect balance is seldom used The ratio of searches to inserts and deletes must be very high Is there another definition of balance that gives good performance with less rebalancing

15 Copyright 2004-2006 Curt Hill Height Balanced Also known as AVL balance –Adelson, Velski and Landis –Developed it and proved its desirability Definition: –The tree is balanced if for each node the heights of the two sub-trees differ at most by one It is the height of the tree that determines the worst case search

16 Copyright 2004-2006 Curt Hill Digression on Search Consider searching an array On average the search requires ½N comparisons The worst case is N searchs to find last one or to show not found The average and worst case are quite different This is not the case for trees

17 Copyright 2004-2006 Curt Hill Searching Trees 4 6 75 2 31 More than half the nodes are leaves at maximum depth. Worst case is three probes, but average case is only slightly less than three probes.

18 Copyright 2004-2006 Curt Hill AVL Trees Again Adelson, Velski and Landis proved: –Worst case of an AVL tree is only 45% worse than perfectly balanced –Average case: Insignificantly different than perfectly balanced Every perfectly balanced is also AVL balanced Far fewer rebalance, thus cheaper to construct –For the most part rebalancing occurs when really needed

19 Copyright 2004-2006 Curt Hill Construction Consider the construction of the following tree Four types of rebalancing operation –RR single –LL single –LR double –RL double Add: 4 5 7 2 1 3 6

20 Copyright 2004-2006 Curt Hill After 2 inserts 4 5 Still perfectly balanced

21 Copyright 2004-2006 Curt Hill Insert 7 4 5 7 Neither perfect nor AVL, rebalance is needed

22 Copyright 2004-2006 Curt Hill Rotate Right 4 5 7 Rebalance is needed – RR Single

23 Copyright 2004-2006 Curt Hill After Rotate 5 7 After rebalance 4

24 Copyright 2004-2006 Curt Hill Insert 2 5 7 No problem 4 2

25 Copyright 2004-2006 Curt Hill Insert 1 5 7 Unbalanced in other way – Do a LL single 4 2 1

26 Copyright 2004-2006 Curt Hill Rebalance 5 7 Rebalance complete – not perfect but AVL 2 14

27 Copyright 2004-2006 Curt Hill Insert 3 5 7 A rebalance is again needed, but different 2 14 3

28 Copyright 2004-2006 Curt Hill After Rotatation 4 5 This requires LR double 2 13 7

29 Copyright 2004-2006 Curt Hill Insert 6 4 5 This requires RL double 2 13 7 6

30 Copyright 2004-2006 Curt Hill Rotate 6-7 4 5 This requires RL double 2 13 6 7

31 Copyright 2004-2006 Curt Hill Rotate 5-6-7 4 6 Now complete 2 13 7 5

32 Copyright 2004-2006 Curt Hill The problem of balancing To implement requires extra stuff in the nodes Measures the height of the descendents Even with an AVL tree there is substantial work to be done at insertion and deletion time Thus the search to insert and delete ratio needs to be high –Just not as high as perfect balance

33 Copyright 2004-2006 Curt Hill Synonyms Another name for an AVL trees is Fibonacci tree The fact that heights may disagree by one leads to as strangely asymmetric tree

34 Copyright 2004-2006 Curt Hill Is this balanced? 5 8 10 11 2 31 4 12 6 79


Download ppt "Copyright 2004-2006 Curt Hill Balance in Binary Trees Impact on Performance."

Similar presentations


Ads by Google