Download presentation
Presentation is loading. Please wait.
Published byBrianna Caldwell Modified over 9 years ago
1
A Study of Balanced Search Trees: Brainstorming a New Balanced Search Tree Anthony Kim, 2005 Computer Systems Research
2
Abstract This project investigates three different balanced search trees for their advantages and disadvantages, thus ultimately their efficiency. Run time and memory space management are two main aspects under the study. Statistical analysis is provided to distinguish subtle differences if there is any. A new balanced search tree is suggested and compared with the four balanced search trees under study,. Balanced search trees are implemented in C++ extensively using pointers and structs.
3
Introduction ● A simple binary search tree has some disadvantages, specifically from its dependence on the incoming data, that significantly affects its tree structure hence its performance. (Ex. linear tree) ● An optimal search tree is one that tries to minimize its height given some data. (Ex. Red-black tree has 2lg(n+1) height max) ● Some of balanced search trees are red-black tree, AVL tree, weight-balanced tree, and B tree.
4
Background ● Simple vocabs: nodes, edges, children, parent, root ● Printing tree using recursion: pre-, in-, and post- order traversal ● Basic binary search tree functions: insertion, lookup, deletion (only first two apply in this project) ● Rotating functions: the key player in balancing (Left rotation and right rotation.)
5
Some Balanced Search (Red-black tree) ● Four properties ● The root of the tree is colored black. ● All paths from the root to the leaves agree on the number of black nodes. ● No path from the root to a leaf may contatin two consecutive nodes colored red. ● Every path from a node to a leaf (of the descendants) has the same number of black nodes ● Has height at most 2lg(n+1)
6
Some Balanced Search Tree (Weight & height balanced search tree.) ● Weight balanced tree and height balanced tree are very similar. ● Weight balanced tree (Height balanced tree) has one property ● At each node, the difference between weight(height) of left subtree and weight(height) of right subtree is less than the threshold value. ● Supposedly yield height lg(n) at most
7
A New Balanced Search Tree Median-weight-mix tree ● Assumption on statistical data ● Give lower bound and upper bound of total data input, random behavior is assumed, meaning data points will be evenly distributed throughout the interval ● Multiple “crests” is assumed to be present in the interval. ● Each node will have a key (data number), an interval (with lower and upper bounds of its assigned interval) and weights of left subtree and right subtree.
8
A New Balanced Search Tree Median-weight-mix tree ● Algorithm ● The weights of each subtree are calculated based on constants R and S ● R = the importance of focusing frequency heavy data points ● S = the importance of focusing frequency weak data points ● Left/right rotations to balanced R to S ratio
9
Testing Methodology ● 14 Randomly generated test cases (test case size ranges 20 – 10,000) ● 4 Real test scores of math competition etc. ● Things I am looking for – Total Run Time – Average Time Retrieval – Height of Tree – Average Retrieval Depth
10
Test Runs (Height-balanced Tree)
11
Results Total Run Time Test CaseRedblackHeightWeightMWM 10000 20000 3N/A000 4 0.0100 5N/A0.010.040 6N/A0.010 7N/A0.0100.02 8N/A0.010.02 9N/A0.02 10N/A0.030.020.05 11N/A0.030.040.05 12N/A0.030.040.05 13N/A0.04 0.05 14N/A0.030.040.05 1010000 1020000 103N/A0.080.41N/A 1040.02 0.050.02 Average retrieval time Test CaseRedblackHeightWeightMWM 10000 20000 3N/A000 4 000 5 00.0000010 6N/A000 7 0.00000200 8N/A0.00000200.000004 9N/A000 10N/A0.0000010.0000020.000001 11N/A0.0000020.000001 12N/A0.000002 0.000001 13N/A0.000002 0.000001 14N/A0.0000010.0000020 1010000 1020000 103N/A0.00000070.00000086N/A 1040.0001810.0000008900.00000093
12
Results Depth Test CaseRedblackHeightWeightMWM 17444 212697 3N/A810 4N/A111613 5N/A122414 6N/A132917 7N/A143120 8N/A143119 9N/A153818 10N/A163823 11N/A163722 12N/A163722 13N/A163623 14N/A163324 101 7189 10238687 103N/A1642N/A 1045596132412 Average Retrieval Depth Test CaseRedblackHeightWeightMWM 13.52.35 25.683.95.084.42 3N/A4.855.445.31 4N/A7.018.6847.422 5N/A8.21711.8238.821 6N/A9.244513.7219.9635 7N/A10.377814.67412.0244 8N/A10.480414.5211.6468 9N/A10.466215.164611.6474 10N/A11.574918.589913.2171 11N/A11.586415.677813.0158 12N/A11.645616.429813.1199 13N/A11.674816.569913.3742 14N/A11.539317.774113.1296 10149.02914.873798.509715.18932 10218.56254.44.554.7375 103N/A0.0007659420.00065652N/A 1042368.981.687512.701352.62517
13
Result ● Total run time and average retrieval time data did not make any sense. ● Hard to time processes on fast computers ● Red-black tree segmentation faulted for large test cases >500, so it provided no experimental data
14
Result (Height)
15
Result (Average retrieval depth)
16
Analysis ● All balanced search trees show logarithmic characteristics for height and average retrieval depth as expected. (except red-black tree) ● Height-balanced tree seems to perform the best among three working balanced search trees. ● Median-weight-mix tree’s logarithmic line lies between height-balanced tree’s line and weight- balanced tree’s line.
17
Conclusion ● The project experimentally showed that balanced binary search trees show logarithmic characteristics. ● Median-weight-mix tree’s performance is an intermediate between height-balanced tree’s and weight-balanced tree’s. ● More studies should be done on other balanced search trees or variants of search trees studied in this project
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.