EECS 4101/5101 Prof. Andy Mirzaian. Lists Move-to-Front Search Trees Binary Search Trees Multi-Way Search Trees B-trees Splay Trees 2-3-4 Trees Red-Black.

Slides:



Advertisements
Similar presentations
Chapter 13. Red-Black Trees
Advertisements

CSE 4101/5101 Prof. Andy Mirzaian Augmenting Data Structures.
CSE 4101/5101 Prof. Andy Mirzaian. Lists Move-to-Front Search Trees Binary Search Trees Multi-Way Search Trees B-trees Splay Trees Trees Red-Black.
EECS 4101/5101 Prof. Andy Mirzaian. Lists Move-to-Front Search Trees Binary Search Trees Multi-Way Search Trees B-trees Splay Trees Trees Red-Black.
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
1 AVL Trees (10.2) CSE 2011 Winter April 2015.
Chapter 4: Trees Part II - AVL Tree
EECS 4101/5101 Prof. Andy Mirzaian. Lists Move-to-Front Search Trees Binary Search Trees Multi-Way Search Trees B-trees Splay Trees Trees Red-Black.
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
D. D. Sleator and R. E. Tarjan | AT&T Bell Laboratories Journal of the ACM | Volume 32 | Issue 3 | Pages | 1985 Presented By: James A. Fowler,
Binary Search Tree AVL Trees and Splay Trees
© The McGraw-Hill Companies, Inc., Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
1 Self-Adjusting Data Structures. 2 Lists [D.D. Sleator, R.E. Tarjan, Amortized Efficiency of List Update Rules, Proc. 16 th Annual ACM Symposium on Theory.
Balanced Binary Search Trees
EECS 311: Chapter 4 Notes Chris Riesbeck EECS Northwestern.
Unified Access Bound 1 [M. B ă doiu, R. Cole, E.D. Demaine, J. Iacono, A unified access bound on comparison- based dynamic dictionaries, Theoretical Computer.
AVL-Trees (Part 1) COMP171. AVL Trees / Slide 2 * Data, a set of elements * Data structure, a structured set of elements, linear, tree, graph, … * Linear:
Tirgul 5 AVL trees.
CSE 326: Data Structures Splay Trees Ben Lerner Summer 2007.
CSC401 – Analysis of Algorithms Lecture Notes 7 Multi-way Search Trees and Skip Lists Objectives: Introduce multi-way search trees especially (2,4) trees,
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
6/14/2015 6:48 AM(2,4) Trees /14/2015 6:48 AM(2,4) Trees2 Outline and Reading Multi-way search tree (§3.3.1) Definition Search (2,4)
Princeton University COS 423 Theory of Algorithms Spring 2001 Kevin Wayne Amortized Analysis Some of these lecture slides are adapted from CLRS.
Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1.
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Splay trees CS 202 – Fundamental Structures of Computer Science II.
AVL-Trees (Part 1: Single Rotations) Lecture COMP171 Fall 2006.
Department of Computer Eng. & IT Amirkabir University of Technology (Tehran Polytechnic) Data Structures Lecturer: Abbas Sarraf Search.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
© 2004 Goodrich, Tamassia (2,4) Trees
CSC 212 Lecture 19: Splay Trees, (2,4) Trees, and Red-Black Trees.
CSE 326: Data Structures Lecture #13 Extendible Hashing and Splay Trees Alon Halevy Spring Quarter 2001.
Binary search trees Definition Binary search trees and dynamic set operations Balanced binary search trees –Tree rotations –Red-black trees Move to front.
Splay Trees Splay trees are binary search trees (BSTs) that:
0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
Splay Trees and B-Trees
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Advanced Data Structures and Algorithms COSC-600 Lecture presentation-6.
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
New Balanced Search Trees Siddhartha Sen Princeton University Joint work with Bernhard Haeupler and Robert E. Tarjan.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Course Review Midterm.
CMSC 341 Splay Trees. 8/3/2007 UMBC CMSC 341 SplayTrees 2 Problems with BSTs Because the shape of a BST is determined by the order that data is inserted,
CS Data Structures Chapter 10 Search Structures.
COMP20010: Algorithms and Imperative Programming Lecture 4 Ordered Dictionaries and Binary Search Trees AVL Trees.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Search Trees Last Update: Nov 5, 2014 EECS2011: Search Trees1 “Grey Tree”, Piet Mondrian, 1912.
1 Trees 4: AVL Trees Section 4.4. Motivation When building a binary search tree, what type of trees would we like? Example: 3, 5, 8, 20, 18, 13, 22 2.
1 Splay trees (Sleator, Tarjan 1983). 2 Goal Support the same operations as previous search trees.
Trees  Linear access time of linked lists is prohibitive Does there exist any simple data structure for which the running time of most operations (search,
Search Trees Chapter   . Outline  Binary Search Trees  AVL Trees  Splay Trees.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Oct 26, 2001CSE 373, Autumn A Forest of Trees Binary search trees: simple. –good on average: O(log n) –bad in the worst case: O(n) AVL trees: more.
© 2004 Goodrich, Tamassia Trees
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
October 19, 2005Copyright © by Erik D. Demaine and Charles E. LeisersonL7.1 Introduction to Algorithms LECTURE 8 Balanced Search Trees ‧ Binary.
Jim Anderson Comp 750, Fall 2009 Splay Trees - 1 Splay Trees In balanced tree schemes, explicit rules are followed to ensure balance. In splay trees, there.
Balanced Binary Search Trees
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Linda Shapiro Winter 2015.
Lecture 10COMPSCI.220.FS.T Binary Search Tree BST converts a static binary search into a dynamic binary search allowing to efficiently insert and.
CS 5243: Algorithms Balanced Trees AVL : Adelson-Velskii and Landis(1962)
1 Binary Search Trees   . 2 Ordered Dictionaries Keys are assumed to come from a total order. New operations: closestKeyBefore(k) closestElemBefore(k)
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Self-Adjusting Data Structures
Binary search trees Definition
Self-Adjusting Search trees
Splay Trees In balanced tree schemes, explicit rules are followed to ensure balance. In splay trees, there are no such rules. Search, insert, and delete.
Splay trees (Sleator, Tarjan 1983)
Presentation transcript:

EECS 4101/5101 Prof. Andy Mirzaian

Lists Move-to-Front Search Trees Binary Search Trees Multi-Way Search Trees B-trees Splay Trees Trees Red-Black Trees SELF ADJUSTING WORST-CASE EFFICIENT competitive competitive? Linear Lists Multi-Lists Hash Tables DICTIONARIES 2

References: Lecture Note 3 AAW animation Lecture Note 3 AAW animation 3

Self-Adjusting Binary Search trees  Red-Black trees guarantee that each dictionary operation (Search, Insert, Delete) takes O(log n) time in the worst-case. They achieve this by enforcing an explicit “balance” constraint on the tree.  Can we endow self-adjusting feature to the standard BST by the use of appropriate rotations to restructure the tree and improve the efficiency of future operations in order to keep the amortized cost per operation low? Is there a simple self-adjusting data structure that guarantees O(log n) amortized time per dictionary operation? YES, Splay trees.  As with Red-Black trees, Splay trees give the following guarantee: Any sequence of m dictionary operations on an initially empty dictionary takes in total O(m log n) time, where n is the max size of the dictionary during this sequence. However, a single dictionary operation on Splay trees in the worst-case may cost up to  (n) time.  Splay Trees maintain a BST in an arbitrary state, with no “balance” information/constraint. However, they perform the simple, uniformly defined, splay operation after each dictionary operation. A splay operation on BSTs is analogous to Move-to-Front on linear lists. 4

Generic Splay  GenericSplay(x): Consider the links on the search path of node x in BST T. This operation performs one rotation per link on this path in some appropriate order to move node x up to the root of T.  The number of rotations (at a cost of O(1) per rotation) done by GenericSplay(x) is depth T (x), i.e., depth of x in T before the splay. Access cost of node x is O(1+ depth T (x)).  Generic Splay tree: This is an arbitrary BST such that after each dictionary operation we perform GenericSplay(x), where x is the deepest node accessed by the dictionary operation that is still in the tree. The cost of the dictionary operation, including the splay, is O(1+depth(x)). Which node is x?  Successful search: x is the node just accessed.  Unsuccessful search: x is parent of the external node just accessed.  Insert: x is the node now holding the insertion key.  Unsuccessful Delete: same as unsuccessful search.  Successful Delete: x is parent of the spliced-out node.  Any sequence of m dictionary operations takes O(m+R) time, where R is the total number of rotations done by the generic splay operations in the sequence. So, we need to find an upper bound on R. 5

Naïve Splay NaïveSplay(x): A series of single rotations up the tree: while p[x]  nil do Rotate(p[x],x) Exercise: Show that there is an adversarial sequence of n dictionary operations on an initially empty BST on which NaïveSplay takes  (n 2 ) time total, i.e., amortized  (n) time per dictionary operation. x x x x x x x x x x x x NaïveSplay(x) 6

Splay(x) Perform a series of the following splay steps, whichever applies, until x becomes the root of the tree. (Left-Right symmetric cases not shown.) D z C y AB x A x B y CD z * ZIG-ZIG 1 2 Rotate 1 then 2 D z A y BC x x AB y CD z ZIG-ZAG 2 1 Rotate 1 then 2 C y AB x root A x BC y ZIG terminating case 7

Splay Examples 8

Amortized Analysis  The aggregate time on any sequence of m dictionary operations on an initially empty BST is O(m+R) R = total # rotations done by Splay operations in the sequence.  We shall upper bound R by amortized analysis.  We analyze amortized cost of Splay with unit cost per rotation.  What is a “good” candidate for the potential function? 9

Potential Function (part 1/2)  1+depth T (x) = # ancestors of x (including x) in T = access cost of x in T.  w T (x) = weight of x in T = # descendents of x (including x) in T = # nodes in the subtree rooted at x.  Candidate 1: Internal path length of T: L(T) =  x  T (1+depth T (x)).  Candidate 2: Weight of T: W(T) =  x  T w T (x) L(T) = W(T) = 19  FACT: (1)  T L(T) = W(T). Why? (2) As potentials, L(T) and W(T) are regular (i.e., always  0, L(  ) = W(  ) = 0). (3)  (n log n)  L(T) = W(T)   (n 2 ), where n=|T|.  L(T) & W(T) can become too large. They won’t work! Dampen W(T) by a sublinear function, e.g., log. 10

Potential Function (part 2/2)  Rank of x in T: r T (x) = log w T (x).  Potential:  (T) =  x  T r T (x).  FACT:  (T) is also regular, and  (n)   (T)   (n log n).  (n) = 2  (n/2) + log n =  (n)  (n) =  (n-1) + log n =  (n log n) 11

The LOG LEMMA LEMMA 0: For any numbers a, b, c  1 a + b  c  log a + log b  2 log c – 2. LEMMA 0: For any numbers a, b, c  1 a + b  c  log a + log b  2 log c – 2. Proof 1: c 2  (a+b) 2 = 4ab + (a-b) 2  4ab. So, c 2  4ab. Now take log. Proof 2: Logarithm is a concave function: log a log b log (a+b)/2 ab(a+b)/2 (log a + log b)/2 This makes the potential function work! 12

ZIG LEMMA C y AB x root A x BC y ZIG T T’ ĉ  1 + 3[r T’ (x) – r T (x)] LEMMA 1: Proof: 13

ZIG-ZIG LEMMA Proof: ZIG-ZIG ĉ  3[r T’ (x) – r T (x)] LEMMA 2: D z C y AB x T A x B y CD z T’ 14

ZIG-ZAG LEMMA Proof: ZIG-ZAG ĉ  3[r T’ (x) – r T (x)] LEMMA 3: D z A y BC x T T’ x AB y CD z 15

Lemmas on each Splay Step ZIG-ZAG ĉ  3[r T’ (x) – r T (x)] LEMMA 3: D z A y BC x T T’ x AB y CD z ZIG-ZIG ĉ  3[r T’ (x) – r T (x)] LEMMA 2: D z C y AB x T A x B y CD z T’ C y AB x root A xBC y ZIG T T’ ĉ  1 + 3[r T’ (x) – r T (x)] LEMMA 1: 16

Amortized cost of Splay THEOREM 1: Amortized number of rotations by a Splay operation on an n-node BST is at most 3 log n + 1. only last splay step can be a ZIG Proof: Splay(x) operation is a sequence of splay steps: T before = T 0 T i-1 TiTi T k = T after  before =  0  i-1 ii  k =  after cici ĉiĉi  17

Amortized cost of dictionary op’s LEMMA 4: Let T be any n-node BST. (a) Increase in potential due to insertion of a new leaf into T, excluding the cost of splaying, is at most log n. (b) Splicing-out any node from T cannot increase the potential. LEMMA 4: Let T be any n-node BST. (a) Increase in potential due to insertion of a new leaf into T, excluding the cost of splaying, is at most log n. (b) Splicing-out any node from T cannot increase the potential. Proof: Exercise 1. THEOREM 2: Aggregate number of rotations by any sequence of m dictionary operations on an initially empty Splay Tree of max size  n is R  m(4 log n + 1). Hence, aggregate execution time for such a sequence is at most O(m+R) = O(m log n). THEOREM 2: Aggregate number of rotations by any sequence of m dictionary operations on an initially empty Splay Tree of max size  n is R  m(4 log n + 1). Hence, aggregate execution time for such a sequence is at most O(m+R) = O(m log n). Proof: Amortized number of rotations by a dictionary operation: Search: at most log n (by Theorem 1). Insert: at most log n (by Theorem 1 & Lemma 4a). Delete: at most log n (by Theorem 1 & Lemma 4b). 18

Competitiveness? 19

Dynamic Optimality Conjecture  Move-to-Front is competitive on linear lists with exchanges.  Is SPLAY competitive on binary search trees with rotations?  Sleator, Tarjan, “Self-adjusting binary search trees,” J. ACM 32(3), pp: , 1985 originally proposed Splay Trees and made the following bold conjecture: Dynamic Optimality Conjecture: s = any sequence of dictionary operations starting with any n-node BST. OPT = optimum off-line algorithm on s. OPT carries out each operation by starting from the root and accessing nodes on the corresponding search path, and possibly some additional nodes, at a cost of one per accessed node, and that between each operation performs an arbitrary number of rotations on any accessed node in the tree, at a cost of one per rotation. Then, C SPLAY (s) = O(n+C OPT (s)). 20

Special Cases  Dynamic Optimality Conjecture is a very strong claim and is still open.  Some special cases have been confirmed.  The following paper gives a survey, offers alternatives to splaying, and develops a new method to aid finding competitive online BST algorithms: Demaine, Harmon, Iacono, Kane, Pătraşcu, “The geometry of binary search trees,” SODA, pp: , [To see this paper, click here.]here  Some known special cases:  Sequential Access [Tar85]: Access each node of a given n-node BST in inorder (smallest, then 2 nd smallest, …, then largest) always starting an access from the root. This takes  (n) time on Splay trees! Red-Black trees are not competitive and take  (n log n) time.  Static Optimality [ST85]: Access only operations on an n-node BST. Static OPT (no rotations allowed) [Knuth’s dynamic programming algorithm]. (See AAW animation on Optimal Static BST, my CSE 3101 LS7 or LN9.) Red-Black trees are not competitive.  Dynamic Finger Search Optimality [CMSS00].  Working-set bound [ST85].  Entropy bound [ST85]. 21

Bibliography: [ST85] D.D. Sleator, R.E. Tarjan, “Self-adjusting binary search trees,” J. ACM 32(3): , [Original paper on Splay Trees.] [Tar85] R.E. Tarjan, “Sequential access on splay trees takes linear time,” Combinatorica 5(4), pp: , [DHIKP09] E.D. Demaine, D. Harmon, J. Iacono, D. Kane, M. Pătraşcu, “The geometry of binary search trees,” 20 th ACM-SIAM Symposium on Discrete Algorithms (SODA), pp: , [WDS06] C.C. Wang, J. Derryberry, D.D. Sleator “O(lg lg n)-competitive dynamic binary search trees,” 17 th SODA, pp: , [Goo00] M.T. Goodrich, “Competitive tree-structured dictionaries,” 11 th SODA, pp: , [CMSS00] R. Cole, B. Mishra, J. Schmidt, A. Siegel, “On the dynamic finger conjecture for splay trees. Parts I and II,” SIAM J. on Computing 30(1), pp: 1-85, [Wei06] M.A. Weiss, “Data Structures and Algorithms in C++,” 3 rd edition, Addison Wesley, [GT02] M.T. Goodrich, R. Tamassia, “Algorithm Design: Foundations, Analysis, and Internet Examples,” John Wiley & Sons, Inc.,

Exercises 23

1.Prove Lemma 4. 2.Consider the sequence of O(n) dictionary operations performed by procedure Test below. procedure Test(n) T   (* T is initialized to an empty dictionary *) for i  1..n do Insert (i,T) for i  1..n do Search (i,T) end Show the following time complexities hold for procedure Test: (a)  (n log n) time (i.e., at least  (n log n) and at most O(n log n) time) if T is a Red-Black tree. (b) At least  (n 2 ) time if T is a NaïveSplay tree. (c) At most O(n log n) time if T is a Splay tree. (d) [Hard: Extra Credit and Optional] Only  (n) time if T is a Splay tree. 3.For any positive integers m and n, n  m, let  (m,n) be the set of all possible sequences s of m dictionary operations applied to an initially empty dictionary, where n is the maximum number of keys in the dictionary at any time during s. We have already seen that executing any sequence s  (m,n) takes at most O(m log n) time on both Red-Black trees and Splay trees. On the other hand, Splay trees tend to adjust to the pattern of usage. Prove for all m and n as above, there is a sequence s  (m,n) whose execution takes at least  (m log n) time on Red-Black trees, but at most O(m) time on Splay trees. 4.Consider inserting the n keys  1,2,...,n , in that order, into an initially empty search tree T. (a) What is the tight asymptotic running time of this insertion sequence if T is a Red-Black tree? Describe the structure of the final Red-Black tree as accurately as you can. (b) Answer the same questions as in (a) if T is a Splay tree. (c) Consider a worst permutation of the n keys  1,2,...,n , so that if these n keys are inserted into an initially empty Splay tree, in the order given by that permutation, the running time is asymptotically maximized. Characterize one such worst permutation for Splay trees. What would be the tight asymptotic running time for the insertion sequence according to that permutation? 24

5.Split and Join on Splay trees: These are cut and paste operations on dictionaries. The Split operation takes as input a dictionary (a set of keys) A and a key value K (not necessarily in A), and splits A into two disjoint dictionaries B = { x  A | key[x]  K } and C = { x  A | key[x] > K }. (Dictionary A is destroyed as a result of this operation.) The Join operation is essentially the reverse; it takes two input dictionaries A and B such that every key in A < every key in B, and replaces them with their union dictionary C = A  B. (A and B are destroyed as a result of this operation.) Design efficient Split and Join on Splay trees and do their amortized analysis. [Note: Split and Join, as defined here, is the same we gave on BST’s, trees, and Red-Black trees.] 6.Top-Down Splay: Describe how to do top-down splay and analyze its amortized cost. 7.[Goodrich-Tamassia, C-3.23, C-3.24, p. 215] Let T be a Splay Tree consisting of the n sorted keys  x 1, x 2,..., x n . Consider an arbitrary sequence s of m successful searches on items in T. Let f i be the access frequency of x i in s. Assume each item is searched at least once (i.e., f i  1 for each i=1..n). (a) Show that the total execution time for sequence s on the Splay tree starting with T is [Hint: Modify the potential function by redefining the weight of a node as the sum of access frequencies of its descendents including itself.] (b) Design and analyze an algorithm that constructs an off-line static (i.e., fixed) Binary Search Tree T' consisting of the same items  x 1, x 2,..., x n , such that depth of item x i in T' is O( log (m/f i ) ). Your algorithm should take at most O(n log n) time, and for an extra credit, it should take O(n) time. [Hint: Let x i be the root, where i is the smallest index such that f 1 + f f i  m/2. Recurse on that idea over the subtrees.] (c) How do parts (a) and (b) above compare the execution times over the access sequence s on the on- line self-adjusting Splay tree T versus the off-line static binary search tree T'? [Hint: You may also consult [CLRS §15.5], AAW, and my CSE3101 Lecture Note 9 and Slide 7 on Knuth’s dynamic programming algorithm that constructs the optimal static binary search tree.] 25

8.[Goodrich-Tamassia, p. 676] Energy-Balanced Binary Search Tree (EB-BST): Red-Black trees use rotations to explicitly maintain a balanced BST to ensure efficiency in the dictionary operations. Splay trees are a self-adjusting alternative. On the other hand, an Energy-Balanced BST explicitly stores a potential energy parameter at each node in the tree. As dictionary operations are performed, the potential energies of tree nodes are increased or decreased. Whenever the potential energy of a node reaches a threshold level, we rebuild the subtree rooted at that node. More specifically, an EB-BST is a binary search tree T in which each node x maintains w[x] (weight of x, i.e., the number of nodes in the subtree rooted at x, including x. Assume w[nil]=0.) More importantly, each node x of T also maintains a potential energy parameter p[x]. Search, insert and delete operations are done as in standard (unbalanced) binary search trees, with one small modification. Every time we perform an insert or delete which traverses a search path from the root of T to a node v in T, we increment p[x] by 1 for each node x on that search path. If there is no node on this path such that p[x]  w[x]/2, then we are done. Otherwise, let x be the highest node in T (i.e., closest to the root) such that p[x]  w[x]/2. We rebuild the subtree rooted at x as a completely balanced binary search tree, and we zero out the potential fields of each node in this subtree (including x). (Note: a binary tree is called completely balanced, if for each node, the sizes of the left and right subtrees of that node differ by at most one.) (a) Show the rebuilding can be done in time linear in the size of the subtree that is being rebuilt. That is, design and analyze an algorithm that converts an arbitrary given BST into an equivalent completely balanced BST in time linear in the size of that tree. (b) Prove that if x and y are any two sibling nodes in an EB-BST, then w[y]  3w[x] +2. [Hint: observe the potential energy of their parent node.] (c) Using part (b), prove that the maximum height of any n-node EB-BST is O(log n). (d) Prove that the amortized time of any dictionary operation on any n-node EB-BST is O(log n). 26

END 27