Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 10 Search Structures

Similar presentations


Presentation on theme: "Chapter 10 Search Structures"— Presentation transcript:

1 Chapter 10 Search Structures
KAIST- CS206

2 10.1 Optimal Binary Search Trees
Extended Two binary trees corresponding to search trees for for do while do return return if while if An optimal binary search tree for a1, a2, … an (a1 < .. < an) is one that minimize the quantity of I+E with (1) pi = the probability of searching for each ai, (2) qi = the probability of searching for id in between ai-1 and ai. Internal path length I = = 7 External path length E = = 17 E = I + 2n KAIST- CS206

3 10.1 Optimal Binary Search Tree: Example
If assume pi = qi = 1/7, for all i and j  If assume p1=0.5, p2=0.1, p3=0.05, q0=0.15, q1=0.1, q2=0.05,q3=0.05, while if if do while do (a) Cost = 15/7, 2.65 (b) Cost = 13/7, 1.9 do while do if do while while if if (c) Cost = 15/7, 1,5 (d) Cost = 15/7, 2.05 (e) Cost = 15/7, 1.6 KAIST- CS206

4 10.1 Optimal Binary Search Tree: Algorithm
Ti,j : an optimal binary search tree for ai+1, …, aj , i<j. ci,j : cost of Ti,j ri,j : root of Ti,j ci,j = min {pk + cost(L) + cost(R) + weight(L) + weight(R) | k = i+1, …, j} = min{wi,j + ci,k-1 + ck, j | k = i+1, …, j} = wi,j + min{wi,j + ci,k-1 + ck, j | k = i+1, …, j} ak L R Example 10.2: (a1, a2, a3, a4)=(do, if, return, while), (p1, p2, p3, p4) = (3, 3, 1, 1), (q0, q1, q2, q3, q4) = (2, 3, 1, 1, 1) 1 2 3 4 W00 = 2 C00 = 0 r00 = 0 W11 = 3 C11 = 0 r11 = 0 W22 = 1 C22 = 0 r22 = 0 W33 = 1 C33 = 0 r33 = 0 W44 = 1 C44 = 0 r44 = 0 W01 = 8 C01 = 8 r01 = 1 W12 = 7 C12 = 7 r12 = 2 W23 = 3 C23 = 3 r23 = 3 W34 = 3 C34 = 3 r34 = 4 W02 = 12 C02 = 19 r02 = 1 W13 = 9 C13 = 12 r13 = 2 W24 = 5 C24 = 8 r24 = 3 W03 = 14 C03 = 25 r03 = 2 W14 = 11 C14 = 19 r14 = 2 W04 = 16 C04 = 32 r04 = 2 for do return while

5 10.1 Optimal Binary Search Tree: Algorithm(2)
template <class KeyType> BST<KeyType>::obst(int * p, int* q, Element<KeyType>* a, int n) // Given n distinct identifiers a1 < a2 < … < an, and probabilities pj, 1 <= j <= n, and // qi, 0 <= i <= n this algorithm computes the cost cij of optimal binary search trees Tij // for identifiers ai+1, …, aj. It also computes rij, the root of Tij. Wij is the weight of Tij { for (int i=0;i<n;i++) { w[i][i] = q[i]; r[i][i] = c[i][i] = 0; // initialize w[i][i+1] = q[i] + q[i+1] + p[i+1]; // optimal trees with one node r[i][i+1] = i+1; c[i][i+1] = w[i][i+1]; } w[n][n] = q[n]; r[n][n] = c[n] [n] =0; for (int m=2; m <= n; m++) // find optimal trees with m nodes for (int i=0; i<=n-m; i++) { int j = i+m; w[i][j] = w[i][j-1] + p[j] + q[j]; int k = KnuthMin(i, j); // KnuthMin returns a value k in the range[r[i, j-1], r[i+1, j]] // minimizing c[i, j-1[ + c[k, [ c[i][j] = w[i[[j] + c[i][k-1] + c[k][j]; r[i][j] = k; } // end of obst O(n3) O(n2)

6 10.2 AVL Trees : 동기 Objective : maintain binary search tree structure for O(log n) access time with dynamic changes of identifiers (i.e., elements) in the tree. APR AUG DEC FEB JAN JULY JUNE MAR MAY NOV OCT SEPT JULY FEB MAY AUG JAN MAR OCT NOV DEC JUNE NOV SEPT 최상의 dynamic insertions 최악의 dynamic insertions KAIST- CS206

7 10.2 AVL Trees : 정의 Height-balanced: An empty tree is height-balanced. If T is a non-empty binary tree with TL and TR as its left and right subtrees, respectively, then T is height-balanced iff (1) TL and TR are height- balanced and (2) |hL-hR| < 2 where hL and hR are the heights of TL and TR, respectively. Balance factor: The balance factor, BF(k), of a node k in a binary tree : hL – hR. AVL is a binary search tree, satisfying for any node k in the tree, BF(k) = -1, 0, or 1. KAIST- CS206

8 10.2 AVL Trees : Insertion NOV MAR AUG MAY +1 (d) Insert AUGUST -1 MAR
(d) Insert AUGUST -1 MAR MAR MAY (a) Insert MARCH (b) Insert MAY MAR MAY NOV -2 -1 MAY RR MAR NOV (c) Insert NOV

9 10.2 AVL Trees : Insertion (2)
LL (e) Insert APRIL NOV MAR MAY +2 AUG APR +1 +2 MAY MAR -1 -1 AUG NOV AUG MAY +1 LR APR MAR APR JAN NOV (f) Insert JANUARY JAN

10 10.2 AVL Trees : Insertion (3)
LR (g) Insert DECEMBER MAY AUG MAR +1 -1 APR JAN DEC NOV JULY (h) Insert JULY +2 +1 MAR MAR -2 -1 -1 DEC MAY DEC MAY +1 RL +1 APR JAN NOV AUG JAN NOV -1 DEC JULY APR FEB JULY FEB (i) Insert FEBRUARY

11 10.2 AVL Trees : Insertion (4)
+2 MAR JAN -1 -1 +1 LR DEC MAY DEC MAR +1 -1 +1 +1 AUG JAN NOV AUG FEB AUG NOV -1 APR FEB JULY APR JULY JULY JUNE (j) Insert JUNE RR (k) Insert OCTOBER MAR DEC JAN -1 +1 AUG FEB MAY -2 APR JUNE JULY NOV OCT

12 10.2 AVL Trees : Insertion (4)
+2 MAR JAN -1 -1 +1 LR DEC MAY DEC MAR +1 -1 +1 +1 AUG JAN NOV AUG FEB AUG NOV -1 APR FEB JULY APR JULY JULY JUNE (j) Insert JUNE RR (k) Insert OCTOBER MAR DEC JAN -1 +1 AUG FEB MAY -2 APR JUNE JULY NOV OCT

13 10.2 AVL Trees : Rebalancing rotations
Balanced subtree Unbalanced following insertion rotation type Rebalanced subtree +1 A +2 A B LL B AR +1 B AR BL A h+2 h BL BR BL BR h+2 BR AR Height of BL increases to h+1 Height of subtrees of B remain h+1 Rebalanced subtree Height of BR increases to h+1 -1 A -2 A B height+2 AL B AL -1 B A BR RR BL BR BL BR height+2 AL BL

14 10.2 AVL Trees : Rebalancing rotations(2)
Balanced subtree Unbalanced following insertion rotation type Rebalanced subtree +2 A C +1 A LR(a) -1 B B A B C +1 A +2 A C h+2 h+2 -1 B AR B -1 A B AR h h BL C BL +1 C LR(b) BL CL CR BR CL CR CL CR h-1 h

15 10.2 AVL Trees : Rebalancing rotations(3)
+2 A C h+2 -1 B +1 B A AR LR(c) BL -1 C BL CL CR BR CL CR h

16 10.2 AVL Trees : Performance comparisons
Output in order O(log n) O(1)2 Insert x O(k) O(n-k) Delete k-th item O(1)1 Delete x O(1) Search for k-th item Search for x AVL tree Linked List Sequential list Operation Doubly linked list and position of x known Position for insertion known KAIST- CS206

17 10.3 2-3 Trees a special case of B-trees Definition (2-3 tree) :
node degree is more than 2 a special case of B-trees Definition (2-3 tree) : (1) Each internal node is a 2-node or a 3-node. (2) if e is a 2-node, key of every element in LeftChild(node e) < key of e key of every element in MiddleChild(node e) > key of e (3) if e is a 3-node, key of every element in LeftChild(node e) < keyL of e KeyL of e < key of every element in MiddleChild(node e) > keyR of e key of every element in RightChild(node e) > keyR of e (4) All external nodes are at the same level. A 40 20 10 B 80 C

18 10.3.3 Inserting into a 2-3 Tree G 40 20 A 70 F 60 80 10 30 B E C D
Searching : O(log n) Inserting : O(log n) (b) 60 inserted A 40 20 10 B 80 C A A 40 40 20 B C C B D 20 10 80 70 10 30 80 70 (a) 70 inserted (b) 30 inserted

19 10.3.4 Deletion from a 2-3 Tree 80 50 20 10 B 95 90 D 60 C A
70 deleted Deleting : O(log n) A 80 50 20 10 B 95 90 D 70 60 C A 80 50 20 10 B 60 C 95 D 90 deleted

20 10.3.4 Deletion from a 2-3 Tree (2)
80 50 20 10 B 95 D 60 C B 80 20 10 deleted A 20 10 B 80 C 50 deleted A 80 20 10 B 95 D 50 C 60 deleted A 20 10 B 80 50 C 95 deleted

21 10.3.4 Deletion from a 2-3 Tree : Rotations
(a) p is the left child r r ? x r ? y p q x p q z y z a b c d a b c d r ? z r ? y (b) p is the middle child r y x q p x q p z a b c d a b c d (c) p is the right child r r r z x y w a a q p q p z y x z b c d e b c d e

22 10.3.4 Deletion from a 2-3 Tree : Procedure
Step 1: Modify node p as necessary to reflect its status after the desired element has been deleted. Step 2: for (; p has zero elements && p!=root; p = r) { let r be the parent of p, and let q be the left or right sibling of p (as appropriate); if(q is a 3-node) perform a rotation else perform a combine; } Step 3: If p has zero elements, then p must be the root. The left child of p becomes the new root, and node p is deleted. KAIST- CS206

23 Figure 10.35: Example of a 3-way search tree that is not a 2-3 tree
10.6 B-Trees Definition: An m-way search tree satisfies (1) The root has at most m subtrees n, A0, (K1, A1), (K2, A2), …, (Kn, An). (2) Ki < Ki+1, i = 1, …, n. (3) Ki < All key values in subtree Ai < Ki+1, i = 1, …, n. (4) Kn < All key values in subtree An, All key values in subtree A0 < K1. (5) The subtrees Ai, i = 1, …, n, are also m-way search tree. a 20, 40 T node schematic format a 2, b, (20, c), (40, d) b 2, 0, (10, 0), (15, 0) c 2, 0, (25, e), (30, 0) d 2, 0, (45, 0), (50, 0) e 1, 0, (28, 0) b c d 10, 15 25, 30 45, 50 e 28 Figure 10.35: Example of a 3-way search tree that is not a 2-3 tree Definition: A B-tree of order m is an m-way search tree, satisfying (1) The root node has at least two children. (2) All nodes other than root and failure nodes have at least m/2 children. (3) All failure nodes are at the same level.

24 10.6.3 B-Trees: Properties N: minimum number of keys in a B-tree
N+1 = the number of failure nodes = the number of nodes at level l+1 > 2(m/2)l-1  If there are N key values, the level of B-tree l is l < log m/2 {N+1)/2} +1 Total maximum search time 6.8 5.7 m 50 125 400 Choice of m - depending on access time : time for reading nodes from disk + time to search the nodes for x

25 10.9 Tries blank a b c g o t w oriole h blank l u a h o u wren r d s b
oriole h blank l u a h o u wren r d s b bluebird bunting gull a u cardinal chickadee godwit goshawk thrasher thrush KAIST- CS206

26 10.9 Tries : Searching and Sampling Strategies
1. Searching : O(l) where l is the number of level 2. How to reduce l  sampling strategy at the i-th level for key value x Example: Sample(x, i) = xr(x,i) for r(x,i) a randomization function blank a b c d e f g h i j k l m n o p q r s t u v w x y z b bunting goshawk wren godwit bluebird thrush thrasher e l a h A tri : sampling one character at a time, from right to left chickadee oriole cardinal gull blank a b c d e f g h i j k l m n o p q r s t u v w x y z b thrasher An optimal tri : sampling on the first level done by using the fourth character cardinal goshawk wren bunting godwit chickadee bluebird gull thrush oriole

27 10.9 Tries : Insertion and Deletion
b Shrink when deleting l o u σ u bunting δ 1 Need a count data member in each branch node bobwhite e δ 2 b j δ 3 Section of tri showing changes resulting from inserting bobwhile and bluejay ρ bluebird bluejay Grow when inserting


Download ppt "Chapter 10 Search Structures"

Similar presentations


Ads by Google