Download presentation
Presentation is loading. Please wait.
1
Static Dictionaries Collection of items. Each item is a pair. (key, element) Pairs have different keys. Operations are: initialize/create get (search) Each item/key/element has an estimated access frequency (or probability). Consider binary search tree only.
2
Example KeyProbability a0.8 b0.1 c a b c a < b < c Cost = 0.8 * 2 + 0.1 * 1 + 0.1 * 2 = 1.9 a b c Cost = 0.8 * 1 + 0.1 * 2 + 0.1 * 3 = 1.2
3
Search Types Successful. Search for a key that is in the dictionary. Terminates at an internal node. Unsuccessful. Search for a key that is not in the dictionary. Terminates at an external/failure node. f0f0 a b c f1f1 f2f2 f3f3
4
Internal And External Nodes A binary tree with n internal nodes has n + 1 external nodes. Let s 1, s 2, …, s n be the internal nodes, in inorder. key(s 1 ) < key(s 2 ) < … < key(s n ). Let key(s 0 ) = –infinity and key(s n+1 ) = infinity. Let f 0, f 1, …, f n be the external nodes, in inorder. f i is reached iff key(s i ) < search key < key(s i+1 ). f0f0 a b c f1f1 f2f2 f3f3
5
Cost Of Binary Search Tree Let p i = probability for key(s i ). Let q i = probability for key(s i ) < search key < key(s i+1 ). Sum of ps and qs = 1. Cost of tree = 0 <= i <= n q i (level(f i ) – 1) + <= i <= n p i * level(s i ) Cost = weighted path length. f0f0 a b c f1f1 f2f2 f3f3
6
Brute Force Algorithm Generate all binary search trees with n internal nodes. Compute the weighted path length of each. Determine tree with minimum weighted path length. Number of trees to examine is O(4 n /n 1.5 ). Brute force approach is impractical for large n.
7
Dynamic Programming Keys are a 1 < a 2 < …< a n. Let T i j = least cost tree for a i+1, a i+2, …, a j. T 0n = least cost tree for a 1, a 2, …, a n. a3a3 a5a5 a6a6 a7a7 a4a4 f2f2 f3f3 f5f5 f4f4 f7f7 f6f6 T i j includes p i+1, p i+2, …, p j and q i, q i+1, …, q j. T 2,7
8
T i j = least cost tree for a i+1, a i+2, …, a j. c i j = cost of T i j = i <= u <= j q u (level(f u ) – 1) + i < u <= j p u * level(s u ). r i j = root of T i j. w i j = weight of T i j = sum of ps and qs in T i j = p i+1 + p i+2 + …+ p j + q i + q i+1 + … + q j Terminology
9
i = j T i j includes p i+1, p i+2, …, p j and q i, q i+1, …, q j. T i i includes q i only. fifi T ii c i i = cost of T i i = 0. r i i = root of T i i = 0. w i i = weight of T i i = sum of ps and qs in T i i = q i
10
i < j T i j = least cost tree for a i+1, a i+2, …, a j. T i j includes p i+1, p i+2, …, p j and q i, q i+1, …, q j. Let a k, i < k <= j, be in the root of T i j. akak LR T i j L includes p i+1, p i+2, …, p k-1 and q i, q i+1, …, q k-1. R includes p k+1, p i+2, …, p j and q k, q k+1, …, q j.
11
cost( L ) L includes p i+1, p i+2, …, p k-1 and q i, q i+1, …, q k-1. cost( L ) = weighted path length of L when viewed as a stand alone binary search tree. L
12
Contribution To c ij c i j = i <= u <= j q u (level(f u ) – 1) + i < u <= j p u * level(s u ). When L is viewed as a subtree of T i j, the level of each node is 1 more than when L is viewed as a stand alone tree. So, contribution of L to c ij is cost( L ) + w i k-1. L akak LR T i j
13
c ij Contribution of L to c ij is cost( L ) + w i k-1. Contribution of R to c ij is cost( R ) + w kj. c ij = cost( L ) + w i k-1 + cost( R ) + w kj + p k = cost( L ) + cost( R ) + w ij akak LR T i j
14
c ij c ij = cost( L ) + cost( R ) + w ij cost( L ) = c ik-1 cost( R ) = c kj c ij = c ik-1 + c kj + w ij Don’t know k. c ij = min i < k <= j {c ik-1 + c kj } + w ij akak LR T i j
15
c ij c ij = min i < k <= j {c ik-1 + c kj } + w ij r ij = k that minimizes right side. akak LR T i j
16
Computation Of c 0n And r 0n Start with c i i = 0, r i i = 0, w i i = q i, 0 <= i <= n (zero-key trees). Use c ij = min i < k <= j {c ik-1 + c kj } + w ij to compute c ii+1, r i i+1, 0 <= i <= n – 1 (one-key trees). Now use the equation to compute c ii+2, r i i+2, 0 <= i <= n – 2 (two-key trees). Now use the equation to compute c ii+3, r i i+3, 0 <= i <= n – 3 (three-key trees). Continue until c 0n and r 0n (n-key tree) have been computed.
17
Computation Of c 0n And r 0n c ij, r ij, i <= j 1234 1 2 3 4 0 0
18
Complexity c ij = min i < k <= j {c ik-1 + c kj } + w ij O(n) time to compute one c ij. O(n 2 ) c ij s to compute. Total time is O(n 3 ). May be reduced to O(n 2 ) by using c ij = min ri,j-1 < k <= ri+1,j {c ik-1 + c kj } + w ij
19
Construct T 0n Root is r 0n. Suppose that r 0n = 10. a 10 T 0n T 09 T 10,n Construct T 09 and T 10,n recursively. Time is O(n).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.