Data Structure and Algorithms

Data Structure and Algorithms
Dr. Maheswari Karthikeyan Lecture 9 20/04/2013

Hashing Dynamic Programming

Hash Table Hash table: a data structure, implemented as an array of objects, where the search keys correspond to the array indexes A “much smarter” version of array Searching in average takes O(1) time Drawbacks: Traverse the data In random order No range search

Direct Addressing If the key field is “age”, you have at most 120 numbers The domain of age numbers is small • So, an array with 120 slots can be built • An array can support time O(1) search by “direct-addressing” What if there are multiple persons having the same age? • What if the domain of the key field is very large? E.g., Salary: [0 – 100,000,000]

Direct Addressing Student Records 1 2 3 8 9 13 14

Hashing Hashing is a function that maps each key to a location in memory. A key’s location does not depend on other elements, and does not change after insertion. A good hash function should be easy to compute. With such a hash function, the dictionary operations can be implemented in O(1) time.

Hashing Map key values to hash table addresses keys -> hash table address This applies to find, insert, and remove Usually: integers -> {0, 1, 2, …, Hsize-1} Typical example: f(n) = n mod Hsize Non-numeric keys converted to numbers For example, strings converted to numbers as Sum of ASCII values First three characters

Hashing Student Records 9 10 20 39 4 14 8 (mod 9)

Hashing Choose a hash function h; it also determines the hash table size. Given an item x with key k, put x at location h(k). To find if x is in the set, check location h(k). What to do if more than one keys hash to the same value. This is called collision. Two methods to handle collision: Separate chaining Open addressing

Separate chaining Maintain a list of all elements that hash to the same value Search -- using the hash function to determine which list to traverse 14 42 29 20 1 36 56 23 16 24 31 17 7 2 3 4 5 6 8 9 10

Separate chaining 53 = 4 x 11 + 9 53 mod 11 = 9 14 42 29 20 1 36 56 23
16 24 31 17 7 2 3 4 5 6 8 9 10 53 14 42 29 20 1 36 56 23 16 24 31 17 7 2 3 4 5 6 8 9 10

Analysis of Hashing with Chaining
Worst case All keys hash into the same bucket a single linked list. insert, delete, find take O(n) time. Average case Keys are uniformly distributed into buckets O(1+N/B): N is the number of elements in a hash table, B is the number of buckets. If N = O(B), then O(1) time per operation. N/B is called the load factor of the hash table.

Open addressing Linear Probing Quadratic Probing Double hashing
If collision happens, alternative cells are tried until an empty cell is found. Three methods of Open addressing: Linear Probing Quadratic Probing Double hashing 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28

Linear Probing (insert 12)
12 = 1 x 12 mod 11 = 1 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12

Search with linear probing (Search 15)
15 = 1 x 15 mod 11 = 4 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12 NOT FOUND !

Deletion in Hashing with Linear Probing
Since empty buckets are used to terminate search, standard deletion does not work. One simple idea is to not delete, but mark. Insert: put item in first empty or marked bucket. Search: Continue past marked buckets. Delete: just mark the bucket as deleted. Advantage: Easy and correct. Disadvantage: table can become full with dead items.

Quadratic Probing Check H(x) If collision occurs check H(x) + 1
Solves the clustering problem in Linear Probing Check H(x) If collision occurs check H(x) + 1 If collision occurs check H(x) + 4 If collision occurs check H(x) + 9 If collision occurs check H(x) + 16 ... H(x) + i2

Quadratic Probing (insert 12)
12 = 1 x 12 mod 11 = 1 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12

Double Hashing When collision occurs use a second hash function
Hash2 (x) = R – (x mod R) R: greatest prime number smaller than table-size Inserting 12 H2(x) = 7 – (x mod 7) = 7 – (12 mod 7) = 2 Check H(x) If collision occurs check H(x) + 2 If collision occurs check H(x) + 4 If collision occurs check H(x) + 6 If collision occurs check H(x) + 8 H(x) + i * H2(x)

Double Hashing (insert 12)
12 = 1 x 12 mod 11 = 1 7 –12 mod 7 = 2 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12

Collision Functions Hi(x)= (H(x)+i) mod B Linear probing
Quadratic probing Hi(x)= (H(x)+ i * H2(x)) mod B - Double hashing

Example Insert the following values into a hash table of size 10 using the hash equation (x2 +1) % 10 using the linear probing, quadratic probing and separate chaining technique Insert these values in sequential order: 1,2,5,6,8,4,9,3,10,7

Dynamic Programming An algorithm design technique for optimization problems (similar to divide and conquer) Applicable when subproblems are not independent Subproblems share subsubproblems A divide and conquer approach would repeatedly solve the common subproblems Dynamic programming solves every subproblem just once and stores the answer in a table

Dynamic Programming Used for optimization problems
A set of choices must be made to get an optimal solution Find a solution with the optimal value (minimum or maximum) There may be many solutions that return the optimal value: an optimal solution

Dynamic Programming Algorithm
Characterize the structure of an optimal solution Recursively define the value of an optimal solution Compute the value of an optimal solution in a bottom-up fashion Construct an optimal solution from computed information

Elements of Dynamic Programming
Optimal Substructure An optimal solution to a problem contains within it an optimal solution to subproblems Optimal solution to the entire problem is build in a bottom-up manner from optimal solutions to subproblems Overlapping Subproblems If a recursive algorithm revisits the same subproblems over and over  the problem has overlapping subproblems

Matrix-Chain Multiplication
Problem: given a sequence A1, A2, …, An, compute the product: A1  A2  An Matrix compatibility: C = A  B colA = rowB rowC = rowA colC = colB A1  A2  Ai  Ai+1  An coli = rowi+1

In what order should we multiply the matrices? A1  A2  An Parenthesize the product to get the order in which matrices are multiplied E.g.: A1  A2  A3 = ((A1  A2)  A3) = (A1  (A2  A3)) Which one of these orderings should we choose? The order in which we multiply the matrices has a significant impact on the cost of evaluating the product

rows[A]  cols[A]  cols[B]
MATRIX-MULTIPLY(A, B) if columns[A]  rows[B] then error “incompatible dimensions” else for i  1 to rows[A] do for j  1 to columns[B] do C[i, j] = 0 for k  1 to columns[A] do C[i, j]  C[i, j] + A[i, k] B[k, j] rows[A]  cols[A]  cols[B] multiplications k j cols[B] j cols[B] k i i * = rows[A] A B C rows[A]

Example A1  A2  A3 A1: 10 x 100 A2: 100 x 5 A3: 5 x ((A1  A2)  A3): A1  A2 = 10 x 100 x 5 = 5,000 (10 x 5) ((A1  A2)  A3) = 10 x 5 x 50 = 2,500 Total: 7,500 scalar multiplications 2. (A1  (A2  A3)): A2  A3 = 100 x 5 x 50 = 25,000 (100 x 50) (A1  (A2  A3)) = 10 x 100 x 50 = 50,000 Total: 75,000 scalar multiplications one order of magnitude difference!!

Given a chain of matrices A1, A2, …, An, where for i = 1, 2, …, n matrix Ai has dimensions pi-1x pi, fully parenthesize the product A1  A2  An in a way that minimizes the number of scalar multiplications. A1  A2  Ai  Ai+1  An p0 x p1 p1 x p2 pi-1 x pi pi x pi+1 pn-1 x pn

1. The Structure of an Optimal Parenthesization
Notation: Ai…j = Ai Ai+1  Aj, i  j For i < j: Ai…j = Ai Ai+1  Aj = Ai Ai+1  Ak Ak+1  Aj = Ai…k Ak+1…j Suppose that an optimal parenthesization of Ai…j splits the product between Ak and Ak+1, where i  k < j

Optimal Substructure Ai…j = Ai…k Ak+1…j The parenthesization of the “prefix” Ai…k must be an optimal parenthesization ! An optimal solution to an instance of the matrix-chain multiplication contains within it optimal solutions to subproblems

2. A Recursive Solution Subproblem:
determine the minimum cost of parenthesizing Ai…j = Ai Ai+1  Aj for 1  i  j  n Let m[i, j] = the minimum number of multiplications needed to compute Ai…j Full problem (A1..n): m[1, n] i = j: Ai…i = Ai  m[i, i] = 0 for i = 1, 2, …, n

2. A Recursive Solution Consider the subproblem of parenthesizing Ai…j = Ai Ai+1  Aj for 1  i  j  n = Ai…k Ak+1…j for i  k < j Assume that the optimal parenthesization splits the product Ai Ai+1  Aj at k (i  k < j) m[i, j] = pi-1pkpj m[i, k] m[k+1,j] m[i, k] m[k+1, j] pi-1pkpj min # of multiplications to compute Ai…k min # of multiplications to compute Ak+1…j # of multiplications to compute Ai…kAk…j

2. A Recursive Solution (cont.)
m[i, j] = m[i, k] m[k+1, j] pi-1pkpj We do not know the value of k There are j – i possible values for k: k = i, i+1, …, j-1 Minimizing the cost of parenthesizing the product Ai Ai+1  Aj becomes: if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j

3. Computing the Optimal Costs
if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j How many subproblems do we have? Parenthesize Ai…j for 1  i  j  n One problem for each choice of i and j 1 2 3 n  (n2) n j 3 2 1

3. Computing the Optimal Costs (cont.)
if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j How do we fill in the tables m[1..n, 1..n] and s[1..n, 1..n] ? Determine which entries of the table are used in computing m[i, j] Ai…j = Ai…k Ak+1…j Fill in m such that it corresponds to solving problems of increasing length

MATRIX-CHAIN-ORDER(P) n  length[p] –1 for i  1 to n do m[i,i]  0 for l  2 to n do for i  1 to n - l +1 do j  i + l –1 m[i,j]   for k  i to j-1 do q  m[i,k] + m[k+1,j] + p i-1 p k p j if q < m[i,j] then m[i,j]  q s[i,j]  k return m and s

if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j Length = 1: i = j, i = 1, 2, …, n Length = 2: j = i + 1, i = 1, 2, …, n-1 second first 1 2 3 n m[1, n] gives the optimal solution to the problem n j 3 Compute rows from bottom to top and from left to right In a similar matrix s we keep the optimal values of k 2 1 i

Example: min {m[i, k] + m[k+1, j] + pi-1pkpj}
m[2, 2] + m[3, 5] + p1p2p5 m[2, 3] + m[4, 5] + p1p3p5 m[2, 4] + m[5, 5] + p1p4p5 m[2, 5] = min k = 3 k = 4 1 2 3 4 5 6 6 5 Values m[i, j] depend only on values that have been previously computed 4 j 3 2 1 i

Example min {m[i, k] + m[k+1, j] + pi-1pkpj}
2 3 Compute A1  A2  A3 A1: 10 x 100 (p0 x p1) A2: 100 x 5 (p1 x p2) A3: 5 x (p2 x p3) m[i, i] = 0 for i = 1, 2, 3 m[1, 2] = m[1, 1] + m[2, 2] + p0p1p2 (A1A2) = *100* 5 = 5,000 m[2, 3] = m[2, 2] + m[3, 3] + p1p2p3 (A2A3) = * 5 * 50 = 25,000 m[1, 3] = min m[1, 1] + m[2, 3] + p0p1p3 = 75,000 (A1(A2A3)) m[1, 2] + m[3, 3] + p0p2p3 = 7, ((A1A2)A3) 7500 2 25000 2 3 5000 1 2 1

Construct the Optimal Solution
Store the optimal choice made at each subproblem s[i, j] = a value of k such that an optimal parenthesization of Ai..j splits the product between Ak and Ak+1 1 2 3 n k n j 3 2 1

Construct the Optimal Solution
s[1, n] is associated with the entire product A1..n The final matrix multiplication will be split at k = s[1, n] A1..n = A1..s[1, n]  As[1, n]+1..n For each subproduct recursively find the corresponding value of k that results in an optimal parenthesization 1 2 3 n n j 3 2 1

4. Construct the Optimal Solution
s[i, j] = value of k such that the optimal parenthesization of Ai Ai+1  Aj splits the product between Ak and Ak+1 1 2 3 4 5 6 3 5 - 4 1 2 6 s[1, n] = 3  A1..6 = A1..3 A4..6 s[1, 3] = 1  A1..3 = A1..1 A2..3 s[4, 6] = 5  A4..6 = A4..5 A6..6 5 4 3 j 2 1 i

4. Construct the Optimal Solution
Mult(A,s,i,j) { if (i<j) X= Mult(A,s,i,s(i,j)); Y= Mult(A,s,s(i,j)+1,j); return X*Y; } else return A(i);

4. Construct the Optimal Solution (cont.)
Mult(A,s,i,j) { if (i<j) X= Mult(A,s,i,s(i,j)); Y= Mult(A,s,s(i,j)+1,j); return X*Y; } else return A(i); 1 2 3 4 5 6 3 5 - 4 1 2 6 5 4 j 3 2 1 i

Example: A1  A6 ( ( ( A4 A5 ) A6 ) ) A1 A2 A3 ) Mult(A,s,1,6) 1 2 3
- 4 1 2 6 (A4.A5).A6 A1.(A2.A3) 5 A2.A3 (1,3) (4,6) 4 A1 A6 A4.A5 j 3 (1,1) (2,3) (4,5) (6,6) A3 2 A2 (2,2) (3,3) (4,4) (5,5) 1 A4 A5 i

Example Paranthesize the following Matrices: A1 : 5 x 4 A2 : 4 x 6 A3 : 6 x 2 A4 : 2 x 7 Write down the m matrix

Example m[i, i] = 0 for i = 1, 2, 3,4 m[1, 2] = m[1, 1] + m[2, 2] + p0p1p2 = *4*6 = 120 m[2, 3] = m[2, 2] + m[3, 3] + p1p2p3 = * 6 * 2 = 48 m[3, 4] = m[3, 3] + m[4, 4] + p2p3p4 = *2 * 7 = 84 1 2 3 4 84 4 48 3 j 120 2 1 i

Example m[1, 3] = min m[1, 1] + m[2, 3] + p0p1p3 = *4*2 = = 88 m[1, 2] + m[3, 3] + p0p2p3 = * 6 * 2 = 180 m [2,4] = min m[2,2] + m[3,4] + p1p2p4 = *6*7 = = 252 m[2, 3] + m[4, 4] + p1p3p4 = *2*7 = = 104

Example 1 2 3 4 104 84 88 48 120 158 4 3 j 2 1 i

Example 1 2 3 4 3 4 1 2 4 3 j 2 1 i A1.(A2.A3).A4)

Example What is the best order in which to multiply matrices A,B,C,D,E if A is 3*10, B is 10*200, C is 200*46, D is 46*150 and E is 150*5? Write down the dynamic programming array.

Data Structure and Algorithms

Similar presentations

Presentation on theme: "Data Structure and Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Structure and Algorithms

Similar presentations

Presentation on theme: "Data Structure and Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback