Data Structure and Algorithms

Slides:



Advertisements
Similar presentations
CSCE 3400 Data Structures & Algorithm Analysis
Advertisements

Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
UNC Chapel Hill Lin/Manocha/Foskey Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject.
Analysis of Algorithms CS 477/677
Hash Tables1 Part E Hash Tables  
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
11-1 Matrix-chain Multiplication Suppose we have a sequence or chain A 1, A 2, …, A n of n matrices to be multiplied –That is, we want to compute the product.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
CS 5243: Algorithms Dynamic Programming Dynamic Programming is applicable when sub-problems are dependent! In the case of Divide and Conquer they are.
COSC 3101A - Design and Analysis of Algorithms 7 Dynamic Programming Assembly-Line Scheduling Matrix-Chain Multiplication Elements of DP Many of these.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
CS 8833 Algorithms Algorithms Dynamic Programming.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Algorithms: Design and Analysis Summer School 2013 at VIASM: Random Structures and Algorithms Lecture 4: Dynamic Programming Phan Th ị Hà D ươ ng 1.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Algorithmics - Lecture 121 LECTURE 11: Dynamic programming - II -
Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject to some constraints. (There may.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Hash Tables 1/28/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Hashing (part 2) CSE 2011 Winter March 2018.
Dynamic Programming Typically applied to optimization problems
Dynamic Programming (DP)
Data Structures Using C++ 2E
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Advanced Algorithms Analysis and Design
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Hashing Alexandra Stefan.
Seminar on Dynamic Programming.
Advanced Design and Analysis Techniques
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
© 2013 Goodrich, Tamassia, Goldwasser
Dictionaries 9/14/ :35 AM Hash Tables   4
Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Dynamic programming techniques
Dynamic programming techniques
Advanced Associative Structures
Dictionaries and Hash Tables
Resolving collisions: Open addressing
CSCE 3110 Data Structures & Algorithm Analysis
Dictionaries and Hash Tables
Dynamic Programming.
Dictionaries 1/17/2019 7:55 AM Hash Tables   4
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Chapter 15: Dynamic Programming II
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Hash Tables Computer Science and Engineering
Lecture 8. Paradigm #6 Dynamic Programming
Algorithms CSCI 235, Spring 2019 Lecture 28 Dynamic Programming III
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Hashing.
Matrix Chain Multiplication
CSCI 235, Spring 2019, Lecture 25 Dynamic Programming
Data Structures and Algorithm Analysis Hashing
Algorithms CSCI 235, Spring 2019 Lecture 27 Dynamic Programming II
Seminar on Dynamic Programming.
Dictionaries and Hash Tables
Presentation transcript:

Data Structure and Algorithms Dr. Maheswari Karthikeyan Lecture 9 20/04/2013

Hashing Dynamic Programming

Hash Table Hash table: a data structure, implemented as an array of objects, where the search keys correspond to the array indexes A “much smarter” version of array Searching in average takes O(1) time Drawbacks: Traverse the data ----- In random order No range search

Direct Addressing If the key field is “age”, you have at most 120 numbers The domain of age numbers is small • So, an array with 120 slots can be built • An array can support time O(1) search by “direct-addressing” What if there are multiple persons having the same age? • What if the domain of the key field is very large? E.g., Salary: [0 – 100,000,000]

Direct Addressing Student Records 1 2 3 8 9 13 14

Hashing Hashing is a function that maps each key to a location in memory. A key’s location does not depend on other elements, and does not change after insertion. A good hash function should be easy to compute. With such a hash function, the dictionary operations can be implemented in O(1) time.

Hashing Map key values to hash table addresses keys -> hash table address This applies to find, insert, and remove Usually: integers -> {0, 1, 2, …, Hsize-1} Typical example: f(n) = n mod Hsize Non-numeric keys converted to numbers For example, strings converted to numbers as Sum of ASCII values First three characters

Hashing Student Records 9 10 20 39 4 14 8 (mod 9)

Hashing Choose a hash function h; it also determines the hash table size. Given an item x with key k, put x at location h(k). To find if x is in the set, check location h(k). What to do if more than one keys hash to the same value. This is called collision. Two methods to handle collision: Separate chaining Open addressing

Separate chaining Maintain a list of all elements that hash to the same value Search -- using the hash function to determine which list to traverse 14 42 29 20 1 36 56 23 16 24 31 17 7 2 3 4 5 6 8 9 10

Separate chaining 53 = 4 x 11 + 9 53 mod 11 = 9 14 42 29 20 1 36 56 23 16 24 31 17 7 2 3 4 5 6 8 9 10 53 14 42 29 20 1 36 56 23 16 24 31 17 7 2 3 4 5 6 8 9 10

Analysis of Hashing with Chaining Worst case All keys hash into the same bucket a single linked list. insert, delete, find take O(n) time. Average case Keys are uniformly distributed into buckets O(1+N/B): N is the number of elements in a hash table, B is the number of buckets. If N = O(B), then O(1) time per operation. N/B is called the load factor of the hash table.

Open addressing Linear Probing Quadratic Probing Double hashing If collision happens, alternative cells are tried until an empty cell is found. Three methods of Open addressing: Linear Probing Quadratic Probing Double hashing 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28

Linear Probing (insert 12) 12 = 1 x 11 + 1 12 mod 11 = 1 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12

Search with linear probing (Search 15) 15 = 1 x 11 + 4 15 mod 11 = 4 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12 NOT FOUND !

Deletion in Hashing with Linear Probing Since empty buckets are used to terminate search, standard deletion does not work. One simple idea is to not delete, but mark. Insert: put item in first empty or marked bucket. Search: Continue past marked buckets. Delete: just mark the bucket as deleted. Advantage: Easy and correct. Disadvantage: table can become full with dead items.

Quadratic Probing Check H(x) If collision occurs check H(x) + 1 Solves the clustering problem in Linear Probing Check H(x) If collision occurs check H(x) + 1 If collision occurs check H(x) + 4 If collision occurs check H(x) + 9 If collision occurs check H(x) + 16 ... H(x) + i2

Quadratic Probing (insert 12) 12 = 1 x 11 + 1 12 mod 11 = 1 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12

Double Hashing When collision occurs use a second hash function Hash2 (x) = R – (x mod R) R: greatest prime number smaller than table-size Inserting 12 H2(x) = 7 – (x mod 7) = 7 – (12 mod 7) = 2 Check H(x) If collision occurs check H(x) + 2 If collision occurs check H(x) + 4 If collision occurs check H(x) + 6 If collision occurs check H(x) + 8 H(x) + i * H2(x)

Double Hashing (insert 12) 12 = 1 x 11 + 1 12 mod 11 = 1 7 –12 mod 7 = 2 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 1 2 3 4 5 6 7 8 9 10 42 14 16 24 31 28 12

Collision Functions Hi(x)= (H(x)+i) mod B Linear probing Quadratic probing Hi(x)= (H(x)+ i * H2(x)) mod B - Double hashing

Example Insert the following values into a hash table of size 10 using the hash equation (x2 +1) % 10 using the linear probing, quadratic probing and separate chaining technique Insert these values in sequential order: 1,2,5,6,8,4,9,3,10,7

Dynamic Programming An algorithm design technique for optimization problems (similar to divide and conquer) Applicable when subproblems are not independent Subproblems share subsubproblems A divide and conquer approach would repeatedly solve the common subproblems Dynamic programming solves every subproblem just once and stores the answer in a table

Dynamic Programming Used for optimization problems A set of choices must be made to get an optimal solution Find a solution with the optimal value (minimum or maximum) There may be many solutions that return the optimal value: an optimal solution

Dynamic Programming Algorithm Characterize the structure of an optimal solution Recursively define the value of an optimal solution Compute the value of an optimal solution in a bottom-up fashion Construct an optimal solution from computed information

Elements of Dynamic Programming Optimal Substructure An optimal solution to a problem contains within it an optimal solution to subproblems Optimal solution to the entire problem is build in a bottom-up manner from optimal solutions to subproblems Overlapping Subproblems If a recursive algorithm revisits the same subproblems over and over  the problem has overlapping subproblems

Matrix-Chain Multiplication Problem: given a sequence A1, A2, …, An, compute the product: A1  A2  An Matrix compatibility: C = A  B colA = rowB rowC = rowA colC = colB A1  A2  Ai  Ai+1  An coli = rowi+1

Matrix-Chain Multiplication In what order should we multiply the matrices? A1  A2  An Parenthesize the product to get the order in which matrices are multiplied E.g.: A1  A2  A3 = ((A1  A2)  A3) = (A1  (A2  A3)) Which one of these orderings should we choose? The order in which we multiply the matrices has a significant impact on the cost of evaluating the product

rows[A]  cols[A]  cols[B] MATRIX-MULTIPLY(A, B) if columns[A]  rows[B] then error “incompatible dimensions” else for i  1 to rows[A] do for j  1 to columns[B] do C[i, j] = 0 for k  1 to columns[A] do C[i, j]  C[i, j] + A[i, k] B[k, j] rows[A]  cols[A]  cols[B] multiplications k j cols[B] j cols[B] k i i * = rows[A] A B C rows[A]

Example A1  A2  A3 A1: 10 x 100 A2: 100 x 5 A3: 5 x 50 1. ((A1  A2)  A3): A1  A2 = 10 x 100 x 5 = 5,000 (10 x 5) ((A1  A2)  A3) = 10 x 5 x 50 = 2,500 Total: 7,500 scalar multiplications 2. (A1  (A2  A3)): A2  A3 = 100 x 5 x 50 = 25,000 (100 x 50) (A1  (A2  A3)) = 10 x 100 x 50 = 50,000 Total: 75,000 scalar multiplications one order of magnitude difference!!

Matrix-Chain Multiplication Given a chain of matrices A1, A2, …, An, where for i = 1, 2, …, n matrix Ai has dimensions pi-1x pi, fully parenthesize the product A1  A2  An in a way that minimizes the number of scalar multiplications. A1  A2  Ai  Ai+1  An p0 x p1 p1 x p2 pi-1 x pi pi x pi+1 pn-1 x pn

1. The Structure of an Optimal Parenthesization Notation: Ai…j = Ai Ai+1  Aj, i  j For i < j: Ai…j = Ai Ai+1  Aj = Ai Ai+1  Ak Ak+1  Aj = Ai…k Ak+1…j Suppose that an optimal parenthesization of Ai…j splits the product between Ak and Ak+1, where i  k < j

Optimal Substructure Ai…j = Ai…k Ak+1…j The parenthesization of the “prefix” Ai…k must be an optimal parenthesization ! An optimal solution to an instance of the matrix-chain multiplication contains within it optimal solutions to subproblems

2. A Recursive Solution Subproblem: determine the minimum cost of parenthesizing Ai…j = Ai Ai+1  Aj for 1  i  j  n Let m[i, j] = the minimum number of multiplications needed to compute Ai…j Full problem (A1..n): m[1, n] i = j: Ai…i = Ai  m[i, i] = 0 for i = 1, 2, …, n

2. A Recursive Solution Consider the subproblem of parenthesizing Ai…j = Ai Ai+1  Aj for 1  i  j  n = Ai…k Ak+1…j for i  k < j Assume that the optimal parenthesization splits the product Ai Ai+1  Aj at k (i  k < j) m[i, j] = pi-1pkpj m[i, k] m[k+1,j] m[i, k] + m[k+1, j] + pi-1pkpj min # of multiplications to compute Ai…k min # of multiplications to compute Ak+1…j # of multiplications to compute Ai…kAk…j

2. A Recursive Solution (cont.) m[i, j] = m[i, k] + m[k+1, j] + pi-1pkpj We do not know the value of k There are j – i possible values for k: k = i, i+1, …, j-1 Minimizing the cost of parenthesizing the product Ai Ai+1  Aj becomes: 0 if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j

3. Computing the Optimal Costs 0 if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j How many subproblems do we have? Parenthesize Ai…j for 1  i  j  n One problem for each choice of i and j 1 2 3 n  (n2) n j 3 2 1

3. Computing the Optimal Costs (cont.) 0 if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j How do we fill in the tables m[1..n, 1..n] and s[1..n, 1..n] ? Determine which entries of the table are used in computing m[i, j] Ai…j = Ai…k Ak+1…j Fill in m such that it corresponds to solving problems of increasing length

3. Computing the Optimal Costs (cont.) MATRIX-CHAIN-ORDER(P) n  length[p] –1 for i  1 to n do m[i,i]  0 for l  2 to n do for i  1 to n - l +1 do j  i + l –1 m[i,j]   for k  i to j-1 do q  m[i,k] + m[k+1,j] + p i-1 p k p j if q < m[i,j] then m[i,j]  q s[i,j]  k return m and s

3. Computing the Optimal Costs (cont.) 0 if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j Length = 1: i = j, i = 1, 2, …, n Length = 2: j = i + 1, i = 1, 2, …, n-1 second first 1 2 3 n m[1, n] gives the optimal solution to the problem n j 3 Compute rows from bottom to top and from left to right In a similar matrix s we keep the optimal values of k 2 1 i

Example: min {m[i, k] + m[k+1, j] + pi-1pkpj} m[2, 2] + m[3, 5] + p1p2p5 m[2, 3] + m[4, 5] + p1p3p5 m[2, 4] + m[5, 5] + p1p4p5 m[2, 5] = min k = 3 k = 4 1 2 3 4 5 6 6 5 Values m[i, j] depend only on values that have been previously computed 4 j 3 2 1 i

Example min {m[i, k] + m[k+1, j] + pi-1pkpj} 2 3 Compute A1  A2  A3 A1: 10 x 100 (p0 x p1) A2: 100 x 5 (p1 x p2) A3: 5 x 50 (p2 x p3) m[i, i] = 0 for i = 1, 2, 3 m[1, 2] = m[1, 1] + m[2, 2] + p0p1p2 (A1A2) = 0 + 0 + 10 *100* 5 = 5,000 m[2, 3] = m[2, 2] + m[3, 3] + p1p2p3 (A2A3) = 0 + 0 + 100 * 5 * 50 = 25,000 m[1, 3] = min m[1, 1] + m[2, 3] + p0p1p3 = 75,000 (A1(A2A3)) m[1, 2] + m[3, 3] + p0p2p3 = 7,500 ((A1A2)A3) 7500 2 25000 2 3 5000 1 2 1

Construct the Optimal Solution Store the optimal choice made at each subproblem s[i, j] = a value of k such that an optimal parenthesization of Ai..j splits the product between Ak and Ak+1 1 2 3 n k n j 3 2 1

Construct the Optimal Solution s[1, n] is associated with the entire product A1..n The final matrix multiplication will be split at k = s[1, n] A1..n = A1..s[1, n]  As[1, n]+1..n For each subproduct recursively find the corresponding value of k that results in an optimal parenthesization 1 2 3 n n j 3 2 1

4. Construct the Optimal Solution s[i, j] = value of k such that the optimal parenthesization of Ai Ai+1  Aj splits the product between Ak and Ak+1 1 2 3 4 5 6 3 5 - 4 1 2 6 s[1, n] = 3  A1..6 = A1..3 A4..6 s[1, 3] = 1  A1..3 = A1..1 A2..3 s[4, 6] = 5  A4..6 = A4..5 A6..6 5 4 3 j 2 1 i

4. Construct the Optimal Solution Mult(A,s,i,j) { if (i<j) X= Mult(A,s,i,s(i,j)); Y= Mult(A,s,s(i,j)+1,j); return X*Y; } else return A(i);

4. Construct the Optimal Solution (cont.) Mult(A,s,i,j) { if (i<j) X= Mult(A,s,i,s(i,j)); Y= Mult(A,s,s(i,j)+1,j); return X*Y; } else return A(i); 1 2 3 4 5 6 3 5 - 4 1 2 6 5 4 j 3 2 1 i

Example: A1  A6 ( ( ( A4 A5 ) A6 ) ) A1 A2 A3 ) Mult(A,s,1,6) 1 2 3 - 4 1 2 6 (A4.A5).A6 A1.(A2.A3) 5 A2.A3 (1,3) (4,6) 4 A1 A6 A4.A5 j 3 (1,1) (2,3) (4,5) (6,6) A3 2 A2 (2,2) (3,3) (4,4) (5,5) 1 A4 A5 i

Example Paranthesize the following Matrices: A1 : 5 x 4 A2 : 4 x 6 A3 : 6 x 2 A4 : 2 x 7 Write down the m matrix

Example m[i, i] = 0 for i = 1, 2, 3,4 m[1, 2] = m[1, 1] + m[2, 2] + p0p1p2 = 0 + 0 + 5*4*6 = 120 m[2, 3] = m[2, 2] + m[3, 3] + p1p2p3 = 0 + 0 + 4 * 6 * 2 = 48 m[3, 4] = m[3, 3] + m[4, 4] + p2p3p4 = 0 + 0 + 6 *2 * 7 = 84 1 2 3 4 84 4 48 3 j 120 2 1 i

Example m[1, 3] = min m[1, 1] + m[2, 3] + p0p1p3 = 0 + 48 + 5*4*2 = 48+ 40 = 88 m[1, 2] + m[3, 3] + p0p2p3 = 120 + 0 + 5 * 6 * 2 = 180 m [2,4] = min m[2,2] + m[3,4] + p1p2p4 = 0 + 84 + 4*6*7 = 84+ 168 = 252 m[2, 3] + m[4, 4] + p1p3p4 = 48 + 0 + 4*2*7 = 48 + 56 = 104

Example 1 2 3 4 104 84 88 48 120 158 4 3 j 2 1 i

Example 1 2 3 4 3 4 1 2 4 3 j 2 1 i A1.(A2.A3).A4)

Example What is the best order in which to multiply matrices A,B,C,D,E if A is 3*10, B is 10*200, C is 200*46, D is 46*150 and E is 150*5? Write down the dynamic programming array.