Review of the second half

Slides:

Advertisements

Similar presentations

AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.

Advertisements

AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.

1 AVL Trees (10.2) CSE 2011 Winter April 2015.

© 2004 Goodrich, Tamassia Hash Tables1  

Data Structures Lecture 11 Fang Yu Department of Management Information Systems National Chengchi University Fall 2010.

© 2004 Goodrich, Tamassia Binary Search Trees   

CSC311: Data Structures 1 Chapter 10: Search Trees Objectives: Binary Search Trees: Search, update, and implementation AVL Trees: Properties and maintenance.

Binary Search Trees1 Part-F1 Binary Search Trees   

Hash Tables1 Part E Hash Tables  

Hash Tables1 Part E Hash Tables  

Binary Search Trees1 ADT for Map: Map stores elements (entries) so that they can be located quickly using keys. Each element (entry) is a key-value pair.

Hash Tables1 Part E Hash Tables  

AVL Trees v z. 2 AVL Tree Definition AVL trees are balanced. An AVL Tree is a binary search tree such that for every internal node v of T, the.

Hash Tables1   © 2010 Goodrich, Tamassia.

Search Trees. Binary Search Tree (§10.1) A binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying.

Binary Search Trees (10.1) CSE 2011 Winter November 2015.

© 2004 Goodrich, Tamassia Binary Search Trees1 CSC 212 Lecture 18: Binary and AVL Trees.

Search Trees Chapter   . Outline  Binary Search Trees  AVL Trees  Splay Trees.

Chapter 10: Search Trees Nancy Amato Parasol Lab, Dept. CSE, Texas A&M University Acknowledgement: These slides are adapted from slides provided with Data.

© 2004 Goodrich, Tamassia Hash Tables1  

Binary Search Trees1 Chapter 3, Sections 1 and 2: Binary Search Trees AVL Trees   

Algorithms Design Fall 2016 Week 6 Hash Collusion Algorithms and Binary Search Trees.

Hash Tables 1/28/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.

Part-D1 Binary Search Trees

Binary Search Trees < > = © 2010 Goodrich, Tamassia

Binary Search Trees < > =

COMP9024: Data Structures and Algorithms

AVL Trees 5/17/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.

Binary Search Trees < > =

AVL Trees 6/25/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.

Hashing CSE 2011 Winter July 2018.

Binary Search Trees (10.1) CSE 2011 Winter August 2018.

Chapter 10 Search Trees 10.1 Binary Search Trees Search Trees

Red-Black Trees 9/12/ :44 AM AVL Trees v z AVL Trees.

Binary Search Trees < > = Binary Search Trees

Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 

© 2013 Goodrich, Tamassia, Goldwasser

Dictionaries 9/14/ :35 AM Hash Tables   4

Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.

Chapter 2: Basic Data Structures

The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:

AVL Trees 11/10/2018 AVL Trees v z AVL Trees.

AVL Trees 4/29/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M. H.

Challenging Problem 2: Challenging problem 2 will be given on the website (week 4) at 9:30 am on Saturday (Oct 1, 2005). 2. The due time is Sunday (Oct.

Red-Black Trees 11/13/2018 2:07 AM AVL Trees v z AVL Trees.

Binary Search Trees (10.1) CSE 2011 Winter November 2018.

Binary Search Trees < > = © 2010 Goodrich, Tamassia

Binary Search Trees < > =

Binary Search Trees < > = Binary Search Trees

Red-Black Trees 11/26/2018 3:42 PM AVL Trees v z AVL Trees.

Red-Black Trees 2018年11月26日3时46分 AVL Trees v z AVL Trees.

Dictionaries and Hash Tables

Announcement 2: A 2 hour midterm (open book) will be given on March (Tuesday) during the lecture time. 2018/12/4.

Page 620 Single-Source Shortest Paths

v z Chapter 10 AVL Trees Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich,

2018/12/27 chapter25.

Copyright © Aiman Hanna All rights reserved

Binary Search Trees < > =

Dictionaries 1/17/2019 7:55 AM Hash Tables   4

AVL Trees 2/23/2019 AVL Trees v z AVL Trees.

Red-Black Trees 2/24/ :17 AM AVL Trees v z AVL Trees.

CS210- Lecture 17 July 12, 2005 Agenda Collision Handling

CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT

Red-Black Trees 5/19/2019 6:39 AM AVL Trees v z AVL Trees.

1 Lecture 13 CS2013.

Binary Search Trees < > = Dictionaries

Dictionaries 二○一九年九月二十四日 ADT for Map:

CS210- Lecture 19 July 18, 2005 Agenda AVL trees Restructuring Trees

Dictionaries and Hash Tables

Presentation transcript:

Review of the second half Single-Source Shortest Paths 2019/4/25 chapter25

Find a shortest path from station A to station B. -need serious thinking to get a correct algorithm. 2019/4/25 chapter25

Adjacency-list representation Let G=(V, E) be a graph. V– set of nodes (vertices) E– set of edges. For each uV, the adjacency list Adj[u] contains all nodes in V that are adjacent to u. 2 1 5 4 3 / (a) (b) 2019/4/25 chapter25

Dijkstra’s Algorithm: Dijkstra’s algorithm assumes that w(e)0 for each e in the graph. maintain a set S of vertices such that Every vertex v S, d[v]=(s, v), i.e., the shortest-path from s to v has been found. (Intial values: S=empty, d[s]=0 and d[v]=) (a) select the vertex uV-S such that d[u]=min {d[x]|x V-S}. Set S=S{u} D[u]= (s, u) at this moment! Why? (b) for each node v adjacent to u do RELAX(u, v, w). Repeat step (a) and (b) until S=V. 2019/4/25 chapter25

Continue: DIJKSTRA(G,w,s): INITIALIZE-SINGLE-SOURCE(G,s) S Q V[G] while Q do u EXTRACT -MIN(Q) S S {u} for each vertex v  Adj[u] do RELAX(u,v,w) 2019/4/25 chapter25

Implementation: a adaptable priority queue Q stores vertices in V-S, keyed by their d[] values. the graph G is represented by adjacency lists so that it takes O(1) time to find an edge (u, v) in (b). 2019/4/25 chapter25

u v 10 5 2 1 3 4 6 9 7 8 s Single Source Shortest Path Problem x y (a) 2019/4/25 chapter25

u v 1 10 8 10 9 s 2 3 4 6 7 5 5 8 2 x y (b) (s,x) is the shortest path using one edge. It is also the shortest path from s to x. 2019/4/25 chapter25

Assume EXTRACT -MIN(Q)=x. (s,x) is the shortest path using one edge. Why? Since (s, x) is the shortest among all edges starting from s. It is also the shortest path from s to x. Proof: (1) Suppose that path P: s->u…->x is the shortest path. Then w (s,u) w(s, x). (2) Since edges have non-negative weight, the total weight of path P is at least w(s,u) w(s, x). (3) So, the edge (s, x) is the shortest path from s to x. 2019/4/25 chapter25

u v 1 8 14 10 9 s 2 3 4 6 7 5 5 7 2 x y (c) 2019/4/25 chapter25

Statement: Suppose S={s, x} and d[y]=min d(v). ……(1) vV-S Then d[y] is the cost of the shortest, i.e., either s->x->y or (s->y) is the shortest path from s to y. Why? If (s, y) is the shortest path, d[y] is correct. Consider that case: s->x->y is the shortest path for y. Proof by contradiction. Assume that s->x->y is not the shortest path and that P1: s->yy->…->y is a shortest path and yyS. (At this moment, we already tried the case for yy=s and yy=x in the alg.) Thus, w(P1)< w(s->x->y). (from assumption: s->x-Y is not the shortest) Since w(e)0 for any e, w(s->yy)<w(P1)<w(s->x->y). Therefore, d[yy]<d[y] and (1) is not true. 2019/4/25 chapter25

u v 1 8 13 10 9 s 2 3 4 6 7 5 5 7 2 x y (d) 2019/4/25 chapter25

7 9 5 8 10 2 1 3 4 6 s u v x y (e) 2019/4/25 chapter25

u v 1 8 9 10 9 s 2 3 4 6 7 5 5 7 2 x y (f) 2019/4/25 chapter25

Theorem: Let S be the set in algorithm and d[y]=min d(v). ……(1) vV-S Then d[y] is the cost of the shortest. (hard part) Proof: Assume that (1) for any v in S, y[v] is the cost of the shortest path. We want to show that d[y] is also the cost of the shortest path after execution of Step (a) If the shortest path from s to v contains vertices in S ONLY, then d[v] is the length of the shortest. Assume that (2) d[y] is NOT the cost of the shortest path from s to y and that P1: s…->yy->…->y is a shortest path and yyS is the 1st node in P1 not in S. Thus, w(P1)<d[y]. So, w(s…->yy)<w(P1). (weight of edge >=0) Thus, w(s…->yy)<w(P1)<d[y]. From (1) and (2), after execution of step (a) d[yy]w(s…->yy). Therefore, d[yy]<d[y] and (1) is not true. 2019/4/25 chapter25

Time complexity of Dijkstra’s Algorithm: Time complexity depends on implementation of the adaptable priority. Method 1: Use an array to story the Queue EXTRACT -MIN(Q) --takes O(|V|) time. Totally, there are |V| EXTRACT -MIN(Q)’s. time for |V| EXTRACT -MIN(Q)’s is O(|V|2). RELAX(u,v,w) --takes O(1) time. Totally |E| RELAX(u, v, w)’s are required. time for |E| RELAX(u,v,w)’s is O(|E|). Total time required is O(|V|2+|E|)=O(|V|2) Backtracking with [] gives the shortest path in inverse order. Method 2: The adaptable priority queue is implemented as a heap. It takes O(log |V|) time to do EXTRACT-MIN(Q) and O(log |V|) time for each RELAX(u, v, w)’s. The total running time is O((|V|+|E|)log |V|)=O(|E|log |V|) assuming the graph is connected. When |E| is O(|V|) the second implementation is better. If |E|=O(|V|2), then the first implementation is better. 2019/4/25 chapter25

Method 3: The priority queue is implemented as a Fibonacci heap Method 3: The priority queue is implemented as a Fibonacci heap. It takes O(log |V|) time to do EXTRACT-MIN(Q) and O(1) time to decrease the key value of an entry. The total running time is O(|V|log |V|+|E|). (not required) 2019/4/25 chapter25

Binary search tree and AVL tree 2019/4/25 chapter25

ADT for Map: Map stores elements (entries) so that they can be located quickly using keys. Each element (entry) is a key-value pair (k, v), where k is the key and v can be any object to store additional information. Each key is unique. (different entries have different keys.) Map support the following methods: Size(): Return the number of entries in M isEmpty(): Test whether M is empty get(k); If M contains an entry e with key=k, then return e else return null. put(k, v): If M does not contain an entry with key=k then add (k, v) to the map and return null; else replace the entry with (k, v) and return the old value. 2019/4/25 chapter25

Methods of Map (continued) remove(k): remove from M the netry with key=k and return its value; if M has no such entry with key=k then return null. keys(); Return an iterable collection containing all keys stored in M values(): Return an iterable collection containing all values in M entries(): return an iterable collection containing all key-value entries in M. Remakrs: hash table is an implementation of Map. 2019/4/25 chapter25

ADT for Dictionary: A Dictionary stores elements (entries). Each element (entry) is a key-value pair (k, v), where k is the key and v can be any object to store additional information. The key is NOT unique. Dictionary support the following methods: size(): Return the number of entries in D isEmpty(): Test whether D is empty find(k): If D contains an entry e with key=k, then return e else return null. findAll(k): Return an iterable collection containing all entries with key=k. insert(k, v): Insert an entry into D, returning the entry created. remove(e): remove from D an enty e, returing the removed entry or null if e was not in D. entries(): return an iterable collection of the key-value entries in D. 2019/4/25 chapter25

Part-F1 Binary Search Trees < 6 2 > 9 1 4 = 8 2019/4/25 chapter25

Search Trees Tree data structure that can be used to implement a dictionary. find(k): If D contains an entry e with key=k, then return e else return null. findAll(k): Return an iterable collection containing all entries with key=k. insert(k, v): Insert an entry into D, returning the entry created. remove(e): remove from D an enty e, returing the removed entry or null if e was not in D. 2019/4/25 chapter25

Binary Search Trees A binary search tree is a binary tree storing keys (or key-value entries) at its internal nodes and satisfying the following property: Let u, v, and w be three nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u)  key(v)  key(w) Different nodes can have the same key. External nodes do not store items An inorder traversal of a binary search trees visits the keys in increasing order 6 9 2 4 1 8 2019/4/25 chapter25

Search To search for a key k, we trace a downward path starting at the root The next node visited depends on the outcome of the comparison of k with the key of the current node If we reach a leaf, the key is not found and we return null Example: find(4): Call TreeSearch(4,root) Algorithm TreeSearch(k, v) if T.isExternal (v) return v if k < key(v) return TreeSearch(k, T.left(v)) else if k = key(v) else { k > key(v) } return TreeSearch(k, T.right(v)) < 6 2 > 9 1 4 = 8 2019/4/25 chapter25

Insertion < > > w w 6 2 9 1 4 8 6 2 9 1 4 8 5 To perform operation insert(k, o), we search for key k (using TreeSearch) Algorithm TreeINsert(k, x, v): Input: A search key, an associate value x and a node v of T to start with Output: a new node w in the subtree T(v) that stores the entry (k, x) W TreeSearch(k,v) If k=key(w) then return TreeInsert(k, x, T.left(w)) T.insertAtExternal(w, (k, x)) Return Example: insert 5 Example: insert another 5? 2 9 > 1 4 8 > w 6 2 9 1 4 8 w 5 2019/4/25 chapter25

Deletion To perform operation remove(k), we search for key k < 6 To perform operation remove(k), we search for key k Assume key k is in the tree, and let v be the node storing k If node v has a leaf child w, we remove v and w from the tree with operation removeExternal(w), which removes w and its parent and replace v with the remaining child. Example: remove 4 < 2 9 > v 1 4 8 w 5 6 2 9 1 5 8 2019/4/25 chapter25

Deletion (cont.) 1 v We consider the case where the key k to be removed is stored at a node v whose children are both internal we find the internal node w that follows v in an inorder traversal we copy key(w) into node v we remove node w and its left child z (which must be a leaf) by means of operation removeExternal(z) Example: remove 3 3 2 8 6 9 w 5 z 1 v 5 2 8 6 9 2019/4/25 chapter25

Deletion (Another Example) 1 v 3 2 8 6 9 w 4 z 5 1 v 4 2 8 6 9 5 2019/4/25 chapter25

Later, we will try to keep h =O(log n). Performance Consider a dictionary with n items implemented by means of a binary search tree of height h the space used is O(n) methods find, insert and remove take O(h) time The height h is O(n) in the worst case and O(log n) in the best case Later, we will try to keep h =O(log n). Review the past 2019/4/25 chapter25

Part-F2 AVL Trees 6 3 8 4 v z 2019/4/25 chapter25

AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that for every internal node v of T, the heights of the children of v can differ by at most 1. An example of an AVL tree where the heights are shown next to the nodes: 2019/4/25 chapter25

Balanced nodes A internal node is balanced if the heights of its two children differ by at most 1. Otherwise, such an internal node is unbalanced. 2019/4/25 chapter25

3 4 n(1) n(2) Height of an AVL Tree Fact: The height of an AVL tree storing n keys is O(log n). Proof: Let us bound n(h): the minimum number of internal nodes of an AVL tree of height h. We easily see that n(1) = 1 and n(2) = 2 For n > 2, an AVL tree of height h contains the root node, one AVL subtree of height n-1 and another of height n-2. That is, n(h) = 1 + n(h-1) + n(h-2) Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), … (by induction), n(h) > 2in(h-2i)>2 {h/2 -1} (1) = 2 {h/2 -1} Solving the base case we get: n(h) > 2 h/2-1 Taking logarithms: h < 2log n(h) +2 Thus the height of an AVL tree is O(log n) h-1 2019/4/25 chapter25 h-2

Insertion in an AVL Tree Insertion is as in a binary search tree Always done by expanding an external node. Example: 44 17 78 32 50 88 48 62 44 17 78 32 50 88 48 62 54 c=z a=y b=x w before insertion after insertion It is no longer balanced 2019/4/25 chapter25

Names of important nodes w: the newly inserted node. (insertion process follow the binary search tree method) The heights of some nodes in T might be increased after inserting a node. Those nodes must be on the path from w to the root. Other nodes are not effected. z: the first node we encounter in going up from w toward the root such that z is unbalanced. y: the child of z with higher height. y must be an ancestor of w. (why? Because z in unbalanced after inserting w) x: the child of y with higher height. x must be an ancestor of w. The height of the sibling of x is smaller than that of x. (Otherwise, the height of y cannot be increased.) See the figure in the last slide. 2019/4/25 chapter25

Algorithm restructure(x): Input: A node x of a binary search tree T that has both parent y and grand-parent z. Output: Tree T after a trinode restructuring. Let (a, b, c) be the list (increasing order) of nodes x, y, and z. Let T0, T1, T2 T3 be a left-to-right (inorder) listing of the four subtrees of x, y, and z not rooted at x, y, or z. Replace the subtree rooted at z with a new subtree rooted at b.. Let a be the left child of b and let T0 and T1 be the left and right subtrees of a, respectively. Let c be the right child of b and let T2 and T3 be the left and right subtrees of c, respectively. 2019/4/25 chapter25

Restructuring (as Single Rotations) 3 2 1 a = x b = y c = z single rotation 2019/4/25 chapter25

Restructuring (as Double Rotations) c = z b = x a = y T 3 1 2 2019/4/25 chapter25

Insertion Example, continued unbalanced... 4 T 1 44 x 2 3 17 62 y z 1 2 2 32 50 78 1 1 1 ...balanced 48 54 88 T 2 T T 1 T 2019/4/25 chapter25 3

Theorem: One restructure operation is enough to ensure that the whole tree is balanced. Proof: Look at the four cases on slides 20 and 21. 2019/4/25 chapter25

Removal in an AVL Tree Removal begins as in a binary search tree by calling removal(k) for binary tree. may cause an imbalance. Example: 44 17 78 32 50 88 48 62 54 44 w 17 62 50 78 48 54 88 before deletion of 32 after deletion 2019/4/25 chapter25

Rebalancing after a Removal Let z be the first unbalanced node encountered while travelling up the tree from w. w-parent of the removed node (in terms of structure, not the name) let y be the child of z with the larger height, let x be the child of y defined as follows; If one of the children of y is taller than the other, choose x as the taller child of y. If both children of y have the same height, select x be the child of y on the same side as y (i.e., if y is the left child of z, then x is the left child of y; and if y is the right child of z then x is the right child of y.) The way to obtain x, y and z are different from insertion. 2019/4/25 chapter25

Rebalancing after a Removal We perform restructure(x) to restore balance at z. As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached 62 a=z 44 44 78 w 17 62 b=y 17 50 88 50 78 c=x 48 54 48 54 88 2019/4/25 chapter25

Unbalanced after restructuring 1 1 62 h=3 a=z 44 h=4 h=5 h=5 44 78 w 17 62 b=y 17 50 88 32 50 78 c=x 88 2019/4/25 chapter25

Rebalancing after a Removal We perform restructure(x) to restore balance at z. As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached 62 a=z 44 44 78 w 17 62 b=y 17 50 88 50 78 c=x 48 54 48 54 88 2019/4/25 chapter25

Example a: Which node is w? Let us remove node 17. w 44 17 78 32 50 88 48 62 54 44 w 32 62 50 78 48 54 88 before deletion of 32 after deletion 2019/4/25 chapter25

Rebalancing: We perform restructure(x) to restore balance at z. As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached 62 a=z 44 44 78 w 32 62 b=y 32 50 88 50 78 c=x 48 54 48 54 88 2019/4/25 chapter25

Running Times for AVL Trees a single restructure is O(1) using a linked-structure binary tree find is O(log n) height of tree is O(log n), no restructures needed insert is O(log n) initial find is O(log n) Restructuring up the tree, maintaining heights is O(log n) remove is O(log n) 2019/4/25 chapter25

Part E Hash Tables 1 2 3 4   025-612-0001 981-101-0002 451-229-0004  1 025-612-0001 2 981-101-0002 3  4 451-229-0004 2019/4/25 chapter25

Motivations of Hash Tables We have n items, each contains a key and value (k, value). The key uniquely determines the item. Each key could be anything, e.g., a number in [0, 232], a string of length 32, etc. How to store the n items such that given the key k, we can find the position of the item with key= k in O(1) time. Another constraint: space required is O(n). Linked list? Space O(n) and Time O(n). Array? Time O(1) and space: too big, e.g., If the key is an integer in [0, 2 32], then the space required is 2 32. if the key is a string of length 30, the space required is 26 30. Hash Table: space O(n) and time O(1). 2019/4/25 chapter25

Basic ideas of Hash Tables A hash function h maps keys of a given type with a wide range to integers in a fixed interval [0, N - 1], where N is the size of the hash table such that if k≠k then h(k)≠h(k’) ….. (1) . Problem: It is hard to design a function h such that (1) holds. What we can do: We can design a function h so that with high chance, (1) holds. i.e., (1) may not always holds, but (1) holds for most of the n keys. 2019/4/25 chapter25

Hash Functions A hash function h maps keys of a given type to integers in a fixed interval [0, N - 1] Example: h(x) = x mod N is a hash function for integer keys The integer h(x) is called the hash value of key x A hash table for a given key type consists of Hash function h Array (called table) of size N the goal is to store item (k, o) at index i = h(k) 2019/4/25 chapter25

Example We design a hash table storing entries as (HKID, Name), where HKID is a nine-digit positive integer Our hash table uses an array of size N = 10,000 and the hash function h(x) = last four digits of x Need a method to handle collision.  1 2 3 4 9997 9998 9999 … 451-229-0004 981-101-0002 200-751-9998 025-612-0001 2019/4/25 chapter25

Collision Handling Collisions occur when different elements are mapped to the same cell Separate Chaining: let each cell in the table point to a linked list of entries that map there  1 2 3 4 451-229-0004 981-101-0004 025-612-0001 Separate chaining is simple, but requires additional memory outside the table 2019/4/25 chapter25

Open Addressing the colliding item is placed in a different cell of the table Load factor: n/N, where n is the number of items to store and N the size of the hash table. n/N≤1. To get a reasonable performance, n/N<0.5. 2019/4/25 chapter25

Linear Probing Linear probing handles collisions by placing the colliding item in the next (circularly) available table cell Each table cell inspected is referred to as a “probe” Colliding items lump together, causing future collisions to cause a longer sequence of probes Example: h(x) = x mod 13 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order 1 2 3 4 5 6 7 8 9 10 11 12 41 18 44 59 32 22 31 73 1 2 3 4 5 6 7 8 9 10 11 12 2019/4/25 chapter25

Search with Linear Probing Consider a hash table A that uses linear probing get(k) We start at cell h(k) We probe consecutive locations until one of the following occurs An item with key k is found, or An empty cell is found, or N cells have been unsuccessfully probed To ensure the efficiency, if k is not in the table, we want to find an empty cell as soon as possible. The load factor can NOT be close to 1. Algorithm get(k) i  h(k) p  0 repeat c  A[i] if c =  return null else if c.key () = k return c.element() else i  (i + 1) mod N p  p + 1 until p = N 2019/4/25 chapter25

Linear Probing Search for key=20. Search for key=15 Example: h(20)=20 mod 13 =7. Go through rank 8, 9, …, 12, 0. Search for key=15 h(15)=15 mod 13=2. Go through rank 2, 3 and return null. Example: h(x) = x mod 13 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, 12, 20 in this order 1 2 3 4 5 6 7 8 9 10 11 12 20 41 18 44 59 32 22 31 73 12 1 2 3 4 5 6 7 8 9 10 11 12 2019/4/25 chapter25

Updates with Linear Probing To handle insertions and deletions, we introduce a special object, called AVAILABLE, which replaces deleted elements remove(k) We search for an entry with key k If such an entry (k, o) is found, we replace it with the special item AVAILABLE and we return element o Else, we return null Have to modify other methods to skip available cells. put(k, o) We throw an exception if the table is full We start at cell h(k) We probe consecutive cells until one of the following occurs A cell i is found that is either empty or stores AVAILABLE, or N cells have been unsuccessfully probed We store entry (k, o) in cell i 2019/4/25 chapter25

Updates with Linear Probing Algorithm put(k,o) i  h(k) p  0 repeat c  A[i] if c =  return null else if c.key () = k A[i]=o else i  (i + 1) mod N p  p + 1 until p = N if p=N+1 the array is full. Example: h(x) = x mod 13 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, 20, 12 in this order Ti insert 12, we look at rank 12 and then rank 0. 1 2 3 4 5 6 7 8 9 10 11 12 12 41 18 44 59 32 22 31 73 20 1 2 3 4 5 6 7 8 9 10 11 12 2019/4/25 chapter25

A complete example Example: A A h(x) = x mod 13 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, 20, 12 in this order Remove(): 20, 12 Get(11): check the cell after AVAILABLE cells. Insert keys 10, 11. 10 is at rank 12 and 11 is at rank 0. The Available cells are hard to deal with. Separate Chaining approach is simpler. 1 2 3 4 5 6 7 8 9 10 11 12 A A 12 41 18 44 59 32 22 31 73 20 1 2 3 4 5 6 7 8 9 10 11 12 2019/4/25 chapter25

Performance of Hashing In the worst case, searches, insertions and removals on a hash table take O(n) time The worst case occurs when all the keys inserted into the map collide The load factor a = n/N affects the performance of a hash table Assuming that the hash values are like random numbers, it can be shown that the expected number of probes for an insertion with open addressing is 1 / (1 - a) The expected running time of all the operations in a hash table is O(1) In practice, hashing is very fast provided the load factor is not close to 100% Applications of hash tables: small databases compilers browser caches 2019/4/25 chapter25

Adaptable Heap A heap that allows changing the key value of an entry in a heap. (See tutorial 11) 2019/4/25 chapter25

Summary ADT: map, dictionary, priority queue, adaptable priority queue. Hash table, binary tree, AVL tree, heap, adaptable heap. Shortest path problem Algorithm Data structures (a complete example for solving an non-trivial problem) 2019/4/25 chapter25

Final Exam The format is similar to that of the mid term. ADT Data structures: AVT tree, binary tree, adaptable heap, heap, linked list, stack, hash table, …. Describe the algorithm/java code for a method Give a concrete instance, e.g., a graph, a tree etc, you are asked to do insertion or deletion, … Shortest path problem Six questions in the final. Answer ALL. 2019/4/25 chapter25