Searching Tables Table: sequence of (key,information) pairs

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Hashing as a Dictionary Implementation
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing CS 3358 Data Structures.
CSE 250: Data Structures Week 12 March 31 – April 4, 2008.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Hash Table March COP 3502, UCF.
HASHING Section 12.7 (P ). HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity)
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
“Never doubt that a small group of thoughtful, committed people can change the world. Indeed, it is the only thing that ever has.” – Margaret Meade Thought.
Hash Tables. 2 Exercise 2 /* Exercise 1 */ void mystery(int n) { int i, j, k; for (i = 1; i
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Searching Tables Table: sequence of (key,information) pairs (key,information) pair is a record key uniquely identifies information, so no duplicate records.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Fundamental Structures of Computer Science II
Hashing.
Hashing Problem: store and retrieving an item using its key (for example, ID number, name) Linked List takes O(N) time Binary Search Tree take O(logN)
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Slides by Steve Armstrong LeTourneau University Longview, TX
Hashing Alexandra Stefan.
Hashing Alexandra Stefan.
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Hash Tables.
Dictionaries and Their Implementations
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
CS202 - Fundamental Structures of Computer Science II
A Hash Table with Chaining
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Tree traversal preorder, postorder: applies to any kind of tree
EE 312 Software Design and Implementation I
Collision Handling Collisions occur when different elements are mapped to the same cell.
Hashing.
Hash Maps Introduction
Podcast Ch21b Title: Collision Resolution
Data Structures and Algorithm Analysis Hashing
DATA STRUCTURES-COLLISION TECHNIQUES
EE 312 Software Design and Implementation I
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Collision Resolution: Open Addressing Extendible Hashing
CSE 373: Data Structures and Algorithms
Presentation transcript:

Searching Tables Table: sequence of (key,information) pairs (key,information) pair is a record key uniquely identifies information, so no duplicate records Sometimes the key is the whole record Searching a table Given a key k and a table T = (k1,i1),…, (kn,in), find the pair (kj,ij) in T such that k=kj (if it exists) Possible approaches: Sequential search: simple, but only effective for small tables Binary search: fast, but table must be sorted Hashing COSC 2P03 Week 11

Hash Tables Idea: use a function h such that for every possible key k, h(k) = index of record with key k: O(1) search time Hash Function: maps keys to addresses Build the table using the hash function: if h(k)=1 then put record (k,i) in cell 1 of table (etc) Tables are generally sparse Hash functions should ideally be: Easy to compute, and Ensure different keys are always mapped to different cells Perfect hash functions are not always possible COSC 2P03 Week 11

Hashing and Collisions Collision: the effect of more than one key being mapped to the same cell Given kx ≠ ky, we have f(kx) = f(ky) Ideally collisions would never happen (this is not realistic) Approaches to dealing with collisions: Allow >1 record to be stored in each table index Buckets: each index is a fixed-size bucket of records Separate chaining: each index has a linked list of records Open addressing: allow only 1 record at each index When a collision occurs, use a collision resolution policy to find a new index for the item, e.g. linear probing etc. COSC 2P03 Week 11

Hash tables with buckets Generally used for storing files on disk Each index has a bucket of fixed size (block) Within the bucket, records are in unsorted order To insert record (k,i): Apply hash function h(k) to determine in which bucket (k,i) belongs and add to next empty space in bucket To search for record (k,i): Compute h(k) and read corresponding block from disk Perform linear search of block to find record with key k COSC 2P03 Week 11

Hash tables – separate chaining Each cell of the table is a separate linked list To insert record (k,i): Apply hash function h(k) to determine in which linked list the record belongs and add to front of list To search for record (k,i): Compute h(k) and access corresponding linked list Perform linear search of linked list to find record with key k COSC 2P03 Week 11

Hash table with Open Addressing findPos – Linear Probing If a record is hashed to index j, which is already occupied, then look in index j+1, j+2, … and put the record in the next available index (each attempt is called a probe) int findPos(int k) // search for index that should store // record with key k { current = hash(k, tableSize); while(array[current] != null && array[current].record.key != k) current = (current+1) % tableSize; } return current; COSC 2P03 Week 11

Hash Table search int find(int k) // search for record with key k { current = findPos(k); if(isActive(current)) return current; else return -1; // not found } COSC 2P03 Week 11

Hash Table – insertion void insert(record R) // insert R if not already in table { current = findPos(R.key); if(isActive(current)) // already in hash table return; else array[current].record = R; array[current].isActive = true; } COSC 2P03 Week 11

Hash table – deletion void remove(int k) // delete record with key k { current = findPos(R.key); if(isActive(currentPos)) array[current].isActive = false; // lazy deletion } COSC 2P03 Week 11

Open addressing – collision resolution policies Linear probing: If a record is hashed to index j, which is already occupied, then look in index j+1, j+2, … and put the record in the next available index Clusters can form when items are hashed to the same address Anything hashed to any index within the cluster makes the cluster bigger → the bigger it gets, the faster it grows Sparse tables reduce the problem but it still exists This is called primary clustering. COSC 2P03 Week 11

Open addressing – collision resolution policies Quadratic probing: Attempts to avoid clustering problem by checking indices that are further apart If a record is hashed to index j, which is already occupied, then look in index j+1, j+4, j+9, …, and put the record in the next available index. Items hashed to the same index will all check the same sequence of indices (secondary clustering) COSC 2P03 Week 11

Open addressing – collision resolution policies Double hashing: Attempts to avoid both primary and secondary clustering by using a second hashing function to determine which other indices should be tried after a collision Uses 2 hash functions, h1(k) ≠ h2(k) If h1(k) hashes a record to index j, which is already occupied, then use h2(k) as a step size for subsequent probes COSC 2P03 Week 11