TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.

Slides:



Advertisements
Similar presentations
Chapter 11. Hash Tables.
Advertisements

Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Hash Tables.
Hashing.
HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hashing as a Dictionary Implementation
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing Techniques.
Hashing CS 3358 Data Structures.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
Chapter 5: Hashing Hash Tables
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Design and Analysis of Algorithms - Chapter 71 Hashing b A very efficient method for implementing a dictionary, i.e., a set with the operations: – insert.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Hash Table March COP 3502, UCF.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
IT 60101: Lecture #151 Foundation of Computing Systems Lecture 15 Searching Algorithms.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
Chapter 5: Hashing Collision Resolution: Separate Chaining Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova, Simpson College.
Comp 335 File Structures Hashing.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
Data Structures and Algorithms Lecture (Searching) Instructor: Quratulain Date: 4 and 8 December, 2009 Faculty of Computer Science, IBA.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Tables. 2 Exercise 2 /* Exercise 1 */ void mystery(int n) { int i, j, k; for (i = 1; i
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
H ASH TABLES. H ASHING Key indexed arrays had perfect search performance O(1) But required a dense range of index values Otherwise memory is wasted Hashing.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
1 What is it? A side order for your eggs? A form of narcotic intake? A combination of the two?
Hashing Alexandra Stefan.
Advanced Associative Structures
Hash Table.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
DATA STRUCTURES-COLLISION TECHNIQUES
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Collision Resolution: Open Addressing Extendible Hashing
Presentation transcript:

TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu

Arrays  Efficient access of data.  Access by index.  Mapping between search keys and indices allows each data to be stored in the array element with the corresponding index.

Example There are 500 students in a school. Each student has their own TDSB nine-digit student number. If we want to assign an ID to each student name, we could use their student number. However, if the greatest student number is “ ”, there would be 351,000,005 elements in the array. This is a lot more than what is required to store the names of 500 students.

Solution:  Mapping between the student numbers and the numbers from 0 to 499. By using arithmetic operations on keys, we can map them onto table addresses. Advantage:  Direct referencing.

Mapping Methods for mapping:  Direct address table  Hash table Hash table – a data structure that uses a hash function to efficiently map certain identifiers or keys (i.e. persons’ names) to associated values (i.e. their telephone numbers). A hash table is made up of two parts:  An array (the actual table where the data to be searched is stored)  A mapping function, a.k.a. hash function.

Hash Function Hash function – a function that transforms the search key into a table address. Different hash functions use different arithmetic operations to do this. We will focus on the modulo arithmetic.

Hash Function

Modulo Arithmetic Numbers as keys  Address = search key % size of array

Pseudocode - Number get number address = key % size of array

Strings as keys  Take the binary representation of a key as a number and then apply the first case.

In general the arithmetic operations in such expressions will use 32-bit modular arithmetic ignoring overflow. For example: Integer.MAX_VALUE + 1 = Integer.MIN_VALUE where Integer.MAX_VALUE = Integer.MIN_VALUE =

Example Char hello Unicode * * * * *31 0 = To prevent overflow, we can apply Horner’s method: a n x n + a n-1 ·x n-1 + a n-2 ·x n-2 + … + a 1 x 1 + a 0 x 0 = x(x(…x(x (a n ·x +a n-1 ) + a n-2 ) + ….) + a 1 ) + a 0

= (((104* ) ) ) We compute the hash function by applying the mod (%) operation at each step, thus avoiding overflowing.  Compute h 0 = (22*32 +5) % N  Compute h 1 = (32*h ) % N  Compute h 2 = (32*h 1 +25) % N  Etc.

Pseudocode - String get string loop (for as many as the number of characters in the string, each time with a different character of the string) { address = (31*address + Unicode of character) % size of array }

Hash Table How do we choose the size of the array (hash table)? Let N be the number of records to be stored. Let M be the size of the hash table. Ideally N records are stored in a hash table of size N. However...  We may not have prior knowledge of exact number of records.  It is possible to have two keys mapped to the same index (although this can be prevented). Hence, we assume that the size of the table (N) can be different from the number of records (M).

Load factor – the ratio between N and M.  Load factor L = N/M  The default L value for Java is Note: M should be a prime number to obtain more even distribution of keys over the table.

Collision Resolution Collision – when two or more keys hash to the same index. Methods to resolve collisions:  Separate chaining  Open addressing Linear probing Quadric probing Double hashing

Linear Probing Collision when inserting:  Probe the next slot in the table. If unoccupied, store the key. If occupied, continue probing the next slot.

Linear Probing - Collision

Searching:  If the key hashes to an occupied slot but does not match the key occupying the slot, probe the next slot. If slot is empty, search is unsuccessful. If slot is occupied: ○ If it does not match, search is unsuccessful. ○ If it matches, search is successful.  When reaching the end of table, resume from the beginning.

Disadvantages:  Primary clustering – building up of large clusters  Runs slowly for tables that are almost full

Hash Table - Advantages  Speed Especially with large number of entries (thousands or more).  Efficient when maximum number of entries is predicted in advance.  If the set of key-value pairs is fixed and known ahead of time (no insertions and deletions), average lookup cost can be reduced by a careful choice of the hash function, bucket table size, and internal data structures.

Hash Tables - Disadvantages  More difficult to implement than self-balancing binary trees.  Difficult to create a perfect hash function.  Insertion or deletion may take time proportional to number of entries. May not be suitable for real-time or interactive applications.  Cost is significantly higher than sequential list or search tree even though operations take constant time on average. Not suitable for small number of entries.