[0][1][2][3][4][5][6][7][8][9] Bing David Ina Abhinav Erik Hyun Jim Fiona Gheeta Chelsea I can easily loop through all the student records by using a.

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.
[0][1][2][3][4][5][6][7][8][9] Bing David Ina Abhinav Erik Hyun Jim Fiona Gheeta Chelsea I can easily loop through all the student records by using a.
Searching for Data Relationship between searching and sorting Simple linear searching Linear searching of sorted data Searching for string or numeric data.
CSCE 3400 Data Structures & Algorithm Analysis
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Using arrays – Example 2: names as keys How do we map strings to integers? One way is to convert each letter to a number, either by mapping them to 0-25.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing Techniques.
Data Structures Hash Tables
1.1 Data Structure and Algorithm Lecture 9 Hashing Topics Reference: Introduction to Algorithm by Cormen Chapter 12: Hash Tables.
Maps, Dictionaries, Hashtables
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
Tirgul 9 Hash Tables (continued) Reminder Examples.
Hash Tables1 Part E Hash Tables  
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
Tirgul 8 Hash Tables (continued) Reminder Examples.
Lecture 10: Search Structures and Hashing
Hashing General idea: Get a large array
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
Hashing CS 105. Hashing Slide 2 Hashing - Introduction In a dictionary, if it can be arranged such that the key is also the index to the array that stores.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Hash Tables.
Hash Table Concepts & Implementations. Sorting by theory Hash Table Concepts Implementation.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Comp 335 File Structures Hashing.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
Hashing – Part I CS 367 – Introduction to Data Structures.
1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Ihab Mohammed and Safaa Alwajidi. Introduction Hash tables are dictionary structure that store objects with keys and provide very fast access. Hash table.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashing Fundamental Data Structures and Algorithms Margaret Reid-Miller 18 January 2005.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Hashing CS 110: Data Structures and Algorithms First Semester,
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Midterm Midterm is Wednesday next week ! The quiz contains 5 problems = 50 min + 0 min more –Master Theorem/ Examples –Quicksort/ Mergesort –Binary Heaps.
Department of Computer Engineering Faculty of Engineering, Prince of Songkla University 1 9 – Hash Tables Presentation copyright 2010 Addison Wesley Longman,
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
CMSC 341 Hashing Readings: Chapter 5. Announcements Midterm II on Nov 7 Review out Oct 29 HW 5 due Thursday CMSC 341 Hashing 2.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
1 What is it? A side order for your eggs? A form of narcotic intake? A combination of the two?
CS203 Lecture 14. Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive.
Hash table CSC317 We have elements with key and satellite data
LEARNING OBJECTIVES O(1), O(N) and O(LogN) access times. Hashing:
Lecture No.43 Data Structures Dr. Sohail Aslam.
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Introduction to Hashing - Hash Functions
Hash functions Open addressing
Hash tables Hash table: a list of some fixed size, that positions elements according to an algorithm called a hash function … hash function h(element)
Hash Table.
Arrays An Array is an ordered collection of variables
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Presentation transcript:

[0][1][2][3][4][5][6][7][8][9] Bing David Ina Abhinav Erik Hyun Jim Fiona Gheeta Chelsea I can easily loop through all the student records by using a for loop. But if I want to access Jim’s record only, I have to start at 0 and loop through the array until I find it. With a big array this could be rather inefficient. Is there a better way? Sequential access good Arrays Direct access bad  Remember! The array elements just hold references to the objects, not the objects themselves! Consider this array of Student records

Sequential access bad  Hash tables Direct access good Bing David Ina Abhinav Erik Hyun Jim Fiona Gheeta Chelsea [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Hashing Function Jim’s student ID no. “6” The student records are stored in an array. The place in the array that a particular student is held is determined by the hashing function. The hashing function takes some value, e.g. a name, or, as here, a student id number, and translates it into an array index. So if we want to find Jim’s record we just give his id number to the hashing function and it tells us where his record is located. We don’t need to search through the records. This is direct access.

Collisions What happens if the hashing function gives the same array index for two different students? This happens and it is called a collision. There are a number of ways of dealing with collisions, the details of which you don’t need to know. But what you do need to know is that the performance of hash tables degrades over time because of multiple collisions. Bing David Ina Abhinav Erik Hyun Jim Fiona Gheeta Chelsea [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Hashing Function Hiro’s student ID no. “6” Collision!

Collisions [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Hashing Function Erik’s student ID no. “4” Erik David’s student ID no. “1” David Hyun’s student ID no. “4” Collision! Hyun goes into next available index Hyun If there had already been a lot of records in the array when the collision happened, Hyun may have been pushed a long way down the array. Later, when we try to access Hyun’s record, the hashing function still gives us 4 as the place to find him. But he’s not there! So we have to do a sequential search from index number 4, through the array, to find him. This is the reason that hash table performance degrades over time.

The Hashing Algorithm The simplest way to translate the Student ID into an array index is to use the modulo operator (% in Java). The modulo operator returns the remainder of a division operation, for example 11 % 4 = 3. Question: If we have an array of 10 elements, what do we need to mod our Student IDs by to be sure of getting some value from 0 to 10? Answer: 11 Question: Let’s say we have an array of size N. Now what to we need to mod our Student IDs by? Answer: N+1 Random Student ID: Array size: Array index this student will be assigned to using modulo operator: What happens if we don’t have a numerical Student ID to use? Say we only have their name? Well we just convert the string into some numerical value using one of several methods. MD5 is a common method; you give it text, it gives you a 128-bit number. The important thing is that we get an even distribution of entries into the array to minimize collisions. MD5 is also used to verify copies of documents because even if one character has changed during the copying, the number that MD5 returns will be totally different. What happens if we don’t have a numerical Student ID to use? Say we only have their name? Well we just convert the string into some numerical value using one of several methods. MD5 is a common method; you give it text, it gives you a 128-bit number. The important thing is that we get an even distribution of entries into the array to minimize collisions. MD5 is also used to verify copies of documents because even if one character has changed during the copying, the number that MD5 returns will be totally different.