Hashing Algorithm 9042635 羅正鴻 9142610 林彥廷 9142621 戴嘉宏.

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

Hash Tables.
Hashing.
HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hashing. CENG 3512 Motivation The primary goal is to locate the desired record in a single access of disk. – Sequential search: O(N) – B+ trees: O(log.
Data Structures Using C++ 2E
File Processing - Indirect Address Translation MVNC1 Hashing Indirect Address Translation Chapter 11.
Hashing Techniques.
Hashing CS 3358 Data Structures.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Design and Analysis of Algorithms - Chapter 71 Hashing b A very efficient method for implementing a dictionary, i.e., a set with the operations: – insert.
Tirgul 9 Hash Tables (continued) Reminder Examples.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Hash Tables1 Part E Hash Tables  
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Tirgul 8 Hash Tables (continued) Reminder Examples.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Appendix E-A Hashing Modified. Chapter Scope Concept of hashing Hashing functions Collision handling – Open addressing – Buckets – Chaining Deletions.
Comp 335 File Structures Hashing.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CS201: Data Structures and Discrete Mathematics I Hash Table.
March 23 & 28, Csci 2111: Data and File Structures Week 10, Lectures 1 & 2 Hashing.
March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Been-Chian Chien, Wei-Pang Yang, and Wen-Yang Lin 8-1 Chapter 8 Hashing Introduction to Data Structure CHAPTER 8 HASHING 8.1 Symbol Table Abstract Data.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Ihab Mohammed and Safaa Alwajidi. Introduction Hash tables are dictionary structure that store objects with keys and provide very fast access. Hash table.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Week 9 - Monday.  What did we talk about last time?  Practiced with red-black trees  AVL trees  Balanced add.
Hashing. Search Given: Distinct keys k 1, k 2, …, k n and collection T of n records of the form (k 1, I 1 ), (k 2, I 2 ), …, (k n, I n ) where I j is.
1 Hash Tables Chapter Motivation Many applications require only: –Insert –Search –Delete Examples –Symbol tables –Memory management mechanisms.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
Data Structures Using C++ 2E
Week 8 - Wednesday CS221.
Subject Name: File Structures
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Hash Table.
Chapter 10 Hashing.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Presentation transcript:

Hashing Algorithm 羅正鴻 林彥廷 戴嘉宏

Introduction Hashing, a ubiquitous information retrieval strategy for providing efficient access to information based on a key Information can usually be accessed in constant time Hashing ’ s drawbacks

Concept of hashing The problem at hand is to define and implement a mapping from a domain of keys to a domain of locations From the performance standpoint, the goal is to avoid collisions (A collision occurs when two or more keys map to the same location) From the compactness standpoint, no application ever stores all keys in a domain simultaneously unless the size of the domain is small

Concept of hashing (con ’ t) The information to be retrieved is stored in a hash table which is best thought of as an array of m locations, called buckets The mapping between a key and a bucket is called the hash function The time to store and retrieve data is proportional to the time to compute the hash function

Hashing function The ideal function, termed a perfect hash function, would distribute all elements across the buckets such that no collisions ever occurred h(v) = f(v) mod m Knuth(1973) suggests using as the value for m a prime number

Hashing function (con ’ t) It is usually better to treat v as a sequence of bytes and do one of the following for f(v) : (1) Sum or multiply all the bytes. Overflow can be ignored (2) Use the last (or middle) byte instead of the first (3) Use the square of a few of the middle bytes

Implementing hashing The following operations are usually provided by an implementation of hashing : (1) Initialization (2) Insertion (3) Retrieval (4) Deletion

Chained hashing

Chained hashing (con ’ t) In the worst case (where all n keys map to a single location), the average time to locate an element will be proportional to n/2. In the best case (where all chains are of equal length), the time will be proportional to n/m.

Open addressing

Minimal perfect hash functions Minimal perfect hash function (MPHF) is a perfect hash function with the property that is hashed m keys to m buckets with no collisions Cichelli(1980) and of Cercone et al.(1983) proposed two important concepts : (1)using tables of values as the parameters (2)using a mapping, ordering, and searching (MOS) approach

Minimal perfect hash functions (con ’ t) Mapping : transform the key set from an original to a new universe Ordering : place the keys in a sequence that determines the order in which hash values are assigned to keys Searching : assign hash values to the keys of each level Mapping → Ordering → Searching

Sager ’ s method and improvement Sager(1984,1985) formalizes and extends Cichelli ’ s approach In the mapping step, three auxiliary(hash) functions are defined on the original universe of keys U : h 0 : U → { 0, ……, m - 1 } h 1 : U → { 0, ……, r - 1 } h 2 : U → { r, ……, 2r – 1 }

Sager ’ s method and improvement The class of functions searched is h(k) = ( h 0 (k) + g(h 1 (k)) + g(h 2 (k)) (mod m) Sager uses a graph that represents the constraints among keys The mapping step goes from keys to triples to a special bipartite graph, the dependency graph, whose vertices are the h 1 (k) and h 2 (k) values and whose edges represent the words

Sager ’ s method and improvement

The algorithm The mapping step

The algorithm (con ’ t) The ordering step

The algorithm (con ’ t) The searching step

Discussion Hashing algorithm is a constant-time algorithm, and there are always advantages to being able to predict the time needed to locate a key The MPHF uses a large amount of space