1 Data Structures. 2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand.

Slides:



Advertisements
Similar presentations
CS 11 C track: lecture 7 Last week: structs, typedef, linked lists This week: hash tables more on the C preprocessor extern const.
Advertisements

Hash Tables and Sets Lecture 3. Sets A set is simply a collection of elements Unlike lists, elements are not ordered Very abstract, general concept with.
CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hashing as a Dictionary Implementation
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Maps, Dictionaries, Hashtables
1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.
CSC 2300 Data Structures & Algorithms February 27, 2007 Chapter 5. Hashing.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Hash Tables1 Part E Hash Tables  
1 Data Structures and Algorithms The material for this lecture is drawn, in part, from The Practice of Programming (Kernighan & Pike) Chapter 2 Professor.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
1 Hash Tables Professor Jennifer Rexford COS 217.
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
D ESIGN & A NALYSIS OF A LGORITHM 01 – H ASHING Informatics Department Parahyangan Catholic University.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Hashing CS 202 – Fundamental Structures of Computer Science II Bilkent.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Dictionaries and Hash Tables. Dictionary A dictionary, in computer science, implies a container that stores key-element pairs called items, and allows.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Pointers OVERVIEW.
LECTURE 34: MAPS & HASH CSC 212 – Data Structures.
Lists II. List ADT When using an array-based implementation of the List ADT we encounter two problems; 1. Overflow 2. Wasted Space These limitations are.
Storage and Retrieval Structures by Ron Peterson.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
1 Chapter 16 Linked Structures Dale/Weems. 2 Chapter 16 Topics l Meaning of a Linked List l Meaning of a Dynamic Linked List l Traversal, Insertion and.
Hashing as a Dictionary Implementation Chapter 19.
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
LECTURE 35: COLLISIONS CSC 212 – Data Structures.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
ENEE150 – 0102 ANDREW GOFFIN Project 4 & Function Pointers.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
1 the hash table. hash table A hash table consists of two major components …
Chapter 5 Linked List by Before you learn Linked List 3 rd level of Data Structures Intermediate Level of Understanding for C++ Please.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
1 i206: Lecture 12: Hash Tables (Dictionaries); Intro to Recursion Marti Hearst Spring 2012.
Dynamic Programming & Memoization. When to use? Problem has a recursive formulation Solutions are “ordered” –Earlier vs. later recursions.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
CSC 143T 1 CSC 143 Highlights of Tables and Hashing [Chapter 11 p (Tables)] [Chapter 12 p (Hashing)]
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
Building Java Programs Generics, hashing reading: 18.1.
Design & Analysis of Algorithm Hashing
Data Structures and Algorithms
CSE373: Data Structures & Algorithms Lecture 6: Hash Tables
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Efficiency add remove find unsorted array O(1) O(n) sorted array
Searching.
Hash Tables.
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
Chapter 16 Linked Structures
Data Structures & Algorithms
slides created by Marty Stepp
Lecture-Hashing.
Presentation transcript:

1 Data Structures

2 Motivating Quotation “Every program depends on algorithms and data structures, but few programs depend on the invention of brand new ones.” -- Kernighan & Pike

“Programming in the Large” Steps Design & Implement Program & programming style (done) Common data structures and algorithms <-- we are here Modularity Building techniques & tools (done) Debug Debugging techniques & tools (done) Test Testing techniques (done) Maintain Performance improvement techniques & tools 3

Goals of this Lecture Help you learn (or refresh your memory) about: Common data structures: linked lists and hash tables Why? Deep motivation: Common data structures serve as “high level building blocks” A power programmer: Rarely creates programs from scratch Often creates programs using high level building blocks Why? Shallow motivation: Provide background pertinent to Assignment 3 … esp. for those who have not taken COS 226 4

Common Task Maintain a collection of key/value pairs Each key is a string; each value is an int Unknown number of key-value pairs Examples (student name, grade) (“john smith”, 84), (“jane doe”, 93), (“bill clinton”, 81) (baseball player, number) (“Ruth”, 3), (“Gehrig”, 4), (“Mantle”, 7) (variable name, value) (“maxLength”, 2000), (“i”, 7), (“j”, -10) 5

Agenda Linked lists Hash tables Hash table issues 6

7 Linked List Data Structure struct Node { const char *key; int value; struct Node *next; }; struct List { struct Node *first; }; 4 "Gehrig" 3 "Ruth" NULL struct List struct Node struct Node Your Assignment 3 data structures will be more elaborate Really this is the address at which “Ruth” resides

Linked List Algorithms Create Allocate List structure; set first to NULL Performance: O(1) => fast Add (no check for duplicate key required) Insert new node containing key/value pair at front of list Performance: O(1) => fast Add (check for duplicate key required) Traverse list to check for node with duplicate key Insert new node containing key/value pair into list Performance: O(n) => slow 8

Linked List Algorithms Search Traverse the list, looking for given key Stop when key found, or reach end Performance: O(n) => slow Free Free Node structures while traversing Free List structure Performance: O(n) => slow 9 Would it be better to keep the nodes sorted by key?

Agenda Linked lists Hash tables Hash table issues 10

11 Hash Table Data Structure enum {BUCKET_COUNT = 1024}; struct Binding { const char *key; int value; struct Binding *next; }; struct Table { struct Binding *buckets[BUCKET_COUNT]; }; NULL 4 "Gehrig" NULL 3 "Ruth" NULL … … … NULL 1023 … struct Table struct Binding struct Binding Your Assignment 3 data structures will be more elaborate Array of linked listsReally this is the address at which “Ruth” resides

12 Hash Table Data Structure Hash function maps given key to an integer Mod integer by BUCKET_COUNT to determine proper bucket 0 BUCKET_COUNT-1 Binding Bucket

Hash Table Example Example: BUCKET_COUNT = 7 Add (if not already present) bindings with these keys: the, cat, in, the, hat 13

Hash Table Example (cont.) First key: “the” hash(“the”) = ; % 7 = 1 Search buckets[1] for binding with key “the”; not found

Hash Table Example (cont.) Add binding with key “the” and its value to buckets[1] the

Hash Table Example (cont.) Second key: “cat” hash(“cat”) = ; % 7 = 2 Search buckets[2] for binding with key “cat”; not found the

Hash Table Example (cont.) Add binding with key “cat” and its value to buckets[2] the cat

Hash Table Example (cont.) Third key: “in” hash(“in”) = ; % 7 = 5 Search buckets[5] for binding with key “in”; not found the cat

Hash Table Example (cont.) Add binding with key “in” and its value to buckets[5] the cat in

Hash Table Example (cont.) Fourth word: “the” hash(“the”) = ; % 7 = 1 Search buckets[1] for binding with key “the”; found it! Don’t change hash table the cat in

Hash Table Example (cont.) Fifth key: “hat” hash(“hat”) = ; % 7 = 2 Search buckets[2] for binding with key “hat”; not found the cat in

Hash Table Example (cont.) Add binding with key “hat” and its value to buckets[2] At front or back? Doesn’t matter Inserting at the front is easier, so add at the front the hat in cat

Hash Table Algorithms Create Allocate Table structure; set each bucket to NULL Performance: O(1) => fast Add Hash the given key Mod by BUCKET_COUNT to determine proper bucket Traverse proper bucket to make sure no duplicate key Insert new binding containing key/value pair into proper bucket Performance: O(1) => fast 23 Is the add performance always fast?

Hash Table Algorithms Search Hash the given key Mod by BUCKET_COUNT to determine proper bucket Traverse proper bucket, looking for binding with given key Stop when key found, or reach end Performance: O(1) => fast Free Traverse each bucket, freeing bindings Free Table structure Performance: O(n) => slow 24 Is the search performance always fast?

Agenda Linked lists Hash tables Hash table issues 25

How Many Buckets? Many! Too few => large buckets => slow add, slow search But not too many! Too many => memory is wasted This is OK: 26 0 BUCKET_COUNT-1

27 What Hash Function? Should distribute bindings across the buckets well Distribute bindings over the range 0, 1, …, BUCKET_COUNT-1 Distribute bindings evenly to avoid very long buckets This is not so good: 0 BUCKET_COUNT-1 What would be the worst possible hash function?

28 How to Hash Strings? Simple hash schemes don’t distribute the keys evenly enough Number of characters, mod BUCKET_COUNT Sum the numeric codes of all characters, mod BUCKET_COUNT … A reasonably good hash function: Weighted sum of characters s i in the string s (Σ a i s i ) mod BUCKET_COUNT Best if a and BUCKET_COUNT are relatively prime E.g., a = 65599, BUCKET_COUNT = 1024

29 How to Hash Strings? Potentially expensive to compute Σ a i s i So let’s do some algebra (by example, for string s of length 5, a =65599): h = Σ i *s i h = *s *s *s *s *s 4 Direction of traversal of s doesn’t matter, so… h = *s *s *s *s *s 0 h = *s *s *s *s *s 4 h = (((((s 0 ) * s 1 ) * s 2 ) * s 3 ) * 65599) + s 4

30 How to Hash Strings? Yielding this function unsigned int hash(const char *s, int bucketCount) { int i; unsigned int h = 0U; for (i=0; s[i]!='\0'; i++) h = h * 65599U + (unsigned int)s[i]; return h % bucketCount; }

31 How to Protect Keys? Suppose Table_add() function contains this code: void Table_add(struct Table *t, const char *key, int value) { … struct Binding *p = (struct Binding*)malloc(sizeof(struct Binding)); p->key = key; … }

32 How to Protect Keys? Problem: Consider this calling code: struct Table *t; char k[100] = "Ruth"; … Table_add(t, k, 3); 3 NULL N … … 1023 … t Ruth\0 k

33 How to Protect Keys? Problem: Consider this calling code: struct Table *t; char k[100] = "Ruth"; … Table_add(t, k, 3); strcpy(k, "Gehrig"); What happens if the client searches t for “Ruth”? For Gehrig? 3 NULL N … … 1023 … t Gehrig\0 k

34 How to Protect Keys? Solution: Table_add() saves a defensive copy of the given key void Table_add(struct Table *t, const char *key, int value) { … struct Binding *p = (struct Binding*)malloc(sizeof(struct Binding)); p->key = (const char*)malloc(strlen(key) + 1); strcpy((char*)p->key, key); … } Why add 1?

35 How to Protect Keys? Now consider same calling code: struct Table *t; char k[100] = "Ruth"; … Table_add(t, k, 3); 3 NULL N … … 1023 … t Ruth\0 k

36 How to Protect Keys? Now consider same calling code: struct Table *t; char k[100] = "Ruth"; … Table_add(t, k, 3); strcpy(k, "Gehrig"); 3 NULL N … … 1023 … t Gehrig\0 k Ruth\0 Hash table is not corrupted

37 Who Owns the Keys? Then the hash table owns its keys That is, the hash table owns the memory in which its keys reside Hash_free() function must free the memory in which the key resides

Summary Common data structures and associated algorithms Linked list (Maybe) fast add Slow search Hash table (Potentially) fast add (Potentially) fast search Very common Hash table issues Hashing algorithms Defensive copies Key ownership 38