CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.

Slides:



Advertisements
Similar presentations
CS 206 Introduction to Computer Science II 04 / 01 / 2009 Instructor: Michael Eckmann.
Advertisements

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Topological Sort and Hashing
CS 206 Introduction to Computer Science II 03 / 23 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 03 / 27 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 10 / 22 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 07 / 2008 Instructor: Michael Eckmann.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
CS 206 Introduction to Computer Science II 04 / 28 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 11 / Veterans Day Instructor: Michael Eckmann.
Hashing Techniques.
CS 206 Introduction to Computer Science II 11 / 19 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 10 / 31 / 2008 Happy Halloween!!! Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 04 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 10 / 20 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 10 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 03 / 20 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 10 / 14 / 2009 Instructor: Michael Eckmann.
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 03 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 10 / 26 / 2009 Instructor: Michael Eckmann.
Hash Tables1 Part E Hash Tables  
CS 206 Introduction to Computer Science II 10 / 29 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 05 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 03 / 25 / 2009 Instructor: Michael Eckmann.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 09 / 2009 Instructor: Michael Eckmann.
Shortest Path Problem For weighted graphs it is often useful to find the shortest path between two vertices Here, the “shortest path” is the path that.
CS 206 Introduction to Computer Science II 03 / 30 / 2009 Instructor: Michael Eckmann.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
CS 206 Introduction to Computer Science II 10 / 08 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 12 / 08 / 2008 Instructor: Michael Eckmann.
CS 106 Introduction to Computer Science I 10 / 15 / 2007 Instructor: Michael Eckmann.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 10 / 28 / 2009 Instructor: Michael Eckmann.
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
Dijkstras Algorithm Named after its discoverer, Dutch computer scientist Edsger Dijkstra, is an algorithm that solves the single-source shortest path problem.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Hashing Dr. Yingwu Zhu.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: 2620a.htm Office: TEL 3049.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
CS201: Data Structures and Discrete Mathematics I Hash Table.
CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
1 Prim’s algorithm. 2 Minimum Spanning Tree Given a weighted undirected graph G, find a tree T that spans all the vertices of G and minimizes the sum.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
1 the BSTree class  BSTreeNode has same structure as binary tree nodes  elements stored in a BSTree are a key- value pair  must be a class (or a struct)
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
CE 221 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Hashing Exercises.
Hashing CS2110 Spring 2018.
Hashing CS2110.
CS202 - Fundamental Structures of Computer Science II
Ch Hash Tables Array or linked list Binary search trees
Presentation transcript:

CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS Fall 2008 Today’s Topics Questions? Dijkstra's algorithm Hash tables and functions

Graphs the shortest path could be in terms of minimum weight for weighted graphs (note: weights are non-negative)‏ e.g. finding the lowest cost flights Dijkstra's algorithm solves this problem –It attempts to minimize the weight at each step. Dijkstra's algorithm is a greedy algorithm. That is, its strategy is to locally minimize the weight, hoping that's the best way to get the minimum weight of the whole graph. –Sometimes the local minimum weight is not the correct choice for the overall problem. In that case, the algorithm will still work, but the initial guess was wrong. –Dijkstra's algorithm works in a similar way to BFS but instead of a queue, use a “minimum” priority queue. That is, a priority queue that returns an item whose priority is least among the items in the priority queue. –Let's see an example on the board and come up with pseudocode for this algorithm.

Graphs Example on the board and then pseudocode for this algorithm. 0-> 1(4), 2(2)‏, 4(4)‏ 1-> 3(3), 4(3)‏ 2-> 1(1), 4(1)‏ 3-> 4(1),5(2)‏ 4-> 6(2)‏ 5-> 6(3)‏ 6-> null Dijkstra's algorithm, given a starting vertex will find the minimum weight paths from that starting vertex to all other vertices.

We need code to handle a weighted, directed graph. We need a “minimum” Priority Queue, that is, one that returns the item with the lowest priority at any given remove(). We need a way to set all the minimum path lengths to Integer.MAX_VALUE (this is the initial value we want to use for the path lengths, because if we ever calculate a lesser weight path, then we store this lesser weight path.)‏

Dijkstra's algorithm pseudocode (given a startV)‏ set all vertices to unvisited and all to have pathLen MAX set pathLen from startV to startV to be 0 add (item=startV, priority=0) to PQ while (PQ !empty) { v = remove the lowest priority vertex from PQ (do this until we get an unvisited vertex out)‏ set v to visited for all unvisited adjacent vertices (adjV) to v { if ( current pathLen from startV to adjV ) > ( weight of the edge from v to adjV + pathLen from startV to v ) then { set adjV's pathLen from startV to adjV to be weight of the edge from v to adjV + pathLen from startV to v add (item=v, priority=pathLen just calculated) to PQ prevof adjV is set to v } // end if } // end for } // end while

Hashing is used to allow very efficient insertion, removal, and retrieval of items. Consider retrieval (searching) with several structures –To find data in an unordered linear list structure O(n)‏ –To find data in an order linear list structure O(log n)‏ –To find data in a BST or a Heap O(log n)‏ What orders are better than log n ? Hashes

Hashing is used to allow –inserting an item –removing an item –searching for an item all in constant time (in the average case). Hashing does not provide efficient sorting nor efficient finding of the minimum or maximum item etc. Hashes

We want to insert our items (of any type (String, int, double, etc.)) into a structure that allows fast retrieval. Terms: –Hash Table (an array of references to objects(items))‏ table_size is the number of places to store –Hash Function (calculates a hash value (an integer) based on some key data about the item we are adding to the hash table.)‏ –Hash Value (the value returned by the hash function)‏ the hash value must be an integer value within [0, table_size – 1] this gives us the index in the hash table, where we wish to store the item. Hashes

Just to give an idea of how to insert and retrieve items into a hash table (this does not use a good hash function)‏ –Consider our items are simply ints –Consider our Hash Function to be f(x) = x % n (this is not a typical hash function)‏ –The hash function returns a hash value which is modded by the size of our hash table array to compute the index where we wish to store our item. –example on the board (assume n=8, add items 24, 3, 17, 31)‏ Then we can reverse the process to see if a particular item is in our hash table. Hashes

In our example (assume n=8, add items 24, 3, 17, 31), what if we needed to insert item 11 into our hash? There'd be a collision. There are several strategies to handle collisions –the chosen strategy effects how retrieval is handled too –Open Addressing (aka Probing hash table) Place item in next open slot –or –Separate chaining Each array element is a list Examples of these two techniques on the board. Hashes

Let's come up with a hash table to store Strings –we'll need to come up with the size of our table –we'll need to decide whether we will use seperate chaining or open addressing hashing We'll need to create a hash function. (We'll talk about strategies for creating good hash functions next time)‏ We'll also allow insertion and retrieval (determine if an item exists in the hash). Hashes

Strategies for best performance (we'll go through more of this next time)‏ –want items to be distributed evenly throughout the hash table and we want few collisions so that depends on our choice of hash function and our size of the hash table –also need to decide whether to use a probing hash table or to use a hash table where collisions are handled by adding the item to a list for the index (hash value)‏ other methods called quadratic probing and double hashing are other ways of handling collisions. –if choices are done well we get the retrieval time to be a constant, but the worst case is O(n)‏ –we also need to consider the computations needed to insert (computing the hash value)‏ Hashes