Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Science 2 Hashing

Similar presentations


Presentation on theme: "Computer Science 2 Hashing"— Presentation transcript:

1 Computer Science 2 Hashing
Note: You will be allowed one page, one side, hand written notes on the final

2 Learning Objectives Understand hashing and its benefits
Be able to generate a ‘hashed’ value based a string Be able to write a function to generate a hashed value. Understand how to add values to a hash table.

3 Hashing: To put records into an array or file (Hash Table) so that they may be randomly accessed. To find and retrieve records from an array or file (Hash table.) Why? Taking a look at the speed Linear Search O(n) Binary Search O(log n) Hashing function Search, Insert, Delete O(1) Worst Case O(n) Multiple collisions.

4 Pros and cons of hashing
FAST!!!!! O(1) Independent of amount of information Cons: Must have random access to hash. Need to know the amount of information to set up hashed array. Potential memory hog. (Need empty spaces in the array for best efficiency.)

5 Simplified Steps to Hashing
Get that data to be added to the array Calculate a ‘unique’ location for the data (Hashing) Place it into the array. (More on this later)

6 Hashing: Qualities of a good hashing function:
Distributes info evenly throughout the array. Uses some unchanging characteristic of the data, along with some simple arithmetic, to determine where the value should go.

7 Sample Hashing Code Use the person’s name. Examples.
Sum of Ordinals, (In ASCII ord(‘A’) = 64, ord(‘a’) = 97) A -> ord(‘A’) – = 1 AB -> ord(‘A’) – 64 + ord(‘B’) – 64 = 1+2=3 BA -> ord(‘B’) – 64 + ord(‘A’) – 64 = 2+1=3 ABA -> ord(‘A’) + ord(‘B’)+ord(‘C’) = = 4 OK But many names can be hashed to the same value, so there will likely be collisions.

8 Eliminating Collisions: Base 26 Hashed Value
In math we use a base 10 numbering system to get unique values. Considering that there are 26 letters in the alphabet we can apply similar properties to names with base 26. This way no two names will hash to the same location. Base 26 AB = 1*26^1 + 2*26^0 = = 28 BA = 2*26^1 + 1*26^0 = = 53 But numbers will be big quickly SMITH = 19*26^4+13*26^3+9*26^2+20*26^1+8*26^0 8,917,644 But… This means I would need an array of at least 8,917,644 in order to place SMITH.

9 MOD to the Rescue After generating the hashed value it must be MODified to fit into the array or file. Example: Array size of 10 SMITH = (19*2^4+13*2^3+9*2^2+20*2^1+8*2^0) MOD 10 = 8,917,644 MOD 10 = 492 MOD 10 = 2 So initially try put SMITH in shelf 2 of the array. With an Array size of 100 492 mod 100 = 92 try shelf 92 In general, calculate the base 2 hashed value for the string, then MOD it by the size of the array (or file) to find the first position. But even with a name as small as SMITH the number is too big to use integers

10 Write a summary of how to calculate a hashed value given a name.
Base 2 Modification Does not give totally unique values like base 26, but it gives significantly less collisions than just adding the ordinal values for each digit. Base 2 modification Base 2 AB = 1*2^1 + 2*2^0 = = 4 BA = 2*2^1 + 1*2^0 = 4+1 = 5 Then MOD by this size of the array. Hands on with Algorithm For each of the following, find the hashed value. Use an array size of 5.  Names Value Ed _______ Bo ________ Al ________ Abe ________ Review break. Write a summary of how to calculate a hashed value given a name.

11 Putting the Values into the Array (or File)
Hash the keyfield (Calculate the relatively unique position in the array (file)) If the spot is ‘empty’, then put in the name. What if…. Someone else happens to be in that spot?

12 Resolving Collisions: Different approaches.
Add 1: Stick the info into the next spot. Easy to code, but it tends to cause a cluster of information which leads to more collisions and a loss in speed. Chaining: Create a linked list from the point of collision. The longer the chain the less efficient. Rehashing: Use some other hash function to determine a ‘unique’ offset. Avoids clustering, easier than chaining.

13 Rehashing Add a value to the position
What problems could there be with this. Add a value that is relatively prime to the size of the array (file) Base the rehashing function on the data. Add the length of the name. What could be wrong with this method? You may use this.

14 Mechanics: Choosing the array/file size
Cooper and Clancy in Oh! Pascal! “In practice, we generally find that optimum results are achieved with a table (array) that’s one-and-one-half to two times as large as the values that are to be stored.” Ex. If you are to store 100 things, you’d make the table 150 to 200 spaces. AP test person. “Keep at least 25% of the array empty.” ex.: If you are to store 100 things, you’d make the table 133 spaces,

15 Pros and cons of hashing
FAST!!!!! O(1) Independent of amount of information Cons: Must have random access to hash. Need to know the amount of information to set up hashed array. Potential memory hog. (Need empty spaces in the array for best efficiency.)

16 Initializing the array (Hash table)
Set the key field (name) on every shelf of the array to “Empty”

17 Adding info: ( A psample pseudo-code)
Get info Hash the key field {The name} Found = false {‘Cause you haven’t found where to put it.} count = 0 {Counting to make sure time isn’t wasted when the array is full.} repeat if array[hashedvalue].name = ‘Empty’ or Deleted put in the info at this spot found = true {‘Cuz you’ve found where to put it} else if array[hashedvalue].name = name you are entering Tell the user “It already exists” found = true {You found the person} else rehash add one to count until (found) or (count is too big)

18 PSEUDOCODE SHOW ONE Get info Hash the key field {The name}
Found = false {‘Cause you haven’t found where to put it.} count = 0 {Counting to make sure time isn’t wasted when the array is full.} repeat if array[hashedvalue].name = ‘Empty’ Show “The person is not in the list” found = true {‘Cuz you’ve found where to put it} else if name = array[hashedvalue].name Show the information about the person found = true {You found the person} else rehash add one to count until (found) or (count is too big)

19 PSEUDOCODE: DELETE ONE
Get info Hash the key field {The name} Found = false {‘Cause you haven’t found where to put it.} count = 0 {Counting to make sure time isn’t wasted when the array is full.} repeat if array[hashedvalue].name = ‘Empty’ Show “The person is not in the list” found = true {‘Cuz you’ve found where to put it} else if name = array[hashedvalue].name //What if someone had to rehash from this spot? Delete the person (Copy ‘Deleted’ to the name field) found = true {You found the person} else rehash add one to count until (found) or (count is too big)

20 PSEUDOCODE: Modifying ADD
Get info Hash the key field {The name} Found = false {‘Cause you haven’t found where to put it.} count = 0 {Counting to make sure time isn’t wasted when the array is full.} repeat if array[hashedvalue].name = ‘Empty’ or Deleted put in the info at this spot found = true {‘Cuz you’ve found where to put it} else if name = array[hashedvalue].name Tell the user “It already exists” found = true {You found the person} else rehash add one to count until (found) or (count is too big)

21 Assume all capital letters for this activity.
Dry Running the Pseudo Code:

22 Assignment Write a Hashing Function that is sent a string representing that value to hash and an integer representing the size of the array. The function will return a hashed value from 0- to (arraysize -1) Looking ahead Start thinking about how you will handle collisions.

23 Handy Pascal Functions
Ans:=Ord(character);//Returns the Ordinal value of a character. ‘A’ = 65, ‘a’ = 97 Cha:=Upcase(character);// returns the uppercase value of the Character. Ord(upcase(name[count]) Integer :=Length(stringVar); //Returns the number of characters in a string. realPower:= 3**4; //same as 34 Example: 10^ 3 realVar:= 10**3

24 Hashing Project Create a menu for this program
Using the hashing function created previously write a program that uses hashing that is designed to store the following: 20 Names, Phone numbers Create a menu for this program Add Show all (Including ‘Empty’ and ‘Deleted.’ This is for error checking) Find Given a name, find the rest of the information or say they are not in the list Delete: Given a name, remove their information from the list. Change (Push) Be able to change the information about a person, including their name.


Download ppt "Computer Science 2 Hashing"

Similar presentations


Ads by Google