Download presentation
Presentation is loading. Please wait.
1
Bitwise Hashing
2
Hash Tables One of the most convenient forms of
a hash table implementation is as an array of linked lists.
3
Data is stored in records containing
1. A key value 2. The actual data 3. A pointer to another record KEY DATA POINTER
4
Keys and Data Often the key value and the actual value
are the same thing, as in the case of a dictionary, where the word being stored can also be used as the key for the hash function WORD POINTER
5
Hashing Hashing is the process of turning the
key value into an integer pointer, used to locate a storage location in a larger array. Hash codes should be designed to give different codes for different keys, although, this cannot be guaranteed.
6
Collisions When two keys hash to the same code a
collision occurs that must be dealt with. Linked lists offer an efficient solution for collision processing. Records with the same hash code are stored in the same linked list.
7
Limitations Usually, you will not have long linked lists.
Your hash function should be designed to make sure there are few collisions. The problem with long linked lists is that they are sequential search structures with an O(n). As opposed to a simple, non-colliding hash O(1).
8
ht = hash table ht record record record
9
Advantages An advantage of using linked lists to
implement hash functions is that adding and deleting records is not difficult.
10
Adding a record ht ht NULL NULL record record record record NULL NULL Hash into list Hash into list NULL NULL If list pointer is If list pointer is NULL then NULL then Record to be added Record to be added assign it to be assign it to be record record the pointer to the pointer to the record. the record. record record
11
ht NULL record record NULL record Record is now added record
12
Deleting a record ht NULL Hash into list record record If list pointer is NULL not NULL then record follow the list along until the word is found record
13
Assign pointer to record to next record
then free old record pointer ht NULL record NULL record Record is now added record
14
Hash table example
15
Name structure #define SSIZE 20; name struct name { char last[SSIZE]; char first[SSIZE]; char mi; char title[SSIZE]; }
16
Address structure struct addr { char street[4*SSIZE]; char city[SSIZE]; char state[SSIZE]; char zip[SSIZE]; }
17
Address_entry structure
struct addr_entry { struct name name; struct addr addr; } typedef struct addr_entry Addr_entry; typedef struct addr Addr; typedef struct name Name;
18
addr_entry name last first mi title addr street city state zip
19
A single node of the linked list
struct addr_list_item { Addr_entry *addr; struct addr_list_item *next; }
20
addr_entry name last first mi title next addr street city state zip
21
typedef struct addr_list_item Addr_list_item;
typedef Addr_list_item *Addr_list; #define EMPTY_LIST NULL
22
The hash table definition
#define TABLE_SIZE 256 static Addr_list addr_ht[TABLE_SIZE]; The definition of this array as ‘static’ means that all values are set to NULL. This is what we want to start out with. As we hash into this table later on we will build linked lists from these pointers.
23
Hashing has two parts to it.
1. a function to convert C strings to unsigned integers 2. a function to convert unsigned integers to hash keys. The basic idea is that a person’s last name, first name and middle initial will be combined into one string, converted to an integer and then hashed into a table location between 0 and 255.
24
Character conversion Conversion of strings to unsigned integers There are conversion functions available in C to convert strings to numbers. to integer: atoi to double: atof to long integer: atol
25
Assumptions The presupposition is that the entire string
does not contain more bits than can be represented by the data type involved. When you are trying to convert longer strings there are other functions that can be used.
26
Available conversions
double: strtod(CSTR str, char **rest) long: strtol(CSTR str, char **rest, int base) unsigned long: strtoul(CSTR str, char **rest, int base) where CSTR is const string and **rest is a pointer to the rest of the string (that portion of the string that was too long to fit in the designated data type.
27
Whole strings to integers?
If, however, we want to take an entire string (like one consisting of a last name, first name and middle initial) and process the whole thing into an integer we must write this function ourselves.
28
unsigned int str_to_int(const char
unsigned int str_to_int(const char *str) { unsigned value = 0u, tmp = 0u; int size = sizeof(int)/sizeof(char); int len = strlen(str); while ( len >= size) { value ^= *(unsigned *)str; /* xor & typecast */ str += size; len -= size; } if ( len > 0 ) { strcopy( (char *)&tmp, str ); value ^= tmp; } }
29
Explanation of str_to_int
This function converts a potentially long string to an unsigned integer. The size of an unsigned integer is machine-dependent. Let us assume that it is 32 bits. Then, value and tmp are initialized as follows:
30
Other key variables int size = sizeof(int)/sizeof(char) = 32 bits / 8 bits = 4 chars (bytes) per unsigned integer int len = strlen(str) /* the number of characters in the string */
31
value 1 byte 1 byte 1 byte 1 byte 32 bits tmp
32
The while loop then reads. while ( len >= size )
The while loop then reads while ( len >= size ) while the number of characters remaining to be processed into our integer >= the number of characters that can fit into one unsigned integer do the following.
33
In other words, the loop takes blocks of 4-
In other words, the loop takes blocks of 4- characters (32 bits) from the string and processes them into value. The loop ends when there are not enough unprocessed characters left in the string to take up 32 bits.
34
The if condition after the loop will process
The if condition after the loop will process the remaining characters into the unsigned integer value.
35
Handout Given str = “ABCDE”; how does the function come up with ‘value’?
36
Bitwise operators Expression Comment
n & 017 Bitwise and; value is n with all but lower 4 bits masked away i | j Bitwise i or j i ^ j Bitwise i exclusive or j i << 4 Value is left shift i by 4 bits j >> 5 Value is right shift i by 5 bits ~n Value is 1’s complement of n
37
Truth tables (&|^) & | ^ T T T F F T F F T F T T T F F T F F T F T T
38
1 byte 1 byte 1 byte 1 byte 32 bits value A B C D E str
1 byte 1 byte 1 byte 1 byte 32 bits value A B C D E 1 str value = value ^ str 1
39
Dealing with leftovers
value 1 if ( len > 0) { strcpy( (char *)&tmp, str ); 1 1 1 1 str
40
1 1 1 1 str tmp after strcpy 1
41
The final value tmp 1 value 1 value = value ^ tmp 1 1
42
What next? Now that we have the function that converts strings
to unsigned integers, we can write the function that hashes unsigned integers into our hash table.
43
The hash function unsigned hash(const Name *name) { char h_str[HSTR_LEN]; unsigned int val; sprintf(h_str, “%s%s%c”, name->last, name->first, name->mi); val = str_to_int(h_str); return( val >> SHIFT_AMT ); }
44
SHIFT_AMT Where SHIFT_AMT is determined by #define SHIFT_AMT 8*sizeof(unsigned int) -TABLE_BITS
45
TABLE_BITS and TABLE_BITS is defined as #define TABLE_BITS 8 TABLE BITS is a function of the size of your hash table as a power of 2. Therefore, since our hash table is 256 the TABLE_BITS are set to 8 (as in 28).
46
The necessary conversion
We want to convert name to an unsigned int and then map the results into a hash table of size To do this last step we only need to select 8 bits of the 32-bit hash value.
47
Bitwise operations One easy method of selecting 8 bits is to use bit-wise operations to right-shift the value so that only 8 bits (TABLE_BITS) remain. This is a number between 0 and 255 and will go in to the hash table.
48
The hashing process
49
Value before and after right shifting
Result of str_to_int(h_str) 1 multiplied by HASH_MULT 1 right shift 32-8 = 24 bits 1 value is 81
50
Results The right shift of 24 bits forces the result into an
integer that fits in 8 bits. The largest such integer is 255 and the smallest is 0. We now have our hash value. In this case 81
51
Fundamental operations
There are a host of hash table operations that must be performed. Some of them need to be able to hash values into the table as well. 1.) data retrieval Your data is stored by name. You want to enter a person’s name and have the program look them up in the table so you can print their address, etc.
52
2.) data removal You no longer need a person’s record. 3.) data insertion You wish to add a new record 4.) record creation Needed by the data insertion function
53
Data retrieval Addr_list retrieve(const Name *name) { Addr_list list_a = addr_ht[hash(name)]; for (; list_a != EMPTY_LIST; list_a = list_a->next) if (cmp_name(name, &list_a->addr->name) == 0) return(list_a); /* entry found */ return( EMPTY_LIST); /* entry not found */ }
54
Data Removal static int found = 0; int erase(const Name *name) { unsigned hashcode = hash(name); Addr_list list_a = addr_ht[hashcode]; Addr_list delete_list(Addr_list, const name *); // prototype found = 1; if (list_a != EMPTY_LIST ) { addr_ht[hashcode] = delete_list(list_a, name); return(found -1); } return(-1); /* no entry to delete */ }
55
Obviously, this needs a deletion routine.
56
Deletion Addr_list Delete_list(Addr_list list_a, const Name *name); {
Addr_list ans; if ( list_a == EMPTY_LIST) { found = 0; return(NULL); }
57
if ( cmp_name (&list_a->addr->name, name) == 0) {
ans = list_a->next; free(list_a); return(ans); } list_a->next = delete_list(list_a->next, name); return( list_a) }
58
Adding a new entry void entr_add(Addr_list a) { Addr_list b = retrieve(&a->addr->name); unsigned hashcode; if (b != EMPTY_LIST) /* replace existing entry */ { a->next = b->next; *b = *a; /* structure assignment */ free(a); }
59
else /. install new entry
else /* install new entry */ { hashcode = hash(&a->addr->name); b = addr_ht[hashcode]; a->next = b; /* b is NULL or points to first item in a linked list */ addr_ht[hashcode] = a; } } New entries are inserted in the first position in the linked list.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.