Download presentation
Presentation is loading. Please wait.
Published byKathleen McCormick Modified over 8 years ago
1
1 the BSTree class BSTreeNode has same structure as binary tree nodes elements stored in a BSTree are a key- value pair must be a class (or a struct) which has a data member for the value a data member for the key a method with the signature: KF key( ) const; where KF is the type of the key
2
2 an example struct treeItem { int id; // key string data; // value int key( ) const { return id; } }; BSTree myBSTree;
3
3 basic BST search algorithm void search (bstree, searchKey) { if (bstree is empty) //base case: item not found // take needed action else if (key in bstree's root == search Key) // base case: item found // take needed action else if (searchKey < key in bstree's root ) search (leftSubtree, searchKey); else search (rightSubtree, searchKey); }
4
4 deletion cases item to be deleted is in a leaf node pointer to its node (in parent) must be changed to NULL item to be deleted is in a node with one empty subtree pointer to its node (in parent) must be changed to the non-empty subtree item to be deleted is in a node with two non-empty subtrees
5
5 the easy cases 36 2042 122439 45 21 40
6
6 the “hard” case 36 2042 122439 45 21 40
7
7 the “hard” case 36 2042 122439 45 21 40 replace with smallest in right subtree (inorder successor) replace with largest in left subtree (inorder predecessor)
8
8 traversing a binary search tree can use any of the binary tree traversal orders – preorder, inorder, postorder base case is reaching an empty tree inorder traversal visits the elements in order of their key values how would you visit the elements in descending order of key values?
9
9 big Oh of BST operations measured by length of the search path depends on the height of the BST height determined by order of insertion height of a BST containing n items is minimum: floor (log 2 n) maximum: n - 1 average: ?
10
10 faster searching "balanced" search trees guarantee O(log 2 n) search path by controlling height of the search tree AVL tree 2-3-4 tree red-black tree (used by STL associative container classes) hash table allows for O(1) search performance search time does not increase as n increases
11
11 Hash Table a hash table is an array of size Tsize has index positions 0.. Tsize-1 two types of hash tables (Nyhoff – Ch.9.3) open hash table array element type is a pair all items stored in the array chained hash table element type is a pointer to a linked list of nodes containing pairs items are stored in the linked list nodes keys are used to generate an array index home address (0.. Tsize-1)
12
12 Considerations How big an array? load factor of a hash table is n/Tsize Hash function to use? int hash(KeyType key) -> 0.. Tsize-1 Collision resolution strategy? hash function is many-to-one
13
13 Hash Function a hash function is used to map a key to an array index (home address) search starts from here insert, retrieve, update, delete all start by applying the hash function to the key
14
14 Some hash functions if KeyType is int - key % TSize if KeyType is a string - convert to an integer and then % Tsize goals for a hash function fast to compute even distribution cannot guarantee no collisions unless all key values are known in advance
15
15 An Open Hash Table key value Hash (key) produces an index in the range 0 to 6. That index is the “home address” 01234560123456 Some insertions: K1 --> 3 K2 --> 5 K3 --> 2 K1 K1info K2 K2info K3 K3info
16
16 Handling Collisions 01234560123456 K3 K3info K1 K1info K2 K2info Some more insertions: K4 --> 3 K5 --> 2 K6 --> 4 K4 K4info K5 K5info K6 K6info Linear probing collision resolution strategy
17
17 Search Performance 01234560123456 K3 K3info K1 K1info K2 K2info K4 K4info K5 K5info K6 K6info Average number of probes needed to retrieve the value with key K? K hash(K) #probes K1 3 1 K2 5 1 K3 2 1 K4 3 2 K5 2 5 K6 4 4 14/6 = 2.33 (successful) unsuccessful search?
18
18 A Chained Hash Table insert keys: K1 --> 3 K2 --> 5 K3 --> 2 K4 --> 3 K5 --> 2 K6 --> 4 linked lists of synonyms 01234560123456 K3 K3info K1 K1info K5 K5info K4 K4info K6 K6info K2 K2info
19
19 Search Performance Average number of probes needed to retrieve the value with key K? K hash(K) #probes K1 3 1 K2 5 1 K3 2 1 K4 3 2 K5 2 2 K6 4 1 8/6 = 1.33 (successful) 01234560123456 K3 K3info K1 K1info K5 K5info K4 K4info K6 K6info K2 K2info unsuccessful search?
20
20 successful search performance open addressing open addressing chaining (linear probing) (double hashing) load factor 0.51.501.39 1.25 0.72.171.72 1.35 0.9 5.502.56 1.45 1.0 ---- ---- 1.50 2.0 ---- ---- 2.00
21
21 Factors affecting Search Performance quality of hash function how uniform? depends on actual data collision resolution strategy used load factor of the HashTable N/Tsize the lower the load factor the better the search performance
22
22 Traversal Visit each item in the hash table Open hash table O(Tsize) to visit all n items Tsize is larger than n Chained hash table O(Tsize + n) to visit all n items Items are not visited in order of key value
23
23 Deletions? search for item to be deleted chained hash table find node and delete it open hash table must mark vacated spot as “deleted” is different than “never used”
24
24 Hash Table Summary search speed depends on load factor and quality of hash function should be less than.75 for open addressing can be more than 1 for chaining items not kept sorted by key very good for fast access to unordered data with known upper bound to pick a good TSize
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.