Higher Order Tries Key = Social Security Number. 441-12-1135 9 decimal digits. 10-way trie (order 10 trie). 1 2 3 4 5 6 7 8 9 Height <= 10.
Social Security Trie 10-way trie 100-way trie Height <= 10. Search => <= 9 branches on digits plus 1 compare. 100-way trie 441-12-1135 Height <= 6. Search => <= 5 branches on digits plus 1 compare. 10-way trie => at most 10 cache misses for a search! Up to 9 branch levels and 1 element level. 100-way => at most 6 misses. #of memory accesses is either 10 or 6. Doesn’t depend on trie degree as you can compute which part of a branch node is to be retrieved and not retrieve the entire node.
Social Security Trie 109-way trie Height <= 2. Search => <= 1 branch on a digit plus 1 compare. 10-way trie => at most 10 cache misses for a search! Up to 9 branch levels and 1 element level. 100-way => at most 6 misses.
Memory Accesses During a search, we can compute needed field of a branch node. Access only that field. T 1 2 3 4 5 6 7 8 9 Assume 4 bytes per pointer. M[T+16] gets you to start of child node; M[M[T+16]+24] gets you to needed grandchild, and so on. key = 46…
Memory Accesses #memory accesses = 1 per encountered branch node + those needed for an element node. T 1 2 3 4 5 6 7 8 9 Assume 4 bytes per pointer. M[T+16] gets you to start of child node; M[M[T+16]+24] gets you to needed grandchild, and so on. key = 46…
Binary Search Trees Red-black tree AVL tree Best binary tree. Height <= 2log2109 ~ 60. Search => up to 60 compares of 9 digit numbers and up to 60 memory accesses. AVL tree Height <= 1.44log2109 ~ 40. Search => up to 40 compares of 9 digit numbers and up to 40 memory accesses. Best binary tree. Height = log2109 ~ 30. 4-byte unsigned int can go up to 2^32 > 10^9 and so can hold an SS#. But if we add a digit to an SS#, we would need 2 unsigned ints.
Higher Order Search Trees 10 30 50 height can be < log2 n #nodes accessed is reduced but cache misses/node increases as node size increases worst-case #compares remains >= log2 n
Compressed Social Security Trie Branch Node Structure #ptr 1 2 3 4 5 6 7 8 9 char# #ptr > 1 char# = character/digit used for branching. Equivalent to bit# field of compressed binary trie. #ptr = # of nonnull pointers in the node.
Insert Insert 012345678. 012345678 Insert 015234567. 2 5 012345678 015234567 3 Null pointer fields not shown.
Insert 2 5 012345678 015234567 3 Insert 015231671.
Insert 2 5 012345678 015234567 3 1 4 6 015231671 Fall off root during search for insert key. Find any key in subtree you fall off of and find first character where this key and insert key differ. In the example, this is character/digit 2. To find a key you need to follow pointers to an element node. To find a pointer, you need to search within each node. Insert 079864231.
Insert 2 5 012345678 015234567 3 1 4 6 015231671 7 079864231 Insert 012345618.
Insert 1 7 8 2 5 012345678 015234567 3 1 4 6 015231671 079864231 7 012345618 Insert 011917352.
Insert 1 7 2 1 2 5 3 079864231 011917352 1 4 1 7 8 6 012345678 012345618 015231671 015234567
Delete 1 7 8 2 5 012345678 015234567 4 6 015231671 079864231 012345618 3 011917352 Delete 011917352.
Delete 1 7 2 2 5 3 079864231 1 4 1 7 8 6 012345678 Can back up at most one level. 012345618 015231671 015234567 Delete 012345678.
Delete 1 7 2 2 5 3 079864231 1 4 6 012345618 Can back up at most one level. 015231671 015234567 Delete 015231671.
Delete 1 7 2 2 5 3 079864231 012345618 015234567 Can back up at most one level.
Variable Length Keys 012345678 015234567 015231671 Insert 0123. Alternative is one trie for each length. Works in some applications. Problem arises only when one key is a (proper) prefix of another.
Variable Length Keys Insert 0123. 012345678 015234567 015231671 0123 # End of key character (#) not shown.
Variable Length Keys One trie per length. T[] Array of tries. 1 2 3 4 5 6 7 8 9 10 T[] Array of tries. Hashtable to tries uses a hash table instead of an array to keep track of the tries of different length keys. Is useful when the lengths for which you have keys are widely spread out (e.g., 90, 250, 1000, 9898, …) Hashtable of tries.
Tries With Edge Information Add a new field (element) to each branch node. New field points to any one of the element nodes in the subtree. Use this pointer on way down to figure out skipped-over characters.
Example 2 5 012345678 015234567 3 1 4 6 015231671 0123 # Element field also useful for insert but complicates delete. element field shown in blue.
Etc. Expected height of an order m trie is ~logmn. Limit height to h (say 6). Level h branch nodes point to buckets that employ some other search structure for all keys in subtrie.
Etc. Switch from trie scheme to simple array when number of pairs in subtrie becomes <= s (say s = 6). Expected # of branch nodes for an order m trie when n is large and m and s are small is n/(s ln m). Sample digits from right to left (instead of from left to right) or using a pseudorandom number generator so as to reduce trie height.
Web Resource See Web writeup for additional applications of tries. Prefix search. Automatic command (or phone number or URL) completion. LZW compression.
Web Resource See Web writeup for alternative node structures for tries. Array Chain Binary search tree Hash table