Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Structures & Algorithms Radix Search Richard Newman based on slides by S. Sahni and book by R. Sedgewick.

Similar presentations


Presentation on theme: "Data Structures & Algorithms Radix Search Richard Newman based on slides by S. Sahni and book by R. Sedgewick."— Presentation transcript:

1 Data Structures & Algorithms Radix Search Richard Newman based on slides by S. Sahni and book by R. Sedgewick

2 Radix-based Keys Key has multiple parts Each part is an element of some set Character Numeral Key parts can be accessed (e.g., string s[i]) Size of set is radix

3 Advantages of Radix-based Search Good worst-case performance Simpler than balanced trees, etc. Fast access to data Easy way to handle variable-length keys Save space (part of key in structure)

4 Disadvantages of Radix-based Search May be space-inefficient Performance depends on access to bytes of keys Must have distinct keys, or other way to handle duplicate keys

5 Digital Search Trees Similar to binary search trees Difference is that we use bits of the key to determine subtree to search Path in tree = prefix of key

6 Digital Search Trees Insert A-S-E-R-C-H-I-N-G Key Repr A00001 S10011 E00101 R10010 C00011 H01000 I01001 N01110 G00111 A S 1 E 0 10 10 R 10 C 1 0 H 10 I 10 N 10 G 10 Note that binary tree is not sorted in BST sense

7 Digital Search Trees Prop 15.1: A search or insertion into a DST takes about lg N comparisons on average, and about 2 lg N comparisons in the worst case, in a tree built from N keys. The number of comparisons is never more than the number of bits in the search key.

8 Tries Use bits of key to guide search like DST But keep keys in order like BST Allow recursive sort, etc. Pronounced “try-ee” or “try” Keys kept at leaves of a binary tree

9 Tries Defn. 15.1: A trie is a binary tree that has keys associated with each leaf, defined as follows: a trie for an empty set is a null link a trie for a single key is a leaf w/key a trie for > 1 key is an internal node with left link referring to trie for keys that start with 0, right for keys 1xxx

10 Tries Insert A-S-E-R-C-H-I-N-G Key Repr A00001 S10011 E00101 R10010 C00011 H01000 I01001 N01110 G00111 A S 1 A 0 Construct tree to point where prefixes match

11 Tries Insert A-S-E-R-C-H-I-N-G Key Repr A00001 S10011 E00101 R10010 C00011 H01000 I01001 N01110 G00111 A 10 AE 10 10 10 10 10 RS S 10 A Construct tree to point where prefixes match

12 Tries Insert A-S-E-R-C-H-I-N-G Key Repr A00001 S10011 E00101 R10010 C00011 H01000 I01001 N01110 G00111 10 A 10 10 10 10 RS A 10 C E 10 10 H

13 Tries Insert A-S-E-R-C-H-I-N- G Key Repr A00001 S10011 E00101 R10010 C00011 H01000 I01001 N01110 G00111 10 10 10 10 10 RS A 10 C E 10 10 H 10 1 0 10 HI

14 Tries Prop. 15.2: The structure of a trie is independent of key insertion order; there is one unique trie for any given set of distinct keys. Prop. 15.3: Insertion or search for a random key in a trie built from N random keys takes about lg N bit comparisons on average, in the worst case, bounded by bits in key

15 Tries Annoying feature of tries: One-way branching when keys have common prefix Prop. 15.4: A trie built from N random w-bit keys has about N/lg 2 nodes on the average (about 1.44 N)

16 Patricia Tries Annoying feature of tries: One-way branching when keys have common prefix Two different types of nodes in trie Patricia tries: fix both of these Practical Algorithm To Retrieve Information Coded In Alphanumeric

17 Patricia Tries Avoid one-way branching: Keep at each node the index of the next bit to test Skip over common prefix! Avoid two types of nodes: Store data in internal nodes Replace external links with back links

18 Patricia Tries S R 4 H 0 1 E 2 3 C 4 A Key Repr A00001 S10011 E00101 R10010 C00011 H01000 I01001 N01110 G00111

19 Patricia Tries S R 4 H 0 1 E 2 3 C 4 A Key Repr A00001 S10011 E00101 R10010 C00011 H01000 I01001 N01110 G00111

20 Patricia Tries Prop 15.5: Insertion or search in a patricia trie built from N random bitstrings takes about lg N bit comparisons on average, and about 2 lg N in the worst case, but never more than the length of the key.

21 Map Radix search Digital Search Trees Tries Patricia Tries Multiway tries and TSTs Text string algorithms

22 Multiway Tries Like radix sort, can get benefit from comparing more than one bit at a time Compare r bits, speed up search by a factor of r What could possibly be bad? Number of links is now R=2 r Can waste a lot of space!

23 Multiway Tries Structure is (almost) the same as binary tries Except there are R branches Search: start at root, leftmost digit Follow i th link if next R-ary digit is i If null link, then miss If reach leaf, it contains only key with prefix matching path to it - compare

24 Existence Tries Only keys, no records Insert/search Defn. 15.2: The existence trie for a set of keys is: Empty set: null link Non-empty set: internal node with links for each possible digit to tries built with the leading digit omitted

25 Existence Tries Convenient to return null on miss, dummy record on hit Convenient to have no duplicate keys and no key a prefix of another key Keys of fixed length, or Use termination character with value NULLdigit, only used as sentinel

26 Existence Tries No need to store any data All keys captured in trie structure If reach NULLdigit at the same time we run out of key digits, search hit Otherwise, search miss Insert: search until find null link, then add nodes for each of the remaining digits in the key

27 Existence Tries t the time is now for n h e i i m e s o w f o r

28 Multi-way Tries R-ary branching Keys stored at leaves Path to leaf defines prefix of key stored at leaf Only build tree downward until prefixes become distinct

29 Multi-way Tries Defn. 15.3: The multiway trie for a set of keys associated with leaves is: Set empty: null link Singleton set: leaf with key Larger set: internal node with links for each possible digit to tries built with the leading digit omitted

30 Multi-way Tries Def

31 Summary Radix search Digital Search Trees Tries Patricia Tries Multiway tries and TSTs Text string algorithms


Download ppt "Data Structures & Algorithms Radix Search Richard Newman based on slides by S. Sahni and book by R. Sedgewick."

Similar presentations


Ads by Google