Contents What is a trie? When to use tries

Name: Contents What is a trie? When to use tries
Uploaded: 2017-08-21T17:00:37+00:00
Duration: PTM11S47
Channel: Blanche Wilkins
Description: Contents What is a trie? When to use tries

Honors Track: Competitive Programming & Problem Solving Tries
Frank Maurix

Contents What is a trie? When to use tries
Implementation and some operations Alternatives for implementation Compression Suffix tree

What is a trie? Data Structure Digital tree, radix tree, prefix tree Stores set of strings (dictionary) Characters as nodes Position reflects prefix represented

When to use tries Pros O(L) and O(L*A) operations Form of radix sort x for all words with same prefix Suffix tree Cons O(N*A) space complexity Horrible for floating point numbers Not a standard library N = Number of nodes L = Length of the word A = Size of the alphabet

Implementation Keep track of root Array of children
Store the number of children Alphabet = A, B,…, Z Uppercase only Change character into value 0,…,25 int c = someChar – 'A'; // - 'a' for lowercase

Implementation import java.util.*; public class ScaryProblem { TrieNode root; //Root of the trie void solve() { root = new TrieNode(null, false, null); //here is the place where you should do some magic with tries } public static void main(String[] args) { new ScaryProblem().solve(); class TrieNode { TrieNode[] children = new TrieNode[26]; Character ch; //last char of prefix, null for root TrieNode parent; //pointer to parent boolean inDictionary; //Prefix of this node in the dictionary? int nOC = 0; //Number of children TrieNode(Character ch, boolean used, TrieNode newParent) {...}

Operations Searching Insertion Word deletion Prefix deletion Retrieving in sorted order

Searching char[] word = {'N', 'A', 'S', 'A'}; root.search(word, 0); //Alternative: word as String and use word.charAt(index) class TrieNode { TrieNode search(char[] word, int index) { //index should be 0 on initial call if (index == word.length - 1) { //Node found or final node doesn’t exist return children[word[index] - 'A']; } else if (children[word[index] - 'A'] == null) { //Node doesn't exist return null; } else { //Keep searching return children[word[index] - 'A'].search(word, index + 1); }

Insertion Inserting nodes may be necessary, but doesn't need to be Example 1: inserting “SPACES”

Insertion Inserting nodes may be necessary, but doesn't need to be Example 2: inserting “NSA”

Insertion char[] word = {'N', 'S', 'A'}; root.insert(word, 0); class TrieNode { void insert(char[] word, int index) { //index 0 on initial call if (children[word[index]-'A'] == null) { //Next node doesn’t exist nOC++; if (index == word.length - 1) { children[word[index]-'A'] = new TrieNode(word[index], true, this); } else { children[word[index]-'A'] = new TrieNode(word[index], false, this); children[word[index]-'A'].insert(word, index + 1); } } else if (index == word.length - 1) { children[word[index] - 'A'].inDictionary = true; children[word[index] - 'A'].insert(word, index + 1);

Deletion Search for corresponding node
Set inDictionary for corresponding node to false If the node isn’t a leaf, you’re done Else, one or more nodes can be removed Removing the nodes Don’t delete the root If the current node is a leaf and not in the dictionary Remove the node Recursive call to the parent and repeat

Deletion Example 1: deleting “SPACES”

Deletion Example 2: deleting “NSA”

Deletion void removeWord(char[] word) { TrieNode result = root.search(word, 0); if (result == null) { //word not in trie } else if (result.nOC == 0) { //node is a leaf result.trieCleanup(); } else { //node isn’t a leaf result.inDictionary = false; } class TrieNode { void trieCleanup() { //Delete current node & check if parent should be deleted if (parent != null) { //Never delete the root parent.children[ch - 'A'] = null; parent.nOC--; if (parent.nOC == 0 && !parent.inDictionary) { parent.trieCleanup();

Prefix deletion Example: deleting all words with prefix ‘SPA’

Prefix deletion void removePrefix(char[] word) { TrieNode result = root.search(word, 0); if (result != null) { result.trieCleanup(); }

Retrieving in alphabetical order
Pre-order tree traversal Only report prefixes in dictionary traverse(root); void traverse(TrieNode node) { if (node.inDictionary) { report(node); //Or other fancy stuff } for (int i = 0; i < 26; i++) { if (node.children[i] != null) { traverse(node.children[i]); To get prefix without too much extra time: When inserting, store prefix only for the node you insert (to stay in O(L)/O(A*L))

Alphabetic successor of a node
If node isn’t a leaf Find minimum on the subtree rooted at node Stop as soon as you find a word in the dictionary Else If parent has a node with index higher than index of the current, then find minimum like before on that node Else repeat for parent

Alternatives for storing children
A is the size of the alphabet L is the length of the word N is the number of nodes Operation Array HashMap LinkedList Insertion O(A . L) O(L)* O(A . L) sorted; O(L) unsorted Deletion O(L) O(L) (search already done) Search Trie traversal O(N)** O(N)** sorted, O(A . log(A) . N)** unsorted * : assuming simple uniform hashing * Expected O(L) time. Worst case O(A . L) ** Assuming report function takes O(1) time

Array vs HashMap vs LinkedList
+ Simple + Small constants + Good for simple alphabet - A pain with complex alphabet - Always a lot of space + Works with complex alphabet + Fast expected time - Worst case still slow - Worst case more space than array - Constants worse than array + Low space usage + Works with complex alphabet + Small constants - Very slow

Compression Merge nodes Adapt operations appropriately
Why compress: less data usage! How to adapt search?

Compression After deletion, more compression may be possible Insertion after compression: Possibly split a node, insert and compress

Suffix tree Take all suffixes of a word Insert all into a trie Offers many fast string operations Worst case O(L2) nodes All suffixes of BANANA BANANA ANANA NANA ANA NA A

Applications of suffix trees
Number of occurrences of a pattern in a text Search for the pattern, only consider that subtree Result = number of nodes in that subtree with inDictionary = true Longest Common Substring of two strings Insert both strings into a suffix tree For each node, store which strings represent them Find deepest node represented by both strings

Questions

Contents What is a trie? When to use tries

Similar presentations

Presentation on theme: "Contents What is a trie? When to use tries"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Contents What is a trie? When to use tries

Similar presentations

Presentation on theme: "Contents What is a trie? When to use tries"— Presentation transcript:

Similar presentations

About project

Feedback