Data Structures and Analysis (COMP 410) David Stotts Computer Science Department UNC Chapel Hill
Design Problem
Real Problem Type ahead Like on google search, phone typing… you type a few chars and the program fills in a list of possible choices for you… based on the prefix you have typed Keep typing more chars, the choices narrow and change Design a data structure that will let you do this Describe the time complexity of using it… searching it as typing is done, generating alternatives, etc.
Take some time Discuss an approach with your neighbor In 5-10 mins we will discuss ideas as a class
Let’s not use node to store a whole word Use child link to represent a char typed Path is then the word Basic idea t <root> n a a a e r o s e n tar to as an a w tea new
Basic idea… This tree encodes (stores) these words: tar, tan, tea, to, ton, toe, a, an, ant, as, net, nest, new, no t <root> n a a n e tan a o o r s n e no tar to as an a n w t s e t tea ton ant new t net toe nest
Representation How many children at each node? As many as there are chars you can type Let’s say 26 for this example node { string val = null; node[26] child = new [null,null,…,null]; boolean isWord = false; }
Representation node { string val = null; node[26] child = new [null,null,…,null]; boolean isWord = false; } val: isWord: false . . . child: 0 1 2 3 4 5 6 7 . . . 22 23 24 25
Representation . . . . . . . . . . . . val: isWord: false child: 0 1 2 3 4 5 6 7 . . . 22 23 24 25 val: “b” child: isWord: false . . . 0 1 2 3 4 5 6 7 . . . 22 23 24 25 val: “a” child: isWord: true . . . 0 1 2 3 4 5 6 7 . . . 22 23 24 25 val: “be” child: isWord: true . . . 0 1 2 3 4 5 6 7 . . . 22 23 24 25
Representation . . . . . . . . . . . . val: isWord: false child: b a 0 1 2 3 4 5 6 7 . . . 22 23 24 25 a b e be <root> val: “b” child: isWord: false . . . 0 1 2 3 4 5 6 7 . . . 22 23 24 25 val: “a” child: isWord: true . . . 0 1 2 3 4 5 6 7 . . . 22 23 24 25 val: “be” child: isWord: true . . . 0 1 2 3 4 5 6 7 . . . 22 23 24 25
Analysis Big Oh time complexity is always expressed in terms of some problem size Here the problem size is not the number of words encoded in the tree, like we say for BST Rather we choose M, the length of a word being inserted or searched for
Analysis The worst case time needed to find a word of length M is… O(M) This is true if the tree contains 10 words or 10 million words Length of the longest path in the tree is length of the longest word stored in the tree
This has a name Trie Pronounced “try” or “tree”, both ways Or “trie tree” tree-tree, try-tree Comes from “ re TRIE val ” Used for prefix-based retrieval of strings formed over an alphabet
Beyond this is just templates END Beyond this is just templates