Tries Data Structure
Tries Trie is a special structure to represent sets of character strings. Can also be used to represent data types that are objects of any type e.g. strings of integers. The word “trie” is derived from the middle letters of the word “retrieval”.
Tries: Example One way to implement a spelling checker is Read a text file. Break it into words( character strings separated by blanks and new lines). Find those words not in a standard dictionary of words. Words in the text but not in the dictionary are printed out as possible misspellings.
Tries: Example It can be implemented by a set having operations of : INSERT DELETE MAKENULL PRINT A Trie structure supports these set operations when the element of the set are words.
Tries: Example T H E $ $$$ $ $ $ N I I N G S SNN I
Tries are appropriate when many words begin with the same sequence of letters. i.e; when the number of distinct prefixes among all words in the set is much less than the total length of all the words. Each path from the root to the leaf corresponds to one word in the represented set. Nodes of the trie correspond to the prefixes of words in the set.
Tries: Example The symbol $ is added at the end of each word so that no prefix of a word can be a word itself. The Trie corresponds to the set {THE,THEN THIN, TIN, SIN, SING} Each node has at most 27 children, one for each letter and $ Most nodes will have many fewer than 27 children. A leaf reached by an edge labeled $ cannot have any children.
Tries nodes as ADT A node in a trie can be viewed as: Mapping whose domain is {A,B,…Z, $} And whose value set is the type “Pointer to trie node”. A trie can be identified with its root. => ADT’s TRIE and TRIENODE have the same data type. However, there operations are different.
Operations on Tries nodes ASSIGN(node,c,p): Assign value p (a pointer to a node) to character c in node node. VALUEOF(node, c): Produce the value associated with character c in node. GETNEW(node, c): Make the value of node for character c be a pointer to a new node. MAKENULL(node): Makes node to be null mapping.
Sets A Set is a collection of members (or elements). Each member of a set is either itself a set or is a primitive element called an atom. All elements of a set are different.
Sets Set can be integers, characters or strings. All elements can be of the same type. Atoms in a set can be linearly ordered. A linear order (denoted by <) on a set S (“less than” or precedes”) satisfies two properties: For any a and b in S, exactly one of a < b, a = b, or b < a is true. For all a, b and c in S, if a < b and b < c, then a < c (transitivity).
Set Notation A set of atoms is generally exhibited by putting curly brackets around its members. Example: {1,4}, denotes the set whose members are 1 and 4. Set is not a list, since order of elements in a set is not important. {4,1} is the same set as {1,4}
Operations on Set UNION: If A and B are sets then A B is the set of elements that are members of A or B or both. INTERSECTION: A B is the set of elements, that are present both in A and B. DIFFERENCE: A – B is the set of elements that are members of A but are not members of B.
Abstract Data Types Based on Sets The Set ADT can incorporate some other operations as well. MERGE(A,B,C): Assigns to the set variable C the value A B, the operator is not defined if A B Ø MEMBER(x,A): Returns true if x A and returns false if x A. MAKENULL(A): makes the Null set be the value for set variable A.
Abstract Data Types Based on Sets INSERT(x,A): x is an element of the type of A’s members. Makes x a member of A. A = A {x} DELETE(x,A): removes x from A. A = A – { x } ASSIGN(A,B): sets the value of set variable A to be equal to the value of set variable B.
Abstract Data Types Based on Sets MIN(A): Returns the least element in set A.This operator is applicable only when the member of A are linearly ordered. MAX(A): Returns the largest element in set A.This operator is applicable only when the member of A are linearly ordered. EQUAL(A,B): Returns true if and only if sets A and B consists of the same elements. FIND(x): Works for collection of disjoint sets. Returns the name of the unique set of which x is a member.
Reference “Data Structures and Algorithms” by A. V. Aho, J. E. Hopcroft, J. D. Ullman.