Trie/Suffix Trie/Suffix Tree

- A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used to store large dictionaries of English (say) words in spelling-checking programs and in natural-language "understanding" programs. Given the data:
- an, ant, all, allot, alloy, aloe, are, ate, be

- The idea is that all strings sharing a common stem or prefix hang off a common node. When the strings are words over {a..z}, a node has at most 27 children - one for each letter plus a terminator.
- The elements in a string can be recovered in a scan from the root to the leaf that ends a string. All strings in the trie can be recovered by a depth-first scan of the tree.

- The idea behind suffix trie is to assign to each symbol in a text an index corresponding to its position in the text (i.e., first symbol has index 1, last symbol has index n = # of symbols in the text).

- A suffix trie is an ordinary trie in which the input strings are all possible suffixes.
- A suffix of a text [t1 ... tn] is a substring [ti ... tn] where i is an integer between 1 and n.

- To demonstrate the structure of the resulting tree we will build the suffix trie corresponding to the following text:
TEXT: G O O G O L $POSITION: 1 2 3 4 5 6 7

- The suffix tree is created by compacting every unary node in the suffix trie.