Ukkonens Algorithm

Both KMP and Boyer Moore algorithm run at O(m+n)O(m+n) time complexity, where mm is the length of the text, and nn is the length of the string. The weaknesses of these algorithms become more apparent as the text space increases.

Ukkonens construction of suffix trees ensures linear time search for the string S, that is independent from the size of the text T. Once the tree is constructed, main other operations can be complete and can go beyond the scope of just the needle and in haystack problem. Suffix trees, for example, provide a linear time solution to the longest common substring problem.

Improvements

  • O(1) space efficiency per node, through the use of two indices that describe the start and end positions of a suffix substring of S. The number of nodes will stay linear to the given word S, so we achieve O(n)O(n) space complexity.

  • Node creations are constant time, since only a constant number of node attributes are initialized at a given time independent of S and T.

  • There is no need to traverse down the tree. Instead memory is maintained on to where to make inserts, either in the form of (1) a pointer to a node that was left from a previous insertion, or (2) a network of suffix links.

Last updated