On the sorting-complexity of suffix tree construction

Martin Farach-Colton, Paolo Ferragina, S. Muthukrishnan

Research output: Contribution to journalArticlepeer-review

164 Scopus citations

Abstract

The suffix tree of a string is the fundamental data structure of combinatorial pattern matching. We present a recursive technique for building suffix trees that yields optimal algorithms in different computational models. Sorting is an inherent bottleneck in building suffix trees and our algorithms match the sorting lower bound. Specifically, we present the following results. (1) Weiner [1973], who introduced the data structure, gave an optimal O(n)-time algorithm for building the suffix tree of an n-character string drawn from a constant-size alphabet. In the comparison model, there is a trivial Ω(n log n)-time lower bound based on sorting, and Weiner's algorithm matches this bound. For integer alphabets, the fastest known algorithm is the O(n log n) time comparison-based algorithm, but no super-linear lower bound is known. Closing this gap.

Original languageEnglish (US)
Pages (from-to)987-1011
Number of pages25
JournalJournal of the ACM
Volume47
Issue number6
DOIs
StatePublished - 2000

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Information Systems
  • Hardware and Architecture
  • Artificial Intelligence

Keywords

  • Dam model
  • External-memory data structures
  • Ram model
  • Sorting complexity
  • Suffix array
  • Suffix tree

Fingerprint

Dive into the research topics of 'On the sorting-complexity of suffix tree construction'. Together they form a unique fingerprint.

Cite this