Perfect hashing for strings: Formalization and algorithms

Martin Farach, S. Muthukrishnan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

49 Scopus citations


Numbers and strings are two objects manipulated by most programs. Hashing has been well-studied for numbers and it has been effective in practice. In contrast, basic hashing issues for strings remain largely unexplored. In this paper, we identify and formulate the core hashing problem for strings that we call substring hashing. Our main technical results are highly efficient sequential/parallel (CRCW PRAM) Las Vegas type algorithms that determine a perfect hash function for substring hashing. For example, given a binary string of length n, one of our algorithms finds a perfect hash function in 0(log n) time, O(n) work, and O(n) space; the hash value for any substring can then be computed in O(loglogn) time using a single processor. Our approach relies on a novel use of the suffix tree of a string. In implementing our approach, we design optimal parallel algorithms for the problem of determining weighted ancestors on a edge-weighted tree that may be of independent interest.

Original languageEnglish (US)
Title of host publicationCombinatorial Pattern Matching - 7th Annual Symposium, CPM 1996, Proceedings
EditorsGene Myers, Dan Hirschberg
PublisherSpringer Verlag
Number of pages11
ISBN (Print)3540612580, 9783540612582
StatePublished - 1996
Event7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996 - Laguna Beach, United States
Duration: Jun 10 1996Jun 12 1996

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996
Country/TerritoryUnited States
CityLaguna Beach

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Perfect hashing for strings: Formalization and algorithms'. Together they form a unique fingerprint.

Cite this