Guess check codes for deletions, insertions, and synchronization

Serge Kas Hanna, Salim El Rouayheb

Research output: Contribution to journalArticlepeer-review

30 Scopus citations

Abstract

We consider the problem of constructing codes that can correct δ deletions occurring in an arbitrary binary string of length n bits. Varshamov-Tenengolts (VT) codes, dating back to 1965, are zero-error single deletion (δ =1) correcting codes and have an asymptotically optimal redundancy. Finding similar codes for δ ≥2 deletions remains an open problem. In this paper, we relax the standard zero-error (i.e., worst-case) decoding requirement by assuming that the positions of the δ deletions (or insertions) are independent of the code word. Our contribution is a new family of explicit codes, that we call Guess Check (GC) codes, that can correct with high probability up to a constant number of δ deletions (or insertions). GC codes are systematic; and have deterministic polynomial time encoding and decoding algorithms. We also describe the application of GC codes to file synchronization.

Original languageEnglish (US)
Article number8368277
Pages (from-to)3-15
Number of pages13
JournalIEEE Transactions on Information Theory
Volume65
Issue number1
DOIs
StatePublished - Jan 2019

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Keywords

  • Coding theory
  • deletions
  • file synchronization
  • insertions
  • systematic codes

Fingerprint

Dive into the research topics of 'Guess check codes for deletions, insertions, and synchronization'. Together they form a unique fingerprint.

Cite this