Synchronization Strings: List Decoding for Insertions and Deletions

Authors Bernhard Haeupler, Amirbehshad Shahrasbi, Madhu Sudan



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2018.76.pdf
  • Filesize: 468 kB
  • 14 pages

Document Identifiers

Author Details

Bernhard Haeupler
  • Carnegie Mellon University, Pittsburgh, PA, USA
Amirbehshad Shahrasbi
  • Carnegie Mellon University, Pittsburgh, PA, USA
Madhu Sudan
  • Harvard University, Cambridge, MA, USA

Cite AsGet BibTex

Bernhard Haeupler, Amirbehshad Shahrasbi, and Madhu Sudan. Synchronization Strings: List Decoding for Insertions and Deletions. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 76:1-76:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ICALP.2018.76

Abstract

We study codes that are list-decodable under insertions and deletions ("insdel codes"). Specifically, we consider the setting where, given a codeword x of length n over some finite alphabet Sigma of size q, delta * n codeword symbols may be adversarially deleted and gamma * n symbols may be adversarially inserted to yield a corrupted word w. A code is said to be list-decodable if there is an (efficient) algorithm that, given w, reports a small list of codewords that include the original codeword x. Given delta and gamma we study what is the rate R for which there exists a constant q and list size L such that there exist codes of rate R correcting delta-fraction insertions and gamma-fraction deletions while reporting lists of size at most L. Using the concept of synchronization strings, introduced by the first two authors [Proc. STOC 2017], we show some surprising results. We show that for every 0 <= delta < 1, every 0 <= gamma < infty and every epsilon > 0 there exist codes of rate 1 - delta - epsilon and constant alphabet (so q = O_{delta,gamma,epsilon}(1)) and sub-logarithmic list sizes. Furthermore, our codes are accompanied by efficient (polynomial time) decoding algorithms. We stress that the fraction of insertions can be arbitrarily large (more than 100%), and the rate is independent of this parameter. We also prove several tight bounds on the parameters of list-decodable insdel codes. In particular, we show that the alphabet size of insdel codes needs to be exponentially large in epsilon^{-1}, where epsilon is the gap to capacity above. Our result even applies to settings where the unique-decoding capacity equals the list-decoding capacity and when it does so, it shows that the alphabet size needs to be exponentially large in the gap to capacity. This is sharp contrast to the Hamming error model where alphabet size polynomial in epsilon^{-1} suffices for unique decoding. This lower bound also shows that the exponential dependence on the alphabet size in previous works that constructed insdel codes is actually necessary! Our result sheds light on the remarkable asymmetry between the impact of insertions and deletions from the point of view of error-correction: Whereas deletions cost in the rate of the code, insertion costs are borne by the adversary and not the code! Our results also highlight the dominance of the model of insertions and deletions over the Hamming model: A Hamming error is equal to one insertion and one deletion (at the same location). Thus the effect of delta-fraction Hamming errors can be simulated by delta-fraction of deletions and delta-fraction of insertions - but insdel codes can deal with much more insertions without loss in rate (though at the price of higher alphabet size).

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Coding theory
Keywords
  • List Decoding
  • Insertions and Deletions
  • Synchronization Strings

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Venkatesan Guruswami and Piotr Indyk. Expander-based constructions of efficiently decodable codes. In Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS), pages 658-667. IEEE, 2001. Google Scholar
  2. Venkatesan Guruswami and Piotr Indyk. Near-optimal linear-time codes for unique decoding and new list-decodable codes over smaller alphabets. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 812-821. ACM, 2002. Google Scholar
  3. Venkatesan Guruswami and Piotr Indyk. Linear time encodable and list decodable codes. In Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pages 126-135. ACM, 2003. Google Scholar
  4. Venkatesan Guruswami and Piotr Indyk. Linear-time list decoding in error-free settings. In International Colloquium on Automata, Languages, and Programming, pages 695-707. Springer, 2004. Google Scholar
  5. Venkatesan Guruswami and Ray Li. Efficiently decodable insertion/deletion codes for high-noise and high-rate regimes. In Information Theory (ISIT), 2016 IEEE International Symposium on, pages 620-624. IEEE, 2016. Google Scholar
  6. Venkatesan Guruswami and Atri Rudra. Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy. IEEE Transactions on Information Theory, 54(1):135-150, 2008. Google Scholar
  7. Venkatesan Guruswami and Madhu Sudan. Improved decoding of reed-solomon and algebraic-geometric codes. In Foundations of Computer Science, 1998. Proceedings. 39th Annual Symposium on, pages 28-37. IEEE, 1998. Google Scholar
  8. Venkatesan Guruswami and Carol Wang. Optimal rate list decoding via derivative codes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 593-604. Springer, 2011. Google Scholar
  9. Venkatesan Guruswami and Carol Wang. Deletion codes in the high-noise and high-rate regimes. IEEE Transactions on Information Theory, 63(4):1961-1970, 2017. Google Scholar
  10. Venkatesan Guruswami and Chaoping Xing. List decoding reed-solomon, algebraic-geometric, and gabidulin subcodes up to the singleton bound. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 843-852. ACM, 2013. Google Scholar
  11. Venkatesan Guruswami and Chaoping Xing. Optimal rate list decoding over bounded alphabets using algebraic-geometric codes. arXiv preprint arXiv:1708.01070, 2017. Google Scholar
  12. Bernhard Haeupler and Amirbehshad Shahrasbi. Synchronization strings: Codes for insertions and deletions approaching the singleton bound. In Proceedings of the Annual Symposium on Theory of Computing (STOC), 2017. Google Scholar
  13. Bernhard Haeupler and Amirbehshad Shahrasbi. Synchronization strings: Explicit constructions, local decoding, and applications. In Proceedings of the Annual Symposium on Theory of Computing (STOC), 2018. Google Scholar
  14. Bernhard Haeupler, Amirbehshad Shahrasbi, and Ellen Vitercik. Synchronization strings: Channel simulations and interactive coding for insertions and deletions. Proceedings of the International Conference on Automata, Languages, and Programming (ICALP), 2017. Google Scholar
  15. Brett Hemenway, Noga Ron-Zewi, and Mary Wootters. Local list recovery of high-rate tensor codes and applications. Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS), 2017. Google Scholar
  16. Brett Hemenway and Mary Wootters. Linear-time list recovery of high-rate expander codes. In International Colloquium on Automata, Languages, and Programming, pages 701-712. Springer, 2015. Google Scholar
  17. Swastik Kopparty. List-decoding multiplicity codes. Theory of Computing, 11(5):149-182, 2015. Google Scholar
  18. Vladimir Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163, 4:845-848, 1965. Google Scholar
  19. Hugues Mercier, Vijay K Bhargava, and Vahid Tarokh. A survey of error-correcting codes for channels with symbol synchronization errors. IEEE Communications Surveys &Tutorials, 12(1), 2010. Google Scholar
  20. Michael Mitzenmacher. A survey of results for deletion channels and related synchronization channels. Probability Surveys, 6:1-33, 2009. Google Scholar
  21. Leonard J. Schulman and David Zuckerman. Asymptotically good codes correcting insertions, deletions, and transpositions. IEEE transactions on information theory, 45(7):2552-2557, 1999. Google Scholar
  22. Neil JA Sloane. On single-deletion-correcting codes. Codes and designs, 10:273-291, 2002. Google Scholar
  23. Antonia Wachter-Zeh. List decoding of insertions and deletions. IEEE Transactions on Information Theory, 2017. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail