Beyond Single-Deletion Correcting Codes: Substitutions and Transpositions

Authors Ryan Gabrys , Venkatesan Guruswami , João Ribeiro , Ke Wu



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2022.8.pdf
  • Filesize: 0.75 MB
  • 17 pages

Document Identifiers

Author Details

Ryan Gabrys
  • ECE Department, University of California, San Diego, CA, USA
Venkatesan Guruswami
  • EECS Department, University of California, Berkeley, CA, USA
João Ribeiro
  • Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
Ke Wu
  • Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA

Cite AsGet BibTex

Ryan Gabrys, Venkatesan Guruswami, João Ribeiro, and Ke Wu. Beyond Single-Deletion Correcting Codes: Substitutions and Transpositions. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 245, pp. 8:1-8:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2022.8

Abstract

We consider the problem of designing low-redundancy codes in settings where one must correct deletions in conjunction with substitutions or adjacent transpositions; a combination of errors that is usually observed in DNA-based data storage. One of the most basic versions of this problem was settled more than 50 years ago by Levenshtein, who proved that binary Varshamov-Tenengolts codes correct one arbitrary edit error, i.e., one deletion or one substitution, with nearly optimal redundancy. However, this approach fails to extend to many simple and natural variations of the binary single-edit error setting. In this work, we make progress on the code design problem above in three such variations: - We construct linear-time encodable and decodable length-n non-binary codes correcting a single edit error with nearly optimal redundancy log n+O(log log n), providing an alternative simpler proof of a result by Cai, Chee, Gabrys, Kiah, and Nguyen (IEEE Trans. Inf. Theory 2021). This is achieved by employing what we call weighted VT sketches, a new notion that may be of independent interest. - We show the existence of a binary code correcting one deletion or one adjacent transposition with nearly optimal redundancy log n+O(log log n). - We construct linear-time encodable and list-decodable binary codes with list-size 2 for one deletion and one substitution with redundancy 4log n+O(log log n). This matches the existential bound up to an O(log log n) additive term.

Subject Classification

ACM Subject Classification
  • Theory of computation → Error-correcting codes
Keywords
  • Synchronization errors
  • Optimal redundancy
  • Explicit codes

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Joshua Brakensiek, Venkatesan Guruswami, and Samuel Zbarsky. Efficient low-redundancy codes for correcting multiple deletions. IEEE Transactions on Information Theory, 64(5):3403-3410, 2018. URL: https://doi.org/10.1109/TIT.2017.2746566.
  2. Kui Cai, Yeow Meng Chee, Ryan Gabrys, Han Mao Kiah, and Tuan Thanh Nguyen. Correcting a single indel/edit for DNA-based data storage: Linear-time encoders and order-optimality. IEEE Transactions on Information Theory, 67(6):3438-3451, 2021. URL: https://doi.org/10.1109/TIT.2021.3049627.
  3. Kuan Cheng, Zhengzhong Jin, Xin Li, and Ke Wu. Deterministic document exchange protocols, and almost optimal binary codes for edit errors. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 200-211, 2018. URL: https://doi.org/10.1109/FOCS.2018.00028.
  4. Kuan Cheng, Zhengzhong Jin, Xin Li, and Ke Wu. Block edit errors with transpositions: Deterministic document exchange protocols and almost optimal binary codes. In Christel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi, editors, 46th International Colloquium on Automata, Languages, and Programming (ICALP), pages 37:1-37:15, 2019. URL: https://doi.org/10.4230/LIPIcs.ICALP.2019.37.
  5. Ryan Gabrys, Venkatesan Guruswami, João Ribeiro, and Ke Wu. Beyond single-deletion correcting codes: Substitutions and transpositions. arXiv e-prints, December 2021. URL: https://doi.org/10.48550/arXiv.2112.09971.
  6. Ryan Gabrys and Frederic Sala. Codes correcting two deletions. IEEE Transactions on Information Theory, 65(2):965-974, 2019. URL: https://doi.org/10.1109/TIT.2018.2876281.
  7. Ryan Gabrys, Eitan Yaakobi, and Olgica Milenkovic. Codes in the Damerau distance for deletion and adjacent transposition correction. IEEE Transactions on Information Theory, 64(4):2550-2570, 2018. URL: https://doi.org/10.1109/TIT.2017.2778143.
  8. Venkatesan Guruswami, Bernhard Haeupler, and Amirbehshad Shahrasbi. Optimally resilient codes for list-decoding from insertions and deletions. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 524-537, 2020. URL: https://doi.org/10.1145/3357713.3384262.
  9. Venkatesan Guruswami and Johan Håstad. Explicit two-deletion codes with redundancy matching the existential bound. IEEE Transactions on Information Theory, 67(10):6384-6394, 2021. URL: https://doi.org/10.1109/TIT.2021.3069446.
  10. Bernhard Haeupler. Optimal document exchange and new codes for insertions and deletions. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 334-347, 2019. URL: https://doi.org/10.1109/FOCS.2019.00029.
  11. Bernhard Haeupler and Amirbehshad Shahrasbi. Synchronization strings: Explicit constructions, local decoding, and applications. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 841-854, 2018. URL: https://doi.org/10.1145/3188745.3188940.
  12. Andreas Lenz and Nikita Polyanskii. Optimal codes correcting a burst of deletions of variable length. In 2020 IEEE International Symposium on Information Theory (ISIT), pages 757-762, 2020. URL: https://doi.org/10.1109/ISIT44484.2020.9174288.
  13. Vladimir Iosifovich Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk, 163(4):845-848, 1965. Google Scholar
  14. Lee Organick, Siena Dumas Ang, Yuan-Jyue Chen, Randolph Lopez, Sergey Yekhanin, Konstantin Makarychev, Miklos Z Racz, Govinda Kamath, Parikshit Gopalan, Bichlien Nguyen, et al. Random access in large-scale DNA data storage. Nature biotechnology, 36(3):242, 2018. URL: https://doi.org/10.1038/nbt.4079.
  15. Clayton Schoeny, Antonia Wachter-Zeh, Ryan Gabrys, and Eitan Yaakobi. Codes correcting a burst of deletions or insertions. IEEE Transactions on Information Theory, 63(4):1971-1985, 2017. URL: https://doi.org/10.1109/TIT.2017.2661747.
  16. Leonard J. Schulman and David Zuckerman. Asymptotically good codes correcting insertions, deletions, and transpositions. IEEE Transactions on Information Theory, 45(7):2552-2557, 1999. URL: https://doi.org/10.1109/18.796406.
  17. Jin Sima and Jehoshua Bruck. On optimal k-deletion correcting codes. IEEE Transactions on Information Theory, 67(6):3360-3375, 2021. URL: https://doi.org/10.1109/TIT.2020.3028702.
  18. Neil J. A. Sloane. On single-deletion-correcting codes. arXiv, 2002. URL: https://doi.org/10.48550/arXiv.math/0207197.
  19. Ilia Smagloy, Lorenz Welter, Antonia Wachter-Zeh, and Eitan Yaakobi. Single-deletion single-substitution correcting codes. In 2020 IEEE International Symposium on Information Theory (ISIT), pages 775-780, 2020. URL: https://doi.org/10.1109/ISIT44484.2020.9174213.
  20. Wentu Song, Kui Cai, and Tuan Thanh Nguyen. List-decodable codes for single-deletion single-substitution with list-size two, January 2022. URL: https://doi.org/10.48550/arXiv.2201.02013.
  21. Wentu Song, Nikita Polyanskii, Kui Cai, and Xuan He. On multiple-deletion multiple-substitution correcting codes. In 2021 IEEE International Symposium on Information Theory (ISIT), pages 2655-2660, 2021. URL: https://doi.org/10.1109/ISIT45174.2021.9517878.
  22. Daniel Tan. Implementation of single-edit correcting code, 2020. URL: https://github.com/dtch1997/single-edit-correcting-code.
  23. Yuanyuan Tang and Farzad Farnoud. Error-correcting codes for short tandem duplication and edit errors. IEEE Transactions on Information Theory, 68(2):871-880, 2022. URL: https://doi.org/10.1109/TIT.2021.3125724.
  24. Rom R. Varshamov and Grigory M. Tenengolts. Codes which correct single asymmetric errors. Autom. Remote Control, 26(2):286-290, 1965. Google Scholar
  25. Antonia Wachter-Zeh. List decoding of insertions and deletions. IEEE Transactions on Information Theory, 64(9):6297-6304, 2018. URL: https://doi.org/10.1109/TIT.2017.2777471.
  26. Shuche Wang, Jin Sima, and Farzad Farnoud. Non-binary codes for correcting a burst of at most 2 deletions. In 2021 IEEE International Symposium on Information Theory (ISIT), pages 2804-2809, 2021. URL: https://doi.org/10.1109/ISIT45174.2021.9517917.
  27. S. M. Hossein Tabatabaei Yazdi, Ryan Gabrys, and Olgica Milenkovic. Portable and error-free DNA-based data storage. Scientific reports, 7(1):5011, 2017. URL: https://doi.org/10.1038/s41598-017-05188-1.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail