RLE Edit Distance in Near Optimal Time

Authors Raphaël Clifford , Paweł Gawrychowski , Tomasz Kociumaka , Daniel P. Martin , Przemysław Uznański



PDF
Thumbnail PDF

File

LIPIcs.MFCS.2019.66.pdf
  • Filesize: 0.54 MB
  • 13 pages

Document Identifiers

Author Details

Raphaël Clifford
  • Department of Computer Science, University of Bristol, UK
Paweł Gawrychowski
  • Institute of Computer Science, University of Wrocław, Poland
Tomasz Kociumaka
  • Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
  • Institute of Informatics, University of Warsaw, Poland
Daniel P. Martin
  • The Alan Turing Institute, British Library, London, UK
Przemysław Uznański
  • Institute of Computer Science, University of Wrocław, Poland

Cite AsGet BibTex

Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, and Przemysław Uznański. RLE Edit Distance in Near Optimal Time. In 44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 138, pp. 66:1-66:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.MFCS.2019.66

Abstract

We show that the edit distance between two run-length encoded strings of compressed lengths m and n respectively, can be computed in O(mn log(mn)) time. This improves the previous record by a factor of O(n/log(mn)). The running time of our algorithm is within subpolynomial factors of being optimal, subject to the standard SETH-hardness assumption. This effectively closes a line of algorithmic research first started in 1993.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
  • Theory of computation → Data structures design and analysis
Keywords
  • String algorithms
  • Compression
  • Pattern matching
  • Run-length encoding

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Hsing-Yen Ann, Chang-Biau Yang, Chiou-Ting Tseng, and Chiou-Yi Hor. A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings. Information Processing Letters, 108(6):360-364, 2008. URL: https://doi.org/10.1016/j.ipl.2008.07.005.
  2. Alberto Apostolico, Gad M. Landau, and Steven Skiena. Matching for Run-Length Encoded Strings. Journal of Complexity, 15(1):4-16, 1999. URL: https://doi.org/10.1006/jcom.1998.0493.
  3. Ora Arbell, Gad M. Landau, and Joseph S. B. Mitchell. Edit distance of run-length encoded strings. Information Processing Letters, 83(6):307-314, 2002. URL: https://doi.org/10.1016/S0020-0190(02)00215-6.
  4. Arturs Backurs and Piotr Indyk. Edit Distance Cannot Be Computed in Strongly Subquadratic Time (Unless SETH is False). SIAM Journal on Computing, 47(3):1087-1097, 2018. URL: https://doi.org/10.1137/15M1053128.
  5. Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In Venkatesan Guruswami, editor, 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, pages 79-97. IEEE Computer Society, 2015. URL: https://doi.org/10.1109/FOCS.2015.15.
  6. Horst Bunke and János Csirik. An algorithm for matching run-length coded strings. Computing, 50(4):297-314, 1993. URL: https://doi.org/10.1007/BF02243873.
  7. Horst Bunke and János Csirik. An improved algorithm for computing the edit distance of run-length coded strings. Information Processing Letters, 54(2):93-96, 1995. URL: https://doi.org/10.1016/0020-0190(95)00005-W.
  8. Kuan-Yu Chen and Kun-Mao Chao. A fully compressed algorithm for computing the edit distance of run-length encoded strings. Algorithmica, 65(2):354-370, 2013. URL: https://doi.org/10.1007/s00453-011-9592-4.
  9. Stuart C. Hinds, James L. Fisher, and Donald P. D'Amato. A document skew detection method using run-length encoding and the Hough transform. In 10th International Conference on Pattern Recognition, ICDR 1990, volume 1, pages 464-468. IEEE Computer Society, 1990. URL: https://doi.org/10.1109/ICPR.1990.118147.
  10. Jia Jie Liu, Guan-Shieng Huang, Yue-Li Wang, and Richard C. T. Lee. Edit distance for a run-length-encoded string and an uncompressed string. Information Processing Letters, 105(1):12-16, 2007. URL: https://doi.org/10.1016/j.ipl.2007.07.006.
  11. Veli Mäkinen and Gonzalo Navarro. Succinct suffix arrays based on run-length encoding. Nordic Journal of Computing, 12(1):40-66, 2005. URL: https://users.dcc.uchile.cl/~gnavarro/ps/njc05.pdf.
  12. William J. Masek and Mike Paterson. A faster algorithm for computing string edit distances. Journal of Computer and System Sciences, 20(1):18-31, 1980. URL: https://doi.org/10.1016/0022-0000(80)90002-1.
  13. Yoshifumi Sakai. Computing the longest common subsequence of two run-length encoded strings. In Kun-Mao Chao, Tsan-sheng Hsu, and Der-Tsai Lee, editors, 23rd International Symposium on Algorithms and Computation, ISAAC 2012, volume 7676 of LNCS, pages 197-206. Springer, 2012. URL: https://doi.org/10.1007/978-3-642-35261-4_23.
  14. Taras K. Vintsyuk. Speech discrimination by dynamic programming. Cybernetics, 4(1):52-57, 1968. URL: https://doi.org/10.1007/bf01074755.
  15. Dong-Hui Xu, Arati S. Kurani, Jacob D. Furst, and Daniela S. Raicu. Run-length encoding for volumetric texture. In Juan J. Villanieva, editor, 4th IASTED International Conference on Visualization, Imaging, and Image Processing, VIIP 2004. Acta Press, 2004. URL: http://facweb.cs.depaul.edu/research/vc/Publications/final_submission_paper_452_131_last.pdf.