Polynomial-Time Equivalences and Refined Algorithms for Longest Common Subsequence Variants

Authors Yuichi Asahiro, Jesper Jansson, Guohui Lin, Eiji Miyano, Hirotaka Ono, Tadatoshi Utashima



PDF
Thumbnail PDF

File

LIPIcs.CPM.2022.15.pdf
  • Filesize: 0.78 MB
  • 17 pages

Document Identifiers

Author Details

Yuichi Asahiro
  • Kyushu Sangyo University, Fukuoka, Japan
Jesper Jansson
  • Kyoto University, Japan
Guohui Lin
  • University of Alberta, Edmonton, Canada
Eiji Miyano
  • Kyushu Institute of Technology, Iizuka, Japan
Hirotaka Ono
  • Nagoya University, Japan
Tadatoshi Utashima
  • Kyushu Institute of Technology, Iizuka, Japan

Cite AsGet BibTex

Yuichi Asahiro, Jesper Jansson, Guohui Lin, Eiji Miyano, Hirotaka Ono, and Tadatoshi Utashima. Polynomial-Time Equivalences and Refined Algorithms for Longest Common Subsequence Variants. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 15:1-15:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.CPM.2022.15

Abstract

The problem of computing the longest common subsequence of two sequences (LCS for short) is a classical and fundamental problem in computer science. In this paper, we study four variants of LCS: the Repetition-Bounded Longest Common Subsequence problem (RBLCS) [Yuichi Asahiro et al., 2020], the Multiset-Restricted Common Subsequence problem (MRCS) [Radu Stefan Mincu and Alexandru Popa, 2018], the Two-Side-Filled Longest Common Subsequence problem (2FLCS), and the One-Side-Filled Longest Common Subsequence problem (1FLCS) [Mauro Castelli et al., 2017; Mauro Castelli et al., 2019]. Although the original LCS can be solved in polynomial time, all these four variants are known to be NP-hard. Recently, an exact, O(1.44225ⁿ)-time, dynamic programming (DP)-based algorithm for RBLCS was proposed [Yuichi Asahiro et al., 2020], where the two input sequences have lengths n and poly(n). We first establish that each of MRCS, 1FLCS, and 2FLCS is polynomially equivalent to RBLCS. Then, we design a refined DP-based algorithm for RBLCS that runs in O(1.41422ⁿ) time, which implies that MRCS, 1FLCS, and 2FLCS can also be solved in O(1.41422ⁿ) time. Finally, we give a polynomial-time 2-approximation algorithm for 2FLCS.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
Keywords
  • Repetition-bounded longest common subsequence problem
  • multiset restricted longest common subsequence problem
  • one-side-filled longest common subsequence problem
  • two-side-filled longest common subsequence problem
  • exact algorithms
  • and approximation algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Said Sadique Adi, Marília D. V. Braga, Cristina G. Fernandes, Carlos Eduardo Ferreira, Fábio Viduani Martinez, Marie-France Sagot, Marco Aurelio Stefanes, Christian Tjandraatmadja, and Yoshiko Wakabayashi. Repetition-free longest common subsequence. Electron. Notes Discret. Math., 30:243-248, 2008. URL: https://doi.org/10.1016/j.endm.2008.01.042.
  2. Yuichi Asahiro, Jesper Jansson, Guohui Lin, Eiji Miyano, Hirotaka Ono, and Tadatoshi Utashima. Exact algorithms for the repetition-bounded longest common subsequence problem. Theor. Comput. Sci., 838:238-249, 2020. URL: https://doi.org/10.1016/j.tcs.2020.07.042.
  3. Lasse Bergroth, Harri Hakonen, and Timo Raita. A survey of longest common subsequence algorithms. In Pablo de la Fuente, editor, Seventh International Symposium on String Processing and Information Retrieval, SPIRE 2000, A Coruña, Spain, September 27-29, 2000, pages 39-48. IEEE Computer Society, 2000. URL: https://doi.org/10.1109/SPIRE.2000.878178.
  4. Laurent Bulteau, Falk Hüffner, Christian Komusiewicz, and Rolf Niedermeier. Multivariate algorithmics for np-hard string problems: The algorithmics column by Gerhard J. Woeginger. Bull. EATCS, 114, 2014. URL: http://eatcs.org/beatcs/index.php/beatcs/article/view/310.
  5. Mauro Castelli, Riccardo Dondi, Giancarlo Mauri, and Italo Zoppis. The longest filled common subsequence problem. In Juha Kärkkäinen, Jakub Radoszewski, and Wojciech Rytter, editors, 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, July 4-6, 2017, Warsaw, Poland, volume 78 of LIPIcs, pages 14:1-14:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. URL: https://doi.org/10.4230/LIPIcs.CPM.2017.14.
  6. Mauro Castelli, Riccardo Dondi, Giancarlo Mauri, and Italo Zoppis. Comparing incomplete sequences via longest common subsequence. Theor. Comput. Sci., 796:272-285, 2019. URL: https://doi.org/10.1016/j.tcs.2019.09.022.
  7. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, 4th Edition. MIT Press, 2022. URL: http://mitpress.mit.edu/books/introduction-algorithms-fourth-edition.
  8. Daniel S. Hirschberg. A linear space algorithm for computing maximal common subsequences. Commun. ACM, 18(6):341-343, 1975. URL: https://doi.org/10.1145/360825.360861.
  9. Daniel S. Hirschberg. Algorithms for the longest common subsequence problem. J. ACM, 24(4):664-675, 1977. URL: https://doi.org/10.1145/322033.322044.
  10. Haitao Jiang, Chunfang Zheng, David Sankoff, and Binhai Zhu. Scaffold filling under the breakpoint and related distances. IEEE ACM Trans. Comput. Biol. Bioinform., 9(4):1220-1229, 2012. URL: https://doi.org/10.1109/TCBB.2012.57.
  11. Radu Stefan Mincu and Alexandru Popa. Better heuristic algorithms for the repetition free LCS and other variants. In Travis Gagie, Alistair Moffat, Gonzalo Navarro, and Ernesto Cuadros-Vargas, editors, String Processing and Information Retrieval - 25th International Symposium, SPIRE 2018, Lima, Peru, October 9-11, 2018, Proceedings, volume 11147 of Lecture Notes in Computer Science, pages 297-310. Springer, 2018. URL: https://doi.org/10.1007/978-3-030-00479-8_24.
  12. Radu Stefan Mincu and Alexandru Popa. Heuristic algorithms for the longest filled common subsequence problem. In 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2018, Timisoara, Romania, September 20-23, 2018, pages 449-453. IEEE, 2018. URL: https://doi.org/10.1109/SYNASC.2018.00075.
  13. Adriana Muñoz, Chunfang Zheng, Qian Zhu, Victor A. Albert, Steve Rounsley, and David Sankoff. Scaffold filling, contig fusion and comparative gene order inference. BMC Bioinform., 11:304, 2010. URL: https://doi.org/10.1186/1471-2105-11-304.
  14. Saul B. Needleman and Christian D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3):443-453, 1970. Google Scholar
  15. David Sankoff. Matching sequences under deletion/insertion constraints. In Proc. National Academy of Science USA, volume 69, pages 4-6, 1972. URL: https://doi.org/10.1073/pnas.69.1.4.
  16. Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168-173, 1974. URL: https://doi.org/10.1145/321796.321811.