Minimum Common String Partition: Exact Algorithms

Authors Marek Cygan , Alexander S. Kulikov , Ivan Mihajlin, Maksim Nikolaev , Grigory Reznikov



PDF
Thumbnail PDF

File

LIPIcs.ESA.2021.35.pdf
  • Filesize: 0.71 MB
  • 16 pages

Document Identifiers

Author Details

Marek Cygan
  • University of Warsaw, Poland
Alexander S. Kulikov
  • Steklov Mathematical Institute at St. Petersburg, Russian Academy of Sciences, Russia
  • St. Petersburg State University, Russia
Ivan Mihajlin
  • Steklov Mathematical Institute at St. Petersburg, Russian Academy of Sciences, Russia
Maksim Nikolaev
  • Steklov Mathematical Institute at St. Petersburg, Russian Academy of Sciences, Russia
Grigory Reznikov
  • National Research University Higher School of Economics, St. Petersburg, Russia

Cite As Get BibTex

Marek Cygan, Alexander S. Kulikov, Ivan Mihajlin, Maksim Nikolaev, and Grigory Reznikov. Minimum Common String Partition: Exact Algorithms. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021) https://doi.org/10.4230/LIPIcs.ESA.2021.35

Abstract

In the minimum common string partition problem (MCSP), one gets two strings and is asked to find the minimum number of cuts in the first string such that the second string can be obtained by rearranging the resulting pieces. It is a difficult algorithmic problem having applications in computational biology, text processing, and data compression. MCSP has been studied extensively from various algorithmic angles: there are many papers studying approximation, heuristic, and parameterized algorithms. At the same time, almost nothing is known about its exact complexity. In this paper, we present new results in this direction. We improve the known 2ⁿ upper bound (where n is the length of input strings) to ϕⁿ where ϕ ≈ 1.618... is the golden ratio. The algorithm uses Fibonacci numbers to encode subsets as monomials of a certain implicit polynomial and extracts one of its coefficients using the fast Fourier transform. Then, we show that the case of constant size alphabet can be solved in subexponential time 2^{O(nlog log n/log n)} by a hybrid strategy: enumerate all long pieces and use dynamic programming over histograms of short pieces. Finally, we prove almost matching lower bounds assuming the Exponential Time Hypothesis.

Subject Classification

ACM Subject Classification
  • Theory of computation → Parameterized complexity and exact algorithms
  • Theory of computation → Algorithm design techniques
Keywords
  • similarity measure
  • string distance
  • exact algorithms
  • upper bounds
  • lower bounds

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Andreas Björklund, Thore Husfeldt, and Mikko Koivisto. Set partitioning via inclusion-exclusion. SIAM J. Comput., 39(2):546-563, 2009. URL: https://doi.org/10.1137/070683933.
  2. Laurent Bulteau, Guillaume Fertin, Christian Komusiewicz, and Irena Rusu. A fixed-parameter algorithm for minimum common string partition with few duplications. In Aaron E. Darling and Jens Stoye, editors, Algorithms in Bioinformatics - 13th International Workshop, WABI 2013, Sophia Antipolis, France, September 2-4, 2013. Proceedings, volume 8126 of Lecture Notes in Computer Science, pages 244-258. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-40453-5_19.
  3. Laurent Bulteau and Christian Komusiewicz. Minimum common string partition parameterized by partition size is fixed-parameter tractable. In Chandra Chekuri, editor, Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon, USA, January 5-7, 2014, pages 102-121. SIAM, 2014. URL: https://doi.org/10.1137/1.9781611973402.8.
  4. Marek Chrobak, Petr Kolman, and Jirí Sgall. The greedy algorithm for the minimum common string partition problem. ACM Trans. Algorithms, 1(2):350-366, 2005. URL: https://doi.org/10.1145/1103963.1103971.
  5. Graham Cormode and S. Muthukrishnan. The string edit distance matching problem with moves. ACM Trans. Algorithms, 3(1):2:1-2:19, 2007. URL: https://doi.org/10.1145/1219944.1219947.
  6. Peter Damaschke. Minimum common string partition parameterized. In Keith A. Crandall and Jens Lagergren, editors, Algorithms in Bioinformatics, 8th International Workshop, WABI 2008, Karlsruhe, Germany, September 15-19, 2008. Proceedings, volume 5251 of Lecture Notes in Computer Science, pages 87-98. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-87361-7_8.
  7. Bin Fu, Haitao Jiang, Boting Yang, and Binhai Zhu. Exponential and polynomial time algorithms for the minimum common string partition problem. In Weifan Wang, Xuding Zhu, and Ding-Zhu Du, editors, Combinatorial Optimization and Applications - 5th International Conference, COCOA 2011, Zhangjiajie, China, August 4-6, 2011. Proceedings, volume 6831 of Lecture Notes in Computer Science, pages 299-310. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-22616-8_24.
  8. Avraham Goldstein, Petr Kolman, and Jie Zheng. Minimum common string partition problem: Hardness and approximations. Electron. J. Comb., 12, 2005. URL: http://www.combinatorics.org/Volume_12/Abstracts/v12i1r50.html.
  9. Isaac Goldstein and Moshe Lewenstein. Quick greedy computation for minimum common string partition. Theor. Comput. Sci., 542:98-107, 2014. URL: https://doi.org/10.1016/j.tcs.2014.05.006.
  10. Alexander Golovnev, Alexander S. Kulikov, and Ivan Mihajlin. Families with infants: Speeding up algorithms for NP-hard problems using FFT. ACM Trans. Algorithms, 12(3):35:1-35:17, 2016. URL: https://doi.org/10.1145/2847419.
  11. Haitao Jiang, Binhai Zhu, Daming Zhu, and Hong Zhu. Minimum common string partition revisited. J. Comb. Optim., 23(4):519-527, 2012. URL: https://doi.org/10.1007/s10878-010-9370-2.
  12. David S. Johnson and Mario Szegedy. What are the least tractable instances of max independent set? In Robert Endre Tarjan and Tandy J. Warnow, editors, Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms, 17-19 January 1999, Baltimore, Maryland, USA, pages 927-928. ACM/SIAM, 1999. URL: http://dl.acm.org/citation.cfm?id=314500.315093.
  13. Haim Kaplan and Nira Shafrir. The greedy algorithm for edit distance with moves. Inf. Process. Lett., 97(1):23-27, 2006. URL: https://doi.org/10.1016/j.ipl.2005.08.010.
  14. Mikko Koivisto. Optimal 2-constraint satisfaction via sum-product algorithms. Inf. Process. Lett., 98(1):24-28, 2006. URL: https://doi.org/10.1016/j.ipl.2005.11.013.
  15. Petr Kolman and Tomasz Walen. Reversal distance for strings with duplicates: Linear time approximation using hitting set. Electron. J. Comb., 14(1), 2007. URL: http://www.combinatorics.org/Volume_14/Abstracts/v14i1r50.html.
  16. Leopold Kronecker. Grundzüge einer arithmetischen theorie der algebraischen grössen. J. Reine Angew. Math., 92:1-122, 1882. Google Scholar
  17. Dana Shapira and James A. Storer. Edit distance with move operations. J. Discrete Algorithms, 5(2):380-392, 2007. URL: https://doi.org/10.1016/j.jda.2005.01.010.
  18. Ryan Williams. A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci., 348(2-3):357-365, 2005. URL: https://doi.org/10.1016/j.tcs.2005.09.023.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail