Hardness of Approximation of (Multi-)LCS over Small Alphabet

Authors Amey Bhangale, Diptarka Chakraborty, Rajendra Kumar



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2020.38.pdf
  • Filesize: 0.51 MB
  • 16 pages

Document Identifiers

Author Details

Amey Bhangale
  • University of California Riverside, CA, USA
Diptarka Chakraborty
  • National University of Singapore, Singapore
Rajendra Kumar
  • IIT Kanpur, India
  • National University of Singapore, Singapore

Acknowledgements

Authors would like to thank anonymous reviewers for providing helpful comments on an earlier version of this paper and especially for pointing out a small technical mistake in the proof of Lemma 14. Authors would also like to thank Pasin Manurangsi for pointing out that for certain regimes no hardness result is known for the densest k-subgraph problem.

Cite AsGet BibTex

Amey Bhangale, Diptarka Chakraborty, and Rajendra Kumar. Hardness of Approximation of (Multi-)LCS over Small Alphabet. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 38:1-38:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2020.38

Abstract

The problem of finding longest common subsequence (LCS) is one of the fundamental problems in computer science, which finds application in fields such as computational biology, text processing, information retrieval, data compression etc. It is well known that (decision version of) the problem of finding the length of a LCS of an arbitrary number of input sequences (which we refer to as Multi-LCS problem) is NP-complete. Jiang and Li [SICOMP'95] showed that if Max-Clique is hard to approximate within a factor of s then Multi-LCS is also hard to approximate within a factor of Θ(s). By the NP-hardness of the problem of approximating Max-Clique by Zuckerman [ToC'07], for any constant δ > 0, the length of a LCS of arbitrary number of input sequences of length n each, cannot be approximated within an n^{1-δ}-factor in polynomial time unless {P}={NP}. However, the reduction of Jiang and Li assumes the alphabet size to be Ω(n). So far no hardness result is known for the problem of approximating Multi-LCS over sub-linear sized alphabet. On the other hand, it is easy to get 1/|Σ|-factor approximation for strings of alphabet Σ. In this paper, we make a significant progress towards proving hardness of approximation over small alphabet by showing a polynomial-time reduction from the well-studied densest k-subgraph problem with perfect completeness to approximating Multi-LCS over alphabet of size poly(n/k). As a consequence, from the known hardness result of densest k-subgraph problem (e.g. [Manurangsi, STOC'17]) we get that no polynomial-time algorithm can give an n^{-o(1)}-factor approximation of Multi-LCS over an alphabet of size n^{o(1)}, unless the Exponential Time Hypothesis is false.

Subject Classification

ACM Subject Classification
  • Theory of computation → Problems, reductions and completeness
Keywords
  • Longest common subsequence
  • Hardness of approximation
  • ETH-hardness
  • Densest k-subgraph problem

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amir Abboud and Arturs Backurs. Towards hardness of approximation for polynomial time problems. In 8th Innovations in Theoretical Computer Science Conference, ITCS 2017, January 9-11, 2017, Berkeley, CA, USA, pages 11:1-11:26, 2017. Google Scholar
  2. Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 59-78, 2015. Google Scholar
  3. Amir Abboud, Thomas Dueholm Hansen, Virginia Vassilevska Williams, and Ryan Williams. Simulating branching programs with edit distance and friends: or: a polylog shaved is a lower bound made. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016, pages 375-388, 2016. Google Scholar
  4. Amir Abboud and Aviad Rubinstein. Fast and deterministic constant factor approximation algorithms for LCS imply new circuit lower bounds. In 9th Innovations in Theoretical Computer Science Conference, ITCS 2018, January 11-14, 2018, Cambridge, MA, USA, pages 35:1-35:14, 2018. Google Scholar
  5. Lasse Bergroth, Harri Hakonen, and Timo Raita. A survey of longest common subsequence algorithms. In Pablo de la Fuente, editor, Seventh International Symposium on String Processing and Information Retrieval, SPIRE 2000, A Coruña, Spain, September 27-29, 2000, pages 39-48. IEEE Computer Society, 2000. Google Scholar
  6. Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige, and Aravindan Vijayaraghavan. Detecting high log-densities: an O(n^1/4) approximation for densest k-subgraph. In Leonard J. Schulman, editor, Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010, pages 201-210. ACM, 2010. Google Scholar
  7. Guillaume Blin, Laurent Bulteau, Minghui Jiang, Pedro J. Tejada, and Stéphane Vialette. Hardness of longest common subsequence for sequences with bounded run-lengths. In Combinatorial Pattern Matching - 23rd Annual Symposium, CPM 2012, Helsinki, Finland, July 3-5, 2012. Proceedings, pages 138-148, 2012. Google Scholar
  8. Mark Braverman, Young Kun Ko, Aviad Rubinstein, and Omri Weinstein. ETH hardness for densest-k-subgraph with perfect completeness. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1326-1341. SIAM, 2017. Google Scholar
  9. Kuan Cheng, Bernhard Haeupler, Xin Li, Amirbehshad Shahrasbi, and Ke Wu. Synchronization strings: Highly efficient deterministic constructions over small alphabets. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 2185-2204, 2019. Google Scholar
  10. Uriel Feige, David Peleg, and Guy Kortsarz. The dense k-subgraph problem. Algorithmica, 29(3):410-421, 2001. Google Scholar
  11. Uriel Feige and Michael Seltser. On the densest k-subgraph problem. Algorithmica, 29, 1997. Google Scholar
  12. Szymon Grabowski. New tabulation and sparse dynamic programming based techniques for sequence similarity problems. Discrete Applied Mathematics, 212:96-103, 2016. Google Scholar
  13. Dan Gusfield. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology. Cambridge University Press, 1997. Google Scholar
  14. Bernhard Haeupler and Amirbehshad Shahrasbi. Synchronization strings: explicit constructions, local decoding, and applications. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 841-854, 2018. Google Scholar
  15. MohammadTaghi Hajiaghayi, Masoud Seddighin, Saeed Seddighin, and Xiaorui Sun. Approximating LCS in linear time: Beating the √n barrier. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 1181-1200, 2019. Google Scholar
  16. D.S. Hirschberg. Recent results on the complexity of common subsequence problems. In Time Warps, String Edits, and Macromolecules, D. Sankoff and J.B. Kruskal, ed., Addison-Wesley, pages 323-328, 1983. Google Scholar
  17. Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-SAT. Journal of Computer and System Sciences, 62(2):367-375, 2001. Google Scholar
  18. Tao Jiang and Ming Li. On the approximation of shortest common supersequences and longest common subsequences. SIAM J. on Computing, 24(5):1122-1139, 1995. Google Scholar
  19. Subhash Khot. Ruling out ptas for graph min-bisection, dense k-subgraph, and bipartite clique. SIAM Journal on Computing, 36(4):1025-1071, 2006. Google Scholar
  20. Marcos Kiwi, Martin Loebl, and Jiří Matoušek. Expected length of the longest common subsequence for large alphabets. Advances in Mathematics, 197(2):480-498, 2005. Google Scholar
  21. G Kortsarz and D Peleg. On choosing a dense subgraph. In Proceedings of the 1993 IEEE 34th Annual Foundations of Computer Science, pages 692-701. IEEE Computer Society, 1993. Google Scholar
  22. S. Lu and K. S. Fu. A sentence-to-sentence clustering procedure for pattern analysis. IEEE Transactions on Systems, Man, and Cybernetics, 8(5):381-389, May 1978. Google Scholar
  23. David Maier. The complexity of some problems on subsequences and supersequences. J. ACM, 25(2):322-336, April 1978. Google Scholar
  24. Pasin Manurangsi. Almost-polynomial ratio ETH-hardness of approximating densest k-subgraph. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 954-961. ACM, 2017. Google Scholar
  25. William J. Masek and Michael S. Paterson. A faster algorithm computing string edit distances. Journal of Computer and System Sciences, 20(1):18-31, 1980. Google Scholar
  26. Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3):425-440, 1991. Google Scholar
  27. Pavel A. Pevzner. Multiple alignment with guaranteed error bounds and communication cost. In Combinatorial Pattern Matching, Third Annual Symposium, CPM 92, Tucson, Arizona, USA, April 29 - May 1, 1992, Proceedings, pages 205-213, 1992. Google Scholar
  28. Prasad Raghavendra and David Steurer. Graph expansion and the unique games conjecture. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 755-764. ACM, 2010. Google Scholar
  29. Aviad Rubinstein and Zhao Song. Reducing approximate longest common subsequence to approximate edit distance. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 1591-1600. SIAM, 2020. Google Scholar
  30. Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168-173, January 1974. Google Scholar
  31. David Zuckerman. Linear degree extractors and the inapproximability of max clique and chromatic number. Theory of Computing, 3(6):103-128, 2007. Google Scholar