Document

# Can You Solve Closest String Faster Than Exhaustive Search?

## File

LIPIcs.ESA.2023.3.pdf
• Filesize: 0.67 MB
• 17 pages

## Cite As

Amir Abboud, Nick Fischer, Elazar Goldenberg, Karthik C. S., and Ron Safier. Can You Solve Closest String Faster Than Exhaustive Search?. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 3:1-3:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ESA.2023.3

## Abstract

We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set X ⊆ Σ^d of n strings, find the string x^* minimizing the radius of the smallest Hamming ball around x^* that encloses all the strings in X. In this paper, we investigate whether the Closest String problem admits algorithms that are faster than the trivial exhaustive search algorithm. We obtain the following results for the two natural versions of the problem: - In the continuous Closest String problem, the goal is to find the solution string x^* anywhere in Σ^d. For binary strings, the exhaustive search algorithm runs in time O(2^d poly(nd)) and we prove that it cannot be improved to time O(2^{(1-ε) d} poly(nd)), for any ε > 0, unless the Strong Exponential Time Hypothesis fails. - In the discrete Closest String problem, x^* is required to be in the input set X. While this problem is clearly in polynomial time, its fine-grained complexity has been pinpointed to be quadratic time n^{2 ± o(1)} whenever the dimension is ω(log n) < d < n^o(1). We complement this known hardness result with new algorithms, proving essentially that whenever d falls out of this hard range, the discrete Closest String problem can be solved faster than exhaustive search. In the small-d regime, our algorithm is based on a novel application of the inclusion-exclusion principle. Interestingly, all of our results apply (and some are even stronger) to the natural dual of the Closest String problem, called the Remotest String problem, where the task is to find a string maximizing the Hamming distance to all the strings in X.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Problems, reductions and completeness
##### Keywords
• Closest string
• fine-grained complexity
• SETH
• inclusion-exclusion

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Amir Abboud, MohammadHossein Bateni, Vincent Cohen-Addad, Karthik C. S., and Saeed Seddighin. On complexity of 1-center in various metrics. CoRR, abs/2112.03222, 2021. URL: https://arxiv.org/abs/2112.03222.
2. Amir Abboud, Richard Ryan Williams, and Huacheng Yu. More applications of the polynomial method to algorithm design. In Piotr Indyk, editor, Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 218-230. SIAM, 2015. URL: https://doi.org/10.1137/1.9781611973730.17.
3. Amir Abboud, Virginia Vassilevska Williams, and Joshua R. Wang. Approximation and fixed parameter subquadratic algorithms for radius and diameter in sparse graphs. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 377-391. SIAM, 2016. URL: https://doi.org/10.1137/1.9781611974331.ch28.
4. Josh Alman, Timothy M. Chan, and R. Ryan Williams. Polynomial representations of threshold functions and algorithmic applications. In Irit Dinur, editor, IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA, pages 467-476. IEEE Computer Society, 2016. URL: https://doi.org/10.1109/FOCS.2016.57.
5. Josh Alman and Ryan Williams. Probabilistic polynomials and hamming nearest neighbors. In Venkatesan Guruswami, editor, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 136-150. IEEE Computer Society, 2015. URL: https://doi.org/10.1109/FOCS.2015.18.
6. Noga Alon, Rina Panigrahy, and Sergey Yekhanin. Deterministic approximation algorithms for the nearest codeword problem. In Irit Dinur, Klaus Jansen, Joseph Naor, and José D. P. Rolim, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 12th International Workshop, APPROX 2009, and 13th International Workshop, RANDOM 2009, Berkeley, CA, USA, August 21-23, 2009. Proceedings, volume 5687 of Lecture Notes in Computer Science, pages 339-351. Springer, 2009. URL: https://doi.org/10.1007/978-3-642-03685-9_26.
7. Lijie Chen. On the hardness of approximate and exact (bichromatic) maximum inner product. Theory Comput., 16:1-50, 2020. URL: https://doi.org/10.4086/toc.2020.v016a004.
8. Gérard Cohen, Iiro Honkala, Simon Litsyn, and Antoine Lobstein. Covering Codes. ISSN. Elsevier Science, 1997.
9. Cláudio Nogueira de Meneses, Zhaosong Lu, Carlos A. S. Oliveira, and Panos M. Pardalos. Optimal solutions for the closest-string problem via integer programming. INFORMS J. Comput., 16(4):419-429, 2004. URL: https://doi.org/10.1287/ijoc.1040.0090.
10. Joaquín Dopazo, A. Rodriguez, J. C. Saiz, and Francisco Sobrino. Design of primers for PCR amplification of highly variable genomes. Comput. Appl. Biosci., 9(2):123-125, 1993. URL: https://doi.org/10.1093/bioinformatics/9.2.123.
11. Moti Frances and Ami Litman. On covering problems of codes. Theory Comput. Syst., 30(2):113-119, 1997. URL: https://doi.org/10.1007/s002240000044.
12. Leszek Gasieniec, Jesper Jansson, and Andrzej Lingas. Efficient approximation algorithms for the hamming center problem. In Robert Endre Tarjan and Tandy J. Warnow, editors, Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms, 17-19 January 1999, Baltimore, Maryland, USA, pages 905-906. ACM/SIAM, 1999. URL: http://dl.acm.org/citation.cfm?id=314500.315081.
13. Jens Gramm, Falk Huner, and Rolf Niedermeier. Closest strings, primer design, and motif search. In Sixth Annual International Conference on Computational Molecular Biology, June 2002.
14. Jens Gramm, Rolf Niedermeier, and Peter Rossmanith. Fixed-parameter algorithms for CLOSEST STRING and related problems. Algorithmica, 37(1):25-42, 2003. URL: https://doi.org/10.1007/s00453-003-1028-3.
15. Venkatesan Guruswami, Daniele Micciancio, and Oded Regev. The complexity of the covering radius problem on lattices and codes. In 19th Annual IEEE Conference on Computational Complexity (CCC 2004), 21-24 June 2004, Amherst, MA, USA, pages 161-173. IEEE Computer Society, 2004. URL: https://doi.org/10.1109/CCC.2004.1313831.
16. Michel Habib, Christophe Paul, and Laurent Viennot. A synthesis on partition refinement: A useful routine for strings, graphs, boolean matrices and automata. In Michel Morvan, Christoph Meinel, and Daniel Krob, editors, STACS 98, 15th Annual Symposium on Theoretical Aspects of Computer Science, Paris, France, February 25-27, 1998, Proceedings, volume 1373 of Lecture Notes in Computer Science, pages 25-38. Springer, 1998. URL: https://doi.org/10.1007/BFb0028546.
17. Ishay Haviv and Oded Regev. Hardness of the covering radius problem on lattices. Chic. J. Theor. Comput. Sci., 2012, 2012. URL: http://cjtcs.cs.uchicago.edu/articles/2012/4/contents.html.
18. Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-sat. J. Comput. Syst. Sci., 62(2):367-375, 2001. URL: https://doi.org/10.1006/jcss.2000.1727.
19. Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Comput. Syst. Sci., 63(4):512-530, 2001. URL: https://doi.org/10.1006/jcss.2001.1774.
20. Yuval Kochman, Arya Mazumdar, and Yury Polyanskiy. The adversarial joint source-channel problem. In Proceedings of the 2012 IEEE International Symposium on Information Theory, ISIT 2012, Cambridge, MA, USA, July 1-6, 2012, pages 2112-2116. IEEE, 2012. URL: https://doi.org/10.1109/ISIT.2012.6283735.
21. J. Kevin Lanctôt. Some String Problems in Computational Biology. PhD thesis, University of Waterloo, 2004.
22. J. Kevin Lanctôt, Ming Li, Bin Ma, Shaojiu Wang, and Louxin Zhang. Distinguishing string selection problems. Inf. Comput., 185(1):41-55, 2003. URL: https://doi.org/10.1016/S0890-5401(03)00057-9.
23. Ming Li, Bin Ma, and Lusheng Wang. On the closest string and substring problems. J. ACM, 49(2):157-171, 2002. URL: https://doi.org/10.1145/506147.506150.
24. K. Lucas, M. Busch, S. Mossinger, and J. A. Thompson. An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. Comput. Appl. Biosci., 7(4):525-529, 1991. URL: https://doi.org/10.1093/bioinformatics/7.4.525.
25. Bin Ma and Xiaoming Sun. More efficient algorithms for closest string and substring problems. SIAM J. Comput., 39(4):1432-1443, 2009. URL: https://doi.org/10.1137/080739069.
26. Holger Mauch, Michael J. Melzer, and John S. Hu. Genetic algorithm approach for the closest string problem. In 2nd IEEE Computer Society Bioinformatics Conference, CSB 2003, Stanford, CA, USA, August 11-14, 2003, pages 560-561. IEEE Computer Society, 2003. URL: https://doi.org/10.1109/CSB.2003.1227407.
27. Arya Mazumdar, Yury Polyanskiy, and Barna Saha. On chebyshev radius of a set in hamming space and the closest string problem. In Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, July 7-12, 2013, pages 1401-1405. IEEE, 2013. URL: https://doi.org/10.1109/ISIT.2013.6620457.
28. Daniele Micciancio. Almost perfect lattices, the covering radius problem, and applications to ajtai’s connection factor. SIAM J. Comput., 34(1):118-169, 2004. URL: https://doi.org/10.1137/S0097539703433511.
29. V. Proutski and Edward C. Holmes. Primer master: a new program for the design and analysis of PCR primers. Comput. Appl. Biosci., 12(3):253-255, 1996. URL: https://doi.org/10.1093/bioinformatics/12.3.253.
30. Noah Stephens-Davidowitz and Vinod Vaikuntanathan. Seth-hardness of coding problems. In David Zuckerman, editor, 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2019, Baltimore, Maryland, USA, November 9-12, 2019, pages 287-301. IEEE Computer Society, 2019. URL: https://doi.org/10.1109/FOCS.2019.00027.
31. Patrick Traxler. The time complexity of constraint satisfaction. In Martin Grohe and Rolf Niedermeier, editors, Parameterized and Exact Computation, Third International Workshop, IWPEC 2008, Victoria, Canada, May 14-16, 2008. Proceedings, volume 5018 of Lecture Notes in Computer Science, pages 190-201. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-79723-4_18.
32. Ying Wang, Wei Chen, Xu Li, and Bing Cheng. Degenerated primer design to amplify the heavy chain variable region from immunoglobulin cdna. BMC Bioinform., 7(S-4), 2006. URL: https://doi.org/10.1186/1471-2105-7-S4-S9.
33. Ryan Williams. On the difference between closest, furthest, and orthogonal pairs: Nearly-linear vs barely-subquadratic complexity. In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 1207-1215. SIAM, 2018. URL: https://doi.org/10.1137/1.9781611975031.78.
34. Raphael Yuster and Uri Zwick. Fast sparse matrix multiplication. ACM Trans. Algorithms, 1(1):2-13, 2005. URL: https://doi.org/10.1145/1077464.1077466.