DROPS

Document

DOI: 10.4230/LIPIcs.ESA.2023.3

Can You Solve Closest String Faster Than Exhaustive Search?

Authors: Amir Abboud, Nick Fischer, Elazar Goldenberg, Karthik C. S., and Ron Safier

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

Abstract

We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set X ⊆ Σ^d of n strings, find the string x^* minimizing the radius of the smallest Hamming ball around x^* that encloses all the strings in X. In this paper, we investigate whether the Closest String problem admits algorithms that are faster than the trivial exhaustive search algorithm. We obtain the following results for the two natural versions of the problem: - In the continuous Closest String problem, the goal is to find the solution string x^* anywhere in Σ^d. For binary strings, the exhaustive search algorithm runs in time O(2^d poly(nd)) and we prove that it cannot be improved to time O(2^{(1-ε) d} poly(nd)), for any ε > 0, unless the Strong Exponential Time Hypothesis fails. - In the discrete Closest String problem, x^* is required to be in the input set X. While this problem is clearly in polynomial time, its fine-grained complexity has been pinpointed to be quadratic time n^{2 ± o(1)} whenever the dimension is ω(log n) < d < n^o(1). We complement this known hardness result with new algorithms, proving essentially that whenever d falls out of this hard range, the discrete Closest String problem can be solved faster than exhaustive search. In the small-d regime, our algorithm is based on a novel application of the inclusion-exclusion principle. Interestingly, all of our results apply (and some are even stronger) to the natural dual of the Closest String problem, called the Remotest String problem, where the task is to find a string maximizing the Hamming distance to all the strings in X.

Cite as

Amir Abboud, Nick Fischer, Elazar Goldenberg, Karthik C. S., and Ron Safier. Can You Solve Closest String Faster Than Exhaustive Search?. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 3:1-3:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{abboud_et_al:LIPIcs.ESA.2023.3,
  author =	{Abboud, Amir and Fischer, Nick and Goldenberg, Elazar and Karthik C. S. and Safier, Ron},
  title =	{{Can You Solve Closest String Faster Than Exhaustive Search?}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{3:1--3:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.3},
  URN =		{urn:nbn:de:0030-drops-186566},
  doi =		{10.4230/LIPIcs.ESA.2023.3},
  annote =	{Keywords: Closest string, fine-grained complexity, SETH, inclusion-exclusion}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2023.58

An Algorithmic Bridge Between Hamming and Levenshtein Distances

Authors: Elazar Goldenberg, Tomasz Kociumaka, Robert Krauthgamer, and Barna Saha

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

Abstract

The edit distance between strings classically assigns unit cost to every character insertion, deletion, and substitution, whereas the Hamming distance only allows substitutions. In many real-life scenarios, insertions and deletions (abbreviated indels) appear frequently but significantly less so than substitutions. To model this, we consider substitutions being cheaper than indels, with cost 1/a for a parameter a ≥ 1. This basic variant, denoted ED_a, bridges classical edit distance (a = 1) with Hamming distance (a → ∞), leading to interesting algorithmic challenges: Does the time complexity of computing ED_a interpolate between that of Hamming distance (linear time) and edit distance (quadratic time)? What about approximating ED_a? We first present a simple deterministic exact algorithm for ED_a and further prove that it is near-optimal assuming the Orthogonal Vectors Conjecture. Our main result is a randomized algorithm computing a (1+ε)-approximation of ED_a(X,Y), given strings X,Y of total length n and a bound k ≥ ED_a(X,Y). For simplicity, let us focus on k ≥ 1 and a constant ε > 0; then, our algorithm takes Õ(n/a + ak³) time. Unless a = Õ(1), in which case ED_a resembles the standard edit distance, and for the most interesting regime of small enough k, this running time is sublinear in n. We also consider a very natural version that asks to find a (k_I, k_S)-alignment, i.e., an alignment with at most k_I indels and k_S substitutions. In this setting, we give an exact algorithm and, more importantly, an Õ((nk_I)/k_S + k_S k_I³)-time (1,1+ε)-bicriteria approximation algorithm. The latter solution is based on the techniques we develop for ED_a for a = Θ(k_S/k_I), and its running time is again sublinear in n whenever k_I ≪ k_S and the overall distance is small enough. These bounds are in stark contrast to unit-cost edit distance, where state-of-the-art algorithms are far from achieving (1+ε)-approximation in sublinear time, even for a favorable choice of k.

Cite as

Elazar Goldenberg, Tomasz Kociumaka, Robert Krauthgamer, and Barna Saha. An Algorithmic Bridge Between Hamming and Levenshtein Distances. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 58:1-58:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{goldenberg_et_al:LIPIcs.ITCS.2023.58,
  author =	{Goldenberg, Elazar and Kociumaka, Tomasz and Krauthgamer, Robert and Saha, Barna},
  title =	{{An Algorithmic Bridge Between Hamming and Levenshtein Distances}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{58:1--58:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.58},
  URN =		{urn:nbn:de:0030-drops-175615},
  doi =		{10.4230/LIPIcs.ITCS.2023.58},
  annote =	{Keywords: edit distance, Hamming distance, Longest Common Extension queries}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2020.1

Hardness Amplification of Optimization Problems

Authors: Elazar Goldenberg and Karthik C. S.

Published in: LIPIcs, Volume 151, 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)

Abstract

In this paper, we prove a general hardness amplification scheme for optimization problems based on the technique of direct products. We say that an optimization problem Π is direct product feasible if it is possible to efficiently aggregate any k instances of Π and form one large instance of Π such that given an optimal feasible solution to the larger instance, we can efficiently find optimal feasible solutions to all the k smaller instances. Given a direct product feasible optimization problem Π, our hardness amplification theorem may be informally stated as follows: If there is a distribution D over instances of Π of size n such that every randomized algorithm running in time t(n) fails to solve Π on 1/α(n) fraction of inputs sampled from D, then, assuming some relationships on α(n) and t(n), there is a distribution D' over instances of Π of size O(n⋅α(n)) such that every randomized algorithm running in time t(n)/poly(α(n)) fails to solve Π on 99/100 fraction of inputs sampled from D'. As a consequence of the above theorem, we show hardness amplification of problems in various classes such as NP-hard problems like Max-Clique, Knapsack, and Max-SAT, problems in P such as Longest Common Subsequence, Edit Distance, Matrix Multiplication, and even problems in TFNP such as Factoring and computing Nash equilibrium.

Cite as

Elazar Goldenberg and Karthik C. S.. Hardness Amplification of Optimization Problems. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 1:1-1:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{goldenberg_et_al:LIPIcs.ITCS.2020.1,
  author =	{Goldenberg, Elazar and Karthik C. S.},
  title =	{{Hardness Amplification of Optimization Problems}},
  booktitle =	{11th Innovations in Theoretical Computer Science Conference (ITCS 2020)},
  pages =	{1:1--1:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-134-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{151},
  editor =	{Vidick, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2020.1},
  URN =		{urn:nbn:de:0030-drops-116863},
  doi =		{10.4230/LIPIcs.ITCS.2020.1},
  annote =	{Keywords: hardness amplification, average case complexity, direct product, optimization problems, fine-grained complexity, TFNP}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2018.11

Towards a General Direct Product Testing Theorem

Authors: Elazar Goldenberg and Karthik C. S.

Published in: LIPIcs, Volume 122, 38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2018)

Abstract

The Direct Product encoding of a string a in {0,1}^n on an underlying domain V subseteq ([n] choose k), is a function DP_V(a) which gets as input a set S in V and outputs a restricted to S. In the Direct Product Testing Problem, we are given a function F:V -> {0,1}^k, and our goal is to test whether F is close to a direct product encoding, i.e., whether there exists some a in {0,1}^n such that on most sets S, we have F(S)=DP_V(a)(S). A natural test is as follows: select a pair (S,S')in V according to some underlying distribution over V x V, query F on this pair, and check for consistency on their intersection. Note that the above distribution may be viewed as a weighted graph over the vertex set V and is referred to as a test graph. The testability of direct products was studied over various domains and test graphs: Dinur and Steurer (CCC '14) analyzed it when V equals the k-th slice of the Boolean hypercube and the test graph is a member of the Johnson graph family. Dinur and Kaufman (FOCS '17) analyzed it for the case where V is the set of faces of a Ramanujan complex, where in this case V=O_k(n). In this paper, we study the testability of direct products in a general setting, addressing the question: what properties of the domain and the test graph allow one to prove a direct product testing theorem? Towards this goal we introduce the notion of coordinate expansion of a test graph. Roughly speaking a test graph is a coordinate expander if it has global and local expansion, and has certain nice intersection properties on sampling. We show that whenever the test graph has coordinate expansion then it admits a direct product testing theorem. Additionally, for every k and n we provide a direct product domain V subseteq (n choose k) of size n, called the Sliding Window domain for which we prove direct product testability.

Cite as

Elazar Goldenberg and Karthik C. S.. Towards a General Direct Product Testing Theorem. In 38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 122, pp. 11:1-11:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{goldenberg_et_al:LIPIcs.FSTTCS.2018.11,
  author =	{Goldenberg, Elazar and C. S., Karthik},
  title =	{{Towards a General Direct Product Testing Theorem}},
  booktitle =	{38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2018)},
  pages =	{11:1--11:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-093-4},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{122},
  editor =	{Ganguly, Sumit and Pandya, Paritosh},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2018.11},
  URN =		{urn:nbn:de:0030-drops-99105},
  doi =		{10.4230/LIPIcs.FSTTCS.2018.11},
  annote =	{Keywords: Property Testing, Direct Product, PCP, Johnson graph, Ramanujan Complex, Derandomization}
}

Search Results

Documents authored by Goldenberg, Elazar

Can You Solve Closest String Faster Than Exhaustive Search?

Abstract

Cite as

An Algorithmic Bridge Between Hamming and Levenshtein Distances

Abstract

Cite as

Hardness Amplification of Optimization Problems

Abstract

Cite as

Towards a General Direct Product Testing Theorem

Abstract

Cite as