Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

We design new parallel algorithms for clustering in high-dimensional Euclidean spaces. These algorithms run in the Massively Parallel Computation (MPC) model, and are fully scalable, meaning that the local memory in each machine may be n^σ for arbitrarily small fixed σ > 0. Importantly, the local memory may be substantially smaller than the number of clusters k, yet all our algorithms are fast, i.e., run in O(1) rounds.
We first devise a fast MPC algorithm for O(1)-approximation of uniform Facility Location. This is the first fully-scalable MPC algorithm that achieves O(1)-approximation for any clustering problem in general geometric setting; previous algorithms only provide poly(log n)-approximation or apply to restricted inputs, like low dimension or small number of clusters k; e.g. [Bhaskara and Wijewardena, ICML'18; Cohen-Addad et al., NeurIPS'21; Cohen-Addad et al., ICML'22]. We then build on this Facility Location result and devise a fast MPC algorithm that achieves O(1)-bicriteria approximation for k-Median and for k-Means, namely, it computes (1+ε)k clusters of cost within O(1/ε²)-factor of the optimum for k clusters.
A primary technical tool that we introduce, and may be of independent interest, is a new MPC primitive for geometric aggregation, namely, computing for every data point a statistic of its approximate neighborhood, for statistics like range counting and nearest-neighbor search. Our implementation of this primitive works in high dimension, and is based on consistent hashing (aka sparse partition), a technique that was recently used for streaming algorithms [Czumaj et al., FOCS'22].

Artur Czumaj, Guichen Gao, Shaofeng H.-C. Jiang, Robert Krauthgamer, and Pavel Veselý. Fully-Scalable MPC Algorithms for Clustering in High Dimension. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 50:1-50:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{czumaj_et_al:LIPIcs.ICALP.2024.50, author = {Czumaj, Artur and Gao, Guichen and Jiang, Shaofeng H.-C. and Krauthgamer, Robert and Vesel\'{y}, Pavel}, title = {{Fully-Scalable MPC Algorithms for Clustering in High Dimension}}, booktitle = {51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)}, pages = {50:1--50:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-322-5}, ISSN = {1868-8969}, year = {2024}, volume = {297}, editor = {Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.50}, URN = {urn:nbn:de:0030-drops-201938}, doi = {10.4230/LIPIcs.ICALP.2024.50}, annote = {Keywords: Massively parallel computing, high dimension, facility location, k-median, k-means} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

In cut sparsification, all cuts of a hypergraph H = (V,E,w) are approximated within 1±ε factor by a small hypergraph H'. This widely applied method was generalized recently to a setting where the cost of cutting each hyperedge e is provided by a splitting function g_e: 2^e → ℝ_+. This generalization is called a submodular hypergraph when the functions {g_e}_{e ∈ E} are submodular, and it arises in machine learning, combinatorial optimization, and algorithmic game theory.
Previous work studied the setting where H' is a reweighted sub-hypergraph of H, and measured the size of H' by the number of hyperedges in it. In this setting, we present two results: (i) all submodular hypergraphs admit sparsifiers of size polynomial in n = |V| and ε^{-1}; (ii) we propose a new parameter, called spread, and use it to obtain smaller sparsifiers in some cases.
We also show that for a natural family of splitting functions, relaxing the requirement that H' be a reweighted sub-hypergraph of H yields a substantially smaller encoding of the cuts of H (almost a factor n in the number of bits). This is in contrast to graphs, where the most succinct representation is attained by reweighted subgraphs. A new tool in our construction of succinct representation is the notion of deformation, where a splitting function g_e is decomposed into a sum of functions of small description, and we provide upper and lower bounds for deformation of common splitting functions.

Yotam Kenneth and Robert Krauthgamer. Cut Sparsification and Succinct Representation of Submodular Hypergraphs. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 97:1-97:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{kenneth_et_al:LIPIcs.ICALP.2024.97, author = {Kenneth, Yotam and Krauthgamer, Robert}, title = {{Cut Sparsification and Succinct Representation of Submodular Hypergraphs}}, booktitle = {51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)}, pages = {97:1--97:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-322-5}, ISSN = {1868-8969}, year = {2024}, volume = {297}, editor = {Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.97}, URN = {urn:nbn:de:0030-drops-202406}, doi = {10.4230/LIPIcs.ICALP.2024.97}, annote = {Keywords: Cut Sparsification, Submodular Hypergraphs, Succinct Representation} }

Document

**Published in:** LIPIcs, Volume 293, 40th International Symposium on Computational Geometry (SoCG 2024)

The Johnson-Lindenstrauss (JL) Lemma introduced the concept of dimension reduction via a random linear map, which has become a fundamental technique in many computational settings. For a set of n points in ℝ^d and any fixed ε > 0, it reduces the dimension d to O(log n) while preserving, with high probability, all the pairwise Euclidean distances within factor 1+ε. Perhaps surprisingly, the target dimension can be lower if one only wishes to preserve the optimal value of a certain problem on the pointset, e.g., Euclidean max-cut or k-means. However, for some notorious problems, like diameter (aka furthest pair), dimension reduction via the JL map to below O(log n) does not preserve the optimal value within factor 1+ε.
We propose to focus on another regime, of moderate dimension reduction, where a problem’s value is preserved within factor α > 1 using target dimension (log n)/poly(α). We establish the viability of this approach and show that the famous k-center problem is α-approximated when reducing to dimension O({log n}/α² + log k). Along the way, we address the diameter problem via the special case k = 1. Our result extends to several important variants of k-center (with outliers, capacities, or fairness constraints), and the bound improves further with the input’s doubling dimension.
While our poly(α)-factor improvement in the dimension may seem small, it actually has significant implications for streaming algorithms, and easily yields an algorithm for k-center in dynamic geometric streams, that achieves O(α)-approximation using space poly(kdn^{1/α²}). This is the first algorithm to beat O(n) space in high dimension d, as all previous algorithms require space at least exp(d). Furthermore, it extends to the k-center variants mentioned above.

Shaofeng H.-C. Jiang, Robert Krauthgamer, and Shay Sapir. Moderate Dimension Reduction for k-Center Clustering. In 40th International Symposium on Computational Geometry (SoCG 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 293, pp. 64:1-64:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{jiang_et_al:LIPIcs.SoCG.2024.64, author = {Jiang, Shaofeng H.-C. and Krauthgamer, Robert and Sapir, Shay}, title = {{Moderate Dimension Reduction for k-Center Clustering}}, booktitle = {40th International Symposium on Computational Geometry (SoCG 2024)}, pages = {64:1--64:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-316-4}, ISSN = {1868-8969}, year = {2024}, volume = {293}, editor = {Mulzer, Wolfgang and Phillips, Jeff M.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2024.64}, URN = {urn:nbn:de:0030-drops-200095}, doi = {10.4230/LIPIcs.SoCG.2024.64}, annote = {Keywords: Johnson-Lindenstrauss transform, dimension reduction, clustering, streaming algorithms} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)

Many streaming algorithms provide only a high-probability relative approximation. These two relaxations, of allowing approximation and randomization, seem necessary - for many streaming problems, both relaxations must be employed simultaneously, to avoid an exponentially larger (and often trivial) space complexity. A common drawback of these randomized approximate algorithms is that independent executions on the same input have different outputs, that depend on their random coins. Pseudo-deterministic algorithms combat this issue, and for every input, they output with high probability the same "canonical" solution.
We consider perhaps the most basic problem in data streams, of counting the number of items in a stream of length at most n. Morris’s counter [CACM, 1978] is a randomized approximation algorithm for this problem that uses O(log log n) bits of space, for every fixed approximation factor (greater than 1). Goldwasser, Grossman, Mohanty and Woodruff [ITCS 2020] asked whether pseudo-deterministic approximation algorithms can match this space complexity. Our main result answers their question negatively, and shows that such algorithms must use Ω(√{log n / log log n}) bits of space.
Our approach is based on a problem that we call Shift Finding, and may be of independent interest. In this problem, one has query access to a shifted version of a known string F ∈ {0,1}^{3n}, which is guaranteed to start with n zeros and end with n ones, and the goal is to find the unknown shift using a small number of queries. We provide for this problem an algorithm that uses O(√n) queries. It remains open whether poly(log n) queries suffice; if true, then our techniques immediately imply a nearly-tight Ω(log n/log log n) space bound for pseudo-deterministic approximate counting.

Vladimir Braverman, Robert Krauthgamer, Aditya Krishnan, and Shay Sapir. Lower Bounds for Pseudo-Deterministic Counting in a Stream. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 30:1-30:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{braverman_et_al:LIPIcs.ICALP.2023.30, author = {Braverman, Vladimir and Krauthgamer, Robert and Krishnan, Aditya and Sapir, Shay}, title = {{Lower Bounds for Pseudo-Deterministic Counting in a Stream}}, booktitle = {50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)}, pages = {30:1--30:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-278-5}, ISSN = {1868-8969}, year = {2023}, volume = {261}, editor = {Etessami, Kousha and Feige, Uriel and Puppis, Gabriele}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.30}, URN = {urn:nbn:de:0030-drops-180827}, doi = {10.4230/LIPIcs.ICALP.2023.30}, annote = {Keywords: streaming algorithms, pseudo-deterministic, approximate counting} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

We study the classical metric k-median clustering problem over a set of input rankings (i.e., permutations), which has myriad applications, from social-choice theory to web search and databases. A folklore algorithm provides a 2-approximate solution in polynomial time for all k = O(1), and works irrespective of the underlying distance measure, so long it is a metric; however, going below the 2-factor is a notorious challenge. We consider the Ulam distance, a variant of the well-known edit-distance metric, where strings are restricted to be permutations. For this metric, Chakraborty, Das, and Krauthgamer [SODA, 2021] provided a (2-δ)-approximation algorithm for k = 1, where δ≈ 2^{-40}.
Our primary contribution is a new algorithmic framework for clustering a set of permutations. Our first result is a 1.999-approximation algorithm for the metric k-median problem under the Ulam metric, that runs in time (k log (nd))^{O(k)} nd³ for an input consisting of n permutations over [d]. In fact, our framework is powerful enough to extend this result to the streaming model (where the n input permutations arrive one by one) using only polylogarithmic (in n) space. Additionally, we show that similar results can be obtained even in the presence of outliers, which is presumably a more difficult problem.

Diptarka Chakraborty, Debarati Das, and Robert Krauthgamer. Clustering Permutations: New Techniques with Streaming Applications. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 31:1-31:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.ITCS.2023.31, author = {Chakraborty, Diptarka and Das, Debarati and Krauthgamer, Robert}, title = {{Clustering Permutations: New Techniques with Streaming Applications}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {31:1--31:24}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.31}, URN = {urn:nbn:de:0030-drops-175340}, doi = {10.4230/LIPIcs.ITCS.2023.31}, annote = {Keywords: Clustering, Approximation Algorithms, Ulam Distance, Rank Aggregation, Streaming} }

Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

The edit distance between strings classically assigns unit cost to every character insertion, deletion, and substitution, whereas the Hamming distance only allows substitutions. In many real-life scenarios, insertions and deletions (abbreviated indels) appear frequently but significantly less so than substitutions. To model this, we consider substitutions being cheaper than indels, with cost 1/a for a parameter a ≥ 1. This basic variant, denoted ED_a, bridges classical edit distance (a = 1) with Hamming distance (a → ∞), leading to interesting algorithmic challenges: Does the time complexity of computing ED_a interpolate between that of Hamming distance (linear time) and edit distance (quadratic time)? What about approximating ED_a?
We first present a simple deterministic exact algorithm for ED_a and further prove that it is near-optimal assuming the Orthogonal Vectors Conjecture. Our main result is a randomized algorithm computing a (1+ε)-approximation of ED_a(X,Y), given strings X,Y of total length n and a bound k ≥ ED_a(X,Y). For simplicity, let us focus on k ≥ 1 and a constant ε > 0; then, our algorithm takes Õ(n/a + ak³) time. Unless a = Õ(1), in which case ED_a resembles the standard edit distance, and for the most interesting regime of small enough k, this running time is sublinear in n.
We also consider a very natural version that asks to find a (k_I, k_S)-alignment, i.e., an alignment with at most k_I indels and k_S substitutions. In this setting, we give an exact algorithm and, more importantly, an Õ((nk_I)/k_S + k_S k_I³)-time (1,1+ε)-bicriteria approximation algorithm. The latter solution is based on the techniques we develop for ED_a for a = Θ(k_S/k_I), and its running time is again sublinear in n whenever k_I ≪ k_S and the overall distance is small enough.
These bounds are in stark contrast to unit-cost edit distance, where state-of-the-art algorithms are far from achieving (1+ε)-approximation in sublinear time, even for a favorable choice of k.

Elazar Goldenberg, Tomasz Kociumaka, Robert Krauthgamer, and Barna Saha. An Algorithmic Bridge Between Hamming and Levenshtein Distances. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 58:1-58:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{goldenberg_et_al:LIPIcs.ITCS.2023.58, author = {Goldenberg, Elazar and Kociumaka, Tomasz and Krauthgamer, Robert and Saha, Barna}, title = {{An Algorithmic Bridge Between Hamming and Levenshtein Distances}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {58:1--58:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.58}, URN = {urn:nbn:de:0030-drops-175615}, doi = {10.4230/LIPIcs.ITCS.2023.58}, annote = {Keywords: edit distance, Hamming distance, Longest Common Extension queries} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

We consider an important generalization of the Steiner tree problem, the Steiner forest problem, in the Euclidean plane: the input is a multiset X ⊆ ℝ², partitioned into k color classes C₁, C₂, …, Cₖ ⊆ X. The goal is to find a minimum-cost Euclidean graph G such that every color class Cᵢ is connected in G. We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to X. Each input point x ∈ X arrives with its color color(x) ∈ [k], and as usual for dynamic geometric streams, the input is restricted to the discrete grid {0, …, Δ}².
We design a single-pass streaming algorithm that uses poly(k ⋅ log Δ) space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio α₂ (currently 1.1547 ≤ α₂ ≤ 1.214). This approximation guarantee matches the state of the art bound for streaming Steiner tree, i.e., when k = 1. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and has so far not been applied in the streaming setting.
We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite approximation requires Ω(k) bits of space.

Artur Czumaj, Shaofeng H.-C. Jiang, Robert Krauthgamer, and Pavel Veselý. Streaming Algorithms for Geometric Steiner Forest. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 47:1-47:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{czumaj_et_al:LIPIcs.ICALP.2022.47, author = {Czumaj, Artur and Jiang, Shaofeng H.-C. and Krauthgamer, Robert and Vesel\'{y}, Pavel}, title = {{Streaming Algorithms for Geometric Steiner Forest}}, booktitle = {49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)}, pages = {47:1--47:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-235-8}, ISSN = {1868-8969}, year = {2022}, volume = {229}, editor = {Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.47}, URN = {urn:nbn:de:0030-drops-163880}, doi = {10.4230/LIPIcs.ICALP.2022.47}, annote = {Keywords: Steiner forest, streaming, sublinear algorithms, dynamic programming} }

Document

**Published in:** LIPIcs, Volume 213, 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)

We consider an approximate version of the trace reconstruction problem, where the goal is to recover an unknown string s ∈ {0,1}ⁿ from m traces (each trace is generated independently by passing s through a probabilistic insertion-deletion channel with rate p). We present a deterministic near-linear time algorithm for the average-case model, where s is random, that uses only three traces. It runs in near-linear time Õ(n) and with high probability reports a string within edit distance Õ(p² n) from s, which significantly improves over the straightforward bound of O(pn).
Technically, our algorithm computes a (1+ε)-approximate median of the three input traces. To prove its correctness, our probabilistic analysis shows that an approximate median is indeed close to the unknown s. To achieve a near-linear time bound, we have to bypass the well-known dynamic programming algorithm that computes an optimal median in time O(n³).

Diptarka Chakraborty, Debarati Das, and Robert Krauthgamer. Approximate Trace Reconstruction via Median String (In Average-Case). In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 213, pp. 11:1-11:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{chakraborty_et_al:LIPIcs.FSTTCS.2021.11, author = {Chakraborty, Diptarka and Das, Debarati and Krauthgamer, Robert}, title = {{Approximate Trace Reconstruction via Median String (In Average-Case)}}, booktitle = {41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)}, pages = {11:1--11:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-215-0}, ISSN = {1868-8969}, year = {2021}, volume = {213}, editor = {Boja\'{n}czyk, Miko{\l}aj and Chekuri, Chandra}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2021.11}, URN = {urn:nbn:de:0030-drops-155228}, doi = {10.4230/LIPIcs.FSTTCS.2021.11}, annote = {Keywords: Trace Reconstruction, Approximation Algorithms, Edit Distance, String Median} }

Document

Invited Talk

**Published in:** LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

Graph-sketching algorithms summarize an input graph G in a manner that suffices to later answer (perhaps approximately) one or more optimization problems on G, like distances, cuts, and matchings. Two famous examples are the Gomory-Hu tree, which represents all the minimum st-cuts in a graph G using a tree on the same vertex set V(G); and the cut-sparsifier of Benczúr and Karger, which is a sparse graph (often a reweighted subgraph) that approximates every cut in G within factor 1±ε. Another genre of these problems limits the queries to designated terminal vertices, denoted T ⊆ V(G), and the sketch size depends on |T| instead of |V(G)|.
The talk will survey this topic, particularly cut and flow problems such as the three examples above. Currently, most known sketches are based on a graph representation, often called edge and vertex sparsification, which leaves room for potential improvements like smaller storage by using another representation, and faster running time to answer a query. These algorithms employ a host of techniques, ranging from combinatorial methods, like graph partitioning and edge or vertex sampling, to standard tools in data-stream algorithms and in sparse recovery. There are also several lower bounds known, either combinatorial (for the graph representation) or based on communication complexity and information theory.
Many of the recent efforts focus on characterizing the tradeoff between accuracy and sketch size, yet many intriguing and very accessible problems are still open, and I will describe them in the talk.

Robert Krauthgamer. Sketching Graphs and Combinatorial Optimization (Invited Talk). In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{krauthgamer:LIPIcs.ICALP.2020.2, author = {Krauthgamer, Robert}, title = {{Sketching Graphs and Combinatorial Optimization}}, booktitle = {47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)}, pages = {2:1--2:1}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-138-2}, ISSN = {1868-8969}, year = {2020}, volume = {168}, editor = {Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.2}, URN = {urn:nbn:de:0030-drops-124090}, doi = {10.4230/LIPIcs.ICALP.2020.2}, annote = {Keywords: Sketching, edge sparsification, vertex sparsification, Gomory-Hu tree, mimicking networks, graph sampling, succinct data structures} }

Document

Invited Talk

**Published in:** LIPIcs, Volume 150, 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019)

Graph-sketching algorithms summarize an input graph G in a manner that suffices to later answer (perhaps approximately) one or more optimization problems on G, like distances, cuts, and matchings. Two famous examples are the Gomory-Hu tree, which represents all the minimum st-cuts in a graph G using a tree on the same vertex set V(G); and the cut-sparsifier of Benczúr and Karger, which is a sparse graph (often a reweighted subgraph) that approximates every cut in G within factor 1 +/- epsilon. Another genre of these problems limits the queries to designated terminal vertices, denoted T subseteq V(G), and the sketch size depends on |T| instead of |V(G)|.
The talk will survey this topic, particularly cut and flow problems such as the three examples above. Currently, most known sketches are based on a graphical representation, often called edge and vertex sparsification, which leaves room for potential improvements like smaller storage by using another representation, and faster running time to answer a query. These algorithms employ a host of techniques, ranging from combinatorial methods, like graph partitioning and edge or vertex sampling, to standard tools in data-stream algorithms and in sparse recovery. There are also several lower bounds known, either combinatorial (for the graphical representation) or based on communication complexity and information theory.
Many of the recent efforts focus on characterizing the tradeoff between accuracy and sketch size, yet many intriguing and very accessible problems are still open, and I will describe them in the talk.

Robert Krauthgamer. Sketching Graphs and Combinatorial Optimization (Invited Talk). In 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 150, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{krauthgamer:LIPIcs.FSTTCS.2019.2, author = {Krauthgamer, Robert}, title = {{Sketching Graphs and Combinatorial Optimization}}, booktitle = {39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2019)}, pages = {2:1--2:1}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-131-3}, ISSN = {1868-8969}, year = {2019}, volume = {150}, editor = {Chattopadhyay, Arkadev and Gastin, Paul}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2019.2}, URN = {urn:nbn:de:0030-drops-115646}, doi = {10.4230/LIPIcs.FSTTCS.2019.2}, annote = {Keywords: Sketching, edge sparsification, vertex sparsification, Gomory-Hu tree, mimicking networks, graph sampling, succinct data structures} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

The All-Pairs Min-Cut problem (aka All-Pairs Max-Flow) asks to compute a minimum s-t cut (or just its value) for all pairs of vertices s,t. We study this problem in directed graphs with unit edge/vertex capacities (corresponding to edge/vertex connectivity). Our focus is on the k-bounded case, where the algorithm has to find all pairs with min-cut value less than k, and report only those. The most basic case k=1 is the Transitive Closure (TC) problem, which can be solved in graphs with n vertices and m edges in time O(mn) combinatorially, and in time O(n^{omega}) where omega<2.38 is the matrix-multiplication exponent. These time bounds are conjectured to be optimal.
We present new algorithms and conditional lower bounds that advance the frontier for larger k, as follows:
- A randomized algorithm for vertex capacities that runs in time {O}((nk)^{omega}). This is only a factor k^omega away from the TC bound, and nearly matches it for all k=n^{o(1)}.
- Two deterministic algorithms for edge capacities (which is more general) that work in DAGs and further reports a minimum cut for each pair. The first algorithm is combinatorial (does not involve matrix multiplication) and runs in time {O}(2^{{O}(k^2)}* mn). The second algorithm can be faster on dense DAGs and runs in time {O}((k log n)^{4^{k+o(k)}}* n^{omega}). Previously, Georgiadis et al. [ICALP 2017], could match the TC bound (up to n^{o(1)} factors) only when k=2, and now our two algorithms match it for all k=o(sqrt{log n}) and k=o(log log n).
- The first super-cubic lower bound of n^{omega-1-o(1)} k^2 time under the 4-Clique conjecture, which holds even in the simplest case of DAGs with unit vertex capacities. It improves on the previous (SETH-based) lower bounds even in the unbounded setting k=n. For combinatorial algorithms, our reduction implies an n^{2-o(1)} k^2 conditional lower bound. Thus, we identify new settings where the complexity of the problem is (conditionally) higher than that of TC.
Our three sets of results are obtained via different techniques. The first one adapts the network coding method of Cheung, Lau, and Leung [SICOMP 2013] to vertex-capacitated digraphs. The second set exploits new insights on the structure of latest cuts together with suitable algebraic tools. The lower bounds arise from a novel reduction of a different structure than the SETH-based constructions.

Amir Abboud, Loukas Georgiadis, Giuseppe F. Italiano, Robert Krauthgamer, Nikos Parotsidis, Ohad Trabelsi, Przemysław Uznański, and Daniel Wolleb-Graf. Faster Algorithms for All-Pairs Bounded Min-Cuts. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 7:1-7:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{abboud_et_al:LIPIcs.ICALP.2019.7, author = {Abboud, Amir and Georgiadis, Loukas and Italiano, Giuseppe F. and Krauthgamer, Robert and Parotsidis, Nikos and Trabelsi, Ohad and Uzna\'{n}ski, Przemys{\l}aw and Wolleb-Graf, Daniel}, title = {{Faster Algorithms for All-Pairs Bounded Min-Cuts}}, booktitle = {46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)}, pages = {7:1--7:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-109-2}, ISSN = {1868-8969}, year = {2019}, volume = {132}, editor = {Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.7}, URN = {urn:nbn:de:0030-drops-105833}, doi = {10.4230/LIPIcs.ICALP.2019.7}, annote = {Keywords: All-pairs min-cut, k-reachability, network coding, Directed graphs, fine-grained complexity} }

Document

**Published in:** LIPIcs, Volume 126, 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019)

In the Set Cover problem, the input is a ground set of n elements and a collection of m sets, and the goal is to find the smallest sub-collection of sets whose union is the entire ground set. The fastest algorithm known runs in time O(mn2^n) [Fomin et al., WG 2004], and the Set Cover Conjecture (SeCoCo) [Cygan et al., TALG 2016] asserts that for every fixed epsilon>0, no algorithm can solve Set Cover in time 2^{(1-epsilon)n} poly(m), even if set sizes are bounded by Delta=Delta(epsilon). We show strong connections between this problem and kTree, a special case of Subgraph Isomorphism where the input is an n-node graph G and a k-node tree T, and the goal is to determine whether G has a subgraph isomorphic to T.
First, we propose a weaker conjecture Log-SeCoCo, that allows input sets of size Delta=O(1/epsilon * log n), and show that an algorithm breaking Log-SeCoCo would imply a faster algorithm than the currently known 2^n poly(n)-time algorithm [Koutis and Williams, TALG 2016] for Directed nTree, which is kTree with k=n and arbitrary directions to the edges of G and T. This would also improve the running time for Directed Hamiltonicity, for which no algorithm significantly faster than 2^n poly(n) is known despite extensive research.
Second, we prove that if p-Partial Cover, a parameterized version of Set Cover that requires covering at least p elements, cannot be solved significantly faster than 2^n poly(m) (an assumption even weaker than Log-SeCoCo) then kTree cannot be computed significantly faster than 2^k poly(n), the running time of the Koutis and Williams' algorithm.

Robert Krauthgamer and Ohad Trabelsi. The Set Cover Conjecture and Subgraph Isomorphism with a Tree Pattern. In 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 126, pp. 45:1-45:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{krauthgamer_et_al:LIPIcs.STACS.2019.45, author = {Krauthgamer, Robert and Trabelsi, Ohad}, title = {{The Set Cover Conjecture and Subgraph Isomorphism with a Tree Pattern}}, booktitle = {36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019)}, pages = {45:1--45:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-100-9}, ISSN = {1868-8969}, year = {2019}, volume = {126}, editor = {Niedermeier, Rolf and Paul, Christophe}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2019.45}, URN = {urn:nbn:de:0030-drops-102840}, doi = {10.4230/LIPIcs.STACS.2019.45}, annote = {Keywords: Conditional lower bounds, Hardness in P, Set Cover Conjecture, Subgraph Isomorphism} }

Document

**Published in:** OASIcs, Volume 69, 2nd Symposium on Simplicity in Algorithms (SOSA 2019)

We reprove three known algorithmic bounds for terminal-clustering problems, using a single framework that leads to simpler proofs. In this genre of problems, the input is a metric space (X,d) (possibly arising from a graph) and a subset of terminals K subset X, and the goal is to partition the points X such that each part, called a cluster, contains exactly one terminal (possibly with connectivity requirements) so as to minimize some objective. The three bounds we reprove are for Steiner Point Removal on trees [Gupta, SODA 2001], for Metric 0-Extension in bounded doubling dimension [Lee and Naor, unpublished 2003], and for Connected Metric 0-Extension [Englert et al., SICOMP 2014].
A natural approach is to cluster each point with its closest terminal, which would partition X into so-called Voronoi cells, but this approach can fail miserably due to its stringent cluster boundaries. A now-standard fix, which we call the Relaxed-Voronoi framework, is to use enlarged Voronoi cells, but to obtain disjoint clusters, the cells are computed greedily according to some order. This method, first proposed by Calinescu, Karloff and Rabani [SICOMP 2004], was employed successfully to provide state-of-the-art results for terminal-clustering problems on general metrics. However, for restricted families of metrics, e.g., trees and doubling metrics, only more complicated, ad-hoc algorithms are known. Our main contribution is to demonstrate that the Relaxed-Voronoi algorithm is applicable to restricted metrics, and actually leads to relatively simple algorithms and analyses.

Arnold Filtser, Robert Krauthgamer, and Ohad Trabelsi. Relaxed Voronoi: A Simple Framework for Terminal-Clustering Problems. In 2nd Symposium on Simplicity in Algorithms (SOSA 2019). Open Access Series in Informatics (OASIcs), Volume 69, pp. 10:1-10:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{filtser_et_al:OASIcs.SOSA.2019.10, author = {Filtser, Arnold and Krauthgamer, Robert and Trabelsi, Ohad}, title = {{Relaxed Voronoi: A Simple Framework for Terminal-Clustering Problems}}, booktitle = {2nd Symposium on Simplicity in Algorithms (SOSA 2019)}, pages = {10:1--10:14}, series = {Open Access Series in Informatics (OASIcs)}, ISBN = {978-3-95977-099-6}, ISSN = {2190-6807}, year = {2019}, volume = {69}, editor = {Fineman, Jeremy T. and Mitzenmacher, Michael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SOSA.2019.10}, URN = {urn:nbn:de:0030-drops-100369}, doi = {10.4230/OASIcs.SOSA.2019.10}, annote = {Keywords: Clustering, Steiner point removal, Zero extension, Doubling dimension, Relaxed voronoi} }

Document

**Published in:** LIPIcs, Volume 124, 10th Innovations in Theoretical Computer Science Conference (ITCS 2019)

We study sublinear algorithms that solve linear systems locally. In the classical version of this problem the input is a matrix S in R^{n x n} and a vector b in R^n in the range of S, and the goal is to output x in R^n satisfying Sx=b. For the case when the matrix S is symmetric diagonally dominant (SDD), the breakthrough algorithm of Spielman and Teng [STOC 2004] approximately solves this problem in near-linear time (in the input size which is the number of non-zeros in S), and subsequent papers have further simplified, improved, and generalized the algorithms for this setting.
Here we focus on computing one (or a few) coordinates of x, which potentially allows for sublinear algorithms. Formally, given an index u in [n] together with S and b as above, the goal is to output an approximation x^_u for x^*_u, where x^* is a fixed solution to Sx=b.
Our results show that there is a qualitative gap between SDD matrices and the more general class of positive semidefinite (PSD) matrices. For SDD matrices, we develop an algorithm that approximates a single coordinate x_{u} in time that is polylogarithmic in n, provided that S is sparse and has a small condition number (e.g., Laplacian of an expander graph). The approximation guarantee is additive | x^_u-x^*_u | <=epsilon | x^* |_infty for accuracy parameter epsilon>0. We further prove that the condition-number assumption is necessary and tight.
In contrast to the SDD matrices, we prove that for certain PSD matrices S, the running time must be at least polynomial in n (for the same additive approximation), even if S has bounded sparsity and condition number.

Alexandr Andoni, Robert Krauthgamer, and Yosef Pogrow. On Solving Linear Systems in Sublinear Time. In 10th Innovations in Theoretical Computer Science Conference (ITCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 124, pp. 3:1-3:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.ITCS.2019.3, author = {Andoni, Alexandr and Krauthgamer, Robert and Pogrow, Yosef}, title = {{On Solving Linear Systems in Sublinear Time}}, booktitle = {10th Innovations in Theoretical Computer Science Conference (ITCS 2019)}, pages = {3:1--3:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-095-8}, ISSN = {1868-8969}, year = {2019}, volume = {124}, editor = {Blum, Avrim}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2019.3}, URN = {urn:nbn:de:0030-drops-100966}, doi = {10.4230/LIPIcs.ITCS.2019.3}, annote = {Keywords: Linear systems, Laplacian solver, Sublinear time, Randomized linear algebra} }

Document

**Published in:** LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)

We provide evidence that computing the maximum flow value between every pair of nodes in a directed graph on n nodes, m edges, and capacities in the range [1..n], which we call the All-Pairs Max-Flow problem, cannot be solved in time that is faster significantly (i.e., by a polynomial factor) than O(n^2 m). Since a single maximum st-flow in such graphs can be solved in time \tilde{O}(m\sqrt{n}) [Lee and Sidford, FOCS 2014], we conclude that the all-pairs version might require time equivalent to \tilde\Omega(n^{3/2}) computations of maximum st-flow, which strongly separates the directed case from the undirected one. Moreover, if maximum $st$-flow can be solved in time \tilde{O}(m), then the runtime of \tilde\Omega(n^2) computations is needed. This is in contrast to a conjecture of Lacki, Nussbaum, Sankowski, and Wulf-Nilsen [FOCS 2012] that All-Pairs Max-Flow in general graphs can be solved faster than the time of O(n^2) computations of maximum st-flow.
Specifically, we show that in sparse graphs G=(V,E,w), if one can compute the maximum st-flow from every s in an input set of sources S\subseteq V to every t in an input set of sinks T\subseteq V in time O((|S||T|m)^{1-epsilon}), for some |S|, |T|, and a constant epsilon>0, then MAX-CNF-SAT (maximum satisfiability of conjunctive normal form formulas) with n' variables and m' clauses can be solved in time {m'}^{O(1)}2^{(1-delta)n'} for a constant delta(epsilon)>0, a problem for which not even 2^{n'}/\poly(n') algorithms are known. Such runtime for MAX-CNF-SAT would in particular refute the Strong Exponential Time Hypothesis (SETH). Hence, we improve the lower bound of Abboud, Vassilevska-Williams, and Yu [STOC 2015], who showed that for every fixed epsilon>0 and |S|=|T|=O(\sqrt{n}), if the above problem can be solved in time O(n^{3/2-epsilon}), then some incomparable (and intuitively weaker) conjecture is false. Furthermore, a larger lower bound than ours implies strictly super-linear time for maximum st-flow problem, which would be an amazing breakthrough.
In addition, we show that All-Pairs Max-Flow in uncapacitated networks with every edge-density m=m(n), cannot be computed in time significantly faster than O(mn), even for acyclic networks. The gap to the fastest known algorithm by Cheung, Lau, and Leung [FOCS 2011] is a factor of O(m^{omega-1}/n), and for acyclic networks it is O(n^{omega-1}), where omega is the matrix multiplication exponent.

Robert Krauthgamer and Ohad Trabelsi. Conditional Lower Bounds for All-Pairs Max-Flow. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 20:1-20:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{krauthgamer_et_al:LIPIcs.ICALP.2017.20, author = {Krauthgamer, Robert and Trabelsi, Ohad}, title = {{Conditional Lower Bounds for All-Pairs Max-Flow}}, booktitle = {44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)}, pages = {20:1--20:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-041-5}, ISSN = {1868-8969}, year = {2017}, volume = {80}, editor = {Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.20}, URN = {urn:nbn:de:0030-drops-74264}, doi = {10.4230/LIPIcs.ICALP.2017.20}, annote = {Keywords: Conditional lower bounds, Hardness in P, All-Pairs Maximum Flow, Strong Exponential Time Hypothesis} }

Document

**Published in:** LIPIcs, Volume 54, 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

In the snippets problem we are interested in preprocessing a text T so that given two pattern queries P_1 and P_2, one can quickly locate the occurrences of the patterns in T that are the closest to each other. A closely related problem is that of constructing a color-distance oracle, where the goal is to preprocess a set of points from some metric space, in which every point is associated with a set of colors, so that given two colors one can quickly locate two points associated with those colors, that are as close as possible to each other.
We introduce efficient data structures for both color-distance oracles and the snippets problem. Moreover, we prove conditional lower bounds for these problems from both the 3SUM conjecture and the Combinatorial Boolean Matrix Multiplication conjecture.

Tsvi Kopelowitz and Robert Krauthgamer. Color-Distance Oracles and Snippets. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 24:1-24:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{kopelowitz_et_al:LIPIcs.CPM.2016.24, author = {Kopelowitz, Tsvi and Krauthgamer, Robert}, title = {{Color-Distance Oracles and Snippets}}, booktitle = {27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)}, pages = {24:1--24:10}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-012-5}, ISSN = {1868-8969}, year = {2016}, volume = {54}, editor = {Grossi, Roberto and Lewenstein, Moshe}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2016.24}, URN = {urn:nbn:de:0030-drops-60684}, doi = {10.4230/LIPIcs.CPM.2016.24}, annote = {Keywords: Snippets, Text Indexing, Distance Oracles, Near Neighbor Search} }

Document

**Published in:** LIPIcs, Volume 40, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015)

We investigate the problem of approximate Nearest-Neighbor Search (NNS) in graphical metrics: The task is to preprocess an edge-weighted graph G=(V,E) on m vertices and a small "dataset" D \subset V of size n << m, so that given a query point q \in V, one can quickly approximate dist(q,D) (the distance from q to its closest vertex in D) and find a vertex a \in D within this approximated distance. We assume the query algorithm has access to a distance oracle, that quickly evaluates the exact distance between any pair of vertices.
For planar graphs G with maximum degree Delta, we show how to efficiently construct a compact data structure -- of size ~O(n(Delta+1/epsilon)) -- that answers (1+epsilon)-NNS queries in time ~O(Delta+1/epsilon). Thus, as far as NNS applications are concerned, metrics derived from bounded-degree planar graphs behave as low-dimensional metrics, even though planar metrics do not necessarily have a low doubling dimension, nor can they be embedded with low distortion into l_2. We complement our algorithmic result by lower bounds showing that the access to an exact distance oracle (rather than an approximate one) and the dependency on Delta (in query time) are both essential.

Ittai Abraham, Shiri Chechik, Robert Krauthgamer, and Udi Wieder. Approximate Nearest Neighbor Search in Metrics of Planar Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 40, pp. 20-42, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{abraham_et_al:LIPIcs.APPROX-RANDOM.2015.20, author = {Abraham, Ittai and Chechik, Shiri and Krauthgamer, Robert and Wieder, Udi}, title = {{Approximate Nearest Neighbor Search in Metrics of Planar Graphs}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015)}, pages = {20--42}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-89-7}, ISSN = {1868-8969}, year = {2015}, volume = {40}, editor = {Garg, Naveen and Jansen, Klaus and Rao, Anup and Rolim, Jos\'{e} D. P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2015.20}, URN = {urn:nbn:de:0030-drops-52923}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2015.20}, annote = {Keywords: Data Structures, Nearest Neighbor Search, Planar Graphs, Planar Metrics, Planar Separator} }

Document

**Published in:** LIPIcs, Volume 40, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015)

We study resistance sparsification of graphs, in which the goal is to find a sparse subgraph (with reweighted edges) that approximately preserves the effective resistances between every pair of nodes. We show that every dense regular expander admits a (1+epsilon)-resistance sparsifier of size ~O(n/epsilon), and conjecture this bound holds for all graphs on n nodes. In comparison, spectral sparsification is a strictly stronger notion and requires Omega(n/epsilon^2) edges even on the complete graph.
Our approach leads to the following structural question on graphs: Does every dense regular expander contain a sparse regular expander as a subgraph? Our main technical contribution, which may of independent interest, is a positive answer to this question in a certain setting of parameters. Combining this with a recent result of von Luxburg, Radl, and Hein (JMLR, 2014) leads to the aforementioned resistance sparsifiers.

Michael Dinitz, Robert Krauthgamer, and Tal Wagner. Towards Resistance Sparsifiers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 40, pp. 738-755, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{dinitz_et_al:LIPIcs.APPROX-RANDOM.2015.738, author = {Dinitz, Michael and Krauthgamer, Robert and Wagner, Tal}, title = {{Towards Resistance Sparsifiers}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2015)}, pages = {738--755}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-89-7}, ISSN = {1868-8969}, year = {2015}, volume = {40}, editor = {Garg, Naveen and Jansen, Klaus and Rao, Anup and Rolim, Jos\'{e} D. P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2015.738}, URN = {urn:nbn:de:0030-drops-53334}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2015.738}, annote = {Keywords: edge sparsification, spectral sparsifier, graph expansion, effective resistance, commute time} }