Document

**Published in:** LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

We introduce and study the communication complexity of computing the inner product of two vectors, where the input is restricted w.r.t. a norm N on the space ℝⁿ. Here, Alice and Bob hold two vectors v,u such that ‖v‖_N ≤ 1 and ‖u‖_{N^*} ≤ 1, where N^* is the dual norm. The goal is to compute their inner product ⟨v,u⟩ up to an ε additive term. The problem is denoted by IP_N, and generalizes important previously studied problems, such as: (1) Computing the expectation 𝔼_{x∼𝒟}[f(x)] when Alice holds 𝒟 and Bob holds f is equivalent to IP_{𝓁₁}. (2) Computing v^TAv where Alice has a symmetric matrix with bounded operator norm (denoted S_∞) and Bob has a vector v where ‖v‖₂ = 1. This problem is complete for quantum communication complexity and is equivalent to IP_{S_∞}.
We systematically study IP_N, showing the following results, near tight in most cases:
1) For any symmetric norm N, given ‖v‖_N ≤ 1 and ‖u‖_{N^*} ≤ 1 there is a randomized protocol using 𝒪̃(ε^{-6} log n) bits of communication that returns a value in ⟨u,v⟩±ε with probability 2/3 - we will denote this by ℛ_{ε,1/3}(IP_N) ≤ 𝒪̃(ε^{-6} log n). In a special case where N = 𝓁_p and N^* = 𝓁_q for p^{-1} + q^{-1} = 1, we obtain an improved bound ℛ_{ε,1/3}(IP_{𝓁_p}) ≤ 𝒪(ε^{-2} log n), nearly matching the lower bound ℛ_{ε, 1/3}(IP_{𝓁_p}) ≥ Ω(min(n, ε^{-2})).
2) One way communication complexity ℛ^{→}_{ε,δ}(IP_{𝓁_p}) ≤ 𝒪(ε^{-max(2,p)}⋅ log n/ε), and a nearly matching lower bound ℛ^{→}_{ε, 1/3}(IP_{𝓁_p}) ≥ Ω(ε^{-max(2,p)}) for ε^{-max(2,p)} ≪ n.
3) One way communication complexity ℛ^{→}_{ε,δ}(N) for a symmetric norm N is governed by the distortion of the embedding 𝓁_∞^k into N. Specifically, while a small distortion embedding easily implies a lower bound Ω(k), we show that, conversely, non-existence of such an embedding implies protocol with communication k^𝒪(log log k) log² n.
4) For arbitrary origin symmetric convex polytope P, we show ℛ_{ε,1/3}(IP_{N}) ≤ 𝒪(ε^{-2} log xc(P)), where N is the unique norm for which P is a unit ball, and xc(P) is the extension complexity of P (i.e. the smallest number of inequalities describing some polytope P' s.t. P is projection of P').

Alexandr Andoni, Jarosław Błasiok, and Arnold Filtser. Communication Complexity of Inner Product in Symmetric Normed Spaces. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 4:1-4:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.ITCS.2023.4, author = {Andoni, Alexandr and B{\l}asiok, Jaros{\l}aw and Filtser, Arnold}, title = {{Communication Complexity of Inner Product in Symmetric Normed Spaces}}, booktitle = {14th Innovations in Theoretical Computer Science Conference (ITCS 2023)}, pages = {4:1--4:22}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-263-1}, ISSN = {1868-8969}, year = {2023}, volume = {251}, editor = {Tauman Kalai, Yael}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.4}, URN = {urn:nbn:de:0030-drops-175077}, doi = {10.4230/LIPIcs.ITCS.2023.4}, annote = {Keywords: communication complexity, symmetric norms} }

Document

APPROX

**Published in:** LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)

We study the space complexity of solving the bias-regularized SVM problem in the streaming model. In particular, given a data set (x_i,y_i) ∈ ℝ^d× {-1,+1}, the objective function is F_λ(θ,b) = λ/2‖(θ,b)‖₂² + 1/n∑_{i=1}ⁿ max{0,1-y_i(θ^Tx_i+b)} and the goal is to find the parameters that (approximately) minimize this objective. This is a classic supervised learning problem that has drawn lots of attention, including for developing fast algorithms for solving the problem approximately: i.e., for finding (θ,b) such that F_λ(θ,b) ≤ min_{(θ',b')} F_λ(θ',b')+ε.
One of the most widely used algorithms for approximately optimizing the SVM objective is Stochastic Gradient Descent (SGD), which requires only O(1/λε) random samples, and which immediately yields a streaming algorithm that uses O(d/λε) space. For related problems, better streaming algorithms are only known for smooth functions, unlike the SVM objective that we focus on in this work.
We initiate an investigation of the space complexity for both finding an approximate optimum of this objective, and for the related "point estimation" problem of sketching the data set to evaluate the function value F_λ on any query (θ, b). We show that, for both problems, for dimensions d = 1,2, one can obtain streaming algorithms with space polynomially smaller than 1/λε, which is the complexity of SGD for strongly convex functions like the bias-regularized SVM [Shalev-Shwartz et al., 2007], and which is known to be tight in general, even for d = 1 [Agarwal et al., 2009]. We also prove polynomial lower bounds for both point estimation and optimization. In particular, for point estimation we obtain a tight bound of Θ(1/√{ε}) for d = 1 and a nearly tight lower bound of Ω̃(d/{ε}²) for d = Ω(log(1/ε)). Finally, for optimization, we prove a Ω(1/√{ε}) lower bound for d = Ω(log(1/ε)), and show similar bounds when d is constant.

Alexandr Andoni, Collin Burns, Yi Li, Sepideh Mahabadi, and David P. Woodruff. Streaming Complexity of SVMs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 50:1-50:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.APPROX/RANDOM.2020.50, author = {Andoni, Alexandr and Burns, Collin and Li, Yi and Mahabadi, Sepideh and Woodruff, David P.}, title = {{Streaming Complexity of SVMs}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)}, pages = {50:1--50:22}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-164-1}, ISSN = {1868-8969}, year = {2020}, volume = {176}, editor = {Byrka, Jaros{\l}aw and Meka, Raghu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.50}, URN = {urn:nbn:de:0030-drops-126532}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2020.50}, annote = {Keywords: support vector machine, streaming algorithm, space lower bound, sketching algorithm, point estimation} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data - data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model.
In this paper, we study two fundamental problems, 2-edge connectivity and 2-vertex connectivity (biconnectivity). PRAM algorithms which run in O(log n) time have been known for many years. We give algorithms using roughly log diameter rounds in the MPC model. Our main results are, for an n-vertex, m-edge graph of diameter D and bi-diameter D', 1) a O(log D log log_{m/n} n) parallel time 2-edge connectivity algorithm, 2) a O(log D log^2 log_{m/n}n+log D'log log_{m/n}n) parallel time biconnectivity algorithm, where the bi-diameter D' is the largest cycle length over all the vertex pairs in the same biconnected component. Our results are fully scalable, meaning that the memory per processor can be O(n^{delta}) for arbitrary constant delta>0, and the total memory used is linear in the problem size. Our 2-edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [Andoni et al., 2018]. We also show an Omega(log D') conditional lower bound for the biconnectivity problem.

Alexandr Andoni, Clifford Stein, and Peilin Zhong. Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 14:1-14:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.ICALP.2019.14, author = {Andoni, Alexandr and Stein, Clifford and Zhong, Peilin}, title = {{Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity}}, booktitle = {46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)}, pages = {14:1--14:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-109-2}, ISSN = {1868-8969}, year = {2019}, volume = {132}, editor = {Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.14}, URN = {urn:nbn:de:0030-drops-105906}, doi = {10.4230/LIPIcs.ICALP.2019.14}, annote = {Keywords: parallel algorithms, biconnectivity, 2-edge connectivity, the MPC model} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

We study the problem of discrete distribution testing in the two-party setting. For example, in the standard closeness testing problem, Alice and Bob each have t samples from, respectively, distributions a and b over [n], and they need to test whether a=b or a,b are epsilon-far (in the l_1 distance). This is in contrast to the well-studied one-party case, where the tester has unrestricted access to samples of both distributions. Despite being a natural constraint in applications, the two-party setting has previously evaded attention.
We address two fundamental aspects of the two-party setting: 1) what is the communication complexity, and 2) can it be accomplished securely, without Alice and Bob learning extra information about each other’s input. Besides closeness testing, we also study the independence testing problem, where Alice and Bob have t samples from distributions a and b respectively, which may be correlated; the question is whether a,b are independent or epsilon-far from being independent. Our contribution is three-fold: 1) We show how to gain communication efficiency given more samples, beyond the information-theoretic bound on t. The gain is polynomially better than what one would obtain via adapting one-party algorithms. 2) We prove tightness of our trade-off for the closeness testing, as well as that the independence testing requires tight Omega(sqrt{m}) communication for unbounded number of samples. These lower bounds are of independent interest as, to the best of our knowledge, these are the first 2-party communication lower bounds for testing problems, where the inputs are a set of i.i.d. samples. 3) We define the concept of secure distribution testing, and provide secure versions of the above protocols with an overhead that is only polynomial in the security parameter.

Alexandr Andoni, Tal Malkin, and Negev Shekel Nosatzki. Two Party Distribution Testing: Communication and Security. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 15:1-15:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.ICALP.2019.15, author = {Andoni, Alexandr and Malkin, Tal and Nosatzki, Negev Shekel}, title = {{Two Party Distribution Testing: Communication and Security}}, booktitle = {46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)}, pages = {15:1--15:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-109-2}, ISSN = {1868-8969}, year = {2019}, volume = {132}, editor = {Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.15}, URN = {urn:nbn:de:0030-drops-105916}, doi = {10.4230/LIPIcs.ICALP.2019.15}, annote = {Keywords: distribution testing, communication complexity, security} }

Document

**Published in:** LIPIcs, Volume 124, 10th Innovations in Theoretical Computer Science Conference (ITCS 2019)

We study sublinear algorithms that solve linear systems locally. In the classical version of this problem the input is a matrix S in R^{n x n} and a vector b in R^n in the range of S, and the goal is to output x in R^n satisfying Sx=b. For the case when the matrix S is symmetric diagonally dominant (SDD), the breakthrough algorithm of Spielman and Teng [STOC 2004] approximately solves this problem in near-linear time (in the input size which is the number of non-zeros in S), and subsequent papers have further simplified, improved, and generalized the algorithms for this setting.
Here we focus on computing one (or a few) coordinates of x, which potentially allows for sublinear algorithms. Formally, given an index u in [n] together with S and b as above, the goal is to output an approximation x^_u for x^*_u, where x^* is a fixed solution to Sx=b.
Our results show that there is a qualitative gap between SDD matrices and the more general class of positive semidefinite (PSD) matrices. For SDD matrices, we develop an algorithm that approximates a single coordinate x_{u} in time that is polylogarithmic in n, provided that S is sparse and has a small condition number (e.g., Laplacian of an expander graph). The approximation guarantee is additive | x^_u-x^*_u | <=epsilon | x^* |_infty for accuracy parameter epsilon>0. We further prove that the condition-number assumption is necessary and tight.
In contrast to the SDD matrices, we prove that for certain PSD matrices S, the running time must be at least polynomial in n (for the same additive approximation), even if S has bounded sparsity and condition number.

Alexandr Andoni, Robert Krauthgamer, and Yosef Pogrow. On Solving Linear Systems in Sublinear Time. In 10th Innovations in Theoretical Computer Science Conference (ITCS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 124, pp. 3:1-3:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.ITCS.2019.3, author = {Andoni, Alexandr and Krauthgamer, Robert and Pogrow, Yosef}, title = {{On Solving Linear Systems in Sublinear Time}}, booktitle = {10th Innovations in Theoretical Computer Science Conference (ITCS 2019)}, pages = {3:1--3:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-095-8}, ISSN = {1868-8969}, year = {2019}, volume = {124}, editor = {Blum, Avrim}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2019.3}, URN = {urn:nbn:de:0030-drops-100966}, doi = {10.4230/LIPIcs.ITCS.2019.3}, annote = {Keywords: Linear systems, Laplacian solver, Sublinear time, Randomized linear algebra} }

Document

**Published in:** LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

Transportation cost metrics, also known as the Wasserstein distances W_p, are a natural choice for defining distances between two pointsets, or distributions, and have been applied in numerous fields. From the computational perspective, there has been an intensive research effort for understanding the W_p metrics over R^k, with work on the W_1 metric (a.k.a earth mover distance) being most successful in terms of theoretical guarantees. However, the W_2 metric, also known as the root-mean square (RMS) bipartite matching distance, is often a more suitable choice in many application areas, e.g. in graphics. Yet, the geometry of this metric space is currently poorly understood, and efficient algorithms have been elusive. For example, there are no known non-trivial algorithms for nearest-neighbor search or sketching for this metric.
In this paper we take the first step towards explaining the lack of efficient algorithms for the W_2 metric, even over the three-dimensional Euclidean space R^3. We prove that there are no meaningful embeddings of W_2 over R^3 into a wide class of normed spaces, as well as that there are no efficient sketching algorithms for W_2 over R^3 achieving constant approximation. For example, our results imply that: 1) any embedding into L1 must incur a distortion of Omega(sqrt(log(n))) for pointsets of size n equipped with the W_2 metric; and 2) any sketching algorithm of size s must incur Omega(sqrt(log(n))/sqrt(s)) approximation. Our results follow from a more general statement, asserting that W_2 over R^3 contains the 1/2-snowflake of all finite metric spaces with a uniformly bounded distortion. These are the first non-embeddability/non-sketchability results for W_2.

Alexandr Andoni, Assaf Naor, and Ofer Neiman. Impossibility of Sketching of the 3D Transportation Metric with Quadratic Cost. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 83:1-83:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.ICALP.2016.83, author = {Andoni, Alexandr and Naor, Assaf and Neiman, Ofer}, title = {{Impossibility of Sketching of the 3D Transportation Metric with Quadratic Cost}}, booktitle = {43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)}, pages = {83:1--83:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-013-2}, ISSN = {1868-8969}, year = {2016}, volume = {55}, editor = {Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.83}, URN = {urn:nbn:de:0030-drops-62028}, doi = {10.4230/LIPIcs.ICALP.2016.83}, annote = {Keywords: Transportation metric, embedding, snowflake, sketching} }

Document

**Published in:** LIPIcs, Volume 51, 32nd International Symposium on Computational Geometry (SoCG 2016)

We prove a tight lower bound for the exponent rho for data-dependent Locality-Sensitive Hashing schemes, recently used to
design efficient solutions for the c-approximate nearest neighbor search. In particular, our lower bound matches the bound of rho<= 1/(2c-1)+o(1) for the l_1 space, obtained via the recent algorithm from [Andoni-Razenshteyn, STOC'15].
In recent years it emerged that data-dependent hashing is strictly superior to the classical Locality-Sensitive Hashing, when
the hash function is data-independent. In the latter setting, the best exponent has been already known: for the l_1 space, the tight bound is rho=1/c, with the upper bound from [Indyk-Motwani,STOC'98] and the matching lower bound from [O'Donnell-Wu-Zhou,ITCS'11].
We prove that, even if the hashing is data-dependent, it must hold that rho>=1/(2c-1)-o(1). To prove the result, we need to
formalize the exact notion of data-dependent hashing that also captures the complexity of the hash functions (in addition to their collision properties). Without restricting such complexity, we would allow for obviously infeasible solutions such as the Voronoi diagram of a dataset. To preclude such solutions, we require our hash functions to be succinct. This condition is satisfied by all the known algorithmic results.

Alexandr Andoni and Ilya Razensteyn. Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing. In 32nd International Symposium on Computational Geometry (SoCG 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 51, pp. 9:1-9:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.SoCG.2016.9, author = {Andoni, Alexandr and Razensteyn, Ilya}, title = {{Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing}}, booktitle = {32nd International Symposium on Computational Geometry (SoCG 2016)}, pages = {9:1--9:11}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-009-5}, ISSN = {1868-8969}, year = {2016}, volume = {51}, editor = {Fekete, S\'{a}ndor and Lubiw, Anna}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2016.9}, URN = {urn:nbn:de:0030-drops-59014}, doi = {10.4230/LIPIcs.SoCG.2016.9}, annote = {Keywords: similarity search, high-dimensional geometry, LSH, data structures, lower bounds} }