# 4 Search Results for "Bakshi, Ainesh"

Document
Track A: Algorithms, Complexity and Games
##### Random Separating Hyperplane Theorem and Learning Polytopes

Authors: Chiranjib Bhattacharyya, Ravindran Kannan, and Amit Kumar

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

##### Abstract
The Separating Hyperplane theorem is a fundamental result in Convex Geometry with myriad applications. The theorem asserts that for a point a not in a closed convex set K, there is a hyperplane with K on one side and a strictly on the other side. Our first result, Random Separating Hyperplane Theorem (RSH), is a strengthening of this for polytopes. RSH asserts that if the distance between a and a polytope K with k vertices and unit diameter in ℜ^d is at least δ, where δ is a fixed constant in (0,1), then a randomly chosen hyperplane separates a and K with probability at least 1/poly(k) and margin at least Ω (δ/√d). RSH has algorithmic applications in learning polytopes. We consider a fundamental problem, denoted the "Hausdorff problem", of learning a unit diameter polytope K within Hausdorff distance δ, given an optimization oracle for K. Using RSH, we show that with polynomially many random queries to the optimization oracle, K can be approximated within error O(δ). To our knowledge, this is the first provable algorithm for the Hausdorff Problem in this setting. Building on this result, we show that if the vertices of K are well-separated, then an optimization oracle can be used to generate a list of points, each within distance O(δ) of K, with the property that the list contains a point close to each vertex of K. Further, we show how to prune this list to generate a (unique) approximation to each vertex of the polytope. We prove that in many latent variable settings, e.g., topic modeling, LDA, optimization oracles do exist provided we project to a suitable SVD subspace. Thus, our work yields the first efficient algorithm for finding approximations to the vertices of the latent polytope under the well-separatedness assumption. This assumption states that each vertex of K is far from the convex hull of the remaining vertices of K, and is much weaker than other assumptions behind algorithms in the literature which find vertices of the latent polytope.

##### Cite as

Chiranjib Bhattacharyya, Ravindran Kannan, and Amit Kumar. Random Separating Hyperplane Theorem and Learning Polytopes. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 25:1-25:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

```@InProceedings{bhattacharyya_et_al:LIPIcs.ICALP.2024.25,
author =	{Bhattacharyya, Chiranjib and Kannan, Ravindran and Kumar, Amit},
title =	{{Random Separating Hyperplane Theorem and Learning Polytopes}},
booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
pages =	{25:1--25:20},
series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN =	{978-3-95977-322-5},
ISSN =	{1868-8969},
year =	{2024},
volume =	{297},
editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.25},
URN =		{urn:nbn:de:0030-drops-201687},
doi =		{10.4230/LIPIcs.ICALP.2024.25},
annote =	{Keywords: Separating Hyperplane Theorem, Learning Polytopes, Optimization Oracles}
}```
Document
Track A: Algorithms, Complexity and Games
##### Sublinear Time Eigenvalue Approximation via Random Sampling

Authors: Rajarshi Bhattacharjee, Gregory Dexter, Petros Drineas, Cameron Musco, and Archan Ray

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)

##### Abstract
We study the problem of approximating the eigenspectrum of a symmetric matrix A ∈ ℝ^{n×n} with bounded entries (i.e., ‖A‖_∞ ≤ 1). We present a simple sublinear time algorithm that approximates all eigenvalues of A up to additive error ±εn using those of a randomly sampled Õ((log³ n)/ε³)×Õ((log³ n)/ε³) principal submatrix. Our result can be viewed as a concentration bound on the complete eigenspectrum of a random submatrix, significantly extending known bounds on just the singular values (the magnitudes of the eigenvalues). We give improved error bounds of ± ε √{nnz(A)} and ±ε‖A‖_F when the rows of A can be sampled with probabilities proportional to their sparsities or their squared 𝓁₂ norms respectively. Here nnz(A) is the number of non-zero entries in A and ‖A‖_F is its Frobenius norm. Even for the strictly easier problems of approximating the singular values or testing the existence of large negative eigenvalues (Bakshi, Chepurko, and Jayaram, FOCS '20), our results are the first that take advantage of non-uniform sampling to give improved error bounds. From a technical perspective, our results require several new eigenvalue concentration and perturbation bounds for matrices with bounded entries. Our non-uniform sampling bounds require a new algorithmic approach, which judiciously zeroes out entries of a randomly sampled submatrix to reduce variance, before computing the eigenvalues of that submatrix as estimates for those of A. We complement our theoretical results with numerical simulations, which demonstrate the effectiveness of our algorithms in practice.

##### Cite as

Rajarshi Bhattacharjee, Gregory Dexter, Petros Drineas, Cameron Musco, and Archan Ray. Sublinear Time Eigenvalue Approximation via Random Sampling. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 21:1-21:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

```@InProceedings{bhattacharjee_et_al:LIPIcs.ICALP.2023.21,
author =	{Bhattacharjee, Rajarshi and Dexter, Gregory and Drineas, Petros and Musco, Cameron and Ray, Archan},
title =	{{Sublinear Time Eigenvalue Approximation via Random Sampling}},
booktitle =	{50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
pages =	{21:1--21:18},
series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN =	{978-3-95977-278-5},
ISSN =	{1868-8969},
year =	{2023},
volume =	{261},
editor =	{Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.21},
URN =		{urn:nbn:de:0030-drops-180733},
doi =		{10.4230/LIPIcs.ICALP.2023.21},
annote =	{Keywords: sublinear algorithms, eigenvalue approximation, randomized linear algebra}
}```
Document
APPROX
##### Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams

Authors: Ainesh Bakshi, Nadiia Chepurko, and David P. Woodruff

Published in: LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)

##### Abstract
We study the Maximum Independent Set problem for geometric objects given in the data stream model. A set of geometric objects is said to be independent if the objects are pairwise disjoint. We consider geometric objects in one and two dimensions, i.e., intervals and disks. Let α be the cardinality of the largest independent set. Our goal is to estimate α in a small amount of space, given that the input is received as a one-pass stream. We also consider a generalization of this problem by assigning weights to each object and estimating β, the largest value of a weighted independent set. We initialize the study of this problem in the turnstile streaming model (insertions and deletions) and provide the first algorithms for estimating α and β. For unit-length intervals, we obtain a (2+ε)-approximation to α and β in poly(log(n)/ε) space. We also show a matching lower bound. Combined with the 3/2-approximation for insertion-only streams by Cabello and Perez-Lanterno [Cabello and Pérez-Lantero, 2017], our result implies a separation between the insertion-only and turnstile model. For unit-radius disks, we obtain a (8√3/π)-approximation to α and β in poly(log(n)/ε) space, which is closely related to the hexagonal circle packing constant. Finally, we provide algorithms for estimating α for arbitrary-length intervals under a bounded intersection assumption and study the parameterized space complexity of estimating α and β, where the parameter is the ratio of maximum to minimum interval length.

##### Cite as

Ainesh Bakshi, Nadiia Chepurko, and David P. Woodruff. Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 64:1-64:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

```@InProceedings{bakshi_et_al:LIPIcs.APPROX/RANDOM.2020.64,
author =	{Bakshi, Ainesh and Chepurko, Nadiia and Woodruff, David P.},
title =	{{Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams}},
booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)},
pages =	{64:1--64:22},
series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN =	{978-3-95977-164-1},
ISSN =	{1868-8969},
year =	{2020},
volume =	{176},
editor =	{Byrka, Jaros{\l}aw and Meka, Raghu},
publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.64},
URN =		{urn:nbn:de:0030-drops-126679},
doi =		{10.4230/LIPIcs.APPROX/RANDOM.2020.64},
annote =	{Keywords: Weighted Maximum Independent Set, Geometric Graphs, Turnstile Streams}
}```
Document
Track A: Algorithms, Complexity and Games
##### Robust Communication-Optimal Distributed Clustering Algorithms

Authors: Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, and David P. Woodruff

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

##### Abstract
In this work, we study the k-median and k-means clustering problems when the data is distributed across many servers and can contain outliers. While there has been a lot of work on these problems for worst-case instances, we focus on gaining a finer understanding through the lens of beyond worst-case analysis. Our main motivation is the following: for many applications such as clustering proteins by function or clustering communities in a social network, there is some unknown target clustering, and the hope is that running a k-median or k-means algorithm will produce clusterings which are close to matching the target clustering. Worst-case results can guarantee constant factor approximations to the optimal k-median or k-means objective value, but not closeness to the target clustering. Our first result is a distributed algorithm which returns a near-optimal clustering assuming a natural notion of stability, namely, approximation stability [Awasthi and Balcan, 2014], even when a constant fraction of the data are outliers. The communication complexity is O~(sk+z) where s is the number of machines, k is the number of clusters, and z is the number of outliers. Next, we show this amount of communication cannot be improved even in the setting when the input satisfies various non-worst-case assumptions. We give a matching Omega(sk+z) lower bound on the communication required both for approximating the optimal k-means or k-median cost up to any constant, and for returning a clustering that is close to the target clustering in Hamming distance. These lower bounds hold even when the data satisfies approximation stability or other common notions of stability, and the cluster sizes are balanced. Therefore, Omega(sk+z) is a communication bottleneck, even for real-world instances.

##### Cite as

Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, and David P. Woodruff. Robust Communication-Optimal Distributed Clustering Algorithms. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 18:1-18:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

```@InProceedings{awasthi_et_al:LIPIcs.ICALP.2019.18,
author =	{Awasthi, Pranjal and Bakshi, Ainesh and Balcan, Maria-Florina and White, Colin and Woodruff, David P.},
title =	{{Robust Communication-Optimal Distributed Clustering Algorithms}},
booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
pages =	{18:1--18:16},
series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN =	{978-3-95977-109-2},
ISSN =	{1868-8969},
year =	{2019},
volume =	{132},
editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.18},
URN =		{urn:nbn:de:0030-drops-105942},
doi =		{10.4230/LIPIcs.ICALP.2019.18},
annote =	{Keywords: robust distributed clustering, communication complexity}
}```
• Refine by Author
• 2 Bakshi, Ainesh
• 2 Woodruff, David P.
• 1 Awasthi, Pranjal
• 1 Balcan, Maria-Florina
• 1 Bhattacharjee, Rajarshi

• Refine by Classification
• 2 Theory of computation → Unsupervised learning and clustering
• 1 Mathematics of computing → Computations on matrices
• 1 Theory of computation → Sketching and sampling
• 1 Theory of computation → Streaming models
• 1 Theory of computation → Streaming, sublinear and near linear time algorithms

• Refine by Keyword
• 1 Geometric Graphs
• 1 Learning Polytopes
• 1 Optimization Oracles
• 1 Separating Hyperplane Theorem
• 1 Turnstile Streams

• Refine by Type
• 4 document

• Refine by Publication Year
• 1 2019
• 1 2020
• 1 2023
• 1 2024

X

Feedback for Dagstuhl Publishing