3 Search Results for "Bakshi, Ainesh"


Document
Track A: Algorithms, Complexity and Games
Sublinear Time Eigenvalue Approximation via Random Sampling

Authors: Rajarshi Bhattacharjee, Gregory Dexter, Petros Drineas, Cameron Musco, and Archan Ray

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)


Abstract
We study the problem of approximating the eigenspectrum of a symmetric matrix A ∈ ℝ^{n×n} with bounded entries (i.e., ‖A‖_∞ ≤ 1). We present a simple sublinear time algorithm that approximates all eigenvalues of A up to additive error ±εn using those of a randomly sampled Õ((log³ n)/ε³)×Õ((log³ n)/ε³) principal submatrix. Our result can be viewed as a concentration bound on the complete eigenspectrum of a random submatrix, significantly extending known bounds on just the singular values (the magnitudes of the eigenvalues). We give improved error bounds of ± ε √{nnz(A)} and ±ε‖A‖_F when the rows of A can be sampled with probabilities proportional to their sparsities or their squared 𝓁₂ norms respectively. Here nnz(A) is the number of non-zero entries in A and ‖A‖_F is its Frobenius norm. Even for the strictly easier problems of approximating the singular values or testing the existence of large negative eigenvalues (Bakshi, Chepurko, and Jayaram, FOCS '20), our results are the first that take advantage of non-uniform sampling to give improved error bounds. From a technical perspective, our results require several new eigenvalue concentration and perturbation bounds for matrices with bounded entries. Our non-uniform sampling bounds require a new algorithmic approach, which judiciously zeroes out entries of a randomly sampled submatrix to reduce variance, before computing the eigenvalues of that submatrix as estimates for those of A. We complement our theoretical results with numerical simulations, which demonstrate the effectiveness of our algorithms in practice.

Cite as

Rajarshi Bhattacharjee, Gregory Dexter, Petros Drineas, Cameron Musco, and Archan Ray. Sublinear Time Eigenvalue Approximation via Random Sampling. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 21:1-21:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bhattacharjee_et_al:LIPIcs.ICALP.2023.21,
  author =	{Bhattacharjee, Rajarshi and Dexter, Gregory and Drineas, Petros and Musco, Cameron and Ray, Archan},
  title =	{{Sublinear Time Eigenvalue Approximation via Random Sampling}},
  booktitle =	{50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
  pages =	{21:1--21:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-278-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{261},
  editor =	{Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.21},
  URN =		{urn:nbn:de:0030-drops-180733},
  doi =		{10.4230/LIPIcs.ICALP.2023.21},
  annote =	{Keywords: sublinear algorithms, eigenvalue approximation, randomized linear algebra}
}
Document
APPROX
Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams

Authors: Ainesh Bakshi, Nadiia Chepurko, and David P. Woodruff

Published in: LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)


Abstract
We study the Maximum Independent Set problem for geometric objects given in the data stream model. A set of geometric objects is said to be independent if the objects are pairwise disjoint. We consider geometric objects in one and two dimensions, i.e., intervals and disks. Let α be the cardinality of the largest independent set. Our goal is to estimate α in a small amount of space, given that the input is received as a one-pass stream. We also consider a generalization of this problem by assigning weights to each object and estimating β, the largest value of a weighted independent set. We initialize the study of this problem in the turnstile streaming model (insertions and deletions) and provide the first algorithms for estimating α and β. For unit-length intervals, we obtain a (2+ε)-approximation to α and β in poly(log(n)/ε) space. We also show a matching lower bound. Combined with the 3/2-approximation for insertion-only streams by Cabello and Perez-Lanterno [Cabello and Pérez-Lantero, 2017], our result implies a separation between the insertion-only and turnstile model. For unit-radius disks, we obtain a (8√3/π)-approximation to α and β in poly(log(n)/ε) space, which is closely related to the hexagonal circle packing constant. Finally, we provide algorithms for estimating α for arbitrary-length intervals under a bounded intersection assumption and study the parameterized space complexity of estimating α and β, where the parameter is the ratio of maximum to minimum interval length.

Cite as

Ainesh Bakshi, Nadiia Chepurko, and David P. Woodruff. Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 64:1-64:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{bakshi_et_al:LIPIcs.APPROX/RANDOM.2020.64,
  author =	{Bakshi, Ainesh and Chepurko, Nadiia and Woodruff, David P.},
  title =	{{Weighted Maximum Independent Set of Geometric Objects in Turnstile Streams}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)},
  pages =	{64:1--64:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-164-1},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{176},
  editor =	{Byrka, Jaros{\l}aw and Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.64},
  URN =		{urn:nbn:de:0030-drops-126679},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2020.64},
  annote =	{Keywords: Weighted Maximum Independent Set, Geometric Graphs, Turnstile Streams}
}
Document
Track A: Algorithms, Complexity and Games
Robust Communication-Optimal Distributed Clustering Algorithms

Authors: Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, and David P. Woodruff

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
In this work, we study the k-median and k-means clustering problems when the data is distributed across many servers and can contain outliers. While there has been a lot of work on these problems for worst-case instances, we focus on gaining a finer understanding through the lens of beyond worst-case analysis. Our main motivation is the following: for many applications such as clustering proteins by function or clustering communities in a social network, there is some unknown target clustering, and the hope is that running a k-median or k-means algorithm will produce clusterings which are close to matching the target clustering. Worst-case results can guarantee constant factor approximations to the optimal k-median or k-means objective value, but not closeness to the target clustering. Our first result is a distributed algorithm which returns a near-optimal clustering assuming a natural notion of stability, namely, approximation stability [Awasthi and Balcan, 2014], even when a constant fraction of the data are outliers. The communication complexity is O~(sk+z) where s is the number of machines, k is the number of clusters, and z is the number of outliers. Next, we show this amount of communication cannot be improved even in the setting when the input satisfies various non-worst-case assumptions. We give a matching Omega(sk+z) lower bound on the communication required both for approximating the optimal k-means or k-median cost up to any constant, and for returning a clustering that is close to the target clustering in Hamming distance. These lower bounds hold even when the data satisfies approximation stability or other common notions of stability, and the cluster sizes are balanced. Therefore, Omega(sk+z) is a communication bottleneck, even for real-world instances.

Cite as

Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, and David P. Woodruff. Robust Communication-Optimal Distributed Clustering Algorithms. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 18:1-18:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{awasthi_et_al:LIPIcs.ICALP.2019.18,
  author =	{Awasthi, Pranjal and Bakshi, Ainesh and Balcan, Maria-Florina and White, Colin and Woodruff, David P.},
  title =	{{Robust Communication-Optimal Distributed Clustering Algorithms}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{18:1--18:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.18},
  URN =		{urn:nbn:de:0030-drops-105942},
  doi =		{10.4230/LIPIcs.ICALP.2019.18},
  annote =	{Keywords: robust distributed clustering, communication complexity}
}
  • Refine by Author
  • 2 Bakshi, Ainesh
  • 2 Woodruff, David P.
  • 1 Awasthi, Pranjal
  • 1 Balcan, Maria-Florina
  • 1 Bhattacharjee, Rajarshi
  • Show More...

  • Refine by Classification
  • 1 Mathematics of computing → Computations on matrices
  • 1 Theory of computation → Sketching and sampling
  • 1 Theory of computation → Streaming models
  • 1 Theory of computation → Streaming, sublinear and near linear time algorithms
  • 1 Theory of computation → Unsupervised learning and clustering

  • Refine by Keyword
  • 1 Geometric Graphs
  • 1 Turnstile Streams
  • 1 Weighted Maximum Independent Set
  • 1 communication complexity
  • 1 eigenvalue approximation
  • Show More...

  • Refine by Type
  • 3 document

  • Refine by Publication Year
  • 1 2019
  • 1 2020
  • 1 2023

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail