DROPS

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2024.18

It’s Hard to HAC Average Linkage!

Authors: MohammadHossein Bateni, Laxman Dhulipala, Kishen N. Gowda, D. Ellis Hershkowitz, Rajesh Jayaram, and Jakub Łącki

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Abstract

Average linkage Hierarchical Agglomerative Clustering (HAC) is an extensively studied and applied method for hierarchical clustering. Recent applications to massive datasets have driven significant interest in near-linear-time and efficient parallel algorithms for average linkage HAC. We provide hardness results that rule out such algorithms. On the sequential side, we establish a runtime lower bound of n^{3/2-ε} on n node graphs for sequential combinatorial algorithms under standard fine-grained complexity assumptions. This essentially matches the best-known running time for average linkage HAC. On the parallel side, we prove that average linkage HAC likely cannot be parallelized even on simple graphs by showing that it is CC-hard on trees of diameter 4. On the possibility side, we demonstrate that average linkage HAC can be efficiently parallelized (i.e., it is in NC) on paths and can be solved in near-linear time when the height of the output cluster hierarchy is small.

Cite as

MohammadHossein Bateni, Laxman Dhulipala, Kishen N. Gowda, D. Ellis Hershkowitz, Rajesh Jayaram, and Jakub Łącki. It’s Hard to HAC Average Linkage!. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{bateni_et_al:LIPIcs.ICALP.2024.18,
  author =	{Bateni, MohammadHossein and Dhulipala, Laxman and Gowda, Kishen N. and Hershkowitz, D. Ellis and Jayaram, Rajesh and {\L}\k{a}cki, Jakub},
  title =	{{It’s Hard to HAC Average Linkage!}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.18},
  URN =		{urn:nbn:de:0030-drops-201613},
  doi =		{10.4230/LIPIcs.ICALP.2024.18},
  annote =	{Keywords: Clustering, Hierarchical Graph Clustering, HAC, Fine-Grained Complexity, Parallel Algorithms, CC}
}

@InProceedings{bateni_et_al:LIPIcs.ICALP.2024.18,
  author =	{Bateni, MohammadHossein and Dhulipala, Laxman and Gowda, Kishen N. and Hershkowitz, D. Ellis and Jayaram, Rajesh and {\L}\k{a}cki, Jakub},
  title =	{{It’s Hard to HAC Average Linkage!}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.18},
  URN =		{urn:nbn:de:0030-drops-201613},
  doi =		{10.4230/LIPIcs.ICALP.2024.18},
  annote =	{Keywords: Clustering, Hierarchical Graph Clustering, HAC, Fine-Grained Complexity, Parallel Algorithms, CC}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2024.90

Dynamic PageRank: Algorithms and Lower Bounds

Authors: Rajesh Jayaram, Jakub Łącki, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Abstract

We consider the PageRank problem in the dynamic setting, where the goal is to explicitly maintain an approximate PageRank vector π ∈ ℝⁿ for a graph under a sequence of edge insertions and deletions. Our main result is a complete characterization of the complexity of dynamic PageRank maintenance for both multiplicative and additive (L₁) approximations. First, we establish matching lower and upper bounds for maintaining additive approximate PageRank in both incremental and decremental settings. In particular, we demonstrate that in the worst-case (1/α)^{Θ(log log n)} update time is necessary and sufficient for this problem, where α is the desired additive approximation. On the other hand, we demonstrate that the commonly employed ForwardPush approach performs substantially worse than this optimal runtime. Specifically, we show that ForwardPush requires Ω(n^{1-δ}) time per update on average, for any δ > 0, even in the incremental setting. For multiplicative approximations, however, we demonstrate that the situation is significantly more challenging. Specifically, we prove that any algorithm that explicitly maintains a constant factor multiplicative approximation of the PageRank vector of a directed graph must have amortized update time Ω(n^{1-δ}), for any δ > 0, even in the incremental setting, thereby resolving a 13-year old open question of Bahmani et al. (VLDB 2010). This sharply contrasts with the undirected setting, where we show that poly log n update time is feasible, even in the fully dynamic setting under oblivious adversary.

Cite as

Rajesh Jayaram, Jakub Łącki, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski. Dynamic PageRank: Algorithms and Lower Bounds. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 90:1-90:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{jayaram_et_al:LIPIcs.ICALP.2024.90,
  author =	{Jayaram, Rajesh and {\L}\k{a}cki, Jakub and Mitrovi\'{c}, Slobodan and Onak, Krzysztof and Sankowski, Piotr},
  title =	{{Dynamic PageRank: Algorithms and Lower Bounds}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{90:1--90:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.90},
  URN =		{urn:nbn:de:0030-drops-202336},
  doi =		{10.4230/LIPIcs.ICALP.2024.90},
  annote =	{Keywords: PageRank, dynamic algorithms, graph algorithms}
}

Document

APPROX

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2021.11

An Optimal Algorithm for Triangle Counting in the Stream

Authors: Rajesh Jayaram and John Kallaugher

Published in: LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

Abstract

We present a new algorithm for approximating the number of triangles in a graph G whose edges arrive as an arbitrary order stream. If m is the number of edges in G, T the number of triangles, Δ_E the maximum number of triangles which share a single edge, and Δ_V the maximum number of triangles which share a single vertex, then our algorithm requires space: Õ(m/T⋅(Δ_E + √{Δ_V})) Taken with the Ω((m Δ_E)/T) lower bound of Braverman, Ostrovsky, and Vilenchik (ICALP 2013), and the Ω((m √{Δ_V})/T) lower bound of Kallaugher and Price (SODA 2017), our algorithm is optimal up to log factors, resolving the complexity of a classic problem in graph streaming.

Cite as

Rajesh Jayaram and John Kallaugher. An Optimal Algorithm for Triangle Counting in the Stream. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 11:1-11:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{jayaram_et_al:LIPIcs.APPROX/RANDOM.2021.11,
  author =	{Jayaram, Rajesh and Kallaugher, John},
  title =	{{An Optimal Algorithm for Triangle Counting in the Stream}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)},
  pages =	{11:1--11:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-207-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{207},
  editor =	{Wootters, Mary and Sanit\`{a}, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.11},
  URN =		{urn:nbn:de:0030-drops-147046},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2021.11},
  annote =	{Keywords: Triangle Counting, Streaming, Graph Algorithms, Sampling, Sketching}
}

Document

APPROX

DOI: 10.4230/LIPIcs.APPROX-RANDOM.2019.29

Towards Optimal Moment Estimation in Streaming and Distributed Models

Authors: Rajesh Jayaram and David P. Woodruff

Published in: LIPIcs, Volume 145, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)

Abstract

One of the oldest problems in the data stream model is to approximate the p-th moment ||X||_p^p = sum_{i=1}^n X_i^p of an underlying non-negative vector X in R^n, which is presented as a sequence of poly(n) updates to its coordinates. Of particular interest is when p in (0,2]. Although a tight space bound of Theta(epsilon^-2 log n) bits is known for this problem when both positive and negative updates are allowed, surprisingly there is still a gap in the space complexity of this problem when all updates are positive. Specifically, the upper bound is O(epsilon^-2 log n) bits, while the lower bound is only Omega(epsilon^-2 + log n) bits. Recently, an upper bound of O~(epsilon^-2 + log n) bits was obtained under the assumption that the updates arrive in a random order. We show that for p in (0, 1], the random order assumption is not needed. Namely, we give an upper bound for worst-case streams of O~(epsilon^-2 + log n) bits for estimating |X |_p^p. Our techniques also give new upper bounds for estimating the empirical entropy in a stream. On the other hand, we show that for p in (1,2], in the natural coordinator and blackboard distributed communication topologies, there is an O~(epsilon^-2) bit max-communication upper bound based on a randomized rounding scheme. Our protocols also give rise to protocols for heavy hitters and approximate matrix product. We generalize our results to arbitrary communication topologies G, obtaining an O~(epsilon^2 log d) max-communication upper bound, where d is the diameter of G. Interestingly, our upper bound rules out natural communication complexity-based approaches for proving an Omega(epsilon^-2 log n) bit lower bound for p in (1,2] for streaming algorithms. In particular, any such lower bound must come from a topology with large diameter.

Cite as

Rajesh Jayaram and David P. Woodruff. Towards Optimal Moment Estimation in Streaming and Distributed Models. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 29:1-29:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{jayaram_et_al:LIPIcs.APPROX-RANDOM.2019.29,
  author =	{Jayaram, Rajesh and Woodruff, David P.},
  title =	{{Towards Optimal Moment Estimation in Streaming and Distributed Models}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)},
  pages =	{29:1--29:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-125-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{145},
  editor =	{Achlioptas, Dimitris and V\'{e}gh, L\'{a}szl\'{o} A.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2019.29},
  URN =		{urn:nbn:de:0030-drops-112443},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2019.29},
  annote =	{Keywords: Streaming, Sketching, Message Passing, Moment Estimation}
}

Document

DOI: 10.4230/LIPIcs.ICALP.2017.19

Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!

Authors: Rajesh Jayaram and Barna Saha

Published in: LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)

Abstract

In 1975, a breakthrough result of L. Valiant showed that parsing context free grammars can be reduced to Boolean matrix multiplication, resulting in a running time of O(n^omega) for parsing where omega <= 2.373 is the exponent of fast matrix multiplication, and n is the string length. Recently, Abboud, Backurs and V. Williams (FOCS 2015) demonstrated that this is likely optimal; moreover, a combinatorial o(n^3) algorithm is unlikely to exist for the general parsing problem. The language edit distance problem is a significant generalization of the parsing problem, which computes the minimum edit distance of a given string (using insertions, deletions, and substitutions) to any valid string in the language, and has received significant attention both in theory and practice since the seminal work of Aho and Peterson in 1972. Clearly, the lower bound for parsing rules out any algorithm running in o(n^omega) time that can return a nontrivial multiplicative approximation of the language edit distance problem. Furthermore, combinatorial algorithms with cubic running time or algorithms that use fast matrix multiplication are often not desirable in practice. To break this n^omega hardness barrier, in this paper we study additive approximation algorithms for language edit distance. We provide two explicit combinatorial algorithms to obtain a string with minimum edit distance with performance dependencies on either the number of non-linear productions, k^*, or the number of nested non-linear production, k, used in the optimal derivation. Explicitly, we give an additive O(k^*gamma) approximation in time O(|G|(n^2 + (n/gamma)^3)) and an additive O(k gamma) approximation in time O(|G|(n^2 + (n^3/gamma^2))), where |G| is the grammar size and n is the string length. In particular, we obtain tight approximations for an important subclass of context free grammars known as ultralinear grammars, for which k and k^* are naturally bounded. Interestingly, we show that the same conditional lower bound for parsing context free grammars holds for the class of ultralinear grammars as well, clearly marking the boundary where parsing becomes hard!

Cite as

Rajesh Jayaram and Barna Saha. Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 19:1-19:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{jayaram_et_al:LIPIcs.ICALP.2017.19,
  author =	{Jayaram, Rajesh and Saha, Barna},
  title =	{{Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!}},
  booktitle =	{44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)},
  pages =	{19:1--19:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-041-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{80},
  editor =	{Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.19},
  URN =		{urn:nbn:de:0030-drops-74548},
  doi =		{10.4230/LIPIcs.ICALP.2017.19},
  annote =	{Keywords: Approximation, Edit Distance, Dynamic Programming, Context Free Grammar, Hardness}
}

Search Results

Documents authored by Jayaram, Rajesh

It’s Hard to HAC Average Linkage!

Abstract

Cite as

Dynamic PageRank: Algorithms and Lower Bounds

Abstract

Cite as

An Optimal Algorithm for Triangle Counting in the Stream

Abstract

Cite as

Towards Optimal Moment Estimation in Streaming and Distributed Models

Abstract

Cite as

Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!

Abstract

Cite as

Thanks for your feedback!

Could not send message