DROPS

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2024.33

On the Amortized Complexity of Approximate Counting

Authors: Ishaq Aden-Ali, Yanjun Han, Jelani Nelson, and Huacheng Yu

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)

Abstract

Naively storing a counter up to value n would require Ω(log n) bits of memory. Nelson and Yu [Jelani Nelson and Huacheng Yu, 2022], following work of Morris [Robert H. Morris, 1978], showed that if the query answers need only be (1+ε)-approximate with probability at least 1 - δ, then O(log log n + log log(1/δ) + log(1/ε)) bits suffice, and in fact this bound is tight. Morris' original motivation for studying this problem though, as well as modern applications, require not only maintaining one counter, but rather k counters for k large. This motivates the following question: for k large, can k counters be simultaneously maintained using asymptotically less memory than k times the cost of an individual counter? That is to say, does this problem benefit from an improved amortized space complexity bound? We answer this question in the negative. Specifically, we prove a lower bound for nearly the full range of parameters showing that, in terms of memory usage, there is no asymptotic benefit possible via amortization when storing multiple counters. Our main proof utilizes a certain notion of "information cost" recently introduced by Braverman, Garg and Woodruff [Mark Braverman et al., 2020] to prove lower bounds for streaming algorithms.

Cite as

Ishaq Aden-Ali, Yanjun Han, Jelani Nelson, and Huacheng Yu. On the Amortized Complexity of Approximate Counting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 33:1-33:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{adenali_et_al:LIPIcs.APPROX/RANDOM.2024.33,
  author =	{Aden-Ali, Ishaq and Han, Yanjun and Nelson, Jelani and Yu, Huacheng},
  title =	{{On the Amortized Complexity of Approximate Counting}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)},
  pages =	{33:1--33:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-348-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{317},
  editor =	{Kumar, Amit and Ron-Zewi, Noga},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2024.33},
  URN =		{urn:nbn:de:0030-drops-210264},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2024.33},
  annote =	{Keywords: streaming, approximate counting, information complexity, lower bounds}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2024.99

Randomized vs. Deterministic Separation in Time-Space Tradeoffs of Multi-Output Functions

Authors: Huacheng Yu and Wei Zhan

Published in: LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)

Abstract

We prove the first polynomial separation between randomized and deterministic time-space tradeoffs of multi-output functions. In particular, we present a total function that on the input of n elements in [n], outputs O(n) elements, such that: - There exists a randomized oblivious algorithm with space O(log n), time O(nlog n) and one-way access to randomness, that computes the function with probability 1-O(1/n); - Any deterministic oblivious branching program with space S and time T that computes the function must satisfy T²S ≥ Ω(n^{2.5}/log n). This implies that logspace randomized algorithms for multi-output functions cannot be black-box derandomized without an Ω̃(n^{1/4}) overhead in time. Since previously all the polynomial time-space tradeoffs of multi-output functions are proved via the Borodin-Cook method, which is a probabilistic method that inherently gives the same lower bound for randomized and deterministic branching programs, our lower bound proof is intrinsically different from previous works. We also examine other natural candidates for proving such separations, and show that any polynomial separation for these problems would resolve the long-standing open problem of proving n^{1+Ω(1)} time lower bound for decision problems with polylog(n) space.

Cite as

Huacheng Yu and Wei Zhan. Randomized vs. Deterministic Separation in Time-Space Tradeoffs of Multi-Output Functions. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 99:1-99:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{yu_et_al:LIPIcs.ITCS.2024.99,
  author =	{Yu, Huacheng and Zhan, Wei},
  title =	{{Randomized vs. Deterministic Separation in Time-Space Tradeoffs of Multi-Output Functions}},
  booktitle =	{15th Innovations in Theoretical Computer Science Conference (ITCS 2024)},
  pages =	{99:1--99:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-309-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{287},
  editor =	{Guruswami, Venkatesan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.99},
  URN =		{urn:nbn:de:0030-drops-196270},
  doi =		{10.4230/LIPIcs.ITCS.2024.99},
  annote =	{Keywords: Time-space tradeoffs, Randomness, Borodin-Cook method}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2024.100

Sampling, Flowers and Communication

Authors: Huacheng Yu and Wei Zhan

Published in: LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)

Abstract

Given a distribution over [n]ⁿ such that any k coordinates need k/log^{O(1)}n bits of communication to sample, we prove that any map that samples this distribution from uniform cells requires locality Ω(log(n/k)/log log(n/k)). In particular, we show that for any constant δ > 0, there exists ε = 2^{-Ω(n^{1-δ})} such that Ω(log n/log log n) non-adaptive cell probes on uniform cells are required to: - Sample a uniformly random permutation on n elements with error 1-ε. This provides an exponential improvement on the Ω(log log n) cell probe lower bound by Viola. - Sample an n-vector with each element independently drawn from a random n^{1-δ}-vector, with error 1-ε. This provides the first adaptive vs non-adaptive cell probe separation for sampling. The major technical component in our proof is a new combinatorial theorem about flower with small kernel, i.e. a collection of sets where few elements appear more than once. We show that in a family of n sets, each with size O(log n/log log n), there must be k = poly(n) sets where at most k/log^{O(1)}n elements appear more than once. To show the lower bound on sampling permutation, we also prove a new Ω(k) communication lower bound on sampling uniformly distributed disjoint subsets of [n] of size k, with error 1-2^{-Ω(k²/n)}. This result unifies and subsumes the lower bound for k = Θ(√n) by Ambainis et al., and the lower bound for k = Θ(n) by Göös and Watson.

Cite as

Huacheng Yu and Wei Zhan. Sampling, Flowers and Communication. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 100:1-100:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{yu_et_al:LIPIcs.ITCS.2024.100,
  author =	{Yu, Huacheng and Zhan, Wei},
  title =	{{Sampling, Flowers and Communication}},
  booktitle =	{15th Innovations in Theoretical Computer Science Conference (ITCS 2024)},
  pages =	{100:1--100:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-309-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{287},
  editor =	{Guruswami, Venkatesan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.100},
  URN =		{urn:nbn:de:0030-drops-196288},
  doi =		{10.4230/LIPIcs.ITCS.2024.100},
  annote =	{Keywords: Flower, Sampling, Cell probe, Communcation complexity}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2023.57

On Constructing Spanners from Random Gaussian Projections

Authors: Sepehr Assadi, Michael Kapralov, and Huacheng Yu

Published in: LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

Abstract

Graph sketching is a powerful paradigm for analyzing graph structure via linear measurements introduced by Ahn, Guha, and McGregor (SODA'12) that has since found numerous applications in streaming, distributed computing, and massively parallel algorithms, among others. Graph sketching has proven to be quite successful for various problems such as connectivity, minimum spanning trees, edge or vertex connectivity, and cut or spectral sparsifiers. Yet, the problem of approximating shortest path metric of a graph, and specifically computing a spanner, is notably missing from the list of successes. This has turned the status of this fundamental problem into one of the most longstanding open questions in this area. We present a partial explanation of this lack of success by proving a strong lower bound for a large family of graph sketching algorithms that encompasses prior work on spanners and many (but importantly not also all) related cut-based problems mentioned above. Our lower bound matches the algorithmic bounds of the recent result of Filtser, Kapralov, and Nouri (SODA'21), up to lower order terms, for constructing spanners via the same graph sketching family. This establishes near-optimality of these bounds, at least restricted to this family of graph sketching techniques, and makes progress on a conjecture posed in this latter work.

Cite as

Sepehr Assadi, Michael Kapralov, and Huacheng Yu. On Constructing Spanners from Random Gaussian Projections. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 57:1-57:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{assadi_et_al:LIPIcs.APPROX/RANDOM.2023.57,
  author =	{Assadi, Sepehr and Kapralov, Michael and Yu, Huacheng},
  title =	{{On Constructing Spanners from Random Gaussian Projections}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)},
  pages =	{57:1--57:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-296-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{275},
  editor =	{Megow, Nicole and Smith, Adam},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.57},
  URN =		{urn:nbn:de:0030-drops-188821},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2023.57},
  annote =	{Keywords: sketching algorithm, lower bound, graph spanner}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2023.80

Characterizing the Multi-Pass Streaming Complexity for Solving Boolean CSPs Exactly

Authors: Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, and Huacheng Yu

Published in: LIPIcs, Volume 251, 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)

Abstract

We study boolean constraint satisfaction problems (CSPs) Max-CSP^f_n for all predicates f: {0,1}^k → {0,1}. In these problems, given an integer v and a list of constraints over n boolean variables, each obtained by applying f to a sequence of literals, we wish to decide if there is an assignment to the variables that satisfies at least v constraints. We consider these problems in the streaming model, where the algorithm makes a small number of passes over the list of constraints. Our first and main result is the following complete characterization: For every predicate f, the streaming space complexity of the Max-CSP^f_n problem is Θ̃(n^deg(f)), where deg(f) is the degree of f when viewed as a multilinear polynomial. While the upper bound is obtained by a (very simple) one-pass streaming algorithm, our lower bound shows that a better space complexity is impossible even with constant-pass streaming algorithms. Building on our techniques, we are also able to get an optimal Ω(n²) lower bound on the space complexity of constant-pass streaming algorithms for the well studied Max-CUT problem, even though it is not technically a Max-CSP^f_n problem as, e.g., negations of variables and repeated constraints are not allowed.

Cite as

Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, and Huacheng Yu. Characterizing the Multi-Pass Streaming Complexity for Solving Boolean CSPs Exactly. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 80:1-80:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{kol_et_al:LIPIcs.ITCS.2023.80,
  author =	{Kol, Gillat and Paramonov, Dmitry and Saxena, Raghuvansh R. and Yu, Huacheng},
  title =	{{Characterizing the Multi-Pass Streaming Complexity for Solving Boolean CSPs Exactly}},
  booktitle =	{14th Innovations in Theoretical Computer Science Conference (ITCS 2023)},
  pages =	{80:1--80:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-263-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{251},
  editor =	{Tauman Kalai, Yael},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2023.80},
  URN =		{urn:nbn:de:0030-drops-175837},
  doi =		{10.4230/LIPIcs.ITCS.2023.80},
  annote =	{Keywords: Streaming algorithms, Constraint Satisfaction Problems}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2021.52

Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs

Authors: Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)

Abstract

For a directed graph G with n vertices and a start vertex u_start, we wish to (approximately) sample an L-step random walk over G starting from u_start with minimum space using an algorithm that only makes few passes over the edges of the graph. This problem found many applications, for instance, in approximating the PageRank of a webpage. If only a single pass is allowed, the space complexity of this problem was shown to be Θ̃(n ⋅ L). Prior to our work, a better space complexity was only known with Õ(√L) passes. We essentially settle the space complexity of this random walk simulation problem for two-pass streaming algorithms, showing that it is Θ̃(n ⋅ √L), by giving almost matching upper and lower bounds. Our lower bound argument extends to every constant number of passes p, and shows that any p-pass algorithm for this problem uses Ω̃(n ⋅ L^{1/p}) space. In addition, we show a similar Θ̃(n ⋅ √L) bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an L-step random walk from every vertex in the graph.

Cite as

Lijie Chen, Gillat Kol, Dmitry Paramonov, Raghuvansh R. Saxena, Zhao Song, and Huacheng Yu. Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 52:1-52:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ICALP.2021.52,
  author =	{Chen, Lijie and Kol, Gillat and Paramonov, Dmitry and Saxena, Raghuvansh R. and Song, Zhao and Yu, Huacheng},
  title =	{{Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{52:1--52:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.52},
  URN =		{urn:nbn:de:0030-drops-141218},
  doi =		{10.4230/LIPIcs.ICALP.2021.52},
  annote =	{Keywords: streaming algorithms, random walk sampling}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2020.79

Succinct Filters for Sets of Unknown Sizes

Authors: Mingmou Liu, Yitong Yin, and Huacheng Yu

Published in: LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)

Abstract

The membership problem asks to maintain a set S ⊆ [u], supporting insertions and membership queries, i.e., testing if a given element is in the set. A data structure that computes exact answers is called a dictionary. When a (small) false positive rate ε is allowed, the data structure is called a filter. The space usages of the standard dictionaries or filters usually depend on the upper bound on the size of S, while the actual set can be much smaller. Pagh, Segev and Wieder [Pagh et al., 2013] were the first to study filters with varying space usage based on the current |S|. They showed in order to match the space with the current set size n = |S|, any filter data structure must use (1-o(1))n(log(1/ε)+(1-O(ε))log log n) bits, in contrast to the well-known lower bound of N log(1/ε) bits, where N is an upper bound on |S|. They also presented a data structure with almost optimal space of (1+o(1))n(log(1/ε)+O(log log n)) bits provided that n > u^0.001, with expected amortized constant insertion time and worst-case constant lookup time. In this work, we present a filter data structure with improvements in two aspects: - it has constant worst-case time for all insertions and lookups with high probability; - it uses space (1+o(1))n(log (1/ε)+log log n) bits when n > u^0.001, achieving optimal leading constant for all ε = o(1). We also present a dictionary that uses (1+o(1))nlog(u/n) bits of space, matching the optimal space in terms of the current size, and performs all operations in constant time with high probability.

Cite as

Mingmou Liu, Yitong Yin, and Huacheng Yu. Succinct Filters for Sets of Unknown Sizes. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 79:1-79:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{liu_et_al:LIPIcs.ICALP.2020.79,
  author =	{Liu, Mingmou and Yin, Yitong and Yu, Huacheng},
  title =	{{Succinct Filters for Sets of Unknown Sizes}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{79:1--79:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.79},
  URN =		{urn:nbn:de:0030-drops-124867},
  doi =		{10.4230/LIPIcs.ICALP.2020.79},
  annote =	{Keywords: Bloom filters, Data structures, Approximate set membership, Dictionaries}
}

Search Results

Documents authored by Yu, Huacheng

On the Amortized Complexity of Approximate Counting

Abstract

Cite as

Randomized vs. Deterministic Separation in Time-Space Tradeoffs of Multi-Output Functions

Abstract

Cite as

Sampling, Flowers and Communication

Abstract

Cite as

On Constructing Spanners from Random Gaussian Projections

Abstract

Cite as

Characterizing the Multi-Pass Streaming Complexity for Solving Boolean CSPs Exactly

Abstract

Cite as

Near-Optimal Two-Pass Streaming Algorithm for Sampling Random Walks over Directed Graphs

Abstract

Cite as

Succinct Filters for Sets of Unknown Sizes

Abstract

Cite as

Thanks for your feedback!

Could not send message