3 Search Results for "Tětek, Jakub"

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2023.62

Bias Reduction for Sum Estimation

Authors: Talya Eden, Jakob Bæk Tejs Houen, Shyam Narayanan, Will Rosenbaum, and Jakub Tětek

Published in: LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

Abstract

In classical statistics and distribution testing, it is often assumed that elements can be sampled exactly from some distribution 𝒫, and that when an element x is sampled, the probability 𝒫(x) of sampling x is also known. In this setting, recent work in distribution testing has shown that many algorithms are robust in the sense that they still produce correct output if the elements are drawn from any distribution 𝒬 that is sufficiently close to 𝒫. This phenomenon raises interesting questions: under what conditions is a "noisy" distribution 𝒬 sufficient, and what is the algorithmic cost of coping with this noise? In this paper, we investigate these questions for the problem of estimating the sum of a multiset of N real values x_1, …, x_N. This problem is well-studied in the statistical literature in the case 𝒫 = 𝒬, where the Hansen-Hurwitz estimator [Annals of Mathematical Statistics, 1943] is frequently used. We assume that for some (known) distribution 𝒫, values are sampled from a distribution 𝒬 that is pointwise close to 𝒫. That is, there is a parameter γ < 1 such that for all x_i, (1 - γ) 𝒫(i) ≤ 𝒬(i) ≤ (1 + γ) 𝒫(i). For every positive integer k we define an estimator ζ_k for μ = ∑_i x_i whose bias is proportional to γ^k (where our ζ₁ reduces to the classical Hansen-Hurwitz estimator). As a special case, we show that if 𝒬 is pointwise γ-close to uniform and all x_i ∈ {0, 1}, for any ε > 0, we can estimate μ to within additive error ε N using m = Θ(N^{1-1/k}/ε^{2/k}) samples, where k = ⌈lg ε/lg γ⌉. We then show that this sample complexity is essentially optimal. Interestingly, our upper and lower bounds show that the sample complexity need not vary uniformly with the desired error parameter ε: for some values of ε, perturbations in its value have no asymptotic effect on the sample complexity, while for other values, any decrease in its value results in an asymptotically larger sample complexity.

Cite as

Talya Eden, Jakob Bæk Tejs Houen, Shyam Narayanan, Will Rosenbaum, and Jakub Tětek. Bias Reduction for Sum Estimation. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 62:1-62:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.APPROX/RANDOM.2023.62,
  author =	{Eden, Talya and Houen, Jakob B{\ae}k Tejs and Narayanan, Shyam and Rosenbaum, Will and T\v{e}tek, Jakub},
  title =	{{Bias Reduction for Sum Estimation}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)},
  pages =	{62:1--62:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-296-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{275},
  editor =	{Megow, Nicole and Smith, Adam},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.62},
  URN =		{urn:nbn:de:0030-drops-188872},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2023.62},
  annote =	{Keywords: bias reduction, sum estimation, sublinear time algorithms, sample complexity}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2022.26

Privately Estimating Graph Parameters in Sublinear Time

Authors: Jeremiah Blocki, Elena Grigorescu, and Tamalika Mukherjee

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Abstract

We initiate a systematic study of algorithms that are both differentially-private and run in sublinear time for several problems in which the goal is to estimate natural graph parameters. Our main result is a differentially-private (1+ρ)-approximation algorithm for the problem of computing the average degree of a graph, for every ρ > 0. The running time of the algorithm is roughly the same (for sparse graphs) as its non-private version proposed by Goldreich and Ron (Sublinear Algorithms, 2005). We also obtain the first differentially-private sublinear-time approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph. An overarching technique we employ is the notion of coupled global sensitivity of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in ad-hoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms.

Cite as

Jeremiah Blocki, Elena Grigorescu, and Tamalika Mukherjee. Privately Estimating Graph Parameters in Sublinear Time. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 26:1-26:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{blocki_et_al:LIPIcs.ICALP.2022.26,
  author =	{Blocki, Jeremiah and Grigorescu, Elena and Mukherjee, Tamalika},
  title =	{{Privately Estimating Graph Parameters in Sublinear Time}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{26:1--26:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.26},
  URN =		{urn:nbn:de:0030-drops-163674},
  doi =		{10.4230/LIPIcs.ICALP.2022.26},
  annote =	{Keywords: differential privacy, sublinear time, graph algorithms}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2022.107

Approximate Triangle Counting via Sampling and Fast Matrix Multiplication

Authors: Jakub Tětek

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Abstract

There is a simple O(n³/{ε²T}) time algorithm for 1±ε-approximate triangle counting where T is the number of triangles in the graph and n the number of vertices. At the same time, one may count triangles exactly using fast matrix multiplication in time Õ(n^ω). Is it possible to get a negative dependency on the number of triangles T while retaining the state-of-the-art n^ω dependency on n? We answer this question positively by providing an algorithm which runs in time O({n^ω}/T^{ω-2})⋅poly(n^o(1)/ε). This is optimal in the sense that as long as the exponent of T is independent of n, T, it cannot be improved while retaining the dependency on n. Our algorithm improves upon the state of the art when T ≫ 1 and T ≪ n. We also consider the problem of approximate triangle counting in sparse graphs, parameterized by the number of edges m. The best known algorithm runs in time Õ_ε(m^{3/2}/T) [Eden et al., SIAM Journal on Computing, 2017]. An algorithm by Alon et al. [JACM, 1995] for exact triangle counting that runs in time Õ(m^{2ω/(ω + 1)}). We again get an algorithm whose complexity has a state-of-the-art dependency on m while having negative dependency on T. Specifically, our algorithm runs in time O({m^{2ω/(ω+1)}}/{T^{2(ω-1)/(ω+1)}}) ⋅ poly(n^o(1)/ε). This is again optimal in the sense that no better constant exponent of T is possible without worsening the dependency on m. This algorithm improves upon the state of the art when T ≫ 1 and T ≪ √m. In both cases, algorithms with time complexity matching query complexity lower bounds were known on some range of parameters. While those algorithms have optimal query complexity for the whole range of T, the time complexity departs from the query complexity and is no longer optimal (as we show) for T ≪ n and T ≪ √m, respectively. We focus on the time complexity in this range of T. To the best of our knowledge, this is the first paper considering the discrepancy between query and time complexity in graph parameter estimation.

Cite as

Jakub Tětek. Approximate Triangle Counting via Sampling and Fast Matrix Multiplication. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 107:1-107:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{tetek:LIPIcs.ICALP.2022.107,
  author =	{T\v{e}tek, Jakub},
  title =	{{Approximate Triangle Counting via Sampling and Fast Matrix Multiplication}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{107:1--107:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.107},
  URN =		{urn:nbn:de:0030-drops-164485},
  doi =		{10.4230/LIPIcs.ICALP.2022.107},
  annote =	{Keywords: Approximate triangle counting, Fast matrix multiplication, Sampling}
}

Refine by Author
2 Tětek, Jakub
1 Blocki, Jeremiah
1 Eden, Talya
1 Grigorescu, Elena
1 Houen, Jakob Bæk Tejs
Show More...

Refine by Classification
2 Theory of computation → Streaming, sublinear and near linear time algorithms
1 Mathematics of computing → Probabilistic algorithms
1 Security and privacy → Privacy-preserving protocols
1 Theory of computation → Lower bounds and information complexity
1 Theory of computation → Sample complexity and generalization bounds
Show More...

Refine by Keyword
1 Approximate triangle counting
1 Fast matrix multiplication
1 Sampling
1 bias reduction
1 differential privacy
Show More...

Refine by Type
3 document

Refine by Publication Year
2 2022
1 2023

3 Search Results for "Tětek, Jakub"

Bias Reduction for Sum Estimation

Abstract

Cite as

Privately Estimating Graph Parameters in Sublinear Time

Abstract

Cite as

Approximate Triangle Counting via Sampling and Fast Matrix Multiplication

Abstract

Cite as

Thanks for your feedback!

Could not send message