Document

RANDOM

**Published in:** LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

In classical statistics and distribution testing, it is often assumed that elements can be sampled exactly from some distribution 𝒫, and that when an element x is sampled, the probability 𝒫(x) of sampling x is also known. In this setting, recent work in distribution testing has shown that many algorithms are robust in the sense that they still produce correct output if the elements are drawn from any distribution 𝒬 that is sufficiently close to 𝒫. This phenomenon raises interesting questions: under what conditions is a "noisy" distribution 𝒬 sufficient, and what is the algorithmic cost of coping with this noise?
In this paper, we investigate these questions for the problem of estimating the sum of a multiset of N real values x_1, …, x_N. This problem is well-studied in the statistical literature in the case 𝒫 = 𝒬, where the Hansen-Hurwitz estimator [Annals of Mathematical Statistics, 1943] is frequently used. We assume that for some (known) distribution 𝒫, values are sampled from a distribution 𝒬 that is pointwise close to 𝒫. That is, there is a parameter γ < 1 such that for all x_i, (1 - γ) 𝒫(i) ≤ 𝒬(i) ≤ (1 + γ) 𝒫(i). For every positive integer k we define an estimator ζ_k for μ = ∑_i x_i whose bias is proportional to γ^k (where our ζ₁ reduces to the classical Hansen-Hurwitz estimator). As a special case, we show that if 𝒬 is pointwise γ-close to uniform and all x_i ∈ {0, 1}, for any ε > 0, we can estimate μ to within additive error ε N using m = Θ(N^{1-1/k}/ε^{2/k}) samples, where k = ⌈lg ε/lg γ⌉. We then show that this sample complexity is essentially optimal. Interestingly, our upper and lower bounds show that the sample complexity need not vary uniformly with the desired error parameter ε: for some values of ε, perturbations in its value have no asymptotic effect on the sample complexity, while for other values, any decrease in its value results in an asymptotically larger sample complexity.

Talya Eden, Jakob Bæk Tejs Houen, Shyam Narayanan, Will Rosenbaum, and Jakub Tětek. Bias Reduction for Sum Estimation. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 62:1-62:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.APPROX/RANDOM.2023.62, author = {Eden, Talya and Houen, Jakob B{\ae}k Tejs and Narayanan, Shyam and Rosenbaum, Will and T\v{e}tek, Jakub}, title = {{Bias Reduction for Sum Estimation}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)}, pages = {62:1--62:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-296-9}, ISSN = {1868-8969}, year = {2023}, volume = {275}, editor = {Megow, Nicole and Smith, Adam}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.62}, URN = {urn:nbn:de:0030-drops-188872}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2023.62}, annote = {Keywords: bias reduction, sum estimation, sublinear time algorithms, sample complexity} }

Document

**Published in:** LIPIcs, Volume 246, 36th International Symposium on Distributed Computing (DISC 2022)

We consider packet forwarding in the adversarial queueing theory (AQT) model introduced by Borodin et al. We introduce a refinement of the AQT (ρ, σ)-bounded adversary, which we call a locally bursty adversary (LBA) that parameterizes injection patterns jointly by edge utilization and packet origin. For constant (O(1)) parameters, the LBA model is strictly more permissive than the (ρ, σ) model. For example, there are injection patterns in the LBA model with constant parameters that can only be realized as (ρ, σ)-bounded injection patterns with ρ + σ = Ω(n) (where n is the network size). We show that the LBA model (unlike the (ρ, σ) model) is closed under packet bundling and discretization operations. Thus, the LBA model allows one to reduce the study of general (uniform) capacity networks and inhomogenous packet sizes to unit capacity networks with homogeneous packets.
On the algorithmic side, we focus on information gathering networks - i.e., networks in which all packets share a common destination, and the union of packet routes forms a tree. We show that the Odd-Even Downhill (OED) forwarding protocol described independently by Dobrev et al. and Patt-Shamir and Rosenbaum achieves buffer space usage of O(log n) against all LBAs with constant parameters. OED is a local protocol, but we show that the upper bound is tight even when compared to centralized protocols. Our lower bound for the LBA model is in contrast to the (ρ, σ)-model, where centralized protocols can achieve worst-case buffer space usage O(1) for ρ, σ = O(1), while the O(log n) upper bound for OED is optimal only for local protocols.

Will Rosenbaum. Packet Forwarding with a Locally Bursty Adversary. In 36th International Symposium on Distributed Computing (DISC 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 246, pp. 34:1-34:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{rosenbaum:LIPIcs.DISC.2022.34, author = {Rosenbaum, Will}, title = {{Packet Forwarding with a Locally Bursty Adversary}}, booktitle = {36th International Symposium on Distributed Computing (DISC 2022)}, pages = {34:1--34:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-255-6}, ISSN = {1868-8969}, year = {2022}, volume = {246}, editor = {Scheideler, Christian}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2022.34}, URN = {urn:nbn:de:0030-drops-172254}, doi = {10.4230/LIPIcs.DISC.2022.34}, annote = {Keywords: packet forwarding, packet scheduling, adversarial queueing theory, network calculus, odd-even downhill forwarding, locally bursty adversary, local algorithms} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Counting and sampling small subgraphs are fundamental algorithmic tasks. Motivated by the need to handle massive datasets efficiently, recent theoretical work has examined the problems in the sublinear time regime. In this work, we consider the problem of sampling a k-clique in a graph from an almost uniform distribution. Specifically the algorithm should output each k-clique with probability (1±ε)/n_k, where n_k denotes the number of k-cliques in the graph and ε is a given approximation parameter. To this end, the algorithm may perform degree, neighbor, and pair queries. We focus on the class of graphs with arboricity at most α, and prove that the query complexity of the problem is Θ^*(min{nα , max {(((nα)^(k/2))/n_k)^{1/(k-1)}, (nα^(k-1))/n_k}}), where n is the number of vertices in the graph, and Θ^*(⋅) suppresses dependencies on (log n/ε)^O(k).
Our upper bound is based on defining a special auxiliary graph H_k, such that sampling edges almost uniformly in H_k translates to sampling k-cliques almost uniformly in the original graph G. We then build on a known edge-sampling algorithm (Eden, Ron and Rosenbaum, ICALP19) to sample edges in H_k. The challenge is simulating queries to H_k while being given query access only to G. Our lower bound follows from a construction of a family of graphs with arboricity α such that in each graph there are n_k k-cliques, where one of these cliques is "hidden" and hence hard to sample.

Talya Eden, Dana Ron, and Will Rosenbaum. Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques in Bounded Arboricity Graphs. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 56:1-56:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ICALP.2022.56, author = {Eden, Talya and Ron, Dana and Rosenbaum, Will}, title = {{Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques in Bounded Arboricity Graphs}}, booktitle = {49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)}, pages = {56:1--56:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-235-8}, ISSN = {1868-8969}, year = {2022}, volume = {229}, editor = {Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.56}, URN = {urn:nbn:de:0030-drops-163974}, doi = {10.4230/LIPIcs.ICALP.2022.56}, annote = {Keywords: sublinear time algorithms, graph algorithms, cliques, arboricity, uniform sampling} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

In this paper, we revisit the problem of sampling edges in an unknown graph G = (V, E) from a distribution that is (pointwise) almost uniform over E. We consider the case where there is some a priori upper bound on the arboriciy of G. Given query access to a graph G over n vertices and of average degree {d} and arboricity at most alpha, we design an algorithm that performs O(alpha/d * {log^3 n}/epsilon) queries in expectation and returns an edge in the graph such that every edge e in E is sampled with probability (1 +/- epsilon)/m. The algorithm performs two types of queries: degree queries and neighbor queries. We show that the upper bound is tight (up to poly-logarithmic factors and the dependence in epsilon), as Omega(alpha/d) queries are necessary for the easier task of sampling edges from any distribution over E that is close to uniform in total variational distance. We also prove that even if G is a tree (i.e., alpha = 1 so that alpha/d = Theta(1)), Omega({log n}/{loglog n}) queries are necessary to sample an edge from any distribution that is pointwise close to uniform, thus establishing that a poly(log n) factor is necessary for constant alpha. Finally we show how our algorithm can be applied to obtain a new result on approximately counting subgraphs, based on the recent work of Assadi, Kapralov, and Khanna (ITCS, 2019).

Talya Eden, Dana Ron, and Will Rosenbaum. The Arboricity Captures the Complexity of Sampling Edges. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 52:1-52:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ICALP.2019.52, author = {Eden, Talya and Ron, Dana and Rosenbaum, Will}, title = {{The Arboricity Captures the Complexity of Sampling Edges}}, booktitle = {46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)}, pages = {52:1--52:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-109-2}, ISSN = {1868-8969}, year = {2019}, volume = {132}, editor = {Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.52}, URN = {urn:nbn:de:0030-drops-106287}, doi = {10.4230/LIPIcs.ICALP.2019.52}, annote = {Keywords: sampling, graph algorithms, arboricity, sublinear-time algorithms} }

Document

**Published in:** LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)

In a celebrated work, Blais, Brody, and Matulef [Blais et al., 2012] developed a technique for proving property testing lower bounds via reductions from communication complexity. Their work focused on testing properties of functions, and yielded new lower bounds as well as simplified analyses of known lower bounds. Here, we take a further step in generalizing the methodology of [Blais et al., 2012] to analyze the query complexity of graph parameter estimation problems. In particular, our technique decouples the lower bound arguments from the representation of the graph, allowing it to work with any query type.
We illustrate our technique by providing new simpler proofs of previously known tight lower bounds for the query complexity of several graph problems: estimating the number of edges in a graph, sampling edges from an almost-uniform distribution, estimating the number of triangles (and more generally, r-cliques) in a graph, and estimating the moments of the degree distribution of a graph. We also prove new lower bounds for estimating the edge connectivity of a graph and estimating the number of instances of any fixed subgraph in a graph. We show that the lower bounds for estimating the number of triangles and edge connectivity also hold in a strictly stronger computational model that allows access to uniformly random edge samples.

Talya Eden and Will Rosenbaum. Lower Bounds for Approximating Graph Parameters via Communication Complexity. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 11:1-11:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.APPROX-RANDOM.2018.11, author = {Eden, Talya and Rosenbaum, Will}, title = {{Lower Bounds for Approximating Graph Parameters via Communication Complexity}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)}, pages = {11:1--11:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-085-9}, ISSN = {1868-8969}, year = {2018}, volume = {116}, editor = {Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.11}, URN = {urn:nbn:de:0030-drops-94156}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2018.11}, annote = {Keywords: sublinear graph parameter estimation, lower bounds, communication complexity} }

Document

**Published in:** OASIcs, Volume 61, 1st Symposium on Simplicity in Algorithms (SOSA 2018)

We consider the problem of sampling an edge almost uniformly from an unknown graph, G = (V, E). Access to the graph is provided via queries of the following types: (1) uniform vertex queries, (2) degree queries, and (3) neighbor queries. We describe a new simple algorithm that returns a random edge e in E using \tilde{O}(n/sqrt{eps m}) queries in expectation, such that each edge e is sampled with probability (1 +/- eps)/m. Here, n = |V| is the number of vertices, and m = |E| is the number of edges. Our algorithm is optimal in the sense that any algorithm that samples an edge from an almost-uniform distribution must perform Omega(n/sqrt{m}) queries.

Talya Eden and Will Rosenbaum. On Sampling Edges Almost Uniformly. In 1st Symposium on Simplicity in Algorithms (SOSA 2018). Open Access Series in Informatics (OASIcs), Volume 61, pp. 7:1-7:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:OASIcs.SOSA.2018.7, author = {Eden, Talya and Rosenbaum, Will}, title = {{On Sampling Edges Almost Uniformly}}, booktitle = {1st Symposium on Simplicity in Algorithms (SOSA 2018)}, pages = {7:1--7:9}, series = {Open Access Series in Informatics (OASIcs)}, ISBN = {978-3-95977-064-4}, ISSN = {2190-6807}, year = {2018}, volume = {61}, editor = {Seidel, Raimund}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SOSA.2018.7}, URN = {urn:nbn:de:0030-drops-83001}, doi = {10.4230/OASIcs.SOSA.2018.7}, annote = {Keywords: Sublinear Algorithms, Graph Algorithms, Sampling Edges, Query Complexity} }