Document

RANDOM

**Published in:** LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

In classical statistics and distribution testing, it is often assumed that elements can be sampled exactly from some distribution 𝒫, and that when an element x is sampled, the probability 𝒫(x) of sampling x is also known. In this setting, recent work in distribution testing has shown that many algorithms are robust in the sense that they still produce correct output if the elements are drawn from any distribution 𝒬 that is sufficiently close to 𝒫. This phenomenon raises interesting questions: under what conditions is a "noisy" distribution 𝒬 sufficient, and what is the algorithmic cost of coping with this noise?
In this paper, we investigate these questions for the problem of estimating the sum of a multiset of N real values x_1, …, x_N. This problem is well-studied in the statistical literature in the case 𝒫 = 𝒬, where the Hansen-Hurwitz estimator [Annals of Mathematical Statistics, 1943] is frequently used. We assume that for some (known) distribution 𝒫, values are sampled from a distribution 𝒬 that is pointwise close to 𝒫. That is, there is a parameter γ < 1 such that for all x_i, (1 - γ) 𝒫(i) ≤ 𝒬(i) ≤ (1 + γ) 𝒫(i). For every positive integer k we define an estimator ζ_k for μ = ∑_i x_i whose bias is proportional to γ^k (where our ζ₁ reduces to the classical Hansen-Hurwitz estimator). As a special case, we show that if 𝒬 is pointwise γ-close to uniform and all x_i ∈ {0, 1}, for any ε > 0, we can estimate μ to within additive error ε N using m = Θ(N^{1-1/k}/ε^{2/k}) samples, where k = ⌈lg ε/lg γ⌉. We then show that this sample complexity is essentially optimal. Interestingly, our upper and lower bounds show that the sample complexity need not vary uniformly with the desired error parameter ε: for some values of ε, perturbations in its value have no asymptotic effect on the sample complexity, while for other values, any decrease in its value results in an asymptotically larger sample complexity.

Talya Eden, Jakob Bæk Tejs Houen, Shyam Narayanan, Will Rosenbaum, and Jakub Tětek. Bias Reduction for Sum Estimation. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 62:1-62:21, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.APPROX/RANDOM.2023.62, author = {Eden, Talya and Houen, Jakob B{\ae}k Tejs and Narayanan, Shyam and Rosenbaum, Will and T\v{e}tek, Jakub}, title = {{Bias Reduction for Sum Estimation}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)}, pages = {62:1--62:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-296-9}, ISSN = {1868-8969}, year = {2023}, volume = {275}, editor = {Megow, Nicole and Smith, Adam}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.62}, URN = {urn:nbn:de:0030-drops-188872}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2023.62}, annote = {Keywords: bias reduction, sum estimation, sublinear time algorithms, sample complexity} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)

Many deployments of differential privacy in industry are in the local model, where each party releases its private information via a differentially private randomizer. We study triangle counting in the noninteractive and interactive local model with edge differential privacy (that, intuitively, requires that the outputs of the algorithm on graphs that differ in one edge be indistinguishable). In this model, each party’s local view consists of the adjacency list of one vertex.
In the noninteractive model, we prove that additive Ω(n²) error is necessary, where n is the number of nodes. This lower bound is our main technical contribution. It uses a reconstruction attack with a new class of linear queries and a novel mix-and-match strategy of running the local randomizers with different completions of their adjacency lists. It matches the additive error of the algorithm based on Randomized Response, proposed by Imola, Murakami and Chaudhuri (USENIX2021) and analyzed by Imola, Murakami and Chaudhuri (CCS2022) for constant ε. We use a different postprocessing of Randomized Response and provide tight bounds on the variance of the resulting algorithm.
In the interactive setting, we prove a lower bound of Ω(n^{3/2}) on the additive error. Previously, no hardness results were known for interactive, edge-private algorithms in the local model, except for those that follow trivially from the results for the central model. Our work significantly improves on the state of the art in differentially private graph analysis in the local model.

Talya Eden, Quanquan C. Liu, Sofya Raskhodnikova, and Adam Smith. Triangle Counting with Local Edge Differential Privacy. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 52:1-52:21, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ICALP.2023.52, author = {Eden, Talya and Liu, Quanquan C. and Raskhodnikova, Sofya and Smith, Adam}, title = {{Triangle Counting with Local Edge Differential Privacy}}, booktitle = {50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)}, pages = {52:1--52:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-278-5}, ISSN = {1868-8969}, year = {2023}, volume = {261}, editor = {Etessami, Kousha and Feige, Uriel and Puppis, Gabriele}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.52}, URN = {urn:nbn:de:0030-drops-181048}, doi = {10.4230/LIPIcs.ICALP.2023.52}, annote = {Keywords: local differential privacy, reconstruction attacks, lower bounds, triangle counting} }

Document

APPROX

**Published in:** LIPIcs, Volume 245, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)

Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate.
Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)-approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting k-cliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space.
In addition to our theoretical results, we simulate our triangle counting algorithms in real-world graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems.

Amartya Shankha Biswas, Talya Eden, Quanquan C. Liu, Ronitt Rubinfeld, and Slobodan Mitrović. Massively Parallel Algorithms for Small Subgraph Counting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 245, pp. 39:1-39:28, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.APPROX/RANDOM.2022.39, author = {Biswas, Amartya Shankha and Eden, Talya and Liu, Quanquan C. and Rubinfeld, Ronitt and Mitrovi\'{c}, Slobodan}, title = {{Massively Parallel Algorithms for Small Subgraph Counting}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)}, pages = {39:1--39:28}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-249-5}, ISSN = {1868-8969}, year = {2022}, volume = {245}, editor = {Chakrabarti, Amit and Swamy, Chaitanya}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2022.39}, URN = {urn:nbn:de:0030-drops-171619}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2022.39}, annote = {Keywords: triangle counting, massively parallel computation, clique counting, approximation algorithms, subgraph counting} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

Counting and sampling small subgraphs are fundamental algorithmic tasks. Motivated by the need to handle massive datasets efficiently, recent theoretical work has examined the problems in the sublinear time regime. In this work, we consider the problem of sampling a k-clique in a graph from an almost uniform distribution. Specifically the algorithm should output each k-clique with probability (1±ε)/n_k, where n_k denotes the number of k-cliques in the graph and ε is a given approximation parameter. To this end, the algorithm may perform degree, neighbor, and pair queries. We focus on the class of graphs with arboricity at most α, and prove that the query complexity of the problem is Θ^*(min{nα , max {(((nα)^(k/2))/n_k)^{1/(k-1)}, (nα^(k-1))/n_k}}), where n is the number of vertices in the graph, and Θ^*(⋅) suppresses dependencies on (log n/ε)^O(k).
Our upper bound is based on defining a special auxiliary graph H_k, such that sampling edges almost uniformly in H_k translates to sampling k-cliques almost uniformly in the original graph G. We then build on a known edge-sampling algorithm (Eden, Ron and Rosenbaum, ICALP19) to sample edges in H_k. The challenge is simulating queries to H_k while being given query access only to G. Our lower bound follows from a construction of a family of graphs with arboricity α such that in each graph there are n_k k-cliques, where one of these cliques is "hidden" and hence hard to sample.

Talya Eden, Dana Ron, and Will Rosenbaum. Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques in Bounded Arboricity Graphs. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 56:1-56:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ICALP.2022.56, author = {Eden, Talya and Ron, Dana and Rosenbaum, Will}, title = {{Almost Optimal Bounds for Sublinear-Time Sampling of k-Cliques in Bounded Arboricity Graphs}}, booktitle = {49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)}, pages = {56:1--56:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-235-8}, ISSN = {1868-8969}, year = {2022}, volume = {229}, editor = {Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.56}, URN = {urn:nbn:de:0030-drops-163974}, doi = {10.4230/LIPIcs.ICALP.2022.56}, annote = {Keywords: sublinear time algorithms, graph algorithms, cliques, arboricity, uniform sampling} }

Document

**Published in:** LIPIcs, Volume 215, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)

A* is a classic and popular method for graphs search and path finding. It assumes the existence of a heuristic function h(u,t) that estimates the shortest distance from any input node u to the destination t. Traditionally, heuristics have been handcrafted by domain experts. However, over the last few years, there has been a growing interest in learning heuristic functions. Such learned heuristics estimate the distance between given nodes based on "features" of those nodes.
In this paper we formalize and initiate the study of such feature-based heuristics. In particular, we consider heuristics induced by norm embeddings and distance labeling schemes, and provide lower bounds for the tradeoffs between the number of dimensions or bits used to represent each graph node, and the running time of the A* algorithm. We also show that, under natural assumptions, our lower bounds are almost optimal.

Talya Eden, Piotr Indyk, and Haike Xu. Embeddings and Labeling Schemes for A*. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 62:1-62:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ITCS.2022.62, author = {Eden, Talya and Indyk, Piotr and Xu, Haike}, title = {{Embeddings and Labeling Schemes for A*}}, booktitle = {13th Innovations in Theoretical Computer Science Conference (ITCS 2022)}, pages = {62:1--62:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-217-4}, ISSN = {1868-8969}, year = {2022}, volume = {215}, editor = {Braverman, Mark}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2022.62}, URN = {urn:nbn:de:0030-drops-156585}, doi = {10.4230/LIPIcs.ITCS.2022.62}, annote = {Keywords: A* algorithm, path finding, graph search} }

Document

RANDOM

**Published in:** LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

We present a sublinear time algorithm that allows one to sample multiple edges from a distribution that is pointwise ε-close to the uniform distribution, in an amortized-efficient fashion. We consider the adjacency list query model, where access to a graph G is given via degree and neighbor queries.
The problem of sampling a single edge in this model has been raised by Eden and Rosenbaum (SOSA 18). Let n and m denote the number of vertices and edges of G, respectively. Eden and Rosenbaum provided upper and lower bounds of Θ^*(n/√ m) for sampling a single edge in general graphs (where O^*(⋅) suppresses poly(1/ε) and poly(log n) dependencies). We ask whether the query complexity lower bound for sampling a single edge can be circumvented when multiple samples are required. That is, can we get an improved amortized per-sample cost if we allow a preprocessing phase? We answer in the affirmative.
We present an algorithm that, if one knows the number of required samples q in advance, has an overall cost that is sublinear in q, namely, O^*(√ q ⋅(n/√ m)), which is strictly preferable to O^*(q⋅ (n/√ m)) cost resulting from q invocations of the algorithm by Eden and Rosenbaum.
Subsequent to a preliminary version of this work, Tětek and Thorup (arXiv, preprint) proved that this bound is essentially optimal.

Talya Eden, Saleet Mossel, and Ronitt Rubinfeld. Sampling Multiple Edges Efficiently. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 51:1-51:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.APPROX/RANDOM.2021.51, author = {Eden, Talya and Mossel, Saleet and Rubinfeld, Ronitt}, title = {{Sampling Multiple Edges Efficiently}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)}, pages = {51:1--51:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-207-5}, ISSN = {1868-8969}, year = {2021}, volume = {207}, editor = {Wootters, Mary and Sanit\`{a}, Laura}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.51}, URN = {urn:nbn:de:0030-drops-147441}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2021.51}, annote = {Keywords: Sampling edges, graph algorithm, sublinear algorithms} }

Document

RANDOM

**Published in:** LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

We consider the problem of sampling and approximately counting an arbitrary given motif H in a graph G, where access to G is given via queries: degree, neighbor, and pair, as well as uniform edge sample queries. Previous algorithms for these tasks were based on a decomposition of H into a collection of odd cycles and stars, denoted D^*(H) = {O_{k₁},...,O_{k_q}, S_{p₁},...,S_{p_𝓁}}. These algorithms were shown to be optimal for the case where H is a clique or an odd-length cycle, but no other lower bounds were known.
We present a new algorithm for sampling arbitrary motifs which, up to poly(log n) factors, is always at least as good, and for most graphs G is strictly better. The main ingredient leading to this improvement is an improved uniform algorithm for sampling stars, which might be of independent interest, as it allows to sample vertices according to the p-th moment of the degree distribution.
Finally, we prove that this algorithm is decomposition-optimal for decompositions that contain at least one odd cycle. These are the first lower bounds for motifs H with a nontrivial decomposition, i.e., motifs that have more than a single component in their decomposition.

Amartya Shankha Biswas, Talya Eden, and Ronitt Rubinfeld. Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 55:1-55:19, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.APPROX/RANDOM.2021.55, author = {Biswas, Amartya Shankha and Eden, Talya and Rubinfeld, Ronitt}, title = {{Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)}, pages = {55:1--55:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-207-5}, ISSN = {1868-8969}, year = {2021}, volume = {207}, editor = {Wootters, Mary and Sanit\`{a}, Laura}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.55}, URN = {urn:nbn:de:0030-drops-147480}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2021.55}, annote = {Keywords: Sublinear time algorithms, Graph algorithms, Sampling subgraphs, Approximate counting} }

Document

**Published in:** LIPIcs, Volume 146, 33rd International Symposium on Distributed Computing (DISC 2019)

In this paper we give sublinear-time distributed algorithms in the CONGEST model for subgraph detection for two classes of graphs: cliques and even-length cycles. We show for the first time that all copies of 4-cliques and 5-cliques in the network graph can be listed in sublinear time, O(n^{5/6+o(1)}) rounds and O(n^{21/22+o(1)}) rounds, respectively. Prior to our work, it was not known whether it was possible to even check if the network contains a 4-clique or a 5-clique in sublinear time.
For even-length cycles, C_{2k}, we give an improved sublinear-time algorithm, which exploits a new connection to extremal combinatorics. For example, for 6-cycles we improve the running time from O~(n^{5/6}) to O~(n^{3/4}) rounds. We also show two obstacles on proving lower bounds for C_{2k}-freeness: First, we use the new connection to extremal combinatorics to show that the current lower bound of Omega~(sqrt{n}) rounds for 6-cycle freeness cannot be improved using partition-based reductions from 2-party communication complexity, the technique by which all known lower bounds on subgraph detection have been proven to date. Second, we show that there is some fixed constant delta in (0,1/2) such that for any k, a Omega(n^{1/2+delta}) lower bound on C_{2k}-freeness implies new lower bounds in circuit complexity.
For general subgraphs, it was shown in [Orr Fischer et al., 2018] that for any fixed k, there exists a subgraph H of size k such that H-freeness requires Omega~(n^{2-Theta(1/k)}) rounds. It was left as an open problem whether this is tight, or whether some constant-sized subgraph requires truly quadratic time to detect. We show that in fact, for any subgraph H of constant size k, the H-freeness problem can be solved in O(n^{2 - Theta(1/k)}) rounds, nearly matching the lower bound of [Orr Fischer et al., 2018].

Talya Eden, Nimrod Fiat, Orr Fischer, Fabian Kuhn, and Rotem Oshman. Sublinear-Time Distributed Algorithms for Detecting Small Cliques and Even Cycles. In 33rd International Symposium on Distributed Computing (DISC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 146, pp. 15:1-15:16, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.DISC.2019.15, author = {Eden, Talya and Fiat, Nimrod and Fischer, Orr and Kuhn, Fabian and Oshman, Rotem}, title = {{Sublinear-Time Distributed Algorithms for Detecting Small Cliques and Even Cycles}}, booktitle = {33rd International Symposium on Distributed Computing (DISC 2019)}, pages = {15:1--15:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-126-9}, ISSN = {1868-8969}, year = {2019}, volume = {146}, editor = {Suomela, Jukka}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.DISC.2019.15}, URN = {urn:nbn:de:0030-drops-113224}, doi = {10.4230/LIPIcs.DISC.2019.15}, annote = {Keywords: Distributed Computing, Subgraph Freeness, CONGEST} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)

In this paper, we revisit the problem of sampling edges in an unknown graph G = (V, E) from a distribution that is (pointwise) almost uniform over E. We consider the case where there is some a priori upper bound on the arboriciy of G. Given query access to a graph G over n vertices and of average degree {d} and arboricity at most alpha, we design an algorithm that performs O(alpha/d * {log^3 n}/epsilon) queries in expectation and returns an edge in the graph such that every edge e in E is sampled with probability (1 +/- epsilon)/m. The algorithm performs two types of queries: degree queries and neighbor queries. We show that the upper bound is tight (up to poly-logarithmic factors and the dependence in epsilon), as Omega(alpha/d) queries are necessary for the easier task of sampling edges from any distribution over E that is close to uniform in total variational distance. We also prove that even if G is a tree (i.e., alpha = 1 so that alpha/d = Theta(1)), Omega({log n}/{loglog n}) queries are necessary to sample an edge from any distribution that is pointwise close to uniform, thus establishing that a poly(log n) factor is necessary for constant alpha. Finally we show how our algorithm can be applied to obtain a new result on approximately counting subgraphs, based on the recent work of Assadi, Kapralov, and Khanna (ITCS, 2019).

Talya Eden, Dana Ron, and Will Rosenbaum. The Arboricity Captures the Complexity of Sampling Edges. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 52:1-52:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ICALP.2019.52, author = {Eden, Talya and Ron, Dana and Rosenbaum, Will}, title = {{The Arboricity Captures the Complexity of Sampling Edges}}, booktitle = {46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)}, pages = {52:1--52:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-109-2}, ISSN = {1868-8969}, year = {2019}, volume = {132}, editor = {Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.52}, URN = {urn:nbn:de:0030-drops-106287}, doi = {10.4230/LIPIcs.ICALP.2019.52}, annote = {Keywords: sampling, graph algorithms, arboricity, sublinear-time algorithms} }

Document

**Published in:** LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)

In a celebrated work, Blais, Brody, and Matulef [Blais et al., 2012] developed a technique for proving property testing lower bounds via reductions from communication complexity. Their work focused on testing properties of functions, and yielded new lower bounds as well as simplified analyses of known lower bounds. Here, we take a further step in generalizing the methodology of [Blais et al., 2012] to analyze the query complexity of graph parameter estimation problems. In particular, our technique decouples the lower bound arguments from the representation of the graph, allowing it to work with any query type.
We illustrate our technique by providing new simpler proofs of previously known tight lower bounds for the query complexity of several graph problems: estimating the number of edges in a graph, sampling edges from an almost-uniform distribution, estimating the number of triangles (and more generally, r-cliques) in a graph, and estimating the moments of the degree distribution of a graph. We also prove new lower bounds for estimating the edge connectivity of a graph and estimating the number of instances of any fixed subgraph in a graph. We show that the lower bounds for estimating the number of triangles and edge connectivity also hold in a strictly stronger computational model that allows access to uniformly random edge samples.

Talya Eden and Will Rosenbaum. Lower Bounds for Approximating Graph Parameters via Communication Complexity. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 11:1-11:18, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.APPROX-RANDOM.2018.11, author = {Eden, Talya and Rosenbaum, Will}, title = {{Lower Bounds for Approximating Graph Parameters via Communication Complexity}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)}, pages = {11:1--11:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-085-9}, ISSN = {1868-8969}, year = {2018}, volume = {116}, editor = {Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.11}, URN = {urn:nbn:de:0030-drops-94156}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2018.11}, annote = {Keywords: sublinear graph parameter estimation, lower bounds, communication complexity} }

Document

**Published in:** OASIcs, Volume 61, 1st Symposium on Simplicity in Algorithms (SOSA 2018)

We consider the problem of sampling an edge almost uniformly from an unknown graph, G = (V, E). Access to the graph is provided via queries of the following types: (1) uniform vertex queries, (2) degree queries, and (3) neighbor queries. We describe a new simple algorithm that returns a random edge e in E using \tilde{O}(n/sqrt{eps m}) queries in expectation, such that each edge e is sampled with probability (1 +/- eps)/m. Here, n = |V| is the number of vertices, and m = |E| is the number of edges. Our algorithm is optimal in the sense that any algorithm that samples an edge from an almost-uniform distribution must perform Omega(n/sqrt{m}) queries.

Talya Eden and Will Rosenbaum. On Sampling Edges Almost Uniformly. In 1st Symposium on Simplicity in Algorithms (SOSA 2018). Open Access Series in Informatics (OASIcs), Volume 61, pp. 7:1-7:9, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:OASIcs.SOSA.2018.7, author = {Eden, Talya and Rosenbaum, Will}, title = {{On Sampling Edges Almost Uniformly}}, booktitle = {1st Symposium on Simplicity in Algorithms (SOSA 2018)}, pages = {7:1--7:9}, series = {Open Access Series in Informatics (OASIcs)}, ISBN = {978-3-95977-064-4}, ISSN = {2190-6807}, year = {2018}, volume = {61}, editor = {Seidel, Raimund}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SOSA.2018.7}, URN = {urn:nbn:de:0030-drops-83001}, doi = {10.4230/OASIcs.SOSA.2018.7}, annote = {Keywords: Sublinear Algorithms, Graph Algorithms, Sampling Edges, Query Complexity} }

Document

**Published in:** LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)

We revisit the classic problem of estimating the degree distribution moments of an undirected graph. Consider an undirected graph G=(V,E) with n (non-isolated) vertices, and define (for s > 0) mu_s = 1\n * sum_{v in V} d^s_v. Our aim is to estimate mu_s within a multiplicative error of (1+epsilon) (for a given approximation parameter epsilon>0) in sublinear time. We consider the sparse graph model that allows access to: uniform random vertices, queries for the degree of any vertex, and queries for a neighbor of any vertex. For the case of s=1 (the average degree), \widetilde{O}(\sqrt{n}) queries suffice for any constant epsilon (Feige, SICOMP 06 and Goldreich-Ron, RSA 08). Gonen-Ron-Shavitt (SIDMA 11) extended this result to all integral s > 0, by designing an algorithms that performs \widetilde{O}(n^{1-1/(s+1)}) queries. (Strictly speaking, their algorithm approximates the number of star-subgraphs of a given size, but a slight modification gives an algorithm for moments.)
We design a new, significantly simpler algorithm for this problem. In the worst-case, it exactly matches the bounds of Gonen-Ron-Shavitt, and has a much simpler proof. More importantly, the running time of this algorithm is connected to the degeneracy of G. This is (essentially) the maximum density of an induced subgraph. For the family of graphs with degeneracy at most alpha, it has a query complexity of widetilde{O}\left(\frac{n^{1-1/s}}{\mu^{1/s}_s} \Big(\alpha^{1/s} + \min\{\alpha,\mu^{1/s}_s\}\Big)\right) = \widetilde{O}(n^{1-1/s}\alpha/\mu^{1/s}_s). Thus, for the class of bounded degeneracy graphs (which includes all minor closed families and preferential attachment graphs), we can estimate the average degree in \widetilde{O}(1) queries, and can estimate the variance of the degree distribution in \widetilde{O}(\sqrt{n}) queries. This is a major improvement over the previous worst-case bounds. Our key insight is in designing an estimator for mu_s that has low variance when G does not have large dense subgraphs.

Talya Eden, Dana Ron, and C. Seshadhri. Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 7:1-7:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.ICALP.2017.7, author = {Eden, Talya and Ron, Dana and Seshadhri, C.}, title = {{Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection}}, booktitle = {44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)}, pages = {7:1--7:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-041-5}, ISSN = {1868-8969}, year = {2017}, volume = {80}, editor = {Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.7}, URN = {urn:nbn:de:0030-drops-73747}, doi = {10.4230/LIPIcs.ICALP.2017.7}, annote = {Keywords: Sublinear algorithms, Degree distribution, Graph moments} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail