Document

APPROX

**Published in:** LIPIcs, Volume 245, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)

Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate.
Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)-approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting k-cliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space.
In addition to our theoretical results, we simulate our triangle counting algorithms in real-world graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems.

Amartya Shankha Biswas, Talya Eden, Quanquan C. Liu, Ronitt Rubinfeld, and Slobodan Mitrović. Massively Parallel Algorithms for Small Subgraph Counting. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 245, pp. 39:1-39:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.APPROX/RANDOM.2022.39, author = {Biswas, Amartya Shankha and Eden, Talya and Liu, Quanquan C. and Rubinfeld, Ronitt and Mitrovi\'{c}, Slobodan}, title = {{Massively Parallel Algorithms for Small Subgraph Counting}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)}, pages = {39:1--39:28}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-249-5}, ISSN = {1868-8969}, year = {2022}, volume = {245}, editor = {Chakrabarti, Amit and Swamy, Chaitanya}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2022.39}, URN = {urn:nbn:de:0030-drops-171619}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2022.39}, annote = {Keywords: triangle counting, massively parallel computation, clique counting, approximation algorithms, subgraph counting} }

Document

**Published in:** LIPIcs, Volume 215, 13th Innovations in Theoretical Computer Science Conference (ITCS 2022)

For a graph G on n vertices, naively sampling the position of a random walk of at time t requires work Ω(t). We desire local access algorithms supporting position_G(t) queries, which return the position of a random walk from some fixed start vertex s at time t, where the joint distribution of returned positions is 1/poly(n) close to those of a uniformly random walk in 𝓁₁ distance.
We first give an algorithm for local access to random walks on a given undirected d-regular graph with Õ(1/(1-λ)√n) runtime per query, where λ is the second-largest eigenvalue of the random walk matrix of the graph in absolute value. Since random d-regular graphs G(n,d) are expanders with high probability, this gives an Õ(√n) algorithm for a graph drawn from G(n,d) whp, which improves on the naive method for small numbers of queries.
We then prove that no algorithm with subconstant error given probe access to an input d-regular graph can have runtime better than Ω(√n/log(n)) per query in expectation when the input graph is drawn from G(n,d), obtaining a nearly matching lower bound. We further show an Ω(n^{1/4}) runtime per query lower bound even with an oblivious adversary (i.e. when the query sequence is fixed in advance).
We then show that for families of graphs with additional group theoretic structure, dramatically better results can be achieved. We give local access to walks on small-degree abelian Cayley graphs, including cycles and hypercubes, with runtime polylog(n) per query. This also allows for efficient local access to walks on polylog degree expanders. We show that our techniques apply to graphs with high degree by extending or results to graphs constructed using the tensor product (giving fast local access to walks on degree n^ε graphs for any ε ∈ (0,1]) and Cartesian product.

Amartya Shankha Biswas, Edward Pyne, and Ronitt Rubinfeld. Local Access to Random Walks. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 24:1-24:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.ITCS.2022.24, author = {Biswas, Amartya Shankha and Pyne, Edward and Rubinfeld, Ronitt}, title = {{Local Access to Random Walks}}, booktitle = {13th Innovations in Theoretical Computer Science Conference (ITCS 2022)}, pages = {24:1--24:22}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-217-4}, ISSN = {1868-8969}, year = {2022}, volume = {215}, editor = {Braverman, Mark}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2022.24}, URN = {urn:nbn:de:0030-drops-156209}, doi = {10.4230/LIPIcs.ITCS.2022.24}, annote = {Keywords: sublinear time algorithms, random generation, local computation} }

Document

RANDOM

**Published in:** LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

We consider the problem of sampling and approximately counting an arbitrary given motif H in a graph G, where access to G is given via queries: degree, neighbor, and pair, as well as uniform edge sample queries. Previous algorithms for these tasks were based on a decomposition of H into a collection of odd cycles and stars, denoted D^*(H) = {O_{k₁},...,O_{k_q}, S_{p₁},...,S_{p_𝓁}}. These algorithms were shown to be optimal for the case where H is a clique or an odd-length cycle, but no other lower bounds were known.
We present a new algorithm for sampling arbitrary motifs which, up to poly(log n) factors, is always at least as good, and for most graphs G is strictly better. The main ingredient leading to this improvement is an improved uniform algorithm for sampling stars, which might be of independent interest, as it allows to sample vertices according to the p-th moment of the degree distribution.
Finally, we prove that this algorithm is decomposition-optimal for decompositions that contain at least one odd cycle. These are the first lower bounds for motifs H with a nontrivial decomposition, i.e., motifs that have more than a single component in their decomposition.

Amartya Shankha Biswas, Talya Eden, and Ronitt Rubinfeld. Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 55:1-55:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.APPROX/RANDOM.2021.55, author = {Biswas, Amartya Shankha and Eden, Talya and Rubinfeld, Ronitt}, title = {{Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)}, pages = {55:1--55:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-207-5}, ISSN = {1868-8969}, year = {2021}, volume = {207}, editor = {Wootters, Mary and Sanit\`{a}, Laura}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.55}, URN = {urn:nbn:de:0030-drops-147480}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2021.55}, annote = {Keywords: Sublinear time algorithms, Graph algorithms, Sampling subgraphs, Approximate counting} }

Document

**Published in:** LIPIcs, Volume 151, 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)

Consider an algorithm performing a computation on a huge random object (for example a random graph or a "long" random walk). Is it necessary to generate the entire object prior to the computation, or is it possible to provide query access to the object and sample it incrementally "on-the-fly" (as requested by the algorithm)? Such an implementation should emulate the random object by answering queries in a manner consistent with an instance of the random object sampled from the true distribution (or close to it). This paradigm is useful when the algorithm is sub-linear and thus, sampling the entire object up front would ruin its efficiency.
Our first set of results focus on undirected graphs with independent edge probabilities, i.e. each edge is chosen as an independent Bernoulli random variable. We provide a general implementation for this model under certain assumptions. Then, we use this to obtain the first efficient local implementations for the Erdös-Rényi G(n,p) model for all values of p, and the Stochastic Block model. As in previous local-access implementations for random graphs, we support Vertex-Pair and Next-Neighbor queries. In addition, we introduce a new Random-Neighbor query. Next, we give the first local-access implementation for All-Neighbors queries in the (sparse and directed) Kleinberg’s Small-World model. Our implementations require no pre-processing time, and answer each query using O(poly(log n)) time, random bits, and additional space.
Next, we show how to implement random Catalan objects, specifically focusing on Dyck paths (balanced random walks on the integer line that are always non-negative). Here, we support Height queries to find the location of the walk, and First-Return queries to find the time when the walk returns to a specified location. This in turn can be used to implement Next-Neighbor queries on random rooted ordered trees, and Matching-Bracket queries on random well bracketed expressions (the Dyck language).
Finally, we introduce two features to define a new model that: (1) allows multiple independent (and even simultaneous) instantiations of the same implementation, to be consistent with each other without the need for communication, (2) allows us to generate a richer class of random objects that do not have a succinct description. Specifically, we study uniformly random valid q-colorings of an input graph G with maximum degree Δ. This is in contrast to prior work in the area, where the relevant random objects are defined as a distribution with O(1) parameters (for example, n and p in the G(n,p) model). The distribution over valid colorings is instead specified via a "huge" input (the underlying graph G), that is far too large to be read by a sub-linear time algorithm. Instead, our implementation accesses G through local neighborhood probes, and is able to answer queries to the color of any given vertex in sub-linear time for q ≥ 9Δ, in a manner that is consistent with a specific random valid coloring of G. Furthermore, the implementation is memory-less, and can maintain consistency with non-communicating copies of itself.

Amartya Shankha Biswas, Ronitt Rubinfeld, and Anak Yodpinyanee. Local Access to Huge Random Objects Through Partial Sampling. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 27:1-27:65, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{biswas_et_al:LIPIcs.ITCS.2020.27, author = {Biswas, Amartya Shankha and Rubinfeld, Ronitt and Yodpinyanee, Anak}, title = {{Local Access to Huge Random Objects Through Partial Sampling}}, booktitle = {11th Innovations in Theoretical Computer Science Conference (ITCS 2020)}, pages = {27:1--27:65}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-134-4}, ISSN = {1868-8969}, year = {2020}, volume = {151}, editor = {Vidick, Thomas}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2020.27}, URN = {urn:nbn:de:0030-drops-117126}, doi = {10.4230/LIPIcs.ITCS.2020.27}, annote = {Keywords: sublinear time algorithms, random generation, local computation} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail