Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection

Eden, Talya; Ron, Dana; Seshadhri, C.

doi:10.4230/LIPIcs.ICALP.2017.7

File

Author Details

Talya Eden

Dana Ron

C. Seshadhri

Cite AsGet BibTex

Talya Eden, Dana Ron, and C. Seshadhri. Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 7:1-7:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.ICALP.2017.7

Abstract

We revisit the classic problem of estimating the degree distribution moments of an undirected graph. Consider an undirected graph G=(V,E) with n (non-isolated) vertices, and define (for s > 0) mu_s = 1\n * sum_{v in V} d^s_v. Our aim is to estimate mu_s within a multiplicative error of (1+epsilon) (for a given approximation parameter epsilon>0) in sublinear time. We consider the sparse graph model that allows access to: uniform random vertices, queries for the degree of any vertex, and queries for a neighbor of any vertex. For the case of s=1 (the average degree), \widetilde{O}(\sqrt{n}) queries suffice for any constant epsilon (Feige, SICOMP 06 and Goldreich-Ron, RSA 08). Gonen-Ron-Shavitt (SIDMA 11) extended this result to all integral s > 0, by designing an algorithms that performs \widetilde{O}(n^{1-1/(s+1)}) queries. (Strictly speaking, their algorithm approximates the number of star-subgraphs of a given size, but a slight modification gives an algorithm for moments.) We design a new, significantly simpler algorithm for this problem. In the worst-case, it exactly matches the bounds of Gonen-Ron-Shavitt, and has a much simpler proof. More importantly, the running time of this algorithm is connected to the degeneracy of G. This is (essentially) the maximum density of an induced subgraph. For the family of graphs with degeneracy at most alpha, it has a query complexity of widetilde{O}\left(\frac{n^{1-1/s}}{\mu^{1/s}_s} \Big(\alpha^{1/s} + \min\{\alpha,\mu^{1/s}_s\}\Big)\right) = \widetilde{O}(n^{1-1/s}\alpha/\mu^{1/s}_s). Thus, for the class of bounded degeneracy graphs (which includes all minor closed families and preferential attachment graphs), we can estimate the average degree in \widetilde{O}(1) queries, and can estimate the variance of the degree distribution in \widetilde{O}(\sqrt{n}) queries. This is a major improvement over the previous worst-case bounds. Our key insight is in designing an estimator for mu_s that has low variance when G does not have large dense subgraphs.

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

A. S. Aliakbarpour, M.and Biswas, T. Gouleakis, J. Peebles, R. Rubinfeld, and A. Yodpinyanee. Sublinear-time algorithms for counting star subgraphs via edge sampling. Algorithmica, pages 1-30, 2017. URL: http://dx.doi.org/10.1007/s00453-017-0287-3.
N. Alon and S. Gutner. Linear time algorithms for finding a dominating set of fixed size in degenerated graphs. In Proceedings of the Annual International Conference Computing and Combinatorics (COCOON), pages 394-405, 2008.
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137-147, 1999.
Arboricity. Wikipedia. URL: https://en.wikipedia.org/wiki/Arboricity.
A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509-512, October 1999.
J. W. Berry, L. A. Fostvedt, D. J. Nordman, C. A. Phillips, C. Seshadhri, and A. G. Wilson. Why do simple algorithms for triangle enumeration work in the real world? Internet Mathematics, 11(6):555-571, 2015.
Z. Bi, C. Faloutsos, and F. Korn. The dgx distribution for mining massive, skewed data. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 17-26. ACM, 2001.
P. Bickel, A. Chen, and E. Levina. The method of moments and degree distributions for network models. Annals of Statistics, 39(5):2280-2301, 2011.
P. Brach, M. Cygan, J. Laccki, and P. Sankowski. Algorithmic complexity of power law networks. In Proceedings of the Annual Symposium on Discrete Algorithms (SODA), pages 1306-1325, 2016.
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener. Graph structure in the web. Computer Networks, 33:309-320, 2000.
B. Chazelle, R. Rubinfeld, and L. Trevisan. Approximating the minimum spanning tree weight in sublinear time. SIAM Journal on Computing, 34(6):1370-1379, 2005.
N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM J. Comput., 14:210-223, 1985.
F. Chierichetti, A. Dasgupta, R. Kumar, S. Lattanzi, and T. Sarlos. On sampling nodes in a network. In Proceedings of the International Conference on World Wide Web (WWW), pages 471-481, 2016.
A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. SIAM Review, 51(4):661-703, 2009.
A. Czumaj, F. Ergun, L. Fortnow, A. Magen, I. Newman, R. Rubinfeld, and C. Sohler. Approximating the weight of the euclidean minimum spanning tree in sublinear time. SIAM Journal on Computing, 35(1):91-109, 2005.
A. Czumaj and C. Sohler. Estimating the weight of metric minimum spanning trees in sublinear time. SIAM Journal on Computing, 39(3):904-922, 2009.
A. Dasgupta, R. Kumar, and T. Sarlos. On estimating the average degree. In Proceedings of the International Conference on World Wide Web (WWW), pages 795-806, 2014.
R. Diestel. Graph Theory. Springer, fourth edition edition, 2010.
T. Eden, A. Levi, D. Ron, and C. Seshadhri. Approximately counting triangles in sublinear time. In Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS), pages 614-633, 2015.
D. Eppstein, M. Loffler, and D. Strash. Listing all maximal cliques in sparse graphs in near-optimal time. In International Symposium on Algorithms and Computation (ISAAC), pages 403-413, 2010.
P. Erdos and T. Gallai. Graphs with prescribed degree of vertices (hungarian). Mat. Lapok, 11:264-274, 1960.
M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In Proceedings of Computer Communication Review (SIGCOMM), pages 251-262. ACM, 1999.
U. Feige. On sums of independent random variables with unbounded variance and estimating the average degree in a graph. SIAM Journal on Computing, 35(4):964-984, 2006.
O. Goldreich and D. Ron. Approximating average parameters of graphs. Random Structures and Algorithms, 32(4):473-493, 2008.
M. Gonen, D. Ron, and Y. Shavitt. Counting stars and other small subgraphs in sublinear-time. SIAM Journal on Discrete Math, 25(3):1365-1411, 2011.
Graph degeneracy. Wikipedia. URL: https://en.wikipedia.org/wiki/Degeneracy_(graph_theory).
S. L. Hakimi. On the realizability of a set of integers as degrees of the vertices of a graph. SIAM Journal Applied Mathematics, 10:496-506, 1962.
A. Hassidim, J. A. Kelner, H. N. Nguyen, and K. Onak. Local graph partitions for approximation and testing. In Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS), pages 22-31. IEEE, 2009.
V. Havel. A remark on the existence of finite graphs (czech). Casopis Pest. Mat., 80:477-480, 1955.
S. Marko and D. Ron. Approximating the distance to properties in bounded-degree and general sparse graphs. ACM Transactions on Algorithms, 5(2), 2009.
D. Matula and L. Beck. Smallest-last ordering and clustering and graph coloring algorithms. Journal of the ACM (JACM), 30(3):417-427, 1983.
J. Nešetřil and P. Ossana de Mendez. Sparsity. Springer, 2010.
H. N. Nguyen and K. Onak. Constant-time approximation algorithms via local improvements. In Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS), pages 327-336. IEEE, 2008.
K. Onak, D. Ron, M. Rosen, and R. Rubinfeld. A near-optimal sublinear-time algorithm for approximating the minimum vertex cover size. In Proceedings of the Annual Symposium on Discrete Algorithms (SODA), pages 1123-1131. SIAM, 2012.
M. Parnas and D. Ron. Approximating the minimum vertex cover in sublinear time and a connection to distributed algorithms. Theoretical Computer Science, 381(1-3):183-196, 2007.
D. Pennock, G. Flake, S. Lawrence, E. Glover, and C. L. Giles. Winners don't take all: Characterizing the competition for links on the web. Proceedings of the national academy of sciences (PNAS), 99(8):5207-5211, 2002.
A. Sala, L. Cao, C. Wilson, R. Zablit, H. Zheng, and B. Y. Zhao. Measurement-calibrated graph models for social network experiments. In Proceedings of the International Conference on World Wide Web (WWW), pages 861-870. ACM, 2010.
O. Simpson, C. Seshadhri, and A. McGregor. Catching the head, tail, and everything in between: A streaming algorithm for the degree distribution. In Proceedings on the International Conference on Data Mining (ICDM), pages 979-984, 2015.
Y. Yoshida, M. Yamamoto, and H. Ito. An improved constant-time approximation algorithm for maximum-matchings. In Proceedings of the Annual Symposium on the Theory of Computing (STOC), pages 225-234. ACM, 2009.

Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection

Authors Talya Eden, Dana Ron, C. Seshadhri

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Keywords

Metrics

References

Thanks for your feedback!

Could not send message