Document

# Approximate Triangle Counting via Sampling and Fast Matrix Multiplication

## File

LIPIcs.ICALP.2022.107.pdf
• Filesize: 0.82 MB
• 20 pages

## Acknowledgements

I would like to thank several people: Václav Rozhoň, who gave early feedback and who suggested using a recursive approach; Rasmus Pagh for helping improve a draft of this paper; my supervisor Mikkel Thorup for helpful discussions and for his support; and Talya Eden for advice regarding related work.

## Cite As

Jakub Tětek. Approximate Triangle Counting via Sampling and Fast Matrix Multiplication. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 107:1-107:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ICALP.2022.107

## Abstract

There is a simple O(n³/{ε²T}) time algorithm for 1±ε-approximate triangle counting where T is the number of triangles in the graph and n the number of vertices. At the same time, one may count triangles exactly using fast matrix multiplication in time Õ(n^ω). Is it possible to get a negative dependency on the number of triangles T while retaining the state-of-the-art n^ω dependency on n? We answer this question positively by providing an algorithm which runs in time O({n^ω}/T^{ω-2})⋅poly(n^o(1)/ε). This is optimal in the sense that as long as the exponent of T is independent of n, T, it cannot be improved while retaining the dependency on n. Our algorithm improves upon the state of the art when T ≫ 1 and T ≪ n. We also consider the problem of approximate triangle counting in sparse graphs, parameterized by the number of edges m. The best known algorithm runs in time Õ_ε(m^{3/2}/T) [Eden et al., SIAM Journal on Computing, 2017]. An algorithm by Alon et al. [JACM, 1995] for exact triangle counting that runs in time Õ(m^{2ω/(ω + 1)}). We again get an algorithm whose complexity has a state-of-the-art dependency on m while having negative dependency on T. Specifically, our algorithm runs in time O({m^{2ω/(ω+1)}}/{T^{2(ω-1)/(ω+1)}}) ⋅ poly(n^o(1)/ε). This is again optimal in the sense that no better constant exponent of T is possible without worsening the dependency on m. This algorithm improves upon the state of the art when T ≫ 1 and T ≪ √m. In both cases, algorithms with time complexity matching query complexity lower bounds were known on some range of parameters. While those algorithms have optimal query complexity for the whole range of T, the time complexity departs from the query complexity and is no longer optimal (as we show) for T ≪ n and T ≪ √m, respectively. We focus on the time complexity in this range of T. To the best of our knowledge, this is the first paper considering the discrepancy between query and time complexity in graph parameter estimation.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Sketching and sampling
##### Keywords
• Approximate triangle counting
• Fast matrix multiplication
• Sampling

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Maryam Aliakbarpour, Amartya Shankha Biswas, Themis Gouleakis, John Peebles, Ronitt Rubinfeld, and Anak Yodpinyanee. Sublinear-Time Algorithms for Counting Star Subgraphs via Edge Sampling. Algorithmica, 80(2):668-697, February 2018. URL: https://doi.org/10.1007/s00453-017-0287-3.
2. Josh Alman and Virginia Vassilevska Williams. A Refined Laser Method and Faster Matrix Multiplication. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), Proceedings, pages 522-539. Society for Industrial and Applied Mathematics, January 2021. URL: https://doi.org/10.1137/1.9781611976465.32.
3. N. Alon, R. Yuster, and U. Zwick. Finding and Counting Given Length Cycles. Algorithmica (New York), 17(3):209-223, 1997. URL: https://doi.org/10.1007/BF02523189.
4. Sepehr Assadi, Michael Kapralov, and Sanjeev Khanna. A simple sublinear-time algorithm for counting arbitrary subgraphs via edge sampling. Leibniz International Proceedings in Informatics, LIPIcs, 124, November 2019. URL: https://doi.org/10.4230/LIPIcs.ITCS.2019.6.
5. Albert-László Barabási and Márton Pósfai. Network science. Cambridge University Press, Cambridge, 2016. URL: http://barabasi.com/networksciencebook/.
6. Amartya Shankha Biswas, T. Eden, Q. Liu, Slobodan Mitrović, and R. Rubinfeld. Parallel algorithms for small subgraph counting. ArXiv, abs/2002.08299, 2020. URL: http://arxiv.org/abs/2002.08299.
7. N. Chiba and Takao Nishizeki. Arboricity and subgraph listing algorithms. SIAM J. Comput., 14:210-223, 1985.
8. Fan Chung and Linyuan Lu. Complex Graphs and Networks (Cbms Regional Conference Series in Mathematics). American Mathematical Society, USA, 2006.
9. Talya Eden, Amit Levi, Dana Ron, and C. Seshadhri. Approximately counting triangles in sublinear time. SIAM Journal on Computing, 46(5):1603-1646, 2017. URL: https://doi.org/10.1137/15M1054389.
10. Talya Eden, Dana Ron, and C. Seshadhri. Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection. In Ioannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl, editors, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017), volume 80 of Leibniz International Proceedings in Informatics (LIPIcs), pages 7:1-7:13, Dagstuhl, Germany, 2017. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.ICALP.2017.7.
11. Talya Eden, Dana Ron, and C. Seshadhri. On approximating the number of k-cliques in sublinear time. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, pages 722-734, New York, NY, USA, 2018. Association for Computing Machinery. URL: https://doi.org/10.1145/3188745.3188810.
12. Talya Eden and Will Rosenbaum. Lower Bounds for Approximating Graph Parameters via Communication Complexity. In Eric Blais, Klaus Jansen, José D. P. Rolim, and David Steurer, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018), volume 116 of Leibniz International Proceedings in Informatics (LIPIcs), pages 11:1-11:18, Dagstuhl, Germany, 2018. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2018.11.
13. Oded Goldreich. Introduction to property testing. Cambridge University Press, 2017.
14. Oded Goldreich and Dana Ron. Approximating average parameters of graphs. In Josep Díaz, Klaus Jansen, José D. P. Rolim, and Uri Zwick, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 363-374, Berlin, Heidelberg, 2006. Springer Berlin Heidelberg.
15. Paul W Holland and Samuel Leinhardt. A method for detecting structure in sociometric data. In Social networks, pages 411-432. Elsevier, 1977.
16. Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. In Proceedings of the Ninth Annual ACM Symposium on Theory of Computing, STOC '77, pages 1-10, New York, NY, USA, 1977. Association for Computing Machinery. URL: https://doi.org/10.1145/800105.803390.
17. John Kallaugher and Eric Price. A hybrid sampling scheme for triangle counting. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '17, pages 1778-1797, USA, 2017. Society for Industrial and Applied Mathematics.
18. Mihail N. Kolountzakis, Gary L. Miller, Richard Peng, and Charalampos E. Tsourakakis. Efficient triangle counting in large graphs via degree-based vertex partitioning. Internet Mathematics, 8(1-2):161-185, 2012. URL: https://doi.org/10.1080/15427951.2012.625260.
19. Tong Ihn Lee, Nicola J Rinaldi, François Robert, Duncan T Odom, Ziv Bar-Joseph, Georg K Gerber, Nancy M Hannett, Christopher T Harbison, Craig M Thompson, Itamar Simon, et al. Transcriptional regulatory networks in saccharomyces cerevisiae. science, 298(5594):799-804, 2002.
20. Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider, and S. Seshadri. Efficient sampling strategies for relational database operations. Theoretical Computer Science, 116(1):195-226, 1993. URL: https://doi.org/10.1016/0304-3975(93)90224-H.
21. Duncan T Odom, Nora Zizlsperger, D Benjamin Gordon, George W Bell, Nicola J Rinaldi, Heather L Murray, Tom L Volkert, Jorg Schreiber, P Alexander Rolfe, David K Gifford, et al. Control of pancreas and liver gene expression by hnf transcription factors. Science, 303(5662):1378-1381, 2004.
22. Rasmus Pagh and Charalampos E. Tsourakakis. Colorful triangle counting and a mapreduce implementation. Inf. Process. Lett., 112(7):277-281, March 2012. URL: https://doi.org/10.1016/j.ipl.2011.12.007.
23. Charalampos E. Tsourakakis, U. Kang, Gary L. Miller, and Christos Faloutsos. Doulion: Counting triangles in massive graphs with a coin. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '09, pages 837-846, New York, NY, USA, 2009. Association for Computing Machinery. URL: https://doi.org/10.1145/1557019.1557111.
24. Osamu Watanabe. Sequential sampling techniques for algorithmic learning theory. Theoretical Computer Science, 348(1):3-14, 2005. Algorithmic Learning Theory (ALT 2000). URL: https://doi.org/10.1016/j.tcs.2005.09.003.
25. Andreas Wimmer and Kevin Lewis. Beyond and below racial homophily: Erg models of a friendship network documented on facebook. American Journal of Sociology, 116(2):583-642, 2010. URL: http://www.jstor.org/stable/10.1086/653658.