Linear Time Subgraph Counting, Graph Degeneracy, and the Chasm at Size Six

Authors Suman K. Bera, Noujan Pashanasangi, C. Seshadhri



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2020.38.pdf
  • Filesize: 0.6 MB
  • 20 pages

Document Identifiers

Author Details

Suman K. Bera
  • University of California, Santa Cruz, CA 95064, USA
Noujan Pashanasangi
  • University of California, Santa Cruz, CA 95064, USA
C. Seshadhri
  • University of California, Santa Cruz, CA 95064, USA

Acknowledgements

We would like to thank David Helmbold for insightful discussions. In particular, David pointed out that the lower bound for counting $8$-cycles does not follow from our construction.

Cite AsGet BibTex

Suman K. Bera, Noujan Pashanasangi, and C. Seshadhri. Linear Time Subgraph Counting, Graph Degeneracy, and the Chasm at Size Six. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 151, pp. 38:1-38:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ITCS.2020.38

Abstract

We consider the problem of counting all k-vertex subgraphs in an input graph, for any constant k. This problem (denoted SUB-CNT_k) has been studied extensively in both theory and practice. In a classic result, Chiba and Nishizeki (SICOMP 85) gave linear time algorithms for clique and 4-cycle counting for bounded degeneracy graphs. This is a rich class of sparse graphs that contains, for example, all minor-free families and preferential attachment graphs. The techniques from this result have inspired a number of recent practical algorithms for SUB-CNT_k. Towards a better understanding of the limits of these techniques, we ask: for what values of k can SUB_CNT_k be solved in linear time? We discover a chasm at k=6. Specifically, we prove that for k < 6, SUB_CNT_k can be solved in linear time. Assuming a standard conjecture in fine-grained complexity, we prove that for all k ⩾ 6, SUB-CNT_k cannot be solved even in near-linear time.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Graph algorithms
  • Theory of computation → Graph algorithms analysis
Keywords
  • Subgraph counting
  • bounded degeneracy graphs
  • fine-grained complexity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amir Abboud and Virginia Vassilevska Williams. Popular conjectures imply strong lower bounds for dynamic problems. In Proc. 55th Annual IEEE Symposium on Foundations of Computer Science, 2014. Google Scholar
  2. Nesreen K. Ahmed, Jennifer Neville, Ryan A. Rossi, and Nick Duffield. Efficient Graphlet Counting for Large Networks. In International Conference on Data Mining, 2015. Google Scholar
  3. Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. Graph sketches: sparsification, spanners, and subgraphs. In Proc. 31st ACM Symposium on Principles of Database Systems, pages 5-14. ACM, 2012. Google Scholar
  4. Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. J. ACM, 42(4):844-856, 1995. Google Scholar
  5. Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209-223, 1997. Google Scholar
  6. Sepehr Assadi, Michael Kapralov, and Sanjeev Khanna. A Simple Sublinear-Time Algorithm for Counting Arbitrary Subgraphs via Edge Sampling. In Proc. 10th Conference on Innovations in Theoretical Computer Science, 2018. Google Scholar
  7. Ziv Bar-Yossef, Ravi Kumar, and D. Sivakumar. Reductions in Streaming Algorithms, with an Application to Counting Triangles in Graphs. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms, 2002. Google Scholar
  8. A. Benson, D. F. Gleich, and J. Leskovec. Higher-order organization of complex networks. Science, 353(6295):163-166, 2016. Google Scholar
  9. Suman K Bera and Amit Chakrabarti. Towards tighter space bounds for counting triangles and other substructures in graph streams. In Proc. 34th International Symposium on Theoretical Aspects of Computer Science, 2017. Google Scholar
  10. Nadja Betzler, Rene Van Bevern, Michael R Fellows, Christian Komusiewicz, and Rolf Niedermeier. Parameterized algorithmics for finding connected motifs in biological networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 8(5):1296-1308, 2011. Google Scholar
  11. M. Bhuiyan, M. Rahman, M. Rahman, and M. Al Hasan. GUISE: Uniform Sampling of Graphlets for Large Graph Analysis. In International Conference on Data Mining, pages 91-100, 2012. Google Scholar
  12. Etienne Birmele et al. Detecting local network motifs. Electronic Journal of Statistics, 6:908-933, 2012. Google Scholar
  13. Andreas Björklund, Thore Husfeldt, Petteri Kaski, and Mikko Koivisto. Counting paths and packings in halves. In Proc. 17th Annual European Symposium on Algorithms, pages 578-586, 2009. Google Scholar
  14. Andreas Björklund, Petteri Kaski, and Łukasz Kowalik. Counting thin subgraphs via packings faster than meet-in-the-middle time. ACM Transactions on Algorithms (TALG), 13(4):48, 2017. Google Scholar
  15. R. Burt. Structural Holes and Good Ideas. American Journal of Sociology, 110(2):349-399, 2004. Google Scholar
  16. Norishige Chiba and Takao Nishizeki. Arboricity and subgraph listing algorithms. SIAM Journal on computing, 14(1):210-223, 1985. Google Scholar
  17. Jonathan Cohen. Graph twiddling in a mapreduce world. Computing in Science & Engineering, 11(4):29, 2009. Google Scholar
  18. J. Coleman. Social Capital in the Creation of Human Capital. American Journal of Sociology, 94:S95-S120, 1988. URL: http://www.jstor.org/stable/2780243.
  19. Radu Curticapean, Holger Dell, and Dániel Marx. Homomorphisms are a good basis for counting small subgraphs. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 210-223, 2017. Google Scholar
  20. Radu Curticapean and Dániel Marx. Complexity of counting subgraphs: Only the boundedness of the vertex-cover number counts. In Proc. 55th Annual IEEE Symposium on Foundations of Computer Science, pages 130-139, 2014. Google Scholar
  21. Reinhard Diestel. Graph Theory, Fourth Edition. Springer, 2010. Google Scholar
  22. Talya Eden, Amit Levi, Dana Ron, and C Seshadhri. Approximately counting triangles in sublinear time. SIAM Journal on Computing, 46(5):1603-1646, 2017. Google Scholar
  23. Talya Eden, Dana Ron, and C Seshadhri. On approximating the number of k-cliques in sublinear time. In Proc. 50th Annual ACM Symposium on the Theory of Computing, pages 722-734, 2018. Google Scholar
  24. Friedrich Eisenbrand and Fabrizio Grandoni. On the complexity of fixed parameter clique and dominating set. Theoretical Computer Science, 326(1-3):57-67, 2004. Google Scholar
  25. Ethan R Elenberg, Karthikeyan Shanmugam, Michael Borokhovich, and Alexandros G Dimakis. Beyond triangles: A distributed framework for estimating 3-profiles of large graphs. In Proc. 12th Annual SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 229-238. ACM, 2015. Google Scholar
  26. Ethan R Elenberg, Karthikeyan Shanmugam, Michael Borokhovich, and Alexandros G Dimakis. Distributed estimation of graph 4-profiles. In Proc. 25th Proceedings, International World Wide Web Conference (WWW), pages 483-493. International World Wide Web Conferences Steering Committee, 2016. Google Scholar
  27. David Eppstein. Arboricity and bipartite subgraph listing algorithms. Information processing letters, 51(4):207-211, 1994. Google Scholar
  28. Mira Gonen and Yuval Shavitt. Approximating the number of network motifs. Internet Mathematics, 6(3):349-372, 2009. Google Scholar
  29. Tomaž Hočevar and Janez Demšar. A combinatorial approach to graphlet counting. Bioinformatics, 30(4):559-565, 2014. Google Scholar
  30. Tomaž Hočevar and Janez Demšar. Combinatorial algorithm for counting small induced graphs and orbits. PloS ONE, 12(2):e0171428, 2017. Google Scholar
  31. Tomaž Hočevar, Janez Demšar, et al. Computation of graphlet orbits for nodes and edges in sparse graphs. Journ. Stat. Soft, 71, 2016. Google Scholar
  32. P. Holland and S. Leinhardt. A method for detecting structure in sociometric data. American Journal of Sociology, 76:492-513, 1970. Google Scholar
  33. F. Hormozdiari, P. Berenbrink, N. Prulj, and S. Cenk Sahinalp. Not All Scale-Free Networks Are Born Equal: The Role of the Seed Graph in PPI Network Evolution. PLoS Computational Biology, 118, 2007. Google Scholar
  34. Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. SIAM Journal on Computing, 7(4):413-423, 1978. Google Scholar
  35. Shweta Jain and C Seshadhri. A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem. In Proc. 26th Proceedings, International World Wide Web Conference (WWW), pages 441-449. International World Wide Web Conferences Steering Committee, 2017. Google Scholar
  36. Madhav Jha, C Seshadhri, and Ali Pinar. A space efficient streaming algorithm for triangle counting using the birthday paradox. In Proc. 19th Annual SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 589-597, 2013. Google Scholar
  37. Madhav Jha, C Seshadhri, and Ali Pinar. Path sampling: A fast and provable method for estimating 4-vertex subgraph counts. In Proc. 24th Proceedings, International World Wide Web Conference (WWW), pages 495-505. International World Wide Web Conferences Steering Committee, 2015. Google Scholar
  38. Hossein Jowhari and Mohammad Ghodsi. New streaming algorithms for counting triangles in graphs. In Computing and Combinatorics, pages 710-716, 2005. Google Scholar
  39. Daniel M Kane, Kurt Mehlhorn, Thomas Sauerwald, and He Sun. Counting arbitrary subgraphs in data streams. In Proc. 39th International Colloquium on Automata, Languages and Programming, pages 598-609, 2012. Google Scholar
  40. Arijit Khan, Nan Li, Xifeng Yan, Ziyu Guan, Supriyo Chakraborty, and Shu Tao. Neighborhood based fast graph search in large networks. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, 2011. Google Scholar
  41. Ton Kloks, Dieter Kratsch, and Haiko Müller. Finding and counting small induced subgraphs efficiently. Information Processing Letters, 74(3-4):115-121, 2000. Google Scholar
  42. Tamara G Kolda, Ali Pinar, Todd Plantenga, C Seshadhri, and Christine Task. Counting triangles in massive graphs with MapReduce. SIAM Journal on Scientific Computing, 36(5):S48-S77, 2014. Google Scholar
  43. Tsvi Kopelowitz, Seth Pettie, and Ely Porat. Higher lower bounds from the 3SUM conjecture. In Proc. 27th Annual ACM-SIAM Symposium on Discrete Algorithms, 2016. Google Scholar
  44. Ioannis Koutis and Ryan Williams. Limits and applications of group algebras for parameterized problems. In International Colloquium on Automata, Languages and Programming, pages 653-664, 2009. Google Scholar
  45. Mirosław Kowaluk, Andrzej Lingas, and Eva-Marta Lundell. Counting and detecting small subgraphs via equations. SIAM Journal on Discrete Mathematics, 27(2):892-909, 2013. Google Scholar
  46. Madhusudan Manjunath, Kurt Mehlhorn, Konstantinos Panagiotou, and He Sun. Approximate Counting of Cycles in Streams. In Proc. 19th Annual European Symposium on Algorithms, pages 677-688, 2011. Google Scholar
  47. Dror Marcus and Yuval Shavitt. Efficient counting of network motifs. In IEEE 30th International Conference on Distributed Computing Systems Workshops, pages 92-98. IEEE, 2010. Google Scholar
  48. David W Matula and Leland L Beck. Smallest-last ordering and clustering and graph coloring algorithms. Journal of the ACM (JACM), 30(3):417-427, 1983. Google Scholar
  49. Andrew McGregor, Sofya Vorotnikova, and Hoa T. Vu. Better Algorithms for Counting Triangles in Data Streams. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 401-411, 2016. Google Scholar
  50. Tijana Milenković and Nataša Pržulj. Uncovering biological network function via graphlet degree signatures. Cancer informatics, 6:CIN-S680, 2008. Google Scholar
  51. Burkhard Monien. How to find long paths efficiently. In North-Holland Mathematics Studies, volume 109, pages 239-254. Elsevier, 1985. Google Scholar
  52. C. St. J. A. Nash-Williams. Decomposition of finite graphs into forests. Journal of the London Mathematical Society, 39(1):12, 1964. Google Scholar
  53. Jaroslav Nešetřil and Svatopluk Poljak. On the complexity of the subgraph problem. Commentationes Mathematicae Universitatis Carolinae, 26(2):415-419, 1985. Google Scholar
  54. Mark Ortmann and Ulrik Brandes. Efficient orbit-aware triad and quad census in directed and undirected graphs. Applied network science, 2(1), 2017. Google Scholar
  55. Mihai Patrascu. Towards polynomial lower bounds for dynamic problems. In Proc. 42nd Annual ACM Symposium on the Theory of Computing, 2010. Google Scholar
  56. Aduri Pavan, Kanat Tangwongsan, Srikanta Tirthapura, and Kun-Lung Wu. Counting and sampling triangles from a graph stream. Proceedings of the VLDB Endowment, 6(14):1870-1881, 2013. Google Scholar
  57. Ali Pinar, C Seshadhri, and Vaidyanathan Vishal. Escape: Efficiently counting all 5-vertex subgraphs. In Proceedings, International World Wide Web Conference (WWW), pages 1431-1440. International World Wide Web Conferences Steering Committee, 2017. Google Scholar
  58. Alejandro Portes. Social Capital: Its Origins and Applications in Modern Sociology. Annual Review of Sociology, 24(1):1-24, 1998. URL: https://doi.org/10.1146/annurev.soc.24.1.1.
  59. Natasa Przulj. Biological network comparison using graphlet degree distribution. Bioinformatics, 23(2):177-183, 2007. Google Scholar
  60. Natasa Przulj, Derek G. Corneil, and Igor Jurisica. Modeling interactome: scale-free or geometric? Bioinformatics, 20(18):3508-3515, 2004. Google Scholar
  61. M. Rahman, M. A. Bhuiyan, and M. Al Hasan. GRAFT: An Efficient Graphlet Counting Method for Large Graph Analysis. IEEE Transactions on Knowledge and Data Engineering, PP(99), 2014. Google Scholar
  62. Ahmet Erdem Sariyuce, C. Seshadhri, Ali Pinar, and Umit V. Catalyurek. Finding the Hierarchy of Dense Subgraphs Using Nucleus Decompositions. In Proceedings, International World Wide Web Conference (WWW), pages 927-937, 2015. Google Scholar
  63. T. Schank and D. Wagner. Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study. In Experimental and Efficient Algorithms, pages 606-609. Springer, 2005. Google Scholar
  64. C. Seshadhri and Srikanta Tirthapura. Scalable Subgraph Counting: The Methods Behind The Madness: WWW 2019 Tutorial. In Proceedings, International World Wide Web Conference (WWW), 2019. Google Scholar
  65. Alina Stoica and Christophe Prieur. Structure of neighborhoods in a large social network. In International Conference on Computational Science and Engineering, volume 4, pages 26-33. IEEE, 2009. Google Scholar
  66. Siddharth Suri and Sergei Vassilvitskii. Counting triangles and the curse of the last reducer. In Proceedings of the 20th international conference on World wide web, pages 607-614, 2011. Google Scholar
  67. Charalampos E. Tsourakakis. The K-clique Densest Subgraph Problem. In Proceedings, International World Wide Web Conference (WWW), pages 1122-1132, 2015. Google Scholar
  68. Johan Ugander, Lars Backstrom, and Jon M. Kleinberg. Subgraph frequencies: mapping the empirical and extremal geography of large graph collections. In Proceedings, International World Wide Web Conference (WWW), pages 1307-1318, 2013. Google Scholar
  69. Virginia Vassilevska. Efficient algorithms for clique problems. Information Processing Letters, 109(4):254-257, 2009. Google Scholar
  70. Virginia Vassilevska and Ryan Williams. Finding, minimizing, and counting weighted subgraphs. In Proc. 41st Annual ACM Symposium on the Theory of Computing, pages 455-464, 2009. Google Scholar
  71. Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John CS Lui, Don Towsley, Jing Tao, and Xiaohong Guan. MOSS-5: A fast method of approximating counts of 5-node graphlets in large graphs. IEEE Transactions on Knowledge and Data Engineering, 30(1):73-86, 2017. Google Scholar
  72. S. Wernicke and F. Rasche. FANMOD: a tool for fast network motif detection. Bioinformatics, 22(9):1152-1153, 2006. Google Scholar
  73. Sebastian Wernicke. Efficient detection of network motifs. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 3(4):347-359, 2006. Google Scholar
  74. Virginia Vassilevska Williams, Joshua R Wang, Ryan Williams, and Huacheng Yu. Finding four-node subgraphs in triangle time. In Proc. 26th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1671-1680, 2014. Google Scholar
  75. Zhao Zhao, Guanying Wang, Ali R Butt, Maleq Khan, VS Anil Kumar, and Madhav V Marathe. Sahad: Subgraph analysis in massive networks using hadoop. In IEEE 26th International Parallel and Distributed Processing Symposium, pages 390-401. IEEE, 2012. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail