Graph Clustering in All Parameter Regimes

Authors Junhao Gan , David F. Gleich , Nate Veldt , Anthony Wirth , Xin Zhang



PDF
Thumbnail PDF

File

LIPIcs.MFCS.2020.39.pdf
  • Filesize: 0.65 MB
  • 15 pages

Document Identifiers

Author Details

Junhao Gan
  • School of Computing and Information Systems, The University of Melbourne, Victoria, Australia
David F. Gleich
  • Department of Computer Science, Purdue University, West Lafayette, IN, USA
Nate Veldt
  • Center for Applied Mathematics, Cornell University, Ithaca, NY, USA
Anthony Wirth
  • School of Computing and Information Systems, The University of Melbourne, Victoria, Australia
Xin Zhang
  • School of Computing and Information Systems, The University of Melbourne, Victoria, Australia

Cite AsGet BibTex

Junhao Gan, David F. Gleich, Nate Veldt, Anthony Wirth, and Xin Zhang. Graph Clustering in All Parameter Regimes. In 45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 170, pp. 39:1-39:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.MFCS.2020.39

Abstract

Resolution parameters in graph clustering control the size and structure of clusters formed by solving a parametric objective function. Typically there is more than one meaningful way to cluster a graph, and solving the same objective function for different resolution parameters produces clusterings at different levels of granularity, each of which can be meaningful depending on the application. In this paper, we address the task of efficiently solving a parameterized graph clustering objective for all values of a resolution parameter. Specifically, we consider a new analysis-friendly objective we call LambdaPrime, involving a parameter λ ∈ (0,1). LambdaPrime is an adaptation of LambdaCC, a significant family of instances of the Correlation Clustering (minimization) problem. Indeed, LambdaPrime and LambdaCC are closely related to other parameterized clustering problems, such as parametric generalizations of modularity. They capture a number of specific clustering problems as special cases, including sparsest cut and cluster deletion. While previous work provides approximation results for a single value of the resolution parameter, we seek a set of approximately optimal clusterings for all values of λ in polynomial time. More specifically, we show that when a graph has m edges and n nodes, there exists a set of at most m clusterings such that, for every λ ∈ (0,1), the family contains an optimal solution to the LambdaPrime objective. This bound is tight on star graphs. We obtain a family of O(log n) clusterings by solving the parametric linear programming (LP) relaxation of LambdaPrime at O(log n) λ values, and rounding each LP solution using existing approximation algorithms. We prove that this is asymptotically tight: for a certain class of ring graphs, for all values of λ, Ω(log n) feasible solutions are required to provide a constant-factor approximation for the LambdaPrime LP relaxation. To minimize the size of the clustering family, we further propose an algorithm that yields a family of solutions of a size no more than twice of the minimum LP-approximating family.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Graph algorithms
  • Mathematics of computing → Approximation algorithms
Keywords
  • Graph Clustering
  • Algorithms
  • Parametric Linear Programming

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. A Arenas, A Fernández, and S Gómez. Analysis of the structure of complex networks at different resolution levels. New Journal of Physics, 10(5):053039, May 2008. URL: https://doi.org/10.1088/1367-2630/10/5/053039.
  2. Nikhil Bansal, Avrim Blum, and Shuchi Shawla. Correlation clustering. Machine Learning, 56:89-113, 2004. Google Scholar
  3. Patricia J. Carstensen. Complexity of some parametric integer and network programming problems. Math. Program., 26(1):64-75, May 1983. URL: https://doi.org/10.1007/BF02591893.
  4. Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. Clustering with qualitative information. Journal of Computer and System Sciences, 71(3):360-383, 2005. URL: https://doi.org/10.1016/j.jcss.2004.10.012.
  5. J.-C. Delvenne, S. N. Yaliraki, and M. Barahona. Stability of graph communities across time scales. Proceedings of the National Academy of Sciences, 107(29):12755-12760, 2010. URL: https://doi.org/10.1073/pnas.0903215107.
  6. Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. Correlation clustering in general weighted graphs. Theoretical Computer Science, 361(2):172-187, 2006. Approximation and Online Algorithms. URL: https://doi.org/10.1016/j.tcs.2006.05.008.
  7. Thang N. Dinh, Xiang Li, and My T. Thai. Network clustering via maximizing modularity: Approximation algorithms and theoretical limits. In ICDM, pages 101-110, 2015. URL: https://doi.org/10.1109/ICDM.2015.139.
  8. Junhao Gan, David F. Gleich, Nate Veldt, Anthony Wirth, and Xin Zhang. Graph clustering in all parameter regimes. arXiv, cs.CC, 2019. Google Scholar
  9. David F. Gleich, Nate Veldt, and Anthony Wirth. Correlation Clustering Generalized. In ISAAC, pages 44:1-44:13, 2018. URL: https://doi.org/10.4230/LIPIcs.ISAAC.2018.44.
  10. Lucas G. S. Jeub, Marya Bazzi, Inderjit S. Jutla, and Peter J. Mucha. A generalized Louvain method for community detection implemented in MATLAB, 2011-2019. https://github.com/GenLouvain/GenLouvain. Google Scholar
  11. Tom Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM (JACM), 46(6):787-832, November 1999. Google Scholar
  12. Thomas L. Magnanti and Dan Stratila. Separable concave optimization approximately equals piecewise linear optimization. In IPCO, pages 234-243, 2004. Google Scholar
  13. Thomas L Magnanti and Dan Stratila. Separable concave optimization approximately equals piecewise-linear optimization. arXiv preprint arXiv:1201.3148, 2012. Google Scholar
  14. Katta G. Murty. Computational complexity of parametric linear programming. Mathematical Programming, 19(1):213-219, December 1980. URL: https://doi.org/10.1007/BF01581642.
  15. Mark EJ Newman and Michelle Girvan. Finding and evaluating community structure in networks. Physical Review E, 69(026113), 2004. Google Scholar
  16. Sebastian Nowozin and Stefanie Jegelka. Solution stability in linear programming relaxations: Graph partitioning and unsupervised learning. In ICML, pages 769-776. ACM, 2009. Google Scholar
  17. Jörg Reichardt and Stefan Bornholdt. Detecting fuzzy community structures in complex networks with a potts model. Physical Review Letters, 93:218701, November 2004. URL: https://doi.org/10.1103/PhysRevLett.93.218701.
  18. Jörg Reichardt and Stefan Bornholdt. Statistical mechanics of community detection. Physical Review E, 74(016110), 2006. Google Scholar
  19. Ron Shamir, Roded Sharan, and Dekel Tsur. Cluster graph modification problems. Discrete Applied Mathematics, 144:173-182, 2004. Google Scholar
  20. V. A. Traag, P. Van Dooren, and Y. Nesterov. Narrow scope for resolution-limit-free community detection. Physical Review E, 84:016114, July 2011. URL: https://doi.org/10.1103/PhysRevE.84.016114.
  21. V. A. Traag, L. Waltman, and N. J. van Eck. From louvain to leiden: guaranteeing well-connected communities. Scientific Reports, 9(1):5233, 2019. URL: https://doi.org/10.1038/s41598-019-41695-z.
  22. Nate Veldt, David Gleich, Anthony Wirth, and James Saunderson. Metric-constrained optimization for graph clustering algorithms. SIAM Journal on Mathematics of Data Science, 1(2):333-355, 2019. Google Scholar
  23. Nate Veldt, David F. Gleich, and Anthony Wirth. A correlation clustering framework for community detection. In WWW, pages 439-448, 2018. URL: https://doi.org/10.1145/3178876.3186110.
  24. Anthony Wirth. Approximation algorithms for clustering. PhD thesis, Princeton University, 2004. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail