Cluster Editing with Overlapping Communities

Authors Emmanuel Arrighi , Matthias Bentert, Pål Grønås Drange , Blair D. Sullivan , Petra Wolf



PDF
Thumbnail PDF

File

LIPIcs.IPEC.2023.2.pdf
  • Filesize: 0.72 MB
  • 12 pages

Document Identifiers

Author Details

Emmanuel Arrighi
  • University of Trier, Germany
Matthias Bentert
  • University of Bergen, Norway
Pål Grønås Drange
  • University of Bergen, Norway
Blair D. Sullivan
  • University of Utah, Salt Lake City, UT, USA
Petra Wolf
  • University of Bergen, Norway

Cite AsGet BibTex

Emmanuel Arrighi, Matthias Bentert, Pål Grønås Drange, Blair D. Sullivan, and Petra Wolf. Cluster Editing with Overlapping Communities. In 18th International Symposium on Parameterized and Exact Computation (IPEC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 285, pp. 2:1-2:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.IPEC.2023.2

Abstract

Cluster Editing, also known as correlation clustering, is a well-studied graph modification problem. In this problem, one is given a graph and allowed to perform up to k edge additions and deletions to transform it into a cluster graph, i.e., a graph consisting of a disjoint union of cliques. However, in real-world networks, clusters are often overlapping. For example, in social networks, a person might belong to several communities - e.g. those corresponding to work, school, or neighborhood. Another strong motivation comes from language networks where trying to cluster words with similar usage can be confounded by homonyms, that is, words with multiple meanings like "bat". The recently introduced operation of vertex splitting is one natural approach to incorporating such overlap into Cluster Editing. First used in the context of graph drawing, this operation allows a vertex v to be replaced by two vertices whose combined neighborhood is the neighborhood of v (and thus v can belong to more than one cluster). The problem of transforming a graph into a cluster graph using at most k edge additions, edge deletions, or vertex splits is called Cluster Editing with Vertex Splitting and is known to admit a polynomial kernel with respect to k and an O(9^{k²} + n + m)-time (parameterized) algorithm. However, it was not known whether the problem is NP-hard, a question which was originally asked by Abu-Khzam et al. [Combinatorial Optimization, 2018]. We answer this in the affirmative. We further give an improved algorithm running in O(2^{7klog k} + n + m) time.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Graph algorithms
Keywords
  • graph modification
  • correlation clustering
  • vertex splitting
  • NP-hardness
  • parameterized algorithm

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Faisal N. Abu-Khzam, Joseph R. Barr, Amin Fakhereldine, and Peter Shaw. A greedy heuristic for cluster editing with vertex splitting. In Proceedings of the 4th International Conference on Artificial Intelligence for Industries (AI4I '21), pages 38-41. IEEE, 2021. Google Scholar
  2. Faisal N. Abu-Khzam, Judith Egan, Serge Gaspers, Alexis Shaw, and Peter Shaw. Cluster editing with vertex splitting. In Combinatorial optimization, pages 1-13. Springer, 2018. Google Scholar
  3. Reyan Ahmed, Stephen Kobourov, and Myroslav Kryven. An FPT algorithm for bipartite vertex splitting. In Proceedings of the 30th International Symposium on Graph Drawing and Network Visualization (GD '22), pages 261-268. Springer International Publishing, 2022. Google Scholar
  4. Sanjeev Arora, Rong Ge, Sushant Sachdeva, and Grant Schoenebeck. Finding overlapping communities in social networks: toward a rigorous approach. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC '12), pages 37-54. Association for Computing Machinery, 2012. Google Scholar
  5. Sanghamitra Bandyopadhyay, Garisha Chowdhary, and Debarka Sengupta. FOCS: Fast overlapped community search. IEEE Transactions on Knowledge and Data Engineering, 27(11):2974-2985, 2015. Google Scholar
  6. Jakob Baumann, Matthias Pfretzschner, and Ignaz Rutter. Parameterized complexity of vertex splitting to pathwidth at most 1. CoRR, abs/2302.14725, 2023. Google Scholar
  7. Jeffrey Baumes, Mark Goldberg, and Malik Magdon-Ismail. Efficient identification of overlapping communities. In Proceedings of the 2005 IEEE International Conference on Intelligednce and Security Informatics (ISI '05), pages 27-36. Springer, 2005. Google Scholar
  8. Francesco Bonchi, Aristides Gionis, and Antti Ukkonen. Overlapping correlation clustering. Knowledge and Information Systems, 35(1):1-32, 2013. Google Scholar
  9. Leizhen Cai. Fixed-parameter tractability of graph modification problems for hereditary properties. Information Processing Letters, 58(4):171-176, 1996. URL: https://doi.org/10.1016/0020-0190(96)00050-6.
  10. Marek Cygan, Fedor V. Fomin, Łukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Parameterized algorithms. Springer, 2015. Google Scholar
  11. George B. Davis and Kathleen M. Carley. Clearing the FOG: Fuzzy, overlapping groups for social networks. Social Networks, 30(3):201-212, 2008. URL: https://doi.org/10.1016/j.socnet.2008.03.001.
  12. Reinhard Diestel. Graph Theory. Springer, 2005. Google Scholar
  13. Pål Grønås Drange, Felix Reidl, Fernando S. Villaamil, and Somnath Sikdar. Fast biclustering by dual parameterization. In Proceedings of the 10th International Symposium on Parameterized and Exact Computation (IPEC '15), pages 402-413. Schloss Dagstuhl — Leibniz-Zentrum für Informatik, 2015. Google Scholar
  14. Nan Du, Bai Wang, Bin Wu, and Yi Wang. Overlapping community detection in bipartite networks. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT '08), pages 176-179, 2008. Google Scholar
  15. Peter Eades and Candido Ferreira Xavier de Mendonça Neto. Vertex splitting and tension-free layout. In Proceedings of the 3rd International Symposium on Graph Drawing and Network Visualization (GD '95), pages 202-211. Springer, 1995. Google Scholar
  16. Michael R. Fellows, Jiong Guo, Christian Komusiewicz, Rolf Niedermeier, and Johannes Uhlmann. Graph-based data clustering with overlaps. In Proceedings of the 15th Annual International Conference on Computing and Combinatorics (COCOON '09), pages 516-526. Springer, 2009. Google Scholar
  17. Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Springer, 2006. Google Scholar
  18. Reynaldo Gil-García and Aurora Pons-Porrata. Dynamic hierarchical algorithms for document clustering. Pattern Recognition Letters, 31(6):469-477, 2010. URL: https://doi.org/10.1016/j.patrec.2009.11.011.
  19. Mark Goldberg, Stephen Kelley, Malik Magdon-Ismail, Konstantin Mertsalov, and Al Wallace. Finding overlapping communities in social networks. In Proceedings of the 2nd IEEE International Conference on Social Computing (SC '10), pages 104-113, 2010. URL: https://doi.org/10.1109/SocialCom.2010.24.
  20. Steve Gregory. An algorithm to find overlapping community structure in networks. In Proceedings of 2007 Knowledge Discovery in Databases (PKDD '07), pages 91-102. Springer, 2007. Google Scholar
  21. Jiong Guo. A more effective linear kernelization for cluster editing. Theoretical Computer Science, 410(8-10):718-726, 2009. URL: https://doi.org/10.1016/j.tcs.2008.10.021.
  22. Jiong Guo, Falk Hüffner, Christian Komusiewicz, and Yong Zhang. Improved algorithms for bicluster editing. In Proceedings of the 5th International Conference on Theory and Applications of Models of Computation (TAMC '08), pages 445-456. Springer, 2008. Google Scholar
  23. Falk Hüffner, Christian Komusiewicz, Hannes Moser, and Rolf Niedermeier. Fixed-parameter algorithms for cluster vertex deletion. Theory of Computing Systems, 47(1):196-217, 2010. URL: https://doi.org/10.1007/s00224-008-9150-x.
  24. Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? Journal of Computer and System Sciences, 63(4):512-530, 2001. URL: https://doi.org/10.1006/jcss.2001.1774.
  25. Christian Komusiewicz and Johannes Uhlmann. Alternative parameterizations for cluster editing. In Proceedings of the 2011 Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM '11), pages 344-355. Springer, 2011. Google Scholar
  26. Christian Komusiewicz and Johannes Uhlmann. Cluster editing with locally bounded modifications. Discrete Applied Mathematics, 160(15):2259-2270, 2012. URL: https://doi.org/10.1016/j.dam.2012.05.019.
  27. Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. Statistical properties of community structure in large social and information networks. In Proceedings of the 17th International Conference on World Wide Web (WWW '08), pages 695-704. Association for Computing Machinery, 2008. Google Scholar
  28. Guo-Hui Lin, Tao Jiang, and Paul E. Kearney. Phylogenetic k-root and Steiner k-root. In Proceedings of the 11th International Symposium on Algorithms and Computation (ISAAC '00), pages 539-551. Springer, 2000. Google Scholar
  29. Neeldhara Misra, Fahad Panolan, and Saket Saurabh. Subexponential algorithm for d-cluster edge deletion: Exception or rule? Journal of Computer and System Sciences, 113:150-162, 2020. Google Scholar
  30. Martin Nöllenburg, Manuel Sorge, Soeren Terziadis, Anaïs Villedieu, Hsiang-Yun Wu, and Jules Wulms. Planarizing graphs and their drawings by vertex splitting. In Proceedings of the 30th International Symposium on Graph Drawing and Network Visualization (GD '22), pages 232-246. Springer, 2022. Google Scholar
  31. Lorenzo Orecchia, Konstantinos Ameranis, Charalampos Tsourakakis, and Kunal Talwar. Practical almost-linear-time approximation algorithms for hybrid and overlapping graph clustering. In Proceedings of the 39th International Conference on Machine Learning (ICML '22), pages 17071-17093. PMLR, 2022. Google Scholar
  32. Airel Pérez-Suárez, José Fco. Martínez-Trinidad, Jesús A. Carrasco-Ochoa, and José E. Medina-Pagola. An algorithm based on density and compactness for dynamic overlapping clustering. Pattern Recognition, 46(11):3040-3055, 2013. URL: https://doi.org/10.1016/j.patcog.2013.03.022.
  33. Hafiz Hassaan Saeed, Khurram Shahzad, and Faisal Kamiran. Overlapping toxic sentiment classification using deep neural architectures. In Proceedings of the 2018 IEEE International Conference on Data Mining Workshops (ICDMW '18), pages 1361-1366. IEEE Computer Society, 2018. URL: https://doi.org/10.1109/ICDMW.2018.00193.
  34. Satu Elisa Schaeffer. Graph clustering. Computer Science Review, 1(1):27-64, 2007. URL: https://doi.org/10.1016/j.cosrev.2007.05.001.
  35. Cees G. M. Snoek, Marcel Worring, Jan C. van Gemert, Jan-Mark Geusebroek, and Arnold W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proceedings of the 14th ACM International Conference on Multimedia (MM '06), pages 421-430. Association for Computing Machinery, 2006. Google Scholar
  36. Lei Tang and Huan Liu. Scalable learning of collective behavior based on sparse social dimensions. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM '09), pages 1107-1116. Association for Computing Machinery, 2009. Google Scholar
  37. Craig A. Tovey. A simplified NP-complete satisfiability problem. Discrete Applied Mathematics, 8(1):85-89, 1984. Google Scholar
  38. Dekel Tsur. Faster parameterized algorithm for bicluster editing. Information Processing Letters, 168, 2021. URL: https://doi.org/10.1016/j.ipl.2021.106095.
  39. Qinna Wang and Eric Fleury. Uncovering overlapping community structure. In Proceedings of the 2nd International Workshop on Complex Networks (COMPLEX '10), pages 176-186. Springer, 2010. Google Scholar
  40. Xufei Wang, Lei Tang, Huiji Gao, and Huan Liu. Discovering overlapping groups in social media. In Proceedings of the 2010 IEEE International Conference on Data Mining (ICDM '10), pages 569-578, 2010. Google Scholar
  41. Alicja Wieczorkowska, Piotr Synak, and Zbigniew W. Raś. Multi-label classification of emotions in music. In Proceedings of the 2006 Intelligent Information Processing and Web Mining (IIPWM ‘04), pages 307-315. Springer, 2006. Google Scholar
  42. Mingyu Xiao and Shaowei Kou. A simple and improved parameterized algorithm for bicluster editing. Information Processing Letters, 2022. Google Scholar
  43. Shihua Zhang, Rui-Sheng Wang, and Xiang-Sun Zhang. Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A: Statistical Mechanics and its Applications, 374(1):483-490, 2007. Google Scholar