Fair Correlation Clustering in General Graphs

Authors Roy Schwartz, Roded Zats

Thumbnail PDF


  • Filesize: 0.83 MB
  • 19 pages

Document Identifiers

Author Details

Roy Schwartz
  • The Henry and Marilyn Taub Faculty of Computer Science, Technion, Haifa, Israel
Roded Zats
  • The Henry and Marilyn Taub Faculty of Computer Science, Technion, Haifa, Israel

Cite AsGet BibTex

Roy Schwartz and Roded Zats. Fair Correlation Clustering in General Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 245, pp. 37:1-37:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


We consider the family of Correlation Clustering optimization problems under fairness constraints. In Correlation Clustering we are given a graph whose every edge is labeled either with a + or a -, and the goal is to find a clustering that agrees the most with the labels: + edges within clusters and - edges across clusters. The notion of fairness implies that there is no over, or under, representation of vertices in the clustering: every vertex has a color and the distribution of colors within each cluster is required to be the same as the distribution of colors in the input graph. Previously, approximation algorithms were known only for fair disagreement minimization in complete unweighted graphs. We prove the following: (1) there is no finite approximation for fair disagreement minimization in general graphs unless P = NP (this hardness holds also for bicriteria algorithms); and (2) fair agreement maximization in general graphs admits a bicriteria approximation of ≈ 0.591 (an improved ≈ 0.609 true approximation is given for the special case of two uniformly distributed colors). Our algorithm is based on proving that the sticky Brownian motion rounding of [Abbasi Zadeh-Bansal-Guruganesh-Nikolov-Schwartz-Singh SODA'20] copes well with uncut edges.

Subject Classification

ACM Subject Classification
  • Theory of computation → Facility location and clustering
  • Correlation Clustering
  • Approximation Algorithms
  • Semi-Definite Programming


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Saba Ahmadi, Sainyam Galhotra, Barna Saha, and Roy Schwartz. Fair correlation clustering, 2020. URL: https://doi.org/10.48550/ARXIV.2002.03508.
  2. Sara Ahmadian, Alessandro Epasto, Marina Knittel, Ravi Kumar, Mohammad Mahdian, Benjamin Moseley, Philip Pham, Sergei Vassilvitskii, and Yuyan Wang. Fair hierarchical clustering. Advances in Neural Information Processing Systems, 33:21050-21060, 2020. Google Scholar
  3. Sara Ahmadian, Alessandro Epasto, Ravi Kumar, and Mohammad Mahdian. Clustering without over-representation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '19, pages 267-275, New York, NY, USA, 2019. Association for Computing Machinery. URL: https://doi.org/10.1145/3292500.3330987.
  4. Sara Ahmadian, Alessandro Epasto, Ravi Kumar, and Mohammad Mahdian. Fair correlation clustering. ArXiv, abs/2002.02274, 2020. Google Scholar
  5. Nir Ailon, Noa Avigdor-Elgrabli, Edo Liberty, and Anke van Zuylen. Improved approximation algorithms for bipartite correlation clustering. SIAM Journal on Computing, 41(5):1110-1121, 2012. URL: https://doi.org/10.1137/110848712.
  6. Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: Ranking and clustering. J. ACM, 55(5), November 2008. URL: https://doi.org/10.1145/1411509.1411513.
  7. Noga Amit. The bicluster graph editing problem. PhD thesis, Tel Aviv University, 2004. Google Scholar
  8. Konstantin Andreev and Harald Räcke. Balanced graph partitioning. Theory of Computing Systems, 39:929-939, November 2006. URL: https://doi.org/10.1007/s00224-006-1350-7.
  9. Megasthenis Asteris, Anastasios Kyrillidis, Dimitris Papailiopoulos, and Alexandros Dimakis. Bipartite correlation clustering: Maximizing agreements. In Artificial Intelligence and Statistics, pages 121-129, 2016. Google Scholar
  10. Per Austrin, Siavosh Benabbas, and Konstantinos Georgiou. Better balance by being biased: A 0.8776-approximation for max bisection. ACM Transactions on Algorithms, 13:1-27, October 2016. URL: https://doi.org/10.1145/2907052.
  11. Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, and Tal Wagner. Scalable fair clustering. In International Conference on Machine Learning, pages 405-413. PMLR, 2019. Google Scholar
  12. Nikhil Bansal, Avrim Blum, and Shuchi Chawla. Correlation clustering. In Machine Learning, pages 238-247, 2002. Google Scholar
  13. Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS, April 2011. URL: https://doi.org/10.1109/FOCS.2011.95.
  14. Suman Bera, Deeparnab Chakrabarty, Nicolas Flores, and Maryam Negahbani. Fair algorithms for clustering. Advances in Neural Information Processing Systems, 32, 2019. Google Scholar
  15. Ioana O. Bercea, Samir Khuller, Clemens Rösner, Melanie Schmidt, Martin Groß, Aounon Kumar, and Daniel R. Schmidt. On the cost of essentially fair clusterings. In Dimitris Achlioptas and Laszlo A. Vegh, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2019, Leibniz International Proceedings in Informatics, LIPIcs. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, September 2019. URL: https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.18.
  16. Simon Caton and Christian Haas. Fairness in machine learning: A survey. arXiv preprint arXiv:2010.04053, 2020. Google Scholar
  17. M. Charikar and A. Wirth. Maximizing quadratic programs: extending grothendieck’s inequality. In 45th Annual IEEE Symposium on Foundations of Computer Science, pages 54-60, 2004. URL: https://doi.org/10.1109/FOCS.2004.39.
  18. Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. Clustering with qualitative information. Journal of Computer and System Sciences, 71(3):360-383, 2005. Learning Theory 2003. URL: https://doi.org/10.1016/j.jcss.2004.10.012.
  19. Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. Near optimal lp rounding algorithm for correlationclustering on complete and complete k-partite graphs. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC '15, pages 219-228, New York, NY, USA, 2015. Association for Computing Machinery. URL: https://doi.org/10.1145/2746539.2746604.
  20. Anshuman Chhabra, Karina Masalkovaitė, and Prasant Mohapatra. An overview of fairness in clustering. IEEE Access, 9:130698-130720, 2021. URL: https://doi.org/10.1109/ACCESS.2021.3114099.
  21. Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. Fair clustering through fairlets. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, pages 5036-5044, Red Hook, NY, USA, 2017. Curran Associates Inc. Google Scholar
  22. Charles J. Colbourn. The complexity of completing partial latin squares. Discrete Applied Mathematics, 8(1):25-30, 1984. URL: https://doi.org/10.1016/0166-218X(84)90075-1.
  23. Tom Coleman, James Saunderson, and Anthony Wirth. A local-search 2-approximation for 2-correlation-clustering. In European Symposium on Algorithms, ESA `08, pages 308-319, 2008. Google Scholar
  24. Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. Correlation clustering in general weighted graphs. Theoretical Computer Science, 361(2):172-187, 2006. Approximation and Online Algorithms. URL: https://doi.org/10.1016/j.tcs.2006.05.008.
  25. Dotan Emanuel and Amos Fiat. Correlation clustering - minimizing disagreements on arbitrary weighted graphs. In Giuseppe Di Battista and Uri Zwick, editors, Algorithms - ESA 2003, pages 208-220, Berlin, Heidelberg, 2003. Springer Berlin Heidelberg. Google Scholar
  26. Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman and Co., USA, 1990. Google Scholar
  27. Mehrdad Ghadiri, Samira Samadi, and Santosh Vempala. Socially fair k-means clustering. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT '21, pages 438-448, New York, NY, USA, 2021. Association for Computing Machinery. URL: https://doi.org/10.1145/3442188.3445906.
  28. Ioannis Giotis and Venkatesan Guruswami. Correlation clustering with a fixed number of clusters. Theory of Computing, 2, May 2005. URL: https://doi.org/10.1145/1109557.1109686.
  29. Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115-1145, November 1995. URL: https://doi.org/10.1145/227683.227684.
  30. Venkatesan Guruswami and Ali Kemal Sinop. Lasserre hierarchy, higher eigenvalues, and approximation schemes for quadratic integer programming with PSD objectives. CoRR, abs/1104.4746, 2011. URL: http://arxiv.org/abs/1104.4746.
  31. Eran Halperin and Uri Zwick. A unified framework for obtaining improved approximation algorithms for maximum graph bisection problems. In Proceedings of the 8th International IPCO Conference on Integer Programming and Combinatorial Optimization, pages 210-225, Berlin, Heidelberg, 2001. Springer-Verlag. Google Scholar
  32. Qiaoming Han, Yinyu Ye, and Jiawei Zhang. An improved rounding method and semidefinite programming relaxation for graph partition. Mathematical Programming, 92:509-535, 2002. Google Scholar
  33. Matthäus Kleindessner, Samira Samadi, Pranjal Awasthi, and Jamie Morgenstern. Guarantees for spectral clustering with fairness constraints. In International Conference on Machine Learning, pages 3458-3467. PMLR, 2019. Google Scholar
  34. Jean B. Lasserre. An explicit exact sdp relaxation for nonlinear 0-1 programs. In Karen Aardal and Bert Gerards, editors, Integer Programming and Combinatorial Optimization, pages 293-303, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg. Google Scholar
  35. Andrew McCallum and Ben Wellner. Toward conditional models of identity uncertainty with application to proper noun coreference. In Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web, pages 79-86, August 2003. Google Scholar
  36. Bernt Øksendal. Stochastic Differential Equations. Springer Berlin Heidelberg, 2003. URL: https://doi.org/10.1007/978-3-642-14394-6.
  37. Prasad Raghavendra and Ning Tan. Approximating csps with global cardinality constraints using sdp hierarchies. In SODA, 2012. Google Scholar
  38. Melanie Schmidt, Chris Schwiegelshohn, and Christian Sohler. Fair coresets and streaming algorithms for fair k-means. In Approximation and Online Algorithms: 17th International Workshop, WAOA 2019, Munich, Germany, September 12–13, 2019, Revised Selected Papers, pages 232-251, Berlin, Heidelberg, 2019. Springer-Verlag. URL: https://doi.org/10.1007/978-3-030-39479-0_16.
  39. Chaitanya Swamy. Correlation clustering: Maximizing agreements via semidefinite programming. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '04, pages 526-527, USA, 2004. Society for Industrial and Applied Mathematics. Google Scholar
  40. Jurgen Van Gael and Xiaojin Zhu. Correlation clustering for crosslingual link detection. In IJCAI, pages 1744-1749, 2007. Google Scholar
  41. Anthony Wirth. Correlation clustering. In Claude Sammut and Geoffrey I. Webb, editors, Encyclopedia of Machine Learning, pages 227-231. Springer US, Boston, MA, 2010. URL: https://doi.org/10.1007/978-0-387-30164-8_176.
  42. Chenchen Wu, Donglei Du, and Dachuan Xu. An improved semidefinite programming hierarchies rounding approximation algorithm for maximum graph bisection problems. In Ding-Zhu Du and Guochuan Zhang, editors, Computing and Combinatorics, pages 304-315, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg. Google Scholar
  43. Sepehr Abbasi Zadeh, Nikhil Bansal, Guru Guruganesh, Aleksandar Nikolov, Roy Schwartz, and Mohit Singh. Sticky brownian rounding and its applications to constraint satisfaction problems. ArXiv, abs/1812.07769, 2020. Google Scholar