Min-Max Correlation Clustering via Neighborhood Similarity
Abstract
We present an efficient algorithm for the min-max correlation clustering problem. The input is a complete graph where edges are labeled as either positive or negative , and the objective is to find a clustering that minimizes the -norm of the disagreement vector over all vertices.
We address this problem with an efficient -approximation algorithm that runs in nearly linear time, , where denotes the number of positive edges. This improves upon the previous best-known approximation guarantee of 4 by Heidrich, Irmai, and Andres [37], whose algorithm runs in time, where is the number of nodes and is the maximum degree in the graph .
Furthermore, we extend our algorithm to the massively parallel computation (MPC) model and the semi-streaming model. In the MPC model, our algorithm runs on machines with memory sublinear in the number of nodes and takes rounds. In the streaming model, our algorithm requires only space, where is the number of vertices in the graph.
Our algorithms are purely combinatorial. They are based on a novel structural observation about the optimal min-max instance, which enables the construction of a -approximation algorithm using neighborhood similarity queries. By leveraging random projection, we further show these queries can be computed in nearly linear time.
Keywords and phrases:
Min Max Correlation Clustering, Approximate algorithmsCopyright and License:
2012 ACM Subject Classification:
Theory of computation Massively parallel algorithms ; Theory of computation Approximation algorithms analysisFunding:
Supported by NSF CCF-2008422.Editors:
Anne Benoit, Haim Kaplan, Sebastian Wild, and Grzegorz HermanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
In the correlation clustering problem, we are given a complete graph where each edge is labeled as either “+” or “-”. A “+” edge indicates that the two vertices are similar, while a “-” edge indicates they are dissimilar. For any partition of the graph, an edge is considered to be in disagreement if it is a negative edge and its endpoints belong to the same cluster, or if it is a positive edge and its endpoints belong to different clusters. It is typical to assume that the input graph is , where is the set of positive edges, and to obtain time bounds in terms of properties of , such as and , the maximum degree of . This is because, in many practical applications, the number of positive edges is often much smaller than the number of negative edges [23].
Given a clustering (partition), the disagreement for any node is defined as the number of edges incident to that are in disagreement with respect to the clustering. The goal of the correlation clustering problem is to find a clustering that minimizes an objective function capturing the disagreement of edges.
Puleo and Milenkovic [41] introduced the objective of minimizing the -norm of the disagreements over the vertices, which is defined as This objective generalizes the correlation clustering problem proposed by Bansal, Blum, and Chawla [8], which corresponds to the case , where the goal is to minimize the total number of disagreements. Significant progress has been made on the -norm objective [21, 3, 22, 27, 26], Cao, Cohen-Addad, Lee, Li, Newman, and Vogl [14] present a -approximate algorithm for -norm objective. For the case , it corresponds to the min-max correlation clustering problem, where the goal is to minimize the maximum disagreements over all vertices. While the case captures scenarios in which minimizing overall disagreements is desired, the case caters to situations where the quality of every individual needs to be ensured – for example, in community detection problems where no “antagonists” are allowed, with an antagonist referring to an individual whose properties are inconsistent with those of too many other members.
Compared to the case where significant progress has been made, the case remains relatively unexplored. Puleo and Milenkovic [41] proposed an algorithm that achieves a -approximation ratio. Their approach uses the standard metric linear programming formulation followed by a rounding algorithm. Later, Charikar, Gupta, and Schwartz [20] improved this result to a -approximation using the same framework, and Kalhan, Makarychev, and Zhou [39] further reduced it to a -approximation. Davies, Moseley, and Newman [29] designed a combinatorial algorithm that achieves a -approximation ratio with a runtime of , where is the number of vertices in the graph. The best known approximation ratio to date is , achieved by Heidrich, Irmai, and Andres [37], who also used a combinatorial approach with a runtime of , where is the maximum degree of the .
Concerning efficiency, the ultimate goal of an efficient algorithm is to achieve a running time that is nearly linear in . However, none of the aforementioned algorithms has yet achieved such a goal. Recently, Cao, Li, and Ye [16] proposed an nearly linear-time algorithm that achieves a -approximation ratio. While their algorithm is fast, the approximation ratio is far from optimal. This naturally leads to the question:
Can we design a nearly linear algorithm for min-max correlation clustering with a small approximation ratio?
We give a nearly linear algorithm for the problem that achieves a -approximation. Moreover, it can be made exactly 3-factor approximation with additional running time.
Theorem 1.
Let be a min-max correlation clustering instance, be a small constant, and be the value of the optimal solution. There exist:
-
1.
A randomized -time algorithm that outputs a clustering with w.h.p.111With high probability, which refers to with probability at least for a sufficiently large constant .
-
2.
A deterministic -time sequential algorithm that outputs a clustering with , where is the maximum degree of .
1.1 Semi-Streaming and MPC Settings
For large graphs, it can be infeasible to store the graph entirely on a single machine and then compute the solution. A substantial body of research has focused on massively parallel computation (MPC) algorithms for the case [10, 40, 31, 11, 25, 7, 12, 15, 28]. Very recently, Cao, Cohen-Addad, Lee, Li, Lolck, Newman, Thorup, Vogl, Yan, and Zhang [13] gave an -round MPC algorithm that achieves a -approximation ratio.
Moreover, several constant-round semi-streaming algorithms with space complexity have been developed [23, 2, 25, 9]. The first single-pass semi-streaming algorithm was given by Ahn, Cormode, Guha, McGregor, and Wirth [2], and it was later improved by multiple works [7, 9, 18, 28]. Very recently, a breakthrough result of Assadi, Khanna, and Putterman [5] showed that it is possible to obtain a -approximation in a single pass using space, where denotes the best approximation ratio achievable by polynomial-time sequential algorithms. Currently, it is known that [14].
In contrast, less work in such directions have been done for the setting. For the MPC setting, the only known work is by Cao, Ye, and Li [16], who proposed an algorithm that achieves a 63.3-approximation ratio in rounds and another algorithm that achieves a 360-approximation ratio in rounds. For the semi-streaming setting, to our knowledge, we are not aware of any existing work on semi-streaming algorithms. Most of the aforementioned streaming algorithms for were specifically tailored for this case, and the sparsification technique of [5] does not seem to generalize obviously to other values of .
In addition to our sequential algorithm, we give a constant-round MPC algorithm and a single-pass semi-streaming algorithm that achieve a -approximation for the problem:
Theorem 2.
Let be a min-max correlation clustering instance, be a small constant, and be the value of the optimal solution. In the following models, there exist randomized algorithms that output a clustering with w.h.p.:
-
1.
(MPC model) An -round algorithm using memory per machine and total memory for any constant .
-
2.
(Semi-streaming model) A single-pass streaming algorithm that uses space.
1.2 Technical Overview
Better Approximation.
Our first main technical contribution is a newly achieved approximation factor of 3. Given a guess for the optimal objective value , if , [37] observed that if the neighborhoods of and share at least elements, then they must belong to the same cluster in the optimal solution. Similarly, if neighborhoods of and differ by more than elements, then they must belong to different clusters in the optimal solution.
Furthermore, they observed that these properties can be used to determine the clusters for vertices of degrees at least . Specifically, if , then for every other vertex , either or . Here, represents the closed neighborhood of vertex . The remaining vertices can then be placed in singleton clusters, as the disagreements per vertex will be upper bounded by their degrees, . Therefore, a clustering of disagreements upper bounded by can be constructed, resulting in a 4-approximation algorithm.
To achieve a 3-approximation, we first observe that if two vertices and have degrees greater than , then it is also the case either or holds. In other words, whether and belong to the same cluster is uniquely determined in the optimal solution. Therefore, the clustering induced on the high-degree vertices (vertices with ) is uniquely determined.
Now the question lies in the placement of the low-degree vertices, that is, vertices with degrees upper-bounded by . It is unclear whether they should be placed in singleton clusters, as it is possible that they need to be included in the same cluster with certain high-degree vertices. Otherwise, the disagreements associated with the high-degree vertices could become too large.
We show that the low-degree vertices can be placed in the high-degree clusters to achieve a maximum disagreements of , provided . A key structural result we show is that if a low-degree vertex belongs to some cluster in an optimal solution, then no vertex outside can have similar neighborhood with , i.e., . Using this structural result, we show the following algorithm constructs a clustering with maximum disagreement upper bounded by (presented slightly differently here than in the main body for the sake of intuition):
1. Form clusters for high-degree vertices based on whether for all high-degree vertices (abort if any inconsistency occurs). 2. Choose an arbitrary vertex in each cluster and have it send proposal messages to low-degree neighboring vertices whose neighborhoods are similar to . 3. For each low-degree vertex that receives at least one proposal, pick one arbitrary proposal and join the cluster containing the vertex that sent it. 4. Place all low-degree vertices that do not receive a proposal into singleton clusters.
Efficient Implementations.
The remaining question is how such an algorithm can be implemented efficiently, particularly in time (and total memory) nearly linear in . A main technical challenge lies in Step 1. To implement it within the aforementioned time bound, we can only afford to conduct similarity tests (i.e. to test whether ) for times. However, it is possible for two vertices that are endpoints of a negative edge to have similar neighborhoods and thus need to be placed in the same cluster to achieve a good clustering.
Using the structural result, we further show that two high-degree vertices and are in the same cluster in the optimal solution if and only if there are at least disjoint paths of length 2 connecting and in , where consists of all the edges in whose endpoints have similar neighborhoods. This property enables us to develop efficient algorithms for the sublinear MPC model and the sequential model.
The remaining question lies in how to find efficiently. To this end, for each vertex , we treat its neighborhood set as a point in an -dimensional space. Then, we apply the (discrete) random projection technique [1] developed for the Johnson-Lindenstrauss transform [38] to reduce the dimension to while preserving the distance (up to a factor) between the points. For 0/1 vectors, the square of the distance is exactly the symmetric difference. Since the dimension is , it takes time to compute the difference. To our knowledge, this is the first time that random projection techniques have been applied to computing efficient solutions for correlation clustering and problems alike. This may be of independent interest, as neighborhood similarity is known to be used in various tools such as almost-clique decompositions [36, 19, 33, 34, 32, 4, 30, 6, 24, 7].
Single-Pass Semi-Streaming.
While the above techniques are sufficient for getting our MPC and sequential algorithms, the single-pass semi-streaming algorithm introduces additional technical difficulties. The main difficulty for a single-pass semi-streaming to work here lies in Step 2, where an arbitrarily chosen high-degree vertex in each cluster proposes to its neighbors who have similar neighborhoods. For convenience, we call the chosen high-degree vertices pivots here.
To be able to do this, we need to memorize the neighbors of all the chosen vertices using space. Suppose that , it can be shown that each cluster in the optimal clustering containing at least one low-degree vertex has size of . As a result, any vertex from these clusters has degree at most . Since we pick a pivot per cluster, is the space we need to store the neighbors of the pivots.
However, we do not know beforehand how the clusters of high-degree vertices look like, so the pivots cannot be chosen at the beginning of the stream. Without knowing what the pivots are beforehand, it is difficult to store their neighbors in the same pass.
As a result, we sample each vertex (both high-degree and low-degree vertices) independently with probability so that w.h.p. each cluster in the optimal clustering has sampled vertices. Then we store the neighbors of all the sampled vertices. This poses another problem: low-degree vertices may be chosen as pivots. However, a low-degree vertex is exempted from our structural result – using it as a pivot may steal vertices from other clusters in an optimal clustering.
To resolve this, we do the following. For each sampled vertex , we first try to recover the high-degree portion of the cluster containing in the optimal solution. We construct a candidate set that contains vertices that would not be added to other clusters. When using as a pivot, we restrict it to consider only the intersection with the candidate set, to ensure that it does not steal vertices from other clusters.
Roughly speaking, the candidate set contains all the low-degree vertices that have similar neighborhood with every vertex in but have different neighborhood with every vertex in any other cluster (as a result, ). We show such a modification does not affect the approximation ratio. Furthermore, since the candidate sets are defined based on similarity of neighborhoods, they can be constructed by the aforementioned dimension reduction technique, which takes space.
2 Preliminaries
Definition 3.
Given , let denote the neighbors of . Define . is the number of neighbors of . We use , and to denote the the corresponding quantities in graph when .
Definition 4.
Let and be sets. The symmetric difference between and , , is defined as .
In general, we say and are similar if is small.
Lemma 5 (Triangle Inequality [35]).
Let be sets. Then: .
Definition 6.
Given any clustering and any vertex , is defined to be the cluster of containing .
Definition 7.
Given clustering , define as the number of disagreements incident to vertex . More precisely:
Definition 8.
Given a clustering , the objective value of , , is defined as the maximum incident disagreements over every vertex.
The MPC Model.
In the MPC model, computation proceeds in synchronous parallel rounds across multiple machines. Each machine has memory . At the beginning of the computation, data is arbitrarily partitioned across the machines. During each round, machines process data locally, exchange messages with other machines, and send or receive messages of total size . The efficiency of an algorithm in this model is measured by the number of rounds required for the algorithm to terminate and the size of the memory available to each machine.
In this paper, we focus on the most practical and challenging regime, also known as the strictly sublinear regime, where each machine has local memory. Here, represents the number of vertices, and is an arbitrary constant. Under this assumption, the input assigned to each machine and the messages exchanged during any round are of size .
The Semi-Streaming Model.
In the semi-streaming model, the input is a stream of edges in . We are allowed to use space, where the space complexity is the number of words used and a word consists of bits. A solution is expected to be output at the end of the stream.
3 Algorithm
Definition 9.
For any , an -similarity query returns:
Definition 10.
(Shorthand for -similarity query) For readability, we use to denote that an -similarity query has been conducted and returned 1. We use to denote that the query returned 0.
The parameter was introduced to accommodate the error induced by an approximate test of whether . For the sake of simplicity, we urge the readers to assume when reading for the first time. When , if and only if .
Definition 11.
Define and , where we call the vertices in and as low-degree and high-degree vertices, respectively.
Algorithm Description.
Algorithm 1 takes two parameters and , where is a guess on the upper bound of and is an error control parameter. The goal of the algorithm is to output a solution of value provided . Note that we use instead of here to indicate that it can be set to zero (at the cost of computing the more expensive exact neighborhood similarity).
The algorithm works as follows: first, we form a clustering of high-degree vertices based on the similarity between the neighborhood of vertices. If two high-degree vertices and have similar neighborhood (i.e. ) then they will be placed in the same cluster.
Once the high-degree clusters are formed, we go through each cluster . For each , we pick an arbitrary pivot . For each neighbor of that remains unclustered (i.e. in ), we include it in if and have similar neighborhoods (i.e. ). Then we set our cluster to be . Then we update the unclustered vertices to be .
Once we went through every , there might be still some unclustered low-degree vertices (i.e. those in ). For each such vertex, we put it in a singleton cluster. Then, we check if the clustering we have obtained has an objective value at most or not. If it does, then we are done. If not, we conclude that , so we would need to set our guess of larger.
Input: A graph and parameters and .
Output: A clustering with or “”
Theorem 12.
Suppose that , Algorithm 1 outputs a clustering with .
Proof.
Let be an optimal solution so . In the next subsections, we show the following:
-
1.
(High-Degree Nodes Clustering). For any , .
-
2.
(No Stealing on Low-Degree Nodes). For each , let be the cluster in such that . We have . That is, those low-degree vertices in are not taken by other clusters for .
-
3.
(Low-Degree Nodes Inclusion). For any , .
-
4.
(Closeness). For any , and .
Once we have shown the above, we can see that as follows. Consider a component of . If is a singleton with , then obviously, . Otherwise, contains some vertex with . By Item 1, there must exist such that .
Let be any vertex in . We will show that . Suppose that , we have:
| by Lemma 5 | ||||
Otherwise, if then it must be the case that . In such a case, must be a vertex in added to in Line 12 to Line 13. This implies and thus . Therefore:
| by Lemma 5 | |||||
Remark 13.
To get a -approximation algorithm, calls of Algorithm 1 are sufficient, as we can perform a binary search on in the range of with and take the solution output by the algorithm with the smallest . For a -approximation algorithm, it can also be achieved within calls of Algorithm 1 by setting .
We prove each of the four items in the proof in the following subsections. Note that the no-stealing property of low-degree vertices (Theorem 17 in Section 3.2) is the main structural result, which not only leads to a guarantee on the approximation ratio, but it is also crucial in the design of efficient algorithms in the subsequent sections.
3.1 High-Degree Nodes Clustering
Recall that is our guess of the upper bound of the optimal solution. In this subsection, we show if is indeed such an upper bound, then there is a unique way to form clustering on high-degree nodes in the optimal solution. Our algorithm clusters the high-degree nodes using the unique way. The following two lemmas are observed by [37].
Lemma 14.
Suppose that . If there is a clustering such that and are in different clusters, then .
Proof.
Therefore, .
Lemma 15.
Suppose that . If there is a clustering such that and are in the same clusters, then .
Proof.
Lemma 16.
Let be a clustering with . For any , .
Proof.
Suppose on the contrary that . Then there exists some node such that .
Suppose that there exists a node such that but . Then there must exist such that . This implies and so . Therefore,
By Lemma 14, , a contradiction.
Otherwise, it must be the case that but . In this case, there exists such that . This implies that , which in turns implies . By Lemma 15, , a contradiction.
3.2 No Stealing on Low-Degree Nodes
In the algorithm, low-degree nodes (i.e. nodes with degree at most ) are added to the clusters formed by high-degree nodes iteratively. In this subsection, we show that if a low-degree node degree node belongs to for some , then it will not be included in for . This implies that , the candidate set of vertices to be added . Then in the next subsection, we show that it will be added to .
Theorem 17.
Let be a clustering with Let and be vertices of degree greater than where , and be a vertex with . If then and so .
A representative case that illustrates the intuition of why the theorem holds is when , i.e. the three sets have small intersections. Since they have small intersection, together with the fact that and have degrees greater than , it cannot be the case that both the symmetric differences and are small, as illustrated in Figure 1. We will refer the case that as the easy case.
When the three sets have a large intersection, with a more sophisticated argument on the relations among , and , it can also be shown that and will not intersect a lot. We will refer to the case that as the hard case.
3.2.1 The easy case
We will first show the theorem for the easy case as follows.
Lemma 18.
Let be a clustering with Let and be vertices of degree greater than where , and be a vertex with . If and , then and so .
Proof.
We will show that is greater than . Figure 1 gives a high level illustration of why this should be true. To show this formally, first note that:
Re-arranging, we have:
| (1) |
By the same reasoning, we have
| (2) |
Now consider the following:
| by (1) and (2) | |||
Therefore, assume to the contrary that and so . Then, it must be the case that . By the fact that and Lemma 15, , a contradiction.
3.2.2 The hard case
Now we consider the case where . To illustrate the idea of proof, suppose that for some . If we follow the same argument as Lemma 18, we would only be able to show that as opposed to .
However, if this is the case, we can show that , as stated in Lemma 19. This would make .
The high-level idea of the proof of Lemma 19 is illustrated in Figure 2. First, we observe that at most vertices in can be contained in ; otherwise the vertex would have disagreements greater than . This implies the contribution of disagreement from to (and to ) is already at least . If the symmetric difference is too large, i.e. larger than , then it would force the disagreements of either or to be too large.
Lemma 19.
Let be a clustering with Let and be vertices of degree greater than where , and be a vertex with . Suppose that for some . Then, .
Proof.
Let . First we claim that at most vertices in can be in . This in turns implies that vertices in contributes at least disagreements to vertex and vertex . To see why this holds, suppose to the contrary that . Since and , it must be the case that , a contradiction. Therefore, we conclude that:
| (3) |
Now note that the disagreements of and are at least:
Using the fact and summing up the above two inequalities, we have:
Therefore, .
Lemma 20.
Let be a clustering with Let and be vertices of degree greater than where , and be a vertex with . If and , then and so .
Proof.
By (1) in the proof of Lemma 18, we have:
Therefore,
By Lemma 19, we have , therefore:
Hence, . Lemma 18 and Lemma 20 complete the proof of Theorem 17. Theorem 17 immediately implies the following:
Corollary 21.
Let be the cluster in such that . For any , .
3.3 Low-Degree Nodes Inclusion
Here, we show that all the vertices in have similar neighborhoods with . As a result, if they are neighbors of , they will be added to .
Lemma 22.
Let be the cluster in such that . For any , .
Proof.
If , then because and by Lemma 16. Now it remains to consider the case . It suffices for us to show that , where , as will be added to .
By Corollary 21, we have , which means is not going to be stolen by other clusters. To show that , now it suffices to show that always holds. This is indeed true, because:
| by Lemma 5 | ||||
3.4 Closeness
In the following, we will show that the cluster we constructed will be similar to and . Intuitively, this holds because the low-degree part of will be sandwiched between the low-degree part of and , i.e. . Also, we know that and are close (i.e. ), so somehow cannot be too far away from and . Note that the high-degree parts of and coincide.
Lemma 23.
Let be the cluster in such that . For any , and .
Proof.
We first show that .
| by Lemma 22 | |||||
| (4) | |||||
Moreover, since , it must be the case that
| (5) |
Now we show that .
4 Efficient Algorithms for the Sequential and MPC Models
To make the algorithm efficient in the sequential and MPC settings, we need to address several challenges. First, we need to compute the graph for high-degree nodes. Second, we need to be able to conduct approximate neighborhood similarity queries efficiently. We will address these issues one by one.
4.1 Computing the Clustering for High-Degree Nodes
In this section, we show that although may be significantly larger than , it is sufficient to make only neighborhood similarity queries to identify the high-degree clusters. Algorithm 2 describes how to form a clustering for high-degree vertices by conducting similarity queries.
Input: A graph , a parameter , a parameter .
Output: A clustering of such that .
Algorithm Description.
In Algorithm 2, we first perform neighborhood queries to construct , where consists of those endpoints who have similar neighborhoods. Then we assign every vertex with a unique identifier, . By sending the ID of every high-degree vertex to its neighbors in , every vertex can learn ID of the high-degree vertex with the smallest ID in its closed neighborhood in (Line 4) and then set to be such an ID. Now, every vertex (including low-degree vertices) sends a token with value to all the neighbors of in . Then, if a vertex recieved at least tokens of the same value, it will set its cluster ID, , to be the minimum value that occurs at least times. Finally, we assign all high-degree vertices with the same cluster ID to be in the same cluster.
In Lemma 24, we show that if , then the output is exactly . The proof is based on showing that if two high-degree vertices are in the same cluster in , then there will be at least disjoint paths of length two in .
Lemma 24.
efficienthighdeg Suppose that . Algorithm 2 outputs a clustering such that .
Proof.
Let be a clustering with . Let be a vertex in . Let be the vertex with the minimum ID in . We will show that
-
1.
Every vertex will receive at least tokens of values in Line 6.
-
2.
In addition, if received at least tokens of value , then .
Once these two points are established, every vertex will be assigned . Consequently, this ensures that . Now we start with the first point, since , ; for otherwise by Lemma 15. This implies:
Also, since and by definition of , we have:
| (6) |
Therefore,
| by (6) | ||||
Hence:
Note that all vertices in must have their values equal to . This is because if and , then . If and , then it must be the case that . Moreover, cannot contain any cluster nodes outside by Theorem 17. Thus, . This implies will receive at least tokens with value .
Next we show that if there is a value such that receives from at least tokens, then . Suppose to the contrary that . First note that if has , then it cannot be the case that . Otherwise, , , and a vertex whose ID is , would all be in , which contradict with the fact is the smallest ID in . Therefore, it must be the case that vertices with are all in .
Now note that if then by Theorem 17, can only be connected to vertices in in . In this case, so by the assumption of . So the only possibility for to have is when . However, as , there are at most neighbors of not in . This implies will receive less than tokens whose value equals to .
4.2 Approximate Neighborhood Similarity Testing by Random Projection
We show that for , w.h.p. the queries for all can be answered in time and thus can be constructed in time.
Definition 25.
Given a vertex , let denote the characteristic vector of .
Lemma 26.
Given , let for some large enough constant . Let be a matrix where each entry is drawn from uniformly at random. W.h.p. for every two vertices and , we have:
Proof.
The lemma follows by observing that .
Lemma 27.
Let be such that , , and set:
W.h.p. the above implementation returns a correct answer for .
Proof.
It suffices to show that if then w.h.p. we will set to be and if , then w.h.p. we will set to be 1.
If , then w.h.p.
Thus, returns 1.
On the other hand, if then w.h.p.
Thus, returns 0.
We have presented the key ingredients that lead to efficient sequential and MPC algorithms. The details of their implementations are presented in the full version [17]. For the semi-streaming algorithm, different challenges arise. The necessary modifications and details are also included in the full version.
References
- [1] Dimitris Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci., 66(4):671–687, 2003. doi:10.1016/S0022-0000(03)00025-4.
- [2] Kook Jin Ahn, Graham Cormode, Sudipto Guha, Andrew McGregor, and Anthony Wirth. Correlation clustering in data streams. Algorithmica, 83(7):1980–2017, 2021. doi:10.1007/S00453-021-00816-9.
- [3] Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: Ranking and clustering. J. ACM, 55(5):23:1–23:27, 2008. doi:10.1145/1411509.1411513.
- [4] Sepehr Assadi, Yu Chen, and Sanjeev Khanna. Sublinear algorithms for vertex coloring. In Proc. 30th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 767–786, 2019. doi:10.1137/1.9781611975482.48.
- [5] Sepehr Assadi, Sanjeev Khanna, and Aaron Putterman. Correlation clustering and (de)sparsification: Graph sketches can match classical algorithms. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing (STOC), pages 417–428, 2025. doi:10.1145/3717823.3718194.
- [6] Sepehr Assadi, Pankaj Kumar, and Parth Mittal. Brooks’ theorem in graph streams: A single-pass semi-streaming algorithm for -coloring. TheoretiCS, 2, 2023. doi:10.46298/THEORETICS.23.9.
- [7] Sepehr Assadi and Chen Wang. Sublinear time and space algorithms for correlation clustering via sparse-dense decompositions. In 13th Innovations in Theoretical Computer Science Conference (ITCS), volume 215, pages 10:1–10:20, 2022. doi:10.4230/LIPICS.ITCS.2022.10.
- [8] Nikhil Bansal, Avrim Blum, and Shuchi Chawla. Correlation clustering. Machine Learning, 56(1-3):89–113, 2004. doi:10.1023/B:MACH.0000033116.57574.95.
- [9] S. Behnezhad, M. Charikar, W. Ma, and L. Tan. Almost 3-approximate correlation clustering in constant rounds. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 720–731, 2022.
- [10] Guy E Blelloch, Jeremy T Fineman, and Julian Shun. Greedy sequential maximal independent set and matching are parallel on average. In Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, pages 308–317, 2012. doi:10.1145/2312005.2312058.
- [11] Mélanie Cambus, Davin Choo, Havu Miikonen, and Jara Uitto. Massively parallel correlation clustering in bounded arboricity graphs. In Seth Gilbert, editor, 35th International Symposium on Distributed Computing (DISC), volume 209 of LIPIcs, pages 15:1–15:18, 2021. doi:10.4230/LIPICS.DISC.2021.15.
- [12] Mélanie Cambus, Fabian Kuhn, Etna Lindy, Shreyas Pai, and Jara Uitto. A single-pass semi-streaming algorithm for -approximate correlation clustering. In Proc. 35th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2024. to appear.
- [13] Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, David Rasmussen Lolck, Alantha Newman, Mikkel Thorup, Lukas Vogl, Shuyi Yan, and Hanwen Zhang. Solving the correlation cluster lp in sublinear time. In Proc. 57th Annual ACM Symposium on Theory of Computing (STOC), pages 1154–1165, 2025. doi:10.1145/3717823.3718181.
- [14] Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, Alantha Newman, and Lukas Vogl. Understanding the cluster linear program for correlation clustering. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC), pages 1605–1616, 2024. doi:10.1145/3618260.3649749.
- [15] Nairen Cao, Shang-En Huang, and Hsin-Hao Su. Breaking 3-factor approximation for correlation clustering in polylogarithmic rounds. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4124–4154. SIAM, 2024. doi:10.1137/1.9781611977912.143.
- [16] Nairen Cao, Shi Li, and Jia Ye. Simultaneously approximating all norms for massively parallel correlation clustering. In Proc. ICALP, 2025. doi:10.4230/LIPIcs.ICALP.2025.40.
- [17] Nairen Cao, Steven Roche, and Hsin-Hao Su. Min-max correlation clustering via neighborhood similarity. arXiv preprint arXiv:2502.12519, 2025. doi:10.48550/arXiv.2502.12519.
- [18] Sayak Chakrabarty and Konstantin Makarychev. Single-pass pivot algorithm for correlation clustering. keep it simple! arXiv preprint arXiv:2305.13560, 2023. doi:10.48550/arXiv.2305.13560.
- [19] Yi-Jun Chang, Wenzheng Li, and Seth Pettie. An optimal distributed (+1)-coloring algorithm? In Proc. 50th ACM Symposium on Theory of Computing (STOC), pages 445–456, 2018. doi:10.1145/3188745.3188964.
- [20] Moses Charikar, Neha Gupta, and Roy Schwartz. Local guarantees in graph cuts and clustering. In Integer Programming and Combinatorial Optimization (IPCO), pages 136–147, 2017. doi:10.1007/978-3-319-59250-3_12.
- [21] Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. Clustering with qualitative information. Journal of Computer and System Sciences, 71(3):360–383, 2005. doi:10.1016/J.JCSS.2004.10.012.
- [22] Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. Near optimal LP rounding algorithm for correlation clustering on complete and complete -partite graphs. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing (STOC), pages 219–228, 2015. doi:10.1145/2746539.2746604.
- [23] Flavio Chierichetti, Nilesh Dalvi, and Ravi Kumar. Correlation clustering in mapreduce. In Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 641–650, 2014. doi:10.1145/2623330.2623743.
- [24] Vincent Cohen-Addad, Silvio Lattanzi, Slobodan Mitrovic, Ashkan Norouzi-Fard, Nikos Parotsidis, and Jakub Tarnawski. Correlation clustering in constant many parallel rounds. In Proceedings of the 38th International Conference on Machine Learning (ICML), volume 139 of Proceedings of Machine Learning Research, pages 2069–2078, 2021. URL: http://proceedings.mlr.press/v139/cohen-addad21b.html.
- [25] Vincent Cohen-Addad, Silvio Lattanzi, Slobodan Mitrovic, Ashkan Norouzi-Fard, Nikos Parotsidis, and Jakub Tarnawski. Correlation clustering in constant many parallel rounds. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pages 2069–2078. PMLR, 2021. URL: http://proceedings.mlr.press/v139/cohen-addad21b.html.
- [26] Vincent Cohen-Addad, Euiwoong Lee, Shi Li, and Alantha Newman. Handling correlated rounding error via preclustering: A 1.73-approximation for correlation clustering. In 64rd IEEE Annual Symposium on Foundations of Computer Science (FOCS). IEEE, 2023. to appear. doi:10.1109/FOCS57990.2023.00065.
- [27] Vincent Cohen-Addad, Euiwoong Lee, and Alantha Newman. Correlation clustering with sherali-adams. In 63rd IEEE Annual Symposium on Foundations of Computer Science (FOCS), pages 651–661. IEEE, 2022. doi:10.1109/FOCS54457.2022.00068.
- [28] Vincent Cohen-Addad, David Rasmussen Lolck, Marcin Pilipczuk, Mikkel Thorup, Shuyi Yan, and Hanwen Zhang. Combinatorial correlation clustering. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 1617–1628, 2024. doi:10.1145/3618260.3649712.
- [29] Sami Davies, Benjamin Moseley, and Heather Newman. Fast combinatorial algorithms for min max correlation clustering. In International Conference on Machine Learning, pages 7205–7230. PMLR, 2023. URL: https://proceedings.mlr.press/v202/davies23a.html.
- [30] Manuela Fischer, Magnús M. Halldórsson, and Yannic Maus. Fast distributed brooks’ theorem. In Proc. ACM-SIAM Symposium on Discrete Algorithms SODA, pages 2567–2588, 2023. doi:10.1137/1.9781611977554.CH98.
- [31] Manuela Fischer and Andreas Noever. Tight analysis of parallel randomized greedy MIS. ACM Trans. Algorithms, 16(1):6:1–6:13, 2020. doi:10.1145/3326165.
- [32] Maxime Flin, Mohsen Ghaffari, Magnús M. Halldórsson, Fabian Kuhn, and Alexandre Nolin. A distributed palette sparsification theorem. In Proc. ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4083–4123. SIAM, 2024. doi:10.1137/1.9781611977912.142.
- [33] Magnús M. Halldórsson, Fabian Kuhn, Yannic Maus, and Tigran Tonoyan. Efficient randomized distributed coloring in CONGEST. In Proc. 53rd ACM Symposium on Theory of Computing (STOC), pages 1180–1193, 2021. doi:10.1145/3406325.3451089.
- [34] Magnús M. Halldórsson, Fabian Kuhn, Alexandre Nolin, and Tigran Tonoyan. Near-optimal distributed degree+1 coloring. In Proc. 54th ACM Symposium on Theory of Computing (STOC), pages 450–463, 2022. doi:10.1145/3519935.3520023.
- [35] Paul R. Halmos. Naive Set Theory. Van Nostrand Reinhold, Princeton, NJ, 1960.
- [36] David G. Harris, Johannes Schneider, and Hsin-Hao Su. Distributed (+1)-coloring in sublogarithmic rounds. J. ACM, 65(4):19:1–19:21, 2018. Preliminary version appeared in Proc. 48th Annual ACM Symposium on Theory of Computing (STOC), pages 465–478, 2016. doi:10.1145/3178120.
- [37] Holger SG Heidrich, Jannik Irmai, and Bjoern Andres. A 4-approximation algorithm for min max correlation clustering. In AISTATS, pages 1945–1953, 2024.
- [38] William B. Johnson and Joram Lindenstrauss. Extensions of lipschitz mappings into hilbert space. Contemporary mathematics, 26:189–206, 1984. URL: https://api.semanticscholar.org/CorpusID:117819162.
- [39] Sanchit Kalhan, Konstantin Makarychev, and Timothy Zhou. Correlation clustering with local objectives. In Advances in Neural Information Processing Systems (NuerIPS), volume 32, 2019.
- [40] Xinghao Pan, Dimitris Papailiopoulos, Samet Oymak, Benjamin Recht, Kannan Ramchandran, and Michael I. Jordan. Parallel correlation clustering on big graphs. In Proc. 28th International Conference on Neural Information Processing Systems (NIPS), pages 82–90, 2015. URL: https://proceedings.neurips.cc/paper/2015/hash/b53b3a3d6ab90ce0268229151c9bde11-Abstract.html.
- [41] Gregory J. Puleo and Olgica Milenkovic. Correlation clustering and biclustering with locally bounded errors. IEEE Trans. Inf. Theory, 64(6):4105–4119, 2018. doi:10.1109/TIT.2018.2819696.