Invited Talk
Recent Progress on Correlation Clustering: From Local Algorithms to Better Approximation Algorithms and Back (Invited Talk)

Authors: Vincent Cohen-Addad

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)

Correlation clustering is a classic model for clustering problems arising in machine learning and data mining. Given a set of data elements represented as vertices of a graph and pairwise similarity represented as edges, the goal is to find a partition of the vertex set so as to minimize the total number of edges across the parts plus the total number of non-edges within the parts. Introduced in the early 2000s [Bansal et al., 2004], correlation clustering has received a large amount of attention through the years. A natural linear programming relaxation was shown to have an integrality gap of at least 2 and at most 2.5 [Ailon et al., 2008] in 2005, and in 2015 at most 2.06 [Chawla et al., 2015]. In 2021, motivated by large-scale application new structural insights allowed to derive a simple, practical algorithm that achieved an O(1)-approximation in a variety of models (Massively Parallel, Sublinear, Streaming or Differentially-private) [Vincent Cohen{-}Addad et al., 2021; Cohen-Addad et al., 2022]. These new insights turned out to be a key building block in designing better algorithms: It serves as a pre-clustering of the input graph that enables algorithm with approximation guarantees significantly better than 2 [Vincent Cohen{-}Addad et al., 2023; Vincent Cohen{-}Addad et al., 2022]. It is a key component in the new algorithm that achieves a 1.44-approximation [Nairen Cao et al., 2024] and in the new local-search based 1.84-approximation for the Massively Parallel, Sublinear, and Streaming models [Vincent Cohen{-}Addad et al., 2024]. This talk will review the above recent development and what are the main open research directions. A collection of joint works with Nairen Cao, Silvio Lattanzi, Euiwoong Lee, Shi Li, David Rasmussen Lolck, Slobodan Mitrovic, Alantha Newman, Ashkan Norouzi-Fard, Nikos Parotsidis, Marcin Pilipczuk, Jakub Tarnawski, Mikkel Thorup, Lukas Vogl, Shuyi Yan, Hanwen Zhang.

A Logarithmic Approximation of Linearly-Ordered Colourings

Authors: Johan Håstad, Björn Martinsson, Tamio-Vesa Nakajima, and Stanislav Živný

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)

A linearly ordered (LO) k-colouring of a hypergraph assigns to each vertex a colour from the set {0,1,…,k-1} in such a way that each hyperedge has a unique maximum element. Barto, Batistelli, and Berg conjectured that it is NP-hard to find an LO k-colouring of an LO 2-colourable 3-uniform hypergraph for any constant k ≥ 2 [STACS'21] but even the case k = 3 is still open. Nakajima and Živný gave polynomial-time algorithms for finding, given an LO 2-colourable 3-uniform hypergraph, an LO colouring with O^*(√n) colours [ICALP'22] and an LO colouring with O^*(n^(1/3)) colours [ACM ToCT'23]. Very recently, Louis, Newman, and Ray gave an SDP-based algorithm with O^*(n^(1/5)) colours. We present two simple polynomial-time algorithms that find an LO colouring with O(log₂(n)) colours, which is an exponential improvement.

Invited Paper
From TCS to Learning Theory (Invited Paper)

Authors: Kasper Green Larsen

Published in: LIPIcs, Volume 306, 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)

While machine learning theory and theoretical computer science are both based on a solid mathematical foundation, the two research communities have a smaller overlap than what the proximity of the fields warrant. In this invited abstract, I will argue that traditional theoretical computer scientists have much to offer the learning theory community and vice versa. I will make this argument by telling a personal story of how I broadened my research focus to encompass learning theory, and how my TCS background has been extremely useful in doing so. It is my hope that this personal account may inspire more TCS researchers to tackle the many elegant and important theoretical questions that learning theory has to offer.

Track A: Algorithms, Complexity and Games
Simultaneously Approximating All 𝓁_p-Norms in Correlation Clustering

Authors: Sami Davies, Benjamin Moseley, and Heather Newman

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

This paper considers correlation clustering on unweighted complete graphs. We give a combinatorial algorithm that returns a single clustering solution that is simultaneously O(1)-approximate for all 𝓁_p-norms of the disagreement vector; in other words, a combinatorial O(1)-approximation of the all-norms objective for correlation clustering. This is the first proof that minimal sacrifice is needed in order to optimize different norms of the disagreement vector. In addition, our algorithm is the first combinatorial approximation algorithm for the 𝓁₂-norm objective, and more generally the first combinatorial algorithm for the 𝓁_p-norm objective when 1 < p < ∞. It is also faster than all previous algorithms that minimize the 𝓁_p-norm of the disagreement vector, with run-time O(n^ω), where O(n^ω) is the time for matrix multiplication on n × n matrices. When the maximum positive degree in the graph is at most Δ, this can be improved to a run-time of O(nΔ² log n).

Coloring Tournaments with Few Colors: Algorithms and Complexity

Authors: Felix Klingelhoefer and Alantha Newman

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

A k-coloring of a tournament is a partition of its vertices into k acyclic sets. Deciding if a tournament is 2-colorable is NP-hard. A natural problem, akin to that of coloring a 3-colorable graph with few colors, is to color a 2-colorable tournament with few colors. This problem does not seem to have been addressed before, although it is a special case of coloring a 2-colorable 3-uniform hypergraph with few colors, which is a well-studied problem with super-constant lower bounds. We present an efficient decomposition lemma for tournaments and show that it can be used to design polynomial-time algorithms to color various classes of tournaments with few colors, including an algorithm to color a 2-colorable tournament with ten colors. For the classes of tournaments considered, we complement our upper bounds with strengthened lower bounds, painting a comprehensive picture of the algorithmic and complexity aspects of coloring tournaments.

Towards Improving Christofides Algorithm for Half-Integer TSP

Authors: Arash Haddadan and Alantha Newman

Published in: LIPIcs, Volume 144, 27th Annual European Symposium on Algorithms (ESA 2019)

We study the traveling salesman problem (TSP) in the case when the objective function of the subtour linear programming relaxation is minimized by a half-cycle point: x_e in {0,1/2,1} where the half-edges form a 2-factor and the 1-edges form a perfect matching. Such points are sufficient to resolve half-integer TSP in general and they have been conjectured to demonstrate the largest integrality gap for the subtour relaxation. For half-cycle points, the best-known approximation guarantee is 3/2 due to Christofides' famous algorithm. Proving an integrality gap of alpha for the subtour relaxation is equivalent to showing that alpha x can be written as a convex combination of tours, where x is any feasible solution for this relaxation. To beat Christofides' bound, our goal is to show that (3/2 - epsilon)x can be written as a convex combination of tours for some positive constant epsilon. Let y_e = 3/2-epsilon when x_e = 1 and y_e = 3/4 when x_e = 1/2. As a first step towards this goal, our main result is to show that y can be written as a convex combination of tours. In other words, we show that we can save on 1-edges, which has several applications. Among them, it gives an alternative algorithm for the recently studied uniform cover problem. Our main new technique is a procedure to glue tours over proper 3-edge cuts that are tight with respect to x, thus reducing the problem to a base case in which such cuts do not occur.

Complex Semidefinite Programming and Max-k-Cut

Authors: Alantha Newman

Published in: OASIcs, Volume 61, 1st Symposium on Simplicity in Algorithms (SOSA 2018)

In a second seminal paper on the application of semidefinite programming to graph partitioning problems, Goemans and Williamson showed in 2004 how to formulate and round a complex semidefinite program to give what is to date still the best-known approximation guarantee of .836008 for Max-3-Cut. (This approximation ratio was also achieved independently around the same time by De Klerk et al..) Goemans and Williamson left open the problem of how to apply their techniques to Max-k-Cut for general k. They point out that it does not seem straightforward or even possible to formulate a good quality complex semidefinite program for the general Max-k-Cut problem, which presents a barrier for the further application of their techniques. We present a simple rounding algorithm for the standard semidefinite programmming relaxation of Max-k-Cut and show that it is equivalent to the rounding of Goemans and Williamson in the case of Max-3-Cut. This allows us to transfer the elegant analysis of Goemans and Williamson for Max-3-Cut to Max-k-Cut. For k > 3, the resulting approximation ratios are about .01 worse than the best known guarantees. Finally, we present a generalization of our rounding algorithm and conjecture (based on computational observations) that it matches the best-known guarantees of De Klerk et al.

The Alternating Stock Size Problem and the Gasoline Puzzle

Authors: Alantha Newman, Heiko Röglin, and Johanna Seif

Published in: LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)

Given a set S of integers whose sum is zero, consider the problem of finding a permutation of these integers such that: (i) all prefixes of the ordering are non-negative, and (ii) the maximum value of a prefix sum is minimized. Kellerer et al. referred to this problem as the stock size problem and showed that it can be approximated to within 3/2. They also showed that an approximation ratio of 2 can be achieved via several simple algorithms. We consider a related problem, which we call the alternating stock size problem, where the number of positive and negative integers in the input set S are equal. The problem is the same as above, but we are additionally required to alternate the positive and negative numbers in the output ordering. This problem also has several simple 2-approximations. We show that it can be approximated to within 1.79. Then we show that this problem is closely related to an optimization version of the gasoline puzzle due to Lovász, in which we want to minimize the size of the gas tank necessary to go around the track. We present a 2-approximation for this problem, using a natural linear programming relaxation whose feasible solutions are doubly stochastic matrices. Our novel rounding algorithm is based on a transformation that yields another doubly stochastic matrix with special properties, from which we can extract a suitable permutation.

