Abstract 1 Introduction 2 Preliminaries 3 The Reduction 4 Definability 5 Consequences 6 Conclusion References

Undefinability of Approximation of 2-To-2 Games

Anuj Dawar ORCID Department of Computer Science and Technology, University of Cambridge, UK Bálint Molnár Department of Computer Science and Technology, University of Cambridge, UK
Abstract

Recent work by Atserias and Dawar [6] and Tucker-Foltz [26] has established undefinability results in fixed-point logic with counting (FPC) corresponding to many classical complexity results from the hardness of approximation. In this line of work, NP-hardness results are turned into unconditional FPC undefinability results. We extend this work by showing the FPC undefinability of any constant factor approximation of weighted 2-to-2 games, based on the NP-hardness results of Khot, Minzer and Safra. Our result shows that the completely satisfiable 2-to-2 games are not FPC-separable from those that are not ϵ-satisfiable, for arbitrarily small ϵ. The perfect completeness of our inseparability is an improvement on the complexity result, as the NP-hardness of such a separation is still only conjectured. This perfect completeness enables us to show the FPC undefinability of other problems whose NP-hardness is conjectured. In particular, we are able to show that no FPC formula can separate the 3-colourable graphs from those that are not t-colourable, for any constant t.

Keywords and phrases:
Hardness of Approximation, Unique Games, Descriptive Complexity, Fixed-Point Logic with Counting
Funding:
Anuj Dawar: Funded by UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee: grant number EP/X028259/1.
Copyright and License:
[Uncaptioned image] © Anuj Dawar and Bálint Molnár; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Finite Model Theory
; Theory of computation Complexity theory and logic ; Theory of computation Problems, reductions and completeness
Editors:
Jörg Endrullis and Sylvain Schmitz

1 Introduction

The study of the hardness of approximation of NP-optimization problems began in earnest with the PCP theorem in the 1990s. This theorem showed that for many problems (such as MAX 3SAT), where there are polynomial-time algorithms that can approximate the optimum solution within a constant factor, there is nonetheless a constant c such that no efficient algorithm can approximate the optimum value within a factor c unless P=NP. Indeed, Håstad [17] established tight bounds for MAX 3SAT: there is a trivial algorithm that achieves an 87 approximation, but none that achieves an 87ϵ approximation for any ϵ, unless P=NP. Such tight bounds are known for many NP-optimization problems, while for others there is a gap in the approximation ratio between the best known algorithm and the strongest known lower bound. An important problem in the latter category is the minimum vertex cover problem, where the best known polynomial-time algorithms yield an approximation ratio of 2, while the strongest proved lower bound is 2.

Perhaps the most important open question in the field of the hardness of approximation is the unique games conjecture of Khot. This states that for any ϵ,δ>0, there is a set of labels Σ such that it is NP-hard to separate the (1ϵ)-satisfiable instances of Σ-unique games (the precise definitions follow below) from those that are not even δ-satisfiable. The strongest result obtained so far in this direction shows that there is a Σ for which it is NP-hard to separate the (12ϵ)-satisfiable instances from the δ-unsatisfiable ones. This result is a consequence of the 2-to-2 theorem due to Khot, Minzer and Safra [20, 11, 21].

The hardness of approximation has also been studied in recent years in the context of logical definability. In particular, Atserias and Dawar [6] showed that many of the NP-hardness results can be recast as unconditional undefinability results in fixed-point logic with counting (FPC). For example, there is an FPC formula which yields an 87 approximation of the value of a  MAX 3SAT instance and there is provably no formula that yields an 87ϵ approximation for any ϵ>0. Recall that FPC is a logic whose expressive power is contained within the complexity class P and which has been characterized as a natural symmetric fragment of that class [1]. Tucker-Foltz [26] established the first definability gap in FPC of unique games, by showing that no formula can distinguish the 12-satisfiable instances from those that are not 13+δ-satisfiable and also showed that no constant factor approximation is FPC definable.

In the present paper, we consider the FPC definability of 2-to-2 games. The hardness of approximating the optimum value of such games was established through a series of results by Khot, Minzer and Safra [20, 25, 11]. At the core of their proof is a reduction from the problem MAX 3XOR of maximizing the number of satisfied clauses in a 3XOR instance. We show that the reductions used can be formulated, with some modification, as first-order definable reductions. As a consequence, we obtain the result that the completely satisfiable instances of 2-to-2 games cannot be separated by an FPC formula from those that are no more than δ-satisfiable. This (1,δ) separation is stronger (in terms of approximation ratios) than the known (1ϵ,δ) NP-hardness result due to the fact that the FPC undefinability of approximating MAX 3XOR was proved with perfect completeness in [6]. A corollary of our result is the FPC undefinability of a (12,δ) separation for (a weighted version of) unique games. This improves, again in terms of the approximation ratios, the gap obtained by Tucker-Foltz, though it should be noted that the latter gap is for unweighted games.

A more striking consequence of our result is that no FPC sentence can separate the class of 3-colourable graphs from those that are not even t-colourable for any constant t3. The NP-hardness of such a separation has only been proved for t at most 5, though it is conjectured for larger values. Indeed, this is a central open problem in the rapidly growing study of promise constraint satisfaction problems (PCSP, see [7]).

The result on graph colouring should be compared with a recent result of Atserias and Dalmau [5] which shows that the promise graph colouring problem cannot be solved by a local consistency algorithm. In particular, this implies that for any constant t the 3-colourable graphs cannot be separated from those that are not t-colourable by a class (whose complement is) definable in Datalog. Since Datalog programs can be translated into sentences of FPC, our Theorem 5.3 can be seen as strengthening their result. It is worth examining this relationship more closely. It is known, from results of [4] and [8], that every class of bounded counting width (and therefore, in particular, any FPC definable class) that is the complement of a fixed-template constraint satisfaction problem (CSP) is already definable in Datalog. Hence, we can conclude from the result of Atserias and Dalmau that no FPC definable CSP separates the 3-colourable graphs from the non-t-colourable ones. However, since it is conceivable that a separating class for these two CSPs is FPC definable but not itself a CSP, our result is still a strengthening. But we can say still more. It can be deduced from the proof in [5] that the 3-colourable graphs and the non-t-colourable ones are not separable by any class definable in an existential positive infinitary logic (+,ω). Moreover, it is a consequence of a very recent proof due to Rossman (published in the present volume [24]) that every class of bounded counting width that is preserved under homomorphisms is definable in +,ω. Thus, we can conclude from these results that no homomorphism-closed class of bounded counting width separates the 3-colourable graphs from the non-t-colourbale ones. Since Theorem 5.3 easily applies to all classes of bounded counting width and not just the FPC-definable ones; and it is conceivable that a separating class is not necessarily closed under homomorphisms, our result subsumes even this strengthened version of that of Atserias and Dalmau.

In Section 2 we introduce the problems, notation and provide background definitions. An outline of the steps involved in the reduction of Khot, Minzer and Safra is given in Section 3. The proof that the reductions involved are definable as first-order interpretations is given in Section 4 and certain consequences derived in Section 5.

2 Preliminaries

2.1 Hardness of Approximation in Optimization

We are interested in NP-hard optimization problems. A standard example is the problem MAX 3SAT, where the aim is to find, given a formula in 3CNF, an assignment of values to its variables that maximizes the number of clauses satisfied. Formally, consider a function problem M, which associates with every possible input instance I a value M(I). In our example, MAX 3SAT maps a formula ϕ to the maximum number m of clauses of ϕ that can be simultaneously satisfied. While, in practice, we might be interested in finding an assignment that achieves this maximum, for the purpose of proving hardness, it suffices to show that it is hard to compute the number m. When finding M(I) is hard, we may wish to approximate it, and we say that an algorithm computes a C-approximation (for a real number C>1) of M if it produces a number M(I) with the guarantee that M(I)M(I)CM(I).

For the sake of uniformity, we consider function problems that take values in [0,1]. Thus, MAX 3SAT assigns to a 3CNF formula ϕ the maximum fraction of the clauses of ϕ that can be simultaneously satisfied. For MAX 3SAT, it is known that, unless P=NP, there is no polynomial-time algorithm that gives a C-approximation for any C<8/7. Such hardness of approximation results are usually proved by means of a hardness of separation, which allows us to frame this in terms of the hardness of decision problems.

Formally, let A and B be two sets (i.e. decision problems) with AB=. We say that A and B are NP-hard to separate, if every set C with ACB¯ is NP-hard, where B¯ denotes the complement of B. For a function problem M, and a constant c[0,1], denote by c-M the set {IM(I)c}. Then, for constants c and s with 0s<c1, we say that the gap problem GapM(c,s) is NP-hard if it is NP-hard to separate the sets c-M and s-M¯. This implies, in particular, that unless P=NP, there is no polynomial-time algorithm giving a cs-approximation of M. The value c in GapM(c,s) is called the completeness parameter and s the soundness parameter.

The first hardness of approximation results come from the PCP theorem [2, 15, 3]: one of its direct consequences is the NP-hardness of Gap3SAT(1,η) for some constant η strictly less than 1. Håstad [17] obtained an optimum inapproximability result for MAX 3SAT. Namely, he showed that Gap3SAT(1,78+ϵ) is NP-hard for arbitrarily small ϵ. This is optimal since there is an easy 87-approximation algorithm. Similarly, he also showed that Gap3XOR(1ϵ,12+ϵ) is NP-hard for arbitrarily small ϵ. Again, this is optimal. Here, 3XOR is the problem where we are given a Boolean formula as a conjunction of clauses, each of which is the XOR of three literals and we aim to maximize the number of satisfied clauses. Note that the completeness parameter must be strictly less than 1, since the problem of determining whether such a formula is satisfiable or not is polynomial-time decidable. Thus 1-3XOR can be separated in polynomial time from (1ϵ)-3XOR¯ for any ϵ.

Reductions

A common way of deriving further hardness of approximation results is via gap-reductions: given function problems A and B, a polynomial-time computable function f taking instances of A to instances of B is a reduction from GapA(c,s) to GapB(c,s) if for all instances I of A

  • Completeness: if A(I)c, then B(f(I))c.

  • Soundness: if A(I)s, then B(f(I))s.

It is easily seen that, if such a reduction exists and GapA(c,s) is NP-hard, then so is GapB(c,s).

2.2 Label Cover Games

Versions of label cover problems are ubiquitous in the study of hardness of approximation (see [13]). A particularly important case are the unique games of Khot [18], defined below. To arrive at the definition, we first introduce some terminology. For positive integers d and e, a relation RU×V is said to be d-to-e if it relates each element of U to exactly d elements of V and each element of V to exactly e elements of U.

Definition 2.1 (d-to-d games).

A d-to-d game is a tuple (G,Σ,Φ), where G=(V,E) is a multi-graph111That is to say, there may be multiple edges between the same pair of vertices. In the sequel we refer simply to graphs to mean multi-graphs., Σ is a finite alphabet and Φ:E𝒫(Σ2) assigns to each edge eE a d-to-d binary relation.

A colouring χ:VΣ satisfies an edge (u,v) if (χ(u),χ(v))Φ(u,v).

The value of the game (G,Σ,Φ) is the maximum over all colourings of the proportion of edges in E that are satisfied.

In this paper, we are particularly interested in 2-to-2 games and 1-to-1 games, the latter also being known as Unique Games. We write UGq for the function problem of determining the value of an instance of unique games with an alphabet of size q. We can then state Khot’s unique games conjecture.

Conjecture 2.2 (Unique Games Conjecture (UGC) [18]).

For any δ,ϵ>0, there exists a positive integer q so that GapUGq(1ϵ,δ) is NP-Hard.

The significance of the conjecture is that it has been shown that many optimal hardness of approximation results follow from it, including MaxCut and VertexCover [19, 23, 18].

The best known hardness result for unique games, towards proving Conjecture 2.2 is that GapUGq(12ϵ,δ) is NP-Hard for arbitrarily small δ and ϵ. This is obtained as a consequence of the hardness of 2-to-2 games established by Khot, Minzer and Safra, which we return to in Section 3.

Theorem 2.3 (Khot-Minzer-Safra).

For any δ,ϵ>0, there exists a positive integer q so that Gap2to2q(1ϵ,δ) is NP-Hard.

It is conjectured that Theorem 2.3 can be strengthened to make the completeness parameter 1, but this remains unproved.

In this paper, we are particularly concerned with weighted 2-to-2 and 1-to-1 games, attaching a weight to each constraint.

Definition 2.4 (Weighted d-to-d games).

A weighted d-to-d game is a tuple (G,Σ,Φ,w), where (G,Σ,Φ) is a d-to-d game and w:E(G)+ is a function assigning a positive real weight to each constraint.

Let tot=eE(G)w(e) be the total weight. The value of the game (G,Σ,Φ,w) is the maximum over all colourings χ:VΣ of the fraction eSχw(e)/tot, where Sχ denotes the set of edges e=(u,v) for which (χ(u),χ(v))Φ(e).

We write 𝒲𝒢2:2;q to denote the class of weighted 2-to-2 games with q labels and Weight2to2q to denote the function taking such a game to its value. Similarly, we write UGq and WeightUGq for the functions giving the values of unique games and weighted unique games with q labels respectively.

2.3 Undefinability of Approximation

We assume the reader is familiar with first-order logic and the basics of finite model theory. A good introduction is to be found in [14]. Our structures are finite structures in a finite relational vocabulary. Our main inexpressibility results are stated for fixed-point logic with counting (FPC). We do not need a formal definition here but note that every property definable in FPC is decidable in polynomial-time and indeed FPC can be understood as a complexity class defined by symmetric polynomial-time computation. For full definitions, refer to [10] and references therein.

The two properties of FPC that we do need are that (1) every class of structures definable in FPC has bounded counting width; and (2) that the class of properties definable in FPC is closed under first-order interpretations. We elaborate on these below.

For a function problem M, and real numbers c and s with 0s<c1, we say that GapM(c,s) is undefinable in FPC if there is no FPC definable class of structures that separates the sets c-M and s-M¯. Atserias and Dawar [6] initiated a study of the FPC undefinability of approximations, showing that many of the NP-hardness results for gap problems can be reproduced as unconditional undefinability results in FPC. In particular Gap3SAT(1,78+ϵ) is not FPC definable. More significantly, they established the following

Theorem 2.5 (Atserias-Dawar [6]).

Gap3XOR(1,12+ϵ) is not FPC definable.

Note the completeness parameter of 1 in the statement, which contrasts with 1ϵ in the case of Theorem 2.3. Perfect completeness cannot be established in the case of NP-hardness because satisfiability of XOR formulas is decidable in polynomial-time. However, it is not definable in FPC and this allows the stronger result in the context of undefinability. This is crucial to the application we make of Theorem 2.5 in Section 5.3

Following up on this work, Tucker-Foltz [26] studied the undefinability of gaps in unique games. In particular, he established the inapproximability of unique games in FPC by any constant factor and the FPC-undefinability of GapUGq(12,13+δ) for a suitable value of q.

Counting Width

For relational structures 𝔸 and 𝔹 in the same vocabulary, and a positive integer k, 𝔸k𝔹 denotes that the two structures cannot be distinguished by any sentence of first-order logic with counting using no more than k distinct variables. For a class 𝒞 of structures, the counting width of 𝒞 is the function ν: such that for any n, ν(n) is the least k such that 𝒞, restricted to structures with at most n elements is a union of k-equivalence classes. Any class that is definable by a sentence of FPC has counting width bounded by a constant. Almost all results showing that a class is not definable in FPC proceed by showing that it, in fact, does not have bounded counting width.

Interpretations

A first-order interpretation of a relational vocabulary τ in a vocabulary σ is a sequence of σ-formulas in first-order logic, which can be seen as mapping σ-structures to τ-structures. There are many variations of the precise definition in the literature. We use the version defined in [6] and refer the reader to that for the formal definition. Given a function problem A whose instances are σ-structures and a function problem B whose instances are τ-structures, an interpretation Θ of τ in σ is a GapA(c,s) to GapB(c,s) reduction if A(𝔸)c implies B(Θ(𝔸))c and A(𝔸)s, then B(Θ(𝔸))s. Definability in FPC and the property of having bounded counting width are both closed under first-order reductions. That is to say, if GapB(c,s) is FPC-definable and there is a first-order reduction of GapA(c,s) to GapB(c,s), then GapA(c,s) is FPC-definable as well.

3 The Reduction

The proof of Theorem 2.3 was completed in 2018 and remains to this day the most significant advance towards establishing the Unique Games Conjecture since the latter was formulated by Khot in [18]. The proof proceeds by a reduction from Gap3XOR(1ϵ,12+δ) and was presented in a series of papers [20, 25, 11]. The main difficulty lies in proving the combinatorial conditions that the soundness analysis relies on. The full reduction and proof of correctness can be found in [22, Chapter 3].

Our aim in the present paper is to show that the reduction constructed has two crucial properties. First, it preserves perfect completeness and thus can be seen as a reduction from Gap3XOR(1,12+δ). Secondly, with small modifications which do not affect the soundness or completeness analysis, it can be described as a first-order interpretation. Together these establish the main theorem.

Theorem 3.1.

For every δ>0, there exists q+ for which GapWeight2to2q(1,δ) is not FPC definable.

In proving this, we do not need to reprise the difficult soundness analysis carried out by Khot et al. Rather we study the actual construction involved in the reduction. For this purpose, we describe the reduction in some detail in this section, and take up the two issues of perfect completeness and first-order definability in the next.

3.1 Regular 3XOR

An instance of 3XOR can be seen as a system of linear equations over the field 𝔽2 with exactly three variables appearing in each equation. We say that such an instance is d-regular if every variable appears in exactly d equations and no two equations share more than one variable. It is known that the NP-hardness of Gap3XOR(1ϵ,12+δ) holds even when restricted to d-regular instances for some fixed value of d (indeed, taking d=5 suffices, see [22, Theorem 3.3.1]). In Section 4.3 we show that this is also true of the undefinability in FPC of Gap3XOR(1,12+δ) From now on, we restrict attention to d-regular instances for a suitable fixed value of d, and we call the resulting function problem GapRegular3XOR.

3.2 Reducing to Transitive Games

In the first step of the reduction, we reduce regular 3XOR instances to label cover games with a mixture of 2-to-2 and 1-to-1 constraints, with an additional transitivity requirement. We formally define these below.

Definition 3.2 (Transitive 2-to-2 games).

A transitive 2-to-2 game is a tuple (G,Σ,Φ) where G=(V,E) is a graph, Σ is a finite alphabet and Φ:E𝒫(Σ2) assigns to each edge e either a 2-to-2 or a 1-to-1 relation and whenever Φ(u,v) is 1-to-1, then for any edge (v,w), Φ(u,w) is the composition of Φ(u,v) and Φ(v,w).

Note that the condition on composition only applies when Φ(u,v) is 1-to-1, but Φ(v,w) may be 1-to-1 or 2-to-2, and this determines whether Φ(u,w) is 1-to-1 or 2-to-2.

Now, fix an instance I of GapRegular3XOR, with X being the set of variables that appear in I and E the set of equations. Thus, each equation eE is of the form x+y+z=b for some b𝔽2. We refer to x,y and z as the variables occurring in e and b as the right-hand side of e.

Fix a positive integer k and let 𝒰Ek be the set of k-tuples U of equations, satisfying the following properties:

  • no variable occurs in more than one equation of U; and

  • if variables x and y appear in distinct equations of U, there is no equation in E (even outside U) in which both x and y occur.

For U=(e1,,ek)𝒰, let XU denote the set of variables occuring in equations in U and for i{1,,k} let vi𝔽2X denote the vector which has 1s in the three coordinates corresponding to the variables occurring in ei and 0s everywhere else. We define the space of side-conditions corresponding to U to be HU=Span(v1,,vk). We say that a linear function f:𝔽2X𝔽2 satisfies the equations in U if f(vi)=bi for all i, where bi is the right-hand side of ei.

Now, fix a parameter l with l|X|, and we define U to be the collection of l-dimensional subspaces of 𝔽2X which are linearly independent of HU. That is

U={L𝔽2Xdim(L)=l,LHU={𝟎}}.

The trivial intersection ensures that for any subspace LU, any linear function f:L𝔽2 can be uniquely extended to one on L+HU222Here the sum is to be understood as vector space sum, i.e. L+HU is the space spanned by the union of L and HU. so that f(vi)=bi for all i. Therefore, the number of linear functions on L+HU satisfying the equations in U is exactly 2l.

We can now define the reduction Θ that takes the instance I to a 2-to-2 transitive game Θ(I). The reduction depends on the choice of parameters k and l. We omit the details on how to select the right parameters.

Vertices.

The vertices of Θ(I) are pairs (U,L), where U𝒰 and LU.

Alphabet.

The alphabet is a set of labels of size 2l. As noted above, for each vertex (U,L), there are exactly 2l linear functions on L+HU satisfying the equations in U. We fix, for each (U,L), a bijection between the alphabet and this set of linear functions. Henceforth, we simply treat the functions themselves as labels.

Constraints.

Given a pair of vertices u=(U,L) and v=(U,L), the constraint Φ(u,v) is a 1-to-1 relation if

dim(L+HU+HU)=dim(L+HU+HU)=dim(L+L+HU+HU)

and a 2-to-2 relation if

dim(L+HU+HU)=dim(L+HU+HU)=dim(L+L+HU+HU)1.

To define the relation, note that any function f:L+HU𝔽2 has a unique extension to L+HU+HU (by the conditions in the definition of 𝒰). Then, we relate f to f:L+HU𝔽2 if, and only if, f and f agree on the shared space (L+HU+HU)(L+HU+HU).

It is the case for any pair, that dim(L+HU+HU)=dim(L+HU+HU) [20, Lemma 4.3]. Let us call this dimension D. By [20, Lemma 4.4], any linear function f:L+HU𝔽2 satisfying the equations of U has a unique extension to (L+HU+HU) that also satisfies the equations of U. Then, it is easily seen that if dim(L+L+HU+HU)=D, then f has exactly one label of (U,L) that it is consistent with, and if dim(L+L+HU+HU)=D+1, there are exactly two such functions, thanks to the “free dimension”. Hence, the constraints are 1-to-1 or 2-to-2 as required. The transitivity property of these constraints is established in [20, Appendix A].

3.3 The final (weighted) 2-to-2 game

The final step of the reduction is to transform the transitive game constructed in Section 3.2 into a weighted 2-to-2 game, getting rid of the 1-to-1 constraints. This weighted game is defined as follows.

Recall the transitive 2-to-2 game Θ(I) constructed in Section 3.2. The transitivity condition guarantees that the vertices of Θ(I) can be partitioned into cliques C1,,Cm so that edges in each clique are associated with 1-to-1 constraints. Moreover, these constraints are consistent in the sense that any colouring of a vertex V in a clique C can be extended in a unique way to a colouring of all vertices in C so that all edge constraints in C are satisfied. Also, by the transitivity condition, for distinct cliques Ci and Cj, either all pairs (u,v)Ci×Cj are connected by 2-to-2 constraints or none are. Furthermore, these 2-to-2 constraints are consistent in the sense that given a clique-consistent colouring for Ci and Cj, either all or none of these 2-to-2 constraints are satisfied.

The final (weighted) 2-to-2 instance I2:2w we construct from Θ(I) has as vertices the vertices of Θ(I) and as edges all edges (u,v) of Θ(I) where u and v are in distinct cliques. For each such edge, with uCi and vCj, we associate the constraint Φ(u,v) which is as in Θ(I). The weight w(u,v) is the probability assigned to (u,v) by the following sampling process:

  • Choose U𝒰, uniformly at random.

  • Choose a random pair L,L so that (U,L) and (U,L) are connected by a 2-to-2 edge. Let Ci be the clique containing (U,L) and Cj be the clique containing (U,L)

  • Choose uniformly at random a pair of vertices (u,v)Ci×Cj.

3.4 Irregular soundness case

For the result in Section 5.3, we need the FPC-undefinability of a different gap problem based on 2-to-2 games. Specifically, we define the value of a game to be, not the fraction of constraints that can be satisfied, but the fraction of the vertices formed by the largest set X so that all constraints between nodes in X are satisfied. Moreover, we relax the notion of colouring to allow vertices to be coloured by multiple colours.

Definition 3.3.

For a 2-to-2 game ((V,E),Σ,Φ), a colouring c:V(Σj) satisfies a set XV if (u,v)EX2.ac(u),bc(v).(a,b)Φ(u,v).

That is to say, a j-colouring, i.e., one that assigns a set of j colours to each vertex satisfies a set X if each constraint between vertices in X is satisfied by some choice among the colours assigned to the vertices.

Definition 3.4 (Irregular Values).

For constants j and q define the function Irreg2to2j,q to take a 2-to-2 game ((V,E),Σ,Φ) to the fraction |X|/|V| where X is the largest subset of V that is satisfied by some j-colouring c:V(Σj) .

We can now state the theorem below, which is a consequence of Theorem 3.1.

Theorem 3.5 (Definable 2-to-2 Games Theorem with irregular soundness).

For every δ with 0<δ<1 and j+, there exists q+ so that GapIrreg2to2j,q(1,δ) is not FPC definable.

It is not hard to see that this is a consequence of Theorem 3.1, and the corresponding claim for NP-hardness appears in e.g. [20]. For completeness, we give a short proof.

Lemma 3.6.

For a weighted 2-to-2 game I=((V,E),Σ,Φ,w) with
q=|Σ|, if Irreg2to2j,q((V,E),Σ,Φ)=δ, then Weight2to2(I)=Ω(δ2j2).

Proof.

Let c be a j-colouring of V that satisfies a set X with |X|/|V|δ. By [22, Remark 3.4.9], there is a Ω(δ2) (weighted) fraction of the edges E which are satisfied by c, in the sense that for each such edge (u,v) there are colours a and b in c(u) and c(v) respectively such that (a,b)Φ(u,v). We now construct a standard colouring by a random process. That is, for each vertex vV, independently choose a colour χ(v) from c(v) uniformly at random. For an edge (u,v), let Ξ(u,v) be the indicator variable indicating whether (χ(u),χ(v))Φ(u,v) and let Ξ be the overall value of the colouring χ. If (u,v)X2, the probability that χ satisfies the constraint Φ(u,v) is at least 1j2, as by definition, among the j2 pairs in c(u)×c(v), at least one satisfies the constraint. Then

𝔼[Ξ] =𝔼[(u,v)Ew(u,v)Ξ(u,v)]=(u,v)Ew(u,v)𝔼[Ξ(u,v)]
(u,v)EX2w(u,v)𝔼[Ξ(u,v)](u,v)EX2w(u,v)1j2Ω(δ2)1j2

Thus, there is a colouring that satisfies at least Ω(δ2j2) (weighted) fraction of the constraints.

From Lemma 3.6, we can conclude Theorem 3.5. For any fixed δ and j, the proof of Theorem 3.1 gives us a q and an FO reduction that takes satisfiable 3XOR instances to satisfiable 2-to-2 games and instances that are at most η-satisfiable to 2-to-2 games with value at most Ω(δ2j2). Then, by Lemma 3.6, this same reduction also maps at most η-satisfiable 3XOR instances to 2-to-2 games I for which Irreg2to2j,q(I)<δ.

3.5 𝟐𝟐 games

The definition of 2-to-2 games, Definition 2.4 only requires each constraint Φ(u,.v) to be a 2-to-2 relation, meaning that each element on the left is related to exactly two elements on the right and vice versa. However, the reductions yield games of a more restricted kind and this will be useful in Section 5.3. Say that a binary relation RA×B is 22 if it is the disjoint union of bipartite graphs K2,2. That is to say A and B can be each partitioned into sets A=iAi and B=iBi so that each Ai and Bi has exactly two elements and R=iAi×Bi.

We claim that the reductions in the proof of Thereom 3.1 yield games in which all constraint relations are 22. Specifically, given linear functions ff:L+HU𝔽2 so that their unique extension to the domain L+HU+HU only differ in their “free dimension”, i.e. they agree in values on (L+HU+HU)(L+HU+HU), f and f are related to the same two linear functions on L+HU (uniquely extensible to L+HU+HU) in Φ((U,L),(U,L)). Thus, the constraint relations constructed are 22.

4 Definability

The aim in this section is to show that the reduction outlined in Section 3 can, with minor modifications, be implemented as a first-order interpretation, preserving perfect completeness. Thus, it gives a first-order definable reduction from Gap3XOR(1,12+δ) to GapWeight2to2q(1,δ) for a suitable choice of parameters. This establishes Theorem 3.1.

4.1 Perfect completeness

To show that the reduction from Section 3 preserves perfect completeness, it suffices to verify that instances of 3XOR that are satisfiable (i.e. have value 1) are mapped by the reduction to instances of 𝒲𝒢2:2 which also have value 1.

Assume I is an 3XOR instance on a set of variables X that is satisfiable, and let s:X𝔽2 be an assignment of values to the variables that satisfies it. Let I2:2w denote the weighted 2-to-2 game that I maps to under the reduction. Then, for each vertex (U,L) of I2:2w the restriction of s to L+HU is a valid label since all equations are satisfied, and it is easily seen that this labelling satisfies all constraints.

4.2 Vocabularies

An instance of 3XOR is defined as a structure over the vocabulary τ3XOR=Eq0,Eq1 with two ternary relations. We think of the universe of a τ3XOR-structure 𝔸 as a set of variables. For b{0,1}, a triple (x,y,z)Eqb is understood as representing the equation x+y+z=b, where addition is modulo 2.

For each positive integer q, we define a vocabulary τ(T) 2-to-2q such that structures in this vocabulary represent instances of transitive 2-to-2 games over a label alphabet of size q. Let Sq denote the collection of permutations of [q]={1,,q}. Note that there is a natural bijective correspondence between Sq and the 1-to-1 relations on [q]. Now, let Sq#2 denote the set of pairs of permutations (π1,π2)Sq×Sq such that for all i[q], π1(i)π2(i). Then, it is easily seen that each 2-to-2 relation on [q] can be seen as the union of such a pair of permutations. Our vocabulary τ(T) 2-to-2q contains a binary relation for each element of Sq and one for each element of Sq#2:

τ(T) 2-to-2q=(Cπ)πSq,(Cπ1,π2)(π1,π2)Sq#2.

We write 𝒞1 for the collection of relation symbols (Cπ)πSq and 𝒞2 for the collection of relation symbols (Cπ1,π2)(π1,π2)Sq#2. Note that the vocabulary itself does not enforce the transitivity property, only a subset of the structures with this vocabulary are transitive 2-to-2 games.

For weighted 2-to-2 games, we construct a vocabulary that allows us to code instances with positive integer weights. This is more limiting than allowing rational weights, but as we show below in Section 4.6, it suffices for our purpose. Specifically,

τ(w) 2-to-2q=C,(Φπ1,π2)(π1,π2)Sq#2,

where C is unary, and the relations Φπ1,π2 are all ternary. A τ(w) 2-to-2q-structure 𝔸 is to be understood as an instance I2:2w=(G,Σ,Φ) of 𝒢2:2w with integer weights. The universe of 𝔸 is the disjoint union of the set V of vertices of I2:2w, and the set C of constraints, with the unary relation C picking out this set. For each (π1,π2)Sq#2, the relation Φπ1,π2V2×C contains those triples (u,v,c) where Φ(u,v) is a pair (R,w) with R being the 2-to-2 relation associated with the pair (π1,π2). The integer weight w is given by the number of elements c for which (u,v,c) is in the relation. We assume our structures satisfy the (first-order) axiom that ensures that there is at most one relation Φπ1,π2 in which triples (u,v,c) appear, for each choice of u and v.

4.3 Undefinability of Regular 3XOR

The reduction in Section 3 starts from regular games. In contrast, the undefinability result in Theorem 2.5 is stated for general 3XOR. Thus, we begin by arguing that the pfoof of Theorem 2.5 can actually be used to show the undefinability of GapRegular3XOR(1,η) for some η strictly smaller than 1.

We first note that the Gap3XOR(1,12+δ) is FPC undefinable even for “half-regular” 3XOR instances. That is, 3XOR instances where each variable appears in the same number of equations. To see this, note that Lemma 5 in [6] uses a bipartite unique-neighbour expander graph with r|X| nodes on the left and |X| nodes on the right. Thus the graph is 3-left-regular and is an (α|X|,β) expander. Such graph exists for every X by [27, Chapter 4]. By a variation shown for Theorem 4.4 in [27], we laim the existence of such a graph with the extra condition that the graph is right-regular. Using this extra assumption on the graph in Lemma 5 in [6] the proof establishes that Gap3XOR(1,12+δ) is FPC undefinable even for “half-regular” 3XOR instances.

A half-regular instance can be converted into a regular one by ensuring that any two equations share at most one variable.

First, by the unique-neighbour expander property of the graph in Lemma 5 in [6], we can assume that the half-regular 3XOR instance has no repeated equations or repeated variables within an equation. This half-regular instance (X,Eq) can be converted into a regular one (call it (X,Eq)) by replacing every equation e:x+y+z=b with three equations (as done in [22]): x+ye+ze=b,xe+y+ze=b,xe+ye+z=b, where xe,ye, and ze are new variables only used for these equations.

As shown in [22], if X is fully satisfiable then so is X and if X is no more than 12+δ-satisfiable, then X is at most η-satisfiable for some η<1 (for example, taking η=0.9 suffices).

The reduction can be easily defined by a first-order interpretation.

4.4 Shuffling variables

One issue that arises with the games constructed in the reduction from Section 3 is that we have a fixed alphabet of size q=2l and we associate with each vertex (U,L) an arbitrary bijection between this and the 2l distinct linear functions on the space L+HU that satisfy the equations in U. The consistency across different vertices is then enforced by the constraint relations. In order to turn this into a first-order reduction, we want to choose these bijections in a symmetry-preserving fashion.

Let I be our starting instance of 3XOR and I2:2T=Θ(I) the transitive 2-to-2 game obtained from the first step of the reduction of Section 3, and let X be the set of variables of I. Let ρSymX be a permutation of X. This permutation has a natural action on other objects constructed from X. In particular, for an equation e of the form x+y+z=b, we write ρ(e) for the equation ρ(x)+ρ(y)+ρ(z)=b. When U is a tuple of such equations, we write ρ(U) for the tuple obtained by applying ρ componentwise to each element of the tuple. Similarly, for other objects obtained by set and tuple constructions from X, we apply the permutation ρ to denote the natural induced action without defining it formally.

Furthermore, we also use ρ to denote the invertible linear map on 𝔽2X obtained by applying ρ to the basis (ex)xX, and extending linearly to all of 𝔽2X. Thus, in particular, for a subspace L𝔽2X, ρ(L) denotes the image of this space under this map.

The following is now straightforward.

Lemma 4.1 (Shuffling Variables 1).

For any permutation ρSymX, if U and ρ(U) are both in 𝒰, and (U,L)V(I2:2T), then ρ(U,L)V(I2:2T).

Proof.

Since ρ maps the basis of HU formed by the left-hand sides of the equations in U to the corresponding basis of Hρ(U), we have ρ(HU)=Hρ(U). By invertibility of ρ, a space L is then linearly independent of HU if, and only if, ρ(L) is linearly independent of Hρ(U).

Now, we want to choose the bijections between our set of 2l labels and the linear functions associated with a vertex (U,L) in such a way that whenever (U,L) and ρ(U,L) are both vertices in I2:2T, then they commute with ρ. For this, fix a canonical space 𝔽23k of dimension 3k. For each U𝒰, we write XUX for the set of variables that appear in U. Since U is a sequence of k equations with pairwise disjoint sets of variables, we can fix a bijection between XU and [3k] which induces an isomorphism μU:𝔽2U𝔽23k. These isomorphisms are easily seen to be ρ-invariant (for all ρ), that is,

S𝔽2XU.μρ(XU)(ρ(S))=μU(S).

Under this map, there is a fixed subspace H𝔽23k of dimension k such that μU(HU)=H for all U. Similarly, there is a fixed collection of l-dimensional spaces such that μU(U)=. Thus, we can identify the vertices of I2:2T uniquely with pairs (U,L) where U𝒰 and L. This is to be understood as the representation of the vertex (U,μU1(L)).

Similarly, for linear functions f over L𝒰, we can define

(ρ(f))(x)=f(ρ1(x)):ρ(L)𝔽2and
(μU(f))(x)=f(μU1(x)):μU(L)𝔽2.

Then, a linear function f on L+HU satisfies the equations in U if, and only if, μU(f) satisfies the equations in μU(U). Hence, we can interpret in a canonical way the label of a node (U,μU1(L)) as a linear function with domain H+L satisfying the equations in μU(U).

We now show that this can be consistently applied to the constraints of the game.

Lemma 4.2 (Shuffling Variables 2).

Suppose (U,L),(U,L)E(I2:2T) and ρ(U),ρ(U) are both in 𝒰. Then

  • (ρ(U,L),ρ(U,L))E(I2:2T)

  • Φ((U,L),(U,L))=Φ(ρ(U,L),ρ(U,L))

Proof.

By Lemma 4.1, ρ(U,L),ρ(U,L)V(I2:2T). Also

dim(ρ(L)+Hρ(U)+Hρ(U))= dim(ρ(L+HU+HU))=dim(L+HU+HU)

The equalities hold because the mapping ρ is an automorphism of 𝔽2X. The analogous dimensionality property holds with the mapping of subspaces (L+HU+HU) and (L+L+HU+HU). Therefore, the dimensionality constraint for drawing edges is invariant under the action of ρ. This proves the first bullet point.

Then if (f,f)Φ((U,L),(U,L)), it means μU1(f) and μU1(f) are consistent on the intersection of their domains. Then μρ(U)1(f)=ρ(μU1(f)) and μρ(U)1(f)=ρ(μU1(f)) are consistent too, meaning (f,f)Φ(ρ(U,L),ρ(U,L)). Hence Φ((U,L),(U,L))Φ(ρ(U,L),ρ(U,L)). Applying the same argument to ρ1 yields the other direction.

4.5 The reduction to the transitive game

We now describe how the reduction Θ from Section 3.2 can be given as a first-order interpretation. Fix positive integers k and l, which are the parameters to the reduction. Given a (regular) 3XOR instance 𝔸=(X,Eq0𝔸,Eq1𝔸), our interpretation maps it to the following (transitive) 2-to-2 game (with alphabet size 2l) 𝔹.

Universe.

The universe of 𝔹 consists of tuples of elements of X of length 4k+23k. These tuples can be seen as broken up into three parts.

  • The first 3k elements (u1,1,,uk,3) are the 3k variables in some U𝒰. To define this, we need to say that they are, in order, the collection of variables of a k-tuple of equations, that no variable appears more than once, and that when two variables appear in distinct equations, they do not occur together in some other equation in 𝔸.

  • The next k elements r1,,rk define the right-hand sides of the k equations in U. To encode these as binary values, we use ri=u1,1 to encode the value 0 and ri=u1,2 to encode the value 1. Since u1,1 and u1,2 are distinct, this works and can be specified by a first-order formula.

  • The next 23k elements also encode bits, using the values of u1,1 and u1,2 as 0 and 1. Think of these as specifying a subset of 𝔽23k. We can write a first-order formula that says that this subset is a subspace L of dimension l (since l and k are fixed, the formula is simply a big disjunction over all subspaces). Finally, we can also write a first-order formula that checks that L is in .

For completeness, here is the first-order sentence checking all these conditions.

πU =i=1k[Eq0(ui,1,ui,2,ui,3)ri=0][Eq1(ui,1,ui,2,ui,3)ri=1]
(a,i)(b,j)ua,iub,j
ab,i,j¬(x(α,β,γ)Perm(ua,i,ub,j,x)Eq0(α,β,γ)Eq1(α,β,γ))
L(i=023k1bi=Li)

Where Perm(x,y,z) describes the set of permutations of x,y,z.

We can thus, as required, identify the elements of 𝔹 with pairs (U,L) which are the vertices of Θ(𝔸).

Relations.

Given two vertices (U,L) and (U,L) of 𝔹, the type of constraint between them (1-to-1, 2-to-2 or no constraint at all) only depends on μU(L),μU(L),r,r and I(U,U), where r,r are the vectors of the right-hand sides of the equations and

I(U,U){((a,i),(b,j))({1,,k}×{1,2,3})2ua,i=ub,j}

If two pairs of vertices agree on all five of these values, there is a permutation ρ of the variables that will take one to the other and then by Lemma 4.2, they must have the same constraint between them.

Note that each of these five parameters can take only a constant number of different values, so for each constraint C𝒞1𝒞2, there is a (constant) finite set SC containing such 5-tuples so that (U,L) and (U,L) are connected by a constraint C if, and only if, (μU(L),μU(L),r,r,I(U,U))SC. The formula πC defining the relation C in 𝔹 simply states that the 5-tuple corresponding to a pair of vertices is in SC. This translates to a disjunction of a finite number of cases and is clearly FO-definable. This concludes the reduction to the transitive game.

4.6 Weight approximation

We now show how to get a weighted 2-to-2 game, that is an approximation of the instance I2:2w constructed in Section 3.3. The vertices of the game are exactly those in the structure 𝔹 above. The main task is to define the weights, by defining a suitable set C of constraints. Recall that the vertices of I2:2w are partitioned into cliques C1,,Cm based on the 1-to-1 constraints. Suppose (U1,L1)Ci and (U2,L2)Cj are two vertices connected by a 2-to-2 constraint. Then, the weight of the constraint is

U,L,LL,LUdim(LL)=l11(U,L)Ci(U,L)Cj1|𝒰|1|{L,LUdim(LL)=l1}|1|Ci||Cj|.

Each of the three factors (apart from the indicator variable) describes the probability of a certain choice in the steps of the random process which define the weights.

Of course, 1|𝒰| is constant for all pairs (U1,L1),(U2,L2). Similarly, 1|{L,LUdim(LL)=l1}| is constant by the symmetry argument presented in Section 4.4. Thus, removing them from the expression does not change the relative weights of the constraints. Also, the clique size only depends on (U1,L1),(U2,L2), so the weight expression (without the normalising factors) simplifies to

|{(U,L,L)(U,L)Ci,(U,L)Cj}||Ci||Cj|. (1)

These weights are rational, so we cannot express them directly in structures over τ(w) 2-to-2q, which is our vocabulary for describing integer-weighted games. One potential way to handle rational weights would be to multiply all weights with a common denominator. This is not a viable option since the number of different-sized cliques grows with the size of the input, making the common denominator too large. However, we have a workaround: instead of these weights, we give an approximation that does not change the soundness parameter significantly but makes the common denominator of the weights small enough (polynomial as a function of the input size) to be definable.

Lemma 4.3.

Given a weighted 2-to-2 game G=(V,Σ,Φ,w), whose value is at most δ, any game G=(V,Σ,Φ,w) where ϕΦ.1γ<w(ϕ)w(ϕ)<γ has value at most δγ2.

Proof (sketch).

The sum of weights drops at most by a factor γ, and the sum of the weights of the satisfied constraints increases by at most a factor of γ.

So, the idea is to approximate clique sizes so that the number of possible denominators is constant and their product grows only polynomially with the input size, while bounding the change with a suitable multiplicative factor γ.

Fix a vertex (U,L) in a clique Ci. Recall that (U,L)Ci if, and only if, there is a one-to-one constraint between (U,L) and (U,L) in 𝔹. First, let us split the equations in U into two groups: “useful” and “useless” ones. An equation in U is useful (for U) if it shares at least one variable with U and useless otherwise. Note that the number of useful equations of (U,L) only depends on U, not on L.

Next, we define an equivalence relation U on the vertices of the game as follows: (U1,L1)U(U2,L2) iff

  • μU1(L1)=μU2(L2).

  • U1 and U2 have the same useful equations (for U), and these equations are in the same positions within the k-tuple.

  • The right-hand sides of the equations in U1 and U2 are the same.

It is easily seen that this is, indeed, an equivalence relation.

Note that the clique Ci is invariant under the equivalence relation U: each equivalence class is either contained in Ci or disjoint with it, by Lemma 4.2 (choosing ρ to be a permutation that fixes the variables of U and any useful equations).

Now, for any f with 0fk, we can establish an upper bound on the number of equivalence classes with f useful equations. Recall that any node (U,L) can be uniquely represented by U and the subspace μU(L)=L:

  • The number of possible subspaces L𝔽23k is at most 223k, as that is an upper bound for || (in fact, it is much smaller, but for our purposes, this upper bound suffices).

  • The number of ways to choose the positions of the useful equations is (kf)2k.

  • The number of choices for the right-hand sides of the equations is 2k.

  • Since the 3XOR instance is regular (each variable appears in at most d equations), the number of equations sharing a variable with U is at most 3kd, so the number of ways of choosing the useful equations is bounded by (3kd)k.

These bounds are all constants, so the number of equivalence classes within the clique, with f useful equations (call it νU,Lf) is bounded by a constant Ψ for all f,U,L.

The number of elements in an equivalence class with f useful equations is simply the number of ways to set the remaining kf equations. This can be approximated by |Eq|kf. Given f useful equations, the probability of a random set of kf equations having common variables with U, the set of useful equations or each other, or making the k-tuple invalid by having two variables from different equations which have a common equation in the 3XOR instance, converges to zero (O(k2|X|)) as the instance size grows, due to the regularity condition. By adding all the approximate sizes of the equivalence classes within Ci, we can conclude that the approximation

χ(𝝂𝑼,𝑳)χ(νU,L0,νU,L1,,νU,Lk)f=0kνU,Lf|Eq|kf|Ci|

is accurate within an arbitrarily small factor as the input size grows. Using this approximation in the weight expression (1), we see that 𝒗{0,,Ψ}k+1χ(𝒗)2 is a common denominator of all weights. Multiplying all weights by this number, we get the expression

w((U1,L1),(U2,L2))=|{(U,L,L)(U,L)Ci,(U,L)Cj}|𝒗{0,,Ψ}k+1{χ(𝒗)if 𝒗𝝂(𝑼𝟏,𝑳𝟏)1if 𝒗=𝝂(𝑼𝟏,𝑳𝟏)𝒗{0,,Ψ}k+1{χ(𝒗)if 𝒗𝝂(𝑼𝟐,𝑳𝟐)1if 𝒗=𝝂(𝑼𝟐,𝑳𝟐) (2)

As we see next, we can define a reduction in FO to weighted 2-to-2 games using these approximate weights.

4.7 Defining the weighted game

Finally, we are ready to show that the construction of a weighted 2-to-2 game with approximate weights as above can be given by an FO interpretation.

Universe.

We need to define the set of vertices, and the set of constraints. The elements of the universe are tuples of elements of X (the set of variables of the 3XOR instance I) of length 8k+1+23k+1+Q, where Q is a parameter we define below.

A vertex (U,L) is coded by the first 4k+23k elements of this tuple, as before, followed by a sequence of 0s. Recall that we code bits 0 and 1 by the first and second elements of the tuple. The first of these 0s is to be interpreted as an indicator that the tuple is a vertex (it will be 1 for a constraint), and the rest are padding to make the length of the tuples match.

A constraint c is coded by a tuple where the first 4k+23k elements represent a vertex (U,L), this is followed by a 1 (i.e. a repeat of the second element of the tuple) and then the next 4k+23k represent a second vertex (U,L). The rest of the tuple codes a unique identifier of the constraint, ID. We construct the interpretation so that for all fixed (U,L),(U,L), there are w((U,L),(U,L)) different identifiers where w is the approximate weight described above. We show that for this weight function, there is a formula W which defines a set of exactly w((U,L),(U,L)) tuples extending the description of (U,L) and (U,L).

Lemma 4.4.

There exists Q+ and a first-order formula W which defines a set T of tuples coding pairs (U,L),(U,L) together with a Q-element unique identifier and such that for each fixed (U,L),(U,L), T contains exactly w((U,L),(U,L)) many tuples extending (U,L),(U,L).

The proof of this lemma, constructing the formula W is in Section 4.8 below.

Thus, we can define the formulas defining the set of vertices and constraints. For simplicity, we use U,L,U,L,ID to describe the sub-tuple of variables in their corresponding parts of the N-tuple, where N=8k+1+23k+1+Q.

Node(U,L,IsConstraint,U,L,ID) IsConstraint=0πU(U,L)x(U,L,ID)x=0

To check if it is a valid constraint, we need

Constraint(U,L,IsConstraint,U,L ,ID)IsConstraint=1πU(U,L)πU(U,L)
C𝒞2πC((U,L),(U,L))W((U,L),(U,L),ID)
Constraints.

For each Cπ1,π2𝒞2, we can construct the formula that defines the set of triples (x,y,c) where x=(U,L,0,,0) y=(U,L,0,0) and c=(U,L,1,U,L,ID), such that there is a constraint of type C between x and y and ID is a valid id of a constraint between them.

Φπ1,π2(x,y,c)πCπ1,π2(x,y)(U,L)=(U1,L1)(U,L)=(U2,L2).

This completes the proof of Theorem 3.1.

4.8 Defining W

To prove Lemma 4.4 we define a first-order formula W(x,y,z) in the vocabulary τ3XOR, where x, y and z are tuples of free variables. The formula is such that if x and y are interepreted by the elements coding the nodes (U,L) and (U,L) respectively, then there are exactly w((U,L)(U,L)) assignments of values to the tuple z that make W true. Here w(U,L)(U,L) is the expression given in Equation 2.

To define W, we construct formulas defining various elements of Equation 2. More precisely, for various numerical expressions e(x,y), which depend on the values assigned to x and y, we construct formulas we denote wq,e(x,y,z), where q is the length of the tuple of variables z. These formulas have the property that when x and y are interepreted by the elements coding the nodes (U,L) and (U,L) the number of q-tuples that can be assigned to z to make ωq,e true is exactly e(x,y). As before, we use 0 and 1 to denote the first and second elements of the tuple. Also, for a first-order formula ϕ(x,y), let 1ϕ denote the indicator variable that ϕ is true (under an assignment of values to x and y).

e=1:

ω1,e(x,y,z)(z=0)

e=1ϕ:

ω1,e(x,y,z)(z=0)ϕ(x,y)

e=e1×e2:

Given ωq1,e1 and ωq2,e2, we can define

ωq1+q2,e(x,y,z1,,zq1,zq1+1,,zq2)ωq1,e1(z1,,zq1)ωq2,e2(zq1+1,,zq2)
e=e1+e2:

Given ωq1,e1 and ωq2,e2, (assuming without loss of generality that q2q1, we can define

ω1+q2,e(x,y,z1,z2,,zq2+1) [z1=0ωq1,e1(z2,,zq1+1)i=q1+2q2+1zi=0]
[z1=1ωq2,e2(z2,,zq2+1)]
e=|𝐄𝐪|:

It suffices to take a formula defining the disjoint union of the relations Eq0 and Eq1.

ω4,e(x,y,z1,z2,z3,z4)(z1=0Eq0(z2,z3,z4))(z1=1Eq1(z2,z3,z4))
e=|{(U1,L1,L2)(U1,L1)Ci,(U1,L2)Cj}|:

The numerator in Equation 1 (and a term in Equation 2) is e=|{(U1,L1,L2)(U1,L1)Ci,(U1,L2)Cj}|. We can get a formula for this by defining exactly this set of tuples. Here z is a tuple of variables composed of three tuples z1, z2 and z3 where z1 has length 4k and each of z2 and z3 is of length 23k.

ω4k+223k,e(x,y,z) =πU(z1,z2)πU(z1,z3)C𝒞2C((z1,z2),(z1,z3)
C𝒞1C((z1,z2),x)C𝒞1C((z1,z3),y)
Defining the size of the equivalence classes.

Another element of Equation 2 are conditions of the form νU,Lf=r for various values of r. We now construct a formula νf,r(x) with 4k+23k free variables that expresses the condition νU,Lfr when x is interpreted by the tuple coding (U,L). In the following, lower case letters u,l, possibly with subscript indices always denote tuples of variables of length 4k and 23k respectively. Recall that two elements in the clique are in the equivalence relation (U,L) if, and only if, their L values are the same and share the same useful equations with the same positions.

We begin with defining a couple of auxiliary formulas. For any j{1,,k}, the formula usefulj(x,u) says of a tuple u that the jth equation it represents is useful and the formula diffj(u1,u2) asserts that the two tuples u1 and u2 differ in the jth equation:

usefulj(x,u)i{1,,3k}(u3(j1)+1=xiu3(j1)+2=xiu3(j1)+3=xi); and
diffj(u1,u2) (u1)3(j1)+1(u2)3(j1)+1(u1)3(j1)+2(u2)3(j1)+2
(u1)3(j1)+3(u2)3(j1)+3(u1)3k+j(u2)3k+j.

With these, we can define νf,r(x) as a formula whiuch asserts the existence of r nodes

u1,l1,,ur,lriπU(ui,li);

which are are in the same clique as the node coded by x

iC𝒞1C(x,ui,li);

all have f useful equations

i{1,,r}{S{1,,k},|S|=f[j{1,,k}usefulj(x,ui)jS]};

and such that no two nodes are (U,L) equivalent when x is interpreted as (U,L)

ij{1,,r}liljo{1,,k}(usefulo(ui)diffo(ui,uj)).

Then, as usual, νf,r(x)νf,r(x)¬νf,(r+1)(x). To give an expression for ωq(νU,Lf) for some q, we can rewrite it as r=1Ψ1νU,Lf,rr and construct the expression using the composition rules (constants can be constructed via repeated addition of ones, addition, multiplication and indicator variables are defined above)

Putting it all together.

For each term in Equation 2, we have described how to define a corresponding formula. Case splits can be handled via indicator variables and constants by repeatedly adding 1s. By a repeated application of the addition and multiplication rules, W can be constructed.

5 Consequences

5.1 Unique Games

An immediate corollary of the definable 2-to-2 games theorem is the inapproximability of unique games by any constant factors:

Given a (weighted) 2-to-2 game I, we can map it to a Unique Game I by splitting every constraint into two: given a constraint of type Cπ1,π2, we can replace them with two 1-to-1 constraints of type Cπ1 and Cπ2. A colouring of the nodes then satisfies the constraint Cπ1,π2 in I if, and only if, exactly one of thee two constraints is satisfied in I. Note that a colouring can only satisfy at most one of the two constraints. This gives a reduction from GapWeight2-to-2q(1,δ) to GapWeightUGq(12,δ2) for any δ>0.

This reduction is clearly FO-definable: the universe remains the same; then, for a 1-to-1 constraint Cπ, we can determine if (x,y,c) represents a constraint of this type with the sentence Φπ(x,y,c)π2Φπ,π2(x,y,c).

Theorem 5.1.

For every δ>0, there exists q+ so that GapWeightUGq12,δ) is FPC undefinable.

This undefinability gap is stronger, at least in terms of the completeness and soundness parameters, than the gaps proved by Tucker-Foltz [26], the only previously known undefinability gaps for Unique Games. However, our construction uses weighted instances, so we can only conclude the undefinability gap over the domain of weighted unique games. Since the gaps in [26] are proved for unweighted games, they are incomparable to Theorem 5.1.

5.2 Vertex Cover

Another consequence of Theorem 2.3 is the NP-Hardness of approximating the Vertex Cover problem by a factor better than 2. The Unique Games Conjecture implies that nothing better than a factor 2 approximation is possible. This is tight, since polynomial-time algorithms achieving a 2-approximation are known. Before the results of Khot et al. establishing Theorem 2.3 the best known inapproximability result, conditional only on PNP, was 1.36. Atserias and Dawar [6] showed a corresponding unconditional FPC undefinabiity result. We improve on this with the following.

Theorem 5.2 (FPC-IS).

For every ϵ,δ>0, GapIS(112δ,ϵ) is not definable in FPC.

Here IS is the function problem giving the size of a maximal independent set in a graph as a proportion of the total number of vertices. This is equivalent to the FPC undefinability of GapVertexCover(11ϵ,2+δ), implying the FPC-inapproximability of vertex cover by a factor smaller than 2. The theorem follows from the reduction presented in  [22, Chapter 5] which can be defined in First-Order Logic using standard methods.

5.3 Graph Colouring

Perhaps the most striking consequence of our result is the following.

Theorem 5.3.

For every t3, the class of 3-colourable graphs are not FPC separable from those that are not t-colourable.

Theorem 5.3 should be contrasted with what is known about the NP-hardness of promise graph colouring. It is known that it is NP-hard to separate the 3-colourable graphs from those that are not 5-colourable [7]. It is conjectured that it is NP-hard to separate the 3-colourable graphs from those that are not t-colourable for all t3, but this is open even for t=6. Thus, Theorem 5.3 provides the first significant example of an FPC hardness of approximation result that is open in the classical setting of NP-hardness.

Guruswami and Sandeep [16] show a reduction from GapIrreg2to2j,q(1,δ) to the problem of separating 3-colourable graphs from non-t-colourable ones [12]. The reduction is easily definable in first-order logic, proving Theorem 5.3.

6 Conclusion

We have shown that the reductions involved in the proof of the celebrated proof by Khot, Minzer and Safra of the 2-to-2 games theorem can all be implemented as interpretations in first-order logic. This means that the NP-hardness they establish of separating nearly satisfiable instances from highly unsatisfiable ones can be turned into an unconditional inseparability result in FPC. Moreover, the result is achieved with perfect completeness: it is impossible to separate with an FPC sentence the fully satisfiable 2-to-2 games from those that are highly unsatisfiable.

From this result we are able to derive a number of consequences, the most striking of which is that it is impossible to separate with an FPC sentence the graphs that are 3-colourable from those that are not t-colourable for any constant t. The NP-hardness of such a separation is only conjectured for values t larger than 5. We also obtain strong FPC undefinability results for approximation of unique games. In terms of approximation ratios these are an improvement over those of Tucker-Foltz [26]. However, the latter results were obtained for unwieghted games while ours are for weighted games.

This work suggests a number of further directions to pursue. One is an investigation of the FPC definability of promise constraint satisfaction problems (PCSP). The t-colouring of 3-colourble graphs is one such example, but PCSP are a very active current area of investigation. Our results could also be tightened by showing them for unweighted instances rather than with weights. Indeed, we believe that Theorem 5.1 could be improved to apply to unweighted games as well, making it a direct improvement of the results of [26]. For this improvement, it would be sufficient to prove the FPC analogue for the result of Crescenzi et al. [9] showing a gap reduction from weighted CSP instances to unweighted ones. The proof of Khot, Minzer and Safra applies this reduction to establish Theorem 2.3 on unweighted games. This merits further study.

References

  • [1] Matthew Anderson and Anuj Dawar. On symmetric circuits and fixed-point logics. Theory of Computing Systems, 60(3):521–551, July 2017. doi:10.1007/s00224-016-9692-2.
  • [2] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, May 1998. doi:10.1145/278298.278306.
  • [3] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: a new characterization of np. J. ACM, 45(1):70–122, January 1998. doi:10.1145/273865.273901.
  • [4] Albert Atserias, Andrei Bulatov, and Anuj Dawar. Affine systems of equations and counting infinitary logic. Theoretical Computer Science, 410(18):1666–1683, 2009. Automata, Languages and Programming (ICALP 2007). doi:10.1016/j.tcs.2008.12.049.
  • [5] Albert Atserias and Víctor Dalmau. Promise constraint satisfaction and width. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms, SODA 2022, pages 1129–1153. SIAM, 2022. doi:10.1137/1.9781611977073.48.
  • [6] Albert Atserias and Anuj Dawar. Definable inapproximability: New challenges for duplicator, 2019. arXiv:1806.11307.
  • [7] Libor Barto, Jakub Bulín, Andrei Krokhin, and Jakub Opršal. Algebraic approach to promise constraint satisfaction. Journal of the ACM, 68(4):1–66, July 2021. doi:10.1145/3457606.
  • [8] Libor Barto and Marcin Kozik. Constraint satisfaction problems solvable by local consistency methods. J. ACM, 61(1), January 2014. doi:10.1145/2556646.
  • [9] Pierluigi Crescenzi, Riccardo Silvestri, and Luca Trevisan. On weighted vs unweighted versions of combinatorial optimization problems. Inf. Comput., 167(1):10–26, May 2001. doi:10.1006/inco.2000.3011.
  • [10] Anuj Dawar. The nature and power of fixed-point logic with counting. ACM SIGLOG News, 2(1):8–21, January 2015. doi:10.1145/2728816.2728820.
  • [11] Irit Dinur, Subhash Khot, Guy Kindler, Dor Minzer, and Muli Safra. Towards a proof of the 2-to-1 games conjecture? In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, pages 376–389, New York, NY, USA, 2018. Association for Computing Machinery. doi:10.1145/3188745.3188804.
  • [12] Irit Dinur, Elchanan Mossel, and Oded Regev. Conditional hardness for approximate coloring. SIAM Journal on Computing, 39(3):843–873, January 2009. doi:10.1137/07068062x.
  • [13] Irit Dinur and Shmuel Safra. On the hardness of approximating label-cover. Information Processing Letters, 89(5):247–254, March 2004. doi:10.1016/j.ipl.2003.11.007.
  • [14] Heinz-Dieter Ebbinghaus and Jörg Flum. Finite Model Theory. Springer, 2nd edition, 1999.
  • [15] Uriel Feige, Shafi Goldwasser, Laszlo Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. J. ACM, 43(2):268–292, March 1996. doi:10.1145/226643.226652.
  • [16] Venkatesan Guruswami and Sai Sandeep. d-to-1 hardness of coloring 3-colorable graphs with o (1) colors. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020.
  • [17] Johan Håstad. Some optimal inapproximability results. J. ACM, 48(4):798–859, July 2001. doi:10.1145/502090.502098.
  • [18] Subhash Khot. On the power of unique 2-prover 1-round games. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, STOC ’02, pages 767–775, New York, NY, USA, 2002. Association for Computing Machinery. doi:10.1145/509907.510017.
  • [19] Subhash Khot. On the unique games conjecture (invited survey). In 2010 IEEE 25th Annual Conference on Computational Complexity, pages 99–121, 2010. doi:10.1109/CCC.2010.19.
  • [20] Subhash Khot, Dor Minzer, and Muli Safra. On independent sets, 2-to-2 games, and grassmann graphs. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 576–589, New York, NY, USA, 2017. Association for Computing Machinery. doi:10.1145/3055399.3055432.
  • [21] Subhash Khot, Dor Minzer, and Muli Safra. Pseudorandom sets in Grassmann graph have near-perfect expansion. Annals of Mathematics, 198(1):1–92, 2023. doi:10.4007/annals.2023.198.1.1.
  • [22] Dor Minzer. On Monotonicity Testing and the 2-to-2 Games Conjecture, volume 49. Association for Computing Machinery, New York, NY, USA, 1 edition, 2022.
  • [23] Prasad Raghavendra. Optimal algorithms and inapproximability results for every csp? In Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, STOC ’08, pages 245–254, New York, NY, USA, 2008. Association for Computing Machinery. doi:10.1145/1374376.1374414.
  • [24] Benjamin Rossman. Equi-rank homomorphism preservation theorem on finite structures. In 33rd EACSL Annual Conference on Computer Science Logic, CSL, 2025.
  • [25] Khot Subhash, Dor Minzer, and Muli Safra. Pseudorandom sets in grassmann graph have near-perfect expansion. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 592–601, 2018. doi:10.1109/FOCS.2018.00062.
  • [26] Jamie Tucker-Foltz. Inapproximability of Unique Games in Fixed-Point Logic with Counting. Logical Methods in Computer Science, Volume 20, Issue 2, April 2024. doi:10.46298/lmcs-20(2:3)2024.
  • [27] Salil P. Vadhan. Pseudorandomness. Foundations and Trends® in Theoretical Computer FhScience, 7(1–3):1–336, 2012. doi:10.1561/0400000010.