Abstract 1 Introduction 2 Technical overview 3 Notation 4 Polylogarithmic approximation for complete Min-NAE-3-SAT 5 Polynomial time algorithm for complete 𝒌-CSPs 6 Deferred proofs from Section 4 7 Hardness of Min-NAE-𝟑-SAT on dense instances References

Min-CSPs on Complete Instances II:
Polylogarithmic Approximation for Min-NAE-3-SAT

Aditya Anand ORCID University of Michigan, Ann Arbor, MI, USA Euiwoong Lee ORCID University of Michigan, Ann Arbor, MI, USA Davide Mazzali ORCID EPFL, Lausanne, Switzerland Amatya Sharma ORCID University of Michigan, Ann Arbor, MI, USA
Abstract

This paper studies complete k-Constraint Satisfaction Problems (CSPs), where an n-variable instance has exactly one nontrivial constraint for each subset of k variables, i.e., it has (nk) constraints. A recent work started a systematic study of complete k-CSPs [Anand, Lee, Sharma, SODA’25], and showed a quasi-polynomial time algorithm that decides if there is an assignment satisfying all the constraints of any complete Boolean-alphabet k-CSP, algorithmically separating complete instances from dense instances.

The tractability of this decision problem is necessary for any nontrivial (multiplicative) approximation for the minimization version, whose goal is to minimize the number of violated constraints. The same paper raised the question of whether it is possible to obtain nontrivial approximation algorithms for complete Min-k-CSPs with k3.

In this work, we make progress in this direction and show a quasi-polynomial time polylog(n)-approximation to Min-NAE-3-SAT on complete instances, which asks to minimize the number of 3-clauses where all the three literals equal the same bit. To the best of our knowledge, this is the first known example of a CSP whose decision version is NP-Hard in general (and dense) instances while admitting a polylog(n)-approximation in complete instances. Our algorithm presents a new iterative framework for rounding a solution from the Sherali-Adams hierarchy, where each iteration interleaves the two well-known rounding tools: the conditioning procedure, in order to “almost fix” many variables, and the thresholding procedure, in order to “completely fix” them.

Finally, we improve the running time of the decision algorithms of Anand, Lee, and Sharma and show a simple algorithm that decides any complete Boolean-alphabet k-CSP in polynomial time.

Keywords and phrases:
Constraint Satisfiability Problems, Approximation Algorithms, Sherali Adams
Category:
APPROX
Funding:
Euiwoong Lee: Supported in part by NSF grant CCF-2236669 and Google.
Copyright and License:
[Uncaptioned image] © Aditya Anand, Euiwoong Lee, Davide Mazzali, and Amatya Sharma; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Approximation algorithms analysis
; Mathematics of computing Approximation algorithms ; Theory of computation Problems, reductions and completeness
Related Version:
Full Version: https://arxiv.org/abs/2504.19051
Editors:
Alina Ene and Eshan Chattopadhyay

1 Introduction

Constraint Satisfaction Problems (CSPs) provide a unified framework for expressing a wide range of combinatorial problems, including SAT, Graph Coloring, and Integer Programming. A CSP consists of a set n of variables that must be assigned values from a given alphabet, subject to a collection of m constraints. The goal is typically to determine whether there exists an assignment that satisfies all constraints (decision CSPs) or to optimize some measure of constraint satisfaction (optimization CSPs).

A major example of optimization CSPs is Max-CSPs, where the objective is to find an assignment that maximizes the number of satisfied constraints. This class of problems has been extensively studied and it is known to admit various approximation algorithms, including the (conditionally) optimal approximability of fundamental problems such as Max-3-LIN, Max-3-SAT, Max-Cut, and Unique Games [20, 25, 26]. However, significantly less is known for their minimization counterpart, namely Min-CSPs, which aim to minimize the number of violated constraints.

In many cases, the minimization versions of CSPs are inherently harder to approximate than the maximization objective. For example, while Max-2-SAT admits a tight αLLZ0.94016567 approximation algorithm modulo the Unique Games Conjecture (UGC) [28, 12, 25], the best-known approximation guarantee for Min-2-SAT (also known as Min-2-CNF-Deletion) is an O(logn)-approximation [1] along with a hardness of ω(1) under the UGC [25]. Another example is that of Not-All-Equal-3-SAT (NAE-3-SAT), where a clause (consisting of three literals) is satisfied if and only if not all the three literals evaluate to the same Boolean value. The maximization version, Max-NAE-3-SAT admits a tight approximation factor of 0.9089 modulo the UGC [11]. In contrast, the minimization version Min-NAE-3-SAT cannot even admit any finite approximation in polynomial time, simply from the NP-hardness of the decision version NAE-3-SAT [34].

Understanding the approximability of Min-CSPs remains a major challenge in computational complexity and approximation algorithms. On the brighter side, [24] proved that the optimal approximation ratio takes one of the values in {1,O(1),polylog(n),poly(n),} for any Min-CSP with the Boolean alphabet (here in this paper, we will talk about CSPs on the Boolean alphabet, unless specified), based on some structural classification.

Apart from the general instances, CSPs often exhibit significantly different behavior when structural requirements are imposed on their instances. The structure of a CSP refers to assumptions on how constraints are distributed across the instance. A k-CSP is defined on n variables, where each constraint involves exactly k variables, and the constraint structure can naturally be modeled as a k-uniform hypergraph, where variables correspond to vertices and each constraint corresponds to a hyperedge. Structural assumptions on CSPs often translate into density conditions on the corresponding hypergraph. Two particularly important structured settings are (1) dense instances, where the number of constraints (i.e. hyperedges) is Ω(nk), and (2) complete instances, where the number of constraints is exactly (nk).

Max-CSPs on dense instances have been extensively studied, with powerful algorithmic techniques yielding strong approximation guarantees. In fact, for every Max-CSP on dense instances, a Polynomial-Time Approximation Scheme (PTAS) is known, achievable through any of the three major algorithmic frameworks: random sampling [5, 10, 3, 15, 30, 23, 8, 35, 29, 17], convex hierarchies [16, 6, 9, 19, 36, 2, 21, 7], and regularity lemmas [18, 14, 32, 22]. However, in contrast, Min-CSPs on dense instances remain far less explored. Known results exist only for specific problems, such as O(1)-approximation algorithms for Unique Games and Min-Uncut [10, 23, 19, 31], and a PTAS for fragile Min-CSPs [23], where a CSP is fragile when changing the value of a variable always flips a clause containing it from satisfied to unsatisfied. Despite these advances, a general framework for tackling Min-CSPs in dense settings is still lacking, making it a compelling direction for further study.

Building on the study of structured CSPs, [4] introduced complete instances as an extreme case of structure, where every possible k-set of variables forms a constraint. The primary motivation for the study of complete instances comes from their connections to machine learning and data science, their role in unifying algorithmic techniques for dense Max-CSPs, and the structural insights they provide into CSPs (see [4] for details). They give a constant-factor polynomial time approximation for Min-2-SAT on complete instances, which contrasts to the ω(1)-hardness on dense instances, and quasi-polynomial (specifically, nO(logn)) time algorithms for the decision versions of k-SAT and k-CSP for all constants k. They also give a polynomial time algorithm for deciding NAE-3-SAT on complete instances. On the hardness side, they prove that there is no polynomial time algorithm for exact optimization (which does not distinguish the maximization and minimization versions) of NAE-3-SAT, k-SAT, k-Lin, k-And, and Unique Games even on complete instances unless NPBPP.

Our Contribution.

We prove two main results in our work: (1) a quasi-polynomial time O(log6n) approximation for Min-NAE-3-SAT on complete instances, and, (2) a polynomial time algorithm for the decision version of k-CSP for all constants k.

We first present a quasi-polynomial (i.e., npolylog(n)) time polylog(n)-approximation algorithm for Min-NAE-3-SAT on complete instances. To the best of our knowledge, it is the first known example of a CSP whose decision version is NP-hard in general instances (and dense instances too, see Claim 7.1) while admitting a polylog(n)-approximation in complete instances.

Theorem 1.1.

There is an algorithm running in time nlogκn for some constant κ>0 that gives an O(log6n)-approximation for Min-NAE-3-SAT on complete instances. Furthermore, the integrality gap of the degree-O(logκn) Sherali-Adams relaxation is at most O(log6n).

Beyond addressing this specific question, our result also strengthens one of the main motivations for studying Min-CSPs on complete instances: understanding whether a combination of algorithmic techniques (random sampling, convex hierarchies, and regularity lemmas) can improve approximability results. As stated earlier, while each of these techniques independently yields PTASes for Max-CSPs on dense/complete instances, their effectiveness for Min-CSPs is much less understood. Our algorithm presents a new iterative framework for rounding a solution from the polylog(n)-round Sherali-Adams hierarchy, where each iteration interleaves the two well-known rounding tools: (1) the conditioning procedure, which identifies and conditions on a small set of variables in order to almost fix a constant fraction of variables and (2) the thresholding procedure, in order to completely fix them. We fix a constant fraction of variables in each iteration (for a total of O(logn) iterations) while ensuring that the value of the LP remains within a polylog(n) factor of the optimal value by the end of all the iterations.

Secondly, we give polynomial time algorithms for the decision version of k-CSPs (for every constant k), improving over the quasi-polynomial time algorithm of [4].

Theorem 1.2.

For every k2, there is a polynomial time algorithm that decides whether a complete instance of k-CSP is satisfiable or not.

Our algorithm is remarkably simple. It relies on Lemma 3.1 of [4], which shows that the number of satisfying assignments for a complete instance of a k-CSP is at most O(nk1), based on a VC-dimension argument. The algorithm first arbitrarily orders the variables v1,,vn, and then proceeds by iteratively maintaining all satisfying assignments for the partial (complete) instance induced on the first i variables. Since the number of satisfying assignments for a complete instance with i variables is at most O(ik1), we can efficiently keep track and update these solutions in polynomial time, leading to our second main result.

Open Questions.

Our work establishes the first nontrivial approximation result for a Min-CSP, Min-NAE-3-SAT on complete instances, whose decision version is NP-hard on general and dense instances. NAE-3-SAT remains hard even when every variable appears in Ω(n2) clauses (˜7.1). However, our techniques do not currently extend to other problems including Min-3-SAT and Min-NAE-4-SAT.

This raises the open question regarding both approximation algorithms and hardness results for Min-k-CSPs. While exact optimization is known to be hard, proving inapproximability beyond this – such as APX-hardness – remains an important direction. A natural first candidate is Min-3-SAT, for which the existence of efficient quasi-polynomial time approximation algorithms or even hardness results on complete instances remains unresolved.

Organization.

Section 2 provides an overview of the quasi-polynomial time O(log6n) approximation for Min-NAE-3-SAT, i.e., a proof outline for Theorem 1.1. Section 3 establishes the notations and preliminaries required throughout the paper. Section 4 shows the proof of the rounding algorithm, Algorithm 2, which proves Theorem 1.1 by rounding the solution of the Sherali-Adams LP. Section 5 presents the polynomial time algorithm for the decision k-CSPs (Theorem 1.2).

2 Technical overview

Convex programming hierarchies have proven to be instrumental in the design of approximation algorithms of Max-CSPs [16, 6, 9, 19, 36, 2, 21, 7] and, limited to a few special cases, Min-CSPs [13]. Therefore, it is natural to attempt to approximate Min-NAE-3-SAT on complete instances using these approaches. We prove Theorem 1.1 by designing an algorithm that first runs the degree-d Sherali-Adams relaxation of Min-NAE-3-SAT, and then rounds the fractional solution to a Boolean assignment.

Informally, a fractional solution μ to such a relaxation associates each set S of at most d variables with distributions μS over assignments α{0,1}S to the variables in S. The crucial property is that these distributions are locally consistent: for any two variable sets S,T of size at most d, the local distributions μS and μT agree on the probability of each assignment to ST. For this reason, such a solution μ is called a pseudodistribution.

The objective of the relaxation is to minimize the average over constraints of the probability that they are violated by an assignment sampled from their respective local distribution. In particular, the objective value of an optimal pseudodistribution μ is at most the fraction of violated constraints in an optimal Boolean assignment.

As solving a degree-d Sherali-Adams relaxation amounts to solving a linear program with nO(d) variables, we would like to keep d as small as possible. In our context, we will set d=polylog(n).

Having put this notation in place, we now give an overview of how the algorithm from Theorem 1.1 rounds these fractional solutions. Specifically, in Section 2.1 we highlight some of the structural properties that our problem instance impose on the pseudodistributions, and in Section 2.2 we discuss how to exploit these properties to design a rounding algorithm.

2.1 Pseudodistributions and complete NAE-3-SAT

When rounding pseudodistributions obtained from the Sherali-Adams or Sum-of-Squares hierarchies, conditioning is perhaps the main hammer at our disposal. This technique, introduced by [9, 19] and adopted ubiquitously to approximate optimization CSPs, is usually implemented as follows: first, we condition on a subset of r=O(1/ϵ2) variables to reduce the average correlations among the groups of t variables to below O(2tϵ); then, we perform independent rounding. This scheme is particularly effective in the context of dense Max-k-CSPs, where we capitalize upon the fact that optimal assignment satisfies at least an Ω(1) fraction of constraints. Hence, we can afford to additively lose a total variation distance of O(2kϵ)=O(ϵ) between the pseudodistribution and independent rounding in each constraint, and still obtain a (1O(ϵ))-approximation.

Obstacle: we cannot afford additive error.

In the context of Min-k-CSPs, the objective function is the fraction of violated constraints. Hence, for a Min-k-CSP instance , we define OPT to be the minimum over all the variable assignments α of the fraction of constraints in unsatisfied by α. With such an objective, the trick discussed above ceases to work: a highly satisfiable instance has a very small optimal value, possibly even OPTO(1/nk). Due to this eventuality, we cannot afford to additively lose an ϵ term for each constraint: it would result in an objective value of, say, ϵ+OPTnOPT unless ϵ1/nk1. Unfortunately, the degree d of the pseudodistribution needs to be larger than the number r=O(1/ϵ2) of variables we condition on, so setting ϵ1/nk1 is prohibitive.

Advantage: polylog(𝒏) loss in time and approximation.

On the flip side of the above discussion, one can deduce that if OPT1/polylog(n) we can in fact afford to lose an additive O(ϵ) term for ϵ=1/polylog(n) and still run in nO(r)=npolylog(n) time. Even more so, if OPT1/polylog(n) then we always get a polylog(n)-approximation, no matter what solution we output (since the value we obtain is always at most 1). We can therefore restrain ourselves to consider instances with small optimal value.

Insight 1.

Without loss of generality, we can consider only instances with OPT1/polylog(n).

Advantage: the instance is complete.

Let μ be an optimal pseudodistribution for our instance , and let us denote by δ=val(μ) the fractional objective value achieved by μ on . Thanks to ˜1, we can assume δOPT1/polylog(n). Recalling that we are in the realm of complete instances, a simple averaging yields the following observation.

Insight 2.

For at least (n3)/2 of the triples {u,v,w}, we have that the constraint P{u,v,w}(α) is unsatisfied with probability at most 2δ1/polylog(n) over αμ{u,v,w}.

We remark that a version of Insight 2 remains true for dense – but not necessarily complete – instances up to an Ω(1) loss in the lower bound. However, being dense is not a “hereditary property”, while being complete is a “hereditary property”: any variable-induced sub-instance of a complete instance is also complete and in particular dense, while this might not be the case for dense instances. Thus, assuming that is complete allows us to benefit from Insight 2 locally to any variable-induced sub-instance. As we will see, this is instrumental to our algorithm.

Advantage: the structure of NAE-𝟑-SAT constraints.

Consider any constraint P{u,v,w} that is unsatisfied with probability at most δ over μ{u,v,w}, for some δ[0,1]. By Insight 2, we can assume that δ2δ1/polylog(n). Then one can see that, among the six satisfying assignments of P{u,v,w}, there is one that retains probability mass at least (11/polylog(n))/61/7 in μ{u,v,w}. Call this satisfying assignment (αu,αv,αw). Assuming for simplicity that the literals of P{u,v,w} are all positive, we conclude that exactly two of αu,αv,αw must be equal. By symmetry, we can assume that αv=αw=0 and αu=1. We now consider the pseudodistribution μ~=μ|(v,w)(βv,βw) obtained by conditioning on the variables v,w taking the random value (βv,βw)μv,w. We can observe that (βv,βw)=(αv,αw)=(0,0) with probability at least 1/7. Therefore, if such an event occurs, we have μ~u(0)14δ, as one can deduce from Table 1. In this case, we say that u becomes O(δ)-fixed in μ~.

Table 1: Distribution of the probability mass in μ{u,v,w} to assignments of a constraint P{u,v,w} that is unsatisfied with probability at most 2δ, assuming that P{u,v,w} has only positive literals.

We now continue to look at a fixed constraint P{u,v,w} as above, and consider the following random experiment: draw a random pair of variables x,y together with a random assignment (βx,βy)μx,y sampled from their local distribution, and let μ~=μ|(x,y)(βx,βy) be the pseudodistribution obtained by conditioning on the pair (x,y) taking value (βx,βy). With probability 1/(n2) we hit the pair (v,w), i.e., (x,y)=(v,w), and with probability 1/7 we have (βx,βy)=(βv,βw)=(0,0). This means that the variable u becomes O(δ)-fixed in μ~ with probability at least 0.1/(n2). By Insight 2, we know that for many (i.e., Ω(n)) choices for u, there are many (i.e., Ω(n2)) triples P{u,v,w} that can O(δ)-fix u, as the one considered in the discussion above. Using the linearity of expectation and Markov’s inequality, we can then conclude the following.

Insight 3.

For a random pair of variables x,y and a random assignment (βx,βy)μx,y, with probability 1/1000 there are at least n/1000 variables u that become O(δ)-fixed in μ~=μ|(x,y)(βx,βy).

A formal version of this fact is stated in Lemma 4.1.

Advantage: “completely” fixing 𝑶(𝜹)-fixed variables incurs little cost.

We recall that conditioning on a random assignment (βx,βy)μx,y preserves the objective value in expectation, i.e., 𝔼[val(μ~)]=val(μ)=δ. Hence, Insight 3 seems to suggest that conditioning on just two variables allows making nontrivial progress towards the goal of rounding the pseudodistribution: the entropy of Ω(n) variables should drastically decrease while preserving the expected objective value. However, the conditioning alone is not enough, as its effect decreases when the entropy becomes already small but still nontrivial (e.g., if all but n variables are O(δ)-fixed, the above argument cannot guarantee any more progress). While independent rounding suffices for Max-CSPs in this case, its additive loss can still be prohibitive for the minimization objective.

In order to bypass this barrier, we additionally use thresholding to round many O(δ)-fixed variables to exactly 0 or 1. Formally, μ^ is a new pseudodistribution where for each variable u that is O(δ)-fixed in μ~, fix u to the bit retaining 1O(δ) of the probability mass of μ~u.

We now zoom in on a constraint P{u,v,w} with only positive literals where, say, u has been fixed to 1 in μ^. Call γ the probability that P{u,v,w} is unsatisfied by an assignment sampled from μ~{u,v,w} for convenience. One can observe that the probability of P{u,v,w} being violated by an assignment sampled from μ^{u,v,w} equals μ^{u,v,w}(1,1,1). This is because the assignment (0,0,0) has probability mass 0 in μ^{u,v,w}, since we fixed u to 1. Then, with the help of Table 2, we can convince ourselves that the probability of P{u,v,w} being unsatisfied by an assignment sampled from μ^{u,v,w} is no larger than γ up to an additive O(δ) error.

Table 2: Distribution of the probability mass in μ~{u,v,w} to assignments of a constraint P{u,v,w} where u is O(δ)-fixed in μ~ and where γ is the probability that P{u,v,w} is unsatisfied by an assignment sampled from μ~{u,v,w}, assuming that P{u,v,w} has only positive literals.

From the above discussion, we conclude that the fractional objective value val(μ^) of μ^ is at most val(μ~)+O(δ). Recalling that 𝔼[val(μ~)]=val(μ)=δ, Insight 3 can then be refined as follows.

Insight 4.

By sampling a random pair of variables, conditioning on a random assignment to them, and performing the thresholding, we can construct a new pseudodistribution μ^ such that 𝔼[val(μ^)]=O(δ) and with probability 1/1000 there are at least n/1000 integral variables u in μ^.

A formal version of this fact can be obtained by combining Lemma 4.1 and Lemma 4.2.

2.2 Towards an algorithm

It would be tempting to employ Insight 4 to get a rounding scheme as the one outlined in Algorithm 1: sample a pair of variables and a random assignment, construct μ^ as above, and recurse on the sub-instance induced by the variables that are still not integral (or halt if there is no such variable). The intuition behind this Algorithm 1 is backed up by a trifecta.

  1. 1.

    First, Insight 4 applies also to the sub-instance that we recurse on: it essentially only relies on Insight 2, which also holds on the sub-instance since is complete (using the fact that being complete is a “hereditary property”, as anticipated).

  2. 2.

    Second, we expect that O(logn) levels of recursion will make every variable integral, by Insight 4.

  3. 3.

    Third, the expected value of the (integral) pseudodistribution obtained at the end is roughly O(δ)O(OPT), also by Insight 4.

However, more work is needed, since Insight 4 only bounds the cost in expectation at each individual recursion level.

Algorithm 1 Conditions on a random pair of variables and thresholds the result.

Obstacle: the need for a high probability guarantee.

In order to materialize the strategy of Algorithm 1, we need to control the deviations of val(μ^) from δ at every level of the recursion. More precisely, let μ(i) be the pseudodistribution at the beginning of the i-th recursion level, and let δi the fractional objective value val(μ(i)) of the pseudodistribution μ(i).

For the sake of exposition, here let us ignore the hidden constant in O(δ) and assume 𝔼[val(μ(i))] =val(μ(i1)) (the ignored error comes from the clauses at least one of whose variables is completely fixed by the threshold step, so it can be controlled via the concept of the aggregate value defined in equation (1) of Section 4). With this assumption, at every level i of the recursion we have

𝔼(i1)-th level[val(μ(i))]=δi1=val(μ(i1)).

Applying this identity at every level, we get

𝔼 algorithm [val(μ(#levels))]
=𝔼1-st level[𝔼(#levels2)-th level[𝔼(#levels1)-th level[val(μ(#levels))]]]
=δ1
OPT,

where the last inequality follows assuming that the pseudodistribution on which we first call the algorithm is optimal. Now, one can intuitively see that the deviation of the last pseudodistribution μ(#levels) from its expected objective value translates to the approximation ratio achieved by the algorithm. Therefore, we are interested in bounding such deviation. To do so, we have to peel off one expectation operator at a time in the above equation by means of Markov’s inequality. At every level i, the guarantee would then be val(μ(i))λval(μ(i1)) with probability 11/λ. Since this gives

val(μ(#levels))λ#levelsOPT,

we need a λ such that λ#levelsO~(1). Recalling that #levels=Ω(logn), we must hence require λ1+1/logn. Unfortunately, Markov’s inequality ensures 1+1/logn multiplicative increase with only pval=O(1/logn) success probability. If we call pfix=1/1000 the probability that μ^ fixes n/1000 variables, we realize that

(1pval)+(1pfix)1.

We are therefore unable to conclude that there exists a choice of x,y,βx,βy for which we both have that μ(i) fixes n/1000 variables and μ(i)(1+1/logn)μ(i1). For the left-hand side above to be less than one, we need pfix1O(1/logn).

Advantage: polylog(n)-degree pseudodistributions.

Recalling that we allow our algorithm to run in npolylog(n) time, we can assume to have a polylog(n)-degree pseudodistribution at our disposal. Supported by this, the natural thing to do is modifying each level of the recursion as follows: condition on O(loglogn) random pairs - as opposed to one - and their respective assignments. While this should boost pfix1O(1/logn), we now have more conditioning steps that could deviate from the expectation. However, suppose we could condition on a random set of O(loglogn) pairs of variables whose joint local distribution is close (in total variation distance) to the product of their marginal distributions: then, we would both obtain the high probability guarantee and simultaneously have only one deviation from the objective value to control. A standard way to ensure that the joint distribution of t (pairs of) variables is ϵ-close to independent is by additionally conditioning on poly(2t/ϵ) variables [33, 36]. Since we want t=O(loglogn) and ϵ=1/polylog(n), we can afford to do so with our polylog(n)-degree pseudodistribution.

A formal version of this algorithm is presented and analyzed in Section 4.

3 Notation

CSPs.

Let k2 be an integer and let V be a set of n variables. A k-CSP is a pair =(V,𝒫) where V is a set of variables over the Boolean alphabet {0,1}, and 𝒫 represents the constraints. Each constraint PS𝒫 is defined by a k-subset of variables S(Vk) and a predicate PS:{0,1}S{0,1}, where 1 means that the constraint is satisfied and 0 otherwise. We will consider complete k-CSPs, so that there is a predicate PS𝒫 for every S(Vk). Furthermore, each constraint PS is unsatisfied for at least one assignment to the variables S. Then, for a global assignment α{0,1}V to the variables, we let

val(,α)=PrS(Vk)[PS(αS)=0],

where S(Vk) indicates that S is sampled uniformly from the elements of (Vk), and αS represents the restriction of α to entries in S.

In the decision version of a k-CSP instance , the goal is to decide if there is an assignment α such that val(,α)= 0. In the minimization version, which we denote by Min-k-CSP, the goal is to find an assignment α which minimizes val(,α), so we define

OPT()=minα{0,1}Vval(,α).

Sometimes we will write val(α) as a shorthand for val(,α) when is clear from the context. For a subset of variables WV, we denote by [W]=(W,𝒫) the sub-instance of induced by the variables W.

NAE-𝟑-SAT.

A complete NAE-3-SAT instance is a complete 3-CSP =(V,𝒫) where each constraint (also called a clause) PS𝒫 is a “not all equal” predicate on three literals, where each literal is either a variable or its negation. For our convenience, we define this formally as follows: for each S(V3) and each uS, we have a polarity bijection PSu:{0,1}{0,1} determining the literal pattern for the variable u in S, and the constraint PS is defined as

PS(α)={1,if |{PSu(αu)}uS|20,otherwise

for every α{0,1}S.

Sherali-Adams notation.

For dk, we consider the degree-d Sherali-Adams relaxation of OPT(), which can be written as

minimize PrS(Vk),αμS[PS(α)=0]subject to μ𝒟(V,d)

where 𝒟(V,d) is the set of pseudodistributions of degree d over the Boolean variables V, i.e. every element μ of 𝒟(V,d) is indexed by subsets SV with |S|d where each μS is a distribution over assignments α{0,1}S to the variables in S, with the property that

S,TV,β{0,1}ST with |ST|d,PrαμS[αST=β]=PrαμT[αST=β].

Moreover, for any α{0,1}S, we let μS(α)[0,1] denote that probability of sampling α from μS. We also let for convenience

val(,μ)=PrS(Vk),αμS[PS(α)=0]

be the fractional objective value of a pseudodistribution μ. Additionally, for WV, we denote by μ[W]𝒟(W,d) the restriction of μ to variables in W. Again, we will use the shorthand val(μ) when is clear from the context.

Conditioning and fixing notation.

For μ𝒟(V,d), SV with |S|d and βsupp(μS), we denote by μ|Sβ𝒟(V,d|S|) the pseudodistribution obtained from μ by conditioning on the event that the variables in S are assigned β. Formally, μ|Sβ is the pseudodistribution defined for each TV with |T|d|S| and α{0,1}T as

(μ|Sβ)T(α)={μST(αβ)μS(β),if αST=βST0,otherwise

where αβ{0,1}TS is the vector assigning α to T and βST to ST. Furthermore, we denote by μSβ𝒟(V,d) the pseudodistribution obtained by fixing the variables in S to take the assignment β, which results from moving all the probability mass of μ{vi} to μ{vi}(bi) for each viS and the corresponding biβ. Formally, μSβ is the pseudodistribution defined for each TV with |T|d and each α{0,1}T as

(μSβ)T(α)={α{0,1}T:αTS=αTSμT(α),if αST=βST0,otherwise

Variable vectors and variable sets.

We will use the convention that for a vector 𝐜V written in boldface, the regular capital letter C={𝐜i}i[] will be the set formed by the union of the value taken by its coordinates.

4 Polylogarithmic approximation for complete Min-NAE-3-SAT

In this section, we present and analyze the rounding scheme that we apply to the Sherali-Adams solution. Hereafter, we fix an instance of NAE-3-SAT with variable set V, where |V|=n. In order to describe the rounding algorithm, we should first introduce the necessary notions.

Fixed variables, ruling value, ruling assignment.

Given a subset WV, a pseudodistribution ρ𝒟(W,d), and a threshold ξ[0,1/2), a variable v is called (ρ,ξ)-fixed if for some bit b{0,1} one has that the probability ρ{v}(b) is at most ξ. Moreover, the corresponding bit 1b having probability at least 1ξ is called the ruling value of v. Also, we define FW(ρ,ξ) to be the the set of all the (ρ,ξ)-fixed variables in W. Moreover, we define ωW(ν,ξ){0,1}FW(ν,ξ) to be the corresponding ruling assignment, where the entry for vFW(ν,ξ) equals the ruling value of v in ρ.

LP value of constraint classes.

Given a set VUV of variables (which can be thought of as the set of unfixed variables), we classify the constraints of =(V,𝒫) into four groups: for i{0,1,2,3}, we let i{VU}=(V,𝒫i{VU}) be the sub-instance with constraints 𝒫i{VU}={PS𝒫:|SVU|=i}, i.e. i{VU} can be thought of as the sub-instance consisting of constraints with exactly i unfixed variables. Then, given a pseudodistribution ρ𝒟(V,d), we define for each i{0,1,2,3} the LP value of the constraints in i{VU} as

LPiVU(ρ)=PS𝒫i{VU}PrαρS[PS(α)=0],

so LPiVU(ρ) can be thought of as the contribution of the constraints with exactly i unfixed variables to the Sherali-Adams objective.

Aggregate LP value.

For analyzing our algorithm, we treat the quantities LP0VU(ρ), LP1VU(ρ), LP2VU(ρ) and LP3VU(ρ) separately. However, for the sake of certain probability bounds, it turns out to be useful to combine these into a single weighted sum: given a real parameter τ>0, a set VUV, and a pseudodistribution ρ𝒟(V,d), we define the aggregate value of ρ as the weighted sum

AτVU(ρ)=τlog3nLP3VU(ρ)+log2nLP2VU(ρ)+lognLP1VU(ρ)+LP0VU(ρ). (1)

We might drop the subscript τ when it is clear from context (and this should not create any confusion as we will use one fixed value of τ for the whole algorithm and analysis).

Thresholds with bounded increase.

As per the definition above, the notion of fixed variables depends on a threshold ξ. In our algorithm and analysis, we will want to use a threshold that ensures a certain bound on the additive increase on the aggregate value. More precisely, given τ>0, VUV, ρ𝒟(V,d), and a parameter δ[0,1], we define the set of thresholds with bounded increase to be the set of all θ[0,1/2) such that

AVUFVU(ρ,θ)(ρFVU(ρ,θ)ωVU(ρ,θ))AVU(ρ)6lognAVU(ρ)+12τδlog2n(|VU|3),

and we denote this set as ΘτVU(ρ,δ). As for AτVU(ρ), we drop the subscript τ when it is clear from context.

𝟐-SAT instances.

Given VUV and a partial assignment α{0,1/2,1}V such that αv{0,1} for all vVVU, the Min-2-SAT instance induced by VU and α is denoted as 2VU,α and is defined as follows: we let 2VU,α=(VU,𝒫2VU,α), where 𝒫2VU,α is a multiset of 2-SAT constraints that associates to each PS𝒫1{VU}𝒫2{VU} exactly one constraint PS:{0,1}{x,y}{0,1} defined as PS(α)=PS(ααSVU) for all α{0,1}SVU, where ααSVU{0,1}S is the vector assigning α to SVU and αSVU to SVU. We remark that since all PS𝒫 are NAE-3-SAT, every PS𝒫2VU,α is a 2-SAT or 1-SAT clause.

4.1 Algorithm outline

Algorithm 2 gives our rounding scheme. Throughout its execution, the input instance =(V,𝒫) remains unchanged, as do the parameters r,t,ϵ,τ, which only depend on n=|V| and on the fixed universal constant K=1020.

Algorithm 2 round-pd(,d,μ,α,).

The algorithm maintains a Sherali-Adams solution μ and a partial assignment α. At any intermediate point, there are some variables VF which are completely fixed: these are variables v for which αv has already been determined, and αv{0,1}. The remaining variables VU are the unfixed variables: these are the variables v for which αv=12, i.e. their assignment is not yet decided. The algorithm is called as round-pd(,d,μ,α(0),0) where the Sherali-Adams solution μ is obtained by solving a d-degree Sherali-Adams LP for Min-NAE-3-SAT where d=polylog(n) will be determined in the analysis below.

On a high level, our algorithm proceeds in stages. At each stage, if the number of unfixed variables is small (specifically |VU|<K2(logn)100), we simply solve the instance by brute force (line 3). Otherwise, if the value of the instance induced on VU is small, val([VU],μ[VU])>1/(10τ), we show that the instance is essentially equivalent to 2-SAT, and solve it using a standard LP-rounding for 2-SAT (line 9, see Lemma 4.5). The interesting case is when neither of these two cases happens: we proceed via a conditioning step and a thresholding step, followed by a recursive call. In the conditioning step (lines 12-18), the algorithm identifies a small set of variables C and an assignment to these variables γ, and conditions on the event that C gets assigned γ, to obtain a new pseudodistribution μ~. Ideally, this pseudodistribution μ~ is such that many of the variables in VU are (μ~,τδ)-fixed. The thresholding step (lines 19-22) then rounds these variables to their ruling assignment, resulting in a new pseudodistribution μ^. After updating α for each newly rounded variable to its already integral value in μ^, the algorithm continues by making a recursive call.

4.2 Analysis

In each stage, after the thresholding step is completed, we will show that the number of unfixed variables (the variables with αv=12) decreases by a constant factor. Hence the number of stages is at most O(logn). Thus if we can show that the conditioning and thresholding steps do not increase the value of the Sherali-Adams solution μ by more than a 1+O(1/logn) factor, we would obtain an integral assignment to the variables with a constant-approximation ratio at the end of O(log|V|) stages. While it is unclear if such a scheme is possible, we track the growth of the aggregate value AτVU(μ)=Θ(polylog(n)val(μ)), and show that AτVU(μ) does not increase by more than a 1+O(1/logn) factor after each stage. Hence, AτVU(μ) always remains a constant approximation to AτV(μ), where μ is the initial Sherali-Adams solution. By the definition of the aggregate value, a constant-approximation to AτVU(μ) must imply a polylog(n)-approximation, giving us the desired result. The rest of this section is devoted to formalizing and proving the above statements.

We begin by analyzing the conditioning step. Informally, we show that if we condition on a small random set of variables and a suitable assignment sampled from their local distribution, then many variables will be fixed with good probability in the obtained conditional pseudodistribution. More precisely, we have the following lemma (see full version for proof):

Lemma 4.1 (Conditioning-to-Fixing).

Let d,r,t be positive integers, let WV, let ρ𝒟(W,d), and let τ1. If val([W],ρ)1/(3τ), |W|3, dr+2t+3, r10332t, t106, then there exists r{0,,r} such that
Pr𝐜Wr+2t,γρC[|FW(ρ|Cγ,τval([W],ρ))||W|100]1(107τ+532t/3r1/3+5t2|W|+3et/106).

Next, we analyze the change in the aggregate value from the thresholding step. Informally, we bound the additive increase of the aggregate value after line 22 compared to the aggregate value before line 19. More precisely, we have the following lemma (see full version for proof):

Lemma 4.2 (Thresholding).

Let d be a positive integer, let ρ𝒟(V,d), let VUV such that for all vVVU there is b{0,1} satisfying ρ{v}(b)=1, and let τ1 and δ[0,110τ]. Then, there exists θ[τδ,2τδ] such that θΘτVU(ρ,δ). Moreover, such a value can be found in time poly(|VU|).

The idea is to combine the two lemmas above to ensure that when we reach the base of the recursion we have a pseudodistribution μ whose aggregate value is within a constant factor of the original one.

Lemma 4.3.

Let c=200K2, let μ𝒟(V,K2logcn), let α(0){0,1/2,1}V with αv=1/2 for all vV. Then, running Algorithm 2 as round-pd(K2logcn,,μ,α(0),0) must reach line 3 or 9 with integers d3,100logn, a pseudodistribution μ𝒟(V,d), and a set VUV such that

AVU(μ)e5000AV(μ)=O(1)AV(μ).

Proof.

Notice that before we reach lines 3 or 9, every recursive call to Algorithm 2 involves one conditioning step and one thresholding step. Let us consider in particular a call of Algorithm 2 that starts from a pseudodistribution μ𝒟(V,d), and a set of unfixed variables VUV such that neither line 3 nor 9 is reached. Then, the conditioning step produces a pseudodistribution μ~. This step does not change the assignment α, or the set of unfixed variables VU. Next, the thresholding step rounds some variables in μ~ and produces a pseudodistribution μ^, and updates the assignment α. Let us call this updated assignment α and let VU={vVαv=12} be the set of unfixed variables with respect to α. For the sake of convenience, let us set define A(μ):=AVU(μ), A(μ~):=AVU(μ~), and A(μ^):=AVU(μ^). In words, A(μ) and A(μ~) are defined with respect to the set VU, whereas A(μ^) is defined with respect to the set VU obtained after thresholding. We will show two things: first, we argue that |VU|99/100|VU|, and second we argue that the increase in the aggregate value A(μ^)A(μ) is at most 50A(μ)/logn.

Recall our setting of parameters: K=1020, r=(logn)100K, t=Kloglogn, ϵ=1/10logn, τ=Klog2n, c=200K2. Since lines 3 and 9 are not reached, we can assume that |VU|c1K2log100n and that val([VU],μ[VU])110τ. By the guarantee of Lemma 4.1 with W=VU and ρ=μ[VU], there exists r{0,,r} such that

Pr𝐜VUr+2t,γμC[|FVU(μ|Cγ,τval([VU],μ[VU]))||VU|100]
1(107τ+532t/3r1/3+5t2|VU|+3et/106)
11100logn.

We also observe that for all i{0,1,2,3} we have

𝔼𝐜VUr+2t,γμC[LPiVU(μ|Cγ)]=LPiVU,

which follows from the property of conditional expectation111Note that while the conditioning “fixes” the variables we condition on, we still do not add these variables to VU - this happens only after the thresholding step. Hence, the sets VU and VF remain unchanged, and therefore a constraint which contributes to LPiVU(μ) continues to contribute to LPiVU(μ|Cγ) for all i{0,1,2,3}.. This in turn means that

𝔼𝐜VUr+2t,γμC[AVU(μ|Cγ)]=AVU(μ)=A(μ).

By Markov’s inequality, it follows that

Pr𝐜Wr+2t,γμC[AVU(μ|Cγ)A(μ)(1+ϵ)]1ϵ/2=1120logn.

It follows that there exists a choice of 𝐜Wr+2t and γsupp(μC) such that we simultaneously have

|FVU(μ|Cγ,τval([VU],μ))||VU|100 and AVU(μ|Cγ)A(μ)(1+ϵ).

In particular, we must have

|FVU(μ~,τval([VU],μ))||VU|100 and A(μ~)A(μ)(1+ϵ).

Next, we analyze the rounding step. Note that the pseudo-distribution μ^ is obtained by rounding μ~. By using Lemma 4.2 with ρ=μ~, it follows that

A(μ^)A(μ~)6lognA(μ~)+12τδ(|VU|3)log2n

Note that τδ(|VU|3)log2n=τLP3VU(μ)log2nA(μ)logn, since δ(|VU|3)=LP3VU(μ). This gives

A(μ^)A(μ~)20A(μ~)+A(μ)logn.

Using the fact that A(μ~)(1+ϵ)A(μ), it follows that

A(μ^)A(μ~)45A(μ)logn.

Finally, combining the conditioning and rounding steps, we obtain

A(μ^)A(μ) =A(μ^)A(μ~)+A(μ~)A(μ)
45A(μ)/logn+ϵA(μ)
50A(μ)/logn,

where the last inequality follows since ϵ=1/10logn.

Note that after each step of conditioning and rounding described above, we fix a 1100 fraction of the unfixed variables VU. It follows that after at most 100logn such stages of conditioning and thresholding, the distribution μ and the set of unfixed variables VU satisfy AVU(μ)(1+50/logn)100lognAV(μ)e5000AV(μ).

The following two lemmas, whose proofs are deferred to Sections 6.1 and 6.2 respectively, prove the guarantees for the two base cases in Algorithm 2. When the number of unfixed variables becomes sufficiently small (line 3) we can compute the optimal solution using brute force (Lemma 4.4). When δ=LP3(μ)/(|VU|3) is large (line 9), we can simply ignore clauses whose all three variables are unfixed, which reduces the problem to Min-2-SAT (Lemma 4.5) that has a well-known polylog(n)-approximation.

Lemma 4.4 (Brute force base case).

Let c=200K2, let μ𝒟(V,K2logcn), let α(0){0,1/2,1}V with αv=1/2 for all vV. Also let d, μ𝒟(V,d),α{0,1/2,1}V and VU={v:αv=1/2} be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as round-pd(K2logcn,,μ,α(0)), reaches line 3. Then, the output α of line (3) satisfies val(,α)val(,μ).

Lemma 4.5 (Min-2-SAT base case).

Let c=200K2, let μ𝒟(V,K2logcn), let α(0){0,1/2,1}V with αv=1/2 for all vV. Also let d, μ𝒟(V,d),α{0,1/2,1}V and VU={v:αv=1/2} be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as round-pd(K2logcn,,μ,α(0)), reaches line 9. Then, the output α of line (9) satisfies val(,α)O(logn)AVU(μ)/(|VU|3).

We finish the proof of our main result.

Proof of Theorem 1.1.

Let μ𝒟(V,K2logcn) be obtained by solving a Sherali-Adams relaxation. Run Algorithm 2 with c=200K2 as round-pd(K2logcn,,μ,α(0)), where α(0){0,1/2,1}V with αv=1/2 for all vV. Lemma 4.3 guarantees that one of lines (3) or (9) is reached with a pseudodistribution μ and a set of unfixed variables VU such that AVU(μ)O(1)AV(μ). Now there are two cases depending on which line is reached when the algorithm terminates.

  • Line (3) is reached. In this case, by Lemma 4.4, we obtain an assignment α with val(,α)val(,μ). Note that

    val(,μ)1(|V|3)i=03LPiVU(μ)1(|V|3)AVU(μ).

    This quantity is at most

    1(|V|3)O(1)AV(μ)1(|V|3)O(τlog3n)i=03LPiV(μ),

    which is equal to

    1(|V|3)O(τlog3n)LP3V(μ)O(τlog3n)val(,μ)

    as desired.

  • Line (9) is reached. In this case, by Lemma 4.5, we obtain an assignment α with val(,α)O(logn)AVU(μ)/(|VU|3). Again, we have

    val(,α)1(|V|3)O(logn)AVU(μ)1(|V|3)O(logn)AV(μ),

    which is at most

    1(|V|3)O(τlog4n)i=03LPV(μ)=1(|V|3)O(τlog4n)LP3V(μ).

    By definition, the above is O(τlog4n)val(,μ). Finally, we recall that τ=Klog2n=O(log2n), so that val(,β)O(log6n)val(,μ). This concludes the proof of the correctness.

For the running time, note that in a given stage of the algorithm involving one conditioning and one thresholding step, the bottleneck is guessing the set of r+2t variables and their assignments to condition on. Since r+2t=O(log100Kn), this step takes quasi-polynomial time. There are at most 100logn stages by Lemma 4.3, thus the total time remains quasi-polynomial. Finally, we can obtain a degree-d Sherali-Adams solution in nO(d) time. Recall that we solve a K2logcn-round Sherali-Adams solution to obtain μ, where c=200K2. Thus, the running time for this step is quasi-polynomial as well. It follows that the overall running time is quasi-polynomial.

5 Polynomial time algorithm for complete 𝒌-CSPs

In this section, we show a simple algorithm that, given any complete n-variable k-CSP instance =(V,𝒫) on the boolean alphabet {0,1}, and decides if there is a satisfying assignment to in nO(k) time. This improves the algorithm of [4], which showed a quasi-polynomial time algorithm for any fixed k.

The quasi-polynomial time of [4] uses a bound on the number of satisfying assignments of any boolean CSP, proved by the same authors. We use the same tool, which can be stated as follows.

Lemma 5.1 (Lemma 3.1 of [4]).

Let =(V,𝒫) be a complete k-CSP. Then, the number of satisfying assignments to is at most O(|V|k1).

Our algorithm is very simple. It chooses an ordering v1,v2,,vn of V, and it keeps track of all satisfying assignments for the sub-instance induced on the first i variables for i=k,k+1,,n. Note that the instance induced on the first i variables is a complete CSP with i variables, hence for any i[n] the number of such assignments is at most O(ik1) by Lemma 5.1. At the i-th step, we simply try to extend each satisfying assignment by trying xi=0 and xi=1. At the end, it suffices to test if there is a satisfying assignment left. See Algorithm 3 for the pseudocode.

Algorithm 3 decide-csp().

The correctness of the algorithm is immediate. At every stage, the number of satisfying assignments, and hence the size of 𝒮, is at most nk1. Hence, the algorithm runs in nO(k) time.

6 Deferred proofs from Section 4

6.1 Proof of Lemma 4.4

In this section we prove the correctness of the base case of the recursion we the execution of Algorithm 2 reaches line 3. The claim is restated below for convenience of the reader.

Lemma 4.4 (Brute force base case). [Restated, see original statement.]

Let c=200K2, let μ𝒟(V,K2logcn), let α(0){0,1/2,1}V with αv=1/2 for all vV. Also let d, μ𝒟(V,d),α{0,1/2,1}V and VU={v:αv=1/2} be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as round-pd(K2logcn,,μ,α(0)), reaches line 3. Then, the output α of line (3) satisfies val(,α)val(,μ).

Proof.

Notice that line 3 is reached after at most 100logn stages of conditioning and thresholding, by Lemma 4.3. For each conditioning step we lose at most r+2t2r=2(logn)100K degrees from the Sherali-Adams relaxation. Hence, at this point we have a pseudodistribution μ𝒟(V,d) where d=K2logcn100logn2(logn)100KK2log100n|VU|.

This in turn means that the Sherali-Adams solution μ is a distribution over integral solutions to the unfixed variables VU. Hence in particular, it follows that there exists an integral assignment βU to the variables in VU such that the combined assignment β, which assigns βU to variables of VU, and the already fixed assignment αVF to the variables VF, satisfies val(β)val(μ), and such an assignment would be found by brute force.

6.2 Proof of Lemma 4.5

In this section we show that when the sub-instance induced by the unfixed variables has large objective value, it means that these constraints are not important and we can then focus on solving the Min-2-SAT instance given by the constraints with one or two fixed variables. The claim is restated below for convenience of the reader.

Lemma 4.5 (Min-2-SAT base case). [Restated, see original statement.]

Let c=200K2, let μ𝒟(V,K2logcn), let α(0){0,1/2,1}V with αv=1/2 for all vV. Also let d, μ𝒟(V,d),α{0,1/2,1}V and VU={v:αv=1/2} be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as round-pd(K2logcn,,μ,α(0)), reaches line 9. Then, the output α of line (9) satisfies val(,α)O(logn)AVU(μ)/(|VU|3).

Proof.

​Notice that line 9 is reached after at most 100logn stages of conditioning and thresholding, by Lemma 4.3. For each conditioning step we lose at most r+2t2r=2(logn)100K degrees from the Sherali-Adams relaxation. Hence, at this point we have a pseudodistribution μ𝒟(V,d) where d=K2logcn100logn2(logn)100KK2log100n3.

We now observe that for all β{0,1}VU, since val([VU],μ[VU])>1/(10τ), the average value of unsatisfied constraints can be bounded as

val([VU],β)110τval([VU],μ[VU]).

Now, Algorithm 2 constructs a Min-2-SAT instance 2VU,α=(VU,𝒫2VU,α) by ignoring constraints which have all their variables from VU and those constraints which have all their variables from VF=VVU. Hence, every constraint in 2VU,α involves 1 or 2 variables from VU. We recall that since all PS𝒫 are NAE-3-SAT, every PS𝒫2VU,α is a 2-SAT or 1-SAT clause.

Let VU={¬v:vVU} be the set of negative literals. For convenience, we will represent this instance as =(VU,𝒞) where the set 𝒞 is the multiset of clauses in the instance, corresponding to the constraints 𝒫2VU,α. Each 2-SAT clause in 𝒞 is the disjunction of two literals 12 with 1,2VUVU. We also allow ourselves to alternatively write 2-SAT clauses as follows: given 12 where 1,2VUVU corresponding to variables v,wVU respectively, we identify 12 with a tuple (v,w,p1,p2) where p1,q2:{0,1}{0,1} are bijections describing the literal pattern of 1 and 2 respectively. More precisely, each of p1 and p2 is identical to one of the mappings Id:bb or Id¯:b1b. For example, if we have a clause ¬vw we might also write it as (v,w,Id¯,Id). Note that this handles 1-SAT clauses too, since v is allowed to equal w. Let us further define

val(,μ)=𝔼(v,w,p1,p2)𝒞[Prσμ{v,w}[p1(σv)=0,p2(σw)=0]]

to be the average probability with which a clause is unsatisfied. We note that

|𝒞|val(,μ)LP2VU(μ)+LP1VU(μ),

since each clause has either 1 or 2 variables from VU, and the remaining variables from VVU (recall that LPiVU(μ) is the sum of the values of the clauses with exactly i variables from VU).

Our goal will be to show that the Sherali-Adams solution μ at this point gives a feasible solution for a standard LP relaxation for the Min-2-SAT instance. If we show this, then by standard results in the literature, we know that the gap for this standard LP relaxation is at most O(log2n). In turn, this must imply that running the rounding algorithm certifying this gap would give an assignment α{0,1}VU that violates at most O(log2n)(LP2VU(μ)+LP1VU(μ)) clauses in . Putting it together with α by including the constraints with all three variables from VU, and the constraints with all three variables from VVU, we get an assignment α{0,1}V such that the total number of constraints it violates must be at most

10τLP3VU(μ)+O(log2n)(LP2VU(μ)+LP1VU(μ))+LP0VU(μ)O(logn)AVU(μ),

which will finish the proof.

We now restrict our attention to showing that the Sherali-Adams solution μ is feasible for the standard LP relaxation for our Min-2-SAT instance (see for example [27]), stated below for convenience. This relaxation solves for a metric d:VUVU0 over the set of literals VUVU.

minimize (12)𝒞12(d(¬1,2)+d(¬2,1))
subject to d(v,¬v)+d(¬v,v)1,vVU(LP-2-SAT)
d(1,2)01,2VUVU
d(1,3)d(1,2)+d(2,3)1,2,3VUVU

Notice that this is indeed a relaxation: given any assignment, for each violated clause (12)𝒞 we set d(¬,)=d(¬,)=1, for every satisfied clause (12)𝒞 we set d(¬,)=d(¬,)=0, and for every other 1,2VUVU we obtain d(1,2) by metric completion. Formally, d(1,2) will be the shortest length d(1,z1)+d(z1,z2)++d(zt,2) among all choices of literals z1,,ztVUVU such that (¬1z1),(¬z1z2),,(¬zt,2)𝒞. In other words, we turn each clause 12 into two implications ¬12, ¬21 and assign the lengths d(¬1,2),d(¬2,1) to be 1 if and only if the corresponding implications are unsatisfied, and assign the other distances as the shortest path distance in the graph whose edges are formed by these implications. We have the following standard result from classical approximation algorithms literature.

Theorem 6.1 ([27]).

There is a rounding algorithm that, given any feasible solution to (LP-2-SAT) of objective value ϕ0, outputs in polynomial time an assignment for which there are at most O(log2n)ϕ unsatisfied clauses.

Now we show that given a degree-d Sherali-Adams solution ρ with d3, we can obtain a feasible solution to (LP-2-SAT) whose value is the same as the value of the Sherali-Adams relaxation. We remark that this is a standard fact, but we prove it here for completeness.

Claim 6.2.

Let d3 be an integer and let μ𝒟(V,d). Then, one can construct a feasible solution to (LP-2-SAT) with objective value at most |𝒞|val(,μ).222We note that |𝒞| is a scaling constant since val is defined as an average, whereas the objective of (LP-2-SAT) is defined as a sum.

Proof.

The construction is very straightforward. Then, for each 1,2VUVU, letting (v,w,p1,p2) be the tuple representation of 12, we set d(1,2)=Prσμ{v,w}[p1(σv)=1,p2(σw)=0]. In words, the length of the edge 1,2 is the probability that the implication 12 is falsified in the pseudodistribution μ. It remains to verify that this solution is feasible. Clearly all the d(1,2) are non-negative, and note that for each vVU we have d(v,¬v)=Prσμ{v}[σ=1] and d(¬v,v)=Prσμ{v}[σ=0], which means d(¬v,v)+d(v,¬v)=Prσμ{v}[σ=0]+Prσμ{v}[σ=1]=1. Next, we check the feasibility of the triangle inequality. Let 1,2,3VUVU, and let (u,v,p1,p2), (v,w,p2,p3), (u,w,p1,p3) be the tuple representation of 12, 23, 13, respectively. Then
d(1,3) = Prσμ{u,w}[p1(σu)=1,p3(σw)=0] = Prσρ{u,v,w}[p1(σu)=1,p2(σv)=0,p3(σw)=0]+Prσρ{u,v,w}[p1(σu)=1,p2(σv)=1,p3(σw)=0] Prσμ{u,v}[p1(σu)=1,p2(σv)=0]+Prσμ{v,w}[p2(σv)=1,p3(σw)=0] d(1,2)+d(2,3)

as desired. The LP value of this solution 12(12)𝒞d(¬1,2)+d(¬2,1) is upper-bounded as
12(12)=(v,w,p1,p2)𝒞Prσμ{v,w}[Id¯(p1(σv))=1,p2(σw))=0]+Prσμ{v,w}[Id¯(p2(σw))=1,p1(σ1))=0] = 12i,jC2Prσμ{v,w}[p1(σv))=0,p2(σw))=0] |𝒞|val(,μ),

where the last inequality follows from the definition of val(,μ).

7 Hardness of Min-NAE-𝟑-SAT on dense instances

We show that solving Min-NAE-3-SAT exactly on dense instances is almost as hard as on general instances (which is known to be NP-hard [34]).

Claim 7.1.

For any small enough constant 0<ε<1/1000, there is an approximation-preserving polynomial time reduction from a general instance of Min-NAE-3-SAT to a dense instance of Min-NAE-3-SAT whose constraint hypergraph has at least (1ε)(n3) constraints and every variable appears in at least (1ε)(n2) constraints.

Proof.

Given a general instance of Min-NAE-3-SAT with variable set V0 with n0=|V0| and constraints 𝒞, add |Vd|=O(n0/ε) dummy variables. The total number of variables becomes n=n0+|Vd|. First, we add (|Vd|3) constraints on Vd, to make sure that the all-true assignment satisfies them. To ensure this, for every three variables v1,v2,v3Vd, we add a NAE-3-SAT clause (v1,v2,¬v3). Now, we add the constraints between the dummy variables Vd and V0. For all pairs of variables v1,v2Vd and variables vV0, add a NAE-3-SAT constraint (v,v1,¬v2). Output the original instance combined with the dummy variables and constraints. The number of constraints is at least (|Vd|3)+|V0|(|Vd|2)=O(n03/ε3), which is at least (1ε)(n3) for a small enough ε. Moreover, the number of clauses each variable appears in, is at least (|Vd|2)(1ε)(n2) for a small enough ε.

For any assignment of the original variables, one can easily satisfy all the dummy constraints by setting all the dummy variables to true. In the other direction, for any assignment of the new instance, changing all dummy variables to true will only satisfy more constraints, so it will possibly violate only the original constraints.

Therefore, the optimal values of the two instances are the same.

References

  • [1] Amit Agarwal, Moses Charikar, Konstantin Makarychev, and Yury Makarychev. O(logn) approximation algorithms for min uncut, min 2cnf deletion, and directed cut problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 573–581, 2005. doi:10.1145/1060590.1060675.
  • [2] Vedat Levi Alev, Fernando Granha Jeronimo, and Madhur Tulsiani. Approximating constraint satisfaction problems on high-dimensional expanders. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 180–201. IEEE, 2019. doi:10.1109/FOCS.2019.00021.
  • [3] Noga Alon, W Fernandez De La Vega, Ravi Kannan, and Marek Karpinski. Random sampling and approximation of MAX-CSPs. Journal of computer and system sciences, 67(2):212–243, 2003. doi:10.1016/S0022-0000(03)00008-4.
  • [4] Aditya Anand, Euiwoong Lee, and Amatya Sharma. Min-csps on complete instances. In Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1178–1201. SIAM, 2025. doi:10.1137/1.9781611978322.35.
  • [5] Sanjeev Arora, David Karger, and Marek Karpinski. Polynomial time approximation schemes for dense instances of NP-hard problems. In Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, pages 284–293, 1995. doi:10.1145/225058.225140.
  • [6] Sanjeev Arora, Subhash A Khot, Alexandra Kolla, David Steurer, Madhur Tulsiani, and Nisheeth K Vishnoi. Unique games on expanding constraint graphs are easy. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 21–28, 2008.
  • [7] Mitali Bafna, Boaz Barak, Pravesh K Kothari, Tselil Schramm, and David Steurer. Playing unique games on certified small-set expanders. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1629–1642, 2021. doi:10.1145/3406325.3451099.
  • [8] Boaz Barak, Moritz Hardt, Thomas Holenstein, and David Steurer. Subsampling mathematical relaxations and average-case complexity. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 512–531. SIAM, 2011. doi:10.1137/1.9781611973082.41.
  • [9] Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. In 2011 ieee 52nd annual symposium on foundations of computer science, pages 472–481. IEEE, 2011. doi:10.1109/FOCS.2011.95.
  • [10] Cristina Bazgan, W Fernandez de La Vega, and Marek Karpinski. Polynomial time approximation schemes for dense instances of minimum constraint satisfaction. Random Structures & Algorithms, 23(1):73–91, 2003. doi:10.1002/rsa.10072.
  • [11] Joshua Brakensiek, Neng Huang, Aaron Potechin, and Uri Zwick. On the mysteries of max nae-sat. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 484–503. SIAM, 2021. doi:10.1137/1.9781611976465.30.
  • [12] Joshua Brakensiek, Neng Huang, and Uri Zwick. Tight approximability of max 2-sat and relatives, under ugc. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1328–1344. SIAM, 2024. doi:10.1137/1.9781611977912.53.
  • [13] Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, Alantha Newman, and Lukas Vogl. Understanding the cluster LP for correlation clustering. In Proceedings of the 56th Annual ACM SIGACT Symposium on Theory of Computing, 2024.
  • [14] Amin Coja-Oghlan, Colin Cooper, and Alan Frieze. An efficient sparse regularity concept. SIAM Journal on Discrete Mathematics, 23(4):2000–2034, 2010. doi:10.1137/080730160.
  • [15] W Fernandez de la Vega, Marek Karpinski, Ravi Kannan, and Santosh Vempala. Tensor decomposition and approximation schemes for constraint satisfaction problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 747–754, 2005.
  • [16] Wenceslas Fernandez de la Vega and Claire Kenyon-Mathieu. Linear programming relaxations of maxcut. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 53–61. Citeseer, 2007. URL: http://dl.acm.org/citation.cfm?id=1283383.1283390.
  • [17] Dimitris Fotakis, Michail Lampis, and Vangelis Paschos. Sub-exponential approximation schemes for CSPs: From dense to almost sparse. In 33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016), pages 37–1, 2016.
  • [18] Alan Frieze and Ravi Kannan. The regularity lemma and approximation schemes for dense problems. In Proceedings of 37th conference on foundations of computer science, pages 12–20. IEEE, 1996. doi:10.1109/SFCS.1996.548459.
  • [19] Venkatesan Guruswami and Ali Kemal Sinop. Lasserre hierarchy, higher eigenvalues, and approximation schemes for graph partitioning and quadratic integer programming with psd objectives. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 482–491. IEEE, 2011. doi:10.1109/FOCS.2011.36.
  • [20] Johan Håstad. Some optimal inapproximability results. Journal of the ACM (JACM), 48(4):798–859, 2001. doi:10.1145/502090.502098.
  • [21] Fernando Granha Jeronimo, Dylan Quintana, Shashank Srivastava, and Madhur Tulsiani. Unique decoding of explicit ε-balanced codes near the gilbert-varshamov bound. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 434–445. IEEE, IEEE, 2020. doi:10.1109/FOCS46700.2020.00048.
  • [22] Fernando Granha Jeronimo, Shashank Srivastava, and Madhur Tulsiani. Near-linear time decoding of ta-shma’s codes via splittable regularity. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1527–1536, 2021. doi:10.1145/3406325.3451126.
  • [23] Marek Karpinski and Warren Schudy. Linear time approximation schemes for the Gale-Berlekamp game and related minimization problems. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 313–322, 2009. doi:10.1145/1536414.1536458.
  • [24] Sanjeev Khanna, Madhu Sudan, Luca Trevisan, and David P Williamson. The approximability of constraint satisfaction problems. SIAM Journal on Computing, 30(6):1863–1920, 2001. doi:10.1137/S0097539799349948.
  • [25] Subhash Khot. On the power of unique 2-prover 1-round games. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 767–775, 2002. doi:10.1145/509907.510017.
  • [26] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAX-CUT and other 2-variable CSPs? SIAM Journal on Computing, 37(1):319–357, 2007. doi:10.1137/S0097539705447372.
  • [27] Philip N Klein, Serge A Plotkin, Satish Rao, and Eva Tardos. Approximation algorithms for Steiner and directed multicuts. Journal of Algorithms, 22(2):241–269, 1997. doi:10.1006/jagm.1996.0833.
  • [28] Michael Lewin, Dror Livnat, and Uri Zwick. Improved rounding techniques for the max 2-sat and max di-cut problems. In International Conference on Integer Programming and Combinatorial Optimization, pages 67–82. Springer, 2002. doi:10.1007/3-540-47867-1_6.
  • [29] Pasin Manurangsi and Dana Moshkovitz. Approximating dense max 2-CSPs. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, page 396, 2015.
  • [30] Claire Mathieu and Warren Schudy. Yet another algorithm for dense max cut: go greedy. In Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, pages 176–182, 2008. URL: http://dl.acm.org/citation.cfm?id=1347082.1347102.
  • [31] Antoine Méot, Arnaud de Mesmay, Moritz Mühlenthaler, and Alantha Newman. Voting algorithms for unique games on complete graphs. In Symposium on Simplicity in Algorithms (SOSA), pages 124–136. SIAM, 2023. doi:10.1137/1.9781611977585.ch12.
  • [32] Shayan Oveis Gharan and Luca Trevisan. A new regularity lemma and faster approximation algorithms for low threshold rank graphs. In International Workshop on Approximation Algorithms for Combinatorial Optimization, pages 303–316. Springer, 2013. doi:10.1007/978-3-642-40328-6_22.
  • [33] Prasad Raghavendra and Ning Tan. Approximating csps with global cardinality constraints using SDP hierarchies. In Yuval Rabani, editor, Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012, pages 373–387. SIAM, 2012. doi:10.1137/1.9781611973099.33.
  • [34] Thomas J. Schaefer. The complexity of satisfiability problems. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC ’78, pages 216–226, New York, NY, USA, 1978. Association for Computing Machinery. doi:10.1145/800133.804350.
  • [35] Grigory Yaroslavtsev. Going for speed: Sublinear algorithms for dense r-CSPs. arXiv preprint arXiv:1407.7887, 2014. arXiv:1407.7887.
  • [36] Yuichi Yoshida and Yuan Zhou. Approximation schemes via sherali-adams hierarchy for dense constraint satisfaction problems and assignment problems. In Moni Naor, editor, Innovations in Theoretical Computer Science, ITCS’14, Princeton, NJ, USA, January 12-14, 2014, pages 423–438. ACM, 2014. doi:10.1145/2554797.2554836.