Min-CSPs on Complete Instances II: Polylogarithmic Approximation for Min-NAE-3-SAT

Anand, Aditya; Lee, Euiwoong; Mazzali, Davide; Sharma, Amatya

doi:10.4230/LIPIcs.APPROX/RANDOM.2025.5

Min-CSPs on Complete Instances II:
Polylogarithmic Approximation for Min-NAE-3-SAT

Aditya Anand

University of Michigan, Ann Arbor, MI, USA Euiwoong Lee

University of Michigan, Ann Arbor, MI, USA Davide Mazzali

EPFL, Lausanne, Switzerland Amatya Sharma

University of Michigan, Ann Arbor, MI, USA

Abstract

This paper studies complete $k$ -Constraint Satisfaction Problems (CSPs), where an $n$ -variable instance has exactly one nontrivial constraint for each subset of $k$ variables, i.e., it has $\binom{n}{k}$ constraints. A recent work started a systematic study of complete $k$ -CSPs [Anand, Lee, Sharma, SODA’25], and showed a quasi-polynomial time algorithm that decides if there is an assignment satisfying all the constraints of any complete Boolean-alphabet $k$ -CSP, algorithmically separating complete instances from dense instances.

The tractability of this decision problem is necessary for any nontrivial (multiplicative) approximation for the minimization version, whose goal is to minimize the number of violated constraints. The same paper raised the question of whether it is possible to obtain nontrivial approximation algorithms for complete Min- $k$ -CSPs with $k\geq 3$ .

In this work, we make progress in this direction and show a quasi-polynomial time $\text{\rm polylog}(n)$ -approximation to Min-NAE-3-SAT on complete instances, which asks to minimize the number of $3$ -clauses where all the three literals equal the same bit. To the best of our knowledge, this is the first known example of a CSP whose decision version is NP-Hard in general (and dense) instances while admitting a $\text{\rm polylog}(n)$ -approximation in complete instances. Our algorithm presents a new iterative framework for rounding a solution from the Sherali-Adams hierarchy, where each iteration interleaves the two well-known rounding tools: the conditioning procedure, in order to “almost fix” many variables, and the thresholding procedure, in order to “completely fix” them.

Finally, we improve the running time of the decision algorithms of Anand, Lee, and Sharma and show a simple algorithm that decides any complete Boolean-alphabet $k$ -CSP in polynomial time.

Keywords and phrases:

Constraint Satisfiability Problems, Approximation Algorithms, Sherali Adams

Category:

APPROX

Funding:

Euiwoong Lee: Supported in part by NSF grant CCF-2236669 and Google.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Approximation algorithms analysis ; Mathematics of computing

\rightarrow

Approximation algorithms ; Theory of computation

\rightarrow

Problems, reductions and completeness

Editors:

Alina Ene and Eshan Chattopadhyay

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Constraint Satisfaction Problems (CSPs) provide a unified framework for expressing a wide range of combinatorial problems, including SAT, Graph Coloring, and Integer Programming. A CSP consists of a set $n$ of variables that must be assigned values from a given alphabet, subject to a collection of $m$ constraints. The goal is typically to determine whether there exists an assignment that satisfies all constraints (decision CSPs) or to optimize some measure of constraint satisfaction (optimization CSPs).

A major example of optimization CSPs is Max-CSPs, where the objective is to find an assignment that maximizes the number of satisfied constraints. This class of problems has been extensively studied and it is known to admit various approximation algorithms, including the (conditionally) optimal approximability of fundamental problems such as Max- $3$ -LIN, Max- $3$ -SAT, Max-Cut, and Unique Games [20, 25, 26]. However, significantly less is known for their minimization counterpart, namely Min-CSPs, which aim to minimize the number of violated constraints.

In many cases, the minimization versions of CSPs are inherently harder to approximate than the maximization objective. For example, while Max- $2$ -SAT admits a tight $\alpha_{\text{LLZ}}\approx 0.94016567$ approximation algorithm modulo the Unique Games Conjecture (UGC) [28, 12, 25], the best-known approximation guarantee for Min- $2$ -SAT (also known as Min- $2$ -CNF-Deletion) is an $O(\sqrt{\log n})$ -approximation [1] along with a hardness of $\omega(1)$ under the UGC [25]. Another example is that of Not-All-Equal- $3$ -SAT (NAE- $3$ -SAT), where a clause (consisting of three literals) is satisfied if and only if not all the three literals evaluate to the same Boolean value. The maximization version, Max-NAE- $3$ -SAT admits a tight approximation factor of $\approx 0.9089$ modulo the UGC [11]. In contrast, the minimization version Min-NAE- $3$ -SAT cannot even admit any finite approximation in polynomial time, simply from the NP-hardness of the decision version NAE- $3$ -SAT [34].

Understanding the approximability of Min-CSPs remains a major challenge in computational complexity and approximation algorithms. On the brighter side, [24] proved that the optimal approximation ratio takes one of the values in $\{1,O(1),\text{\rm polylog}(n),\text{\rm poly}(n),\infty\}$ for any Min-CSP with the Boolean alphabet (here in this paper, we will talk about CSPs on the Boolean alphabet, unless specified), based on some structural classification.

Apart from the general instances, CSPs often exhibit significantly different behavior when structural requirements are imposed on their instances. The structure of a CSP refers to assumptions on how constraints are distributed across the instance. A $k$ -CSP is defined on $n$ variables, where each constraint involves exactly $k$ variables, and the constraint structure can naturally be modeled as a $k$ -uniform hypergraph, where variables correspond to vertices and each constraint corresponds to a hyperedge. Structural assumptions on CSPs often translate into density conditions on the corresponding hypergraph. Two particularly important structured settings are (1) dense instances, where the number of constraints (i.e. hyperedges) is $\Omega(n^{k})$ , and (2) complete instances, where the number of constraints is exactly $\binom{n}{k}$ .

Max-CSPs on dense instances have been extensively studied, with powerful algorithmic techniques yielding strong approximation guarantees. In fact, for every Max-CSP on dense instances, a Polynomial-Time Approximation Scheme (PTAS) is known, achievable through any of the three major algorithmic frameworks: random sampling [5, 10, 3, 15, 30, 23, 8, 35, 29, 17], convex hierarchies [16, 6, 9, 19, 36, 2, 21, 7], and regularity lemmas [18, 14, 32, 22]. However, in contrast, Min-CSPs on dense instances remain far less explored. Known results exist only for specific problems, such as $O(1)$ -approximation algorithms for Unique Games and Min-Uncut [10, 23, 19, 31], and a PTAS for fragile Min-CSPs [23], where a CSP is fragile when changing the value of a variable always flips a clause containing it from satisfied to unsatisfied. Despite these advances, a general framework for tackling Min-CSPs in dense settings is still lacking, making it a compelling direction for further study.

Building on the study of structured CSPs, [4] introduced complete instances as an extreme case of structure, where every possible $k$ -set of variables forms a constraint. The primary motivation for the study of complete instances comes from their connections to machine learning and data science, their role in unifying algorithmic techniques for dense Max-CSPs, and the structural insights they provide into CSPs (see [4] for details). They give a constant-factor polynomial time approximation for Min- $2$ -SAT on complete instances, which contrasts to the $\omega(1)$ -hardness on dense instances, and quasi-polynomial (specifically, $n^{O(\log n)}$ ) time algorithms for the decision versions of $k$ -SAT and $k$ -CSP for all constants $k$ . They also give a polynomial time algorithm for deciding NAE- $3$ -SAT on complete instances. On the hardness side, they prove that there is no polynomial time algorithm for exact optimization (which does not distinguish the maximization and minimization versions) of NAE- $3$ -SAT, $k$ -SAT, $k$ -Lin, $k$ -And, and Unique Games even on complete instances unless $\textsf{NP}\subseteq\textsf{BPP}$ .

Our Contribution.

We prove two main results in our work: (1) a quasi-polynomial time $O(\log^{6}n)$ approximation for Min-NAE- $3$ -SAT on complete instances, and, (2) a polynomial time algorithm for the decision version of $k$ -CSP for all constants $k$ .

We first present a quasi-polynomial (i.e., $n^{\text{\rm polylog}(n)}$ ) time $\text{\rm polylog}(n)$ -approximation algorithm for Min-NAE- $3$ -SAT on complete instances. To the best of our knowledge, it is the first known example of a CSP whose decision version is NP-hard in general instances (and dense instances too, see Claim 7.1) while admitting a $\text{\rm polylog}(n)$ -approximation in complete instances.

Theorem 1.1.

There is an algorithm running in time $n^{\log^{\kappa}n}$ for some constant $\kappa>0$ that gives an $O(\log^{6}n)$ -approximation for Min-NAE- $3$ -SAT on complete instances. Furthermore, the integrality gap of the degree- $O(\log^{\kappa}n)$ Sherali-Adams relaxation is at most $O(\log^{6}n)$ .

Beyond addressing this specific question, our result also strengthens one of the main motivations for studying Min-CSPs on complete instances: understanding whether a combination of algorithmic techniques (random sampling, convex hierarchies, and regularity lemmas) can improve approximability results. As stated earlier, while each of these techniques independently yields PTASes for Max-CSPs on dense/complete instances, their effectiveness for Min-CSPs is much less understood. Our algorithm presents a new iterative framework for rounding a solution from the $\text{\rm polylog}(n)$ -round Sherali-Adams hierarchy, where each iteration interleaves the two well-known rounding tools: (1) the conditioning procedure, which identifies and conditions on a small set of variables in order to almost fix a constant fraction of variables and (2) the thresholding procedure, in order to completely fix them. We fix a constant fraction of variables in each iteration (for a total of $O(\log n)$ iterations) while ensuring that the value of the LP remains within a $\text{\rm polylog}(n)$ factor of the optimal value by the end of all the iterations.

Secondly, we give polynomial time algorithms for the decision version of $k$ -CSPs (for every constant $k$ ), improving over the quasi-polynomial time algorithm of [4].

Theorem 1.2.

For every $k\geq 2$ , there is a polynomial time algorithm that decides whether a complete instance of $k$ -CSP is satisfiable or not.

Our algorithm is remarkably simple. It relies on Lemma $3.1$ of [4], which shows that the number of satisfying assignments for a complete instance of a $k$ -CSP is at most $O(n^{k-1})$ , based on a VC-dimension argument. The algorithm first arbitrarily orders the variables $v_{1},\ldots,v_{n}$ , and then proceeds by iteratively maintaining all satisfying assignments for the partial (complete) instance induced on the first $i$ variables. Since the number of satisfying assignments for a complete instance with $i$ variables is at most $O(i^{k-1})$ , we can efficiently keep track and update these solutions in polynomial time, leading to our second main result.

Open Questions.

Our work establishes the first nontrivial approximation result for a Min-CSP, Min-NAE- $3$ -SAT on complete instances, whose decision version is NP-hard on general and dense instances. NAE- $3$ -SAT remains hard even when every variable appears in $\Omega(n^{2})$ clauses (˜7.1). However, our techniques do not currently extend to other problems including Min- $3$ -SAT and Min-NAE-4-SAT.

This raises the open question regarding both approximation algorithms and hardness results for Min- $k$ -CSPs. While exact optimization is known to be hard, proving inapproximability beyond this – such as APX-hardness – remains an important direction. A natural first candidate is Min- $3$ -SAT, for which the existence of efficient quasi-polynomial time approximation algorithms or even hardness results on complete instances remains unresolved.

Organization.

Section 2 provides an overview of the quasi-polynomial time $O(\log^{6}n)$ approximation for Min-NAE- $3$ -SAT, i.e., a proof outline for Theorem 1.1. Section 3 establishes the notations and preliminaries required throughout the paper. Section 4 shows the proof of the rounding algorithm, Algorithm 2, which proves Theorem 1.1 by rounding the solution of the Sherali-Adams LP. Section 5 presents the polynomial time algorithm for the decision $k$ -CSPs (Theorem 1.2).

2 Technical overview

Convex programming hierarchies have proven to be instrumental in the design of approximation algorithms of Max-CSPs [16, 6, 9, 19, 36, 2, 21, 7] and, limited to a few special cases, Min-CSPs [13]. Therefore, it is natural to attempt to approximate Min-NAE- $3$ -SAT on complete instances using these approaches. We prove Theorem 1.1 by designing an algorithm that first runs the degree- $d$ Sherali-Adams relaxation of Min-NAE- $3$ -SAT, and then rounds the fractional solution to a Boolean assignment.

Informally, a fractional solution $\mu$ to such a relaxation associates each set $S$ of at most $d$ variables with distributions $\mu_{S}$ over assignments $\alpha\in\{0,1\}^{S}$ to the variables in $S$ . The crucial property is that these distributions are locally consistent: for any two variable sets $S, T$ of size at most $d$ , the local distributions $\mu_{S}$ and $\mu_{T}$ agree on the probability of each assignment to $S\cap T$ . For this reason, such a solution $\mu$ is called a pseudodistribution.

The objective of the relaxation is to minimize the average over constraints of the probability that they are violated by an assignment sampled from their respective local distribution. In particular, the objective value of an optimal pseudodistribution $\mu^{*}$ is at most the fraction of violated constraints in an optimal Boolean assignment.

As solving a degree- $d$ Sherali-Adams relaxation amounts to solving a linear program with $n^{O(d)}$ variables, we would like to keep $d$ as small as possible. In our context, we will set $d=\text{\rm polylog}(n)$ .

Having put this notation in place, we now give an overview of how the algorithm from Theorem 1.1 rounds these fractional solutions. Specifically, in Section 2.1 we highlight some of the structural properties that our problem instance impose on the pseudodistributions, and in Section 2.2 we discuss how to exploit these properties to design a rounding algorithm.

2.1 Pseudodistributions and complete NAE-3-SAT

When rounding pseudodistributions obtained from the Sherali-Adams or Sum-of-Squares hierarchies, conditioning is perhaps the main hammer at our disposal. This technique, introduced by [9, 19] and adopted ubiquitously to approximate optimization CSPs, is usually implemented as follows: first, we condition on a subset of $r=O(1/\epsilon^{2})$ variables to reduce the average correlations among the groups of $t$ variables to below $O(2^{t}\epsilon)$ ; then, we perform independent rounding. This scheme is particularly effective in the context of dense Max- $k$ -CSPs, where we capitalize upon the fact that optimal assignment satisfies at least an $\Omega(1)$ fraction of constraints. Hence, we can afford to additively lose a total variation distance of $O(2^{k}\epsilon)=O(\epsilon)$ between the pseudodistribution and independent rounding in each constraint, and still obtain a $(1-O(\epsilon))$ -approximation.

Obstacle: we cannot afford additive error.

In the context of Min- $k$ -CSPs, the objective function is the fraction of violated constraints. Hence, for a Min- $k$ -CSP instance $\mathcal{I}$ , we define $\mathrm{OPT}$ to be the minimum over all the variable assignments $\alpha$ of the fraction of constraints in $\mathcal{I}$ unsatisfied by $\alpha$ . With such an objective, the trick discussed above ceases to work: a highly satisfiable instance has a very small optimal value, possibly even $\mathrm{OPT}\leq O(1/n^{k})$ . Due to this eventuality, we cannot afford to additively lose an $\epsilon$ term for each constraint: it would result in an objective value of, say, $\epsilon+\mathrm{OPT}\gtrsim n\cdot\mathrm{OPT}$ unless $\epsilon\ll 1/n^{k-1}$ . Unfortunately, the degree $d$ of the pseudodistribution needs to be larger than the number $r=O(1/\epsilon^{2})$ of variables we condition on, so setting $\epsilon\ll 1/n^{k-1}$ is prohibitive.

Advantage: $\text{\rm polylog}(n)$ loss in time and approximation.

On the flip side of the above discussion, one can deduce that if $\mathrm{OPT}\geq 1/\text{\rm polylog}(n)$ we can in fact afford to lose an additive $O(\epsilon)$ term for $\epsilon=1/\text{\rm polylog}(n)$ and still run in $n^{O(r)}=n^{\text{polylog}(n)}$ time. Even more so, if $\mathrm{OPT}\geq 1/\text{\rm polylog}(n)$ then we always get a $\text{\rm polylog}(n)$ -approximation, no matter what solution we output (since the value we obtain is always at most $1$ ). We can therefore restrain ourselves to consider instances with small optimal value.

Insight 1.

Without loss of generality, we can consider only instances $\mathcal{I}$ with $\mathrm{OPT}\leq 1/\text{\rm polylog}(n)$ .

Advantage: the instance is complete.

Let $\mu$ be an optimal pseudodistribution for our instance $\mathcal{I}$ , and let us denote by $\delta=\mathrm{val}(\mu)$ the fractional objective value achieved by $\mu$ on $\mathcal{I}$ . Thanks to ˜1, we can assume $\delta\leq\mathrm{OPT}\leq 1/\text{\rm polylog}(n)$ . Recalling that we are in the realm of complete instances, a simple averaging yields the following observation.

Insight 2.

For at least $\binom{n}{3}/2$ of the triples $\{u,v,w\}$ , we have that the constraint $P_{\{u,v,w\}}(\alpha)$ is unsatisfied with probability at most $2\delta\leq 1/\text{\rm polylog}(n)$ over $\alpha\sim\mu_{\{u,v,w\}}$ .

We remark that a version of Insight 2 remains true for dense – but not necessarily complete – instances up to an $\Omega(1)$ loss in the lower bound. However, being dense is not a “hereditary property”, while being complete is a “hereditary property”: any variable-induced sub-instance of a complete instance is also complete and in particular dense, while this might not be the case for dense instances. Thus, assuming that $\mathcal{I}$ is complete allows us to benefit from Insight 2 locally to any variable-induced sub-instance. As we will see, this is instrumental to our algorithm.

Advantage: the structure of NAE- $3$ -SAT constraints.

Consider any constraint $P_{\{u,v,w\}}$ that is unsatisfied with probability at most $\delta^{\prime}$ over $\mu_{\{u,v,w\}}$ , for some $\delta^{\prime}\in[0,1]$ . By Insight 2, we can assume that $\delta^{\prime}\leq 2\delta\leq 1/\text{\rm polylog}(n)$ . Then one can see that, among the six satisfying assignments of $P_{\{u,v,w\}}$ , there is one that retains probability mass at least $(1-1/\text{\rm polylog}(n))/6\geq 1/7$ in $\mu_{\{u,v,w\}}$ . Call this satisfying assignment $(\alpha^{*}_{u},\alpha^{*}_{v},\alpha^{*}_{w})$ . Assuming for simplicity that the literals of $P_{\{u,v,w\}}$ are all positive, we conclude that exactly two of $\alpha^{*}_{u},\alpha^{*}_{v},\alpha^{*}_{w}$ must be equal. By symmetry, we can assume that $\alpha^{*}_{v}=\alpha^{*}_{w}=0$ and $\alpha^{*}_{u}=1$ . We now consider the pseudodistribution $\widetilde{\mu}=\mu|_{(v,w)\leftarrow(\beta_{v},\beta_{w})}$ obtained by conditioning on the variables $v, w$ taking the random value $(\beta_{v},\beta_{w})\sim\mu_{v,w}$ . We can observe that $(\beta_{v},\beta_{w})=(\alpha^{*}_{v},\alpha^{*}_{w})=(0,0)$ with probability at least $1/7$ . Therefore, if such an event occurs, we have $\widetilde{\mu}_{u}(0)\leq 14\delta$ , as one can deduce from Table 1. In this case, we say that $u$ becomes $O(\delta)$ -fixed in $\widetilde{\mu}$ .

Table 1: Distribution of the probability mass in

\mu_{\{u,v,w\}}

to assignments of a constraint

P_{\{u,v,w\}}

that is unsatisfied with probability at most

2\delta

, assuming that

P_{\{u,v,w\}}

has only positive literals.

We now continue to look at a fixed constraint $P_{\{u,v,w\}}$ as above, and consider the following random experiment: draw a random pair of variables $x, y$ together with a random assignment $(\beta_{x},\beta_{y})\sim\mu_{x,y}$ sampled from their local distribution, and let $\widetilde{\mu}=\mu|_{(x,y)\leftarrow(\beta_{x},\beta_{y})}$ be the pseudodistribution obtained by conditioning on the pair $(x,y)$ taking value $(\beta_{x},\beta_{y})$ . With probability $1/\binom{n}{2}$ we hit the pair $(v,w)$ , i.e., $(x,y)=(v,w)$ , and with probability $1/7$ we have $(\beta_{x},\beta_{y})=(\beta_{v},\beta_{w})=(0,0)$ . This means that the variable $u$ becomes $O(\delta)$ -fixed in $\widetilde{\mu}$ with probability at least $0.1/\binom{n}{2}$ . By Insight 2, we know that for many (i.e., $\Omega(n)$ ) choices for $u$ , there are many (i.e., $\Omega(n^{2})$ ) triples $P_{\{u,v,w\}}$ that can $O(\delta)$ -fix $u$ , as the one considered in the discussion above. Using the linearity of expectation and Markov’s inequality, we can then conclude the following.

Insight 3.

For a random pair of variables $x, y$ and a random assignment $(\beta_{x},\beta_{y})\sim\mu_{x,y}$ , with probability $1/1000$ there are at least $n/1000$ variables $u$ that become $O(\delta)$ -fixed in $\widetilde{\mu}=\mu|_{(x,y)\leftarrow(\beta_{x},\beta_{y})}$ .

A formal version of this fact is stated in Lemma 4.1.

Advantage: “completely” fixing $O(\delta)$ -fixed variables incurs little cost.

We recall that conditioning on a random assignment $(\beta_{x},\beta_{y})\sim\mu_{x,y}$ preserves the objective value in expectation, i.e., $\operatorname*{\mathbb{E}}[\mathrm{val}(\widetilde{\mu})]=\mathrm{val}(\mu)=\delta$ . Hence, Insight 3 seems to suggest that conditioning on just two variables allows making nontrivial progress towards the goal of rounding the pseudodistribution: the entropy of $\Omega(n)$ variables should drastically decrease while preserving the expected objective value. However, the conditioning alone is not enough, as its effect decreases when the entropy becomes already small but still nontrivial (e.g., if all but $\sqrt{n}$ variables are $O(\delta)$ -fixed, the above argument cannot guarantee any more progress). While independent rounding suffices for Max-CSPs in this case, its additive loss can still be prohibitive for the minimization objective.

In order to bypass this barrier, we additionally use thresholding to round many $O(\delta)$ -fixed variables to exactly $0$ or $1$ . Formally, $\widehat{\mu}$ is a new pseudodistribution where for each variable $u$ that is $O(\delta)$ -fixed in $\widetilde{\mu}$ , fix $u$ to the bit retaining $1-O(\delta)$ of the probability mass of $\widetilde{\mu}_{u}$ .

We now zoom in on a constraint $P_{\{u,v,w\}}$ with only positive literals where, say, $u$ has been fixed to $1$ in $\widehat{\mu}$ . Call $\gamma$ the probability that $P_{\{u,v,w\}}$ is unsatisfied by an assignment sampled from $\widetilde{\mu}_{\{u,v,w\}}$ for convenience. One can observe that the probability of $P_{\{u,v,w\}}$ being violated by an assignment sampled from $\widehat{\mu}_{\{u,v,w\}}$ equals $\widehat{\mu}_{\{u,v,w\}}(1,1,1)$ . This is because the assignment $(0,0,0)$ has probability mass $0$ in $\widehat{\mu}_{\{u,v,w\}}$ , since we fixed $u$ to $1$ . Then, with the help of Table 2, we can convince ourselves that the probability of $P_{\{u,v,w\}}$ being unsatisfied by an assignment sampled from $\widehat{\mu}_{\{u,v,w\}}$ is no larger than $\gamma$ up to an additive $O(\delta)$ error.

Table 2: Distribution of the probability mass in

\widetilde{\mu}_{\{u,v,w\}}

to assignments of a constraint

P_{\{u,v,w\}}

where

u

is

O(\delta)

-fixed in

\widetilde{\mu}

and where

\gamma

is the probability that

P_{\{u,v,w\}}

is unsatisfied by an assignment sampled from

\widetilde{\mu}_{\{u,v,w\}}

, assuming that

P_{\{u,v,w\}}

has only positive literals.

From the above discussion, we conclude that the fractional objective value $\mathrm{val}(\widehat{\mu})$ of $\widehat{\mu}$ is at most $\mathrm{val}(\widetilde{\mu})+O(\delta)$ . Recalling that $\operatorname*{\mathbb{E}}[\mathrm{val}(\widetilde{\mu})]=\mathrm{val}(\mu)=\delta$ , Insight 3 can then be refined as follows.

Insight 4.

By sampling a random pair of variables, conditioning on a random assignment to them, and performing the thresholding, we can construct a new pseudodistribution $\widehat{\mu}$ such that $\operatorname*{\mathbb{E}}[\mathrm{val}(\widehat{\mu})]=O(\delta)$ and with probability $1/1000$ there are at least $n/1000$ integral variables $u$ in $\widehat{\mu}$ .

A formal version of this fact can be obtained by combining Lemma 4.1 and Lemma 4.2.

2.2 Towards an algorithm

It would be tempting to employ Insight 4 to get a rounding scheme as the one outlined in Algorithm 1: sample a pair of variables and a random assignment, construct $\widehat{\mu}$ as above, and recurse on the sub-instance induced by the variables that are still not integral (or halt if there is no such variable). The intuition behind this Algorithm 1 is backed up by a trifecta.

1.

First, Insight 4 applies also to the sub-instance that we recurse on: it essentially only relies on Insight 2, which also holds on the sub-instance since $\mathcal{I}$ is complete (using the fact that being complete is a “hereditary property”, as anticipated).
2.

Second, we expect that $O(\log n)$ levels of recursion will make every variable integral, by Insight 4.
3.

Third, the expected value of the (integral) pseudodistribution obtained at the end is roughly $O(\delta)\leq O(\mathrm{OPT})$ , also by Insight 4.

However, more work is needed, since Insight 4 only bounds the cost in expectation at each individual recursion level.

Algorithm 1 Conditions on a random pair of variables and thresholds the result.

Obstacle: the need for a high probability guarantee.

In order to materialize the strategy of Algorithm 1, we need to control the deviations of $\mathrm{val}(\widehat{\mu})$ from $\delta$ at every level of the recursion. More precisely, let $\mu^{(i)}$ be the pseudodistribution at the beginning of the $i$ -th recursion level, and let $\delta_{i}$ the fractional objective value $\mathrm{val}(\mu^{(i)})$ of the pseudodistribution $\mu^{(i)}$ .

For the sake of exposition, here let us ignore the hidden constant in $O(\delta)$ and assume $\mathbb{E}[\mathrm{val}(\mu^{(i)})]$ $=\mathrm{val}(\mu^{(i-1)})$ (the ignored error comes from the clauses at least one of whose variables is completely fixed by the threshold step, so it can be controlled via the concept of the aggregate value defined in equation (1) of Section 4). With this assumption, at every level $i$ of the recursion we have

\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\text{$(i-1)$-th level}\end{% subarray}}\left[\mathrm{val}\left(\mu^{(i)}\right)\right]=\delta_{i-1}=\mathrm% {val}\left(\mu^{(i-1)}\right)\,.

Applying this identity at every level, we get

	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\text{ algorithm}% \end{subarray}}$	$\displaystyle\left[\mathrm{val}\left(\mu^{(\text{\#levels})}\right)\right]$
		$\displaystyle=\operatorname{\mathbb{E}}_{\begin{subarray}{c}\text{$1$-st % level}\end{subarray}}\left[\,\dots\,\operatorname{\mathbb{E}}_{\begin{% subarray}{c}\text{$(\text{\#levels}-2)$-th level}\end{subarray}}\left[% \operatorname*{\mathbb{E}}_{\begin{subarray}{c}\text{$(\text{\#levels}-1)$-th % level}\end{subarray}}\left[\mathrm{val}\left(\mu^{(\text{\#levels})}\right)% \right]\right]\,\dots\,\right]$
		$\displaystyle=\delta_{1}$
		$\displaystyle\leq\mathrm{OPT}\,,$

where the last inequality follows assuming that the pseudodistribution on which we first call the algorithm is optimal. Now, one can intuitively see that the deviation of the last pseudodistribution $\mu^{(\text{\#levels})}$ from its expected objective value translates to the approximation ratio achieved by the algorithm. Therefore, we are interested in bounding such deviation. To do so, we have to peel off one expectation operator at a time in the above equation by means of Markov’s inequality. At every level $i$ , the guarantee would then be $\mathrm{val}(\mu^{(i)})\leq\lambda\cdot\mathrm{val}(\mu^{(i-1)})$ with probability $1-1/\lambda$ . Since this gives

\mathrm{val}\left(\mu^{(\text{\#levels})}\right)\leq\lambda^{\text{\#levels}}% \mathrm{OPT}\,,

we need a $\lambda$ such that $\lambda^{\text{\#levels}}\leq\widetilde{O}(1)$ . Recalling that $\text{\#levels}=\Omega(\log n)$ , we must hence require $\lambda\leq 1+1/\log n$ . Unfortunately, Markov’s inequality ensures $1+1/\log n$ multiplicative increase with only $p_{\text{val}}=O(1/\log n)$ success probability. If we call $p_{\text{fix}}=1/1000$ the probability that $\widehat{\mu}$ fixes $n/1000$ variables, we realize that

(1-p_{\text{val}})+(1-p_{\text{fix}})\geq 1\,.

We are therefore unable to conclude that there exists a choice of $x,y,\beta_{x},\beta_{y}$ for which we both have that $\mu^{(i)}$ fixes $n/1000$ variables and $\mu^{(i)}\leq(1+1/\log n)\cdot\mu^{(i-1)}$ . For the left-hand side above to be less than one, we need $p_{\text{fix}}\geq 1-O(1/\log n)$ .

Advantage: polylog( $n$ )-degree pseudodistributions.

Recalling that we allow our algorithm to run in $n^{\text{polylog}(n)}$ time, we can assume to have a $\text{\rm polylog}(n)$ -degree pseudodistribution at our disposal. Supported by this, the natural thing to do is modifying each level of the recursion as follows: condition on $O(\log\log n)$ random pairs - as opposed to one - and their respective assignments. While this should boost $p_{\text{fix}}\geq 1-O(1/\log n)$ , we now have more conditioning steps that could deviate from the expectation. However, suppose we could condition on a random set of $O(\log\log n)$ pairs of variables whose joint local distribution is close (in total variation distance) to the product of their marginal distributions: then, we would both obtain the high probability guarantee and simultaneously have only one deviation from the objective value to control. A standard way to ensure that the joint distribution of $t$ (pairs of) variables is $\epsilon$ -close to independent is by additionally conditioning on $\text{\rm poly}(2^{t}/\epsilon)$ variables [33, 36]. Since we want $t=O(\log\log n)$ and $\epsilon=1/\text{\rm polylog}(n)$ , we can afford to do so with our $\text{\rm polylog}(n)$ -degree pseudodistribution.

A formal version of this algorithm is presented and analyzed in Section 4.

3 Notation

CSPs.

Let $k\geq 2$ be an integer and let $V$ be a set of $n$ variables. A $k$ -CSP is a pair $\mathcal{I}=(V,\mathcal{P})$ where $V$ is a set of variables over the Boolean alphabet $\{0,1\}$ , and $\mathcal{P}$ represents the constraints. Each constraint $P_{S}\in\mathcal{P}$ is defined by a $k$ -subset of variables $S\in\binom{V}{k}$ and a predicate $P_{S}:\{0,1\}^{S}\rightarrow\{0,1\}$ , where $1$ means that the constraint is satisfied and $0$ otherwise. We will consider complete $k$ -CSPs, so that there is a predicate $P_{S}\in\mathcal{P}$ for every $S\in\binom{V}{k}$ . Furthermore, each constraint $P_{S}$ is unsatisfied for at least one assignment to the variables $S$ . Then, for a global assignment $\alpha\in\{0,1\}^{V}$ to the variables, we let

\mathrm{val}(\mathcal{I},\alpha)=\Pr_{S\sim\binom{V}{k}}[P_{S}(\alpha_{S})=0]\,,

where $S\sim\binom{V}{k}$ indicates that $S$ is sampled uniformly from the elements of $\binom{V}{k}$ , and $\alpha_{S}$ represents the restriction of $\alpha$ to entries in $S$ .

In the decision version of a $k$ -CSP instance $\mathcal{I}$ , the goal is to decide if there is an assignment $\alpha^{*}$ such that $\mathrm{val}(\mathcal{I},\alpha^{*})\nobreak\ =\nobreak\ 0$ . In the minimization version, which we denote by Min- $k$ -CSP, the goal is to find an assignment $\alpha^{*}$ which minimizes $\mathrm{val}(\mathcal{I},\alpha^{*})$ , so we define

\mathrm{OPT}(\mathcal{I})=\min_{\alpha\in\{0,1\}^{V}}\mathrm{val}(\mathcal{I},% \alpha)\,.

Sometimes we will write $\mathrm{val}(\alpha)$ as a shorthand for $\mathrm{val}(\mathcal{I},\alpha)$ when $\mathcal{I}$ is clear from the context. For a subset of variables $W\subseteq V$ , we denote by $\mathcal{I}[W]=(W,\mathcal{P}^{\prime})$ the sub-instance of $\mathcal{I}$ induced by the variables $W$ .

NAE- $3$ -SAT.

A complete NAE- $3$ -SAT instance is a complete $3$ -CSP $\mathcal{I}=(V,\mathcal{P})$ where each constraint (also called a clause) $P_{S}\in\mathcal{P}$ is a “not all equal” predicate on three literals, where each literal is either a variable or its negation. For our convenience, we define this formally as follows: for each $S\in\binom{V}{3}$ and each $u\in S$ , we have a polarity bijection $P_{S}^{u}:\{0,1\}\rightarrow\{0,1\}$ determining the literal pattern for the variable $u$ in $S$ , and the constraint $P_{S}$ is defined as

P_{S}(\alpha)=\begin{cases}1\,,\quad&\text{if }\left|\left\{P_{S}^{u}(\alpha_{% u})\right\}_{u\in S}\right|\geq 2\\ 0\,,\quad&\text{otherwise}\end{cases}

for every $\alpha\in\{0,1\}^{S}$ .

Sherali-Adams notation.

For $d\geq k$ , we consider the degree- $d$ Sherali-Adams relaxation of $\mathrm{OPT}(\mathcal{I})$ , which can be written as

\begin{array}[]{ll@{}ll}\text{minimize }&\Pr_{S\sim\binom{V}{k},\,\alpha\sim% \mu_{S}}[P_{S}(\alpha)={0}]&\\ \text{subject to }&\mu\in\mathcal{D}(V,d)\end{array}

where $\mathcal{D}(V,d)$ is the set of pseudodistributions of degree $d$ over the Boolean variables $V$ , i.e. every element $\mu$ of $\mathcal{D}(V,d)$ is indexed by subsets $S\subseteq V$ with $|S|\leq d$ where each $\mu_{S}$ is a distribution over assignments $\alpha\in\{0,1\}^{S}$ to the variables in $S$ , with the property that

\forall S,T\subseteq V,\beta\in\{0,1\}^{S\cap T}\text{ with }|S\cup T|\leq d\,% ,\quad\Pr_{\alpha\sim\mu_{S}}[\alpha_{S\cap T}=\beta]=\Pr_{\alpha\sim\mu_{T}}[% \alpha_{S\cap T}=\beta]\,.

Moreover, for any $\alpha\in\{0,1\}^{S}$ , we let $\mu_{S}(\alpha)\in[0,1]$ denote that probability of sampling $\alpha$ from $\mu_{S}$ . We also let for convenience

\mathrm{val}(\mathcal{I},\mu)=\Pr_{S\sim\binom{V}{k},\,\alpha\sim\mu_{S}}[P_{S% }(\alpha)={0}]

be the fractional objective value of a pseudodistribution $\mu$ . Additionally, for $W\subseteq V$ , we denote by $\mu[W]\in\mathcal{D}(W,d)$ the restriction of $\mu$ to variables in $W$ . Again, we will use the shorthand $\mathrm{val}(\mu)$ when $\mathcal{I}$ is clear from the context.

Conditioning and fixing notation.

For $\mu\in\mathcal{D}(V,d)$ , $S\subseteq V$ with $|S|\leq d$ and $\beta\in\mathrm{supp}(\mu_{S})$ , we denote by $\mu|_{S\leftarrow\beta}\in\mathcal{D}(V,d-|S|)$ the pseudodistribution obtained from $\mu$ by conditioning on the event that the variables in $S$ are assigned $\beta$ . Formally, $\mu|_{S\leftarrow\beta}$ is the pseudodistribution defined for each $T\subseteq V$ with $|T|\leq d-|S|$ and $\alpha\in\{0,1\}^{T}$ as

\left(\mu|_{S\leftarrow\beta}\right)_{T}(\alpha)=\begin{cases}\frac{\mu_{S\cup T% }(\alpha\oplus\beta)}{\mu_{S}(\beta)}\,,\quad&\text{if }\alpha_{S\cap T}=\beta% _{S\cap T}\\ 0\,,\quad&\text{otherwise}\end{cases}

where $\alpha\oplus\beta\in\{0,1\}^{T\cup S}$ is the vector assigning $\alpha$ to $T$ and $\beta_{S\setminus T}$ to $S\setminus T$ . Furthermore, we denote by $\mu^{S\leftarrow\beta}\in\mathcal{D}(V,d)$ the pseudodistribution obtained by fixing the variables in $S$ to take the assignment $\beta$ , which results from moving all the probability mass of $\mu_{\{v_{i}\}}$ to $\mu_{\{v_{i}\}}(b_{i})$ for each $v_{i}\in S$ and the corresponding $b_{i}\in\beta$ . Formally, $\mu^{S\leftarrow\beta}$ is the pseudodistribution defined for each $T\subseteq V$ with $|T|\leq d$ and each $\alpha\in\{0,1\}^{T}$ as

\left(\mu^{S\leftarrow\beta}\right)_{T}(\alpha)=\begin{cases}\sum_{\begin{% subarray}{c}\alpha^{\prime}\in\{0,1\}^{T}:\\ \alpha^{\prime}_{T\setminus S}=\alpha_{T\setminus S}\end{subarray}}\,\,{\mu_{T% }(\alpha^{\prime})}\,,\quad&\text{if }\alpha_{S\cap T}=\beta_{S\cap T}\\ 0\,,\quad&\text{otherwise}\end{cases}

Variable vectors and variable sets.

We will use the convention that for a vector $\mathbf{c}\in V^{\ell}$ written in boldface, the regular capital letter $C=\{\mathbf{c}_{i}\}_{i\in[\ell]}$ will be the set formed by the union of the value taken by its coordinates.

4 Polylogarithmic approximation for complete Min-NAE-3-SAT

In this section, we present and analyze the rounding scheme that we apply to the Sherali-Adams solution. Hereafter, we fix an instance $\mathcal{I}$ of NAE- $3$ -SAT with variable set $V$ , where $|V|=n$ . In order to describe the rounding algorithm, we should first introduce the necessary notions.

Fixed variables, ruling value, ruling assignment.

Given a subset $W\subseteq V$ , a pseudodistribution $\rho\in\mathcal{D}(W,d)$ , and a threshold $\xi\in[0,1/2)$ , a variable $v$ is called $(\rho,\xi)$ -fixed if for some bit $b\in\{0,1\}$ one has that the probability $\rho_{\{v\}}(b)$ is at most $\xi$ . Moreover, the corresponding bit $1-b$ having probability at least $1-\xi$ is called the ruling value of $v$ . Also, we define $F_{W}(\rho,\xi)$ to be the the set of all the $(\rho,\xi)$ -fixed variables in $W$ . Moreover, we define $\omega^{*}_{W}(\nu,\xi)\in\{0,1\}^{F_{W}(\nu,\xi)}$ to be the corresponding ruling assignment, where the entry for $v\in F_{W}(\nu,\xi)$ equals the ruling value of $v$ in $\rho$ .

LP value of constraint classes.

Given a set $V_{\mathrm{U}}\subseteq V$ of variables (which can be thought of as the set of unfixed variables), we classify the constraints of $\mathcal{I}=(V,\mathcal{P})$ into four groups: for $i\in\{0,1,2,3\}$ , we let $\mathcal{I}_{i}\{V_{\mathrm{U}}\}=(V,\mathcal{P}_{i}\{V_{\mathrm{U}}\})$ be the sub-instance with constraints $\mathcal{P}_{i}\{V_{\mathrm{U}}\}=\{P_{S}\in\mathcal{P}:|S\cap V_{\mathrm{U}}|% =i\}$ , i.e. $\mathcal{I}_{i}\{V_{\mathrm{U}}\}$ can be thought of as the sub-instance consisting of constraints with exactly $i$ unfixed variables. Then, given a pseudodistribution $\rho\in\mathcal{D}(V,d)$ , we define for each $i\in\{0,1,2,3\}$ the LP value of the constraints in $\mathcal{I}_{i}\{V_{\mathrm{U}}\}$ as

\mathrm{LP}^{V_{\mathrm{U}}}_{i}(\rho)=\sum_{P_{S}\in\mathcal{P}_{i}\{V_{% \mathrm{U}}\}}\Pr_{\alpha\sim\rho_{S}}[P_{S}(\alpha)=0]\,,

so $\mathrm{LP}^{V_{\mathrm{U}}}_{i}(\rho)$ can be thought of as the contribution of the constraints with exactly $i$ unfixed variables to the Sherali-Adams objective.

Aggregate LP value.

For analyzing our algorithm, we treat the quantities $\mathrm{LP}^{V_{\mathrm{U}}}_{0}(\rho)$ , $\mathrm{LP}^{V_{\mathrm{U}}}_{1}(\rho)$ , $\mathrm{LP}^{V_{\mathrm{U}}}_{2}(\rho)$ and $\mathrm{LP}^{V_{\mathrm{U}}}_{3}(\rho)$ separately. However, for the sake of certain probability bounds, it turns out to be useful to combine these into a single weighted sum: given a real parameter $\tau>0$ , a set $V_{\mathrm{U}}\subseteq V$ , and a pseudodistribution $\rho\in\mathcal{D}(V,d)$ , we define the aggregate value of $\rho$ as the weighted sum

A^{V_{\mathrm{U}}}_{\tau}(\rho)=\tau\cdot\log^{3}n\cdot\mathrm{LP}^{V_{\mathrm% {U}}}_{3}(\rho)+\log^{2}n\cdot\mathrm{LP}_{2}^{V_{\mathrm{U}}}(\rho)+\log n% \cdot\mathrm{LP}^{V_{\mathrm{U}}}_{1}(\rho)+\mathrm{LP}^{V_{\mathrm{U}}}_{0}(% \rho)\,.

(1)

We might drop the subscript $\tau$ when it is clear from context (and this should not create any confusion as we will use one fixed value of $\tau$ for the whole algorithm and analysis).

Thresholds with bounded increase.

As per the definition above, the notion of fixed variables depends on a threshold $\xi$ . In our algorithm and analysis, we will want to use a threshold that ensures a certain bound on the additive increase on the aggregate value. More precisely, given $\tau>0$ , $V_{\mathrm{U}}\subseteq V$ , $\rho\in\mathcal{D}(V,d)$ , and a parameter $\delta\in[0,1]$ , we define the set of thresholds with bounded increase to be the set of all $\theta\in[0,1/2)$ such that

A^{V_{\mathrm{U}}\setminus F_{V_{\mathrm{U}}}(\rho,\theta)}(\rho^{F_{V_{% \mathrm{U}}}(\rho,\theta)\leftarrow\omega^{*}_{V_{\mathrm{U}}}(\rho,\theta)})-% A^{V_{\mathrm{U}}}(\rho)\leq\frac{6}{\log n}A^{{V_{\mathrm{U}}}}(\rho)+12\tau% \delta\log^{2}n\binom{|V_{\mathrm{U}}|}{3}\,,

and we denote this set as $\Theta_{\tau}^{V_{\mathrm{U}}}(\rho,\delta)$ . As for $A^{V_{\mathrm{U}}}_{\tau}(\rho)$ , we drop the subscript $\tau$ when it is clear from context.

$2$ -SAT instances.

Given $V_{\mathrm{U}}\subseteq V$ and a partial assignment $\alpha\in\{0,1/2,1\}^{V}$ such that $\alpha_{v}\in\{0,1\}$ for all $v\in V\setminus V_{\mathrm{U}}$ , the Min- $2$ -SAT instance induced by $V_{\mathrm{U}}$ and $\alpha$ is denoted as $\mathcal{I}_{2}\langle V_{\mathrm{U}},\alpha\rangle$ and is defined as follows: we let $\mathcal{I}_{2}\langle V_{\mathrm{U}},\alpha\rangle=(V_{\mathrm{U}},\mathcal{P% }_{2}\langle V_{\mathrm{U}},\alpha\rangle)$ , where $\mathcal{P}_{2}\langle V_{\mathrm{U}},\alpha\rangle$ is a multiset of $2$ -SAT constraints that associates to each $P_{S}\in\mathcal{P}_{1}\{V_{\mathrm{U}}\}\cup\mathcal{P}_{2}\{V_{\mathrm{U}}\}$ exactly one constraint $P^{\prime}_{S}:\{0,1\}^{\{x,y\}}\rightarrow\{0,1\}$ defined as $P^{\prime}_{S}(\alpha^{\prime})=P_{S}(\alpha^{\prime}\oplus\alpha_{S\setminus V% _{\mathrm{U}}})$ for all $\alpha^{\prime}\in\{0,1\}^{S\cap V_{\mathrm{U}}}$ , where $\alpha^{\prime}\oplus\alpha_{S\setminus V_{\mathrm{U}}}\in\{0,1\}^{S}$ is the vector assigning $\alpha^{\prime}$ to $S\cap V_{\mathrm{U}}$ and $\alpha_{S\setminus V_{\mathrm{U}}}$ to $S\setminus V_{\mathrm{U}}$ . We remark that since all $P_{S}\in\mathcal{P}$ are NAE- $3$ -SAT, every $P^{\prime}_{S}\in\mathcal{P}_{2}\langle V_{\mathrm{U}},\alpha\rangle$ is a $2$ -SAT or $1$ -SAT clause.

4.1 Algorithm outline

Algorithm 2 gives our rounding scheme. Throughout its execution, the input instance $\mathcal{I}=(V,\mathcal{P})$ remains unchanged, as do the parameters $r,t,\epsilon,\tau$ , which only depend on $n=|V|$ and on the fixed universal constant $K=10^{20}$ .

Algorithm 2

\textsc{round-pd}(\mathcal{I},d,\mu,\alpha,\ell)

.

The algorithm maintains a Sherali-Adams solution $\mu$ and a partial assignment $\alpha$ . At any intermediate point, there are some variables $V_{\mathrm{F}}$ which are completely fixed: these are variables $v$ for which $\alpha_{v}$ has already been determined, and $\alpha_{v}\in\{0,1\}$ . The remaining variables $V_{\mathrm{U}}$ are the unfixed variables: these are the variables $v$ for which $\alpha_{v}=\frac{1}{2}$ , i.e. their assignment is not yet decided. The algorithm is called as $\textsc{round-pd}(\mathcal{I},d^{*},\mu^{*},\alpha^{(0)},0)$ where the Sherali-Adams solution $\mu^{*}$ is obtained by solving a $d^{*}$ -degree Sherali-Adams LP for Min-NAE- $3$ -SAT where $d^{*}=\text{\rm polylog}(n)$ will be determined in the analysis below.

On a high level, our algorithm proceeds in stages. At each stage, if the number of unfixed variables is small (specifically $|V_{\mathrm{U}}|<K^{2}(\log n)^{100}$ ), we simply solve the instance by brute force (line 3). Otherwise, if the value of the instance induced on $V_{\mathrm{U}}$ is small, $\mathrm{val}(\mathcal{I}[V_{\mathrm{U}}],\mu[V_{\mathrm{U}}])>{1}/({10\tau})$ , we show that the instance is essentially equivalent to $2$ -SAT, and solve it using a standard LP-rounding for $2$ -SAT (line 9, see Lemma 4.5). The interesting case is when neither of these two cases happens: we proceed via a conditioning step and a thresholding step, followed by a recursive call. In the conditioning step (lines 12-18), the algorithm identifies a small set of variables $C$ and an assignment to these variables $\gamma$ , and conditions on the event that $C$ gets assigned $\gamma$ , to obtain a new pseudodistribution $\widetilde{\mu}$ . Ideally, this pseudodistribution $\widetilde{\mu}$ is such that many of the variables in $V_{\mathrm{U}}$ are $(\widetilde{\mu},\tau\delta)$ -fixed. The thresholding step (lines 19-22) then rounds these variables to their ruling assignment, resulting in a new pseudodistribution $\widehat{\mu}$ . After updating $\alpha$ for each newly rounded variable to its already integral value in $\widehat{\mu}$ , the algorithm continues by making a recursive call.

4.2 Analysis

In each stage, after the thresholding step is completed, we will show that the number of unfixed variables (the variables with $\alpha_{v}=\frac{1}{2}$ ) decreases by a constant factor. Hence the number of stages is at most $O(\log n)$ . Thus if we can show that the conditioning and thresholding steps do not increase the value of the Sherali-Adams solution $\mu$ by more than a $1+O(1/\log n)$ factor, we would obtain an integral assignment to the variables with a constant-approximation ratio at the end of $O(\log|V|)$ stages. While it is unclear if such a scheme is possible, we track the growth of the aggregate value $A_{\tau}^{V_{\mathrm{U}}}(\mu)=\Theta(\text{\rm polylog}(n)\mathrm{val}(\mu))$ , and show that $A_{\tau}^{V_{\mathrm{U}}}(\mu)$ does not increase by more than a $1+O({1}/{\log n})$ factor after each stage. Hence, $A_{\tau}^{V_{\mathrm{U}}}(\mu)$ always remains a constant approximation to $A_{\tau}^{V}(\mu^{*}$ ), where $\mu^{*}$ is the initial Sherali-Adams solution. By the definition of the aggregate value, a constant-approximation to $A_{\tau}^{V_{\mathrm{U}}}(\mu)$ must imply a $\text{\rm polylog}(n)$ -approximation, giving us the desired result. The rest of this section is devoted to formalizing and proving the above statements.

We begin by analyzing the conditioning step. Informally, we show that if we condition on a small random set of variables and a suitable assignment sampled from their local distribution, then many variables will be fixed with good probability in the obtained conditional pseudodistribution. More precisely, we have the following lemma (see full version for proof):

Lemma 4.1 (Conditioning-to-Fixing).

Let $d, r, t$ be positive integers, let $W\subseteq V$ , let $\rho\in\mathcal{D}(W,d)$ , and let $\tau\geq 1$ . If $\mathrm{val}(\mathcal{I}[W],\rho)\leq 1/(3\tau)$ , $|W|\geq 3$ , $d\geq r+2t+3$ , $r\geq 10^{3}3^{2t}$ , $t\geq 10^{6}$ , then there exists $r^{\prime}\in\{0,\dots,r\}$ such that
$\Pr_{\begin{subarray}{c}\mathbf{c}\sim W^{r^{\prime}+2t},\,\gamma\sim\rho_{C}% \end{subarray}}\left[\left|F_{W}(\rho|_{C\leftarrow\gamma},\,\tau\cdot\mathrm{% val}(\mathcal{I}[W],\rho))\right|\geq\frac{|W|}{100}\right]\geq 1-\left(\frac{% 10^{7}}{\sqrt{\tau}}+5\frac{3^{2t/3}}{r^{1/3}}+\frac{5t^{2}}{|W|}+3\mathrm{e}^% {-t/10^{6}}\right)\,.$

Next, we analyze the change in the aggregate value from the thresholding step. Informally, we bound the additive increase of the aggregate value after line 22 compared to the aggregate value before line 19. More precisely, we have the following lemma (see full version for proof):

Lemma 4.2 (Thresholding).

Let $d$ be a positive integer, let $\rho\in\mathcal{D}(V,d)$ , let $V_{\mathrm{U}}\subseteq V$ such that for all $v\in V\setminus V_{\mathrm{U}}$ there is $b\in\{0,1\}$ satisfying $\rho_{\{v\}}(b)=1$ , and let $\tau\geq 1$ and $\delta\in[0,\frac{1}{10\tau}]$ . Then, there exists $\theta\in[\tau\delta,2\tau\delta]$ such that $\theta\in\Theta^{V_{\mathrm{U}}}_{\tau}(\rho,\delta)$ . Moreover, such a value can be found in time $\text{\rm poly}(|V_{U}|)$ .

The idea is to combine the two lemmas above to ensure that when we reach the base of the recursion we have a pseudodistribution $\mu$ whose aggregate value is within a constant factor of the original one.

Lemma 4.3.

Let $c=200K^{2}$ , let $\mu^{*}\in\mathcal{D}(V,K^{2}\log^{c}n)$ , let $\alpha^{(0)}\in\{0,1/2,1\}^{V}$ with $\alpha_{v}=1/2$ for all $v\in V$ . Then, running Algorithm 2 as $\textsc{round-pd}(K^{2}\log^{c}n,\mathcal{I},\mu^{*},\alpha^{(0)},0)$ must reach line 3 or 9 with integers $d\geq 3,\ell\leq 100\log n$ , a pseudodistribution $\mu\in\mathcal{D}(V,d)$ , and a set $V_{\mathrm{U}}\subseteq V$ such that

A^{{V_{\mathrm{U}}}}(\mu)\leq\mathrm{e}^{5000}A^{V}(\mu^{*})=O(1)A^{V}(\mu^{*}% )\,.

Proof.

Notice that before we reach lines 3 or 9, every recursive call to Algorithm 2 involves one conditioning step and one thresholding step. Let us consider in particular a call of Algorithm 2 that starts from a pseudodistribution $\mu\in\mathcal{D}(V,d)$ , and a set of unfixed variables $V_{\mathrm{U}}\subseteq V$ such that neither line 3 nor 9 is reached. Then, the conditioning step produces a pseudodistribution $\widetilde{\mu}$ . This step does not change the assignment $\alpha$ , or the set of unfixed variables $V_{\mathrm{U}}$ . Next, the thresholding step rounds some variables in $\widetilde{\mu}$ and produces a pseudodistribution $\widehat{\mu}$ , and updates the assignment $\alpha$ . Let us call this updated assignment $\alpha^{\prime}$ and let $V_{\mathrm{U}}^{\prime}=\{v\in V\mid\alpha^{\prime}_{v}=\frac{1}{2}\}$ be the set of unfixed variables with respect to $\alpha^{\prime}$ . For the sake of convenience, let us set define $A(\mu):=A^{V_{\mathrm{U}}}(\mu)$ , $A(\widetilde{\mu}):=A^{V_{\mathrm{U}}}(\widetilde{\mu})$ , and $A(\widehat{\mu}):=A^{V_{\mathrm{U}}^{\prime}}(\widehat{\mu})$ . In words, $A(\mu)$ and $A(\widetilde{\mu})$ are defined with respect to the set $V_{\mathrm{U}}$ , whereas $A(\widehat{\mu})$ is defined with respect to the set $V_{\mathrm{U}}^{\prime}$ obtained after thresholding. We will show two things: first, we argue that $|V_{\mathrm{U}}^{\prime}|\leq 99/100\cdot|V_{\mathrm{U}}|$ , and second we argue that the increase in the aggregate value $A(\widehat{\mu})-A(\mu)$ is at most $50A(\mu)/\log n$ .

Recall our setting of parameters: $K=10^{20}$ , $r=(\log n)^{100K}$ , $t=K\log\log n$ , $\epsilon={1}/{10\log n}$ , $\tau=K\log^{2}n$ , $c=200K^{2}$ . Since lines 3 and 9 are not reached, we can assume that $|V_{\mathrm{U}}|\geq c_{1}\geq K^{2}\log^{100}n$ and that $\mathrm{val}(\mathcal{I}[V_{\mathrm{U}}],\mu[V_{\mathrm{U}}])\leq\frac{1}{10\tau}$ . By the guarantee of Lemma 4.1 with $W=V_{\mathrm{U}}$ and $\rho=\mu[V_{\mathrm{U}}]$ , there exists $r^{\prime}\in\{0,\dots,r\}$ such that

		$\displaystyle\Pr_{\begin{subarray}{c}\mathbf{c}\sim{V_{\mathrm{U}}}^{r^{\prime% }+2t},\\ \gamma\sim\mu_{C}\end{subarray}}\left[\left\|F_{V_{\mathrm{U}}}(\mu\|_{C% \leftarrow\gamma},\,\tau\cdot\mathrm{val}(\mathcal{I}[V_{\mathrm{U}}],\mu[V_{% \mathrm{U}}]))\right\|\geq\frac{\|V_{\mathrm{U}}\|}{100}\right]$
	$\displaystyle\geq$	$\displaystyle 1-\left(\frac{10^{7}}{\sqrt{\tau}}+5\frac{3^{2t/3}}{r^{1/3}}+% \frac{5t^{2}}{\|V_{\mathrm{U}}\|}+3\mathrm{e}^{-t/10^{6}}\right)$
	$\displaystyle\geq$	$\displaystyle 1-\frac{1}{100\log n}\,.$

We also observe that for all $i\in\{0,1,2,3\}$ we have

\mathbb{E}_{\begin{subarray}{c}\mathbf{c}\sim V_{\mathrm{U}}^{r^{\prime}+2t},% \,\gamma\sim\mu_{C}\end{subarray}}[\mathrm{LP}^{V_{\mathrm{U}}}_{i}(\mu|_{C% \leftarrow\gamma})]=\mathrm{LP}^{V_{\mathrm{U}}}_{i}\,,

which follows from the property of conditional expectation¹¹1Note that while the conditioning “fixes” the variables we condition on, we still do not add these variables to $V_{\mathrm{U}}$ - this happens only after the thresholding step. Hence, the sets $V_{\mathrm{U}}$ and $V_{\mathrm{F}}$ remain unchanged, and therefore a constraint which contributes to $\mathrm{LP}_{i}^{V_{\mathrm{U}}}(\mu)$ continues to contribute to $\mathrm{LP}_{i}^{V_{\mathrm{U}}}(\mu|_{C\leftarrow\gamma})$ for all $i\in\{0,1,2,3\}$ .. This in turn means that

\mathbb{E}_{\begin{subarray}{c}\mathbf{c}\sim V_{\mathrm{U}}^{r^{\prime}+2t},% \,\gamma\sim\mu_{C}\end{subarray}}[A^{V_{\mathrm{U}}}(\mu|_{C\leftarrow\gamma}% )]=A^{V_{\mathrm{U}}}(\mu)=A(\mu)\,.

By Markov’s inequality, it follows that

\displaystyle\Pr_{\begin{subarray}{c}\mathbf{c}\sim W^{r^{\prime}+2t},\,\gamma% \sim\mu_{C}\end{subarray}}[A^{V_{\mathrm{U}}}(\mu|_{C\leftarrow\gamma})\geq A(% \mu)(1+\epsilon)]\leq 1-\epsilon/2=1-\frac{1}{20\log n}.

It follows that there exists a choice of $\mathbf{c}\in W^{r^{\prime}+2t}$ and $\gamma\in\mathrm{supp}(\mu_{C})$ such that we simultaneously have

\left|F_{V_{\mathrm{U}}}(\mu|_{C\leftarrow\gamma},\,\tau\cdot\mathrm{val}(% \mathcal{I}[V_{\mathrm{U}}],\mu))\right|\geq\frac{|V_{\mathrm{U}}|}{100}\quad% \mbox{ and }\quad A^{V_{\mathrm{U}}}(\mu|_{C\leftarrow\gamma})\leq A(\mu)(1+% \epsilon)\,.

In particular, we must have

\left|F_{V_{\mathrm{U}}}(\widetilde{\mu},\,\tau\cdot\mathrm{val}(\mathcal{I}[V% _{\mathrm{U}}],\mu))\right|\geq\frac{|V_{\mathrm{U}}|}{100}\quad\mbox{ and }% \quad A(\widetilde{\mu})\leq A(\mu)(1+\epsilon)\,.

Next, we analyze the rounding step. Note that the pseudo-distribution $\widehat{\mu}$ is obtained by rounding $\widetilde{\mu}$ . By using Lemma 4.2 with $\rho=\widetilde{\mu}$ , it follows that

A(\widehat{\mu})-A(\widetilde{\mu})\leq\frac{6}{\log n}A(\widetilde{\mu})+12% \tau\delta\binom{|V_{\mathrm{U}}|}{3}\log^{2}n

Note that $\tau\delta\binom{|V_{\mathrm{U}}|}{3}\log^{2}n=\tau\cdot\mathrm{LP}_{3}^{V_{% \mathrm{U}}}(\mu)\log^{2}n\leq\frac{A(\mu)}{\log n}$ , since $\delta\binom{|V_{U}|}{3}=\mathrm{LP}_{3}^{V_{U}}(\mu)$ . This gives

A(\widehat{\mu})-A(\widetilde{\mu})\leq 20\frac{A(\widetilde{\mu})+A(\mu)}{% \log n}\,.

Using the fact that $A(\widetilde{\mu})\leq(1+\epsilon)A(\mu)$ , it follows that

A(\widehat{\mu})-A(\widetilde{\mu})\leq 45\frac{A(\mu)}{\log n}\,.

Finally, combining the conditioning and rounding steps, we obtain

	$\displaystyle A(\widehat{\mu})-A(\mu)$	$\displaystyle=A(\widehat{\mu})-A(\widetilde{\mu})+A(\widetilde{\mu})-A(\mu)$
		$\displaystyle\leq 45A(\mu)/\log n+\epsilon A(\mu)$
		$\displaystyle\leq 50A(\mu)/\log n,$

where the last inequality follows since $\epsilon={1}/{10\log n}$ .

Note that after each step of conditioning and rounding described above, we fix a $\frac{1}{100}$ fraction of the unfixed variables $V_{\mathrm{U}}$ . It follows that after at most $100\log n$ such stages of conditioning and thresholding, the distribution $\mu$ and the set of unfixed variables $V_{\mathrm{U}}$ satisfy $A^{V_{\mathrm{U}}}(\mu)\leq(1+50/\log n)^{100\log n}A^{V}(\mu^{*})\leq\mathrm{% e}^{5000}A^{V}(\mu^{*})$ . $\hfill\blacktriangleleft$

The following two lemmas, whose proofs are deferred to Sections 6.1 and 6.2 respectively, prove the guarantees for the two base cases in Algorithm 2. When the number of unfixed variables becomes sufficiently small (line 3) we can compute the optimal solution using brute force (Lemma 4.4). When $\delta=\mathrm{LP}_{3}(\mu)/\binom{|V_{\mathrm{U}}|}{3}$ is large (line 9), we can simply ignore clauses whose all three variables are unfixed, which reduces the problem to Min-2-SAT (Lemma 4.5) that has a well-known $\text{\rm polylog}(n)$ -approximation.

Lemma 4.4 (Brute force base case).

Let $c=200K^{2}$ , let $\mu^{*}\in\mathcal{D}(V,K^{2}\log^{c}n)$ , let $\alpha^{(0)}\in\{0,1/2,1\}^{V}$ with $\alpha_{v}=1/2$ for all $v\in V$ . Also let $d$ , $\mu\in\mathcal{D}(V,d),\alpha\in\{0,1/2,1\}^{V}$ and $V_{\mathrm{U}}=\{v:\alpha_{v}=1/2\}$ be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as $\textsc{round-pd}(K^{2}\log^{c}n,\mathcal{I},\mu^{*},\alpha^{(0)})$ , reaches line 3. Then, the output $\alpha^{*}$ of line (3) satisfies $\mathrm{val}(\mathcal{I},\alpha^{*})\leq\mathrm{val}(\mathcal{I},\mu)$ .

Lemma 4.5 (Min- $2$ -SAT base case).

Let $c=200K^{2}$ , let $\mu^{*}\in\mathcal{D}(V,K^{2}\log^{c}n)$ , let $\alpha^{(0)}\in\{0,1/2,1\}^{V}$ with $\alpha_{v}=1/2$ for all $v\in V$ . Also let $d$ , $\mu\in\mathcal{D}(V,d),\alpha\in\{0,1/2,1\}^{V}$ and $V_{\mathrm{U}}=\{v:\alpha_{v}=1/2\}$ be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as $\textsc{round-pd}(K^{2}\log^{c}n,\mathcal{I},\mu^{*},\alpha^{(0)})$ , reaches line 9. Then, the output $\alpha^{*}$ of line (9) satisfies $\mathrm{val}(\mathcal{I},\alpha^{*})\leq O(\log n)A^{V_{\mathrm{U}}}(\mu)/% \binom{|V_{\mathrm{U}}|}{3}$ .

We finish the proof of our main result.

Proof of Theorem 1.1.

Let $\mu^{*}\in\mathcal{D}(V,K^{2}\log^{c}n)$ be obtained by solving a Sherali-Adams relaxation. Run Algorithm 2 with $c=200K^{2}$ as round-pd $(K^{2}\log^{c}n,\mathcal{I},\mu^{*},\alpha^{(0)})$ , where $\alpha^{(0)}\in\{0,1/2,1\}^{V}$ with $\alpha_{v}=1/2$ for all $v\in V$ . Lemma 4.3 guarantees that one of lines (3) or (9) is reached with a pseudodistribution $\mu$ and a set of unfixed variables $V_{{\mathrm{U}}}$ such that $A^{V_{\mathrm{U}}}(\mu)\leq O(1)A^{V}(\mu^{*})$ . Now there are two cases depending on which line is reached when the algorithm terminates.

$\blacksquare$

Line (3) is reached. In this case, by Lemma 4.4, we obtain an assignment $\alpha^{*}$ with $\mathrm{val}(\mathcal{I},\alpha^{*})\leq\mathrm{val}(\mathcal{I},\mu)$ . Note that

$\mathrm{val}(\mathcal{I},\mu)\leq\frac{1}{{\binom{|V|}{3}}}\sum_{i=0}^{3}% \mathrm{LP}^{V_{\mathrm{U}}}_{i}(\mu)\leq\frac{1}{{\binom{|V|}{3}}}A^{V_{% \mathrm{U}}}(\mu)\,.$

This quantity is at most

$\frac{1}{{\binom{|V|}{3}}}O(1)A^{V}(\mu^{*})\leq\frac{1}{{\binom{|V|}{3}}}O(% \tau\log^{3}n)\sum_{i=0}^{3}\mathrm{LP}_{i}^{V}(\mu^{*})\,,$

which is equal to

$\frac{1}{{\binom{|V|}{3}}}O(\tau\log^{3}n)\mathrm{LP}^{V}_{3}(\mu^{*})\leq O(% \tau\log^{3}n)\mathrm{val}(\mathcal{I},\mu^{*})$

as desired.
$\blacksquare$

Line (9) is reached. In this case, by Lemma 4.5, we obtain an assignment $\alpha^{*}$ with $\mathrm{val}(\mathcal{I},\alpha)\leq O(\log n)A^{V_{\mathrm{U}}}(\mu)/\binom{|% V_{\mathrm{U}}|}{3}$ . Again, we have

$\mathrm{val}(\mathcal{I},\alpha^{*})\leq\frac{1}{{\binom{|V|}{3}}}O(\log n)A^{% V_{{\mathrm{U}}}}(\mu)\leq\frac{1}{{\binom{|V|}{3}}}O(\log n)A^{V}(\mu^{*})\,,$

which is at most

$\frac{1}{{\binom{|V|}{3}}}O(\tau\log^{4}n)\sum_{i=0}^{3}\mathrm{LP}^{V}(\mu^{*% })=\frac{1}{{\binom{|V|}{3}}}O(\tau\log^{4}n)\mathrm{LP}^{V}_{3}(\mu^{*})\,.$

By definition, the above is $O(\tau\log^{4}n)\mathrm{val}(\mathcal{I},\mu^{*})$ . Finally, we recall that $\tau=K\log^{2}n=O(\log^{2}n)$ , so that $\mathrm{val}(\mathcal{I},\beta^{*})\leq O(\log^{6}n)\mathrm{val}(\mathcal{I},% \mu^{*}).$ This concludes the proof of the correctness.

For the running time, note that in a given stage of the algorithm involving one conditioning and one thresholding step, the bottleneck is guessing the set of $r^{\prime}+2t$ variables and their assignments to condition on. Since $r^{\prime}+2t=O(\log^{100K}n)$ , this step takes quasi-polynomial time. There are at most $100\log n$ stages by Lemma 4.3, thus the total time remains quasi-polynomial. Finally, we can obtain a degree- $d$ Sherali-Adams solution in $n^{O(d)}$ time. Recall that we solve a $K^{2}\log^{c}n$ -round Sherali-Adams solution to obtain $\mu^{*}$ , where $c=200K^{2}$ . Thus, the running time for this step is quasi-polynomial as well. It follows that the overall running time is quasi-polynomial. $\hfill\blacktriangleleft$

5 Polynomial time algorithm for complete $𝒌$ -CSPs

In this section, we show a simple algorithm that, given any complete $n$ -variable $k$ -CSP instance $\mathcal{I}=(V,\mathcal{P})$ on the boolean alphabet $\{0,1\}$ , and decides if there is a satisfying assignment to $\mathcal{I}$ in $n^{O(k)}$ time. This improves the algorithm of [4], which showed a quasi-polynomial time algorithm for any fixed $k$ .

The quasi-polynomial time of [4] uses a bound on the number of satisfying assignments of any boolean CSP, proved by the same authors. We use the same tool, which can be stated as follows.

Lemma 5.1 (Lemma 3.1 of [4]).

Let $\mathcal{I}=(V,\mathcal{P})$ be a complete $k$ -CSP. Then, the number of satisfying assignments to $\mathcal{I}$ is at most $O(|V|^{k-1})$ .

Our algorithm is very simple. It chooses an ordering $v_{1},v_{2},\dots,v_{n}$ of $V$ , and it keeps track of all satisfying assignments for the sub-instance induced on the first $i$ variables for $i=k,k+1,\dots,n$ . Note that the instance induced on the first $i$ variables is a complete CSP with $i$ variables, hence for any $i\in[n]$ the number of such assignments is at most $O(i^{k-1})$ by Lemma 5.1. At the $i$ -th step, we simply try to extend each satisfying assignment by trying $x_{i}=0$ and $x_{i}=1$ . At the end, it suffices to test if there is a satisfying assignment left. See Algorithm 3 for the pseudocode.

Algorithm 3 decide-csp(

\mathcal{I}

).

The correctness of the algorithm is immediate. At every stage, the number of satisfying assignments, and hence the size of $\mathcal{S}$ , is at most $n^{k-1}$ . Hence, the algorithm runs in $n^{O(k)}$ time.

6 Deferred proofs from Section 4

6.1 Proof of Lemma 4.4

In this section we prove the correctness of the base case of the recursion we the execution of Algorithm 2 reaches line 3. The claim is restated below for convenience of the reader.

Lemma 4.4 (Brute force base case). [Restated, see original statement.]

Let $c=200K^{2}$ , let $\mu^{*}\in\mathcal{D}(V,K^{2}\log^{c}n)$ , let $\alpha^{(0)}\in\{0,1/2,1\}^{V}$ with $\alpha_{v}=1/2$ for all $v\in V$ . Also let $d$ , $\mu\in\mathcal{D}(V,d),\alpha\in\{0,1/2,1\}^{V}$ and $V_{\mathrm{U}}=\{v:\alpha_{v}=1/2\}$ be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as $\textsc{round-pd}(K^{2}\log^{c}n,\mathcal{I},\mu^{*},\alpha^{(0)})$ , reaches line 3. Then, the output $\alpha^{*}$ of line (3) satisfies $\mathrm{val}(\mathcal{I},\alpha^{*})\leq\mathrm{val}(\mathcal{I},\mu)$ .

Proof.

Notice that line 3 is reached after at most $100\log n$ stages of conditioning and thresholding, by Lemma 4.3. For each conditioning step we lose at most $r+2t\leq 2r=2(\log n)^{100K}$ degrees from the Sherali-Adams relaxation. Hence, at this point we have a pseudodistribution $\mu\in\mathcal{D}(V,d)$ where $d=K^{2}\log^{c}n-100\log n\cdot 2(\log n)^{100K}\geq K^{2}\log^{100}n\geq|V_{% \mathrm{U}}|$ .

This in turn means that the Sherali-Adams solution $\mu$ is a distribution over integral solutions to the unfixed variables $V_{\mathrm{U}}$ . Hence in particular, it follows that there exists an integral assignment $\beta_{\mathrm{U}}$ to the variables in $V_{\mathrm{U}}$ such that the combined assignment $\beta$ , which assigns $\beta_{\mathrm{U}}$ to variables of $V_{\mathrm{U}}$ , and the already fixed assignment $\alpha_{V_{\mathrm{F}}}$ to the variables $V_{\mathrm{F}}$ , satisfies $\mathrm{val}(\beta)\leq\mathrm{val}(\mu)$ , and such an assignment would be found by brute force. $\hfill\blacktriangleleft$

6.2 Proof of Lemma 4.5

In this section we show that when the sub-instance induced by the unfixed variables has large objective value, it means that these constraints are not important and we can then focus on solving the Min- $2$ -SAT instance given by the constraints with one or two fixed variables. The claim is restated below for convenience of the reader.

Lemma 4.5 (Min- $2$ -SAT base case). [Restated, see original statement.]

Let $c=200K^{2}$ , let $\mu^{*}\in\mathcal{D}(V,K^{2}\log^{c}n)$ , let $\alpha^{(0)}\in\{0,1/2,1\}^{V}$ with $\alpha_{v}=1/2$ for all $v\in V$ . Also let $d$ , $\mu\in\mathcal{D}(V,d),\alpha\in\{0,1/2,1\}^{V}$ and $V_{\mathrm{U}}=\{v:\alpha_{v}=1/2\}$ be the degree, the pseudodistribution, the partial assignment and the corresponding set of unfixed variables when Algorithm 2, run as $\textsc{round-pd}(K^{2}\log^{c}n,\mathcal{I},\mu^{*},\alpha^{(0)})$ , reaches line 9. Then, the output $\alpha^{*}$ of line (9) satisfies $\mathrm{val}(\mathcal{I},\alpha^{*})\leq O(\log n)A^{V_{\mathrm{U}}}(\mu)/% \binom{|V_{\mathrm{U}}|}{3}$ .

Proof.

Notice that line 9 is reached after at most $100\log n$ stages of conditioning and thresholding, by Lemma 4.3. For each conditioning step we lose at most $r+2t\leq 2r=2(\log n)^{100K}$ degrees from the Sherali-Adams relaxation. Hence, at this point we have a pseudodistribution $\mu\in\mathcal{D}(V,d)$ where $d=K^{2}\log^{c}n-100\log n\cdot 2(\log n)^{100K}\geq K^{2}\log^{100}n\geq 3$ .

We now observe that for all $\beta\in\{0,1\}^{V_{\mathrm{U}}}$ , since $\mathrm{val}(\mathcal{I}[V_{\mathrm{U}}],\mu[V_{\mathrm{U}}])>1/(10\tau)$ , the average value of unsatisfied constraints can be bounded as

\mathrm{val}(\mathcal{I}[V_{\mathrm{U}}],\beta)\leq 1\leq 10\tau\cdot\mathrm{% val}(\mathcal{I}[V_{\mathrm{U}}],\mu[V_{\mathrm{U}}])\,.

Now, Algorithm 2 constructs a Min- $2$ -SAT instance $\mathcal{I}_{2}\langle V_{\mathrm{U}},\alpha\rangle=(V_{\mathrm{U}},\mathcal{P% }_{2}\langle V_{\mathrm{U}},\alpha\rangle)$ by ignoring constraints which have all their variables from $V_{\mathrm{U}}$ and those constraints which have all their variables from $V_{\mathrm{F}}=V\setminus V_{\mathrm{U}}$ . Hence, every constraint in $\mathcal{I}_{2}\langle V_{\mathrm{U}},\alpha\rangle$ involves $1$ or $2$ variables from $V_{\mathrm{U}}$ . We recall that since all $P_{S}\in\mathcal{P}$ are NAE- $3$ -SAT, every $P^{\prime}_{S}\in\mathcal{P}_{2}\langle V_{\mathrm{U}},\alpha\rangle$ is a $2$ -SAT or $1$ -SAT clause.

Let $V_{\mathrm{U}}^{-}=\{\neg v:v\in V_{\mathrm{U}}\}$ be the set of negative literals. For convenience, we will represent this instance as $\mathcal{I}^{\prime}=(V_{\mathrm{U}},\mathcal{C})$ where the set $\mathcal{C}$ is the multiset of clauses in the instance, corresponding to the constraints $\mathcal{P}_{2}\langle V_{\mathrm{U}},\alpha\rangle$ . Each $2$ -SAT clause in $\mathcal{C}$ is the disjunction of two literals $\ell_{1}\lor\ell_{2}$ with $\ell_{1},\ell_{2}\in V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}$ . We also allow ourselves to alternatively write $2$ -SAT clauses as follows: given $\ell_{1}\lor\ell_{2}$ where $\ell_{1},\ell_{2}\in V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}$ corresponding to variables $v,w\in V_{\mathrm{U}}$ respectively, we identify $\ell_{1}\lor\ell_{2}$ with a tuple $(v,w,p_{1},p_{2})$ where $p_{1},q_{2}:\{0,1\}\rightarrow\{0,1\}$ are bijections describing the literal pattern of $\ell_{1}$ and $\ell_{2}$ respectively. More precisely, each of $p_{1}$ and $p_{2}$ is identical to one of the mappings $\mathrm{Id}:b\mapsto b$ or $\overline{\mathrm{Id}}:b\mapsto 1-b$ . For example, if we have a clause $\neg v\lor w$ we might also write it as $(v,w,\overline{\mathrm{Id}},\mathrm{Id})$ . Note that this handles $1$ -SAT clauses too, since $v$ is allowed to equal $w$ . Let us further define

\mathrm{val}(\mathcal{I}^{\prime},\mu)=\operatorname*{\mathbb{E}}_{(v,w,p_{1},% p_{2})\sim\mathcal{C}}\left[\Pr_{\sigma\sim\mu_{\{v,w\}}}[p_{1}(\sigma_{v})=0,% \,p_{2}(\sigma_{w})=0]\right]

to be the average probability with which a clause is unsatisfied. We note that

|\mathcal{C}|\mathrm{val}(\mathcal{I}^{\prime},\mu)\leq\mathrm{LP}_{2}^{V_{% \mathrm{U}}}(\mu)+\mathrm{LP}_{1}^{V_{\mathrm{U}}}(\mu)\,,

since each clause has either $1$ or $2$ variables from $V_{\mathrm{U}}$ , and the remaining variables from $V\setminus V_{\mathrm{U}}$ (recall that $\mathrm{LP}^{V_{\mathrm{U}}}_{i}(\mu)$ is the sum of the values of the clauses with exactly $i$ variables from $V_{\mathrm{U}}$ ).

Our goal will be to show that the Sherali-Adams solution $\mu$ at this point gives a feasible solution for a standard LP relaxation for the Min- $2$ -SAT instance. If we show this, then by standard results in the literature, we know that the gap for this standard LP relaxation is at most $O(\log^{2}n)$ . In turn, this must imply that running the rounding algorithm certifying this gap would give an assignment $\alpha^{\prime}\in\{0,1\}^{V_{\mathrm{U}}}$ that violates at most $O(\log^{2}n)(\mathrm{LP}_{2}^{V_{\mathrm{U}}}(\mu)+\mathrm{LP}_{1}^{V_{\mathrm% {U}}}(\mu))$ clauses in $\mathcal{I}^{\prime}$ . Putting it together with $\alpha$ by including the constraints with all three variables from $V_{\mathrm{U}}$ , and the constraints with all three variables from $V\setminus V_{\mathrm{U}}$ , we get an assignment $\alpha^{*}\in\{0,1\}^{V}$ such that the total number of constraints it violates must be at most

10\tau\mathrm{LP}_{3}^{V_{\mathrm{U}}}(\mu)+O(\log^{2}n)(\mathrm{LP}^{V_{% \mathrm{U}}}_{2}(\mu)+\mathrm{LP}^{V_{\mathrm{U}}}_{1}(\mu))+\mathrm{LP}^{V_{% \mathrm{U}}}_{0}(\mu)\leq O(\log n)\,A^{V_{\mathrm{U}}}(\mu)\,,

which will finish the proof.

We now restrict our attention to showing that the Sherali-Adams solution $\mu$ is feasible for the standard LP relaxation for our Min- $2$ -SAT instance (see for example [27]), stated below for convenience. This relaxation solves for a metric $d:V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}\rightarrow\mathbb{R}_{\geq 0}$ over the set of literals $V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}$ .

	minimize	$\displaystyle\sum_{(\ell_{1}\lor\ell_{2})\in\mathcal{C}}\frac{1}{2}(d(\neg\ell% _{1},\ell_{2})+d(\neg\ell_{2},\ell_{1}))$
	subject to	$\displaystyle d(v,\neg v)+d(\neg v,v)\geq 1,\quad\forall v\in V_{\mathrm{U}}% \quad\quad\quad\quad\quad\quad\text{(LP-2-SAT)}$
		$\displaystyle d(\ell_{1},\ell_{2})\geq 0\quad\forall\ell_{1},\ell_{2}\in V_{% \mathrm{U}}\cup V_{\mathrm{U}}^{-}$
		$\displaystyle d(\ell_{1},\ell_{3})\leq d(\ell_{1},\ell_{2})+d(\ell_{2},\ell_{3% })\quad\forall\ell_{1},\ell_{2},\ell_{3}\in V_{\mathrm{U}}\cup V_{\mathrm{U}}^% {-}$

Notice that this is indeed a relaxation: given any assignment, for each violated clause $(\ell_{1}\lor\ell_{2})\in\mathcal{C}$ we set $d(\neg\ell,\ell)=d(\neg\ell,\ell)=1$ , for every satisfied clause $(\ell_{1}\lor\ell_{2})\in\mathcal{C}$ we set $d(\neg\ell,\ell)=d(\neg\ell,\ell)=0$ , and for every other $\ell_{1},\ell_{2}\in V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}$ we obtain $d(\ell_{1},\ell_{2})$ by metric completion. Formally, $d(\ell_{1},\ell_{2})$ will be the shortest length $d(\ell_{1},z_{1})+d(z_{1},z_{2})+\ldots+d(z_{t},\ell_{2})$ among all choices of literals $z_{1},\dots,z_{t}\in V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}$ such that $(\neg\ell_{1}\lor z_{1}),(\neg z_{1}\lor z_{2}),\dots,(\neg z_{t},\ell_{2})\in% \mathcal{C}$ . In other words, we turn each clause $\ell_{1}\lor\ell_{2}$ into two implications $\neg\ell_{1}\implies\ell_{2}$ , $\neg\ell_{2}\implies\ell_{1}$ and assign the lengths $d({\neg\ell_{1}},{\ell_{2}}),d({\neg\ell_{2}},{\ell_{1}})$ to be $1$ if and only if the corresponding implications are unsatisfied, and assign the other distances as the shortest path distance in the graph whose edges are formed by these implications. We have the following standard result from classical approximation algorithms literature.

Theorem 6.1 ([27]).

There is a rounding algorithm that, given any feasible solution to (LP-2-SAT) of objective value $\phi\in\mathbb{R}_{\geq 0}$ , outputs in polynomial time an assignment for which there are at most $O(\log^{2}n)\phi$ unsatisfied clauses.

Now we show that given a degree- $d$ Sherali-Adams solution $\rho$ with $d\geq 3$ , we can obtain a feasible solution to (LP-2-SAT) whose value is the same as the value of the Sherali-Adams relaxation. We remark that this is a standard fact, but we prove it here for completeness.

Claim 6.2.

Let $d\geq 3$ be an integer and let $\mu\in\mathcal{D}(V,d)$ . Then, one can construct a feasible solution to (LP-2-SAT) with objective value at most $|\mathcal{C}|\mathrm{val}(\mathcal{I}^{\prime},\mu)$ .²²2We note that $|\mathcal{C}|$ is a scaling constant since $\mathrm{val}$ is defined as an average, whereas the objective of (LP-2-SAT) is defined as a sum.

Proof.

The construction is very straightforward. Then, for each $\ell_{1},\ell_{2}\in V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}$ , letting $(v,w,p_{1},p_{2})$ be the tuple representation of $\ell_{1}\lor\ell_{2}$ , we set $d(\ell_{1},\ell_{2})=\Pr_{\sigma\in\mu_{\{v,w\}}}[p_{1}(\sigma_{v})=1,\,p_{2}(% \sigma_{w})=0]$ . In words, the length of the edge $\ell_{1},\ell_{2}$ is the probability that the implication $\ell_{1}\implies\ell_{2}$ is falsified in the pseudodistribution $\mu$ . It remains to verify that this solution is feasible. Clearly all the $d(\ell_{1},\ell_{2})$ are non-negative, and note that for each $v\in V_{\mathrm{U}}$ we have $d(v,\neg v)=\Pr_{\sigma\in\mu_{\{v\}}}[\sigma=1]$ and $d(\neg v,v)=\Pr_{\sigma\in\mu_{\{v\}}}[\sigma=0]$ , which means $d(\neg v,v)+d(v,\neg v)=\Pr_{\sigma\in\mu_{\{v\}}}[\sigma=0]+\Pr_{\sigma\in\mu% _{\{v\}}}[\sigma=1]=1$ . Next, we check the feasibility of the triangle inequality. Let $\ell_{1},\ell_{2},\ell_{3}\in V_{\mathrm{U}}\cup V_{\mathrm{U}}^{-}$ , and let $(u,v,p_{1},p_{2})$ , $(v,w,p_{2},p_{3})$ , $(u,w,p_{1},p_{3})$ be the tuple representation of $\ell_{1}\lor\ell_{2}$ , $\ell_{2}\lor\ell_{3}$ , $\ell_{1}\lor\ell_{3}$ , respectively. Then
$\displaystyle d(\ell_{1},\ell_{3})$ $\displaystyle=$ $\displaystyle\Pr_{\sigma\sim\mu_{\{u,w\}}}[p_{1}(\sigma_{u})=1,\,p_{3}(\sigma_% {w})=0]$ $\displaystyle=$ $\displaystyle\Pr_{\sigma\sim\rho_{\{u,v,w\}}}[p_{1}(\sigma_{u})=1,\,p_{2}(% \sigma_{v})=0,\,p_{3}(\sigma_{w})=0]+\Pr_{\sigma\sim\rho_{\{u,v,w\}}}[p_{1}(% \sigma_{u})=1,\,p_{2}(\sigma_{v})=1,\,p_{3}(\sigma_{w})=0]$ $\displaystyle\leq$ $\displaystyle\Pr_{\sigma\sim\mu_{\{u,v\}}}[p_{1}(\sigma_{u})=1,\,p_{2}(\sigma_% {v})=0]+\Pr_{\sigma\sim\mu_{\{v,w\}}}[p_{2}(\sigma_{v})=1,\,p_{3}(\sigma_{w})=0]$ $\displaystyle\leq$ $\displaystyle d(\ell_{1},\ell_{2})+d(\ell_{2},\ell_{3})$

as desired. The LP value of this solution $\frac{1}{2}\sum_{(\ell_{1}\lor\ell_{2})\in\mathcal{C}}d({\neg\ell_{1}},{\ell_{% 2}})+d({\neg\ell_{2}},{\ell_{1}})$ is upper-bounded as
$\displaystyle\frac{1}{2}\sum_{(\ell_{1}\lor\ell_{2})=(v,w,p_{1},p_{2})\in% \mathcal{C}}\Pr_{\sigma\sim\mu_{\{v,w\}}}[\overline{\mathrm{Id}}(p_{1}(\sigma_% {v}))=1,\,p_{2}(\sigma_{w}))=0]+\Pr_{\sigma\sim\mu_{\{v,w\}}}[\overline{% \mathrm{Id}}(p_{2}(\sigma_{w}))=1,\,p_{1}(\sigma_{1}))=0]$ $\displaystyle=$ $\displaystyle\frac{1}{2}\sum_{i,j\in C}2\Pr_{\sigma\sim\mu_{\{v,w\}}}[p_{1}(% \sigma_{v}))=0,\,p_{2}(\sigma_{w}))=0]$ $\displaystyle\leq$ $\displaystyle|\mathcal{C}|\mathrm{val}(\mathcal{I}^{\prime},\mu)\,,$

where the last inequality follows from the definition of $\mathrm{val}(\mathcal{I}^{\prime},\mu)$ . $\hfill\vartriangleleft$ $\hfill\blacktriangleleft$

7 Hardness of Min-NAE- $3$ -SAT on dense instances

We show that solving Min-NAE- $3$ -SAT exactly on dense instances is almost as hard as on general instances (which is known to be NP-hard [34]).

Claim 7.1.

For any small enough constant $0<\varepsilon<1/1000$ , there is an approximation-preserving polynomial time reduction from a general instance of Min-NAE- $3$ -SAT to a dense instance of Min-NAE- $3$ -SAT whose constraint hypergraph has at least $(1-\varepsilon)\binom{n}{3}$ constraints and every variable appears in at least $(1-\varepsilon)\binom{n}{2}$ constraints.

Proof.

Given a general instance of Min-NAE- $3$ -SAT with variable set $V_{0}$ with $n_{0}=|V_{0}|$ and constraints $\mathcal{C}$ , add $|V_{d}|=O(n_{0}/\varepsilon)$ dummy variables. The total number of variables becomes $n=n_{0}+|V_{d}|$ . First, we add $\binom{|V_{d}|}{3}$ constraints on $V_{d}$ , to make sure that the all-true assignment satisfies them. To ensure this, for every three variables $v_{1},v_{2},v_{3}\in V_{d}$ , we add a NAE- $3$ -SAT clause $(v_{1},v_{2},\neg v_{3})$ . Now, we add the constraints between the dummy variables $V_{d}$ and $V_{0}$ . For all pairs of variables $v_{1},v_{2}\in V_{d}$ and variables $v\in V_{0}$ , add a NAE- $3$ -SAT constraint $(v,v_{1},\neg v_{2})$ . Output the original instance combined with the dummy variables and constraints. The number of constraints is at least $\binom{|V_{d}|}{3}+|V_{0}|\cdot\binom{|V_{d}|}{2}=O(n_{0}^{3}/\varepsilon^{3})$ , which is at least $(1-\varepsilon)\binom{n}{3}$ for a small enough $\varepsilon$ . Moreover, the number of clauses each variable appears in, is at least $\binom{|V_{d}|}{2}\geq(1-\varepsilon)\binom{n}{2}$ for a small enough $\varepsilon$ .

For any assignment of the original variables, one can easily satisfy all the dummy constraints by setting all the dummy variables to true. In the other direction, for any assignment of the new instance, changing all dummy variables to true will only satisfy more constraints, so it will possibly violate only the original constraints.

Therefore, the optimal values of the two instances are the same. $\hfill\vartriangleleft$

References

[1] Amit Agarwal, Moses Charikar, Konstantin Makarychev, and Yury Makarychev. $O(\sqrt{\log n})$ approximation algorithms for min uncut, min 2cnf deletion, and directed cut problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 573–581, 2005. doi:10.1145/1060590.1060675.
[2] Vedat Levi Alev, Fernando Granha Jeronimo, and Madhur Tulsiani. Approximating constraint satisfaction problems on high-dimensional expanders. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 180–201. IEEE, 2019. doi:10.1109/FOCS.2019.00021.
[3] Noga Alon, W Fernandez De La Vega, Ravi Kannan, and Marek Karpinski. Random sampling and approximation of MAX-CSPs. Journal of computer and system sciences, 67(2):212–243, 2003. doi:10.1016/S0022-0000(03)00008-4.
[4] Aditya Anand, Euiwoong Lee, and Amatya Sharma. Min-csps on complete instances. In Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1178–1201. SIAM, 2025. doi:10.1137/1.9781611978322.35.
[5] Sanjeev Arora, David Karger, and Marek Karpinski. Polynomial time approximation schemes for dense instances of NP-hard problems. In Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, pages 284–293, 1995. doi:10.1145/225058.225140.
[6] Sanjeev Arora, Subhash A Khot, Alexandra Kolla, David Steurer, Madhur Tulsiani, and Nisheeth K Vishnoi. Unique games on expanding constraint graphs are easy. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 21–28, 2008.
[7] Mitali Bafna, Boaz Barak, Pravesh K Kothari, Tselil Schramm, and David Steurer. Playing unique games on certified small-set expanders. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1629–1642, 2021. doi:10.1145/3406325.3451099.
[8] Boaz Barak, Moritz Hardt, Thomas Holenstein, and David Steurer. Subsampling mathematical relaxations and average-case complexity. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 512–531. SIAM, 2011. doi:10.1137/1.9781611973082.41.
[9] Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. In 2011 ieee 52nd annual symposium on foundations of computer science, pages 472–481. IEEE, 2011. doi:10.1109/FOCS.2011.95.
[10] Cristina Bazgan, W Fernandez de La Vega, and Marek Karpinski. Polynomial time approximation schemes for dense instances of minimum constraint satisfaction. Random Structures & Algorithms, 23(1):73–91, 2003. doi:10.1002/rsa.10072.
[11] Joshua Brakensiek, Neng Huang, Aaron Potechin, and Uri Zwick. On the mysteries of max nae-sat. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 484–503. SIAM, 2021. doi:10.1137/1.9781611976465.30.
[12] Joshua Brakensiek, Neng Huang, and Uri Zwick. Tight approximability of max 2-sat and relatives, under ugc. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1328–1344. SIAM, 2024. doi:10.1137/1.9781611977912.53.
[13] Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, Alantha Newman, and Lukas Vogl. Understanding the cluster LP for correlation clustering. In Proceedings of the 56th Annual ACM SIGACT Symposium on Theory of Computing, 2024.
[14] Amin Coja-Oghlan, Colin Cooper, and Alan Frieze. An efficient sparse regularity concept. SIAM Journal on Discrete Mathematics, 23(4):2000–2034, 2010. doi:10.1137/080730160.
[15] W Fernandez de la Vega, Marek Karpinski, Ravi Kannan, and Santosh Vempala. Tensor decomposition and approximation schemes for constraint satisfaction problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 747–754, 2005.
[16] Wenceslas Fernandez de la Vega and Claire Kenyon-Mathieu. Linear programming relaxations of maxcut. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 53–61. Citeseer, 2007. URL: http://dl.acm.org/citation.cfm?id=1283383.1283390.
[17] Dimitris Fotakis, Michail Lampis, and Vangelis Paschos. Sub-exponential approximation schemes for CSPs: From dense to almost sparse. In 33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016), pages 37–1, 2016.
[18] Alan Frieze and Ravi Kannan. The regularity lemma and approximation schemes for dense problems. In Proceedings of 37th conference on foundations of computer science, pages 12–20. IEEE, 1996. doi:10.1109/SFCS.1996.548459.
[19] Venkatesan Guruswami and Ali Kemal Sinop. Lasserre hierarchy, higher eigenvalues, and approximation schemes for graph partitioning and quadratic integer programming with psd objectives. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 482–491. IEEE, 2011. doi:10.1109/FOCS.2011.36.
[20] Johan Håstad. Some optimal inapproximability results. Journal of the ACM (JACM), 48(4):798–859, 2001. doi:10.1145/502090.502098.
[21] Fernando Granha Jeronimo, Dylan Quintana, Shashank Srivastava, and Madhur Tulsiani. Unique decoding of explicit $\varepsilon$ -balanced codes near the gilbert-varshamov bound. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 434–445. IEEE, IEEE, 2020. doi:10.1109/FOCS46700.2020.00048.
[22] Fernando Granha Jeronimo, Shashank Srivastava, and Madhur Tulsiani. Near-linear time decoding of ta-shma’s codes via splittable regularity. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1527–1536, 2021. doi:10.1145/3406325.3451126.
[23] Marek Karpinski and Warren Schudy. Linear time approximation schemes for the Gale-Berlekamp game and related minimization problems. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 313–322, 2009. doi:10.1145/1536414.1536458.
[24] Sanjeev Khanna, Madhu Sudan, Luca Trevisan, and David P Williamson. The approximability of constraint satisfaction problems. SIAM Journal on Computing, 30(6):1863–1920, 2001. doi:10.1137/S0097539799349948.
[25] Subhash Khot. On the power of unique 2-prover 1-round games. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 767–775, 2002. doi:10.1145/509907.510017.
[26] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAX-CUT and other 2-variable CSPs? SIAM Journal on Computing, 37(1):319–357, 2007. doi:10.1137/S0097539705447372.
[27] Philip N Klein, Serge A Plotkin, Satish Rao, and Eva Tardos. Approximation algorithms for Steiner and directed multicuts. Journal of Algorithms, 22(2):241–269, 1997. doi:10.1006/jagm.1996.0833.
[28] Michael Lewin, Dror Livnat, and Uri Zwick. Improved rounding techniques for the max 2-sat and max di-cut problems. In International Conference on Integer Programming and Combinatorial Optimization, pages 67–82. Springer, 2002. doi:10.1007/3-540-47867-1_6.
[29] Pasin Manurangsi and Dana Moshkovitz. Approximating dense max 2-CSPs. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, page 396, 2015.
[30] Claire Mathieu and Warren Schudy. Yet another algorithm for dense max cut: go greedy. In Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, pages 176–182, 2008. URL: http://dl.acm.org/citation.cfm?id=1347082.1347102.
[31] Antoine Méot, Arnaud de Mesmay, Moritz Mühlenthaler, and Alantha Newman. Voting algorithms for unique games on complete graphs. In Symposium on Simplicity in Algorithms (SOSA), pages 124–136. SIAM, 2023. doi:10.1137/1.9781611977585.ch12.
[32] Shayan Oveis Gharan and Luca Trevisan. A new regularity lemma and faster approximation algorithms for low threshold rank graphs. In International Workshop on Approximation Algorithms for Combinatorial Optimization, pages 303–316. Springer, 2013. doi:10.1007/978-3-642-40328-6_22.
[33] Prasad Raghavendra and Ning Tan. Approximating csps with global cardinality constraints using SDP hierarchies. In Yuval Rabani, editor, Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012, pages 373–387. SIAM, 2012. doi:10.1137/1.9781611973099.33.
[34] Thomas J. Schaefer. The complexity of satisfiability problems. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC ’78, pages 216–226, New York, NY, USA, 1978. Association for Computing Machinery. doi:10.1145/800133.804350.
[35] Grigory Yaroslavtsev. Going for speed: Sublinear algorithms for dense $r$ -CSPs. arXiv preprint arXiv:1407.7887, 2014. arXiv:1407.7887.
[36] Yuichi Yoshida and Yuan Zhou. Approximation schemes via sherali-adams hierarchy for dense constraint satisfaction problems and assignment problems. In Moni Naor, editor, Innovations in Theoretical Computer Science, ITCS’14, Princeton, NJ, USA, January 12-14, 2014, pages 423–438. ACM, 2014. doi:10.1145/2554797.2554836.

[bib.bib1] [1] Amit Agarwal, Moses Charikar, Konstantin Makarychev, and Yury Makarychev. $O(\sqrt{\log n})$ approximation algorithms for min uncut, min 2cnf deletion, and directed cut problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 573–581, 2005. doi:10.1145/1060590.1060675.

[bib.bib2] [2] Vedat Levi Alev, Fernando Granha Jeronimo, and Madhur Tulsiani. Approximating constraint satisfaction problems on high-dimensional expanders. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 180–201. IEEE, 2019. doi:10.1109/FOCS.2019.00021.

[bib.bib3] [3] Noga Alon, W Fernandez De La Vega, Ravi Kannan, and Marek Karpinski. Random sampling and approximation of MAX-CSPs. Journal of computer and system sciences, 67(2):212–243, 2003. doi:10.1016/S0022-0000(03)00008-4.

[bib.bib4] [4] Aditya Anand, Euiwoong Lee, and Amatya Sharma. Min-csps on complete instances. In Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1178–1201. SIAM, 2025. doi:10.1137/1.9781611978322.35.

[bib.bib5] [5] Sanjeev Arora, David Karger, and Marek Karpinski. Polynomial time approximation schemes for dense instances of NP-hard problems. In Proceedings of the twenty-seventh annual ACM symposium on Theory of computing, pages 284–293, 1995. doi:10.1145/225058.225140.

[bib.bib6] [6] Sanjeev Arora, Subhash A Khot, Alexandra Kolla, David Steurer, Madhur Tulsiani, and Nisheeth K Vishnoi. Unique games on expanding constraint graphs are easy. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 21–28, 2008.

[bib.bib7] [7] Mitali Bafna, Boaz Barak, Pravesh K Kothari, Tselil Schramm, and David Steurer. Playing unique games on certified small-set expanders. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1629–1642, 2021. doi:10.1145/3406325.3451099.

[bib.bib8] [8] Boaz Barak, Moritz Hardt, Thomas Holenstein, and David Steurer. Subsampling mathematical relaxations and average-case complexity. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 512–531. SIAM, 2011. doi:10.1137/1.9781611973082.41.

[bib.bib9] [9] Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. In 2011 ieee 52nd annual symposium on foundations of computer science, pages 472–481. IEEE, 2011. doi:10.1109/FOCS.2011.95.

[bib.bib10] [10] Cristina Bazgan, W Fernandez de La Vega, and Marek Karpinski. Polynomial time approximation schemes for dense instances of minimum constraint satisfaction. Random Structures & Algorithms, 23(1):73–91, 2003. doi:10.1002/rsa.10072.

[bib.bib11] [11] Joshua Brakensiek, Neng Huang, Aaron Potechin, and Uri Zwick. On the mysteries of max nae-sat. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 484–503. SIAM, 2021. doi:10.1137/1.9781611976465.30.

[bib.bib12] [12] Joshua Brakensiek, Neng Huang, and Uri Zwick. Tight approximability of max 2-sat and relatives, under ugc. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1328–1344. SIAM, 2024. doi:10.1137/1.9781611977912.53.

[bib.bib13] [13] Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, Alantha Newman, and Lukas Vogl. Understanding the cluster LP for correlation clustering. In Proceedings of the 56th Annual ACM SIGACT Symposium on Theory of Computing, 2024.

[bib.bib14] [14] Amin Coja-Oghlan, Colin Cooper, and Alan Frieze. An efficient sparse regularity concept. SIAM Journal on Discrete Mathematics, 23(4):2000–2034, 2010. doi:10.1137/080730160.

[bib.bib15] [15] W Fernandez de la Vega, Marek Karpinski, Ravi Kannan, and Santosh Vempala. Tensor decomposition and approximation schemes for constraint satisfaction problems. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 747–754, 2005.

[bib.bib16] [16] Wenceslas Fernandez de la Vega and Claire Kenyon-Mathieu. Linear programming relaxations of maxcut. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 53–61. Citeseer, 2007. URL: http://dl.acm.org/citation.cfm?id=1283383.1283390.

[bib.bib17] [17] Dimitris Fotakis, Michail Lampis, and Vangelis Paschos. Sub-exponential approximation schemes for CSPs: From dense to almost sparse. In 33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016), pages 37–1, 2016.

[bib.bib18] [18] Alan Frieze and Ravi Kannan. The regularity lemma and approximation schemes for dense problems. In Proceedings of 37th conference on foundations of computer science, pages 12–20. IEEE, 1996. doi:10.1109/SFCS.1996.548459.

[bib.bib19] [19] Venkatesan Guruswami and Ali Kemal Sinop. Lasserre hierarchy, higher eigenvalues, and approximation schemes for graph partitioning and quadratic integer programming with psd objectives. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 482–491. IEEE, 2011. doi:10.1109/FOCS.2011.36.

[bib.bib20] [20] Johan Håstad. Some optimal inapproximability results. Journal of the ACM (JACM), 48(4):798–859, 2001. doi:10.1145/502090.502098.

[bib.bib21] [21] Fernando Granha Jeronimo, Dylan Quintana, Shashank Srivastava, and Madhur Tulsiani. Unique decoding of explicit $\varepsilon$ -balanced codes near the gilbert-varshamov bound. In Sandy Irani, editor, 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 434–445. IEEE, IEEE, 2020. doi:10.1109/FOCS46700.2020.00048.

[bib.bib22] [22] Fernando Granha Jeronimo, Shashank Srivastava, and Madhur Tulsiani. Near-linear time decoding of ta-shma’s codes via splittable regularity. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1527–1536, 2021. doi:10.1145/3406325.3451126.

[bib.bib23] [23] Marek Karpinski and Warren Schudy. Linear time approximation schemes for the Gale-Berlekamp game and related minimization problems. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 313–322, 2009. doi:10.1145/1536414.1536458.

[bib.bib24] [24] Sanjeev Khanna, Madhu Sudan, Luca Trevisan, and David P Williamson. The approximability of constraint satisfaction problems. SIAM Journal on Computing, 30(6):1863–1920, 2001. doi:10.1137/S0097539799349948.

[bib.bib25] [25] Subhash Khot. On the power of unique 2-prover 1-round games. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 767–775, 2002. doi:10.1145/509907.510017.

[bib.bib26] [26] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAX-CUT and other 2-variable CSPs? SIAM Journal on Computing, 37(1):319–357, 2007. doi:10.1137/S0097539705447372.

[bib.bib27] [27] Philip N Klein, Serge A Plotkin, Satish Rao, and Eva Tardos. Approximation algorithms for Steiner and directed multicuts. Journal of Algorithms, 22(2):241–269, 1997. doi:10.1006/jagm.1996.0833.

[bib.bib28] [28] Michael Lewin, Dror Livnat, and Uri Zwick. Improved rounding techniques for the max 2-sat and max di-cut problems. In International Conference on Integer Programming and Combinatorial Optimization, pages 67–82. Springer, 2002. doi:10.1007/3-540-47867-1_6.

[bib.bib29] [29] Pasin Manurangsi and Dana Moshkovitz. Approximating dense max 2-CSPs. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, page 396, 2015.

[bib.bib30] [30] Claire Mathieu and Warren Schudy. Yet another algorithm for dense max cut: go greedy. In Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, pages 176–182, 2008. URL: http://dl.acm.org/citation.cfm?id=1347082.1347102.

[bib.bib31] [31] Antoine Méot, Arnaud de Mesmay, Moritz Mühlenthaler, and Alantha Newman. Voting algorithms for unique games on complete graphs. In Symposium on Simplicity in Algorithms (SOSA), pages 124–136. SIAM, 2023. doi:10.1137/1.9781611977585.ch12.

[bib.bib32] [32] Shayan Oveis Gharan and Luca Trevisan. A new regularity lemma and faster approximation algorithms for low threshold rank graphs. In International Workshop on Approximation Algorithms for Combinatorial Optimization, pages 303–316. Springer, 2013. doi:10.1007/978-3-642-40328-6_22.

[bib.bib33] [33] Prasad Raghavendra and Ning Tan. Approximating csps with global cardinality constraints using SDP hierarchies. In Yuval Rabani, editor, Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012, pages 373–387. SIAM, 2012. doi:10.1137/1.9781611973099.33.

[bib.bib34] [34] Thomas J. Schaefer. The complexity of satisfiability problems. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC ’78, pages 216–226, New York, NY, USA, 1978. Association for Computing Machinery. doi:10.1145/800133.804350.

[bib.bib35] [35] Grigory Yaroslavtsev. Going for speed: Sublinear algorithms for dense $r$ -CSPs. arXiv preprint arXiv:1407.7887, 2014. arXiv:1407.7887.

[bib.bib36] [36] Yuichi Yoshida and Yuan Zhou. Approximation schemes via sherali-adams hierarchy for dense constraint satisfaction problems and assignment problems. In Moni Naor, editor, Innovations in Theoretical Computer Science, ITCS’14, Princeton, NJ, USA, January 12-14, 2014, pages 423–438. ACM, 2014. doi:10.1145/2554797.2554836.

Min-CSPs on Complete Instances II: Polylogarithmic Approximation for Min-NAE-3-SAT

Abstract

Keywords and phrases:

Category:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Our Contribution.

Theorem 1.1.

Theorem 1.2.

Open Questions.

Organization.

2 Technical overview

2.1 Pseudodistributions and complete NAE-3-SAT

Obstacle: we cannot afford additive error.

Advantage: polylog⁢(𝒏) loss in time and approximation.

Insight 1.

Advantage: the instance is complete.

Insight 2.

Advantage: the structure of NAE-𝟑-SAT constraints.

Insight 3.

Advantage: “completely” fixing 𝑶⁢(𝜹)-fixed variables incurs little cost.

Insight 4.

2.2 Towards an algorithm

Obstacle: the need for a high probability guarantee.

Advantage: polylog(n)-degree pseudodistributions.

3 Notation

CSPs.

NAE-𝟑-SAT.

Sherali-Adams notation.

Conditioning and fixing notation.

Variable vectors and variable sets.

4 Polylogarithmic approximation for complete Min-NAE-3-SAT

Fixed variables, ruling value, ruling assignment.

LP value of constraint classes.

Aggregate LP value.

Thresholds with bounded increase.

𝟐-SAT instances.

4.1 Algorithm outline

4.2 Analysis

Lemma 4.1 (Conditioning-to-Fixing).

Lemma 4.2 (Thresholding).

Lemma 4.3.

Proof.

Lemma 4.4 (Brute force base case).

Lemma 4.5 (Min-2-SAT base case).

Proof of Theorem 1.1.

5 Polynomial time algorithm for complete 𝒌-CSPs

Lemma 5.1 (Lemma 3.1 of [4]).

6 Deferred proofs from Section 4

6.1 Proof of Lemma 4.4

Lemma 4.4 (Brute force base case). [Restated, see original statement.]

Proof.

6.2 Proof of Lemma 4.5

Lemma 4.5 (Min-2-SAT base case). [Restated, see original statement.]

Proof.

Theorem 6.1 ([27]).

Claim 6.2.

Proof.

7 Hardness of Min-NAE-𝟑-SAT on dense instances

Claim 7.1.

Proof.

References

Min-CSPs on Complete Instances II:
Polylogarithmic Approximation for Min-NAE-3-SAT

Advantage: $\text{\rm polylog}(n)$ loss in time and approximation.

Advantage: the structure of NAE- $3$ -SAT constraints.

Advantage: “completely” fixing $O(\delta)$ -fixed variables incurs little cost.

Advantage: polylog( $n$ )-degree pseudodistributions.

NAE- $3$ -SAT.

$2$ -SAT instances.

Lemma 4.5 (Min- $2$ -SAT base case).

5 Polynomial time algorithm for complete $𝒌$ -CSPs

Lemma 4.5 (Min- $2$ -SAT base case). [Restated, see original statement.]

7 Hardness of Min-NAE- $3$ -SAT on dense instances