Max-Cut with Multiple Cardinality Constraints

Makarychev, Yury; Pittu, Madhusudhan Reddy; Vakilian, Ali

doi:10.4230/LIPIcs.APPROX/RANDOM.2025.13

Max-Cut with Multiple Cardinality Constraints

Yury Makarychev

Toyota Technological Institute at Chicago, IL, USA Madhusudhan Reddy Pittu

Carnegie Mellon University, Pittsburgh, PA, USA Ali Vakilian

Toyota Technological Institute at Chicago, IL, USA

Abstract

We study the classic Max-Cut problem under multiple cardinality constraints, which we refer to as the Constrained Max-Cut problem. Given a graph $G=(V,E)$ , a partition of the vertices into $c$ disjoint parts $V_{1},\ldots,V_{c}$ , and cardinality parameters $k_{1},\ldots,k_{c}$ , the goal is to select a set $S\subseteq V$ such that $|S\cap V_{i}|=k_{i}$ for each $i\in[c]$ , maximizing the total weight of edges crossing $S$ (i.e., edges with exactly one endpoint in $S$ ).

By designing an approximate kernel for Constrained Max-Cut and building on the correlation rounding technique of Raghavendra and Tan (2012), we present a $(0.858-\varepsilon)$ -approximation algorithm for the problem when $c=O(1)$ . The algorithm runs in time $O\big(\min\{k/\varepsilon,n\}^{\operatorname{poly}(c/\varepsilon)}+% \operatorname{poly}(n)\big)$ , where $k=\sum_{i\in[c]}k_{i}$ and $n=|V|$ . This improves upon the $(\frac{1}{2}+\varepsilon_{0})$ -approximation of Feige and Langberg (2001) for $\operatorname{Max-Cut}_{k}$ (the special case when $c=1,k_{1}=k$ ), and generalizes the $(0.858-\varepsilon)$ -approximation of Raghavendra and Tan (2012), which only applies when $\min\{k,n-k\}=\Omega(n)$ and does not handle multiple constraints.

We also establish that, for general values of $c$ , it is NP-hard to determine whether a feasible solution exists that cuts all edges. Finally, we present a $1/2$ -approximation algorithm for Max-Cut under an arbitrary matroid constraint.

Keywords and phrases:

Maxcut, Semi-definite Programming, Sum of Squares Hierarchy

Category:

APPROX

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Design and analysis of algorithms

Acknowledgements:

This work was conducted in part while Madhusudhan Reddy Pittu was a visiting student at TTIC.

Editors:

Alina Ene and Eshan Chattopadhyay

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Given an undirected graph $G=(V,E)$ on $n$ vertices and a weight function $w:E\to\mathbb{R}^{+}$ , the Max-Cut problem seeks a subset $S\subseteq V$ maximizing $\delta_{w}(S)=\sum_{\begin{subarray}{c}u\in S,v\in V\setminus S\end{subarray}}% w(\{u,v\})$ , the total weight of edges crossing the cut $(S,V\setminus S)$ . Without loss of generality, we assume the weights are scaled so that the total edge weight satisfies $\sum_{e\in E}w(e)=1$ .

Max-Cut is a fundamental problem in combinatorial optimization and approximation algorithms, with several landmark results, most notably the seminal SDP rounding algorithm by Goemans and Williamson [15], which achieves an $\alpha_{\operatorname{\sf GW}}\approx 0.878$ approximation. This approximation ratio is known to be optimal under the Unique Games Conjecture (UGC) [18].

In this work, we study a variant called Constrained Max-Cut, where additional partition constraints are imposed on the solution.

Definition 1 (Constrained Max-Cut).

Given a graph $G=(V,E)$ , a weight function $w:E\to\mathbb{R}^{+}$ , and a set of $c$ partition constraints $\{(V_{i},k_{i})\}_{i\in[c]}$ where $V=\biguplus_{i\in[c]}V_{i}$ and $k_{i}\leq|V_{i}|/2$ for all $i$ , the Constrained Max-Cut problem asks to find a subset $S\subseteq V$ such that $|S\cap V_{i}|=k_{i}$ for all $i\in[c]$ , maximizing $\delta_{w}(S)$ .

Several well-studied problems are special cases of $\operatorname{Constrained\;Max-Cut}$ . The Max-Bisection problem corresponds to $c=1$ and $k=n/2$ , and admits approximation factors close to $\alpha_{\operatorname{\sf GW}}$ – specifically, $0.8776$ [2] (see also [14, 25, 9, 17, 11, 16, 22]). More generally, when there is a single cardinality constraint $|S|=k$ (i.e., $c=1$ ), the problem is known as $\operatorname{Max-Cut}_{k}$ [10]. It is also referred to as $(k,n-k)$ -Max-Cut in parameterized complexity, see e.g., [6, 24].

Definition 2 (Max-Cut_k).

Given an undirected graph $G=(V,E)$ , a weight function $w:E\to\mathbb{R}^{+}$ , and an integer $k$ ¹¹1Assume that $k\leq n/2$ without loss of generality., the $\operatorname{Max-Cut}_{k}$ problem seeks a subset $S\subseteq V$ of cardinality exactly $k$ that maximizes $\delta_{w}(S)$ .

For $k=\Omega(n)$ , the global correlation rounding technique of Raghavendra and Tan [22] (building on [4]) achieves an $\alpha_{\mathrm{cc}}\approx 0.858$ approximation. Austrin and Stankovic [3] later showed that this approximation is essentially tight for $k<0.365n$ . However, when $k=o(n)$ , existing results are weaker. Feige and Langberg [10] gave a $(0.5+\varepsilon_{0})$ -approximation for all $k$ , where $\varepsilon_{0}$ is a small universal constant ( $\varepsilon_{0}<0.09$ ). Moreover, the pipage rounding technique of Ageev and Sviridenko [1] guarantees a $0.5$ -approximation for all $k$ .

The special case of $\operatorname{Constrained\;Max-Cut}$ with $c=1$ and $k=o(n)$ has applications in pricing in social networks [8], also referred to as influence-and-exploit [13]. In this context, consumers’s valuation depends directly on the usage of their neighbors in the network. Consequently, the seller’s optimal pricing strategy may involve offering discounts to certain influencers who hold central positions in the underlying network. Candogan et al. [8] considered a setting with two price types: full and discounted. Specifically, the objective is to maximize the total network influence, subject to the constraint of offering $k$ discounted prices to a small target set of buyers, where $k\ll n$ . Candogan et al. showed that the obtained problem can be reduced to $\operatorname{Max-Cut}_{k}$ , where $k=o(n)$ .

Moreover, in certain settings where diversity among influencers is desirable, it is natural to require that the selected influencers fairly represent different groups. This requirement can be modeled as a $\operatorname{Constrained\;Max-Cut}$ problem with multiple capacity constraints. In most relevant cases, the number of groups is a small integer $c=O(1)$ . For a comprehensive survey on fair representation for learning on graphs, see to [19].

In this paper, we also introduce and study a more general version of the problem: finding a maximum cut subject to a matroid constraint.

Definition 3 (Matroid Max-Cut).

Given an undirected graph $G=(V,E)$ , a weight function $w:E\rightarrow\mathbb{R}^{+}$ , and a matroid $\mathcal{M}=(V,\mathcal{I})$ , the matroid Max-Cut problem is to maximize $\delta_{w}(S)$ subject to $S\in\mathcal{B}$ , where $\mathcal{B}$ is the collection of bases of $\mathcal{M}$ .

Note that $\operatorname{Constrained\;Max-Cut}$ and $\operatorname{Max-Cut}_{k}$ are special cases of the matroid Max-Cut problem; we get these problems when the matroid is a partition matroid or a uniform matroid, respectively. This problem has not been explicitly studied previously. However, an algorithm by Lee, Mirrokni, Nagarajan, and Sviridenko [21] gives a $(\frac{1}{3}-\varepsilon)$ -approximation to the more general problem of symmetric submodular function maximization subject to matroid base constraint.

1.1 Our Results and Techniques

Our algorithm for $\operatorname{Max-Cut}_{k}$ builds on the global correlation rounding technique introduced by Raghavendra and Tan [22], which achieves an $\alpha_{cc}\approx 0.858$ -approximation in the regime where $k=\Omega(n)$ . We extend this approach by developing an approximate kernel and applying it in conjunction with correlation rounding, allowing us to handle the challenging case where $k=o(n)$ .

This yields an approximation guarantee of $(0.858-\varepsilon)$ for $\operatorname{Max-Cut}_{k}$ across all values of $k$ , improving upon the $(0.5+\varepsilon_{0})$ -approximation of [10] in the sparse regime. Formally:

Theorem 4.

For every $\varepsilon>0$ , there is an algorithm that runs in $O(\min\{k/\varepsilon,n\}^{\operatorname{poly}(1/\varepsilon)}+\operatorname{% poly}(n))$ time and obtains an $(\alpha_{cc}-\varepsilon)$ -approximation to the optimal $\operatorname{Max-Cut}_{k}$ solution, where $\alpha_{cc}\approx 0.858$ is the approximation factor of the Raghavendra–Tan algorithm.

The regime where $k=o(n)$ is particularly challenging, as the correlation rounding approach of Raghavendra and Tan [22] does not extend to this setting. Our algorithm closes this gap by improving upon the $(0.5+\varepsilon_{0})$ -approximation of [10] in the sparse regime. For completeness, we provide a brief overview of the approach of Raghavendra and Tan in Section 1.2, and highlight the key reasons why it breaks down when $k=o(n)$ .

We next address the more general setting of multiple constraints, focusing on the case $c=O(1)$ .

Theorem 5.

For every $\varepsilon>0$ , there is an algorithm that runs in $O\big(\min\{k/\varepsilon,n\}^{\operatorname{poly}(c/\varepsilon)}+% \operatorname{poly}(n)\big)$ time, where $k=\sum\limits_{i\in[c]}k_{i}$ , and obtains an $(\alpha_{cc}-\varepsilon)$ -approximation to the optimal solution of Constrained Max-Cut. In particular, when $c=O(1)$ and $\varepsilon$ is fixed, the running time is polynomial in $n$ .

More broadly, we study Max-Cut under an arbitrary matroid constraint $\mathcal{M}=(V,\mathcal{I})$ , generalizing $\operatorname{Constrained\;Max-Cut}$ with an arbitrary number of partition constraints, especially when $c=\omega(1)$ .

Theorem 6.

There exists a $0.5$ -approximation algorithm for matroid Max-Cut.

The only prior result for this setting is by Lee et al. [21], who provided a $(\frac{1}{3}-\varepsilon)$ -approximation for the more general problem of symmetric submodular function maximization subject to a matroid base constraint.

Finally, we show that for general $\operatorname{Constrained\;Max-Cut}$ with an arbitrary number of constraints, it is NP-hard to decide whether there exists a feasible solution cutting all edges. Formally:

Theorem 7.

Given a graph $G=(V,E)$ , a partition of vertices into $V_{1},\dots,V_{c}$ , and budget parameters $k_{1},\dots,k_{c}$ , it is NP-hard to decide whether there exists a feasible solution $S$ such that $\delta(S)=|E|$ .

We note that for the standard Max-Cut problem, this decision variant can be solved in polynomial time using bipartite testing.

Our Techniques.

A key technical contribution of our work is the construction of an approximate kernel for $\operatorname{Constrained\;Max-Cut}$ . Specifically, for any cardinality constraint $k$ , we sort the vertices by their (weighted) degrees as $v_{1},\dots,v_{n}$ , and define $\widetilde{V}$ as the top $O(k/\varepsilon)$ vertices. While the graph $G$ remains unchanged, we restrict our attention to solutions contained entirely within $\widetilde{V}$ . Then, an optimal solution $\widetilde{S}$ to $\operatorname{Max-Cut}_{k}$ over $\widetilde{V}$ achieves a cut value that is at least a $(1-\varepsilon)$ fraction of the true optimum. In other words,

\displaystyle\max_{\widetilde{S}\subseteq\widetilde{V},\,|\widetilde{S}|=k}% \delta_{w}(\widetilde{S})\geq(1-\varepsilon)\cdot\max_{S\subseteq V,\,|S|=k}% \delta_{w}(S).

(1)

See Theorem 12 in Section 2 for the formal statement.

This reduction is particularly useful because it allows us to focus on problem instances where $k=\Omega(n)$ . Conceptually, we can contract the vertices in $V\setminus\widetilde{V}$ into a single super vertex $s$ , and then restrict the solution to exclude $s$ . This transforms the sparse regime into one where the effective solution size is a constant fraction of the (reduced) vertex set, enabling the use of correlation rounding techniques that require $k=\Omega(n)$ .

In contrast, prior work by [24] uses kernelization to design fixed-parameter algorithms for $\operatorname{Max-Cut}_{k}$ , but their parameter is the value of the optimal solution itself, and they aim for an exact kernel. As a result, their kernel size is polynomial in $k$ , which is insufficient for our purposes. Moreover, their kernelization is sequential and adaptive, while ours is non-adaptive. Our approximate kernel also extends to $\operatorname{Constrained\;Max-Cut}$ with multiple constraints ( $c>1$ ), as formalized in Theorem 14.

Once we reduce to an instance with $k=\Omega(n)$ , we apply the Raghavendra–Tan algorithm [22] to obtain a subset of vertices of size $k^{\prime}\in k\cdot[1-\varepsilon,1+\varepsilon]$ , achieving a cut value that is at least an $\alpha_{cc}$ -fraction of the optimum. We then perform a random correction step: adjusting the solution by randomly adding or removing at most $\varepsilon k$ vertices to exactly match the required size $k$ , incurring only a negligible loss in cut value.

When $c>1$ , however, the rounding procedure of [22] does not directly apply. To handle the setting with multiple constraints, we introduce the notion of $\alpha$ -block independence for SDP solutions, which generalizes the standard notion of $\alpha$ -independence. Informally, an SDP solution is $\alpha$ -block independent if, within each partition $V_{i}$ , the average correlation between pairs of vertices is at most $\alpha$ .

We first show how to efficiently construct a block-independent solution. Then, by applying the rounding algorithm of [22], we obtain a subset $S$ that approximately satisfies each group constraint: for each $i\in[c]$ , the size $k_{i}^{\prime}=|S\cap V_{i}|$ lies in the range $[(1-\varepsilon)k_{i},(1+\varepsilon)k_{i}]$ . Finally, we apply a random correction step within each group to enforce exact feasibility, while ensuring that the cut value degrades by only a negligible amount.

For matroid Max-Cut, we combine techniques from [1] and [7] to design a linear programming relaxation with integrality gap at most $0.5$ , which can be solved efficiently. Applying pipage rounding to this relaxation yields a deterministic $0.5$ -approximation algorithm.

1.2 Preliminaries

Our results heavily rely on the global correlation rounding technique developed in [22]. For completeness, we include the relevant definitions and theorems in this section. A quick summary of the Lasserre hierarchy is provided in Appendix A.

Naive approaches based on variants of hyperplane rounding applied to a two-round Lasserre SDP relaxation for the $\operatorname{Max-Cut}_{k}$ problem can produce subsets $S$ of expected size $k$ that achieve good approximation guarantees. However, these approaches offer no control over the concentration of $|S|$ around $k$ , due to potentially high correlations between the values assigned to vertices by the SDP solution.

Notation.

We use $\mu=\{\mu_{S}\}_{|S|\leq\ell}$ to denote a level- $\ell$ Lasserre pseudo-distribution, where $\mu_{S}:\{-1,1\}^{S}\rightarrow[0,1]$ is a distribution over partial assignments to the subset $S\subseteq V$ . Let $X_{S}$ denote the random variables jointly distributed according to $\mu_{S}$ , and $X_{i}$ the marginal variable for $i\in V$ under $\mu_{\{i\}}$ . We write $\mathbb{P}_{\mu_{S}}[X_{S}\in A]$ to denote the pseudo-probability that the assignment to $S$ lies in the event $A\subseteq\{-1,1\}^{S}$ . In particular, conditional pseudo-probabilities are expressed as $\mathbb{P}_{\mu_{S\cup\{i\}}}[X_{i}=1\mid X_{S}=\alpha]$ , which denotes the pseudo-probability that $X_{i}=1$ given that $X_{S}=\alpha$ for some $\alpha\in\{-1,1\}^{S}$ .

SDP Relaxation.

To leverage the correlations between vertices, [22] employ an $\ell$ -round Lasserre SDP for $\operatorname{Max-Cut}_{k}$ with a sufficiently large constant $\ell$ , formally described in Equation 2.

$\displaystyle\max\quad$	$\displaystyle\sum\limits_{\{i,j\}\in E}w_{i,j}\mathbb{P}_{\mu_{\{i,j\}}}[X_{\{% i,j\}}\in\{(-1,1),(1,-1)\}]$		(2)
s.t.	$\displaystyle\sum\limits_{i\in V}\mathbb{P}_{\mu_{S\cup\{i\}}}(X_{i}=1\mid X_{% S}=\alpha)=k$	$\displaystyle\forall S\subseteq V,\;\|S\|\leq\ell-1,\;\alpha\in\{0,1\}^{S}$
	$\displaystyle\mu\textnormal{ is a level-}\ell\textnormal{ pseudo-distribution}.$

Measuring Correlations.

One method to assess the correlation between two random variables $X_{i}$ and $X_{j}$ is through mutual information, defined as $I_{\mu_{\{i,j\}}}(X_{i};X_{j})=H(X_{i})-H(X_{i}\mid X_{j})$ , where $X_{i}$ and $X_{j}$ are sampled according to the local distribution $\mu_{\{i,j\}}$ . An SDP solution is $\alpha$ -independent if the average mutual information between uniformly random vertex pairs is at most $\alpha$ , i.e., $\mathbb{E}_{i,j\in V}[I(X_{i};X_{j})]\leq\alpha$ .

Definition 8 ( $\alpha$ -independence [22]).

An SDP solution to an $\ell$ -round Lasserre SDP is $\alpha$ -independent if $\mathbb{E}_{i,j\in V}[I_{\mu_{\{i,j\}}}(X_{i};X_{j})]\leq\alpha$ , where $\mu_{\{i,j\}}$ is the local distribution over $\{i,j\}$ . More generally, if $W$ is a distribution over $V$ , then the solution is $\alpha$ -independent w.r.t. $W$ if $\mathbb{E}_{i,j\sim W}[I_{\mu_{\{i,j\}}}(X_{i};X_{j})]\leq\alpha$ . When unspecified, $W$ is assumed to be the uniform distribution over $V$ .

For many standard rounding schemes, such as halfspace rounding, the variance in the balance of the resulting cut is directly linked to the average correlation among random vertex pairs. Specifically, if the rounding scheme is applied to an $\alpha$ -independent solution, the variance in the cut’s balance is bounded by a polynomial function of $\alpha$ .

Obtaining Uncorrelated SDP Solutions.

If all vertices in a $t$ -round Lasserre SDP solution are highly correlated, conditioning on the value of one vertex reduces the entropy of the rest. Formally, if the solution is not $\alpha$ -independent (i.e., $\mathbb{E}_{i,j\in V}[I(X_{i};X_{j})]>\alpha$ ), then conditioning on a randomly chosen vertex $i$ and its value $b$ decreases the average entropy of the remaining variables by at least $\alpha$ . Repeating this process $1/\alpha$ times suffices to obtain an $\alpha$ -independent solution. Thus, starting from a $t$ -round Lasserre SDP solution, this process results in a $(t-\ell)$ -round $\alpha$ -independent solution for some $\ell=O(1/\alpha)$ .

Rounding Uncorrelated SDP Solutions.

Given an $\alpha$ -independent SDP solution, many natural rounding schemes ensure that the balance of the output cut is concentrated around its expectation. Hence, it suffices to construct rounding schemes that preserve the expected balance. Raghavendra and Tan [22] present a simple rounding procedure that preserves the individual bias of each vertex, thereby ensuring the global balance property.

An elegant probabilistic argument from [22] shows how to convert an $(\ell+4/\alpha^{2}+1)$ -round Lasserre SDP solution into an $\alpha$ -independent $\ell$ -round solution, while losing only an additive factor of $\alpha$ in the objective value (assuming the optimum is normalized to at most $1$ ).

Lemma 9 ([22]).

There exists $t\leq k$ such that $\mathbb{E}_{i_{1},\dots,i_{t}\sim W}\mathbb{E}_{i,j\sim W}\left[I(X_{i};X_{j}% \mid X_{i_{1}},\dots,X_{i_{t}})\right]\leq\frac{1}{k-1}$ .

Lemma 9 implies the existence of a $t\leq 1/\alpha+1$ such that conditioning on the joint assignment to $t$ randomly sampled vertices reduces the average mutual information between other pairs to at most $\alpha$ .

Theorem 10 ([22]).

For every $\alpha>0$ and integer $\ell$ , there exists an algorithm running in time $O(n^{\operatorname{poly}(1/\alpha)+\ell})$ that finds an $\alpha$ -independent solution to the $\ell$ -round Lasserre SDP, with objective value at least $\text{OPT}-\alpha$ , where OPT is the optimum SDP value.

Theorem 10 implies that there exists $t=O(1/\alpha^{2})$ such that conditioning on $t$ vertices yields an $\alpha$ -independent solution with probability at least $\alpha/2$ . Since the sampling procedure preserves the marginal biases of the vertices, the SDP objective remains close to optimal in expectation. By Markov’s inequality, the value of the conditioned solution is at least $\text{OPT}-\alpha$ with probability at least $1/(1+\alpha)$ . Thus, there exists a small subset of vertices such that conditioning on them yields an $\alpha$ -independent solution with near-optimal value.

Algorithm 5.3 of [22] is a rounding scheme that preserves the bias (according to the SDP solution) of every vertex while also approximately preserving the pairwise correlations up to polynomial factors. Using numerical techniques, they show that the probability of an edge being cut is at least $\alpha_{cc}\approx 0.858$ times its contribution to the SDP objective, implying that the total cut value is at least $\alpha_{cc}$ times the SDP value.

Controlling Cut Balance.

Theorem 11 ([22]).

Given an $\alpha$ -independent solution to two rounds of the Lasserre SDP, let $\{y_{i}\}_{i\in V}$ denote the rounded output from Algorithm 5.3. Let $S=\mathbb{E}_{i\sim W}[y_{i}]$ be the expected balance of the cut. Then, $\operatorname{Var}(S)\leq O(\alpha^{1/12})$ .

By applying Chebyshev’s inequality to Theorem 11, the number of vertices in the cut lies in the range $k\pm n\cdot O(\alpha^{1/24})$ with high probability. When $k/n=\Omega(1)$ , we can choose $\alpha=\Omega(1)$ small enough so that the relative deviation is within $\varepsilon k$ . A post-processing step can then adjust the set size to exactly $k$ (e.g., by swapping a small number of vertices), which incurs only an $O(\varepsilon)$ fractional loss in cut value. However, when $k=o(n)$ , the additive error term $n\cdot O(\alpha^{1/24})$ may significantly exceed $k$ , making it difficult to ensure cardinality feasibility without substantially affecting the objective.

Notation.

For any subset $S\subseteq V$ and vertex $i\in V$ , we write $S+i:=S\cup\{i\}$ and $S-i:=S\setminus\{i\}$ . Let $G=(V,E)$ be a graph with non-negative edge weights $w:E\rightarrow\mathbb{R}_{\geq 0}$ . For a parameter $r\in\mathbb{N}$ , let $H_{r}\subseteq V$ denote the set of the $r$ highest-degree vertices in $G$ under weight function $w$ . If the vertex set $V$ is partitioned into $c$ disjoint groups, $V=\biguplus_{i\in[c]}V_{i}$ , then $H_{r}^{(i)}\subseteq V_{i}$ denotes the $r$ highest-degree vertices in part $V_{i}$ . When the weight function is clear from context, we abbreviate the weighted degree of a vertex $v$ as $\delta(v)$ instead of $\delta_{w}(v)$ .

2 Approximate Kernels for Max-Cut with Cardinality Constraints

In Section 2.1, we show that for any instance $(G,k)$ of $\operatorname{Max-Cut}_{k}$ , one can reduce the graph to a (conditioned) instance $(\widetilde{G},k)$ with $|\widetilde{V}|=O(k/\varepsilon)$ vertices. In Section 2.2, we generalize this construction to the setting with multiple partitions. Specifically, for any instance $(G=(V=\biguplus_{i\in[c]}V_{i},E),(k_{1},\dots,k_{c}))$ of $\operatorname{Constrained\;Max-Cut}$ , we construct a conditioned instance $(\widetilde{G}=(\widetilde{V}=\biguplus_{i\in[c]}\widetilde{V}_{i},\widetilde{% E}),(k_{1},\dots,k_{c}))$ such that for every $i\in[c]$ , we have $|\widetilde{V}_{i}|=O(k_{i}/\varepsilon)$ .

We use OPT to denote the optimal cut value of a given instance. For the single-group problem $\operatorname{Max-Cut}_{k}$ on a graph $G=(V,E)$ , we define $\text{OPT}:=\max_{S\subseteq V,\ |S|=k}\delta(S)$ . For the multi-group case $\operatorname{Constrained\;Max-Cut}$ with partition $V=\biguplus_{i\in[c]}V_{i}$ and size constraints $k_{1},\dots,k_{c}$ , the optimal value is $\text{OPT}:=\max_{\begin{subarray}{c}S=\biguplus_{i\in[c]}S_{i}\\ S_{i}\subseteq V_{i},\ |S_{i}|=k_{i}\end{subarray}}\delta(S)$ .

2.1 Approximate Kernel for $\operatorname{Max-Cut}_{k}$

Kernel Procedure for $\operatorname{Max-Cut}_{k}$ .

We now describe the approximate kernel construction for the single-group $\operatorname{Max-Cut}_{k}$ problem.

Input: Graph $G=(V,E)$ , cardinality parameter $k$ , and approximation parameter $0<\varepsilon\leq 1/2$ .
Output: Reduced graph $\widetilde{G}=(\widetilde{V},\widetilde{E})$ .

1.

If $k/\varepsilon\geq n$ , return $G$ .
2.

Otherwise, sort the vertices of $G$ in decreasing order of weighted degree, and retain only the top $k/\varepsilon$ vertices. Merge the rest of vertices into a super vertex $s$ , and return the resulting graph $\widetilde{G}$ .

Note that the super vertex $s$ appears in the output graph $\widetilde{G}$ only if $k/\varepsilon+1\leq n$ .

Theorem 12.

For any $\operatorname{Max-Cut}_{k}$ instance $(G,k)$ , let $(\widetilde{G},k)$ be the reduced instance returned by the $\operatorname{Max-Cut}_{k}$ kernel procedure above. Then the optimal cut value of $\operatorname{Max-Cut}_{k}$ on $\widetilde{G}$ , conditioned on not selecting the super vertex $s$ , satisfies

\widetilde{\text{OPT}}:=\max_{S\subseteq\widetilde{V}\setminus\{s\},\,|S|=k}% \delta(S)\geq(1-4\varepsilon)\cdot\text{OPT}.

Proof.

Let $h:=k/\varepsilon$ , and recall from the notation in the preliminaries that $H_{h}$ is the set of the $h$ highest-degree vertices in $G$ . Let $S^{*}$ be an optimal solution of size $k$ in $G$ with cut value $\delta(S^{*})=\text{OPT}$ . We will construct a set $S_{T}\subseteq H_{h}$ of size $k$ with value close to OPT.

We iteratively transform $S^{*}$ into a set within $H_{h}$ by applying Lemma 13 up to $k$ times. At each step $t$ , we replace a vertex $j_{t}\in S_{t-1}\setminus H_{h}$ with a vertex $i_{t}\in H_{h}\setminus S_{t-1}$ such that:

\delta(S_{t})\geq\left(1-\frac{2}{h-k}\right)\cdot\delta(S_{t-1}).

Since each step increases $|S_{t}\cap H_{h}|$ by one, the process terminates in at most $T\leq k$ steps. Therefore,

\delta(S_{T})\geq\text{OPT}\cdot\left(1-\frac{2}{h-k}\right)^{T}\geq\text{OPT}% \cdot\left(1-\frac{2T}{h-k}\right)\geq\text{OPT}\cdot\left(1-\frac{2k}{h-k}% \right).

Substituting $h=k/\varepsilon$ , we get

\delta(S_{T})\geq\text{OPT}\cdot\left(1-\frac{2\varepsilon}{1-\varepsilon}% \right)\geq\text{OPT}\cdot(1-4\varepsilon),

where the final bound assumes $\varepsilon\leq 1/2$ . $\hfill\blacktriangleleft$

Lemma 13.

Let $S\subseteq V$ be a subset of size $|S|=k\leq h$ such that $S\setminus H_{h}\neq\emptyset$ . Then there exist vertices $i\in H_{h}\setminus S$ and $j\in S\setminus H_{h}$ such that:

\delta\big((S-j)+i\big)\geq\left(1-\frac{2}{h-k}\right)\cdot\delta(S).

Proof.

Since $S\setminus H_{h}\neq\emptyset$ and $|S|\leq h$ , we have $H_{h}\setminus S\neq\emptyset$ . Let $i\in H_{h}\setminus S$ be the vertex minimizing $\delta(S,\{i\})$ , the total weight of edges between $i$ and $S$ . Let $j$ be any vertex in $S\setminus H_{h}$ .

We use the submodularity of the cut function:

\delta((S-j)+i)+\delta(S)\geq\delta(S+i)+\delta(S-j).

Rearranging:

	$\displaystyle\delta((S-j)+i)-\delta(S)$	$\displaystyle\geq\left[\delta(S+i)-\delta(S)\right]+\left[\delta(S-j)-\delta(S% )\right]$
		$\displaystyle=\big(\delta(\{i\})-2\delta(S,\{i\})\big)-\big(\delta(\{j\})-2% \delta(S-j,\{j\})\big)$
		$\displaystyle=\big(\delta(\{i\})-\delta(\{j\})\big)+2\delta(S-j,\{j\})-2\delta% (S,\{i\})$
		$\displaystyle\geq-2\delta(S,\{i\}).$

Now we bound $\delta(S,\{i\})$ . Since $i$ minimizes $\delta(S,\cdot)$ among $H_{h}\setminus S$ , we have:

\delta(S)=\sum_{v\in V\setminus S}\delta(S,\{v\})\geq\sum_{v\in H_{h}\setminus S% }\delta(S,\{v\})\geq|H_{h}\setminus S|\cdot\delta(S,\{i\}),

which implies:

\delta(S,\{i\})\leq\frac{\delta(S)}{h-k}.

Putting everything together:

\delta((S-j)+i)\geq\delta(S)-\frac{2}{h-k}\cdot\delta(S)=\left(1-\frac{2}{h-k}% \right)\cdot\delta(S),

completing the proof. $\hfill\blacktriangleleft$

2.2 Approximate Kernel for Constrained Max-Cut

Kernel Procedure for Constrained Max-Cut.

We now describe the kernelization procedure for the $\operatorname{Constrained\;Max-Cut}$ problem with multiple vertex groups.

Input: Graph $G=(V=\biguplus_{i\in[c]}V_{i},E)$ , cardinality constraints $k_{1},\dots,k_{c}$ , and approximation parameter $0<\varepsilon\leq 1/2$ .
Output: Reduced graph $\widetilde{G}=(\widetilde{V}=\biguplus_{i\in[c]}\widetilde{V}_{i},\widetilde{E})$ .

1.

For each $i\in[c]$ , if $k_{i}/\varepsilon+1\leq n_{i}:=|V_{i}|$ , retain the top $k_{i}/\varepsilon$ vertices in $V_{i}$ by weighted degree and merge the remaining vertices into a super vertex $s_{i}$ .
2.

Return the resulting graph $\widetilde{G}$ .

Note that a super vertex $s_{i}$ appears in the output graph $\widetilde{G}$ only if $k_{i}/\varepsilon+1\leq n_{i}$ . Let $S_{\text{super}}:=\{s_{i}\mid s_{i}\text{ exists in }\widetilde{G}\}$ denote the set of all super vertices.

Theorem 14.

For any $\operatorname{Constrained\;Max-Cut}$ instance $(G,k_{1},\dots,k_{c})$ , let $(\widetilde{G},k_{1},\dots,k_{c})$ be the reduced instance returned by the $\operatorname{Constrained\;Max-Cut}$ kernel procedure. Then the optimal value of the reduced instance, conditioned on not selecting any super vertex, satisfies

\widetilde{\text{OPT}}:=\max_{\begin{subarray}{c}S\subseteq\widetilde{V}% \setminus S_{\text{super}}\\ |S\cap\widetilde{V}_{i}|=k_{i}\ \forall i\in[c]\end{subarray}}\delta(S)\geq(1-% 4c\varepsilon)\cdot\text{OPT}.

Proof.

Let $S^{*}$ be an optimal solution to the original instance with $\delta(S^{*})=\text{OPT}$ . For each part $i\in[c]$ , define $H_{i}:=H_{k_{i}/\varepsilon}^{(i)}$ as the top $k_{i}/\varepsilon$ vertices in $V_{i}$ by weighted degree (as defined in the preliminaries).

We will transform $S^{*}$ into a solution $S_{T}$ such that $S_{T}\cap V_{i}\subseteq H_{i}$ for every $i\in[c]$ while losing only a small fraction of the cut value. At each step $t$ , identify the smallest index $p\in[c]$ for which $S_{t}\cap V_{p}\nsubseteq H_{p}$ , and apply the local exchange from Corollary 16 to swap a vertex $j\in(S_{t}\cap V_{p})\setminus H_{p}$ with a vertex $i\in H_{p}\setminus(S_{t}\cap V_{p})$ , yielding a new set $S_{t+1}$ with

\delta(S_{t+1})\geq\left(1-\frac{2}{k_{i}/\varepsilon-k_{i}}\right)\cdot\delta% (S_{t}).

For each $i\in[c]$ , we perform at most $k_{i}$ such exchanges in $V_{i}$ . Hence, the total cut value at the end satisfies:

\delta(S_{T})\geq\delta(S^{*})\cdot\prod_{i=1}^{c}\left(1-\frac{2}{k_{i}/% \varepsilon-k_{i}}\right)^{k_{i}}.

Using the inequality $(1-x)^{m}\geq 1-mx$ , we get

\delta(S_{T})\geq\text{OPT}\cdot\left(1-\sum_{i=1}^{c}\frac{2k_{i}}{k_{i}/% \varepsilon-k_{i}}\right)\geq\text{OPT}\cdot\left(1-\sum_{i=1}^{c}\frac{2% \varepsilon}{1-\varepsilon}\right)\geq\text{OPT}\cdot(1-4c\varepsilon),

where the final bound assumes $\varepsilon\leq 1/2$ . $\hfill\blacktriangleleft$

Lemma 15.

Let $S\subseteq V$ be a subset of size $k$ and let $H\subseteq V$ be a subset of size greater than $k$ such that $S\setminus H\neq\emptyset$ and every vertex in $H\setminus S$ has higher weighted degree than every vertex in $S\setminus H$ . Then there exist $i\in H\setminus S$ and $j\in S\setminus H$ such that:

\delta((S-j)+i)\geq\left(1-\frac{2}{|H\backslash S|}\right)\cdot\delta(S).

Proof.

The proof is identical to that of Lemma 13. It proceeds by selecting $i$ to minimize $\delta(S,\{i\})$ over $H\setminus S$ and applying cut submodularity to bound the loss when replacing $j\in S\setminus H$ . $\hfill\blacktriangleleft$

Corollary 16.

For any subset $S\subseteq V$ and an index $p\in[c]$ such that $|S\cap V_{p}|=k_{p}\leq h$ and $(S\cap V_{p})\backslash H_{h}^{(p)}\neq\emptyset$ , there exist vertices $i\in H_{h}^{(p)}\backslash(S\cap V_{p})$ and $j\in(S\cap V_{p})\backslash H_{h}^{(p)}$ such that

\delta((S-j)+i)\geq\left(1-\frac{2}{h-k_{p}}\right)\cdot\delta(S).

Proof.

Using $H=(S\backslash V_{p})\cup H_{h}^{(p)}$ in Lemma 15, and the fact that $H\backslash S=H_{h}^{(p)}\backslash(S\cap V_{p})$ finishes the proof. $\hfill\blacktriangleleft$

3 Single Constraint

In this section, we describe our $(\alpha_{cc}-\varepsilon)$ -approximation algorithm for $\operatorname{Max-Cut}_{k}$ , for all values of $k$ . Without loss of generality, we assume $k\leq n/2$ due to the symmetry of the cut function.

3.1 Algorithm

Input: Weighted graph $G=(V,E)$ and parameters $k\leq n/2$ , $0<\varepsilon\leq 1/2$ .
Output: A set $S\subseteq V$ of size $|S|=k$ .

1.

(Preprocessing Step) Let $\widetilde{G}=(\widetilde{V},\widetilde{E})$ be the approximate kernel output by the $\operatorname{Max-Cut}_{k}$ kernel with input $(G,k,\varepsilon)$ . Note that $|\widetilde{V}|=O(k/\varepsilon)$ .
2.
(SDP and Conditioning)
1. (a)
  
  Solve a $(3+4/\varepsilon^{120})$ -round Lasserre SDP relaxation for the $\operatorname{Max-Cut}_{k}$ problem on the graph $\widetilde{G}$ (see Section 3.2).
2. (b)
  
  Apply Theorem 10 with $\alpha=\varepsilon^{60}$ and $\ell=2$ to obtain a 2-level SDP solution that is $\varepsilon^{60}$ -independent and has objective value at least $\widetilde{\text{OPT}}-\varepsilon^{60}$ , where $\widetilde{\text{OPT}}$ is the optimum value of $\operatorname{Max-Cut}_{k}$ on $\widetilde{G}$ (conditioned on not selecting the super vertex $s$ ). By Lemma 18 (2), we know that $\widetilde{\text{OPT}}-\varepsilon^{60}\geq(1-\varepsilon)\widetilde{\text{OPT}}$ .
3.

(Rounding) Apply the rounding algorithm of Raghavendra and Tan (Algorithm 5.3 in [22]) to obtain a (random) set $\widehat{S}$ . Let $\mathcal{E}$ denote the event that $|\widehat{S}|\in[k-\varepsilon^{2}|\widetilde{V}|,\,k+\varepsilon^{2}|% \widetilde{V}|]$ .
4.

(Correction) If event $\mathcal{E}$ does not occur, return an arbitrary subset $S\subseteq\widetilde{V}\setminus\{s\}$ of size $k$ . Otherwise, adjust $\widehat{S}$ by randomly adding or removing vertices to produce a set $S$ of size exactly $k$ , and return $S$ .

3.2 Lasserre SDP

We now describe the SDP used in Step 2 of the algorithm above. Since a full overview of the Lasserre hierarchy is already provided in Appendix A, we only describe the relevant formulation.

After the preprocessing step, we solve the following level- $\ell$ Lasserre SDP relaxation for $\operatorname{Max-Cut}_{k}$ on the reduced graph $\widetilde{G}$ , with an additional constraint ensuring the super vertex $s$ is not selected:
$\displaystyle\max\quad$ $\displaystyle\sum\limits_{\{i,j\}\in E}w_{i,j}\cdot\mathbb{P}_{\mu_{\{i,j\}}}[% X_{\{i,j\}}\in\{(-1,1),(1,-1)\}]$ (3) s.t. $\displaystyle\sum\limits_{i\in\widetilde{V}}\mathbb{P}_{\mu_{S\cup\{i\}}}(X_{i% }=1\mid X_{S}=\alpha)=k$ $\displaystyle\forall S\subseteq V,\ |S|\leq\ell-1,\ \alpha\in\{0,1\}^{S}$ $\displaystyle\mathbb{P}_{\mu_{S\cup\{s\}}}(X_{s}=1\mid X_{S}=\alpha)=0$ $\displaystyle\forall|S|\leq\ell-1,\ \alpha\in\{0,1\}^{S}$ (4) $\displaystyle\mu\textnormal{ is a level-}\ell\textnormal{ pseudo-distribution}.$

3.3 Analysis

Proof of Theorem 4.

Using Lemma 18 (1), the optimal value on the kernelized instance $\widetilde{G}$ is at least $(1-4\varepsilon)\text{OPT}$ . After solving the SDP, we obtain a value at least $(1-\varepsilon)\widetilde{\text{OPT}}\geq(1-5\varepsilon)\text{OPT}$ .

The expected size of $\widehat{S}$ is exactly $k$ since Algorithm 5.3 from [22] preserves the bias of each vertex. Using Theorem 11, the variance of the balance $|\widehat{S}|/|\widetilde{V}|$ is at most $O(\varepsilon^{60/12})=O(\varepsilon^{5})$ (Assume that the constant hidden in the $O$ -notation is $1$ for simplicity. One can absorb the constant into the $\varepsilon$ in general). By Chebyshev’s inequality, the event $\mathcal{E}$ occurs with probability at least $1-\varepsilon$ .

The expected cut value conditioned on $\mathcal{E}$ satisfies:

\displaystyle\mathbb{E}[\delta(\widehat{S})\mid\mathcal{E}]\geq\frac{\mathbb{E% }[\delta(\widehat{S})]-\varepsilon\cdot\text{OPT}}{1-\varepsilon}\geq\left(% \frac{(1-5\varepsilon)\alpha_{cc}-\varepsilon}{1-\varepsilon}\right)\cdot\text% {OPT}\geq(\alpha_{cc}-9\varepsilon)\cdot\text{OPT}.

(5)

Let $S$ be the final corrected set. Then:

$\displaystyle\mathbb{E}[\delta(S)]$	$\displaystyle\geq(1-\varepsilon)\cdot\mathbb{E}[\delta(S)\mid\mathcal{E}]$	(6)
	$\displaystyle\stackrel{{\scriptstyle(\lx@cref{creftype~refnum}{lem:randomcorre% ction_single})}}{{\geq}}(1-\varepsilon)^{2}\cdot\mathbb{E}[\delta(\widehat{S})% \mid\mathcal{E}]$	(7)
	$\displaystyle\stackrel{{\scriptstyle(\ref{eqn:conditional_S_hat})}}{{\geq}}(1-% \varepsilon)^{2}\cdot(\alpha_{cc}-9\varepsilon)\cdot\text{OPT}.$	(8)

Steps 1 and 4 take $O(|E|\log|E|)$ and $O(|V|)$ time, respectively. Steps 2 and 3, which involve solving the Lasserre SDP and rounding, dominate the runtime and require $O(|\widetilde{V}|)^{\operatorname{poly}(1/\varepsilon)}$ time. $\hfill\blacktriangleleft$

Lemma 17.

Conditioned on the event $\mathcal{E}$ and a fixed $\widehat{S}$ , let $S$ be the set obtained by randomly adding or deleting $|k-|\widehat{S}||$ vertices from $\widetilde{V}\setminus(\widehat{S}\cup\{s\})$ so that $|S|=k$ . Then:

\mathbb{E}[\delta(S)]\geq(1-\varepsilon)\cdot\delta(\widehat{S}).

Lemma 18.

Let $\widetilde{\text{OPT}}$ denote the optimum value of $\operatorname{Max-Cut}_{k}$ on $\widetilde{G}$ , conditioned on not selecting the super vertex $s$ .

1.

$\widetilde{\text{OPT}}\geq(1-4\varepsilon)\cdot\text{OPT}$ , where OPT is the optimum value for $\operatorname{Max-Cut}_{k}$ on $G$ .
2.

$\widetilde{\text{OPT}}$ is at least an $\varepsilon$ -fraction of the total edge weight in $\widetilde{E}$ .

Proof.

Part (1) follows from Theorem 12. For part (2), we show that a uniformly random subset of $V\setminus\{s\}$ of size $k$ cuts any edge with probability at least $\varepsilon$ .

Let $n^{\prime}:=|\widetilde{V}\setminus\{s\}|$ . Then $2k\leq n^{\prime}\leq k/\varepsilon$ . If edge $e$ is adjacent to $s$ , it is cut with probability $k/n^{\prime}\geq\varepsilon$ . Otherwise, the cut probability is $2k(n^{\prime}-k)/(n^{\prime}(n^{\prime}-1))\geq k/n^{\prime}\geq\varepsilon$ . $\hfill\blacktriangleleft$

4 Constant Number of Constraints

In this section, we present our $(\alpha_{cc}-\varepsilon)$ -approximation algorithm for $\operatorname{Constrained\;Max-Cut}$ , the Max-Cut problem with $c$ cardinality constraints. Our primary focus is on instances where the number of vertices to be selected from each part $V_{i}$ is relatively small, and for this reason, we assume that $k_{i}\leq n_{i}/2$ . (Unlike the case in $\operatorname{Max-Cut}_{k}$ , this is not without loss of generality.)

The key observation enabling this extension of [22] to multiple constraints is that the notion of $\alpha$ -independence can be defined locally within each block. Specifically, it suffices to ensure that the average mutual information between vertex pairs within each part is small:

\mathbb{E}_{i,i^{\prime}\in V_{j}}[I(X_{i};X_{i^{\prime}})]\leq\alpha\quad% \text{for all }j\in[c].

If this condition holds, then after rounding via Algorithm 5.3 of [22], the size of each intersection $|\widehat{S}\cap V_{j}|$ concentrates around its expectation. In particular, using Theorem 11 with $W$ as the uniform distribution over $V_{j}$ , the variance of $|\widehat{S}\cap V_{j}|/|V_{j}|$ is bounded by $O(\alpha^{1/12})$ for every $j\in[c]$ . Therefore, for an appropriate choice of $\alpha$ , we obtain

|\widehat{S}\cap V_{j}|\in[k_{j}(1-\varepsilon),\;k_{j}(1+\varepsilon)]\quad% \text{simultaneously for all }j\in[c],

with probability at least $1-\varepsilon$ .

Definition 19 ( $\alpha$ -block independence).

An SDP solution to an $\ell$ -round Lasserre relaxation is $\alpha$ -block independent if $\mathbb{E}_{i,i^{\prime}\in V_{j}}[I_{\mu_{\{i,i^{\prime}\}}}(X_{i};X_{i^{% \prime}})]\leq\alpha$ hold for all $j\in[c]$ .

To find such solutions, we extend the conditioning technique of [22]. The following procedure begins with an $(L+\ell)$ -round SDP solution and returns an $\ell$ -round $\alpha$ -block independent solution for $L=O(c^{2}/\alpha^{2})$ .

Conditioning Procedure

1.
For each $t\in[L]$ :
1. (a)
  
  Sample a block index $j_{t}\in[c]$ uniformly at random.
2. (b)
  
  Sample a vertex $i_{t}\in V_{j_{t}}$ uniformly at random.
3. (c)
  
  Sample $X_{i_{t}}$ from its marginal distribution under the current SDP solution (conditioned on previous outcomes), and condition on this value.
4. (d)
  
  If the resulting SDP solution is $\alpha$ -block independent, terminate and return it.

Lemma 20.

For any $L\in\mathbb{Z}_{\geq 2}$ , there exists $t\leq L$ such that

\displaystyle\mathbb{E}_{j_{1},\dots,j_{t}\in[c]}\mathbb{E}_{i_{1}\in V_{j_{1}% },\dots,i_{t}\in V_{j_{t}}}\left[\sum_{j,j^{\prime}\in[c]}\mathbb{E}_{i\in V_{% j},\;i^{\prime}\in V_{j^{\prime}}}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X% _{i_{t}})\right]\leq\frac{c^{2}}{L}.

Proof.

Define the potential function:

\phi_{t}:=\mathbb{E}_{j_{1},\dots,j_{t}\in[c]}\mathbb{E}_{i_{1}\in V_{j_{1}},% \dots,i_{t}\in V_{j_{t}}}\left[\mathbb{E}_{j\in[c]}\mathbb{E}_{i\in V_{j}}H(X_% {i}\mid X_{i_{1}},\dots,X_{i_{t}})\right].

Now, conditioned on fixed values of $j_{1},\dots,j_{t}$ and $i_{1},\dots,i_{t}$ , the difference in potentials is:
$\displaystyle\phi_{t}-\phi_{t+1}$ $\displaystyle=\mathbb{E}_{j\in[c]}\mathbb{E}_{i\in V_{j}}\left(H(X_{i}\mid X_{% i_{1}},\dots,X_{i_{t}})-\mathbb{E}_{j_{t+1}\in[c]}\mathbb{E}_{i_{t+1}\in V_{j_% {t+1}}}H(X_{i}\mid X_{i_{1}},\dots,X_{i_{t+1}})\right)$ $\displaystyle=\mathbb{E}_{j,j_{t+1}\in[c]}\mathbb{E}_{i\in V_{j},i_{t+1}\in V_% {j_{t+1}}}I(X_{i};X_{i_{t+1}}\mid X_{i_{1}},\dots,X_{i_{t}})$ $\displaystyle=\frac{1}{c^{2}}\sum_{j,j^{\prime}\in[c]}\mathbb{E}_{i\in V_{j},i% ^{\prime}\in V_{j^{\prime}}}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X_{i_{t% }}).$

Taking expectation over all random choices of $j_{1},\dots,j_{t}$ and $i_{1},\dots,i_{t}$ gives:

\displaystyle\phi_{t}-\phi_{t+1}=\frac{1}{c^{2}}\mathbb{E}_{j_{1},\dots,j_{t}}% \mathbb{E}_{i_{1},\dots,i_{t}}\left[\sum_{j,j^{\prime}\in[c]}\mathbb{E}_{i\in V% _{j},i^{\prime}\in V_{j^{\prime}}}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X% _{i_{t}})\right].

(9)

Summing (9) over $t=0$ to $L-1$ , and noting that entropy is always non-negative, we get:

\sum_{t=0}^{L-1}\mathbb{E}_{j_{1},\dots,j_{t}}\mathbb{E}_{i_{1},\dots,i_{t}}% \left[\sum_{j,j^{\prime}\in[c]}\mathbb{E}_{i\in V_{j},i^{\prime}\in V_{j^{% \prime}}}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X_{i_{t}})\right]\leq c^{2% }(\phi_{0}-\phi_{L})\leq c^{2}.

Therefore, by averaging, there exists $t\leq L$ for which the expected blockwise mutual information is at most $c^{2}/L$ . $\hfill\blacktriangleleft$

Corollary 21.

If

\sum_{j,j^{\prime}\in[c]}\mathbb{E}_{i\in V_{j},i^{\prime}\in V_{j^{\prime}}}I% (X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X_{i_{t}})\leq\alpha,

then

\mathbb{E}_{i,i^{\prime}\in V_{j}}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X% _{i_{t}})\leq\alpha\quad\text{for all }j\in[c].

Proof.

Each blockwise term is a subset of the global sum, and mutual information is non-negative. $\hfill\blacktriangleleft$

Theorem 22.

For every $\alpha>0$ and integer $\ell>0$ , there exists an algorithm running in time $O(n^{\ell+\operatorname{poly}(c/\alpha)})$ that finds an $\alpha$ -block independent solution to the $\ell$ -round Lasserre SDP with value at least $\text{OPT}-\alpha$ , where OPT is the optimum value of the $(L+\ell)$ -round SDP.

Proof.

Set $L=\frac{4c^{2}}{\alpha^{2}}$ . First, solve the $(L+\ell)$ -round Lasserre SDP relaxation (as described in Section 4.1) to obtain an initial solution.

Next, apply the conditioning procedure described above. That is, for each $t\in[L]$ , sample a block index $j_{t}\in[c]$ uniformly at random, then sample a vertex $i_{t}\in V_{j_{t}}$ uniformly, sample $X_{i_{t}}$ from its marginal distribution (after the first $t-1$ fixings), and condition the SDP solution on that assignment. Continue this process until the resulting pseudo-distribution becomes $\alpha$ -block independent.

We analyze this procedure by appealing to Lemma 20, which shows that:

\displaystyle\mathbb{E}_{j_{1},\dots,j_{t}\sim[c]}\mathbb{E}_{i_{1}\sim V_{j_{% 1}},\dots,i_{t}\sim V_{j_{t}}}\left[\sum_{j,j^{\prime}\in[c]}\mathbb{E}_{i\in V% _{j},\;i^{\prime}\in V_{j^{\prime}}}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots% ,X_{i_{t}})\right]\leq\frac{c^{2}}{L}=\frac{\alpha^{2}}{4}.

For some $t\leq L$ . By Markov’s inequality, the probability that the total conditional mutual information (summed over all block pairs) exceeds $\alpha$ is at most:

\Pr\left[\sum_{j,j^{\prime}}\mathbb{E}_{i\in V_{j},i^{\prime}\in V_{j^{\prime}% }}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X_{i_{t}})>\alpha\right]\leq\frac% {\alpha^{2}/4}{\alpha}=\frac{\alpha}{4}.

Thus, with probability at least $1-\frac{\alpha}{4}$ , the conditioned solution is $\alpha$ -block independent. By Corollary 21, this also implies that:

\mathbb{E}_{i,i^{\prime}\in V_{j}}I(X_{i};X_{i^{\prime}}\mid X_{i_{1}},\dots,X% _{i_{t}})\leq\alpha\quad\text{for all }j\in[c],

so the solution satisfies the desired independence property within each block.

Now consider the effect of the conditioning procedure on the SDP objective value. Let $\operatorname{SDP}_{\ell}$ denote the value of the SDP after conditioning. Since conditioning preserves expectations, we have $\mathbb{E}[\operatorname{SDP}_{\ell}]=\text{OPT}$ . To bound the probability that the value drops by more than $\alpha$ , we apply Markov’s inequality to the non-negative random variable $1-\operatorname{SDP}_{\ell}$ :

\Pr[\operatorname{SDP}_{\ell}<\text{OPT}-\alpha]=\Pr[1-\operatorname{SDP}_{% \ell}>1-\text{OPT}+\alpha]\leq\frac{1-\text{OPT}}{1-\text{OPT}+\alpha}\leq% \frac{1}{1+\alpha},

where the last inequality uses $\text{OPT}\leq 1$ .

Separately, as shown earlier, the probability that the conditioned solution fails to be $\alpha$ -block independent is at most $\alpha/4$ . By a union bound, the total failure probability is

\frac{\alpha}{4}+\frac{1}{1+\alpha}<1,

for all $\alpha\leq 1$ . Hence, there exists a choice of conditioning – i.e., some $t\leq L$ and assignment to $X_{i_{1}},\dots,X_{i_{t}}$ – such that the resulting SDP solution is $\alpha$ -block independent and has objective value at least $\text{OPT}-\alpha$ .

This outcome can be found by brute-force search over all subsets of up to $L=O(c^{2}/\alpha^{2})$ variables and their possible assignments. The overall runtime is thus

O(n^{\ell+L})=O(n^{\ell+\operatorname{poly}(c/\alpha)}),

as claimed. $\hfill\blacktriangleleft$

Proof of Theorem 5.

Lemma 23 proves this theorem for Algorithm 4.1. $\hfill\blacktriangleleft$

4.1 Algorithm

Input: Weighted graph $G=(V,E)$ with vertex set partitioned as $V=\biguplus_{i=1}^{c}V_{i}$ , parameters $k_{i}\leq|V_{i}|/2$ for $i\in[c]$ , and $0<\varepsilon\leq 1/2$ .
Output: A set $S\subseteq V$ such that $|S\cap V_{i}|=k_{i}$ for all $i\in[c]$ .

1.

(Preprocessing Step) Let $\widetilde{G}=(\widetilde{V},\widetilde{E})$ be the approximate kernel obtained via the $\operatorname{Constrained\;Max-Cut}$ kernel procedure with input $(G,(k_{1},\dots,k_{c}),\varepsilon)$ .
2.
(SDP and Conditioning)
1. (a)
  
  Solve a $(3+4c/\varepsilon^{120})$ -round Lasserre SDP relaxation for $\operatorname{Constrained\;Max-Cut}$ on $\widetilde{G}$ (see Section 4.2).
2. (b)
  
  Apply Theorem 22 with $\alpha=\varepsilon^{60}$ and $\ell=2$ to obtain a 2-level SDP solution that is $\varepsilon^{60}$ -block independent and has value at least $\widetilde{\text{OPT}}-\varepsilon^{60}$ . From Lemma 24, we know $\widetilde{\text{OPT}}-\varepsilon^{60}\geq(1-\varepsilon)\widetilde{\text{OPT}}$ .
3.

(Rounding) Apply Algorithm 5.3 from [22] to obtain a (random) set $\widehat{S}$ . Let $\mathcal{E}_{i}$ denote the event that $|\widehat{S}\cap V_{i}|\in[k_{i}-\varepsilon^{2}|V_{i}|,\;k_{i}+\varepsilon^{2% }|V_{i}|]$ for each $i\in[c]$ , and define $\mathcal{E}:=\bigcap_{i=1}^{c}\mathcal{E}_{i}$ .
4.

(Correction) If $\mathcal{E}$ does not occur, return an arbitrary feasible set. Otherwise, for each part $V_{i}$ , randomly add or remove vertices to ensure $|\widehat{S}\cap V_{i}|=k_{i}$ , and return the resulting set $S$ .

Lemma 23.

The expected value of the cut returned by Algorithm 4.1 is at least $\big(\alpha_{cc}-O(\varepsilon)\big)\text{OPT}$ . The running time of the algorithm is $O\big(\min\{k/\varepsilon,n\}^{\operatorname{poly}(c/\varepsilon)}+% \operatorname{poly}(n)\big)$ where $k=\sum\limits_{i=1}^{c}k_{i}$ .

4.2 SDP Relaxation

We solve the following level- $\ell$ Lasserre SDP relaxation for $\operatorname{Constrained\;Max-Cut}$ on the reduced graph $\widetilde{G}$ :

$\displaystyle\max\quad$	$\displaystyle\sum\limits_{\{i,j\}\in E}w_{i,j}\cdot\mathbb{P}_{\mu_{\{i,j\}}}[% X_{\{i,j\}}\in\{(-1,1),(1,-1)\}]$		(10)
s.t.	$\displaystyle\sum\limits_{i\in\widetilde{V}_{j}}\mathbb{P}_{\mu_{S\cup\{i\}}}(% X_{i}=1\mid X_{S}=\alpha)=k_{j}$	$\displaystyle\forall j\in[c],\;\|S\|\leq\ell-1,\;\alpha\in\{0,1\}^{S}$
	$\displaystyle\mathbb{P}_{\mu_{S\cup\{s_{j}\}}}(X_{s_{j}}=1\mid X_{S}=\alpha)=0$	$\displaystyle\forall j\in[c],\;\|S\|\leq\ell-1,\;\alpha\in\{0,1\}^{S}$
	$\displaystyle\mu\textnormal{ is a level-}\ell\textnormal{ pseudo-distribution}.$

4.3 Analysis

Proof of Lemma 23.

By Lemma 24, we have $\widetilde{\text{OPT}}\geq(1-4c\varepsilon)\text{OPT}$ , and after solving the SDP and conditioning, the objective remains at least $(1-\varepsilon)\widetilde{\text{OPT}}\geq(1-(4c+1)\varepsilon)\text{OPT}$ .

Since the SDP solution is $\varepsilon^{60}$ -block independent, using Theorem 11 with $W$ as the uniform distribution over each $V_{i}$ , the variance of $|\widehat{S}\cap V_{i}|/|V_{i}|$ is $O(\varepsilon^{5})$ . By Chebyshev’s inequality, the event $\mathcal{E}_{i}$ occurs with probability at least $1-\varepsilon$ , so the joint event $\mathcal{E}=\bigcap_{i=1}^{c}\mathcal{E}_{i}$ occurs with probability at least $1-c\varepsilon$ .

The expected value of the cut after rounding is at least $\alpha_{cc}\cdot(1-(4c+1)\varepsilon)\cdot\text{OPT}$ . Conditioning on $\mathcal{E}$ , we have:

	$\displaystyle\mathbb{E}[\delta(\widehat{S})\mid\mathcal{E}]$	$\displaystyle\geq\frac{\mathbb{E}[\delta(\widehat{S})]-c\varepsilon\cdot\text{% OPT}}{1-c\varepsilon}$
		$\displaystyle\geq\left(\frac{\alpha_{cc}(1-(4c+1)\varepsilon)-c\varepsilon}{1-% c\varepsilon}\right)\cdot\text{OPT}\geq(\alpha_{cc}-O(c\varepsilon))\cdot\text% {OPT}.$		(11)

Let $S$ be the corrected set after adjusting $\widehat{S}$ to satisfy cardinality constraints exactly. Using the same argument as in Lemma 17 applied sequentially across the $c$ parts, we get:

\displaystyle\mathbb{E}[\delta(S)]\geq(1-c\varepsilon)\cdot(1-\varepsilon)^{c}% \cdot\mathbb{E}[\delta(\widehat{S})\mid\mathcal{E}]\geq(\alpha_{cc}-O(c% \varepsilon))\cdot\text{OPT}.

(12)

The total running time is dominated by solving the SDP and brute-force conditioning, which takes $O(n^{\ell+\operatorname{poly}(c/\varepsilon)})$ . Preprocessing and postprocessing steps take $\operatorname{poly}(n)$ time. $\hfill\blacktriangleleft$

Lemma 24.

Let $\widetilde{\text{OPT}}$ be the optimum value of $\operatorname{Constrained\;Max-Cut}$ on $\widetilde{G}$ (conditioned on not picking any $s_{i}$ ).

1.

$\widetilde{\text{OPT}}\geq(1-4c\varepsilon)\cdot\text{OPT}$ , where OPT is the optimum value of the $\operatorname{Constrained\;Max-Cut}$ instance on $G$ .
2.

$\widetilde{\text{OPT}}\geq\varepsilon$ fraction of the total edge weight in $\widetilde{E}$ .

Proof.

Part (1) follows from Theorem 14. For (2), consider sampling $S=\bigcup_{i=1}^{c}S_{i}$ , where each $S_{i}$ is a uniformly random subset of size $k_{i}$ from $\widetilde{V}_{i}\setminus\{s_{i}\}$ . Let $n_{i}^{\prime}=|\widetilde{V}_{i}\setminus\{s_{i}\}|$ . Since $2k_{i}\leq n_{i}^{\prime}\leq k_{i}/\varepsilon$ , we have $k_{i}/n_{i}^{\prime}\geq\varepsilon$ .

1.

If an edge is adjacent to $s_{i}$ , the cut probability is at least $\varepsilon$ .
2.

If both endpoints are in the same $V_{i}$ , Lemma 18 (2) gives cut probability $\geq\varepsilon$ .
3.

If endpoints lie in $V_{i}$ and $V_{j}$ ( $i\neq j$ ), the probability that exactly one endpoint lies in $S$ is

$\left(\frac{k_{i}}{n_{i}^{\prime}}\right)\left(1-\frac{k_{j}}{n_{j}^{\prime}}% \right)+\left(\frac{k_{j}}{n_{j}^{\prime}}\right)\left(1-\frac{k_{i}}{n_{i}^{% \prime}}\right)\geq\varepsilon/2+\varepsilon/2=\varepsilon.$

So the expected cut value of $S$ is at least an $\varepsilon$ fraction of total edge weight in $\widetilde{E}$ . $\hfill\blacktriangleleft$

Proof of Theorem 5.

Lemma 23 proves this theorem for Algorithm 4.1. $\hfill\blacktriangleleft$

5 Arbitrary Number of Constraints

We consider the general case of $\operatorname{Constrained\;Max-Cut}$ with an arbitrary number of constraints, potentially $c=\omega(1)$ . First, we present a $0.5$ -approximation for the more general problem of Max-Cut under an arbitrary matroid constraint. Next, we establish an NP-hardness result for determining whether the optimal solution in a given instance of $\operatorname{Constrained\;Max-Cut}$ with an arbitrary number of constraints equals the total number of edges in the graph.

5.1 Approximation Algorithm

Proof of Theorem 6.

Consider the following linear program:

$\displaystyle\max:$	$\displaystyle\sum\limits_{e\in E}w_{e}y_{e}$	(13)
$\displaystyle y_{\{u,v\}}$	$\displaystyle\leq x_{u}+x_{v}$	(14)
$\displaystyle y_{\{u,v\}}$	$\displaystyle\leq 2-(x_{u}+x_{v})$	(15)
$\displaystyle x$	$\displaystyle\in\mathcal{B}$	(16)

where $\mathcal{B}$ is the base polytope of the matroid $\mathcal{M}$ . We can see that for any given $x$ , the optimal choice for $y_{\{u,v\}}$ is $\min\{x_{u}+x_{v},2-x_{u}-x_{v}\}$ . When $x$ is integral, this function also coincides with the indicator whether edge $\{u,v\}$ has been cut. Since the matroid polytope is separable, we can solve the LP Equation 13 efficiently.

Now consider the following non-concave quadratic program:

	$\displaystyle\max:$	$\displaystyle\sum\limits_{\{u,v\}\in E}w_{e}\left(x_{u}+x_{v}-2x_{u}x_{v}\right)$		(17)
	$\displaystyle x$	$\displaystyle\in\mathcal{B}$		(18)

Observe that the function $x_{u}+x_{v}-2x_{u}x_{v}$ also coincides with the cut indicator function of edge $\{u,v\}$ when $x$ is integral. Even though we cannot solve Equation 17 efficiently, we can show that it has no integrality gap and infact that we can round any fractional solution $\widehat{x}\in\mathcal{B}$ to a solution $x\in\mathcal{B}$ that is integral and with value at least that of $\widehat{x}$ . The two crucial properties we need that are easy to see are:

1.

The function $\sum\limits_{\{u,v\}\in E}w_{e}\left(x_{u}+x_{v}-2x_{u}x_{v}\right)$ is convex in any direction $e_{u}-e_{v}$ for $u\neq v\in V$ . Here $e_{u}\in\{0,1\}^{V}$ is the indicator vector for vertex $u$ .
2.

The polytope $\mathcal{B}$ is the facet of a matroid and hence solvable and integral.

Given these properties, any fraction solution can be pipage rounded (see [1] and especially section 3.2 of [7]) to an integral solution with value at least that of the fractional solution. The final observation is that for any $x\in[0,1]^{V}$ , we have

\displaystyle(x_{u}+x_{v}-2x_{u}x_{v})\leq\min\{x_{u}+x_{v},2-x_{u}-x_{v}\}% \leq 2(x_{u}+x_{v}-2x_{u}x_{v}).

(19)

from Lemma 27. This implies that the integrality gap of Equation 13 is at most $0.5$ and in fact provides a way to find a rounding with value at least $0.5$ times the LP value. Solve the LP and pipage round the solution using the quadratic objective. $\hfill\blacktriangleleft$ Note that the proof idea for Theorem 6 is essentially the same as in [1] used for the Hypergraph Max $k$ -cut with given sizes of parts problem and the pipage rounding for matroids from [7].

5.2 Hardness Result

Proof of Theorem 7.

We show a reduction from the 3D matching problem. An instance of the 3D matching problem is a tripartite graph with parts $X, Y, Z$ . The edges are triples $(x,y,z)\in X\times Y\times Z$ . The problem is to decide if there is a subset of the edges such that every vertex is included in exactly one edge.

The reduction is as follows: For every edge $e=(x,y,z)$ , consider the star graph with four vertices with the center labeled $e$ and the leaves labeled $(e,x),(e,y),(e,z)$ respectively. The overall graph $G$ is simply the union of all these stars. The partition matroid consists of parts $\mathcal{P}_{x}$ for every vertex $x\in X$ that contains the vertices $(e^{\prime},x^{\prime})$ such that $x^{\prime}=x$ . We have parts similarly for elements in $Y$ and $Z$ . The capacity of every part is exactly $1$ .

(Completeness) If there is a collection of edges $e_{i},i\in M$ that every element of $X, Y, Z$ is in exactly one edge, then consider the solution $S=\bigcup_{i\in M}\{(e_{i},e_{i}.x),(e_{i},e_{i}.y),e_{i}.z\}\cup_{i\notin M}% \{e_{i}\}$ where $e_{i}.x:=e_{i}\cap X$ . It is easy to see that $\delta(S)=1$ and $|S\cap\mathcal{P}_{u}|=1$ for every $u\in X\cup Y\cup Z$ .

(Soundness) Since every star graph is bipartite, any solution such that $\delta(S)=1$ should have the center on one side and the leaves on the other side. This implies that every solution such that $\delta(S)=1$ is of the form $S=\bigcup_{i\in M}\{(e_{i},e_{i}.x),(e_{i},e_{i}.y),e_{i}.z\}\cup_{i\notin M}% \{e_{i}\}$ for some $M\subset E$ where $E$ is the collection of triples from the 3D matching instance. The partition matroid constraint that $|S\cap P_{u}|=1$ for $u\in X\cap Y\cap Z$ exactly translates to $M$ being a perfect 3D matching. Since it is NP-Hard to decide if there is a perfect 3D matching, it is NP-hard to decide if there is a cut $S\subseteq V$ such that $\delta(S)=1$ and $|S\cap\mathcal{P}_{i}|=k_{i},\,i\in[c]$ when $c=\omega(1)$ . $\hfill\blacktriangleleft$

$\blacktriangleright$ Remark 25.

The above theorem shows that, in general, deciding whether the Max-Cut value equals the total number of edges in the graph is solvable in polynomial time when the number of constraints is constant. Moreover, the decision problem becomes solvable in quasi-polynomial time when the number of constraints is $\operatorname{poly}(\log n)$ .

References

[1] Alexander A. Ageev and Maxim Sviridenko. Pipage rounding: A new method of constructing algorithms with proven performance guarantee. Journal of Combinatorial Optimization, 8:307–328, 2004. doi:10.1023/B:JOCO.0000038913.96607.C2.
[2] Per Austrin, Siavosh Benabbas, and Konstantinos Georgiou. Better balance by being biased: A 0.8776-approximation for max bisection. ACM Transactions on Algorithms (TALG), 13(1):1–27, 2016. doi:10.1145/2907052.
[3] Per Austrin and Aleksa Stanković. Global cardinality constraints make approximating some max-2-csps harder. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 145:24:1–24:17, 2019. doi:10.4230/LIPICS.APPROX-RANDOM.2019.24.
[4] Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. In Annual Symposium on Foundations of Computer Science, pages 472–481, 2011. doi:10.1109/FOCS.2011.95.
[5] Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. Submodular maximization with cardinality constraints. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 1433–1452. SIAM, 2014. doi:10.1137/1.9781611973402.106.
[6] Leizhen Cai. Parameterized complexity of cardinality constrained optimization problems. The Computer Journal, 51(1):102–121, 2008. doi:10.1093/COMJNL/BXM086.
[7] Gruia Calinescu, Chandra Chekuri, Martin Pál, and Jan Vondrák. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740–1766, 2011. doi:10.1137/080733991.
[8] Ozan Candogan, Kostas Bimpikis, and Asuman Ozdaglar. Optimal pricing in networks with externalities. Operations Research, 60(4):883–905, 2012. doi:10.1287/OPRE.1120.1066.
[9] Uriel Feige, Marek Karpinski, and Michael Langberg. A note on approximating max-bisection on regular graphs. Information Processing Letters, 79(4):181–188, 2001. doi:10.1016/S0020-0190(00)00189-7.
[10] Uriel Feige and Michael Langberg. Approximation algorithms for maximization problems arising in graph partitioning. Journal of Algorithms, 41(2):174–211, 2001. doi:10.1006/JAGM.2001.1183.
[11] Uriel Feige and Michael Langberg. The $\mathrm{RPR}^{2}$ rounding technique for semidefinite programs. Journal of Algorithms, 60(1):1–23, 2006. doi:10.1016/J.JALGOR.2004.11.003.
[12] Noah Fleming, Pravesh Kothari, and Toniann Pitassi. Semialgebraic proofs and efficient algorithm design. Foundations and Trends® in Theoretical Computer Science, 14(1-2):1–221, 2019. doi:10.1561/0400000086.
[13] Dimitris Fotakis and Paris Siminelakis. On the efficiency of influence-and-exploit strategies for revenue maximization under positive externalities. Theoretical Computer Science, 539:68–86, 2014. doi:10.1016/J.TCS.2014.04.026.
[14] Alan Frieze and Mark Jerrum. Improved approximation algorithms for MAX $k$ -CUT and MAX BISECTION. Algorithmica, 18(1):67–81, 1997. doi:10.1007/BF02523688.
[15] Michel X Goemans and David P Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM), 42(6):1115–1145, 1995. doi:10.1145/227683.227684.
[16] Venkatesan Guruswami, Yury Makarychev, Prasad Raghavendra, David Steurer, and Yuan Zhou. Finding almost-perfect graph bisections. In In Proceedings of Innovations in Computer Science, pages 321–337, 2011. URL: http://conference.iiis.tsinghua.edu.cn/ICS2011/content/papers/11.html.
[17] Eran Halperin and Uri Zwick. A unified framework for obtaining improved approximation algorithms for maximum graph bisection problems. Random Structures and Algorithms, 20, May 2002. doi:10.1002/rsa.10035.
[18] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAX-CUT and other $2$ -variable CSPs? SIAM Journal on Computing, 37(1):319–357, 2007. doi:10.1137/S0097539705447372.
[19] Charlotte Laclau, Christine Largeron, and Manvi Choudhary. A survey on fairness for machine learning on graphs. arXiv preprint arXiv:2205.05396, 2022. doi:10.48550/arXiv.2205.05396.
[20] Monique Laurent. A comparison of the sherali-adams, lovász-schrijver, and lasserre relaxations for 0-1 programming. Mathematics of Operations Research, 28(3):470–496, 2003. doi:10.1287/MOOR.28.3.470.16391.
[21] Jon Lee, Vahab S. Mirrokni, Viswanath Nagarajan, and Maxim Sviridenko. Maximizing nonmonotone submodular functions under matroid or knapsack constraints. SIAM Journal on Discrete Mathematics, 23(4):2053–2078, 2010. doi:10.1137/090750020.
[22] Prasad Raghavendra and Ning Tan. Approximating CSPs with global cardinality constraints using SDP hierarchies. In Proceedings of the Symposium on Discrete Algorithms, pages 373–387. SIAM, 2012. doi:10.1137/1.9781611973099.33.
[23] Thomas Rothvoß. The lasserre hierarchy in approximation algorithms. Lecture Notes for the MAPSP, pages 1–25, 2013.
[24] Saket Saurabh and Meirav Zehavi. ( $k,n-k$ )-max-cut: An $\mathcal{O}^{*}(2^{p})$ -time algorithm and a polynomial kernel. Algorithmica, 80:3844–3860, 2018.
[25] Yinyu Ye. A $.699$ -approximation algorithm for max-bisection. Mathematical Programming, 90:101–111, 2001. doi:10.1007/PL00011415.

Appendix A Basics of the Lasserre SDP Hierarchy

For detailed expositions on the sum-of-squares hierarchy, we refer the reader to the excellent surveys by Laurent [20], Rothvoß [23], and Fleming et al. [12]. This section briefly summarizes key ideas drawn from these sources.

Given a binary optimization problem with a linear relaxation defined by a matrix $\bm{A}\in\mathbb{R}^{m\times n}$ and right-hand side $\bm{b}\in\mathbb{R}^{m}$ , consider the feasible region $K=\{x\in\mathbb{R}^{V}:\bm{A}x\geq\bm{b}\}$ .

We ask: how can we systematically strengthen this relaxation to better approximate the convex hull of integral solutions, $\textnormal{conv}(K\cap\{-1,1\}^{V})$ ?²²2We work over $\{-1,1\}^{V}$ rather than $\{0,1\}^{V}$ since this is the natural domain for problems like Max-Cut, where signs encode partition membership.

Points in this convex hull can be interpreted as distributions over the hypercube $\{-1,1\}^{V}$ . The level- $\ell$ Lasserre SDP yields a pseudo-distribution $\mu=\{\mu_{S}\}_{|S|\leq\ell}$ , where each $\mu_{S}:\{-1,1\}^{S}\to[0,1]$ is a distribution over partial assignments to the subset $S\subseteq V$ . However, there need not exist a global distribution whose marginals agree with these $\mu_{S}$ . Despite this, the pseudo-distribution satisfies the following key properties:

1.

Marginal Consistency: The pseudo-distributions are consistent across overlapping subsets. That is, for any subsets $S,T\subseteq V$ with $|S|,|T|\leq\ell$ and any assignment $a\in\{-1,1\}^{S\cap T}$ , we have:

$\displaystyle\mu_{S\cap T}(a)=\sum\limits_{b\in\{-1,1\}^{S\setminus T}}\mu_{S}% (a\circ b)=\sum\limits_{c\in\{-1,1\}^{T\setminus S}}\mu_{T}(a\circ c),$ (20)

where $a\circ b$ denotes the extension of $a$ to $S$ using $b$ , and similarly for $a\circ c$ on $T$ .
2.

Conditioning: The SDP solution supports conditioning on the value of any variable $i\in V$ . Given a level- $\ell$ pseudo-distribution and variable $i$ , there exist level- $(\ell-1)$ pseudo-distributions $\mu^{(+)}$ , $\mu^{(-)}$ and a weight $\lambda\in[0,1]$ such that for all $S\subseteq V$ with $|S|\leq\ell-1$ and $\alpha\in\{-1,1\}^{S}$ ,

$\displaystyle\mu_{S}(\alpha)=\lambda\cdot\mu_{S}^{(+)}(\alpha)+(1-\lambda)% \cdot\mu_{S}^{(-)}(\alpha),$ (21)

where $\mu_{S}^{(+)}(\alpha)$ is nonzero only if $\alpha(i)=+1$ and $\mu_{S}^{(-)}(\alpha)$ is nonzero only if $\alpha(i)=-1$ .

While these properties are also satisfied by weaker hierarchies like Sherali–Adams, the Lasserre hierarchy is uniquely characterized by an additional sum-of-squares condition: for every polynomial $q(x)$ of degree at most $\ell$ , the pseudo-expectation of its square is non-negative:

\displaystyle\mathbb{E}_{\mu}[q(x)^{2}]\geq 0.

(22)

Assuming polynomials are multilinear (since we evaluate over the hypercube), any such polynomial $p(x)$ can be written as $p(x)=\sum\limits_{S}c_{S}y_{S}(x)$ , where $y_{S}(x):=\prod_{i\in S}x_{i}$ and $|S|\leq\ell$ . Then the pseudo-expectation becomes:

\mathbb{E}_{\mu}[p(x)]=\sum\limits_{S}c_{S}\sum\limits_{\alpha\in\{-1,1\}^{S}}% \mu_{S}(\alpha)y_{S}(\alpha).

Moreover, to incorporate the linear constraints $\bm{A}x\geq\bm{b}$ , the Lasserre relaxation requires that:

\displaystyle\mathbb{E}_{\mu}\left[q(x)^{2}\cdot\left(\sum\limits_{j\in V}A_{i% ,j}x_{j}-b_{i}\right)\right]\geq 0\quad\forall i\in[m],\text{ for all }q(x)% \text{ of degree }\leq\ell-1.

(23)

A level- $\ell$ pseudo-distribution satisfying (22) and (23) can be found by solving a semidefinite program, as described below.

A.1 SDP Formulation

Definition 26 (Lasserre Hierarchy [23]).

Let $K=\{x\in\mathbb{R}^{V}:\bm{A}x\geq\bm{b}\}$ . The level- $t$ Lasserre relaxation, denoted $LAS_{t}(K)$ , consists of vectors $y\in\mathbb{R}^{2^{V}}$ satisfying:

	$\displaystyle M_{t}(y)$	$\displaystyle=(y_{I\cup J})_{\|I\|,\|J\|\leq t}\succeq\bm{0},$
	$\displaystyle M_{t}^{\ell}(y)$	$\displaystyle=\left(\sum\limits_{i=1}^{n}A_{\ell,i}y_{I\cup J\cup\{i\}}-b_{% \ell}y_{I\cup J}\right)_{\|I\|,\|J\|\leq t-1}\succeq\bm{0}\quad\forall\ell\in[m],$
	$\displaystyle y_{\emptyset}$	$\displaystyle=1.$

Here, $M_{t}(y)$ is the moment matrix, and $M_{t}^{\ell}(y)$ is the moment matrix of slacks. The projection onto the original variables is denoted by $LAS_{t}^{\textnormal{proj}}(K):=\{(y_{\{1\}},\dots,y_{\{n\}}):y\in LAS_{t}(K)\}$ .

This is a valid relaxation: for any integral solution $x\in K\cap\{0,1\}^{n}$ , the assignment $y_{S}=\prod_{i\in S}x_{i}$ for all $S\subseteq[n]$ yields a feasible point in $LAS_{t}(K)$ .

Each variable $y_{S}$ represents the pseudo-moment corresponding to all variables in $S$ being assigned $+1$ . From these, one can recover the pseudo-distribution via Möbius inversion:

\displaystyle\mu_{S}(\mathbbm{1}_{S,S^{\prime}})=\sum\limits_{S\setminus S^{% \prime}\subseteq T\subseteq S}(-1)^{|T\cap S^{\prime}|}y_{T}\quad\forall S^{% \prime}\subseteq S,

(24)

where $\mathbbm{1}_{S,S^{\prime}}\in\{-1,1\}^{S}$ denotes the partial assignment that sets variables in $S^{\prime}$ to $-1$ and the remaining in $S\setminus S^{\prime}$ to $+1$ .

Appendix B Omitted proofs

Lemma 27.

For any $x,y\in[0,1]$ , $(x+y-2xy)\leq\min\{x+y,2-x-y\}\leq 2(x+y-2xy)$ .

Proof.

We consider the following cases,

1.

When $x+y\leq 1$ , the required inequality to prove is

$\displaystyle x+y-2xy\leq x+y\leq 2(x+y-2xy).$

The left most inequality is equivalent to $0\leq xy$ which is trivially true. The right most inequality is equivalent to $4xy\leq x+y$ which is true because $4xy\leq(x+y)^{2}\leq x+y$ .
2.

When $x+y\leq 1$ , substituting $x^{\prime}=1-x,y^{\prime}=1-y$ , the required inequality to prove is

$\displaystyle x^{\prime}+y^{\prime}-2x^{\prime}y^{\prime}\leq x^{\prime}+y^{% \prime}\leq 2(x^{\prime}+y^{\prime}-2x^{\prime}y^{\prime}).$

The conditions on $x^{\prime},y^{\prime}$ are that they are in the range $[0,1]$ and that $x^{\prime}+y^{\prime}\leq 1$ . This is exactly the case above.

$\hfill\blacktriangleleft$

Proof of Lemma 17.

Suppose $|\widehat{S}|<k$ , then $S$ is obtained by adding a uniformly random set of $k-|\widehat{S}|$ vertices from $\widetilde{V}\setminus(\widehat{S}\cup\{s\})$ to $\widehat{S}$ . The value $|\widetilde{V}|$ is equal to $k/\varepsilon+1$ if $k/\varepsilon+1\leq n$ and $n$ otherwise. In the first case, the probability of an element in $\widetilde{V}\setminus\widehat{S}$ to be added to $\widehat{S}$ is at most $(k-|\widehat{S}|)/(k/\varepsilon-|\widehat{S}|)\leq\varepsilon$ . In the second case, it is ${(k-|\widehat{S}|)}/{(n-|\widehat{S}|)}$ . For it to be at most $\varepsilon$ , it suffices to have $|\widehat{S}|\geq{(k-\varepsilon n)}/{(1-\varepsilon)}$ , which is true because in fact, we can show that

\displaystyle\frac{k-\varepsilon n}{1-\varepsilon}\leq k-\varepsilon^{2}n\leq|% \widehat{S}|.

The leftmost inequality is equivalent to $k\leq n(1-\varepsilon+\varepsilon^{2})$ which is true because $k\leq n/2\leq n(1-\varepsilon+\varepsilon^{2})$ . Using Lemma 28, we can imply that $\mathbb{E}[\delta(S)]\geq(1-\varepsilon)\cdot\delta(\widehat{S})$ .

If $|\widehat{S}|>k$ , then $S$ is obtained by removing a uniformly random subset of size $|\widehat{S}|-k$ . The probability of an element in $\widehat{S}$ to be removed is $(|\widehat{S}|-k)/|\widehat{S}|\leq\varepsilon$ . Hence, it suffices to have $|\widehat{S}|\leq\frac{k}{1-\varepsilon}$ , which is true because in fact, $|\widehat{S}|\leq k+\varepsilon^{2}|\widetilde{V}|\leq k/(1-\varepsilon)$ . The rightmost inequality is equivalent to $|\widetilde{V}|\leq k/(\varepsilon(1-\varepsilon))$ . This is true because $|\widetilde{V}|\leq k/\varepsilon+1\leq k/(\varepsilon(1-\varepsilon))$ . Using Lemma 28 on $\widetilde{V}\setminus\widehat{S}$ , we get $\mathbb{E}[\delta(S)]\geq(1-\varepsilon)\cdot\delta(\widehat{S})$ . $\hfill\blacktriangleleft$

Lemma 28 (Lemma 2.2 in [5]).

For any set $S\subseteq V$ , if $R\subseteq V\setminus S$ is a random set such that each element in $V\setminus S$ is included in $R$ (not necessarily independently) with probability at most $p$ , then

\displaystyle\mathbb{E}\left[\delta(S\cup R)\right]\geq(1-p)\cdot\delta(S).

(25)

This fact is generally true for any non-negative submodular function.

[bib.bib1] [1] Alexander A. Ageev and Maxim Sviridenko. Pipage rounding: A new method of constructing algorithms with proven performance guarantee. Journal of Combinatorial Optimization, 8:307–328, 2004. doi:10.1023/B:JOCO.0000038913.96607.C2.

[bib.bib2] [2] Per Austrin, Siavosh Benabbas, and Konstantinos Georgiou. Better balance by being biased: A 0.8776-approximation for max bisection. ACM Transactions on Algorithms (TALG), 13(1):1–27, 2016. doi:10.1145/2907052.

[bib.bib3] [3] Per Austrin and Aleksa Stanković. Global cardinality constraints make approximating some max-2-csps harder. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 145:24:1–24:17, 2019. doi:10.4230/LIPICS.APPROX-RANDOM.2019.24.

[bib.bib4] [4] Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. In Annual Symposium on Foundations of Computer Science, pages 472–481, 2011. doi:10.1109/FOCS.2011.95.

[bib.bib5] [5] Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. Submodular maximization with cardinality constraints. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 1433–1452. SIAM, 2014. doi:10.1137/1.9781611973402.106.

[bib.bib6] [6] Leizhen Cai. Parameterized complexity of cardinality constrained optimization problems. The Computer Journal, 51(1):102–121, 2008. doi:10.1093/COMJNL/BXM086.

[bib.bib7] [7] Gruia Calinescu, Chandra Chekuri, Martin Pál, and Jan Vondrák. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740–1766, 2011. doi:10.1137/080733991.

[bib.bib8] [8] Ozan Candogan, Kostas Bimpikis, and Asuman Ozdaglar. Optimal pricing in networks with externalities. Operations Research, 60(4):883–905, 2012. doi:10.1287/OPRE.1120.1066.

[bib.bib9] [9] Uriel Feige, Marek Karpinski, and Michael Langberg. A note on approximating max-bisection on regular graphs. Information Processing Letters, 79(4):181–188, 2001. doi:10.1016/S0020-0190(00)00189-7.

[bib.bib10] [10] Uriel Feige and Michael Langberg. Approximation algorithms for maximization problems arising in graph partitioning. Journal of Algorithms, 41(2):174–211, 2001. doi:10.1006/JAGM.2001.1183.

[bib.bib11] [11] Uriel Feige and Michael Langberg. The $\mathrm{RPR}^{2}$ rounding technique for semidefinite programs. Journal of Algorithms, 60(1):1–23, 2006. doi:10.1016/J.JALGOR.2004.11.003.

[bib.bib12] [12] Noah Fleming, Pravesh Kothari, and Toniann Pitassi. Semialgebraic proofs and efficient algorithm design. Foundations and Trends® in Theoretical Computer Science, 14(1-2):1–221, 2019. doi:10.1561/0400000086.

[bib.bib13] [13] Dimitris Fotakis and Paris Siminelakis. On the efficiency of influence-and-exploit strategies for revenue maximization under positive externalities. Theoretical Computer Science, 539:68–86, 2014. doi:10.1016/J.TCS.2014.04.026.

[bib.bib14] [14] Alan Frieze and Mark Jerrum. Improved approximation algorithms for MAX $k$ -CUT and MAX BISECTION. Algorithmica, 18(1):67–81, 1997. doi:10.1007/BF02523688.

[bib.bib15] [15] Michel X Goemans and David P Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM), 42(6):1115–1145, 1995. doi:10.1145/227683.227684.

[bib.bib16] [16] Venkatesan Guruswami, Yury Makarychev, Prasad Raghavendra, David Steurer, and Yuan Zhou. Finding almost-perfect graph bisections. In In Proceedings of Innovations in Computer Science, pages 321–337, 2011. URL: http://conference.iiis.tsinghua.edu.cn/ICS2011/content/papers/11.html.

[bib.bib17] [17] Eran Halperin and Uri Zwick. A unified framework for obtaining improved approximation algorithms for maximum graph bisection problems. Random Structures and Algorithms, 20, May 2002. doi:10.1002/rsa.10035.

[bib.bib18] [18] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAX-CUT and other $2$ -variable CSPs? SIAM Journal on Computing, 37(1):319–357, 2007. doi:10.1137/S0097539705447372.

[bib.bib19] [19] Charlotte Laclau, Christine Largeron, and Manvi Choudhary. A survey on fairness for machine learning on graphs. arXiv preprint arXiv:2205.05396, 2022. doi:10.48550/arXiv.2205.05396.

[bib.bib20] [20] Monique Laurent. A comparison of the sherali-adams, lovász-schrijver, and lasserre relaxations for 0-1 programming. Mathematics of Operations Research, 28(3):470–496, 2003. doi:10.1287/MOOR.28.3.470.16391.

[bib.bib21] [21] Jon Lee, Vahab S. Mirrokni, Viswanath Nagarajan, and Maxim Sviridenko. Maximizing nonmonotone submodular functions under matroid or knapsack constraints. SIAM Journal on Discrete Mathematics, 23(4):2053–2078, 2010. doi:10.1137/090750020.

[bib.bib22] [22] Prasad Raghavendra and Ning Tan. Approximating CSPs with global cardinality constraints using SDP hierarchies. In Proceedings of the Symposium on Discrete Algorithms, pages 373–387. SIAM, 2012. doi:10.1137/1.9781611973099.33.

[bib.bib23] [23] Thomas Rothvoß. The lasserre hierarchy in approximation algorithms. Lecture Notes for the MAPSP, pages 1–25, 2013.

[bib.bib24] [24] Saket Saurabh and Meirav Zehavi. ( $k,n-k$ )-max-cut: An $\mathcal{O}^{*}(2^{p})$ -time algorithm and a polynomial kernel. Algorithmica, 80:3844–3860, 2018.

[bib.bib25] [25] Yinyu Ye. A $.699$ -approximation algorithm for max-bisection. Mathematical Programming, 90:101–111, 2001. doi:10.1007/PL00011415.

Max-Cut with Multiple Cardinality Constraints

Abstract

Keywords and phrases:

Category:

Copyright and License:

2012 ACM Subject Classification:

Acknowledgements:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Definition 1 (Constrained Max-Cut).

Definition 2 (Max-Cutk).

Definition 3 (Matroid Max-Cut).

1.1 Our Results and Techniques

Theorem 4.

Theorem 5.

Theorem 6.

Theorem 7.

Our Techniques.

1.2 Preliminaries

Notation.

SDP Relaxation.

Measuring Correlations.

Definition 8 (α-independence [22]).

Obtaining Uncorrelated SDP Solutions.

Rounding Uncorrelated SDP Solutions.

Lemma 9 ([22]).

Theorem 10 ([22]).

Controlling Cut Balance.

Theorem 11 ([22]).

Notation.

2 Approximate Kernels for Max-Cut with Cardinality Constraints

2.1 Approximate Kernel for 𝐌𝐚𝐱−𝐂𝐮𝐭𝒌

Kernel Procedure for 𝐌𝐚𝐱−𝐂𝐮𝐭𝒌.

Theorem 12.

Proof.

Lemma 13.

Proof.

2.2 Approximate Kernel for Constrained Max-Cut

Kernel Procedure for Constrained Max-Cut.

Theorem 14.

Proof.

Lemma 15.

Proof.

Corollary 16.

Proof.

3 Single Constraint

3.1 Algorithm

3.2 Lasserre SDP

3.3 Analysis

Proof of Theorem 4.

Lemma 17.

Lemma 18.

Proof.

4 Constant Number of Constraints

Definition 19 (α-block independence).

Conditioning Procedure

Lemma 20.

Proof.

Corollary 21.

Proof.

Theorem 22.

Proof.

Proof of Theorem 5.

4.1 Algorithm

Lemma 23.

4.2 SDP Relaxation

4.3 Analysis

Proof of Lemma 23.

Lemma 24.

Proof.

Proof of Theorem 5.

5 Arbitrary Number of Constraints

5.1 Approximation Algorithm

Proof of Theorem 6.

5.2 Hardness Result

Proof of Theorem 7.

Definition 2 (Max-Cut_k).

Definition 8 ( $\alpha$ -independence [22]).

2.1 Approximate Kernel for $\operatorname{Max-Cut}_{k}$

Kernel Procedure for $\operatorname{Max-Cut}_{k}$ .

Definition 19 ( $\alpha$ -block independence).

$\blacktriangleright$ Remark 25.