Biased Linearity Testing in the 1% Regime

Khot, Subhash; Mittal, Kunal

doi:10.4230/LIPIcs.CCC.2025.10

Biased Linearity Testing in the 1% Regime

Subhash Khot

Courant Institute of Mathematical Sciences, New York University, NY, USA Kunal Mittal

Department of Computer Science, Princeton University, NJ, USA

Abstract

We study linearity testing over the $p$ -biased hypercube $(\{0,1\}^{n},\mu_{p}^{\otimes n})$ in the 1% regime. For a distribution $\nu$ supported over $\{x\in\{0,1\}^{k}:\sum_{i=1}^{k}x_{i}=0\textnormal{ (mod 2)}\}$ , with marginal distribution $\mu_{p}$ in each coordinate, the corresponding $k$ -query linearity test $\textnormal{Lin}(\nu)$ proceeds as follows: Given query access to a function $f:\{0,1\}^{n}\to\{-1,1\}$ , sample $(x_{1},\dots,x_{k})\sim\nu^{\otimes n}$ , query $f$ on $x_{1},\dots,x_{k}$ , and accept if and only if $\prod_{i\in[k]}f(x_{i})=1$ .

Building on the work of Bhangale, Khot, and Minzer (STOC ’23), we show, for $0<p\leqslant\frac{1}{2}$ , that if $k\geqslant 1+\frac{1}{p}$ , then there exists a distribution $\nu$ such that the test $\textnormal{Lin}(\nu)$ works in the 1% regime; that is, any function $f:\{0,1\}^{n}\to\{-1,1\}$ passing the test $\textnormal{Lin}(\nu)$ with probability $\geqslant\frac{1}{2}+\epsilon$ , for some constant $\epsilon>0$ , satisfies $\Pr_{x\sim\mu_{p}^{\otimes n}}[f(x)=g(x)]\geqslant\frac{1}{2}+\delta$ , for some linear function $g$ , and a constant $\delta=\delta(\epsilon)>0$ .

Conversely, we show that if $k<1+\frac{1}{p}$ , then no such test $\textnormal{Lin}(\nu)$ works in the 1% regime. Our key observation is that the linearity test $\textnormal{Lin}(\nu)$ works if and only if the distribution $\nu$ satisfies a certain pairwise independence property.

Keywords and phrases:

Linearity test, 1% regime,

p

-biased

Funding:

Subhash Khot: Research supported by NSF Award CCF-1422159, NSF Award CCF-2130816, and the Simons Investigator Award.

Kunal Mittal: Research supported by NSF Award CCF-2007462, and the Simons Investigator Award.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Computational complexity and cryptography

Related Version:

Full Version: https://arxiv.org/abs/2502.01900

Acknowledgements:

We thank Amey Bhangale, Yang P. Liu, and Dor Minzer for discussions that helped this project. Amey and Dor politely declined to be co-authors.

DOI:

10.4230/LIPIcs.CCC.2025.10

Event:

40th Computational Complexity Conference (CCC 2025)

Editors:

Srikanth Srinivasan

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

A function $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ is said to be linear over $\mathbb{F}_{2}$ ¹¹1by identifying the range $\mathbb{F}_{2}$ with $\left\{-1,1\right\}$ , under the map $b\to(-1)^{b}$ , if there exists a set $S\subseteq[n]$ , such that $f(x)=\prod_{i\in S}\left(-1\right)^{x_{i}}$ ; this function is denoted by $\chi_{S}$ . The classical linearity testing problem, asks, given query access²²2the algorithm is allowed to ask/query the value of $f(x)$ at any $x\in\left\{0,1\right\}^{n}$ to a function $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ , to distinguish between the following two cases³³3the algorithm is allowed to answer arbitrarily for functions $f$ which violate both the conditions:

1.

$f$ is a linear function.
2.

$f$ is far from being linear; that is, for every linear function $\chi_{S}$ , the functions $f$ and $\chi_{S}$ disagree on many points.

Linearity testing was first studied by Blum, Luby, and Rubinfeld, who gave a very simple 3-query test for this problem [15]. This test, known as the BLR test, proceeds in the following manner: Sample $x,y\sim\left\{0,1\right\}^{n}$ uniformly and independently; query $f$ at $x, y,$ and $x\oplus y$ , and accept if and only if $f(x\oplus y)=f(x)\cdot f(y)$ . Observe that this test accepts all linear functions with probability 1. Blum, Luby and Rubinfeld proved that any function $f$ passing this test with high probability ( $1-\delta$ , for some small $\delta>0$ ), must agree with some linear function $\chi_{S}$ on most (at least $1-O(\delta)$ fraction of) points in $\left\{0,1\right\}^{n}$ . This result, with the acceptance/agreement probability close to 1, is known as the 99%-regime of the test.

It was shown later [4, 23] that the above result extends to the 1% regime as well; more precisely, for every $\delta\in[0,1]$ , and $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ such that

\mathop{\mathbb{E}}_{x,y\sim\left\{0,1\right\}^{n}}\left[f(x)\cdot f(y)\cdot f% (x\oplus y)\right]\geqslant\delta,

there exists $S\subseteq[n]$ such that $\mathop{\mathbb{E}}_{x\sim\left\{0,1\right\}^{n}}\left[f(x)\cdot\chi_{S}(x)% \right]\geqslant\delta$ .

The above test is of fundamental importance in theoretical computer science, and has several applications; for example, it is one of the ingredients in the proof of the celebrated PCP theorem [19, 3, 2]. Furthermore, the analysis of the BLR test by Bellare et al. [4] is one of the early uses of Fourier analysis over the boolean hypercube, an area which now plays a crucial role in many diverse subfields of mathematics and computer science, like complexity theory, harness of approximation, learning theory, coding theory, social choice theory, etc. [28].

In this work, we are interested in the problem of linearity testing over the $p$ -biased hypercube. For $p\in(0,1)$ , we denote by $\mu_{p}$ the $p$ -biased distribution on $\left\{0,1\right\}$ , which assigns probability $p$ to 1, and $1-p$ to 0. The $p$ -biased hypercube refers the set $\left\{0,1\right\}^{n}$ , with the $n$ -fold product measure $\mu_{p}^{\otimes n}$ . Linearity testing, in this $p$ -biased setting, asks to distinguish between linear functions, and functions which are far (with respect to the $p$ -biased measure) from being linear.

The 99% regime of this problem is well-understood [24, 17], and a simple 4-query test works in this case (see Example 4 below). The question for the 1% regime turns out to be significantly more challenging for any $p\not=1/2$ , and was wide open until a recent work of Bhangale, Khot and Minzer [11] made significant progress. In particular, for every $p\in\left(\frac{1}{3},\frac{2}{3}\right)$ , they give a 4-query test that works in the 1% regime.

Building upon the work of Bhangale, Khot and Minzer, we consider a very general class of tests, where, very roughly, some $k$ queries $x_{1},\dots,x_{k}\in\left\{0,1\right\}^{n}$ , satisfying $\sum_{i\in[k]}x_{i}=0\textnormal{ (mod 2)}$ are chosen, and the test accepts $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ if $\prod_{i\in[k]}f(x_{i})=1$ . We shall require the following definitions:

Definition 1 (Class of Distributions).

For $k\in\mathbb{N},\ p\in(0,1)$ , we define $\mathcal{D}(p,k)$ to be the class of all distributions $\nu$ on $\left\{0,1\right\}^{k}$ having $\mu_{p}$ as the marginal distribution on each coordinate $i\in[k]$ , and such that $\textnormal{supp}(\nu)\subseteq\left\{x\in\left\{0,1\right\}^{k}:\sum_{i=1}^{k% }x_{i}=0\textnormal{ (mod 2)}\right\}$ . We say that such a distribution $\nu$ has full even-weight support, if the above inclusion is an equality.

For a distribution $\nu\in\mathcal{D}(p,k)$ , we say that $i\in[k]$ is a pairwise independent coordinate, if for each $j\in[k],j\not=i$ , it holds that $\mathop{\mathbb{E}}_{X\sim\nu}\left[X_{i}\cdot X_{j}\right]=p^{2}$ . We say that $\nu$ is pairwise independent, if all its coordinates are pairwise independent.

Definition 2 (Class of Linearity Tests).

For a distribution $\nu\in\mathcal{D}(p,k)$ , we define a corresponding linearity test, denoted by $\textnormal{Lin}(\nu)$ , as follows. Given query access to a function $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ : Sample⁴⁴4here, by $x=(x_{1},\dots,x_{k})\sim\nu^{\otimes n}$ , we mean that for each $j\in[n]$ , sample $(x_{1}^{\left(j\right)},\dots,x_{k}^{\left(j\right)})\sim\nu$ independently (also see Section 2 for notation). $x=(x_{1},\dots,x_{k})\sim\nu^{\otimes n}$ , and accept if and only if $f(x_{1})\cdot f(x_{2})\cdots f(x_{k})=1$ .

Note that every linear function passes such a test with probability 1⁵⁵5When $k$ is even, affine functions of the form $\pm\chi_{S}$ also pass the test with probability 1. In this work, we shall ignore the distinction between these functions and linear functions.. More strongly, each query in $\nu^{\otimes n}$ having marginal distribution $\mu_{p}^{\otimes n}$ ensures that functions that are close to linear (with respect to $p$ -biased measure) are also accepted with high probability; in the property testing literature, such tests are called tolerant. Furthermore, this is a very general class of linearity tests, containing many of the mentioned previously tests, as demonstrated by the following examples:

Example 3.

The BLR test uses $\nu$ to be uniform over ${\left\{x\in\left\{0,1\right\}^{3}:x_{1}+x_{2}+x_{3}=0\textnormal{ (mod 2)}% \right\}}$ .

Example 4.

The 4-query $p$ -biased test of [17] (for the 99% regime) uses a distribution $\nu$ over $\left\{0,1\right\}^{4}$ of the following form: With probability $p_{0}$ , set all coordinates to 0; with probability $p_{1}$ , set all coordinates to 1; and with probability $1-p_{0}-p_{1}$ , sample uniformly from the set $\left\{x\in\left\{0,1\right\}^{4}:x_{1}+x_{2}+x_{3}+x_{4}=0\textnormal{ (mod 2% )}\right\}$ . Note that each coordinate has bias $p_{1}+\frac{1}{2}\left(1-p_{0}-p_{1}\right)$ , and $p_{0},p_{1}$ are chosen so that this equals $p$ .

In this work, we analyze the precise conditions under which tests in Defintion 2 work for linearity testing, in the 1% regime. Our main result (proven in Section 6) is the following:

Theorem 5.

Let $p\in(0,1)$ .

1.

For every integer $k>1+\frac{1}{\min\left\{p,1-p\right\}}$ , there exists a distribution $\nu\in\mathcal{D}(p,k)$ , such that the test $\textnormal{Lin}(\nu)$ is a $k$ -query linearity test over the $p$ -biased hypercube, for the 1% regime.

That is, for every $\mathop{\epsilon}>0$ , there exists a $\delta>0$ , such that for every large $n\in\mathbb{N}$ , and every function $f:\left\{0,1\right\}^{n}\to[-1,1]$ satisfying

$\left|\mathop{\mathbb{E}}_{(X_{1},\dots,X_{k})\sim\nu^{\otimes n}}\left[\prod_% {i\in[k]}f(X_{i})\right]\right|\geqslant\mathop{\epsilon},$

there exists a set $S\subseteq[n]$ , such that $\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \right]\right|\geqslant\delta.$
2.

The above point also holds for all integers $k\geqslant 3$ with $p=\frac{1}{k-1}$ , and for all even integers $k\geqslant 4$ with $p=1-\frac{1}{k-1}$ .
3.

Conversely, for every positive integer $k<1+\frac{1}{\min\left\{p,1-p\right\}}$ , and every distribution $\nu\in\mathcal{D}(p,k)$ , the test $\textnormal{Lin}(\nu)$ fails in the 1% regime.

That is, there exists a constant $\alpha>0$ , such that for every large $n\in\mathbb{N}$ , there exists a function $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ satisfying

$\left|\mathop{\mathbb{E}}_{(X_{1},\dots,X_{k})\sim\nu^{\otimes n}}\left[\prod_% {i\in[k]}f(X_{i})\right]\right|\geqslant\alpha,$

and such that for every $S\subseteq[n]$ , it holds that $\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \right]\right|\leqslant o_{n}(1)$ .

$\blacktriangleright$ Remark 6.

Note that the above theorem does not discuss the case when $k\geqslant 5$ is an odd integer, and $p=1-\frac{1}{k-1}$ . This case is very interesting and is discussed in more detail in Section 6.1. Informally speaking, the test corresponding to the “natural” distribution $\nu\in\mathcal{D}(p,k)$ , in this case, ensures correlation with a character of $\mathbb{Z}/(k-1)\mathbb{Z}$ , and not a linear function $\chi_{S}$ (that is, a character of $\mathbb{Z}/2\mathbb{Z}$ ). In Section 6.1, we also present an alternative test to get around this.

Next, we shall describe the main technical results we prove along the way to prove Theorem 5. We start by stating (a generalized version of) the main linearity testing result of Bhangale, Khot and Minzer [11]:

Theorem 7 (General version proved later as Theorem 33).

Let $k\geqslant 3$ be a positive integer, and let $p\in(0,1),\ \epsilon\in(0,1]$ be constants, and let $\nu\in\mathcal{D}(p,k)$ be a distribution with full even-weight support (see Definition 1). Then, there exist constants $\delta>0,\ d\in\mathbb{N}$ (possibly depending on $k,p,\epsilon,\nu$ ), such that for every large enough $n\in\mathbb{N}$ , the following is true:

Let $f:\left\{0,1\right\}^{n}\to[-1,1]$ be a function such that

\left|\mathop{\mathbb{E}}_{(X_{1},\dots,X_{k})\sim\nu^{\otimes n}}\left[\prod_% {i=1}^{k}f(X_{i})\right]\right|\geqslant\mathop{\epsilon}.

Then, there exists a set $S\subseteq[n]$ , and a polynomial $g:\left\{0,1\right\}^{n}\to\mathop{\mathbb{R}}$ of degree at most $d$ and with 2-norm $\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[g(X)^{2}\right]\leqslant 1$ , such that

\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \cdot g(X)\right]\right|\geqslant\delta.

Moreover, if the distribution $\nu$ has some pairwise independent coordinate, then we may assume $g\equiv 1$ ; that is, $f$ correlates with a linear function $\chi_{S}$ .

We remark that Bhangale, Khot and Minzer only consider the case $k=4$ , and only show $g\equiv 1$ in the case that all coordinates of $\nu$ are pairwise independent. However, their proofs extend to the more general setting of Theorem 7; we give an outline of this proof in Section 7. Furthermore, we note that we are able to analyze the linearity test for a class of distributions which is much larger than the class of full even-weight support distributions; these distributions, in some sense, contain the BLR test, and are formally defined in Section 7.

In the above work, the authors ask whether the conclusion $g\equiv 1$ can be obtained without assumption that $\nu$ has a pairwise independent coordinate. We show this is not possible, and in fact the assumption that $\nu$ has a pairwise independent coordinate is necessary.

Theorem 8 (Restated and proved later as Theorem 24).

Let $k\in\mathbb{N},\ p\in(0,1)$ , and let $\nu\in\mathcal{D}(p,k)$ be a distribution having no pairwise independent coordinate (see Definition 1).

Then, there exists a constant $\alpha>0$ , such that for every large enough $n\in\mathbb{N}$ , there exists a function $f:\left\{0,1\right\}^{n}\to[-1,1]$ such that

1.

$\left|\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})% \right]\right|\geqslant\alpha.$
2.

For every $S\subseteq[n]$ , it holds that $\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \right]\right|\leqslant o_{n}(1)$ .

Moreover, if the distribution $\nu$ is such that $\eta:=\max_{i,j\in[k],i\not=j}\Pr_{X\sim\nu}\left[X_{i}=X_{j}\right]<1$ (that is, no two coordinates are almost surely equal), the above holds for a function $f$ with range $\left\{-1,1\right\}$ .

$\blacktriangleright$ Remark 9.

1.

The assumption $\eta<1$ in the second part of the Theorem 8 is necessary. For example, if the $i$ ^th and $j$ ^th coordinates of $\nu$ are equal, then, for functions $f$ with range $\left\{-1,1\right\}$ , the terms $f(X_{i})$ and $f(X_{j})$ cancel out in the product $\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})\right]$ . In particular, the test is equivalent to the $(k-2)$ -query test with coordinates $i, j$ removed from $\nu$ , and this new distribution may possibly satisfy the conditions of Theorem 7.
2.

The function $f$ we construct in Theorem 8 does not correlate well with any linear function, although, as possibly required by Theorem 7, it does correlate well with some constant degree function.
3.

The above theorem, answers in the negative a question of [11], who ask if

$\left|\mathop{\mathbb{E}}_{(X,Y,Z,W)\sim\nu^{\otimes n}}\left[g_{1}(X)\cdot g_% {2}(Y)\cdot g_{3}(Z)\cdot g_{4}(W)\right]\right|=o_{n}(1)$

for distributions $\nu\in\mathcal{D}(p,4)$ with full even-weight support, and $g_{1},\dots,g_{4}:\left\{0,1\right\}^{n}\to\mathop{\mathbb{R}}$ bounded, noise stable, and resilient functions.
4.

It is an easy check that the distribution $\nu$ from Example 4 cannot have a pairwise independent coordinate, unless $p=1/2$ . This shows that for $p\not=\frac{1}{2}$ , simple tests that work in the 99% regime fail to work in the 1% regime.
5.

Recall that every $\nu\in\mathcal{D}(p,k)$ satisfies $\sum_{i}X_{i}=0\textnormal{ (mod 2)}$ almost surely, for $X\sim\nu$ . We never use this in the proof of the above theorem, and the conclusion holds without it.

Very roughly speaking, in the proof of the above theorem we first construct a counter-example function in Gaussian space which “passes the test” with decent probability, while having zero expectation; this function is then converted to a boolean function using the Central Limit Theorem and a rounding procedure. Along the way, we prove a simple characterization for a random vector to have an independent coordinate, which we believe to be of independent interest, and is stated as follows:

Proposition 10 (Restated formally and proved later as Proposition 19).

Let $X=(X_{1},\dots,X_{k})$ be a $k$ -dimensional multivariate Gaussian random vector, such that for each $i\in[k]$ , the marginal is $X_{i}\sim\mathcal{N}(0,1)$ . Then, the following are equivalent:

1.

For every “nice” function $f:\mathop{\mathbb{R}}\to\mathop{\mathbb{R}}$ satisfying $\mathop{\mathbb{E}}_{Z\sim\mathcal{N}(0,1)}\left[f(Z)\right]=0$ , it holds that $\mathop{\mathbb{E}}\left[f(X_{1})\cdot f(X_{2})\cdots f(X_{k})\right]=0$ .
2.

There exists $i\in[k]$ such that $X_{i}$ is independent of $(X_{1},\dots,X_{i-1},X_{i+1},\dots,X_{k})$ .

Finally, to use the above theorems (Theorem 7 and Theorem 8), we analyze the tradeoff between the number of queries $k$ and the bias $p$ , such that a distribution $\nu\in\mathcal{D}(p,k)$ with some pairwise independent coordinate exists. In particular, we prove the following (restated and proved later as Proposition 25 and Proposition 27):

Proposition 11.

Let $k\in\mathbb{N},\ p\in(0,1)$ . Then, there exists a distribution $\nu\in\mathcal{D}(p,k)$ with some pairwise independent coordinate if and only if $k\geqslant 1+\frac{1}{\min\left\{p,1-p\right\}}$ .

We note that the above generalizes the parameter setting for both the BLR test, corresponding to $p=\frac{1}{2},\ k=3$ , and the case of $p\in\left(\frac{1}{3},\frac{2}{3}\right),\ k=4$ considered in [11].

1.1 Related work

The problem of linearity testing has been extensively studied, starting with the work of Blum, Luby and Rubinfeld [15], who gave a test for the uniform distribution, in the 99% regime. The analysis of their test was later extended to the 1% regime [4, 23]. Tests for linearity have been also been studied in the low-randomness regime, and in the setting of non-abelian groups [6, 5, 29].

For the $p$ -biased case, in the 99% regime, Halevy and Kushilevitz [20] gave a 3-query linearity test, that only uses random samples from the $p$ -biased distribution! However, the test is not tolerant, makes queries that are not distributed according to $\mu_{p}^{\otimes n}$ , and hence may reject functions that are very close to linear (with respect to the $p$ -biased measure). Tolerant testers were analyzed later [24, 17]. More strongly, the work of Dinur, Filmus and Harsha [17] gives $2^{d}$ -query tolerant tester for $p$ -biased testing of degree $d$ functions over $\mathbb{F}_{2}$ , a problem which has been well studied over the uniform distribution [1, 14].

As a part of their work on approximability of satisfiable constraint satisfaction problems [9, 10, 11, 12, 13], Bhangale, Khot and Minzer study the $p$ -biased version of linearity testing, in the 1% regime. As mentioned before, they give a 4-query test for $p\in\left(\frac{1}{3},\frac{2}{3}\right)$ .

David, Dinur, Goldberg, Kindler and Shinkar [16] study linearity testing on the $k$ -slice (vectors of hamming-weight $k$ ), denoted by $L_{k,n}$ , of the $n$ -dimensional boolean hypercube, for even integers $k$ . They show that if $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ is such that $f(x\oplus y)=f(x)f(y)$ with probability $1-\epsilon$ over $x,y,x\oplus y$ (conditioned on all lying in $L_{k,n}$ ), then $f$ agrees with a linear function on $1-\delta$ fraction of $L_{k,n}$ , where $\delta=\delta(\epsilon)\to 0$ as $\epsilon\to 0$ . In a recent work, Kalai, Lifshitz, Minzer and Ziegler [22] prove a similar result for the $n/2$ -slice, in the 1% regime.

1.2 Organization of the paper

We start by presenting some preliminaries in Section 2. In Section 3, we prove a variant of Theorem 8 over the Gaussian distribution, which then is used in Section 4 to prove Theorem 8. In Section 5, we analyze the tradeoff between the bias $p$ and the number of queries $k$ , for the existence of a valid linearity test. Combining all results, we prove Theorem 5 in Section 6. In Section 7, we outline of the proof of Theorem 7.

2 Preliminaries

We use $\exp$ to denote the exponential function, given by $\exp(x)=e^{x}$ for $x\in\mathop{\mathbb{R}}$ .

Let $\mathbb{N}=\left\{1,2,\dots\right\}$ be the set of natural numbers. For each $n\in\mathbb{N}$ , we use $[n]$ to denote the set $\left\{1,2,\dots,n\right\}$ . For non-negative functions $f,g:\mathbb{N}\to\mathop{\mathbb{R}}$ , we say that $f(n)=o_{n}(g(n))$ if $\lim_{n\to\infty}\frac{f(n)}{g(n)}=0$ .

For a probability distribution $\nu$ on $\mathcal{X}$ , we use $\textnormal{supp}(\nu)$ to denote its support. For $n\in\mathbb{N}$ , we use $\nu^{\otimes n}$ to denote the $n$ -fold product distribution on $\mathcal{X}^{n}$ . In particular, we shall be interested in the case when $\mathcal{X}\subseteq\mathop{\mathbb{R}}^{k}$ for some $k\in\mathbb{N}$ . In this case, for vectors $x\in\mathop{\mathbb{R}}^{kn}$ , we shall use subscripts for indices in $[k]$ and superscripts for indices in $[n]$ ; that is, for each $i\in[k],j\in[n]$ , we use $x_{i}^{\left(j\right)}$ to denote the $(i,j)^{\textsuperscript{th}}$ coordinate of $x$ . Further, for each $i\in[k]$ , we use $x_{i}$ to denote the vector $\left(x_{i}^{\left(1\right)},\dots,x_{i}^{\left(n\right)}\right)\in\mathop{% \mathbb{R}}^{n}$ , and similarly for each $j\in[n]$ , we use $x^{\left(j\right)}$ to denote the vector $\left(x_{1}^{\left(j\right)},\dots,x_{k}^{\left(j\right)}\right)\in\mathop{% \mathbb{R}}^{k}$ .

For $k\in\mathbb{N}$ , let $S_{k}$ denote the group of all permutations on $[k]$ . For each $\pi\in S_{k},\ x\in\mathop{\mathbb{R}}^{k}$ , we use $x_{\pi}$ to denote $\left(x_{\pi(1)},\dots,x_{\pi(k)}\right)\in\mathop{\mathbb{R}}^{k}$ . With this notation, we define the symmetrization of functions over $\mathop{\mathbb{R}}^{k}$ :

Definition 12.

For any function $f:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ , we define its symmetrization as the function $\textnormal{Sym}(f):\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ , given by $\textnormal{Sym}(f)(x)=\sum_{\pi\in S_{k}}f(x_{\pi})$ .

We shall use the following facts from probability theory:

Fact 13 (Chebyshev’s Inequality; see [18] for reference).

Let $X$ be a random variable such that $\mathop{\mathbb{E}}\left[X^{2}\right]<\infty$ . Then, for any any $a>0$ ,

\Pr\left[\left|X-\mathop{\mathbb{E}}\left[X\right]\right|\geqslant a\right]% \leqslant\frac{\textnormal{Var}\left[X\right]}{a^{2}}.

Fact 14 (Hoeffding’s Inequality [21]).

Let $X_{1},\dots,X_{n}$ be independent random variables such that $a_{i}\leqslant X_{i}\leqslant b_{i}$ almost surely, and let $S=\sum_{i=1}^{n}X_{i}$ . Then, for all $t>0$ ,

\Pr\left[\left|S-\mathop{\mathbb{E}}\left[S\right]\right|\geqslant t\right]% \leqslant 2\cdot\exp\left(-\frac{2t^{2}}{\sum_{i=1}^{n}\left(b_{i}-a_{i}\right% )^{2}}\right).

Theorem 15 (Multivariate Central Limit Theorem; see [18] for reference).

Let $X^{\left(1\right)},X^{\left(2\right)},\dots$ be $\mathop{\mathbb{R}}^{k}$ -valued i.i.d. random vectors, with mean zero, and finite a covariance matrix $\Sigma\in\mathop{\mathbb{R}}^{k\times k}$ given by $\Sigma_{i,j}=\mathop{\mathbb{E}}\left[X^{\left(1\right)}_{i}\cdot X^{\left(1% \right)}_{j}\right]$ . If $S_{n}=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}X^{\left(i\right)}$ , then, $S_{n}\xrightarrow{\mathcal{D}}Z$ as $n\to\infty$ , for $Z\sim\mathcal{N}(0,\Sigma)$ . That is, for every bounded continuous function $H:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ ,

\lim_{n\to\infty}\mathop{\mathbb{E}}\left[H(S_{n})\right]=\mathop{\mathbb{E}}_% {Z\sim\mathcal{N}(0,\Sigma)}\left[H(Z)\right].

We shall also use the following fact about zeros of polynomials:

Lemma 16.

Let $p_{1},\dots,p_{r}:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ be non-zero polynomials. Then, there exists $y\in\mathop{\mathbb{R}}^{k}$ such that for each permutation $\pi\in S_{k}$ , and each $i\in[r]$ , it holds that $p_{i}(y_{\pi})\not=0$ .

Proof Sketch.

The zero-set of any non-zero polynomial has measure zero, with respect to the Lebesgue measure on $\mathop{\mathbb{R}}^{k}$ . Hence, by sub-additivity, the set of points $y\in\mathop{\mathbb{R}}^{k}$ violating the statement of the lemma is of measure zero as well. $\hfill\blacktriangleleft$

Next, we give some basic results about the probabilist’s Hermite polynomials. The reader is referred to Chapter 11 in [28] for more details.

Definition 17.

The Hermite polynomials $(H_{j})_{j\in\mathbb{Z}_{\geqslant 0}}$ are univariate polynomials, with $H_{j}$ a monic polynomial of degree $j$ , satisfying the power series expression

\exp\left(tx-\frac{t^{2}}{2}\right)=\sum_{j=0}^{\infty}\frac{1}{j!}\cdot H_{j}% (x)\cdot t^{j},\quad\text{for }t,x\in\mathop{\mathbb{R}}.

Note that the series above is absolutely convergent, with $\sum_{j=0}^{\infty}\frac{1}{j!}\cdot\left|H_{j}(x)\right|\cdot\left|t\right|^{% j}\leqslant\exp\left(\left|t\right|\cdot\left|x\right|+\frac{t^{2}}{2}\right)$ for each $t,x\in\mathop{\mathbb{R}}$ .

Lemma 18.

Let $k\in\mathbb{N}$ and $s_{1},s_{2},\dots,s_{k}\in\mathbb{Z}_{\geqslant 0}$ , and let $\Sigma\in\mathop{\mathbb{R}}^{k\times k}$ be a positive semi-definite matrix such that $\Sigma_{i,i}=1$ for each $i$ . For $V=\Sigma-I$ , it holds that

\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[H_{s_{1}}(X_{1})\cdots H% _{s_{k}}(X_{k})\right]=\frac{s_{1}!\cdots s_{k}!}{d!\cdot 2^{d}}\cdot\left[% \left(t^{\top}V\ t\right)^{d}:t_{1}^{s_{1}}\cdots t_{k}^{s_{k}}\right]

where $s_{1}+\dots+s_{k}=2d$ , and $\left[\left(t^{\top}V\ t\right)^{d}:t_{1}^{s_{1}}\cdots t_{k}^{s_{k}}\right]$ denotes the coefficient of $t_{1}^{s_{1}}\cdots t_{k}^{s_{k}}$ in the polynomial $\left(t^{\top}V\ t\right)^{d}$ . Also, the above expectation is zero when $s_{1}+\dots+s_{k}$ is odd.

Proof.

Recall that the moment generating function of a multivariate Gaussian distribution is given by

\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\exp\left(t_{1}X_{1}+% \dots t_{k}X_{k}\right)\right]=\exp\left(\frac{1}{2}\cdot t^{\top}\Sigma\ t% \right),

for each $t\in\mathop{\mathbb{R}}^{k}$ . Multiplying the above by $\exp(-\frac{1}{2}\cdot t^{\top}t)$ , and plugging in the power series in Definition 17, we get for each $t\in\mathop{\mathbb{R}}^{k}$ that

	$\displaystyle\sum_{d=0}^{\infty}\frac{1}{d!\cdot 2^{d}}\cdot\left(t^{\top}V\ t% \right)^{d}$	$\displaystyle=\exp\left(\frac{1}{2}\cdot t^{\top}V\ t\right)$
		$\displaystyle=\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\exp\left(% \left(t_{1}X_{1}-\frac{t_{1}^{2}}{2}\right)+\dots+\left(t_{k}X_{k}-\frac{t_{k}% ^{2}}{2}\right)\right)\right]$
		$\displaystyle=\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\left(\sum% _{s_{1}=0}^{\infty}\frac{1}{s_{1}!}\cdot H_{s_{1}}(X_{1})\cdot t_{1}^{s_{1}}% \right)\cdots\left(\sum_{s_{k}=0}^{\infty}\frac{1}{s_{k}!}\cdot H_{s_{k}}(X_{k% })\cdot t_{k}^{s_{k}}\right)\right]$
		$\displaystyle=\sum_{s_{1},\dots,s_{k}\geqslant 0}\frac{t_{1}^{s_{1}}\cdots t_{% k}^{s_{k}}}{s_{1}!\cdots s_{k}!}\cdot\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,% \Sigma)}\left[H_{s_{1}}(X_{1})\cdots H_{s_{k}}(X_{k})\right].$

Note that since the power series in Definition 17 is absolutely convergent, all steps above of interchanging limits and expectations are valid by the dominated convergence theorem. Finally, comparing coefficients, we have the desired result. $\hfill\blacktriangleleft$

3 A Gaussian Variant

The first step towards proving Theorem 8 is to prove a Gaussian variant, stated below:

Proposition 19.

Let $k\in\mathbb{N}$ , and let $\Sigma\in\mathop{\mathbb{R}}^{k\times k}$ be a symmetric positive semi-definite matrix such that:

1.

For each $i\in[k]$ , it holds that $\Sigma_{i,i}=1$ .
2.

The matrix $V=\Sigma-I$ has no row/column as all zeros.

Then, there exists a Lipschitz continuous function $f:\mathop{\mathbb{R}}\to[-1,1]$ such that:

\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[f(X)\right]=0,\quad\text{and}% \quad\left|\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\prod_{i\in[k% ]}f(X_{i})\right]\right|>0.

3.1 Symmetric Powers of Polynomials

Before we prove the above proposition, we first prove a lemma about (symmetrization of) powers of multivariate polynomials. We show that if a polynomial $q(x_{1},\dots,x_{k})$ depends on all the variables $x_{1},\dots,x_{k}$ , then the symmetric power $\textnormal{Sym}(q^{d})$ , for some integer $d$ (see Definition 12), contains a monomial divisible by $x_{1}x_{2}\cdots x_{k}$ .

Lemma 20.

Let $k\in\mathbb{N}$ , and let $q:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ be a polynomial such that for each $i\in[k]$ , the polynomial $\ell_{i}=\partial_{i}q$ is not identically zero. Then, there exists some $d\in\mathbb{N}$ , and positive integers $s_{1},\dots,s_{k}\in\mathbb{N}$ such that the coefficient of $x_{1}^{s_{1}}\cdot x_{2}^{s_{2}}\cdots x_{k}^{s_{k}}$ in the polynomial $\textnormal{Sym}(q^{d})$ is non-zero.

We start by proving the following lemma about derivates of powers of $q$ .

Lemma 21.

Let $k\in\mathbb{N}$ , and let $q:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ be a polynomial. For each $i\in[k]$ , let $\ell_{i}=\partial_{i}q$ .

Then, for every $s=(s_{1},\dots,s_{k})\in\mathbb{Z}_{\geqslant 0}^{k}$ with $\left|s\right|=\sum_{i\in[k]}s_{i}$ , there exist polynomials $p_{0},\dots,p_{\left|s\right|}$ , with $p_{\left|s\right|}=\prod_{i\in[k]}\ell_{i}^{s_{i}}$ , such that for each $d\geqslant\left|s\right|$ , it holds that

\partial_{1}^{s_{1}}\cdot\partial_{2}^{s_{2}}\cdots\partial_{k}^{s_{k}}\left(q% ^{d}\right)=q^{d-\left|s\right|}\cdot\left(\sum_{i=0}^{\left|s\right|}d^{i}% \cdot p_{i}\right).

Proof.

The proof is by induction on $\left|s\right|$ . For the base case, if $\left|s\right|=0$ , we have $s=(0,0,\dots,0)$ , and $p_{0}=1$ satisfies the statement of the lemma.

For the inductive step, consider any $s=(s_{1},\dots,s_{k})\in\mathbb{Z}_{\geqslant 0}^{k}$ with $\left|s\right|=\sum_{i\in[k]}s_{i}>0$ . Without loss of generality, by symmetry, we can assume that $s_{1}>0$ . By the inductive hypothesis applied to $(s_{1}-1,s_{2}\dots,s_{k})$ , we have the existence of polynomials $p_{0},\dots,p_{\left|s\right|-1}$ , with $p_{\left|s\right|-1}=\ell_{1}^{s_{1}-1}\cdot\prod_{i=2}^{k}\ell_{i}^{s_{i}}$ , and such that for each $d\geqslant\left|s\right|-1$ , we have

\partial_{1}^{s_{1}-1}\cdot\partial_{2}^{s_{2}}\cdots\partial_{k}^{s_{k}}\left% (q^{d}\right)=q^{d-\left|s\right|+1}\cdot\left(\sum_{i=0}^{\left|s\right|-1}d^% {i}\cdot p_{i}\right).

Now, if $d\geqslant\left|s\right|$ , differentiating the above with respect to $x_{1}$ , we get

	$\displaystyle\partial_{1}^{s_{1}}\cdot\partial_{2}^{s_{2}}\cdots\partial_{k}^{% s_{k}}\left(q^{d}\right)$	$\displaystyle=q^{d-\left\|s\right\|}\cdot\left((d-\left\|s\right\|+1)\cdot\ell_{1}% \cdot\sum_{i=0}^{\left\|s\right\|-1}d^{i}\cdot p_{i}\right)+q^{d-\left\|s\right\|+% 1}\cdot\left(\sum_{i=0}^{\left\|s\right\|-1}d^{i}\cdot\partial_{1}(p_{i})\right)$
		$\displaystyle=q^{d-\left\|s\right\|}\cdot\left(\sum_{i=1}^{\left\|s\right\|}d^{i}% \cdot\ell_{1}\cdot p_{i-1}+\sum_{i=0}^{\left\|s\right\|-1}d^{i}\cdot\left(\left(% -\left\|s\right\|+1\right)\cdot\ell_{1}\cdot p_{i}+q\cdot\partial_{1}p_{i}\right% )\right)$
		$\displaystyle=q^{d-\left\|s\right\|}\cdot\left(\sum_{i=0}^{\left\|s\right\|}d^{i}% \cdot\tilde{p}_{i}\right),$

where the polynomials $\tilde{p}_{1},\dots,\tilde{p}_{\left|s\right|}$ do not depend on $d$ , and are such that $\tilde{p}_{\left|s\right|}=p_{\left|s\right|-1}\cdot\ell_{1}=\prod_{i\in[k]}% \ell_{i}^{s_{i}}$ , as desired. $\hfill\blacktriangleleft$

With the above lemma in hand, next we shall consider the symmetrization operation applied to derivatives of powers of $q$ .

Lemma 22.

Let $k\in\mathbb{N}$ , and let $q:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ be a polynomial such that for each $i\in[k]$ , the polynomial $\ell_{i}=\partial_{i}q$ is not identically zero.

Then, for each large enough even integer $d\in\mathbb{N}$ , the polynomial $\textnormal{Sym}\left(\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots\partial_{k}^% {2}\left(q^{d}\right)\right)$ is not identically zero.

Proof.

By applying Lemma 21 on $s=(2,2,\dots,2)$ , we have the existence of polynomials $p_{0},\dots,p_{2k}$ , with $p_{2k}=\prod_{i\in[k]}\ell_{i}^{2}$ , such that for each $d\geqslant 2k$ , it holds that $\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots\partial_{k}^{2}\left(q^{d}\right)=% q^{d-2k}\cdot\left(\sum_{i=0}^{2k}d^{i}\cdot p_{i}\right)$ .

By Lemma 16, let $y\in\mathop{\mathbb{R}}^{k}$ be such that $y$ (and its permutations) don’t lie in the zero set of any of the polynomials $\ell_{1},\dots,\ell_{k},q$ . We define

A=\min_{\pi\in S_{k}}\left[\prod_{i\in[k]}\ell_{i}\left(y_{\pi}\right)^{2}% \right]>0,\quad B=\max_{0\leqslant i\leqslant 2k-1,\ \pi\in S_{k}}\left|p_{i}(% y_{\pi})\right|\geqslant 0.

Then, for any even integer $d\geqslant\max\left\{2k,\frac{4kB}{A}\right\}$ , it holds that

	$\displaystyle\textnormal{Sym}\left(\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots% \partial_{k}^{2}\left(q^{d}\right)\right)(y)$	$\displaystyle=\sum_{\pi\in S_{k}}q\left(y_{\pi}\right)^{d-2k}\cdot\left(d^{2k}% \cdot\prod_{i\in[k]}\ell_{i}\left(y_{\pi}\right)^{2}+\sum_{i=0}^{2k-1}d^{i}% \cdot p_{i}\left(y_{\pi}\right)\right)$
		$\displaystyle\geqslant\sum_{\pi\in S_{k}}q\left(y_{\pi}\right)^{d-2k}\cdot% \left(d^{2k}\cdot A-\sum_{i=0}^{2k-1}d^{i}\cdot B\right)$
		$\displaystyle\geqslant\left(\sum_{\pi\in S_{k}}q\left(y_{\pi}\right)^{d-2k}% \right)\cdot\left(d^{2k}\cdot A-2k\cdot d^{2k-1}\cdot B\right)$
		$\displaystyle\geqslant\left(\sum_{\pi\in S_{k}}q\left(y_{\pi}\right)^{d-2k}% \right)\cdot\frac{d^{2k}A}{2}>0.$

Hence, for even integers $d\geqslant\max\left\{2k,\frac{4kB}{A}\right\}$ , the polynomial $\textnormal{Sym}\left(\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots\partial_{k}^% {2}\left(q^{d}\right)\right)$ is not identically zero. $\hfill\blacktriangleleft$

Finally, we prove the main lemma of this section.

Proof of Lemma 20.

Let $k\in\mathbb{N}$ , and let $q:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ be a polynomial such that for each $i\in[k]$ , the polynomial $\ell_{i}=\partial_{i}q$ is not identically zero. It suffices to prove that for some $d\in\mathbb{N}$ , the polynomial $\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots\partial_{k}^{2}\left(\textnormal{% Sym}(q^{d})\right)$ is not identically zero, since then the coefficient of some monomial divisible by $x_{1}^{2}\cdot x_{2}^{2}\cdots x_{k}^{2}$ is non-zero.

For each polynomial $p:\mathop{\mathbb{R}}^{k}\to\mathop{\mathbb{R}}$ , and each $\pi\in S_{k}$ , we shall use $p_{\pi}$ to denote the polynomial given by $p_{\pi}(x)=p(x_{\pi})$ . Then, for all $s_{1},\dots,s_{k}\in\mathbb{Z}_{\geqslant 0}$ , we have that $\partial_{1}^{s_{1}}\cdot\partial_{2}^{s_{2}}\cdots\partial_{k}^{s_{k}}\left(p% _{\pi}\right)=\left(\partial_{\pi^{-1}(1)}^{s_{1}}\cdot\partial_{\pi^{-1}(2)}^% {s_{2}}\cdots\partial_{\pi^{-1}(k)}^{s_{k}}\left(p\right)\right)_{\pi}$ .

By the above, we have that for each $d\in\mathbb{N}$ ,

	$\displaystyle\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots\partial_{k}^{2}\left(% \textnormal{Sym}(q^{d})\right)$	$\displaystyle=\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots\partial_{k}^{2}\left% (\sum_{\pi\in S_{k}}q_{\pi}^{d}\right)$
		$\displaystyle=\sum_{\pi\in S_{k}}\partial_{1}^{2}\cdot\partial_{2}^{2}\cdots% \partial_{k}^{2}\left(q_{\pi}^{d}\right)$
		$\displaystyle=\sum_{\pi\in S_{k}}\left(\partial_{\pi^{-1}(1)}^{2}\cdot\partial% _{\pi^{-1}(2)}^{2}\cdots\partial_{\pi^{-1}(k)}^{2}\left(q^{d}\right)\right)_{\pi}$
		$\displaystyle=\sum_{\pi\in S_{k}}\left(\partial_{1}^{2}\cdot\partial_{2}^{2}% \cdots\partial_{k}^{2}\left(q^{d}\right)\right)_{\pi}$
		$\displaystyle=\textnormal{Sym}\left(\partial_{1}^{2}\cdot\partial_{2}^{2}% \cdots\partial_{k}^{2}\left(q^{d}\right)\right).$

Now, the result follows from Lemma 22. $\hfill\blacktriangleleft$

3.2 Proving the Gaussian Variant

We start by proving a slight variant of Proposition 19, where we allow $f$ to be an arbitrary (possibly unbounded) polynomial.

Lemma 23.

Let $k\in\mathbb{N}$ , and let $\Sigma\in\mathop{\mathbb{R}}^{k\times k}$ be a symmetric positive semi-definite matrix such that:

1.

For each $i\in[k]$ , it holds that $\Sigma_{i,i}=1$ .
2.

The matrix $V=\Sigma-I$ has no row/column as all zeros.

Then, there exists a polynomial $f:\mathop{\mathbb{R}}\to\mathop{\mathbb{R}}$ such that $\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[f(X)\right]=0$ , and

\left|\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\prod_{i\in[k]}f(X% _{i})\right]\right|>0.

Proof.

For $s=(s_{1},\dots,s_{k})\in\mathbb{N}^{k}$ and $\alpha=(\alpha_{1},\dots,\alpha_{k})\in\mathop{\mathbb{R}}^{k}$ , let $f_{s,\alpha}:\mathop{\mathbb{R}}\to\mathop{\mathbb{R}}$ be the polynomial defined by $f_{s,\alpha}(x)=\alpha_{1}H_{s_{1}}(x)+\cdots+\alpha_{k}H_{s_{k}}(x)$ , where the polynomials $H_{s_{i}}$ are Hermite polynomials (see Definition 17). Observe that since $s_{1},\dots,s_{k}\geqslant 1$ , we have $\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[f(X)\right]=0$ ; this follows from the orthogonality of the Hermite polynomials.

Suppose, for the sake of contradiction, that for every $s\in\mathbb{N}^{k},\ \alpha\in\mathop{\mathbb{R}}^{k}$ , it holds that

\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\prod_{i\in[k]}f_{s,% \alpha}(X_{i})\right]=\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[% \prod_{i\in[k]}\sum_{j\in[k]}\alpha_{j}H_{s_{j}}(X_{i})\right]=0.

Observe that for every $s\in\mathbb{N}^{k}$ , the above expression can be written as a multivariate polynomial in $\alpha_{1},\dots,\alpha_{k}$ . If the polynomial vanishes for all $\alpha\in\mathop{\mathbb{R}}^{k}$ , the coefficient of $\alpha_{1}\cdot\alpha_{2}\cdots\alpha_{k}$ must be zero; that is,

\sum_{\pi\in S_{k}}\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\prod% _{i\in[k]}H_{s_{\pi(i)}}(X_{i})\right]=0.

Now, applying Lemma 18, we get that for each $d\in\mathbb{N}$ , and each $s_{1},\dots,s_{k}\geqslant 1$ with $s_{1}+\dots+s_{k}=2d$ ,

	$\displaystyle\sum_{\pi\in S_{k}}\left[\left(t^{\top}V\ t\right)^{d}:t_{1}^{s_{% \pi(1)}}\cdots t_{k}^{s_{\pi(k)}}\right]$	$\displaystyle=\sum_{\pi\in S_{k}}\left[\left(t_{\pi}^{\top}V\ t_{\pi}\right)^{% d}:t_{1}^{s_{k}}\cdots t_{k}^{s_{k}}\right]$
		$\displaystyle=\left[\textnormal{Sym}\left(\left(t^{\top}V\ t\right)^{d}\right)% :t_{1}^{s_{k}}\cdots t_{k}^{s_{k}}\right]=0.$

Note that the assumption that $V$ has no zero row/column implies that for every $i\in[k]$ , the polynomial $\partial_{i}\left(t^{\top}V\ t\right)$ is not identically zero. By Lemma 20, this is a contradiction. $\hfill\blacktriangleleft$

With the above, we now prove Proposition 19 via a standard truncation argument.

Proof of Proposition 19.

By Lemma 23, we know that there exists a polynomial $f:\mathop{\mathbb{R}}\to\mathop{\mathbb{R}}$ such that $\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[f(X)\right]=0$ , and $\left|\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\prod_{i\in[k]}f(X% _{i})\right]\right|>0$ .

For each integer $M\in\mathbb{N}$ , we define the truncated function $f_{M}:\mathop{\mathbb{R}}\to[-M,M]$ by

f_{M}(x)=f(x)\cdot\mathds{1}_{\left|f(x)\right|\leqslant M}+M\cdot\mathds{1}_{% f(x)>M}-M\cdot\mathds{1}_{f(x)<-M}.

Also, let $g_{M}:\mathop{\mathbb{R}}\to[-2M,2M]$ , be given by $g_{M}(x)=f_{M}(x)-\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[f_{M}(X)\right]$ . Observe that

1.

For every $M$ , it holds that $\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[g_{M}(X)\right]=0$ .
2.

For every $M$ , the function $g_{M}$ is bounded and Lipschitz continuous.
3.

For every $x\in\mathop{\mathbb{R}}$ , $f_{M}(x)\to f(x)$ as $M\to\infty$ . Further, since $\left|f_{M}(x)\right|\leqslant\left|f(x)\right|$ for each $x\in\mathop{\mathbb{R}},M\in\mathbb{N}$ , by the dominated convergence theorem, we have $\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[f_{M}(x)\right]\to\mathop{% \mathbb{E}}_{X\sim\mathcal{N}(0,1)}\left[f(x)\right]=0$ as $M\to\infty$ . This implies that for each $x\in\mathop{\mathbb{R}}$ , $g_{M}(x)\to f(x)$ as $M\to\infty$ .

Also, for each $x\in\mathop{\mathbb{R}},M\in\mathbb{N}$ , we have $\left|g_{M}(x)\right|\leqslant\left|f(x)\right|+\mathop{\mathbb{E}}_{X\sim% \mathcal{N}(0,1)}\left[\left|f(X)\right|\right]$ . Hence, by the dominated convergence theorem, we have that $\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\prod_{i\in[k]}g_{M}(X_{% i})\right]\to\mathop{\mathbb{E}}_{X\sim\mathcal{N}(0,\Sigma)}\left[\prod_{i\in% [k]}f(X_{i})\right]\not=0$ as $M\to\infty$ .

By the above, for some large enough $M$ , the function $\frac{1}{2M}\cdot g_{M}:\mathop{\mathbb{R}}\to[-1,1]$ satisfies the desired properties. $\hfill\blacktriangleleft$

4 Linearity Testing Requires Pairwise Independence

In this section, we prove Theorem 8, which is restated below.

Theorem 24.

Let $k\in\mathbb{N},\ p\in(0,1)$ , and let $\nu\in\mathcal{D}(p,k)$ be a distribution having no pairwise independent coordinate (see Definition 1). Then, there exists a constant $\alpha>0$ , such that for every large enough $n\in\mathbb{N}$ , there exists a function $f:\left\{0,1\right\}^{n}\to[-1,1]$ such that

1.

$\left|\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})% \right]\right|\geqslant\alpha.$
2.

For every $S\subseteq[n]$ , it holds that $\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\chi_{S}(X)% \right]\right|\leqslant o_{n}(1)$ .

Moreover, if the distribution $\nu$ is such that $\eta:=\max_{i,j\in[k],i\not=j}\Pr_{X\sim\nu}\left[X_{i}=X_{j}\right]<1$ (that is, no two coordinates are almost surely equal), the above holds for a function $f$ with range $\left\{-1,1\right\}$ .

The remainder of this section is devoted to the proof of Theorem 24. In Section 4.1, we prove the first part of the theorem, dealing with functions with range $[-1,1]$ . Then, in Section 4.2, we show how to round to functions with range $\left\{-1,1\right\}$ .

4.1 Function with Range $[-1,1]$

Let $k\in\mathbb{N},\ p\in(0,1),$ and let $\nu\in\mathcal{D}(p,k)$ be a distribution having no pairwise independent coordinate. Let $\Sigma\in\mathop{\mathbb{R}}^{k\times k}$ be the (normalized) covariance matrix corresponding to the distribution $\nu$ , given by, $\Sigma_{i,j}=\mathop{\mathbb{E}}_{X\sim\nu}\left[\frac{\left(X_{i}-p\right)% \cdot\left(X_{j}-p\right)}{p-p^{2}}\right]$ . Observe that the matrix $\Sigma$ satisfies the conditions of Proposition 19, and hence there exists a function $h:\mathop{\mathbb{R}}\to[-1,1]$ such that

1.

$\mathop{\mathbb{E}}_{Z\sim\mathcal{N}(0,1)}\left[h(Z)\right]=0$
2.

The function $H:\mathop{\mathbb{R}}^{k}\to[-1,1]$ given by $H(x)=\prod_{i\in[k]}h\left(x_{i}\right)$ is such that

$\alpha:=\frac{1}{2}\cdot\left|\mathop{\mathbb{E}}_{Z\sim\mathcal{N}(0,\Sigma)}% \left[H(Z)\right]\right|>0.$
3.

The function $h$ is $K$ -Lipschitz for some $K>0$ ; in particular, both $h$ and $H$ are bounded continuous functions.

Consider any large $n\in\mathbb{N}$ . We define $f:\left\{0,1\right\}^{n}\to[-1,1]$ by

f(x)=h\left(\frac{1}{\sqrt{n}}\cdot\sum_{j=1}^{n}\frac{x^{\left(j\right)}-p}{% \sqrt{p-p^{2}}}\right),

The function $f$ satisfies the two properties in the theorem statement, as follows:

$\blacksquare$

Let $X\sim\nu^{\otimes n}$ , and let $Y=(Y_{1},\dots,Y_{k})$ be a $\left\{0,1\right\}^{k}$ -valued random vector, defined as $Y_{i}=\frac{1}{\sqrt{n}}\cdot\sum_{j=1}^{n}\frac{X_{i}^{\left(j\right)}-p}{% \sqrt{p-p^{2}}}$ .

Let $F:\left\{0,1\right\}^{kn}\to[-1,1]$ be given by $F(x)=\prod_{i\in[k]}f\left(x_{i}\right)$ . Since $H$ is continuous and bounded, we have by the Multivariate CLT (Theorem 15) that

$\left|\ \mathop{\mathbb{E}}\left[F(X)\right]-\mathop{\mathbb{E}}_{Z\sim% \mathcal{N}(0,\Sigma)}\left[H(Z)\right]\ \right|=\left|\ \mathop{\mathbb{E}}% \left[{H(Y)}\right]-\mathop{\mathbb{E}}_{Z\sim\mathcal{N}(0,\Sigma)}\left[H(Z)% \right]\ \right|\leqslant o_{n}(1).$

Hence, for large $n$ , we get $\left|\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})% \right]\right|\geqslant 2\alpha-o_{n}(1)\geqslant\alpha$ , as desired.

\blacksquare

Consider any subset $S\subseteq[n]$ , and let $T\subseteq S$ be any subset of size $\left|T\right|=\min\left\{\lfloor n^{1/4}\rfloor,\left|S\right|\right\}$ . Let $\tilde{f}:\left\{0,1\right\}^{n}\to[-1,1]$ be defined by $\tilde{f}(X)=h\left(\frac{1}{\sqrt{n-\left|T\right|}}\cdot\sum_{j\in[n]% \setminus T}\frac{x^{\left(j\right)}-p}{\sqrt{p-p^{2}}}\right)$ ; note that this function only depends on the coordinates of $x$ outside the set $T$ . Further, for each $x\in\left\{0,1\right\}^{n}$ , by the Lipschitz bound on $h$ , we get

	$\displaystyle\left\|f(x)-\tilde{f}(x)\right\|$	$\displaystyle\leqslant K\cdot\left\|\frac{1}{\sqrt{n}}\cdot\sum_{j=1}^{n}\frac{% x^{\left(j\right)}-p}{\sqrt{p-p^{2}}}-\frac{1}{\sqrt{n-\left\|T\right\|}}\cdot% \sum_{j\in[n]\setminus T}\frac{x^{\left(j\right)}-p}{\sqrt{p-p^{2}}}\right\|$
		$\displaystyle\leqslant\frac{K}{\sqrt{p-p^{2}}}\cdot\left(\frac{\left\|T\right\|}% {\sqrt{n}}+\left(n-\left\|T\right\|\right)\cdot\left\|\frac{1}{\sqrt{n-\left\|T% \right\|}}-\frac{1}{\sqrt{n}}\right\|\right)$
		$\displaystyle\leqslant\frac{K}{\sqrt{p-p^{2}}}\cdot\left(\frac{\left\|T\right\|}% {\sqrt{n}}+\frac{n-\left\|T\right\|}{\sqrt{n}}\cdot\frac{\left\|T\right\|}{n}\right)$
		$\displaystyle\leqslant\frac{K}{\sqrt{p-p^{2}}}\cdot\frac{2\left\|T\right\|}{% \sqrt{n}}=o_{n}(1),$

where we used that $(1-t)^{-1/2}\leqslant 1+t$ for each $t\in[0,1/2]$ .

Now, for $X\sim\mu_{p}^{\otimes n}$ , we have

	$\displaystyle\left\|\ \mathop{\mathbb{E}}_{X}\left[f(X)\cdot\chi_{S}(X)\right]% \ \right\|$	$\displaystyle\leqslant\left\|\ \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\cdot% \chi_{S}(X)\right]\ \right\|+o_{n}(1)$
		$\displaystyle=\left\|\ \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\cdot\chi_{S% \setminus T}(X)\right]\cdot\mathop{\mathbb{E}}_{X}\left[\chi_{T}(X)\right]\ % \right\|+o_{n}(1)$
		$\displaystyle=\left\|\ \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\cdot\chi_{S% \setminus T}(X)\right]\ \right\|\cdot\left\|1-2p\right\|^{\left\|T\right\|}+o_{n}(1).$

If $\left|S\right|\geqslant\lfloor n^{1/4}\rfloor$ , then $\left|1-2p\right|^{\left|T\right|}=o_{n}(1)$ . Otherwise, we have that $S=T$ , and by the Central Limit Theorem (see Theorem 15) , the first term in the above product equals

\left|\ \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\right]\ \right|=\left|\ % \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\right]-\mathop{\mathbb{E}}_{Z\sim% \mathcal{N}(0,1)}\left[h(Z)\right]\ \right|=o_{n}(1).\ \qed

4.2 Rounding to a Function with Range $\left\{-1,1\right\}$

Now, we shall prove the second part of Theorem 24.

Let $k\in\mathbb{N},\ p\in(0,1),$ and let $\nu\in\mathcal{D}(p,k)$ be a distribution having no pairwise independent coordinate. Further suppose that the distribution $\nu$ is such that

\eta:=\max_{i,j\in[k],i\not=j}\Pr_{X\sim\nu}\left[X_{i}=X_{j}\right]<1.

Let $\alpha>0$ be as obtained in Section 4.1. Consider any large $n\in\mathbb{N}$ , and let $f:\left\{0,1\right\}^{n}\to[-1,1]$ be the function obtained in Section 4.1.

Let $g:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ be a random function, defined as $g(x)=\begin{cases}1,&w.p.\ \frac{1+f(x)}{2}\\ -1,&w.p.\ \frac{1-f(x)}{2}\end{cases}$ , independently for each $x\in\left\{0,1\right\}^{n}$ . Observe that this satisfies $\mathop{\mathbb{E}}_{g}\left[g(x)\right]=f(x)$ for each $x\in\left\{0,1\right\}^{n}$ . We will show that the function $g$ satisfies the two desired properties with probability $1-o_{n}(1)$ , and hence by the probabilistic method, this guarantees the existence of a non-random $g$ as desired. This is done as follows:

1.

Let $F,G:\left\{0,1\right\}^{kn}\to[-1,1]$ be defined as $F(x)=\prod_{i\in[k]}f\left(x_{i}\right)$ and $G(x)=\prod_{i\in[k]}g\left(x_{i}\right)$ . Let $X,Y\sim\nu^{\otimes n}$ be independent (of each other and of $g$ ) and let $E$ be the event that $X_{1},\dots,X_{k},Y_{1},\dots,Y_{k}$ are all distinct. Then, by a union bound, we have that $\Pr\left[\bar{E}\right]\leqslant 2\cdot\binom{k}{2}\cdot\eta^{n}+k^{2}\cdot% \left(p^{2}+\left(1-p\right)^{2}\right)^{n}=o_{n}(1)$ , and hence

	$\displaystyle\left\|\ \mathop{\mathbb{E}}_{g}\mathop{\mathbb{E}}_{X\sim\nu^{% \otimes n}}\left[G(X)\right]-\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[F% (X)\right]\ \right\|$	$\displaystyle\leqslant\Pr\left[\bar{E}\right]+\left\|\ \mathop{\mathbb{E}}_{g}% \mathop{\mathbb{E}}_{X,Y}\left[G(X)\cdot\mathds{1}_{E}\right]-\mathop{\mathbb{% E}}_{X}\left[F(X)\right]\ \right\|$
		$\displaystyle\leqslant\Pr\left[\bar{E}\right]+\left\|\ \mathop{\mathbb{E}}_{X,Y% }\left[F(X)\cdot\mathds{1}_{E}\right]-\mathop{\mathbb{E}}_{X}\left[F(X)\right]% \ \right\|$
		$\displaystyle\leqslant 2\Pr\left[\bar{E}\right]=o_{n}(1).$

Similarly, we have

	$\displaystyle\left\|\ \mathop{\mathbb{E}}_{g}\left[\mathop{\mathbb{E}}_{X}\left% [G(X)\right]\right]^{2}-\left[\mathop{\mathbb{E}}_{X}\left[F(X)\right]\right]^% {2}\ \right\|$	$\displaystyle=\left\|\ \mathop{\mathbb{E}}_{g}\mathop{\mathbb{E}}_{X,Y}\left[G(% X)\cdot G(Y)\right]-\mathop{\mathbb{E}}_{X,Y}\left[F(X)\cdot F(Y)\right]\ \right\|$
		$\displaystyle\leqslant 2\Pr\left[\bar{E}\right]=o_{n}(1).$

Letting $\beta=\left|\mathop{\mathbb{E}}_{X}\left[F(X)\right]\right|\geqslant\alpha$ , we get $\textnormal{Var}_{g}\left[\mathop{\mathbb{E}}_{X}\left[G(X)\right]\right]% \leqslant\beta^{2}+o_{n}(1)-\left(\beta-o_{n}(1)\right)^{2}=o_{n}(1)$ . Hence, by Chebyshev’s inequality (Fact 13), we have $\left|\mathop{\mathbb{E}}_{X}\left[G(X)\right]\right|\geqslant\frac{\alpha}{2}$ with probability $1-o_{n}(1)$ .

2.

Fix $S\subseteq[n]$ . Let $X\sim\mu_{p}^{\otimes n}$ , and let $W=\mathop{\mathbb{E}}_{X}\left[\chi_{S}(X)\cdot g(X)\right]=\sum_{x\in\left\{0% ,1\right\}^{n}}\Pr\left[X=x\right]\cdot\chi_{S}(x)\cdot g(x)$ . Observe that $W$ is a sum of $2^{n}$ independent and bounded random variables, and such that $\mathop{\mathbb{E}}_{g}[W]=\mathop{\mathbb{E}}_{X}\left[\chi_{S}(X)\cdot f(X)\right]$ . For $q=\max\left\{p,1-p\right\}<1$ , it holds that $\sum_{x}(2\Pr[X=x])^{2}\leqslant 4q^{n}\cdot\sum_{x}\Pr[X=x]=4q^{n}$ , and by Hoeffding’s inequality (Fact 14), we have for each $t>0$ that

$\Pr\left[\left|W-\mathop{\mathbb{E}}[W]\right|\geqslant t\right]\leqslant 2% \cdot\exp\left(-\frac{2t^{2}}{4q^{n}}\right).$

Let $t=q^{n/4}$ . Then, with probability at least $1-o_{n}(2^{-n})$ , it holds that $\left|W\right|=\left|\mathop{\mathbb{E}}_{X}\left[\chi_{S}(X)\cdot g(X)\right]% \right|\leqslant\left|\mathop{\mathbb{E}}_{X}\left[\chi_{S}(X)\cdot f(X)\right% ]\right|+q^{n/4}=o_{n}(1).$

Now, a union bound over $S\subseteq[n]$ shows that with probability $1-o_{n}(1)$ , the above holds for every $S\subseteq[n]$ . ∎

5 Queries vs. Bias Tradeoff

In this section, we analyze the relation between $p$ (the bias) and $k$ (the number of queries) for the existence of a distribution $\nu\in\mathcal{D}(p,k)$ with some pairwise independent coordinate, and with full even-weight support (see Definition 1).

5.1 Query Lower Bound

We prove a lower bound on $k$ in terms of the $p$ , as follows:

Proposition 25.

Let $k\in\mathbb{N},\ p\in(0,1)$ , and let $\nu\in\mathcal{D}(p,k)$ be a distribution that has some pairwise independent coordinate. Then, it holds that $k\geqslant 3$ and $\frac{1}{k-1}\leqslant p\leqslant 1-\frac{1}{k-1}$ .

Proof.

Let $X\sim\nu$ , and let $i\in[k]$ be a pairwise independent coordinate under $\nu$ .

For $Z=\sum_{j\not=i}X_{j}$ , we have by linearity of expectation, that $\mathop{\mathbb{E}}\left[X_{i}\cdot Z\right]=(k-1)p^{2}$ . On the other hand, observe that if $X_{i}=1$ , then $Z=1\textnormal{ (mod 2)}$ and so $Z\geqslant 1$ . Hence,

p=\mathop{\mathbb{E}}\left[X_{i}\cdot 1\right]\leqslant\mathop{\mathbb{E}}% \left[X_{i}\cdot Z\right]=(k-1)p^{2},

and we have $(k-1)p\geqslant 1$ ; in particular, this shows $k\geqslant 3$ .

For the upper bound on $p$ , we consider the following cases:

$\blacksquare$

$k$ is odd: In this case, if $X_{i}=1$ , then $Z=1\textnormal{ (mod 2)}$ and so $Z\leqslant k-2$ . Hence,

$(k-1)p^{2}=\mathop{\mathbb{E}}\left[X_{i}\cdot Z\right]\leqslant\mathop{% \mathbb{E}}\left[X_{i}\cdot(k-2)\right]=p(k-2),$

and we have $(k-1)p\leqslant(k-2)$ , as desired.
$\blacksquare$

$k$ is even: In this case, observe that the distribution of the random variable $(1-X_{1},\dots,1-X_{k})$ also satisfies the hypothesis of the proposition, with $p$ replaced by $1-p$ . Hence, the above proof gives us $(k-1)\cdot(1-p)\geqslant 1$ , as desired.

$\hfill\blacktriangleleft$

$\blacktriangleright$ Remark 26.

The proof of Proposition 25 also shows that for $k>3$ and $p\in\left\{\frac{1}{k-1},1-\frac{1}{k-1}\right\}$ , any distribution satisfying the assumptions of Proposition 25 cannot have full even-weight support. This is because if $p\in\left\{\frac{1}{k-1},1-\frac{1}{k-1}\right\}$ , in all cases in the above proof, the random variable $Z$ must be constant under some value of $X_{i}$ (either $X_{i}=0$ or $X_{i}=1$ ); this cannot be the case for a distribution with full even-weight support when $k>3$ .

5.2 Query Upper Bound

In this subsection, we shall prove the following proposition.

Proposition 27.

Let $k\geqslant 3$ be a positive integer, and let $p\in\left[\frac{1}{k-1},1-\frac{1}{k-1}\right]$ (note that this interval is non-empty for $k\geqslant 3$ ).

Then, there exists a permutation-invariant⁶⁶6we say that a distribution $\nu$ over $\left\{0,1\right\}^{k}$ is permutation-invariant, if for $X=(X_{1},\dots,X_{k})\sim\nu$ , and any permutation $\pi:[k]\to[k]$ , the distribution of $\left(X_{\pi(1)},\dots,X_{\pi(k)}\right)$ is the same as $\nu$ . and pairwise independent distribution $\nu(k,p)\in\mathcal{D}(p,k)$ (see Definition 1). Furthermore, if $k=3$ or if $p\not\in\left\{\frac{1}{k-1},1-\frac{1}{k-1}\right\}$ , then there exists such a distribution with full even-weight support.

The proof involves various cases, considered below in Lemma 28 and Lemma 29.

Lemma 28.

Let $k\geqslant 4$ be a positive integer, and let $p\in\left[\frac{1}{k-1},\frac{2}{k-1}\right)\cup\left(1-\frac{2}{k-1},1-\frac{% 1}{k-1}\right]$ (note that this interval is contained in $\left[\frac{1}{k-1},1-\frac{1}{k-1}\right]$ for $k\geqslant 4$ ). Then, there exists a pairwise independent distribution $\nu(k,p)\in\mathcal{D}(p,k)$ .

Moreover, if $p\not\in\left\{\frac{1}{k-1},1-\frac{1}{k-1}\right\}$ , then there exists such a distribution with full even-weight support.

Proof.

Let $k\geqslant 4$ be a positive integer, and let $p\in\left[\frac{1}{k-1},\frac{2}{k-1}\right)\cup\left(1-\frac{2}{k-1},1-\frac{% 1}{k-1}\right]$ . Let $s=\lfloor\frac{k}{2}\rfloor$ ; we shall exhibit a vector $q=(q_{0},q_{1},\dots,q_{s})\in[0,1]^{s+1}$ satisfying:

\sum_{i=0}^{s}\binom{k}{2i}\cdot q_{i}=1,\ \quad\sum_{i=1}^{s}\binom{k-1}{2i-1% }\cdot q_{i}=p,\quad\sum_{i=1}^{s}\binom{k-2}{2i-2}\cdot q_{i}=p^{2}.

The distribution $\nu(p,k)$ is then defined by assigning probability $\begin{cases}q_{\left|x\right|/2},&\left|x\right|=0\textnormal{ (mod 2)}\\ 0,&\left|x\right|=1\textnormal{ (mod 2)}\end{cases}$ to the point $x\in\left\{0,1\right\}^{k}$ , where $\left|x\right|=\sum_{i=1}^{k}x_{i}$ . Note that the above properties correspond to $\nu(k,p)$ being a valid probability distribution supported on even-hamming-weight vectors, having marginals $\mu_{p}$ , and pairwise independent coordinates. Furthermore, the distribution $\nu(p,k)$ has full even-weight support if and only if each $q_{i}\in(0,1)$ .

The vector $q$ is defined as follows in different cases (for brevity, we omit the verification of the above properties):

1.

$k\geqslant 5$ is odd, $p\in\left[\frac{1}{k-1},\frac{2}{k-1}\right)$ : Let $q_{0}=1+\frac{kp^{2}}{2}-\frac{k^{2}p}{2(k-1)}$ , $q_{1}=\frac{(k-2)p-(k-1)p^{2}}{(k-1)(k-3)}$ , $q_{(k-1)/2}=\frac{(k-1)p^{2}-p}{(k-1)(k-3)}$ , and zero otherwise.
2.

$k\geqslant 5$ is odd, $1-p\in\left[\frac{1}{k-1},\frac{2}{k-1}\right)$ : Let $q_{0}=1+\frac{kp^{2}}{k-3}-\frac{k(2k-5)p}{(k-1)(k-3)}$ , $q_{(k-3)/2}=\frac{3(k-2)p-3(k-1)p^{2}}{(k-1)(k-2)(k-3)}$ , $q_{(k-1)/2}=\frac{(k-1)p^{2}-(k-4)p}{2(k-1)}$ , and zero otherwise.
3.

$k\geqslant 4$ is even, $p\in\left[\frac{1}{k-1},\frac{2}{k-1}\right)$ : Let $q_{0}=\frac{(k-1)p^{2}-(k+1)p+2}{2}$ , $q_{1}=\frac{p-p^{2}}{k-2}$ , $q_{k/2}=\frac{(k-1)p^{2}-p}{k-2}$ , and zero otherwise.
4.

$k\geqslant 4$ is even, $1-p\in\left[\frac{1}{k-1},\frac{2}{k-1}\right)$ : In this case, we define $\nu(k,p)$ to be the distribution obtained by flipping each coordinate of $\nu(k,1-p)$ .

Next, we show that if $p\not\in\left\{\frac{1}{k-1},1-\frac{1}{k-1}\right\}$ , then such a distribution $\nu(p,k)$ with full even-weight support exists. We only need to do this for the first three cases, as the procedure described in the fourth case preserves the property of full even-weight support.

The same argument applies in all cases, and we present it for the first case: that is when $k\geqslant 5$ is odd, and $p\in\left(\frac{1}{k-1},\frac{2}{k-1}\right)$ . We observe if $p\not=\frac{1}{k-1}$ , each of the probabilities $q_{0},q_{1},q_{(k-1)/2}$ above lie in the interval $(0,1)$ . Now, consider the equations

\sum_{i=0}^{s}\binom{k}{2i}\cdot\tilde{q}_{i}=0,\ \quad\sum_{i=1}^{s}\binom{k-% 1}{2i-1}\cdot\tilde{q}_{i}=0,\quad\sum_{i=1}^{s}\binom{k-2}{2i-2}\cdot\tilde{q% }_{i}=0.

In these equations, the variables $\tilde{q}_{0},\tilde{q}_{1},\tilde{q}_{(k-1)/2}$ are linearly independent, and hence, there exists a vector $\tilde{q}\in\mathop{\mathbb{R}}^{s+1}$ satisfying these equations, which has all coordinates equal to 1, other than possibly $\tilde{q}_{0},\tilde{q}_{1},\tilde{q}_{(k-1)/2}$ . Then, for some small $\delta>0$ , the vector $q+\delta\cdot\tilde{q}$ has all coordinates in $(0,1)$ , and satisfies the required properties. $\hfill\blacktriangleleft$

Lemma 29.

Let $k\geqslant 6$ be a positive integer, and let $p\in\left[\frac{2}{k-1},1-\frac{2}{k-1}\right]\setminus\left\{\frac{1}{2}\right\}$ (note that this interval is non-empty for $k\geqslant 6$ ). There, there exists a pairwise independent distribution $\nu(k,p)\in\mathcal{D}(p,k)$ with full even-weight support.

Proof.

Let $k\geqslant 6$ be a positive integer, and let $p\in\left[\frac{2}{k-1},1-\frac{2}{k-1}\right],\ p\not=\frac{1}{2}$ . That is, for $q=\min\left\{p,1-p\right\}<\frac{1}{2}$ , we have $k\geqslant 1+\frac{2}{q}$ . Let $\ell$ be the smallest odd integer satisfying $\ell>1+\frac{1}{q}>3$ . Note that this satisfies $4\leqslant\ell\leqslant 3+\frac{1}{q}<1+\frac{2}{q}\leqslant k$ , and we have $q\in\left(\frac{1}{\ell-1},\frac{2}{\ell-1}\right)$ .

By Lemma 28, there exist pairwise independent distributions $\nu(\ell,p)$ and $\nu(\ell,1-p)$ , with full even-weight support. Let $\tilde{\nu}_{0}=\nu(\ell,p)$ , and let $\tilde{\nu}_{1}$ be the distribution obtained by flipping each coordinate of $\nu(\ell,1-p)$ . Since $\ell$ is odd, for each $b\in\left\{0,1\right\}$ , it holds that $\tilde{\nu}_{b}$ has pairwise independent coordinates, each with marginal $\mu_{p}$ , and such that $\textnormal{supp}(\tilde{\nu}_{b})=\left\{x\in\left\{0,1\right\}^{k}:\sum_{i=1% }^{k}x_{i}=b\textnormal{ (mod 2)}\right\}$ . Finally, we define $X\sim\nu(k,p)$ via the following random process: Let $(X_{\ell+1},\dots,X_{k})\sim\mu_{p}^{\otimes\left(k-\ell\right)}$ , and with $Z=\sum_{i=\ell+1}^{k}X_{i}\textnormal{ (mod 2)}$ , we let $(X_{1},\dots,X_{\ell})\sim\tilde{\nu}_{Z}$ . It is an easy check that this distribution satisfies the required properties. $\hfill\blacktriangleleft$

Finally, we prove Proposition 27.

Proof of Proposition 27.

Note that it suffices to find such a distribution that is not necessarily permutation invariant, since averaging the distribution over all permutations preserves pairwise independence and full even-weight support.

If $p=1/2$ , for any $k\geqslant 3$ , we let $\nu(k,p)$ be the uniform distribution on the set $\left\{x\in\left\{0,1\right\}^{k}:\sum_{i=1}^{k}x_{i}=0\textnormal{ (mod 2)}\right\}$ .

Now, for $k=3$ , it must hold that $p=1/2$ , in which case $\nu(k,p)$ is as above. For $k=4$ or $k=5$ , and $p\not=1/2$ , it must hold that $p\in\left[\frac{1}{k-1},\frac{2}{k-1}\right)\cup\left(1-\frac{2}{k-1},1-\frac{% 1}{k-1}\right]$ , and the result follows from Lemma 28. For $k\geqslant 6$ and $p\not=\frac{1}{2}$ , the result follows from Lemma 28 and Lemma 29. $\hfill\blacktriangleleft$

6 Putting Everything Together

We are now ready to prove our main result.

Proof of Theorem 5.

Let $p\in(0,1)$ .

1.

Consider any positive integer $k>1+\frac{1}{\min\left\{p,1-p\right\}}\geqslant 3$ (or $k=3$ with $p=\frac{1}{2}$ ). By Proposition 27, there exists a pairwise independent distribution $\nu\in\mathcal{D}(p,k)$ with full even-weight support. The result now follows by Theorem 33.
2.
Suppose that $k\geqslant 3$ with $p=\frac{1}{k-1}$ , or $k\geqslant 4$ is even with $\ p=1-\frac{1}{k-1}$ . In these cases, we observe that the distribution $\nu\in\mathcal{D}(p,k)$ constructed in Lemma 28 is pairwise independent, and contains BLR (see Definition 32):
1. (a)
  
  If $k\geqslant 3,p=\frac{1}{k-1}$ , the distribution $\nu$ contains all vectors in $\left\{0,1\right\}^{k}$ of hamming-weights 0 and 2 in its support. In this case, Definition 32 is satisfied with $\tilde{b}=0$ and $\tilde{z}$ as the all-zeros vector.
2. (b)
  
  If $k\geqslant 4$ is even, and $p=1-\frac{1}{k-1}$ , the distribution $\nu$ contains all vectors in $\left\{0,1\right\}^{k}$ of hamming-weights $k-2$ and $k$ in its support. In this case, Definition 32 is satisfied with $\tilde{b}=1$ and $\tilde{z}$ as the all-ones vector.
The result now follows by Theorem 33.
3.

Suppose that $k<1+\frac{1}{\min\left\{p,1-p\right\}}$ is a positive integer, and let $\nu\in\mathcal{D}(p,k)$ . We perform the following operation on the distribution $\nu$ : if $i,j\in[k],\ i\not=j$ are such that $\Pr_{X\sim\nu}[X_{i}=X_{j}]=1$ , we remove coordinates $i, j$ from $\nu$ , and repeat until no such pairs remain.
Finally, we are left with a distribution $\tilde{\nu}$ on $\tilde{k}\leqslant k$ coordinates. We consider the following two cases:
1. (a)
  
  Suppose that $\tilde{k}=0$ . In this case, for every $n\in\mathbb{N},$ and every $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ , it holds that $\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})\right]=1$ , since the $k$ terms in the product cancel out in pairs. Hence, it suffices to show the existence of a function $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ satisfying $\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \right]\right|\leqslant o_{n}(1)$ for every $S\subseteq[n]$ . Note that a (uniformly) random function $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ satisfies this with high probability, by an argument similar to the one at the end of Section 4.2 (a random function can be thought of as rounding the constant zero function as in Section 4.2).
2. (b)
  
  Now, suppose that $\tilde{k}\not=0$ . Then, it holds that $\tilde{\nu}\in\mathcal{D}(p,\tilde{k})$ , and by Proposition 25, we have that $\tilde{\nu}$ has no pairwise independent coordinate. Now, by Theorem 24 there exists a constant $\alpha>0$ , such that for every large $n\in\mathbb{N}$ , there exists a function $f:\left\{0,1\right\}^{n}\to\left\{-1,1\right\}$ such that
  
  $\left|\mathop{\mathbb{E}}_{(X_{1},\dots,X_{k})\sim\nu^{\otimes n}}\left[\prod_% {i\in[k]}f(X_{i})\right]\right|=\left|\mathop{\mathbb{E}}_{(X_{1},\dots,X_{% \tilde{k}})\sim\tilde{\nu}^{\otimes n}}\left[\prod_{i\in[\tilde{k}]}f(X_{i})% \right]\right|\geqslant\alpha,$
  
  and such that $\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \right]\right|\leqslant o_{n}(1)$ for every $S\subseteq[n]$ .

$\hfill\blacktriangleleft$

6.1 A Corner Case

In the above proof, we leave the case of odd $k\geqslant 5$ and $p=1-\frac{1}{k-1}$ . This turns out to be very interesting, and we discuss it next. For the remainder of this section, we fix such a $k$ and $p$ .

In this case, the pairwise independent distribution $\nu\in\mathcal{D}(p,k)$ constructed in Lemma 28, is supported on vectors of hamming weights 0 and $k-1$ (and does not contain BLR as in Definition 32). In particular, for every $x\in\textnormal{supp}(\nu)$ , it holds that $\sum_{i=1}^{k}x_{i}=0\textnormal{ (mod $k-1$)}$ . For this reason, as we show next, the best we can expect from the test $\textnormal{Lin}(\nu)$ , is to guarantee correlation with a character over $\mathbb{Z}/(k-1)\mathbb{Z}$ , and this is indeed true.

Definition 30 (Characters over $\mathbb{Z}/(k-1)\mathbb{Z}$ ).

Let $\omega$ be a primitive $(k-1)$ ^th root of unity. For every $0\leqslant r\leqslant k-2$ , we define the function $\phi_{r}:\left\{0,1\right\}\to\mathbb{C}$ as $\phi_{r}(x)=\omega^{rx}$ .

For every $n\in\mathbb{N}$ , and every integers $0\leqslant r^{\left(1\right)},\dots,r^{\left(n\right)}\leqslant k-2$ , we define the product character $\phi_{r^{\left(1\right)},\dots,r^{\left(n\right)}}:\left\{0,1\right\}^{n}\to% \mathbb{C}$ by $\phi_{r^{\left(1\right)},\dots,r^{\left(n\right)}}(x)=\prod_{j=1}^{n}\phi_{r^{% \left(j\right)}}(x^{\left(j\right)})=\omega^{\sum_{j=1}^{n}r^{\left(j\right)}x% ^{\left(j\right)}}$ .

Now, consider the test $\textnormal{Lin}(\nu)$ . Observe that any character $f=\phi_{r^{\left(1\right)},\dots,r^{\left(n\right)}}$ passes this test with probability 1:

\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[\prod_{i\in[k]}f(X_{i})\right]% =\prod_{j=1}^{n}\mathop{\mathbb{E}}_{Y\sim\nu}\left[\prod_{i\in[k]}\phi_{r^{% \left(j\right)}}(Y_{i})\right]=\prod_{j=1}^{n}\mathop{\mathbb{E}}_{Y\sim\nu}% \left[\omega^{r^{\left(j\right)}\cdot\left(\sum_{i\in[k]}{Y_{i}}\right)}\right% ]=1.

Next, we claim that characters explain the success of $\textnormal{Lin}(\nu)$ for any function $f$ :

Theorem 31.

For every constant $\epsilon>0$ , there exists a constant $\delta>0$ such that for every large enough $n\in\mathbb{N}$ , the following is true:

Let $f:\left\{0,1\right\}^{n}\to[-1,1]$ be a function such that $\left|\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})% \right]\right|\geqslant\mathop{\epsilon}.$ Then, there exist integers $0\leqslant r^{\left(1\right)},\dots,r^{\left(n\right)}\leqslant k-2$ , such that

\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\phi_{r^{% \left(1\right)},\dots,r^{\left(n\right)}}(X)\right]\right|\geqslant\delta.

Proof.

The result follows from the work of Bhangale, Khot, Liu and Minzer [7, 8], and we omit the details. Very roughly speaking, the proof follows a similar strategy as in Section 7: first show that $f$ has good correlation with a character under random restrictions; then, use this to show that $f$ has good correlation with character times a low-degree function; finally, use that $\nu$ is pairwise independent to get rid of the low-degree function. $\hfill\blacktriangleleft$

Finally, we present an alternative solution to deal with this corner case of odd $k\geqslant 5$ and $p=1-\frac{1}{k-1}$ . Instead of the test $\textnormal{Lin}(\nu)$ , we can perform the following test:

Let $f:\left\{0,1\right\}^{n}\to[-1,1]$ , and let $\nu^{\prime}\in\mathcal{D}(1-p,k)=\mathcal{D}(\frac{1}{k-1},k)$ be the pairwise independent distribution from Lemma 28.

1.

Sample $X=(X_{1},\dots,X_{k})\sim\nu^{\prime\otimes n}$ .
2.

Let $X^{\prime}$ be the vector obtained by negating each of the $k n$ coordinates of $X$ .
3.

Query $f$ on $X^{\prime}_{1},\dots,X^{\prime}_{k}$ and accept if and only if $\prod_{i\in[k]}f(X^{\prime}_{i})=1$ .

Each query $X_{i}^{\prime}$ of the above test is distributed according to $\mu_{p}^{\otimes n}$ , and the analysis of the test simply follows from the analysis for $\textnormal{Lin}(\nu^{\prime})$ in Theorem 5. The drawback here, though, is that the test does not accept all linear functions with probability 1, but only functions of the form $(-1)^{\left|S\right|}\cdot\chi_{S},$ for $S\subseteq[n]$ .

7 Analysis of the Linearity Test

In this section, we shall state and prove a generalized version of Theorem 7. The proof follows the work of Bhangale, Khot and Minzer [11], and hence we only give a rough outline (skipping many of the technical points), pointing out the places where the proof differs from the above work. We start with the following definition:

Definition 32.

Let $k\geqslant 3,p\in(0,1)$ , and let $\nu\in\mathcal{D}(p,k)$ be a distribution. We say that $\nu$ contains BLR, if there exists some $\tilde{b}\in\left\{0,1\right\},\ \tilde{z}\in\left\{0,1\right\}^{k-3}$ , such that

\left\{(x_{1},\ x_{2},\ x_{1}\oplus x_{2}\oplus\tilde{b},\ \tilde{z}):x_{1},x_% {2}\in\left\{0,1\right\}\right\}\subseteq\textnormal{supp}(\nu)\subseteq\left% \{0,1\right\}^{k}.

Furthermore, for technical reasons, we shall also require that

\textnormal{span}_{\mathbb{F}_{2}}(\textnormal{supp}(\nu))=\left\{x\in\left\{0% ,1\right\}^{k}:\sum_{i=1}^{k}x_{i}=0\textnormal{ (mod 2)}\right\}.

Observe that any $\nu$ with full even-weight support contains BLR (with $\tilde{b}=0$ , and $\tilde{z}$ the all-zeros vector). With this, we state the following generalization of Theorem 7:

Theorem 33.

Let $k\geqslant 3$ be a positive integer, and let $p\in(0,1),\ \epsilon\in(0,1]$ be constants, and let $\nu\in\mathcal{D}(p,k)$ be a distribution containing BLR (see Definition 32). Then, there exist constants $\delta>0,\ d\in\mathbb{N}$ (possibly depending on $k,p,\epsilon,\nu$ ), such that for every large enough $n\in\mathbb{N}$ , the following is true:

Let $f:\left\{0,1\right\}^{n}\to[-1,1]$ be a function such that

\left|\mathop{\mathbb{E}}_{(X_{1},\dots,X_{k})\sim\nu^{\otimes n}}\left[\prod_% {i=1}^{k}f(X_{i})\right]\right|\geqslant\mathop{\epsilon}.

Then, there exists a set $S\subseteq[n]$ , and a polynomial $g:\left\{0,1\right\}^{n}\to\mathop{\mathbb{R}}$ of degree at most $d$ and with 2-norm $\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[g(X)^{2}\right]\leqslant 1$ , such that

\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \cdot g(X)\right]\right|\geqslant\delta.

Moreover, if the distribution $\nu$ has some pairwise independent coordinate, then we may assume $g\equiv 1$ ; that is, $f$ correlates with a linear function $\chi_{S}$ .

The remainder of this section is devoted to the proof of the above theorem. Let $k\geqslant 3$ be an integer, and let $\ p\in(0,1),\ \epsilon\in(0,1]$ be constants, and let $\nu\in\mathcal{D}(p,k)$ be a distribution containing BLR (see Definition 32). Also, let $f:\left\{0,1\right\}^{n}\to[-1,1]$ be a function such that

\left|\mathop{\mathbb{E}}_{X=(X_{1},\dots,X_{k})\sim\nu^{\otimes n}}\left[% \prod_{i=1}^{k}f(X_{i})\right]\right|\geqslant\mathop{\epsilon}.

(1)

Step 1: Large Fourier Coefficient under Random Restriction

We note that the proof of this step is where we differ from [11].

Since the distribution $\nu\in\mathcal{D}(p,k)$ contains BLR, we can write $\nu=(1-\beta)\cdot\nu^{\prime}+\beta\cdot\mu$ , for some small constant $0<\beta<\frac{1}{2}\min\left\{p,1-p\right\}$ , some distribution $\nu^{\prime}$ over $\left\{0,1\right\}^{k}$ , and with $\mu$ the uniform distribution over $\left\{(x_{1},x_{2},x_{1}\oplus x_{2}\oplus\tilde{b},\tilde{z}):x_{1},x_{2}\in% \left\{0,1\right\}\right\}$ , where $\tilde{b},\tilde{z}$ are as in Definition 32. Using this, we can describe choosing $X\sim\nu^{\otimes n}$ as the following two step process. First choose a set $I\subseteq[n]$ , denoted $I\sim_{1-\beta}[n]$ , by choosing $i\in I$ with probability $1-\beta$ , independently for each $i\in[n]$ . Then, choose $Z\sim\nu^{\prime\otimes I}$ and $Y\sim\mu^{\bar{I}}$ , and set $X=(Y,Z)$ .

With the above, we can prove that the function $f$ satisfies the property of having a large fourier coefficient under random restrictions; the reader is referred to [28] for an introduction to Fourier analysis over the hypercube.

Lemma 34.

With $\delta=\epsilon/2$ , it holds that

\Pr_{I\sim_{1-\beta}[n],\ Z\sim\nu^{\prime\otimes I}}\left[\exists S\subseteq[% n]\setminus I:\ \left|\widehat{f_{I\to Z_{1}}}(S)\right|\geqslant\delta\ % \right]\geqslant\delta.

Here, $f_{I\to Z_{1}}$ refers to the restriction of the function $f$ , with the variables in $I$ set to $Z_{1}$ .

Proof.

By Equation 1, we have

	$\displaystyle\epsilon$	$\displaystyle\leqslant\left\|\ \mathop{\mathbb{E}}_{X=(X_{1},\dots,X_{k})\sim% \nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})\right]\ \right\|$
		$\displaystyle=\left\|\ \mathop{\mathbb{E}}_{I\sim_{1-\beta}[n],\ Z\sim\nu^{% \prime\otimes I}}\mathop{\mathbb{E}}_{Y\sim\mu^{\otimes\bar{I}}}\left[\prod_{i% =1}^{k}f_{I\to Z_{i}}(Y_{i})\right]\ \right\|$
		$\displaystyle\leqslant\mathop{\mathbb{E}}_{I\sim_{1-\beta}[n],\ Z\sim\nu^{% \prime\otimes I}}\left\|\ \mathop{\mathbb{E}}_{Y\sim\mu^{\otimes\bar{I}}}\left[% \prod_{i=1}^{k}f_{I\to Z_{i}}(Y_{i})\right]\ \right\|$

Observe that in the above expression, the random variables $Y_{4},\dots,Y_{k}$ are constants (determined by $\tilde{z}$ ). Now, using a (classical) Fourier analytic argument to analyze the BLR linearity test over the uniform distribution (see Chapter 1 of [28]), we get

	$\displaystyle\epsilon$	$\displaystyle\leqslant\mathop{\mathbb{E}}_{I\sim_{1-\beta}[n],\ Z\sim\nu^{% \prime\otimes I}}\left\|\ \mathop{\mathbb{E}}_{Y\sim\mu^{\otimes\bar{I}}}\left[% \prod_{i=1}^{3}f_{I\to Z_{i}}(Y_{i})\right]\ \right\|$
		$\displaystyle=\mathop{\mathbb{E}}_{I\sim_{1-\beta}[n],\ Z\sim\nu^{\prime% \otimes I}}\left\|\ \sum_{S\subseteq\bar{I}}\widehat{f_{I\to Z_{1}}}(S)\cdot% \widehat{f_{I\to Z_{2}}}(S)\cdot\widehat{f_{I\to Z_{3}}}(S)\cdot(-1)^{\tilde{b% }\cdot\left\|S\right\|}\ \right\|$
		$\displaystyle\leqslant\mathop{\mathbb{E}}_{I\sim_{1-\beta}[n],\ Z\sim\nu^{% \prime\otimes I}}\left[\max_{S\subseteq\bar{I}}\left\|\widehat{f_{I\to Z_{1}}}(% S)\right\|\right]$
		$\displaystyle\leqslant\Pr_{I\sim_{1-\beta}[n],\ Z\sim\nu^{\prime\otimes I}}% \left[\exists S\subseteq\bar{I}:\ \left\|\widehat{f_{I\to Z_{1}}}(S)\right\|% \geqslant\epsilon/2\right]+\epsilon/2.\$

$\hfill\blacktriangleleft$

Step 2: Direct Product Test

Using Theorem 1.1 in [11], by Lemma 34 we get the existence of constants $d\in\mathbb{N},\delta^{\prime}>0$ , a set $S\subseteq[n]$ , and a polynomial $g:\left\{0,1\right\}^{n}\to\mathop{\mathbb{R}}$ of degree at most $d$ , and with 2-norm $\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[g(X)^{2}\right]\leqslant 1$ , such that

\left|\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)% \cdot g(X)\right]\right|\geqslant\delta^{\prime}.

This proves the first part of Theorem 7. It remains to show that if $\nu$ has some pairwise independent coordinate, it is possible to remove the function $g$ in the above expression.

Step 3: List Decoding

This step follows Section 4.2 and Section 4.3 in [11].

Using an iterative list-decoding process,⁷⁷7we remark that this is a subtle argument using different degree parameters for the different polynomials. we can find a constant $r\in\mathbb{N}$ , and functions $\chi_{S_{1}},\dots,\chi_{S_{r}}$ , and constant degree polynomials $g_{1},\dots,g_{r},$ such that it is possible to “replace” $f$ by $\sum_{i\in[r]}\chi_{S_{i}}\cdot g_{i}$ in Equation 1 (and lose at most some constant factor in $\epsilon$ ). Now, this implies that for some constant $\epsilon^{\prime}>0$ , and some indices $j_{1},\dots,j_{k}\in[r]$ , we have

\left|\mathop{\mathbb{E}}_{(X_{1},\dots,X_{k})\sim\nu^{\otimes n}}\left[\prod_% {i=1}^{k}\chi_{S_{j_{i}}}(X_{i})g_{j_{i}}(X_{i})\right]\right|\geqslant% \epsilon^{\prime}.

(2)

We remark that for the next step, some extra structure on $S_{j_{i}}$ ’s is needed, and ensuring that it holds requires the condition on $\textnormal{span}_{\mathbb{F}_{2}}(\textnormal{supp}(\nu))$ in Definition 32.

Step 4: Invariance Principle Argument

This step follows Section 4.4, Section 4.5, and Section 4.6 in [11].

Assume, for the sake of contradiction, that $f$ is not correlated well with any $\chi_{S}$ ; that is, $\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[f(X)\cdot\chi_{S}(X)\right% ]\leqslant o_{n}(1)$ for each $S\subseteq[n]$ . Using this, it can be shown, roughly, that for each $i\in[k]$ , the expectation $\mathop{\mathbb{E}}_{X\sim\mu_{p}^{\otimes n}}\left[\chi_{S_{j_{i}}}(X)g_{j_{i% }}(X)\right]\leqslant o_{n}(1)$ ; note that for this conclusion to hold, we might have to modify $S_{j_{i}}$ ’s and $g_{j_{i}}$ ’s, however it is possible to do so while maintaining Equation 2 – roughly, this follows by showing $S_{j_{i}}$ ’s are close to each other.

Now, by an invariance principle argument [27, 25, 26], very roughly,⁸⁸8technically speaking, this requires two extra steps: first, noise is applied to ensure function boundedness and distribution connectivity; second, a regularity lemma is used to make low-degree influences small. it is possible to replace the expectation in Equation 2 over $(X_{1},\dots,X_{k})\sim\nu^{\otimes n}$ , by an expectation over $(Z_{1},\dots,Z_{k})\sim\mathcal{N}(0,\Sigma)^{\otimes n}$ , where $\Sigma\in\mathop{\mathbb{R}}^{k\times k}$ is the (normalized) covariance matrix of $\nu$ . Finally, we use that some coordinate $X_{i^{*}}$ is pairwise independent of each $X_{i}$ , for $i\not=i^{*}$ . Since the Gaussian distribution is determined by its covariance matrix, this implies that $Z_{i^{*}}$ is mutually independent of $(Z_{i})_{i\not=i^{*}}$ . We have

	$\displaystyle\epsilon^{\prime}$	$\displaystyle\leqslant\left\|\mathop{\mathbb{E}}_{X=(X_{1},\dots,X_{k})\sim\nu^% {\otimes n}}\left[\prod_{i=1}^{k}\chi_{S_{j_{i}}}(X_{i})g_{j_{i}}(X_{i})\right% ]\right\|$
		$\displaystyle\approx\left\|\mathop{\mathbb{E}}_{Z=(Z_{1},\dots,Z_{k})\sim% \mathcal{N}(0,\Sigma)^{\otimes n}}\left[\prod_{i=1}^{k}\chi_{S_{j_{i}}}(Z_{i})% g_{j_{i}}(Z_{i})\right]\right\|$
		$\displaystyle\approx\left\|\ \mathop{\mathbb{E}}_{Z_{i^{}}\sim\mathcal{N}(0,1)% ^{\otimes n}}\left[\chi_{S_{j_{i^{}}}}(Z_{i^{}})g_{j_{i^{}}}(Z_{i^{}})% \right]\right\|\cdot\left\|\ \mathop{\mathbb{E}}_{Z}\left[\prod_{i\in[k],i\not=i% ^{}}\chi_{S_{j_{i}}}(Z_{i})g_{j_{i}}(Z_{i})\right]\right\|$
		$\displaystyle\approx\left\|\ \mathop{\mathbb{E}}_{X_{i^{}}\sim\mu_{p}^{\otimes n% }}\left[\chi_{S_{j_{i^{}}}}(X_{i^{}})g_{j_{i^{}}}(X_{i^{}})\right]\right\|% \cdot\left\|\ \mathop{\mathbb{E}}_{Z}\left[\prod_{i\in[k],i\not=i^{}}\chi_{S_{% j_{i}}}(Z_{i})g_{j_{i}}(Z_{i})\right]\right\|$
		$\displaystyle\leqslant o_{n}(1),$

which is a contradiction. ∎

References

[1] Noga Alon, Tali Kaufman, Michael Krivelevich, Simon Litsyn, and Dana Ron. Testing Reed-Muller codes. IEEE Trans. Inform. Theory, 51(11):4032–4039, 2005. doi:10.1109/TIT.2005.856958.
[2] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, 1998. doi:10.1145/278298.278306.
[3] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: a new characterization of NP. J. ACM, 45(1):70–122, 1998. doi:10.1145/273865.273901.
[4] Mihir Bellare, Don Coppersmith, Johan Håstad, Marcos Kiwi, and Madhu Sudan. Linearity testing in characteristic two. IEEE Trans. Inform. Theory, 42(6, part 1):1781–1795, 1996. (also in SFCS 1995). doi:10.1109/18.556674.
[5] Michael Ben-or, Don Coppersmith, Mike Luby, and Ronitt Rubinfeld. Non-abelian homomorphism testing, and distributions close to their self-convolutions. Random Structures Algorithms, 32(1):49–70, 2008. (also in APPROX-RANDOM 2004). doi:10.1002/RSA.20182.
[6] Eli Ben-Sasson, Madhu Sudan, Salil Vadhan, and Avi Wigderson. Randomness-efficient low degree tests and short PCPs via epsilon-biased sets. In STOC, pages 612–621, 2003. doi:10.1145/780542.780631.
[7] Amey Bhangale, Subhash Khot, Yang P. Liu, and Dor Minzer. On approximability of satisfiable k-CSPs: VI, 2024. Available at https://arxiv.org/pdf/2411.15133.
[8] Amey Bhangale, Subhash Khot, Yang P. Liu, and Dor Minzer. On approximability of satisfiable k-CSPs: VII, 2024. Available at https://arxiv.org/pdf/2411.15136.
[9] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable $k$ -CSPs: I. In STOC, pages 976–988, 2022. doi:10.1145/3519935.3520028.
[10] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: II. In STOC, pages 632–642, 2023. doi:10.1145/3564246.3585120.
[11] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: III. In STOC, pages 643–655, 2023. doi:10.1145/3564246.3585121.
[12] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: IV. In STOC, pages 1423–1434, 2024. doi:10.1145/3618260.3649610.
[13] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: V. Electron. Colloquium Comput. Complex., TR24-129, 2024. URL: https://eccc.weizmann.ac.il/report/2024/129.
[14] Arnab Bhattacharyya, Swastik Kopparty, Grant Schoenebeck, Madhu Sudan, and David Zuckerman. Optimal testing of Reed-Muller codes. In FOCS, pages 488–497, 2010. doi:10.1109/FOCS.2010.54.
[15] Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with applications to numerical problems. J. Comput. System Sci., 47(3):549–595, 1993. (also in STOC 1990). doi:10.1016/0022-0000(93)90044-W.
[16] Roee David, Irit Dinur, Elazar Goldenberg, Guy Kindler, and Igor Shinkar. Direct sum testing. SIAM J. Comput., 46(4):1336–1369, 2017. (also in ITCS 2015). doi:10.1137/16M1061655.
[17] Irit Dinur, Yuval Filmus, and Prahladh Harsha. Analyzing Boolean functions on the biased hypercube via higher-dimensional agreement tests. In SODA, pages 2124–2133, 2019.
[18] Rick Durrett. Probability—theory and examples. Cambridge University Press, Cambridge, 2019. Fifth edition.
[19] Uriel Feige, Shafi Goldwasser, Laszlo Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. J. ACM, 43(2):268–292, 1996. doi:10.1145/226643.226652.
[20] Shirley Halevy and Eyal Kushilevitz. Distribution-free property-testing. SIAM J. Comput., 37(4):1107–1138, 2007. (also in APPROX-RANDOM 2003, 2005). doi:10.1137/050645804.
[21] Wassily Hoeffding. Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc., 58:13–30, 1963.
[22] Gil Kalai, Noam Lifshitz, Tamar Ziegler, and Dor Minzer. A dense model theorem for the boolean slice. In FOCS, 2024. (to appear).
[23] Tali Kaufman, Simon Litsyn, and Ning Xie. Breaking the $\epsilon$ -soundness bound of the linearity test over $\rm GF(2)$ . SIAM J. Comput., 39(5):1988–2003, 2010. (also in APPROX-RANDOM 2008).
[24] Swastik Kopparty and Shubhangi Saraf. Tolerant linearity testing and locally testable codes. In APPROX-RANDOM, pages 601–614, 2009. doi:10.1007/978-3-642-03685-9_45.
[25] Elchanan Mossel. Gaussian bounds for noise correlation of functions. Geom. Funct. Anal., 19(6):1713–1756, 2010.
[26] Elchanan Mossel. Gaussian bounds for noise correlation of resilient functions. Israel J. Math., 235(1):111–137, 2020.
[27] Elchanan Mossel, Ryan O’Donnell, and Krzysztof Oleszkiewicz. Noise stability of functions with low influences: invariance and optimality. Ann. of Math. (2), 171(1):295–341, 2010.
[28] Ryan O’Donnell. Analysis of Boolean functions. Cambridge University Press, New York, 2014.
[29] Amir Shpilka and Avi Wigderson. Derandomizing homomorphism testing in general groups. SIAM J. Comput., 36(4):1215–1230, 2006. (also in STOC 2004). doi:10.1137/S009753970444658X.

[bib.bib1] [1] Noga Alon, Tali Kaufman, Michael Krivelevich, Simon Litsyn, and Dana Ron. Testing Reed-Muller codes. IEEE Trans. Inform. Theory, 51(11):4032–4039, 2005. doi:10.1109/TIT.2005.856958.

[bib.bib2] [2] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, 1998. doi:10.1145/278298.278306.

[bib.bib3] [3] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: a new characterization of NP. J. ACM, 45(1):70–122, 1998. doi:10.1145/273865.273901.

[bib.bib4] [4] Mihir Bellare, Don Coppersmith, Johan Håstad, Marcos Kiwi, and Madhu Sudan. Linearity testing in characteristic two. IEEE Trans. Inform. Theory, 42(6, part 1):1781–1795, 1996. (also in SFCS 1995). doi:10.1109/18.556674.

[bib.bib5] [5] Michael Ben-or, Don Coppersmith, Mike Luby, and Ronitt Rubinfeld. Non-abelian homomorphism testing, and distributions close to their self-convolutions. Random Structures Algorithms, 32(1):49–70, 2008. (also in APPROX-RANDOM 2004). doi:10.1002/RSA.20182.

[bib.bib6] [6] Eli Ben-Sasson, Madhu Sudan, Salil Vadhan, and Avi Wigderson. Randomness-efficient low degree tests and short PCPs via epsilon-biased sets. In STOC, pages 612–621, 2003. doi:10.1145/780542.780631.

[bib.bib7] [7] Amey Bhangale, Subhash Khot, Yang P. Liu, and Dor Minzer. On approximability of satisfiable k-CSPs: VI, 2024. Available at https://arxiv.org/pdf/2411.15133.

[bib.bib8] [8] Amey Bhangale, Subhash Khot, Yang P. Liu, and Dor Minzer. On approximability of satisfiable k-CSPs: VII, 2024. Available at https://arxiv.org/pdf/2411.15136.

[bib.bib9] [9] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable $k$ -CSPs: I. In STOC, pages 976–988, 2022. doi:10.1145/3519935.3520028.

[bib.bib10] [10] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: II. In STOC, pages 632–642, 2023. doi:10.1145/3564246.3585120.

[bib.bib11] [11] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: III. In STOC, pages 643–655, 2023. doi:10.1145/3564246.3585121.

[bib.bib12] [12] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: IV. In STOC, pages 1423–1434, 2024. doi:10.1145/3618260.3649610.

[bib.bib13] [13] Amey Bhangale, Subhash Khot, and Dor Minzer. On approximability of satisfiable k-CSPs: V. Electron. Colloquium Comput. Complex., TR24-129, 2024. URL: https://eccc.weizmann.ac.il/report/2024/129.

[bib.bib14] [14] Arnab Bhattacharyya, Swastik Kopparty, Grant Schoenebeck, Madhu Sudan, and David Zuckerman. Optimal testing of Reed-Muller codes. In FOCS, pages 488–497, 2010. doi:10.1109/FOCS.2010.54.

[bib.bib15] [15] Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with applications to numerical problems. J. Comput. System Sci., 47(3):549–595, 1993. (also in STOC 1990). doi:10.1016/0022-0000(93)90044-W.

[bib.bib16] [16] Roee David, Irit Dinur, Elazar Goldenberg, Guy Kindler, and Igor Shinkar. Direct sum testing. SIAM J. Comput., 46(4):1336–1369, 2017. (also in ITCS 2015). doi:10.1137/16M1061655.

[bib.bib17] [17] Irit Dinur, Yuval Filmus, and Prahladh Harsha. Analyzing Boolean functions on the biased hypercube via higher-dimensional agreement tests. In SODA, pages 2124–2133, 2019.

[bib.bib18] [18] Rick Durrett. Probability—theory and examples. Cambridge University Press, Cambridge, 2019. Fifth edition.

[bib.bib19] [19] Uriel Feige, Shafi Goldwasser, Laszlo Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. J. ACM, 43(2):268–292, 1996. doi:10.1145/226643.226652.

[bib.bib20] [20] Shirley Halevy and Eyal Kushilevitz. Distribution-free property-testing. SIAM J. Comput., 37(4):1107–1138, 2007. (also in APPROX-RANDOM 2003, 2005). doi:10.1137/050645804.

[bib.bib21] [21] Wassily Hoeffding. Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc., 58:13–30, 1963.

[bib.bib22] [22] Gil Kalai, Noam Lifshitz, Tamar Ziegler, and Dor Minzer. A dense model theorem for the boolean slice. In FOCS, 2024. (to appear).

[bib.bib23] [23] Tali Kaufman, Simon Litsyn, and Ning Xie. Breaking the $\epsilon$ -soundness bound of the linearity test over $\rm GF(2)$ . SIAM J. Comput., 39(5):1988–2003, 2010. (also in APPROX-RANDOM 2008).

[bib.bib24] [24] Swastik Kopparty and Shubhangi Saraf. Tolerant linearity testing and locally testable codes. In APPROX-RANDOM, pages 601–614, 2009. doi:10.1007/978-3-642-03685-9_45.

[bib.bib25] [25] Elchanan Mossel. Gaussian bounds for noise correlation of functions. Geom. Funct. Anal., 19(6):1713–1756, 2010.

[bib.bib26] [26] Elchanan Mossel. Gaussian bounds for noise correlation of resilient functions. Israel J. Math., 235(1):111–137, 2020.

[bib.bib27] [27] Elchanan Mossel, Ryan O’Donnell, and Krzysztof Oleszkiewicz. Noise stability of functions with low influences: invariance and optimality. Ann. of Math. (2), 171(1):295–341, 2010.

[bib.bib28] [28] Ryan O’Donnell. Analysis of Boolean functions. Cambridge University Press, New York, 2014.

[bib.bib29] [29] Amir Shpilka and Avi Wigderson. Derandomizing homomorphism testing in general groups. SIAM J. Comput., 36(4):1215–1230, 2006. (also in STOC 2004). doi:10.1137/S009753970444658X.

	$\displaystyle\partial_{1}^{s_{1}}\cdot\partial_{2}^{s_{2}}\cdots\partial_{k}^{% s_{k}}\left(q^{d}\right)$	$\displaystyle=q^{d-\left\|s\right\|}\cdot\left((d-\left\|s\right\|+1)\cdot\ell_{1}% \cdot\sum_{i=0}^{\left\|s\right\|-1}d^{i}\cdot p_{i}\right)+q^{d-\left\|s\right\|+% 1}\cdot\left(\sum_{i=0}^{\left\|s\right\|-1}d^{i}\cdot\partial_{1}(p_{i})\right)$
		$\displaystyle=q^{d-\left\|s\right\|}\cdot\left(\sum_{i=1}^{\left\|s\right\|}d^{i}% \cdot\ell_{1}\cdot p_{i-1}+\sum_{i=0}^{\left\|s\right\|-1}d^{i}\cdot\left(\left(% -\left\|s\right\|+1\right)\cdot\ell_{1}\cdot p_{i}+q\cdot\partial_{1}p_{i}\right% )\right)$
		$\displaystyle=q^{d-\left\|s\right\|}\cdot\left(\sum_{i=0}^{\left\|s\right\|}d^{i}% \cdot\tilde{p}_{i}\right),$

	$\displaystyle\left\|f(x)-\tilde{f}(x)\right\|$	$\displaystyle\leqslant K\cdot\left\|\frac{1}{\sqrt{n}}\cdot\sum_{j=1}^{n}\frac{% x^{\left(j\right)}-p}{\sqrt{p-p^{2}}}-\frac{1}{\sqrt{n-\left\|T\right\|}}\cdot% \sum_{j\in[n]\setminus T}\frac{x^{\left(j\right)}-p}{\sqrt{p-p^{2}}}\right\|$
		$\displaystyle\leqslant\frac{K}{\sqrt{p-p^{2}}}\cdot\left(\frac{\left\|T\right\|}% {\sqrt{n}}+\left(n-\left\|T\right\|\right)\cdot\left\|\frac{1}{\sqrt{n-\left\|T% \right\|}}-\frac{1}{\sqrt{n}}\right\|\right)$
		$\displaystyle\leqslant\frac{K}{\sqrt{p-p^{2}}}\cdot\left(\frac{\left\|T\right\|}% {\sqrt{n}}+\frac{n-\left\|T\right\|}{\sqrt{n}}\cdot\frac{\left\|T\right\|}{n}\right)$
		$\displaystyle\leqslant\frac{K}{\sqrt{p-p^{2}}}\cdot\frac{2\left\|T\right\|}{% \sqrt{n}}=o_{n}(1),$

	$\displaystyle\left\|\ \mathop{\mathbb{E}}_{X}\left[f(X)\cdot\chi_{S}(X)\right]% \ \right\|$	$\displaystyle\leqslant\left\|\ \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\cdot% \chi_{S}(X)\right]\ \right\|+o_{n}(1)$
		$\displaystyle=\left\|\ \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\cdot\chi_{S% \setminus T}(X)\right]\cdot\mathop{\mathbb{E}}_{X}\left[\chi_{T}(X)\right]\ % \right\|+o_{n}(1)$
		$\displaystyle=\left\|\ \mathop{\mathbb{E}}_{X}\left[\tilde{f}(X)\cdot\chi_{S% \setminus T}(X)\right]\ \right\|\cdot\left\|1-2p\right\|^{\left\|T\right\|}+o_{n}(1).$

	$\displaystyle\left\|\ \mathop{\mathbb{E}}_{g}\mathop{\mathbb{E}}_{X\sim\nu^{% \otimes n}}\left[G(X)\right]-\mathop{\mathbb{E}}_{X\sim\nu^{\otimes n}}\left[F% (X)\right]\ \right\|$	$\displaystyle\leqslant\Pr\left[\bar{E}\right]+\left\|\ \mathop{\mathbb{E}}_{g}% \mathop{\mathbb{E}}_{X,Y}\left[G(X)\cdot\mathds{1}_{E}\right]-\mathop{\mathbb{% E}}_{X}\left[F(X)\right]\ \right\|$
		$\displaystyle\leqslant\Pr\left[\bar{E}\right]+\left\|\ \mathop{\mathbb{E}}_{X,Y% }\left[F(X)\cdot\mathds{1}_{E}\right]-\mathop{\mathbb{E}}_{X}\left[F(X)\right]% \ \right\|$
		$\displaystyle\leqslant 2\Pr\left[\bar{E}\right]=o_{n}(1).$

	$\displaystyle\epsilon$	$\displaystyle\leqslant\left\|\ \mathop{\mathbb{E}}_{X=(X_{1},\dots,X_{k})\sim% \nu^{\otimes n}}\left[\prod_{i=1}^{k}f(X_{i})\right]\ \right\|$
		$\displaystyle=\left\|\ \mathop{\mathbb{E}}_{I\sim_{1-\beta}[n],\ Z\sim\nu^{% \prime\otimes I}}\mathop{\mathbb{E}}_{Y\sim\mu^{\otimes\bar{I}}}\left[\prod_{i% =1}^{k}f_{I\to Z_{i}}(Y_{i})\right]\ \right\|$
		$\displaystyle\leqslant\mathop{\mathbb{E}}_{I\sim_{1-\beta}[n],\ Z\sim\nu^{% \prime\otimes I}}\left\|\ \mathop{\mathbb{E}}_{Y\sim\mu^{\otimes\bar{I}}}\left[% \prod_{i=1}^{k}f_{I\to Z_{i}}(Y_{i})\right]\ \right\|$

Biased Linearity Testing in the 1% Regime

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Acknowledgements:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Definition 1 (Class of Distributions).

Definition 2 (Class of Linearity Tests).

Example 3.

Example 4.

Theorem 5.

▶ Remark 6.

Theorem 7 (General version proved later as Theorem 33).

Theorem 8 (Restated and proved later as Theorem 24).

▶ Remark 9.

Proposition 10 (Restated formally and proved later as Proposition 19).

Proposition 11.

1.1 Related work

1.2 Organization of the paper

2 Preliminaries

Definition 12.

Fact 13 (Chebyshev’s Inequality; see [18] for reference).

Fact 14 (Hoeffding’s Inequality [21]).

Theorem 15 (Multivariate Central Limit Theorem; see [18] for reference).

Lemma 16.

Proof Sketch.

Definition 17.

Lemma 18.

Proof.

3 A Gaussian Variant

Proposition 19.

3.1 Symmetric Powers of Polynomials

Lemma 20.

Lemma 21.

Proof.

Lemma 22.

Proof.

Proof of Lemma 20.

3.2 Proving the Gaussian Variant

Lemma 23.

Proof.

Proof of Proposition 19.

4 Linearity Testing Requires Pairwise Independence

Theorem 24.

4.1 Function with Range [−𝟏,𝟏]

4.2 Rounding to a Function with Range {−𝟏,𝟏}

5 Queries vs. Bias Tradeoff

5.1 Query Lower Bound

Proposition 25.

Proof.

▶ Remark 26.

5.2 Query Upper Bound

Proposition 27.

Lemma 28.

Proof.

Lemma 29.

Proof.

Proof of Proposition 27.

6 Putting Everything Together

Proof of Theorem 5.

6.1 A Corner Case

Definition 30 (Characters over ℤ/(k−1)⁢ℤ).

Theorem 31.

Proof.

7 Analysis of the Linearity Test

Definition 32.

Theorem 33.

Step 1: Large Fourier Coefficient under Random Restriction

Lemma 34.

Proof.

Step 2: Direct Product Test

Step 3: List Decoding

Step 4: Invariance Principle Argument

$\blacktriangleright$ Remark 6.

$\blacktriangleright$ Remark 9.

4.1 Function with Range $[-1,1]$

4.2 Rounding to a Function with Range $\left\{-1,1\right\}$

$\blacktriangleright$ Remark 26.

Definition 30 (Characters over $\mathbb{Z}/(k-1)\mathbb{Z}$ ).