A Pseudorandom Generator for Functions of Low-Degree Polynomial Threshold Functions

Yao, Penghui; Zhao, Mingnan

doi:10.4230/LIPIcs.ICALP.2025.134

A Pseudorandom Generator for Functions of Low-Degree Polynomial Threshold Functions

Penghui Yao

State Key Laboratory for Novel Software Technology, New Cornerstone Science Laboratory, Nanjing University, Nanjing 210023, China
Hefei National Laboratory, Hefei 230088, China Mingnan Zhao

State Key Laboratory for Novel Software Technology, New Cornerstone Science Laboratory, Nanjing University, Nanjing 210023, China

Abstract

Developing explicit pseudorandom generators (PRGs) for prominent categories of Boolean functions is a key focus in computational complexity theory. In this paper, we investigate the PRGs against the functions of degree- $d$ polynomial threshold functions (PTFs) over Gaussian space. Our main result is an explicit construction of PRG with seed length $\mathrm{poly}\!\left(k,d,1/\epsilon\right)\cdot\log n$ that can fool any function of $k$ degree- $d$ PTFs with probability at least $1-\varepsilon$ . More specifically, we show that the summation of $L$ independent $R$ -moment-matching Gaussian vectors $\epsilon$ -fools functions of $k$ degree- $d$ PTFs, where $L=\mathrm{poly}\!\left(k,d,\frac{1}{\epsilon}\right)$ and $R=O({\log\frac{kd}{\epsilon}})$ . The PRG is then obtained by applying an appropriate discretization to Gaussian vectors with bounded independence.

Keywords and phrases:

Pseudorandom generators, polynomial threshold functions

Category:

Track A: Algorithms, Complexity and Games

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Pseudorandomness and derandomization

Related Version:

Previous Version: https://arxiv.org/abs/2504.10904

Funding:

PY and MZ were supported by National Natural Science Foundation of China (Grant Nos. 62332009 and 12347104), Innovation Program for Quantum Science and Technology (Grant No. 2021ZD0302901), NSFC/RGC Joint Research Scheme (Grant No. 12461160276), Natural Science Foundation of Jiangsu Province (Grant No. BK20243060), and New Cornerstone Science Foundation.

DOI:

10.4230/LIPIcs.ICALP.2025.134

Event:

52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Editors:

Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

In computational complexity theory, derandomization is a powerful technique that aims to reduce randomness in algorithms without sacrificing efficiency or accuracy. A versatile approach for derandomization is to design explicit pseudorandom generators (PRGs) for notable families of Boolean functions. A PRG for a family of Boolean functions is able to consume few random bits and produce a distribution over high-dimensional vectors, which is indistinguishable from a target distribution, such as the uniform distribution over Boolean cube, by any function in the family. In this paper, we concern ourselves with the Gaussian distribution over $\mathbb{R}^{n}$ . Formally,

Definition 1.

Let $\mathcal{F}\subseteq\left\{f:\mathbb{R}^{n}\to\{0,1\}\right\}$ be a family of Boolean functions. A function $G:\{0,1\}^{r}\to\mathbb{R}^{n}$ is a pseudorandom generator for $\mathcal{F}$ with error $\epsilon$ over Gaussian distribution $\mathcal{N}\left(0,1\right)^{n}$ if for any $f\in\mathcal{F}$ ,

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}s\sim_{u}\left\{0,1\right% \}^{r}\end{subarray}}\!\left[f(G(s))\right]-\operatorname*{\mathbb{E}}_{\begin% {subarray}{c}x\sim\mathcal{N}\left(0,1\right)^{n}\end{subarray}}\!\left[f(x)% \right]\right|\leq\epsilon\enspace.

We call $r$ the seed length of $G$ . We also say $G$ $\epsilon$ -fools $\mathcal{F}$ over the Gaussian distribution.

There has been a considerable amount of research developing PRGs for various Boolean function families, including halfspaces, polynomial threshold functions and intersections of halfspaces. Let $\mathrm{sign}:\mathbb{R}\to\left\{0,1\right\}$ be the function such that $\mathrm{sign}\!\left(x\right)=1$ iff $x\geq 0$ . A halfspace is a Boolean function of the form $f(x)=\mathrm{sign}\!\left(a_{1}x_{1}+\cdots+a_{n}x_{n}-b\right)$ for some $a_{1},\cdots,a_{n},b\in\mathbb{R}$ . Halfspaces are a fundamental class of Boolean functions which have found significant applications in machine learning, complexity theory, theory of approximation and more. A very successful series of work produced PRGs that $\epsilon$ -fools halfspaces with seed length poly-logarithmic in $n$ and $\epsilon^{-1}$ over both Boolean space [28, 5, 21, 7] and Gaussian space [19]. Polynomial threshold functions (PTFs) are functions of the form $f(x)=\mathrm{sign}\!\left(p(x)\right)$ where $p$ is a polynomial. We call $f$ is a degree- $d$ PTF if $p$ is a degree- $d$ polynomial. PTFs are natural generalization for halfspaces since a halfspace is a degree- $1$ PTF. An explicit PRG that $\epsilon$ -fools PTFs over Boolean space has been achieved with seed length $(d/\epsilon)^{O(d)}\cdot\log n$ [21]. As for Gaussian space, a sequence of work [6, 12, 13, 14, 21, 15, 16, 24, 17] succeeds in giving a PRG with seed length polynomial in $d$ , $\epsilon^{-1}$ and $\log n$ [24, 17]. Another extension for halfspaces is intersections of $k$ halfspaces which are polytopes with $k$ facets. A line of work [8, 9, 27, 4, 25] results in PRGs with seed length polynomial in $\log k$ , $\log n$ and $1/\epsilon$ over Boolean space [25] and over Gaussian space [4].

Considering the prosperity of PRGs for these functions families, we commence designing PRGs for functions of degree- $d$ polynomial threshold functions.

Definition 2.

We say a function $F:\mathbb{R}^{n}\to\{0,1\}$ is a function of $k$ degree- $d$ PTFs if there exist $k$ polynomials $p_{1},\dots,p_{k}:\mathbb{R}^{n}\to\mathbb{R}$ of degree $d$ and a Boolean function $f:\{0,1\}^{k}\to\left\{0,1\right\}$ such that $F(x)=f\!\left(\mathrm{sign}\!\left(p_{1}\!(x)\right),\dots,\mathrm{sign}\!% \left(p_{k}\!(x)\right)\right).$

This family consumes all three function families we discussed above. For example, it includes intersections of halfspaces by setting $d=1$ and $f(x)=x_{1}\cdots x_{k}$ . The research on PRGs for functions of PTFs is driven by several motivations beyond its fundamental role in derandomization tasks. For instance, the collection of satisfying assignments of an intersection of $k$ degree- $2$ PTFs corresponds to the feasible solutions set of an $\{0,1\}$ -integer quadratic programing [22] with $k$ constraints. The investigation into the structure of these sets has been a central focus of extensive research in areas including learning theory, counting, optimization, and combinatorics.

In this work, we consider building explicit PRGs for functions of degree- $d$ PTFs over Gaussian space. Before presenting our main result, we briefly revisit relevant prior work on fooling functions of halfspaces.

Table 1: Related Work on PRGs for Intersections of PTFs.

Reference

Function Family

Seed Length

[8]

Monotone functions of

k

halfspaces

O((k\log(k/\epsilon)+\log n)\cdot\log(k/\epsilon))

[9]

Intersections of

k

\delta

-regular halfspaces

O(\log n\log k/\epsilon)

for

\delta\leq\epsilon^{5}/(\log^{8.1}\!k\log(1/\epsilon))

[27]

Intersections of

k

weight-

t

halfspaces

\mathrm{poly}\!\left(\log n,\log k,t,1/\epsilon\right)

[25]

Intersections of

k

halfspaces

\mathrm{polylog}\ m\cdot{\epsilon}^{-(2+\delta)}\cdot\log n

for any absolute constant

\delta\in(0,1)

[4]

Intersections of

k

halfspaces

Arbitrary functions of

k

halfspaces

O(\log n+\mathrm{poly}\!\left(\log k,1/\epsilon\right))

O(\log n+\mathrm{poly}\!\left(k,1/\epsilon\right))

[6]

Intersections of

k

degree-

2

PTFs

O(\log n\cdot\mathrm{poly}\!\left(k,1/\epsilon\right))

1.1 Prior Work

The related work is summarized in Table 1. Gopalan, O’Donnell, Wu and Zuckerman [8] constructed PRGs for monotone functions of halfspaces. They modified the PRG for halfspaces in [21] and showed the modified PRG $\epsilon$ -fools any monotone function of $k$ halfspaces over a broad class of product distributions with seed length $O((k\log(k/\epsilon)+\log n)\cdot\log(k/\epsilon))$ . When $k/\epsilon\leq\log^{c}n$ any $c>0$ , the seed length can be further improved to $O(k\log(k/\epsilon)+\log n)$ .

Harsha, Klivans and Meka [9] considered designing PRGs for intersections of regular halfspaces (i.e., halspaces with low influence). A halfspace $f(x)=\mathrm{sign}\!\left(a_{1}x_{1}+\cdots+a_{n}x_{n}-b\right)$ is $\delta$ -regular if $\sum_{i}a_{i}^{4}\leq\delta^{2}\sum_{i}a_{i}^{2}$ . They gave an explicit PRG construction for intersections of $k$ $\delta$ -regular halfspaces over proper and hypercontractive distributions with seed length $O(\log n\log k/\epsilon)$ when $\delta$ is no more than a threshold. Their proof is based on developing an invariance principle for intersections of regular halfspaces via a generalization of the well-known Lindeberg method [20] and an anti-concentration result of polytopes in Gaussian space from [18].

By extending the approach of [9] and combing the results on bounded independence fooling CNF formulas [1, 26], Servedio and Tan [27] designed an explicit PRG that $\epsilon$ -fools intersections of $k$ weight- $t$ halfspaces over Boolean space with $\mathrm{poly}\!\left(\log n,\log k,t,1/\epsilon\right)$ seed length. A halfspace $f(x)=\mathrm{sign}\!\left(a_{1}x_{1}+\cdots+a_{n}x_{n}-b\right)$ is said to be weight- $t$ if each $a_{i}$ is an integer in $[-t,t]$ .

As for intersections of $k$ general halfspaces, O’Donnell, Servedio and Tan [25] gave a PRG construction over Boolean space with a polylogarithmic seed length dependence on $k$ and $n$ . Their proof involves a novel invariance principle for intersections of arbitrary halfspaces and a Littlewood–Offord style anticoncentration inequality for polytopes over Boolean space.

Concurrently, Chattopadhyay, De and Servedio [4] proposed a simple PRG that $\epsilon$ -fools intersections of $k$ general halfspaces over Gaussian space, building upon the concept of Johnson-Lindenstrauss transform [10, 11]. The seed length is $O(\log n+\mathrm{poly}\!\left(\log k,1/\epsilon\right))$ . Additionally, they show that the same PRG with seed length $O(\log n+\mathrm{poly}\!\left(k,1/\epsilon\right))$ is able to fool arbitrary functions of $k$ halfspaces.

Speaking of fooling functions of PTFs, the study by Diakonikolas, Kane and Nelson [6] stands out as the sole work that constructs a PRG for intersections of $k$ degree- $2$ PTFs. Their PRG is specific to degree $d\leq 2$ with a $O(\log n\cdot\mathrm{poly}\!\left(k,1/\epsilon\right))$ seed length.

1.2 Main Result

In this work, we investigate the PRGs fooling any function of low-degree PTFs. The main result is the following.

Theorem 3 (Informal version of Theorem 20).

There exists an explicit PRG $\epsilon$ -fools any function of $k$ degree- $d$ PTFs over Gaussian space with seed length $\mathrm{poly}\!\left(k,d,1/\epsilon\right)\cdot\log n$ .

The proof is inspired by the PRG proposed in [13] and the work [17]. This theorem follows from two components.

(1) Bounded independence fools functions of $𝒌$ degree- $𝒅$ PTFs

Consider the continuous random vector $Y=\frac{1}{\sqrt{L}}\sum_{i=1}^{L}Y_{i}$ where $Y_{i}$ is a $R$ -wise independent standard Gaussian vector of length $n$ . Every $Y_{i,j}$ is a standard Gaussian variable and for any degree- $R$ polynomial $f$ , $\operatorname*{\mathbb{E}}\!\left[f(Y_{i})\right]=\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}y\sim\mathcal{N}(0,1)^{n}\end{subarray}}\!\left[f(y)\right]$ . We will prove that

Theorem 4.

(Informal version of Theorem 16) With $R=O({\log\frac{kd}{\epsilon}})$ and $L=\mathrm{poly}\!\left(k,d,\frac{1}{\epsilon}\right)$ , the distribution of $Y$ $\epsilon$ -fools any function of $k$ degree- $d$ PTFs over Gaussian space.

The prior work [17] shows that bounded independence fools a single low-degree polynomial threshold function. This generalizes their work to the case of functions of $k$ low-degree PTFs.

(2) Discretization of bounded independence Gaussians

An explicit PRG construction requires a discrete approximation to Gaussian vectors with bounded independence. The idea is to use a finite entropy random variable $X$ to approximate $Y$ . Previous work [13] uses the idea that a single Gaussian variable can be produced by two uniform random variables in $[0,1]$ through the Box–Muller transform [2]. Therefore bounded independence Gaussian variables $Y_{i}$ can be generated by using bounded independence uniform random variables. Then by truncating these uniform $[0,1]$ random variables to a sufficient precision, we obtain vectors $X_{i}$ that serve as a discrete approximation of $Y_{i}$ . We prove that $X$ also fools functions of $k$ degree- $d$ PTFs as long as $X$ is a good approximation to $Y$ .

Lemma 5 (Informal version of Lemma 19).

If $X_{i,j}$ and $Y_{i,j}$ are sufficiently close with high probability, then $X$ also fools functions of $k$ degree- $d$ PTFs.

2 Preliminary

Basic Notation

For $n\in\mathbb{N}$ , $[n]$ denotes the set $\left\{1,2,\cdots,n\right\}$ . For $\alpha\in\mathbb{R}^{n}$ and $i\in[n]$ , $\alpha_{i}$ denotes the $i$ -th coordinate of $\alpha$ , $\left|\alpha\right|=\sum_{i=1}^{n}\left|\alpha_{i}\right|$ and $\left\lVert\alpha\right\rVert_{\infty}=\max_{1\leq i\leq n}\left|\alpha_{i}\right|$ . For $\alpha,\beta\in\mathbb{R}^{n}$ , $\alpha-\beta$ denotes the vector $v$ such that $v_{i}=\alpha_{i}-\beta_{i}$ for all $i\in[n]$ , and $\alpha^{\beta}=\prod_{i=1}^{n}\alpha_{i}^{\beta_{i}}$ . For $\alpha\in\mathbb{N}^{n}$ , $\alpha!=\prod_{i=1}^{n}\alpha_{i}!$ . When it is clear from the context, we will use both subscript and superscript as indices.

Derivatives and Multidimensional Taylor Expansion

For a function $f:\mathbb{R}^{n}\to\mathbb{R}$ and $\alpha\in\mathbb{N}^{n}$ , we use $\partial^{\alpha}\!f$ to denote the partial derivative taken $\alpha_{i}$ times in the $i$ -th coordinate and define $\left\lVert\nabla^{t}{f(x)}\right\rVert=\sqrt{\sum_{\alpha\in\mathbb{N}^{n},% \left|\alpha\right|=t}\left(\partial^{\alpha}f(x)\right)^{2}}$ . For $f(a,b):\mathbb{R}^{n}\times\mathbb{R}^{n}\to\mathbb{R}$ and $\alpha,\beta\in\mathbb{N}^{n}$ , we use $\partial^{\alpha}_{a}\partial^{\beta}_{b}\!f$ to denote the partial derivative taken $\alpha_{i}$ times in $a_{i}$ and $\beta_{i}$ times in $b_{i}$ . Using these notations, one has:

Theorem 6 (Multidimensional Taylor’s Theorem).

Let $d\in\mathbb{N}$ and $f:\mathbb{R}^{n}\to\mathbb{R}^{n}$ be a $\mathcal{C}^{d+1}$ function. Then for all $x,y\in\mathbb{R}^{n}$ ,

f(y)=\sum_{\alpha\in\mathbb{N}^{n},\left|\alpha\right|\leq d}\frac{\partial^{% \alpha}f(x)}{\alpha!}(y-x)^{\alpha}+\sum_{\alpha\in\mathbb{N}^{n},\left|\alpha% \right|=d+1}\frac{\partial^{\alpha}f(z)}{\alpha!}(y-x)^{\alpha}

where $z=cx+(1-c)y$ for some $c\in(0,1)$ .

Bump Function

Consider the bump function $\Psi:\mathbb{R}\to\mathbb{R}$ defined by $\Psi(x)=\begin{cases}e^{\frac{1}{x^{2}-1}},&\text{ if }\left|x\right|<1,\\ 0,&\text{ if }\left|x\right|\geq 1.\end{cases}$ It is well known that this function is infinitely differentiable and the derivatives are bounded.

Fact 7.

For all $t\in\mathbb{N}$ , $\left|\Psi^{(t)}(x)\right|\leq t^{(3+o(1))t}$ .

Let $\rho$ be the smooth univariate function defined by $\rho(x)=\begin{cases}1,&\text{ for }x\geq 1,\\ e\cdot e^{\frac{1}{(t-1)^{2}-1}}&\text{ for }0<x<1,\\ 0,&\text{ for }x\leq 0.\end{cases}$ It is easy to see $\rho$ is obtained from $\Psi$ via translation, stretch, and concatenation. We have

Fact 8.

For all $t\in\mathbb{N}$ , $\left|\rho^{(t)}(x)\right|\leq t^{(3+o(1))t}$ .

Fact 9.

Let $r(u,v)\coloneq\rho(\log u-\log v+c)$ for some constant $c$ . Then we have that for all $n,m\in\mathbb{N}$ , $\left|\frac{\partial^{n}\partial^{m}r(u,v)}{\partial u^{n}\partial v^{m}}% \right|\leq\frac{(n+m)^{6(n+m)}}{\left|u\right|^{n}\left|v\right|^{m}}$ .

We include the proof for the above three facts in Appendix A for self-containment.

Gaussian Space and the Gaussian Noise Operator

We denote by $y\sim\mathcal{N}(0,1)^{n}$ that $y=(y_{1},\dots,y_{n})\in\mathbb{R}^{n}$ is a random vector whose components are independent standard Gaussian variables (i.e., with mean $0$ and variance $1$ ). We say a random vector $Y\in\mathbb{R}^{n}$ is a $k$ -wise independent standard Gaussian vector if every component of $Y$ is a standard Gaussian variable and $\operatorname*{\mathbb{E}}[p(Y)]=\operatorname*{\mathbb{E}}_{y\sim\mathcal{N}(% 0,1)^{n}}[p(y)]$ for all polynomials $p:\mathbb{R}^{n}\to\mathbb{R}$ with degree at most $k$ . For a function $f:\mathbb{R}^{n}\to\mathbb{R}$ on Gaussian space and $1\leq p\leq\infty$ , the $p$ -norm is denoted by $\left\lVert f\right\rVert_{p}=\left(\operatorname*{\mathbb{E}}_{\begin{% subarray}{c}y\sim\mathcal{N}(0,1)^{n}\end{subarray}}\!\left[\left|f(y)\right|^% {p}\right]\right)^{1/p}.$ For $\rho\in[0,1]$ , the Gaussian noise operator $U_{\rho}$ is the operator on the space of functions $f:\mathbb{R}^{n}\to\mathbb{R}$ defined by $U_{\rho}f(x)=\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}(0% ,1)^{n}\end{subarray}}\!\left[f(\rho x+\sqrt{1-\rho^{2}}y)\right].$

The probabilists’ Hermite polynomials [23, Section 11] $\left\{H_{j}\right\}_{j\in\mathbb{N}}$ are defined by $H_{j}(y)=\frac{(-1)^{j}}{\varphi(y)}\cdot\frac{\mathrm{d}^{j}\varphi(y)}{% \mathrm{d}\ y^{j}}$ where $\varphi(y)=\frac{1}{\sqrt{2\pi}}e^{-\frac{y^{2}}{2}}$ . The univariate Hermite polynomials $\left\{h_{j}\right\}_{j\in\mathbb{N}}$ are defined by normalization: $h_{j}=\frac{1}{\sqrt{j!}}H_{j}$ . For a multi-index $\alpha\in\mathbb{N}^{n}$ , the (multivariate) Hermite polynomial $h_{\alpha}:\mathbb{R}^{n}\to\mathbb{R}$ is $h_{\alpha}(y)=\prod_{j=1}^{n}h_{\alpha_{j}}(y_{j}).$ The degree of $h_{\alpha}$ is $\left|\alpha\right|$ . The Hermite polynomials $\left\{h_{\alpha}\right\}_{\alpha\in\mathbb{N}^{n}}$ form an orthonormal basis for the functions over Gaussian space: $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}(0,1)^{n}\end{% subarray}}\!\left[h_{\alpha}(y)h_{\beta}(y)\right]=1$ iff $\alpha=\beta$ , and every degree- $d$ polynomial $f:\mathbb{R}^{n}\to\mathbb{R}$ can be uniquely expanded as $f(y)=\sum_{\alpha\in\mathbb{N}^{n},\left|\alpha\right|\leq d}\widehat{f}(% \alpha)h_{\alpha}(y).$ We can also expand the function $f(x+\sqrt{\lambda}y)$ in the Hermite basis in a manner similar to Taylor expansion.

Lemma 10 (Lemma 16 in [17]).

Suppose $f(y)=\sum_{\alpha\in\mathbb{N}^{n}}\widehat{f}(\alpha)h_{\alpha}(y)$ , we have $f(x+\sqrt{\lambda}y)=\sum_{\alpha\in\mathbb{N}^{n}}\frac{\partial^{\alpha}\phi% (x)}{\sqrt{\alpha!}}\lambda^{\left|\alpha\right|/2}h_{\alpha}(y),$ where $\phi(x)=U_{\sqrt{1-\lambda}}f\left(\frac{x}{\sqrt{1-\lambda}}\right)$ .

The function $U_{\rho}f$ has the following expansion: $U_{\rho}f(y)=\sum_{\alpha\in\mathbb{N}^{n},\left|\alpha\right|\leq d}\rho^{% \left|\alpha\right|}\widehat{f}(\alpha)h_{\alpha}(y).$ The definition of $U_{\rho}$ can be extended to $\rho>1$ by its action on the Hermite polynomials: $U_{\rho}h_{\alpha}(y)=\rho^{\left|\alpha\right|}h_{\alpha}(y)$ . We will use the following hypercontractive inequality:

Theorem 11.

Let $f:\mathbb{R}^{n}\to\mathbb{R}$ and $2\leq p\leq\infty$ , $\left\lVert f\right\rVert_{p}\leq\left\lVert U_{\sqrt{p-1}}f\right\rVert_{2}$ .

For more details on analysis over Gaussian space, readers may refer to [23].

Low-Degree Polynomials

Low-degree polynomials are extensively studied in the literature. We list some results used in this paper. It is well-know that low-degree polynomials have the following anti-concentration property:

Lemma 12 (Theorem 8 in [3]).

Let $p:\mathbb{R}^{n}\to\mathbb{R}$ be a polynomial of degree $d$ with $\left\lVert p\right\rVert_{2}=1$ . Then we have $\Pr_{x\sim\mathcal{N}(0,1)^{n}}[\left|p(x)\right|\leq\epsilon]=O(d\epsilon^{1/% d}).$

Suppose $p$ is a low-degree polynomial, the following gives an estimation on the deviation of $p(x)$ caused by a small perturbation.

Lemma 13 (Lemma 22 in [13]).

Let $p:\mathbb{R}^{n}\to\mathbb{R}$ be a polynomial of degree $d$ with $\left\lVert p\right\rVert_{2}=1$ . Suppose $x\in\mathbb{R}^{n}$ be a vector with $\left\lVert x\right\rVert_{\infty}\leq B(B>1)$ . Let $x^{\prime}$ be another vector such that $\left\lVert x-x^{\prime}\right\rVert_{\infty}\leq\delta<1$ . Then we have $\left|p(x)-p(x^{\prime})\right|\leq\delta n^{d/2}O(B)^{d}.$

The magnitudes of the derivatives of a low-degree polynomial are likely to grow at a moderate rate with high probability. Formally,

Lemma 14 (Lemma 6 in [17]).

Let $p:\mathbb{R}^{n}\to\mathbb{R}$ be an arbitrary polynomial of degree $d$ and $y\sim\mathcal{N}\left(0,1\right)^{n}$ , the following holds with probability at least $1-\epsilon d^{3}$ :

\left\lVert\nabla^{t}{p(y)}\right\rVert\leq O\left(\epsilon^{-1}\right)\left% \lVert\nabla^{t-1}{p(y)}\right\rVert\text{ for all }1\leq t\leq d.

The following lemma gives quantitative bounds on how much the derivatives $\nabla^{t}{p(x+\sqrt{\lambda}y)}$ are concentrated around those of $\phi(x)=\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}(0,1)^{% n}\end{subarray}}\!\left[p(x+\sqrt{\lambda}y)\right]$ when $y\sim\mathcal{N}(0,1)^{n}$ .

Lemma 15 (Lemma 23 in [17]).

Let $0\leq\lambda<1$ and $p:\mathbb{R}^{n}\to\mathbb{R}$ be an arbitrary polynomial of degree $d$ and $\phi(x)=U_{\sqrt{1-\lambda}}p\!\left(\frac{x}{\sqrt{1-\lambda}}\right)=% \operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}(0,1)^{n}\end{% subarray}}\!\left[p(x+\sqrt{\lambda}y)\right]$ . For $0\leq t\leq d$ and $y\sim\mathcal{N}\left(0,1\right)^{n}$ ,

\left(\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}\left(0,1% \right)^{n}\end{subarray}}\!\left[\left\lVert\nabla^{t}{p(x+\sqrt{\lambda}y)}-% \nabla^{t}{\phi(x)}\right\rVert^{R}\right]\right)^{\frac{1}{R}}\leq\sqrt{\sum_% {j=t+1}^{d}(\lambda dR)^{j-t}\left\lVert\nabla^{j}{\phi(x)}\right\rVert^{2}}\enspace.

3 Fooling the Functions of PTFs via Bounded Independence

In this section, we show that a random Gaussian vector matching certain moments fools any function of low-degree polynomial threshold functions. Formally, we prove

Theorem 16.

Fix a small constant $0<\epsilon<1$ and let $R\in\mathbb{N}$ be an integer. Let $p_{1},\dots,p_{k}:\mathbb{R}^{n}\to\mathbb{R}$ be arbitrary polynomials of degree $d$ and $f:\{0,1\}^{k}\to\left\{0,1\right\}$ be an arbitrary Boolean function. Define function

F(x)\coloneqq f\!\left(\mathrm{sign}\!\left(p_{1}\!(x)\right),\dots,\mathrm{% sign}\!\left(p_{k}\!(x)\right)\right)\enspace.

Let $Y=\frac{1}{\sqrt{L}}\sum_{i=1}^{L}Y_{i}$ where $Y_{i}$ is a $2dR$ -wise independent standard Gaussian vector of length $n$ and $L=\Omega\left(\frac{k^{2}d^{3}R^{15}}{\epsilon^{2}}\right)$ . Then, we have

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (Y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}% \left(0,1\right)^{n}\end{subarray}}\!\left[F(y)\right]\right|=O(\epsilon kd^{3% })+kdL\cdot 2^{-\Omega(R)}\enspace.

The key idea in the proof of Theorem 16 is to analyze the derivatives of the disturbed function $\phi_{i}(x)=\operatorname*{\mathbb{E}}_{y\sim\mathcal{N}\left(0,1\right)^{n}}[% p_{i}(x+\sqrt{\lambda}y)]$ . We will see that once the derivatives of $\phi_{i}$ are well-controlled by its preceding order derivative at $x$ , $\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}$ is concentrated around $\nabla^{t}{\phi_{i}(x)}$ for a random $y$ , and $p_{i}(x+\sqrt{\lambda}y)$ and $\phi_{i}(x)$ share the same sign with high probability. Starting from this point, we use the mollifier introduced in [17]

\displaystyle G(x)\coloneqq\prod_{i=1}^{k}\prod_{t=0}^{d-1}\rho\!\left(\log\!% \left(\frac{\left\lVert\nabla^{t}{p_{i}(x)}\right\rVert^{2}}{16\epsilon^{2}% \left\lVert\nabla^{t+1}{p_{i}(x)}\right\rVert^{2}}\right)\right)

(1)

to judge whether the derivatives are all well-controlled for all $k$ polynomials. $G(x)=0$ as long as a certain order of derivative that is not controlled by its preceding order derivative. Our proof consists of following steps:

$\blacksquare$

Approximation using the mollifier $G$ : We first establish that

$\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (Y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[F(y)\right]\right|\approx\left|\operatorname*{\mathbb{E}}_{\begin{% subarray}{c}Y\end{subarray}}\!\left[F(Y)G(Y)\right]-\operatorname*{\mathbb{E}}% _{\begin{subarray}{c}y\end{subarray}}\!\left[F(y)G(y)\right]\right|\enspace.$

This approximation enables us to focus primarily on the analysis of $F(y)G(y)$ in the subsequent steps.

\blacksquare

Hybrid argument: Let $\lambda=L^{-1}$ , $y=\sqrt{\lambda}\sum_{i=1}^{L}y_{i}$ where $y_{i}\sim\mathcal{N}(0,1)^{n}$ and $Z^{i}=\sqrt{\lambda}(y_{1}+\cdots+y_{i-1}+Y_{i+1}+\cdots Y_{L})$ . We will show

\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[F(Z^{i}+\sqrt{\lambda}Y_{i})G(Z^{i}+\sqrt{\lambda}Y_{i})\right]\approx% \operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[F(Z^{i}+% \sqrt{\lambda}y_{i})G(Z^{i}+\sqrt{\lambda}y_{i})\right]\enspace.

(2)

Therefore by the triangle inequality, we have

	$\displaystyle\hskip-10.00002pt\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y% \end{subarray}}\!\left[F(Y)G(Y)\right]$	$\displaystyle=\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[F(Z^{1}+\sqrt{\lambda}Y_{1})G(Z^{1}+\sqrt{\lambda}Y_{1})\right]$
		$\displaystyle\approx\operatorname{\mathbb{E}}_{\begin{subarray}{c}\end{% subarray}}\!\left[F(Z^{1}+\sqrt{\lambda}y_{1})G(Z^{1}+\sqrt{\lambda}y_{1})% \right]=\operatorname{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[% F(Z^{2}+\sqrt{\lambda}Y_{2})G(Z^{2}+\sqrt{\lambda}Y_{2})\right]$
		$\displaystyle\approx\cdots\approx\operatorname{\mathbb{E}}_{\begin{subarray}{% c}\end{subarray}}\!\left[F(Z^{L}+\sqrt{\lambda}y_{L})G(Z^{L}+\sqrt{\lambda}y_{% L})\right]=\operatorname{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[F(y)G(y)\right]\enspace.$

To prove (2), we show for any fixed $x$ ,

\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[F(x+% \sqrt{\lambda}Y_{i})G(x+\sqrt{\lambda}Y_{i})\right]\approx\operatorname*{% \mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[F(x+\sqrt{\lambda}y_{i}% )G(x+\sqrt{\lambda}y_{i})\right]\enspace.

This is done by a case analysis:

–

The derivatives of all $k$ polynomials $\phi_{j}(x)$ are well-controlled at point $x$ . In this case, all $p_{j}(x+\sqrt{\lambda}Y_{i})$ and $p_{j}(x+\sqrt{\lambda}y_{i})$ share the same sign with high probability. Thus, it is highly likely that $F(x+\sqrt{\lambda}Y_{i})$ and $F(x+\sqrt{\lambda}y_{i})$ are nearly the same constant. It suffices to show $Y_{i}$ fools the mollifier function $G(x+\sqrt{\lambda}y_{i})$ .
–

At least one derivative is not controlled. In this case, we will show that $G(x+\sqrt{\lambda}Y_{i})$ and $G(x+\sqrt{\lambda}y_{i})$ are $0$ with high probability. This implies that $F(x+\sqrt{\lambda}Y_{i})G(x+\sqrt{\lambda}Y_{i})=F(x+\sqrt{\lambda}y_{i})G(x+% \sqrt{\lambda}y_{i})=0$ with overwhelming probability.

In the subsequent sections, Section 3.1 first demonstrates that $Y_{i}$ is able to fool the mollifier function $G$ when $x$ is a well-behaved point. Section 3.2 shows the closeness of a single step in the hybrid argument. Lastly, we prove Theorem 16 using approximation and the hybrid argument in Section 3.3.

3.1 Fooling the Mollifier $𝑮$

We begin with proving that a $2dR$ -wise independent standard Gaussian vector $Y$ fools the mollifier function $G(x+\sqrt{\lambda}y)$ . To achieve this, we utilize the Taylor expansion to expand the mollifier function $G(x+\sqrt{\lambda}y)$ up to a specified order. As a result, $G(x+\sqrt{\lambda}y)$ is decomposed into two parts: a degree- $d(R-1)$ polynomial $l(y)$ and a remainder term $\Delta(y)$ . We mainly show that $\operatorname*{\mathbb{E}}[\Delta]$ is negligible under both pseudorandom distribution and true Gaussian distribution. This leads us to the conclusion that $\operatorname*{\mathbb{E}}[G(x+\sqrt{\lambda}y)]\approx\operatorname*{\mathbb{% E}}[l(y)]$ and $\operatorname*{\mathbb{E}}[G(x+\sqrt{\lambda}Y)]\approx\operatorname*{\mathbb{% E}}[l(Y)]$ . Furthermore, since $l(y)$ has degree at most $d R$ , it follows that $\operatorname*{\mathbb{E}}[l(y)]=\operatorname*{\mathbb{E}}[l(Y)]$ . Thus, we conclude that $\operatorname*{\mathbb{E}}[G(x+\sqrt{\lambda}y)]\approx\operatorname*{\mathbb{% E}}[G(x+\sqrt{\lambda}Y)]$ .

Lemma 17.

Fix a small constant $0<\epsilon<1$ and let $R\in\mathbb{N}$ be an integer. Let $p_{1},\dots,p_{k}:\mathbb{R}^{n}\to\mathbb{R}$ be arbitrary polynomials of degree $d$ . Define $\phi_{i}(x)\coloneq U_{\sqrt{1-\lambda}}p_{i}\!\left(\frac{x}{\sqrt{1-\lambda}% }\right)=\operatorname*{\mathbb{E}}_{y\sim\mathcal{N}\left(0,1\right)^{n}}[p_{% i}(x+\sqrt{\lambda}y)]$ for all $p_{i}$ . Suppose that a fix point $x\in\mathbb{R}^{n}$ satisfies $\left\lVert\nabla^{t+1}{\phi_{i}(x)}\right\rVert\leq\frac{1}{\epsilon}\left% \lVert\nabla^{t}{\phi_{i}(x)}\right\rVert$ for any $1\leq i\leq k$ and $0\leq t\leq d-1$ . Let $Y$ be a $2dR$ -wise independent standard Gaussian vector of length $n$ . For $\lambda=O(k^{-2}d^{-3}R^{-15}\epsilon^{2})$ , we have

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[G% (x+\sqrt{\lambda}Y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y% \sim\mathcal{N}\left(0,1\right)^{n}\end{subarray}}\!\left[G(x+\sqrt{\lambda}y)% \right]\right|=kd\cdot 2^{-\Omega(R)}\enspace,

where $G$ is defined in (1).

Proof.

Let $\sigma(z)\coloneq\rho(z-\log{16\epsilon^{2}})$ and by the definition of function $G\left(\cdot\right)$ defined in (1),

G(z)=\prod_{i=1}^{k}\prod_{t=0}^{d-1}\sigma\!\left(\log\left\lVert\nabla^{t}{p% _{i}(z)}\right\rVert^{2}-\log\left\lVert\nabla^{t+1}{p_{i}(z)}\right\rVert^{2}% \right).

Define variables $\left\{s_{i}^{t}\right\}_{1\leq i\leq k,0\leq t\leq d-1}$ and $\left\{r_{i}^{t}\right\}_{1\leq i\leq k,0\leq t\leq d-1}$ by letting $s_{i}^{t}=\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}\right\rVert^{2}$ and $r_{i}^{t}=\left\lVert\nabla^{t+1}{p_{i}(x+\sqrt{\lambda}y)}\right\rVert^{2}$ as functions of $y$ . Apparently, we have

G(x+\sqrt{\lambda}y)=g(s,r)\coloneq\prod_{i=1}^{k}\prod_{t=0}^{d-1}\sigma\!% \left(\log s_{i}^{t}-\log r_{i}^{t}\right)\enspace.

Therefore, it is equivalent to prove

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}\left(0,1% \right)^{n}\end{subarray}}\!\left[g(s,r)\right]-\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}Y\end{subarray}}\!\left[g(s,r)\right]\right|=kd\cdot 2^{-% \Omega(R)}\enspace.

To this end, we expand $g(s,r)$ into $R$ -th order using the Taylor expansion at some point $(a,b)$ : $g(s,r)=l(s,r)+\Delta,$ where $l(s,r)$ is a polynomial of $y$ of degree at most $d R$ and $\Delta$ is the remainder. Since $Y$ is $2dR$ -wise independent, we know $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right% )^{n}\end{subarray}}\!\left[l(s,r)\right]=\operatorname*{\mathbb{E}}_{\begin{% subarray}{c}Y\end{subarray}}\!\left[l(s,r)\right]$ . Therefore, it suffices to show $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right% )^{n}\end{subarray}}\!\left[\left|\Delta\right|\right]$ and $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[\left|% \Delta\right|\right]$ are bounded by $kd\cdot 2^{-\Omega(R)}$ .

More specifically, we choose to expand $g(s,r)$ at points $a_{i}^{t}=\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert^{2}$ and $b_{i}^{t}=\left\lVert\nabla^{t+1}{\phi_{i}(x)}\right\rVert^{2}$ via Theorem 6: $g(s,r)=l(s,r)+\Delta,$ where

l(s,r)=\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)_{i\in[k],0\leq t% \leq d-1}\in\mathbb{N}^{kd}\\ \left(\beta^{t}_{i}\right)_{i\in[k],0\leq t\leq d-1}\in\mathbb{N}^{kd}\\ \left|\alpha\right|+\left|\beta\right|<R\end{subarray}}\frac{\partial^{\alpha}% _{s}\partial^{\beta}_{r}g(a,b)}{\alpha!\beta!}(s-a)^{\alpha}(r-b)^{\beta}

and

\Delta=\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)_{i\in[k],0\leq t% \leq d-1}\in\mathbb{N}^{kd}\\ \left(\beta^{t}_{i}\right)_{i\in[k],0\leq t\leq d-1}\in\mathbb{N}^{kd}\\ \left|\alpha\right|+\left|\beta\right|=R\end{subarray}}\frac{\partial^{\alpha}% _{s}\partial^{\beta}_{r}g(s^{*},r^{*})}{\alpha!\beta!}(s-a)^{\alpha}(r-b)^{% \beta}\enspace.

for some $(s^{*},r^{*})$ on the line segment joining $(s,r)$ and $(a,b)$ . It is not hard to see that $l(s,r)$ is a polynomial of $y$ of degree at most $d(R-1)$ . For the remainder term $\Delta$ , we will prove the following bound: $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right% )^{n}\end{subarray}}\!\left[\left|\Delta\right|\right]\leq kd\cdot 2^{-\Omega(% R)}.$

We now give bounds on $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right% )^{n}\end{subarray}}\!\left[\left|\Delta\right|\right]$ . The same argument applies to $Y$ as well. Fix a small constant $0<\delta<\frac{1}{kdR^{7}}$ . Let $\mathcal{E}_{i}^{t}$ be the event that

\left|\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}\right\rVert-\left\lVert% \nabla^{t}{\phi_{i}(x)}\right\rVert\right|\leq\delta\left\lVert\nabla^{t}{\phi% _{i}(x)}\right\rVert\enspace.

Now let $\mathcal{E}=\wedge_{1\leq i\leq k,0\leq t\leq d}\mathcal{E}_{i}^{t}$ . Note that

\Delta=\Delta\cdot\mathds{1}_{\mathcal{E}}+\Delta\cdot\mathds{1}_{\overline{% \mathcal{E}}}=\Delta\cdot\mathds{1}_{\mathcal{E}}+g(s,r)\cdot\mathds{1}_{% \overline{\mathcal{E}}}-l(s,r)\cdot\mathds{1}_{\overline{\mathcal{E}}}\enspace.

Here, $\mathds{1}_{\mathcal{A}}=1$ when event $\mathcal{A}$ occurs and $\mathds{1}_{\mathcal{A}}=0$ otherwise. Therefore, by the triangle inequality, we have

$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left\|\Delta\right\|\right]$	$\displaystyle\leq\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[\Delta\cdot\mathds{1}_{\mathcal{E}}\right]+\operatorname{\mathbb{E}% }_{\begin{subarray}{c}y\end{subarray}}\!\left[g(s,r)\cdot\mathds{1}_{\overline% {\mathcal{E}}}\right]+\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[\left\|l(s,r)\right\|\cdot\mathds{1}_{\overline{\mathcal{E}}}\right]$
	$\displaystyle\leq\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[\Delta\cdot\mathds{1}_{\mathcal{E}}\right]+\operatorname{\mathbb{E}% }_{\begin{subarray}{c}y\end{subarray}}\!\left[\mathds{1}_{\overline{\mathcal{E% }}}\right]+\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left\|l(s,r)\right\|\cdot\mathds{1}_{\overline{\mathcal{E}}}\right]$
	$\displaystyle\leq\underbrace{\operatorname{\mathbb{E}}_{\begin{subarray}{c}y% \end{subarray}}\!\left[\Delta\cdot\mathds{1}_{\mathcal{E}}\right]}_{(\text{% Term }1)}+\underbrace{\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[\mathds{1}_{\overline{\mathcal{E}}}\right]}_{(\text{Term }2)% }+\underbrace{\sqrt{\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[l^{2}(s,r)\right]}}_{(\text{Term }3)}\cdot\sqrt{% \operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[\mathds% {1}_{\overline{\mathcal{E}}}\right]}\enspace.$	(Cauchy–Schwarz)

We are next to bound Term $1\sim 3$ .

Bounding Term 1.

If event $\mathcal{E}$ occurs, we have

	$\displaystyle\left\|\Delta\right\|$	$\displaystyle\leq\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)\in% \mathbb{N}^{kd},\left(\beta^{t}_{i}\right)\in\mathbb{N}^{kd}\\ \left\|\alpha\right\|+\left\|\beta\right\|=R\end{subarray}}\frac{\left\|\partial^{% \alpha}_{s}\partial^{\beta}_{r}g(s^{},r^{})\right\|}{\alpha!\beta!}\prod_{i=1% }^{k}\prod_{t=0}^{d-1}\left\|s_{i}^{t}-a_{i}^{t}\right\|^{\alpha_{i}^{t}}\left\|r% _{i}^{t}-b_{i}^{t}\right\|^{\beta_{i}^{t}}$
		$\displaystyle\leq\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)\in% \mathbb{N}^{kd},\left(\beta^{t}_{i}\right)\in\mathbb{N}^{kd}\\ \left\|\alpha\right\|+\left\|\beta\right\|=R\end{subarray}}{R^{6R}}\cdot\prod_{i=1% }^{k}\prod_{t=0}^{d-1}\frac{\left\|s_{i}^{t}-a_{i}^{t}\right\|^{\alpha_{i}^{t}}}% {\left\|s^{t}_{i}\right\|^{\alpha_{i}^{t}}}\frac{\left\|r_{i}^{t}-b_{i}^{t}% \right\|^{\beta_{i}^{t}}}{\left\|r^{t}_{i}\right\|^{\beta_{i}^{t}}}$
		$\displaystyle\leq\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)\in% \mathbb{N}^{kd},\left(\beta^{t}_{i}\right)\in\mathbb{N}^{kd}\\ \left\|\alpha\right\|+\left\|\beta\right\|=R\end{subarray}}{R^{6R}}\cdot\left(% \frac{2\delta+\delta^{2}}{(1-\delta)^{2}}\right)^{R}=\binom{R+2kd-1}{R}\cdot{R% ^{6R}}\cdot\left(4\delta\right)^{R}\leq 2^{-R}\enspace,$

where the second inequality is from Fact 9, and the third one is true by the following facts:

$\blacksquare$

$(1-\delta)^{2}a^{t}_{i}\leq s_{i}^{t}\leq(1+\delta)^{2}a^{t}_{i}$ , $(1-\delta)^{2}b^{t}_{i}\leq r_{i}^{t}\leq(1+\delta)^{2}b^{t}_{i}$ ,
$\blacksquare$

${s^{*t}_{i}}\geq\min\left\{s^{t}_{i},a^{t}_{i}\right\}\geq(1-\delta)^{2}a^{t}_% {i}$ and ${r^{*t}_{i}}\geq\min\left\{r^{t}_{i},b^{t}_{i}\right\}\geq(1-\delta)^{2}b^{t}_% {i}$ , since $(s^{*t}_{i},r^{*t}_{i})$ lies between $(s^{t}_{i},r^{t}_{i})$ and $(a^{t}_{i},b^{t}_{i})$ .

This gives us $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[\Delta% \cdot\mathds{1}_{\mathcal{E}}\right]\leq 2^{-R}.$

Bounding Term 2.

Since

\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}\right\rVert-\left\lVert\nabla^% {t}{\phi_{i}(x)}\right\rVert\leq\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)% }-\nabla^{t}{\phi_{i}(x)}\right\rVert

and

\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert-\left\lVert\nabla^{t}{p_{i}(x+% \sqrt{\lambda}y)}\right\rVert\leq\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y% )}-\nabla^{t}{\phi_{i}(x)}\right\rVert\enspace,

we have

\left|\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}\right\rVert-\left\lVert% \nabla^{t}{\phi_{i}(x)}\right\rVert\right|\leq\left\lVert\nabla^{t}{p_{i}(x+% \sqrt{\lambda}y)}-\nabla^{t}{\phi_{i}(x)}\right\rVert

(3)

This gives us

	$\displaystyle\Pr_{y}\left[\mathcal{E}_{i}^{t}\right]$	$\displaystyle\geq\Pr_{y}\left[\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}-% \nabla^{t}{\phi_{i}(x)}\right\rVert\leq\delta\left\lVert\nabla^{t}{\phi_{i}(x)% }\right\rVert\right]$
		$\displaystyle\geq 1-\frac{\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end% {subarray}}\!\left[\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}-\nabla^{t}{% \phi_{i}(x)}\right\rVert^{R}\right]}{\delta^{R}\left\lVert\nabla^{t}{\phi_{i}(% x)}\right\rVert^{R}}$
		$\displaystyle\geq 1-\frac{\left(\sum_{j=t+1}^{d}(\lambda dR)^{j-t}\left\lVert% \nabla^{j}{\phi(x)}\right\rVert^{2}\right)^{\frac{R}{2}}}{\delta^{R}\left% \lVert\nabla^{t}{\phi_{i}(x)}\right\rVert^{R}}$
		$\displaystyle\geq 1-\left(\frac{1}{\delta}\right)^{R}\left(\sum_{j=1}^{d-t}% \left(\frac{\lambda dR}{\epsilon^{2}}\right)^{j}\right)^{\frac{R}{2}}\geq 1-% \left(\sqrt{\frac{2\lambda dR}{\delta^{2}\epsilon^{2}}}\right)^{R}$

where the second inequality is from Markov’s inequality, the third one is from Lemma 15 and the fourth one holds since $x\in\mathbb{R}^{n}$ satisfies $\left\lVert\nabla^{t+1}{\phi_{i}(x)}\right\rVert\leq\frac{1}{\epsilon}\left% \lVert\nabla^{t}{\phi_{i}(x)}\right\rVert$ for any $1\leq i\leq k$ and $0\leq t\leq d-1$ . For $\lambda=O(k^{-2}d^{-3}R^{-15}\epsilon^{2})$ , we have $\Pr_{y}\left[\mathcal{E}_{i}^{t}\right]\geq 1-2^{-R}.$ Consequently, $\Pr_{y}\left[\mathcal{E}\right]\geq 1-kd2^{-R}.$ Therefore, we have $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[\mathds% {1}_{\overline{\mathcal{E}}}\right]\leq kd2^{-R}.$

Bounding Term 3.

We next to upper bound $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[l^{2}(s% ,r)\right]$ . Note that

		$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[l^{2}(s,r)\right]$
	$\displaystyle\leq$	$\displaystyle\sum_{\begin{subarray}{c}\left\|\alpha\right\|+\left\|\beta\right\|<R% \\ \left\|\alpha^{\prime}\right\|+\left\|\beta^{\prime}\right\|<R\end{subarray}}\frac% {\left\|\partial^{\alpha}_{s}\partial^{\beta}_{r}g(a,b)\right\|}{\alpha!\beta!}% \frac{\left\|\partial^{\alpha^{\prime}}_{s}\partial^{\beta^{\prime}}_{r}g(a,b)% \right\|}{\alpha^{\prime}!\beta^{\prime}!}\operatorname*{\mathbb{E}}_{\begin{% subarray}{c}y\end{subarray}}\!\left[\left\|(s-a)^{\alpha}(r-b)^{\beta}(s-a)^{% \alpha^{\prime}}(r-b)^{\beta^{\prime}}\right\|\right]$
	$\displaystyle\ \leq$	$\displaystyle\sum_{\begin{subarray}{c}q_{1}<R\\ q_{2}<R\end{subarray}}\sum_{\begin{subarray}{c}\left\|\alpha\right\|+\left\|\beta% \right\|=q_{1}\\ \left\|\alpha^{\prime}\right\|+\left\|\beta^{\prime}\right\|=q_{2}\end{subarray}}R% ^{6(q_{1}+q_{2})}\cdot\underbrace{\operatorname*{\mathbb{E}}_{\begin{subarray}% {c}y\end{subarray}}\!\left[\prod_{i=1}^{k}\prod_{t=0}^{d-1}\frac{\left\|s_{i}^{% t}-a_{i}^{t}\right\|^{\alpha_{i}^{t}}}{\left\|a^{t}_{i}\right\|^{\alpha_{i}^{t}}}% \frac{\left\|r_{i}^{t}-b_{i}^{t}\right\|^{\beta_{i}^{t}}}{\left\|b^{t}_{i}\right\|% ^{\beta_{i}^{t}}}\frac{\left\|s_{i}^{t}-a_{i}^{t}\right\|^{{\alpha^{\prime}}_{i}% ^{t}}}{\left\|a^{t}_{i}\right\|^{{\alpha^{\prime}}_{i}^{t}}}\frac{\left\|r_{i}^{t% }-b_{i}^{t}\right\|^{{\beta^{\prime}}_{i}^{t}}}{\left\|b^{t}_{i}\right\|^{{\beta^% {\prime}}_{i}^{t}}}\right]}_{(\star)}$

By generalized Hölder’s inequality, for $q_{1}+q_{2}\neq 0$

	$\displaystyle(\star)\leavevmode\nobreak\ \leq\leavevmode\nobreak\$	$\displaystyle\prod_{i,t}\leavevmode\nobreak\ \frac{\left(\operatorname{% \mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[\left\|s_{i}^{t}-a_{i}^% {t}\right\|^{q_{1}+q_{2}}\right]\right)^{\frac{\alpha_{i}^{t}}{q_{1}+q_{2}}}}{% \left\|a^{t}_{i}\right\|^{\alpha_{i}^{t}}}\cdot\frac{\left(\operatorname{% \mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[\left\|r_{i}^{t}-b_{i}^% {t}\right\|^{q_{1}+q_{2}}\right]\right)^{\frac{\beta_{i}^{t}}{q_{1}+q_{2}}}}{% \left\|b^{t}_{i}\right\|^{\beta_{i}^{t}}}$
		$\displaystyle\enspace\enspace\enspace\enspace\enspace\enspace\enspace\enspace% \cdot\frac{\left(\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[\left\|s_{i}^{t}-a_{i}^{t}\right\|^{q_{1}+q_{2}}\right]\right)^{\frac{% {\alpha^{\prime}}_{i}^{t}}{q_{1}+q_{2}}}}{\left\|a^{t}_{i}\right\|^{{\alpha^{% \prime}}_{i}^{t}}}\cdot\frac{\left(\operatorname{\mathbb{E}}_{\begin{subarray% }{c}y\end{subarray}}\!\left[\left\|r_{i}^{t}-b_{i}^{t}\right\|^{q_{1}+q_{2}}% \right]\right)^{\frac{{\beta^{\prime}}_{i}^{t}}{q_{1}+q_{2}}}}{\left\|b^{t}_{i}% \right\|^{{\beta^{\prime}}_{i}^{t}}}\enspace.$

Note that for $0<q\leq 2R$ ,

		$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left(\frac{\left\|\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}\right% \rVert^{2}-\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert^{2}\right\|}{\left% \lVert\nabla^{t}{\phi_{i}(x)}\right\rVert^{2}}\right)^{q}\right]$
	$\displaystyle\leq$	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left(\frac{2\left\|\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}\right% \rVert-\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert\right\|\cdot\left\lVert% \nabla^{t}{\phi_{i}(x)}\right\rVert+\left\|\left\lVert\nabla^{t}{p_{i}(x+\sqrt{% \lambda}y)}\right\rVert-\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert\right\|^% {2}}{\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert^{2}}\right)^{q}\right]$
	$\displaystyle\leq$	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left(\frac{2\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}-\nabla^{t}{% \phi_{i}(x)}\right\rVert}{\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert}+% \frac{\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}-\nabla^{t}{\phi_{i}(x)}% \right\rVert^{2}}{\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert^{2}}\right)^{% q}\right]$
	$\displaystyle=$	$\displaystyle\sum_{j=0}^{q}2^{j}\operatorname*{\mathbb{E}}_{\begin{subarray}{c% }y\end{subarray}}\!\left[\frac{\left\lVert\nabla^{t}{p_{i}(x+\sqrt{\lambda}y)}% -\nabla^{t}{\phi_{i}(x)}\right\rVert^{2q-j}}{\left\lVert\nabla^{t}{\phi_{i}(x)% }\right\rVert^{2q-j}}\right]$
	$\displaystyle\leq$	$\displaystyle\sum_{j=0}^{q}2^{j}\cdot\left(\sqrt{\frac{4\lambda dR}{\epsilon^{% 2}}}\right)^{2q-j}=4^{q}\cdot\sum_{j=0}^{q}\left(\sqrt{\frac{\lambda dR}{% \epsilon^{2}}}\right)^{2q-j}$
	$\displaystyle\leq$	$\displaystyle 2\cdot 4^{q}\cdot\left(\sqrt{\frac{\lambda dR}{\epsilon^{2}}}% \right)^{q}\leq\left(\sqrt{\frac{17\lambda dR}{\epsilon^{2}}}\right)^{q}$

where the first inequality is from $\left|a^{2}-b^{2}\right|=\left|a-b\right|\left|a+b\right|\leq\left|a-b\right|(% \left|a\right|+\left|b\right|)\leq\left|a-b\right|(2\left|a\right|+\left|a-b% \right|)$ , the second one is from Eq.(3) and the third inequality is from Lemma 15. Therefore,

\displaystyle(\star)\leq\prod_{i,t}\left(\sqrt{\frac{17\lambda dR}{\epsilon^{2% }}}\right)^{\alpha_{i}^{t}+\beta_{i}^{t}+{\alpha^{\prime}}_{i}^{t}+{\beta^{% \prime}}_{i}^{t}}=\left(\sqrt{\frac{17\lambda dR}{\epsilon^{2}}}\right)^{q_{1}% +q_{2}}\enspace.

Consequently,

	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[l^{2}(s,r)\right]$	$\displaystyle\leq\sum_{\begin{subarray}{c}q_{1}<R\\ q_{2}<R\end{subarray}}\sum_{\begin{subarray}{c}\left\|\alpha\right\|+\left\|\beta% \right\|=q_{1}\\ \left\|\alpha^{\prime}\right\|+\left\|\beta^{\prime}\right\|=q_{2}\end{subarray}}R% ^{6(q_{1}+q_{2})}\cdot\left(\sqrt{\frac{17\lambda dR}{\epsilon^{2}}}\right)^{q% _{1}+q_{2}}$
		$\displaystyle\leq\sum_{\begin{subarray}{c}q_{1}<R\\ q_{2}<R\end{subarray}}(R+2kd-1)^{q_{1}+q_{2}}\cdot R^{6(q_{1}+q_{2})}\cdot% \left(\sqrt{\frac{17\lambda dR}{\epsilon^{2}}}\right)^{q_{1}+q_{2}}$
		$\displaystyle=\sum_{q=0}^{2R-2}(q+1)\cdot\left((R+2kd-1)R^{6}\sqrt{\frac{17% \lambda dR}{\epsilon^{2}}}\right)^{q}\enspace.$

For $\lambda=O(k^{-2}d^{-3}R^{-15}\epsilon^{2})$ sufficiently small, we have $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[l^{2}(s% ,r)\right]\leq\sum_{q=0}^{2R-2}(q+1)\cdot 4^{-q}<2R.$

Thus, putting everything together, we have

	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left\|\Delta\right\|\right]$	$\displaystyle\leq\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[\Delta\cdot\mathds{1}_{\mathcal{E}}\right]+\operatorname{\mathbb{E}% }_{\begin{subarray}{c}y\end{subarray}}\!\left[\mathds{1}_{\overline{\mathcal{E% }}}\right]+\sqrt{\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[l^{2}(s,r)\right]}\sqrt{\operatorname{\mathbb{E}}_{\begin{subarray}% {c}y\end{subarray}}\!\left[\mathds{1}_{\overline{\mathcal{E}}}\right]}$
		$\displaystyle\leq 2^{-R}+kd2^{-R}+\sqrt{2Rkd2^{-R}}$
		$\displaystyle\leq kd\cdot 2^{-\Omega(R)}\enspace.$

We now regard $l(s,r)$ as a function of $y$ , and from the above inequality we have

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[G% (x+\sqrt{\lambda}y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y% \end{subarray}}\!\left[l(s,r)\right]\right|\leq\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}y\end{subarray}}\!\left[\left|\Delta\right|\right]\leq kd% \cdot 2^{-\Omega(R)}\enspace.

Similarly, the same argument applying on $2dR$ -wise independent Gaussian vector $Y$ gives us

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[G% (x+\sqrt{\lambda}Y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y% \end{subarray}}\!\left[l(s,r)\right]\right|\leq kd\cdot 2^{-\Omega(R)}\enspace.

The lemma then follows from the fact that $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[l(s,r)% \right]=\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left% [l(s,r)\right]$ , since $l(s,r)$ is a polynomial of $y$ of degree at most $d(R-1)$ . $\hfill\blacktriangleleft$

3.2 A Single Step in the Hybrids

In this section, we analyze one single step in the entire hybrid argument. We will show that for any $x$ , we have that $\operatorname*{\mathbb{E}}_{Y}[F(x+\sqrt{\lambda}Y)G(x+\sqrt{\lambda}Y)]% \approx\operatorname*{\mathbb{E}}_{y}[F(x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)]$ for $2dR$ -wise independent Gaussian $Y$ and true Gaussian $y$ .

Let $\phi_{i}(x)=U_{\sqrt{1-\lambda}}p_{i}\!\left(\frac{x}{\sqrt{1-\lambda}}\right)% =\operatorname*{\mathbb{E}}_{y}[p_{i}(x+\sqrt{\lambda}y)]$ . The proof proceeds through a case analysis based on the behavior of $\phi_{i}$ at the fixed point $x$ . Specifically, we define $x$ as well-behaved if $\left\lVert\nabla^{t+1}{\phi_{i}(x)}\right\rVert\leq\frac{1}{\epsilon}\left% \lVert\nabla^{t}{\phi_{i}(x)}\right\rVert$ for all $t\in[d]$ and $i\in[k]$ . In other words, for each function $\phi_{i}$ , its $t$ -th order derivatives are controlled by its $(t-1)$ -th order derivatives.

$\blacksquare$

In the scenario where $x$ is not well-behaved, we can identify an $i_{0}$ and a $t_{0}$ such that with at least probability $1-2^{-R+1}$ ,

$\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert>\frac{1}% {4\epsilon}\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert\enspace.$

Thus, it is highly probable that the mollifier function $G(x+\sqrt{\lambda}y)=0$ . So, the expectation of $F(x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)$ is no more that $2^{-R+1}$ . The same argument works for $Y$ as well.
$\blacksquare$

For the case that $x$ is well-behaved, we will show that for all $p_{i}$ , ${p_{i}(x+\sqrt{\lambda}y)}$ and ${p_{i}(x+\sqrt{\lambda}Y)}$ are nearly the same constant. This implies $F(x+\sqrt{\lambda}y)$ and $F(x+\sqrt{\lambda}Y)$ are equal in most situations. Then it suffices to show $Y$ fools the mollifier, as discussed in the previous section.

Lemma 18.

Fix a small constant $0<\epsilon<1$ and let $R\in\mathbb{N}$ be an integer. Let $p_{1},\dots,p_{k}:\mathbb{R}^{n}\to\mathbb{R}$ be arbitrary polynomials of degree $d$ and $f:\{0,1\}^{k}\to\left\{0,1\right\}$ be an arbitrary Boolean function. Define function

F(x)\coloneqq f\!\left(\mathrm{sign}\!\left(p_{1}\!(x)\right),\dots,\mathrm{% sign}\!\left(p_{k}\!(x)\right)\right)\enspace.

Let $Y$ be a $2dR$ -wise independent standard Gaussian vector of length $n$ . For any $x\in\mathbb{R}^{n}$ and $\lambda=O(k^{-2}d^{-3}R^{-15}\epsilon^{2})$

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (x+\sqrt{\lambda}Y)G(x+\sqrt{\lambda}Y)\right]-\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right)^{n}\end{subarray}}\!\left[% F(x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)\right]\right|=kd2^{-\Omega(R)}\enspace,

where $G$ is defined in (1).

Proof.

Let $\phi_{i}(x)=U_{\sqrt{1-\lambda}}p_{i}\!\left(\frac{x}{\sqrt{1-\lambda}}\right)$ . Define $x$ is good if for any $1\leq i\leq k$ and $0\leq t\leq d-1$ , $\left\lVert\nabla^{t+1}{\phi_{i}(x)}\right\rVert\leq\frac{1}{\epsilon}\left% \lVert\nabla^{t}{\phi_{i}(x)}\right\rVert$ . We prove this lemma by considering $x$ is good or not.

We first consider that $x$ is not good. In this case, we will show $G(x+\sqrt{\lambda}y)=0$ holds with high probability. Consequently, $F(x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)$ is zero with high probability. To this end, it suffices to find an $i_{0}$ and a $t_{0}$ such that

\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert>\frac{1}% {4\epsilon}\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert\enspace.

We choose an arbitrary $i_{0}$ satisfying that there exists $0\leq t\leq d-1$ such that $\left\lVert\nabla^{t+1}{\phi_{i_{0}}(x)}\right\rVert$ $>\frac{1}{\epsilon}\left\lVert\nabla^{t}{\phi_{i_{0}}(x)}\right\rVert$ . Since $x$ is not good, we know such $i_{0}$ exists. And let $t_{0}$ be the largest $t$ such that $\left\lVert\nabla^{t+1}{\phi_{i_{0}}(x)}\right\rVert>\frac{1}{\epsilon}\left% \lVert\nabla^{t}{\phi_{i_{0}}(x)}\right\rVert$ holds. It is not hard to check

$\blacksquare$

$\left\lVert\nabla^{t_{0}}{\phi_{i_{0}}(x)}\right\rVert<{\epsilon}\left\lVert% \nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert$ ,
$\blacksquare$

$\left\lVert\nabla^{t+1}{\phi_{i_{0}}(x)}\right\rVert\leq\frac{1}{\epsilon}% \left\lVert\nabla^{t}{\phi_{i_{0}}(x)}\right\rVert$ for $t\geq t_{0}+1$ .

We are next to prove the following inequalities hold with high probability,

(a)

$\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert<2\epsilon% \left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert$
(b)

$\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert>\frac{1}% {2}\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert$

It is easy to see $(a)$ and $(b)$ give us $\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert>\frac{1}% {4\epsilon}\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert$ .

Showing (a).

By Markov’s inequality, we have

		$\displaystyle\Pr_{y}\left[\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}% y)}-\nabla^{t_{0}}{\phi_{i_{0}}(x)}\right\rVert\geq\epsilon\left\lVert\nabla^{% t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert\right]$
	$\displaystyle\leq$	$\displaystyle\frac{\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}-% \nabla^{t_{0}}{\phi_{i_{0}}(x)}\right\rVert^{R}\right]}{\epsilon^{R}\left% \lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert^{R}}\leq\frac{\left(\sum_{% t=t_{0}+1}^{d}(\lambda dR)^{t-t_{0}}\left\lVert\nabla^{t}{\phi(x)}\right\rVert% ^{2}\right)^{\frac{R}{2}}}{\epsilon^{R}\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}% }(x)}\right\rVert^{R}}$
	$\displaystyle\leq$	$\displaystyle\frac{\left(\sum_{t=t_{0}+1}^{d}(\lambda dR)^{t-t_{0}}\left(\frac% {1}{\epsilon^{2}}\right)^{t-t_{0}-1}\left\lVert\nabla^{t_{0}+1}{\phi(x)}\right% \rVert^{2}\right)^{\frac{R}{2}}}{\epsilon^{R}\left\lVert\nabla^{t_{0}+1}{\phi_% {i_{0}}(x)}\right\rVert^{R}}\leq\left(\sum_{t=1}^{d-t_{0}}\left(\frac{\lambda dR% }{\epsilon^{2}}\right)^{t}\right)^{\frac{R}{2}}\enspace.$

Here the second inequality is from Lemma 15. The third inequality uses the condition that $\left\lVert\nabla^{t+1}{\phi_{i_{0}}(x)}\right\rVert\leq\frac{1}{\epsilon}% \left\lVert\nabla^{t}{\phi_{i_{0}}(x)}\right\rVert$ for $t\geq t_{0}+1$ . Since $\lambda\leq\frac{\epsilon^{2}}{100dR}$ , this probability is bounded by $2^{-R}$ . Therefore, with probability at least $1-2^{-R}$ ,

\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}-\nabla^{t_{0}}{\phi_{i% _{0}}(x)}\right\rVert<\epsilon\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}% \right\rVert.

Moreover, we know $\left\lVert\nabla^{t_{0}}{\phi_{i_{0}}(x)}\right\rVert<{\epsilon}\left\lVert% \nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert$ . So, we have with probability at least $1-2^{-R}$ ,

\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert\leq\left% \lVert\nabla^{t_{0}}{\phi_{i_{0}}(x)}\right\rVert+\left\lVert\nabla^{t_{0}}{p_% {i_{0}}(x+\sqrt{\lambda}y)}-\nabla^{t_{0}}{\phi_{i_{0}}(x)}\right\rVert<2% \epsilon\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert\enspace.

Showing (b).

Similarly, we have

		$\displaystyle\Pr_{y}\left[\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{% \lambda}y)}-\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert\geq\frac{1}{2}\left% \lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert\right]$
	$\displaystyle\leq$	$\displaystyle\frac{\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{\lambda}y)}-% \nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert^{R}\right]}{2^{-R}\left\lVert% \nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert^{R}}\leq\frac{\left(\sum_{t=t_{0% }+2}^{d}(\lambda dR)^{t-t_{0}-1}\left\lVert\nabla^{t}{\phi(x)}\right\rVert^{2}% \right)^{\frac{R}{2}}}{2^{-R}\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}% \right\rVert^{R}}$
	$\displaystyle\leq$	$\displaystyle\frac{\left(\sum_{t=t_{0}+2}^{d}\left(\frac{\lambda dR}{\epsilon^% {2}}\right)^{t-t_{0}-1}\left\lVert\nabla^{t_{0}+1}{\phi(x)}\right\rVert^{2}% \right)^{\frac{R}{2}}}{2^{-R}\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}% \right\rVert^{R}}\leq\left(4\cdot\sum_{t=1}^{d-t_{0}-1}\left(\frac{\lambda dR}% {\epsilon^{2}}\right)^{t}\right)^{\frac{R}{2}}\enspace.$

Here the second inequality is from Lemma 15. The third inequality uses the condition that $\left\lVert\nabla^{t+1}{\phi_{i_{0}}(x)}\right\rVert\leq\frac{1}{\epsilon}% \left\lVert\nabla^{t}{\phi_{i_{0}}(x)}\right\rVert$ for $t\geq t_{0}+1$ . Since $\lambda\leq\frac{\epsilon^{2}}{100dR}$ , this probability is bounded by $2^{-R}$ . Therefore, with probability at least $1-2^{-R}$ ,

\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert\geq\left% \lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert-\left\lVert\nabla^{t_{0}+1% }{p_{i_{0}}(x+\sqrt{\lambda}y)}-\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert>% \frac{1}{2}\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert\enspace.

Thus, combining (a) and (b), we have that with probability at least $1-2\cdot 2^{-R}$ ,

\left\lVert\nabla^{t_{0}+1}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert>\frac{1}% {2}\left\lVert\nabla^{t_{0}+1}{\phi_{i_{0}}(x)}\right\rVert>\frac{1}{4\epsilon% }\left\lVert\nabla^{t_{0}}{p_{i_{0}}(x+\sqrt{\lambda}y)}\right\rVert\enspace,

and consequently $G(x+\sqrt{\lambda}y)=0$ . This gives us the bound on the following expectation

\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right% )^{n}\end{subarray}}\!\left[F(x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)\right]% \leq 2\cdot 2^{-R}\enspace.

Since $Y$ is $d R$ -wise independent, the above argument still holds for $Y$ . So,

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (x+\sqrt{\lambda}Y)G(x+\sqrt{\lambda}Y)\right]-\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right)^{n}\end{subarray}}\!\left[% F(x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)\right]\right|\leq 4\cdot 2^{-R}\enspace.

Now suppose that $x$ is good. That is, for any $1\leq i\leq k$ and $0\leq t\leq d-1$ , $\left\lVert\nabla^{t+1}{\phi_{i}(x)}\right\rVert\leq\frac{1}{\epsilon}\left% \lVert\nabla^{t}{\phi_{i}(x)}\right\rVert$ . In this case, we will prove that the sign of ${p_{i}(x+\sqrt{\lambda}y)}$ is the same as the sign of $\phi_{i}(x)$ with high probability over the random variable $y$ . Therefore, $F(x+\sqrt{\lambda}y)$ is almost like a constant, since the value of $F(x+\sqrt{\lambda}y)$ only depends on the signs of all ${p_{i}(x+\sqrt{\lambda}y)}$ . To show the signs of ${p_{i}(x+\sqrt{\lambda}y)}$ and $\phi_{i}(x)$ are the same, it suffices to show $\frac{p_{i}(x+\sqrt{\lambda}y)}{\phi_{i}(x)}>0$ . Let

q_{i}(y)\coloneq\frac{p_{i}(x+\sqrt{\lambda}y)}{\phi_{i}(x)}-1=\frac{1}{\phi_{% i}(x)}\sum_{0<\left|\alpha\right|\leq d}\frac{\partial^{\alpha}\phi_{i}(x)}{% \sqrt{\alpha!}}\lambda^{\left|\alpha\right|/2}h_{\alpha}(y)\enspace.

Here, we expand $p_{i}(x+\sqrt{\lambda}y)=\phi_{i}(x)+\sum_{0<\left|\alpha\right|\leq d}\frac{% \partial^{\alpha}\phi_{i}(x)}{\sqrt{\alpha!}}\lambda^{\left|\alpha\right|/2}h_% {\alpha}(y)$ according to Lemma 10. We have by applying hypercontractive inequality in Theorem 11,

	$\displaystyle\left\lVert q_{i}(y)\right\rVert_{R}\leq\left\lVert U_{\sqrt{R}}q% _{i}(y)\right\rVert_{2}$	$\displaystyle=\left\lVert\frac{1}{\phi_{i}(x)}\sum_{0<\left\|\alpha\right\|\leq d% }\frac{\partial^{\alpha}\phi_{i}(x)}{\sqrt{\alpha!}}R^{\left\|\alpha\right\|/2}% \lambda^{\left\|\alpha\right\|/2}h_{\alpha}(y)\right\rVert_{2}$
		$\displaystyle\leq\sqrt{\sum_{0<t\leq d}\frac{\left\lVert\nabla^{t}{\phi_{i}(x)% }\right\rVert^{2}}{\phi_{i}(x)^{2}}(\lambda R)^{t}}\leq\sqrt{\sum_{0<t\leq d}% \left(\frac{\lambda R}{\epsilon^{2}}\right)^{t}}\enspace.$

In the last inequality, we use our assumption that $\left\lVert\nabla^{t}{\phi_{i}(x)}\right\rVert\leq\frac{1}{\epsilon}\left% \lVert\nabla^{t-1}{\phi_{i}(x)}\right\rVert\leq\cdots\leq\frac{\left|\phi_{i}(% x)\right|}{\epsilon^{t}}$ . Since $\frac{\lambda R}{\epsilon^{2}}$ is sufficiently small, we have $\left\lVert q_{i}(y)\right\rVert_{R}\leq\frac{1}{4}$ . Therefore, by Markov’s inequality, we have $\Pr_{y}\left[\left|q_{i}(y)\right|\geq\frac{1}{2}\right]\leq 2^{R}\cdot\left% \lVert q_{i}(y)\right\rVert_{R}^{R}\leq 2^{-R}.$ This means that with probability at least $1-2^{-R}$ , we have $\left|\frac{p_{i}(x+\sqrt{\lambda}y)}{\phi_{i}(x)}-1\right|\leq\frac{1}{2},$ and therefore $\frac{p_{i}(x+\sqrt{\lambda}y)}{\phi_{i}(x)}\geq\frac{1}{2}.$ Thus, with probability at least $1-2^{-R}$ , the sign of ${p_{i}(x+\sqrt{\lambda}y)}$ is the same as the sign of $\phi_{i}(x)$ . Then by a union bound,

\Pr_{y}\left[\forall 1\leq i\leq k,\ \mathrm{sign}\!\left(p_{i}(x+\sqrt{% \lambda}y)\right)=\mathrm{sign}\!\left(\phi_{i}(x)\right)\right]\geq 1-k\cdot 2% ^{-R}\enspace.

Let $c=f\!\left(\mathrm{sign}\!\left(\phi_{1}\!(x)\right),\dots,\mathrm{sign}\!% \left(\phi_{k}\!(x)\right)\right)$ be a constant. We have

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[F% (x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)\right]-\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}y\end{subarray}}\!\left[c\cdot G(x+\sqrt{\lambda}y)\right]% \right|\leq k\cdot 2^{-R}\enspace.

The same argument applying on $2dR$ -wise independent Gaussian vector $Y$ gives us

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (x+\sqrt{\lambda}Y)G(x+\sqrt{\lambda}Y)\right]-\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}Y\end{subarray}}\!\left[c\cdot G(x+\sqrt{\lambda}Y)\right]% \right|\leq k\cdot 2^{-R}\enspace.

By Lemma 17, we know $\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[G% (x+\sqrt{\lambda}Y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y% \sim\mathcal{N}\left(0,1\right)^{n}\end{subarray}}\!\left[G(x+\sqrt{\lambda}y)% \right]\right|=kd\cdot 2^{-\Omega(R)}$ . Therefore, we have

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (x+\sqrt{\lambda}Y)G(x+\sqrt{\lambda}Y)\right]-\operatorname*{\mathbb{E}}_{% \begin{subarray}{c}y\sim\mathcal{N}\left(0,1\right)^{n}\end{subarray}}\!\left[% F(x+\sqrt{\lambda}y)G(x+\sqrt{\lambda}y)\right]\right|=kd2^{-\Omega(R)}% \enspace.\

$\hfill\blacktriangleleft$

3.3 Proof of Theorem 16

Proof of Theorem 16.

By Lemma 14, the following holds with probability at least $1-\epsilon kd^{3}$ :

\left\lVert\nabla^{t}{p_{i}(y)}\right\rVert\leq O\left(\frac{1}{\epsilon}% \right)\left\lVert\nabla^{t-1}{p_{i}(y)}\right\rVert\text{ for all }1\leq t% \leq d\text{ and }1\leq i\leq k.

Recall the function $G(x)=\prod_{i=1}^{k}\prod_{t=0}^{d-1}\rho\!\left(\log\!\left(\frac{\left\lVert% \nabla^{t}{p_{i}(x)}\right\rVert^{2}}{16\epsilon^{2}\left\lVert\nabla^{t+1}{p_% {i}(x)}\right\rVert^{2}}\right)\right)$ as defined in (1). Note that $G(x)=0$ if there exists some $i$ and $t$ such that $\left\lVert\nabla^{t}{p_{i}(y)}\right\rVert>O\left(\frac{1}{\epsilon}\right)% \left\lVert\nabla^{t-1}{p_{i}(y)}\right\rVert$ . Thus, $\Pr_{y}[G(y)=1]\geq 1-O(\epsilon kd^{3})$ . We have

	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[F(y)\right]$	$\displaystyle=\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}% \!\left[F(y)(1-G(y))\right]+\operatorname{\mathbb{E}}_{\begin{subarray}{c}y% \end{subarray}}\!\left[F(y)G(y)\right]$
		$\displaystyle\leq\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[1-G(y)\right]+\operatorname{\mathbb{E}}_{\begin{subarray}{c}Y\end{% subarray}}\!\left[F(Y)G(Y)\right]+\left\|\operatorname{\mathbb{E}}_{\begin{% subarray}{c}y\end{subarray}}\!\left[F(y)G(y)\right]-\operatorname{\mathbb{E}}% _{\begin{subarray}{c}Y\end{subarray}}\!\left[F(Y)G(Y)\right]\right\|$
		$\displaystyle\leq O(\epsilon kd^{3})+\operatorname{\mathbb{E}}_{\begin{% subarray}{c}Y\end{subarray}}\!\left[F(Y)\right]+\left\|\operatorname{\mathbb{E% }}_{\begin{subarray}{c}y\end{subarray}}\!\left[F(y)G(y)\right]-\operatorname*{% \mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F(Y)G(Y)\right]\right\|\enspace.$

Let $y=\frac{1}{\sqrt{L}}\sum_{i=1}^{L}y_{i}$ where $y_{i}\sim\mathcal{N}(0,1)^{n}$ and denote $Z^{i}=\frac{1}{\sqrt{L}}(y_{1}+\cdots+y_{i-1}+Y_{i+1}+\cdots Y_{L})$ . We have

		$\displaystyle\left\|\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[F(y)G(y)\right]-\operatorname{\mathbb{E}}_{\begin{subarray}% {c}Y\end{subarray}}\!\left[F(Y)G(Y)\right]\right\|$
	$\displaystyle\leq$	$\displaystyle\sum_{i=1}^{L}\left\|\operatorname{\mathbb{E}}_{\begin{subarray}{% c}Z^{i},Y_{i}\end{subarray}}\!\left[F\!\left(Z^{i}+\frac{1}{\sqrt{L}}Y_{i}% \right)G\!\left(Z^{i}+\frac{1}{\sqrt{L}}Y_{i}\right)\right]-\operatorname{% \mathbb{E}}_{\begin{subarray}{c}Z^{i},y_{i}\end{subarray}}\!\left[F\!\left(Z^{% i}+\frac{1}{\sqrt{L}}y_{i}\right)G\!\left(Z^{i}+\frac{1}{\sqrt{L}}y_{i}\right)% \right]\right\|$
	$\displaystyle\leq$	$\displaystyle kdL\cdot 2^{-\Omega(R)}$

where the last inequality is from Lemma 18. Thus,

\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[F(y)\right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{% subarray}}\!\left[F(Y)\right]+O(\epsilon kd^{3})+kdL\cdot 2^{-\Omega(R)}\enspace.

And the other side follows from considering $1-F(x)$ . $\hfill\blacktriangleleft$

4 Discretization

To give an explicit construction of a PRG, we need a discretization of $R$ -wise independent Gaussian distributions. In this section, we show an algorithm which outputs $L$ vectors $\left\{X_{i}\right\}_{1\leq i\leq L}$ approximating $Y_{i}$ , that is, $\left|X_{i,j}-Y_{i,j}\right|$ is sufficiently small. Before that, we first prove that if $X$ and $Y$ are close enough, then $X$ also fools any function of low-degree polynomial threshold functions.

Lemma 19.

Let $0<\epsilon,\delta<1$ , and $R\in\mathbb{N}$ be an integer. Let $Y=\frac{1}{\sqrt{L}}\sum_{i=1}^{L}Y_{i}$ where $Y_{i}$ is an $R$ -wise independent Gaussian vector of length $n$ for $1\leq i\leq L$ . Let $p_{1},\dots,p_{k}:\mathbb{R}^{n}\to\mathbb{R}$ be arbitrary polynomials of degree $d$ and $f:\{0,1\}^{k}\to\left\{0,1\right\}$ be an arbitrary Boolean function. Define functions

F(x)\coloneqq f\!\left(\mathrm{sign}\!\left(p_{1}\!(x)\right),\dots,\mathrm{% sign}\!\left(p_{k}\!(x)\right)\right)

Suppose that for any such function $F$ ,

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (Y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}% \left(0,1\right)^{n}\end{subarray}}\!\left[F(y)\right]\right|\leq\epsilon\enspace.

Suppose that $\left\{X_{i}\right\}_{1\leq i\leq L}$ are random vectors of length $n$ and there is a joint distribution over $X$ and $Y$ such that for each $1\leq i\leq L,1\leq j\leq n$ , $\Pr\left[\left|X_{i,j}-Y_{i,j}\right|\leq\delta\right]\geq 1-\delta$ .

Let $X=\frac{1}{\sqrt{L}}\sum_{i=1}^{L}X_{i}$ and we have that for any such function $F$

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}X\end{subarray}}\!\left[F% (X)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}% \left(0,1\right)^{n}\end{subarray}}\!\left[F(y)\right]\right|\leq\epsilon+k2^{% 2k}d\delta^{1/d}\sqrt{nL}\log\frac{1}{\delta}+O(2^{2k}nL\delta)\enspace.

Proof.

Let $q_{i}(x)=p_{i}(x)+\delta(nL)^{d/2}\left(\log\frac{1}{\delta}\right)^{d}$ and $\widetilde{F}(x)=f\!\left(\mathrm{sign}\!\left(q_{1}\!(x)\right),\dots,\mathrm% {sign}\!\left(q_{k}\!(x)\right)\right)$ . We will prove that

(a)

$\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[F(X)% \right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[\widetilde{F}(y)\right]+\epsilon+O(2^{2k}nL\delta)$ ,
(b)

$\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[% \widetilde{F}(y)\right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end% {subarray}}\!\left[F(y)\right]+k2^{2k}d\delta^{1/d}\sqrt{nL}\log\frac{1}{\delta}$ .

Combing (a) and (b), we have

\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[F(X)% \right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[F(y)\right]+k2^{2k}d\delta^{1/d}\sqrt{nL}\log\frac{1}{\delta}+\epsilon+O% (2^{2k}nL\delta)\enspace.

The other side can be obtained in a similar way by considering $1-F(x)$ .

Proving (a).

Since $\widetilde{F}$ is a function of degree- $d$ PTFs and $Y$ fools such a function, we have $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[% \widetilde{F}(Y)\right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end% {subarray}}\!\left[\widetilde{F}(y)\right]+\epsilon.$ Therefore, it suffices to prove that $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[F(X)% \right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[\widetilde{F}(Y)\right]+O(2^{2k}nL\delta).$

Fix a set $S\subseteq[k]$ . Let $P_{S}(x)=\prod_{i\in S}\mathrm{sign}\!\left(p_{i}(x)\right)$ and $Q_{S}(x)=\prod_{i\in S}\mathrm{sign}\!\left(q_{i}(x)\right)$ . We first show that

\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[P_{S}(X)\right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{% subarray}}\!\left[Q_{S}(Y)\right]+O(nL\delta)\enspace.

Let event $\mathcal{E}$ denote that for all $i\in[L],j\in[n]$ , $\left|Y_{i,j}\right|\leq\log\frac{1}{\delta}$ and $\left|X_{i,j}-Y_{i,j}\right|\leq\delta$ . By the tail bound of the standard Gaussian distribution, we have $\Pr[\mathcal{E}]\geq 1-O(nL\delta)$ . We have

	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[P_{S}(X)\right]=\Pr\left[\bigwedge_{i\in S}p_{i}(X)\geq 0\right]$	$\displaystyle\leq\Pr\left[\bigwedge_{i\in S}p_{i}(X)\geq 0\wedge\mathcal{E}% \right]+\Pr\left[\overline{\mathcal{E}}\right]$
		$\displaystyle\leq\Pr\left[\bigwedge_{i\in S}p_{i}(Y)\geq-\delta(nL)^{d/2}\left% (\log\frac{1}{\delta}\right)^{d}\right]+O(nL\delta)$
		$\displaystyle=\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[Q_{S}(Y)\right]+O(nL\delta)\enspace,$

where the second inequality is by Lemma 13 and viewing $p_{i}\!\left(\frac{1}{\sqrt{L}}\sum_{i=1}^{L}X_{i}\right)$ as a degree $d$ function of $n L$ variables.

Note that $f(x)=\sum_{S\subseteq[k]}f(\mathds{1}_{S})\prod_{i\in S}x_{i}\prod_{i\notin S}% (1-x_{i})$ where $\mathds{1}_{S}$ denotes the length- $k$ string with $1$ ’s only at coordinates in $S$ . This can be further simplified in the form

f(x)=\sum_{S\subseteq[k]}c_{S}\prod_{i\in S}x_{i}\enspace,

with $\sum_{S}\left|c_{S}\right|\leq 2^{2k}$ . So, we have

	$\displaystyle\hskip-15.00002pt\operatorname{\mathbb{E}}_{\begin{subarray}{c}% \end{subarray}}\!\left[F(X)\right]=\sum_{S\subseteq[k]}c_{S}\operatorname{% \mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[P_{S}(X)\right]$	$\displaystyle=\sum_{S\subseteq[k]}c_{S}\operatorname{\mathbb{E}}_{\begin{% subarray}{c}\end{subarray}}\!\left[Q_{S}(Y)\right]+\sum_{S\subseteq[k]}c_{S}% \left(\operatorname{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[P_% {S}(X)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[Q_{S}(Y)\right]\right)$
		$\displaystyle\leq\sum_{S\subseteq[k]}c_{S}\operatorname{\mathbb{E}}_{\begin{% subarray}{c}\end{subarray}}\!\left[Q_{S}(Y)\right]+O(2^{2k}nL\delta)=% \operatorname{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[% \widetilde{F}(Y)\right]+O(2^{2k}nL\delta)\enspace.$

Proving (b).

Next we show $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[% \widetilde{F}(y)\right]$ and $\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[F(y)\right]$ are close. Note that

\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[Q_{S}(y)\right]=\Pr\left[\bigwedge_{i\in S}p_{i}(y)\geq-\delta(nL)^{d/2}% \left(\log\frac{1}{\delta}\right)^{d}\right]\leq\Pr\left[\bigwedge_{i\in S}p_{% i}(y)\geq 0\right]+kd\delta^{1/d}\sqrt{nL}\log\frac{1}{\delta}\enspace,

where the inequality is from Lemma 12. Thus, we have

\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[Q_{S}(y)% \right]\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[P_{S}(y)\right]+kd\delta^{1/d}\sqrt{nL}\log\frac{1}{\delta}\enspace.

Furthermore, we know that

	$\displaystyle\operatorname{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!% \left[\widetilde{F}(y)\right]=\sum_{S\subseteq[k]}c_{S}\operatorname{\mathbb{% E}}_{\begin{subarray}{c}\end{subarray}}\!\left[Q_{S}(y)\right]$	$\displaystyle=\sum_{S\subseteq[k]}c_{S}\operatorname{\mathbb{E}}_{\begin{% subarray}{c}\end{subarray}}\!\left[P_{S}(y)\right]+\sum_{S\subseteq[k]}c_{S}% \left(\operatorname{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}}\!\left[Q_% {S}(y)-P_{S}(y)\right]\right)$
		$\displaystyle\leq\operatorname*{\mathbb{E}}_{\begin{subarray}{c}\end{subarray}% }\!\left[F(y)\right]+k2^{2k}d\delta^{1/d}\sqrt{nL}\log\frac{1}{\delta}\enspace.\$

$\hfill\blacktriangleleft$

We now prove the main theorem for constructing an explicit pseudorandom generator. The idea is that a standard Gaussian variable can be generated using two uniform $[0,1]$ random variables through the Box–Muller transform [2]. Let $Y_{i,j}=\sqrt{-2\log u_{i,j}}\cos(2\pi v_{i,j})$ where $u_{i,j}$ and $v_{i,j}$ are uniform in $[0,1]$ . Then $Y_{i,j}$ is a Gaussian variable. Thus, if we truncate $u_{i,j}$ and $v_{i,j}$ to a certain precision and produce $X_{i,j}$ in a similar manner, $X$ approximates $Y$ with high probability.

Theorem 20.

There exists an explicit PRG which $\epsilon$ -fools any functions of any $k$ degree- $d$ polynomial threshold functions over $\mathcal{N}(0,1)^{n}$ with seed length $O\left(\frac{k^{5}d^{11}}{\epsilon^{2}}\mathrm{log}\frac{kdn}{\epsilon}\right).$

Proof.

In Theorem 16, set parameter $\epsilon$ as $\frac{\epsilon}{kd^{3}}$ and set $R$ as $C\log\frac{kd}{\epsilon}$ for some large constant $C$ . Then for $L=C^{\prime}\cdot\frac{k^{4}d^{9}}{\epsilon^{2}}\cdot\mathrm{polylog}\frac{kd}% {\epsilon}$ where $C^{\prime}$ is a large constant, we have

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F% (Y)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}% \left(0,1\right)^{n}\end{subarray}}\!\left[F(y)\right]\right|\leq O(\epsilon)\enspace.

Similar to the proof of Corollary 2 in [13], we can let $Y_{i,j}$ generated by

Y_{i,j}=\sqrt{-2\log u_{i,j}}\cos(2\pi v_{i,j})

where $u_{i}=(u_{i,1},\dots,u_{i,n})$ and $v_{i}=(v_{i,1},\dots,v_{i,n})$ are $2dR$ -wise independent uniform $[0,1]$ random vectors. Then let $u^{\prime}_{i,j}$ and $v^{\prime}_{i,j}$ be $M=C^{\prime\prime}kd\log\frac{kdn}{\epsilon}$ -bit approximations to $u_{i,j}$ and $v_{i,j}$ (i.e., round $u_{i,j}$ and $v_{i,j}$ to multipels of $2^{-M}$ ), where $C^{\prime\prime}$ is a large constant, and let

X_{i,j}=\sqrt{-2\log u^{\prime}_{i,j}}\cos(2\pi v^{\prime}_{i,j})\enspace.

Letting $\delta=\Omega(2^{-M/2})$ , for the same reason as in the proof of Corollary 2 in [13], we have $\left|X_{i,j}-Y_{i,j}\right|<\delta$ with probability at least $1-\delta$ . Then, by Lemma 19,

\left|\operatorname*{\mathbb{E}}_{\begin{subarray}{c}X\end{subarray}}\!\left[F% (X)\right]-\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\sim\mathcal{N}% \left(0,1\right)^{n}\end{subarray}}\!\left[F(y)\right]\right|\leq O(\epsilon)+% k2^{2k}d\delta^{1/d}\sqrt{nL}\log\frac{1}{\delta}+O(2^{2k}nL\delta)=O(\epsilon% )\enspace.

Note that $X_{i}$ can be generated by $2dR$ -wise independent random variables $u^{\prime}_{i,j}$ and $v^{\prime}_{i,j}$ taken uniformly from $\left\{2^{-M},2\cdot 2^{-M},3\cdot 2^{-M},\dots,1\right\}$ using $O(dRM)$ randomness. Thus generating $X$ uses $O(LdRM)=O\left(\frac{k^{5}d^{11}}{\epsilon^{2}}\mathrm{log}\frac{kdn}{\epsilon% }\right)$ randomness. $\hfill\blacktriangleleft$

References

[1] Louay M. J. Bazzi. Polylogarithmic independence can fool DNF formulas. SIAM Journal on Computing, 38(6):2220–2272, 2009. doi:10.1137/070691954.
[2] G. E. P. Box and Mervin E. Muller. A note on the generation of random normal deviates. The Annals of Mathematical Statistics, 29(2):610–611, 1958. doi:10.1214/aoms/1177706645.
[3] Anthony Carbery and James Wright. Distributional and $L^{q}$ norm inequalities for polynomials over convex bodies in $\mathbb{R}^{n}$ . Mathematical Research Letters, 8:233–248, 2001. doi:10.4310/MRL.2001.v8.n3.a1.
[4] Eshan Chattopadhyay, Anindya De, and Rocco A. Servedio. Simple and efficient pseudorandom generators from Gaussian processes. In Proceedings of the 34th Computational Complexity Conference, Dagstuhl, DEU, 2020. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2019.4.
[5] Ilias Diakonikolas, Parikshit Gopalan, Ragesh Jaiswal, Rocco A. Servedio, and Emanuele Viola. Bounded independence fools halfspaces. SIAM Journal on Computing, 39(8):3441–3462, 2010. doi:10.1137/100783030.
[6] Ilias Diakonikolas, Daniel M. Kane, and Jelani Nelson. Bounded independence fools degree- $2$ threshold functions. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 11–20, 2010. doi:10.1109/FOCS.2010.8.
[7] Parikshit Gopalan, Daniel M. Kane, and Raghu Meka. Pseudorandomness via the discrete fourier transform. SIAM Journal on Computing, 47(6):2451–2487, 2018. doi:10.1137/16M1062132.
[8] Parikshit Gopalan, Ryan O’Donnell, Yi Wu, and David Zuckerman. Fooling functions of halfspaces under product distributions. In 2010 IEEE 25th Annual Conference on Computational Complexity, pages 223–234, 2010. doi:10.1109/CCC.2010.29.
[9] Prahladh Harsha, Adam Klivans, and Raghu Meka. An invariance principle for polytopes. J. ACM, 59(6), 2013. doi:10.1145/2395116.2395118.
[10] William B Johnson, Joram Lindenstrauss, and Gideon Schechtman. Extensions of Lipschitz maps into Banach spaces. Israel Journal of Mathematics, 54(2):129–138, 1986. doi:10.1007/BF02764938.
[11] Daniel Kane, Raghu Meka, and Jelani Nelson. Almost optimal explicit johnson-lindenstrauss families. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 628–639, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg. doi:10.1007/978-3-642-22935-0_53.
[12] Daniel M. Kane. $k$ -independent Gaussians fool polynomial threshold functions. In 2011 IEEE 26th Annual Conference on Computational Complexity, pages 252–261, 2011. doi:10.1109/CCC.2011.13.
[13] Daniel M. Kane. A small PRG for polynomial threshold functions of Gaussians. In Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 257–266, USA, 2011. IEEE Computer Society. doi:10.1109/FOCS.2011.16.
[14] Daniel M. Kane. A structure theorem for poorly anticoncentrated Gaussian chaoses and applications to the study of polynomial threshold functions. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 91–100, 2012. doi:10.1109/FOCS.2012.52.
[15] Daniel M. Kane. A pseudorandom generator for polynomial threshold functions of Gaussian with subpolynomial seed length. In 2014 IEEE 29th Conference on Computational Complexity, pages 217–228, 2014. doi:10.1109/CCC.2014.30.
[16] Daniel M. Kane. A Polylogarithmic PRG for Degree 2 Threshold Functions in the Gaussian Setting. In David Zuckerman, editor, 30th Conference on Computational Complexity (CCC 2015), volume 33 of Leibniz International Proceedings in Informatics (LIPIcs), pages 567–581, Dagstuhl, Germany, 2015. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2015.567.
[17] Zander Kelley and Raghu Meka. Random restrictions and PRGs for PTFs in Gaussian space. In Shachar Lovett, editor, 37th Computational Complexity Conference (CCC 2022), volume 234 of Leibniz International Proceedings in Informatics (LIPIcs), pages 21:1–21:24, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2022.21.
[18] Adam R. Klivans, Ryan O’Donnell, and Rocco A. Servedio. Learning geometric concepts via Gaussian surface area. In 2008 49th Annual IEEE Symposium on Foundations of Computer Science, pages 541–550, 2008. doi:10.1109/FOCS.2008.64.
[19] Pravesh K. Kothari and Raghu Meka. Almost optimal pseudorandom generators for spherical caps: extended abstract. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, pages 247–256, New York, NY, USA, 2015. Association for Computing Machinery. doi:10.1145/2746539.2746611.
[20] J. W. Lindeberg. Eine neue herleitung des exponentialgesetzes in der wahrscheinlichkeitsrechnung. Mathematische Zeitschrift, 15:211–225, 1922. doi:10.1007/BF01494395.
[21] Raghu Meka and David Zuckerman. Pseudorandom generators for polynomial threshold functions. SIAM Journal on Computing, 42(3):1275–1301, 2013. doi:10.1137/100811623.
[22] Jorge Nocedal and Stephen J. Wright, editors. Quadratic Programming, pages 438–486. Springer New York, New York, NY, 1999. doi:10.1007/0-387-22742-3_16.
[23] Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press, 2014. doi:10.1017/CBO9781139814782.
[24] Ryan O’Donnell, Rocco A. Servedio, and Li-Yang Tan. Fooling Gaussian PTFs via local hyperconcentration. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 1170–1183, New York, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3357713.3384281.
[25] Ryan O’Donnell, Rocco A. Servedio, and Li-Yang Tan. Fooling polytopes. J. ACM, 69(2), January 2022. doi:10.1145/3460532.
[26] Alexander Razborov. A simple proof of Bazzi’s theorem. ACM Trans. Comput. Theory, 1(1), February 2009. doi:10.1145/1490270.1490273.
[27] R. A. Servedio and L. Tan. Fooling intersections of low-weight halfspaces. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 824–835, Los Alamitos, CA, USA, October 2017. IEEE Computer Society. doi:10.1109/FOCS.2017.81.
[28] Rocco A. Servedio. Every linear threshold function has a low-weight approximator. In 21st Annual IEEE Conference on Computational Complexity (CCC’06), pages 18–32, 2006. doi:10.1109/CCC.2006.18.

Appendix A Facts about Bump Function

$\blacktriangleright$ Fact 7

For all $t\in\mathbb{N}$ , $\left|\Psi^{(t)}(x)\right|\leq t^{(3+o(1))t}$ .

Proof.

It is easy to check there exists a series of polynomials $\left\{P_{t}\right\}_{t\in\mathbb{N}}$ such that for $x\in(-1,1)$

\Psi^{(t)}(x)=\frac{P_{t}(x)}{(1-x^{2})^{2t}}\cdot\Psi(x)\enspace.

Besides, $\left\{P_{t}\right\}_{t\in\mathbb{N}}$ has the following recursion:

P_{0}(x)=1,\enspace P_{1}(x)=-2x,\enspace P_{t}=(1-x^{2})^{2}P^{\prime}_{t-1}(% x)+4(t-1)x(1-x^{2})P_{t-1}(x)-2xP_{t-1}(x)\enspace.

The degree of $P_{t}$ is at most $3t$ . Therefore we have

\left|\Psi^{(t)}(x)\right|\leq\left(\max_{x\in(-1,1)}\left|P_{t}(x)\right|% \right)\cdot\left(\max_{x\in(-1,1)}\frac{\Psi(x)}{(1-x^{2})^{2t}}\right)\enspace.

Let $f(x)=x\cdot e^{-\frac{x}{2t}}$ for $x\in(1,+\infty)$ . We have $f(x)\leq f(2t)=\frac{2t}{e}$ by a simple calculation. Thus, we have that $\frac{\Psi(x)}{(1-x^{2})^{2t}}\leq\left(\frac{2t}{e}\right)^{2t}$ . We are left to bound $\max_{x\in(-1,1)}\left|P_{t}(x)\right|$ .

Define $\left\lVert P\right\rVert_{1}$ be the sum of absolute values of all coefficients of the polynomial $P$ . Since $x\in(-1,1)$ , we know $\max_{x\in(-1,1)}\left|P_{t}(x)\right|\leq\left\lVert P_{t}\right\rVert_{1}$ . By the recursion, we have

\left\lVert P_{t}\right\rVert_{1}\leq 4\left\lVert P_{t-1}^{\prime}\right% \rVert_{1}+8t\left\lVert P_{t-1}\right\rVert_{1}\leq 20t\left\lVert P_{t-1}% \right\rVert_{1}

where the last inequality is from $\left\lVert P_{t-1}^{\prime}\right\rVert_{1}\leq 3t\left\lVert P_{t-1}\right% \rVert_{1}$ since the degree of $P_{t-1}$ is at most $3t-3$ . So we know $\max_{x\in(-1,1)}\leq 20^{t}\cdot t!.$ Therefore,

\left|\Psi^{(t)}(x)\right|\leq\left(\frac{2t}{e}\right)^{2t}\cdot 20^{t}\cdot t% !=t^{(3+o(1))t}\enspace.\

$\hfill\blacktriangleleft$

$\blacktriangleright$ Fact 8

For all $t\in\mathbb{N}$ , $\left|\rho^{(t)}(x)\right|\leq t^{(3+o(1))t}$ .

Proof.

Directly from Fact 7 by observing that $\rho(x)=e\cdot\Psi(1-x)$ for $0<x<1$ and is constant elsewhere. $\hfill\blacktriangleleft$

$\blacktriangleright$ Fact 9

Let $r(u,v)\coloneq\rho(\log u-\log v+c)$ for some constant $c$ . Then we have that for all $n,m\in\mathbb{N}$ , $\left|\frac{\partial^{n}\partial^{m}r(u,v)}{\partial u^{n}\partial v^{m}}% \right|\leq\frac{(n+m)^{6(n+m)}}{\left|u\right|^{n}\left|v\right|^{m}}$ .

Proof.

Let $g(u,v)=\log u-\log v+c$ . Then by the generalized chain rule for the derivative of the composition of two functions (also known as Faà di Bruno’s formula), we have

	$\displaystyle\frac{\partial^{n}\partial^{m}r(u,v)}{\partial u^{n}\partial v^{m% }}=$	$\displaystyle\sum_{\begin{subarray}{c}(a_{1},\dots,a_{n})\in\mathbb{N}^{n}\\ a_{1}+2\cdot a_{2}+\cdots+n\cdot a_{n}=n\end{subarray}}\sum_{\begin{subarray}{% c}(b_{1},\dots,b_{m})\in\mathbb{N}^{m}\\ b_{1}+2\cdot b_{2}+\cdots+m\cdot b_{m}=m\end{subarray}}\frac{n!m!}{\prod_{i=1}% ^{n}\left(i!\right)^{a_{i}}a_{i}!\prod_{i=1}^{m}\left(i!\right)^{b_{i}}b_{i}!}$
		$\displaystyle\enspace\enspace\enspace\cdot\rho^{(a_{1}+\cdots+a_{n}+b_{1}+% \cdots+b_{m})}(g(u,v))\cdot\prod_{i=1}^{n}\left(\frac{(-1)^{i}i!}{u^{i}}\right% )^{a_{i}}\prod_{i=1}^{m}\left(\frac{(-1)^{i+1}i!}{v^{i}}\right)^{b_{i}}\enspace.$

Therefore

\displaystyle\left|\frac{\partial^{n}\partial^{m}r(u,v)}{\partial u^{n}% \partial v^{m}}\right|\leq n^{n}\cdot m^{m}\cdot n!\cdot m!\cdot(m+n)^{4(m+n)}% \cdot\frac{1}{\left|u\right|^{n}\left|v\right|^{m}}\leq\frac{(n+m)^{6(n+m)}}{% \left|u\right|^{n}\left|v\right|^{m}}\enspace.\

$\hfill\blacktriangleleft$

[bib.bib1] [1] Louay M. J. Bazzi. Polylogarithmic independence can fool DNF formulas. SIAM Journal on Computing, 38(6):2220–2272, 2009. doi:10.1137/070691954.

[bib.bib2] [2] G. E. P. Box and Mervin E. Muller. A note on the generation of random normal deviates. The Annals of Mathematical Statistics, 29(2):610–611, 1958. doi:10.1214/aoms/1177706645.

[bib.bib3] [3] Anthony Carbery and James Wright. Distributional and $L^{q}$ norm inequalities for polynomials over convex bodies in $\mathbb{R}^{n}$ . Mathematical Research Letters, 8:233–248, 2001. doi:10.4310/MRL.2001.v8.n3.a1.

[bib.bib4] [4] Eshan Chattopadhyay, Anindya De, and Rocco A. Servedio. Simple and efficient pseudorandom generators from Gaussian processes. In Proceedings of the 34th Computational Complexity Conference, Dagstuhl, DEU, 2020. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2019.4.

[bib.bib5] [5] Ilias Diakonikolas, Parikshit Gopalan, Ragesh Jaiswal, Rocco A. Servedio, and Emanuele Viola. Bounded independence fools halfspaces. SIAM Journal on Computing, 39(8):3441–3462, 2010. doi:10.1137/100783030.

[bib.bib6] [6] Ilias Diakonikolas, Daniel M. Kane, and Jelani Nelson. Bounded independence fools degree- $2$ threshold functions. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 11–20, 2010. doi:10.1109/FOCS.2010.8.

[bib.bib7] [7] Parikshit Gopalan, Daniel M. Kane, and Raghu Meka. Pseudorandomness via the discrete fourier transform. SIAM Journal on Computing, 47(6):2451–2487, 2018. doi:10.1137/16M1062132.

[bib.bib8] [8] Parikshit Gopalan, Ryan O’Donnell, Yi Wu, and David Zuckerman. Fooling functions of halfspaces under product distributions. In 2010 IEEE 25th Annual Conference on Computational Complexity, pages 223–234, 2010. doi:10.1109/CCC.2010.29.

[bib.bib9] [9] Prahladh Harsha, Adam Klivans, and Raghu Meka. An invariance principle for polytopes. J. ACM, 59(6), 2013. doi:10.1145/2395116.2395118.

[bib.bib10] [10] William B Johnson, Joram Lindenstrauss, and Gideon Schechtman. Extensions of Lipschitz maps into Banach spaces. Israel Journal of Mathematics, 54(2):129–138, 1986. doi:10.1007/BF02764938.

[bib.bib11] [11] Daniel Kane, Raghu Meka, and Jelani Nelson. Almost optimal explicit johnson-lindenstrauss families. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 628–639, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg. doi:10.1007/978-3-642-22935-0_53.

[bib.bib12] [12] Daniel M. Kane. $k$ -independent Gaussians fool polynomial threshold functions. In 2011 IEEE 26th Annual Conference on Computational Complexity, pages 252–261, 2011. doi:10.1109/CCC.2011.13.

[bib.bib13] [13] Daniel M. Kane. A small PRG for polynomial threshold functions of Gaussians. In Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 257–266, USA, 2011. IEEE Computer Society. doi:10.1109/FOCS.2011.16.

[bib.bib14] [14] Daniel M. Kane. A structure theorem for poorly anticoncentrated Gaussian chaoses and applications to the study of polynomial threshold functions. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 91–100, 2012. doi:10.1109/FOCS.2012.52.

[bib.bib15] [15] Daniel M. Kane. A pseudorandom generator for polynomial threshold functions of Gaussian with subpolynomial seed length. In 2014 IEEE 29th Conference on Computational Complexity, pages 217–228, 2014. doi:10.1109/CCC.2014.30.

[bib.bib16] [16] Daniel M. Kane. A Polylogarithmic PRG for Degree 2 Threshold Functions in the Gaussian Setting. In David Zuckerman, editor, 30th Conference on Computational Complexity (CCC 2015), volume 33 of Leibniz International Proceedings in Informatics (LIPIcs), pages 567–581, Dagstuhl, Germany, 2015. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2015.567.

[bib.bib17] [17] Zander Kelley and Raghu Meka. Random restrictions and PRGs for PTFs in Gaussian space. In Shachar Lovett, editor, 37th Computational Complexity Conference (CCC 2022), volume 234 of Leibniz International Proceedings in Informatics (LIPIcs), pages 21:1–21:24, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2022.21.

[bib.bib18] [18] Adam R. Klivans, Ryan O’Donnell, and Rocco A. Servedio. Learning geometric concepts via Gaussian surface area. In 2008 49th Annual IEEE Symposium on Foundations of Computer Science, pages 541–550, 2008. doi:10.1109/FOCS.2008.64.

[bib.bib19] [19] Pravesh K. Kothari and Raghu Meka. Almost optimal pseudorandom generators for spherical caps: extended abstract. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, pages 247–256, New York, NY, USA, 2015. Association for Computing Machinery. doi:10.1145/2746539.2746611.

[bib.bib20] [20] J. W. Lindeberg. Eine neue herleitung des exponentialgesetzes in der wahrscheinlichkeitsrechnung. Mathematische Zeitschrift, 15:211–225, 1922. doi:10.1007/BF01494395.

[bib.bib21] [21] Raghu Meka and David Zuckerman. Pseudorandom generators for polynomial threshold functions. SIAM Journal on Computing, 42(3):1275–1301, 2013. doi:10.1137/100811623.

[bib.bib22] [22] Jorge Nocedal and Stephen J. Wright, editors. Quadratic Programming, pages 438–486. Springer New York, New York, NY, 1999. doi:10.1007/0-387-22742-3_16.

[bib.bib23] [23] Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press, 2014. doi:10.1017/CBO9781139814782.

[bib.bib24] [24] Ryan O’Donnell, Rocco A. Servedio, and Li-Yang Tan. Fooling Gaussian PTFs via local hyperconcentration. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 1170–1183, New York, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3357713.3384281.

[bib.bib25] [25] Ryan O’Donnell, Rocco A. Servedio, and Li-Yang Tan. Fooling polytopes. J. ACM, 69(2), January 2022. doi:10.1145/3460532.

[bib.bib26] [26] Alexander Razborov. A simple proof of Bazzi’s theorem. ACM Trans. Comput. Theory, 1(1), February 2009. doi:10.1145/1490270.1490273.

[bib.bib27] [27] R. A. Servedio and L. Tan. Fooling intersections of low-weight halfspaces. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 824–835, Los Alamitos, CA, USA, October 2017. IEEE Computer Society. doi:10.1109/FOCS.2017.81.

[bib.bib28] [28] Rocco A. Servedio. Every linear threshold function has a low-weight approximator. In 21st Annual IEEE Conference on Computational Complexity (CCC’06), pages 18–32, 2006. doi:10.1109/CCC.2006.18.

$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left\|\Delta\right\|\right]$	$\displaystyle\leq\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[\Delta\cdot\mathds{1}_{\mathcal{E}}\right]+\operatorname{\mathbb{E}% }_{\begin{subarray}{c}y\end{subarray}}\!\left[g(s,r)\cdot\mathds{1}_{\overline% {\mathcal{E}}}\right]+\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[\left\|l(s,r)\right\|\cdot\mathds{1}_{\overline{\mathcal{E}}}\right]$
	$\displaystyle\leq\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[\Delta\cdot\mathds{1}_{\mathcal{E}}\right]+\operatorname{\mathbb{E}% }_{\begin{subarray}{c}y\end{subarray}}\!\left[\mathds{1}_{\overline{\mathcal{E% }}}\right]+\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[\left\|l(s,r)\right\|\cdot\mathds{1}_{\overline{\mathcal{E}}}\right]$
	$\displaystyle\leq\underbrace{\operatorname{\mathbb{E}}_{\begin{subarray}{c}y% \end{subarray}}\!\left[\Delta\cdot\mathds{1}_{\mathcal{E}}\right]}_{(\text{% Term }1)}+\underbrace{\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[\mathds{1}_{\overline{\mathcal{E}}}\right]}_{(\text{Term }2)% }+\underbrace{\sqrt{\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{% subarray}}\!\left[l^{2}(s,r)\right]}}_{(\text{Term }3)}\cdot\sqrt{% \operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!\left[\mathds% {1}_{\overline{\mathcal{E}}}\right]}\enspace.$	(Cauchy–Schwarz)

	$\displaystyle\left\|\Delta\right\|$	$\displaystyle\leq\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)\in% \mathbb{N}^{kd},\left(\beta^{t}_{i}\right)\in\mathbb{N}^{kd}\\ \left\|\alpha\right\|+\left\|\beta\right\|=R\end{subarray}}\frac{\left\|\partial^{% \alpha}_{s}\partial^{\beta}_{r}g(s^{},r^{})\right\|}{\alpha!\beta!}\prod_{i=1% }^{k}\prod_{t=0}^{d-1}\left\|s_{i}^{t}-a_{i}^{t}\right\|^{\alpha_{i}^{t}}\left\|r% _{i}^{t}-b_{i}^{t}\right\|^{\beta_{i}^{t}}$
		$\displaystyle\leq\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)\in% \mathbb{N}^{kd},\left(\beta^{t}_{i}\right)\in\mathbb{N}^{kd}\\ \left\|\alpha\right\|+\left\|\beta\right\|=R\end{subarray}}{R^{6R}}\cdot\prod_{i=1% }^{k}\prod_{t=0}^{d-1}\frac{\left\|s_{i}^{t}-a_{i}^{t}\right\|^{\alpha_{i}^{t}}}% {\left\|s^{t}_{i}\right\|^{\alpha_{i}^{t}}}\frac{\left\|r_{i}^{t}-b_{i}^{t}% \right\|^{\beta_{i}^{t}}}{\left\|r^{t}_{i}\right\|^{\beta_{i}^{t}}}$
		$\displaystyle\leq\sum_{\begin{subarray}{c}\left(\alpha^{t}_{i}\right)\in% \mathbb{N}^{kd},\left(\beta^{t}_{i}\right)\in\mathbb{N}^{kd}\\ \left\|\alpha\right\|+\left\|\beta\right\|=R\end{subarray}}{R^{6R}}\cdot\left(% \frac{2\delta+\delta^{2}}{(1-\delta)^{2}}\right)^{R}=\binom{R+2kd-1}{R}\cdot{R% ^{6R}}\cdot\left(4\delta\right)^{R}\leq 2^{-R}\enspace,$

	$\displaystyle\operatorname*{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}\!% \left[F(y)\right]$	$\displaystyle=\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray}}% \!\left[F(y)(1-G(y))\right]+\operatorname{\mathbb{E}}_{\begin{subarray}{c}y% \end{subarray}}\!\left[F(y)G(y)\right]$
		$\displaystyle\leq\operatorname{\mathbb{E}}_{\begin{subarray}{c}y\end{subarray% }}\!\left[1-G(y)\right]+\operatorname{\mathbb{E}}_{\begin{subarray}{c}Y\end{% subarray}}\!\left[F(Y)G(Y)\right]+\left\|\operatorname{\mathbb{E}}_{\begin{% subarray}{c}y\end{subarray}}\!\left[F(y)G(y)\right]-\operatorname{\mathbb{E}}% _{\begin{subarray}{c}Y\end{subarray}}\!\left[F(Y)G(Y)\right]\right\|$
		$\displaystyle\leq O(\epsilon kd^{3})+\operatorname{\mathbb{E}}_{\begin{% subarray}{c}Y\end{subarray}}\!\left[F(Y)\right]+\left\|\operatorname{\mathbb{E% }}_{\begin{subarray}{c}y\end{subarray}}\!\left[F(y)G(y)\right]-\operatorname*{% \mathbb{E}}_{\begin{subarray}{c}Y\end{subarray}}\!\left[F(Y)G(Y)\right]\right\|\enspace.$

A Pseudorandom Generator for Functions of Low-Degree Polynomial Threshold Functions

Abstract

Keywords and phrases:

Category:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Funding:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Definition 1.

Definition 2.

1.1 Prior Work

1.2 Main Result

Theorem 3 (Informal version of Theorem 20).

(1) Bounded independence fools functions of 𝒌 degree-𝒅 PTFs

Theorem 4.

(2) Discretization of bounded independence Gaussians

Lemma 5 (Informal version of Lemma 19).

2 Preliminary

Basic Notation

Derivatives and Multidimensional Taylor Expansion

Theorem 6 (Multidimensional Taylor’s Theorem).

Bump Function

Fact 7.

Fact 8.

Fact 9.

Gaussian Space and the Gaussian Noise Operator

Lemma 10 (Lemma 16 in [17]).

Theorem 11.

Low-Degree Polynomials

Lemma 12 (Theorem 8 in [3]).

Lemma 13 (Lemma 22 in [13]).

Lemma 14 (Lemma 6 in [17]).

Lemma 15 (Lemma 23 in [17]).

3 Fooling the Functions of PTFs via Bounded Independence

Theorem 16.

3.1 Fooling the Mollifier 𝑮

Lemma 17.

Proof.

Bounding Term 1.

Bounding Term 2.

Bounding Term 3.

3.2 A Single Step in the Hybrids

Lemma 18.

Proof.

Showing (a).

Showing (b).

3.3 Proof of Theorem 16

Proof of Theorem 16.

4 Discretization

Lemma 19.

Proof.

Proving (a).

Proving (b).

Theorem 20.

Proof.

References

Appendix A Facts about Bump Function

Proof.

Proof.

Proof.

(1) Bounded independence fools functions of $𝒌$ degree- $𝒅$ PTFs

3.1 Fooling the Mollifier $𝑮$