Pseudorandom Bits for Non-Commutative Programs

Lee, Chin Ho; Viola, Emanuele

doi:10.4230/LIPIcs.CCC.2025.9

Pseudorandom Bits for Non-Commutative Programs

Chin Ho Lee

North Carolina State University, Raleigh, NC, USA Emanuele Viola

Northeastern University, Boston, MA, USA

Abstract

We obtain new explicit pseudorandom generators for several computational models involving groups. Our main results are as follows:

1.

We consider read-once group-products over a finite group $G$ , i.e., tests of the form $\prod_{i=1}^{n}g_{i}^{x_{i}}$ where $g_{i}\in G$ , a special case of read-once permutation branching programs. We give generators with optimal seed length $c_{G}\log(n/\varepsilon)$ over any $p$ -group. The proof uses the small-bias plus noise paradigm, but derandomizes the noise to avoid the recursion in previous work. Our generator works when the bits are read in any order. Previously for any non-commutative group the best seed length was $\geq\log n\log(1/\varepsilon)$ , even for a fixed order.
2.

We give a reduction that “lifts” suitable generators for group products over $G$ to a generator that fools width- $w$ block products, i.e., tests of the form $\prod g_{i}^{f_{i}}$ where the $f_{i}$ are arbitrary functions on disjoint blocks of $w$ bits. Block products generalize several previously studied classes. The reduction applies to groups that are mixing in a representation-theoretic sense that we identify.
3.

Combining (2) with (1) and other works we obtain new generators for block products over the quaternions or over any commutative group, with nearly optimal seed length. In particular, we obtain generators for read-once polynomials modulo any fixed $m$ with nearly optimal seed length. Previously this was known only for $m=2$ .
4.

We give a new generator for products over “mixing groups.” The construction departs from previous work and uses representation theory. For constant error, we obtain optimal seed length, improving on previous work (which applied to any group).

This paper identifies a challenge in the area that is reminiscent of a roadblock in circuit complexity – handling composite moduli – and points to several classes of groups to be attacked next.

Keywords and phrases:

Group programs, Space-bounded derandomization, Representation theory

Funding:

Chin Ho Lee: Work done in part at Harvard University, supported by Madhu Sudan’s and Salil Vadhan’s Simons Investigator Awards.

Emanuele Viola: Supported by NSF grants CCF-2114116 and CCF-2430026.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Pseudorandomness and derandomization

Acknowledgements:

We thank Yves de Cornulier for answering a question about Dedekind groups and providing a proof of Lemma 9.

DOI:

10.4230/LIPIcs.CCC.2025.9

Event:

40th Computational Complexity Conference (CCC 2025)

Editors:

Srikanth Srinivasan

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

The construction of explicit pseudorandom generators is a fundamental research goal that has applications in many areas of theoretical computer science. For background we refer to the recent survey [24]. We first define pseudorandom generators, incorporating the variants of any order (reflected in the permutation $\pi$ ) and non-Boolean tests (reflected in the range set $R$ ).

Definition 1 (Pseudorandom generators (PRGs)).

An explicit function $P\colon\{0,1\}^{s}\to\{0,1\}^{n}$ is a pseudorandom generator (PRG) with seed length $s$ and error $\varepsilon$ for a class of functions $F$ mapping $\{0,1\}^{n}$ to a set $R$ if for every $f\in F$ the statistical distance between $f(P(U_{s}))$ and $f(U_{n})$ is $\leq\varepsilon$ , where $U_{s}$ denotes the uniform distribution over $\{0,1\}^{s}$ . We say $P$ fools $F$ in any order if $\pi(P)$ fools $F$ for any permutation $\pi$ of the positions of the $n$ input bits. A PRG is explicit if it is computable in time $n^{c}$ .

PRGs for branching programs, and group programs.

A main agenda is obtaining explicit pseudorandom generators for read-once branching programs (ROBPs), with an ultimate goal of proving $\text{BPL}=\text{L}$ . However, even for constant-width, permutation ROBPs, the best known seed length is $\geq\log n\log(1/\varepsilon)$ . This is $\geq\log^{2}n$ when $\varepsilon=1/n$ , and thus falls short of the optimal seed length $c\log(n/\varepsilon)$ . For permutation ROBPs of width $w$ , seed length $c_{w}\log(n/\varepsilon)\log(\varepsilon^{-1}\log n)$ follows from instantiating the “Polarizing Random Walks” [9] with a bound from [43, 28]. These generators work in any order; thus they essentially match the seed length $c_{w}\log(n/\varepsilon)\log(1/\varepsilon)$ that was already available for fixed-order in a sequence of exciting works culminating in [46].

The class of permutation ROBPs is equivalent to group programs (see e.g. [25]):

Definition 2.

A program (or product) $p$ of length $n$ over a group $G$ is a tuple $(g_{1},g_{2},\ldots,g_{n})\in G^{n}$ . The program computes the function $f_{p}\colon\{0,1\}^{n}\ni x\mapsto\prod_{i\in[n]}g_{i}^{x_{i}}\in G$ .

No generator with seed length less than $\log n\log(1/\varepsilon)$ was available for any non-commutative group. While optimal seed length $c\log(n/\varepsilon)$ was known for $\mathbb{Z}_{2}$ since [40], it took nearly 20 years and different techniques to have the same seed length over $\mathbb{Z}_{3}$ [35, 38], and remarkably that seed length is still not available even for $\mathbb{Z}_{6}$ (see [18] for the best known construction).

PRGs for read-once polynomials.

Another model that has received significant attention is read-once polynomials. Intuitively, this model can serve as a bridge between permutation and non-permutation ROBPs. The available generators for non-permutation ROBPs have significantly worse seed length than for permutation programs, see e.g. [37] and the discussion there.

A sequence of works [30, 37, 14] culminated in PRGs with seed length $c\log n+\log(1/\varepsilon)\log^{c}\log(1/\varepsilon)$ for read-once polynomials over $\mathbb{Z}_{2}$ . But for other domains such as $\mathbb{Z}_{3}$ such good seed lengths were not known.

PRGs for block-products.

A more general model that generalizes and unifies the previous ones is what we call block-products of width $w$ over a group $G$ . Here, the input bits are arbitrarily partitioned in blocks of $w$ bits, arbitrary Boolean functions are then applied to each block, and finally the outputs are used as exponents to group elements. For our results, we will need to allow one block to be larger; we call this spill and incorporate it in the following definition.

Definition 3 (Block-product with spill).

A function $f\colon\{0,1\}^{n}\to G$ is computable by a $w$ -block product with $\ell$ terms and a spill of $q$ bits, written as $(\ell,w,q)$ -product, over a group $G$ if there exist $\ell+1$ disjoint subsets $I_{0},I_{1},\ldots,I_{\ell}\subseteq[n]$ , where $\lvert I_{0}\rvert\leq q$ and $\lvert I_{i}\rvert\leq w$ for each $i\in[\ell]$ such that

f(x)=\prod_{i=1}^{\ell}g_{i}^{f_{i}(x_{I_{i}})}

for some group elements $g_{i}\in G$ , functions $f_{i}\colon\{0,1\}^{I_{i}}\to\{0,1\}$ . Here $x_{I_{i}}$ are the $|I_{i}|$ bits of $x$ indexed by $I_{i}$ .

Note that block products are unordered by definition. They are a generalization of several function classes that have been studied, including modular sums [34, 39, 18] (when $G$ is a cyclic group and $w=1$ ), product tests with outputs in $\{-1,1\}$ (a.k.a. combinatorial checkerboards) [52, 23, 29, 31, 27] (when $G=\mathbb{Z}_{2}$ ), themselves a generalization of combinatorial rectangles [5, 36, 19], and unordered combinatorial shapes [20, 18] (when $G=\mathbb{Z}_{m+1}$ ). Block products also generalize read-once polynomials because one can show (for the uniform and typically also pseudorandom distributions) that monomials of degree $\geq\log(n/\varepsilon)$ do not affect the result significantly, and so one can simulate these polynomials with blocks of size $\log(n/\varepsilon)$ .

In terms of generators, a series of works culminating in [27] gives nearly-optimal seed length (i.e., $w+\log(\ell/\varepsilon)$ up to lower-order factors) over $\mathbb{Z}_{2}$ . But such a result was not known over other groups such as $\mathbb{Z}_{3}$ or any non-commutative group.

1.1 Our results

In this work we bring new techniques, notably from group theory, to bear on these problems, and use them to obtain new pseudorandom generators.

First, we obtain optimal seed length for products over $p$ -groups.

Definition 4.

A finite $p$ -group is a group of order $p^{k}$ for an integer $k$ and a prime $p$ .

Equivalently, the order of every element is a power of $p$ . (The latter definition makes sense for infinite groups, but we only consider finite groups.) The class of $p$ -groups is rich and has been studied in various areas of theory of computation. For example, $p$ -groups remain a candidate for good group-theoretic algorithms for matrix multiplication [7]; the isomorphism testing for a subclass of $p$ -groups has been identified as a barrier to faster group isomorphism algorithms [47]; $p$ -groups (specifically, unitriangular groups) are used for cryptography in NC⁰ [4] (see [50] for an exposition emphasizing these groups); finally, $p$ -groups (specifically, quaternions) are used in computer graphics to express 3D rotations [26].

We now give a few examples of such groups, all of which are non-commutative.

$\blacksquare$

The quaternion group $\mathbb{Q}_{8}$ of order $8$ is a $2$ -group.
$\blacksquare$

Unitriangular groups over $\mathbb{F}_{p}$ are $p$ -groups. They consist of upper-triangular matrices (of some fixed dimension), with $1$ on the diagonal and entries in $\mathbb{F}_{p}$ .
$\blacksquare$

Wreath products give natural examples of $p$ -groups. For example, the wreath product $\mathbb{Z}_{p}\wr\mathbb{Z}_{p}$ is a group of order $p^{p+1}$ , hence a $p$ -group. This group is the direct product $\mathbb{Z}_{p}^{p}$ with another element in $\mathbb{Z}_{p}$ acting on the tuple by shifting the coordinates. For concreteness, the case $p=2$ can be presented as $(a,b;z)$ where $a,b,z\in\mathbb{Z}_{2}$ , $(a,b;0)(a^{\prime},b^{\prime};z^{\prime})=(a+a^{\prime},b+b^{\prime};z^{\prime})$ , and $(a,b;1)(a^{\prime},b^{\prime};z^{\prime})=(a+b^{\prime},b+a^{\prime};1+z^{% \prime})$ . Wreath product constructions (not necessarily $p$ -groups) have been studied in a variety of contexts ranging from group-theoretic algorithms for matrix multiplication [10], to construction of expander graphs [3, 44], to mixing in non-quasirandom groups [22].
$\blacksquare$

The dihedral group $\mathbb{D}_{n}$ is the group of order $2n$ of symmetries of a regular polygon with $n$ sides. When $n=2^{t}$ , $\mathbb{D}_{n}$ is a $2$ -group.

We give pseudorandom generators for programs over $p$ -groups, with optimal seed length. Throughout this paper, we use $c_{x}$ to denote a constant that depends on the variable $x$ .

Theorem 5.

Let $G$ be a $p$ -group. There is an explicit pseudorandom generator that fools programs of length $n$ over $G$ in any order, with seed length $c_{G}\log(n/\varepsilon)$ .

In fact, the same result holds even for block-products over $p$ -groups with constant block length $w$ .

Polynomials and block-products.

We give a general reduction that “lifts” a PRG $P$ for group products over $G$ to a PRG $P^{\prime}$ for block-products (and read-once polynomials) over $G$ . The reduction applies to any group $G$ that is mixing:

Definition 6 (Mixing groups).

A (finite) group $G$ is mixing if for every nontrivial irreducible (unitary) representation $\rho$ and non-identity element $g\in G$ , the matrix $\rho(g)$ has no eigenvalue 1.

$\blacktriangleright$ Remark 7.

Our results for mixing groups (Theorems 10 and 12) apply more generally to fooling words over a mixing subset $H$ of a (not necessarily mixing) group $G$ . The property we need is that Definition 6 holds for every element in $H$ . There are many examples of mixing subsets of non-mixing groups which generate the entire group $G$ . For example, for $\mathbb{S}_{3}=\mathbb{D}_{3}$ , it suffices to exclude the “flip” element, i.e. the non-identity element $r$ , where $r^{2}=1$ . Moreover, one can have natural examples for infinite groups. However for simplicity we focus on finite mixing groups.

We note that mixing groups are exactly the class of Dedekind groups.

Definition 8 (Finite).

Dedekind groups are groups of the form $\mathbb{Q}_{8}\times\mathbb{Z}_{2}^{t}\times D$ for any integer $t$ and commutative group $D$ of odd order. A non-commutative Dedekind group is also called a Hamiltonian group.

Lemma 9 (Mixing characterization of Dedekind groups).

A finite group is mixing if and only if it is Dedekind.

A proof of Lemma 9 is in Section 9. We can now state our reduction:

Theorem 10.

Let $G$ be a mixing group. Suppose there is a PRG $P_{1}$ with seed length $s_{1}$ that $\varepsilon$ -fools $(\ell,1,3\log(1/\varepsilon))$ -products over $G$ . Then there is a PRG that $\bigl{(}c_{G}\log(w+\log(\ell/\varepsilon))\cdot\varepsilon\bigr{)}$ -fools $(\ell,w,\log(1/\varepsilon))$ -products over $G$ , with seed length

c_{G}\bigl{(}s_{1}+\log(\ell/\varepsilon)+w\bigr{)}\cdot\log^{c}\bigl{(}w+\log% (n\ell/\varepsilon)\bigr{)}.

Note that if $P_{1}$ has nearly optimal seed length (i.e., $\log(\ell/\varepsilon)$ times lower-order terms) then also the final PRG has nearly optimal seed length (i.e., $w+\log(\ell/\varepsilon)$ , times lower-order terms).

Applying the reduction (Theorem 10) we obtain near-optimal PRGs for block products over commutative or Dedekind $2$ -groups (in particular, the quaternions).

Corollary 11.

Let $G$ be either a commutative group, or a Dedekind 2-group, that is, $G=\mathbb{Q}_{8}\times\mathbb{Z}_{2}^{t}$ for some $t$ . There is an explicit PRG that $\varepsilon$ -fools $(\ell,w,0)$ -block products over $G$ with seed length $c_{G}(w+\log(\ell/\varepsilon))\log^{c}(w+\log(\ell n/\varepsilon))$ .

Proof.

We use the reduction (Theorem 10). For commutative groups we use the PRG in [17] for $P_{1}$ ; for Dedekind 2-groups we use our Theorem 5 for $P_{1}$ . Actually, in both cases the generators were only stated for group products while we need to handle the spill. The simple modification is in Section 8. $\hfill\blacktriangleleft$ As remarked earlier, as a consequence of Corollary 11, we obtain PRGs for read-once polynomials over $n$ variables over any finite field $\mathbb{F}$ with near-optimal seed length $c_{\mathbb{F}}\log(n/\varepsilon)\log^{c}\log(n/\varepsilon$ ). Again, this was not known even for $\mathbb{F}_{3}$ .

This result is also a step towards handling group programs over more general groups, for example nilpotent groups, which are direct products of $p$ -groups (for different $p$ ). Jumping ahead, our techniques imply that generators for such groups follow from generators for (non-read-once) polynomials over composites.

Finally, we give a new generator for products over mixing groups.

Theorem 12.

Let $G$ be a mixing group. There is an explicit PRG $P$ that $\varepsilon$ -fools length- $n$ programs over $G$ with seed length $c_{G}\log(n/\varepsilon)\log(1/\varepsilon)$ , in any order.

The parameter improvement over previous work appears tiny: As remarked earlier, [9] gives seed length $c_{G}\log(n/\varepsilon)\log(\varepsilon^{-1}\log n)$ , and moreover for any $G$ . Still, for constant error we obtain optimal seed length which was known only in the fixed-order case (cf. [46]). Also note that mixing groups of the form $\mathbb{Q}_{8}\times\mathbb{Z}_{2}^{t}$ (i.e., $m=1$ ) are $2$ -groups, for which we gave optimal seed length in Theorem 5. But the techniques there do not even apply to the commutative (mixing) group $\mathbb{Z}_{2}\times\mathbb{Z}_{3}$ .

Our main interest in this result is that its proof is different from previous work: it showcases how we can use information on the representation theory to improve the parameters, pointing to several open problems.

1.2 Future directions and open problems

This work suggests that the difficulty of handling more general classes of groups lies in composite moduli. For example, we do not have new generators for $\mathbb{D}_{3}=\mathbb{S}_{3}$ , a group of order $6$ , even though we have optimal seed length for $\mathbb{D}_{n}$ when $n$ is a power of two. Thus, a challenge emerging from this work is to improve the seed length over any non-commutative group of composite order. Again, $\mathbb{S}_{3}$ is an obvious candidate, which is equivalent to fooling width-3 permutation ROBPs. But other groups could be easier to handle, for example Dedekind groups or the direct product of a $p$ -group and a $q$ -group where $p\neq q$ are primes.

Also, the techniques in this paper point to several other questions. Can we extend our reduction to block products where instead of $g^{f}$ for Boolean $f$ we more generally have $g^{f}$ replaced by a function with range $G$ ? For what other groups can we exploit representation theory to obtain better PRGs?

2 Proof of Theorem 5

We use the fact that programs over $p$ -groups can be written as polynomials. Elements in a group of order $p^{k}$ will be written as $k$ -tuples over $\mathbb{F}_{p}$ .

Lemma 13.

Let $G$ be a group of order $p^{k}$ , and $k$ an integer. There is a 1-1 correspondence between $G$ and $\mathbb{F}_{p}^{k}$ and a polynomial map $f=(f_{1},\ldots,f_{k})\colon(\mathbb{F}_{p}^{k})^{n}\to\mathbb{F}_{p}^{k}$ over $\mathbb{F}_{p}$ where the $f_{i}\colon(\mathbb{F}_{p}^{k})^{n}\to\mathbb{F}_{p}$ have degree $c_{G}$ such that for any $\overline{g}:=(g_{1},g_{1},\ldots,g_{n})\in G^{n}$ and $x\in\{0,1\}^{n}$ , we have

\prod_{i=1}^{n}g_{i}^{x_{i}}=\bigl{(}f_{1}(\overline{g}),f_{2}(\overline{g}),% \ldots,f_{k}(\overline{g})\bigr{)}(x_{1},\ldots,x_{n}).

This lemma is essentially in the previous work [42]. However the statement there is for nilpotent groups and cannot be immediately used. Also, the proof relies on previous work and is somewhat indirect. So we give a direct proof of the result we need (i.e., Lemma 13).

Before the proof we illustrate it via an example.

Example 14.

Let $G:=\mathbb{Z}_{2}\wr\mathbb{Z}_{2}$ from the introduction. Consider a product $\prod_{i}(a_{i},b_{i};z_{i})$ . Via a polynomial map we can rewrite this product into a normal form where all the $z_{i}$ are in one element only:

\Bigl{(}\prod_{i}(a^{\prime}_{i},b^{\prime}_{i};0)\Bigr{)}(0,0;z^{\prime}).

Computing this product is then immediate, via a linear map. The key observation is that $a^{\prime}_{i}=a_{i}$ if the sum of the $z_{j}$ with $j<i$ is even, and $a^{\prime}_{i}=b_{i}$ otherwise, and that this computation is a quadratic polynomial (in the input bits $a_{i},b_{i},z_{i}$ ).

Proof of Lemma 13.

We proceed by induction on $k$ . If $k=1$ , then $G$ is cyclic. We can take a generator $a\in G$ and define the 1-1 mapping $G\ni a^{z}\leftrightarrow z\in\mathbb{F}_{p}$ . So $\prod_{i=1}^{n}g_{i}^{x_{i}}=\prod_{i=1}^{n}a^{z_{i}x_{i}}$ can be written as the degree-1 polynomial $\sum_{i=1}^{n}z_{i}x_{i}$ .

Otherwise, $G$ has a normal subgroup $H$ of order $p^{k-1}$ [15, Chapter 6, Theorem 1.(3)]. The corresponding quotient group $Q=G/H$ has order $p$ and is therefore cyclic. So we can write $g_{i}\in G$ as

g_{i}=a^{e_{i}}h_{i}

where $h_{i}\in H$ and $a$ is a generator of $Q$ . Applying the induction hypothesis on $H$ , we can identify each element $g_{i}=a^{e_{i}}h_{i}$ with a $k$ -tuple $(e_{i},e_{i}^{\prime})\in\mathbb{F}_{p}^{k}$ , where $e_{i}^{\prime}\in\mathbb{F}_{p}^{k-1}$ corresponds to $h_{i}\in H$ .

Now we apply the conjugation trick as in [6], and use induction. That is, let $b_{i}:=a^{\sum_{j\leq i}e_{j}x_{j}}$ and write

\prod_{i=1}^{n}g_{i}^{x_{i}}=\Bigl{(}\prod_{i=1}^{n}\left(b_{i}h_{i}^{x_{i}}b_% {i}^{-1}\right)\Bigr{)}b_{n}.

Note that $b_{i}$ and $b_{i}^{-1}$ can be computed by some degree-1 polynomials over $\mathbb{F}_{p}$ , and $h_{i}^{x_{i}}$ can be (trivially) computed by a degree- $c_{H}$ polynomial over $\mathbb{F}_{p}$ .

Therefore, each term $b_{i}h_{i}^{x_{i}}b_{i}^{-1}$ can be computed by some degree- $c_{G}$ polynomial map $f^{H,i}=(f_{1},\ldots,f_{k})$ over $\mathbb{F}_{p}$ . Moreover, these terms lie in $H$ because $H$ is a normal subgroup of $G$ . Hence we have reduced to a product over $H$ , which by induction hypothesis, can be computed by some degree- $c_{H}$ polynomial, and the result follows. $\hfill\blacktriangleleft$

Given Lemma 13, it suffices to construct a bit-generator that fools low-degree polynomials over $\mathbb{F}_{p}$ .

The case $p=2$ .

For this case, we can simply combine Lemma 13 with known generators for polynomials over $\mathbb{F}_{2}$ [8, 32, 51]. In fact, we obtain results for non-read-once programs as well, and of any length. (Indeed, such polynomials are equivalent to low-degree polynomials over $\mathbb{F}_{2}$ .)

The case $p>2$ .

Here we need additional ideas because bit-generators that fool polynomials over $\mathbb{F}_{q}$ with $q\neq 2$ are not known. However, the works [8, 32, 51] do give generators that output field elements that fool such polynomials.

Lemma 15 ([51]).

There are distributions $Y$ over $\mathbb{F}_{p}^{n}$ that can be explicitly sampled from a uniform seed of $c_{p}(2^{d}\log(1/\varepsilon)+\log n)$ bits such that for any degree- $d$ polynomial $f$ in $n$ variables over $\mathbb{F}_{p}$ , we have $\Delta(f(Y),f(U))\leq\varepsilon$ .

However, we need distributions over $\{0,1\}^{n}$ . This distinction is critical and arises in a number of previous works. Currently, for domain $\{0,1\}^{n}$ only weaker results with seed length $\geq\log^{2}n$ are known [33].

Still, as pointed out in [35, 38], Lemma 15 implies results over the domain $\mathbb{F}_{2}^{n}$ for biased bits:

Definition 16.

We denote by $N_{p}$ a vector of $n$ i.i.d. bits coming up $1$ with probability $1/p$ .

Corollary 17 ([35, 38]).

There are distributions $X$ over $\{0,1\}^{n}$ that can be explicitly sampled from a uniform seed of $c_{p}(2^{d}\log(1/\varepsilon)+\log n)$ bits such that for any degree- $d$ polynomial $f$ in $n$ variables over $\mathbb{F}_{p}$ we have $\Delta(f(X),f(N_{p}))\leq\varepsilon$ .

Proof.

Let $Y=(Y_{1},Y_{2},\ldots,Y_{n})$ be the distribution from Lemma 15, for degree $d(p-1)$ . Define $X:=(Y_{1}^{p-1},Y_{2}^{p-1},\ldots,Y_{n}^{p-1})$ . Note $X$ is over $\{0,1\}^{n}$ . Also, if $U$ is uniform in $\mathbb{F}_{p}$ then $U^{p-1}=N_{p}$ . The result follows. $\hfill\blacktriangleleft$ We will show how to use biased bits. For this we use that the program is read-once.

Lemma 18.

Let $X$ fool degree-1 polynomials over $\mathbb{F}_{2}$ with error $\varepsilon^{c_{G,p}}$ . Then $X+N_{p}$ fools programs of length $n$ over $G$ with error $\varepsilon$ .

Proof.

This follows from Lemma 7.2 in [16] combined with the Fourier bound in [43, 28]. The proof in [16] is for the fixed noise parameter $p=4$ , but the generalization to any $p$ is immediate (replace $1/2$ with $1-2/p$ in the last two lines of the proof). $\hfill\blacktriangleleft$ We now have all the ingredients.

Proof of Theorem 5..

Use Lemma 18. Averaging over $X$ , it suffices to derandomize $N_{p}$ . By Lemma 13 it suffices to do this for low-degree polynomials. This follows from Corollary 17. $\hfill\blacktriangleleft$

3 Representation theory and matrix analysis

In this section, we present the fragment of representation theory and matrix analysis that we need. The books by Serre [45], Diaconis [13], and Terras [48] are good references for representation theory and non-commutative Fourier analysis. The Barbados notes [53], [21, Section 13], [22], or [12] provide briefer introductions.

Matrices.

Let $M$ be a square complex matrix. We denote by $\operatorname{tr}(M)$ the trace of $M$ , by $\overline{M}$ the conjugate of $M$ , by $M^{T}$ the transpose of $M$ , and by $M^{*}$ the conjugate transpose $\overline{M^{T}}$ (aka adjoint, Hermitian conjugate, etc.). The matrix $M$ is unitary if the rows and the columns are orthonormal; equivalently $M^{-1}=M^{*}$ .

The Frobenius norm, (a.k.a. Schatten 2-norm, Hilbert–Schmidt operator) of a square matrix $M$ , denoted $\lVert M\rVert_{\mathsf{F}}$ , is $\sum_{i,j}\lvert M_{i,j}\rvert^{2}=\operatorname{tr}(MM^{*})$ .

The operator norm of a matrix $M$ , denoted $\lVert M\rVert_{\mathsf{op}}$ , is the square root of the largest eigenvalue of the matrix $MM^{\ast}$ . In particular, if $M$ is a normal matrix, i.e. $MM^{\ast}=M^{\ast}M$ , then $\lVert M\rVert_{\mathsf{op}}$ equals its largest eigenvalue in magnitude.

Fact 19.

$\lVert AB\rVert_{\mathsf{op}}\leq\lVert A\rVert_{\mathsf{op}}\lVert B\rVert_{% \mathsf{op}}$ .

Fact 20.

For a $d\times d$ matrix $M$ with eigenvalues $\lambda_{1},\ldots,\lambda_{d}$ , we have $\lVert M\rVert_{\mathsf{F}}^{2}=\sum_{i=1}^{d}\lvert\lambda_{i}\rvert^{2}\leq d% \lVert M\rVert_{\mathsf{op}}^{2}$ .

Representation theory.

Let $G$ be a group. A representation $\rho$ of $G$ with dimension $d$ maps elements of $G$ to $d\times d$ unitary, complex matrices so that $\rho(xy)=\rho(x)\rho(y)$ . Thus, $\rho$ is a homomorphism from $G$ to the group of linear transformations of the vector space $\mathbb{C}^{d}$ . We denote by $d_{\rho}$ the dimension of $\rho$ .

If there is a non-trivial subspace $W$ of $\mathbb{C}^{d}$ that is invariant under $\rho$ , that is, $\rho(x)W\subseteq W$ for every $x\in G$ , then $\rho$ is reducible; otherwise it is irreducible. Irreducible representations are abbreviated irreps and play a critical role in Fourier analysis. We denote by $\widehat{G}$ a complete set of inequivalent irreducible representations of $G$ .

Let $\widehat{G}$ be the set of irreducible representations of $G$ (i.e. the dual group of $G$ ). We have

\sum_{\rho\in\widehat{G}}d^{2}_{\rho}=|G|.

(1)

For a random variable $Z$ we also use $Z$ to denote its probability mass function.

For an irrep $\rho\in\widehat{G}$ , the $\rho$ -th Fourier coefficient of $Z$ is

\widehat{Z}(\rho):=\sum_{g\in G}Z(g)\overline{\rho(g)}=\mathop{\bf E\/}\Bigl{[% }\overline{\rho(Z)}\Bigr{]}.

The Fourier expansion of $Z\colon G\to\mathbb{R}$ is

Z(g)=\frac{1}{\lvert G\rvert}\sum_{\rho\in\widehat{G}}d_{\rho}\operatorname{tr% }\bigl{(}\widehat{Z}(\rho)\rho(g)\bigr{)}.

Parseval’s identity gives

\sum_{g\in G}Z(g)^{2}=\frac{1}{\lvert G\rvert}\sum_{\rho\in\widehat{G}}d_{\rho% }\lVert\widehat{Z}(\rho)\rVert_{\mathsf{F}}^{2}.

$\vartriangleright$ Claim 21.

Suppose $X$ and $Y$ are two random variables over $G$ such that for every irreducible representation $\rho$ of $G$ , we have $\lVert\mathop{\bf E\/}[\rho(X)]-\mathop{\bf E\/}[\rho(Y)]\rVert_{\mathsf{op}}\leq\varepsilon$ . Then $X$ and $Y$ are $(\sqrt{\lvert G\rvert}\cdot\varepsilon)$ -close in statistical distance.

Proof.

$\displaystyle\frac{1}{2}\sum_{g\in G}\big{\lvert}X(g)-Y(g)\big{\rvert}$	$\displaystyle\leq\frac{\sqrt{\lvert G\rvert}}{2}\Bigl{(}\sum_{g\in G}\bigl{(}X% (g)-Y(g)\bigr{)}^{2}\Bigr{)}^{1/2}$	(Cauchy–Schwarz)
	$\displaystyle=\frac{\sqrt{\lvert G\rvert}}{2}\biggl{(}\frac{1}{\lvert G\rvert}% \sum_{\rho\in\widehat{G}}d_{\rho}\big{\lVert}\widehat{X}(\rho)-\widehat{Y}(% \rho)\big{\rVert}_{\mathsf{F}}^{2}\biggr{)}^{1/2}$	(Parseval)
	$\displaystyle=\frac{1}{2}\Bigl{(}\sum_{\rho\in\widehat{G}}d_{\rho}\cdot(d_{% \rho}\varepsilon^{2})\Bigr{)}^{1/2}$	(Fact 20)
	$\displaystyle=\frac{\varepsilon}{2}\cdot\Bigl{(}\sum_{\rho\in\widehat{G}}d_{% \rho}^{2}\Bigr{)}^{1/2}=\sqrt{\lvert G\rvert}\cdot\varepsilon/2.$	(Equation 1)

$\hfill\vartriangleleft$

4 Proof of Theorem 12

Again, besides the parameter improvement, our main point here is to illustrate how we use representation theory to obtain pseudorandom generators. These ideas will then be generalized to the more general and complicated setting of block products in the next section.

Let $\rho$ be an irreducible representation of a mixing group (Definition 6). By definition of mixing, if $\rho$ is a non-identity matrix then it does not have $1$ as its eigenvalues. A main observation is that if there are many non-identity matrices $\rho(g_{i})$ in the program, then the bias $\lVert\mathop{\bf E\/}[\prod_{i=1}^{n}\rho(g_{i})^{U_{i}}]\rVert_{\mathsf{op}}$ is small. This is proved in the next two claims.

$\vartriangleright$ Claim 22.

Let $M$ be a unitary matrix with eigenvalues $e^{i\theta_{j}}$ for some $\theta_{j}\in[-\pi,\pi]$ on the unit circle. Suppose $\lvert\theta_{j}\rvert\geq\theta$ for every $j$ . Then $\lVert(I+M)/2\rVert_{\mathsf{op}}\leq 1-\theta^{2}/8$ .

Proof.

As $M$ is unitary, we can write $M=Q^{\ast}DQ$ , where $D$ is a diagonal matrix with $M$ ’s eigenvalues on its diagonal and $Q$ is unitary. The eigenvalues of $(I+M)/2=Q^{\ast}(\frac{I+D}{2})Q$ are $\frac{1+e^{i\theta_{j}}}{2}=e^{i\theta_{j}/2}\cdot\frac{e^{-i\theta_{j}/2}+e^{% i\theta_{j}/2}}{2}=e^{i\theta_{j}/2}\cos(\theta_{j}/2)$ , which have magnitudes at most $\lvert\cos(\theta_{j}/2)\rvert\leq 1-\theta^{2}/8$ . $\hfill\vartriangleleft$

$\vartriangleright$ Claim 23.

Let $G$ be a mixing group. Let $\rho$ be an irreducible representation of $G$ of dimension $d_{\rho}$ . Let $f_{\rho}(x)=\prod_{i=1}^{n}\rho(g_{i})^{x_{i}}$ be the representation of a group program. Suppose $\rho(g_{i})\neq I_{d_{\rho}}$ for $t\geq c_{G}\log(1/\varepsilon)$ many $i$ ’s. Then $\lVert\mathop{\bf E\/}[f_{\rho}(U)]\rVert_{\mathsf{op}}\leq\varepsilon$ .

Proof.

Let $T$ be the $t$ coordinates $j$ where $\rho(g_{j})\neq I_{d_{\rho}}$ . For every fixing of the other coordinates, we can write $f_{\rho}(U)$ as

B\prod_{j\in T}\rho(g_{j})^{U_{j}}B_{j}

for some unitary matrices $B$ and $B_{j}$ ’s. So

\lVert f_{\rho}(U)\rVert_{\mathsf{op}}\leq\lVert B\rVert_{\mathsf{op}}\prod_{j% \in T}\lVert\mathop{\bf E\/}[\rho(g_{j})^{U_{j}}]\rVert_{\mathsf{op}}\lVert B_% {j}\rVert_{\mathsf{op}}\leq(1-c_{G})^{t}\leq\varepsilon.\

$\hfill\vartriangleleft$

We now proceed with the proof of the main result. The proof extends to handle the spill, but for simplicity we do not discuss it here. We fool each irreducible representation of $G$ separately and then appeal to Claim 21. Fix a representation $\rho$ and consider the product

f_{\rho}(x):=\prod_{j=1}^{\ell}\rho(g_{j})^{x_{j}}.

Let $t$ be the number of non-identity elements $\rho(g_{j}):j\in[n]$ and $S$ be their coordinates.

Let us sketch the construction. First, XORing with an almost $2c_{G}\log(1/\varepsilon)$ -wise uniform distribution takes care of the case $t\leq c_{G}\log(1/\varepsilon)$ , so we may assume that $t$ is larger. In this case, by Claim 23, we have that the bias $\lVert\mathop{\bf E\/}[f_{\rho}(U)]\rVert_{\mathsf{op}}$ is small under the uniform distribution. Our goal is to set $c t$ bits in $S$ to uniform and apply Claim 23 again.

Let $\ell:=c_{G}\log(1/\varepsilon)$ . Let $M$ be a $(\log n)\times 10\ell$ matrix filled with uniform bits.

We will make $\log n$ guesses of $t$ . For each guess $v=2^{i}\cdot\ell:i\in\{0,\ldots,\log n-1\}$ of $t$ , we select a subset of size $\ell$ of the input positions using a hash function $h_{i}$ , and then hash these $\ell$ positions to row $i$ of $M$ using another hash function $h$ , and assign input bits correspondingly. The final generator is obtained by trying all guesses, using the same seed for each guess $h_{i}$ , and XORing together the bits.

In more detail, for each $i\in\{0,\ldots,\log n-1\}$ , let $h_{i}\colon[n]\to\{0,1\}$ be a $10\ell$ -wise independent hash family with $\mathop{\bf Pr\/}_{h_{i}}[h_{i}(j)=1]=2^{-i}$ for each $j\in[n]$ . Let $h\colon[n]\to[10\ell]$ be another $5\ell$ -wise uniform hash family. The output of our generator is

D:=D^{(0)}\oplus\cdots\oplus D^{(\log n-1)},

where the $j$ -th bit of $D^{(i)}$ is

h_{i}(j)\cdot M_{i,h(j)}.

We use the same seed to sample $h_{0},\dots,h_{\log n-1}$ , which costs at most $O_{G}(\log n\log(1/\varepsilon))$ bits [49, Corollary 3.34]. Sampling $h$ uses another $O_{G}(\log(n/\varepsilon)\log(1/\varepsilon))$ bits. This uses a total of $O_{G}(\log(n/\varepsilon)\log(1/\varepsilon))$ bits.

We now show that $\lVert\mathop{\bf E\/}[f_{\rho}(D)]\rVert_{\mathsf{op}}\leq O(\varepsilon)$ . Suppose $t\in[2^{i}\ell,2^{i+1}\ell]$ . Recall that $S$ are the coordinates corresponding to the non-identity matrices in the product. Let $J:=h_{i}^{-1}\{1\}\cap S$ . As $\mathop{\bf Pr\/}[h_{i}(j)=1]=2^{-i}$ , we have $\mathop{\bf E\/}[\lvert J\rvert]\in[\ell,2\ell]$ . Applying tail bounds for bounded independence (see Lemma 36), we have $\lvert J\rvert\in[\ell/2,3\ell]$ except with probability $\varepsilon$ . Conditioned on this event, as $\lvert J\rvert\leq 3\ell$ and $h$ is $5\ell$ -wise uniform, we can think of $h$ as a random function from $J$ to $[10\ell]$ . Hence, for each $j\in[10\ell]$ , we have

\mathop{\bf Pr\/}\bigl{[}\lvert J\cap h^{-1}(j)\rvert=1\bigr{]}=\lvert J\rvert% \cdot 1/(10\ell)\cdot(1-1/(10\ell))^{\lvert J\rvert-1}\geq(\ell/2)\cdot(1/10% \ell)\cdot(1/2)\geq 1/40.

By a Chernoff bound, we have that except with probability at most $\varepsilon$ , the number of $j$ such that $\lvert J\cap h^{-1}(j)\rvert=1$ is at least $\ell/10$ .

Let $T$ be these coordinates. Fixing all the bits in $M$ except the ones in row $i$ that are fed into $T$ , we can write the conditional expectation of $f_{\rho}(G)$ over the bits in $T$ as

B\prod_{j\in T}\mathop{\bf E\/}_{x_{j}}[A_{j}^{x_{j}}]B_{j},

for some unitary matrices $B$ , $A_{j}$ ’s and $B_{j}$ ’s, and in particular, $A_{j}$ has its eigenvalues bounded away from 1 on the complex unit circle. Therefore, by Claim 22,

\Big{\lVert}B\prod_{j\in T}\mathop{\bf E\/}_{x_{j}}\bigl{[}A_{j}^{x_{j}}\bigr{% ]}B_{j}\Big{\rVert}_{\mathsf{op}}\leq\prod_{j\in T}\bigg{\lVert}\frac{(I+A_{j}% )}{2}\bigg{\rVert}_{\mathsf{op}}\leq\varepsilon.\

5 Proof of Theorem 10

In this section we prove Theorem 10. This type of reductions goes back to the work of [19] on read-once CNFs (itself building on [1]), and have been refined in several subsequent works. The work [30] extended the techniques to read-once polynomials. It exploited the observation that when the number of monomials is significantly larger than its degree, the bias of the polynomial is small, and therefore the bias of the restricted function remains small. Building on this observation, [37] showed that one can aggressively restrict most of the coordinates, while keeping the bias of the restricted function small. In addition, a typical restricted product is a low-degree polynomial (plus a spill), for which we have optimal generators [8, 32, 51].

However, [37] reduces to non-linear polynomials (degree 16). As discussed earlier, bit-generators with good seed lengths are only known over $\mathbb{Z}_{2}$ . We give a refined reduction that reduces to polynomials of degree one, for which we have generators over $\mathbb{Z}_{m}$ for any $m$ [35, 38, 18].

At the same time, we show that the reduction can be carried over any mixing group, by working with representations of the group.

Definition 24.

Let $\mathcal{U}_{\theta}(d)$ be the set of $d\times d$ unitary matrices with eigenvalues $e^{2\pi i\theta_{j}}$ where $\lvert\theta_{j}\rvert\geq\theta$ .

Definition 25.

A group $G$ is $\theta$ -mixing if it has a complete set of unitary irreducible representations where each non-identity matrix lies in $\mathcal{U}_{\theta}(d)$ for some $d$ .

The following theorem will serve as the basis of our iterative construction of the PRG.

Theorem 26.

Let $w\geq\log\log(1/\varepsilon)+\log m$ . Suppose there is a PRG $P$ with seed length $s$ that $\varepsilon$ -fools $(m^{5}2^{30w},2w,2\log(1/\varepsilon))$ -products over $G$ . Let $P_{1}$ be a PRG with seed length $s_{1}$ that $\varepsilon$ -fools $(\ell,1,3\log(1/\varepsilon))$ -products over a group $G$ of order $m$ that is $(1/m)$ -mixing. Then there is a PRG that $O(\varepsilon)$ -fools $(m^{5}2^{45w},3w,2\log(1/\varepsilon))$ -products over $G$ with seed length

s+\Bigl{(}s_{1}+O_{m}((\log(1/\varepsilon)+w)\log w+\log\log n)\Bigr{)}.

We first show how to apply Theorem 26 iteratively to obtain Theorem 10.

Proof of Theorem 10.

We iterate Theorem 26 repeatedly for some $t$ times to reduce the problem to fooling an $O(\log(m/\varepsilon))$ -junta which can be done using an almost bounded uniform distribution.

Given an $(\ell,w,\log(1/\varepsilon))$ -product $f$ , let $w^{\prime}=\max\{w,\log\ell,\log m\}$ so that we can view $f$ as an $(m^{5}2^{45w^{\prime}},3w^{\prime},2\log(1/\varepsilon))$ -product. We first apply Theorem 26 for $t_{1}=O(\log w^{\prime})$ times until we have a

\bigl{(}(m\cdot\log(1/\varepsilon))^{C},\log\log(1/\varepsilon)+\log m,2\log(1% /\varepsilon)\bigr{)}\text{-product},

for some constant $C$ .

Let $b:=\frac{\log(1/\varepsilon)+\log m}{\log\log(1/\varepsilon)+\log m}$ . We will apply the following repeatedly for some $r=O_{m}(1)$ steps. We divide the $f_{i}:i\geq 1$ into groups of $b$ functions and view the product of functions in each group as a single function, this way we can think of the above product as a

\Bigl{(}\tfrac{(m\cdot\log(1/\varepsilon))^{C}}{b},\log(1/\varepsilon)+\log m,% 2\log(1/\varepsilon)\Bigr{)}\text{-product.}

So we can continue applying Theorem 26 for $t_{2}=O(\log(\log(1/\varepsilon)+\log m))\leq O_{m}(\log\log(1/\varepsilon))$ times and the restricted function becomes a

\Bigl{(}\tfrac{(m\cdot\log(1/\varepsilon))^{C}}{b},\log\log(1/\varepsilon)+% \log m,2\log(1/\varepsilon)\Bigr{)}\text{-product.}

Repeating this process for

\displaystyle r=\log_{b}\Bigl{(}\bigl{(}m\cdot\log(1/\varepsilon)\bigr{)}^{C}% \Bigr{)}\leq\frac{2C\bigl{(}\log m+\log\log(1/\varepsilon)\bigr{)}}{\log\log(1% /\varepsilon)}=O_{m}(1)

times, we are left with a

\bigl{(}O(1),\log\log(1/\varepsilon)+\log m,2\log(1/\varepsilon)\bigr{)}\text{% -product}

which can be fooled by an $\varepsilon$ -almost $O(\log(m/\varepsilon))$ -wise uniform distribution that can be sampled using $s^{\prime}=O(\log(m/\varepsilon)+\log\log n)$ bits [41, 2]. Therefore, in total we apply Theorem 26 for

\displaystyle t:=t_{1}+r\cdot t_{2}\leq O(\log w^{\prime})+O_{m}(\log\log(1/% \varepsilon))=O_{m}\bigl{(}\log(w+\log(\ell/\varepsilon))\bigr{)}

times, each with a seed of

s=s_{1}+O_{m}((\log(\ell/\varepsilon)+w)\log w+\log\log n).

bits. Hence in total it uses

\displaystyle s\cdot t+s^{\prime}

\displaystyle\leq O_{m}\bigl{(}s_{1}+\log(\ell/\varepsilon)+w\bigr{)}\cdot% \textnormal{polylog}\bigl{(}w,\log\ell,\log n,\log(1/\varepsilon)\bigr{)}

bits. $\hfill\blacktriangleleft$

5.1 Analysis of one iteration: Proof of Theorem 26

We now prove Theorem 26. Given an $(m^{5}\cdot 2^{45w},3w,2\log(1/\varepsilon))$ -product $f=\prod_{i=0}^{\ell}f_{i}$ over $G$ of order $m$ that is $(1/m)$ -mixing, let $\ell$ be the number of non-constant $f_{i}$ . We say $f$ is a long product if $\ell\geq m^{5}\cdot 2^{30w}$ , otherwise $f$ is short. At a high-level, we apply Theorem 30 to $P_{1}$ to obtain a PRG that fools long products in one shot, and use Lemma 27 and 29 below to reduce fooling a short product to fooling a product of smaller width $w$ .

Lemma 27.

Let $w\geq\log m$ and $C$ be a sufficiently large constant. Define

	$\displaystyle k$	$\displaystyle:=C(w+\log(\ell/\varepsilon))$
	$\displaystyle\delta$	$\displaystyle:=(m\cdot w)^{-k}$
	$\displaystyle p$	$\displaystyle:=2^{-C}.$

There exist two $\delta$ -almost $k$ -wise independent distributions $D$ and $T$ with $\mathop{\bf E\/}[D_{i}]=1/2$ and $\mathop{\bf E\/}[T_{i}]=p$ for every $i\in[n]$ , such that for every $(\ell,w,0)$ -product $f$ over $G$ of order $m$ , we have $\lvert\mathop{\bf E\/}_{D,T}[f(D+T\wedge U)]-\mathop{\bf E\/}[f(U)]\rvert\leq\varepsilon$ .

Moreover, $D$ and $T$ can be efficiently sampled with a seed of length $O_{m}((\log(\ell/\varepsilon)+w)\log w+\log\log n)$ .

Lemma 27 follows from the following lemma, which can be obtained from applying a variant of a result of Forbes and Kelley [16] to the Fourier bounds on functions computable by block products over groups, which was established in [28]. (Block products are called generalized group products in [28].)

Lemma 28 ([16, 28]).

Let $f\colon\{0,1\}^{n}\to\{0,1\}$ be computable by an $(\ell,w,0)$ -block product over a group $G$ . Let $D$ and $T$ be two independent $\delta$ -almost $2(k+w)$ -wise independent distributions on $\{0,1\}^{n}$ with $\mathop{\bf E\/}[D_{i}]=1/2$ and $\mathop{\bf E\/}[T_{i}]=p$ , and $U$ be the uniform distribution on $\{0,1\}^{n}$ . Then

\big{\lvert}\mathop{\bf E\/}[f(D+T\wedge U)]-\mathop{\bf E\/}[f(U)]\big{\rvert% }\leq\ell\cdot\bigl{(}\sqrt{\delta}\cdot(w\cdot\lvert G\rvert)^{k+w}+(1-2p)^{k% /2}+\sqrt{\gamma}\bigr{)}.

We remark that for every constant $p$ , one can show that $n^{-\omega(1)}$ -bias plus noise $N_{p}$ is necessary to fool programs over groups of order $\textnormal{poly}(n)$ with any subconstant error $\varepsilon$ . This follows from [11], where it shows that there exists such a distribution which puts $2\varepsilon$ more probability mass on strings whose Hamming weight is greater than $n/2+O_{p}(\sqrt{kn})$ than the uniform distribution.

Lemma 29 (Width reduction for short products).

Let $G$ be any group of order $m$ . Let $D$ and $T$ be the two distributions defined in Lemma 27. Let $w\geq\log m$ . Let $f$ be an $(\ell,3w,2\log(1/\varepsilon))$ -product over $G$ , where $\ell\leq m^{5}\cdot 2^{30w}$ . Then with probability at least $1-\varepsilon$ over $D$ and $T$ , the restricted function $f_{D,T}$ is an $(\ell,2w,2\log(1/\varepsilon))$ -product over $G$ .

We prove Lemma 29 in Section 7.

Theorem 30.

Let $w\geq\log\log(1/\varepsilon)+2\log(1/\theta)$ . Suppose there is a PRG $P_{1}$ with seed length $s_{1}$ that $\varepsilon$ -fools $(\ell,1,3\log(1/\varepsilon))$ -product $f:=\prod_{i=0}^{\ell}f_{i}$ over a matrix group $\mathcal{M}$ supported on $\mathcal{U}_{\theta}(d)$ , where $\ell\geq 2^{3w}\theta^{-2}$ and each $f_{i}$ is non-constant. Then there is a PRG that fools $(\ell,3w,2\log(1/\varepsilon))$ -products $f=\prod_{i=0}^{\ell}f_{i}$ over $\mathcal{M}$ , where $\ell\in[2^{30w}\theta^{-5},2^{45w}\theta^{-5}]$ and every $f_{i}$ is non-constant, with seed length $s=s_{1}+O_{\theta}\bigl{(}\log(1/\varepsilon)+w+\log\log n\bigr{)}$ .

Corollary 31.

Theorem 30 applies to products over any $\theta$ -mixing group $G$ with $\varepsilon$ replaced by $\varepsilon/\sqrt{\lvert G\rvert}$ .

Proof.

By definition, all its irreps $\rho$ belong to $\mathcal{U}_{\theta}(d_{\rho})$ . It follows from Claim 21 that it suffices to fool all its irreps. $\hfill\blacktriangleleft$

We prove Theorem 30 in Section 6. We now show how Theorem 26 follows from Lemma 27 and 29 and Theorem 30.

Proof of Theorem 26.

Let $P_{1}$ be the PRG that $\varepsilon$ -fools $(\ell,1,3\log(1/\varepsilon))$ -products with seed length $s_{1}$ . Applying Theorem 30 with $P_{1}$ , we obtain a PRG $P_{\mathsf{long}}$ that $\varepsilon$ -fools every $(\ell,3w,2\log(1/\varepsilon))$ -product $f=\prod_{i=0}^{\ell}f_{i}$ , where $\ell\in[m^{5}2^{30w},m^{5}2^{45w}]$ and every $f_{i}$ is non-constant, with seed length $s_{\mathsf{long}}=s_{1}+O_{m}(\log(1/\varepsilon)+w+\log\log n)$ . We now sample the distributions $D, T$ in Lemma 27, and output

(D+T\wedge P(U))+P_{\mathsf{long}}.

Using Lemma 27 and $\ell\leq 2^{O_{m}(w)}$ , sampling $D$ and $T$ uses $s_{\mathsf{short}}=O_{m}((\log(1/\varepsilon)+w)\log w+\log\log n)$ bits. So altogether this takes $s+s_{\mathsf{long}}+s_{\mathsf{short}}=s+s_{1}+O_{m}((\log(1/\varepsilon)+w)% \log w+\log\log n)$ bits.

Let $f$ be an $(m^{5}2^{45w},3w,2\log(1/\varepsilon))$ -product with $\ell$ many non-constant $f_{i}$ ’s. If $\ell\geq m^{5}2^{30w}$ , then $P_{\mathsf{long}}$ $\varepsilon$ -fools it. Otherwise, $\ell\leq m^{5}2^{30w}$ and so $f$ is an $(m^{5}2^{30w},3w,2\log(1/\varepsilon))$ -product. So by Lemma 29, with probability at least $1-\varepsilon$ over the choices of $D$ and $T$ , the function $f_{D,T}$ is an $(m^{5}2^{30w},2w,2\log(1/\varepsilon))$ -product, and therefore can be $\varepsilon$ -fooled using the generator $P$ given by the assumption. The total error is $O(\varepsilon)$ . $\hfill\blacktriangleleft$

6 Width reduction for long products: Proof of Theorem 30

In this section, we prove Theorem 30. Let $f=\prod_{i=0}^{\ell}f_{i}$ be an $(\ell,3w,2\log(1/\varepsilon))$ -product over a matrix group $\mathcal{M}\subseteq\mathcal{U}_{\theta}(d)$ , where $\ell\in[2^{30w}\theta^{-5},2^{45w}\theta^{-5}]$ and each $f_{i}$ is non-constant. Note that when a product $f$ has this many non-constant functions, the “bias” $\lVert\mathop{\bf E\/}[f(U)]\rVert_{\mathsf{op}}$ of $f$ is doubly exponentially small in $w$ , i.e. at most $\exp(-2^{2w})$ (see Claim 33, which is at most $\varepsilon$ whenever $w\geq\log\log(1/\varepsilon)$ ). Following [37], we will pseudorandomly restrict most of the coordinates of $f$ and show that the bias of a typical restricted product remains bounded by $\varepsilon$ . More importantly, we will show that this restricted product has width 1 (with a small spill). Therefore, it suffices to construct a PRG for width-1 products (with a small spill).

We remark that previous works showed that a typical restricted product has degree at most 16, as opposed to 1. This difference is already crucial in fooling products over $\mathbb{Z}_{m}$ for composites $m$ with good seed lengths, as we do not have (bit)-PRGs even for degree-2 polynomials over $\mathbb{Z}_{6}$ .

6.1 The reduction

We will use the following standard construction of $\delta$ -almost $k$ -wise independent distributions with marginals $p$ .

$\vartriangleright$ Claim 32.

There exists an explicit $\delta$ -almost $k$ -wise independent distribution $T$ on $\{0,1\}^{n}$ with $\mathop{\bf E\/}[T_{i}]=2^{-b}$ for every $i\in[n]$ which can be sampled using $O(b+k+\log(1/\delta)+\log\log n)$ bits.

Proof.

We sample an $(\delta,kb)$ -biased distribution $D$ on $\{0,1\}^{nb}$ and $b$ uniform bits $U_{b}$ . By standard construction [41, 2], $D$ can be sampled using $O(b+\log(k/\varepsilon)+\log\log n)$ bits. Write $D=(D_{1},\ldots,D_{n})$ where each $D_{i}\in\{0,1\}^{b}$ . We output $T\in\{0,1\}^{n}$ , where $T_{i}=\mathsf{AND}_{b}(D\oplus U_{b})$ , where $\mathsf{AND}_{b}$ is the $\mathsf{AND}$ function on $b$ bits. We have $\mathop{\bf E\/}[T_{i}]=2^{-b}$ because $U_{b}$ is uniform. By [37, Claim 3.7], $T$ is $(\varepsilon\cdot 2^{k})$ -almost $k$ -wise uniform. Setting $\varepsilon=2^{-k}\cdot\delta$ proves the claim. $\hfill\vartriangleleft$

Let $C$ be a sufficiently large constant. Let

$\displaystyle k$	$\displaystyle=C(\log(1/\varepsilon)+w)$
$\displaystyle\delta$	$\displaystyle=\theta^{k}$	(2)
$\displaystyle p$	$\displaystyle=2^{-23w}\theta^{3}.$

Let $D$ and $T$ be two $\delta$ -almost $k$ -wise independent distributions, with $\mathop{\bf E\/}[D_{i}]=1/2$ and $\mathop{\bf E\/}[T_{i}]=p$ for every $i\in[n]$ , and let $P_{1}$ be the PRG given by the theorem. The generator is

P:=D+T\wedge P_{1}.

By Claim 32, this uses $s_{1}+O_{\theta}(\log(1/\varepsilon)+w+\log\log n)$ bits.

6.2 Analysis

We first state a claim showing that if the number of non-constant $f_{i}$ ’s $\ell$ in a block product $f$ is much greater than its width $w$ , then the bias $\lVert\mathop{\bf E\/}[f(U)]\rVert_{\mathsf{op}}$ is small. We defer its proof to Section 6.2.1.

$\vartriangleright$ Claim 33.

For integers $w$ and $q$ , let $f=\prod_{i=0}^{\ell}f_{i}$ be an $(\ell,w,q)$ -product over some matrix group $\mathcal{M}$ supported on $\mathcal{U}_{\theta}(d)$ for some $\ell\geq 2^{2w+2}\theta^{2}\log(1/\varepsilon)$ , where each $f_{i}$ is non-constant. Then $\lVert\mathop{\bf E\/}[f(U)]\rVert_{\mathsf{op}}\leq\varepsilon$ .

Recall that $\ell\in[2^{30w}\theta^{-5},2^{45w}\theta^{-5}]$ . Given $D, T$ , let $f_{D,T}\colon\{0,1\}^{T}\to\mathcal{M}$ be the restricted product

f_{D,T}(x):=f(D+T\wedge x)=\prod_{i=0}^{\ell}f_{i}(D+T\wedge x).

We use $f_{D,T,i}(x)$ to denote $f_{i}(D+T\wedge x)$ .

The following lemma shows that with high probability over $D$ and $T$ , the function $f_{D,T}$ is an $(\ell,1,3\log(1/\varepsilon))$ -product, that is, a group program with a small spill. Note that this lemma is true for products over any group.

Lemma 34.

Let $D$ and $T$ be two distributions on $\{0,1\}^{n}$ defined in 2. Let $w\geq\log\log(1/\varepsilon)+2\log(1/\theta)$ and $f=\prod_{i}f_{i}$ be an $(\ell,3w,2\log(1/\varepsilon))$ -product, where $\ell\in[2^{30w}\theta^{-5},2^{45w}\theta^{-5}]$ and each $f_{i}$ is non-constant. Then with probability $1-\varepsilon$ over $D$ and $T$ , the function $f_{D,T}$ is an $(\ell,1,3\log(1/\varepsilon))$ -product, where $\ell\geq 2^{3w}\theta^{-2}$ and each $f_{D,T,i}$ is non-constant.

Theorem 30 follows from Claims 42, 33, and 34.

Proof of Theorem 30.

As $\ell\geq 2^{2w+2}\theta^{-2}\log(1/\varepsilon)$ , by Claim 33, we have $\lVert\mathop{\bf E\/}[f(U)]\rVert_{\mathsf{op}}\leq\varepsilon$ . By Lemma 34, with probability $1-\varepsilon$ over $D$ and $T$ , the restricted function $f_{D,T}=\prod_{i}f_{D,T,i}$ is an $(\ell,1,3\log(1/\varepsilon))$ -product, where $\ell\geq 2^{3w}\theta^{-2}$ and each $f_{D,T,i}$ is non-constant. As $w\geq\log\log(1/\varepsilon)$ , again by Claim 33, we have $\lVert\mathop{\bf E\/}[f_{D,T}(U)]\rVert_{\mathsf{op}}\leq\varepsilon$ . By our assumption, we have $\lVert\mathop{\bf E\/}[f_{D,T}(P_{1})]\rVert_{\mathsf{op}}\leq\lVert\mathop{% \bf E\/}[f_{D,T}(U)]\rVert_{\mathsf{op}}+\varepsilon\leq 2\varepsilon$ . So altogether we have $\lVert\mathop{\bf E\/}[f(U)]-\mathop{\bf E\/}[f(G(U))]\rVert_{\mathsf{op}}\leq O% (\varepsilon)$ . The seed length follows from the construction. $\hfill\blacktriangleleft$

Proof of Lemma 34.

To get some intuition, think of $\theta$ as a constant. Recall that the number of functions $\ell$ is roughly between $2^{30w}$ and $2^{45w}$ , and $T$ is keeping each bit free with probability $p=2^{-25w}$ . Therefore, under a typical restriction, we expect for most functions in the product, only 1 bit is set to free, and very few functions have 2 free bits.

We first need to lower-bound the probability that a non-constant function remains non-constant under a random restriction.

$\vartriangleright$ Claim 35.

Let $g$ be a non-constant function on $w$ bits. For $p\in[0,1]$ , let $T$ be the distribution on $\{0,1\}^{w}$ , where the coordinates $T_{i}$ ’s are independent and $\mathop{\bf E\/}[T_{i}]=p$ for each $i\in[w]$ . With probability at least $p\cdot((1-p)/2)^{w-1}$ the function $g_{U,T}(x):=g(U+T\wedge x)$ is a non-constant function on $1$ bit.

Proof.

Since $g$ is non-constant, there is an $x\in\{0,1\}^{w}$ and a coordinate $j\in[w]$ such that $g(x+e_{j})\neq g(x)$ . The probability that only the coordinate $T_{j}$ is 1 (and the rest are 0), and $U$ agrees with $x$ on the rest of the $w-1$ coordinates is

	$\displaystyle\mathop{\bf Pr\/}\Bigl{[}\bigl{(}T=\{j\}\bigr{)}\wedge\bigwedge_{% i\neq j}U_{i}=x_{i}\Bigr{]}$	$\displaystyle=\mathop{\bf Pr\/}\Bigl{[}T=\{j\}\Bigr{]}\cdot\mathop{\bf Pr\/}% \Bigl{[}\bigwedge_{i\neq j}U_{i}=x_{i}\Bigr{]}$
		$\displaystyle=p\cdot(1-p)^{w-1}\cdot 2^{-(w-1)}=p\cdot\left(\frac{1-p}{2}% \right)^{w-1}.\$

$\hfill\vartriangleleft$

We will use the following standard tail bound for almost $k$ -wise independent random variables.

Lemma 36 (Lemma 8.1 in [30]).

Let $X_{1},\ldots,X_{\ell}$ be $\gamma$ -almost $t$ -wise independent random variables supported on $[0,1]$ . Let $X:=\sum_{i}X_{i}$ , and $\mu:=\mathop{\bf E\/}[X]$ . We have

\mathop{\bf Pr\/}\Bigl{[}\lvert X-\mu\rvert\geq\mu/2\Bigr{]}\leq O\left(\frac{% t}{\mu}\right)^{t}+O\left(\frac{\ell}{\mu}\right)^{t}\gamma.

Proof of Lemma 34.

We will show that for most choices of $D$ and $T$ , at least $2^{3w}\theta^{-2}$ of the $f_{D,T,i}$ depend on only 1 coordinate, and the ones that depend on at least $2$ coordinates together form a $\log(1/\varepsilon)$ -junta.

We first consider the set of functions $f_{D,T,i}$ that are restricted to $1$ -bit non-constant functions. Let

J_{1}:=\{i\in[\ell]:\text{$\lvert T\cap I_{i}\rvert=1$ and $f_{T,D,i}$ is non-% constant}\}.

If $D$ and $T$ were exactly independent instead of almost-independent, then applying Claim 35 with our choice of $p\geq 2^{-23w}\theta^{3}$ , we would have

\mathop{\bf E\/}_{D,T}\bigl{[}\lvert J_{1}\rvert\bigr{]}\geq\ell\cdot p\cdot% \left(\frac{1-p}{2}\right)^{3w-1}\geq(2^{30w}\theta^{-5})\cdot(2^{-23w}\theta^% {3})\cdot 2^{-3w}\geq 2^{4w}\theta^{-2}.

As $(D,T)$ is $\delta$ -almost $k$ -wise independent and $\lvert I_{i}\rvert\leq 3w$ for $i\in[\ell]$ , the indicators $\mathbbm{1}(i\in J_{1}):i\in[\ell]$ are $\delta$ -almost $\lfloor k/(3w)\rfloor$ -wise independent. So applying Lemma 36 with $t=\frac{C(\log(1/\varepsilon)+w)}{300w}\leq\lfloor k/(3w)\rfloor$ and $\gamma=\delta=\theta^{k}$ , and recalling $k=C(\log(1/\varepsilon)+w)$ , $\ell\leq 2^{45w}\theta^{-5}$ , and $w\geq\log\log(1/\varepsilon)+\log(1/\theta)$ , we have

$\displaystyle\mathop{\bf Pr\/}_{D,T}\bigl{[}\lvert J_{1}\rvert\leq 2^{3w}% \theta^{-2}\bigr{]}$	$\displaystyle\leq O\left(\frac{t}{2^{4w}\theta^{-2}}\right)^{t}+O\left(2^{41w}% \theta^{-3}\right)^{t}\cdot\theta^{k}$
	$\displaystyle\leq 2^{-\Omega\bigl{(}w\cdot\frac{C(\log(1/\varepsilon)+w)}{300w% }\bigr{)}}+\theta^{k/2}$
	$\displaystyle\leq\varepsilon.$	(3)

We now consider the $f_{D,T,i}$ ’s that depend on at least two coordinates. We will show that these functions altogether depend on at most $\log(1/\varepsilon)$ coordinates. As a result, we can think of these functions as a single $\log(1/\varepsilon)$ -junta.

Let $J_{\geq 2}:=\{i\in[\ell]:\lvert I_{i}\cap T\rvert\geq 2\}$ be the set of functions $f_{D,T,i}$ ’s that depend on at least $2$ coordinates, and $Q:=\bigcup_{i\in J_{\geq 2}}I_{i}\cap T$ be the collection of coordinates these functions depend on. Suppose $\lvert Q\rvert\geq\log(1/\varepsilon)$ . Then as $\lvert I_{i}\cap T\rvert\geq 2$ for $i\in J_{\geq 2}$ , it must be the case that some $u\leq\lceil\frac{\log(1/\varepsilon)}{2}\rceil$ of the subsets $I_{i}\cap T:i\in J_{\geq 2}$ together contain at least $2u$ many free coordinates. The probability of the latter event is at most

\binom{\ell}{u}\cdot\binom{u\cdot 3w}{2u}\cdot\Bigl{(}p^{2u}+\delta\Bigr{)}.

Setting $u=\frac{\log(1/\varepsilon)}{2w}+1$ , and recalling $\ell\leq 2^{45w}\theta^{-5}$ , $p=2^{-23w}\theta^{3}$ and $\delta=\theta^{k}\leq p^{2u}$ , the above is at most

	$\displaystyle\ell^{u}\cdot(6w)^{2u}\cdot 2p^{2u}$	$\displaystyle\leq(2^{45wu}\theta^{5u})\cdot 2^{3u\log w}\cdot(2\cdot 2^{-46w}% \theta^{6})$
		$\displaystyle\leq(2^{-2w}\theta)^{u}\leq\varepsilon.$		(4)

Let $I_{0}^{\prime}:=(T\cap I_{0})\cup Q$ . By Sections 6.2 and 4, with probability $1-2\varepsilon$ over $(D,T)$ , we have $\lvert J_{1}\rvert\geq 2^{3w}\theta^{-2}$ and $\lvert I_{0}^{\prime}\rvert\leq 3\log(1/\varepsilon)$ . In this case, the function $f_{D,T}$ is a product of at least $2^{3w}\theta^{-2}$ non-constant $1$ -bit functions and a $(3\log(1/\varepsilon))$ -junta. In other words, $f_{D,T}=\prod_{i}f_{D,T,i}$ is a $(\ell,1,3\log(1/\varepsilon))$ -product, where $\ell\geq 2^{3w}\theta^{-2}$ , and each $f_{D,T,i}:i\in[\ell]$ is non-constant. $\hfill\blacktriangleleft$

6.2.1 Long products have small bias: Proof of Claim 33

In this section, we prove Claim 33. We start with bounding the bias of a single arbitrary non-constant function on $w$ bits.

$\vartriangleright$ Claim 37.

Let $\mathcal{M}$ be a group of matrices supported on $\mathcal{U}_{\theta}(d)$ . We have $\lVert\mathop{\bf E\/}[g(U)]\rVert_{\mathsf{op}}\leq 1-2^{-(2w+2)}\theta^{2}$ for every non-constant function $g\colon\{0,1\}^{w}\to\mathcal{M}$ .

Proof.

Let $T$ be the uniform distribution on $\{0,1\}^{n}$ . Note that $\mathop{\bf E\/}[g(U)]=\mathop{\bf E\/}_{U,T}[\mathop{\bf E\/}_{U^{\prime}}[g(% U+T\wedge U^{\prime})]]$ . Applying Claim 35 with $p=1/2$ , with probability at least $2^{-(2w-1)}$ , the function $g_{U,T}(x):=g(U+T\wedge x)$ is a non-constant $1$ -bit function. Suppose $M_{1}=:g_{U,T}(1)\neq g_{U,T}(0):=M_{0}$ . By Claim 22, we have $\lVert(M_{1}+M_{0})/2\rVert_{\mathsf{op}}=\lVert M_{1}(I+M_{1}^{-1}M_{0})/2% \rVert_{\mathsf{op}}\leq 1-\theta^{2}/8$ . So

	$\displaystyle\big{\lVert}\mathop{\bf E\/}[g(U)]\big{\rVert}_{\mathsf{op}}$	$\displaystyle\leq(1-2^{-(2w-1)})\cdot 1+2^{-(2w-1)}\cdot\lVert(M_{1}+M_{0})/2% \rVert_{\mathsf{op}}$
		$\displaystyle\leq 1-2^{-(2w-1)}\cdot\bigl{(}1-(1-\theta^{2}/8)\bigr{)}$
		$\displaystyle=1-2^{-(2w+2)}\theta^{2}.\$

$\hfill\vartriangleleft$

Proof of Claim 33.

By Claim 37, we have $\lVert\mathop{\bf E\/}[f_{i}(U)]\rVert_{\mathsf{op}}\leq 1-2^{-(2w+2)}\theta^{2}$ for each $i\in[\ell]$ . Hence, for $\ell\geq 2^{2w+2}\theta^{2}\log(1/\varepsilon)$ , we have

\big{\lVert}\mathop{\bf E\/}[f(U)]\big{\rVert}_{\mathsf{op}}\leq\prod_{i\in[% \ell]}\big{\lVert}\mathop{\bf E\/}[f_{i}(U)]\big{\rVert}_{\mathsf{op}}\leq% \bigl{(}1-2^{-(2w+2)}\theta^{2}\bigr{)}^{\ell}\leq\exp\bigl{(}-\ell\cdot 2^{-(% 2w+2)}\theta^{2}\bigr{)}\leq\varepsilon.\

$\hfill\blacktriangleleft$

7 Width reduction of short products: Proof of Lemma 29

In this section, we prove Lemma 29. Recall that $D, T$ are $\delta$ -almost $k$ -wise independent distributions with $\mathop{\bf E\/}[D_{i}]=1/2$ and $\mathop{\bf E\/}[T_{i}]=p$ , where

	$\displaystyle k$	$\displaystyle=C\bigl{(}w+\log(m/\varepsilon)\bigr{)}$
	$\displaystyle\delta$	$\displaystyle=(m\cdot w)^{-k}$
	$\displaystyle p$	$\displaystyle=2^{-C}.$

for a sufficiently large constant $C$ .

Given $T$ , say a coordinate $i\in[n]$ is fixed if $T_{i}=0$ and is free if $T_{i}=1$ . Let $f(x):=\prod_{i=0}^{\ell}f_{i}(x_{I_{i}})$ be a $(\ell,3w,2\log(1/\varepsilon))$ -product, where $\ell=m^{5}\cdot 2^{30w}$ . We will show that with high probability over $T$ , (1) at most $\log(1/\varepsilon)$ of the $2\log(1/\varepsilon)$ coordinates in $I_{0}$ are free; (2) for most of the $I_{i}:i\geq 1$ , at most $2w$ coordinates in each of them are free, and (3) in the remaining $I_{i}$ ’s, there are at most $\log(1/\varepsilon)$ many free coordinates in total.

To proceed, let

J:=\{i\in[\ell]:\lvert T\cap I_{i}\rvert\geq 2w\}\quad\text{and}\quad Q:=% \bigcup_{j\in J}I_{j}\cap T.

It suffices to show the following two claims.

$\vartriangleright$ Claim 38.

$\lvert Q\rvert\leq\log(1/\varepsilon)$ with probability $1-\varepsilon$ over $T$ .

$\vartriangleright$ Claim 39.

$\lvert T\cap I_{0}\rvert\leq\log(1/\varepsilon)$ with probability $1-\varepsilon$ over $T$ .

Proof of Lemma 29.

Let $I_{0}^{\prime}=(T\cap I_{0})\cup Q$ . By Claims 38 and 39, with probability $1-2\varepsilon$ over $T$ , we have $\lvert I_{0}^{\prime}\rvert\leq 2\log(1/\varepsilon)$ , and for every $i\in[\ell]\setminus J$ , we have $\lvert T\cap I_{i}\rvert\leq 2w$ . Therefore, the function $f_{D,T}$ is a $(\ell,2w,2\log(1/\varepsilon))$ -product. $\hfill\blacktriangleleft$

Proof of Claim 38.

Suppose $\lvert Q\rvert\geq\log(1/\varepsilon)$ . Then as $\lvert T\cap I_{j}\rvert\geq 2w$ for $j\in J$ , it must be the case that some $u\leq\lceil\frac{\log(1/\varepsilon)}{2w}\rceil$ subsets $T\cap I_{j}:j\in J$ altogether contain $2w\cdot u$ many free coordinates. This happens with probability at most

\binom{\ell}{u}\cdot\binom{3w\cdot u}{2w\cdot u}\cdot\Bigl{(}p^{2w\cdot u}+% \delta\Bigr{)}.

Setting $u=\frac{\log(1/\varepsilon)}{3w}+1$ , and recalling $\ell\leq m^{5}2^{30w}$ and $\delta=(m\cdot w)^{-k}\leq p^{2w\cdot u}$ , the above is at most

	$\displaystyle\ell^{u}\cdot 2^{3w\cdot u}\cdot 2p^{2w\cdot u}$	$\displaystyle\leq\bigl{(}m^{5}2^{30w}\bigr{)}^{u}\cdot 2^{3w\cdot u}\cdot 2^{-% C\cdot 2w\cdot u+1}$
		$\displaystyle\leq 2^{u\bigl{(}33w+5\log m-C\cdot 2w\bigr{)}}$
		$\displaystyle\leq 2^{-u\cdot 3w}\leq\varepsilon$

where in the second last inequality we used $w\geq\log m$ . $\hfill\blacktriangleleft$

Proof of Claim 39.

Recall that $\lvert I_{0}\rvert\leq 2\log(1/\varepsilon)$ , and $\delta=(m\cdot w)^{-k}\leq p^{\log(1/\varepsilon)}$ . So

	$\displaystyle\mathop{\bf Pr\/}\Bigl{[}\lvert T\cap I_{0}\rvert\geq\log(1/% \varepsilon)\Bigr{]}$	$\displaystyle\leq\binom{\lvert I_{0}\rvert}{\log(1/\varepsilon)}\Bigl{(}p^{% \log(1/\varepsilon)}+\delta\Bigr{)}$
		$\displaystyle\leq 2^{2\log(1/\varepsilon)}\cdot 2^{-C\log(1/\varepsilon)+1}$
		$\displaystyle\leq\varepsilon/2.\$

$\hfill\blacktriangleleft$

8 Fooling $(1,w,3\log(1/\varepsilon))$ -products over groups

In this section, we show how to extend the PRGs for $(\ell,1,0)$ -products over $p$ -groups (Theorem 5) and commutative groups [17] to fool $(\ell,1,3\log(1/\varepsilon))$ -products.

8.1 $𝒑$ -groups

We use the fact that our generator in Theorem 5 is simply the XOR of independent copies of small-bias distributions. The following claim shows that conditioning on a small number of bits of a small-bias distribution remains small bias.

$\vartriangleright$ Claim 40.

Let $D$ be an $\varepsilon$ -biased distribution on $\{0,1\}^{n}$ . For any set $S$ and $y\in\{0,1\}^{S}$ , the distribution of $D$ conditioned on $D_{S}=y$ is $(2^{\lvert S\rvert+1}\varepsilon)$ -biased on $\{0,1\}^{[n]\setminus S}$ .

Proof.

We may assume $\varepsilon\leq 2^{-(\lvert S\rvert+1)}$ , for otherwise the claim is vacuous. For a subset $T\subseteq[n]$ , let $\chi_{T}(x):=(-1)^{\sum_{i\in T}x_{i}}$ be any parity test. Let $T$ be any nonempty subset of $[n]\setminus S$ . First observe that

\mathbbm{1}(D_{S}=y)=\prod_{i\in S}\frac{1-\chi_{\{i\}}(D)}{2}=2^{-\lvert S% \rvert}\sum_{S^{\prime}\subseteq S}(-1)^{\lvert S^{\prime}\rvert}\chi_{S^{% \prime}}(D).

Taking expectations on both sides and applying the triangle inequality, we have $\mathop{\bf Pr\/}[D_{S}=y]\geq 2^{-\lvert S\rvert}-\varepsilon\geq 2^{-(\lvert S% \rvert+1)}$ . Note that

	$\displaystyle\mathop{\bf E\/}[\chi_{T}(D)\mid D_{S}=y]\mathop{\bf Pr\/}[D_{S}=y]$	$\displaystyle=\mathop{\bf E\/}[\chi_{T}(D)\cdot\mathbbm{1}(D_{S}=y)]$
		$\displaystyle=\mathop{\bf E\/}\Bigl{[}\chi_{T}(D)\cdot\prod_{i\in S}\frac{1-% \chi_{\{i\}}(D)}{2}\Bigr{]}$
		$\displaystyle=2^{-\lvert S\rvert}\sum_{S^{\prime}\subseteq S}(-1)^{\lvert S^{% \prime}\rvert}\mathop{\bf E\/}\bigl{[}\chi_{T\cup S^{\prime}}(D)\bigr{]}.$

So its magnitude is bounded by $\varepsilon$ . Therefore, $\big{\lvert}\mathop{\bf E\/}[\chi_{T}(D)\mid D_{S}=y]\big{\rvert}\leq\mathop{% \bf Pr\/}[D_{S}=y]^{-1}\varepsilon\leq 2^{\lvert S\rvert+1}\varepsilon$ . $\hfill\vartriangleleft$

Corollary 41.

There is a PRG that $\varepsilon$ -fools $(n,1,3\log(1/\varepsilon))$ -products over any $p$ -groups of order $m$ with seed length $O_{m}(\log(n/\varepsilon))$ .

Proof.

Recall our generator for $p$ -groups in Theorem 5 is simply the XOR of independent copies of $(\varepsilon/n)^{c}$ -biased distributions. By Claim 40, for any fixing of the input bits of $f_{0}$ in each copy, each distribution remains $(2/\varepsilon)(\varepsilon/n)^{c}$ -biased. $\hfill\blacktriangleleft$

8.2 Commutative groups

We now show that the Gopalan–Kane–Meka PRG fools $(\ell,1,3\log(1/\varepsilon))$ -products. We will use the following PRG by Gopalan, Kane, and Meka [18] that fools $(\ell,1,0)$ -products over $\mathbb{C}$ . A simple argument shows that the same PRG also fools $(\ell,1,3\log(1/\varepsilon))$ -products.

$\vartriangleright$ Claim 42.

There is an explicit PRG $P$ that $\varepsilon$ -fools $(\ell,1,3\log(1/\varepsilon))$ -products over commutative groups of order $m$ with seed length $O_{m}(\log(\ell/\varepsilon)(\log\log(\ell/\varepsilon)^{2})$ .

Lemma 43 (Theorem 1.1 and Lemma 9.1 in [18]).

There is an explicit $P_{\mathsf{GKM}}\colon\{0,1\}^{s}\to\{0,1\}^{n}$ where $s=O(\log(\ell/\varepsilon))(\log\log(\ell/\varepsilon))^{2}$ such that the following holds. If $w\in\mathbb{R}^{n}$ satisfies $\sum_{i}\lvert w_{i}\rvert\leq W$ , then

\textnormal{dist}_{TV}\bigl{(}\langle w,U\rangle,\langle w,P_{\mathsf{GKM}}(U)% \rangle\bigr{)}\leq O(\sqrt{W})\cdot\varepsilon.\

Proof of Claim 42.

By Claim 21, it suffices to fool the product over each irreducible representation of $G$ with error $\varepsilon/\sqrt{m}$ . Since $G$ is commutative, all its irreps are 1-dimensional. Moreover, they are supported on subsets of $\mathbb{C}_{m}:=\{z\in\mathbb{C}:\lvert z\rvert^{m}=1\}$ . Let $f=\prod_{i=0}^{\ell}f_{i}$ be an $(\ell,1,3\log(1/\varepsilon))$ -product over $\mathbb{C}_{m}$ .

Let $\omega:=e^{i\frac{2\pi}{m}}$ . Note that for any 1-bit function $g\colon\{0,1\}\to\mathbb{C}_{m}$ we can write $g$ as

g(y)=\omega^{a_{1}y+a_{0}(1-y)}=\omega^{a_{0}}\cdot\omega^{(a_{1}-a_{0})y}

for some $a_{0},a_{1}\in\{0,\ldots,m-1\}$ . We can also write $f_{0}(x_{I_{0}})$ as $\omega^{h(x_{I_{0}})}$ , for some $h\colon\{0,1\}^{I_{0}}\to\{0,\ldots,m-1\}$ . Therefore, $f$ has the form of

f(x)=\omega^{b}\cdot\omega^{\sum_{j\in J}a_{j}x_{j}+h(x_{I_{0}})},

for some coefficients $b$ and $a_{j}$ ’s taking values from $\{0,\ldots,m-1\}$ , and $J$ and $I_{0}$ are disjoint subsets of $[n]$ . Write $I_{0}=\{r_{1}<\cdots<r_{\lvert I_{0}\rvert}\}$ for some $r_{j}\in[n]$ . Consider the integer-valued function $F\colon\{0,1\}^{\ell}\to\mathbb{Z}$ defined by

F(x):=\sum_{j\in J}a_{j}x_{j}+2^{\lceil\log(m\ell)\rceil}\sum_{j=1}^{\lvert I_% {0}\rvert}2^{j-1}x_{r_{j}}.

So the first $\lceil\log m\ell\rceil$ bits of $F(x)$ encode the first sum, and the last $\lvert I_{0}\rvert$ bits are the decimal encoding of the binary string $x_{I_{0}}$ . Note that we can compute $f(x)$ given $F(x)$ . Moreover, $F(x)=\langle w,x\rangle$ for some $w\in\mathbb{R}^{n}$ with $\sum_{i}\lvert w_{i}\rvert\leq O(m\ell/\varepsilon^{3})$ . Therefore, if we let $P$ be the PRG in Lemma 43 with error $O(\frac{\varepsilon^{3}}{m\ell})$ , which uses a seed of $O(\log(m\ell/\varepsilon))(\log\log(m\ell/\varepsilon))^{2}$ bits, then it follows that $\textnormal{dist}_{TV}(F(U),F(P_{\mathsf{GKM}}(U))\leq\varepsilon/\sqrt{m}$ . $\hfill\blacktriangleleft$

9 Mixing characterization of Dedekind groups

In this section we give a proof of Lemma 9, which says that a (finite) group $G$ is mixing if and only if it is Dedekind. This proof is provided by Yves de Cornulier on https://mathoverflow.net/a/482286/8271.

Recall from Definition 6 that a (finite) group $G$ is mixing if for every nontrivial irreducible (unitary) representation $\rho$ and non-identity element $g\in G$ , the matrix $\rho(g)$ has no eigenvalue 1, and $G$ is Dedekind if it has the form $\mathbb{Q}_{8}\times\mathbb{Z}_{2}^{t}\times D$ for any integer $t$ and commutative group $D$ of odd order.

We first note that the definition of $G$ being mixing is equivalent to the following condition: for every $g\in G$ and irrep $\rho$ , the subspace $\ker(\rho(g)-I)$ is a subrepresentation. The proof of Lemma 9 follows from the following two claims. Here we use the equivalence that a group $G$ is Dedekind if and only if every subgroup of $G$ is normal.

$\vartriangleright$ Claim 44.

If $G$ is Dedekind, then $\ker(\rho(g)-I)$ is a subrepresentation for every $g$ and $\rho$ .

Proof.

Take an element $g\in G$ . By definition of Dedekind, every subgroup in $G$ is normal. In particular $\langle g\rangle$ is also normal.

Take $v\in W_{g}:=\{v:\rho(g)v=v\}$ . To show that $W_{g}$ is a subrepresentation, we need to show that $\rho(g)(\rho(h)v)=\rho(h)v$ for every $h$ . But this is equivalent to showing

\rho(h)^{-1}\rho(g)\rho(h)v=\rho(h^{-1}gh)v=v.

Since $\langle g\rangle$ is normal, we have $h^{-1}gh=g^{i}$ for some $i$ . It is clear that $\rho(g^{i})v=\rho(g)^{i}v=\rho(g)^{i-1}(\rho(g)v)=\rho(g)^{i-1}v=\cdots=v$ . So indeed $W_{g}$ is a subrepresentation. $\hfill\vartriangleleft$

$\vartriangleright$ Claim 45.

If $\ker(\rho(g)-I)$ is a subrepresentation for every $g$ and $\rho$ of $G$ , then $G$ is Dedekind.

Proof.

As in the previous claim, we consider $W_{g}=\ker(\rho(g)-I)=\{v:\rho(g)v=v\}$ . Note that for $v\in W_{g}$ , we have $\rho(g^{i})v=\rho(g)^{i}v=v$ . Suppose $W_{g}$ is a subrepresentation. Take $h\in G$ and $v\in W_{g}$ . Since $\rho(h)W_{g}\subseteq W_{g}$ , we have $\rho(g)\rho(h)v=\rho(h)v$ . That means $\rho(h^{-1}gh)v=v$ for every $v\in W_{g}$ . This implies $\langle h^{-1}gh\rangle=\langle g\rangle$ , meaning $h^{-1}gh\in\langle g\rangle$ and thus $\langle g\rangle$ is normal. $\hfill\vartriangleleft$

References

[1] Miklos Ajtai and Avi Wigderson. Deterministic simulation of probabilistic constant-depth circuits. Advances in Computing Research - Randomness and Computation, 5:199–223, 1989.
[2] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple constructions of almost $k$ -wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992. doi:10.1002/RSA.3240030308.
[3] Noga Alon, Alexander Lubotzky, and Avi Wigderson. Semi-direct product in groups and zig-zag product in graphs: Connections and applications. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 630–637, 2001. doi:10.1109/SFCS.2001.959939.
[4] Benny Applebaum, Yuval Ishai, and Eyal Kushilevitz. Cryptography in NC⁰. SIAM J. on Computing, 36(4):845–888, 2006.
[5] Roy Armoni, Michael E. Saks, Avi Wigderson, and Shiyu Zhou. Discrepancy sets and pseudorandom generators for combinatorial rectangles. In 37th IEEE Symp. on Foundations of Computer Science (FOCS), pages 412–421, 1996. doi:10.1109/SFCS.1996.548500.
[6] David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC¹. J. of Computer and System Sciences, 38(1):150–164, 1989. doi:10.1016/0022-0000(89)90037-8.
[7] Jonah Blasiak, Thomas Church, Henry Cohn, Joshua A. Grochow, and Chris Umans. Which groups are amenable to proving exponent two for matrix multiplication? CoRR, abs/1712.02302, 2017. arXiv:1712.02302.
[8] Andrej Bogdanov and Emanuele Viola. Pseudorandom bits for polynomials. SIAM J. on Computing, 39(6):2464–2486, 2010. doi:10.1137/070712109.
[9] Eshan Chattopadhyay, Pooya Hatami, Kaave Hosseini, and Shachar Lovett. Pseudorandom generators from polarizing random walks. Theory Comput., 15:1–26, 2019. doi:10.4086/TOC.2019.V015A010.
[10] Henry Cohn, Robert D. Kleinberg, Balázs Szegedy, and Christopher Umans. Group-theoretic algorithms for matrix multiplication. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 379–388, 2005. doi:10.1109/SFCS.2005.39.
[11] Harm Derksen, Peter Ivanov, Chin Ho Lee, and Emanuele Viola. Pseudorandomness, symmetry, smoothing: Ii, 2024. doi:10.48550/arXiv.2407.12110.
[12] Harm Derksen, Chin Ho Lee, and Emanuele Viola. Boosting uniformity in quasirandom groups: fast and simple. In IEEE Symp. on Foundations of Computer Science (FOCS), 2024.
[13] Persi Diaconis. Group representations in probability and statistics, volume 11 of Institute of Mathematical Statistics Lecture Notes—Monograph Series. Institute of Mathematical Statistics, Hayward, CA, 1988.
[14] Dean Doron, Pooya Hatami, and William M. Hoza. Log-seed pseudorandom generators via iterated restrictions. In Shubhangi Saraf, editor, 35th Computational Complexity Conference, CCC 2020, July 28-31, 2020, Saarbrücken, Germany (Virtual Conference), volume 169 of LIPIcs, pages 6:1–6:36. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.CCC.2020.6.
[15] David S. Dummit and Richard M. Foote. Abstract Algebra. Wiley, 3rd edition, 2004.
[16] Michael A. Forbes and Zander Kelley. Pseudorandom generators for read-once branching programs, in any order. In IEEE Symp. on Foundations of Computer Science (FOCS), 2018. doi:10.1109/FOCS.2018.00093.
[17] Parikshit Gopalan, Daniel Kane, and Raghu Meka. Pseudorandomness via the discrete fourier transform. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 903–922, 2015. doi:10.1109/FOCS.2015.60.
[18] Parikshit Gopalan, Daniel M. Kane, and Raghu Meka. Pseudorandomness via the discrete Fourier transform. SIAM J. Comput., 47(6):2451–2487, 2018. doi:10.1137/16M1062132.
[19] Parikshit Gopalan, Raghu Meka, Omer Reingold, Luca Trevisan, and Salil Vadhan. Better pseudorandom generators from milder pseudorandom restrictions. In IEEE Symp. on Foundations of Computer Science (FOCS), 2012.
[20] Parikshit Gopalan, Raghu Meka, Omer Reingold, and David Zuckerman. Pseudorandom generators for combinatorial shapes. SIAM J. Comput., 42(3):1051–1076, 2013. doi:10.1137/110854990.
[21] W. T. Gowers. Generalizations of Fourier analysis, and how to apply them. Bull. Amer. Math. Soc. (N.S.), 54(1):1–44, 2017. doi:10.1090/bull/1550.
[22] W. T. Gowers and Emanuele Viola. Mixing in non-quasirandom groups. In ACM Innovations in Theoretical Computer Science conf. (ITCS), 2022.
[23] Elad Haramaty, Chin Ho Lee, and Emanuele Viola. Bounded independence plus noise fools products. SIAM J. Comput., 47(2):493–523, 2018. doi:10.1137/17M1129088.
[24] Pooya Hatami and William Hoza. Paradigms for unconditional pseudorandom generators. Foundations and Trends® in Theoretical Computer Science, 16(1-2):1–210, 2024. doi:10.1561/0400000109.
[25] Michal Koucký, Prajakta Nimbhorkar, and Pavel Pudlák. Pseudorandom generators for group products: extended abstract. In STOC, pages 263–272. ACM, 2011. doi:10.1145/1993636.1993672.
[26] Jack B. Kuipers. Quaternions in computer graphics and robotics. In SIGGRAPH 2002 Course Notes, San Antonio, TX, 2002. ACM SIGGRAPH.
[27] Chin Ho Lee. Fourier bounds and pseudorandom generators for product tests. In 34th Computational Complexity Conference, volume 137. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CCC.2019.7.
[28] Chin Ho Lee, Edward Pyne, and Salil Vadhan. Fourier growth of regular branching programs. In Approximation, randomization, and combinatorial optimization. Algorithms and techniques, volume 245 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 2, 21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/lipics.approx/random.2022.2.
[29] Chin Ho Lee and Emanuele Viola. The coin problem for product tests. ACM Trans. Comput. Theory, 10(3):Art. 14, 10, 2018. doi:10.1145/3201787.
[30] Chin Ho Lee and Emanuele Viola. More on bounded independence plus noise: Pseudorandom generators for read-once polynomials. Theory of Computing, 16:1–50, 2020. Available at https://www.ccs.neu.edu/home/viola/papers/LV-rop.pdf. doi:10.4086/TOC.2020.V016A007.
[31] Chin Ho Lee and Emanuele Viola. More on bounded independence plus noise: pseudorandom generators for read-once polynomials. Theory Comput., 16:Paper No. 7, 50, 2020. doi:10.4086/toc.2020.v016a007.
[32] Shachar Lovett. Unconditional pseudorandom generators for low degree polynomials. Theory of Computing, 5(1):69–82, 2009. doi:10.4086/TOC.2009.V005A003.
[33] Shachar Lovett, Partha Mukhopadhyay, and Amir Shpilka. Pseudorandom generators for $CC^{0}[p]$ and the Fourier spectrum of low-degree polynomials over finite fields. In 51th IEEE Symp. on Foundations of Computer Science (FOCS). IEEE, 2010.
[34] Shachar Lovett, Omer Reingold, Luca Trevisan, and Salil Vadhan. Pseudorandom bit generators that fool modular sums. In Approximation, randomization, and combinatorial optimization, volume 5687 of Lecture Notes in Comput. Sci., pages 615–630. Springer, Berlin, 2009. doi:10.1007/978-3-642-03685-9_46.
[35] Shachar Lovett, Omer Reingold, Luca Trevisan, and Salil P. Vadhan. Pseudorandom bit generators that fool modular sums. In 13th Workshop on Randomization and Computation (RANDOM), volume 5687 of Lecture Notes in Computer Science, pages 615–630. Springer, 2009. doi:10.1007/978-3-642-03685-9_46.
[36] Chi-Jen Lu. Improved pseudorandom generators for combinatorial rectangles. Combinatorica, 22(3):417–433, 2002. doi:10.1007/S004930200021.
[37] Raghu Meka, Omer Reingold, and Avishay Tal. Pseudorandom generators for width-3 branching programs. In STOC’19 – Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 626–637. ACM, New York, 2019. doi:10.1145/3313276.3316319.
[38] Raghu Meka and David Zuckerman. Small-bias spaces for group products. In 13th Workshop on Randomization and Computation (RANDOM), volume 5687 of Lecture Notes in Computer Science, pages 658–672. Springer, 2009. doi:10.1007/978-3-642-03685-9_49.
[39] Raghu Meka and David Zuckerman. Small-bias spaces for group products. In Approximation, randomization, and combinatorial optimization, volume 5687 of Lecture Notes in Comput. Sci., pages 658–672. Springer, Berlin, 2009. doi:10.1007/978-3-642-03685-9_49.
[40] J. Naor and M. Naor. Small-bias probability spaces: efficient constructions and applications. In 22nd ACM Symp. on the Theory of Computing (STOC), pages 213–223. ACM, 1990. doi:10.1145/100216.100244.
[41] Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993. doi:10.1137/0222053.
[42] Pierre Péladeau and Denis Thérien. On the languages recognized by nilpotent groups (a translation of “Sur les langages reconnus par des groupes nilpotents”). Electron. Colloquium Comput. Complex., TR01-040, 2001. URL: https://eccc.weizmann.ac.il/eccc-reports/2001/TR01-040/index.html.
[43] Omer Reingold, Thomas Steinke, and Salil P. Vadhan. Pseudorandomness for regular branching programs via Fourier analysis. In Workshop on Randomization and Computation (RANDOM), pages 655–670, 2013. doi:10.1007/978-3-642-40328-6_45.
[44] Eyal Rozenman, Aner Shalev, and Avi Wigderson. Iterative construction of cayley expander graphs. Theory Comput., 2(5):91–120, 2006. doi:10.4086/TOC.2006.V002A005.
[45] Jean Pierre Serre. Linear Representations of Finite Groups. Springer, 1977.
[46] Thomas Steinke. Pseudorandomness for permutation branching programs without the group theory. Electron. Colloquium Comput. Complex., TR12-083, 2012. URL: https://eccc.weizmann.ac.il/report/2012/083.
[47] Xiaorui Sun. Faster isomorphism for p-groups of class 2 and exponent p. In STOC, pages 433–440. ACM, 2023.
[48] Audrey Terras. Fourier analysis on finite groups and applications, volume 43 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1999. doi:10.1017/CBO9780511626265.
[49] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1-3):1–336, 2012. doi:10.1561/0400000010.
[50] Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009. doi:10.1561/0400000033.
[51] Emanuele Viola. The sum of $d$ small-bias generators fools polynomials of degree $d$ . Computational Complexity, 18(2):209–217, 2009. doi:10.1007/S00037-009-0273-5.
[52] Thomas Watson. Pseudorandom generators for combinatorial checkerboards. Computational Complexity, 22(4):727–769, 2013. doi:10.1007/s00037-012-0036-6.
[53] Avi Wigderson. Representation theory of finite groups, and applications. Available at https://www.math.ias.edu/avi/node/2289, 2010.

[bib.bib1] [1] Miklos Ajtai and Avi Wigderson. Deterministic simulation of probabilistic constant-depth circuits. Advances in Computing Research - Randomness and Computation, 5:199–223, 1989.

[bib.bib2] [2] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple constructions of almost $k$ -wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992. doi:10.1002/RSA.3240030308.

[bib.bib3] [3] Noga Alon, Alexander Lubotzky, and Avi Wigderson. Semi-direct product in groups and zig-zag product in graphs: Connections and applications. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 630–637, 2001. doi:10.1109/SFCS.2001.959939.

[bib.bib4] [4] Benny Applebaum, Yuval Ishai, and Eyal Kushilevitz. Cryptography in NC⁰. SIAM J. on Computing, 36(4):845–888, 2006.

[bib.bib5] [5] Roy Armoni, Michael E. Saks, Avi Wigderson, and Shiyu Zhou. Discrepancy sets and pseudorandom generators for combinatorial rectangles. In 37th IEEE Symp. on Foundations of Computer Science (FOCS), pages 412–421, 1996. doi:10.1109/SFCS.1996.548500.

[bib.bib6] [6] David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC¹. J. of Computer and System Sciences, 38(1):150–164, 1989. doi:10.1016/0022-0000(89)90037-8.

[bib.bib7] [7] Jonah Blasiak, Thomas Church, Henry Cohn, Joshua A. Grochow, and Chris Umans. Which groups are amenable to proving exponent two for matrix multiplication? CoRR, abs/1712.02302, 2017. arXiv:1712.02302.

[bib.bib8] [8] Andrej Bogdanov and Emanuele Viola. Pseudorandom bits for polynomials. SIAM J. on Computing, 39(6):2464–2486, 2010. doi:10.1137/070712109.

[bib.bib9] [9] Eshan Chattopadhyay, Pooya Hatami, Kaave Hosseini, and Shachar Lovett. Pseudorandom generators from polarizing random walks. Theory Comput., 15:1–26, 2019. doi:10.4086/TOC.2019.V015A010.

[bib.bib10] [10] Henry Cohn, Robert D. Kleinberg, Balázs Szegedy, and Christopher Umans. Group-theoretic algorithms for matrix multiplication. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 379–388, 2005. doi:10.1109/SFCS.2005.39.

[bib.bib11] [11] Harm Derksen, Peter Ivanov, Chin Ho Lee, and Emanuele Viola. Pseudorandomness, symmetry, smoothing: Ii, 2024. doi:10.48550/arXiv.2407.12110.

[bib.bib12] [12] Harm Derksen, Chin Ho Lee, and Emanuele Viola. Boosting uniformity in quasirandom groups: fast and simple. In IEEE Symp. on Foundations of Computer Science (FOCS), 2024.

[bib.bib13] [13] Persi Diaconis. Group representations in probability and statistics, volume 11 of Institute of Mathematical Statistics Lecture Notes—Monograph Series. Institute of Mathematical Statistics, Hayward, CA, 1988.

[bib.bib14] [14] Dean Doron, Pooya Hatami, and William M. Hoza. Log-seed pseudorandom generators via iterated restrictions. In Shubhangi Saraf, editor, 35th Computational Complexity Conference, CCC 2020, July 28-31, 2020, Saarbrücken, Germany (Virtual Conference), volume 169 of LIPIcs, pages 6:1–6:36. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.CCC.2020.6.

[bib.bib15] [15] David S. Dummit and Richard M. Foote. Abstract Algebra. Wiley, 3rd edition, 2004.

[bib.bib16] [16] Michael A. Forbes and Zander Kelley. Pseudorandom generators for read-once branching programs, in any order. In IEEE Symp. on Foundations of Computer Science (FOCS), 2018. doi:10.1109/FOCS.2018.00093.

[bib.bib17] [17] Parikshit Gopalan, Daniel Kane, and Raghu Meka. Pseudorandomness via the discrete fourier transform. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 903–922, 2015. doi:10.1109/FOCS.2015.60.

[bib.bib18] [18] Parikshit Gopalan, Daniel M. Kane, and Raghu Meka. Pseudorandomness via the discrete Fourier transform. SIAM J. Comput., 47(6):2451–2487, 2018. doi:10.1137/16M1062132.

[bib.bib19] [19] Parikshit Gopalan, Raghu Meka, Omer Reingold, Luca Trevisan, and Salil Vadhan. Better pseudorandom generators from milder pseudorandom restrictions. In IEEE Symp. on Foundations of Computer Science (FOCS), 2012.

[bib.bib20] [20] Parikshit Gopalan, Raghu Meka, Omer Reingold, and David Zuckerman. Pseudorandom generators for combinatorial shapes. SIAM J. Comput., 42(3):1051–1076, 2013. doi:10.1137/110854990.

[bib.bib21] [21] W. T. Gowers. Generalizations of Fourier analysis, and how to apply them. Bull. Amer. Math. Soc. (N.S.), 54(1):1–44, 2017. doi:10.1090/bull/1550.

[bib.bib22] [22] W. T. Gowers and Emanuele Viola. Mixing in non-quasirandom groups. In ACM Innovations in Theoretical Computer Science conf. (ITCS), 2022.

[bib.bib23] [23] Elad Haramaty, Chin Ho Lee, and Emanuele Viola. Bounded independence plus noise fools products. SIAM J. Comput., 47(2):493–523, 2018. doi:10.1137/17M1129088.

[bib.bib24] [24] Pooya Hatami and William Hoza. Paradigms for unconditional pseudorandom generators. Foundations and Trends® in Theoretical Computer Science, 16(1-2):1–210, 2024. doi:10.1561/0400000109.

[bib.bib25] [25] Michal Koucký, Prajakta Nimbhorkar, and Pavel Pudlák. Pseudorandom generators for group products: extended abstract. In STOC, pages 263–272. ACM, 2011. doi:10.1145/1993636.1993672.

[bib.bib26] [26] Jack B. Kuipers. Quaternions in computer graphics and robotics. In SIGGRAPH 2002 Course Notes, San Antonio, TX, 2002. ACM SIGGRAPH.

[bib.bib27] [27] Chin Ho Lee. Fourier bounds and pseudorandom generators for product tests. In 34th Computational Complexity Conference, volume 137. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CCC.2019.7.

[bib.bib28] [28] Chin Ho Lee, Edward Pyne, and Salil Vadhan. Fourier growth of regular branching programs. In Approximation, randomization, and combinatorial optimization. Algorithms and techniques, volume 245 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 2, 21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/lipics.approx/random.2022.2.

[bib.bib29] [29] Chin Ho Lee and Emanuele Viola. The coin problem for product tests. ACM Trans. Comput. Theory, 10(3):Art. 14, 10, 2018. doi:10.1145/3201787.

[bib.bib30] [30] Chin Ho Lee and Emanuele Viola. More on bounded independence plus noise: Pseudorandom generators for read-once polynomials. Theory of Computing, 16:1–50, 2020. Available at https://www.ccs.neu.edu/home/viola/papers/LV-rop.pdf. doi:10.4086/TOC.2020.V016A007.

[bib.bib31] [31] Chin Ho Lee and Emanuele Viola. More on bounded independence plus noise: pseudorandom generators for read-once polynomials. Theory Comput., 16:Paper No. 7, 50, 2020. doi:10.4086/toc.2020.v016a007.

[bib.bib32] [32] Shachar Lovett. Unconditional pseudorandom generators for low degree polynomials. Theory of Computing, 5(1):69–82, 2009. doi:10.4086/TOC.2009.V005A003.

[bib.bib33] [33] Shachar Lovett, Partha Mukhopadhyay, and Amir Shpilka. Pseudorandom generators for $CC^{0}[p]$ and the Fourier spectrum of low-degree polynomials over finite fields. In 51th IEEE Symp. on Foundations of Computer Science (FOCS). IEEE, 2010.

[bib.bib34] [34] Shachar Lovett, Omer Reingold, Luca Trevisan, and Salil Vadhan. Pseudorandom bit generators that fool modular sums. In Approximation, randomization, and combinatorial optimization, volume 5687 of Lecture Notes in Comput. Sci., pages 615–630. Springer, Berlin, 2009. doi:10.1007/978-3-642-03685-9_46.

[bib.bib35] [35] Shachar Lovett, Omer Reingold, Luca Trevisan, and Salil P. Vadhan. Pseudorandom bit generators that fool modular sums. In 13th Workshop on Randomization and Computation (RANDOM), volume 5687 of Lecture Notes in Computer Science, pages 615–630. Springer, 2009. doi:10.1007/978-3-642-03685-9_46.

[bib.bib36] [36] Chi-Jen Lu. Improved pseudorandom generators for combinatorial rectangles. Combinatorica, 22(3):417–433, 2002. doi:10.1007/S004930200021.

[bib.bib37] [37] Raghu Meka, Omer Reingold, and Avishay Tal. Pseudorandom generators for width-3 branching programs. In STOC’19 – Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 626–637. ACM, New York, 2019. doi:10.1145/3313276.3316319.

[bib.bib38] [38] Raghu Meka and David Zuckerman. Small-bias spaces for group products. In 13th Workshop on Randomization and Computation (RANDOM), volume 5687 of Lecture Notes in Computer Science, pages 658–672. Springer, 2009. doi:10.1007/978-3-642-03685-9_49.

[bib.bib39] [39] Raghu Meka and David Zuckerman. Small-bias spaces for group products. In Approximation, randomization, and combinatorial optimization, volume 5687 of Lecture Notes in Comput. Sci., pages 658–672. Springer, Berlin, 2009. doi:10.1007/978-3-642-03685-9_49.

[bib.bib40] [40] J. Naor and M. Naor. Small-bias probability spaces: efficient constructions and applications. In 22nd ACM Symp. on the Theory of Computing (STOC), pages 213–223. ACM, 1990. doi:10.1145/100216.100244.

[bib.bib41] [41] Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993. doi:10.1137/0222053.

[bib.bib42] [42] Pierre Péladeau and Denis Thérien. On the languages recognized by nilpotent groups (a translation of “Sur les langages reconnus par des groupes nilpotents”). Electron. Colloquium Comput. Complex., TR01-040, 2001. URL: https://eccc.weizmann.ac.il/eccc-reports/2001/TR01-040/index.html.

[bib.bib43] [43] Omer Reingold, Thomas Steinke, and Salil P. Vadhan. Pseudorandomness for regular branching programs via Fourier analysis. In Workshop on Randomization and Computation (RANDOM), pages 655–670, 2013. doi:10.1007/978-3-642-40328-6_45.

[bib.bib44] [44] Eyal Rozenman, Aner Shalev, and Avi Wigderson. Iterative construction of cayley expander graphs. Theory Comput., 2(5):91–120, 2006. doi:10.4086/TOC.2006.V002A005.

[bib.bib45] [45] Jean Pierre Serre. Linear Representations of Finite Groups. Springer, 1977.

[bib.bib46] [46] Thomas Steinke. Pseudorandomness for permutation branching programs without the group theory. Electron. Colloquium Comput. Complex., TR12-083, 2012. URL: https://eccc.weizmann.ac.il/report/2012/083.

[bib.bib47] [47] Xiaorui Sun. Faster isomorphism for p-groups of class 2 and exponent p. In STOC, pages 433–440. ACM, 2023.

[bib.bib48] [48] Audrey Terras. Fourier analysis on finite groups and applications, volume 43 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1999. doi:10.1017/CBO9780511626265.

[bib.bib49] [49] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1-3):1–336, 2012. doi:10.1561/0400000010.

[bib.bib50] [50] Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009. doi:10.1561/0400000033.

[bib.bib51] [51] Emanuele Viola. The sum of $d$ small-bias generators fools polynomials of degree $d$ . Computational Complexity, 18(2):209–217, 2009. doi:10.1007/S00037-009-0273-5.

[bib.bib52] [52] Thomas Watson. Pseudorandom generators for combinatorial checkerboards. Computational Complexity, 22(4):727–769, 2013. doi:10.1007/s00037-012-0036-6.

[bib.bib53] [53] Avi Wigderson. Representation theory of finite groups, and applications. Available at https://www.math.ias.edu/avi/node/2289, 2010.

Pseudorandom Bits for Non-Commutative Programs

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Acknowledgements:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

Definition 1 (Pseudorandom generators (PRGs)).

PRGs for branching programs, and group programs.

Definition 2.

PRGs for read-once polynomials.

PRGs for block-products.

Definition 3 (Block-product with spill).

1.1 Our results

Definition 4.

Theorem 5.

Polynomials and block-products.

Definition 6 (Mixing groups).

▶ Remark 7.

Definition 8 (Finite).

Lemma 9 (Mixing characterization of Dedekind groups).

Theorem 10.

Corollary 11.

Proof.

Theorem 12.

1.2 Future directions and open problems

2 Proof of Theorem 5

Lemma 13.

Example 14.

Proof of Lemma 13.

The case 𝒑=𝟐.

The case 𝒑>𝟐.

Lemma 15 ([51]).

Definition 16.

Corollary 17 ([35, 38]).

Proof.

Lemma 18.

Proof.

Proof of Theorem 5..

3 Representation theory and matrix analysis

Matrices.

Fact 19.

Fact 20.

Representation theory.

⊳ Claim 21.

Proof.

4 Proof of Theorem 12

⊳ Claim 22.

Proof.

⊳ Claim 23.

Proof.

5 Proof of Theorem 10

Definition 24.

Definition 25.

Theorem 26.

Proof of Theorem 10.

5.1 Analysis of one iteration: Proof of Theorem 26

Lemma 27.

Lemma 28 ([16, 28]).

Lemma 29 (Width reduction for short products).

Theorem 30.

Corollary 31.

Proof.

Proof of Theorem 26.

6 Width reduction for long products: Proof of Theorem 30

6.1 The reduction

⊳ Claim 32.

Proof.

6.2 Analysis

⊳ Claim 33.

Lemma 34.

Proof of Theorem 30.

Proof of Lemma 34.

⊳ Claim 35.

$\blacktriangleright$ Remark 7.

The case $p=2$ .

The case $p>2$ .

$\vartriangleright$ Claim 21.

$\vartriangleright$ Claim 22.

$\vartriangleright$ Claim 23.

$\vartriangleright$ Claim 32.

$\vartriangleright$ Claim 33.

$\vartriangleright$ Claim 35.

$\vartriangleright$ Claim 37.

$\vartriangleright$ Claim 38.

$\vartriangleright$ Claim 39.

8 Fooling $(1,w,3\log(1/\varepsilon))$ -products over groups

8.1 $𝒑$ -groups

$\vartriangleright$ Claim 40.

$\vartriangleright$ Claim 42.

$\vartriangleright$ Claim 44.

$\vartriangleright$ Claim 45.