Abstract 1 Introduction 2 Proof of Theorem 5 3 Representation theory and matrix analysis 4 Proof of Theorem 12 5 Proof of Theorem 10 6 Width reduction for long products: Proof of Theorem 30 7 Width reduction of short products: Proof of Lemma 29 8 Fooling (𝟏,𝒘,𝟑𝐥𝐨𝐠(𝟏/𝜺))-products over groups 9 Mixing characterization of Dedekind groups References

Pseudorandom Bits for Non-Commutative Programs

Chin Ho Lee ORCID North Carolina State University, Raleigh, NC, USA Emanuele Viola ORCID Northeastern University, Boston, MA, USA
Abstract

We obtain new explicit pseudorandom generators for several computational models involving groups. Our main results are as follows:

  1. 1.

    We consider read-once group-products over a finite group G, i.e., tests of the form i=1ngixi where giG, a special case of read-once permutation branching programs. We give generators with optimal seed length cGlog(n/ε) over any p-group. The proof uses the small-bias plus noise paradigm, but derandomizes the noise to avoid the recursion in previous work. Our generator works when the bits are read in any order. Previously for any non-commutative group the best seed length was lognlog(1/ε), even for a fixed order.

  2. 2.

    We give a reduction that “lifts” suitable generators for group products over G to a generator that fools width-w block products, i.e., tests of the form gifi where the fi are arbitrary functions on disjoint blocks of w bits. Block products generalize several previously studied classes. The reduction applies to groups that are mixing in a representation-theoretic sense that we identify.

  3. 3.

    Combining (2) with (1) and other works we obtain new generators for block products over the quaternions or over any commutative group, with nearly optimal seed length. In particular, we obtain generators for read-once polynomials modulo any fixed m with nearly optimal seed length. Previously this was known only for m=2.

  4. 4.

    We give a new generator for products over “mixing groups.” The construction departs from previous work and uses representation theory. For constant error, we obtain optimal seed length, improving on previous work (which applied to any group).

This paper identifies a challenge in the area that is reminiscent of a roadblock in circuit complexity – handling composite moduli – and points to several classes of groups to be attacked next.

Keywords and phrases:
Group programs, Space-bounded derandomization, Representation theory
Funding:
Chin Ho Lee: Work done in part at Harvard University, supported by Madhu Sudan’s and Salil Vadhan’s Simons Investigator Awards.
Emanuele Viola: Supported by NSF grants CCF-2114116 and CCF-2430026.
Copyright and License:
[Uncaptioned image] © Chin Ho Lee and Emanuele Viola; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Pseudorandomness and derandomization
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2025/071/
Acknowledgements:
We thank Yves de Cornulier for answering a question about Dedekind groups and providing a proof of Lemma 9.
Editors:
Srikanth Srinivasan

1 Introduction

The construction of explicit pseudorandom generators is a fundamental research goal that has applications in many areas of theoretical computer science. For background we refer to the recent survey [24]. We first define pseudorandom generators, incorporating the variants of any order (reflected in the permutation π) and non-Boolean tests (reflected in the range set R).

Definition 1 (Pseudorandom generators (PRGs)).

An explicit function P:{0,1}s{0,1}n is a pseudorandom generator (PRG) with seed length s and error ε for a class of functions F mapping {0,1}n to a set R if for every fF the statistical distance between f(P(Us)) and f(Un) is ε, where Us denotes the uniform distribution over {0,1}s. We say P fools F in any order if π(P) fools F for any permutation π of the positions of the n input bits. A PRG is explicit if it is computable in time nc.

PRGs for branching programs, and group programs.

A main agenda is obtaining explicit pseudorandom generators for read-once branching programs (ROBPs), with an ultimate goal of proving BPL=L. However, even for constant-width, permutation ROBPs, the best known seed length is lognlog(1/ε). This is log2n when ε=1/n, and thus falls short of the optimal seed length clog(n/ε). For permutation ROBPs of width w, seed length cwlog(n/ε)log(ε1logn) follows from instantiating the “Polarizing Random Walks” [9] with a bound from [43, 28]. These generators work in any order; thus they essentially match the seed length cwlog(n/ε)log(1/ε) that was already available for fixed-order in a sequence of exciting works culminating in [46].

The class of permutation ROBPs is equivalent to group programs (see e.g. [25]):

Definition 2.

A program (or product) p of length n over a group G is a tuple (g1,g2,,gn)Gn. The program computes the function fp:{0,1}nxi[n]gixiG.

No generator with seed length less than lognlog(1/ε) was available for any non-commutative group. While optimal seed length clog(n/ε) was known for 2 since [40], it took nearly 20 years and different techniques to have the same seed length over 3 [35, 38], and remarkably that seed length is still not available even for 6 (see [18] for the best known construction).

PRGs for read-once polynomials.

Another model that has received significant attention is read-once polynomials. Intuitively, this model can serve as a bridge between permutation and non-permutation ROBPs. The available generators for non-permutation ROBPs have significantly worse seed length than for permutation programs, see e.g. [37] and the discussion there.

A sequence of works [30, 37, 14] culminated in PRGs with seed length clogn+log(1/ε)logclog(1/ε) for read-once polynomials over 2. But for other domains such as 3 such good seed lengths were not known.

PRGs for block-products.

A more general model that generalizes and unifies the previous ones is what we call block-products of width w over a group G. Here, the input bits are arbitrarily partitioned in blocks of w bits, arbitrary Boolean functions are then applied to each block, and finally the outputs are used as exponents to group elements. For our results, we will need to allow one block to be larger; we call this spill and incorporate it in the following definition.

Definition 3 (Block-product with spill).

A function f:{0,1}nG is computable by a w-block product with terms and a spill of q bits, written as (,w,q)-product, over a group G if there exist +1 disjoint subsets I0,I1,,I[n], where |I0|q and |Ii|w for each i[] such that

f(x)=i=1gifi(xIi)

for some group elements giG, functions fi:{0,1}Ii{0,1}. Here xIi are the |Ii| bits of x indexed by Ii.

Note that block products are unordered by definition. They are a generalization of several function classes that have been studied, including modular sums [34, 39, 18] (when G is a cyclic group and w=1), product tests with outputs in {1,1} (a.k.a. combinatorial checkerboards) [52, 23, 29, 31, 27] (when G=2), themselves a generalization of combinatorial rectangles [5, 36, 19], and unordered combinatorial shapes [20, 18] (when G=m+1). Block products also generalize read-once polynomials because one can show (for the uniform and typically also pseudorandom distributions) that monomials of degree log(n/ε) do not affect the result significantly, and so one can simulate these polynomials with blocks of size log(n/ε).

In terms of generators, a series of works culminating in [27] gives nearly-optimal seed length (i.e., w+log(/ε) up to lower-order factors) over 2. But such a result was not known over other groups such as 3 or any non-commutative group.

1.1 Our results

In this work we bring new techniques, notably from group theory, to bear on these problems, and use them to obtain new pseudorandom generators.

First, we obtain optimal seed length for products over p-groups.

Definition 4.

A finite p-group is a group of order pk for an integer k and a prime p.

Equivalently, the order of every element is a power of p. (The latter definition makes sense for infinite groups, but we only consider finite groups.) The class of p-groups is rich and has been studied in various areas of theory of computation. For example, p-groups remain a candidate for good group-theoretic algorithms for matrix multiplication [7]; the isomorphism testing for a subclass of p-groups has been identified as a barrier to faster group isomorphism algorithms [47]; p-groups (specifically, unitriangular groups) are used for cryptography in NC0 [4] (see [50] for an exposition emphasizing these groups); finally, p-groups (specifically, quaternions) are used in computer graphics to express 3D rotations [26].

We now give a few examples of such groups, all of which are non-commutative.

  • The quaternion group 8 of order 8 is a 2-group.

  • Unitriangular groups over 𝔽p are p-groups. They consist of upper-triangular matrices (of some fixed dimension), with 1 on the diagonal and entries in 𝔽p.

  • Wreath products give natural examples of p-groups. For example, the wreath product pp is a group of order pp+1, hence a p-group. This group is the direct product pp with another element in p acting on the tuple by shifting the coordinates. For concreteness, the case p=2 can be presented as (a,b;z) where a,b,z2, (a,b;0)(a,b;z)=(a+a,b+b;z), and (a,b;1)(a,b;z)=(a+b,b+a;1+z). Wreath product constructions (not necessarily p-groups) have been studied in a variety of contexts ranging from group-theoretic algorithms for matrix multiplication [10], to construction of expander graphs [3, 44], to mixing in non-quasirandom groups [22].

  • The dihedral group 𝔻n is the group of order 2n of symmetries of a regular polygon with n sides. When n=2t, 𝔻n is a 2-group.

We give pseudorandom generators for programs over p-groups, with optimal seed length. Throughout this paper, we use cx to denote a constant that depends on the variable x.

Theorem 5.

Let G be a p-group. There is an explicit pseudorandom generator that fools programs of length n over G in any order, with seed length cGlog(n/ε).

In fact, the same result holds even for block-products over p-groups with constant block length w.

Polynomials and block-products.

We give a general reduction that “lifts” a PRG P for group products over G to a PRG P for block-products (and read-once polynomials) over G. The reduction applies to any group G that is mixing:

Definition 6 (Mixing groups).

A (finite) group G is mixing if for every nontrivial irreducible (unitary) representation ρ and non-identity element gG, the matrix ρ(g) has no eigenvalue 1.

 Remark 7.

Our results for mixing groups (Theorems 10 and 12) apply more generally to fooling words over a mixing subset H of a (not necessarily mixing) group G. The property we need is that Definition 6 holds for every element in H. There are many examples of mixing subsets of non-mixing groups which generate the entire group G. For example, for 𝕊3=𝔻3, it suffices to exclude the “flip” element, i.e. the non-identity element r, where r2=1. Moreover, one can have natural examples for infinite groups. However for simplicity we focus on finite mixing groups.

We note that mixing groups are exactly the class of Dedekind groups.

Definition 8 (Finite).

Dedekind groups are groups of the form 8×2t×D for any integer t and commutative group D of odd order. A non-commutative Dedekind group is also called a Hamiltonian group.

Lemma 9 (Mixing characterization of Dedekind groups).

A finite group is mixing if and only if it is Dedekind.

A proof of Lemma 9 is in Section 9. We can now state our reduction:

Theorem 10.

Let G be a mixing group. Suppose there is a PRG P1 with seed length s1 that ε-fools (,1,3log(1/ε))-products over G. Then there is a PRG that (cGlog(w+log(/ε))ε)-fools (,w,log(1/ε))-products over G, with seed length

cG(s1+log(/ε)+w)logc(w+log(n/ε)).

Note that if P1 has nearly optimal seed length (i.e., log(/ε) times lower-order terms) then also the final PRG has nearly optimal seed length (i.e., w+log(/ε), times lower-order terms).

Applying the reduction (Theorem 10) we obtain near-optimal PRGs for block products over commutative or Dedekind 2-groups (in particular, the quaternions).

Corollary 11.

Let G be either a commutative group, or a Dedekind 2-group, that is, G=8×2t for some t. There is an explicit PRG that ε-fools (,w,0)-block products over G with seed length cG(w+log(/ε))logc(w+log(n/ε)).

Proof.

We use the reduction (Theorem 10). For commutative groups we use the PRG in [17] for P1; for Dedekind 2-groups we use our Theorem 5 for P1. Actually, in both cases the generators were only stated for group products while we need to handle the spill. The simple modification is in Section 8. As remarked earlier, as a consequence of Corollary 11, we obtain PRGs for read-once polynomials over n variables over any finite field 𝔽 with near-optimal seed length c𝔽log(n/ε)logclog(n/ε). Again, this was not known even for 𝔽3.

This result is also a step towards handling group programs over more general groups, for example nilpotent groups, which are direct products of p-groups (for different p). Jumping ahead, our techniques imply that generators for such groups follow from generators for (non-read-once) polynomials over composites.

Finally, we give a new generator for products over mixing groups.

Theorem 12.

Let G be a mixing group. There is an explicit PRG P that ε-fools length-n programs over G with seed length cGlog(n/ε)log(1/ε), in any order.

The parameter improvement over previous work appears tiny: As remarked earlier, [9] gives seed length cGlog(n/ε)log(ε1logn), and moreover for any G. Still, for constant error we obtain optimal seed length which was known only in the fixed-order case (cf. [46]). Also note that mixing groups of the form 8×2t (i.e., m=1) are 2-groups, for which we gave optimal seed length in Theorem 5. But the techniques there do not even apply to the commutative (mixing) group 2×3.

Our main interest in this result is that its proof is different from previous work: it showcases how we can use information on the representation theory to improve the parameters, pointing to several open problems.

1.2 Future directions and open problems

This work suggests that the difficulty of handling more general classes of groups lies in composite moduli. For example, we do not have new generators for 𝔻3=𝕊3, a group of order 6, even though we have optimal seed length for 𝔻n when n is a power of two. Thus, a challenge emerging from this work is to improve the seed length over any non-commutative group of composite order. Again, 𝕊3 is an obvious candidate, which is equivalent to fooling width-3 permutation ROBPs. But other groups could be easier to handle, for example Dedekind groups or the direct product of a p-group and a q-group where pq are primes.

Also, the techniques in this paper point to several other questions. Can we extend our reduction to block products where instead of gf for Boolean f we more generally have gf replaced by a function with range G? For what other groups can we exploit representation theory to obtain better PRGs?

2 Proof of Theorem 5

We use the fact that programs over p-groups can be written as polynomials. Elements in a group of order pk will be written as k-tuples over 𝔽p.

Lemma 13.

Let G be a group of order pk, and k an integer. There is a 1-1 correspondence between G and 𝔽pk and a polynomial map f=(f1,,fk):(𝔽pk)n𝔽pk over 𝔽p where the fi:(𝔽pk)n𝔽p have degree cG such that for any g¯:=(g1,g1,,gn)Gn and x{0,1}n, we have

i=1ngixi=(f1(g¯),f2(g¯),,fk(g¯))(x1,,xn).

This lemma is essentially in the previous work [42]. However the statement there is for nilpotent groups and cannot be immediately used. Also, the proof relies on previous work and is somewhat indirect. So we give a direct proof of the result we need (i.e., Lemma 13).

Before the proof we illustrate it via an example.

Example 14.

Let G:=22 from the introduction. Consider a product i(ai,bi;zi). Via a polynomial map we can rewrite this product into a normal form where all the zi are in one element only:

(i(ai,bi;0))(0,0;z).

Computing this product is then immediate, via a linear map. The key observation is that ai=ai if the sum of the zj with j<i is even, and ai=bi otherwise, and that this computation is a quadratic polynomial (in the input bits ai,bi,zi).

Proof of Lemma 13.

We proceed by induction on k. If k=1, then G is cyclic. We can take a generator aG and define the 1-1 mapping Gazz𝔽p. So i=1ngixi=i=1nazixi can be written as the degree-1 polynomial i=1nzixi.

Otherwise, G has a normal subgroup H of order pk1 [15, Chapter 6, Theorem 1.(3)]. The corresponding quotient group Q=G/H has order p and is therefore cyclic. So we can write giG as

gi=aeihi

where hiH and a is a generator of Q. Applying the induction hypothesis on H, we can identify each element gi=aeihi with a k-tuple (ei,ei)𝔽pk, where ei𝔽pk1 corresponds to hiH.

Now we apply the conjugation trick as in [6], and use induction. That is, let bi:=ajiejxj and write

i=1ngixi=(i=1n(bihixibi1))bn.

Note that bi and bi1 can be computed by some degree-1 polynomials over 𝔽p, and hixi can be (trivially) computed by a degree-cH polynomial over 𝔽p.

Therefore, each term bihixibi1 can be computed by some degree-cG polynomial map fH,i=(f1,,fk) over 𝔽p. Moreover, these terms lie in H because H is a normal subgroup of G. Hence we have reduced to a product over H, which by induction hypothesis, can be computed by some degree-cH polynomial, and the result follows.

Given Lemma 13, it suffices to construct a bit-generator that fools low-degree polynomials over 𝔽p.

The case 𝒑=𝟐.

For this case, we can simply combine Lemma 13 with known generators for polynomials over 𝔽2 [8, 32, 51]. In fact, we obtain results for non-read-once programs as well, and of any length. (Indeed, such polynomials are equivalent to low-degree polynomials over 𝔽2.)

The case 𝒑>𝟐.

Here we need additional ideas because bit-generators that fool polynomials over 𝔽q with q2 are not known. However, the works [8, 32, 51] do give generators that output field elements that fool such polynomials.

Lemma 15 ([51]).

There are distributions Y over 𝔽pn that can be explicitly sampled from a uniform seed of cp(2dlog(1/ε)+logn) bits such that for any degree-d polynomial f in n variables over 𝔽p, we have Δ(f(Y),f(U))ε.

However, we need distributions over {0,1}n. This distinction is critical and arises in a number of previous works. Currently, for domain {0,1}n only weaker results with seed length log2n are known [33].

Still, as pointed out in [35, 38], Lemma 15 implies results over the domain 𝔽2n for biased bits:

Definition 16.

We denote by Np a vector of n i.i.d. bits coming up 1 with probability 1/p.

Corollary 17 ([35, 38]).

There are distributions X over {0,1}n that can be explicitly sampled from a uniform seed of cp(2dlog(1/ε)+logn) bits such that for any degree-d polynomial f in n variables over 𝔽p we have Δ(f(X),f(Np))ε.

Proof.

Let Y=(Y1,Y2,,Yn) be the distribution from Lemma 15, for degree d(p1). Define X:=(Y1p1,Y2p1,,Ynp1). Note X is over {0,1}n. Also, if U is uniform in 𝔽p then Up1=Np. The result follows. We will show how to use biased bits. For this we use that the program is read-once.

Lemma 18.

Let X fool degree-1 polynomials over 𝔽2 with error εcG,p. Then X+Np fools programs of length n over G with error ε.

Proof.

This follows from Lemma 7.2 in [16] combined with the Fourier bound in [43, 28]. The proof in [16] is for the fixed noise parameter p=4, but the generalization to any p is immediate (replace 1/2 with 12/p in the last two lines of the proof). We now have all the ingredients.

Proof of Theorem 5..

Use Lemma 18. Averaging over X, it suffices to derandomize Np. By Lemma 13 it suffices to do this for low-degree polynomials. This follows from Corollary 17.

3 Representation theory and matrix analysis

In this section, we present the fragment of representation theory and matrix analysis that we need. The books by Serre [45], Diaconis [13], and Terras [48] are good references for representation theory and non-commutative Fourier analysis. The Barbados notes [53], [21, Section 13], [22], or [12] provide briefer introductions.

Matrices.

Let M be a square complex matrix. We denote by tr(M) the trace of M, by M¯ the conjugate of M, by MT the transpose of M, and by M the conjugate transpose MT¯ (aka adjoint, Hermitian conjugate, etc.). The matrix M is unitary if the rows and the columns are orthonormal; equivalently M1=M.

The Frobenius norm, (a.k.a. Schatten 2-norm, Hilbert–Schmidt operator) of a square matrix M, denoted M𝖥, is i,j|Mi,j|2=tr(MM).

The operator norm of a matrix M, denoted M𝗈𝗉, is the square root of the largest eigenvalue of the matrix MM. In particular, if M is a normal matrix, i.e. MM=MM, then M𝗈𝗉 equals its largest eigenvalue in magnitude.

Fact 19.

AB𝗈𝗉A𝗈𝗉B𝗈𝗉.

Fact 20.

For a d×d matrix M with eigenvalues λ1,,λd, we have M𝖥2=i=1d|λi|2dM𝗈𝗉2.

Representation theory.

Let G be a group. A representation ρ of G with dimension d maps elements of G to d×d unitary, complex matrices so that ρ(xy)=ρ(x)ρ(y). Thus, ρ is a homomorphism from G to the group of linear transformations of the vector space d. We denote by dρ the dimension of ρ.

If there is a non-trivial subspace W of d that is invariant under ρ, that is, ρ(x)WW for every xG, then ρ is reducible; otherwise it is irreducible. Irreducible representations are abbreviated irreps and play a critical role in Fourier analysis. We denote by G^ a complete set of inequivalent irreducible representations of G.

Let G^ be the set of irreducible representations of G (i.e. the dual group of G). We have

ρG^dρ2=|G|. (1)

For a random variable Z we also use Z to denote its probability mass function.

For an irrep ρG^, the ρ-th Fourier coefficient of Z is

Z^(ρ):=gGZ(g)ρ(g)¯=𝐄[ρ(Z)¯].

The Fourier expansion of Z:G is

Z(g)=1|G|ρG^dρtr(Z^(ρ)ρ(g)).

Parseval’s identity gives

gGZ(g)2=1|G|ρG^dρZ^(ρ)𝖥2.
Claim 21.

Suppose X and Y are two random variables over G such that for every irreducible representation ρ of G, we have 𝐄[ρ(X)]𝐄[ρ(Y)]𝗈𝗉ε. Then X and Y are (|G|ε)-close in statistical distance.

Proof.
12gG|X(g)Y(g)| |G|2(gG(X(g)Y(g))2)1/2 (Cauchy–Schwarz)
=|G|2(1|G|ρG^dρX^(ρ)Y^(ρ)𝖥2)1/2 (Parseval)
=12(ρG^dρ(dρε2))1/2 (Fact 20)
=ε2(ρG^dρ2)1/2=|G|ε/2. (Equation 1)

4 Proof of Theorem 12

Again, besides the parameter improvement, our main point here is to illustrate how we use representation theory to obtain pseudorandom generators. These ideas will then be generalized to the more general and complicated setting of block products in the next section.

Let ρ be an irreducible representation of a mixing group (Definition 6). By definition of mixing, if ρ is a non-identity matrix then it does not have 1 as its eigenvalues. A main observation is that if there are many non-identity matrices ρ(gi) in the program, then the bias 𝐄[i=1nρ(gi)Ui]𝗈𝗉 is small. This is proved in the next two claims.

Claim 22.

Let M be a unitary matrix with eigenvalues eiθj for some θj[π,π] on the unit circle. Suppose |θj|θ for every j. Then (I+M)/2𝗈𝗉1θ2/8.

Proof.

As M is unitary, we can write M=QDQ, where D is a diagonal matrix with M’s eigenvalues on its diagonal and Q is unitary. The eigenvalues of (I+M)/2=Q(I+D2)Q are 1+eiθj2=eiθj/2eiθj/2+eiθj/22=eiθj/2cos(θj/2), which have magnitudes at most |cos(θj/2)|1θ2/8.

Claim 23.

Let G be a mixing group. Let ρ be an irreducible representation of G of dimension dρ. Let fρ(x)=i=1nρ(gi)xi be the representation of a group program. Suppose ρ(gi)Idρ for tcGlog(1/ε) many i’s. Then 𝐄[fρ(U)]𝗈𝗉ε.

Proof.

Let T be the t coordinates j where ρ(gj)Idρ. For every fixing of the other coordinates, we can write fρ(U) as

BjTρ(gj)UjBj

for some unitary matrices B and Bj’s. So

fρ(U)𝗈𝗉B𝗈𝗉jT𝐄[ρ(gj)Uj]𝗈𝗉Bj𝗈𝗉(1cG)tε.

We now proceed with the proof of the main result. The proof extends to handle the spill, but for simplicity we do not discuss it here. We fool each irreducible representation of G separately and then appeal to Claim 21. Fix a representation ρ and consider the product

fρ(x):=j=1ρ(gj)xj.

Let t be the number of non-identity elements ρ(gj):j[n] and S be their coordinates.

Let us sketch the construction. First, XORing with an almost 2cGlog(1/ε)-wise uniform distribution takes care of the case tcGlog(1/ε), so we may assume that t is larger. In this case, by Claim 23, we have that the bias 𝐄[fρ(U)]𝗈𝗉 is small under the uniform distribution. Our goal is to set ct bits in S to uniform and apply Claim 23 again.

Let :=cGlog(1/ε). Let M be a (logn)×10 matrix filled with uniform bits.

We will make logn guesses of t. For each guess v=2i:i{0,,logn1} of t, we select a subset of size of the input positions using a hash function hi, and then hash these positions to row i of M using another hash function h, and assign input bits correspondingly. The final generator is obtained by trying all guesses, using the same seed for each guess hi, and XORing together the bits.

In more detail, for each i{0,,logn1}, let hi:[n]{0,1} be a 10-wise independent hash family with 𝐏𝐫hi[hi(j)=1]=2i for each j[n]. Let h:[n][10] be another 5-wise uniform hash family. The output of our generator is

D:=D(0)D(logn1),

where the j-th bit of D(i) is

hi(j)Mi,h(j).

We use the same seed to sample h0,,hlogn1, which costs at most OG(lognlog(1/ε)) bits [49, Corollary 3.34]. Sampling h uses another OG(log(n/ε)log(1/ε)) bits. This uses a total of OG(log(n/ε)log(1/ε)) bits.

We now show that 𝐄[fρ(D)]𝗈𝗉O(ε). Suppose t[2i,2i+1]. Recall that S are the coordinates corresponding to the non-identity matrices in the product. Let J:=hi1{1}S. As 𝐏𝐫[hi(j)=1]=2i, we have 𝐄[|J|][,2]. Applying tail bounds for bounded independence (see Lemma 36), we have |J|[/2,3] except with probability ε. Conditioned on this event, as |J|3 and h is 5-wise uniform, we can think of h as a random function from J to [10]. Hence, for each j[10], we have

𝐏𝐫[|Jh1(j)|=1]=|J|1/(10)(11/(10))|J|1(/2)(1/10)(1/2)1/40.

By a Chernoff bound, we have that except with probability at most ε, the number of j such that |Jh1(j)|=1 is at least /10.

Let T be these coordinates. Fixing all the bits in M except the ones in row i that are fed into T, we can write the conditional expectation of fρ(G) over the bits in T as

BjT𝐄xj[Ajxj]Bj,

for some unitary matrices B, Aj’s and Bj’s, and in particular, Aj has its eigenvalues bounded away from 1 on the complex unit circle. Therefore, by Claim 22,

BjT𝐄xj[Ajxj]Bj𝗈𝗉jT(I+Aj)2𝗈𝗉ε.

5 Proof of Theorem 10

In this section we prove Theorem 10. This type of reductions goes back to the work of [19] on read-once CNFs (itself building on [1]), and have been refined in several subsequent works. The work [30] extended the techniques to read-once polynomials. It exploited the observation that when the number of monomials is significantly larger than its degree, the bias of the polynomial is small, and therefore the bias of the restricted function remains small. Building on this observation, [37] showed that one can aggressively restrict most of the coordinates, while keeping the bias of the restricted function small. In addition, a typical restricted product is a low-degree polynomial (plus a spill), for which we have optimal generators [8, 32, 51].

However, [37] reduces to non-linear polynomials (degree 16). As discussed earlier, bit-generators with good seed lengths are only known over 2. We give a refined reduction that reduces to polynomials of degree one, for which we have generators over m for any m [35, 38, 18].

At the same time, we show that the reduction can be carried over any mixing group, by working with representations of the group.

Definition 24.

Let 𝒰θ(d) be the set of d×d unitary matrices with eigenvalues e2πiθj where |θj|θ.

Definition 25.

A group G is θ-mixing if it has a complete set of unitary irreducible representations where each non-identity matrix lies in 𝒰θ(d) for some d.

The following theorem will serve as the basis of our iterative construction of the PRG.

Theorem 26.

Let wloglog(1/ε)+logm. Suppose there is a PRG P with seed length s that ε-fools (m5230w,2w,2log(1/ε))-products over G. Let P1 be a PRG with seed length s1 that ε-fools (,1,3log(1/ε))-products over a group G of order m that is (1/m)-mixing. Then there is a PRG that O(ε)-fools (m5245w,3w,2log(1/ε))-products over G with seed length

s+(s1+Om((log(1/ε)+w)logw+loglogn)).

We first show how to apply Theorem 26 iteratively to obtain Theorem 10.

Proof of Theorem 10.

We iterate Theorem 26 repeatedly for some t times to reduce the problem to fooling an O(log(m/ε))-junta which can be done using an almost bounded uniform distribution.

Given an (,w,log(1/ε))-product f, let w=max{w,log,logm} so that we can view f as an (m5245w,3w,2log(1/ε))-product. We first apply Theorem 26 for t1=O(logw) times until we have a

((mlog(1/ε))C,loglog(1/ε)+logm,2log(1/ε))-product,

for some constant C.

Let b:=log(1/ε)+logmloglog(1/ε)+logm. We will apply the following repeatedly for some r=Om(1) steps. We divide the fi:i1 into groups of b functions and view the product of functions in each group as a single function, this way we can think of the above product as a

((mlog(1/ε))Cb,log(1/ε)+logm,2log(1/ε))-product.

So we can continue applying Theorem 26 for t2=O(log(log(1/ε)+logm))Om(loglog(1/ε)) times and the restricted function becomes a

((mlog(1/ε))Cb,loglog(1/ε)+logm,2log(1/ε))-product.

Repeating this process for

r=logb((mlog(1/ε))C)2C(logm+loglog(1/ε))loglog(1/ε)=Om(1)

times, we are left with a

(O(1),loglog(1/ε)+logm,2log(1/ε))-product

which can be fooled by an ε-almost O(log(m/ε))-wise uniform distribution that can be sampled using s=O(log(m/ε)+loglogn) bits [41, 2]. Therefore, in total we apply Theorem 26 for

t:=t1+rt2O(logw)+Om(loglog(1/ε))=Om(log(w+log(/ε)))

times, each with a seed of

s=s1+Om((log(/ε)+w)logw+loglogn).

bits. Hence in total it uses

st+s Om(s1+log(/ε)+w)polylog(w,log,logn,log(1/ε))

bits.

5.1 Analysis of one iteration: Proof of Theorem 26

We now prove Theorem 26. Given an (m5245w,3w,2log(1/ε))-product f=i=0fi over G of order m that is (1/m)-mixing, let be the number of non-constant fi. We say f is a long product if m5230w, otherwise f is short. At a high-level, we apply Theorem 30 to P1 to obtain a PRG that fools long products in one shot, and use Lemma 27 and 29 below to reduce fooling a short product to fooling a product of smaller width w.

Lemma 27.

Let wlogm and C be a sufficiently large constant. Define

k :=C(w+log(/ε))
δ :=(mw)k
p :=2C.

There exist two δ-almost k-wise independent distributions D and T with 𝐄[Di]=1/2 and 𝐄[Ti]=p for every i[n], such that for every (,w,0)-product f over G of order m, we have |𝐄D,T[f(D+TU)]𝐄[f(U)]|ε.

Moreover, D and T can be efficiently sampled with a seed of length Om((log(/ε)+w)logw+loglogn).

Lemma 27 follows from the following lemma, which can be obtained from applying a variant of a result of Forbes and Kelley [16] to the Fourier bounds on functions computable by block products over groups, which was established in [28]. (Block products are called generalized group products in [28].)

Lemma 28 ([16, 28]).

Let f:{0,1}n{0,1} be computable by an (,w,0)-block product over a group G. Let D and T be two independent δ-almost 2(k+w)-wise independent distributions on {0,1}n with 𝐄[Di]=1/2 and 𝐄[Ti]=p, and U be the uniform distribution on {0,1}n. Then

|𝐄[f(D+TU)]𝐄[f(U)]|(δ(w|G|)k+w+(12p)k/2+γ).

We remark that for every constant p, one can show that nω(1)-bias plus noise Np is necessary to fool programs over groups of order poly(n) with any subconstant error ε. This follows from [11], where it shows that there exists such a distribution which puts 2ε more probability mass on strings whose Hamming weight is greater than n/2+Op(kn) than the uniform distribution.

Lemma 29 (Width reduction for short products).

Let G be any group of order m. Let D and T be the two distributions defined in Lemma 27. Let wlogm. Let f be an (,3w,2log(1/ε))-product over G, where m5230w. Then with probability at least 1ε over D and T, the restricted function fD,T is an (,2w,2log(1/ε))-product over G.

We prove Lemma 29 in Section 7.

Theorem 30.

Let wloglog(1/ε)+2log(1/θ). Suppose there is a PRG P1 with seed length s1 that ε-fools (,1,3log(1/ε))-product f:=i=0fi over a matrix group supported on 𝒰θ(d), where 23wθ2 and each fi is non-constant. Then there is a PRG that fools (,3w,2log(1/ε))-products f=i=0fi over , where [230wθ5,245wθ5] and every fi is non-constant, with seed length s=s1+Oθ(log(1/ε)+w+loglogn).

Corollary 31.

Theorem 30 applies to products over any θ-mixing group G with ε replaced by ε/|G|.

Proof.

By definition, all its irreps ρ belong to 𝒰θ(dρ). It follows from Claim 21 that it suffices to fool all its irreps.

We prove Theorem 30 in Section 6. We now show how Theorem 26 follows from Lemma 27 and 29 and Theorem 30.

Proof of Theorem 26.

Let P1 be the PRG that ε-fools (,1,3log(1/ε))-products with seed length s1. Applying Theorem 30 with P1, we obtain a PRG P𝗅𝗈𝗇𝗀 that ε-fools every (,3w,2log(1/ε))-product f=i=0fi, where [m5230w,m5245w] and every fi is non-constant, with seed length s𝗅𝗈𝗇𝗀=s1+Om(log(1/ε)+w+loglogn). We now sample the distributions D,T in Lemma 27, and output

(D+TP(U))+P𝗅𝗈𝗇𝗀.

Using Lemma 27 and 2Om(w), sampling D and T uses s𝗌𝗁𝗈𝗋𝗍=Om((log(1/ε)+w)logw+loglogn) bits. So altogether this takes s+s𝗅𝗈𝗇𝗀+s𝗌𝗁𝗈𝗋𝗍=s+s1+Om((log(1/ε)+w)logw+loglogn) bits.

Let f be an (m5245w,3w,2log(1/ε))-product with many non-constant fi’s. If m5230w, then P𝗅𝗈𝗇𝗀 ε-fools it. Otherwise, m5230w and so f is an (m5230w,3w,2log(1/ε))-product. So by Lemma 29, with probability at least 1ε over the choices of D and T, the function fD,T is an (m5230w,2w,2log(1/ε))-product, and therefore can be ε-fooled using the generator P given by the assumption. The total error is O(ε).

6 Width reduction for long products: Proof of Theorem 30

In this section, we prove Theorem 30. Let f=i=0fi be an (,3w,2log(1/ε))-product over a matrix group 𝒰θ(d), where [230wθ5,245wθ5] and each fi is non-constant. Note that when a product f has this many non-constant functions, the “bias” 𝐄[f(U)]𝗈𝗉 of f is doubly exponentially small in w, i.e. at most exp(22w) (see Claim 33, which is at most ε whenever wloglog(1/ε)). Following [37], we will pseudorandomly restrict most of the coordinates of f and show that the bias of a typical restricted product remains bounded by ε. More importantly, we will show that this restricted product has width 1 (with a small spill). Therefore, it suffices to construct a PRG for width-1 products (with a small spill).

We remark that previous works showed that a typical restricted product has degree at most 16, as opposed to 1. This difference is already crucial in fooling products over m for composites m with good seed lengths, as we do not have (bit)-PRGs even for degree-2 polynomials over 6.

6.1 The reduction

We will use the following standard construction of δ-almost k-wise independent distributions with marginals p.

Claim 32.

There exists an explicit δ-almost k-wise independent distribution T on {0,1}n with 𝐄[Ti]=2b for every i[n] which can be sampled using O(b+k+log(1/δ)+loglogn) bits.

Proof.

We sample an (δ,kb)-biased distribution D on {0,1}nb and b uniform bits Ub. By standard construction [41, 2], D can be sampled using O(b+log(k/ε)+loglogn) bits. Write D=(D1,,Dn) where each Di{0,1}b. We output T{0,1}n, where Ti=𝖠𝖭𝖣b(DUb), where 𝖠𝖭𝖣b is the 𝖠𝖭𝖣 function on b bits. We have 𝐄[Ti]=2b because Ub is uniform. By [37, Claim 3.7], T is (ε2k)-almost k-wise uniform. Setting ε=2kδ proves the claim.

Let C be a sufficiently large constant. Let

k =C(log(1/ε)+w)
δ =θk (2)
p =223wθ3.

Let D and T be two δ-almost k-wise independent distributions, with 𝐄[Di]=1/2 and 𝐄[Ti]=p for every i[n], and let P1 be the PRG given by the theorem. The generator is

P:=D+TP1.

By Claim 32, this uses s1+Oθ(log(1/ε)+w+loglogn) bits.

6.2 Analysis

We first state a claim showing that if the number of non-constant fi’s in a block product f is much greater than its width w, then the bias 𝐄[f(U)]𝗈𝗉 is small. We defer its proof to Section 6.2.1.

Claim 33.

For integers w and q, let f=i=0fi be an (,w,q)-product over some matrix group supported on 𝒰θ(d) for some 22w+2θ2log(1/ε), where each fi is non-constant. Then 𝐄[f(U)]𝗈𝗉ε.

Recall that [230wθ5,245wθ5]. Given D,T, let fD,T:{0,1}T be the restricted product

fD,T(x):=f(D+Tx)=i=0fi(D+Tx).

We use fD,T,i(x) to denote fi(D+Tx).

The following lemma shows that with high probability over D and T, the function fD,T is an (,1,3log(1/ε))-product, that is, a group program with a small spill. Note that this lemma is true for products over any group.

Lemma 34.

Let D and T be two distributions on {0,1}n defined in 2. Let wloglog(1/ε)+2log(1/θ) and f=ifi be an (,3w,2log(1/ε))-product, where [230wθ5,245wθ5] and each fi is non-constant. Then with probability 1ε over D and T, the function fD,T is an (,1,3log(1/ε))-product, where 23wθ2 and each fD,T,i is non-constant.

Theorem 30 follows from Claims 42, 33, and 34.

Proof of Theorem 30.

As 22w+2θ2log(1/ε), by Claim 33, we have 𝐄[f(U)]𝗈𝗉ε. By Lemma 34, with probability 1ε over D and T, the restricted function fD,T=ifD,T,i is an (,1,3log(1/ε))-product, where 23wθ2 and each fD,T,i is non-constant. As wloglog(1/ε), again by Claim 33, we have 𝐄[fD,T(U)]𝗈𝗉ε. By our assumption, we have 𝐄[fD,T(P1)]𝗈𝗉𝐄[fD,T(U)]𝗈𝗉+ε2ε. So altogether we have 𝐄[f(U)]𝐄[f(G(U))]𝗈𝗉O(ε). The seed length follows from the construction.

Proof of Lemma 34.

To get some intuition, think of θ as a constant. Recall that the number of functions is roughly between 230w and 245w, and T is keeping each bit free with probability p=225w. Therefore, under a typical restriction, we expect for most functions in the product, only 1 bit is set to free, and very few functions have 2 free bits.

We first need to lower-bound the probability that a non-constant function remains non-constant under a random restriction.

Claim 35.

Let g be a non-constant function on w bits. For p[0,1], let T be the distribution on {0,1}w, where the coordinates Ti’s are independent and 𝐄[Ti]=p for each i[w]. With probability at least p((1p)/2)w1 the function gU,T(x):=g(U+Tx) is a non-constant function on 1 bit.

Proof.

Since g is non-constant, there is an x{0,1}w and a coordinate j[w] such that g(x+ej)g(x). The probability that only the coordinate Tj is 1 (and the rest are 0), and U agrees with x on the rest of the w1 coordinates is

𝐏𝐫[(T={j})ijUi=xi] =𝐏𝐫[T={j}]𝐏𝐫[ijUi=xi]
=p(1p)w12(w1)=p(1p2)w1.

We will use the following standard tail bound for almost k-wise independent random variables.

Lemma 36 (Lemma 8.1 in [30]).

Let X1,,X be γ-almost t-wise independent random variables supported on [0,1]. Let X:=iXi, and μ:=𝐄[X]. We have

𝐏𝐫[|Xμ|μ/2]O(tμ)t+O(μ)tγ.
Proof of Lemma 34.

We will show that for most choices of D and T, at least 23wθ2 of the fD,T,i depend on only 1 coordinate, and the ones that depend on at least 2 coordinates together form a log(1/ε)-junta.

We first consider the set of functions fD,T,i that are restricted to 1-bit non-constant functions. Let

J1:={i[]:|TIi|=1 and fT,D,i is non-constant}.

If D and T were exactly independent instead of almost-independent, then applying Claim 35 with our choice of p223wθ3, we would have

𝐄D,T[|J1|]p(1p2)3w1(230wθ5)(223wθ3)23w24wθ2.

As (D,T) is δ-almost k-wise independent and |Ii|3w for i[], the indicators 𝟙(iJ1):i[] are δ-almost k/(3w)-wise independent. So applying Lemma 36 with t=C(log(1/ε)+w)300wk/(3w) and γ=δ=θk, and recalling k=C(log(1/ε)+w), 245wθ5, and wloglog(1/ε)+log(1/θ), we have

𝐏𝐫D,T[|J1|23wθ2] O(t24wθ2)t+O(241wθ3)tθk
2Ω(wC(log(1/ε)+w)300w)+θk/2
ε. (3)

We now consider the fD,T,i’s that depend on at least two coordinates. We will show that these functions altogether depend on at most log(1/ε) coordinates. As a result, we can think of these functions as a single log(1/ε)-junta.

Let J2:={i[]:|IiT|2} be the set of functions fD,T,i’s that depend on at least 2 coordinates, and Q:=iJ2IiT be the collection of coordinates these functions depend on. Suppose |Q|log(1/ε). Then as |IiT|2 for iJ2, it must be the case that some ulog(1/ε)2 of the subsets IiT:iJ2 together contain at least 2u many free coordinates. The probability of the latter event is at most

(u)(u3w2u)(p2u+δ).

Setting u=log(1/ε)2w+1, and recalling 245wθ5, p=223wθ3 and δ=θkp2u, the above is at most

u(6w)2u2p2u (245wuθ5u)23ulogw(2246wθ6)
(22wθ)uε. (4)

Let I0:=(TI0)Q. By Sections 6.2 and 4, with probability 12ε over (D,T), we have |J1|23wθ2 and |I0|3log(1/ε). In this case, the function fD,T is a product of at least 23wθ2 non-constant 1-bit functions and a (3log(1/ε))-junta. In other words, fD,T=ifD,T,i is a (,1,3log(1/ε))-product, where 23wθ2, and each fD,T,i:i[] is non-constant.

6.2.1 Long products have small bias: Proof of Claim 33

In this section, we prove Claim 33. We start with bounding the bias of a single arbitrary non-constant function on w bits.

Claim 37.

Let be a group of matrices supported on 𝒰θ(d). We have 𝐄[g(U)]𝗈𝗉12(2w+2)θ2 for every non-constant function g:{0,1}w.

Proof.

Let T be the uniform distribution on {0,1}n. Note that 𝐄[g(U)]=𝐄U,T[𝐄U[g(U+TU)]]. Applying Claim 35 with p=1/2, with probability at least 2(2w1), the function gU,T(x):=g(U+Tx) is a non-constant 1-bit function. Suppose M1=:gU,T(1)gU,T(0):=M0. By Claim 22, we have (M1+M0)/2𝗈𝗉=M1(I+M11M0)/2𝗈𝗉1θ2/8. So

𝐄[g(U)]𝗈𝗉 (12(2w1))1+2(2w1)(M1+M0)/2𝗈𝗉
12(2w1)(1(1θ2/8))
=12(2w+2)θ2.

Proof of Claim 33.

By Claim 37, we have 𝐄[fi(U)]𝗈𝗉12(2w+2)θ2 for each i[]. Hence, for 22w+2θ2log(1/ε), we have

𝐄[f(U)]𝗈𝗉i[]𝐄[fi(U)]𝗈𝗉(12(2w+2)θ2)exp(2(2w+2)θ2)ε.

7 Width reduction of short products: Proof of Lemma 29

In this section, we prove Lemma 29. Recall that D,T are δ-almost k-wise independent distributions with 𝐄[Di]=1/2 and 𝐄[Ti]=p, where

k =C(w+log(m/ε))
δ =(mw)k
p =2C.

for a sufficiently large constant C.

Given T, say a coordinate i[n] is fixed if Ti=0 and is free if Ti=1. Let f(x):=i=0fi(xIi) be a (,3w,2log(1/ε))-product, where =m5230w. We will show that with high probability over T, (1) at most log(1/ε) of the 2log(1/ε) coordinates in I0 are free; (2) for most of the Ii:i1, at most 2w coordinates in each of them are free, and (3) in the remaining Ii’s, there are at most log(1/ε) many free coordinates in total.

To proceed, let

J:={i[]:|TIi|2w}andQ:=jJIjT.

It suffices to show the following two claims.

Claim 38.

|Q|log(1/ε) with probability 1ε over T.

Claim 39.

|TI0|log(1/ε) with probability 1ε over T.

Proof of Lemma 29.

Let I0=(TI0)Q. By Claims 38 and 39, with probability 12ε over T, we have |I0|2log(1/ε), and for every i[]J, we have |TIi|2w. Therefore, the function fD,T is a (,2w,2log(1/ε))-product.

Proof of Claim 38.

Suppose |Q|log(1/ε). Then as |TIj|2w for jJ, it must be the case that some ulog(1/ε)2w subsets TIj:jJ altogether contain 2wu many free coordinates. This happens with probability at most

(u)(3wu2wu)(p2wu+δ).

Setting u=log(1/ε)3w+1, and recalling m5230w and δ=(mw)kp2wu, the above is at most

u23wu2p2wu (m5230w)u23wu2C2wu+1
2u(33w+5logmC2w)
2u3wε

where in the second last inequality we used wlogm.

Proof of Claim 39.

Recall that |I0|2log(1/ε), and δ=(mw)kplog(1/ε). So

𝐏𝐫[|TI0|log(1/ε)] (|I0|log(1/ε))(plog(1/ε)+δ)
22log(1/ε)2Clog(1/ε)+1
ε/2.

8 Fooling (𝟏,𝒘,𝟑𝐥𝐨𝐠(𝟏/𝜺))-products over groups

In this section, we show how to extend the PRGs for (,1,0)-products over p-groups (Theorem 5) and commutative groups [17] to fool (,1,3log(1/ε))-products.

8.1 𝒑-groups

We use the fact that our generator in Theorem 5 is simply the XOR of independent copies of small-bias distributions. The following claim shows that conditioning on a small number of bits of a small-bias distribution remains small bias.

Claim 40.

Let D be an ε-biased distribution on {0,1}n. For any set S and y{0,1}S, the distribution of D conditioned on DS=y is (2|S|+1ε)-biased on {0,1}[n]S.

Proof.

We may assume ε2(|S|+1), for otherwise the claim is vacuous. For a subset T[n], let χT(x):=(1)iTxi be any parity test. Let T be any nonempty subset of [n]S. First observe that

𝟙(DS=y)=iS1χ{i}(D)2=2|S|SS(1)|S|χS(D).

Taking expectations on both sides and applying the triangle inequality, we have 𝐏𝐫[DS=y]2|S|ε2(|S|+1). Note that

𝐄[χT(D)DS=y]𝐏𝐫[DS=y] =𝐄[χT(D)𝟙(DS=y)]
=𝐄[χT(D)iS1χ{i}(D)2]
=2|S|SS(1)|S|𝐄[χTS(D)].

So its magnitude is bounded by ε. Therefore, |𝐄[χT(D)DS=y]|𝐏𝐫[DS=y]1ε2|S|+1ε.

Corollary 41.

There is a PRG that ε-fools (n,1,3log(1/ε))-products over any p-groups of order m with seed length Om(log(n/ε)).

Proof.

Recall our generator for p-groups in Theorem 5 is simply the XOR of independent copies of (ε/n)c-biased distributions. By Claim 40, for any fixing of the input bits of f0 in each copy, each distribution remains (2/ε)(ε/n)c-biased.

8.2 Commutative groups

We now show that the Gopalan–Kane–Meka PRG fools (,1,3log(1/ε))-products. We will use the following PRG by Gopalan, Kane, and Meka [18] that fools (,1,0)-products over . A simple argument shows that the same PRG also fools (,1,3log(1/ε))-products.

Claim 42.

There is an explicit PRG P that ε-fools (,1,3log(1/ε))-products over commutative groups of order m with seed length Om(log(/ε)(loglog(/ε)2).

Lemma 43 (Theorem 1.1 and Lemma 9.1 in [18]).

There is an explicit P𝖦𝖪𝖬:{0,1}s{0,1}n where s=O(log(/ε))(loglog(/ε))2 such that the following holds. If wn satisfies i|wi|W, then

distTV(w,U,w,P𝖦𝖪𝖬(U))O(W)ε.
Proof of Claim 42.

By Claim 21, it suffices to fool the product over each irreducible representation of G with error ε/m. Since G is commutative, all its irreps are 1-dimensional. Moreover, they are supported on subsets of m:={z:|z|m=1}. Let f=i=0fi be an (,1,3log(1/ε))-product over m.

Let ω:=ei2πm. Note that for any 1-bit function g:{0,1}m we can write g as

g(y)=ωa1y+a0(1y)=ωa0ω(a1a0)y

for some a0,a1{0,,m1}. We can also write f0(xI0) as ωh(xI0), for some h:{0,1}I0{0,,m1}. Therefore, f has the form of

f(x)=ωbωjJajxj+h(xI0),

for some coefficients b and aj’s taking values from {0,,m1}, and J and I0 are disjoint subsets of [n]. Write I0={r1<<r|I0|} for some rj[n]. Consider the integer-valued function F:{0,1} defined by

F(x):=jJajxj+2log(m)j=1|I0|2j1xrj.

So the first logm bits of F(x) encode the first sum, and the last |I0| bits are the decimal encoding of the binary string xI0. Note that we can compute f(x) given F(x). Moreover, F(x)=w,x for some wn with i|wi|O(m/ε3). Therefore, if we let P be the PRG in Lemma 43 with error O(ε3m), which uses a seed of O(log(m/ε))(loglog(m/ε))2 bits, then it follows that distTV(F(U),F(P𝖦𝖪𝖬(U))ε/m.

9 Mixing characterization of Dedekind groups

In this section we give a proof of Lemma 9, which says that a (finite) group G is mixing if and only if it is Dedekind. This proof is provided by Yves de Cornulier on https://mathoverflow.net/a/482286/8271.

Recall from Definition 6 that a (finite) group G is mixing if for every nontrivial irreducible (unitary) representation ρ and non-identity element gG, the matrix ρ(g) has no eigenvalue 1, and G is Dedekind if it has the form 8×2t×D for any integer t and commutative group D of odd order.

We first note that the definition of G being mixing is equivalent to the following condition: for every gG and irrep ρ, the subspace ker(ρ(g)I) is a subrepresentation. The proof of Lemma 9 follows from the following two claims. Here we use the equivalence that a group G is Dedekind if and only if every subgroup of G is normal.

Claim 44.

If G is Dedekind, then ker(ρ(g)I) is a subrepresentation for every g and ρ.

Proof.

Take an element gG. By definition of Dedekind, every subgroup in G is normal. In particular g is also normal.

Take vWg:={v:ρ(g)v=v}. To show that Wg is a subrepresentation, we need to show that ρ(g)(ρ(h)v)=ρ(h)v for every h. But this is equivalent to showing

ρ(h)1ρ(g)ρ(h)v=ρ(h1gh)v=v.

Since g is normal, we have h1gh=gi for some i. It is clear that ρ(gi)v=ρ(g)iv=ρ(g)i1(ρ(g)v)=ρ(g)i1v==v. So indeed Wg is a subrepresentation.

Claim 45.

If ker(ρ(g)I) is a subrepresentation for every g and ρ of G, then G is Dedekind.

Proof.

As in the previous claim, we consider Wg=ker(ρ(g)I)={v:ρ(g)v=v}. Note that for vWg, we have ρ(gi)v=ρ(g)iv=v. Suppose Wg is a subrepresentation. Take hG and vWg. Since ρ(h)WgWg, we have ρ(g)ρ(h)v=ρ(h)v. That means ρ(h1gh)v=v for every vWg. This implies h1gh=g, meaning h1ghg and thus g is normal.

References

  • [1] Miklos Ajtai and Avi Wigderson. Deterministic simulation of probabilistic constant-depth circuits. Advances in Computing Research - Randomness and Computation, 5:199–223, 1989.
  • [2] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992. doi:10.1002/RSA.3240030308.
  • [3] Noga Alon, Alexander Lubotzky, and Avi Wigderson. Semi-direct product in groups and zig-zag product in graphs: Connections and applications. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 630–637, 2001. doi:10.1109/SFCS.2001.959939.
  • [4] Benny Applebaum, Yuval Ishai, and Eyal Kushilevitz. Cryptography in NC0. SIAM J. on Computing, 36(4):845–888, 2006.
  • [5] Roy Armoni, Michael E. Saks, Avi Wigderson, and Shiyu Zhou. Discrepancy sets and pseudorandom generators for combinatorial rectangles. In 37th IEEE Symp. on Foundations of Computer Science (FOCS), pages 412–421, 1996. doi:10.1109/SFCS.1996.548500.
  • [6] David A. Mix Barrington. Bounded-width polynomial-size branching programs recognize exactly those languages in NC1. J. of Computer and System Sciences, 38(1):150–164, 1989. doi:10.1016/0022-0000(89)90037-8.
  • [7] Jonah Blasiak, Thomas Church, Henry Cohn, Joshua A. Grochow, and Chris Umans. Which groups are amenable to proving exponent two for matrix multiplication? CoRR, abs/1712.02302, 2017. arXiv:1712.02302.
  • [8] Andrej Bogdanov and Emanuele Viola. Pseudorandom bits for polynomials. SIAM J. on Computing, 39(6):2464–2486, 2010. doi:10.1137/070712109.
  • [9] Eshan Chattopadhyay, Pooya Hatami, Kaave Hosseini, and Shachar Lovett. Pseudorandom generators from polarizing random walks. Theory Comput., 15:1–26, 2019. doi:10.4086/TOC.2019.V015A010.
  • [10] Henry Cohn, Robert D. Kleinberg, Balázs Szegedy, and Christopher Umans. Group-theoretic algorithms for matrix multiplication. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 379–388, 2005. doi:10.1109/SFCS.2005.39.
  • [11] Harm Derksen, Peter Ivanov, Chin Ho Lee, and Emanuele Viola. Pseudorandomness, symmetry, smoothing: Ii, 2024. doi:10.48550/arXiv.2407.12110.
  • [12] Harm Derksen, Chin Ho Lee, and Emanuele Viola. Boosting uniformity in quasirandom groups: fast and simple. In IEEE Symp. on Foundations of Computer Science (FOCS), 2024.
  • [13] Persi Diaconis. Group representations in probability and statistics, volume 11 of Institute of Mathematical Statistics Lecture Notes—Monograph Series. Institute of Mathematical Statistics, Hayward, CA, 1988.
  • [14] Dean Doron, Pooya Hatami, and William M. Hoza. Log-seed pseudorandom generators via iterated restrictions. In Shubhangi Saraf, editor, 35th Computational Complexity Conference, CCC 2020, July 28-31, 2020, Saarbrücken, Germany (Virtual Conference), volume 169 of LIPIcs, pages 6:1–6:36. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.CCC.2020.6.
  • [15] David S. Dummit and Richard M. Foote. Abstract Algebra. Wiley, 3rd edition, 2004.
  • [16] Michael A. Forbes and Zander Kelley. Pseudorandom generators for read-once branching programs, in any order. In IEEE Symp. on Foundations of Computer Science (FOCS), 2018. doi:10.1109/FOCS.2018.00093.
  • [17] Parikshit Gopalan, Daniel Kane, and Raghu Meka. Pseudorandomness via the discrete fourier transform. In IEEE Symp. on Foundations of Computer Science (FOCS), pages 903–922, 2015. doi:10.1109/FOCS.2015.60.
  • [18] Parikshit Gopalan, Daniel M. Kane, and Raghu Meka. Pseudorandomness via the discrete Fourier transform. SIAM J. Comput., 47(6):2451–2487, 2018. doi:10.1137/16M1062132.
  • [19] Parikshit Gopalan, Raghu Meka, Omer Reingold, Luca Trevisan, and Salil Vadhan. Better pseudorandom generators from milder pseudorandom restrictions. In IEEE Symp. on Foundations of Computer Science (FOCS), 2012.
  • [20] Parikshit Gopalan, Raghu Meka, Omer Reingold, and David Zuckerman. Pseudorandom generators for combinatorial shapes. SIAM J. Comput., 42(3):1051–1076, 2013. doi:10.1137/110854990.
  • [21] W. T. Gowers. Generalizations of Fourier analysis, and how to apply them. Bull. Amer. Math. Soc. (N.S.), 54(1):1–44, 2017. doi:10.1090/bull/1550.
  • [22] W. T. Gowers and Emanuele Viola. Mixing in non-quasirandom groups. In ACM Innovations in Theoretical Computer Science conf. (ITCS), 2022.
  • [23] Elad Haramaty, Chin Ho Lee, and Emanuele Viola. Bounded independence plus noise fools products. SIAM J. Comput., 47(2):493–523, 2018. doi:10.1137/17M1129088.
  • [24] Pooya Hatami and William Hoza. Paradigms for unconditional pseudorandom generators. Foundations and Trends® in Theoretical Computer Science, 16(1-2):1–210, 2024. doi:10.1561/0400000109.
  • [25] Michal Koucký, Prajakta Nimbhorkar, and Pavel Pudlák. Pseudorandom generators for group products: extended abstract. In STOC, pages 263–272. ACM, 2011. doi:10.1145/1993636.1993672.
  • [26] Jack B. Kuipers. Quaternions in computer graphics and robotics. In SIGGRAPH 2002 Course Notes, San Antonio, TX, 2002. ACM SIGGRAPH.
  • [27] Chin Ho Lee. Fourier bounds and pseudorandom generators for product tests. In 34th Computational Complexity Conference, volume 137. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CCC.2019.7.
  • [28] Chin Ho Lee, Edward Pyne, and Salil Vadhan. Fourier growth of regular branching programs. In Approximation, randomization, and combinatorial optimization. Algorithms and techniques, volume 245 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 2, 21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/lipics.approx/random.2022.2.
  • [29] Chin Ho Lee and Emanuele Viola. The coin problem for product tests. ACM Trans. Comput. Theory, 10(3):Art. 14, 10, 2018. doi:10.1145/3201787.
  • [30] Chin Ho Lee and Emanuele Viola. More on bounded independence plus noise: Pseudorandom generators for read-once polynomials. Theory of Computing, 16:1–50, 2020. Available at https://www.ccs.neu.edu/home/viola/papers/LV-rop.pdf. doi:10.4086/TOC.2020.V016A007.
  • [31] Chin Ho Lee and Emanuele Viola. More on bounded independence plus noise: pseudorandom generators for read-once polynomials. Theory Comput., 16:Paper No. 7, 50, 2020. doi:10.4086/toc.2020.v016a007.
  • [32] Shachar Lovett. Unconditional pseudorandom generators for low degree polynomials. Theory of Computing, 5(1):69–82, 2009. doi:10.4086/TOC.2009.V005A003.
  • [33] Shachar Lovett, Partha Mukhopadhyay, and Amir Shpilka. Pseudorandom generators for CC0[p] and the Fourier spectrum of low-degree polynomials over finite fields. In 51th IEEE Symp. on Foundations of Computer Science (FOCS). IEEE, 2010.
  • [34] Shachar Lovett, Omer Reingold, Luca Trevisan, and Salil Vadhan. Pseudorandom bit generators that fool modular sums. In Approximation, randomization, and combinatorial optimization, volume 5687 of Lecture Notes in Comput. Sci., pages 615–630. Springer, Berlin, 2009. doi:10.1007/978-3-642-03685-9_46.
  • [35] Shachar Lovett, Omer Reingold, Luca Trevisan, and Salil P. Vadhan. Pseudorandom bit generators that fool modular sums. In 13th Workshop on Randomization and Computation (RANDOM), volume 5687 of Lecture Notes in Computer Science, pages 615–630. Springer, 2009. doi:10.1007/978-3-642-03685-9_46.
  • [36] Chi-Jen Lu. Improved pseudorandom generators for combinatorial rectangles. Combinatorica, 22(3):417–433, 2002. doi:10.1007/S004930200021.
  • [37] Raghu Meka, Omer Reingold, and Avishay Tal. Pseudorandom generators for width-3 branching programs. In STOC’19 – Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 626–637. ACM, New York, 2019. doi:10.1145/3313276.3316319.
  • [38] Raghu Meka and David Zuckerman. Small-bias spaces for group products. In 13th Workshop on Randomization and Computation (RANDOM), volume 5687 of Lecture Notes in Computer Science, pages 658–672. Springer, 2009. doi:10.1007/978-3-642-03685-9_49.
  • [39] Raghu Meka and David Zuckerman. Small-bias spaces for group products. In Approximation, randomization, and combinatorial optimization, volume 5687 of Lecture Notes in Comput. Sci., pages 658–672. Springer, Berlin, 2009. doi:10.1007/978-3-642-03685-9_49.
  • [40] J. Naor and M. Naor. Small-bias probability spaces: efficient constructions and applications. In 22nd ACM Symp. on the Theory of Computing (STOC), pages 213–223. ACM, 1990. doi:10.1145/100216.100244.
  • [41] Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. on Computing, 22(4):838–856, 1993. doi:10.1137/0222053.
  • [42] Pierre Péladeau and Denis Thérien. On the languages recognized by nilpotent groups (a translation of “Sur les langages reconnus par des groupes nilpotents”). Electron. Colloquium Comput. Complex., TR01-040, 2001. URL: https://eccc.weizmann.ac.il/eccc-reports/2001/TR01-040/index.html.
  • [43] Omer Reingold, Thomas Steinke, and Salil P. Vadhan. Pseudorandomness for regular branching programs via Fourier analysis. In Workshop on Randomization and Computation (RANDOM), pages 655–670, 2013. doi:10.1007/978-3-642-40328-6_45.
  • [44] Eyal Rozenman, Aner Shalev, and Avi Wigderson. Iterative construction of cayley expander graphs. Theory Comput., 2(5):91–120, 2006. doi:10.4086/TOC.2006.V002A005.
  • [45] Jean Pierre Serre. Linear Representations of Finite Groups. Springer, 1977.
  • [46] Thomas Steinke. Pseudorandomness for permutation branching programs without the group theory. Electron. Colloquium Comput. Complex., TR12-083, 2012. URL: https://eccc.weizmann.ac.il/report/2012/083.
  • [47] Xiaorui Sun. Faster isomorphism for p-groups of class 2 and exponent p. In STOC, pages 433–440. ACM, 2023.
  • [48] Audrey Terras. Fourier analysis on finite groups and applications, volume 43 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1999. doi:10.1017/CBO9780511626265.
  • [49] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1-3):1–336, 2012. doi:10.1561/0400000010.
  • [50] Emanuele Viola. On the power of small-depth computation. Foundations and Trends in Theoretical Computer Science, 5(1):1–72, 2009. doi:10.1561/0400000033.
  • [51] Emanuele Viola. The sum of d small-bias generators fools polynomials of degree d. Computational Complexity, 18(2):209–217, 2009. doi:10.1007/S00037-009-0273-5.
  • [52] Thomas Watson. Pseudorandom generators for combinatorial checkerboards. Computational Complexity, 22(4):727–769, 2013. doi:10.1007/s00037-012-0036-6.
  • [53] Avi Wigderson. Representation theory of finite groups, and applications. Available at https://www.math.ias.edu/avi/node/2289, 2010.