Abstract 1 Introduction 2 Preliminaries 3 Proof that sums of INW generators fool constant-width ROBPs 4 Seed length lower bound for fooling ROBPs 5 Directions for future research References

On Sums of INW Pseudorandom Generators

William M. Hoza ORCID Department of Computer Science, The University of Chicago, IL, USA Zelin Lv ORCID Department of Computer Science, The University of Chicago, IL, USA
Abstract

We study a new approach for constructing pseudorandom generators (PRGs) that fool constant-width standard-order read-once branching programs (ROBPs). Let X be the n-bit output distribution of the INW PRG (Impagliazzo, Nisan, and Wigderson, STOC 1994), instantiated using expansion parameter λ. We prove that the bitwise XOR of t independent copies of X fools width-w programs with error nlog(w+1)(λlogn)t. Notably, this error bound is meaningful even for relatively large values of λ such as λ=1/O(logn).

Admittedly, our analysis does not yet imply any improvement in the bottom-line overall seed length required for fooling such programs – it just gives a new way of re-proving the well-known O(log2n) bound. Furthermore, we prove that this shortcoming is not an artifact of our analysis, but rather is an intrinsic limitation of our “XOR of INW” approach. That is, no matter how many copies of the INW generator we XOR together, and no matter how we set the expansion parameters, if the generator fools width-3 programs and the proof of correctness does not use any properties of the expander graphs except their spectral expansion, then we prove that the seed length of the generator is inevitably Ω(log2n).

Still, we hope that our work might be a step toward constructing near-optimal PRGs fooling constant-width ROBPs. We suggest that one could try running the INW PRG on t correlated seeds, sampled via another PRG, and taking the bitwise XOR of the outputs.

Keywords and phrases:
INW generator, pseudorandomness, space-bounded computation, XOR Lemmas
Category:
RANDOM
Copyright and License:
[Uncaptioned image] © William M. Hoza and Zelin Lv; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Pseudorandomness and derandomization
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2025/050/ [20]
Acknowledgements:
We thank Huacheng Yu for collaboration during the early stages of this project. We thank Gil Cohen and Dean Doron for valuable discussions. We thank Aaron Potechin for valuable discussions and helpful comments on a draft of this paper.
Editors:
Alina Ene and Eshan Chattopadhyay

1 Introduction

1.1 Pseudorandom generators for space-bounded computation

Randomness can be considered a type of “fuel” for computation. We prefer to use as little randomness as possible, just like we prefer to minimize consumption of other types of “fuel.” A pseudorandom generator (PRG) is a way of decreasing the number of random bits used in computation.

Definition 1 (PRG).

Let X be a distribution over {0,1}n and let f:{0,1}n{0,1}. We say that X fools f with error ε if

|𝔼[f(X)]𝔼[f]|ε.

Here 𝔼[f] is a shorthand for 𝔼[f(U{0,1}n)], where, in general, U denotes the uniform distribution over the finite set . A pseudorandom generator (PRG) is a function G:{0,1}n for some finite set . We say that G fools f with error ε if G(U) fools f with error ε. The seed length of the PRG is the quantity log||.111Usually we want the domain to have the form ={0,1}s for some s, but in this work we allow arbitrary domains.

In this paper, we study PRGs in the context of space-bounded computation. If we wish to simulate a randomized space-bounded algorithm, then we ought to use a PRG that fools standard-order read-once branching programs (ROBPs), defined next.

Definition 2 (Standard-order ROBP).

A width-w length-n standard-order read-once branching program (ROBP) is a directed acyclic multigraph. The vertices are arranged in n+1 layers V0,V1,,Vn, each consisting of w vertices. Each vertex in Vi where i<n has two outgoing edges leading to Vi+1 labeled 0 and 1. One vertex v0V0 is designated as the start vertex, and each vertex vVn is labeled with an output value qv{0,1}. Each input x{0,1}n selects a walk through the program, defined by starting at v0 and traversing the outgoing edge with label xi in step i. This walk ends at some vertex vVn. The program computes the function f given by f(x)=qv.

Polynomial-width standard-order ROBPs describe what log-space randomized algorithms do on a fixed input as a function of their random bits. In this paper, we focus on the constant-width case. There isn’t necessarily a clear connection between constant-width ROBPs and uniform models of computation such as randomized Turing machines, but constant-width ROBPs still constitute an extremely interesting nonuniform model of space-bounded computation. The width-2 case is well-understood [44, 4, 19, 29]. In particular, Saks and Zuckerman showed in the 1990s that there is an explicit PRG that fools width-2 branching programs with seed length O(logn) [44], which is optimal. Decades later, Meka, Reingold, and Tal showed that there is an explicit PRG that fools width-3 standard-order ROBPs with seed length O~(logn) [35]. However, the best seed length bound for width-4 programs is O(log2n), which is a special case of the following classic theorem.

Theorem 3 ([38]).

For every w,n and every ε(0,1), there exists an explicit PRG that fools width-w length-n standard-order ROBPs with error ε and seed length O(log(wn/ε)logn).

The original proof of Theorem 3 is due to Nisan [38]. There is an alternative proof due to Impagliazzo, Nisan, and Wigderson [26] that is more relevant to the present paper. Impagliazzo, Nisan, and Wigderson inductively defined a sequence of PRGs G0,G1,, where Gj:j{0,1}2j, as follows.

  1. 1.

    Base case: Let 0={0,1} and G0(x)=x.

  2. 2.

    Inductive step: Assume we have already constructed Gj1. Let Hj be a Dj-regular undirected multigraph on the vertex set j1. Define j=j1×[Dj] and

    Gj(x,y)=(Gj1(x),Gj1(Hj[x,y])).

    Here Hj[x,y] denotes the y-th neighbor of the vertex x in Hj.

Impagliazzo, Nisan, and Wigderson’s original analysis [26] shows that if λ(Hj)λ for every j, then Glogn fools width-w length-n standard-order ROBPs with error λwn. Here λ(Hj) denotes the spectral expansion parameter of Hj, which can be defined as the second largest eigenvalue of its transition probability matrix in absolute value. There are numerous explicit constructions of expander graphs Hj such that λ(Hj)λ and deg(Hj)poly(1/λ). (See, for example, Vadhan’s pseudorandomness survey [47].) If we use such expander graphs, then the seed length of Glogn is O(lognlog(1/λ)). Choosing λ=εwn completes the proof of Theorem 3.

1.2 Instantiating the INW PRG with relatively mild expanders

It is a major open problem to design PRGs that fool constant-width standard-order ROBPs with seed length o(log2n). A natural thing to try is to simply increase the parameter λ in the INW construction. Indeed, even when λ is relatively large, it turns out that the INW generator still does a good job of fooling so-called “regular” and “permutation” ROBPs [8, 15, 28, 45, 22]. More generally, one can try using different values of λ at the different levels of the recursion, say λ1,,λlogn. For example, Rozenman and Vadhan showed how to use the INW generator, with λj=Ω(1) for most but not all j, to solve the undirected s-t connectivity problem in deterministic log-space [43], re-proving Reingold’s famous theorem [42].

Unfortunately, it turns out that no matter how we set the expansion parameters, the INW generator is provably too weak to fool constant-width standard-order ROBPs with seed length o(log2n) [9, 23]. Making this statement precise is a little subtle, because “the” INW generator is actually a whole family of PRGs, even after we fix the expansion parameters λ1,,λlogn. After all, the definition of the PRG does not specify which specific expander graphs to use. For a vector λ=(λ1,,λlogn), let us define INW(λ) to be the set of all PRGs Glogn that can be constructed via the INW template using graphs H1,,Hlogn satisfying λ(Hj)λj for every j. As a shorthand, let us also write INW(λ) for the special case λ1=λ2==λlogn=λ when n is clear from context. (See Definition 23 for a more detailed definition.) Then we have the following theorem due to Brody and Verbin [9], with details filled in by Hoza, Pyne, and Vadhan [23]:

Theorem 4 (Limitations of the INW generator [9, 23]).

Let λ=(λ1,,λlogn)[0,1]logn. If every PRG in the family INW(λ) fools width-3 standard-order ROBPs with error 0.99, then λj1/nΩ(1) for Ω(logn) values of j, and moreover every PRG in the family INW(λ) has seed length Ω(log2n).

Theorem 4 can be interpreted as saying that if one wishes to use the INW template to construct a PRG that fools width-3 standard-order ROBPs with seed length o(log2n), then the proof of correctness would have to exploit some property of the specific graphs H1,,Hlogn beyond their spectral expansion. It is not clear what property would be helpful.

1.3 A possible way forward: Unpredictability and Yao’s XOR Lemma

In this paper, we propose a new approach for constructing a near-optimal PRG fooling constant-width standard-order ROBPs. The approach is based on the classic notion of unpredictability. Specialized to the setting of standard-order ROBPs, unpredictability can be defined as follows.

Definition 5 (Unpredictability).

Let X be a distribution over {0,1}n. We say that X is δ-unpredictable for width-w standard-order ROBPs if, for every i[n] and every width-w length-(i1) standard-order ROBP f:{0,1}i1{0,1}, we have

Pr[f(X1,X2,,Xi1)=Xi]12+δ.

If X=G(U) where G:{0,1}n, then we also describe G itself as being δ-unpredictable for width-w standard-order ROBPs.

Interestingly, even though INW(1/polylogn) does not fool width-3 programs (Theorem 4), it turns out that INW(1/polylogn) is unpredictable for constant-width programs. More precisely, Fefferman, Shaltiel, Umans, and Viola showed that every PRG in the family INW(λ) is O(wλlogn)-unpredictable for width-w standard-order ROBPs [16].222One way to prove this statement is to use the notion of weight introduced by Braverman, Rao, Raz, and Yehudayoff [8]. For any next-bit predictor f:{0,1}i1{0,1}, we can define a test g:{0,1}i{0,1} that checks whether f succeeds: g(x1,,xi)=f(x1,,xi1)xi. If f can be computed by a width-w standard-order ROBP, then g can be computed by a width-w standard-order ROBP with weight two. From here, Braverman, Rao, Raz, and Yehudayoff’s analysis shows that INW(λ) fools g with error O(wλlogn) [8].

How can we leverage the unpredictability of the INW generator to construct a PRG that actually fools constant-width ROBPs? Fefferman, Shaltiel, Umans, and Viola suggested combining the INW generator with a randomness extractor [16]. We propose a different approach, inspired by Yao’s XOR Lemma in circuit complexity. Yao’s XOR Lemma is about the average-case hardness of computing a Boolean function, or more generally the hardness of guessing a Boolean random variable Y{0,1} given some correlated random variable X{0,1}n. Let (X(1),Y(1)),,(X(t),Y(t)) be t independent copies of (X,Y). Roughly speaking, Yao’s XOR Lemma says that if it is “somewhat hard” to guess Y given X, then it is “very hard” to guess Y(1)Y(t) given X(1),,X(t). Here “somewhat hard” means that all small circuits have, e.g., a constant failure probability, and “very hard” means that all small circuits have a failure probability very close to 1/2.

Maurer and Tessaro observed that Yao’s XOR Lemma implies that the bitwise XOR operation amplifies cryptographic unpredictability [34]. In more detail, let X be a distribution that is δ-unpredictable for small circuits for a relatively large value of δ such as δ=0.1. Let X=X(1)X(t), where X(1),,X(t) are independent copies of X. Yao’s XOR Lemma implies that if a small circuit attempts to guess the i-th bit of X given the first i1 bits of each copy X(1),,X(t), then its success probability will be very close to 1/2. If the circuit is only given the first i1 bits of X, then the prediction task becomes even more difficult. Thus, X is δ-unpredictable for small circuits where δδ.

Intuitively, we expect the same phenomenon to occur in the ROBP setting. If X is the bitwise XOR of t independent copies of some distribution X that is δ-unpredictable for constant-width standard-order ROBPs, then we expect X to be δ-unpredictable for such programs, where δδt. By Yao’s so-called “distinguisher-to-predictor lemma,” this would imply that X actually fools such programs with error δn. Based on these considerations, we propose the following two-step approach for constructing near-optimal PRGs fooling constant-width standard-order ROBPs.

  • Step 1: Let GINW(1/polylogn) be a PRG with seed length s=O~(logn). Prove that if we sample tlogn seeds Z(1),,Z(t) independently and uniformly at random, then the bitwise XOR G(Z(1))G(Z(t)) fools constant-width standard-order ROBPs.

  • Step 2: Derandomize the proof of Step 1. That is, prove that G(Z(1))G(Z(t)) fools constant-width standard-order ROBPs even if the seeds Z(1),,Z(t) are not independent, but rather they are generated in some pseudorandom manner using only O~(logn) truly random bits.

To be clear, “Step 1” is not in any way trivial, even if we ignore “Step 2.” Taking a bitwise XOR of several independent samples from a distribution does not always improve its pseudorandomness properties. For example, if X is k-wise uniform and uniform over a subspace of 𝔽2n (see Section 2.8), then the bitwise XOR of many copies of X has the same distribution as a single copy of X. See also Theorem 10.

1.4 Sums of INW generators fool constant-width ROBPs

One of the main results of this paper is to accomplish “Step 1” described above. To state our results more precisely, let us make the following definition.

Definition 6 (XOR of INW generators).

Let n,t where n is a power of two, and let Λ[0,1]t×logn. Let INWt(Λ) denote the set of all PRGs G:1××t{0,1}n of the form

G(z(1),,z(t))=G(1)(z(1))G(t)(z(t)),

where G(i)INW(Λi) for every i[t]. Here Λi denotes the i-th row of Λ. As a shorthand, we also write INWt(λ) if every entry of Λ is equal to λ and n is clear from context.

We prove the following.

Theorem 7 (INWt fools constant-width ROBPs).

Let n,w,t where n is a power of two, and let Λ[0,1]t×logn. Every PRG in the family INWt(Λ) fools width-w length-n standard-order ROBPs with error

nlog(w+1)i=1tj=1lognΛi,j.

For example, Theorem 7 implies that there is a value t=Θ(logwlogn+log(1/ε)loglogn) such that INWt(1/log2n) fools width-w length-n standard-order ROBPs with error ε. By plugging in explicit expanders of degree poly(1/λ), we get an explicit PRG that fools width-w length-n standard-order ROBPs with error ε and seed length

O(log2nlogw+lognlog(1/ε)),

re-proving Theorem 3 in the case w=O(1). For context, prior to our work, there were already several different known ways to construct PRGs that fool constant-width standard-order ROBPs (and more general models) using a seed of length O(log2n) [38, 26, 2, 41, 18, 7] or O~(log2n) [17]. Our Theorem 7 adds to this list.

We hope that there is value in having yet another proof of the O(log2n) bound, but of course the real goal is to construct a PRG with seed length o(log2n). As discussed previously, we believe that the most promising approach (“Step 2” in our proposal) is to “derandomize” Theorem 7, i.e., prove that a similar error bound holds even if the t seeds for the INW generators are chosen in some pseudorandom manner instead of sampling them independently. So far, we have not figured out how to make such an approach work. However, we would like to point out several “success stories” describing cases in which prior researchers managed to solve similar problems:

  • There are known derandomized versions of Yao’s XOR Lemma [25, 27, 12].

  • The first asymptotically optimal “small-bias” generator, due to Naor and Naor [37], works by plugging several correlated seeds into a weak initial generator and taking the bitwise XOR of the results [37].

  • There are several constructions of low-error “weighted pseudorandom generators” and “hitting set generators” for space-bounded computation that work by plugging several correlated seeds into a “moderate-error” initial PRG and then combining the results in some manner [24, 40, 14, 21, 5, 11, 10, 13].

We find these success stories encouraging.

1.5 Summing fewer copies of the INW generator does not work

Instead of analyzing correlated seeds, we can also consider a different and more straightforward approach for improving the seed length of our PRG. What happens if we take an XOR of fewer copies of the INW generator while still using relatively mild expander graphs? For example, what if we try INW2(1/polylogn)?

We are inspired to ask this question because of a line of work on the bitwise XOR of small-bias distributions. Recall that by definition, a distribution X over {0,1}n is ε-biased if, for every nonempty set S[n], we have |Pr[iSXi=0]Pr[iSXi=1]|ε. A sequence of works has shown that the bitwise XOR of d independent small-bias distributions fools degree-d polynomials over 𝔽2, whereas the XOR of only d1 small-bias distributions does not (unless the bias parameter is extremely small) [6, 32, 48, 33]. This demonstrates that a bitwise XOR of a small number of “weak” PRGs can sometimes be quite strong. Indeed, the XOR of just two small-bias distributions has been proposed as a candidate PRG for ROBPs [36, 31].

Unfortunately, we prove that INW2(1/polylogn) is not capable of fooling width-3 standard-order ROBPs. More generally, no matter how many or how few copies of the INW generator we XOR together, and no matter how we set the expansion parameters, if the resulting PRG fools width-3 programs, then the seed length is inevitably Ω(log2n):

Theorem 8 (Limitations of INWt).

Let n be a power of two, let t, and let Λ[0,1]t×logn. If every PRG in the family INWt(Λ) fools width-3 standard-order ROBPs with error 0.99, then every PRG in the family INWt(Λ) has seed length Ω(log2n).

1.6 Increasing 𝒕 vs. decreasing 𝝀: Two incomparable PRG paradigms

Motivated by Theorems 7 and 8, we wish to better understand how our new INWt generator compares to the classic INW generator. To be more precise, suppose that for some relatively large value of λ, it turns out that INW(λ) is too weak to fool some class of functions. To strengthen the PRG, one could try using INWt(λ) for some t>1, or one could try using INW(λ) for some λ<λ. Which approach is better?

We prove that these two approaches are in fact incomparable in general. There are cases in which one approach is better, and there are cases in which the other approach is better. We state these results precisely in the following two theorems. The first theorem describes a case in which summing multiple generators is the better approach.

Theorem 9 (A case where XORing is cheaper than using heavy-duty expanders).
  1. 1.

    For every λ(0,1), every PRG in INW2(λ) fools quadratic polynomials over 𝔽2 with error O(λ).

  2. 2.

    Let λ[0,1]logn where n is a power of two. If every PRG in INW(λ) fools quadratic polynomials over 𝔽2 with error 0.49, then λj2Ω(2j) for every j4, and moreover every PRG in INW(λ) has seed length Ω(n).333We remark that the distinguishers we construct are read-once quadratic polynomials, and hence they can be computed by width-4 ROBPs that read their input bits in some nonstandard order. This is an example showing that the INW generator does not fool arbitrary-order ROBPs. Tzur previously showed that Nisan’s PRG has this same weakness [46], but to the best of our knowledge, ours is the first such example regarding the INW generator. We thank Adin Gitig for pointing out this gap in the literature.

Item 1 in the theorem above is an immediate consequence of prior work [28, 15, 45, 6, 32, 48]. The main content of the theorem is Item 2. Theorem 9 demonstrates the strength of the “XOR of INW” paradigm. We hope that future researchers can capitalize on the strengths of the INWt generator to develop better PRGs for ROBPs. Next, let us discuss a case in which summing multiple INW generators is worse than simply using one INW generator with a slightly smaller λ value.

Theorem 10 (A case where using heavy-duty expanders is cheaper than XORing).

Let n be a power of two and let λ(1000n,12). There exist λ=Ω(λ2) and f:{0,1}n{0,1} such that the following hold.

  1. 1.

    Every PRG in INW(λ) fools f with error 0.01.

  2. 2.

    Let t. If every PRG in INWt(λ) fools f with error 0.99, then tΩ(λn/logn), and moreover every PRG in INWt(λ) has seed length Ω(nλlog(1/λ)).

Theorem 10 is valuable because it helps us to interpret Theorem 7. As a reminder, Theorem 7 says that constant-width ROBPs are fooled by a sum of INW(1/polylogn) generators. In a sense, Theorem 10 shows that Theorem 7 “could have been false” and was not “inevitable.” Indeed, Theorem 10 shows that in general, the fact that a function is fooled by INW(1/poly(n)) does not automatically imply that it is fooled by a sum of INW(1/polylogn) generators. Consequently, Theorem 7 reveals a new weakness of the constant-width ROBP model, above and beyond the well-known weakness of being fooled by INW(1/poly(n)). We hope that future researchers can further exploit the weaknesses of such programs to fool them with a shorter seed.

It might be instructive to compare INWt(λ) with a different family of PRGs. Let 𝒢t(λ) denote the set of generators one can construct by the following recursive paradigm:

G0(x) =x
Gi+1(x,y) =(Gi(x),Gi(Hi+1t[x,y])),

where Hi+1 is a graph satisfying λ(Hi+1)λ. This is the same as the definition of the INW generator, except that we use a t-th power of an expander graph instead of using a generic expander graph. Then 𝒢t(λ)INW(λt), so a theorem saying “Every PRG in 𝒢t(λ) fools width-w length-n standard-order ROBPs with error λtwn” would be true but not interesting. In contrast, Theorem 10 implies that INWt(λ)INW(λt), and we believe that Theorem 7 is saying something new and interesting.

1.7 Proof techniques

1.7.1 Our proof that 𝐈𝐍𝐖𝒕(𝝀) fools ROBPs, even if 𝝀 is relatively large

In this section, we give a brief informal overview of our proof of Theorem 7, our positive result on using INWt(Λ) to fool constant-width standard-order ROBPs. For simplicity’s sake, let us focus on the case that Λi,j=λ for every i,j, and let us assume that we take the XOR of t copies of the same generator in INW(λ). The analysis is based on the following alternative, equivalent description of the resulting PRG. We inductively define a sequence of PRGs G0,G1,G2,, where Gi:it{0,1}2i, as follows.

  1. 1.

    Base case: Let 0={0,1} and let G0(x1,,xt)=x1xt.

  2. 2.

    Inductive step: Assume we have already constructed Gj1. Let Hj be a Dj-regular expander graph on the vertex set j1 satisfying λ(Hj)λ. Define j=j1×[Dj] and

    Gj((x1,y1),,(xt,yt))=(Gj1(x1,,xt),Gj1(Hj[x1,y1],,Hj[xt,yt])). (1)

In effect, Equation 1 says that at each stage of the recursion, we use a tensor product of expander graphs (HjHj) to recycle the seed (x1,,xt). Note that tensoring does not improve expansion: λ(HjHj)=λ(Hj).

Our job is to show that Gj fools width-w standard-order ROBPs. For simplicity’s sake, let us focus on showing that the last bit of the output of Gj is δ-unpredictable for such programs where δ=wj(jλ)t. Let f be a width-w program that attempts to predict the last bit of the output of Gj, given all the previous bits. The idea is to write the transition probability matrix of HjHj in the form

(J+E)(J+E)(J+E).

Here J is the transition matrix of the complete graph with self-loops, and E is an “error matrix” with operator norm at most λ. After expanding, we get a sum of terms, each of which has the form JkE(tk) (after reordering if necessary).

To analyze such a term, we can first consider any fixing of the last tk parts of the seed, allowing us to focus on the Jk factor. This Jk factor corresponds to running INW(λ)k twice, on two independent seeds, and concatenating the results. Since f is trying to predict the last bit, the first copy of INW(λ)k is completely irrelevant; it does not give f any advantage whatsoever. Meanwhile, by induction, the second copy is ε-unpredictable where ε=wj1((j1)λ)k. Meanwhile, the matrix E(tk) has operator norm at most λtk. As it turns out, this has the effect of dampening the inductive advantage bound by a factor of wλtk, so that overall the advantage from the JkE(tk) term is at most wj(j1)kλt. Summing over all terms, we get an overall advantage bound of wjλtk=0t(tk)(j1)k=wj(jλ)t.

In the full proof, in order to optimize parameters and to make the proof as simple as possible, we ultimately do not formally use the notion of unpredictability, but the idea remains the same.

1.7.2 Our proof that sums of INW PRGs with seed length 𝒐(𝐥𝐨𝐠𝟐𝒏) do not fool ROBPs

Next, let us discuss our proof that if the entries of Λ are large enough that there exists a generator in INWt(Λ) with seed length o(log2n), then the entries of Λ are so large that there exists a generator in INWt(Λ) that does not fool width-3 programs (Theorem 8). For simplicity’s sake, let us focus on the case t=2. The proof builds on the works of Brody and Verbin [9], Hoza, Pyne, and Vadhan [23], and Lee and Viola [31]. We construct our INW2 generator using Cayley graphs over the group 𝔽2n, i.e., expander graphs of the form H[x,y]=xG(y) where G is a small-bias generator. By using both Cayley graphs and complete graphs where appropriate, we ensure that the output of our INW2 generator has many substrings of the form

(xx,xxG(y)G(y)),

where x,x,G(y),G(y){0,1}m for some m=Θ(logn).

We instantiate G using a small-bias generator constructed by Lee and Viola [31]. Their generator outputs noisy codewords, i.e., the output is a+b where a is a random element of a subspace 𝒞𝔽2m and b is a random vector of low Hamming weight. Crucially, Lee and Viola observe that the sum of two noisy codewords is another noisy codeword. Consequently, letting z=xx, the output of our INW2 generator has many substrings of the form

(z,something close to 𝒞z).

The specific parameters ensure that the description above is nontrivial, i.e., there is some v such that such a substring is never equal to (0m,v). Our width-3 distinguisher checks all the relevant regions of its input and accepts if it ever sees the specific substring (0m,v). With some tweaks, it turns out that this approach is strong enough to prove Theorem 8.

1.7.3 Our proof that INW with small 𝝀 is incomparable with 𝐈𝐍𝐖𝒕 with large 𝝀

Next, let us briefly explain how we construct an INW generator G that does not fool quadratic polynomials over 𝔽2 (Item 2 in Theorem 9). The construction is based on the inner product function 𝖨𝖯. We construct an expander graph H=(V,E) such that 𝖨𝖯(x,y)=0 for every (x,y)E. The construction is essentially E=𝖨𝖯1(0), except that we must redirect a few edges to ensure that the graph is regular. Using this expander graph H, we can construct an INW generator whose output is always rejected by 𝖨𝖯. In contrast, 𝔼[𝖨𝖯]=1/2o(1) under the uniform distribution.

Finally, let us briefly discuss the proof of Theorem 10 (a case where INWt(λ) performs poorly, but INW(λ) performs well when λ is slightly smaller than λ). We use Cayley graphs to construct an INWt(λ) generator that has many (2m)-bit substrings of the form

(i=1txi,i=1t(xiGi(yi)),i=1txi,i=1t(xiGi(yi))),

where G1,,Gt are small-bias generators, and, crucially, we ensure that y1 and y1 disagree somewhere in their last log(1/λ) bits.

Our distinguisher begins by canceling out the x’s to compute the sums i=1tGi(yi) and i=1tGi(yi). Next, the distinguisher inverts these sum-of-small-bias generators to compute the underlying seeds y1,,yt,y1,,yt. (We use a probabilistic argument to show that there exist small-bias generators G1,,Gt for which this inversion procedure is possible.) Finally, the distinguisher rejects if the last log(1/λ) bits of y1 agree with the last log(1/λ) bits of y1, and otherwise it accepts.

By construction, the distinguisher accepts every output of our INWt(λ) generator. In contrast, we show that it has low acceptance probability under any INW(λ) generator, where λ is just a little smaller than λ. The proof is based on two ideas.

  • In later rounds of the INW recursion, we can use standard techniques based on “communication bottlenecks” to argue that the expander graph in the INW construction does not introduce much error. In particular, after processing ixi and i(xiGi(yi)), the distinguisher only needs to remember the last log(1/λ) bits of the computed seed y1. As a result, the acceptance probability is close to what it would be if we used independent seeds instead of correlated seeds in these later rounds of the INW construction.

  • Our remaining task is to bound our distinguisher’s acceptance probability under a concatenation of many independent INW(λ) generators, each of which outputs m bits. We are not able to say anything about the distribution of the distinguisher’s computed y1 seed. However, simply because they are independent and identically distributed, there is a noticeable chance that the distinguisher’s computed y1 and y1 seeds happen to agree in their last log(1/λ) bits. By doing O(1/λ) independent trials, we ensure that the overall acceptance probability is low.

The proofs of Theorems 9 and 10 are omitted from this extended abstract, but they can be found in the full version of this paper [20].

1.8 Additional related work

Assadi and N established a general XOR lemma for streaming algorithms [3] (see also work by Lee, Pyne, and Vadhan [30]). However, their XOR lemma does not imply anything about INWt, because they focus on the scenario in which the streaming algorithm sees t instances of a problem in sequence, one after another, instead of seeing the bitwise XOR of the instances.

Several prior works have proved limitations on the power of sums of small-bias distributions to fool various types of tests [6, 33, 36, 4, 31]. Our negative result about using INWt to fool ROBPs (Theorem 8) is in a similar spirit, and indeed the proof builds on Lee and Viola’s work [31] as mentioned previously.

2 Preliminaries

2.1 Read-once evaluation programs (ROEPs)

Ultimately, the model of computation we are interested in is the standard-order ROBP model (Definition 2). At intermediate stages of our argument, we will use a slight generalization of the ROBP model called the “read-once evaluation program” (ROEP) model, introduced by Braveman, Rao, Raz, and Yehudayoff [8]. An ROEP is simply an ROBP with fractional output values:

Definition 11 (Standard-order ROEP [8]).

A width-w length-n standard-order read-once evaluation program (ROEP) is defined just like a width-w length-n standard-order ROBP (Definition 2), except that the output values qv are permitted to be any values in [0,1]. Thus, the program computes a function f:{0,1}n[0,1]. We say that a distribution X over {0,1}n fools f with error ε if |𝔼[f(X)]𝔼[f]|ε.

2.2 Graphs

For any graph H, we use V(H) to denote the vertex set of H. As discussed in the introduction, we use the notation H[x,y] to denote the y-th neighbor of the vertex x in the graph H. This is well-defined provided that H is a labeled graph, defined as follows.

Definition 12 (Graph labeling).

Let H=(V,E) be a D-regular directed multigraph. We say that H is labeled if for every vertex x, the outgoing edges are labeled 1,,D. In this case, we write H[x,y] to denote the vertex reached from x by traversing the outgoing edge labeled y. If H is a D-regular undirected multigraph, then we identify H with the symmetric digraph obtained by replacing each undirected edge {x,x} with two directed edges: (x,x) and (x,x). If we say that H is “labeled,” we allow the two edges (x,x) and (x,x) to have distinct labels.

We write KR to denote the complete graph on R vertices without self-loops, and we write JR to denote the complete graph on R vertices with self-loops. We write J or J if the number of vertices is clear from context. Several occurrences of “J” in a single equation might represent complete graphs on several different numbers of vertices.

If H is a D-regular undirected multigraph on R vertices, its transition probability matrix is the matrix M[0,1]R×R defined by letting Mu,v=e(u,v)D, where e(u,v) denotes the number of edges from u to v. We often abuse notation by identifying a graph H with its transition probability matrix. For example, we use JR to denote the R×R matrix in which every entry is equal to 1/R.

2.3 Spectral expansion

For a matrix MR×R, we use the notation Mop to denote the operator norm of M, i.e.,

Mop=maxxRx2=1xM2.
Definition 13 (Expansion parameter).

Let H be a regular undirected multigraph. The expansion parameter λ(H) is defined as

λ(H)=HJop.

Equivalently, one can define λ(H) to be the second-largest eigenvalue of H in absolute value.

For example, λ(J)=0. The complete graph without self-loops also has quite a good expansion parameter:

Fact 14 (Expansion parameter of the complete graph without self-loops).

For every R, we have λ(KR)=1/(R1).

Proof sketch.

The following R vectors are linearly independent eigenvectors of KR:

  • The all-ones vector, which has eigenvalue 1;

  • Vectors of the form (1,0,0,,0,1,0,0,,0), each of which has eigenvalue 1/(R1).

Cayley graphs are a more interesting class of expanders. The general definition is as follows.

Definition 15 (Cayley graph).

Let V be a group, let be a finite set, and let G:V be a function. The Cayley graph Cay(V,G) is a labeled ||-regular directed multigraph on the vertex set V defined by

Cay(V,G)[x,y]=xG(y), (2)

where is the group operation.

We will only use this definition in the special case that V=𝔽2n, so Equation 2 becomes

Cay(𝔽2n,G)[x,y]=xG(y).

It is well-known that if G is a small-bias generator, then Cay(𝔽2n,G) is an expander. We include the proof for completeness’ sake.

Lemma 16 (Expanders from small-bias generators).

Let n and let G:𝔽2n be a λ-biased generator. Then λ(Cay(𝔽2n,G))λ.

Proof.

Let A be the 2n×2n transition probability matrix of Cay(𝔽2n,G). Then

A=1||yA(y),

where Ax,x(y)=1 if x=xG(y) and 0 otherwise.

The 2n eigenvectors of A(y) are the character functions χa (viewed as vectors indexed by 𝔽2n). Indeed,

(χaA(y))x=xχa(x)Ax,x(y)=χa(xG(y))=χa(x)χa(G(y)),

because the only x for which A(y)(x,x) is nonzero is x=xG(y). Hence χa is an eigenvector of A(y) with eigenvalue χa(G(y)), and by linearity χa is an eigenvector of A with eigenvalue

λa=1||yχa(G(y))=𝔼y[χa(G(y))].

Then we have all 2n eigenvalues of A. When a=0, one finds λ0=1, and so

λ(Cay(𝔽2n,G))=maxa0|𝔼y[χa(G(y))]|.

Since G is an λ-biased generator, |𝔼y[χa(G(y))]|λ, for all nonzero a𝔽2n.

2.4 Lower bound on the degree of expander graphs

To prove our seed length lower bounds, we rely on the following standard fact.

Proposition 17 (Expander graph degree lower bound).

Let H be an undirected D-regular graph on R vertices. Then

λ(H)1DRDR1.

In particular, Dmin{1/(2λ(H)2),(R+1)/2}, and if D=1, then λ(H)=1.

A proof of Proposition 17 can be found in the full version of this paper [20].

2.5 Tensor products

Definition 18 (Tensor product of graphs).

Given a pair of labeled graphs H1,H2 on R1, R2 vertices with degrees D1, D2 respectively, define the tensor product H1H2 to be the (D1D2)-regular graph on R1R2 vertices with neighbor relation (H1H2)[(u1,u2),(e1,e2)]=(H1[u1,e1],H2[u2,e2]).

Proposition 19 (Spectral expansion of a tensor product).

Let H1,H2 be undirected regular graphs. Then λ(H1H2)=max(λ(H1),λ(H2)).

We also use the notation MM to denote the tensor product of matrices, aka the Kronecker product.

Fact 20 (Operator norm of tensor product).

For any two matrices M,M, we have

MMop=MopMop.

2.6 Expander mixing lemma

We will use the following weak version of the famous “expander mixing lemma.”

Lemma 21 (Expander mixing lemma).

Let H=(V,E) be a regular undirected multigraph. Let f,g:V{0,1}. Sample a uniform random vertex X, then sample a uniform random neighbor Y of X. Then

|𝔼[f(X)g(Y)]𝔼[f]𝔼[g]|λ(H).

For a proof, see, e.g., Vadhan’s pseudorandomness survey [47].

2.7 INW generators

We now present a more precise definition of the INW generator, in case the informal definition in the introduction was not sufficiently clear.

Definition 22 (Permissible families of graphs).

Let n be a power of two, let D1,,Dlogn, let Hi be a labeled Di-regular undirected multigraph for every i[logn], and let H=(H1,,Hlogn). We say that H is permissible if V(H1)=[2] and V(Hi+1)=V(Hi)×[Di] for every i[logn1], where V(H) denotes the vertex set of H.

More generally, suppose is a t×logn matrix of labeled regular undirected multigraphs:

=[H1,1H1,lognHt,1Ht,logn].

We say that is permissible if each row is permissible.

Definition 23 (INW generators).

Let H=(H1,,Hlogn) be a permissible family of labeled regular undirected multigraphs. We define INWH:V(Hlogn){0,1}n recursively by the formulas

INW()(x) =x
INW(H1,,˙Hj)(x,y) =(INW(H1,,Hj1)(x),INW(H1,,Hj1)(Hj[x,y])).

More generally, let be a t×logn matrix of labeled regular undirected multigraphs, say

=[H1,1H1,lognHt,1Ht,logn],

and assume that is permissible. We define INWt:V(H1,logn)××V(Ht,logn){0,1}n by the formula

INWt(x1,,xt)=INW1(x1)INWt(xt),

where 1,,t are the rows of .

For a vector λ[0,1]logn, we define

INW(λ)={INWH:H is a 1×logn permissible family and λ(Hj)λj for all j}.

When n is clear from context, we use INW(λ) as a shorthand for the case that λj=λ for every j. Similarly, for a matrix Λ[0,1]t×logn, we define

INWt(Λ)={INW: is a t×logn permissible family and λ(Hi,j)Λi,j for all i,j}.

When n is clear from context, we use INWt(λ) as a shorthand for the case that Λi,j=λ for every i,j.

2.8 Binary linear code

In this section we briefly review the basic concepts of the coding theory.

Definition 24 (Binary Linear Code).

A binary linear code 𝒞 of block length m is a subspace of 𝔽2m, where 𝔽2 is the field with two elements. The minimum distance d of 𝒞 is the smallest Hamming weight among all nonzero codewords in 𝒞.

Definition 25 (Binary Entropy Function).

For 0δ1, the binary entropy function is defined as

H(δ)=δlog2(δ)(1δ)log2(1δ),

with the convention that 0log20=0.

Theorem 26 (GV bound for binary linear codes).

For every m,k such that km/2, there exists a binary linear code of block length m with minimum distance k+1 and dimension at least (1H(k/m))m.

Corollary 27.

For every δ(0,1/2), there is a subspace 𝒞𝔽2m of dimension at most H(δ)m such that the uniform distribution over 𝒞 is (δm)-wise uniform distribution, which means every δm bits of this distribution are uniform over {0,1}δm.

The proofs of Theorem 26 and Corollary 27 can be found in the full version of this paper [20].

3 Proof that sums of INW generators fool constant-width ROBPs

In this section, we present the proof of Theorem 7, which says that INWt(Λ) fools width-w programs with error nlog(w+1)i=1tj=1lognΛi,j. The proof is based on the notion of a robust PRG. Roughly speaking, a robust PRG is a multi-seed PRG that still works even if some seeds are fixed to arbitrary values, with an error bound that depends on the number of random seeds. The precise definition is as follows.

Definition 28 (Robust PRG).

Let 1,,t be finite sets, let G:1××t{0,1}n, let W0, let ε=(ε1,,εt)(0,1)t, and let f:{0,1}n. We say that G robustly (W,ε)-fools f if the following holds. For every A[t] and every x[t]Ai[t]Ai, if we sample xAiAi uniformly at random, then

|𝔼xA[f(G(x1,,xt))]𝔼[f]|WiAεi.

For example, if 1==t={0,1} and G(x1,,xt)=x1xt, then G robustly (1,0)-fools every function f:{0,1}[0,1]. This example will serve as the base case of our analysis of INWt. The inductive step will be based on the following lemma, which shows how to double the output length of a robust PRG fooling ROEPs.

Lemma 29 (Inductive step in the analysis of INWt).

Let n,w where n is even. Let 1,,t be finite sets. Let G:1××t{0,1}n/2. Let W0 and ε(0,1)t, and assume that G robustly (W,ε)-fools all width-w length-(n/2) standard-order ROEPs. For each i[t], let H(i) be a Di-regular multigraph on the vertex set i. Let λ=(λ(H(1)),,λ(H(t))). Define G:(1×[D1])××(t×[Dt]){0,1}n by the formula

G((x1,y1),,(xt,yt))=(G(x1,,xt),G(H(1)[x1,y1],,H(t)[xt,yt])).

Then G robustly (W(w+1),ε+λ)-fools all width-w length-n standard-order ROEPs.

Proof.

Let f:{0,1}n[0,1] be a width-w length-n standard-order ROEP. Fix any set A[t]. For every i[t]A, let xi,xii be arbitrary fixed values. We also write Xi=xi and Xi=xi for every i[t]A. Meanwhile, for every iA, sample Xii and Yi[Di] independently and uniformly at random, and let Xi=H(i)[Xi,Yi]. Let X=(X1,,Xt) and X=(X1,,Xt). Our goal is to show that (G(X),G(X)) fools f with error W(w+1)iR(εi+λi).

For each vertex u in the middle layer of f, define fu:{0,1}n/2{0,1} by letting fu(z) indicate whether f reaches u when it reads z. Furthermore, define fu:{0,1}n/2[0,1] by letting fu(z) be the label of the vertex reached in the final layer if we start at u and read z. Let μu=𝔼[fu] and f¯u=fuμu. Then for any z,z{0,1}n/2, we have

f(z,z)=u[w]fu(z)fu(z)=(u[w]fu(z)f¯u(z))+(u[w]fu(z)μu).

The second sum above is computable by a width-w standard-order ROEP fleft:{0,1}n2[0,1] that ignores the second half of its input. By assumption, (G(X),G(X)) fools fleft with error WiRεi.

Now consider a single term in the first sum, fu(z)f¯u(z). Under the uniform distribution, we have

𝔼[fuf¯u]=𝔼[fu]𝔼[f¯u]=𝔼[fu](μuμu)=0.

Now we analyze the expectation under the pseudorandom distribution (G(X),G(X)). We begin by writing the expectation as a sum. For convenience, for any set S[t], let us use the notation S to denote the Cartesian product iSi. Furthermore, let us identify the graph H(i) with its transition probability matrix. Then we have

𝔼[fu(G(X))f¯u(G(X))]
=xA,xAAPr[X=x and X=x]fu(G(x))f¯u(G(x))
=xA,xAA(iAHxi,xi|i|)fu(G(x))f¯u(G(x)),

where the notation x denotes the vector x=(x1,,xt), and similarly x=(x1,,xt). Next, we use the decomposition H(i)=J|i|+E(i), where J|i| has 1/|i| in every entry and E(i) is some matrix with operator norm λi. Applying this decomposition entrywise, we get

𝔼[fu(G(X))f¯u(G(X))]
=xA,xAA(iA(1|i|2+Exi,xi(i)|i|))fu(G(x))f¯u(G(x))
=A=STxT,xTT(iTExi,xi(i)|i|)𝔼xS,xSS[fu(G(x))f¯u(G(x))],

where the outer sum is over all partitions of A into two disjoint sets, S and T. The product iTExi,xi(i) is exactly the (xT,xT) entry in the tensor product matrix iTE(i). Meanwhile, the expectation

𝔼xS,xSS[fu(G(x))f¯u(G(x))]

splits as a product of expectations. Thus, we get

𝔼[fu(G(X))f¯u(G(X))]
=A=ST1|T|xT,xTT(iTE(i))xT,xT𝔼xSS[fu(G(x))]𝔼xSS[f¯u(G(x))].

We can think of the first expectation, 𝔼xSS[fu(G(x))], as a single entry axT in a long vector aT. Similarly, we think of the second expectation, 𝔼xSS[f¯u(G(x))], as a single entry bxT in a long vector bT. In this way, the inner sum above becomes a vector-matrix-vector product:

|𝔼[fu(G(X))f¯u(G(X))]| =|A=ST1|T|a𝖳ETb|
A=ST1|T|a2iTE(i)opb2
A=STabiTE(i)op
=A=STabiTλi.

The function fu is {0,1}-valued, so a1. Meanwhile, for any fixing of xTT, the entry bxT is precisely the error when we sample xSS uniformly at random and try to use G(x) to fool fu, a width-w length-(n/2) standard-order ROEP. By assumption, bWiSεi. Therefore,

|𝔼[fu(G(X))f¯u(G(X))]|WA=ST(iSεi)(iTλi)=AiA(εi+λi).

Consequently, summing up all the errors, we get

|𝔼[f(G(X),G(X))]𝔼[f]|(u[w]WiA(εi+λi))+WiAεiW(w+1)iA(εi+λi).

Proof of Theorem 7.

Let Glogn be any PRG in the family INWt(Λ). We will prove by induction on n that Glogn robustly (W,ε)-fools width-w length-n standard-order ROEPs, where W=(w+1)logn and εi=j=1lognΛi,j. For the base case, if n=1, then there is exactly one PRG in the family INWt(Λ), namely G0:{0,1}t{0,1} is given by G0(x)=x1xt. This PRG indeed robustly (1,0)-fools all functions f:{0,1}[0,1]. Now, for the inductive step, suppose n>1. By definition of INWt(Λ), the PRG Glogn has the form Glogn(a1,,at)=Glogn(1)(a1)Glogn(t)(at), where Glogn(i)INW(Λi) for every i. By definition of INW(Λi), the seed ai is a pair (xi,yi), and the generator Glogn(i) has the form

Glogn(i)(xi,yi)=(Glogn1(i)(xi),Glogn1(i)(H(i)[xi,yi]))

for some Glogn1(i)INW((Λi,1,,Λi,logn1)) and some Λi,logn-spectral expander H(i). Define a PRG Glogn1 by the rule

Glogn1(x1,,xt)=Glogn1(1)(x1)Glogn1(t)(xt).

Then Glogn1INWt(Λ), where Λ consists of all but the last column of Λ. By induction, Glogn1 robustly ((w+1)logn1,α)-fools width-w length-n standard-order ROEPs, where αi=j=1logn1Λi,j. Working through the definitions, we see that the PRG Glogn can be written as

Glogn((x1,y1),,(xt,yt))=(Glogn1(x1,,xt),Glogn1(H(1)[x1,y1],,H(t)[xt,yt])).

Applying Lemma 29 completes the inductive step.

Finally, since Glogn robustly (W,ε)-fools width-w length-n standard-order ROEPs, it follows that Glogn fools width-w length-n standard-order ROBPs with error ε, where

ε=Wi=1tεi=nlog(w+1)i=1tj=1lognΛi,j.

4 Seed length lower bound for fooling ROBPs

In this section, we prove Theorem 8, which states that no matter how many (or how few) copies of the INW generator we XOR together, and regardless of how we set the expansion parameters, if the resulting PRG fools width-3 programs, then the seed length is inevitably Ω(log2n).

4.1 Sums of small-bias distributions

We begin by analyzing a family of a small-bias distributions introduced by Lee and Viola [31]. This construction guarantees that summing independent samples from these distributions does not substantially increase the overall number of distinct strings.

Definition 30 (Sum of sets).

For S,T{0,1}m, S+T={stsS,tT}.

Lemma 31 (Small-bias distributions with a small sum set).

For every m,t and every ε1,,εt(0,1] such that ln(1/ε1)++ln(1/εt)<1625m, there exist distributions D1,,Dt over {0,1}m such that Di is εi-biased for every i, and

|Supp(D1)++Supp(Dt)|<22m/2.

Furthermore, the probability mass function of each Di only takes on rational values.

Proof.

For each i[t], we construct Di by taking the bitwise XOR XYi, where X is distributed uniformly over 𝒞 which is (m25)-wise uniform constructed by Corollary 27, and Yi is an independent “noise vector” constructed as follows. Repeat the following process ri times independently, where ri=25ln(1/εi): choose a position uniformly at random from [m], and set it to a uniform bit. The remaining bits of Yi are zero. Note that X is uniform over a subspace of 𝔽2m and the probability of getting a particular noise vector y is a multiple of (1/m)ri, which is a rational number.

First we prove that each Di is εi-biased. For any character function χS with |S|<m25, since X and Yi are independent and X is (m25)-wise uniform, χS is perfectly fooled by Di. Next, consider any character function χS, where |S|m25. In this case, the bias is nonzero only if none of the elements in S are selected by the random noise Yi. So the bias is at most

(1|S|/m)riexp(ri|S|/m)εi.

Thus, we have shown that each Di is εi-biased.

By the closure property of linear subspaces,

|Supp(D1)++Supp(Dt)| =|Supp(X)+Supp(Y1)++Supp(X)+Supp(Yt)|
=|𝒞++𝒞+Supp(Y1)++Supp(Yt)|
=|𝒞+Supp(Y1)++Supp(Yt)|.

The set 𝒞+Supp(Y1)++Supp(Yt) is precisely the set of all binary strings within Hamming distance at most r1++rt25ln(1/ε1)++25ln(1/εt)<m25 from 𝒞. Therefore,

|𝒳+Supp(Y1)++Supp(Yt)||𝒞|(mm/25) 2H(125)m2H(125)m
<22m/2.

 Remark 32 (The benefit of noisy codewords).

In the proof of Lemma 31, we use small-bias distributions based on noisy codewords [31]. A more straightforward approach for minimizing |Supp(D1)++Supp(Dt)| would be to simply make |Supp(Di)| as small as possible and then use the trivial bound

|Supp(D1)++Supp(Dt)|i=1t|Supp(Di)|.

This approach is too weak to prove Lemma 31. Indeed, every small-bias distribution Di satisfies |Supp(Di)|Ω(m) [1], so we inevitably have i=1t|Supp(Di)|Ω(m)t, so the bound is trivial when t1.01m/logm. In contrast, Lemma 31 is meaningful even when t=Θ(m).

4.2 Constructing an 𝐈𝐍𝐖𝒕 generator that does not fool ROBPs

Recall that the seed length of INWt is i[t],j[logn]log(deg(Hi,j)). Our goal is to show that if every PRG in INWt(Λ) fools width-3 programs, then every PRG in INWt(Λ) has seed length Ω(log2n). Indeed, we will show that for each PRG in INWt(Λ), the seed length grows by Ω(logn) in each of Ω(logn) of the rounds of the INW recursion. The lemma below makes this precise, formulated in the contrapositive.

Lemma 33 (One-round seed length lower bound for fooling width-3 programs).

Let n be a sufficiently large power of two, let t, let Λ[0,1]t×logn, and let j[loglogn,14logn]. Assume that there exists a t×logn permissible family of labeled regular undirected multigraphs ={Hi,j} such that λ(Hi,j)Λi,j for every i,j and log(deg(H1,j))++log(deg(Ht,j))<120000logn. Then there exists another t×logn permissible family of labeled regular undirected multigraphs ={Hi,j} such that λ(Hi,j)Λi,j for every i,j, along with a length-n, width-3, standard-order ROBP B such that Pr[B(U{0,1}n)=1]1exp(n1/4), but B(INWt(x))=0 for every seed x.

Proof.

We first partition [t] based on whether deg(Hi,j)1/(2Λi,j2). Let T1[t] be the indices such that deg(Hi,j)1/(2Λi,j2) and let T2=[t]T1, which means for all iT2, deg(Hi,j)(|Hi,j|+1)/2 by Proposition 17. Without loss of generality, we assume that T1=[t1] and T2=[t][t1]. Note that if Λi,j=0 for some i[t], then iT2.

Let M be a power of two satisfying n1/8M<n1/4, and let m=logM. Now we can construct a new family of graphs . For indices iT1, we use Lemma 31 to construct a family of distributions Di,iT1 over {0,1}m such that each Di is Λi,j-biased. By Proposition 17, we know that for any graph Hi,j such that deg(Hi,j)=1, we have λ(Hi,j)=Λi,j=1. Thus,

iT1log(1/Λi,j) =iT1,Λi,j1log(1/Λi,j)+iT1,Λi,j=1log(1/Λi,j)
iT1,Λi,j1log(2deg(Hi,j))+0
2i[t]log(deg(Hi,j))110000logn<11000m,

where we used the fact that if Λi,j1, then deg(Hi,j)2. Let Zi=Cay(𝔽2m,SBi), where SBi:Ki{0,1}m is some generator such that SBi(UKi)=Di. (Such a generator exists, because all probabilities under Di are rational numbers.) This graph satisfies λ(Zi)Λi,j by Lemma 16, and each vertex in Zi has |Supp(Di)| distinct neighbors. Let q=2j1 and Q=2q, and we define

Wi=ZiJQ/M.

Recall that 1,,t denote the rows of . For each iT1, let

i=[J2,,J22j2,Wi,J,,J],

that is, for all indices jj, Hi,j is the complete graph of appropriate size, and in j-th index, we use our Wi constructed before.

For each iT2, let

i=[Hi,1,Hi,2,,Hi,j1,Hi,j,J,J],

i.e. the graphs up to j are exactly same as the corresponding graphs in , and after j, we use the complete graphs of appropriate size. By combining i,iT1 and i,iT2 accordingly, we get a new family graphs . Since is a family of graphs that satisfy the constraint Λ, so is .

By the definition of INWt,

INWt(x10,,xt0)=INW1(x10)INWt(xt0)=Glogn(1)(x10)Glogn(t)(xt0)
=(Glogn1(1)(x1),Glogn1(1)(H1,logn[x1,y1]))
(Glogn1(t)(xt),Glogn1(t)(Ht,logn[xt,yt])),

where xi and yi are the substrings of xi of appropriate length, and each Glogn(i) as a INW generator in INW(Λi), so that Gj(i), j[logn], is defined recursively. By recursively expanding these generators until the j-th round, where we can express the output of INWt as n/2j independent copies, as i,j are complete graphs for i[t] and j<jlogn, of the following form

Gj(1)(x1,y1)Gj(t)(xt,yt)
=(Gj1(1)(x1),Gj1(1)(H1,j[x1,y1]))(Gj1(t)(xt),Gj1(t)(H1,j[xt,yt]))
=(x1xt1Gj1(t1+1)(xt1+1)Gj1(t)(xt),
H1,j[x1,y1]Ht1,j[xt1,yt1]
Gj1(t1+1)(Ht1+1,j[xt1+1,yt1+1])Gj1(t)(Ht,j[xt,yt])),

as Gj1(i) is the identity function for iT1, where xi,yi are the corresponding input strings.

Let Ri be the set of all possible Gj1(i)(xi)Gj1(i)(Hi,j[xi,yi]) where we allow xi to range over all vertices and we allow yi to range over all edge labels, and let Ri={x1,,m|xRi}, which is the set of first m bits of strings in Ri. In the following, we are going to show that |i=1tRi|<2m, hence there is at least one v{0,1}m such that vi=1tRi. Our approach is to bound

|i=1tRi||iT1Ri|iT2|Ri|.

We will bound |iT1Ri| and iT2|Ri| separately.

For each iT1, let xi(1){0,1}m be the first m bits of xi, and xi(2){0,1}qm to be the substring of xi after xi(1), so that xi=(xi(1),xi(2)). For yi, since Wi=ZiJ, we have yi=(yi(1),yi(2)) where yi(1)Ki and yi(2){0,1}qm. Thus, for iT1, we have

xiHi,j[xi,yi] =(xi(1),xi(2))Hi,j[(xi(1),xi(2)),(yi(1),yi(2))]
=(xi(1),xi(2))Wi[(xi(1),xi(2)),(yi(1),yi(2))]
=(xi(1),xi(2))(Zi[xi(1),yi(1)],JQ/M[xi(2),yi(2)])
=(xi(1),xi(2))(Zi[xi(1),yi(1)],yi(2)).

By our construction of Zi, for any choice of xi(1),yi(1), we have xi(1)Zi[xi(1),yi(1)]Supp(Di), and

iT1xi(1)Zi[xi(1),yi(1)]iT1Supp(Di).

Therefore, by Lemma 31, we have

|iT1Ri||iT1Supp(Di)|22m/2.

For iT2, we have

Gj1(i)(xi)Gj1(i)(Hi,j[xi,yi])=Gj1(i)(xi)Gj1(i)(Hi,j[xi,yi]).

For iT2, deg(Hi,j)|Hi,j|+12 and xi is a vertex label in Hi,j,

|Ri||Hi,j|deg(Hi,j)2(deg(Hi,j)1)deg(Hi,j)2deg(Hi,j)2.

We know that log(deg(H1,j))++log(deg(Ht,j))<120000logn<0.001logn. By Proposition 17, for any undirected graph h with degree 1, we know λ(h)=1, so that for any iT2, deg(Hi,j)2, which means |T2|<0.001logn. Therefore,

iT2|Ri|2|T2|iT2deg(Hi,j)220.001lognn0.01n0.0220.2m.

By the bounds above, the number of possible choices for v is at most

|iT1Ri|iT2|Ri|22m/2+0.2m<2m.

Therefore, there is at least one v{0,1}m such that vi=1tRi.

Encoding this missing element to a width-3 branching program B computing the function f:{0,1}n{0,1} as

f(x)=i=0n/2j1(x2ji+12ji+m=0mx2ji+q+12ji+q+m=v),

we know that B(INWt(x))=0 for any input seed x. However, for a truly random input satisfies each clause with probability 22m1/n, and since there are n/2jn3/4 such clauses, the acceptance probability is

Pr[B(U{0,1}n)=1]=1Pr[B(U{0,1}n)=0]1(11/n)n3/41exp(n1/4).

We can then prove Theorem 8.

Theorem 34 (Restatement of Theorem 8).

Let n be a power of two, let t, and let Λ[0,1]t×logn. If every PRG in the family INWt(Λ) fools width-3 standard-order ROBPs with error 0.99, then every PRG in the family INWt(Λ) has seed length Ω(log2n).

Proof.

For every t, Lemma 33 implies that for every PRG in the family INWt(Λ) and for each j[loglogn,logn],

log(deg(H1,j))++log(deg(Ht,j))120000logn.

Consequently, the overall seed length is at least

j[loglogn,logn](log(deg(H1,j))++log(deg(Ht,j))) j[loglogn,logn]120000logn
=Ω(log2n).

5 Directions for future research

It would be very interesting to prove a general “XOR lemma” saying that taking the bitwise XOR of many copies of a distribution amplifies its unpredictability for bounded-width ROBPs. Proving such a general lemma might be the key to analyzing derandomized sums of INW generators.

There is a well-known variant of the INW generator in which we use an extractor to recycle the seed at each stage instead of using an expander, i.e., the recursive step is Gj+1(x,y)=(Gj(x),Gj(𝖤𝗑𝗍(x,y))). This construction and its analysis are similar to the Nisan-Zuckerman PRG [39]. Intriguingly, it is not clear how to carry out the proof of Theorem 7 using extractors instead of expanders. Conceivably, an extractor-based proof might be more amenable to derandomization.

We would also like to highlight the problem of determining the optimal dependence on w in Theorem 7. Can the nlog(w+1) term be improved to poly(n,w)?

References

  • [1] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple constructions of almost k-wise independent random variables. Random Structures Algorithms, 3(3):289–304, 1992. doi:10.1002/rsa.3240030308.
  • [2] Roy Armoni. On the derandomization of space-bounded computations. In Proceedings of the 2nd Workshop on Randomization and Approximation Techniques in Computer Science (RANDOM), pages 47–59, 1998. doi:10.1007/3-540-49543-6_5.
  • [3] Sepehr Assadi and Vishvajeet N. Graph streaming lower bounds for parameter estimation and property testing via a streaming XOR lemma. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 612–625, 2021. doi:10.1145/3406325.3451110.
  • [4] Andrej Bogdanov, Zeev Dvir, Elad Verbin, and Amir Yehudayoff. Pseudorandomness for width-2 branching programs. Theory Comput., 9:283–292, 2013. doi:10.4086/toc.2013.v009a007.
  • [5] Andrej Bogdanov, William M. Hoza, Gautam Prakriya, and Edward Pyne. Hitting Sets for Regular Branching Programs. In Proceedings of the 37th Computational Complexity Conference (CCC), pages 3:1–3:22, 2022. doi:10.4230/LIPIcs.CCC.2022.3.
  • [6] Andrej Bogdanov and Emanuele Viola. Pseudorandom bits for polynomials. SIAM J. Comput., 39(6):2464–2486, April 2010. doi:10.1137/070712109.
  • [7] Mark Braverman, Gil Cohen, and Sumegha Garg. Pseudorandom pseudo-distributions with near-optimal error for read-once branching programs. SIAM J. Comput., 49(5):STOC18–242–STOC18–299, 2020. doi:10.1137/18M1197734.
  • [8] Mark Braverman, Anup Rao, Ran Raz, and Amir Yehudayoff. Pseudorandom generators for regular branching programs. SIAM J. Comput., 43(3):973–986, 2014. doi:10.1137/120875673.
  • [9] Joshua Brody and Elad Verbin. The coin problem and pseudorandomness for branching programs. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 30–39, 2010. doi:10.1109/FOCS.2010.10.
  • [10] Eshan Chattopadhyay and Jyun-Jie Liao. Recursive Error Reduction for Regular Branching Programs. In 15th Innovations in Theoretical Computer Science Conference (ITCS), pages 29:1–29:20, 2024. doi:10.4230/LIPIcs.ITCS.2024.29.
  • [11] Lijie Chen, William M. Hoza, Xin Lyu, Avishay Tal, and Hongxun Wu. Weighted pseudorandom generators via inverse analysis of random walks and shortcutting. In Proceedings of the 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 1224–1239, 2023. doi:10.1109/FOCS57990.2023.00072.
  • [12] Lijie Chen and Xin Lyu. Inverse-exponential correlation bounds and extremely rigid matrices from a new derandomized XOR lemma. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 761–771, 2021. doi:10.1145/3406325.3451132.
  • [13] Kuan Cheng and Ruiyang Wu. Weighted pseudorandom generators for read-once branching programs via weighted pseudorandom reductions, 2025. doi:10.48550/arXiv.2502.08272.
  • [14] Gil Cohen, Dean Doron, Oren Renard, Ori Sberlo, and Amnon Ta-Shma. Error Reduction for Weighted PRGs Against Read Once Branching Programs. In Proceedings of the 36th Computational Complexity Conference (CCC), pages 22:1–22:17, 2021. doi:10.4230/LIPIcs.CCC.2021.22.
  • [15] Anindya De. Pseudorandomness for permutation and regular branching programs. In Proceedings of the 2011 IEEE 26th Annual Conference on Computational Complexity, CCC ’11, pages 221–231, USA, 2011. IEEE Computer Society. doi:10.1109/CCC.2011.23.
  • [16] Bill Fefferman, Ronen Shaltiel, Christopher Umans, and Emanuele Viola. On beating the hybrid argument. Theory Comput., 9:809–843, 2013. doi:10.4086/toc.2013.v009a026.
  • [17] Michael A. Forbes and Zander Kelley. Pseudorandom generators for read-once branching programs, in any order. In Proceedings of the 59th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 946–955, 2018. doi:10.1109/FOCS.2018.00093.
  • [18] Anat Ganor and Ran Raz. Space Pseudorandom Generators by Communication Complexity Lower Bounds. In Klaus Jansen, José Rolim, Nikhil R. Devanur, and Cristopher Moore, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014), volume 28 of Leibniz International Proceedings in Informatics (LIPIcs), pages 692–703, Dagstuhl, Germany, 2014. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.APPROX-RANDOM.2014.692.
  • [19] Pooya Hatami and William Hoza. Paradigms for unconditional pseudorandom generators. Found. Trends Theor. Comput. Sci., 16(1-2):1–210, 2024. doi:10.1561/0400000109.
  • [20] William Hoza and Zelin Lv. On sums of inw pseudorandom generators. ECCC preprint TR25-050, 2025. URL: https://eccc.weizmann.ac.il/report/2025/050/.
  • [21] William M. Hoza. Better Pseudodistributions and Derandomization for Space-Bounded Computation. In Proceedings of the 25th International Conference on Randomization and Computation (RANDOM), pages 28:1–28:23, 2021. doi:10.4230/LIPIcs.APPROX/RANDOM.2021.28.
  • [22] William M. Hoza, Edward Pyne, and Salil Vadhan. Pseudorandom Generators for Unbounded-Width Permutation Branching Programs. In 12th Innovations in Theoretical Computer Science Conference (ITCS), pages 7:1–7:20, 2021. doi:10.4230/LIPIcs.ITCS.2021.7.
  • [23] William M. Hoza, Edward Pyne, and Salil Vadhan. Limitations of the Impagliazzo-Nisan-Wigderson pseudorandom generator against permutation branching programs. Algorithmica, 2024, 2024. URL: https://link.springer.com/article/10.1007/s00453-024-01251-2.
  • [24] William M. Hoza and David Zuckerman. Simple optimal hitting sets for small-success 𝕃. SIAM J. Comput., 49(4):811–820, 2020. doi:10.1137/19M1268707.
  • [25] Russell Impagliazzo. Hard-core distributions for somewhat hard problems. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science (FOCS), pages 538–545, 1995. doi:10.1109/SFCS.1995.492584.
  • [26] Russell Impagliazzo, Noam Nisan, and Avi Wigderson. Pseudorandomness for network algorithms. In Proceedings of the 26th Annual Symposium on Theory of Computing (STOC), pages 356–364, 1994. doi:10.1145/195058.195190.
  • [27] Russell Impagliazzo and Avi Wigderson. P = BPP if E requires exponential circuits: derandomizing the XOR lemma. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (STOC), pages 220–229, 1997. doi:10.1145/258533.258590.
  • [28] Michal Koucký, Prajakta Nimbhorkar, and Pavel Pudlák. Pseudorandom generators for group products: extended abstract. In Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing, STOC ’11, pages 263–272, New York, NY, USA, 2011. Association for Computing Machinery. doi:10.1145/1993636.1993672.
  • [29] Vinayak M. Kumar. New Pseudorandom Generators and Correlation Bounds Using Extractors. In Proceedings of the 16th Innovations in Theoretical Computer Science Conference (ITCS), volume 325, pages 68:1–68:23, 2025. doi:10.4230/LIPIcs.ITCS.2025.68.
  • [30] Chin Ho Lee, Edward Pyne, and Salil Vadhan. On the Power of Regular and Permutation Branching Programs. In Proceedings of the 27th International Conference on Randomization and Computation (RANDOM), pages 44:1–44:22, 2023. doi:10.4230/LIPIcs.APPROX/RANDOM.2023.44.
  • [31] Chin Ho Lee and Emanuele Viola. Some limitations of the sum of small-bias distributions. Theory of Computing, 13(16):1–23, 2017. doi:10.4086/toc.2017.v013a016.
  • [32] Shachar Lovett. Unconditional pseudorandom generators for low-degree polynomials. Theory of Computing, 5(3):69–82, 2009. doi:10.4086/toc.2009.v005a003.
  • [33] Shachar Lovett and Yoav Tzur. Explicit lower bound for fooling polynomials by the sum of small-bias generators. ECCC preprint TR09-088, 2009. URL: https://eccc.weizmann.ac.il/report/2009/088/.
  • [34] Ueli Maurer and Stefano Tessaro. Computational indistinguishability amplification: tight product theorems for system composition. In Proceedings of the 29th International Cryptology Conference (CRYPTO), pages 355–373, 2009. doi:10.1007/978-3-642-03356-8_21.
  • [35] Raghu Meka, Omer Reingold, and Avishay Tal. Pseudorandom generators for width-3 branching programs. In Proceedings of the 51st Annual Symposium on Theory of Computing (STOC), pages 626–637, 2019. doi:10.1145/3313276.3316319.
  • [36] Raghu Meka and David Zuckerman. Small-bias spaces for group products. In Proceedings of the 13th International Workshop on Randomization and Computation (RANDOM), pages 658–672, 2009. doi:10.1007/978-3-642-03685-9_49.
  • [37] Joseph Naor and Moni Naor. Small-bias probability spaces: efficient constructions and applications. SIAM J. Comput., 22(4):838–856, 1993. doi:10.1137/0222053.
  • [38] Noam Nisan. Pseudorandom generators for space-bounded computation. Combinatorica, 12(4):449–461, 1992. doi:10.1007/BF01305237.
  • [39] Noam Nisan and David Zuckerman. Randomness is linear in space. J. Comput. System Sci., 52(1):43–52, 1996. doi:10.1006/jcss.1996.0004.
  • [40] Edward Pyne and Salil Vadhan. Pseudodistributions That Beat All Pseudorandom Generators (Extended Abstract). In Proceedings of the 36th Computational Complexity Conference (CCC), pages 33:1–33:15, 2021. doi:10.4230/LIPIcs.CCC.2021.33.
  • [41] Ran Raz and Omer Reingold. On recycling the randomness of states in space bounded computation. In The 31st Annual ACM Symposium on Theory of Computing (STOC), pages 159–168, 1999. doi:10.1145/301250.301294.
  • [42] Omer Reingold. Undirected connectivity in log-space. J. ACM, 55(4):Art. 17, 24, 2008. doi:10.1145/1391289.1391291.
  • [43] Eyal Rozenman and Salil Vadhan. Derandomized squaring of graphs. In Proceedings of the 9th International Workshop on Randomization and Computation (RANDOM), pages 436–447, 2005. doi:10.1007/11538462_37.
  • [44] Michael Saks and David Zuckerman, 1995. Unpublished.
  • [45] Thomas Steinke. Pseudorandomness for permutation branching programs without the group theory. ECCC preprint TR12-083, 2012. URL: https://eccc.weizmann.ac.il/report/2012/083/.
  • [46] Yoav Tzur. Notions of weak pseudorandomness and GF(2n)-polynomials. M.Sc. thesis, Weizmann Institute of Science, 2009. URL: https://eccc.weizmann.ac.il/static/books/Notions_of_Weak_Pseudorandomness/.
  • [47] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1–3):1–336, 2012. doi:10.1561/0400000010.
  • [48] Emanuele Viola. The sum of d small-bias generators fools polynomials of degree d. Comput. Complexity, 18(2):209–217, 2009. doi:10.1007/s00037-009-0273-5.