Generalised Linial–Nisan Conjecture Is False for DNFs
Abstract
Aaronson (STOC 2010) conjectured that almost -wise independence fools constant-depth circuits; he called this the generalised Linial–Nisan conjecture. Aaronson himself later found a counterexample for depth-3 circuits. We give here an improved counterexample for depth-2 circuits (DNFs). This shows, for instance, that Bazzi’s celebrated result (-wise independence fools DNFs) cannot be generalised in a natural way. We also propose a way to circumvent our counterexample: We define a new notion of pseudorandomness called local couplings and show that it fools DNFs and even decision lists.
Keywords and phrases:
pseudorandomness, DNFs, bounded independenceFunding:
Yaroslav Alekseev: Supported by ISF grant 507/24.Copyright and License:
![[Uncaptioned image]](x1.png) © Yaroslav Alekseev, Mika Göös, Ziyi Guan, Gilbert Maystre,
 © Yaroslav Alekseev, Mika Göös, Ziyi Guan, Gilbert Maystre,
Artur Riazanov, Dmitry Sokolov, and Weiqiang Yuan; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Circuit complexity ; Theory of computation Pseudorandomness and derandomizationAcknowledgements:
We thank Shalev Ben-David and Avishay Tal for helpful email communication.Funding:
This project was supported by the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract number MB22.00026.Editors:
Srikanth SrinivasanSeries and Publisher:
 Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
 Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Linial and Nisan [16] conjectured that “-wise independent” distributions fool constant-depth circuits (class ). More specifically, a distribution over is called -independent if the marginal distribution on every -sized subset of bits is uniform. We say that -fools a circuit if the circuit cannot distinguish from the uniform distribution on :
The Linial–Nisan conjecture was first proved for depth-2 circuits (DNFs and CNFs) by Bazzi [4] (with a simplification by Razborov [21]) and then for every -circuit by Braverman [7]. Indeed, Braverman showed that every size- -circuit is -fooled by -independence.
Aaronson [1] asked whether the Linial–Nisan conjecture could be strengthened to hold also for “almost -wise independence,” a seemingly modest generalisation. We say that a distribution over is -independent if for every subset , , the marginal distribution on the bits in is multiplicatively close to uniform in the sense that for every ,
Generalised Linial–Nisan Conjecture (GLN).
Let be a -independent distribution over . Then -fools every -circuit of size .
Aaronson’s original motivation for this conjecture was to resolve a problem in quantum complexity theory. He showed that a positive resolution of GLN would imply the separation relative to an oracle. (This separation was subsequently proved by Raz and Tal [20] by a different approach.) Later, Aaronson himself found a counterexample to GLN for depth-3 circuits [2], but he still re-posed the conjecture (and thought it “plausible”) for depth-2 circuits. Our main result here is to refute the GLN conjecture in this remaining case.
Theorem 1 (Main result).
There exists a -independent distribution over and a -width DNF formula such that
Let us make two notes about the parameters here. First, our formula will have quasi-polynomial size, whereas Aaronson’s depth-3 counterexample has only polynomial size; hence his example achieves slightly better parameters (at the cost of larger depth). Second, our construction can be varied to produce the following tradeoff: by increasing the DNF width to any , we can make the distribution -independent (see Section 2.5).
1.1 Implications and related work
One consequence of the failure of the GLN conjecture is to the construction of pseudorandom generators (PRGs) for DNFs. It is known that for there exist -independent distributions with support size [18, 3], which is smaller than that is required for truly -wise independent distributions [8]. Thus Theorem 1 rules out a natural approach (“output an almost -wise independent distribution”) to improving the seed length of PRGs. For the current state-of-the-art PRGs for DNFs, see [9, 22, 17]; see also the survey [13].
Another lesson from Theorem 1 is to the further development of circuit lower bound methods. We find it important to seek alternative proofs of central theorems such as Bazzi’s [4, 21] and its extensions [7]. The existing proofs use the polynomial method to approximate a DNF with a low-degree polynomial. Is there a more “combinatorial” proof of Bazzi’s theorem? One such more combinatorial approach is the top-down lower bound method [12, 11], which often uses entropy-based arguments to analyse circuits. We interpret the failure of GLN as a challenge to such top-down methods. While the method is in a formal sense complete (it can prove any lower bound that is true), the typical entropy counting arguments have a hard time distinguishing almost -wise independent distributions from truly -wise independent ones, suggesting that any top-down proof of Bazzi’s theorem would require substantially new ideas.
1.2 Workaround: Local couplings
To complement our main result, we also propose a way to circumvent the failure of GLN by proposing a new notion of pseudorandomness called local couplings. This notion is useful for fooling depth-2 circuit models, but not depth-3 models; in particular, we show the following claims:
- 
1. 
Local couplings fool DNFs (query complexity analogue of NP). 
- 
2. 
Local couplings fool decision lists (query complexity analogue of ). 
- 
3. 
Local couplings do not fool depth-3 circuits (query complexity analogue of ). 
Definition 2 (Local couplings).
A pair of jointly distributed random variables is an -semi-coupling if for every and ,
We say that is an -coupling if both and are -semi-couplings.
The notion of a local coupling was somewhat implicit in Aaronson’s analysis [2] of his depth-3 counterexample. Local couplings are also a stronger variant of a notion proposed by Zhandry [23] that he called “substitution distance.”
Claims 1–2
A width- decision list is a sequence of pairs where are -terms (conjunctions of at most literals) and are output values. A decision list defines as follows: where . The decision list width of a function is polynomially equivalent to the number of queries necessary to compute the function [10, Appendix A]. In other words it is indeed a query complexity analogue of . The following theorem (Section 3.1) formalises 1–2 when is uniformly distributed.
Theorem 3.
Let be computed by a width- decision list. For any -coupling ,
Claim 3
Aaronson’s [2] original counterexample involved a distribution related to a certain surjectivity function, which can be computed by a small depth-3 circuit. We observe (Section 3.2) that Aaronson’s distribution can indeed be locally coupled with the uniform distribution, which implies that local couplings do not fool depth-3 circuits (Claim 3). We can furthermore conclude (using Theorem 3) that fools decision lists – this claim was already made earlier by Aaronson [2, Theorem 3], but his proof contained a mistake,111The mistake is acknowledged on the author’s homepage. An implication of this result would have been to show the separation relative to a random oracle. That result however follows (using a function different from surjectivity) from the more recent result that PH is infinite in the random oracle model [15]. which we can now fix with the notion of local couplings. Finally, we also show (Section 3.3) that an -semi-coupling is not enough by itself to fool DNFs – one truly needs the two-sided condition of an -coupling.
2 Counterexample
In this section, we prove Theorem 1 by constructing a DNF formula and an associated almost -independent distribution that can distinguish from uniform. We first (Section 2.1) construct a weak example that distinguishes from uniform with advantage . Then (Section 2.4) we amplify this advantage to by using a standard majority trick.
2.1 Construction
Our starting point is the address function defined as . Here we write to mean where is the integer corresponding naturally to the bitstring . Let us first observe that Addr together with the uniform distribution over “almost works” as the counterexample in Theorem 1. For the distribution of is already -independent. The reason the whole does not have the same property is that for example, fixing all bits of to some forces , so for some containing all bits describing and the bit the settings of with have probability zero.
To avoid this issue we hide the bits of the address using the usual tribes function:
The input here is an boolean matrix and the function returns iff the matrix contains an all- column. It is well-known [19, §4.2] that if we choose , the function becomes balanced, meaning that .
A natural attempt to define a counterexample would be to consider the distinguishing function . This does not work since this function requires polynomial width as the negation of Tribes reduces to it. We fix this by replacing the Addr function by its monotone version: is defined as
We are now ready to define our function by (see also Figure 1)
| (1) | 
Note that input size of is . The following constructs a narrow DNF for .
Claim 4.
There exists a of width that computes .
Proof.
A is commonly viewed as a collection of -certificates: is computable by a - iff for each point there exists a certificate comprised of a subset of input bits of size and such that implies . Hence it is enough to provide a certificate of width for each -input of . Consider a -input and let . If , a -certificate is simply a set of matrices of size together with a -column in each of those matrices. Such certificates fix variables. A similar idea can certify -inputs with , at the cost of adding the corresponding bit .
We now define as a distribution of the random variable defined below.
Definition 5.
Let over be sampled as follows:
- 
1. 
Sample uniformly and independently for each . 
- 
2. 
Sample uniformly and independently. 
- 
3. 
Let ; If , fix . 
We show that is -independent, yet distinguishes from the uniform distribution, which together with Claim 4 implies the following weaker version of Theorem 1:
Lemma 6.
The distribution as in Definition 5 is -independent, but there is an - such that
We reduce the proof of Lemma 6 to the following two lemmas:
Lemma 7.
where is as in Definition 5.
Lemma 8.
The distribution as in Definition 5 is -independent for , .
Proof of Lemma 6.
2.2 Proof of Lemma 7
Let and . Since the matrices are uniformly generated, it is possible to couple and by defining where . Note that the address part of each input coincides and in particular, they share the event .
Observe that by the definition of : if then Step (3) is not reached in the definition of and is uniform. On the other hand we have . Indeed, if holds, we have by the definition of and .
Thus, and so it remains to bound . For that, we need the following simple fact:
Lemma 9.
Let distributed over according to a product distribution such that . Then , where .
Proof.
Let us couple with as follows: suppose . We then set with probability and with probability . Then , so is indeed uniformly distributed. Then , so .
Note that each bit is close to being balanced:
As all are independent, we can use Lemma 9 to get sharp bounds on their sum being exactly : .
2.3 Proof of Lemma 8
We need to show that for every of size and for every we have . We now classify the bits of and . Let for be the set of bits of in . Let be the set of bit indices of that belong to (we identify the indices with their bit representations). Let and be the corresponding parts of .
Since are uniformly distributed it suffices to show that
Let Intuitively the only non-uniformity in is introduced when as this is the only case where is changed from uniform. We make this intuition precise in the following claim.
Claim 10.
For any event that is a function of we have
Proof.
Let for . By the total probability law we get
| (2) | ||||
| (3) | ||||
| (4) | 
In (2) and (3) we use that given the event is independent from . Since (4) is minimized when and maximized when , we have the claim.
Now let be the event “”. Let us compute for . Since we have , wlog let . Since the bits of denoted by are independent and is a conjunction of independent events we have
Let us fix and bound . By definition , so it equals iff no column of is all-, in particular all columns that do not contain bits of must not be all-. For each of these columns the probability that it is not all- is . Since there are at least such columns we get
Thus, , so we conclude the proof by Claim 10.
2.4 Amplification
In this section we reduce Theorem 1 to Lemma 6. The construction is a simple variation of the majority vote of several instances of . We prove that our construction indeed amplifies the distinguishing probability in the following lemma.
Lemma 11.
Suppose is distributed over and there exists a function such that
for some depending on . Let . Then for we have
where are independent samples of and .
Proof.
Let . Since we have by Hoeffding inequality,
Similarly, we can conclude that hence,
With , we conclude the proof. We now need to show that a narrow can check whether . In fact, this is true for any monotone function composed with a narrow :
Lemma 12.
Let be a function that can be computed by a - . Let be a monotone function. Then can be computed by a -.
Proof.
Since can be computed by a -, a -certificate of is a satisfying assignment for one term of , which has size at most . Since is monotone we can certify that by giving a -certificate that for every where that is the case. Such certificate has size at most , which implies that can be computed by a -.
Finally, we need to show that independent copies of an -independent distribution comprise an -independent distribution:
Lemma 13.
If is -independent, then the product distribution is -independent.
Proof.
Suppose . Let be independent copies of . Fix and . For every , we define and to be the positions of and respectively in . Then,
Since is -independent, for every , . Hence, for small enough :
Proof of Theorem 1.
Let be a natural number to be fixed later. Let be the -independent distribution in Lemma 6. Let be the - such that
From Lemma 13, for every , is -independent. By Lemma 11 for , when ,
Moreover, can be computed by a - from Lemma 12. Choosing and we get that there exists a -independent distribution over that can be -distinguished from the uniform by a -, which implies the claim.
2.5 Variation: Tradeoff between width and error
We finally sketch an extension of our construction that gives a tradeoff between DNF width and .
Theorem 14.
For any there exists a function computable by a - and an -independent distribution over such that
Proof sketch.
We define a “monotone xor” of the functions Addr as follows: where if , if the value of is iff . The distinguisher is then defined by hiding the bits of in Tribes instances:
We sample from the distribution in two steps:
- 
1. 
Sample uniformly at random. 
- 
2. 
If for it happens that and , we flip a random bit among . 
The -distinguishability of from the uniform distribution by is shown analogously to Lemma 7. Then according to Section 2.4 we increase the width of the by the factor to get a -distinguisher. The result then follows by choosing the appropriate constants in and big-O.
3 Local couplings
3.1 Couplings fool decision lists: Proof of Theorem 3
Let be the -terms in the decision list defining . It is sufficient to show that for we have . We show that and are both high and conclude the statement from that. Let us show using that is an -semi-coupling. Denoting the set of input bits mentioned in the term we write
In order to conclude that it suffices to show that . This follows from the total probability law:
Now the same argument shows that since is an -semi-coupling we have . We conclude Theorem 3 by the union bound.
3.2 Surjectivity fools decision lists
Aaronson [2] refuted the GLN conjecture by considering the following distribution:
Definition 15.
For every , let . Define (or simply when is clear from the context) as the distribution of generated as follows:
- 
1. 
Sample . 
- 
2. 
Sample . 
- 
3. 
For each , let if , otherwise is sampled uniformly from . 
Aaronson proved the following.
Theorem 16 ([2]).
For every , is -wise independent for all . Moreover, there is a depth-3 circuit of size such that
We prove that Aaronson’s counterexample, however, cannot refute GLN conjecture for more restricted models, even decision lists.
Lemma 17.
For every and decision list of width ,
Proof.
Let be as in Definition 15. Note that . By Theorem 3, it suffices to show is -coupled with .
By definition, we need to show and are -semi-couplings. The former directly follows from Definition 15: for every and ,
Regarding the latter, fix any , . For each we have
| (5) | ||||
Crucially (5) holds since given random variables are independent from each other. We conclude by the total probability law:
3.3 Semi-couplings do not fool DNFs
In this section we give an example of a semi-coupling where such that can be distinguished from by a polylogarithmic-width DNF. First, observe that we can interpret the definition of in Definition 5 as a coupling with the uniform distribution: we sample uniformly and then modify in the location . With being the state of before the change, that defines some coupling between and the uniformly distributed . This, however, is not a semi-coupling, since if we fix to some value such that and fix such that , then with probability .
We modify the distribution from Definition 5 by replacing each bit of with an instance of Tribes.
Lemma 18.
There exists a -semi-coupling with and an - that -distinguishes from .
Proof.
Consider the smallest such that . We define the coupling as follows:
- 
1. 
Sample uniformly. 
- 
2. 
Sample uniformly. 
- 
3. 
Take . 
- 
4. 
Define by for each . 
- 
5. 
If , choose and force for each . 
Local coupling.
We claim that is -semi-coupled with . Fix some and . Then for bits of that correspond to the coupling condition is trivially satisfied as these bits are shared with . The remaining bits are indexed by , , , we need to bound the probability:
If or , then this probability is since (5) is not invoked and . If and we have
Distinguishability.
We take the distinguishing function from Lemma 7 and define the new distinguisher as
Let be the event “”. As in Lemma 7 we observe that , so . By the construction of and we have . On the other hand
| (by Lemma 9) | |||
| (analogous to Lemma 7) | 
Formally, to show the last inequality, we will do the following:
Then as shown in Lemma 7 . All together this gives us that -distinguishes and .
It remains to observe that the -certificate complexity of is at most : to the certificate of in Claim 4 we add the certificate that where . Thus there exists a of width that computes .
In order to get the -distinguishability we follow the amplification in Section 2.4:
Theorem 19.
There exists a -semi-coupling where and a -width that -distinguishes from .
Proof.
The proof is identical to the one of Theorem 1. Take over that is -semi-coupled with , then the random variable comprised of independent copies of , is -semi-coupled with independent copies of , . On the other hand by Lemma 12 and Lemma 11 there exists an - that -distinguishes and . Since we get the claim.
References
- [1] Scott Aaronson. BQP and the Polynomial Hierarchy. In 42nd ACM Symposium on Theory of Computing, STOC, pages 141–150. ACM, 2010. doi:10.1145/1806689.1806711.
- [2] Scott Aaronson. A Counterexample to the Generalized Linial-Nisan Conjecture. CoRR, abs/1110.6126, 2011. arXiv:1110.6126.
- [3] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple Constructions of Almost -wise Independent Random Variables. Random Structures and Algorithms, 3(3):289–304, 1992. doi:10.1002/rsa.3240030308.
- [4] Louay Bazzi. Polylogarithmic independence can fool DNF formulas. SIAM Journal on Computing, 38(6):2220–2272, 2009. doi:10.1137/070691954.
- [5] Andrej Bogdanov, Krishnamoorthy Dinesh, Yuval Filmus, Yuval Ishai, Avi Kaplan, and Akshayaram Srinivasan. Bounded Indistinguishability for Simple Sources. In 13th Innovations in Theoretical Computer Science Conference, ITCS, volume 215 of LIPIcs, pages 26:1–26:18. Schloss Dagstuhl, 2022. doi:10.4230/LIPIcs.ITCS.2022.26.
- [6] Andrej Bogdanov, Yuval Ishai, Emanuele Viola, and Christopher Williamson. Bounded Indistinguishability and the Complexity of Recovering Secrets. In 36th Advances in Cryptology, CRYPTO, pages 593–618. Springer, 2016. doi:10.1007/978-3-662-53015-3_21.
- [7] Mark Braverman. Poly-logarithmic Independence Fools Bounded-Depth Boolean Circuits. Communications of the ACM, 54(4):108–115, 2011. doi:10.1145/1924421.1924446.
- [8] Benny Chor, Oded Goldreich, Johan Hasted, Joel Freidmann, Steven Rudich, and Roman Smolensky. The Bit Extraction Problem or -resilient Functions. In 26th Annual Symposium on Foundations of Computer Science, FOCS. IEEE, 1985. doi:10.1109/sfcs.1985.55.
- [9] Anindya De, Omid Etesami, Luca Trevisan, and Madhur Tulsiani. Improved Pseudorandom Generators for Depth Circuits. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 504–517. Springer Berlin Heidelberg, 2010. doi:10.1007/978-3-642-15369-3_38.
- [10] Mika Göös, Pritish Kamath, Toniann Pitassi, and Thomas Watson. Query-to-Communication Lifting for . Computational Complexity, 28(1):113–144, 2019.
- [11] Mika Göös, Artur Riazanov, Anastasia Sofronova, and Dmitry Sokolov. Top-Down Lower Bounds for Depth-Four Circuits. In 64th Annual Symposium on Foundations of Computer Science, FOCS, pages 1048–1055. IEEE, 2023. doi:10.1109/FOCS57990.2023.00063.
- [12] Johan Håstad, Stasys Jukna, and Pavel Pudlák. Top-Down Lower Bounds for Depth Circuits. In 34th Annual Symposium on Foundations of Computer Science, FOCS, pages 124–129. IEEE Computer Society, 1993. doi:10.1109/SFCS.1993.366875.
- [13] Pooya Hatami and William Hoza. Paradigms for Unconditional Pseudorandom Generators. Foundations and Trends in Theoretical Computer Science, 16(1–2):1–210, 2024. doi:10.1561/0400000109.
- [14] William Hoza. Fooling Near-Maximal Decision Trees. Technical report, ECCC, 2025. URL: https://eccc.weizmann.ac.il/report/2025/003/.
- [15] Johan Håstad, Benjamin Rossman, Rocco Servedio, and Li-Yang Tan. An Average-Case Depth Hierarchy Theorem for Boolean Circuits. Journal of the ACM, 64(5), 2017. doi:10.1145/3095799.
- [16] Nathan Linial and Noam Nisan. Approximate inclusion-exclusion. Combinatorica, 10(4):349–365, 1990. doi:10.1007/BF02128670.
- [17] Xin Lyu. Improved Pseudorandom Generators for AC0 Circuits. In 37th Computational Complexity Conference, CCC, volume 234 of LIPIcs, pages 34:1–34:25. Schloss Dagstuhl, 2022. doi:10.4230/LIPIcs.CCC.2022.34.
- [18] Joseph Naor and Moni Naor. Small-Bias Probability Spaces: Efficient Constructions and Applications. SIAM Journal on Computing, 22(4):838–856, 1993. doi:10.1137/0222053.
- [19] Ryan O’Donnell. Analysis of Boolean Functions. Cambridge University Press, 2014.
- [20] Ran Raz and Avishay Tal. Oracle separation of BQP and PH. In 51st ACM Symposium on Theory of Computing, STOC, pages 13–23. ACM, 2019. doi:10.1145/3313276.3316315.
- [21] Alexander Razborov. A Simple Proof of Bazzi’s Theorem. ACM Transactions Computational Theory, 1(1):3:1–3:5, 2009. doi:10.1145/1490270.1490273.
- [22] Avishay Tal. Tight Bounds on the Fourier Spectrum of AC0. In 32nd Computational Complexity Conference, CCC, pages 15:1–15:31. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPICS.CCC.2017.15.
- [23] Mark Zhandry. Toward Separating QMA from QCMA with a Classical Oracle. In 16th Innovations in Theoretical Computer Science Conference, ITCS, volume 325 of LIPIcs, pages 95:1–95:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2025. doi:10.4230/LIPICS.ITCS.2025.95.
