Quantum Advantage from Sampling Shallow Circuits: Beyond Hardness of Marginals
Abstract
We construct a family of distributions with over and a family of depth- quantum circuits such that is produced exactly by with the all zeros state as input, yet any constant-depth classical circuit with bounded fan-in gates evaluated on any binary product distribution has total variation distance from . Moreover, the quantum circuits we construct are geometrically local and use a relatively standard gate set: Hadamard, controlled-phase, CNOT, and Toffoli gates. All previous separations of this type suffer from some undesirable constraint on the classical circuit model or the quantum circuits witnessing the separation.
Our family of distributions is inspired by the Parity Halving Problem of Watts, Kothari, Schaeffer, and Tal (STOC, 2019), which built on the work of Bravyi, Gosset, and Kรถnig (Science, 2018) to separate shallow quantum and classical circuits for relational problems.
Keywords and phrases:
Shallow circuits, sampling, quantum circuitsFunding:
Daniel M. Kane: Supported in part by NSF Medium Award CCF-2107079.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Circuit complexity ; Theory of computation Quantum complexity theory ; Theory of computation Complexity classesAcknowledgements:
AO thanks Farzan Byramji for suggestions improving the presentation. KW thanks David Gosset for helpful discussions motivating the problem.Editor:
Shubhangi SarafSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl โ Leibniz-Zentrum fรผr Informatik
1 Introduction
One of, if not the primary direction in the study of quantum computing is to exhibit computational tasks that can be performed far more efficiently on a quantum computer than on a classical one. There are a number of promising candidates [28, 1, 7], but the quantum superiority of many such algorithms relies on unproven assumptions about computational hardness.
To obtain unconditional quantum-classical separations, one must consider classical models of computation against which there are known unconditional lower bounds. Bravyi, Gosset, and Kรถnig gave the first result of this kind by constructing a search problem which could be solved by constant-depth quantum circuits, but not constant-depth classical circuits [8]. Formally, . Since then, there have been many improvements to this result that consider stronger classical circuit families, different error models, and/or different topologies [36, 18, 9, 19, 10, 15].
Nevertheless, these results still fundamentally use the search paradigm for separating the quantum and classical circuit models (or, in fact, sometimes generalizations of search [18, 16]). Intuitively, one can think of these search problems as follows: the input to the problem is both the number of qubits and a specification of a constant-depth quantum circuit , and the goal is to output any bit string in the support of the distribution after measuring in the computational basis. One might wonder if the specification of the quantum circuit is even necessary. That is, is there a single quantum circuit for every that gives rise to a hard-to-sample distribution? In fact, Bravyi, Gosset, and Kรถnig asked exactly this question in their original work [8, Section 5].
There are a few reasons why we might want such a separation. First, one goal for proofs of quantum advantage is to help distill the core aspects of quantum computers that make them more powerful than their classical counterparts. Clearly then, a separation from a single family of distributions is desirable in its simplicity. Moreover, such results give complexity-theoretic support for certain quantum advantage experiments in which changing the underlying circuit is extremely difficult [40, 13, 38].
Watts and Parham [37] were the first to answer the challenge of [8] by constructing a family of constant-depth quantum circuits with output distributions that cannot be sampled (even approximately) by constant-depth classical circuits with bounded fan-in gates. Unfortunately, their result has two significant caveats. First, it imposes a strict requirement on the number of input bits to the classical circuit. Second, the quantum circuits they construct contain more-or-less arbitrary single-qubit gates (at least outside the Clifford hierarchy).
These two properties combine to make the โquantumโ contribution to the quantum-classical separation less clear. To see this, first notice that the usual method of converting between gate sets does not apply in the constant-depth regime, since the Solovay-Kitaev theorem incurs a polylogarithmic depth overhead [23]. This implies that the choice of which single-qubit gates to allow in the quantum circuit model could ultimately affect which kinds of separations are possible. This consideration has been put into sharp relief by recent work that gives a product distribution which cannot be sampled (even approximately) by constant-depth classical circuits with uniformly random input bits [35, 21].
In other words, it is possible to obtain a quantum vs. classical separation with a quantum circuit model that has no entangling gates (as only single-qubit rotation gates are needed to sample from a product distribution), undermining the claim that the separation is related to the powers of quantum mechanics. Indeed, if instead we allowed our classical circuit to have random inputs of arbitrary bias, then they, too, could easily produce the desired distribution. While the result of [37] does allow for classical circuits with biased input bits, the restriction on the number of input bits leaves open the possibility that larger classical circuits may still be able to sample from the distribution.
The main contribution of this work is the construction of a family of distributions that achieves the best properties from all prior works:
Theorem 1 (Informal Version of Theorem 2).
There is a uniform family of constant-depth quantum circuits such that
-
Discrete gate set: is constructed from Hadamard, controlled-phase, , and Toffoli gates. Furthermore, has a depth-7, geometrically local implementation.
-
Quantum advantage: Let be a family of constant-depth classical circuits (i.e., ), and consider the following two distributions: applied to any product distribution; and measuring in the computational basis. The total variation distance between these two distributions is .
Our distribution in Theorem 1 is based on a relational problem given by Watts, Kothari, Schaeffer, and Tal [36] to strengthen the previously mentioned separation between and [8]. Despite this similar construction, our results are incomparable, as they lower bound the stronger class of , but our results are in the distributional, rather than relational, setting.
Theorem 1 may also be of modest philosophical interest. Recall that randomness extractors convert weak sources of randomness into a near uniform distribution. In influential work, Trevisan and Vadhan provided extractors for distributions produced by polynomial size circuits, claiming that these โsampl[e]able distributions are a reasonable model for distributions actually arising in natureโ [30]. A recent follow up by Ball, Goldin, Dachman-Soled, and Mutreja instead argues that a better choice for โnatural sourcesโ are those generated by quantum circuits, since the universe is governed by quantum phenomena [4]. Our main result shows that even in the extremely restricted circuit regime, these two beliefs dramatically differ.
Open Problems
The obvious next question in this line of inquiry, raised earlier in [37, Section 2], is whether a similar distributional separation exists between the classes of and . In fact, it appears even the weaker task of separating from (for sampling distributions) is open. There are several known distributions based on pseudorandom objects that cannot be accurately sampled in [24, 5, 33, 34]; it is unclear whether shallow quantum circuits can sample them. We remark that while our distribution is based on a problem from [36] which separates and for relational problems, our distribution can be sampled by an circuit (see [17, Remark 5.10]).
Another direction is to refine the quantum gate set. The quantum circuits in our main separation result only require Hadamard, controlled-phase, and Toffoli gates, as opposed to the rotation gates required to generate the -biased product distribution used in previous separations [35, 21]. Still, one may wish to further limit the gate set, especially in light of the fact that Hadamard and Toffoli gates are quantum universal [2]. Unfortunately, the standard techniques to simulate the controlled-phase gates in this manner do not naively work in our setting (see [17, Subsection 4.1]), and we leave the minimal gate set required to separate and for sampling distributions as an open question.
One final direction deserving of further investigation is hardness amplification for sampling. Our proof of Theorem 1 crucially uses a direct product theorem for sampling in (see [17, Subsection 5.4]), which allows us to amplify a weak separation between and . A similar direct product theorem (or more generally, a hardness amplification result) for sampling in would likely be useful in addressing open separations. Note that such a theorem was asked for by Chattopadhyay, Goodman, and Zuckerman [11] who gave an analogous result for read-once branching programs.
2 The Proof Outline
In this section, we provide an overview of the proof of Theorem 2 โ a more precise and quantitative version of Theorem 1 parameterized by locality. The details can be found in the full version of the paper [17]. A function is -local if no output bit depends on more than input bits. Note that any family of circuits computes functions of constant locality. We will often refer directly to a distribution as -local if it is the result of applying a -local function to random inputs drawn from a product distribution.
Theorem 2.
Let be an integer. There exists a constant depending only on such that the following holds. There is a uniform family of distributions with over such that
-
There exists a family of geometrically local depth- quantum circuits where is produced exactly by on input . In addition, the quantum circuits only uses Hadamard, controlled-phase, , and Toffoli gates, and measurements in the computational basis. Moreover, Hadamard gates are only applied in the first and last layers, i.e., is in the second level of the Fourier Hierarchy [27].
-
For all , has total variation distance from any -local distribution with any binary product distribution as input.
ย Remark 3.
Given Theorem 2, it is natural to wonder whether every distribution produced by circuits can be sampled by circuits. The following example shows that this is not the case. Consider the distribution over which takes value with probability and otherwise. A classical circuit can easily produce by having each output bit mirror the same input bit. circuits, however, cannot generate , as doing so is equivalent to preparing a nekomata (first defined in [25]), i.e., a state of the form
A lightcone argument shows that depth is necessary to prepare such a state, as is shown in [36].
In Section 2.1, we will review the Parity Halving Problem of [36], and explain how to derive a distributional version of the problem that can be exactly sampled by a shallow quantum circuit, but seemingly cannot be accurately sampled by a function of low locality. In Section 2.2, we will sketch a proof that this distribution has constant distance from every -local distribution. That is, there is a distribution which exhibits a constant distance separation between classical and quantum sampling with shallow circuits. To boost this separation to an optimal one, we highlight and apply a direct product theorem implicit in [21] in Section 2.3.
2.1 Quantum Sampling and a Classical Reduction
As mentioned, the authors of [36] define the Parity Halving Problem (PHP): a relation problem over bit strings which is solvable by a shallow quantum circuit, but any randomized (and therefore ) circuit can only succeed on a trivial fraction of inputs. It is defined as follows:
Definition 4 (Parity Halving Problem).
Given with , return which satisfies .
Their initial observation is that the PHP can be solved with certainty on all instances by a circuit with polynomial size quantum advice, i.e., PHP is in the class . This circuit is shown on the left in Figure 1.
To see that this circuit does indeed solve the PHP, note that after the gates are applied the resulting state is
Finally, applying to yields a uniform superposition over bit strings of parity .
In order to obtain a relational separation between and without the need for quantum advice111Actually, the Relaxed Parity Halving Problem even serves to separate and , but we make mention of it here as it serves as motivation for our sampling separation., the authors of [36] define a variant of the PHP as follows:
Definition 5 (Relaxed Parity Halving Problem).
Fix a graph . Given with , return and for which there exists such that
If has a cycle then it may not be the case that for each there exists a such that and together satisfy the first condition above. However, when the underlying graph is a tree then such a exists for each and preparing a โpoor manโs cat stateโ suffices to solve the Relaxed Parity Halving Problem over . A poor manโs cat state is a state proportional to where is the bitwise negation of . The key observation is that there is a circuit which prepares a poor manโs cat state, conditioned on the measurement outcome of another register. Indeed, the state (where and satisfy the first constraint of Definition 5) can be prepared by a circuit so long as the maximum degree of is constant. Finally, applying the PHP circuit, treating the register of the poor manโs cat state as if it were the cat state, yields a uniformly random string of parity .
This circuit is shown on the right in Figure 1. In order to obtain lower bounds against for the (R)PHP, standard locality arguments apply.
Recall our goal is to construct a distributional separation. A reasonable first attempt might be to consider the distribution which is uniform over tuples satisfying the relation. If generating this distribution is as hard as computing the RPHP relation, then classical hardness follows. Unfortunately, this is not the case. To gain some intuition, consider the classical PARITY function, which cannot be computed by shallow classical circuits [20, 29], and yet, a simple circuit can sample from the distribution where is a uniformly random bit string [3, 6]. Specifically, one can map the random bits
where denotes concatenation [3, 6]. In fact, a similar construction classically samples from (see [17, Subsection 5.2]).
We briefly digress to remark that this example is not an outlier. Indeed, the past decade or two has seen the study of sampling distributions blossom into a rich area, in many ways independent of computation, with exciting connections to fields such data structures [31, 24, 5, 35, 39, 21], extractors [30, 12, 32, 33, 4], pseudorandom generators [31, 24], and coding theory [26]. We refer the interested reader to the recent works [14, 35, 21, 39, 26, 22] and references within for more details.
To overcome the above barrier, consider the strings subject to the constraint . The simple-but-key observation is that on input with odd Hamming weight, the quantum circuit shown on the left in Figure 1 yields a uniformly random bit string . (Note that is always uniformly random.) Hence, if we replace with some other state, we can now run our quantum circuit without necessarily invoking the promise on the Hamming weight of , which gives us some added flexibility in our choice of distribution. In fact, we will simply replace each qubit with the single-qubit state . That is, if we were to measure the qubits of the -register, we would obtain the -biased distribution for each bit. It is exactly this distribution of inputs for which we can show classical hardness. In the following subsection, we will highlight exactly why this distribution does not suffer from the same shortcoming as the PARITY example.
Formally, the distribution witnessing the separation is defined as follows:
Definition 6 (The Distribution).
Let be a tree with undirected edges. A sample is drawn as follows: first sample and . Define by setting for each . If has odd Hamming weight, then sample uniformly at random; otherwise, sample as a uniform -bit string of parity .
In [17, Section 4], we show how to sample exactly using circuits with the help of ancilla. The circuit is obtained by slightly modifying the construction given in [36] for the RPHP:
Proposition 7.
Let be a tree and let be its maximum vertex degree. Then there exists a geometrically local quantum circuit such that the following holds.
-
has depth and only uses Hadamard, controlled-phase, CNOT, and Toffoli gates. Moreover, Hadamard gates are only applied in the first and last layers.
-
Let be the distribution obtained by measuring in the computational basis. Then the marginal distribution of the first coordinates of is exactly .
We refer to the target distribution as because it essentially โhostsโ the following distribution, , defined below:
Definition 8 (The Distribution).
A sample is drawn as follows: first sample according to the -biased product distribution. If has odd Hamming weight, then sample uniformly at random; otherwise has even Hamming weight, and sample as a uniform -bit string of parity .
is the distribution which we are able to prove classical hardness for in a more straightforward way. Observe that the relationship between and is analogous to that between the PHP and RPHP. The reduction from to is given in [17, Subsection 5.3].
Lemma 9.
Let be a tree and let be arbitrary. Define , where is the set of edges on the unique path between and . Then there exists a -local function such that
2.2 Classical Hardness
The classical lower bound of Theorem 2 is largely derived from the following hardness result. Let denote the tower of 2โs of height (e.g., ).
Theorem 10.
Let be an integer. Assume and . Then any -local distribution has total variation distance at least from .
Combining Theorem 10 with Lemma 9 easily gives the following corollary.
Corollary 11.
Let be a tree and let be arbitrary. Define , where is the set of edges on the unique path between and . Additionally, let be an integer, and assume and . Then any -local distribution has total variation distance at least from .
Proof.
Assume by contradiction there exists a -local function and a product distribution over such that the distribution of applied to samples drawn from , denoted , is -close to for some . Define a new function by
where is defined as in Lemma 9. Then is -local by Lemma 9 and -close to by the data processing inequality. This contradicts Theorem 10.
Before sketching the main ideas behind the proof of Theorem 10, a few remarks are in order. First, a tighter analysis can yield distance , assuming ; this is near optimal, as the -local222The distribution is -local if the input bits are unbiased coins. When we allow input bits with mixed bias of and , the distribution is -local. distribution achieves distance . Second, the quadratic upper bound on in Theorem 10 is necessary; we show is -local when in [17, Proposition 5.8]. Finally, it is necessary that and not as the latter can be exactly sampled (see [17, Proposition 5.9]), though any bias other than or will be hard.
Let us now provide an overview of Theorem 10โs proof. Fix an arbitrary -local function and an arbitrary product distribution over as input. Our goal is to show that the distribution is 0.24-far from . One immediate challenge in working with -local functions is that the locality constraint is โone-sided.โ Even though no output bit is influenced by many input bits, there may exist an input bit that affects every single output bit. The resulting output distribution, then, can have complicated correlations, which muddle the analysis.
The Structured Case: A First Attempt
To warm-up, we first consider the idealized setting where there are โnon-connectedโ output bits, by which we mean no two such output bits depend on a common input bit. In particular, the marginal distributions of projected onto the individual coordinates are independent. Here, one should view as large, say . We proceed via a concentration vs. anticoncentration dichotomy, present in various forms in the works [31, 14, 35, 21, 22]. Specifically, we classify each of the output bits according to how their corresponding marginal distribution compares to the marginal distribution of the target distribution.
At a high level, we would like to argue that either many of these output bits have marginal distributions which are far from those of the target distribution, in which case we can combine the marginal errors, or many of these output bits are close to the โcorrectโ marginal distribution, in which case a more complicated anticoncentration argument shows that a specific potential function highlights a noticeable discrepancy between the two distributions. To this end, we call an output bit Type-1 if the marginal distribution is -far in total variation distance from the marginal distribution , and call it Type-2 otherwise. Here, is some small threshold parameter.
Suppose at least of the non-connected output bits are Type-1. Note that since total variation distance is closed under projection, a single Type-1 neighborhood already gives distance . To strengthen the bound, one can take advantage of independence and apply standard concentration inequalities, as in the proof of [21, Lemma 4.2], to conclude has distance roughly from .333There is a small subtlety here, in that if the set of Type-1 output bits fully contains the last output bits, then those output bits are not a product distribution in . For simplicity, we will assume that this does not occur, although the full statement of [21, Lemma 4.2] is robust enough to still apply in that scenario. For , this is at least 0.24, as desired.
The more involved case is when at least output bits are Type-2. Here, rather than directly comparing to , we compare the expectation of a complex-valued potential function over samples drawn from the two distributions. Direct calculations show that for (see [17, Claim 5.3]) and that is bounded away from 1 for any integral random variable suitably far from constant modulo 4 (see [17, Claim 5.5]). It is tempting to argue that by the independence of the non-connected output bits, we can fix the value of the input bits not affecting any Type-2 output bits to view as a product of many independent random variables with magnitudes bounded away from 1. Then we could conclude that for each of these input conditionings, for , which would give the desired distance of (using [17, Lemma 3.4]).
The problem, however, is that the contributions of the remaining output bits can compensate for those of the non-connected output bits. For example, consider the string , where the โs are independent random bits. There are independent bits, yet the stringโs Hamming weight is fixed at . Thus, we cannot reason about solely from the non-connected output bits. Instead, we need to consider the neighborhood of each output bit , i.e., the set of output bits that are also influenced by the input bits determining .
The Structured Case: Refining the Output Structure
To fix our analysis, let us change our assumption from there being non-connected output bits to there being non-connected neighborhoods. Here, we refer to two neighborhoods as non-connected if for every pair of output bits and , the input bits that determine are disjoint from those that determine . We can similarly classify each neighborhood as Type-1 or Type-2 depending on the distance of its marginal distribution to that of the target distribution. The analysis in the case of many Type-1 neighborhoods is performed almost identically to the previous scenario, but now we are able to reason more carefully when there are many Type-2 neighborhoods.
Indeed, consider a Type-2 neighborhood , where is the output bits contained in the first indices (corresponding to ) and is the output bits contained in the latter indices (corresponding to ). If we can show that is not too close to being a constant, then the potential function argument sketched above can actually be carried out. To this end, let be the output bit that defines the neighborhood , and consider the effect of conditioning on vs. on .
First suppose is in the part. In this case, we can write and express as . Recall that is a Type-2 neighborhood, so it should resemble a product distribution. In particular, the distribution of conditioned on should be roughly the same as when conditioned on . Observe then, that and should have noticeably different distributions, as we are essentially comparing a binomial distribution with its shift. Since should be close to -biased, it takes both values with constant probability, so cannot be too close to any fixed value. A similar analysis shows that if is in the part, then we are comparing the density of and .
Unfortunately, there is a problem with this latter case. Suppose the neighborhood does not contain any bits in the part. Then we are comparing the density of and , or equivalently, and . The part is -biased, so can have the same distribution as ! Note that it is this fact which allows for the previously described sampling algorithm for . Hence, we must make one further refinement to our assumption.
The Structured Case: A Final Adjustment
Now instead of simply assuming there are non-connected neighborhoods, we insist that all neighborhoods are generated by output bits in the part. Moreover, we will only require the non-connectedness property on bits in the part of the neighborhoods. This second condition actually makes the analysis more challenging, but we will later see it is necessary for this model case to be obtainable. We once more redefine Type-1 and Type-2 neighborhoods; this time we classify neighborhoods based only on their marginals on the first output bits. The case of many Type-1 neighborhoods essentially works as before (see [17, Lemma 5.2]), so it remains to address the case where at least of the neighborhoods are Type-2.
To obtain some structure in the part, we exploit our assumption on the size of . Since we have , most pairs of neighborhoods do not intersect in the last output bits. Quantitatively, we can find non-connected Type-2 neighborhoods that do not intersect in the part. Without loss of generality, assume they are . By fixing the value of all the input bits that do not affect , the contributions to from these neighborhoods are now independent. In particular, the expectation of becomes a product of expectations over the output of the neighborhood. As noted above, we can conclude the expectation over the neighborhood is bounded away from 1 if is not too close to any fixed value.
This ends up being a bit difficult to show directly, since while the neighborhoods are disjoint in the part, they may be connected. Fortunately, the variance of over a random such fixing of the input bits follows from that of , where we do have non-connectedness in the part. By the previous argument of considering conditioned on the output bit being 0 vs. being 1, we are able to prove is typically not too close to constant (see [17, Claim 5.7]). This concludes the analysis of many Type-2 neighborhoods (see [17, Lemma 5.6]), as well as the proof of Theorem 2 under certain ideal assumptions.
Reduction to the Structured Case
Previously, we assumed a rather strong structure: many output bits generating neighborhoods that are non-connected in . This, of course, is not a structure readily present in an arbitrary -local function . To reduce to this case, a standard approach (appearing in, e.g., [31, 8, 34, 14, 35, 21, 22]) is to strategically condition on bits to express an arbitrary -local function as a convex combination of functions with the desired structure. In other words, if we can find some set of input bits whose removal induces many non-connected neighborhoods of the form we want, then we can express as
where is defined as with the input bits in fixed to their values in . Observe that each has the structured form we already know how to analyze, regardless of the actual values the bits in are set to. More formally, we have:
-
1.
If most of the non-connected neighborhoods are Type-1, then is -far from , and
-
2.
Otherwise, .
By a union bound argument (see [17, Lemma 3.4]), these results on the conditioned functions can be combined to obtain . Then as long as , we obtain the desired distance bound of 0.24.
At this point, the remaining task is combinatorial. We construct a bipartite graph whose left side is the first output bits, and whose right side is the input bits. Note that each left vertex has maximum degree . We want to remove right vertices to obtain non-connected neighborhoods of the prescribed form, where . Ideally, we would like to be as large as possible to maximize the total variation distance. Fortunately, this task has already been done for us. By [21, Corollary 4.11], we can take and , as desired.
For the sake of completeness, we briefly highlight the main idea behind the proof of [21, Corollary 4.11]. The key observation is that locality, while only explicitly constraining the left vertices, also constrains the right ones, since it upper bounds the number of edges by . Thus while we cannot forbid high-degree right vertices, there cannot be many of them. This implies that we can โaffordablyโ remove all right vertices above a particular degree threshold, and greedily find non-connected vertices on the left side. A more involved analysis (see [21, Corollary 4.8]) provides better parameters than one could obtain via this naive approach, and an even more involved analysis guarantees non-connected left neighborhoods, rather than just vertices. Still, the proofs morally operate in a similar way to the strategy described. This completes the sketch of the proof of Theorem 2; the full details can be found in [17, Section 5].
We conclude by remarking that the above analysis is fairly robust, and it allows one to rule out the sampleability of a number of simple distributions by shallow circuits. Thus, the specific distributions we have chosen to consider are primarily a function of what can be produced by shallow quantum circuits, rather than what can be forbidden for shallow classical ones.
2.3 Boosting the Separation
Combining our results thus far produces a separation with constant total variation distance. In order to prove the stronger separation in Theorem 2, we consider the distribution . Certainly, if our quantum circuit can generate , then it can also generate . Moreover, we can apply the following direct product theorem implicit in [21] (and formalized in [17, Subsection 5.4]) to show the overlap of the target distribution with that produced by classical circuits decays exponentially quickly.
Theorem 12 (Direct Product Theorem).
Let be integers, and let be a distribution over . Suppose that for any -local function and binary product distribution on , we have
Then for any integer , -local function , and binary product distribution on , we have
The proof of Theorem 12, much like the proof of Theorem 2, uses a graph elimination result derived in [21]. In this context, such a result allows one to find many independent groups of output bits corresponding to instances of . Since the marginal distributions of and disagree on each group, we can use a standard concentration inequality to derive a strong error bound.
Proof of Theorem 2.
Let , and let be the spanning tree of the square grid obtained by including all the edges in the first row, as well as all the edges in each column. Observe that the diameter of is , so (as defined in Corollary 11) is at most . In particular, our choice of guarantees . Thus, Corollary 11 implies any -local distribution at least -far from . Applying Theorem 12, we can boost this error to conclude that any -local distribution has distance from at least
Let be a sufficiently large constant depending only on . For any integer , express with , and define the distribution . Since total variation distance is closed under projection, we find that any -local distribution has distance from at least
for large enough .
We conclude by noting that Proposition 7 gives a depth- quantum circuit that exactly samples on input by considering the marginal distribution on specific coordinates. By padding with extra zeros, a similar circuit on inputs can also sample .
ย Remark 13.
Our setting of is motivated by common topological choices in implementations. One could alternatively set to be a balanced binary tree to minimize , but this would not affect the final bound.
References
- [1] Scott Aaronson and Alex Arkhipov. The computational complexity of linear optics. Theory Of Computing, 9(4):143โ252, 2013. doi:10.4086/TOC.2013.V009A004.
- [2] Dorit Aharonov. A simple proof that Toffoli and Hadamard are quantum universal. arXiv preprint quant-ph/0301040, 2003.
- [3] Lรกsziรณ Babai. Random oracles separate PSPACE from the polynomial-time hierarchy. Information Processing Letters, 26(1):51โ53, 1987. doi:10.1016/0020-0190(87)90036-6.
- [4] Marshall Ball, Eli Goldin, Dana Dachman-Soled, and Saachi Mutreja. Extracting randomness from samplable distributions, revisited. In 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 1505โ1514. IEEE, 2023. doi:10.1109/FOCS57990.2023.00092.
- [5] Chris Beck, Russell Impagliazzo, and Shachar Lovett. Large deviation bounds for decision trees and sampling lower bounds for AC0-circuits. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 101โ110. IEEE, 2012. doi:10.1109/FOCS.2012.82.
- [6] Ravi B Boppana and Jeffrey C Lagarias. One-way functions and circuit complexity. Information and Computation, 74(3):226โ240, 1987. doi:10.1016/0890-5401(87)90022-8.
- [7] Adam Bouland, Bill Fefferman, Chinmay Nirkhe, and Umesh Vazirani. On the complexity and verification of quantum random circuit sampling. Nature Physics, 15(2):159โ163, 2019.
- [8] Sergey Bravyi, David Gosset, and Robert Kรถnig. Quantum advantage with shallow circuits. Science, 362(6412):308โ311, 2018.
- [9] Sergey Bravyi, David Gosset, Robert Kรถnig, and Marco Tomamichel. Quantum advantage with noisy shallow circuits. Nature Physics, 16(10):1040โ1045, 2020.
- [10] Libor Caha, Xavier Coiteux-Roy, and Robert Kรถnig. A colossal advantage: 3D-local noisy shallow quantum circuits defeat unbounded fan-in classical circuits. arXiv preprint arXiv:2312.09209, 2023.
- [11] Eshan Chattopadhyay, Jesse Goodman, and David Zuckerman. The space complexity of sampling. In 13th Innovations in Theoretical Computer Science Conference,(ITCS 2022), 2022.
- [12] Anindya De and Thomas Watson. Extractors and lower bounds for locally samplable sources. ACM Transactions on Computation Theory (TOCT), 4(1):1โ21, 2012. doi:10.1145/2141938.2141941.
- [13] Yu-Hao Deng, Yi-Chao Gu, Hua-Liang Liu, Si-Qiu Gong, Hao Su, Zhi-Jiong Zhang, Hao-Yang Tang, Meng-Hao Jia, Jia-Min Xu, Ming-Cheng Chen, et al. Gaussian boson sampling with pseudo-photon-number-resolving detectors and quantum computational advantage. Physical review letters, 131(15):150601, 2023.
- [14] Yuval Filmus, Itai Leigh, Artur Riazanov, and Dmitry Sokolov. Sampling and certifying symmetric functions. In Approximation, Randomization, and Combinatorial Optimization. (APPROX/RANDOM), 2023.
- [15] Sabee Grewal and Vinayak M Kumar. Improved circuit lower bounds and quantum-classical separations. arXiv preprint arXiv:2408.16406, 2024.
- [16] Daniel Grier, Nathan Ju, and Luke Schaeffer. Interactive quantum advantage with noisy, shallow Clifford circuits. arXiv preprint arXiv:2102.06833, 2021. arXiv:2102.06833.
- [17] Daniel Grier, Daniel M. Kane, Jackson Morris, Anthony Ostuni, and Kewen Wu. Quantum advantage from sampling shallow circuits: Beyond hardness of marginals. arXiv preprint arXiv:2510.07808, 2025. doi:10.48550/arXiv.2510.07808.
- [18] Daniel Grier and Luke Schaeffer. Interactive shallow Clifford circuits: quantum advantage against and beyond. In Proceedings of the 52nd Annual ACM Symposium on Theory of Computing, pages 875โ888, 2020. doi:10.1145/3357713.3384332.
- [19] Atsuya Hasegawa and Franรงois Le Gall. Quantum Advantage with Shallow Circuits Under Arbitrary Corruption. In 32nd International Symposium on Algorithms and Computation (ISAAC 2021), volume 212 of Leibniz International Proceedings in Informatics (LIPIcs), pages 74:1โ74:16. Schloss Dagstuhl โ Leibniz-Zentrum fรผr Informatik, 2021. doi:10.4230/LIPIcs.ISAAC.2021.74.
- [20] Johan Hรฅstad. Computational limitations for small depth circuits. PhD thesis, Massachusetts Institute of Technology, 1986.
- [21] Daniel M Kane, Anthony Ostuni, and Kewen Wu. Locality bounds for sampling Hamming slices. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 1279โ1286, 2024. doi:10.1145/3618260.3649670.
- [22] Daniel M Kane, Anthony Ostuni, and Kewen Wu. Locally sampleable uniform symmetric distributions. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing, pages 1807โ1816, 2025. doi:10.1145/3717823.3718243.
- [23] Greg Kuperberg. Breaking the cubic barrier in the Solovay-Kitaev algorithm. arXiv preprint arXiv:2306.13158, 2023. doi:10.48550/arXiv.2306.13158.
- [24] Shachar Lovett and Emanuele Viola. Bounded-depth circuits cannot sample good codes. In 2011 IEEE 26th Annual Conference on Computational Complexity, pages 243โ251. IEEE, 2011. doi:10.1109/CCC.2011.11.
- [25] Gregory Rosenthal. Bounds on the QAC0 complexity of approximating parity. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021), Leibniz International Proceedings in Informatics (LIPIcs), pages 32:1โ32:20. Schloss Dagstuhl โ Leibniz-Zentrum fรผr Informatik, 2021. doi:10.4230/LIPIcs.ITCS.2021.32.
- [26] Ronen Shaltiel and Jad Silbak. Explicit codes for poly-size circuits and functions that are hard to sample on low entropy distributions. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 2028โ2038, 2024. doi:10.1145/3618260.3649735.
- [27] Yaoyun Shi. Quantum and classical tradeoffs. Theoretical computer science, 344(2-3):335โ345, 2005. doi:10.1016/J.TCS.2005.03.053.
- [28] Peter W Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM review, 41(2):303โ332, 1999. doi:10.1137/S0036144598347011.
- [29] Roman Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In Proceedings of the nineteenth annual ACM symposium on Theory of computing, pages 77โ82, 1987. doi:10.1145/28395.28404.
- [30] Luca Trevisan and Salil Vadhan. Extracting randomness from samplable distributions. In Proceedings 41st Annual Symposium on Foundations of Computer Science, pages 32โ42. IEEE, 2000. doi:10.1109/SFCS.2000.892063.
- [31] Emanuele Viola. The complexity of distributions. SIAM Journal on Computing, 41(1):191โ218, 2012. doi:10.1137/100814998.
- [32] Emanuele Viola. Extractors for Turing-machine sources. In International Workshop on Approximation Algorithms for Combinatorial Optimization, pages 663โ671. Springer, 2012. doi:10.1007/978-3-642-32512-0_56.
- [33] Emanuele Viola. Extractors for circuit sources. SIAM Journal on Computing, 43(2):655โ672, 2014. doi:10.1137/11085983X.
- [34] Emanuele Viola. Sampling lower bounds: boolean average-case and permutations. SIAM Journal on Computing, 49(1):119โ137, 2020. doi:10.1137/18M1198405.
- [35] Emanuele Viola. New sampling lower bounds via the separator. In 38th Computational Complexity Conference (CCC 2023), Leibniz International Proceedings in Informatics (LIPIcs), pages 26:1โ26:23. Schloss Dagstuhl โ Leibniz-Zentrum fรผr Informatik, 2023. doi:10.4230/LIPIcs.CCC.2023.26.
- [36] Adam Bene Watts, Robin Kothari, Luke Schaeffer, and Avishay Tal. Exponential separation between shallow quantum circuits and unbounded fan-in shallow classical circuits. In Proceedings of the 51st Annual ACM Symposium on Theory of Computing, pages 515โ526, 2019. doi:10.1145/3313276.3316404.
- [37] Adam Bene Watts and Natalie Parham. Unconditional quantum advantage for sampling with shallow circuits. arXiv preprint arXiv:2301.00995, 2023. doi:10.48550/arXiv.2301.00995.
- [38] Aaron W Young, Shawn Geller, William J Eckner, Nathan Schine, Scott Glancy, Emanuel Knill, and Adam M Kaufman. An atomic boson sampler. Nature, 629(8011):311โ316, 2024.
- [39] Huacheng Yu and Wei Zhan. Sampling, flowers and communication. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024), Leibniz International Proceedings in Informatics (LIPIcs), pages 100:1โ100:11. Schloss Dagstuhl โ Leibniz-Zentrum fรผr Informatik, 2024. doi:10.4230/LIPIcs.ITCS.2024.100.
- [40] Han-Sen Zhong, Hui Wang, Yu-Hao Deng, Ming-Cheng Chen, Li-Chao Peng, Yi-Han Luo, Jian Qin, Dian Wu, Xing Ding, Yi Hu, et al. Quantum computational advantage using photons. Science, 370(6523):1460โ1463, 2020.
