Abstract 1 Introduction 2 Hard Distributions for the Extremal Forrelation Problem 3 Classical Lower Bound References

Forrelation Is Extremally Hard

Uma Girish ORCID Columbia University, New York, NY, USA Rocco Servedio ORCID Columbia University, New York, NY, USA
Abstract

The Forrelation problem is a central problem that demonstrates an exponential separation between quantum and classical capabilities. In this problem, given query access to n-bit Boolean functions f and g, the goal is to estimate the Forrelation function forr(f,g), which measures the correlation between g and the Fourier transform of f.

In this work we provide a new linear algebraic perspective on the Forrelation problem, as opposed to prior analytic approaches. We establish a connection between the Forrelation problem and bent Boolean functions and through this connection, analyze an extremal version of the Forrelation problem where the goal is to distinguish between extremal instances of Forrelation, namely (f,g) with forr(f,g)=1 and forr(f,g)=1.

We show that this problem can be solved with one quantum query and success probability one, yet requires Ω~(2n/4) classical randomized queries, even for algorithms with a one-third failure probability, highlighting the remarkable power of one exact quantum query. We also study a restricted variant of this problem where the inputs f,g are computable by small classical circuits and show classical hardness under cryptographic assumptions.

Keywords and phrases:
Forrelation, exact quantum, query complexity
Copyright and License:
[Uncaptioned image] © Uma Girish and Rocco Servedio; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Oracles and decision trees
; Theory of computation Complexity classes ; Theory of computation Quantum complexity theory
Related Version:
Full Version: https://arxiv.org/abs/2508.02514
Editor:
Shubhangi Saraf

1 Introduction

Understanding the relative power of quantum versus classical computation is one of the major goals in complexity theory. Following the seminal work of Shor [38], it is widely believed that quantum computation is exponentially more powerful than classical computation; however, since we are unable to prove classical lower bounds for strong models of computation, there are relatively few settings in which such a separation can be unconditionally established. Query complexity is an important example of such a setting where we can unconditionally prove exponential quantum speedups.

In query complexity, we typically consider a Boolean function f:{0,1}n{±1} and the objective is to compute some property of f by querying the values f(x) on as few inputs x{0,1}n as possible. In the classical setting, the algorithm can adaptively and probabilistically choose inputs to query, and the goal is to solve the problem with high success probability, say at least 2/3. In the quantum setting, the standard way to model a quantum query is by means of the unitary operator Of which maps |x to |xf(x) for all x{0,1}n and as before, the goal is to compute some property of f with high success probability, while minimizing the number of calls to the unitary Of. Numerous works [21, 40, 10, 1] have demonstrated properties of f that are exponentially easier to compute with quantum queries as opposed to classical queries. For instance, a version of periodicity testing [19] can be solved with poly(n) quantum queries, while requiring 2Ω(n) classical randomized queries; this algorithm is a key subroutine in Shor’s factoring algorithm [38], and the classical query lower bound helps explain some of the difficulty in finding efficient classical algorithms for factoring.

Understanding the strongest possible separation between quantum and classical computation has long been a topic of great interest. The overarching motivation here is to find a problem that is as easy as possible for quantum algorithms and as hard as possible for classical algorithms. In this context, a new problem called the Forrelation problem has emerged as a central concept.

In the Forrelation problem, given query access to Boolean functions f,g, the goal is to estimate the value of the Forrelation function forr(f,g)[1,1], which captures the correlation between g and f^, the Fourier transform of f. The Forrelation function has been used to establish numerous results about the power of quantum computation. The first such result is due to Aaronson [1], who showed that the Forrelation problem can be solved with high probability using just one quantum query, yet randomized algorithms require 2Ω(n) queries. This was quantitatively strengthened by [2] who defined variants of the Forrelation problem that are now known to demonstrate the largest possible separation between quantum and randomized query complexity for partial functions [2, 41, 37, 9, 14]. The Forrelation problem has since been used to prove many other results, including the celebrated oracle separation of 𝖡𝖰𝖯 and 𝖯𝖧 [35]. Variants of this problem have been used to prove quantum lower bounds, such as a construction of a classical oracle relative to which 𝖯=𝖭𝖯 but 𝖡𝖰𝖯𝖰𝖢𝖬𝖠 [4], as well as the existence of various quantum cryptography primitives [31, 32], separations between adaptive and non-adaptive quantum algorithms [27], and various separations between quantum and classical communication complexity [25, 26, 8].

However, all these works share one common limitation – they only establish classical hardness for estimating the Forrelation function up to a constant less than one. The best-known result in this context is due to Aaronson and Ambainis [2], who show that distinguishing between forr(f,g)2/π and forr(f,g)2/π requires 2Ω(n) classical randomized queries. The factor of 2/π arises from their analytic approach, which involves sampling Gaussian random variables and rounding them to {±1}. All existing techniques use this framework and run into the same 2/π barrier. This naturally leads us to ask: just how hard is it to approximate the Forrelation function in the extremal case? In particular, Aaronson and Ambainis [2] ask the following question (see open question #4 in the discussion section of their paper): if we want a 1 versus 2Ω(n) separation between quantum and classical query complexity, how small can the error of the quantum algorithm be? More precisely, we ask:

How hard is it to distinguish forr(f,g)=1 from forr(f,g)=1?

These two extreme cases capture the largest and smallest possible values of the Forrelation function, and hence this question captures the hardness of approximating Forrelation to any non-trivial factor.

To study this problem, we introduce a fundamentally new way of looking at the Forrelation problem. In contrast to previous analytic approaches, which rely on rounding high-dimensional Gaussian distributions, our approach uses only simple linear algebra over 𝔽2n and elementary probabilistic arguments. We establish a novel connection between the Forrelation problem and bent functions, a well-studied concept in the analysis of Boolean functions. Using this connection, we show that despite the strong promise on the inputs, the extremal Forrelation problem is classically hard. Our main theorem establishes a bounded-error randomized lower bound of 2Ω(n) for this problem. In contrast, there is a simple quantum algorithm [1] that solves this problem with one quantum query and success probability one.

In the following section, we formally define the Forrelation problem, describe the history of the problem, and state our main results.

1.1 Forrelation Problem

To describe the Forrelation problem, we first need to introduce the concept of the Fourier transform and the Forrelation function. The Boolean Fourier transform, also known as the Walsh-Hadamard transform, is a central concept in Boolean function analysis which has applications to learning theory, social choice theory, circuit complexity, property testing, and quantum versus classical separations. It is defined as follows.

Definition 1 (Fourier Transform).

For f:𝔽2n, define f^:𝔽2n by

f^(y):=12nx𝔽2nf(x)(1)x,y for all y𝔽2n.

We now define the Forrelation function, which has close connections with the Fourier transform of Boolean functions. The input to this function consists of the truth tables of two Boolean functions f and g and the output is the correlation between g and the Fourier transform of f.

Definition 2 (Forrelation Function).

The function forr is defined as follows. For Boolean functions f,g:𝔽2n{±1}, define

forr(f,g) :=12n/2y𝔽2nf^(y)g(y) (1)
=123n/2x,y𝔽2nf(x)g(y)(1)x,y.

It is not too difficult to see that for any pair of Boolean functions (f,g), the value of forr(f,g) is between 1 and 1. Indeed, applying Cauchy-Schwarz on Equation 1 implies that |forr(f,g)|2n/2yf^(y)2yg(y)2. Since f and g are ±1-valued functions, Parseval’s theorem implies that yf^(y)2=𝐄𝒙[f(𝒙)2]=𝐄𝒚[g(𝒚)2]=1 and it follows that |forr(f,g)|1.

|+   Of,gH|+   H|+   H|+   H|+   H

Figure 1: Quantum circuit for the Forrelation problem for n=5. Here, |+=12(|0+|1) and Of,g is the oracle mapping basis states |0|x to |0|xf(x) and |1|x to |1|xg(x).

The Forrelation function is also interesting from the perspective of quantum algorithms as it can be interpreted as the bias of a certain one-query quantum algorithm. More precisely, Aaronson [1] gave a simple quantum query algorithm that makes one call to Of,g and returns 1 with probability precisely 12+forr(f,g)2 (see Figure 1 for an illustration of the algorithm.) The intuition for this quantum algorithm comes from the ability of quantum circuits to implement the Fourier transform. We can view the Fourier transform as a unitary map that transforms the truth table of f into that of f^ (up to a normalization factor of 2n/2). Additionally, this unitary map turns out to be a Hadamard matrix, i.e., a tensor product of n Hadamard gates. In contrast, it seems difficult to estimate the Forrelation of f and g using only classical queries to the truth tables of f and g. This motivates the definition of the Forrelation problem, where the goal is to estimate the Forrelation function up to a small additive error.

Definition 3 (Forrelation Problem).

Fix a parameter 0ε1. Given query access to the truth tables of Boolean functions f,g:𝔽2n{±1} that are promised to satisfy either

  • yes case: forr(f,g)ε, or

  • no case: forr(f,g)ε,

distinguish between the two cases.111In prior works, the goal of the Forrelation problem is usually to distinguish forr(f,g)ε from forr(f,g)ε/2. Nevertheless, existing lower bounds also work for our variant of the problem; the proofs can be modified to produce two distributions supported on forr(f,g)ε and forr(f,g)ε, respectively, such that they are each indistinguishable from the uniform distribution over (f,g) for classical algorithms of small cost.

As mentioned before, there is a simple quantum algorithm that solves the Forrelation problem with one query, and the success probability of the algorithm is precisely 12+ε2 where ε is the underlying parameter. In particular, when ε=1, the algorithm makes no error.

Limitations of Prior Works on Forrelation

Numerous works have established classical hardness of the Forrelation problem for ε bounded away from 1 [1, 2, 35, 9]. We will describe these results in more detail in Table 1, but all these works share a common limitation, which is that they only establish hardness for estimating the Forrelation up to a global constant less than one. The best-known constant is due to Aaronson and Ambainis [2], who showed that the Forrelation problem with ε=2/πo(1) requires 2Ω(n) randomized queries.

1.2 Our Results

We consider the Forrelation problem with ε=1, and call this the Extremal Forrelation Problem. Here, we are given query access to Boolean functions f,g:{0,1}n{±1} that are promised to satisfy |forr(f,g)|=1, and we wish to tell whether forr(f,g)=1 or forr(f,g)=1. As mentioned before, the quantum algorithm for this problem follows immediately from the works of [1, 2]. When this algorithm is run on inputs to the Extremal Forrelation Problem it makes one query and solves the problem with success probability one. Our main theorem is that the classical randomized bounded-error query complexity of this problem is 2Ω(n).

Theorem 4.

The Extremal Forrelation Problem, which is solvable with one quantum query and success probability one, requires Ω~(2n/4) queries for any classical randomized query algorithm that succeeds with at least 2/3 probability.

We also analyze a variant of this problem where the oracles f,g are efficiently computable.

Corollary 5.

Suppose one-way functions exist against classical poly(n)-time algorithms. Then, there is no poly(n)-time randomized algorithm that solves the Extremal Forrelation Problem with probability at least 2/3, even if the oracles f,g are computable by poly(n)-sized classical circuits.

We remark that the above result is proved in the black-box setting where the algorithm can only query the truth tables of f,g. We do not know if these results apply in the white-box setting where the algorithm is given an explicit description of a small circuit computing f,g.

We now mention a few consequences of our results.

Lower Bounds for the Forrelation Problem
Table 1: Lower Bounds for the Forrelation Problem.
Forrelation Problem Classical Model Classical Lower Bound
[1] ε=0.05 Randomized decision tree Ω~(2n/4) queries (depth)
[2] ε=2/π Randomized decision tree Ω~(2n/2) queries
[35] ε=Θ(1/n) Randomized AC0 circuits exp(2Ω(n/d)) size
[9] ε=1/210 Randomized AC0 circuits exp(2Ω(n/d)) size
[This work] ε=1 Randomized decision tree Ω~(2n/4) queries

Numerous works have studied lower bounds for the Forrelation problem. See Table 1 for a summary. The first classical lower bound for the Forrelation problem was established by Aaronson [1], who showed a Ω~(2n/4) lower bound when ε=0.05. This was improved to a Ω~(2n/2) lower bound for ε=2/πo(1) by Aaronson and Ambainis [2]. In their breakthrough result, Raz and Tal [35] proved lower bounds when ε=Θ(1/n). Although this choice of ε is smaller than in the previously mentioned works, the true strength of [35] lies in their classical lower bound which holds against a much more powerful model than classical query algorithms. Bansal and Sinha [9] strengthened this result by proving it in the regime of ε=Θ(1). It is worth emphasizing that a crucial aspect of this work – and a key reason they were able to resolve the conjecture of Aaronson and Ambainis on maximal separations – was their focus on lower bounds in the regime of ε=Θ(1) as opposed to ε=Θ(1/n). All of this underscores the difficulty and importance of proving lower bounds for the Forrelation problem, particularly as ε increases. In particular, as mentioned earlier, [2] have asked the following question (open question #4): how large ε can be for the Forrelation problem while remaining classical hard with 2Ω(n) queries? Our main theorem shows that even when ε is as large as can be (i.e., one), the Forrelation problem requires 2Ω(n) classical queries.

Efficient Oracle Separation

In [3], Aaronson and Chen ask, what happens if we consider quantum algorithms that can access an oracle, but we impose a constraint that the oracle has to be “physically realistic”? The motivation is to design a quantum advantage experiment by studying query complexity separations where the input oracles are implementable by small classical circuits. Motivated by this, we ask

How hard is it to estimate forr(f,g) when f,g are computable by poly(n)-sized circuits?

As an easy corollary of our main result (Corollary 5), we show that if (classically secure) one-way functions exist, then there is no classical poly(n)-time algorithm for the Forrelation problem, even if f,g are computable by polynomial-sized circuits. As mentioned before, our result is a query complexity lower bound that holds in the black-box setting where the algorithm can only query the truth tables of f,g and does not have an explicit description of the circuits computing f,g.

Table 2: Speedups with Exact Quantum Computation.
Exact Quantum Classical Lower Bound
Queries Queries Success Probability
Deutsch-Jozsa [21] 1 2Ω(n) 1
O(1) 2/3
Simon’s Problem [40, 13] O(n) (adaptive) 2Ω(n) 2/3
Welded Tree [17, 33] O(n2.5) (adaptive) 2Ω(n) 2/3
Order Finding (QFT) [34, 19] O(n) (parallel), can be made 1 2Ω(n) 2/3
Hidden Linear Structure (QFT) [20] O(n) (parallel), can be made 1 2Ω(n) 2/3
Extremal Forrelation Problem [this work] 1 2Ω(n) 2/3
Power of Exact Quantum Computation

Exact algorithms are algorithms that make no error, that is, they succeed with probability one. The power of exact quantum algorithms has been studied extensively in the past; see Table 2 for a summary. Numerous works have tried to understand the best possible separations between exact quantum and classical algorithms for total functions [6, 7, 15]. For partial functions, one of the earliest results is due to Deutsch–Jozsa [21], who showed that distinguishing between the constant function and a balanced function can be solved by an exact quantum algorithm with one query, but any zero-error randomized algorithm requires 2Ω(n) queries. However, the randomized query complexity drops to O(1) if the algorithm is allowed to err with small probability. The first exponential speedup over bounded-error randomized algorithms is demonstrated by Simon’s problem [40], which can be solved exactly [13] with O(n) quantum queries, but requires 2Ω(n) classical queries in the bounded-error model. Since then, there have been other problems showing a similar separation using Discrete Fourier transforms, with the additional advantage that all the queries can be made in parallel [19, 34, 20]. While many of these works describe their quantum algorithms as making one query, in their models one can retrieve the truth table values at n different points with just one query; this corresponds to making O(n) parallel queries in the standard model. In particular, they consider query access of the form |x|f(x) where f has n-bit outputs, whereas the standard model only allows single-bit outputs. At first glance, it appears that such quantum algorithms need to make Ω(n) queries. However, using this, one can obtain a new Boolean function on 2n-bit inputs with an exact one-query quantum algorithm such that every randomized algorithm requires 2Ω(n) queries222We thank the anonymous reviewer for TQC 2025 pointing this out.. The idea here is to replace f(x) by the Hadamard encoding of f(x) and to use the Bernstein-Vazirani algorithm. Our main result (Theorem 4) achieves a similar separation, arguably for a simpler function and a simpler quantum protocol.

1.3 Outlook & Future Directions

We now highlight a number of open questions inspired by our work.

Optimal Separations

The bounded-error randomized query complexity of Extremal Forrelation Problem remains to be understood. The results of [2, 14] implies a O~(2n/2) upper bound, while we prove an Ω~(2n/4) lower bound in Theorem 4. We conjecture that our lower bound can be improved to Ω~(2n/2), matching the upper bound from [2, 14]. This would recover the separation from [2], with the additional advantage that the quantum algorithm succeeds with probability one.

Conjecture 6.

The bounded-error randomized complexity of the Extremal Forrelation Problem is Ω~(2n/2).

We remark that Andrej Bogdanov and Yanbo Chen [12] have pointed out that for the particular μyes and μno distributions that we use to prove Theorem 4, there is a classical algorithm which makes 2n/4+o(n) queries and with high probability distinguishes a random instance (𝒇,𝒈) drawn from μyes (for which forr(𝒇,𝒈)=1) from a random instance (𝒇,𝒈) drawn from μno (for which forr(𝒇,𝒈)=1). Hence, any proof of Conjecture 6 must use a different construction than the one used in our proof of Theorem 4.

More broadly, one can ask about optimal separations between quantum and classical query complexity. This question was studied by [2], who introduced a variant of Forrelation called k-Forrelation whose quantum query complexity is k/2 and conjectured that its randomized query complexity is Ω~(2n(11/k)). When k=2, this problem is identical to the Forrelation problem as in Definition 3. They further conjectured that this is the best possible separation between bounded-error quantum and randomized query complexity. There have been a number of works on this topic [14, 41], culminating in the works of [37] and [9] that proved this conjecture. One can ask if a similar separation can be achieved by exact quantum algorithms. We conjecture that this can indeed be achieved.

Conjecture 7.

For k>2, the bounded-error randomized complexity of the extremal version of the k-Forrelation problem is Ω~(2n(11/k)).

Resolving this conjecture would shed light on the best possible separations between exact quantum and bounded-error randomized query complexity for partial functions.

Forrelation of Low-Degree Polynomials

Recently, there has been a lot of interest in studying the complexity of forrelation for instances in which f and g are degree-d 𝔽2-polynomials [24, 39]. It is not too difficult to define pairs of low-degree polynomials that have large Forrelation, for instance, in Definition 12, we can sample h to be a random degree-d 𝔽2-polynomial as opposed to a uniformly random Boolean function. This variant of Forrelation has been studied with the broad motivation of finding an efficient instantiation of an oracle which makes Forrelation classically hard. Towards this, researchers have tried to understand the computational complexity of estimating Forrelation for low-degree polynomials [24, 39].

We believe that understanding the query complexity of this version of Forrelation is independently interesting. In this setting, given query access to low-degree polynomials f,g, we wish to compute forr(f,g) using as few queries to the truth tables of f and g as possible. One trivial algorithm for this is to learn the polynomials using (nd) queries each and then compute Forrelation offline. We conjecture that this algorithm is more or less optimal.

Conjecture 8.

The Extremal Forrelation Problem where the inputs are restricted to degree-d 𝔽2-polynomials has randomized query complexity nΩ(d).

Proving this conjecture would give some evidence towards the hardness of computing forrelation even when the inputs are low-degree polynomials – it would indicate that computing the forrelation of low-degree polynomials is almost as hard as learning the polynomials.

Hardness Against Stronger Classical Models

One of the original motivations in [1] for introducing the Forrelation problem is that it was conjectured to be classically hard to compute, even for the relatively powerful model of 𝖠𝖢0 circuits. This conjecture was proved in the breakthrough result of Raz and Tal [35], resulting in an oracle separation between 𝖡𝖰𝖯 and 𝖯𝖧. We conjecture that a variant of the extremal Forrelation problem can be used to demonstrate a similar separation.

Conjecture 9.

Depth-d 𝖠𝖢0 circuits require exp(2Ω(n/d)) size to solve the Extremal Forrelation Problem.

We suspect that proving this conjecture would require studying a richer class of bent functions. We think this question is also interesting from the perspective of classical complexity theory, as it may lead to new techniques to prove 𝖠𝖢0 lower bounds.

1.4 Our Ideas

Before we go into our proof ideas, we first describe the existing approaches to proving Forrelation lower bounds and explain why they fail to work as ε tends to 1.

1.4.1 Why prior approaches don’t work for 𝜺=𝟏.

To prove randomized lower bounds on the Forrelation problem, one needs to produce hard distributions, i.e., distributions on the yes and no instances of the problem that are classically hard to distinguish. For now, let us focus on generating yes instances. It is not too difficult to generate a pair of real-valued functions f,g with large Forrelation and 𝔼[f2]=𝔼[g2]=1; indeed, given any nonzero function f, we may define g to be proportional to f^, with an appropriate normalization factor so as to satisfy forr(f,g)=1 and 𝔼[f2]=𝔼[g2]=1. The difficulty is in producing a pair of ±1-valued functions with large Forrelation. Existing techniques to generate such pairs of functions follow essentially the same framework:

  1. 1.

    For each x𝔽2n, independently sample 𝒇(x) according to the Gaussian distribution with mean 0 and variance Θ(ε). For each y𝔽2n, define 𝒈(y):=2n/2𝒇^(y).

  2. 2.

    For each x,y𝔽2n, independently round each 𝒇(x),𝒈(y) to ±1.

It is not difficult to show that Item 1 generates a distribution on pairs of real-valued functions that with high probability have large Forrelation, namely at least Θ(ε). The goal of Item 2 is to modify these functions to produce ±1-valued functions that continue to have large Forrelation, at least Θ(ε). This is where the choice of the rounding function and the parameter ε become crucial. The best parameters to date are obtained in [2], where Item 1 is as described above with ε=1 and the rounding in Item 2 is done using the sign function, which maps non-negative reals to 1 and negative reals to 1. Using properties of the Gaussian distribution, [2] show that the resulting Boolean functions (namely sign(𝒇),sign(𝒈)) have Forrelation at least 2/π in expectation.333An intuition for the 2/π factor is as follows. A basic fact about the Gaussian distribution is that for nearly-uncorrelated Gaussians 𝒇(x) and 𝒈(y), we have 𝔼[sign(𝒇(x))sign(𝒈(y))]2π𝔼[𝒇(x)𝒈(y)]. Since the Forrelation function is a linear combination of terms of the form 𝒇(x)𝒈(y), we get that 𝔼[forr(sign(𝒇),sign(𝒈))]2/π.

One might wonder if there is a different rounding procedure that produces a distribution on inputs with Forrelation very close to 1, but we suspect that this is not the case. The reason is that this framework is actually fairly oblivious to the underlying unitary matrix. In particular, one can consider the problem of estimating the Rorrelation function,

rorrU(f,g):=2nx,y𝔽2nf(x)g(y)Ux,y,

where U is any 2n×2n unitary matrix; note that when U=Hn this is identical to the Forrelation function. The Rorrelation problem was introduced and studied by [41]. It turns out that the problem of estimating rorrU(f,g) is classically hard for any unitary matrix whose entries are small (at most 1/2Ω(n) in magnitude), furthermore, this can be proved by using the above framework. In particular, if we sample a Haar random orthogonal matrix 𝑼, all of its entries will be small in magnitude and the framework described above will establish classical hardness of estimating rorr𝑼(f,g) up to a small global constant. On the other hand, the extremal version of the Rorrelation problem is vacuously easy for a Haar random orthogonal matrix 𝑼. In more detail, in the full version of the paper, we prove the following:

Claim 10.

With extremely high probability over a Haar random orthogonal matrix 𝑼,

maxf,g:𝔽2n{±1}|rorr𝑼(f,g)|<0.99. (2)

In other words, for a typical Haar random orthogonal matrix 𝑼, there are no ±1-valued functions for which rorr𝑼(f,g)=±1 and thus, the extremal version of the Rorrelation problem becomes vacuously easy. There is something special about the Forrelation problem and the Hadamard matrix which makes it possible for there to exist even one pair of ±1-valued functions (f,g) such that |forr(f,g)|=1.

It turns out that these extremal instances of the Forrelation problem arise from an important class of Boolean functions known as bent functions; we now describe this connection.

1.4.2 Connections to Bent Functions

Let us try to generate Boolean functions f,g:𝔽2n{±1} with as large Forrelation value as possible. To do this, let us first revisit the argument for why forr(f,g) cannot be larger than 1. Recall that

forr(f,g)2n/2y𝔽2nf^(y)g(y)2n/2yf^(y)2yg(y)2=1,

where the inequality is by Cauchy-Schwarz. For this to be tight, we need to set each g(y) to be a multiple of f^(y). Due to the normalization factor, it turns out that we need to set g(y)=2n/2f^(y). In particular, since g(y) is ±1, it implies that each Fourier coefficient f^(y) is ±2n/2. Boolean functions with this property, namely that all of the Fourier coefficients have equal magnitude, are known as bent Boolean functions. These are precisely the functions that give rise to extremal instances of the Forrelation problem. In more detail, as shown above any pair of functions (f,g) whose Forrelation value is 1 must arise from a bent function f; conversely, it is easy to see that any bent function f gives rise to a Boolean function g (namely g(y):=2n/2f^(y)) such that (f,g) has Forrelation 1.

Bent functions are extensively studied in Boolean function analysis. One important class of bent functions is the Maiorana-McFarland class of bent functions (see e.g. Section 6.1 of [16]). This family consists of all n-bit Boolean functions (1)f where f:𝔽2n𝔽2 is defined at x𝔽2n by f(x):=x𝟣,x𝟤+h(x𝟤) where n is even, x𝟣 and x𝟤 denote the first and second halves of x, and h:𝔽2n/2𝔽2 is any Boolean function. Here, x𝟣,x𝟤 is the inner product function (mod 2). It is not too difficult to show that every function of this form is bent (see [22]; a proof of this is implicit in the proof of our Lemma 13); intuitively, the inner product function itself is a bent function and adding any Boolean function that only depends on the second half of x preserves the bent-ness. We will use the Maiorana-McFarland family to construct hard instances for our lower bound.

Related Works

There have been a few works [36, 18, 5] that observe the close connections between bent functions and the hidden shift problem (Simon’s problem). Independently of our work, [23] observed the connection between bent functions and extremal instances of the Forrelation problem 444We thank anonymous reviewers for TQC 2025 for pointing this out..

A related problem is property testing for the class of bent Boolean functions. Here, we are given query access to a Boolean function f and the objective is to tell whether f is bent or Ω(1)-far from every bent function. This problem was studied by [11, 29]. In particular, [29] established an Ω(2n/4) lower bound on the bounded-error randomized query complexity of this problem and they used the Maiorana-McFarland family to construct hard distributions. In our problem, we are given truth table access to f and g that are promised to be bent and that satisfy either g=f^2n/2 or g=f^2n/2 and we wish to tell these apart. As both f and g are promised to be bent and there is a high degree of correlation between f and g, this significantly complicates the problem and we need to introduce some new ideas in our proof.

1.4.3 Proof Sketch

The high-level structure of our argument is reminiscent of the lower bound argument of [29] (see Section 4 of [29]). For ease of notation, let χ:𝔽2{±1} be the function mapping x(1)x. For a Boolean-valued function f:𝔽2n𝔽2, let χf denote the function (1)f.

To explain the ideas underlying our proof, we instead consider the problem of distinguishing forr(f,g)=1 from forr(f,g)=1/2n. It is simpler to describe a pair of hard distributions for this variant of the problem, so we focus on it in the current sketch; however, the main section of our paper directly tackles the case of forr(f,g)=1 versus forr(f,g)=1. For the distributions we work with in the proof overview (Equations 4 and 3), it is easier to pinpoint exactly where the classical hardness arises from, whereas the actual distributions we work with (Definition 12) have more complicated terms.

We will now define a distribution over pairs of functions (χf,χg) where f,g:𝔽2n𝔽2 are Boolean functions. For the yes distribution, we will sample χ𝒇 to be a bent function. As described before, this determines a unique ±1-valued function χ𝒈 (namely, χ𝒈:=2n/2χ𝒇^) such that forr(χ𝒇,χ𝒈)=1.555Throughout the paper we use bold font to indicate random variables; note that in the current context both 𝒇 and 𝒈 are random variables, albeit (completely) correlated ones. The process to generate 𝒇 consists of two steps:

  1. 1.

    Maiorana-McFarland family: First, sample a random bent function from the Maiorana-McFarland family. We use this step to ensure that we only produce bent functions – this is essential to construct instances with forr(χf,χg)=1.

  2. 2.

    Linear Transformation: We then apply an invertible linear transformation on the input variables; doing this simply permutes the Fourier coefficients, and hence preserves bent-ness. We will show this step sufficiently masks the input and “hides” the inner-product structure, which effectively prevents classical algorithms from detecting structural properties by reading just a few coordinates.

In contrast, for the no distribution, we will sample both f,g to be highly non-bent functions that mimic the Maiorana-McFarland family of bent functions. We now give a more formal description of this process. Sample 𝑨𝔽2n×n to be a random invertible matrix, 𝒉:𝔽2n/2𝔽2 to be a uniformly random function, and let 𝑩=(𝑨T)1. For the yes distribution, define

𝒇(x):=𝑨𝟣x,𝑨𝟤x+𝒉(𝑨𝟤x)and 𝒈(y):=𝑩𝟣y,𝑩𝟤y+𝒉(𝑩𝟣y), (3)

where M𝟣,M𝟤 denote the upper and lower halves of a matrix M. For the no distribution,

𝒇(x):=𝒉(𝑨𝟤x)and 𝒈(y):=𝒉(𝑩𝟣y), (4)

We then let (χ𝒇,χ𝒈) be the inputs to the Forrelation problem. It is not too difficult to show that 𝒇,𝒈 in Equation 3 satisfy forr(χ𝒇,χ𝒈)=1 and similarly, in Equation 4 satisfy forr(χ𝒇,χ𝒈)=1/2n. The proof of this involves analyzing the Fourier transform of the Maiorana-McFarland family and applying a suitable change of variables; a more general version of this statement is proved in Lemma 13.

Classical Hardness

We will now give some intuition as to why these distributions are hard to distinguish by classical algorithms. Consider a randomized decision tree of depth that distinguishes these distributions with large advantage. By Yao’s minmax principle, fixing the randomness of this algorithm, there exists a deterministic decision tree with the same properties. For now, let us assume that the tree is non-adaptive, that is, it fixes a set of points and this same set is queried along every root-to-leaf path. (This assumption turns out to be not so important, and the analysis for general adaptive algorithms closely follows the non-adaptive case.) Let x1,,xk𝔽2n and y1,,yk𝔽2n be the points queried by the non-adaptive decision tree. Here, xi and yj indicate points in the truth table of 𝒇 and 𝒈 respectively and k{0,,}; in other words, the algorithm just queries 𝒇(x1),,𝒇(xk) and 𝒈(y1),,𝒈(yk). We can assume without loss of generality that the xi are distinct and similarly the yj are distinct (there may be collisions between xi and yj). We will assume for the proof sketch that none of xi,yj are zero, since such queries are useless in distinguishing the yes and no distributions (as the outcomes are the same for both distributions).

Let us look at the contribution of 𝒉() to each of 𝒇(x1),,𝒇(xk) and 𝒈(y1),,𝒈(yk). Recalling the definition of the yes and no distribution in Equations 3 and 4, it is easy to see that for both these distributions, the contribution of 𝒉() is given by

𝒉(𝑨𝟤x1),,𝒉(𝑨𝟤xk)and𝒉(𝑩𝟣y1),,𝒉(𝑩𝟣yk). (5)

In particular, the sequence of points on which 𝒉() is implicitly queried is

𝑨𝟤x1,,𝑨𝟤xkand𝑩𝟣y1,,𝑩𝟣yk. (6)

The main technical lemma of our work shows that if 2cn, for a suitable absolute constant c>0, then with high probability Equation 6 is a sequence of distinct points. (This is not necessarily the case if some of xi,yj are allowed to be zero, but as we argued before, such queries are useless and we can assume without loss of generality that xi,yj0.) A version of this is formalized in Lemma 17 and Lemma 18. Whenever this is the case, the sequence of outcomes in Equation 5 consists of uniform and independent random bits – indeed, 𝒉 is a uniformly random function and hence its evaluations on distinct points are independent and uniform. This happens for both the yes and the no distributions and as a result, any decision tree of depth 2cn cannot sufficiently distinguish these distributions. We now sketch the proof that Equation 6 is a distinct sequence with high probability.

Collision Probability Analysis

This is done in Section 3.1. We will analyze the probability that any two points in Equation 6 are equal and show that it is at most 2Ω(n). We then apply a union bound over all pairs of points (there are at most 2 pairs) to conclude that with high probability, at least 122Ω(n)2/3, the sequence of points in Equation 6 is distinct. This along with the above paragraph would complete the proof.

Collisions within the 𝑨𝟤xi are significantly easier to analyze; the matrix 𝑨 is distributed according to a uniformly random invertible matrix, and hence intuitively the lower half 𝑨𝟤 has enough entropy to make collisions of the form 𝑨𝟤xi=𝑨𝟤xi highly unlikely. Collisions within the 𝑩𝟣yj can be similarly controlled. We prove this in Lemma 18. Collisions between 𝑨𝟤xi and 𝑩𝟣yj are significantly harder to analyze, since 𝑨𝟤 and 𝑩𝟣 are correlated with each other due to the relationship 𝑩=(𝑨T)1, but this can nevertheless be shown.

The General Case: Forrelation 𝟏 versus 𝟏

We now describe the additional ideas required to prove the classical hardness of the Extremal Forrelation Problem. In order to generate instances with forr(f,g)=1 and forr(f,g)=1, we are forced to sample f to be a bent function and we are forced to set g so that either χg=+χf^ or χg=χf^. If we were to naively modify the aforementioned yes distribution to a no distribution by letting χg=χf^, then the resulting yes and no distributions would become easy to classically distinguish: querying the yes distribution on x=0 and y=0 would produce identical answers, namely (1)𝒉(0) and (1)𝒉(0) while querying the no distribution on these points would produce different answers, namely (1)𝒉(0) and (1)𝒉(0). In order to get around this issue, in Item 2 of our actual hard distributions we need to apply an affine transformation on the input variables, instead of a linear transformation. This has the effect of shifting the origin, which intuitively means that the classical algorithm “does not know” which point to query in order to see this correlated pair of answers. Remarkably, applying this random shift also simplifies the collision probability analysis in the case of collisions between 𝑨𝟤xi and 𝑩𝟣yj. For more details, see Definition 12.

1.4.4 Organization

We describe our hard distributions in Section 2. We prove that these are indeed valid distributions (Lemma 13) and prove some some key properties about them (Lemmas 14 and 16). In Section 3, we present the main technical lemma (Lemma 17) and prove the main theorem assuming this in Section 3.2. In Section 3.1, we prove the main lemma.

2 Hard Distributions for the Extremal Forrelation Problem

As described in the introduction, the hard instances of our problem will be based on the Maiorana-McFarland family (see e.g. Section 6.1 of [16]) of bent functions – this family consists of “inner-product-like” functions. To sample our hard instances, we will first sample an affine shift, followed by a random “inner-product-like” function under this affine shift. We describe this in more detail below.

2.1 Descriptions of Hard Distributions

Notation

For simplicity of notation, define the function χ:𝔽2{±1} mapping x to (1)x. For any Boolean-valued function f:𝔽2n𝔽2, let χf:𝔽2n{±1} denote the function χ(f)(x)=(1)f(x). For matrices A,B𝔽2n×n, let A=[A𝟣A𝟤] and B=[B𝟣B𝟤] where A𝟣,A𝟤 and B𝟣,B𝟤 are n/2×n matrices representing the upper and lower halves of A and B respectively. We use x,y to denote the inner product (mod 2) between vectors x,y𝔽2n.

We now proceed to the description of the hard distributions. We first define a joint distribution on 4-tuples (A,B,a,b) where A,B are matrices and a,b are affine shifts (vectors) as follows.

Definition 11.

Let 𝐀𝔽2n×n be a uniformly random matrix of full rank and let 𝐁:=(𝐀T)1. Let 𝐚𝔽2n be a uniformly vector and let 𝐛=𝐁T[𝐁𝟤𝐁𝟣]𝐚.

Let be the induced distribution on (𝑨,𝑩). We use 𝟤𝟣, 𝟤 and 𝟣 to denote the induced distribution on (𝑨𝟤,𝑩𝟣), 𝑨𝟤 and 𝑩𝟣 respectively. We now define the hard instances of the Extremal Forrelation Problem for a family of functions .

Definition 12 (Hard Instances of Extremal Forrelation Problem).

Let be any collection of boolean functions mapping 𝔽2n/2 to {0,1}. Sample (𝐀,𝐁,𝐚,𝐛) as in Definition 11. Sample 𝐡:𝔽2n/2{0,1} to be a uniformly random Boolean function in . Define Boolean functions 𝐟,𝐠:𝔽2n𝔽2 as follows.

𝒇(x):=𝑨𝟣x,𝑨𝟤x+x,𝒂+𝒉(𝑨𝟤x)𝒈(y):=𝑩𝟣y,𝑩𝟤y+y,𝒃+𝒉(𝑩𝟣y+𝑩𝟣𝒂)+𝑩𝟣𝒂,𝑩𝟤𝒂.

Let μyes and μno be the induced distributions on (χ𝐟,χ𝐠) and (χ𝐟,χ𝐠) respectively for h.

We will later instantiate in various ways. The following lemma shows that regardless of , these are indeed valid distributions, that is, μyes and μno are indeed supported on the yes and no instances of the Extremal Forrelation Problem.

Lemma 13.

For (χf,χg) in the support of μyes and μno, we have forr(f,g)=1 and forr(f,g)=1 respectively (regardless of ).

Proof of Lemma 13.

Fix any A,B,a,b as in Definition 12 and let f,g be as specified in Definition 12. For any y𝔽2n , consider

χf^(y) =12nz𝔽2nχf(z)χ(z,y)
=12nz𝔽2nχ(f(z)+z,y)
=12nz𝔽2nχ(A𝟣z,A𝟤z+z,a+h(A𝟤z)+z,y).

Since A is invertible we can do a change of variables xAz, which in particular gives us x𝟣A𝟣z and x𝟤A𝟤z. This lets us rewrite the above as

χf^(y)=12nx𝔽2nχ(x𝟣,x𝟤+A1x,a+h(x𝟤)+A1x,y).

We now observe that A1x,=x,B, since (A1)T=B. Substituting this above, we see that

χf^(y)=12nx𝔽2nχ(x𝟣,x𝟤+x,Ba+h(x𝟤)+x,By).

We now express x,B as x𝟣,B𝟣+x𝟤,B𝟤. Thus,

χf^(y)=12nx𝔽2nχ(x𝟣,x𝟤+x𝟣,B𝟣a+x𝟤,B𝟤a+h(x𝟤)+x𝟣,B𝟣y+x𝟤,B𝟤y).

We now group the terms based on x𝟣.

χf^(y) =12nx𝔽2nχ(x𝟣,x𝟤+B𝟣a+B𝟣y+x𝟤,B𝟤a+B𝟤y+h(x𝟤))
=12nx𝟤𝔽2n/2x𝟣𝔽2n/2χ(x𝟣,x𝟤+B𝟣a+B𝟣y)χ(x𝟤,B𝟤a+B𝟤y+h(x𝟤)).

Observe that the first term χ(x𝟣,x𝟤+B𝟣a+B𝟣y) when summed over all x𝟣𝔽2n/2 is non-zero only if x𝟤+B𝟣a+B𝟣y=0 (equivalently, x𝟤=B𝟣a+B𝟣y); and if it is non-zero, it equals 2n/2. Thus, we have

χf^(y) =12n/2χ(B𝟣a+B𝟣y,B𝟤a+B𝟤y+h(B𝟣a+B𝟣y)).

We now expand the terms in the R.H.S. to obtain

χf^(y) =12n/2χ(B𝟣y,B𝟤y+[B𝟣B𝟤]y,[B𝟤B𝟣]a+h(B𝟣a+B𝟣y)+B𝟣a,B𝟤a)
=12n/2χ(B𝟣y,B𝟤y+y,[B𝟣TB𝟤T][B𝟤B𝟣]a+h(B𝟣a+B𝟣y)+B𝟣a,B𝟤a).

Recall that we have b=BT[B𝟤B𝟣]a=[B𝟣TB𝟤T][B𝟤B𝟣]a. Substituting this in the above equation, we have

χf^(y) =12n/2χ(B𝟣y,B𝟤y+y,b+h(B𝟣a+B𝟣y))+B𝟣a,B𝟤a.

Recalling the definition of g(y) from Definition 12, we see that

χf^(y) 12n/2χg(y). (7)

We now use the defining equation for the Forrelation function (Equation 1) to get

forr(χf,χg)12n/2y𝔽2nχf^(y)χg(y).

Combining this and Equation 7, we see that forr(χf,χg)=12nyχg(y)2=1. It can be similarly shown that forr(χf,χg)=12ny(1)χg(y)2=1. This completes the proof of Lemma 13.

2.2 Characterizing the Marginals of the Distributions

Recall that we used to denote the induced distribution on (𝑨,𝑩), and 𝟤 and 𝟣 to denote the induced distribution on 𝑨𝟤 and 𝑩𝟣 respectively. In this section, we will characterize the marginal distributions 𝟣 and 𝟤, which turn out to be identical, as follows:

Lemma 14.

Each of the distributions 𝟣 and 𝟤 is precisely the uniform distribution over all matrices in 𝔽2n/2×n that have full row-rank.

We then show the following fact about the row-rank of a uniformly random rectangular matrix.

Fact 15.

A uniformly random matrix in 𝔽2n/2×n has full row-rank with probability at least 1(n/2)2n/2.

As an immediate corollary of this and Lemma 14, we obtain the following.

Corollary 16.

Each of the distributions 𝟣 and 𝟤 is (n/2)2n/2close to the uniform distribution over 𝔽2n/2×n in total variational distance.

We now give the proofs of Lemma 14 and Fact 15.

Proof of Lemma 14.

We will prove this for the distribution 𝟣 and the argument for 𝟤 is identical. Recall that 𝑩 is sampled according to a uniformly random full-rank matrix. Fix any full row-rank matrix B𝟣. We will count the number of matrices B𝟤 such (B𝟣,B𝟤) has full rank. We will show that this number is precisely (2n2n/2)×(2n2n/2+1)××(2n/22n1). This would complete the proof.

To count the number of B𝟤, we first count the number of possibilities for each row of B𝟤. Firstly, each row of B𝟤 must not lie in the span of the previous rows of B𝟤 and the rows of B𝟣. The first row of B𝟤 is any vector not in the span of the rows of B𝟣 and thus has 2n2n/2 possibilities. Having fixed this, the second row of B𝟤 can be any vector that is not in the span of the first row of B𝟤 and the rows of B𝟣, and thus has 2n2n/2+1 possibilities. We repeat this argument and at the i-th step, we choose a vector that is not in the n/2+i1-dimensional space spanned by the first i1 rows of B𝟤 and the n/2 rows of B𝟣; this can be done in 2n2n/2+i1 ways. Doing this for all n/2 rows shows that the total number of possibilities for B𝟤 is precisely i=1n/2(2n2n/2+i1) and this completes the proof of Lemma 14.

Proof of Fact 15.

We perform a calculation identical to that in Lemma 14. By a similar argument, we can show that the number of matrices in 𝔽2n/2×n with full row-rank is precisely

i=1n/2(2n2i1).

Thus, the probability that a uniformly random n/2×n matrix is of full row-rank is precisely

i=1n/2(2n2i1)2n2/2=i=1n/2(2n2i12n)(12n/2)n/21(n/2)2n/2.

This completes the proof of Fact 15.

3 Classical Lower Bound

The main technical ingredient in the classical lower bound will be Lemma 17. We will describe this lemma and its proof in Section 3.1. We will then prove Theorem 4 in Section 3.2 assuming this.

3.1 Statement of Lemma 17 and its Proof

The main technical ingredient is the following.

Lemma 17.

Let x,y{0,1}n be any two vectors. Then,

𝐏𝐫(𝑨𝟤,𝑩𝟣)𝟤𝟣𝒂𝔽2n[𝑨𝟤x=𝑩𝟣y+𝑩𝟣𝒂]=2n/2.

In the above lemma, (𝑨𝟤,𝑩𝟣)𝟤𝟣 and 𝒂𝔽2n are sampled independently, just as in Definition 11. We will also require the following lemma.

Lemma 18.

For xx𝔽2n and yy𝔽2n, we have

𝐏𝐫𝑨𝟤𝟤[𝑨𝟤x=𝑨𝟤x](n/2+1)2n/2
𝐏𝐫𝑩𝟣𝟣[𝑩𝟣y=𝑩𝟣y](n/2+1)2n/2.

We now prove these two lemmas.

Proof of Lemma 17.

Let be the event 𝑨𝟤x=𝑩𝟣y+𝑩𝟣𝒂. This is equivalent to 𝑩𝟣𝒂=𝑨𝟤x+𝑩𝟣y. Now, regardless of what x,y are, the vector 𝒂 is distributed as a uniformly random vector in 𝔽2n that is independent of 𝑨,𝑩, because 𝒂 is sampled uniformly and independently of 𝑨,𝑩. Furthermore, 𝑩𝟣 has full row-rank. Therefore, fixing 𝑨=A and 𝑩=B, we see that the probability over 𝒂 that B𝟣𝒂 is equal to A𝟤x+B𝟣y is precisely 2n/2. This completes the proof.

Proof of Lemma 18.

We prove this lemma for 𝑨𝟤𝟤 and the proof of 𝑩𝟣𝟣 is identical. The event we wish to bound the probability of is [𝑨𝟤(xx)=0]. We use Corollary 16 to conclude that the total variational distance between 𝟤 and the uniform distribution is at most (n/2)2n/2. Let us now work with a uniformly random matrix 𝑨𝟤. Since (xx)0, the vector 𝑨𝟤(xx) is a uniformly random vector in 𝔽2n/2. Therefore, the probability that it is zero is at most 2n/2. This completes the proof.

We now complete the proof of Theorem 4 using these lemmas.

3.2 Proof of Theorem 4

Let be any family of all n/2-variate boolean functions. Consider a classical randomized query protocol for the Extremal Forrelation Problem with D queries. Recall from Lemma 13 that μyes and μno are supported on the yes and no instances of the Extremal Forrelation Problem. Given a randomized query protocol with D queries for the Extremal Forrelation Problem that succeeds with at least 2/3 probability, by Yao’s principle, there exists a deterministic decision tree of depth D that distinguishes μyes and μno with advantage at least 1/3.

Given such a deterministic decision tree of depth D2n/4/6 and a root-to-leaf path 𝒫 of length in the tree, each node in the path 𝒫 corresponds to either a query of the form f(x) (where the truth table of f is probed), or a query of the form g(y) (where the truth table of g is probed). We assume without loss of generality that the query vectors for f are distinct and similarly the query vectors for g are distinct (there may be common query vectors which are given to both f and g). We will then use the following claims.

Claim 19.

Consider any deterministic decision tree of depth D2n/4/(6n), and fix any root-to-leaf path 𝒫 of length D in the tree. Let x(1),,x(k)𝔽2n and y(1),,y(k)𝔽2n be the sequence of vectors queried. Let 𝑨,𝑩,𝒂 be distributed as in Definition 11, and consider the sequence of points

𝑨𝟤x(1),,𝑨𝟤x(k)and𝑩𝟣y(1)+𝑩𝟣𝒂,,𝑩𝟣y(k)+𝑩𝟣𝒂. (8)

Let be the event that (8) is a sequence of distinct points. Then, Prμyes[],Prμno[]9/10.

Claim 20.

Let be the family of all n/2-variate boolean functions. Under the same hypothesis as Claim 19, whenever occurs, the probability of taking path 𝒫 for the distributions μyes| and μno| is exactly 2.

Remark

Claim 20 is the only place where the properties of come into play – every other part of the proof is independent of .

Once we have these claims, the proof follows quite easily. Let be the family of all n/2-variate boolean functions. Let us look at the induced distributions μyesleaf, and μnoleaf, on the leaves of the decision tree when the inputs are sampled according to μyes and μno respectively and let μ be the distribution on the leaves induced by a truly random walk down the tree. Claim 19 and Claim 20 imply that for any leaf, the probability that μyesleaf, and μnoleaf, assign to that leaf are each at least 9/10 times the probability assigned by μ. This implies that there exists distributions μ~yesleaf,,μ~noleaf, on the leaves such that

μyesleaf,=910μ+110μ~yesleaf,,
μnoleaf,=910μ+110μ~noleaf,.

This implies that the total variational distance between μyesleaf, and μnoleaf, is at most 1/10 and hence, the output distributions of the decision tree on inputs sampled according to μyes and μno differ in total variational distance by at most 1/10. This contradicts the assumption that the decision tree distinguishes these distributions with advantage at least 1/3. This completes the proof of Theorem 4 assuming Claim 19 and Claim 20. We will now prove these claims.

3.3 Proof of Claim 19 and Claim 20

We now proceed to the proof of Claim 19, which is where we will use Lemma 17 and Lemma 18. We will then prove Claim 20 which relies primarily on the properties of .

Proof of Claim 19.

We first analyze the probability of when the distribution on inputs is μyes. The calculation for μno is identical and is omitted. Note that (8) is a random sequence of points (where the randomness comes from 𝑨,𝑩 and 𝒂). As stated in Claim 20, let be the event that this is a sequence of distinct points. We will now argue that is a high-probability event.

First, let us bound the probability of collisions within the 𝑨𝟤x(i). Fix any ii[k]. We apply Lemma 18 to the vectors x(i)x(i). Lemma 18 implies that

𝐏𝐫[𝑨𝟣x(i)=𝑨𝟣x(i)](n/2+1)2n/2.

We now apply a union bound over i,i[k]. There are at most k2 possibilities to union bound over. This implies that with probability at least 1k2(n/2+1)2n/2, the sequence of points

𝑨𝟤x(1),,𝑨𝟤x(k)

is a sequence of k distinct points. We can similarly argue about collisions within the 𝑩𝟣y(j) to conclude that with probability at least 1(k)2(n/2+1)2n/2, the sequence of points

𝑩𝟣y(1)+𝑩𝟣𝒂,,𝑩𝟣y(k)+𝑩𝟣𝒂

is a sequence of k distinct points. Finally, we argue about collisions between pairs of the form 𝑨𝟤x(i) and 𝑩𝟣y(j)+𝑩𝟣𝒂. Fix any i[k] and j[k]. We apply Lemma 17 to the vectors x(i) and y(j). Lemma 17 implies that

𝐏𝐫[𝑨𝟤x(i)=𝑩𝟣y(j)+𝑩𝟣𝒂]2n/2.

We now apply a union bound over (i,j). There are at most k(k) possibilities to union bound over. This implies that with probability at least 1k(k)2n/2, there are no collisions among pairs of the form 𝑨𝟤x(i) and 𝑩𝟣y(j)+𝑩𝟣𝒂. So by a union bound, the total probability of the bad event ¬ is at most

k2(n/2+1)2n/2+(k)2(n/2+1)2n/2+k(k)42n/23D2n2n/2.

for large enough n. Recall that we set D2n/4/(6n), so we get that

3D2n2n/232n/2136nn2n/2<110.

This shows that 𝐏𝐫μyes[]9/10. The argument for μno is identical. This completes the proof of Claim 19.

Proof of Claim 20.

We first analyze the probability of receiving any particular sequence of outcomes when the distribution on inputs is μyes. The calculation for μno is identical and is omitted.

Recall the μyes distribution. This is obtained by sampling 𝑨,𝑩,𝒂,𝒃 as in Definition 12, sampling 𝒉 uniformly at random from , and outputting (χ𝒇,χ𝒈) where 𝒇,𝒈:𝔽2n𝔽2 are defined as:

𝒇(x):=𝑨𝟣x,𝑨𝟤x+x,𝒂+𝒉(𝑨𝟤x)𝒈(y):=𝑩𝟣y,𝑩𝟤y+y,𝒃+𝒉(𝑩𝟣y+𝑩𝟣𝒂)+𝑩𝟣𝒂,𝑩𝟤𝒂.

Along the path 𝒫, we have queried 𝒇(x(1)),,𝒇(x(k)) and 𝒈(y(1)),,𝒈(y(k)). Let us consider the contribution of 𝒉() to these query responses. This is given by evaluating 𝒉() on the following sequence of points

𝑨𝟤x(1),,𝑨𝟤x(k)and𝑩𝟣y(1)+𝑩𝟣𝒂,,𝑩𝟣y(k)+𝑩𝟣𝒂

which is precisely the sequence given in Equation 8. We will now argue that when happens, the probability of taking this path under μyes (and similarly μno) is precisely 2. Let us compute the probability of taking the path 𝒫 under μyes conditioned on happening. When happens, we have that the sequence of points

𝑨𝟤x(1),,𝑨𝟤x(k)and𝑩𝟣y(1)+𝑩𝟣𝒂,,𝑩𝟣y(k)+𝑩𝟣𝒂

is a sequence of distinct points. We now observe that the evaluations of the function 𝒉 on these points are independent and uniformly random bits in {0,1}. In other words,

𝒉(𝑨𝟤x(1)),,𝒉(𝑨𝟤x(k))and𝒉(𝑩𝟣y(1)+𝑩𝟣𝒂),,𝒉(𝑩𝟣y(k)+𝑩𝟣𝒂)

is a sequence of uniformly random bits in {0,1} when 𝒉. Thus, the probability of receiving any particular sequence of outcomes when querying the truth tables of 𝒇,𝒈 at the vectors x(1),,x(k), y(1),,y(k) is either exactly 2. This completes the proof.

3.4 Proof of Corollary 5

Proof of Corollary 5.

Let be the family of all boolean functions {h:{0,1}n/2{0,1}}. It is well known [30, 28] that if one-way functions exist, then we can construct a family of pseudorandom functions :={hλ:{0,1}n/2{0,1}}λ{0,1}k(n) with k(n)=poly(n) such that

  • Efficiency: there is a poly(n)-sized classical circuit that computes hλ(x) given inputs λ{0,1}k(n) and x{0,1}n/2.

  • Security: Any classical polynomial-time algorithm 𝒜 that queries the truth-table of an n/2-bit Boolean function cannot sufficiently distinguish a uniformly random function in from a truly uniformly random function, i.e.,

    |𝐄𝝀{0,1}k(n)[𝒜h𝝀(x)(1n)]𝐄𝒉[𝒜𝒉(x)(1n)]|negl(n).

Let μyes and μno be the hard distributions as in Definition 12 defined with respect to and similarly μyes and μno be defined with respect to .

Implementability of 𝒇,𝒈.

For any pair (f,g) drawn from either μyes or μno, the functions f,g are computable by polynomial-sized circuits. In more detail, for a fixed draw of A,B,a,b,λ, the circuit for f takes input x, first computes the inner products A𝟣x,A𝟤x and x,a, then computes the vector A𝟤x, and finally applies the circuit for hλ on this vector and thus computes f(x)=A𝟣x,A𝟤x+x,a+hλ(A𝟤x). All of these operations can be implemented by poly(n)-sized classical circuits and the circuit for g is analogous. We will now show that under the cryptographic assumption, there is no classical algorithm that distinguishes μyes and μno with poly(n) queries.

Classical Indistinguishability of 𝒇,𝒈.

Let 𝒜 be any classical algorithm that makes at most poly(n) queries to the truth tables of f and g. Theorem 4 shows that 𝒜 cannot distinguish μyes and μno with advantage more than negl(n)666While the statement of Theorem 4 only refers to 1/3 advantage, the proof (19) shows that the advantage of a classical algorithm in solving the Extremal Forrelation Problem scales as 2n/2 times D2 where D is the number of queries. In particular, for poly(n)-time algorithms (which make at most at most poly(n) queries), this advantage is negl(n).. We will now show that under the cryptographic assumption, 𝒜 cannot distinguish μyes and μyes with more than negl(n) advantage. By the same argument, an analogous statement holds for the distributions μno and μno. Consequently, by the triangle inequality, we get that 𝒜 cannot distinguish μyes and μno with more than negl(n) advantage and this completes the proof.

To see that 𝒜 cannot distinguish μyes and μyes with more than negl(n) advantage, we show that any such algorithm can be turned into a distinguisher for and . Indeed, consider the algorithm 𝒜 that given query access to an unknown n/2-bit boolean function h, first samples (𝑨,𝑩,𝒂,𝒃) as in Definition 11 and runs 𝒜 on the pair of functions (f,g) as defined in the yes distribution in Definition 12. Similarly to the argument before, each query to f at the point x can be simulated by making a query to h at the point A𝟤x and computing f(x)=A𝟣x,A𝟤x+x,a+h(A𝟤x), and similarly for g. Thus, a distinguisher for μyes and μyes can be turned into one for and with the same number of queries and same advantage. This completes the proof.

References

  • [1] Scott Aaronson. BQP and the polynomial hierarchy. In Proceedings of the 61st IEEE Annual Symposium on Foundations of Computer Science, pages 141–150, 2010. doi:10.1145/1806689.1806711.
  • [2] Scott Aaronson and Andris Ambainis. Forrelation: A problem that optimally separates quantum from classical computing. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, pages 307–316, 2015. doi:10.1145/2746539.2746547.
  • [3] Scott Aaronson and Lijie Chen. Complexity-Theoretic Foundations of Quantum Supremacy Experiments. In 32nd Computational Complexity Conference (CCC 2017), pages 22:1–22:67, 2017. doi:10.4230/LIPIcs.CCC.2017.22.
  • [4] Scott Aaronson, DeVon Ingram, and William Kretschmer. The acrobatics of BQP. In Proceedings of the 37th Computational Complexity Conference, pages 20:1–20:17, 2022. doi:10.4230/LIPIcs.CCC.2022.20.
  • [5] Serge Adonsou, Peter Bruin, Maris Ozols, and Joppe Stokvis. Hidden shift problem for complex functions. Available at https://arxiv.org/pdf/2507.19440, 2025.
  • [6] Andris Ambainis. Superlinear advantage for exact quantum algorithms. SIAM J. Comput., 45:617–631, 2012.
  • [7] Andris Ambainis, Kaspars Balodis, Aleksandrs Belovs, Troy Lee, Miklos Santha, and Juris Smotrovs. Separations in query complexity based on pointer functions. J. ACM, 64(5), September 2017. doi:10.1145/3106234.
  • [8] Srinivasan Arunachalam and Uma Girish. Trade-Offs Between Entanglement and Communication. In 38th Computational Complexity Conference (CCC 2023), pages 25:1–25:23, 2023. doi:10.4230/LIPIcs.CCC.2023.25.
  • [9] Nikhil Bansal and Makrand Sinha. k-forrelation optimally separates quantum and classical query complexity. In Samir Khuller and Virginia Vassilevska Williams, editors, Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1303–1316, 2021. doi:10.1145/3406325.3451040.
  • [10] Ethan Bernstein and Umesh Vazirani. Quantum complexity theory. SIAM Journal on Computing, 26(5):1411–1473, 1997. doi:10.1137/S0097539796300921.
  • [11] Abhishek Bhrushundi, Sourav Chakraborty, and Raghav Kulkarni. Property testing bounds for linear and quadratic functions via parity decision trees. In Computer Science - Theory and Applications - 9th International Computer Science Symposium in Russia, pages 97–110, 2014. doi:10.1007/978-3-319-06686-8_8.
  • [12] Andrej Bogdanov and Yanbo Chen, 2025. Personal communication.
  • [13] Gilles Brassard and Peter Høyer. An exact quantum polynomial-time algorithm for simon’s problem. In Fifth Israel Symposium on Theory of Computing and Systems, ISTCS 1997, pages 12–23, 1997. doi:10.1109/ISTCS.1997.595153.
  • [14] Sergey Bravyi, David Gosset, Daniel Grier, and Luke Schaeffer. Classical algorithms for forrelation. Available as arXiv e-print at https://arxiv.org/abs/2102.06963, 2022. arXiv:2102.06963.
  • [15] Harry Buhrman, Richard Cleve, Ronald de Wolf, and Christof Zalka. Bounds for small-error and zero-error quantum algorithms. In Proceedings of 40th Annual Symposium on Foundations of Computer Science, pages 358–368, 1999. doi:10.1109/SFFCS.1999.814607.
  • [16] Claude Carlet and Sihem Mesnager. Four decades of research on bent functions. Des. Codes Cryptogr., 78:78:5–78:50, 2016. doi:10.1007/S10623-015-0145-8.
  • [17] Andrew M. Childs, Richard Cleve, Enrico Deotto, Edward Farhi, Sam Gutmann, and Daniel A. Spielman. Exponential algorithmic speedup by a quantum walk. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC 2003), pages 59–68, 2003. doi:10.1145/780542.780552.
  • [18] Andrew M. Childs, Robin Kothari, Maris Ozols, and Martin Roetteler. Easy and hard functions for the Boolean hidden shift problem. In 8th Conference on the Theory of Quantum Computation, Communication and Cryptography, TQC 2013, pages 50–79, 2013. doi:10.4230/LIPIcs.TQC.2013.50.
  • [19] Richard Cleve. The query complexity of order-finding. Information and Computation, 192(2):162–171, 2004. doi:10.1016/J.IC.2004.04.001.
  • [20] J. Niel de Beaudrap, Richard Cleve, and John Watrous. Sharp quantum vs. classical query complexity separations. Available as arXiv e-print at https://arxiv.org/abs/quant-ph/0011065, 2001. arXiv:quant-ph/0011065.
  • [21] David Deutsch and Richard Jozsa. Rapid solution of problems by quantum computation. Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences, 439:553–558, 1992. URL: https://api.semanticscholar.org/CorpusID:121702767.
  • [22] J.F. Dillon. Elementary Hadamard Difference Sets. PhD thesis, University of Maryland, College Park, 1974.
  • [23] Suman Dutta, Subhamoy Maitra, and Chandra Sekhar Mukherjee. Following forrelation – quantum algorithms in exploring Boolean functions’ spectra. Advances in Mathematics of Communications, 18(1):1–25, 2024. doi:10.3934/amc.2021067.
  • [24] Alexandru Georghiu. Verifiable quantum advantage: old and new ideas, 2025. Link to recording of talk: https://youtu.be/7NqAcaSwKf8?si=2ZBkEkbiVaUMSzMJ.
  • [25] Uma Girish, Ran Raz, and Avishay Tal. Quantum versus randomized communication complexity, with efficient players. Comput. Complex., 31(2):17, 2022. doi:10.1007/S00037-022-00232-7.
  • [26] Uma Girish, Ran Raz, and Wei Zhan. Lower Bounds for XOR of Forrelations. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021), pages 52:1–52:14, 2021. doi:10.4230/LIPIcs.APPROX/RANDOM.2021.52.
  • [27] Uma Girish, Makrand Sinha, Avishay Tal, and Kewen Wu. The power of adaptivity in quantum query algorithms. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 1488–1497, 2024. doi:10.1145/3618260.3649621.
  • [28] Oded Goldreich, Shafi Goldwasser, and Silvio Micali. How to construct random functions. J. ACM, 33(4):792–807, August 1986. doi:10.1145/6490.6503.
  • [29] Elena Grigorescu, Karl Wimmer, and Ning Xie. Tight lower bounds for testing linear isomorphism. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques - 16th International Workshop, APPROX 2013, and 17th International Workshop, RANDOM 2013, pages 559–574, 2013. doi:10.1007/978-3-642-40328-6_39.
  • [30] Johan HÅstad, Russell Impagliazzo, Leonid A. Levin, and Michael Luby. A pseudorandom generator from any one-way function. SIAM Journal on Computing, 28(4):1364–1396, 1999. doi:10.1137/S0097539793244708.
  • [31] William Kretschmer, Luowen Qian, Makrand Sinha, and Avishay Tal. Quantum cryptography in Algorithmica. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 1589–1602, 2023. doi:10.1145/3564246.3585225.
  • [32] William Kretschmer, Luowen Qian, and Avishay Tal. Quantum-computable one-way functions without one-way functions. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing, pages 189–200, 2025. doi:10.1145/3717823.3718144.
  • [33] Guanzhong Li, Lvzhou Li, and Jingquan Luo. Recovering the original simplicity: succinct and deterministic quantum algorithm for the welded tree problem. In ACM-SIAM Symposium on Discrete Algorithms, pages 2454–2480, 2023.
  • [34] Michele Mosca and Christof Zalka. Exact quantum Fourier transforms and discrete logarithm algorithms. International Journal of Quantum Information, 02(01):91–100, 2004.
  • [35] Ran Raz and Avishay Tal. Oracle separation of BQP and PH. J. ACM, 69(4):30:1–30:21, 2022. doi:10.1145/3530258.
  • [36] Martin Rotteler. Quantum algorithms for highly non-linear Boolean functions. Proceedings of the 2010 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 448–457, 2020. doi:10.1137/1.9781611973075.37.
  • [37] Alexander A. Sherstov, Andrey A. Storozhenko, and Pei Wu. An optimal separation of randomized and quantum query complexity. SIAM J. Comput., 52(2):525–567, 2023. doi:10.1137/22M1468943.
  • [38] Peter W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Journal on Computing, 26(5):1484–1509, 1997. doi:10.1137/S0097539795293172.
  • [39] Noah Shutty. Simons institute summer cluster on quantum computing: Lightning talks, 2025. Link to recording of talk: https://www.youtube.com/live/7F5LBNGDRmk?si=NhPVNL25qGTtcngP&t=2651.
  • [40] Daniel R. Simon. On the power of quantum computation. SIAM J. Comput., 26(5):1474–1483, October 1997. doi:10.1137/S0097539796298637.
  • [41] Avishay Tal. Towards optimal separations between quantum and randomized query complexities. In Proceedings of the 61st IEEE Annual Symposium on Foundations of Computer Science, pages 228–239, 2020. doi:10.1109/FOCS46700.2020.00030.