Quantum Query Algorithms are Completely Bounded Forms

We prove a characterization of $t$-query quantum algorithms in terms of the unit ball of a space of degree-$2t$ polynomials. Based on this, we obtain a refined notion of approximate polynomial degree that equals the quantum query complexity, answering a question of Aaronson et al. (CCC'16). Our proof is based on a fundamental result of Christensen and Sinclair (J. Funct. Anal., 1987) that generalizes the well-known Stinespring representation for quantum channels to multilinear forms. Using our characterization, we show that many polynomials of degree four are far from those coming from two-query quantum algorithms. We also give a simple and short proof of one of the results of Aaronson et al. showing an equivalence between one-query quantum algorithms and bounded quadratic polynomials. Revision note: A mistake was found in the proof of the second result on degree-4 polynomials far from 2-query quantum algorithms. An explanation of the issue, a corrected proof and stronger examples are presented in work of Escudero Guti\'errez and the second author.


Introduction
In the black-box model of quantum computation one is given access to a unitary operation, usually referred to as an oracle, that allows one to probe the bits of an unknown binary string x ∈ {−1, 1} n in superposition. Promised that x lies in a subset D ⊆ {−1, 1} n , the goal in this model is to learn some property of x given by a Boolean function f : D → {−1, 1}, when only given access to x through the oracle. An application of the oracle is usually referred to as a query. The boundederror quantum query complexity of f , denoted Q ε ( f ), is the minimal number of queries a quantum algorithm must make on the worst-case input x ∈ D to compute f (x) with probability at least 1 − ε, where ε ∈ (0, 1/2) is usually some fixed but arbitrary positive constant.
Many of the best-known quantum algorithms are naturally captured by this model. A few examples of partial functions whose quantum query complexity is exponentially smaller than their classical counterpart (the decision-tree complexity) are period finding [Sho97], Simon's problem [Sim97] and Forrelation [AA15]. Famous problems related to total functions that admit polynomial quantum speed-ups include unstructured search [Gro96], element distinctness [Amb07] 2. Constant-degree polynomials characterize constant-query quantum algorithms.
These avenues were recently explored by Aaronson et al. [AA15,AAI + 16]. The first work strengthened the polynomial method by observing that quantum algorithms give rise to polynomials with a so-called block-multilinear structure. Based on this observation, they introduced a refined degree measure, bm-deg ε ( f ) which lies between deg ε ( f ) and 2Q ε ( f ), prompting the immediate question of how well that approximates Q ε ( f ). The subsequent work showed, among other things, that for infinitely many n, there is a function f with bm-deg 1/3 ( f ) = O( √ n) and Q 1/3 ( f ) = Ω(n), thereby also ruling out the possibility that this degree measure validates possibility 1. The natural next question then asks if there is another refined notion of polynomial degree that approximates quantum query complexity [AAI + 16, Open problem 3].
In the direction of the second avenue, [AAI + 16] showed a surprising converse to the polynomial method for quadratic polynomials. Say that a polynomial p ∈ R[x 1 , . . . , x n ] is bounded if it satisfies p(x) ∈ [−1, 1] for all x ∈ {−1, 1} n . Theorem 1.1 (Aaronson et al.). There exists an absolute constant C ∈ (0, 1] such that the following holds. For every bounded quadratic polynomial p, there exists a one-query quantum algorithm that, on input x ∈ {−1, 1} n , returns a sign with expectation Cp(x).
This implies that possibility 2 holds true for quadratic polynomials. It also leads to the problem of finding a similar converse for higher-degree polynomials, asking for instance whether twoquery quantum algorithms are equivalent to quartic polynomials [AAI + 16, Open problem 1].

Our results
This paper addresses the above-mentioned two problems. Our first result is a new notion of polynomial degree that gives a tight characterization of quantum query complexity (Definition 1.4 and Corollary 1.5 below), giving an answer to [AAI + 16, Open problem 3]. Using this characterization, we show that there is no generalization of Theorem 1.1 to higher-degree polynomials, in the sense that there is no absolute constant C ∈ (0, 1] for which the analogous statement holds true. This gives a partial answer to [AAI + 16, Open problem 1], ruling out a strong kind of equivalence. Finally, we give a simplified shorter proof of Theorem 1.1. Below we explain our results in more detail. Quantum algorithms are completely bounded forms For the rest of the discussion, all polynomials will be assumed to be bounded, real and (2n)-variate if not specified otherwise. We refer to a homogeneous polynomial as a form. For α ∈ {0, 1, 2, . . . } 2n and x ∈ R 2n , we write |α| = α 1 + · · · + α 2n and x α = x α 1 1 · · · x α 2n 2n . Then, any form p of degree t can be written as p(x) = ∑ α∈{0,1,...,t} 2n : |α|=t where c α are some real coefficients. Our new notion of polynomial degree is based on a characterization of quantum query algorithms in terms of forms satisfying a certain norm constraint. The norm we assign to a form as in (1) is given by a norm of the unique symmetric t-tensor T p ∈ R 2n×···×2n such that p can be written as p(x) = 2n ∑ i 1 ,...,i t =1 (T p ) i 1 ,...,i t x i 1 · · · x i t .
Explicitly, this tensor is given by (T p ) i 1 ,...,i t = c e i 1 +···+e i t τ(i 1 , . . . , i t ) , where e i is the ith standard basis vector for R 2n and τ(i 1 , . . . , i t ) is the number of distinct permutations of the sequence (i 1 , . . . , i t ). The relevant norm of T p is in turn given in terms of an infimum over decompositions of the form T p = ∑ σ∈S t T σ • σ, where the sum is over permutations of {1, . . . , t}, each T σ is a t-tensor, and T σ • σ is the permuted version of T σ given by Note that the notation T σ does not refer to an action of S t on the set of tensors. Moreover, since T σ is arbitrary we could have just absorbed the permutation in the decomposition of T p ; the reason why we didn't will become clear in a moment. Finally, the actual norm is based on the completely bounded norm of each of the T σ . Given a t-tensor T ∈ R 2n×···×2n , its completely bounded norm T cb is given by the supremum over positive integers k and collections of k × k unitary Definition 1.2 (Completely bounded norm of a form). Let p be a form of degree t and let T p be the symmetric t-tensor as in (3). Then, the completely bounded norm of p is defined by Standard compactness arguments show that both the completely bounded norm of tensors and of polynomials are attained. Let us point out that T cb does not always equal T • σ cb for a non-trivial permutation. For this reason, the completely bounded norm of a polynomial can be significantly smaller than that of its associated symmetric tensor: for n-variate cubic forms their ratio can be as large as Ω( √ n). Let us also mention that for ease of exposition, we are abusing the term "completely bounded norm". Such norms originate from operator space theory and make sense only in reference to underlying operator spaces, which we have tacitly fixed in the above discussion. The norm in (5) was originally introduced in the general context of tensor products of operator spaces in [OP99]. In that framework, the definition considered here corresponds to a particular operator space based on n 1 , but we shall not use this fact here. Our characterization of quantum query algorithms is as follows. 1. There exists a form p of degree 2t such that p cb ≤ 1 and p((x, 1)) = β(x) for every x ∈ {−1, 1} n , where 1 ∈ R n is the all-ones vector. 3 2. There exists a t-query quantum algorithm that, on input x ∈ {−1, 1} n , returns a sign with expected value β(x).
It may be observed that the polynomial method is contained in the above statement, since any (2n)-variate form p defines an n-variate polynomial given by q(x) = p((x, 1)). The above theorem refines the polynomial method in the sense that quantum algorithms can only yield polynomials of the form q(x) = p((x, 1)) where p has completely bounded norm at most one.
Our proof is based on a fundamental result of Christensen and Sinclair [CS87] concerning multilinear forms on C * -algebras that generalizes the well-known Stinespring representation theorem for quantum channels (see also [PS87] and [Pis03, Chapter 5]). As such, this result applies in a more general setting than what is strictly needed here. Section 2 contains some preliminary material that will allow us to state the result in its original form, in particular the general definition of completely bounded norms of multilinear forms on C * -algebras.
Completely bounded approximate degree Theorem 1.3 motivates the following new notion of approximate degree for partial Boolean functions. Definition 1.4 (Completely bounded approximate degree). For D ⊆ {−1, 1} n , let f : D → {−1, 1} be a (possibly partial) Boolean function and let ε ≥ 0. Then, the ε-completely bounded approximate degree of f , denoted cb-deg ε ( f ), is the smallest positive integer t for which there exists a form p of degree 2t such that p cb ≤ 1 as in Eq. (5) and we have |p((x, 1)) − f (x)| ≤ 2ε for every x ∈ D.
As a corollary of Theorem 1.3, we get the following characterization of quantum query complexity.
We remark that the characterization of Q ε ( f ) via the adversary method holds for all constant ε > 0, whereas our characterization holds for every ε ≥ 0. In addition, in our characterization we do not lose constant factors (unlike in the adversary method characterization) which could possibly be useful to understand the quantum query complexity of ordered search [HNS02,CL08].
Chebyshev polynomials The Chebyshev polynomials have been used in a number of places to find approximating polynomials for Boolean functions, most notably [NS94]. These polynomials can be defined through the recursion T 0 (α) = 1, T 1 (α) = α, T k+1 (α) = 2αT k (α) − T k−1 (α) for k ∈ N. Particularly useful are the n-variate degree-k polynomials p k (x) = T k (x 1 + · · · + x n )/n . In a forthcoming work, we show using a straightforward argument based on the recursion formula that there exist degree-k forms F k on R n such that F k (x) = p k (x) for every x ∈ {−1, 1} n and F k cb ≤ 1 for every k. As a simple application, from Theorem 1.3 and a result of [NS94], one then easily obtains the fact that the n-bit OR function, restricted to the set of strings with Hamming weight at most 1, has quantum query complexity O( √ n), as implied by Grover's algorithm [Gro96].
Separations for higher-degree forms Theorem 1.1 follows from our Theorem 1.3 and the fact that for every bounded quadratic form p(x) = x T Ax, the matrix A has completely bounded norm bounded from above by an absolute constant (independent of n); this is discussed in more detail below. If the same were true for the tensors T p corresponding to higher-degree forms p then Theorem 1.3 would give higher-degree extensions of Theorem 1.1. Unfortunately, this will turn out to be false for polynomials of degrees greater than 3. Bounded forms whose associated tensors have unbounded completely bounded norm appeared before in the work of Smith [Smi88], who gave an explicit example with completely bounded norm log n. Since p cb involves an infimum over decompositions of T p , this does not yet imply a counterexample to higher-degree versions of Theorem 1.1. However, such counterexamples are implied by recent work on Bell inequalities, multiplayer XOR games in particular. It is not difficult to see that p cb is bounded from below by the so-called jointly completely bounded norm of the tensor T p , a quantity that in quantum information theory is better known as the entangled bias of the XOR game whose (unnormalized) game tensor is given by T p . One obtains this quantity by inserting tensor products between the unitaries appearing in (4). Pérez-García et al. [PGWP + 08] and Vidick and the second author [BV13] gave examples of bounded cubic forms with unbounded jointly completely bounded norm. Both constructions are non-explicit, the first giving a completely bounded norm of order Ω((log n) 1/4 ) and the latter of order Ω(n 1/4 ). Here, we explain how to get a larger separation by means of a much simpler (although still non-explicit) construction and show that a bounded cubic form p given by a suitably normalized random sign tensor has completely bounded norm p cb = Ω( √ n) with high probability (Theorem 4.1). The result presented here is not new, but it follows from the existence of commutative operator algebras which are not Q-algebras. Here, we present a self-contained proof which follows the same lines as in [DJT95,Theorem 18.16] and, in addition, we prove the result with high probability (rather than just the existence of such trilinear forms). We also explain how to obtain from this result quartic examples by embedding into 3-dimensional "tensor slices", which in turn imply counterexamples to a quartic versus two-query version of Theorem 1.1.

Remark.
A mistake was found in the last step mentioned above, concerning the implication of counterexamples to a quartic version of Theorem 1.1. A corrected proof (and stronger examples) can be found in [BG22].
Short proof of Theorem 1.1 As shown in [AAI + 16], Theorem 1.1 is yet another surprising consequence of the ubiquitous Grothendieck inequality [Gro53] (Theorem 5.2 below), well known for its relevance to Bell inequalities [Tsi87,CHTW04] and combinatorial optimization [AN06,KN12], not to mention its fundamental importance to Banach spaces [Pis12]. An equivalent formulation of Grothendieck's inequality again recovers Theorem 1.1 for quadratic forms p(x) = x T Ax given by a matrix A ∈ R n×n satisfying a certain norm constraint A ∞ → 1 ≤ 1, which in particular implies that p is bounded (see Section 2 for more on this norm). Indeed, in that case Grothendieck's inequality implies that A cb ≤ K G for some absolute constant K G ∈ (1, 2) (independent of n and A). Normalizing by K −1 G , one obtains Theorem 1.1 with C = K −1 G for such quadratic forms from Theorem 1.3. The general version of Theorem 1.1 for quadratic polynomials follows from this via a so-called decoupling argument (see Section 5). This arguably does not simplify the original proof of Theorem 1.1, as Theorem 1.3 relies on deep results itself. However, in Section 5 we give a short simplified proof, showing that Theorem 1.1 follows almost directly from a "factorization version" of Grothendieck's inequality (Theorem 5.3) that follows from the more standard version (Theorem 5.2). The factorization version was used in the original proof as well, but only as a lemma in a more intricate argument. In computer science, this factorization version has already found applications in an algorithmic version of the Bourgain-Tzafriri Column Subset Theorem [Tro09] and algorithms for community detection in the stochastic block model [LLV15]. This appears to be its first occurrence in quantum computing.

Related work
Although there is no converse to the polynomial method for arbitrary polynomials, equivalences between quantum algorithms and polynomials have been studied before in certain models of computation. For example, we do know of such characterizations in the model of non-deterministic query complexity [Wol03], unbounded-error query complexity [BVW07,MNR11] and quantum query complexity in expectation [KLW15]. We remark here that in all these settings, the quantum algorithms constructed from polynomials were non-adaptive algorithms, i.e., the quantum algorithm begins with a quantum state, repeatedly applies the oracle some fixed number of times and then performs a projective measurement. Crucially, these algorithms do not contain interlacing unitaries that are present in the standard model of query complexity, hence are known to be a much weaker class of algorithms (see Montanaro [Mon10] for more details).
Our main result is yet another demonstration of the expressive power of C * -algebras and operator space theory in quantum information theory; for a survey on applications of these areas to two-prover one-round games, see [PV16]. The appearance of Q-algebras (mentioned in the above paragraph on separations) is also not a first in quantum information theory, see for instance [PGWP + 08, BBLV12,BBLV13].
After the initial version of this work appeared it was shown by Gribling and Laurent that the completely bounded norm of a degree-d polynomial can be computed by a semidefinite program (SDP) of size O(n d ) [GL19]. An SDP formulation for quantum query complexity was already known using the negative-weight adversary method [Rei11], but as we mentioned after Corollary 1.5, the adversary method only characterizes bounded-error quantum query complexity. With our characterization, the result of Gribling and Laurent gives a hierarchy of SDPs even for exact quantum query complexity. An SDP characterization of quantum query complexity was also given earlier by Barnum, Saks and Szegedy [BSS03]. This SDP uses matrix-variables of size |D|, which is 2 n for total functions, and so can be much larger than that of Gribling and Laurent.

Organization
In Section 2, we give a brief introduction to normed vector spaces, C * -algebras and define the model of quantum query complexity. In Section 3, we prove our main theorem characterizing quantum query algorithms. In Section 4, we explain the separation obtained for higher-degree forms. In Section 5, we give a short proof of the main theorem in Aaronson et al. [AAI + 16].

Preliminaries
Here we fix some basic notation and recall some basic definitions. In addition, in order to be able to state and use our main tool (Theorem 3.1 of Christensen and Sinclair), we recall some basic facts of C * -algebras and completely bounded norms.
Normed vector spaces For parameter p ∈ [1, ∞), the p-norm of a vector x ∈ R n is defined by For a matrix A ∈ R n×n , denote the standard operator norm by A and define By linear programming duality, observe that the right-hand side of equality above can be written as sup Ax 1 : We denote the norm of a general normed vector space X by · X , if there is a danger of ambiguity. Denote by 1 X the identity map on X and by 1 d the identity map on C d . For normed vector spaces X, Y, let L(X, Y) be the collection of all linear maps T : X → Y. We will use the notation L(X) as a shorthand for L(X, X). The (operator) norm of a linear map T ∈ L(X, Y) is given by Throughout we endow C d with the standard Euclidean norm. Note that the space L(C d ) is naturally identified with the set of d × d matrices, sometimes denoted M d (C), and we use the two notations interchangeably. For Hilbert spaces H, K, we endow H ⊗ K with the norm given by the inner product f ⊗ a, g ⊗ b = f , g H a, b K , making this space isometric to H ⊕ · · · ⊕ H (d times). This can be extended linearly to the entire domain. Similarly, we endow L(H) ⊗ L(C d ) with the operator norm of the space L(H ⊗ C d ) of linear operators on the Hilbert space H ⊗ C d ; with some abuse of notation, we shall identify the two spaces of operators.
C * -algebras We collect a few basic facts of C * -algebras that we use later and refer to [Arv12] for an extensive introduction. A C * -algebra X = (X, ·, * ) is a normed complex vector space X, complete with respect to its norm (i.e., a Banach space), that is endowed with two operations in addition to the standard vector-space addition and scalar multiplication operations: 1. an associative multiplication · : X × X → X, denoted x · y for x, y ∈ X, that is distributive with respect to the vector space addition and continuous with respect to the norm of X, which by definition of continuity means x · y X ≤ x X y X for all x, y ∈ X; 2. an involution * : X → X, that is, a conjugate linear map that sends x ∈ X to (a unique) x * ∈ X satisfying (x * ) * = x and (xy) * = y * x * for any x, y ∈ X, and such that Any finite-dimensional normed vector space is a Banach space. A C * -algebra X is unital if it has a multiplicative identity, denoted 1 X . The most important example of a unital C * -algebra is M n (C), where the involution operator is the conjugate-transpose and the norm is the operator norm. A linear map π : X → Y from one C * -algebra X to another Y is a * -homomorphism if it preserves the multiplication operation, π(xy) = π(x)π(y), and satisfies π(x) * = π(x * ) for all x, y ∈ X . For a complex Hilbert space H, a mapping π : X → L(H) is a * -representation if it is a * -homomorphism. An important fact is the Gelfand-Naimark Theorem [Mur14, Theorem 3.4.1] asserting that any C * -algebra admits an isometric (that is, norm-preserving) * -representation for some complex Hilbert space. Suppose X = (X, · X , * ), Y = (Y, · Y , †) are C * -algebras, then the tensor product X ⊗ Y is also a C * -algebra defined in terms of the standard tensor product of the vector spaces X ⊗ Y with the associative multiplication · XY and involution operator defined as: This can then be extended linearly to the entire domain.

Completely bounded norms
We also collect a few basic facts about completely bounded norms that we use later and refer to [Pau02] for an extensive introduction. For a C * -algebra X and positive integer d, we denote by M d (X ) the set of d-by-d matrices with entries in X . Note that this set can naturally be identified with the algebraic tensor product X ⊗ L(C d ), that is, the linear span of all elements of the form x ⊗ M, where x ∈ X and M ∈ L(C d ). Using the Gelfand-Naimark theorem, we endow M d (X ) with a norm induced by an isometric * -representation π of X into The notation A reflects the fact that this norm is in fact independent of the particular * -representation π. Based on this, we can define a norm on linear maps σ : X → L(H) as follows: We will also need the following fact about the completely bounded norm of * -representations of C * -algebras [Pis03, Theorem 1.6].
Lemma 2.1. Let X be a finite-dimensional C * -algebra, H, H be Hilbert spaces, π : X → L(H) be a *representation and U ∈ L(H, H ) and V ∈ L(H , H) be linear maps. Then, the map σ : X → L(H ), defined as σ(x) = Uπ(x)V, satisfies that σ cb ≤ U V .
We will also use the famous Fundamental Factorization Theorem [Pau02,Theorem 8.4]. Below we state the theorem when restricted to finite-dimensional spaces (see also the remark after [JKP09, Theorem 16]). Tensors and multilinear forms For vector spaces X, Y over the same field and positive integer t, recall that a mapping T : X × · · · × X t times → Y is t-linear if for every x 1 , . . . , x t ∈ X and i ∈ [t], the map is linear. A t-tensor of dimension n is a map T : [n] × · · · × [n] → C, which can alternatively be identified by T = (T i 1 ,...,i t ) n i 1 ,...,i t =1 ∈ C n×···×n . With abuse of notation we identify a t-tensor T ∈ C n×···×n with the t-linear form T : C n × · · · × C n → C given by Next, we introduce the general definition of the completely bounded norm of a t-linear form T : X × · · · × X → C on a C * -algebra X . First, we use the standard identification of such forms with the linear form on the tensor product X ⊗ · · · ⊗ X given by T(x 1 ⊗ · · · ⊗ x t ) = T(x 1 , . . . , x t ). We consider a bilinear map : X ⊗ L(C d ), X ⊗ L(C d ) → X ⊗ X ⊗ L(C d ) for any positive integer d defined as follows. For x, y ∈ X and M x , Observe that this operation changes the order of the tensor factors and multiplies M x with M y . This operation is associative but not commutative. Extend the definition of the operation bi-linearly to its entire domain. Define the t-linear map T d : The completely bounded norm of T is now defined by Note that the definition given in Eq. (4) corresponds to the particular case where the C *algebra X is formed by the n × n diagonal matrices. Since any square matrix with operator norm at most 1 is a convex combination of unitary matrices (by the Russo-Dye Theorem), 4 the completely bounded norm can also be defined by taking the supremum over unitaries A j ∈ M d (X ). The completely bounded norm can be defined even more generally for multilinear maps into L(H), for some Hilbert space H, to yield the definition of this norm for linear maps given above, but we will not use this here.
Quantum query complexity The quantum query model was formally defined by Beals et al. in [BBC + 01]. In this model, we are given black-box access to a unitary operator, often called an oracle O x , whose description depends in a simple way on some binary input string x ∈ {0, 1} n . An application of the oracle on a quantum register is referred to as a quantum query to x. In the standard form of the model, a query acts on a pair of registers on (Q, A), where Q is an ndimensional query register and A is a one-qubit auxiliary register. A query to the oracle effects the unitary transformation given by (These oracles are also commonly called bit oracles.) A quantum query algorithm consists of a fixed sequence of unitary operations acting on (Q, A) in addition to a workspace register W. A t-query quantum algorithm begins by initializing the joint register (Q, A, W) in the all-zero state and continues by interleaving a sequence of unitaries . . . Figure 1: A t-query quantum algorithm that starts with the all-zero state and concludes by measuring the register A.
U 0 , . . . , U t on (Q, A, W) with oracles O x on (Q, A). Finally, the algorithm performs a 2-outcome measurement on A and returns the measurement outcome.
For a Boolean function f : {0, 1} n → {0, 1}, the algorithm is said to compute f with error ε > 0 if for every x, the measurement outcome of register A equals f (x) with probability at least 1 − ε. The bounded-error query complexity of f , denoted Q ε ( f ), is the smallest t for which such an algorithm exists. Note that in this model, we are not concerned with the amount of time (i.e., the number of gates) it takes to implement the interlacing unitaries, which could be much bigger than the query complexity itself.
Here we will work with a slightly less standard oracle sometime referred to as a phase oracle, in which the standard oracle is preceded and followed by a Hadamard on A. Since the Hadamards can be undone by the unitaries surrounding the queries in a quantum query algorithm, using the phase oracle does not reduce generality. A query to this oracle, sometimes denoted O x,± , applies the (controlled) unitary Diag((1, (−1) x )) to joint register (A, Q). To avoid having to write (−1) x later on, we shall work in the equivalent setting where Boolean functions send {−1, 1} n to {−1, 1}.

Characterization of quantum query algorithms
In this section we prove Theorem 1.3. The main ingredient of the proof is the following celebrated representation theorem by Christensen and Sinclair [CS87] showing that completely-boundedness of a multilinear form is equivalent to the existence of an exceedingly nice factorization.
Theorem 3.1 (Christensen-Sinclair). Let t be a positive integer and let X be a C * -algebra. Then, for any t-linear form T : X × · · · × X → C, we have T cb ≤ 1 if and only if there exist Hilbert spaces H 0 , . . . , H t+1 where H 0 = H t+1 = C, * -representations π i : X → L(H i ) for each i ∈ [t] and contractions V i ∈ L(H i , H i−1 ), for each i ∈ [t + 1] such that for any x 1 , . . . , x t ∈ X , we have We first show how the above result simplifies when restricting to the special case in which the C * -algebra X is formed by the set of diagonal n-by-n matrices.
Corollary 3.2. Let m, n, t be positive integers such that t ≥ 2 and m = n t . Let T ∈ C n×···×n be a t-tensor. Then, T cb ≤ 1 if and only if there exist a positive integer d, unit vectors u, v ∈ C m and contractions U i , V i ∈ L(C m , C dn ) such that for any x 1 , . . . , x t ∈ C n , we have Proof. The set X = Diag(C n ) of diagonal matrices is a (finite-dimensional) C * -algebra (endowed with the standard matrix product and conjugate-transpose involution). Now, define the t-linear form R : X × · · · × X → C by R(X 1 , . . . , X t ) = T(diag(X 1 ), . . . , diag(X t )). We claim that R cb = T cb . Observe that for every positive integer d, the set {B ∈ M d (X ) : B ≤ 1} can be identified with the set of block-diagonal matrices B = ∑ n i=1 E i,i ⊗ B(i) of size nd × nd and blocks B(1), . . . , B(n) of size d × d satisfying B(i) ≤ 1 for all i ∈ [n]. It follows that which shows that R cb = T cb .
Next, we show that (7) is equivalent to (8). The fact that (8) implies (7) follows immediately from the fact that the map Diag(x) → Diag(x) ⊗ 1 d is a * -representation. Now assume (7). Without loss of generality, we may assume that each of the Hilbert spaces H 1 , . . . , H t has dimension at least m. If not, we can expand the dimensions of the ranges and domains of the representations π i and contractions V i by dilating with appropriate isometries into larger Hilbert spaces ("padding with zeros"). For each i ∈ [t], let S i ⊆ H i be the subspace Since dim(X ) = n, we have that dim(S i ) ≤ m. For each i ∈ [t], let Q i ∈ L(C m , H i ) be an isometry such that S i ⊆ Im(Q i ). Note that V t+1 is a vector in the unit ball of H t . Let Q t+1 ∈ L(C m , H t ) be an isometry such that V t+1 ∈ Im(Q t+1 ). Note that for each i ∈ [t + 1], the map Q i Q * i acts as the identity on Im(Q i ). For each i ∈ {2, . . . , t} define the map σ i : X → L(C m ) by σ i (x) = Q * i V i π i (x)Q i+1 and σ 1 (x) = Q * 1 π 1 (x)Q 2 . Finally define u = Q * 1 V * 1 and v = Q * t+1 V t+1 . Then, the right-hand side of (7) can be written as u * σ 1 (x 1 ) · · · σ t (x t )v.
It follows from Lemma 2.1 that σ i cb ≤ 1. Let σ i : L(C n ) → L(C m ) be the linear map given by σ i (M) = σ i (Diag(M 11 , . . . , M nn )) for any M ∈ L(C m ). Then, for every diagonal matrix x ∈ X , we have σ i (x) = σ i (x) and also σ i cb ≤ σ i cb . It follows from Theorem 2.2 that there exist a positive integer d i and contractions U i , We can take all d i equal to d = max i {d i } by suitably dilating the contractions U i , V i . Setting u = u/ u 2 and U 1 = u 2 U 1 , and similarly defining v , V i+1 shows that Eq. (7) implies Eq. (8).
2. There exists a t-query quantum algorithm that, on input x ∈ {−1, 1} n , returns a sign with expected value β(x).
Remark. Note that Lemma 3.3 itself already gives a characterization of quantum query algorithms, but in terms of the completely bounded norm of a tensor, as opposed to a polynomial. Then, the reader could wonder about the interest of Theorem 1.3, which is a similar characterization (though of course, equivalent), but in terms of a more complicated-looking norm. As mentioned in the introduction, the completely bounded norm of a polynomial can be significantly smaller than that of its associated symmetric tensor. Therefore, given a function β, a symmetric (2t)-tensor T verifying item 1 in Lemma 3.3 and the degree-(2t) polynomial p(x) = T(x, . . . , x), checking that p cb ≤ 1 should be easier than proving that T cb ≤ 1. In fact, it may well be the case that T cb > 1 (so Lemma 3.3 does not allow us to conclude anything, and we should look for another T), while p cb ≤ 1 which allows us to apply Theorem 1.3.
Proof of Lemma 3.3. We first prove that (2) implies (1). As discussed in Section 2, a t-query quantum algorithm with phase oracles initializes the joint register (A, Q, W) in the all-zero state on which it then performs some unitaries U 1 , . . . , U t interlaced with queries D(x) = Diag ((1, x)) ⊗ 1 W . Let {P 0 , P 1 } be the the two-outcome measurement done at the end of the algorithm and assume that it returns +1 on measurement outcome zero and −1 otherwise. Let Q = P 0 − P 1 and note that Q is a contraction since P 0 , P 1 are positive semi-definite and satisfy P 0 + P 1 = 1. The final state of the quantum algorithm (before the measurement of register A) is Hence the expected value of the measurement outcome is then given by By assumption, this expected value equals β(x) for every x ∈ {−1, 1} n . For z ∈ C 2n , denote D (z) = Diag((z n+1 , . . . , z 2n , z 1 , . . . , z n )) ⊗ 1 W and U t = U * t QU t . Define the (2t)-linear form T by T(y 1 , . . . , y 2t ) = u * U * 1 D (y 1 )U * 2 · · · D (y t ) U t D (y t+1 ) · · · U 2 D (y 2t )U 1 u.
Clearly T ((x, 1) . . . , (x, 1)) = β(x) for every x ∈ {−1, 1} n . Moreover, by definition T admits a factorization as in (8). It thus follows from Corollary 3.2 that T cb ≤ 1. We turn T into a real tensor by taking its real part T = (T + T)/2, where T is the coordinate-wise complex conjugate of T. 5 Since for any x ∈ {−1, 1} n and y = (x, 1), the value T(y, . . . , y) is real, we have T (y, . . . , y) = β(x). We need to show that T cb ≤ 1. To this end, consider an arbitrary positive integer d, unit vectors v, w ∈ C d and sequences of unitary matrices 5 An anonymous referee pointed out one could also use a result of Barnum et al. [BSS03] showing that the unitaries in quantum query algorithms can be assumed to be real. In that case one can assume T is a real tensor to begin with.
where we assumed that the unit vectors v, w ∈ C d maximize the operator norm. Note that T cb is given by the supremum over d and V j (i). Taking the complex conjugate of the above summands on the right-hand side allows us to express the above absolute value as wherev,w, V j (i) denote the coordinate-wise complex conjugates. Since each V j (i) is still unitary, it follows that (10) is at most T cb and so T cb ≤ T cb ≤ 1. Hence, by the triangle inequality, T cb ≤ ( T cb + T cb )/2 ≤ 1 as desired.
Next, we show that (1) implies (2). Let T be a (2t)-tensor as in item 1. From Corollary 3.2 it follows that T admits a factorization as in (8).
Observe that each W i is a contraction and recall that unitaries are contractions. For the moment, assume for simplicity that each W i is in fact unitary. Define two vectors u = V 0 u and v = U 2t+1 v and observe that these are unit vectors in C 2dn . The right-hand side of (8) then gives us where D( then T(y, . . . , y) = |v * 1 v 2 |. Based on this, we obtain the quantum query algorithm that prepares v 1 and v 2 in parallel, each using at most t queries. This is described in Figure 2. Figure 2: The registers C, Q, W denote the control, query and workspace registers. Let U, V be unitaries with W 1 u and W 2t+1 v as their first columns, respectively and for x ∈ {−1, 1} n and y = (x, 1), let Diag(y) be the query operator. The algorithm begins by initializing the joint register (C, Q, W) in the all-zero state and proceeds by performing the displayed operations. The algorithm returns +1 if the outcome of the measurement on C equals zero and −1 otherwise.
To see why this algorithm satisfies the requirements, first note that the algorithm makes t queries to the input x. For the correctness of the algorithm, we begin by observing that before the application of the first query, the state of the joint register (C, Q, W) is 1 √ 2 (e 1 ⊗ W 1 u + e 2 ⊗ W t+1 v).
Before the final Hadamard gate, the state of the joint register is given by A standard calculation and (11) then show that after the final Hadamard gate, the expected output of the algorithm is precisely T ((x, 1), . . . , (x, 1)) = β(x). In the general case where the W i s are not necessarily unitary, we can use the fact that, by the Russo-Dye Theorem and Carathéodory's Theorem, each W i is a convex combination of at most (dn) 2 + 1 unitaries. The algorithm can thus use randomness to effect each W i on expectation. Alternatively, by linear algebra there exists a unitary matrix W i ∈ C 2dn×2dn that has W i as its upper-left corner (see [AAI + 16, Lemma 7]), through which the algorithm could implement W i by working on a larger quantum register.
Using Lemma 3.3, we now prove our main Theorem 1.3.
Proof of Theorem 1.3. We first show that (2) implies (1). Using the equivalence in Lemma 3.3, there exists a (2t)-tensor T ∈ R 2n×···×2n such that T cb ≤ 1 and for every x ∈ {−1, 1} n and y = (x, 1), we have Define the symmetric 2t-tensor T = 1 (2t)! ∑ σ∈S 2t T • σ. Let p ∈ R[x 1 , . . . , x 2n ] be the form of degree 2t associated with T . Since there is a unique symmetric tensor associated with a polynomial, it follows that T = T p (where T p is defined by Eq. (3)). Then, p((x, 1)) = β(x) for every x ∈ {−1, 1} n . Moreover, if we set T σ = T for each σ ∈ S t , it follows from the above decomposition of T p and Definition 1.2 that p cb ≤ T cb ≤ 1.
We now prove Corollary 1.5, which is an immediate consequence of our main theorem.
Proof of Corollary 1.5. We first show cb-deg ε ( f ) ≥ Q ε ( f ): Suppose cb-deg ε ( f ) = d, then there exists a degree-(2d) form p satisfying: |p(x) − f (x)| ≤ 2ε for every x ∈ D and p cb ≤ 1. Using our characterization in Theorem 1.3, it follows that there exists a d-query quantum algorithm A, that on input x ∈ D, returns a sign with expected value p(x). So, our ε-error quantum algorithm for f simply runs A and outputs the sign. We Then, there exists a t-query quantum algorithm that, on input x ∈ D, outputs a sign with expected value β(x) satisfying |β(x) − f (x)| ≤ 2ε. Note that we could also run the quantum algorithm for x D and let β(x) be the expected value of the quantum algorithm for such xs. Using Theorem 1.3, we know that there exists a degree-(2t) form p satisfying β(x) = p(x) for every x ∈ {−1, 1} n and p cb ≤ 1. Clearly p satisfies the conditions of Definition 1.4, hence cb-deg ε ( f ) ≤ t.

Separations for quartic polynomials
In this section we show the existence of a quartic polynomial p that is bounded but for which any two-query quantum algorithm A satisfying E[A(x)] = Cp(x) for every x ∈ {−1, 1} n must necessarily have C = O(n −1/2 ). We show this using a (random) cubic form that is bounded, but whose completely bounded norm is poly(n), following a construction of [DJT95,Theorem 18.16].
Remark. Earlier versions of this work finished this section with a claim that Corollary 4.5 gives a counterexample to possible quartic extensions of Theorem 1.1. Although the claim holds as stated, the proof contained a bug. The faulty proof is omitted from the current version. An explanation of the issue, a corrected proof and stronger examples are presented in [BG22].
Given a form p : R n → R, we define its norm as Note that the condition p ≤ 1 is equivalent to p being bounded. c α x α be a random cubic form such the coefficients c α are independent uniformly distributed {−1, 1}-valued random variables. Then, with probability at least 1 − Cne −cn , we have p cb ≥ c √ n p .
We shall use the following standard concentration-of-measure results. The first is the Hoeffding bound [Pol12, Corollary 3 (Appendix B)].
Lemma 4.2 (Hoeffding bound). Let X 1 , . . . , X m be independent uniformly distributed {−1, 1}-random variables and let a ∈ R m . Then, for any τ > 0, we have The second result is one from random matrix theory concerning upper tail estimates for Wigner ensembles (see [Tao12, Corollary 2.3.6]). Lemma 4.3. There exist absolute constants C, c ∈ (0, ∞) such that the following holds. Let n be a positive integer and let M be a random n × n symmetric random matrix such that for j ≥ i, the entries M ij are independent random variables with mean zero and absolute value at most 1. Then, for any τ ≥ C, we have We also use the following proposition.
Proposition 4.4. Let m, n, t be positive integers, let p ∈ R[x 1 , . . . , x n ] be a t-linear form, let T p ∈ R n×···×n be as in (3) and let A 1 , . . . , A n ∈ L(R m ) be pairwise commuting contractions. Then, Proof. Consider an arbitrary decomposition T p = ∑ σ∈S t T σ • σ. Then, the definition of the completely bounded norm and triangle inequality show that Since the A i commute, the above can be re-written as The claim now follows from the definition of p cb and using the fact that the decomposition of T p was arbitrary.
Proof of Theorem 4.1. We begin by showing that with high probability, p ≤ O(n 2 ). To this end, let us fix an arbitrary x ∈ {−1, 1} n . Then, p(x) is a sum of at most n 3 independent uniformly distributed random {−1, 1}-random variables. It therefore follows from Lemma 4.2 that Pr |p(x)| > 2n 2 ≤ 2e −2n , By the union bound over x ∈ {−1, 1} n , it follows that p > 2n 2 with probability at most 2e −n , which gives the claim. We now lower bound p cb . Let τ > 0 be a parameter to be set later. Let T ∈ R n×n×n be the random symmetric 3-tensor associated with p as in (3). For every i ∈ [n], we define the linear map A i e 1 = e i A i e j = 1 τ √ n ∑ n k=1 T i,j,k e k+n A i e j+n = δ i,j e 2n+1 A i e 2n+1 = e 1 .
Observe that for every i, j, k ∈ [n], we have Since T is symmetric, it follows easily that these maps commute, which is to say that A i A j = A j A i for every i, j ∈ [n]. In addition, we claim that with high probability, these maps are contractions (i.e., the associated matrices have operator norm at most 1). To see this, for each i ∈ [n], let M i be the random matrix given by M i = (T i,j,k ) n j,k=1 . Observe that M i is symmetric and its entries have mean zero and absolute value at most 1. By Lemma 4.3 and a union bound, we get that for absolute constants c, C and provided τ ≥ C. Now, for any Euclidean unit vector u ∈ R 2n+2 , we have It follows from (13) that max i M i ≤ τ √ n with probability at least 1 − Cne −cτn , which in turn implies the above is at most u 2 ≤ 1 and therefore that all A i have operator norm at most 1.
By Proposition 4.4, provided that the A i s are contractions. By (12), and since |T i,j,k | ≥ 1/6 for every i, j, k ∈ [n], the above is at least n 5/2 /(36τ). with probability at least 1 − Cne −cτn . Letting τ be a sufficiently large constant then gives the result.
As mentioned in the introduction, one can easily extend this result to the case of 4-linear forms. To demonstrate the failure of Theorem 1.1 for quartic polynomials, we "embed" the degree-3 polynomial p in Theorem 4.1 into a degree-4 polynomial q which has high completely bounded norm.
and pairwise commuting contractions A 1 , . . . , A n ∈ L(R 2n+2 ) such that where c ∈ (0, 1] is some absolute constant. Proof. Let p be a bounded multi-linear cubic form such that p cb ≥ C √ n, the existence of which is guaranteed by Theorem 4.1. Let T p ∈ R n×n×n be the random symmetric 3-tensor associated to p. Consider the symmetric 4-tensor S ∈ R (n+1)×(n+1)×(n+1)×(n+1) defined by S 0,j,k, = T j,k, , S i,0,k, = T i,k, , S i,j,0, = T i,j, , S i,j,k,0 = T i,j,k for every i, j, k, ∈ [n] and S i,j,k, = 0 otherwise. Since S is symmetric, there exists a unique multi-linear quartic form q associated to S. It follows easily that q = 4 p . Moreover, by considering the contractions A i used in the proof of Theorem 4.1 and defining A 0 = 1 n+2 , it follows that q cb ≥ 4 p cb . The form q/4 is thus as desired.

Short proof of Theorem 1.1
In this section, we give a short proof of Theorem 1.1.
Proof sketch of Theorem 1.1 We begin by giving a brief sketch of the original proof. The first step is to show that without loss of generality, we may assume that the polynomial p is a quadratic form. This is the content of the decoupling argument mentioned in the introduction, proved for polynomials of arbitrary degree in [AAI + 16], but stated here only for the quadratic case.
To prove the theorem, we may thus restrict to a quadratic form p(x) = x T Ax given by some matrix A ∈ R n×n such that A ∞ → 1 ≤ 1. The next step is to massage the matrix A into a unitary matrix (that can be applied by a quantum algorithm). To obtain this unitary, the authors use an argument based on two versions of Grothendieck's inequality and a technique known as variable splitting, developed in earlier work of Aaronson and Ambainis [AA15]. The first version of Grothendieck's inequality is the one most commonly used in applications [Gro53].
Theorem 5.2 (Grothendieck). There exists a universal constant K G ∈ (0, ∞) such that the following holds. For every positive integer n and matrix A ∈ R n×n , we have Elementary proofs of this theorem can be found in [AN06]. The Grothendieck constant K G is the smallest real number for which Theorem 5.2 holds true. The problem of determining its exact value, posed in [Gro53], remains open. The best lower and upper bounds 1.6769 · · · ≤ K G < 1.7822 · · · were proved by Davie and Reeds [Dav84,Ree91], and Braverman et al. [BMMN13], resp. The second version of Grothendieck's inequality is as follows.
Theorem 5.3 (Grothendieck). For every positive integer n and matrix A ∈ R n×n , there exist u, v ∈ (0, 1] n such that u 2 = v 2 = 1 and such that the matrix where Diag(w) denotes the square diagonal matrix whose diagonal is w.
Our contribution The first (standard) version of Grothendieck's inequality (Theorem 5.2) easily implies that any matrix A such that A ∞ → 1 ≤ 1 has completely bounded norm at most K G . Combing this fact with our Theorem 1.3 and Lemma 5.1, one quickly retrieves Theorem 1.1. However, Theorem 1.3 is based on the rather deep Theorem 3.1. We observe that Theorem 1.1 also follows readily from the much simpler Theorem 5.3 alone (proved below for completeness), after one assumes that p is a quadratic form as above.
Indeed, Theorem 5.3 gives unit vectors u, v such that the matrix B as in (15) has (operator) norm at most 1. Unitary matrices have norm exactly 1 and of course represent the type of operation a quantum algorithm can implement. Moreover, since u, v are unit vectors, they represent (log n)-qubit quantum states. Using the fact that for w, z ∈ R n , we have Diag(w)z = Diag(z)w, we get the following factorization formula (not unlike the one of Corollary 3.2, which is of course no coincidence): x If we assume for the moment that the matrix B actually is unitary, then the right-hand side of (16) suggests the simple one-query quantum algorithm described in Figure 3.
Diag(x) B Figure 3: Let U u , U v be unitaries that have u, v as their first columns, respectively. The algorithm initializes a (1 + log n)-qubit register in the all-zero state, transforms this state into the superposition 1 √ 2 (e 1 ⊗ u + e 2 ⊗ v), queries the input x via the unitary Diag(x) applied to the (log n)-qubit register, applies a controlled-B, and finishes by measuring the first qubit in the Hadamard basis.
Using (16), we observe that the algorithm returns zero with probability Now, it is clear that the the expected value of the measurement result is precisely p(x)/K G , giving Theorem 1.1 with C = 1/K G . In case B is not unitary, one can use the same argument as in the final step of the proof of Theorem 1.3.

Factorization version of Grothendieck's inequality
For completeness and because of its relevance to Theorem 1.1, we here give a proof of Theorem 5.3. The proof relies on the standard version of Grothendieck's inequality (Theorem 5.2). In addition, the proof makes use of the following version of the Hahn-Banach theorem [Rud91, Theorem 3.4].
Theorem 5.4 (Hahn-Banach separation theorem). Let C, D ⊆ R n be convex sets and let C be algebraically open. Then the following are equivalent: • The sets C and D are disjoint.
• There exists a vector λ ∈ R n and a constant α ∈ R such that λ, c < α for every c ∈ C and λ, d ≥ α for every d ∈ D.
Morever, if C and D are convex cones, 7 we may take α = 0.
where the second inequality is by AM-GM inequality. Define the set K ⊆ R n×n by M k x k , y n i,j=1 : d ∈ N, x i , y j ∈ R d .
We claim that K is a convex cone. Observe that for every t ∈ R + and matrix Q ∈ K given by vectors x i , y j , the vectors x i = √ tx i and y j = √ ty j similarly define tQ, and so K is a cone. We now show that K is a convex set. Let Q, Q ∈ K be specified by x i , y j and x i , y j respectively. Then, for any λ ∈ [0, 1], the convex combination λQ + (1 − λ)Q also belongs to K, as it can be specified by the vectors ( √ 1 − λy j ). Additionally, it follows from Eq. (17) that K is disjoint from the open convex cone R n×n <0 of matrices with strictly negative entries. By Theorem 5.4 (the Hahn-Banach separation theorem), we conclude that there exists a nonzero matrix L ∈ R n×n such that L, Q ≥ 0 for every Q ∈ K and L, N < 0 for every N ∈ R n×n <0 . In particular, the second inequality implies that L ∈ R n×n + . Let P = L/ ∑ ij L ij , so that {P ij } n i,j=1 defines a probability distribution over [n] 2 . Then, for any Q ∈ K, M k x k , y , where σ i = P i1 + · · · + P in and µ j = P 1j + · · · + P nj . Observe that σ i , µ j are strictly positive because P ij > 0. Rearranging the inequality above and using bi-linearity, it follows that for every λ > 0, we have In particular, for the case where x k , y ∈ R, i.e., the scalar case, we have x T My ≤ diag(σ) 1/2 x 2 diag(µ) 1/2 y 2 .
The theorem follows by letting