Abstract 1 Introduction 2 Preliminaries 3 Partial transpose rank 4 Non-commutative set-multilinear formula lower bounds from PT-rank 5 Commutative set-multilinear formula lower bounds from PT-rank 6 Upper bounds and candidate hard matrices for PT-rank References

Multi-Quadratic Sum-Of-Squares Lower Bounds Imply VNC1 VNP

Benjamin Rossman ORCID Duke University, Durham, NC, USA Davidson Zhu ORCID Duke University, Durham, NC, USA
Abstract

The sum-of-squares (SoS) complexity of a d-multiquadratic polynomial f (quadratic in each of d blocks of n variables) is the minimum s such that f=i=1sgi2 with each gi d-multilinear. In the case d=2, Hrubeš, Wigderson and Yehudayoff [13] showed that an n1+Ω(1) lower bound on the SoS complexity of explicit biquadratic polynomials implies an exponential lower bound for non-commutative arithmetic circuits. In this paper, we establish an analogous connection between general multiquadratic sum-of-squares and commutative arithmetic formulas. Specifically, we show that an ndo(logd) lower bound on the SoS complexity of explicit d-multiquadratic polynomials, for any d=d(n) with ω(1)d(n)O(lognloglogn), would separate the algebraic complexity classes 𝖵𝖭𝖢𝟣 and 𝖵𝖭𝖯.

Keywords and phrases:
sum-of-squares, arithmetic formulas
Copyright and License:
[Uncaptioned image] © Benjamin Rossman and Davidson Zhu; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Algebraic complexity theory
Related Version:
Full Version: https://arxiv.org/abs/2512.01227
Editor:
Shubhangi Saraf

1 Introduction

Computational complexity theory aims to categorize computational problems by the amount of time and resources they need. The algebraic complexity classes proposed by Leslie Valiant [31], on the other hand, were created for understanding the tractability of polynomials, as opposed to Boolean functions. Arithmetic circuits, branching programs and formulas are defined as analogs of their Boolean counterparts.

Algebraic complexity classes 𝖵𝖥(=𝖵𝖭𝖢𝟣)𝖵𝖡𝖯𝖵𝖯(=𝖵𝖭𝖢𝟤)𝖵𝖭𝖯 consist of sequences of poly(n)-degree n-variate polynomials that are computable by poly(n)-size arithmetic formulas, branching programs, circuits and projections of circuits, respectively. As with the analogous Boolean classes (𝖭𝖢𝟣𝖭𝖫𝖯𝖭𝖯), these four algebraic classes are believed to be distinct. However, it remains a major open problem even to separate 𝖵𝖭𝖢𝟣 from 𝖵𝖭𝖯, despite significant progress on lower bounds for formulas, branching programs and circuits in various restricted settings: low-depth [2, 17, 18], multilinear [9, 20, 23], set-multilinear [3, 16, 25], and non-commutative [6, 19, 30] (to mention just a few representative works).

1.1 Biquadratic sum-of-squares and non-commutative circuits

Let f𝔽[X1,,Xn,Y1,,Yn] be a biquadratic polynomial over a field 𝔽 with monomials of the form Xi1Xi2Yj1Yj2. The (biquadratic) sum-of-squares complexity of f, denoted SoS(f), is the minimum number s of bilinear polynomials g1,,gs such that f=g12++gs2. When 𝔽 is algebraically closed, we have SoS(f)n2 for all f, while a dimension-counting argument shows that SoS(f)=Ω(n2) for almost all f.

Sum-of-squares complexity has been intensively studied for the separable biquadratic polynomial

qn:=(X12++Xn2)(Y12++Yn2).

Over and , trivial bound are given by

nSoS(qn)SoS(qn)n2.

A classical theorem of Hurwitz from 1898 shows that SoS(qn)=n iff n{1,2,4,8} [15]. An upper bound SoS(qn)=O(n2logn) is given by the classical Radon-Hurwitz construction (see [22, 28]). Asymptotically, this remains the strongest known upper bound over . It was also the strongest known bound over , until a recent breakthrough of Hrubeš [14]:

Theorem 1.1 ([14]).

There exists a sum-of-squares decomposition of qn over with just O(n1.62) terms, that is, SoS(qn)=O(n1.62).

As for lower bounds, Adams and Atiyah [1] showed SoS(qn)2log2n; in particular, SoS(qn)2n2 infinitely often. Slightly better bounds have been obtained for specific values of n (and for general fields of characteristic 2) using techniques from algebraic K-theory [7]. However, no lower bound better than SoS(qn)=Ω(n) is currently known.

A striking result of Hrubeš, Wigderson and Yehudayoff [13] explains the difficulty of improving this lower bound. They show that proving a sufficiently super-linear lower bound on SoS(qn) would have a dramatic consequence in non-commutative algebraic circuit complexity:

Theorem 1.2 ([13]).

Let f=(fn) be any explicit sequence111The sequence f=(fn) is explicit if the coefficient in fn of each monomial Xi1Xi2Yj1Yj2 is computable in polylog(n) time given n and indices i1,i2,j1,j2[n]. of biquadratic polynomials over . If SoS(f)=Ω(n1+ε) for any ε>0, then the permanent PERMn requires exponential size non-commutative circuits; in particular, this implies a separation of complexity classes 𝖵𝖯𝗇𝖼 and 𝖵𝖭𝖯𝗇𝖼 (the non-commutative algebraic analogues of 𝖯 and 𝖭𝖯). Moreover, a lower bound SoS(f)=Ω(nlog5n) suffices for the separation 𝖵𝖯𝗇𝖼𝖵𝖭𝖯𝗇𝖼.

Notice that Hrubeš’s upper bound SoS(qn)=O(n1.62) does not rule out the possibility of separating 𝖵𝖯𝗇𝖼 and 𝖵𝖭𝖯𝗇𝖼 via an Ω(n1+ε) lower bound. (We remark that Theorems 1.1 and 1.2 hold more generally over any algebraically closed field, but are not known to extend to ; see the discussion in §1.4.)

1.2 Multiquadratic sum-of-squares and commutative set-multilinear formulas

The main theorem of this paper shows that strong enough lower bounds for multiquadratic sum-of-squares imply super-polynomial lower bounds for commutative arithmetic formulas.

Let f be a d-multiquadratic polynomial over an algebraically closed field, that is, a linear combination of monomials Xi1(1)Xj1(1)Xid(d)Xjd(d) where i1,j1,,id,jd[n]. As in the biquadratic setting (d=2), the (multiquadratic) sum-of-squares complexity of f, denoted SoS(f), is defined as the minimum number s of d-multilinear polynomials g1,,gs such that f=g12++gs2. Similar to the biquadratic setting, we have SoS(f)O(nd) for all f, and SoS(f)=Ω((n/2)d) for almost all f (Prop. 3.17).

Our main result – stated informally below and formally in the next subsection – shows that a nearly optimal lower bound on the SoS complexity of explicit d-multiquadratic polynomials would separate the algebraic complexity classes 𝖵𝖭𝖢𝟣 and 𝖵𝖭𝖯.

Theorem 1.3 (informal).

Let f=(fn) be any explicit sequence of d-multiquadratic polynomials over an algebraically closed or finite field of characteristic 2. If SoS(f)=ndo(logd), then this yields an nΩ(logd) lower bound on the set-multilinear formula size of associated d+1-multilinear polynomials f~=(f~n). If in addition d=d(n) satisfies ω(1)d(n)O(lognloglogn), we get a separation 𝖵𝖭𝖢𝟣𝖵𝖭𝖯 (via a lemma of Raz [24]).

While Theorem 1.3 is analogous to the result of Hrubeš, Wigderson and Yehudayoff [13], it is closer in nature to a well-known theorem of Raz [24] linking tensor rank to set-multilinear formula lower bounds (see the discussion in §1.4).

1.3 Partial transpose rank of 𝒏𝒅×𝒏𝒅 matrices

Definition 1.4.

Let 𝒳={Xi1(1)}i1=1n{Xi2(2)}i2=1n{Xid(d)}id=1n be d disjoint blocks of n variables. A polynomial P(𝒳) is d-multiquadratic if each monomial of P(𝒳) has total degree exactly 2 in each block.

Every nd×nd matrix M gives rise to a d-multiquadratic polynomial defined by

QM:=i,j[n]dMi,jXi1(1)Xj1(1)Xid(d)Xjd(d).

Conversely, for fields of characteristic 2, we get a bijective correspondence between d-multiquadratic polynomials (with d blocks of n variables) and nd×nd matrices satisfying M=M1==Md, where k is the k-th partial transpose operation which swaps row-index ik and column-index jk. We refer to such matrices as fully symmetric.

The actual main theorem of this paper concerns a generalization of SoS complexity called partial transpose rank – or PT-rank for short – which is a complexity measure on general nd×nd matrices (not just fully symmetric ones). We say that an nd×nd matrix P is PT-basic if Pκ has rank 1 for any κ[d], where κ is the composition of (commuting) operations k for kκ. We then define the PT-rank of any nd×nd matrix M as the minimum number r of PT-basic matrices P1,,Pr such that M=P1++Pr. This complexity measure is closely related to SoS complexity: for fully symmetric matrices M (satisfying M=Mk for all k[d]), we show that PT-rank(M) and SoS(QM) are equivalent to an O(2d) factor over any algebraically closed or finite field of characteristic 2 (Prop. 3.15).

Our main theorem relates the PT-rank of an arbitrary nd×nd matrix M with the set-multilinear formula size of an associated multilinear polynomial of degree d+1 in dn+(d1)n2+dn variables:

Q~M:=i,j[n]dMi,jX~i1(0)X~j1,i2(1)X~jd1,id(d1)X~jd(d).
Theorem 1.5 (Main theorem).

For every nd×nd matrix M (over any field), the set-multilinear formula size of Q~M is at least PT-rank(M)/ndlogd+1.

Via a standard set-multilinearization lemma of Raz [24], we get the following corollary:

Corollary 1.6.

Suppose that PT-rank(M)=ndo(logd) where M is any explicit sequence of nd×nd matrices with ω(1)d(n)O(lognloglogn). Then Q~M has arithmetic formula size at least n(1o(1))logd and, hence, 𝖵𝖭𝖢𝟣𝖵𝖭𝖯.

(Theorem 1.3 on SoS complexity follows directly from Theorem 1.5 and Corollary 1.6. For a d-multiquadratic polynomial f=QM, the multilinear polynomial f~ of Theorem 1.3 is given by Q~M.)

Since almost all M have PT-rank (12o(1))nd (Prop. 3.16), finding an explicit M with PT-rank(M)=ndo(logd) is a classic problem of “finding hay in a haystack”. By finding a very explicit M (with Q~M in 𝖵𝖯 rather than 𝖵𝖭𝖯), Corollary 1.6 could conceivably be used to show 𝖵𝖭𝖢𝟣𝖵𝖯. On the flipside, an additional result of this paper (Theorem 6.6) rules out lower bounds, over fields of characteristic 2, whenever Q~M belongs to the class 𝖵𝖡𝖯𝗈𝗋𝖽 of set-multilinear polynomials computable by a polynomial-size ordered branching program. This negative result leverages the recent upper bound SoS(qn)=O(n1.62) of Hrubeš [14] (Theorem 1.1). We also show that PT-rank(M) is at most nO(d0.7) when M is a Kronecker product of d n×n matrices over any fields of characteristic 2 (Theorem 6.3); this observation rules out certain explicit nd×nd matrices as candidates for having PT-rank ndo(logd).

So far as we know, over GF(2), it is possible that the nd×nd identity matrix Ind has PT-rank ndo(logd) or even exactly nd. If so, by Theorem 1.5 this implies an n(1o(1))logd lower bound on the formula size of iterated matrix multiplication IMMn,d+1 (=Q~Ind), which would separate classes 𝖵𝖭𝖢𝟣 and 𝖵𝖡𝖯 over GF(2).

1.4 Related work

Following the influential paper of Hrubeš, Wigderson and Yehudayoff [13], subsequent work by various authors [3, 5, 8] has found additional reductions between various sum-of-squares complexity measures and algebraic circuit lower bounds. Most recently, Dutta, Saxena and Thierauf [8] showed that an d0.5+Ω(1) lower bound on the support-sum of any explicit degree-d univariate polynomial would separate 𝖵𝖯 and 𝖵𝖭𝖯. However, none of the prior work has explored the d-multiquadratic setting, nor implications specific to arithmetic formulas.

More closely related to our main theorem is the well-known theorem of Raz [24] that an ndo(d) lower bound on the tensor rank of any explicit d-multilinear polynomial g (i.e. order-d tensor [n]d𝔽) implies an nΩ(logd) lower bound on the set-multilinear formula size of the same polynomial g. The further implication 𝖵𝖭𝖢𝟣𝖵𝖭𝖯 when ω(1)d(n)O(lognloglogn) follows from an additional lemma in [24], which we also invoke above.

We remark that a lower bound SoS(f)=ndo(logn) for an explicit d-multiquadratic polynomial f (i.e. fully symmetric nd×nd matrix) is conceivable without having to break the longstanding O(nd/2) tensor rank barrier. Indeed, Theorem 1.3 appears to evade the known barriers for rank-based lower bound methods identified by Efremenko et al [10] and Garg et al [11].

1.5 Outline of the paper

The rest of the paper is organized as follows: §2 presents key definitions pertaining to tensors, matrices, and arithmetic formulas. §3 formally introduces partial transpose rank and explores its properties. §4 proves a weaker (but easier and illustrative) version of our main theorem – with respect to non-commutative set-multilinear formulas. §5 proves the main theorem – with respect to commutative set-multilinear formulas. §6 discusses upper bounds for PT-rank and candidate explicit hard matrices.

2 Preliminaries

Throughout this paper, n,d are arbitrary positive integers, where we regard n as a growing parameter and d=d(n) as a function of n.

We write [n] for the set {1,,n} and logn for the base-2 logarithm of n. We write ,:[n]2[n2] for the pairing bijection defined by i,j:=(i1)n+j.

2.1 The underlying field

Throughout this paper, 𝔽 is an arbitrary field. We will explicitly state whenever an assumption on 𝔽 is required. In particular, our main theorem (Theorem 1.5) and other results concerning PT-rank work for completely general fields (including characteristic 2).

The only results that depend on 𝔽 concern the relationship between PT-rank and SoS complexity (Lemma 3.15) and the consequence of our main theorem stated in the terms of SoS complexity (Theorem 1.3). Even here, our results apply to a large class of fields of characteristic 2, including all algebraically closed fields and finite fields.

2.2 Matrices and tensors

In the Introduction, our main theorem was stated first in terms of SoS complexity of d-multiquadratic polynomials, then more generally in terms of PT-rank of nd×nd matrices. These rank measures were linked to the algebraic complexity of a third kind of object: d+1-multilinear polynomials. It is convenient to regard these objects as different flattenings of a tensor [n]2d𝔽 to tensors of format [n2]d or [nd]2 or [n]×[n2]d1×[n], respectively.

Definition 2.1 (Tensors).

An order d tensor is a function A:[n1]××[nd]𝔽 where n1,,nd are nonnegative integers.222All definitions and claims in this paper can be rephrased in a basis-free manner, that is, regarding A as a multilinear function 𝔽n1××𝔽nd𝔽 without reference to any specific bases for vector spaces 𝔽ni. We shall mainly (but not exclusively) consider tensors [n]d𝔽 where n1==nd=n. More generally, we consider tensors A:[n]D𝔽 with coordinates indexed by a nonempty finite set D.333Specifically, in §5 we consider tensors where D is the directed edge set of an arbitrary subgraph of Pathd.

For tensors A:[n]d𝔽 and B:[n]e𝔽, their tensor product is denoted by AB:[n]d+e𝔽. For tensors A:[n]D𝔽 and B:[n]E𝔽 with D and E disjoint, their tensor product is denoted by AB:[n]DE𝔽.

The tensor rank of a tensor A:[n1]××[nd]𝔽, denoted tensor-rank(A), is the minimum number r such that A admits a decomposition A=j=1r(A1,jAd,j) for some collection of order 1 tensors Ai,j:[ni]𝔽.444We define tensor rank solely for the sake of comparing our main result (Theorem 1.5) with Raz’s theorem shows that an ndo(d) lower bound on tensor rank implies 𝖵𝖭𝖢𝟣𝖵𝖯 [24], as discussed in §1.4. Elsewhere in this paper, we only every consider the matrix rank of flattened tensors.

Definition 2.2 (Matrices).

A matrix is an order 2 tensor M:[m1]×[m2]𝔽, also written as M𝔽m1×m2. The rank of M is denoted by rank(M); note that this is a special case tensor rank. The transpose of M is denoted by M𝖳𝔽m2×m1. (We use a bolder font 𝖳 to distinguish the full transpose from partial transpose operations κ on nd×nd matrices introduced in Def. 3.1.)

For matrices M:[m1]×[m2]𝔽 and N:[n1]×[n2]𝔽, their Kronecker product is denoted by MN:[m1n1]×[m2n2]𝔽 (not to be confused with the tensor product MN:[m1]×[m2]×[n1]×[n2]𝔽.)

The n×n identity matrix is denoted by In. We write Ind for the order 2d tensor InIn (d times) (not to be confused with the nd×nd matrix Ind=InIn, which is a tensor of order 2).

When M is an nd×nd matrix, we index its rows and columns by d-tuples in [n]d. That is, we express the entries of M as M(i1,,id),(j1,,jd).

Definition 2.3 (Matrix flattening of a tensor).

For a tensor A:[n]D𝔽 and partition IJ=D, we denote MatI,J(A) for the flattened matrix

MatI,J(A):[n|I|]×[n|J|]𝔽.

2.3 Set-multilinear formula size

There is a one-to-one correspondence between order d tensors A:[n1]××[nd]𝔽 and set-multilinear polynomials PA𝔽𝗌𝗆[𝒳] in variables 𝒳={Xa(i):i[d],a[ni]}. Namely, A describes the coefficients of PA:

PA(𝒳)=(a1,,ad)[n1]××[nd]Aa1,,adXa1(1)Xad(d).

For any nonempty finite set D, we have a similar correspondence between tensors A:[n]D𝔽 and set-multilinear polynomial PA𝔽𝗌𝗆[𝒳] in variables 𝒳={Xa(i):iD,a[n]}.

Definition 2.4 (Set-multilinear formula size).

The set-multilinear formula size of a tensor A:[n]D𝔽, denoted 𝗌𝗆(A), is the minimum number of leaves in a syntactically set-multilinear arithmetic formula that computes the associated polynomial PA. This may also be defined directly by induction:

  • If |D|=1, then 𝗌𝗆(A) is 1 if A is non-zero anywhere and 0 otherwise.

  • If |D|2, then 𝗌𝗆(A) is the minimum value of i(𝗌𝗆(Bi)+𝗌𝗆(Ci)) over indexed families {(Ei,Fi,Bi,Ci)}i (where i runs over an arbitrary, unnamed index set) such that Ei,Fi are disjoint nonempty sets satisfying EiFi=D and Bi:[n]Ei𝔽 and Ci:[n]Fi𝔽 are tensors satisfying A=iBiCi.

For further background on algebraic complexity measures, see [4, 26, 29].

2.4 𝒏𝒅×𝒏𝒅 matrices and their shifted tensors

Recall that our main theorem, Theorem 1.5, relates PT-rank(M) to the set-multilinear formula size of a related polynomial. This polynomial is denoted Q~M in the statement of Theorem 1.5. However, it is convenient to speak of the associated tensor, which we shall refer to as the “shifted tensor” of M.

Definition 2.5 (The shifted tensor of an nd×nd matrix M).

For any nd×nd matrix M, we defined the reshaped tensor ShiftedTensor(M):[n2]d+1𝔽 as follows. Recall that , is a bijection [n]2[n2]. For all p,i1,j1,,id,jd,q[n], we have
ShiftedTensor(M)(p,i1,j1,i2,,jd1,id,jd,q) :=𝟙{p=q=1}M(i1,,id),(j1,,jd).

Note that the set-multilinear polynomial PShiftedTensor(M) is precisely Q~M of Theorem 1.5.

Definition 2.6 (Iterated Matrix Multiplication).

An important example of a “shifted tensor” is the Iterated Matrix Multiplication tensor IMMn,d:[n2]d𝔽 defined by

IMMn,d(a1,b1,,ad,bd):=𝟙[(b1,,bd1)=(a2,,ad)].

The familiar corresponding set-multilinear polynomial describes the sum of entries in the product of d symbolic n×n matrices:

PIMMn,d(𝒳)=a0,,ad[n]Xa0,a1(1)Xa1,a2(2)Xad1,ad(d).

Note that IMMn,d=ShiftedTensor(Ind1), that is, IMMn,d is the shifted tensor of the nd1×nd1 identity matrix.

3 Partial transpose rank

The partial transpose is a key concept in quantum information theory, particularly in studying entanglement and separability of quantum states [12, 21]. In this section, we introduce the partial transpose rank of an nd×nd matrix. To the best of our knowledge, this notion has not been explicitly studied before. As we show, PT rank generalizes SoS complexity (for fully symmetric matrices over algebraically closed or finite fields of characteristic 2).

We begin by defining the partial transpose operations on nd×nd matrices.

Definition 3.1 (Partial transpose).

Let M be an nd×nd matrix.

  • For k[d], the k-th partial transpose of M, denoted Mk, is the nd×nd matrix defined by

    M(i1,,id),(j1,,jd)k:=M(i1,,ik1,jk,ik+1,,id),(j1,,jk1,ik,jk+1,,jd).
  • For κ[d], the κ-partial transpose of M is the matrix Mκ:=Mk1kc where k1,,kc are the distinct elements of κ (in any order).

Note that M=M and M[d]=M𝖳 (the usual transpose of M). Also note that

Mκ1κ2=Mκ2κ1=Mκ1κ2

where κ1κ2=(κ1κ2)(κ2κ1) is the symmetric difference of κ1 and κ2. In particular, the transpose of Mκ is Mκ¯ where κ¯:=[d]κ is the complement of κ. If M is symmetric (i.e. M=M), it follows that Mκ=Mκ¯.

Definition 3.2 (Fully symmetric matrices).

We say that an nd×nd matrix M is fully symmetric if Mk=M for all k[d] (equivalently: if Mκ=M for all κ[d]).

Example 3.3.

Let d=2 and n=3. Suppose M is a partitioned 32×32 matrix with 3×3 blocks M11,,M33. Then

M=M =[M11M12M13M21M22M23M31M32M33], M{1}=M1 =[M11M12M13M21M22M23M31M32M33],
M{2}=M2 =[M11M21M31M12M22M32M13M23M33], M{1,2}=M =[M11M21M31M12M22M32M13M23M33].

When d=2 (as in this example), the operation 1 is also known as the block transpose, since it transposes the individual blocks within a partitioned n2×n2 matrix.

Definition 3.4 (Partial transpose rank).

Let M be an nd×nd matrix.

  • M is PT-basic if there exists κ[d] such that rank(Mκ)=1.

  • The PT-rank of M is the minimum number of PT-basic matrices that sum to M.

The next lemma gives an alternative characterization of PT-rank.

Lemma 3.5.

The PT-rank of an nd×nd matrix M equals the minimum value of κ[d1]rank(Nκκ) over all decompositions M=κ[d1]Nκ.

Proof.

Let r be the PT-rank of M and let B1,,Br be PT-basic matrices that sum to M. For each Bi, there exists κi[d] with rank(Biκi)=1. Without loss of generality, we may choose κi[d1], since rank(Biκi)=rank((Biκi)𝖳)=rank(Biκi¯). Now define Nκ:=i[r]:κi=κBi.

The next two examples illustrate PT-rank in the case d=2.

Example 3.6.

Let d=2 and n=2. The 22×22 identity matrix has PT-rank 2, since it is not PT-basic, but is a sum of PT-basic matrices:

[1000010000100001] =[1001000000001001]+[0000011001100000]1

3.1 Properties of PT-rank

Unlike the full transpose (κ=[d]), the partial transpose operation can increase matrix rank when κ[d]. (See full version of the paper for proofs of lemmas in this section.)

Lemma 3.7.

For every nd×nd matrix M and κ[d], we have

rank(Mκ)n2min{|κ|,d|κ|}rank(M).

Moreover, this inequality is best possible in general.

The partial transpose also behaves differently than the full transpose with respect to products of matrices. For two nd×nd matrices P and M, we have (PM)=MP; additionally, rank((PM))=rank(M) whenever P is invertible. With respect to the partial transpose, we get analogous equations only in the case that P (or M) is a Kronecker product of d n×n matrices.

Lemma 3.8.

Let P and M be nd×nd matrices such that P=B1Bd for n×n matrices B1,,Bd. Then for all κ[d],

(PM)κ=LMκRwhere L=k=1d{Inif kκBkotherwise and R=k=1d{Bk𝖳if kκInotherwise.

Moreover, if B1,,Bd are nonsingular then rank((PM)κ)=rank(Mκ).

Corollary 3.9.

PT-rank is invariant under the action of GL(n)d on nd×nd matrices.

Corollary 3.9 shows that PT-rank generalizes to a complexity measure on multilinear functions V1××VdV1××Vd (equivalently: on tensors in c=1d(VcVc)) for any n-dimensional 𝔽-vector spaces V1,,Vd, independent of a choice of bases.

Note that the PT-rank of a matrix depends on its assumed dimension. For example, the PT-rank of an n4×n4 matrix is different from its PT-rank treated as a n^2×n^2 matrix where n^=n2, as a different set of partial transposes are allowed. When there is ambiguity, we write PT-rank[n]4(M) or PT-rank[n2]2(M) to distinguish these two cases.

Lemma 3.10.

PT-rank[n]pq(M)PT-rank[np]q(M) whenever d=pq with p,q1.

Proof.

Write M for the npq×npq matrix and N for the equivalent n^q×n^q matrix with n^=np. For λ[q], note that

rank(Nλ)=rank(Mκ)where κ={k+(l1)p:k[p],l[λ]}.

The claimed inequality follows from this fact and the characterization of PT-rank given by Lemma 3.5.

3.2 PT-rank and SoS complexity

We now explain the close relationship between PT-rank of nd×nd matrices and SoS complexity of d-multiquadratic forms.

Definition 3.11 (Multiquadratic polynomial QM).

For an n1nd×n1nd matrix M, let QM denote the d-multiquadratic polynomial

QM:=i,j[n1]××[nd]Mi,jXi1(1)Xj1(1)Xid(d)Xjd(d).

For example, the identity matrix In3 has corresponding 3-multiquadratic polynomial

qIn3=(i[n]Xi2)(i[n]Yi2)(i[n]Zi2)

where we write Xi,Yi,Zi for Xi(1),Xi(2),Xi(3). (The biquadratic polynomial qn in the Introduction is given by qIn2 in this notation.)

Lemma 3.12.

Let 𝔽 be algebraically closed field of characteristic 2. Then over 𝔽, we have SoS(QM)PT-rank(M).

Proof.

Suppose SoS(QM)=m<. Then, there exist an m×nd matrix L such that

QM ==1m(i[n]dL,iXi1(1)Xid(d))2=i,j[n]d(L𝖳L)i,jXi1(1)Xj1(1)Xid(d)Xjd(d).

Over an algebraically closed field, we have the familiar fact

{symmetric matrices of rankm}={L𝖳L:L𝔽m×nd}.

Therefore,
SoS(QM)=min{rank(D):symmetric D s.t. QM=i,j[n]dDi,jXi1(1)Xj1(1)Xid(d)Xjd(d)}.

Since variables commute, the righthand equation is equivalent to M=12dκ[d]Dκ. That is,

SoS(QM)min{rank(D):M=12dκ[d1]Dκ}.

Given any decomposition M=κ[d]Pκκ with PT-rank(M)=κrank(Pκ), let D=12dκPκ. Then

12dκDκ=12dκλPλκ=12dκ(λPλλ)κ=12dκMκ=M.

Therefore, SoS(QM)rank(D)κrank(Pκ)=PT-rank(M).

 Remark 3.13.

Over finite fields 𝔽, the bound SoS(QM)1PT-rank(M) can be proved using same ideas as above. The only difference is that following the result from [27], every symmetric matrix of rank m can be expressed as L𝖳L for L𝔽(m+1)×nd, thus comes the additional 1.

As we show next, the bounds above are tight up to O(2d) factor for fully symmetric matrices.

Lemma 3.14.

If M is a fully symmetric matrix over a field of characteristic 2, then PT-rank(M)2d1SoS(QM).

Proof.

Suppose SoS(QM)=m<. Then there exists an m×nd matrix L such that

QM ==1m(i[n]dL,iXi1(1)Xid(d))2=i,j[n]d(L𝖳L)i,jXi1(1)Xj1(1)Xid(d)Xjd(d).

Since these are commutative polynomials (in particular, Xik(k)Xjk(k)=Xjk(k)Xik(1)), it follows that

κ[d]Mκ=κ[d](L𝖳L)κ.

Since M is fully symmetric and L𝖳L is symmetric, it follows that 2d1M=κ[d1](L𝖳L)κ. Therefore, we have M=κ[d1]Nκ for matrices Nκ:=12d1(L𝖳L)κ. Since rank((Nκ)κ)=rank(L𝖳L)=s for each κ[d1], we have PT-rank(M)2d1s=2d1SoS(QM).

Lemmas 3.14 and 3.12 together imply:

Proposition 3.15.

Let M be an nd×nd matrix over algebraically closed or finite field of characteristic 2. Then SoS(QM)1PT-rank(M)2d1SoS(QM).

3.3 PT-rank of almost all matrices

Although not required for any other results in this paper, we conclude this section by observing that (asymptotically) almost all nd×nd matrices have PT-rank Ω(nd), and (asymptotically) almost all fully symmetric nd×nd matrices have PT-rank Ω((n/2)d). These results are established standard dimension-counting arguments. (See full paper for proofs.)

Proposition 3.16.

Asymptotically almost all nd×nd matrices have PT-rank at least (12ε)nd for every constant ε>0.

Since the set of fully symmetric nd×nd matrices has dimension n2d/2d, a similar counting argument shows:

Proposition 3.17.

Asymptotically almost all fully symmetric nd×nd matrices have PT-rank at least (12d+1ε)nd for any constant ε>0.

4 Non-commutative set-multilinear formula lower bounds from PT-rank

Recall the main theorem of this paper:

Theorem 1.5 (restated).

For every nd×nd matrix M, we have

𝗌𝗆(ShiftedTensor(M)) PT-rank(M)ndlogd+1.

As a helpful warm-up to the proof in §5, in this section we shall prove a strictly weaker – but slightly simpler – version of Theorem 1.5 that lower bounds the non-commutative set-multilinear formula size of ShiftedTensor(M). We start with the definition:

Definition 4.1.

The non-commutative set-multilinear formula size of a tensor A:[n]d𝔽, denoted 𝗇𝖼,𝗌𝗆(A), is defined inductively as follows:

  • If d=1, then 𝗇𝖼,𝗌𝗆(A) equals 1 if A is non-zero at any point and 0 otherwise.

  • If d2, then 𝗇𝖼,𝗌𝗆(A) is the minimum value of i(𝗇𝖼,𝗌𝗆(Bi)+𝗇𝖼,𝗌𝗆(Ci)) over indexed families {(ei,fi,Bi,Ci)}i where ei,fi are positive integers satisfying ei+fi=d and Bi:[n]ei𝔽 and Ci:[n]fi𝔽 are tensors such that A=iBiCi.

In the remainder of this section, we prove the following:

Theorem 4.2 (Non-commutative version of Theorem 1.5).

For every nd×nd matrix M, we have

𝗇𝖼,𝗌𝗆(ShiftedTensor(M)) PT-rank(M)ndlogd+1.

4.1 The complexity measure 𝝆 on order 𝟐𝒅 tensors

We introduce a complexity measure on tensors A:[n]2d𝔽, denoted ρ(A), which is closely related to PT-rank (normalized by factor nd), but with an added “max” quantifier that affects the 1 and k partial transposes. (A precise relationship between ρ and PT-rank is noted at the end of this section.)

Definition 4.3.

For any a,b{0,1} and α{0,1}d1, we define a partition Ia,α,bJa,α,b=[2d] by

Ia,α,b :=[2d]{ a, 2+α1, 4+α2,, 2d2+αd, 2d+b },
Ja,α,b :=[2d]{1 a, 3α1, 5α2,, 2d1αd, 2d+1b }.

For a tensor A:[n]2d𝔽, we define its relative rank with respect to (a,α,b){0,1}d+1 as follows:

relrka,α,b(A) :=ndrank(MatIa,α,b,Ja,α,b(A)).
Lemma 4.4.

relrka,α,b(A)n𝟙[ab].

Proof.

Follows from the observation that

|Ia,α,b| =d1+𝟙[a=1]+𝟙[b=0],
|Ja,α,b| =d1+𝟙[a=0]+𝟙[b=1],

and therefore min{|Ia,α,b|,|Ja,α,b|}=d𝟙[ab].

The next lemmas show that relrk is sub-additive and multiplicative.

Lemma 4.5.

relrka,α,b(iAi)irelrka,α,b(Ai).

Proof.

Follows from sub-additivity of rank.

Lemma 4.6.

For all A:[n]2d𝔽 and B:[n]2e𝔽 and a,b,c{0,1} and α{0,1}d1 and β{0,1}e1, with respect to the tensor product AB:[n]2(d+e)𝔽, we have

Ia,α,b,β,c =Ia,α,bIb,β,c,
Ja,α,b,β,c =Ja,α,bJb,β,c,
MatIa,α,b,β,c,Ja,α,b,β,c(AB) =MatIa,α,b,Ja,α,b(A)MatIb,β,c,Jb,β,c(B),
relrka,α,b,β,c(AB) =relrka,α,b(A)relrkb,β,c(B).

(Note: We only require the inequality relrka,α,b,β,c(AB)relrka,α,b(A)relrkb,β,c(B), that is, the sub-multiplicativity of relrk.)

Proof.

Follows from definitions and multiplicativity of matrix rank under Kronecker product.

Definition 4.7.

For a tensor A:[n]2d𝔽, notation {Xα}A denotes that Xα are tensors [n]2d𝔽, indexed by α{0,1}d1, such that A=αXα.

We define a complexity measure ρ(A) by

ρ(A) :=maxa,b{0,1}min{Xα}Aα{0,1}d1relrka,α,b(Xα).
Lemma 4.8.

ρ is sub-additive, that is, ρ(iAi)iρ(Ai) where {Ai}i is any indexed family of tensors Ai:[n]2d𝔽.

Proof.

We have

ρ(iAi) =maxa,bmin{Xα}iAiαrelrka,α,b(Xα)
maxa,bmin{Xi,α}AiXα:=iXi,ααrelrka,α,b(Xα)
=maxa,bmin{Xi,α}Aiαrelrka,α,b(iXi,α)
maxa,bmin{Xi,α}Aiiαrelrka,α,b(Xi,α)
=maxa,bimin{Xi,α}Aiαrelrka,α,b(Xi,α)
imaxa,bmin{Xi,α}Aiαrelrka,α,b(Xi,α)
=iρ(Ai).

Unlike relative rank, ρ is not obviously multiplicative or sub-multiplicative under tensor product. However, the following key lemma shows that ρ always shrinks under tensor products. In contrast, the relative rank of a tensor product AB is only guaranteed to shrink when (the matrix flattening of) A or B has unbalanced dimensions.

Lemma 4.9.

For all A:[n]2d𝔽 and B:[n]2e𝔽, we have

ρ(AB)n1min{ρ(A),ρ(B)}.
Proof.

Let a,b,c range over {0,1}, and let α,β range over {0,1}d1,{0,1}e1 respectively. We have

ρ(AB) =maxa,cmin{Zα,b,β}ABα,b,βrelrka,α,b,β,c(Zα,b,β)
maxa,cmin{Xα}A,{Yβ}BZα,b,β:= 1[b=1c]XαYβα,b,βrelrka,α,b,β,c(Zα,b,β)
=maxa,cmin{Xα}A,{Yβ}Bα,βrelrka,α,1c,β,c(XαYβ)
=maxa,cmin{Xα}A,{Yβ}Bα,βrelrka,α,1c(Xα)relrk1c,β,c(Yβ)(Lemma 4.6)
maxa,cmin{Xα}A,{Yβ}Bα,βrelrka,α,1c(Xα)n1(Lemma 4.4)
=n1maxa,cmin{Xα}Aαrelrka,α,1c(Xα)
=n1maxa,bmin{Xα}Aαrelrka,α,b(Xα)
=n1ρ(A).

The proof of ρ(AB)n1ρ(B) follows similarly.

4.2 Non-commutative formula lower bounds from 𝝆

Definition 4.10 (The order d flattening A:[n2]d𝔽 of an order 2d tensor A:[n]2d𝔽).

For an order 2d tensor A:[n]2d𝔽, we define its order d flattening A:[n2]d𝔽 by the identity

A(a1,b1,,ad,bd)=A(a1,b1,,ad,bd)

for all a1,b1,,ad,bd[n].

Note that this flattening operation commutes with the tensor product: for any A:[n]2d𝔽 and B:[n]2e𝔽, we have (AB)=AB (as tensors [n]2(d+e)𝔽).

Theorem 4.11.

For every A:[n]2d𝔽, we have 𝗇𝖼,𝗌𝗆(A)nlogdρ(A). That is, any non-commutative set-multilinear formula computing the flattened tensor A:[n2]d𝔽 has size at least nlogdρ(A).

Proof.

We argue by induction on d. The base case d=1 is trivial. Assume d2 and fix {(ei,fi,Bi,Ci)}i with ei,fi1 and ei+fi=d and Bi:[n]ei𝔽 and Ci:[n]fi𝔽 such that A=iBiCi and 𝗇𝖼,𝗌𝗆(A)=i(𝗇𝖼,𝗌𝗆(Bi)+𝗇𝖼,𝗌𝗆(Ci)).

We have

𝗇𝖼,𝗌𝗆(A) i(nlogeiρ(Bi)+nlogfiρ(Ci))(induction hypothesis)
i:eid/2nlogeiρ(Bi)+i:fi>d/2nlogfiρ(Ci)
i:eid/2nlogei+1ρ(BiCi)+i:fi>d/2nlogfi+1ρ(BiCi)(Lemma 4.9)
nlogdiρ(BiCi)
nlogdρ(iBiCi)(Lemma 4.8)
=nlogdρ(A).

We are finally ready to prove Theorem 4.2.

Proof of Theorem 4.2.

Let M be an nd×nd matrix. Recall the order d+1 tensor ShiftedTensor(M):[n2]d+1𝔽, which equals PaddedTensor(M) for the order 2d+2 tensor PaddedTensor(M):[n]2d+2𝔽 defined by
PaddedTensor(M)(p,i1,j1,i2,,jd1,id,jd,q) =𝟙{p=q=1}M(i1,,id),(j1,,jd), ShiftedTensor(M)(p,i1,j1,i2,,jd1,id,jd,q) =𝟙{p=q=1}M(i1,,id),(j1,,jd).

It follows immediately from definitions of PT-rank and ρ that PT-rank(M)=nd+1ρ(PaddedTensor(M)). Theorem 4.11 now implies the desired bound

𝗇𝖼,𝗌𝗆(ShiftedTensor(M))nlogdρ(PaddedTensor(M)) =PT-rank(M)ndlogd+1.

5 Commutative set-multilinear formula lower bounds from PT-rank

This section generalizes the argument in the last section to prove our main theorem (Theorem 1.5).

5.1 Extending 𝝆 to tensors over subgraphs of 𝐏𝐚𝐭𝐡𝒅

Instead of considering tensors [n]2d𝔽, we consider tensors [n]D(G)𝔽 with coordinates indexed by the (symmetric) directed edge set of a subgraph G of an undirected path graph.

Let Path be the undirected path graph with vertex set {vi:i} and edge set {{vi1,vi}:i}.

Let G=(V(G),E(G)), E(G)(V(G)2), range over nonempty finite subgraphs of Path with no isolated vertices (i.e., V(G)=eE(G)e). We next introduce some useful notation:

  • We consider the partition V1(G)V2(G)=V(G) defined by

    V1(G) :={vV(G):v has degree 1 in G},
    V2(G) :={vV(G):v has degree 2 in G}.
  • Let D(G)V(G)×V(G) be the (symmetric) set of directed edges of G, that is,

    D(G) :={vw:{v,w}E(G)}.

    We consider the partition D1(G)D2(G)=D(G) defined by

    D1(G) :={vwD(G):vV1(G)},
    D2(G) :={vwD(G):vV2(G)}.

    Note that |D1(G)|=|V1(G)| and |D2(G)|=2|V2(G)|.

  • For any set SV(Path), let 𝒫(S) denote the set of pairs γ=(Iγ,Jγ) of disjoint sets Iγ,JγD(Path) such that

    S={v:vuIγ}={v:vwJγ}.

    (That is, for each viS with neighbors vi1,vvi+1 in Path, exactly one of the sets Iγ and Jγ contains vivi1 and the other contains vivvi+1.) Note that |Iγ|=|Jγ|=|S| and |𝒫(S)|=2|S|.

  • For disjoint sets S1,S2V(Path) and γ1𝒫(S1) and γ2𝒫(γ2), let

    γ1γ2:=(Iγ1Iγ2,Jγ1Jγ2)𝒫(S1S2).

We now consider tensors A:[n]D(G)𝔽 and define analogues of relrk and ρ.

 Remark 5.1.

When G is a path of length d with vertices v0,,vd, we identify the sequence of directed edges v0v1,v1v0,v1v2,v2v1,,vd1vd,vdvd1 with integers 1,,2d. We thus identify D(G) with the set [2d] and tensors [n]D(G)𝔽 with tensors [n]2d𝔽. Under this identification, the following definitions of relrk and ρ directly generalize those in §4.

Definition 5.2.

For a graph GPath and tensor A:[n]D(G)𝔽 and α𝒫(V2(G)) and γ𝒫(V1(G)), we define

relrkα,γ(A) :=n|E(G)|rank(MatIα(IγD1(G)),Jα(JγD1(G))(A)).
Definition 5.3.

For a graph GPath and tensor A:[n]D(G)𝔽, notation {Xα}A denotes that Xα are tensors [n]D(G)𝔽, indexed by α𝒫(V2(G)), such that A=αXα.

We define the complexity measure ρ(A) by

ρ(A) :=maxγ𝒫(V1(G))min{Xα}Aα𝒫(V2(G))relrkα,γ(Xα).
Lemma 5.4.

ρ(iAi)iρ(Ai).

Note that |Iα|=|Jα|=|V2(G)| and |D1(G)Iγ|+|D1(G)Jγ|=|V1(G)|. As a consequence, we get the following generalization of Lemma 4.4.

Lemma 5.5.

relrkα,γ(A)n||IγD1(G)||JγD1(G)||.

Part (a) of the next lemma shows that the bound of Lemma 5.5 may be as small as n|V1(G)|. Part (b) makes a related observation that will be useful later.

Lemma 5.6.
  1. (a)

    For every graph GPath, there exist unique elements γ+,γ𝒫(V1(G)) such that

    Iγ+=Jγ=D1(G),Jγ+D1(G)=IγD1(G)=.
  2. (b)

    For all edge-disjoint graphs G,HPath, there exist unique elements δ+,δ𝒫(V1(G)V1(H)) such that

    Iδ+=Jδ=D1(G)D1(H),Jδ+D1(H)=IδD1(H)=.
Proof.
  1. (a)

    Consider any vV1(G) with neighbors u,w in Path. Without loss of generality, {v,u}E(G) and {v,w}E(G). We include vw in Iγ+ (=Jγ) and vu in Iγ (=Jγ+).

  2. (b)

    Consider any vV1(G)V1(H) with neighbors u,w in Path. Without loss of generality, {v,u}E(G)E(H) and {v,w}E(H)E(G). We include vw in Iδ+ (=Jδ) and vu in Iδ (=Jδ+).

Lemma 5.7.

Suppose that G,H are edge-disjoint finite subgraphs of Path, let A:[n]D(G)𝔽 and B:[n]D(H)𝔽, and consider the tensor product AB:[n]D(GH)𝔽. For all

α𝒫(V2(G)),β𝒫(V2(H)),δ𝒫(V1(G)V1(H)),
ξ𝒫(V1(G)V1(H)),ζ𝒫(V1(H)V1(G)),

we have

relrkαβδ,ξζ(AB) =relrkα,ξδ(A)relrkβ,ζδ(B).
Lemma 5.8.

For all edge-disjoint G,HPath and tensors A:[n]D(G)𝔽 and B:[n]D(H)𝔽, we have

ρ(AB)n|V1(G)V1(H)|min{ρ(A),ρ(B)}.

(See full paper for the proof, which is similar to Lemma 5.7.)

5.2 Set-multilinear formula lower bounds from 𝝆

The next definition generalizes the flattening operation introduced in Def. 4.10.

Definition 5.9.

For any GPath and A:[n]D(G)𝔽, let A:[n2]E(G)𝔽 denote the flattened tensor defined in the natural way: for x[n]D(G), we have A(y)=A(x) where y[n2]E(G) is the element defined by y{vi1,vi}=xvi1,xvi for each i such that {vi1,vi}E(G).

The analog of Theorem 4.11 for (commutative) set-multilinear formulas has a similar proof based on the inductive definition of 𝗌𝗆 However, instead of logd in the exponent of n, the relevant parameter becomes log(G) where (G) is the length of the longest path in G. We record this notation below:

Definition 5.10.

For GPath, let (G) denote the length of the longest path in G (i.e., the maximal number of edges in a connected component of G).

Theorem 5.11.

For every GPath and A:[n]D(G)𝔽, we have

𝗌𝗆(A)nlog(G)ρ(A).

That is, any set-multilinear arithmetic formula computing the flattened tensor A:[n2]E(G)𝔽 has size at least n1+log(G)ρ(A).

Proof.

We argue by induction on |E(G)|. The base case |E(G)|=1 is trivial, since |E(G)||V(G)|+1+log|E(G)|=0 and ρ(A)1.

Now assume |E(G)|2 and fix {(Hi,Ki,Bi,Ci)}i with Hi,KiG such that HiKi=G and Bi:[n]D(Hi)𝔽 and Ci:[n]D(Ki)𝔽 such that A=iBiCi and 𝗌𝗆(A)=i(𝗌𝗆(Bi)+𝗌𝗆(Ci)).

Note that for all i,

(G)max{(Hi),(Ki)}(|V1(Hi)V1(Ki)|+1).

Therefore,

log(G)max{log(Hi),log(Ki)}+|V1(Hi)V1(Ki)|.

We now have

𝗌𝗆(A) i(nlog(Hi)ρ(Bi)+nlog(Ki)ρ(Ci))(induction hypothesis)
inmax{log(Hi),log(Ki)}+|V1(Hi)V1(Ki)|ρ(BiCi)(Lemma 5.8)
nlog(G)iρ(BiCi)
nlog(G)ρ(iBiCi)(Lemma 5.4)
=nlog(G)ρ(A).

In the special case where G is path of length d, Theorem 5.11 has the following corollary, which directly strengthens Theorem 4.11 by replacing 𝗇𝖼,𝗌𝗆 with 𝗌𝗆.

Corollary 5.12.

For every A:[n]2d𝔽, we have 𝗌𝗆(A)nlogdρ(A).

Finally, Theorem 1.5 follows directly from Theorem 4.11 via the same observation that

PT-rank(M)=nd+1ρ(PaddedTensor(M))

for every nd×nd matrix M and that ShiftedTensor(M)=PaddedTensor(M).

6 Upper bounds and candidate hard matrices for PT-rank

In this section, we first present some known PT-rank upper bounds for certain families of matrices. These bounds are based on Hrubes’ recent subquadratic upper bound for the sum-of-squares problem [14]. Then, we give some explicit candidate hard matrices that might realize our desired lower bound.

6.1 Kronecker products of 𝒏×𝒏 matrices have low PT-rank

Recall Hrubeš’ upper bound SoS(qn)=O(n1.62) (Theorem 1.1). By Proposition 3.15, this is equivalent to PT-rank(In2)=O(n1.62). This immediately gives the following upper bound on PT-rank(Ind) for all d.

Proposition 6.1.

PT-rank(Ind)=n0.81d+O(1).

Proof.

We prove the bound PT-rank(Ind)=O(n0.81d) in the case where d is even. (The bound n0.81d+O(1) for odd d follows by monotonicity.) This bound is given by the following calculation:

PT-rank[n]d(Ind) PT-rank[nd/2]2(I(nd/2)2)=O((nd/2)1.62)=O(n0.81d).

The first inequality holds since PT-rank[n]d can use κ for all κ[d1], whereas PT-rank[nd/2]2 can only use κ for κ{,[d/2]}.

Notice that the PT-rank bound of Proposition 6.1 involves only two partial transposes, namely and [d/2]. Using all 2d partial transposes, we can show a much stronger upper bound:

Proposition 6.2.

PT-rank(Ind)=nO(d0.7).

Proof.

The construction by Hrubeš shows that

i=1d(j=1nXi,j2)=i=1d/2(j=1O(n1.62)fi,j2),

where fi,j is a bilinear form in {Xi,(2j1)}i=1d and {Xi,(2j)}i=1d. We may repeat the step by considering fij as new variables. Inductively, this gives

i=1d(j=1nXi,j2)=j=1sgj2,

where gj is multilinear in the d sets of variables and

s=O(n(1.62)log2d)nO(d0.7).

The corollary follows from Proposition 3.15.

The upper bound above extends to Kronecker products of n×n matrices, via GL(n,𝔽)d-invariance of PT-rank (Lemma 3.8). Thus, we have proved:

Theorem 6.3.

If M is a Kronecker product of d n×n matrices, then PT-rank(M) is at most nO(d0.7).

6.2 Matrices in 𝗩𝗕𝗣𝗼𝗿𝗱 have low PT-rank

The Kronecker products of n×n matrices have low PT-rank. Using this fact, we may derive an upper bound for all matrices whose shifted tensors are in the complexity class 𝖵𝖡𝖯𝗈𝗋𝖽.

Definition 6.4 (Ordered algebraic branching programs).

Let f be a set-multilinear polynomial over 𝔽 of degree d in variable sets X(1),X(2),,X(d). We say that f has an ordered algebraic branching program of width w and size s if

f=v1𝖳M1M2Mdv2,

where v1𝔽w0 and v2𝔽wd are vectors whose entries are in 𝔽, Mi𝔽wi1×wi are matrices whose entries are linear combinations of variables in Xi for i[d], and the following holds:

wiw for 0i[d] and i=0dwis.
Definition 6.5 (𝖵𝖡𝖯𝗈𝗋𝖽).

A sequence of set-multilinear polynomials with n variables is in 𝖵𝖡𝖯𝗈𝗋𝖽 if and only if it can be computed by a sequence of ordered algebraic branching programs with size s=nO(1).

An ordered algebraic branching program for a tensor A is an ordered algebraic branching program for its associated set-multilinear polynomial PA (as defined in Section 2.3). The class 𝖵𝖡𝖯𝗈𝗋𝖽 is defined for tensors similarly.

Our main result (Theorem 1.5) shows that sufficiently strong lower bounds on PT-rank(M) for explicit matrices M would imply a separation of classes 𝖵𝖭𝖢𝟣 and 𝖵𝖭𝖯. For a more stringent notion of “explicit”, this reduction could even potentially separate 𝖵𝖭𝖢𝟣 from 𝖵𝖯 (i.e., arithmetic formulas from circuits). In this section, however, we describe a limitation of PT-rank by showing that it cannot separate 𝖵𝖭𝖢𝟣 from 𝖵𝖡𝖯𝗈𝗋𝖽 (i.e., arithmetic formulas from ordered branching programs).

Theorem 6.6.

For any sequence of nd×nd matrices M such that ShiftedTensor(M)𝖵𝖡𝖯𝗈𝗋𝖽, we have PT-rank(M)n0.81d+O(1).

(See full paper for the proof.)

6.3 Candidate hard matrices

We now propose a natural family of nd×nd matrices over 𝔽= which are plausible candidates for having nearly maximal PT-rank.

Definition 6.7.

Let d be an even integer. Let n be a prime power such that n>2d. Let T be an d×d matrix over the prime field 𝔽n. Let ω=e2πi/n be a primitive nth root of unity. Then, define the nd×nd matrix WT over by

(WT)(i1,,id),(j1,,jd)=ω(i1,,id)T(j1,,jd).

With the definition above, we first consider three choices T1,T2,T3 that don’t work, that is, where WT has small PT-rank.

Identity matrix

For T1=Id, we have

(WT1)(i1,,id),(j1,,jd)=ω(i1,,id)(j1,,jd).

We may define n×n matrices WT1,l for l[d] by (WT1,l)il,jl=ωiljl. Then, we have WT1=WT1,1WT1,d. By Theorem 6.3, this implies PT-rank(WT1)=nO(d0.7).

Cyclic matrix

Consider the cyclic permutation matrix T2 defined by

(T2)a,b=1[a+1bmodd].

For this choice of T2, we have

(WT2)(i1,,id),(j1,,jd)=ω(i1,,id1,id)(j2,,jd,j1).

Let κ={1,3,5,,d1}. Then, by definition of partial transpose, we have

(WT2κ)(j1,i2,,jd1,id),(i1,j2,,id1,jd)=(WT2)(i1,,id),(j1,,jd)=ω(i1,,id1,id)(j2,,jd,j1).

Define nd vectors u and v by

u(j1,i2,,jd1)=ω(i2,i4,,id)(j1,j3,,jd1),v(i1,j2,,id1)=ω(i1,i3,,id1)(j2,j4,,jd).

We then have WT2κ=uv, which means it has rank 1. Therefore, PT-rank(WT2)=1.

Upper triangular matrix

Consider the upper triangular matrix T3 defined by

(T3)a,b=1[ab].

For this choice of T3, we have (WT3)(i1,,id),(j1,,jd)=ωklikjl.

Let A be the unique [n]2d tensor such that A=ShiftedTensor(WT3). Then,

A(p,i1,j1,i2,,jd,q)=(WT3)(i1,,id),(j1,,jd).

Then, it can be observed that rank(Mat{1,,d+1},{d+2,,2d+2})n2. (In fact, A𝖵𝖡𝖯𝗈𝗋𝖽.) Therefore, PT-rank(WT3)n0.81d+O(1) by Theorem 6.6.

The three matrices T1,T2,T2 considered above give rise to WT with small PT-rank. We now give some evidence that a Cauchy matrix (in which every submatrix has full rank) might be a good choice of T. (See full paper for proofs of the propositions in this section.)

Definition 6.8.

For λ[d], let WT[λ] be the n2|λ|×n2(d|λ|) matrix defined by

(WT[λ])(iλ,jλ),(iλc,jλc)=(WT)(i1,,id),(j1,,jd).
Proposition 6.9.

Let T be a Cauchy matrix. For every λ[d], the matrix WT[λ]n2|λ|×n2(d|λ|) has full rank:

rank(WT[λ])=n 2min{|λ|,d|λ|}.

The two propositions above disable the only two upper bound techniques we know: taking a single partial transpose, and finding some flattening of KT with small rank in order to apply Hrubeš’ construction.

References

  • [1] John F Adams and Michael F Atiyah. K-theory and the hopf invariant. The Quarterly Journal of Mathematics, 17(1):31–38, 1966.
  • [2] Prashanth Amireddy, Ankit Garg, Neeraj Kayal, Chandan Saha, and Bhargav Thankey. Low-Depth Arithmetic Circuit Lower Bounds: Bypassing Set-Multilinearization. In Kousha Etessami, Uriel Feige, and Gabriele Puppis, editors, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023), volume 261 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1–12:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ICALP.2023.12.
  • [3] V Arvind and S Raja. Some lower bound results for set-multilinear arithmetic computations. Chicago Journal OF Theoretical Computer Science, 6:1–262, 2016.
  • [4] Peter Bürgisser, Michael Clausen, and Mohammad A Shokrollahi. Algebraic complexity theory, volume 315. Springer Science & Business Media, 2013.
  • [5] Marco L. Carmosino, Russell Impagliazzo, Shachar Lovett, and Ivan Mihajlin. Hardness Amplification for Non-Commutative Arithmetic Circuits. In Rocco A. Servedio, editor, 33rd Computational Complexity Conference (CCC 2018), volume 102 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1–12:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.CCC.2018.12.
  • [6] Prerona Chatterjee. Separating abps and some structured formulas in the non-commutative setting. In 36th Computational Complexity Conference, 2021.
  • [7] Daniel Dugger and Daniel C Isaksen. The hopf condition for bilinear forms over arbitrary fields. Annals of mathematics, pages 943–964, 2007.
  • [8] Pranjal Dutta, Nitin Saxena, and Thomas Thierauf. Weighted sum-of-squares lower bounds for univariate polynomials imply VPVNP. computational complexity, 33(1):1–54, 2024.
  • [9] Zeev Dvir, G Malod, S Perifel, and A Yehudayoff. Separating multilinear branching programs and formulas. In 44th Annual ACM Symposium on Theory of Computing, STOC’12, 2012.
  • [10] Klim Efremenko, Ankit Garg, Rafael Oliveira, and Avi Wigderson. Barriers for Rank Methods in Arithmetic Complexity. In Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), pages 1:1–1:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.ITCS.2018.1.
  • [11] Ankit Garg, Visu Makam, Rafael Oliveira, and Avi Wigderson. More barriers for rank methods, via a “numeric to symbolic” transfer. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 824–844. IEEE Computer Society, 2019.
  • [12] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. On the necessary and sufficient conditions for separability of mixed quantum states. Phys. Lett. A, 223(1), 1996.
  • [13] Pavel Hrubeš, Avi Wigderson, and Amir Yehudayoff. Non-commutative circuits and the sum-of-squares problem. Journal of the American Mathematical Society, 24(3):871–898, 2011.
  • [14] Pavel Hrubeš. A subquadratic upper bound on sum-of-squares composition formulas. In 39th Computational Complexity Conference (CCC 2024), volume 300 of LIPIcs, pages 12:1–12:11, 2024. doi:10.4230/LIPIcs.CCC.2024.12.
  • [15] Adolf Hurwitz. Über die komposition der quadratischen formen. Mathematische Annalen, 88(1):1–25, 1922.
  • [16] Deepanshu Kush and Shubhangi Saraf. Near-Optimal Set-Multilinear Formula Lower Bounds. In Amnon Ta-Shma, editor, 38th Computational Complexity Conference (CCC 2023), volume 264 of Leibniz International Proceedings in Informatics (LIPIcs), pages 15:1–15:33. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.CCC.2023.15.
  • [17] Nutan Limaye, Srikanth Srinivasan, and Sébastien Tavenas. On the Partial Derivative Method Applied to Lopsided Set-Multilinear Polynomials. In Shachar Lovett, editor, 37th Computational Complexity Conference (CCC 2022), volume 234 of Leibniz International Proceedings in Informatics (LIPIcs), pages 32:1–32:23. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.CCC.2022.32.
  • [18] Nutan Limaye, Srikanth Srinivasan, and Sebastien Tavenas. Superpolynomial lower bounds against low-depth algebraic circuits. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 804–814. IEEE Computer Society, 2022.
  • [19] Noam Nisan. Lower bounds for non-commutative computation. In Proceedings of the twenty-third annual ACM symposium on Theory of computing, pages 410–418, 1991.
  • [20] Noam Nisan and Avi Wigderson. Lower bounds on arithmetic circuits via partial derivatives. Computational complexity, 6:217–234, 1996. doi:10.1007/BF01294256.
  • [21] Asher Peres. Separability criterion for density matrices. Physical Review Letters, 77(8):1413, 1996.
  • [22] AR Rajwade. Squares, volume 171. Cambridge University Press, 1993.
  • [23] Ran Raz. Multilinear-NC1 multilinear-NC2. In 45th Annual IEEE Symposium on Foundations of Computer Science, pages 344–351. IEEE, 2004.
  • [24] Ran Raz. Tensor-rank and lower bounds for arithmetic formulas. Journal of the ACM (JACM), 60(6):1–15, 2013. doi:10.1145/2535928.
  • [25] Ran Raz and Amir Yehudayoff. Balancing syntactically multilinear arithmetic circuits. Computational Complexity, 17:515–535, 2008. doi:10.1007/S00037-008-0254-0.
  • [26] Ramprasad Saptharishi. A survey of lower bounds in arithmetic circuit complexity. Github survey, 95, 2015.
  • [27] Gadiel Seroussi and Abraham Lempel. Factorization of symmetric matrices and trace-orthogonal bases in finite fields. SIAM Journal on Computing, 9(4):758–767, 1980. doi:10.1137/0209059.
  • [28] Daniel B Shapiro. Compositions of quadratic forms, volume 33. Walter de Gruyter, 2011.
  • [29] Amir Shpilka and Amir Yehudayoff. Arithmetic circuits: A survey of recent results and open questions. Foundations and Trends in Theoretical Computer Science, 5(3–4):207–388, 2010. doi:10.1561/0400000039.
  • [30] Sébastien Tavenas, Nutan Limaye, and Srikanth Srinivasan. Set-multilinear and non-commutative formula lower bounds for iterated matrix multiplication. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 416–425, 2022. doi:10.1145/3519935.3520044.
  • [31] Leslie G Valiant. Completeness classes in algebra. In Proceedings of the 17th ACM Symposium on Theory of Computing, pages 249–261, 1979. doi:10.1145/800135.804419.