Depth-4 Lower Bounds, Determinantal Complexity : A Unified Approach

Tavenas has recently proved that any n^{O(1)}-variate and degree n polynomial in VP can be computed by a depth-4 circuit of size 2^{O(\sqrt{n}\log n)}. So to prove VP not equal to VNP, it is sufficient to show that an explicit polynomial in VNP of degree n requires 2^{\omega(\sqrt{n}\log n)} size depth-4 circuits. Soon after Tavenas's result, for two different explicit polynomials, depth-4 circuit size lower bounds of 2^{\Omega(\sqrt{n}\log n)} have been proved Kayal et al. and Fournier et al. In particular, using combinatorial design Kayal et al.\ construct an explicit polynomial in VNP that requires depth-4 circuits of size 2^{\Omega(\sqrt{n}\log n)} and Fournier et al.\ show that iterated matrix multiplication polynomial (which is in VP) also requires 2^{\Omega(\sqrt{n}\log n)} size depth-4 circuits. In this paper, we identify a simple combinatorial property such that any polynomial f that satisfies the property would achieve similar circuit size lower bound for depth-4 circuits. In particular, it does not matter whether f is in VP or in VNP. As a result, we get a very simple unified lower bound analysis for the above mentioned polynomials. Another goal of this paper is to compare between our current knowledge of depth-4 circuit size lower bounds and determinantal complexity lower bounds. We prove the that the determinantal complexity of iterated matrix multiplication polynomial is \Omega(dn) where d is the number of matrices and n is the dimension of the matrices. So for d=n, we get that the iterated matrix multiplication polynomial achieves the current best known lower bounds in both fronts: depth-4 circuit size and determinantal complexity. To the best of our knowledge, a \Theta(n) bound for the determinantal complexity for the iterated matrix multiplication polynomial was known only for constant d>1 by Jansen.


Introduction
One of the main challenges in algebraic complexity theory is to separate VP from VNP. This problem is well known as Valiant's hypothesis [Val79]. This is an algebraic analog of the problem P vs NP. Permanent polynomial characterizes the class VNP over fields of all characteristics except 2 and the determinant polynomial characterizes the class VP with respect to quasi-polynomial projections.
Definition 1. The determinantal complexity of a polynomial f , over n variables, is the minimum m such that there are affine linear functions A k,ℓ , 1 ≤ k, ℓ ≤ m defined over the same set of variables and f = det((A k,ℓ ) 1≤k,ℓ≤m ). It is denoted by dc(f ).
To resolve Valiant's hypothesis, proving dc(perm n ) = n ω(log n) is sufficient. Von zur Gathen [vzG86] proved dc(perm n ) ≥ 8 7 n. Later Cai [Cai90], Babai and Seress [vzG87], and Meshulam [Mes89] independently improved the lower bound to √ 2n. In 2004, Mignon and Ressayre [MR04] came up with a new idea of using second order derivatives and proved that dc(perm n ) ≥ n 2 2 over the fields of characteristic zero. Subsequently, Cai et al. [CCL08] extended the result of Mignon and Ressayre to all fields of characteristic = 2.
For any polynomial f , Valiant [Val79] proved that dc(f ) ≤ 2(F (f ) + 1) where F (f ) is the arithmetic formula complexity of f . Later, Nisan [Nis91] proved that dc(f ) = O(B(f )) where B(f ) is the the arithmetic branching program complexity of f .
Another possible way to prove Valiant's hypothesis is to prove that the permanent polynomial can not be computed by any polynomial size arithmetic circuit. In 2008, Agrawal and Vinay proved that any arithmetic circuit of sub-exponential size can be depth reduced to a depth-4 circuit maintaining a nontrivial upper bound on the size [AV08]. Subsequently, Koiran [Koi12] and Tavenas [Tav13] have come up with improved depth reductions (in terms of parameters). In particular, Tavenas proved that any n O(1) -variate polynomial of degree n in VP can also be computed by a ΣΠ [O( In a recent breakthrough, Gupta et al. [GKKS13] proved a 2 Ω( √ n) lower bound for the size of the depth-4 circuits computing the determinant or the permanent polynomial using the method of shifted partial derivatives. Subsequently, Kayal et al. [KSS13] improved the situation by proving a 2 Ω( √ n] -circuit size lower bound for an explicit polynomial in VNP. More precisely, in [KSS13] the following family of polynomials constructed from the combinatorial design of Nisan-Wigderson [NW94] were considered: where a(z) runs over all univariate polynomials of degree < k for a suitable parameter k = O( √ n) and F is a finite field of size n. Here we consider the natural identification of F with the set {1, 2, . . . , n}. Since the number of monomials in NW(X) is n O( √ n) , the result from [KSS13] gives a tight bound of 2 Θ( √ n log n) for the depth-4 circuit complexity of NW(X).
Although the combined implication of [KSS13] and [Tav13] looks very exciting from the perspective of lower bounds, a recent result by Fournier et al. [FLMS13] shows that such a lower bound is also obtained by the iterated matrix multiplication polynomial which is in VP. The iterated matrix multiplication polynomial of d generic n × n matrices X (1) , X (2) , . . . , X (d) is the (1, 1)th entry of the product of the matrices. More formally, let X (1) , X (2) , . . . , X (d) be d generic n × n matrices with disjoint set of variables and x . Then the iterated matrix multiplication polynomial (denoted by IMM n,d ) is defined as follows.
Notice that IMM n,d (X) is a n 2 (d − 2) + 2n-variate polynomial of degree d. The result from [FLMS13] is also tight since IMM n,d ∈ VP and very importantly their result proves the optimality of the depth reduction of Tavenas [Tav13].
One of the main motivations of our study comes from this tantalizing fact that two seemingly different polynomials NW(X) ∈ VNP and IMM n,d (X) ∈ VP behave very similarly as far as the 2 Ω( √ n log n) -size lower bound for depth-4 circuits are concerned. In this paper, we seek a conceptual reason for this behaviour. We identify a simple combinatorial property such that any polynomial that satisfies it, would require 2 Ω( √ n log n) -size depth-4 arithmetic circuits. We call it Leading Monomial Distance Property. In particular, it does not matter whether the polynomial is easy (i.e. in VP) or hard (i.e. the polynomial is in VNP but not known to be in VP). As a result of this abstraction we give a simple unified analysis of the depth-4 circuit size lower bounds for NW(X) and IMM n,d (X).
To define the Leading Monomial Distance Property, we first define the notion of distance between two monomials.
Definition 2. Let m 1 , m 2 be two monomials over a set of variables. Let S 1 and S 2 be the (multi)sets of variables corresponding to the monomials m 1 and m 2 respectively. The distance ∆(m 1 , m 2 ) between the monomials m 1 and m 2 is the min{|S 1 |− |S 1 ∩ S 2 |, |S 2 |− |S 1 ∩ S 2 |} where the cardinalities are the order of the multi-sets.
For example, let m 1 = x 2 1 x 2 x 2 3 x 4 and m 2 = x 1 x 2 2 x 3 x 5 x 6 . Then We say that a n O(1) -variate and n-degree polynomial has the Leading Monomial Distance Property, if the leading monomials of a large subset (≈ n √ n ) of its span of the derivatives (of order ≈ √ n) have good pair-wise distance. Leading monomials are defined by defining a suitable order on the set of variables. We denote the leading monomial of a polynomial f (X) by LM(f ). More formally, we prove the following theorem in Section 4.
Theorem 1. Let f (X) be a n O(1) -variate polynomial of degree n. Let there be s ≥ n δk (δ is some Another motivation of this work is to find a connection between our current knowledge of the determinantal complexity lower bounds and the depth-4 circuit size lower bounds. The best known determinantal complexity lower bound for a n O(1) -variate and n degree (Permanent) polynomial is Ω(n 2 ). Here we ask the following question: can we give an example of an explicit n O(1) -variate degree n polynomial in VNP for which the determinantal complexity is Ω(n 2 ) and the the depth-4 complexity is 2 Ω( √ n log n) ? We settle this problem by showing a Ω(n 2 ) lower bound for dc(IMM n,n (X)) which is a O(n 3 )-variate and n-degree polynomial. In particular, we prove the following theorem.
Theorem 2. For any integers n and d > 1, the determinantal complexity of the iterated matrix multiplication polynomial IMM n,d is Ω(dn).
Since IMM n,d (X) has an algebraic branching program of size O(dn) [Nis91], from the above theorem it follows that dc(IMM n,d (X)) = Θ(dn). This improves upon the earlier bound of Θ(n) for the determinantal complexity of the iterated matrix multiplication polynomial for a constant d > 1 [Jan11]. Similar to the approach of [CCL08] and [MR04], we also use the the rank of Hessian matrix as our main technical tool.

Organization
In Section 3, we state some results from [GKKS13]. In Section 4, we do a unified analysis of the depth-4 lower bound results of [KSS13] and [FLMS13]. We prove the determinantal complexity lower bound of IMM n,d (X) in Section 5.

Preliminaries
The following beautiful lemma (from [GKKS13]) is the key to the asymptotic estimates required for the lower bound analyses.
In this paper, whenever we apply this lemma, (f + g) 2 will be o(a). So, we will not worry about the error term (which will be asymptotically zero) generated by this estimate.
Building on the results of [AV08] and [Koi12], Tavenas [Tav13] proved that any n O(1) -variate polynomial of degree n in VP can be computed by a ΣΠ [O( In [GKKS13], the authors introduced the method of shifted partial derivatives which is the key tool in the recent depth-4 lower bound results. They also gave an elegant way of upper bounding the dimension of the shifted partial derivative space of a polynomial computed by a ΣΠ . . x in n , let ∂ i f be the partial derivative of f with respect to the monomial x i . We recall the following definition of shifted derivatives from [GKKS13].
be a multivariate polynomial. The span of the ℓ-shifted k-th order derivatives of f , denoted by ∂ =k f ≤ℓ , is defined as We denote by dim( ∂ =k f ≤ℓ ) the dimension of the vector space ∂ =k f ≤ℓ .
The following proposition follows from the standard Gaussian elimination technique.
Proposition 1 (Corollary 13, [GKKS13]). For any multivariate polynomial with |i| ≤ ℓ and |j| = k and g ∈ F-span{∂ i f }} In [GKKS13], the following upper bound on the dimension of the shifted partial derivative space for polynomials computed by ΣΠ [D] ΣΠ [t] circuits was shown.

Unified analysis of depth-lower bounds
In this section we first prove a simple combinatorial lemma which we believe is the crux of the best known depth-4 lower bound results. In fact, the lower bounds on the size of circuits computing the polynomials NW(X) and IMM n,d (X) follow easily from this lemma by suitable setting of parameters.
Lemma 3. Let m 1 , m 2 , . . . , m s be the monomials over N variables such that ∆(m i , m j ) ≥ d. Each monomial m i is extended by padding monomials of length at most ℓ over N variables. Let B = N +ℓ N . Then the total number of monomials generated by the extension is at least sB − s 2 N +ℓ−d N .
Proof. Let B i be the set of monomials constructed by extending the monomial m i . It is easy to see that |B i | = N +ℓ N . We would like to estimate | ∪ i B i |. From the principle of inclusion and exclusion, we get Then the total number of monomials after extension is lower bounded as follows.
. We will use Lemma 1 to tightly estimate the subsequent computations. In particular, we always choose parameter ℓ such that d 2 = o(N + ℓ). This also shows that the error term given by Lemma 1 is always asymptotically zero and we will not worry about it.
We now apply Lemma 1 to derive the following: s . We use the inequality 1 + x > e x/2 for 0 < x < 1 to lower bound 1 + N ℓ d by e  N )) . Let f be a polynomial such that there are at least s polynomials in ∂ =k (f ) (for a suitable value of k) and any two of their leading monomials have distance of at least d. Then for the above choice of ℓ and from Proposition 1, we know that the dimension of ∂ =k f ≤ℓ ≥ (1 − 1 p(N ) ) s N +ℓ N . Combining this with Lemma 2, we get the following : Suppose we choose ℓ such that (kt − k) 2 = o(ℓ). Then by applying Lemma 1 we can easily show (kt−k) . We get the following Theorem from the above discussion.
Theorem 3. Let f (X) be a n O(1) -variate polynomial of degree n. Let there be s ≥ n δk (δ is some constant > 0) different polynomials in ∂ =k (f ) for k = ǫ √ n (where 0 < ǫ < 1 is any constant) such that any two of their leading monomials have distance of at least d ≥ n c for a constant c > 1.
circuit that computes f (X) must be of size s ′ ≥ e Ω( √ n ln n) .
Proof. All we need to do is to choose ℓ appropriately so that we get the desired lower bound. Let N be the number of variables in f . Then from the upper bound estimate of ℓ, we get that ℓ ≤ N (n/c) 2(δk ln n+2 ln n) by fixing the inverse polynomial to 1 n 2 . So it is enough to choose ℓ such that ℓ ≤ N n 4cδk ln n = N √ n 4cδǫ ln n . From the lower bound arguments mentioned above, we know that s ′ ≥ To get the required lower bound it is enough to choose ℓ such that N kt ℓ < µδk ln n for some 0 < µ < 1. Since t ≤ √ n, it is enough to take ℓ > N √ n µδ ln n . By comparing the lower and upper bounds of ℓ, we get that ǫ < µ 4c . Once we fix µ, we need to choose ǫ < min{c ′ , µ 4c }.
In the next section, we show that the lower bounds on the size of ΣΠ √ n] circuits computing NW(X) and IMM n,n (X) can be obtained by simply applying the Theorem 3. Moreover, it shows that the lower bound arguments of IMM n,n (X) are essentially same as the lower bound arguments of NW(X).

Lower bounds on the size of depth-4 circuits computing NW(X) and IMM n,d (X)
Now we derive the depth-4 circuit size lower bound for NW(X) polynomial by a simple application of Theorem 3. Proof. Recall that NW(X) = a(z)∈F[z] x 1a(1) x 2a(2) . . . x na(n) where F is a finite field of size n and a(z) are the polynomials of degree ≤ k − 1. Notice that any two monomials can intersect in at most k − 1 variables. Here we fix an ordering on the variables: x 11 ≻ x 12 ≻ · · · ≻ x nn . We differentiate the polynomial NW(X) with respect to the first k = ǫ √ n variables of each monomial. After differentiation, we get n k monomials of length (n − k) each. Since they are constructed from the image of univariate polynomials of degree at most (k − 1), the distance d between any two monomials ≥ n − 2k > n/2. So to get the required lower bound we apply Theorem 3 with δ = 1 and c = 2. The parameter ǫ will get fixed from Theorem 3.
Next we derive the lower bound on the size of the depth-4 circuit computing IMM n,n (X). Proof. Recall that IMM n,n (X) = i 1 ,i 2 ,...,i n−1 ∈[n] x (1) i (n−1) 1 . It is a polynomial over (n − 2)n 2 + 2n variables. Next we fix the following lexicographic ordering on the variables of the set of matrices {X (1) , X (2) , . . . , X (n) } as follows: X (1) ≻ X (2) ≻ X (3) ≻ . . . ≻ X (n) and in any X (i) the ordering is x Choose a prime p such that n 2 ≤ p ≤ n. Consider the set of univariate polynomials a(z) ∈ F p [z] of degree at most (k − 1) for k = ǫ √ n where ǫ is a small constant to be fixed later in the analysis.
Consider a set of 2k of those matrices such that they are n/4k distance apart : ) . Clearly 2k + 1 + (2k−1)n 4k < n. For each univariate polynomial a of degree at most (k − 1), define a set S a = {x }. Number of such sets is at least n 2 k and |S a ∩ S b | < k for a = b. Now we consider a polynomial f (X) which is a restriction of the polynomial IMM n,n (X). By restriction we simply mean that a few variables of IMM n,n (X) are fixed to some elements from the field and the rest of the variables are left untouched. We define the restriction as follows.
The rest of the variables are left untouched. Next we differentiate the polynomial f (X) with respect to the sets of variables S a indexed by the polynomials a(z) ∈ F[z]. Consider the leading monomial of the derivatives with respect to the sets S a for all a(z) ∈ F[z]. Since |S a ∩ S b | < k, it is straightforward to observe that the distance between any two leading monomials is at least k · n 4k = n 4 . The intuitive justification is that whenever there is a difference in S a and S b , that difference can be stretched to a distance n 4k because of the restriction that eliminates non diagonal entries. Now we prove the lower bound for the polynomial f (X) by applying Theorem 3. Clearly s ≥ (n/2) k > n 1 4 (2k) . So we set the parameter δ = 1/4 and c = 4. The parameter ǫ will get fixed from Theorem 3. Since f (X) is a restriction of IMM n,n (X), any lower bound for f (X) is a lower bound for IMM n,n (X) too : If IMM n,n (X) has a 2 o( √ n log n) sized ΣΠ [O(

Determinantal complexity of IMM n,d
We start by recalling a few facts from [CCL08]. Let A k,ℓ (X), 1 ≤ k, ℓ ≤ m be the affine linear functions over F[X] such that the following is true.
Consider a point X 0 ∈ F n 2 d such that IMM n,d (X 0 ) = 0. The affine linear functions A k,ℓ (X) can be expressed as L k,ℓ (X − X 0 ) + y k,ℓ where L k,ℓ is a linear form and y k,ℓ is a constant from the field. Thus, (A k,ℓ (X)) 1≤k,ℓ≤m = (L k,ℓ (X − X 0 )) 1≤k,ℓ≤m + Y 0 . If IMM n,d (X 0 ) = 0 then det(Y 0 ) = 0. Let C and D be two non-singular matrices such that CY 0 D is a diagonal matrix.
We use H IMM n,d (X) to denote the Hessian matrix of the iterated matrix multiplication and is defined as follows.
By taking second order derivatives and evaluating the Hessian matrices of IMM n,d (X) and det((A k,ℓ (X)) 1≤k,ℓ≤m ) at X 0 , we obtain H IMM n,d (X 0 ) = LH det (Y 0 )L T where L is a n 2 d × m 2 matrix with entries from the field. It follows that rank(H IMM n,d (X 0 )) ≤ rank(H det (Y 0 )). We give an explicit construction of a point X 0 ∈ F n 2 d such that IMM n,d (X 0 ) = 0 and rank(H IMM n,d (X 0 )) ≥ d(n − 1).

Upper bound for the rank of H det (Y 0 )
This analysis is the same as the one given in [CCL08] and [MR04] and we briefly recall it here for the sake of completeness. The second order derivative of det(Y) with respect to the variables y ij and y kℓ eliminates the rows {i, k} and the columns {j, ℓ}. Considering the form of Y 0 , the non-zero entries in H det (Y 0 ) are obtained only if 1 ∈ {i, k} and 1 ∈ {j, ℓ} and thus (ij, kℓ) are of the form (11, tt) or (t1, 1t) or (1t, t1) for any t > 1. This gives rank(H det (Y 0 )) = O(m).

Lower bound for the rank of H IMM n,d (X 0 )
In this section, we prove Theorem 2. In particular, we give a polynomial time algorithm to construct a point X 0 explicitly such that IMM n,d (X 0 ) = 0 and rank(H(IMM n,d (X 0 ))) ≥ d(n − 1). From the discussion in Section 5.1 and the upper bound for dc(IMM n,d (X)) from [Nis91], it is clear that such a construction is sufficient to prove Theorem 2.
Theorem 4. For any integers n, d > 1, there is a point X 0 ∈ F n 2 d such that IMM n,d (X 0 ) = 0 and rank(H(IMM n,d (X 0 ))) ≥ d(n − 1). Moreover, the point X 0 can be constructed explicitly in polynomial time.
Proof. We prove the theorem by induction on d. For the purpose of induction, we maintain that the entries indexed by the indices (1, 2), (1, 3), . . . , (1, n) of the matrix obtained after multiplying the first (d − 1) matrices are not all zero at X 0 .
We first prove the base case for d = 2. The corresponding polynomial is IMM n,2 (X) = i1 . It is easy to observe that the rank of the Hessian matrix is 2n > 2(n − 1) at any point since each non-zero entry of the Hessian matrix is 1 and the structure of the Hessian matrix is the following: where the only non-zero rows of B 12 are shown in the figure below and B 21 = B T 12 .
x (1) Thus, we have the following expression.
polynomial IMM n,d (X). This also makes the rows indexed by the variables x linearly independent. It is important to note that P 11 (X) = IMM n,d (X). Now, let us define a point such that it is a zero of the polynomial IMM n,(d+1) (X). We set = 0. Inductively fix the variables appearing in P 11 (X) by the values assigned by X 0 which is a zero of the polynomial IMM n,d (X). We will fix the other variables suitably later. We call the new point X 0 as well. Now, consider the first d×d blocks of the Hessian matrix H IMM n,(d+1) (X 0 ). It precisely represents the Hessian matrix of P 11 (X) which is also the Hessian matrix of the polynomial IMM n,d (X) at the point X 0 1 . By induction hypothesis, the rank of this minor of H IMM n,(d+1) (X 0 ) is at least d(n − 1).
The only non-zero entries in the columns indexed by the variable set X (d) are indexed by the variables x in B (d+1)d are linearly independent of the rows of B 1d , B 2d , . . . , B dd . Hence the rank of H IMM n,(d+1) at the point described is ≥ (d + 1)(n − 1).
For the purpose of induction, we must verify that the entries indexed by the indices (1, 2), (1, 3), . . . , (1, n) of the matrix obtained after multiplying the first d matrices are not all zero at X 0 . These entries are the polynomials P 12 , P 13 , . . . , P 1n . We shall express each of the polynomials in terms of s 1 , s 2 , . . . , s n as follows. By induction hypothesis, we already know that s 2 , s 3 , . . . , s n are not all zero at X 0 . Notice that the variables in X (d) \ {x n1 } were never set in the previous steps of induction 2 . Therefore, we can fix these variables suitably such that P 12 , P 13 , . . . , P 1n are not all zero when evaluated at the point X 0 (in fact, we can make all of them non-zero). It is clear that we construct the point X 0 in polynomial time. This completes the proof.