Unbalancing Sets and An Almost Quadratic Lower Bound for Syntactically Multilinear Arithmetic Circuits

We prove a lower bound of Ω(n2/log2n) on the size of any syntactically multilinear arithmetic circuit computing some explicit multilinear polynomial f(x1,...,xn). Our approach expands and improves upon a result of Raz, Shpilka and Yehudayoff ([34]), who proved a lower bound of Ω(n4/3/log2n) for the same polynomial. Our improvement follows from an asymptotically optimal lower bound for a generalized version of Galvin's problem in extremal set theory. A special case of our combinatorial result implies, for every n, a tight Ω(n) lower bound on the minimum size of a family F of subsets of cardinality 2n of a set X of size 4n, so that any subset of X of size 2n has intersection of size exactly n with some member of F. This settles a problem of Galvin up to a constant factor, extending results of Frankl and Rödl [15] and Enomoto et al. [12], who proved in 1987 the above statement (with a tight constant) for odd values of n, leaving the even case open.


Introduction
An arithmetic circuit is one of the most natural and standard computational models for computing multivariate polynomials. They provide a succinct representation of multivariate polynomials, and in some sense, they can be thought of as algebraic analogs of boolean circuits. Formally, an arithmetic circuit over a field F and a set of variables X = {x 1 , x 2 , . . . , x n } is a directed acyclic graph in which every vertex has in-degree either zero or two. The vertices of in-degree zero (called leaves) are labeled by variables in X or elements of F, and the vertices of in-degree two are labeled by either + (called sum gates) or × (called product gates). A circuit can have one or more vertices of out degree zero, known as the output gates. The polynomial computed by a vertex in any 1 given circuit is naturally defined in an inductive way: a leaf computes the polynomial which is equal to its label. A sum gate computes the polynomial which is the sum of the polynomials computed at its children and a product gate computes the polynomial which is the product of the polynomials at its children. The polynomials computed by a circuit are the polynomials computed by its output gates. The size of an arithmetic circuit is the number of vertices in it.
It is not hard to show (see, e.g., [CKW11]) that a random polynomial of degree d = poly(n) in n variables cannot be computed an arithmetic circuit of size poly(n) with overwhelmingly high probability. A fundamental problem in this area of research is to prove a similar super-polynomial lower bound for an explicit polynomial family. Unfortunately, the problem continues to remain wide open and the current best lower bound known for general arithmetic circuits 2 is an Ω(n log n) lower bound due to Strassen [Str73] and Baur and Strassen [BS83] from more than three decades ago. The absence of substantial progress on this general question has lead to focus on the question of proving better lower bounds for restricted and more structured subclasses of arithmetic circuits. Arithmetic formulas [Kal85], non-commutative arithmetic circuits [Nis91], algebraic branching programs [Kum17], and low depth arithmetic circuits [NW97, GK98, GR00, Raz10, GKKS14, FLMS14, KLSS14, KS14,KS17] are some such subclasses which have been studied from this perspective. For an overview of the definition of these models and the state of art for lower bounds for them, we refer the reader to the surveys of Shpilka and Yehudayoff [SY10] and Saptharishi [Sap16].
Several of the most important polynomials in algebraic complexity and in mathematics in general are multilinear. Notable examples include the determinant, the permanent, and the elementary symmetric polynomials. Therefore, one subclass which has received a lot of attention in the last two decades and will be the focus of this paper is the class of multilinear arithmetic circuits.

Multilinear arithmetic circuits
For an arithmetic circuit Ψ and a vertex v in Ψ, we denote by X v the set of variables x i such that there is a directed path from a leaf labeled by x i to v; in this case, we also say that v depends on x i 3 . A polynomial P is said to be multilinear if the individual degree of every variable in P is at most one.
An arithmetic circuit Ψ is said to be syntactically multilinear if for every multiplication gate v in Ψ with children u and w, the sets of variables X u and X w are disjoint. We say that Ψ is semantically multilinear if the polynomial computed at every vertex is a multilinear polynomial. Observe that if Ψ is a syntactically multilinear circuit, then it is also semantically multilinear. However, it is not clear if every semantically multilinear circuit can be efficiently simulated by a 1 Throughout this paper, we will use the terms gates and vertices interchangeably. 2 In the rest of the paper, when we say a lower bound, we always mean it for an explicit polynomial family. 3 We remark that this is a syntactic notion of dependency, since it is possible that every monomial with xi might get canceled in the intermediate computation and might not eventually appear in the polynomial computed at v.

syntactically multilinear circuit.
A multilinear circuit is a natural model for computing multilinear polynomials, but it is not necessarily the most efficient one. Indeed, it is remarkable that all the constructions of polynomial size arithmetic circuits for the determinant [Csa76,Ber84,MV97], which are fundamentally different from one another, nevertheless share the property of being non-multilinear, namely, they involve non-multilinear intermediate computations which eventually cancel out. There are no subexponential-size multilinear circuits known for the determinant, and one may very well conjecture these do not exist at all.
Multilinear circuits were first studied studied by Nisan and Wigderson [NW97]. Subsequently, Raz [Raz09] defined the notion of multilinear formulas 4 and showed that any multilinear formula computing the determinant or the permanent of an n × n variable matrix must have superpolynomial size. In a follow up work [Raz06], Raz further strengthed the results in [Raz09] and showed that there is a family of multilinear polynomials in n variables which can be computed by a poly(n) size syntactically arithmetic circuits but require multilinear formulas of size n Ω(log n) .
Building on the ideas and techniques developed in [Raz09], Raz and Yehudayoff [RY09] showed an exponential lower bound for syntactically multilinear circuits of constant depth. Interestingly, they also showed a super-polynomial separation between depth ∆ and depth ∆ + 1 syntactically multilinear circuits for constant ∆.
In spite of the aforementioned progress on the question of lower bounds for multilinear formulas and bounded depth syntactically multilinear circuits, there was no Ω(n 1+ε ) lower bounds known for general syntactically multilinear circuits for any constant ε > 0. In fact, the results in [Raz06] shows that the main technical idea underlying the results in [Raz09,Raz06,RY09] is unlikely to directly give a super-polynomial lower bound for general syntactically multilinear circuits. However, a weaker super-linear lower bound still seemed conceivable via similar techniques.
Raz, Shpilka and Yehudayoff [RSY08] showed that this is indeed the case. By a sophisticated and careful application of the techniques in [Raz09] along with many other ideas, they showed an Ω n 4/3 log 2 n lower bound for an explicit n variate polynomial. Since then, this has remained the best lower bound known for syntactically multilinear circuits. In this paper, we improve this result by showing an almost quadratic lower bound for syntactically multilinear circuits for an explicit n variate polynomial. In fact, the family of hard polynomials in this paper is the same as the one used in [RSY08]. We now formally state our result.
Theorem 1.1. There is an explicit family of polynomials {f n : n = 4p for a prime p} where f n is an n variate multilinear polynomial such that any syntactically multilinear arithmetic circuit computing f n must have size at least Ω(n 2 / log 2 n).
For our proof, we follow the strategy in [RSY08]. Our improvement comes from an improvement in a key lemma in [RSY08] which addresses the following combinatorial problem. Stated informally, Raz, Shpilka and Yehudayoff [RSY08] showed that m(n) ≥ Ω n 1/3 /log n . For our proof, we essentially 5 show that m(n) ≥ Ω (n/log n).
In addition to its application to the proof of Theorem 1.1, Question 1.2 seems to be a natural problem in extremal combinatorics and might be of independent interest, and special cases thereof were studied in the combinatorics literature. In the next section, we briefly discuss the state of art of this question and state our main technical result in Theorem 1.4.

Unbalancing Sets
The following question, which is of very similar nature to Question 1.2, is known as Galvin's problem (see [FR87,EFIN87] Frankl and Rödl [FR87] were able to show that m(n) ≥ εn for some ε > 0 if n is odd, and Enomoto, Frankl, Ito and Nomura [EFIN87] proved that m(n) ≥ 2n if n is odd, which implies that even the constant in the construction given above is optimal. The question is still open for even values of n: in fact, Markert and West (unpublished, see [EFIN87]) showed that for n ∈ {2, 4}, m(n) < 2n.
For our purposes, we need to generalize Galvin's problem in two ways. The first is to lift the restriction on the set sizes. The second is to ask how small can the size of the family F = {S 1 , . . . , S m } ⊆ 2 [n] be if we merely assume each balanced partition T is "τ -balanced" on some S ∈ F, namely, if ||T ∩ S| − |S|/2|| ≤ τ for some S (the main case of interest for us is τ = O(log n)). Of course, since T itself is balanced, very small or very large sets are always τ -balanced, and thus we impose the non-triviality condition 10τ ≤ |S| ≤ n − 10τ for every S ∈ F (the constant 10 here is, of course, arbitrary).
We propose the conjecture that, perhaps up to a constant, this construction is optimal.
Conjecture 1.3. Let n be a positive even integer, and We remark that the relevance of conjectures of the form of Conjecture 1.3 to lower bounds in algebraic complexity was also observed by Jansen [Jan08], who considered the problem of obtaining lower bound on homogenous syntactically multilinear algebraic branching program (which is a weaker model than syntactically multilinear circuits).
Alon, Bergmann, Coppersmith and Odlyzko [ABCO88] considered a very similar problem of balancing ±1-vectors: they studied families of vectors , which satisfy the properties that for every w ∈ {±1} n (not necessarily balanced), there They generalized a construction of Knuth [Knu86] and proved a matching lower bound which together showed that m = ⌈n/(d + 1)⌉ is both necessary and sufficient for such a set to exist. Galvin's problem seems like "the {0, 1} version" of the same problem, but, to quote from [ABCO88], there does not seem to be any simple dependence between the problems.
While we do not prove Conjecture 1.3 in its full form, we are able to prove a special case which is enough to derive the lower bounds for syntactically multilinear circuits. We prove: Apart from the constants, which we did not try to optimize, Theorem 1.4 is weaker than Conjecture 1.3 in two ways. The first is the requirement that n, the universe size, equals 4p for some prime p. The second is the requirement that τ ≥ log p. In fact, our proof implies something a bit stronger: the unbalancedness parameter τ can be picked to be even smaller, as long as we assume |S i | ≥ C log p for a large enough constant C.
Since Conjecture 1.3 is a fairly natural conjecture in extremal combinatorics, it will be interesting to remove either of these restrictions. However, this does not seem to imply any immediate improvements in our lower bound, even up to logarithmic factors.

Proof overview
In this section, we discuss the main ideas and give a brief sketch of the proofs of Theorem 1.1 and Theorem 1.4. Since our proof heavily depends on the proof in [RSY08] and follows the same strategy, we start by revisiting the main steps in their proof and noting the key differences between the proof in [RSY08] and our proof. We also outline the reduction to the combinatorial problem of unbalancing set families in Question 1.2.

Proof sketch of [RSY08]
The proof in [RSY08] starts by proving a syntactically multilinear analog of a classical result of Baur and Strassen [BS83], where it was shown that if an n variate polynomial f is computable by an arithmetic circuit Ψ of size s(n), then there is an arithmetic circuit Ψ ′ of size at most 5s(n) with n outputs such that the i-th output gate of Ψ ′ computes f i = ∂f ∂x i . Raz, Shpilka and Yehudayoff show that if Ψ is syntactically multilinear, then the circuit Ψ ′ continues to be syntactically multilinear. Additionally, there is no directed path from a leaf labeled by x i to the output gate computing f i . 6 Once we have this structural result, it would suffice to prove a lower bound on the size of Ψ ′ . For brevity, we denote the subcircuit of Ψ ′ rooted at the output gate computing f i by Ψ ′ i . As a key step of the proof in [RSY08], the authors identify certain sets of vertices U 1 , U 2 , . . . , U n in Ψ ′ with the following properties.
• For every i ∈ [n], U i is a subset of vertices in Ψ ′ i .
• For every i ∈ [n] and v ∈ U i , the number of j = i such that v ∈ U j is not too large (at most O(log n)).
Observe that at this point, showing a lower bound of s ′ (n) on the size of each U i implies a lower bound of ns ′ (n)/log n on the size of Ψ ′ and hence Ψ. In [RSY08], the authors show that there is an explicit f such that each U i must have size at least Ω(n 1/3 / log n), thereby getting a lower bound of Ω(n 4/3 / log 2 n) on the size of Ψ. For our proof, we follow precisely this high level strategy. Our improvement in the lower bound comes from showing that each U i must be of size at least Ω(n/ log n) and not just Ω(n 1/3 / log n) as shown in [RSY08]. We now elaborate further on the main ideas in this step in [RSY08] and the differences with the proofs in this paper.
We start with some intuition into the definition of the sets U i in [RSY08]. Consider a vertex v in Ψ ′ which depends on at least k variables. Without loss of generality, let these variables be {x 1 , x 2 , . . . , x k }. From item 4 in Theorem 4.2, we know that the variable x i does not appear in the subcircuit Ψ ′ i . Therefore, the vertex v cannot appear in the subcircuits Ψ ′ 1 , Ψ ′ 2 , . . . , Ψ ′ k . So, if we define the set U i as the set of vertices in Ψ ′ i which depend on at least k variables, then U i must be disjoint from vertices in at least k of the subcircuits Ψ ′ 1 , Ψ ′ 2 , . . . , Ψ ′ n . Picking k ≥ n − O(log n) would give us the desired property. So, if we can prove a lower bound on the size of the set U i , we would be done. However, the definition of the set U i so far turns out to be too general, and we do not know a way of directly proving a lower bound on its size. 7 To circumvent this obstacle, [RSY08] define the set U i (called the upper leveled gates in Ψ ′ i ) as the set of all vertices in Ψ ′ i which depend on at least n − 6 log n variables and have a child which depends on more than 6 log n variables and less than n − 6 log n variables. This additional structure is helpful in proving a lower bound on the size of U i . We now discuss this in some more detail.
For every i ∈ [n], let L i be the set of vertices u in Ψ ′ i , such that 6 log n < |X u | < n − 6 log n, and u has a parent in U i . These gates are referred to as lower leveled gates. Observe that |U i | ≥ |L i | 2 , since the in-degree of every vertex in ψ ′ i is at most 2. The key structural property of the set L i is the following (see Proposition 5.5 in [RSY08]).
• The degree of g is at most O(log n).
Observe that Equation 1.6 is basically a decomposition of a potentially-hard polynomial f i in terms of the sum of products of multilinear polynomials in an intermediate number of variables. The goal is to show that for an appropriate explicit f i , the number of summands on the right hand side of Equation 1.6 cannot be too small. A similar scenario also appears in the multilinear formula lower bounds and bounded depth multilinear formula lower bounds of [Raz09,Raz06,RY09] (albeit with some key differences). Hence, a natural approach at this point would be use the tools in [Raz09,Raz06,RY09], namely the rank of the partial derivative matrix, to attempt to prove this lower bound. We refer the reader to Section 2.1 for the definitions and properties of the partial derivative matrix and proceed with the overview. For each j ∈ [ℓ], let the polynomial h j in Lemma 1.5 depend on the variables S j ⊆ X. The key technical step in the rest of the proof is to show that there is a partition of the set of variables X = {x 1 , x 2 , . . . , x n } into Y and Z such that |Y | = |Z| and for every j ∈ [ℓ], ||S j ∩ Y | − |S j ∩ Z|| ≥ Ω(log n). In [RSY08], the authors show that there is an absolute constant ε > 0 such that if ℓ ≤ εn 1/3 / log n, then there is an equipartition of X which unbalances all the sets {S j : j ∈ [ℓ]} by at least Ω(log n). Our key technical contribution (Theorem 1.4) in this paper is to show that as long as ℓ ≤ εn/ log n, there is an equipartition which unbalances all the S j 's by at least Ω(log n). This implies an Ω(n/ log n) on the size of each set U i , and thus an Ω(n 2 / log 2 n) lower bound on the circuit size.
Before we dive into a more detailed discussion on the overview and main ideas in the proof of Theorem 1.4 in the next section, we would like to remark that the lower bound question in Equation 1.6 seems to be a trickier question than what is encountered while proving multilinear formula lower bounds [Raz09,Raz06] or bounded depth syntactically multilinear circuit lower bounds [RY09]. The main differences are that in the proofs in [Raz09,Raz06,RY09], the sets S j have a stronger guarantee on their size (at least n Ω(1) and at most n − n Ω(1) ), and each of the summands on the right has many variable disjoint factors and not just two factors as in Equation 1.6. For instance, in the formula lower bound proofs the number of variable disjoint factors in each summand on the right is Ω(log n), and for constant depth circuit lower bounds it is n Ω(1) . Together, these properties make it possible to show much stronger lower bounds on ℓ. In particular, it is known that a random equipartition works for these two applications, in the sense that it unbalances sufficiently many factors in each summand, thereby implying that the rank of the partial derivative matrix of the polynomial is small. Hence, for an appropriate 8 f i , the number of summands must be large. However, since a set of size O(log n) is balanced under a random equipartition with probability Ω(1/ √ log n) and the identity in Equation 1.6 involves just two variable disjoint factors, taking a random equipartition would not enable us to prove any meaningful bounds.
Proof sketch of Theorem 1.4 Recall that our task is, given a small collection of subsets of [n], to find a balanced partition which is unbalanced on each of the sets. Equivalently, we would like to prove if F is a family of subsets such that every balanced partition balances at least one set in F, then |F| must be large (of course, F must satisfy the conditions in Theorem 1.4). For the sake of simplicity, suppose all subsets S ∈ F are of even size, and assume further that for every subset T ⊆ [n] of size n/2 there exists S ∈ F such that T completely balances S, namely, |T ∩ S| = |S|/2. One possible approach to obtain lower bounds on |F| is via an application of the polynomial method. Define the following polynomial over, say, the rationals: By the assumption on F, the polynomial f evaluates to 0 over all points in {0, 1} n with Hamming weight exactly n/2. We can also argue, using the assumption on the set sizes in F, that f is not identically zero, and clearly deg(f ) ≤ |F|. Thus, a lower bound on deg(f ) translates to a lower bound on F.
This idea, however, seems like a complete nonstarter, since there exists a degree 1 non-zero polynomial which evaluates to 0 over the middle layer of {0, 1} n , namely, i x i − n/2.
A very clever solution to this potential obstacle was found by Hegedűs [Heg10]. Suppose n = 4p for some prime p. The main insight in [Heg10] is to consider the polynomial f above over F p , and to add the requirement that there exists some z ∈ {0, 1} 4p , of Hamming weight exactly 3p, such that f (z) = 0. This requirement rules out the trivial example i x i − n/2, and Hegedűs was able to show that the degree of any polynomial with these properties must be at least p = n/4 (see Lemma 2.1 for the complete statement).
We are thus left with the task of proving that our polynomial evaluates to a non-zero value over some point z ∈ {0, 1} 4p of Hamming weight 3p. This turns out to be not very hard to show, by choosing a random such vector z. Indeed, it is not surprising that it is much easier to directly show that a highly unbalanced partition of [n] (into 3n/4 vs n/4) unbalances all the sets F. 9 The goal of Theorem 1.4 is to show that there is even a balanced partition of [n] which is unbalanced on all these subsets, and this is potentially a more challenging task.
Even though Lemma 2.1 seems to be a fundamental statement about polynomials over finite fields and could conceivably have an elementary proof, the proof in [Heg10] uses more advanced techniques. It relies on the description of Gröbner basis for ideals of polynomials in F[x 1 , x 2 , . . . , x n ] which vanish on all points in {0, 1} n of weight equal to n/2. A complete description of the reduced Gröbner basis for such ideals was given by Hegedűs and Rónyai [HR03] and their proof builds up on a number of earlier partial results [ARS02,FG06] on this problem.
To the best of our knowledge, the proof in [Heg10] is the only known proof of Lemma 2.1, and giving a self contained elementary proof of it seems to be an interesting question.

Organization of the paper
In the rest of the paper, we set up some notation and discuss some preliminary notions in Section 2, prove Theorem 1.4 in Section 3 and complete the proof of Theorem 1.1 in Section 4.

Preliminaries
For n ∈ N, we denote [n] = {1, 2, . . . , n}. For a prime p, we denote by F p the finite field with p elements. The characteristic vector of a set S ⊆ [n] is denoted by 1 S ∈ {0, 1} n .
As is standard, We use the following lemma from [Heg10].

Partial derivative matrix
For a circuit Ψ, we denote by |Ψ| the size of Ψ, namely, the number of gates in it. For a gate v, we denote by X v the set of variables that occur in the subcircuit rooted at v.
Let X = {x 1 , . . . , x n } be a set of variables, Y ⊆ X (not necessarily of size n/2) and let Z = X\Y . For a multilinear polynomial f (X) ∈ F[X], we define the partial derivative matrix of f with respect to Y, Z, denoted M Y,Z (f ), as follows: the rows of M are indexed by multilinear monomials in Y . the columns of M are indexed by multilinear monomials in Z. The entry which corresponds to (m 1 , m 2 ) is the coefficient of the monomial m 1 · m 2 in f . We define rank Y,Z (f ) = rank(M Y,Z (f )).
The following properties of the partial derivative matrix are easy to prove and well-documented (see, e.g., [RSY08]).
Proposition 2.2. The following properties hold: 2. For every two multilinear polynomials f 1 (X), f 2 (X) ∈ F[X] and for every partition .
5. Let f (X) ∈ F[X] be a multilinear polynomial of total degree d. Then for every partition

Unbalancing sets under a balanced partition
In this section, we prove the following theorem. We start with the following lemma, which shows that a small collection of sets can be unbalanced (modulo p) by a partition which is very unbalanced.  Proof. The probability that |T | = 3p is given by 4p 3p ·(3/4) 3p ·(1/4) p , which is Θ(1/ √ p), by Stirling's approximation.
The proof of Lemma 3.2 is now fairly immediate.
Proof of Lemma 3.2. Pick T ∼ µ 3/4 . By Claim 3.3, |T | = 3p with probability Θ(1/ √ p). Recall that T is bad for S i if |T ∩ S i | = ⌊|S i |/2⌋ + t mod p. By Claim 3.3, for each S i , T is bad for S i with probability at most 1/p 5 . Hence, the probability that there exists i ∈ [m] such that T is bad for S i is at most m/p 5 ≤ 1/p 4 . It follows that with probability at most 1 − Θ(1/ √ p) + 1/p 4 < 1, either |T | = 3p or T is bad for some S i , and hence there exists a selection of T such that |T | = 3p and T is good for all S i 's.
We are now ready to prove Theorem 3.1.
Proof of Theorem 3.1. Let S 1 , . . . , S m be a collection of sets as stated in the theorem. Since d Y (S j ) = d Y ([n] \ S j ), we can assume without loss of generality, by possibly replacing a set with its complement, that |S j | ≤ 2p for all j ∈ [m]. We may further assume m ≤ p as otherwise the statement directly follows. For j ∈ [m], define the following polynomials over F p : as a polynomial over F p . By assumption, for every Y ∈ [4p] 2p , f (1 Y ) = 0. This follows because 1 Y , 1 S j = |Y ∩ S j |, and by assumption, for some j is holds that d Y (S j ) ≤ τ , so it must be that |Y ∩ S j | − ⌊|S j |/2⌋ ∈ {−τ, . . . , 0, . . . , τ + 1}, so that B j (1 Y ) = 0. Furthermore, Lemma 3.2 guarantees the existence of a set T ∈ [4p] 3p such that f (1 T ) = 0, as the set T from Lemma 3.2 satisfies the property that ( 1 T , 1 S j − ⌊|S j |/2⌋ − t) = 0 mod p for all −τ ≤ t ≤ τ + 1 and for all j ∈ [m].
By Lemma 2.1, deg(f ) ≥ p, and by construction, deg(f ) ≤ 3τ · m, which implies the desired lower bound on m.

Syntactically Multilinear Arithmetic Circuits
In this section, for the sake of completeness, we review the arguments of Raz, Shpilka and Yehudayoff [RSY08], and show how Theorem 3.1 implies a lower bound of Ω(n 2 / log 2 n). We mostly refer for [RSY08] for the proofs.
Specifically, we will show the following.
The first step in proof of Theorem 4.1 is to show that if f is computed by a syntactically mutilinear circuit of size s, then there exists a syntactically multilinear circuit of size O(s) that computes all the first-order partial derivatives of f , with the additional important property that for each i, the variable x i does not appear in the subcircuit rooted at the output gate which computes ∂f /∂x i .

For every
In particular, if v is a gate in Ψ ′ , then it is connected by a directed path to at most n − |X v | output gates.
The proof of Theorem 4.2 appears in [RSY08], and mostly follows the classical proof of Baur and Strassen [BS83] of the analogous result for general circuits, with additional care in order to guarantee the last two properties.
Next we define two types of gates in a syntactically multilinear arithmetic circuits.
Definition 4.3. Let Φ be a syntactically multilinear arithmetic circuit. Define L(Φ, k), the set of lower-leveled gates in Φ, by Define U (Φ, k), the set of upper-leveled gates in Φ, by The following lemma shows that if the set of lower-leveled gates is small, then there exists a partition X = Y ⊔ Z under which the polynomial computed by the circuit is not of full rank.
We first sketch how Theorem 4.1 follows from Lemma 4.4. The proof is identical to the proof given in [RSY08] with slightly different parameters.
Proof of Theorem 4.1 assuming Lemma 4.4. Let Ψ ′ be the arithmetic circuit computing all n firstorder partial derivatives of f , given by Theorem 4.2. Set τ = 3 log p and let L = L(Ψ ′ , 100τ ) and U = U (Ψ ′ , 100τ ) as in Definition 4.3.
Denote f i = ∂f /∂x i and let v i be the gate in Ψ ′ computing f i , and Ψ ′ i be the subcircuit of Ψ ′ rooted at v i . Let L i = L(Ψ ′ i , 100τ ). It is not hard to show (see [RSY08]) that L i ⊆ L, and by Lemma 4.4 and item 4 in Proposition 2.2, it follows that |L i | ≥ p/(4τ ). For : v is a gate in Ψ i } to be the set of indices i such that there exists a directed path from v to the output gate computing f i . For i ∈ [n], let U i = {u ∈ U : u is a gate in Ψ ′ i }, so that u∈U C u = i∈[n] |U i |. Since the fan-in of each gate is at most two, |L i | ≤ 2|U i |, and since every u ∈ U satisfies |X u | ≥ n − 100τ , it follows by Theorem 4.2 that |C u | ≤ 100τ . Thus, we get By item 2 in Theorem 4.2, and since p = n/4 and τ = 3 log p, It remains to prove Lemma 4.4. As the proof mostly appears in [RSY08], we only sketch the main steps.
Proof sketch of Lemma 4.4. Suppose L ≤ p/(4τ ). By applying Theorem 3.1 to the family of sets {X v : v ∈ L}, it follows that there exists a balanced partition Y ⊔ Z of X such that X v is τunbalanced for every gate v ∈ L.
The proof now proceeds in the exact same manner as the proof of Lemma 5.2 in [RSY08]. In Proposition 5.5 of [RSY08], it is shown that one can write f = i∈ [ℓ] g i h i + g, where L = {v 1 , . . . , v ℓ }, h i is the polynomial computed at v i , and the set of variables appearing in g i is disjoint from X v i .
In Claim 5.7 of [RSY08], it is shown that for every i ∈ [ℓ], rank Y,Z (g i h i ) ≤ 2 n/2−τ . This uses the fact that X v i is τ -unbalanced, the upper bound in item 1 in Proposition 2.2, and item 3 in the same proposition.
In Proposition 5.8 of [RSY08], it is shown (with the necessary change of parameters) that the degree of g is at most 200τ .

An explicit full-rank polynomial
In this section, for the sake of completeness, we give a construction of a polynomial which is full-rank under any partition of the variables.
Finally, define f =
Corollary 4.7. For n = 4p where p is prime, every syntactically multilinear circuit computing f has size at least Ω(n 2 / log 2 n).
The polynomial f in Construction 4.5 is in the class VNP of explicit polynomials, but it is not known whether there exists a polynomial size multilinear circuit for f . Raz and Yehudayoff [RY08] constructed a full-rank polynomial g ∈ F[X, W ′ ] that has a syntactically multilinear circuit of size O(n 3 ). Their construction also uses a set of auxiliary variables W ′ of size O(n 3 ). Thus, if one measures the complexity as a function of |X| ∪ |W ′ |, the quadratic lower bound of Theorem 4.1 is meaningless, because a lower bound of Ω(n 3 ) holds trivially. However, we believe that since the rank is taken over F(W ′ ), it is only fair to consider computations over F(W ′ ), where any rational expression in the variables of W ′ is merely a field constant. Thus, in this setting, an input gate can be labeled by an arbitrarily complex rational function in the variables of W ′ , and the complexity is measured as a function of |X| alone. In this model the lower bound of Theorem 4.1 is meaningful, and furthermore, this example shows that the partial derivative matrix technique cannot prove an ω(n 3 ) lower bound.