On Computing Multilinear Polynomials Using Multi-r-ic Depth Four Circuits

In this article, we are interested in understanding the complexity of computing multilinear polynomials using depth four circuits in which the polynomial computed at every node has a bound on the individual degree of r≥ 1 with respect to all its variables (referred to as multi-r-ic circuits). The goal of this study is to make progress towards proving superpolynomial lower bounds for general depth four circuits computing multilinear polynomials, by proving better bounds as the value of r increases. Recently, Kayal, Saha and Tavenas (Theory of Computing, 2018) showed that any depth four arithmetic circuit of bounded individual degree r computing an explicit multilinear polynomial on nO(1) variables and degree d must have size at least (n/r1.1)Ω(√d/r). This bound, however, deteriorates as the value of r increases. It is a natural question to ask if we can prove a bound that does not deteriorate as the value of r increases, or a bound that holds for a larger regime of r. In this article, we prove a lower bound that does not deteriorate with increasing values of r, albeit for a specific instance of d = d(n) but for a wider range of r. Formally, for all large enough integers n and a small constant η, we show that there exists an explicit polynomial on nO(1) variables and degree Θ (log2 n) such that any depth four circuit of bounded individual degree r ≤ nη must have size at least exp(Ω(log2 n)). This improvement is obtained by suitably adapting the complexity measure of Kayal et al. (Theory of Computing, 2018). This adaptation of the measure is inspired by the complexity measure used by Kayal et al. (SIAM J. Computing, 2017).

On Computing Multilinear Polynomials Using Multi-r -ic Depth Four Circuits 16:3 exponential lower bound against the same circuit model but for a polynomial in VNP and in the high-degree regime. In this setting of high degree and for the polynomial in VNP, [Hegde and Saha 2017] prove their lower bound using just the method of Shifted Partial Derivatives.
Motivation for This Work. Raz and Yehudayoff [Raz and Yehudayoff 2009] showed a lower bound of exp(Ω( d log d)) against multilinear depth four circuits that compute a multilinear polynomial over n variables and degree d n (cf. [Kayal et al. 2018, Footnote 9]). Kayal et al. [2018] have shown a lower bound of ( n r 1.1 ) Ω( d r ) for a multilinear polynomial over n O (1) variables and degree d that is computed by a multi-r -ic depth four circuit. This lower bound deteriorates as the value of r increases. Further, it is superpolynomial only when r is o(d). This raises a natural question if the dependence on r could be improved upon.
In this work, we show that for a certain regime of d, we can prove a lower bound that does not deteriorate as the value of r increases. Theorem 1.1 (Main Theorem). Let n be a large enough integer. There exist a constant η ∈ (0, 1) and an explicit n O (1) -variate, degree Θ(log 2 n) multilinear polynomial Q n such that for all r ≤ n η , any syntactically multi-r -ic depth four circuit computing Q n must have size exp Ω(log 2 n) .

Lower Bounds for Iterated Matrix Multiplication and Determinant Polynomials.
The Iterated Matrix Multiplication polynomial IMMñ ,d (X ) is the (1, 1) entry in the product ofd many disjointñ ×ñ generic 2 matrices. The explicit polynomial Q n that we consider can be expressed as a p-projection 3 of Iterated Matrix Multiplication polynomial IMMñ ,d (whereñ = n O (1) and d = Θ(log 2 n)). Since this projection maintains multilinearity, 4 existence of a syntactically multi-ric depth four circuit of size s(n) that computes IMMñ ,d implies existence of a syntactically multir -ic depth four circuit of size at most s(n) that computes Q n . Thus, Theorem 1.1 implies a lower bound of n Ω(log n) for the Iterated Matrix Multiplication polynomial as well.
Corollary 1.2 (Informal). Let n and d be integers such that d = Θ(log 2 n). There exists a constant η ∈ (0, 1) such that for all r ≤ n η , any syntactically multi-r -ic depth four circuit computing Iterated Matrix Multiplication polynomial (IMM n,d ) must have size at least exp(Ω(log 2 n)).
Since the Iterated Matrix Multiplication polynomial can be expressed as a p-projection of determinant polynomial [Saptharishi 2019, Theorem 3.6], we get a similar lower bound for the determinant polynomial as well. That is, IMM n,d (X ) can be expressed as the determinant of an nd × nd matrix M whose entries are either variables from X or constants. Further, all the variable appearances in M are distinct. If the determinant of a generic nd × nd matrix Y can be computed by a syntactically multi-r -ic depth four circuit of size s(n, d), then by substituting Y with M, we get a syntactically multi-r -ic depth four circuit of size at most s(n, d) that computes Det n,d (M). Putting it together with the fact that IMM n,d (X ) = Det nd (M), we get a syntactically multi-r -ic depth four circuit of size at most s(n, d) that computes IMM n,d . However, Corollary 1.2 tells us that for all r ≤ n η , any syntactically multi-r -ic depth four circuit computing IMM n,d for d = Θ(log 2 n) must have size n Ω(log n) . Thus, s(n, d) for d = Θ(log 2 n) must at least be n Ω(log n) . Given any large integer N , we can now infer a lower bound for computing the determinant of a generic N × N matrix 16:4 S. Chillara against syntactically multi-r -ic depth four circuits, by picking n such that N = Θ(n log 2 n) and invoking Corollary 1.2 with this value of n. Corollary 1.3 (Informal). Let N be a large integer. There exists a constant γ ∈ (0, 1) such that for all r ≤ N γ any syntactically multi-r -ic depth four circuit computing the determinant polynomial over N × N matrix must have size at least exp(Ω(log 2 N )). Kayal et al. [2018]. Kayal et al. [2018]  . Note that this bound is superpolynomial only when r = o(log 2 n). In comparison, for all r ≤ n η (for some constant in (0, 1)) we show a lower bound of n Ω(log n) , which is quantitatively better in this regime of parameters. In particular, we show a lower bound in the regime of parameters where r d. That is, for the fixed values of n and d = Θ(log 2 n), we show lower bounds for larger circuit classes whose individual degree is much larger than the degree of the polynomial computed. Kayal et al. [2018] also show a lower bound of 2 Ω( √ N ) (against syntactically multi-r -ic depth four circuits), which does not deteriorate with increasing values of r , albeit for a multi-r -ic polynomial defined over N variables and of degree Θ(r N ). Hegde and Saha [2017] proved an exponential lower bound of 2 Ω( N log N r ) against syntactically multi-r -ic depth four circuits for an N -variate, degree Θ(N ) polynomial (in VNP) when r ≤ o(N ). Our result is mostly incomparable against this as we show lower bounds in the low-degree regime. We summarize this discussion in the form of a n Ω(log n) d ≈ 2.6 log 2 n, and r ≤ n 0.04

Comparison to Hegde and Saha [2017] and
If we can show superpolynomial-size lower bounds against multi-r -ic depth four circuits for r = n c for any constant c, then we can indeed have superpolynomial circuit size lower bounds against depth four circuits. We believe that by building on the work of Kayal et al. [2018] and Hegde and Saha [2017], Theorem 1.1 is a step towards that direction.
Proof Overview. A depth four circuit computes polynomials that can be expressed as sums of products of polynomials (cf. Theorem 2.1). In particular, the bottom layer of a depth four circuit (as per Theorem 2.1) consists of product gates, each of which computes a monomial.
Let μ be a parameter that we shall fix later. Let {x 1 , . . . , x N } be the set of variables that a depth four circuit (of size s < N c μ for a constant c and a large integer μ) depends on. Let ρ be a random restriction such that for all i ∈ [N ], ρ sets variable x i to zero with probability (1 − N −2c ) (for some constant c) and leaves it untouched otherwise. Thus, the probability that a product gate at the bottom, of variable support at least μ, survives this restriction is at most N −2c μ . By taking a union bound, we get that the probability that a product gate at the bottom, of variable support larger than μ, survives is at most s · N −2c μ ≤ N −c μ . Thus, with a probability of at least (1 − N −c μ ), random restriction ρ reduces a depth four circuit of size at most N c μ to a depth four circuit of size at most N c μ in which all the bottom product gates depend on at most μ − 1 variables. On the other hand, we show that our explicit polynomial Q n (which is also defined over the variable set {x 1 , . . . , x N }) reduces to a polynomial f n,α,k with high probability under the same random restriction. A further union bound tells us that a syntactically multi-r -ic depth four circuit of size at most N c μ reduces to a syntactically multi-r -ic depth four circuit of size at most N c μ and whose bottom product gates have small variable support of at most μ − 1, and Q n reduces to f n,α,k , simultaneously with high probability (cf. Section 4).
Conditioned on this, we prove lower bounds for f n,α,k against a syntactically multi-r -ic depth four circuits of size at most N c μ and whose bottom product gates have low variable support (of at most μ − 1).
Let T 1 ,T 2 , . . . ,T s be the terms corresponding to the product gates feeding into the output sum gate. The output polynomial is the sum of terms T 1 ,T 2 , . . . ,T s . Note that each of these T i s is a product polynomial Q i, j such that every monomial in these Q i, j s depends on a small set of variables (say μ − 1 many). One major observation at this point is to see that there can at most be N · r many factors in any of the T i s. Kayal et al. [2018] observed that the measure of shifted partial derivatives [Fournier et al. 2015;Kayal et al. 2014b] does not yield any non-trivial lower bound if the number of factors is much larger than the number of variables itself. They worked around this obstacle by defining a hybrid complexity measure (referred to as Shifted Skew Partial Derivatives) where they first split all the variables into two disjoint sets Y and Z such that |Y | |Z |. They then considered some low order partial derivatives with respect to monomials in F [Y ] and subsequently set all the variables from Y to zero in the partial derivatives obtained. This effectively reduces the number of factors in any summand in a partial derivative of T to at most |Z | · r . They then shift these polynomials by monomials in variables from Z and look at the dimension of the F-linear span of the polynomials thus obtained.
This measure gave them a size lower bound of ( n r 1.1 ) Ω( d r ) against multi-r -ic depth four circuits computing an explicit polynomial on n O (1) variables and degree d = o(n) when r = o(d). To improve the dependence on r in the lower bound, we consider a variant of Shifted Skew Partial Derivatives that we call Projected Shifted Skew Partial Derivatives (cf. Section 2.1). Here, we project down the space of Shifted Skew Partials and only look at the multilinear terms. Since the polynomial of interest is multilinear, it makes sense to only look at the multilinear terms obtained after the shifts of the skew partial derivatives. This is analogous to the method employed by  to prove exponential size lower bounds for homogeneous depth four circuits, through the measure of Projected Shifted Partial Derivatives.
We first show that the dimension of Projected Shifted Skew Partial derivatives is not too large for small multi-r -ic depth four circuits of low bottom support (cf. Section 3.1). We then show that there exists an explicit polynomial f n,α,k whose dimension of Projected Shifted Skew Partial derivatives is large and thus cannot be computed by small multi-r -ic depth four circuits of low bottom support (cf. Section 3.2). In particular, by suitably fixing the parameters (cf. Section 4), we show that any syntactically multi-r -ic depth four circuit of low bottom support of at most μ − 1 that computes f n,α,k must have size at least N c 0 μ , where c 0 is a small constant and c 0 c. Putting it all together, we get that there exists a random restriction ρ that simultaneously reduces a syntactically multi-r -ic depth four circuit of size at most s ≤ N c μ to a syntactically multi-r -ic depth four circuit of size at most s ≤ N c μ and bottom support at most μ − 1, and the explicit polynomial Q n to f n,α,k , and from the aforementioned discussion, we get that s must at least be N c 0 μ . Thus, any syntactically multi-r -ic depth four circuit computing Q n must be of size at least N c 0 μ ≥ n Ω(log n) .

PRELIMINARIES
Notation: to refer to the space of partial derivatives of order k of f with respect to monomials of degree k in Y .
• We use z = and z ≤ to refer to the set of all the monomials of degree equal to and at most , respectively, in variables from Z . • We use z ≤ ML to refer to the set of all the multilinear monomials of degree at most in variables from Z .
• We use z ≤ NonML to refer to the set of all the non-multilinear monomials of degree at most in variables from Z . • For sets A and B of polynomials, we define the product A · B to be the set • For a polynomial f , vars(f ) is the set of variables that the polynomial f depends on.
• For a gate u in a circuit, we use f u to denote the polynomial computed at gate u.
• For a polynomial f in F[Y Z ], we define Z -support of f to be the set vars(f ) ∩ Z and Z -support size of f to be equal to |vars(f ) ∩ Z |.
Definition 2.1 (Depth four Circuits). A depth four circuit (denoted by ΣΠΣΠ) over a field F and variables {x 1 , x 2 , . . . , x n } computes polynomials that can be expressed in the form of sums of products of polynomials. That is, s

is a depth four circuit and all the monomials in every polynomial
Definition 2.2 (multi-r-ic Circuits). Let r = (r 1 , r 2 , . . . , r n ). An arithmetic circuit C is said to be a syntactically multi-r-ic circuit if in at most r i many of the u j s (j ∈ [t]) and the total formal degree with respect to every variable x i (i ∈ [n]) over the polynomials computed at u 1 , u 2 , . . . ,u t , is bounded by . If r = (r , r , . . . , r ), then we simply refer to them as multi-r -ic circuits.

Complexity Measure
We shall now describe our complexity measure, which we shall henceforth refer to as Dimension of Projected Shifted Skew Partial Derivatives. This is a natural extension of the Dimension of Shifted Skew Partial Derivatives as used by Kayal et al. [2018]. This formulation is analogous to the work of Kayal et al. [2014a], where they study a shifted partials inspired measure called Shifted Projected Partial derivatives, and then , where they study Projected Shifted Partial derivatives.
Since the polynomial of interest is multilinear, it does make sense for us to only look at those shifts of the partial derivatives that maintain multilinearity. At the same time, since the individual degree of the intermediate computations in the multi-r -ic depth four circuit could be large and non-multilinear terms cancel out to generate the multilinear polynomial, we can focus on the multilinear terms generated after the shifts by projecting our linear space of polynomials down to them. We describe this process formally below.
Let the variable set X be partitioned into two fixed, disjoint sets Y and Z such that |Y | is much is obtained by setting every variable from Y to zero and leaving the variables from Z untouched. Let mult : is obtained by setting the coefficients of all the non-multilinear monomials in д to 0 and leaving the rest untouched.
Recall that we use ∂ =k Y f to denote the set of all partial derivatives of f of order k with respect to degree k monomials over variables just from Y , and z ≤ ·σ Y (∂ =k Y f ) to refer to the set of polynomials obtained by multiplying each polynomial in σ Y (∂ =k Y f ) with monomials of degree at most in Z variables. We will now define our complexity measure, Dimension of Projected Shifted Skew Partial Derivatives, with respect to parameters k and (denoted by Γ k, ) as follows: This is a natural generalization of the Shifted Skew Partial Derivatives measure defined by Kayal et al. [2018]. The following proposition is easy to verify.

Proposition 2.3 (Sub-additivity). Let k and be integers. Let the polynomials
Monomial Distance: We recall the following definition of distance between monomials from Chillara and Mukhopadhyay [2019].
Definition 2.4 (Definition 2.7, Chillara and Mukhopadhyay [2019]). Let M 1 , M 2 be two monomials over a set of variables. Let S 1 and S 2 be the multisets of variables corresponding to the monomials M 1 and M 2 , respectively. The distance dist(M 1 , M 2 ) between the monomials M 1 and M 2 is the min{|S 1 | − |S 1 ∩ S 2 |, |S 2 | − |S 1 ∩ S 2 |}, where the cardinalities are the order of the multisets.
It is important to note that two distinct monomials could have distance 0 between them if one of them is a multiple of the other and hence the triangle inequality does not hold.
For two vectors a, b, we use HammingDist(a, b) to refer to the Hamming distance between these vectors a and b.
The following beautiful lemma (from Gupta et al. [2014]) is key to the asymptotic estimates required for the lower bound analyses.

S. Chillara
We need the following strengthening of the Principle of Inclusion and Exclusion, due to Kumar and Saraf [2017].

MULTI-r -IC DEPTH FOUR CIRCUITS OF LOW BOTTOM SUPPORT
Let C be a multi-r -ic depth four circuit of size s and bottom support at most μ. For some parameters k and that we shall fix later, we shall show that Γ k, (C) is not too large if multi-r -ic depth four circuit C is of small size and is of low bottom support.

Upper Bound on Γ k, (C)
Recall that C can be expressed as a sum of at most s many products of polynomials T 1 + · · · + T s , where each T i is a syntactically multi-r -ic product of polynomials of low monomial support.
We shall first prove a bound on Γ k, (T i ) for an arbitrary i ∈ [s] and derive a bound on Γ k, (C) by using sub-additivity of the measure (cf. Proposition 2.3).
Let T be a syntactic multi-r -ic product of polynomials such that all the monomials in every polynomial factor in T depend on at most μ many variables. We shall first pre-process the product T by doing the following procedure.
Preprocessing: Repeat this process until all but at most one of the factors in T (except R) have a Z -support size of at least μ 2 .
(1) Pick two factors P i 1 and P i 2 from T such that R P i 1 , P i 2 and they have the smallest Zsupport size amongst all the factors but R in T .
(2) If both of them have Z -support size strictly less than μ 2 , merge these factors to obtain a new factor P . Else, stop.
(3) Update the term T by replacing the factors P i 1 and P i 2 with P . Repeat.
In the procedure described above, it is important to note that post-merging, the monomials in the product polynomial will depend on at most μ many variables from Z as the factors being merged had Z -support size strictly less than μ 2 each. Henceforth, W.L.O.G we shall assume that every product gate at the top, in multi-r -ic depth four circuit of low bottom support, is in the processed form. Let be the product obtained after the preprocessing. All but at most one of the Q i s have a Z -support size of at least μ 2 . The total Z -support size is at most |Z | r = mr since T is a syntactically multi-r -ic product. Thus, Lemma 3.1. Let n, k, r , , and μ be positive integers such that + kμ < m 2 . Let T be a processed syntactic multi-r -ic product of polynomials ) depend on at most μ many variables from Z . Then, Γ k, (T ) is at most t k · m +k μ · ( + kμ). Before presenting the proof of Lemma 3.1, we shall first use it to show an upper bound on the dimension of the space of Projected Shifted Skew Partial derivatives of a depth four multi-r -ic circuit of low bottom support.
Lemma 3.2. Let n, k, r , , and μ be positive integers such that + kμ < m 2 . Let C be a processed syntactic multi-r -ic depth four circuit of bottom support μ and size s. Then, Γ k, (C) is at most s · 2mr μ +1 k · m +k μ · ( + kμ).
Proof. From the above discussion, we get that C can be expressed as s i T i such that each T i is a processed syntactically multi-r -ic product of polynomials, all of whose monomials depend on at most μ many variables from Z . From Proposition 2.3, we get that Γ k, (C) ≤ s i=1 Γ k, (T i ). From the aforementioned discussion we know that the number of factors in each of the T i s with non-zero Z -support size is at most ( 2mr μ + 1). From Lemma 3.1, we get that for all i ∈ [s], Γ k, (T i ) is at most 2mr μ +1 k · m +k μ · ( + kμ). By putting all of this together, we get that We now present the proof of Lemma 3.1 to complete the picture.
Proof of Lemma 3.1. We will first show by induction on k the following for the set of kth order partial derivatives of T with respect to degree k monomials over variables from Y : The base case of induction for k = 0 is trivial as T is already in the required form. Let us assume the induction hypothesis for all derivatives of order < k. That is, ∂ =k−1 Y T can be expressed as a linear combination of terms of the form . That is, h 1 (Z ) can be expressed as a linear combination of multilinear monomials of degree at most (k − 1)μ, and non-multilinear monomials of degree at most (k − 1)r μ over F [Z ].
For some u ∈ [|Y |] and some fixed i 0 in S, where the first summand on the right-hand side of the above equation lies in the subspace  and Q i 0 are polynomials such that every monomial in these depends on at most μ many variables from Z . These monomials can be split into two sets, those that are multilinear and those that are strictly non-multilinear, over the variables from Z . Thus, In the above expression, the contribution from the variables from Y to the monomials in Recall the fact that h 1 (Z ) is a linear combination of multilinear monomials of degree at most (k − 1)μ, and non-multilinear monomials of degree at most (k − 1)r μ. Thus, we get that From the discussion above we know that any polynomial in ∂ =k Y (T ) can be expressed as a linear combination of polynomials of the form ∂h ∂y u . Further, every polynomial of the form ∂h ∂y u belongs to the set Thus, we get that ∂ =k Y T is a subset of W . This completes the inductive argument. From the aforementioned discussion, we can now derive the following expressions: It is easy to see that this inclusion holds under shift by monomials of degree at most over variables from Z : By taking a multilinear projection of the elements on both sides, we get that

Polynomial Family That Is Hard for Multi-r -ic Depth Four Circuits of Low Bottom Support
Let n, α, k be positive integers and N 0 be equal to k(n 2 + 2αn). Let Y and Z be two disjoint sets of variables defined as follows. For all i ∈ [k], let Let the variable set X = {x 1 , . . . , x N 0 } be equal to Y Z under some suitable renaming. We define the polynomial family f n,α,k (X ) = f n,α,k (Y , Z ) as follows (exactly as it was defined in Kayal et al. [2018]): It is easy to see that |Y | is n 2 k and |Z | is 2αnk. We shall henceforth use m to refer to |Z |. Thus, N 0 = |X | = |Y | + |Z | = k(n 2 + 2αn). The degree of the polynomial f n,α,k (denoted by d) is equal to (2αk + k).
Proof. There are n 2k elements in [n] 2k . Note that the volume of a Hamming ball of radius Δ 0 < k over vectors of length 2k is at 2k that are at most Δ 0 -far from its center. Thus, there exists a packing of these Hamming balls in [n] 2k with at least n 2k −Δ 0 Δ 0( 2k Δ 0 ) many balls. The centers of these balls are at least 2Δ 0 far away and thus at least Δ 0 far away from each other. Set P Δ 0 to be the collection of centers of these Hamming balls.
Remark: Lemma 3.3 can be optimized in the above lemma to obtain a set P of size 2n 2k −0.5Δ 0 Δ 0( 2k 0.5Δ 0 ) by considering balls of radius 0.5Δ.
It is important to note that for any choice of (a, b) ∈ [n] 2k , we get that ∂ k (a,b) f n,α,k is a multilinear monomial of degree d − k = 2αk, over just the variables from Z . Lemma 3.4. Let (a, b), (a , b ) ∈ [n] 2k be such that HammingDist((a, b), (a , b ) Proof. For a vector (a, b) ∈ [n] 2k , For any Δ 0 < k, let P Δ 0 ⊂ [n] 2k be the set of vectors obtained from Lemma 3.3. Let ∂ =k By combining this with Lemma 3.4, we get that the pairwise distance between any two monomials in the set ∂ =k P Δ 0 f n,α,k is at least α Δ 0 . This can formally be summarized as follows.
Lemma 3.5. Let Δ 0 , n, α, k be integers. Let P Δ 0 be a subset of [n] 2k obtained from Lemma 3.3 such that for any (a, b) (a , b ) ∈ P Δ 0 , HammingDist((a, b), (a , b )) ≥ Δ 0 . Then ∂ =k We shall now show that the cardinality of the set mult(z = · σ Y (∂ =k P Δ 0 f n,α,k )) is large enough for a suitable setting of parameters α, Δ 0 , and k. Lemma 3.6. For ε and δ be some constants in (0, 1). Let n be an asymptotically large integer. Let m, k, d, Δ 0 , α, , and μ be such that , we get that where P Δ 0 is a set obtained from Lemma 3.3.
Proof. Let M 1 , M 2 , . . . , M t be the monomials in the set ∂ =k . Let M be the set of all mutlilinear monomials of the form M i · M over variables from Z , where i ∈ [t] and M is a multilinear monomial of degree , disjoint from M i . It is important to note that the set M now corresponds to the set mult(z = · σ Y (∂ =k We shall now show that λ = T 2 T 1 ≥ 1 for all α ≤ 0.99(2−δ ) log n δ log( 2 1−ε ) . Once we prove that λ ≥ 1, we can then invoke Lemma 2.6 and show that ∪ i ∈[t ] B i ≥ T 1 4λ .
By simplifying the expression for λ, we get the following: The math block above crucially uses the fact that Δ 2 = o(m) = o( ) and (d − k) 2 = o(m) while invoking Lemma 2.5. The error term from invoking Lemma 2.5 has been absorbed by the constant 2 to give rise to O(1) factor. For some suitably fixed constants δ and ε, let Δ 0 be set to δk and be set to m For the sake of contradiction, let us assume that T 2 T 1 < 1. Then, where c −1 0 is a constant hidden under the O(1) in the first line of the math block. Hence, This contradicts our assumption on α for all asymptotically large n. Thus, we get that λ ≥ 1 for all α ≤ 0.99 · (2−δ ) log n δ log( 2 1−ε ) , and we can invoke Lemma 2.6 to get the following: Lemma 3.7. Let δ and ε be any constants in (0, 1). Let n be an asymptotically large integer. Let m, k, d, α, , and μ be such that and ε, δ ∈ (0, 1), we get ) is a set of multilinear monomials over just the variables from Z and thus, Putting this together with Lemma 3.6, we get that Γ k, (f n,α,k ) ≥ Ω(1) · m−(d −k) · m α δ k .

Putting It All Together
We shall now prove a size lower bound against depth four multi-r -ic circuits of low bottom support that compute f n,α,k by instantiating α to a suitable value that is smaller than 0.99·(2−δ ) log n δ log( 2 1−ε ) for some fixed constants δ and ε.
Theorem 3.9. Let δ , ε, and ν be some constants as obtained from Lemma 3.8. Let n be an asymptotically large integer. Let r , α, and μ be such that , and .
Let C be a depth four multi-r -ic circuit of bottom support at most μ and size s. If C computes the polynomial f n,α,k , then s must at least be n 0.09ν k .
Proof. Let δ , ε, and ν be the constants obtained from Lemma 3.8. For a fixed value of α = 0.98·(2−δ ) log n δ log( 2 1−ε ) , the polynomial f n,α,k is defined on the variable sets Y and Z such that |Z | = m = 2αnk. Let , k, μ be such that = m 2 (1−ε), k 2 μ 2 = o(m), and +kμ < m 2 . Let Δ 0 = δk. Let us assume that the polynomial f n,α,k is computed by a depth four multi-r -ic circuit C of bottom support at most μ and size s. Then it must be the case that Γ k, (f n,α,k ) = Γ k, (C).
Invoking Lemma 3.7 with α = 0.98(2−δ ) log n δ log( 2 , and the values of ε, δ, and ν obtained from Lemma 3.8, we get that Invoking Lemma 3.2 with + kμ < m 2 , we get that Putting these two together with the fact that Γ k, (f n,α,k ) = Γ k, (C), we get the following: In line 2 of the above math block, we use the inequality n k ≤ en k k . In line 4, we use Lemma 2.5 to simplify the terms along with the fact that k 2 μ 2 = o(m − ), (d − k) 2 = o(m) and k 2 μ 2 = o( ). In line 6, we substitute 2αnk for m and simplify the terms. Let us set α to 0.98·(2−δ ) log n δ log 2 1−ε . Since δ , ε, and ν are strictly positive constants in (0, 1) given by Lemma 3.8, they satisfy the following inequality: Thus, From the fixing of the parameters μ and α, we get that μ α is a constant. Since μ log 1+ε 1−ε + log r ≤ 0.9ν log n, we get that 09ν k for all asymptotically large enough n.

MULTI-r -IC DEPTH FOUR CIRCUITS
We shall now define another polynomial family P n,α,k based on the definition of f n,α,k and then prove a lower bound for the polynomial family P n,αk against multi-r -ic depth four circuits by lifting the lower bound for f n,α,k against multi-r -ic depth four circuits of low bottom support.
It is easy to see that if h(x 1 , . . . , x n ) has a circuit of size s = s(n), then д(y 1 , . . . ,y m ) also has a circuit of size s = s(n) = s(m O (1) ).
Let us now recall the following lemmas from Saptharishi [2019]. Proofs of these lemmas are a step-by-step adaptation, rather than a replication of proofs of Lemma 20.5 and Lemma 20.4, respectively, in Saptharishi [2019].
We shall first show that the polynomial P n,α,k reduces to the polynomial f n,α,k upon taking random restrictions and p-projections, with a high probability.
Lemma 4.2 (Analogous to Lemma 20.5 5 , [Saptharishi 2019]). Let c be a constant as fixed above. Let ρ be a random restriction on the variable setX that sets each variable to zero independently, with a probability of (1 − N −c 0 ). Then f n,α,k (X ) is a p-projection of ρ(P n,α,k (X )) with a probability of at least (1 − e −N 0 ).
Proof. For each i ∈ [N 0 ], the probability that all the variablesx i, j (j ∈ [t]) are set to zero by ρ is as follows: By union bound, the probability that there exists an i ∈ [N 0 ] such that all the variables of the form x i, j for j ∈ [t] are set to zero is at most 1 e N 0 . Thus, with a probability of at least (1 − e −N 0 ), for each i, there exists at least one j such that ρ(x i, j ) 0. It is easy to see that the polynomial f n,α,k can be written as a p-projection of ρ(P n,α,k ) in such a case. For each i ∈ [N 0 ], the substitution maps one of the non-zero ρ(x i, j )s to x i and sets the rest to 0.
We shall now show that, under random restrictions, any syntactically multi-r -ic depth four circuit reduces to a syntactically multi-r -ic depth four circuit of low bottom support with a high probability and without any blow-up in size. Lemma 4.3 (Analogous to Lemma 20.4, [Saptharishi 2019]). Let γ > 0 be a parameter. Let N and μ be integers. Let P be a N -variate polynomial that is computed by a syntactically multi-r -ic depth four circuit C of size s ≤ N γ μ . Let ρ be a random restriction that sets each variable to zero independently with probability (1 − N −2γ ). Then with a probability of at least (1 − N −γ μ ), polynomial ρ(P) is computed by a multi-r -ic depth four circuit C of bottom support at most μ, and size s.
Proof. Let C be a multi-r -ic depth four circuit of size s computing P. Let {M 1 , M 2 , . . . , M t } be the set of monomials computed at the lower product gate of C that have at least μ + 1 distinct variables in their support. Note that t is at most s. For all i ∈ [t], By taking a union bound, the probability that there exists a monomial amongst {M 1 , M 2 , . . . , M t } that is not set to 0 by ρ is strictly less than t · N −2γ μ ≤ s · N −2γ μ ≤ N −γ μ . Thus, with a probability of at least (1 − N −γ μ ), all the monomials at the bottom product gate depend on at most μ distinct variables.
With this background, we are now ready to present the proof of Theorem 1.1.
Proof of Theorem 1.1. Let ε, δ, and ν be the constants obtained from Lemma 3.8 and c be a small constant in (0, 1) as fixed above. Let n be a large positive integer. Let the parameters N , N 0 , r , μ, α, and k be set in terms of n or otherwise as follows: • r ≤ n 0.5ν , • μ = 0.4ν log n log( 1+ε 1−ε ) , • α = 0.98·(2−δ ) log n δ log( 2 1−ε ) , • N 0 = k(n 2 + 2αnk), 5 The form of this lemma as mentioned in Saptharishi [2019] is due to Kumar and Saptharishi. On Computing Multilinear Polynomials Using Multi-r -ic Depth Four Circuits 16:19 • N = N 2+c 0 + N 1+c 0 ln N 0 , • γ be a parameter given by the equation N 2γ = N c 0 , and • k = 10γ μ log N ν log n . The above setting of parameters also satisfies the conditions that k 2 μ 2 = o(m) and (d − k) 2 = O(α 2 k 2 ) = o(m). LetX = {x 1,1 ,x 1,2 , . . . ,x 1,t , . . . ,x N 0 ,1 ,x N 0 ,2 , . . . ,x N 0 ,t } be a set of variables over which the polynomial P n,α,k is defined, where t = N 1+c 0 + N c 0 ln N 0 . Let ρ be a random restriction such that a variable is set to zero with a probability of (1 − N −c 0 ) = (1 − N −2γ ), and is left untouched otherwise. Let C be a syntactically multi-r -ic depth four circuit of size s ≤ N γ μ that computes P n,α,k . Lemma 4.3 tells us that C = ρ(C) is a multi-r -ic depth four circuit of size s and bottom support at most μ with a probability of at least (1 − N −γ μ ). Conditioned on this probability, ρ(P n,α,k ) has a multi-r -ic ΣΠΣΠ {μ } size at most s.
By invoking Lemma 4.2, we get that f n,α,k is a p-projection of ρ(P n,α,k ) with a probability of at least (1 −e −N 0 ). Since ρ(P n,α,k ) has a multi-r -ic ΣΠΣΠ {μ } circuit of size at most s with a probability of at least 1 − N −γ μ , with a probability of at least (1 − N −γ μ − e −N 0 ), f n,α,k is computed by a multir -ic ΣΠΣΠ {μ } circuit of size at most s. In other words, there exists a multi-r -ic depth four circuit of bottom support at most μ and size at most s that computes f n,α,k .
On the other hand, by invoking Theorem 3.9 with the set of parameters as defined above, we get that any multi-r -ic ΣΠΣΠ {μ } circuit that computes f n,α,k must be of size exp((0.09νk log n). Upon putting both of these facts together, it must be the case that n 0.09ν k = N 0.9γ μ ≤ s ≤ N γ μ .
Since ε, δ, and ν are constants, and N = n O (1) , we get that s must at least be exp Ω(log 2 n) . The explicit polynomial Q n is P n,α,k , where α and k are set to values described above.