Factors of low individual degree polynomials

Kaltofen (Randomness in computation, vol 5, pp 375–412, 1989) proved the remarkable fact that multivariate polynomial factorization can be done efficiently, in randomized polynomial time. Still, more than twenty years after Kaltofen’s work, many questions remain unanswered regarding the complexity aspects of polynomial factorization, such as the question of whether factors of polynomials efficiently computed by arithmetic formulas also have small arithmetic formulas, asked in Kopparty et al. (2014), and the question of bounding the depth of the circuits computing the factors of a polynomial. We are able to answer these questions in the affirmative for the interesting class of polynomials of bounded individual degrees, which contains polynomials such as the determinant and the permanent. We show that if P(x1,…,xn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${P(x_{1},\ldots,x_{n})}$$\end{document} is a polynomial with individual degrees bounded by r that can be computed by a formula of size s and depth d, then any factor f(x1,…,xn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${f(x_{1},\ldots, x_{n})}$$\end{document} of P(x1,…,xn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${P(x_{1},\ldots,x_{n})}$$\end{document} can be computed by a formula of size poly((rn)r,s)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textsf{poly}((rn)^{r},s)}$$\end{document} and depth d + 5. This partially answers the question above posed in Kopparty et al. (2014), who asked if this result holds without the dependence on r. Our work generalizes the main factorization theorem from Dvir et al. (SIAM J Comput 39(4):1279–1293, 2009), who proved it for the special case when the factors are of the form f(x1,…,xn)≡xn-g(x1,…,xn-1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${f(x_{1}, \ldots, x_{n}) \equiv x_{n} - g(x_{1}, \ldots, x_{n-1})}$$\end{document}. Along the way, we introduce several new technical ideas that could be of independent interest when studying arithmetic circuits (or formulas).


Introduction
Let f (x 1 , . . ., x n ) ∈ F[x 1 , . . ., x n ] be a multivariate polynomial over a field F. The individual degree of f with respect to variable x i , denoted by deg xi (f ), is the largest power of x i appearing in a monomial of f .Many interesting polynomials have bounded individual degree, such as the Permanent and Determinant polynomials.Moreover, the class of polynomials of bounded individual degree is closed under factorization, since if a polynomial f (x 1 , . . ., x n ) has individual degrees bounded by r, so will its factors.In this work, we study the problem of formula (circuit) factorization of polynomials of low individual degree.
One of the basic operations on polynomials is factorization.This problem can be phrased as follows: given a polynomial P (x 1 , . . ., x n ), decide whether P (x 1 , . . ., x n ) is irreducible, or if not, output one of its factors, which we denote by f (x 1 , . . ., x n ).From the computational perspective, we will usually be given a device computing the polynomial P , and we will be asked to output a similar device computing f .In the field of arithmetic complexity, the most natural device for computing polynomials is an arithmetic circuit or a formula (see Definition 1.1 below).Therefore, we will assume that we are given P as an arithmetic circuit (formula) and output one of its factors in the same representation.We now give the definition of an arithmetic circuit/formula: Definition 1.1.An arithmetic circuit Γ is a directed acyclic labeled graph in which the vertices are called 'gates'.The gates of Γ with in-degree 0 are called inputs and are labeled by either a variable from {x 1 , . . ., x n } or by a field element from F. Every other gate of Γ is labeled by either '×' or '+' and has in-degree 2. (If we talk about bounded depth circuits/formulas, then we remove the restriction on the in-degree.)There is one gate with out-degree 0, which we call the output gate.Each gate in Γ computes a polynomial in F[x 1 , . . ., x n ] in the natural way.An arithmetic circuit is called a formula if its underlying graph is a tree.The size of a circuit (formula) Γ, written |Γ|, is given by the number of edges in the circuit (formula) and the depth of Γ, written depth(Γ), is defined as the length of the longest directed path in the graph of Γ.
Polynomial factorization is one of the cornerstone problems in modern computer algebra, and as such has been the focus of intensive research.The past three decades have seen major advances on the development of efficient algorithms for polynomial factorization, pioneered by the works of Lenstra et al. and Kaltofen [11,7,8,9].In addition to the general problem, polynomial factorization has also been studied in many other important (and more restricted) representations.For instance, in the sparse representation, where the input polynomial is given as a list of its coefficients and monomials, the works of Lenstra, Kaltofen and von zur Gathen [12,4] give efficient algorithms for sparse factorization in the univariate and in the multivariate cases.For a more complete survey on polynomial factorization we refer the reader to the survey [9] and to the book [3].
In the seminal work of Kaltofen [8], it is proved that if P (x 1 , . . ., x n ) of total degree D can be computed by an arithmetic circuit of size s, then any of its factors have arithmetic circuits of size poly(n, s, D).Moreover, Kaltofen gives a randomized algorithm that with high probability outputs such a factor in polynomial time.This result, besides settling an important complexity theoretic question, has since then had a great impact in many areas of computer science, such as coding theory [16,5], derandomization [6] and cryptography [1].However, many interesting questions on the complexity of arithmetic circuits or formulas under factorization remain unanswered.In particular, we study the following two questions, where the first one was asked in the work of Kopparty et al. [10], while the second question was stated as an open problem in the survey [15, Open Problem 19]: 1.If P (x 1 , . . ., x n ) of total degree D is computed by an arithmetic formula of size s, is it true that any of its factors will also have formulas of size poly(n, s, D)? 2. If P (x 1 , . . ., x n ) can be computed by a circuit of size s and depth d, can its factors be computed by a circuit of size poly(s) and depth O(d)?
In this work, we answer both of these questions in the affirmative, in the case where the input polynomial P has bounded individual degrees.In particular, we show: ] be a factor of P , where F is a field of characteristic zero.If there exists a formula (circuit) of size s and depth d computing P , then there exists a formula (circuit) of depth d + 5 and size poly((nr) r , s) that computes f (x 1 , . . ., x n ).Moreover, if we require the in-degree of each gate to be 2, then the size remains the same and the depth becomes d + O(r log(nr)).

Factors of Low Individual Degree Polynomials
Notice that our theorem has no restriction on the individual degrees of the polynomials computed by the intermediate gates of the circuit (that is, we have no syntactic restrictions).We only care about the individual degrees of the output polynomial, which we regard as bounded by a constant, denoted by r, in the theorem above.
Theorem 1.2 provides a direct answer to the second question posed above in the case where P has bounded individual degrees (that is, r is a constant).The connection between Theorem 1.2 and the first question comes from the fact that one can always balance formulas to have logarithmic depth.More precisely, suppose that we are given a formula Φ (with in-degree bounded by 2) of size s = poly(n) computing P .By Theorem 2.7 in [15], we can assume that Φ is of size poly(s) and depth(Φ) = O(log s).Hence, Theorem 1.2 implies that there exists a formula Ψ, with in-degree bounded by 2, of depth depth(Ψ) = depth(Φ) + O(r log(sn)) = O(log s) and size poly((nr) r , s) = poly(s) computing any factor f (x 1 , . . ., x n ) of P .This provides an affirmative answer to the first question.
Before giving an overview of the proof of Theorem 1.2, we give some background on related work on factorization in general and in bounded depth circuits.
The problem of factoring in bounded depth was studied previously in [2], who showed that if P (x 1 , . . ., x n ) has a depth d circuit of size s and deg xn (P ) ≤ r, then its factors of the form x n − φ(x 1 , . . ., x n−1 ) have depth d + 3 circuits of size poly(n r , s).This result was used to extend the hardness-randomness tradeoffs of [6] to the bounded depth model.Our main theorem generalizes their result to any factor of P , provided that P has bounded individual degrees.
Shpilka and Volkovich in [14] initiated the study of factorization of multilinear polynomials, which are the most basic case of polynomials of bounded individual degrees.They relate the problem of deterministically factoring multilinear polynomials to the problem of performing deterministic Polynomial Identity Testing (PIT).In their paper, they prove that these two problems are roughly equivalent in the multilinear setting for most restricted multilinear circuit classes that have been studied.Since the problem of performing deterministic PIT seems to be hard, even for the class of multilinear formulas, this shed some light on the difficulty of obtaining deterministic factorization even for this model.This equivalence between deterministic PIT and deterministic polynomial factorization was later generalized by Kopparty et al. in [10] to polynomials (of polynomial degree) computed by general circuits.Since we prove here that, for polynomials of bounded individual degrees computed by circuits of small depth, their factors can also be computed by circuits of small depth, one could hope for similar connections between PIT for restricted classes of circuits -say of bounded depth and low individual degrees -and factorization of polynomials in such classes.

Proof Overview
In this section, we give an overview of the proof of the main theorem.For simplicity of exposition, we will only refer to arithmetic circuits in this overview, but our results hold true for formulas as well, as the statements in the later sections show.We begin with a definition: Definition 2.1 (Approximate Root).Let P (x 1 , . . ., x n , y) be a polynomial in F[x 1 , . . ., x n , y].We say that q(x 1 , . . ., x n ) is a root of P up to degree t if all the homogeneous parts up to degree t of the polynomial P (x 1 , . . ., x n , q(x 1 , . . ., x n )) are zero.That is, P (x 1 , . . ., x n , q(x 1 , . . ., x n )) only has monomials of degree larger than t.
Given a polynomial P (x 1 , . . ., x n , y) ∈ F[x 1 , . . ., x n , y] with individual degree in y bounded by r, Dvir et al. [2] show that if P (0, . . ., 0, y) has no double roots, that is, P (0, . . ., 0, y) can be factored as where µ i = µ j for i = j, then for each µ i , there exists an approximate root q i,t (x 1 , . . ., x n ) of P up to degree t such that q i,t (0, . . ., 0) = µ i .Moreover, they show that if P is computed by a circuit Γ of size s and depth d, then there exists a circuit of size poly(t r , s) and depth d + 2 computing q i,t (x 1 , . . ., x n ).
With this idea in mind, suppose for simplicity that where each polynomial g i (x 1 , . . ., x n ) has a nonzero constant term µ i and µ i = µ j for i = j.
In this case we are in the framework of [2], since and the roots µ i are distinct.As Section 4 shows, we can guarantee distinct roots in P (0, . . ., 0, y) by using a random shift of the variables (x 1 , . . ., x n ), as long as P is squarefree.Therefore, for each µ i and t ≥ 1, we can find polynomials q i,t (x 1 , . . ., x n ) such that q i,t (0, . . ., 0) = µ i and the polynomial P (x 1 , . . ., x n , q i,t (x 1 , . . ., x n )) only has terms of degree larger than t.Since the minimum degree terms of P (x 1 , . . ., x n , q i,t (x 1 , . . ., x n )) must come from the product of the minimum degree terms of each of the polynomials q i,t (x 1 , . . ., x n ) − g j (x 1 , . . ., x n ).Notice that for each j = i, the constant term of q i,t (x 1 , . . ., x n ) − g j (x 1 , . . ., x n ) is equal to µ i − µ j , which is nonzero.Therefore, the minimum degree terms of P (x 1 , . . ., x n , q i,t (x 1 , . . ., x n )) must come from the minimum degree terms of the polynomial q i,t (x 1 , . . ., x n )−g i (x 1 , . . ., x n ).
Because P (x 1 , . . ., x n , q i,t (x 1 , . . ., x n )) only has terms of degree larger than t, the same must happen to the polynomial q i,t (x 1 , . . ., x n ) − g i (x 1 , . . ., x n ).This implies that q i,t (x 1 , . . ., x n ) approximates the actual root g i (x 1 , . . ., x n ) of P up to degree t.Hence, if we pick t larger than the total degree of g i , the lower degree terms of q i,t correspond to the root g i , and therefore we can recover this root g i (and use it to factor P ).
There are two main issues with this approach that we need to overcome, if we are to generalize it.The first issue is that P may not factor into linear factors in y, that is, polynomials of the form y − g i (x 1 , . . ., x n ).The second one is that P need not be monic in y, in which case we will still need to recover its leading coefficient -which is a polynomial in To deal with the first issue, let us study a toy example: assume that P is monic in y with deg y (P ) = r, that is, P (x 1 , . . ., x n , y) ≡ y r + r−1 i=0 P i (x 1 , . . ., x n )y i , but P does not factor into linear factors in y.Let f (x 1 , . . ., x n , y) be one of its factors, of degree k in y.Since P is monic in y, we know that f must also be monic in y.Note that C C C 2 0 1 5 if we work over the algebraic closure of F(x 1 , . . ., x n ) (that is, the field F(x 1 , . . ., x n )), we can factor P (and f ) into linear factors in y.In this work, we will not describe what the algebraic closure of F[x 1 , . . ., x n ] is, since it is a very complex field, and it is not needed in our proof.We only mention F(x 1 , . . ., x n ) here to give us some intuition on how to generalize the root finding approach described above.For simplicity, think of elements of the closure as "functions" over the variables x 1 , . . ., x n .Since f divides P , if then there will be indices (say i from 1 to k) such that However, it is worth noting that these linear factors will not be polynomials!Nevertheless, the fact that they share some roots in the closure of F[x 1 , . . ., x n ] gives us a hint on what to do next.To overcome this problem, we will (in Lemma 6.1 and Corollary 6.2) approximate these functions ϕ i by polynomials g i,t , in a way that the polynomial agrees with f on the terms of order smaller than t.Therefore, for large enough t, the lower order terms of g t (x 1 , . . ., x n , y) will correspond to the polynomial f , which we can then obtain by interpolation (Lemma 3.3).We can think of each polynomial g i,t as the Taylor expansion of ϕ i up to degree t.
The way we obtain these approximations to the roots (the polynomials g i,t ) is by a procedure similar in nature to Hensel lifting.Suppose that ϕ i (0, . . ., 0) = µ i for 1 ≤ i ≤ k, and moreover, suppose that µ i = µ j for i = j.From each valuation µ i , we will construct a family of polynomials g i,t of degree t, such that g i,t (x 1 , . . ., x n ) is a root of f up to degree t.Now, the question is: how can we construct this family of polynomials if we do not have access to f ?The answer to this question lies on the fact that each root y − ϕ i of f is also a root of P , and therefore we can access the valuations of ϕ i 's through the circuit computing P .Hence, we will use the fact that the ϕ i 's are also roots of P in order to find the polynomials g t that approximate f (Lemma 7.1).
To overcome the second main issue, that the polynomial P may not be monic, let us define where f k (x 1 , . . ., x n ) ≡ 0 and P r (x 1 , . . ., x n ) ≡ 0. If f divides P , then it must be the case that the leading coefficient f k of f divides the leading coefficient P r of P .Hence, a possible solution to this second issue would be to find, by some kind of induction, a small circuit for f k based on the circuit for P r that we obtain from P .Then, we could generalize the factoring result for monic polynomials to the case where the factors are rational functions of the form With these two results, we could multiply the circuits computing f k and f f k to obtain our factor f .More precisely, if we could find, by induction on the number of variables, a small circuit Φ k for f k based on the circuit Γ r for P r that we obtain from P via interpolation (Lemma 3.4), and if we could find a small circuit Υ for the rational function f f k based on the circuit Γ computing P (Lemma 7.1), then the circuit given by Υ × Φ k would compute the polynomial f , as we wanted.
One problem with this approach is that, even if we can generalize the monic factoring result to monic rational functions as above, as far as we know, the best bound on the size of the circuit Γ r computing P r is given by 3r • s (see Lemma 3.4).Therefore, if we define T (n, s) as the maximum size of a factor of a polynomial in n variables computed by a circuit of size s, the induction given by the procedure above would give us the following bounds on the size: The reason for this bound is the following: P (x 1 , . . ., x n , y) has n + 1 variables and is computed by Γ, which has size s.Hence, the maximum size of a factor f is by definition T (n + 1, s).Since f k divides the leading coefficient P r , which is computed by Γ r of size 3rs and has n variables, the bound we have on the size of Φ k is given by T (n, 3rs), because now the input polynomial is P r .Assuming that the size of f /f k can be bounded by ((nr) r • s) α , for some constant α (which we can by Lemma 7.1), we obtain the additive factor poly((nr) r , s).
Since the circuit for f is given by Υ × Φ k , we need to add the bounds on the sizes for Φ k and Υ.However, when we solve this equation, we obtain that which is exponential in n, the number of variables!Therefore, this approach, as it is, cannot work.
The main problem with the recursion above is that the bound on the circuit size of the leading coefficient, if we only use Lemma 3.4, keeps getting worse as we reduce the number of variables -it will become (3r) • s if we get rid of variables.To get around this issue, we define the reversal of a polynomial with respect to a specific variable and we study its properties with regards to divisibility.If we define its reversal with respect to y as the polynomial That is, P is obtained from the polynomial P by "reversing" the coefficients P i (x 1 , . . ., x n ).It is easy to see that f divides P iff f divides P .By performing a reversal, notice that we have transformed the leading coefficient of our problem from P r (x 1 , . . ., x n ) to P 0 (x 1 , . . ., x n ).This has the advantage that now, the leading coefficient of our input polynomial can be computed by the circuit Γ| y=0 (that is, the circuit obtained from Γ by setting y = 0), which has size ≤ s.This now allows us to recurse into the division of f 0 by P 0 (the new leading

Factors of Low Individual Degree Polynomials
coefficients after the reversal) without paying the multiplicative cost on the size of the circuit.Hence with this idea we avoid paying the exponential blowup on the circuit size!On the coin side, notice that the size of the circuit computing the polynomial P is bounded by 8r 2 • s, according to Lemma 3.7.But this blow up does not hurt us, since the reversal is not cumulative.
More precisely, we now have the following recursion: we want to bound the size of a factor of P , computed by a circuit Γ of size s and on n + 1 variables.This bound is by definition T (n + 1, s).Let Γ be a circuit computing P .Suppose we can find a circuit computing f /f 0 of size bounded by ((nr) r • | Γ|) α ≤ ((nr) r • 8r 2 s) α , for some constant α (which we can by Lemma 7.1).Then we are only left with the problem of finding a small circuit for f 0 , which divides P 0 , which in turn can be computed by a circuit of size bounded by s in n variables.The bound for a circuit for f 0 is given in this case by T (n, s), by definition of the function T .Therefore, our recursion becomes as we wanted!
The idea of the reversal of a polynomial is similar to the definition of reversal of a univariate polynomial given in [3, §9.1].This notion of reversal is used there to perform division with remainder for univariate polynomials by using Newton iteration.
To generalize the monic factoring result to the case when f is monic in y with rational coefficients, we introduce the idea of an approximation polynomial of a rational function (see Section 5), and we use this approximation polynomial in Lemma 7.1 (instead of the rational function) as the "factor" of the input polynomial.If f is a rational function of the form where g(x 1 , . . ., x n ) and f i (x 1 , . . ., x n ) are polynomials in F[x 1 , . . ., x n ] such that g(0, . . ., 0) = 0, we define its approximation polynomial (to degree m) as the following polynomial where g ≡ g(x 1 , . . ., x n ) and f i ≡ f i (x 1 , . . ., x n ).This polynomial "approximates" the rational function f (x 1 , . . ., x n , y) in the sense that, for large enough m, the polynomial obtained by up to high order terms (see Observation 5.3), which we can get rid of by interpolation (Lemma 3.3).By adapting the approach in [2] to work with approximation polynomials, we can find all the "roots" of the approximation polynomials, and after that combine this approximation polynomial with the circuit obtained to compute the leading term.
After we take care of finding the leading coefficient f 0 (x 1 , . . ., x n ) (of the reversed polynomial f (x 1 , . . ., x n , y)), and after recovering the approximation polynomial ψ f ,m (see Lemma 7.1), we can multiply it by f 0 to obtain the factor f (up to high order terms) which, after interpolation, becomes our desired factor (see Theorem 7.2).
We conclude this proof outline with a basic roadmap of the main ideas involved in this work: 1. Given a circuit Γ for our polynomial P (x 1 , . . ., x n , y), we find a circuit Γ computing the reversal polynomial P (x 1 , . . ., x n , y). (Lemma 3.7) 2. We use the circuit Γ to find small circuits Φ i,t for each approximate root of P up to degree t. (Section 6) 3. Since f , divides P (Lemma 3.8), any approximate root of f will also be an approximate root of P .By combining the circuits Φ i,t computing the approximate roots of f (x 1 , . . ., x n , y), find circuit Ψ computing the approximation polynomial (see Section 5) of the monic rational function f (x 1 , . . ., x n , y)

Organization
The rest of the paper is organized as follows: in Section 3, we set up notations, go over some useful background and discuss the concept of reversal of a polynomial.In Section 4, we introduce the concept of properly splitting variable restrictions.In Section 5, we formally introduce the concepts of standard forms and approximation polynomials.In Section 6, we adapt the approach of [2] to find small formulas for the roots of P (x 1 , . . ., x n , y).In Section 7 we prove our main technical lemma and theorem.In Section 8, we conclude and propose some open problems.
For the sake of brevity of exposition, we only give a proof of our main technical theorem.The proofs of all other facts stated in this paper can be found in the full version [13].

Preliminaries
In this section, we establish the notation that will be used throughout the paper and some technical background that will be needed in the proof of our main theorem.

Notations
From this point on, we will use boldface for vectors, and regular font for scalars.Thus, we will denote the vector (x 1 , . . ., x n ) by x.If we want to multiply the vector x by a scalar z we will denote this product by zx.
We will denote our base field by F, assume that F has characteristic zero and that it is algebraically closed.The results in this paper also hold for non-closed fields of large enough characteristic, if we allow ourselves to use elements from field extensions.The assumptions just made are for clarity of exposition.
Let N 0 be the set of natural numbers including zero, that is, 0 is a vector of natural numbers and x = (x 1 , . . ., x n ) is a vector of formal variables, we define x e = n i=1 x ei i .That is, x e is the monomial corresponding to the product of the variables n i=1 x ei i , where each variable is raised to the proper power.

Factors of Low Individual Degree Polynomials
We will denote F(x)[y] as the set of polynomials in the variable y whose coefficients are rational functions on the variables x.That is, f (x, y) ∈ F(x)[y] iff it can be expressed in the When working with a polynomial in F[x, y], we might be interested in looking at the homogeneous parts of a polynomial with respect to certain variables only.This will be particularly useful when lifting the "roots" of a polynomial f (x, y) of the form y − q(x) in order to obtain a circuit computing f (x, y).To this end, we introduce the following definition.that is, when considered as a polynomial on the variables x, and regarding y as a constant.More explicitly, H x m [P ] is equal to the sum of all monomials of P that have degree m in x 1 , . . ., x n , without any restrictions on the degree of y.We also define For example, if P (x, y) Notice that if P (x, y) ≡ r i=0 P i (x)y i , then the partial homogeneous parts satisfy the following property: Therefore, this definition of partial homogeneous parts agrees with the definition of homogeneous parts if P (x, y) does not depend on variable y.When talking about partial homogeneous parts of a polynomial, it is useful to have a notion of minimum degree with respect to some variables.Definition 3.2 (Minimum Degree).Let f (x, y) ∈ F[x, y] be a polynomial.We define mindeg x (f (x, y)) to be the minimum degree of polynomial f (x, y) on the variables x.In other words, we have mindeg , we have that mindeg x (f ) = 3.

Basic Operations on Circuits and Formulas
We begin with the following standard lemma on obtaining the homogeneous components of a polynomial.The version below is from [2].The next lemma shows us how to obtain the coefficients of a polynomial through interpolation.Lemma 3.4 (Interpolation).Let P (x, y) ≡ r i=0 y i P i (x) be a polynomial computed by a formula (circuit) Γ.Then for each i ∈ {0, 1, . . ., r}, there exists a formula (circuit) Φ i such that |Φ i | ≤ 3r • |Γ| and Φ i computes the polynomial P i (x).
Given an irreducible polynomial g(x, y) and a polynomial P (x, y) that is divisible by g, it will be useful for us to find a polynomial D(x, y) that is divisible by g and it is also square-free with respect to g, that is, g(x, y) ∂D ∂y (x, y).The next lemma shows that we can find such a polynomial efficiently.The following observation will be very useful to convert small depth formulas into formulas with fanin bounded by 2. Observation 3.6.Any formula Φ of size s and depth d, without restrictions on the fanin of any of its gates, can be computed by a formula Ψ of size 2s and depth d • (1 + log(s)), where each gate has fanin 2.
To see that this observation is true, just replace each addition (multiplication) gate of fanin t by a balanced formula of size 2t made only with addition (multiplication) gates.Since t ≤ s, and a balanced formula of size 2t has depth 1 + log t, we have that each gate will be replaced by a formula of depth at most 1 + log s.The replacement by a balanced formula clearly does not change the computation, and the depth increases by a multiplicative factor of 1 + log s, as we wanted.

Reversal of Polynomials
In this section, we define a very useful operation for polynomials, which serves as a crucial tool in the proof of our main theorem.This operation, which we call reversal, simply maps a polynomial P (x, y) ≡ r i=0 P i (x)y i , with P r (x) • P 0 (x) ≡ 0, to P (x) ≡ r i=0 P i (x)y r−i .The restriction that P r (x) • P 0 (x) ≡ 0 is needed in this paper because it preserves irreducibility, as we will see in Lemma 3.8 and Corollary 3.9.We begin by showing that the reversal can be computed almost as efficiently as the original polynomial.Lemma 3.7 (Reversal Lemma).Let P (x, y) ≡ r i=0 y i P i (x) be a polynomial computed by a formula (circuit) Γ, where P r (x) • P 0 (x) ≡ 0. Let P (x, y) ≡ r i=0 y r−i P i (x) be its reversal.
We now connect the reversal operation to divisibility and irreducibility of polynomials.Since divisibility is preserved by taking reversals, we have the following corollary: Corollary 3.9 (Irreducibility of Reversals).Let P (x, y) ≡ r i=0 y i P i (x), with P r (x) • P 0 (x) ≡ 0, be an irreducible polynomial in F[x, y].In addition, let P (x, y) ≡ r i=0 y r−i P i (x) be its reversal.
Then, we have that P is irreducible ⇐⇒ P is irreducible.
Another useful property of reversals is that if two univariate polynomials do not share a common root, then their reversals will not share any root either.This gives us the following lemma: do not share any common roots, then their reversals f (x), g(x) do not share any roots either.

Properly Splitting Variable Restrictions
In this section, we study properties of pairs of polynomials f (x, y), g(x, y) which share no common factor involving the variable y.We state a lemma on restrictions of the x variables of f and g that preserve the property that their restrictions share no common factors in y.
We denote such restrictions as properly splitting variable restrictions.

Standard Forms and Approximation Polynomials
In this section we define the notion of standard forms in F(x)[y], that is, the ring of polynomials on the variable y with coefficients being rational functions on the variables x.
We also define the approximation polynomial of a standard form.These concepts will be useful when factoring a polynomial P (x, y) ∈ F[x, y], since our factorization procedure will use standard forms to obtain the factors of P (x, y) that depend of the variable y.We begin with the following definition: Definition 5.1 (Standard Form and Approximation Polynomials).We say that f (x, y) ∈ where . For a given parameter m ∈ N, we define the approximation polynomial of the standard form f to degree m, as the polynomial ψ f,m (x, y) ∈ F[x, y] given by In order to state some useful properties of approximation polynomials, we will need to extend the definition of reversals to standard forms.Definition 5.2.Let f (x, y) be a standard form as above, with the additional condition that f 0 (x) ≡ 0. We define the reversal of f (x, y) as the following standard form: The following observations about standard forms reveal much of its usefulness when factoring a polynomial.Observation 5.3.If f (x, y) ∈ F(x)[y] is in standard form as above, notice that the following holds for all m ∈ N: y)), we have: , where γ ∈ F, we have that h(x, y) is also a standard form and

Approximating the Roots of a Polynomial
In this section, we proceed in a similar way as in [2] and find approximations of the roots of a polynomial P (x, y) up to degree t.That is, as we defined in the introduction, we find polynomials q t (x) such that H x ≤t [P (x, q t (x))] ≡ 0.Moreover, we observe that under certain conditions on the polynomial P (x, y) these roots are well-defined and unique given their constant coefficient.This uniqueness condition will be useful because it will allow us to construct any factor of P (x, y) through the lifting procedure, since a factor f (x, y) of P (x, y) will share some of the roots of P (x, y).We begin with the approximation lemma: Lemma 6.1 (Approximation Lemma).Let P (x, y) ∈ F[x, y], P (x, y) ≡ ∂P ∂y (x, y) and µ ∈ F be such that P (0, µ) = 0 but P (0, µ) = ξ = 0.Then, for each t ≥ 0, there exists a unique polynomial q t (x) s.t.deg(q t ) ≤ t, q t (0) = µ and H x ≤t [P (x, q t (x))] ≡ 0.Moreover, if P can be computed by a formula (circuit) Γ such that its output gate is an addition gate, there is a formula (circuit) Φ t for the polynomial q t (x) such that the output gate of Φ t is an addition gate, depth(Φ t ) ≤ depth(Γ) + 2 and If we require the fanin of the formula (circuit) to be 2, then the size of Φ t does not change, and depth(Φ t ) ≤ depth(Γ) + 5r log(t).
Now that we know that any root of a polynomial P (x, y) of small individual degree computed by a small formula can be approximated by a small formula, the next corollary uses the uniqueness of the approximation of the root to show that the same is true for any factor of P (x, y).Corollary 6.2.Let P (x, y) and µ ∈ F be defined as in Lemma 6.1 and for each t ∈ N 0 , let q t (x) be the unique polynomial obtained from Lemma 6.
then the polynomial q t (x) also satisfies

Proof of the Main Theorem
In this section, we give the proof of our main theorem.In addition, we state the consequences of the main theorem for both small formula size and depth of circuits computing factors of polynomials with small bounded degree.
Lemma 7.1 (Main Lemma).Let P (x, y) ∈ F[x, y] be such that deg y (P ) = r, and also If we require the in-degree of the formula (circuit) to be 2, then the size of Ψ m or Ψ m does not change, and max(depth(Ψ m ), depth( Ψ m )) ≤ d + 10r log m.
With the Main Lemma stated above, we are now able to state and prove our main theorem.
which can be trivially computed by a formula Ψ of size ≤ 50k and depth 2. In this case, setting G(x) to be any constant polynomial, for instance G(x) ≡ 1, c = 0 and Φ m = Ψ, takes care of the base case.
Hence, let's assume that the claim is true for polynomials P (x) ∈ F[x] = F[x 1 , . . ., x n ] with P (0) = 0, for some n ≥ 1.Now we will prove that the same bounds hold for polynomials P (x, y) ∈ F[x, y] s.t.P (0, 0) = 0. Let P (x, y) ∈ F[x, y] be a polynomial computed by Γ and f (x, y) ∈ F[x, y] be a factor of P (x, y).We can assume that f (x, y) and P (x, y) depend on y, otherwise we can simply restrict the formula Γ to Γ| y=0 , and by the induction hypothesis the result follows. Let where each f i (x, y) ∈ F[x, y] is an irreducible polynomial.Since P (0, 0) = 0, we have that C 0 (x) ≡ P (x, 0) ≡ 0, and moreover, that C 0 (0) = 0. Let Notice that f (x, y) | P (x, y) ⇒ u(x) | C 0 (x).In addition, notice that C 0 (0) = 0 and C 0 (x) can be computed by the formula Γ| Now that we have an approximation to the factor u(x), which is the constant term of the polynomial f (x, y) when seen as a polynomial in the variable y, we want to use Lemma 7.1 to find the factors of f (x, y) that contain y.For this, we will first need to find polynomials D i (x, y) with small formulas such that f i (x, y) | D i (x, y) and each D i is square-free with respect to f i (x, y).
Fortunately, Lemma 3.5 tells us that for each (irreducible) polynomial f i (x, y), we can find formulas ∆ i of size ≤ 9r 2 |Γ| computing polynomials such that for any c ∈ F n where G i (c) = 0 we have that c properly splits f i (c, y) with respect 4 At first, it may seem strange that G(x, y) does not depend on the variable y, since if we continued this argument by induction we would arrive at the conclusion that G(x, y) is the constant polynomial.However, notice that even though H(x) does not depend on the variable xn, the polynomial G(x, y) depends on xn, since the polynomials C0(x) and G i (x) depend on xn.The right way to see this dependence is the following: G(x, y) depends on every variable except the variable used by the lifting procedure, which in this case is the variable y.Hence, we will have that H(x) depends on all the variables except xn (if we choose to perform the lifting with respect to xn).

Denote
Since h i0 (x) ≡ f i0 (x + c, 0) | P (x + c, 0) ≡ C 0 (x + c) and C 0 (c) = 0 (because G(c, γ) = 0), we have that h i0 (0) = 0, for all 1 ≤ i ≤ t.Hence, after normalization by a proper field element, we can write each h i0 in the following form: which implies (by Corollary 3.9) that the polynomial Because f i (x, y) | D i (x, y) and f i (x, y) ∂D i ∂y (x, y), by Lemma 3.8 we obtain that Since h i (0, y) ≡ f i (c, y), we also have that h i (0, y) has no common roots with ∂E i ∂y (0, y).
The following claim shows that i (x, y) satisfies the conditions of Lemma 7.1.

Proof of claim.
Notice that conditions (i) and (ii) from Lemma 7.1 follow from the fact that h i (x, y) | E i (x, y) and Lemmas 3.8 and 4.2.Condition (iii) follows from the fact that h i (0, y) h i0 (0) ≡ h i (0, y) shares no common roots with ∂E i ∂y (0, y) and from Lemma 3.10.
This finishes the proof of the claim.Now that we have rational functions in monic standard form that are, in a certain sense, computing the reversal of each f i (x, y), we can use the main lemma to lift the factorization of the approximation polynomial of f i (x, y)/f i0 (x). 5ince each i (x, y) and Ẽi (x, y) satisfy the conditions of Lemma 7.1, and Ẽi (x, y) can be computed by a formula Υ i of size |Υ i | ≤ 180r 4 • |Γ| = 180r 4 s and depth depth(Υ i ) ≤ d + 1 (since Υ i is a shift of ∆ i ), we have that there exists a formula Ψ i,m having as output gate a multiplication gate, depth(Ψ i,m ) ≤ depth(Υ i ) + 3 ≤ d + 4 and size By Observation 5.3, we have that and also In addition, from the formulas Ψ i,m and from the fact that t i=1 e i ≤ r, we have that the formula given by Ψ and computes the following polynomial: and depth(Φ ) ≤ d+5 such that Φ (x) ≡ f (x+c+a).By shifting the inputs of the formula Φ by −(a + c), we have that the new formula just obtained, call it Φ, computes the polynomial f (x), as we wanted.It is easy to see that Φ has the desired upper bound on its size.It is also clear from the proof that if we restrict the in-degree of the formulas (circuits) to be 2, we obtain the desired bounds on the depth.This finishes the proof.

Conclusion
Besides solving a question posed by Kopparty et al. [10] and Open Problem 19 in [15] for the class of bounded individual degree polynomials, notice that Lemma 7.1 and Theorem 7.2 also provide a framework to convert formulas (circuits) for the approximate roots of a polynomial into actual formulas (circuits) for factors of the same polynomial.Since Lemma 7.1, and therefore Theorem 7.2, uses the Approximation Lemma (Lemma 6.1) as a black-box, any improvements on Lemma 6.1 would lead to better bounds on the size of the formulas for the factors of the input polynomial.Hence, if one can remove the exponential dependence on the parameter r (the bound on the individual degrees) in the Approximation Lemma, one can fully solve the questions above.This is the main open question left by this work.
Acknowledgements.The author would like to thank his advisor Zeev Dvir for all the helpful discussions and encouragement throughout the course of this work.

Definition 3 . 1 (
Partial Homogeneous Parts).Let P (x, y) ≡ d α d (y) • x d be a polynomial in F[x, y], where each α d (y) ∈ F[y].For each m ∈ N 0 , we define H x m [P ] as the polynomial formed by the homogeneous parts of degree m of P (x, y), when seen as a polynomial in F[y][x],

Lemma 3 . 8 (
Divisibility with Reversals).Let P (x, y) ≡ r i=0y i P i (x), with P r (x) • P 0 (x) ≡ 0 and f (x, y) ≡ k i=0 y i f i (x), with f k (x) • f 0 (x)≡ 0, be two polynomials.In addition, let i P i (x) and f (x, y) ≡ k i=0 y k−i f i (x) be their reversals.Then, we have thatf | P ⇐⇒ f | P .

Theorem 7 . 2 (
Main Theorem).Let P (x) ∈ F[x] \ {0} be such that deg xi (P ) ≤ r, 1 ≤ i ≤ n, P (0) = 0 and let Γ be a formula (circuit) of size s and depthd computing P .Let f (x) ∈ F[x]be a factor of P (x), and let m be a positive integer.There exists a polynomialG(x) ∈ F[x] of total degree deg(G) ≤ 4r 3 n 3 such that if c ∈ F n satisfies G(c) = 0then there exists a formula Φ m whose output gate is a multiplication gate and for whichdepth(Φ m ) ≤ d + 4 3 , |Φ m | ≤ 60000m 2 r 8 n • m + r + 1 r + 1 s and H x ≤m [Φ m (x)] ≡ H x ≤m [f (x + c)].If werequire the in-degree of the formula (circuit) to be 2, then the size of Φ m does not change, and depth(Φ m ) ≤ d + 20r log m.Proof.The proof of the theorem is by induction on the number of variables.The bound is trivial in the univariate case, since if f (x), P (x) ∈ F[x], where deg(f ) = k ≤ r and f | P , then we can write