Reconstruction of Depth 3 Arithmetic Circuits with Top Fan-In 3
Abstract
In this paper, we give the first subexponential (and in fact quasi-polynomial time) reconstruction algorithm for depth 3 circuits of top fan-in 3 ( circuits) over the fields and . Concretely, we show that given blackbox access to an -variate polynomial computed by a circuit of size , there is a randomized algorithm that runs in time quasi-poly() and outputs a generalized circuit computing . The size includes the bit complexity of coefficients appearing in the circuit.
Depth 3 circuits of constant fan-in ( circuits) and closely related models have been extensively studied in the context of polynomial identity testing (PIT). The study of PIT for these models led to an understanding of the structure of identically zero circuits and circuits using some very elegant connections to discrete geometry, specifically the Sylvester-Gallai Theorem, and colorful and high dimensional variants of them. Despite a lot of progress on PIT for circuits and more recently on PIT for depth 4 circuits of bounded top and bottom fan-in, reconstruction algorithms for circuits has proven to be extremely challenging.
In this paper, we build upon the structural results for identically zero circuits that bound their rank, and prove stronger structural properties of circuits (again using connections to discrete geometry). One such result is a bound on the number of codimension 3 subspaces on which a polynomial computed by an circuit can vanish on. Armed with the new structural results, we provide the first reconstruction algorithms for circuits over and .
Our work extends the work of [Sinha, CCC 2016] who provided a reconstruction algorithm for circuits over and as well as the works of [Shpilka, STOC 2007] who provided a reconstruction algorithms for circuits in the setting of small finite fields, and [Karnin-Shpilka, CCC 2009] who provided reconstruction algorithms for circuits in the setting of small finite fields.
Keywords and phrases:
arithmetic circuits, learning, reconstructionFunding:
Shubhangi Saraf: Research partially supported by an NSERC Discovery Grant and a McLean Award.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Algebraic complexity theory ; Theory of computation Circuit complexityAcknowledgements:
The authors want to thank Nitin Saxena for helpful discussions.Editors:
Srikanth SrinivasanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Arithmetic circuits are directed acyclic graphs (DAGs) that compute multivariate polynomials in a compact form, constructing these polynomials from variables using addition () and multiplication () operations.
Reconstruction of arithmetic circuits is the following problem: given black-box or oracle access to a polynomial computed by a circuit of size from a certain class of circuits , design an efficient algorithm (either deterministic or randomized) to recover a circuit that computes the same polynomial as . This question is the algebraic equivalent of exact learning in Boolean circuit complexity [2]. If it is additionally required that the output circuit belongs to the same class as the input circuit, this process is referred to as proper learning.
Reconstruction of arithmetic circuits is a fundamental and challenging problem. In recent years, there has been a flurry of works on developing reconstruction algorithms for various interesting subclasses of arithmetic circuits [4, 32, 31, 15].
Thanks to the depth reduction results from [1, 33, 49, 19], we now know that even depth- and depth- circuits are quite expressive. Thus, efficient reconstruction algorithms even for depth three circuits would have significant implications for more general circuit models. Perhaps not surprisingly, we are quite far from achieving efficient reconstruction algorithms for general depth- circuits. However, in recent years there have been several works developing reconstruction algorithms for restricted yet interesting subclasses of depth 3 () and depth 4 () circuits [45, 25, 47, 48, 8, 40, 9, 21, 7].
A closely related problem is that of blackbox polynomial identity testing (PIT) which asks for the following. Given blackbox access to a polynomial computed by some circuit of size from some class , the goal is to decide if is identically zero. In other words, the goal is to compute an explicit hitting set for the class 111With randomness, this problem is easy using the Schwartz-Zippel Lemma [44, 50].
It is easy to see that obtaining deterministic reconstruction algorithms for a class of circuits is at least as hard is derandomizing black-box PIT for . Even randomized reconstruction almost always requires some deep understanding of the structure of the underlying circuit class and in almost every case we know, it seems harder than derandomizing PIT for that class. Indeed for most circuits classes that have been studied, efficient PIT algorithms have been a precursor to understanding reconstruction algorithms for that class. Since reconstruction for depth 3 circuits of constant top fan-in ( circuits) is the main focus of this paper, we first give some context by describing what is know about PIT for this class.
PIT for circuits
The recent breakthrough work of [34] gives the first subexponential deterministic blackbox PIT for (and in fact any constant depth) circuits. If we want truly polynomial time blackbox derandomization, then we only know how do this for restricted classes of depth 3 circuits. When the the top fan-in of the output sum gate is a constant , then this class is referred to as the class of circuits. circuits have received a great deal of attention in the context of blackbox PIT, and there has been a large body of beautiful works eventually showing polynomial time blackbox PIT algorithms for circuits [14, 24, 27, 43, 42]. A running theme through several of these works is to show that identically zero circuits have some very interesting structure; they must be low rank. Along the way some very elegant connections to discrete geometry, specifically the Sylvester-Gallai Theorem, and colorful and high dimensional variants of them were developed and used. In the last few years, there have been several exciting works trying obtaining similar results for interesting subclasses of depth-4 circuits, in particular for which are depth 4 circuits with bounded top and bottom fan-in. A sequence of results [46, 38, 37, 39, 17] developed a beautiful theory of Sylvester-Gallai type configurations for quadratic polynomials and was able to obtain a polynomial time deterministic PIT result for (think of circuits but with product of quadratics computed at the second layer of gates instead of a product of linear forms). Extending this to larger and is a very interesting direction and partial results in this direction have been obtained in [35, 16, 36]. Using completely different techniques, a remarkable work by [12] gives quasipolynomial blackbox PIT for circuits for any constants and .
Reconstruction for circuits
Despite all this progress for PIT, far less is known for reconstruction of circuits even when the algorithms are allowed to be randomized.
For now let us assume the underlying field has characteristic 0. In particular let us assume the coefficients lie in or . Without additional restrictions like multilinearity and set-multilinearity, we essentially only how to do efficient reconstruction of circuits over infinite fields like and when [47]! The result in [47] gives a randomized time reconstruction algorithm for variate, degree polynomials represented by a circuit over or .
Note that when , then derandomizing PIT is easy, and it only starts becoming challenging once . However reconstruction is much more challenging and despite all the progress on the PIT front, it took a really long time to get efficient reconstruction for circuits. The proof in [47] is quite sophisticated and uses some really beautiful connections to discrete geometry, in particular to the robust Sylvester-Gallai theorems (inspired by the theory developed for PIT of circuits). Thus even for the seemingly simple case of , reconstruction can be fairly complex. Already when , the techniques of the above work break down and nothing nontrivial was known for reconstructing circuits over or .
In the setting of finite fields, there are some additional very interesting results for reconstruction of circuits. Over small (only polynomially large) finite fields, the first (and very nontrivial) reconstruction algorithm for circuits was given in [45]. This algorithm run time has a quasipolynomial dependence on (it crucially needs to iterate over all field constants) and is therefore only efficient for small fields. This work was extended to circuits for any constant in [25], but again it is only efficient for small finite fields due to the quasipolynomial dependence on . When the input is an -variate, degree polynomial computed by a size circuit, both the above algorithms run in - time. In the setting of , only very recently it was shown in [48] that there is a polynomial time reconstruction algorithms with a run time that has a polynomial dependence on . In the further restricted setting where we have additional constraints of multilinearity or set-multilinearity222This setting captures tensor reconstruction for constant rank tensors, there has been a large body of work on reconstruction algorithms for circuits [4, 45, 25, 8, 40, 9].
Thus to summarize, over large fields, or infinite fields such as or , we knew no nontrivial reconstruction algorithms for general circuits even for . The main result of this paper is to give the first efficient reconstruction algorithm for circuits over infinite fields such as and .
Our result (informal).
Given blackbox access to an -variate degree polynomial over or , computed by a circuit, there is a randomized quasi-poly() time reconstruction algorithm for f, where is the maximum bit complexity of any constant appearing in the circuit.
Before we state our results more formally, we first introduce some definitions and notions related to circuits.
Some definitions related to circuits
The model of depth- arithmetic circuits with top fan-in , which we refer as circuits, has three layers of alternating and gates and computes a polynomial of the form
where the -s are linear polynomials.
We will in fact assume that the circuits are homogeneous and all the ’s are actually the same. This is because for the purpose of reconstruction and PIT, one can easily reduce to the homogeneous setting.
We say that the circuit is simple if and minimal if for all proper subsets , . We define . The simplification of , denoted by , is defined as . We define the rank of a circuit () as the dimension of the space spanned by all the linear forms in the circuit . We will often be concerned with .
A generalized depth 3 circuit is of the form
where , are linear forms in and . Notice that when is small (say constant or , the representation looks like a circuit where every product gate is further multiplied by a polynomial in few () linear forms.
Our techniques: Rank bounds and connections to discrete geometry
As mentioned previously, several of the blackbox PIT results for circuits and related models follows from some insight into the structure of identically zero circuits. One such remarkable structural result which is also a central ingredient in our proof is that identically zero simple and minimal circuits over or must be of only constant rank [27, 42, 43]. In particular, if a simple and minimal circuit computes the identically zero polynomial, then the set of all linear forms appearing at any gates in the circuit can only span a constant dimensional space.
In this paper, we develop a deeper understanding of some structural properties of circuits. Since we need to learn the linear forms in a given underlying circuit (not just determine whether the polynomial is zero or nonzero), thus we need also to understand and develop structural properties of non-zero circuits. One example of such a structural result we show is that if a polynomial is computed by an circuit (which has some mild non-degeneracy property) then the number of codimension 3 subspaces on which it can vanish is polynomially bounded (see Lemma 11).
1.1 Our Results
In this paper, we give the first subexponential time (and in fact quasipolynomial time) algorithm for reconstructing circuits over and . When the three multiplication gates in our circuit are sufficiently distant, i.e. when for some absolute constant , then our algorithm does “proper learning”, i.e. its output is the unique circuit computing . If this distance property does not hold, then our algorithm outputs a generalized depth 3 circuit of top fan-in at most 2 with parameters . We state our main theorem below. The running time in the statement is suppressing a dependence on the max bit complexity of any constant appearing in the circuit .
Theorem 1.
Let be a field that is or . Let be a degree polynomial computed by circuit of the form . There exist an absolute constant such that the following holds. There is a randomized algorithm that runs in time, makes blackbox queries to , and with probability does the following:
-
1.
If then it outputs a circuit computing .
-
2.
If , such that then it outputs a generalized depth 3 circuit computing .
Remark 2 (Dependence on bit complexity).
If is the maximum bit complexity of any coefficient appearing in , then our algorithm run time also depends polynomially on . In the statement of the above theorem and later in the paper, we have suppressed the dependence in the running time for clarity of exposition.
Remark 3 (Proper vs improper learning).
Note that our algorithm is a proper learning algorithm only when every pair of multiplication gates had enough “distance”. Otherwise, the output came from the model of generalized depth 3 circuits. All prior works on reconstruction of circuits and circuits ( [45, 25, 47, 48]) also had a similar kind of output.
1.2 Other related works
As the case of general is considered hard, several restrictions of the model have been considered for reconstruction. Some well-studied models are powering depth-3 circuits , multilinear , and set-multilinear circuits. The restrictions of powering circuits and set-multilinear have received special attention due to their connections with finding symmetric tensor decomposition and tensor decomposition problems [4, 45, 25, 8, 40, 9].
For the model of depth circuits, , the problem of reconstruction is equivalent to sparse multivariate interpolation for which we have a polynomial time algorithm in [5]. The work of [7] studied multilinear depth-4 circuits with bounded top fan-in ( circuits), and gave deterministic reconstruction algorithms which ran in time. The running time is however still at least , and hence it does not work over large/infinite fields. Note that when the top fan-in is 2, i.e. for circuits, we do know such efficient polynomial-time reconstruction algorithms by the work of [21]. Read-once oblivious branching programs (ROABPs) are another model that have been well studied in the context of reconstruction. In [31], the authors presented a randomized reconstruction (proper learning) algorithm for ran in time . This was derandomized in [15], giving a deterministic quasi-poly reconstruction algorithm.
Recently, there have also been several works studying average case learning algorithms for arithmetic circuits. In [30], the authors give a time reconstruction algorithm for non-degenerate homogeneous depth three circuits circuits. A learning algorithm for generalized depth three circuits in the non-degenerate case is presented in [6]. Reconstruction algorithms for other constant depth circuits in the non-degenerate case have also been obtained in [20, 22, 26, 29, 18].
Outline of the paper
In the rest of the paper, we will present an overview of the proof in Section 2 and the detailed proof of a structure theorem about the vanishing codimension 2 and 3 spaces of a polynomial computed by a circuit, which we crucially use in our reconstruction algorithm in Section 4. The rest of the details and proofs can be found in the full version [41] of the paper.
2 Proof Overview
Let be a polynomial that has a representation and let
be a circuit computing . Thus each is a product of linear forms, and as we describe in the preliminaries, with some simple preprocessing, we can assume that the circuit and all gates within it are homogeneous. In general, the gates might have nontrivial gcd, which has to be dealt with, but for the purpose of the proof overview, let us assume that . Note that we cannot easily reduce to the case of by factoring and dividing out the linear factors since there might be linear factors which do not divide the gcd, and division by those factors might not preserve the property of the polynomial being representable by a circuit.
In order to reconstruct the circuit , we need to somehow try and learn the linear forms appearing in . What we have is (randomized) access to a blackbox computing .
Notice that if is a linear form dividing , is a linear form dividing and is a linear form dividing then if we go modulo , and , then the polynomial vanishes identically. In other words, for any input where , and evaluate to , evaluates to as well.
Let to be the set of all codimension subspaces of over which vanishes. Then if we could somehow “learn” all the spaces in this set, then one of them would correspond to i.e. the codimension 3 space where vanish (or evaluate to ). Thus, a starting challenge for us is to show that the set can be learned. It turns out that the set can be infinite. Suppose is some codimension 2 space on which vanishes. Then any codimension 3 space contained within will also be a space on which vanishes and hence be contained in . This makes the set unwieldy to deal with and hence we modify the definition.
Let to be the set of all codimension subspaces of over which vanishes, such that it is not contained within any codimension 1 or 2 space on which vanishes. One of our significant structural results is to show that other than in certain degenerate settings, is finite and in fact has at most distinct elements. This is indeed the first major ingredient of our proof and we prove this in Section 4. We also show that , which is defined similarly with codimension 2 spaces, is finite, and has at most distinct elements. This is the starting point of our analysis.
Once we prove that and are finite and polynomially bounded (other than in degenerate settings), the next thing we show is how to actually compute and . Once we have these sets, we then use them to learn the linear forms appearing in .
Our reconstruction algorithm for circuits over or (and proof of correctness of the algorithm) follows from the following broad outline
-
1.
Obtain an upper bound on the number of a codimension 2 subspaces () and codimension 3 subspaces () on which vanishes (other than in some degenerate settings). This is shown in Section 4.
-
2.
Algorithmically compute and . This is shown in Section 5 of the full version [41]. At a high level, we consider projections of the circuit to constantly many variables, compute and for the constant variate polynomials by solving a suitable system of polynomial equations for each projection, and then “glue” or “lift” the solutions to a global solution over the entire original space. The ideas are inspired by the algorithms in [48]. However there are some additional nontrivial challenges that arise in our setting.
-
3.
Use and to form a list of linear forms (which we call ) such that several of these linear forms actually divide one of the gates of the circuit. We will do this in settings where and can be computed as well as in the degenerate cases where they cannot, and for this we will use the structure of the degeneracy. This is shown in Section 6 of the full version [41] and is the most technically complex part of the proof.
-
4.
Reconstruct the entirety of the circuit using the few linear forms learned in the previous part. This is achieved by going modulo the linear forms and learning the projected circuit of top fan-in at most , and then gluing the projections to recover the original circuit. This part uses several ideas from the reconstruction algorithms of Shpilka [45] and Karnin-Shpilka [25].
We will discuss each of the four parts in some more detail in the following subsections below.
2.1 Overview: Upper bounding the size of and
For , which is defined to be the number of codimension 1 spaces over which vanishes, bounding its size is easy since, each member of corresponds to a linear factor of , and there can be at most of those.
We now give a flavor of what goes into bounding . We will only be able to bound when the circuit is such that for some absolutely constant which depends on the rank bound for identically zero circuits.
By assumption, let . For any which is an space, let . The computes a polynomial that is identically zero. Thus, by rank bounds for identically zero circuits (see Theorem 6), it holds that which is an absolute constant much smaller than . Suppose we are only trying to bound the number of those over which none of the individual vanish (the other case is not so hard to bound) then the fact that is much smaller than means that many triples of linear forms coming from distinct gates must “collapse” and become identical when we go mod . Thus they will no longer contribute to the rank of as they will be in the gcd. (If there is no movement to the gcd, then the overall rank can reduce by at most two). Now if which divides , which divides and which divides are three linearly independent linear forms that become identical mod then it must hold that . Suppose this happens for another triple which is linearly independent and such that is distinct from . Then, since must belong to the spans of both triples, it must hold that is in fact equal to . Thus jointly determine and hence, given the circuit , there are only choices for .
Note that we haven’t covered all cases. It could be that the triples which collapse and move to the gcd are not linearly independent and they only span 2-dimensional spaces. This case needs to be handled separately. Also the case where one of the gates (say ) identically vanishes over has to be dealt with. In each of these cases we are able to bound the number of such spaces in and thus we get a polynomial bound on .
Bounding is significantly more challenging, and the analysis breaks down into a larger number of cases. Also, we are not able to bound the size of whenever is large, unlike the bounding of . Notice that even if is large, there could exists a linear form such that is quite small. In this case perhaps which is computed by can vanish on a large (maybe infinite) set of codimension 2 spaces. Then along with , this will give us a large (possibly unbounded) set of codimension 3 spaces that vanishes on in . Thus we are only able to bound when we have the added condition that there is no linear form such that is small. This is the non-degeneracy condition that we alluded to earlier.
For circuits that start off with being large but such that there exists a linear form such that is small and , we call these circuits “special form” circuits. We are unable to bound the number of spaces for polynomials computed by such circuits, but nevertheless, we show that such circuits have other additional nice properties that we will eventually exploit to learn them.
Comparison to [48]
In [48], the authors develop a similar structural result where they bound the number of codimension 2 vanishing spaces for the non-linear part for a circuit. The structure of circuits is simpler. Note for instance that derandomizing PIT for circuits is trivial whereas derandomizing PIT for circuits is more challenging and is closely related to the rank bound. Indeed (unlike in [48]) we need to crucially use the rank bound to bound vanishing spaces, and the analysis is more intricate.
2.2 Overview: Algorithmically computing and
We will only show how to compute and in the settings where we have proved that their size is polynomially bounded.
Our high level strategy is inspired by the algorithms in [48]. However there are some additional nontrivial challenges that arise in our setting.
For the purpose of the proof overview, we focus on the case of computing , and assume that we know how to compute .
Constant variate case
We first show that if the underlying polynomial is constant variate, then this can be done. We do this by setting up a suitable system of polynomial equations. Setting up equations to find all (the variables are the coefficients of the monomials in , , ) such that vanishes over the codimension 3 space is fairly straightforward. However this might have infinitely many solutions unless we ensure that does not lie within any codimension 2 space on which vanishes. For this we first compute and and for each element in these sets we will require that some certain matrix has large rank. By introducing additional variables (just constantly many) we show how to capture these “large rank” constraints as well using polynomial equations. Thus overall we have polynomially many equations of polynomial degree, but in only constantly many variables. This can be solved (over or ) to recover all solutions.
Large variate case
In case is over a large number of variables, we consider several distinct projections of to the constant variate case. We learn the spaces for each of the projections and then glue them together to recover the spaces for . Before performing the projections, we first apply a random linear invertible transformation to the variables of to ensure that the projection has nice properties (such as being able to apply Hilbert irreducibility). For our proof to go through, we will require that the projections of do not contain new linear factors (which would correspond to new spaces not arising from projections of original spaces) and this is easy to obtain using Hilbert irreducibility. However, crucially we will also require that the projections don’t generate new spaces. This is important since if new spaces were generated, then potentially one could lose out on spaces when we take a projection (since we are not able to learn codimension 3 spaces contained within an space). Proving that new spaces (i.e. not just the ones that are projections of spaces of ) are not generated does not follow immediately from Hilbert irreducibility, and indeed we are not able to prove this fact for general polynomials (though perhaps it might be true in general as well). Our proof crucially use the fact that can be represented as a high rank circuit. Indeed a crucial ingredient in our proof (of the fact that no new spaces are generated in the projections) is the upper bound on the number of spaces of .
Once we can prove that no new spaces are generated in the projections, then it is not hard to show that every space in gets projected to a distinct space in for each projected polynomial . Now for each is a constant variate polynomial and we can show that can be computed. Given the structure of how are chosen, it is then not hard to see that the spaces in can be stitched across the different choices of to recover .
Comparison to [48]
A similar algorithm (Algorithm 7) appeared in [48] for computing the set of codimension 2 vanishing spaces for circuits. Our algorithm for learning the set of codimension 3 vanishing spaces is inspired by this work, but there is one crucial difficulty/difference. In [48], when is being learnt, one needs to discard those vanishing spaces that are contained in an space. This can be easily achieved by just dividing out the linear factors. This process needs to be modified when learning since there is no way of just “factoring out” the spaces in . Instead, we have to set up a modified system of polynomial equations that handles this. We also need to prove a structural result that ensures that after projecting the large variate case to the constant variate case, the bounds on the number of spaces still holds (no linear form exists modulo which the rank of the simple part drops below ), and no new spaces are generated. These issues did not arise in [48].
2.3 Overview: Using and to learn some linear forms appearing in
Recall, is a polynomial that has a representation and let
be the circuit computing . Each is product of linear forms and for now we are assuming that . We will show that we are able to learn independent linear forms from one of the gates in .
If is a linear form dividing , is a linear form dividing and is a linear form dividing such that , then will belong to unless it is contained within some space in or is some space in . In such a case, we will say got “blocked” by an or space and hence did not get learned.
Now suppose that did not get blocked and hence lies in . Thus we can use to learn . Suppose there exists dividing such that is some other distinct space in . Thus we can also learn . The intersection of these two spaces allows us to learn . Just like we managed to learn , if we could also somehow learn for some other linear form dividing , then we could take the intersection of and to learn and hence learn one of the linear forms appearing in !
Thus intersections of kernels of spaces in can be useful in learning linear forms in . Can this strategy always be carried out to learn linear forms? Or could it be that we cannot learn any linear forms because is empty, since every codimension 3 space on which vanishes got blocked by a space in or ?
It turns out that this part proved to be surprisingly challenging to show and is the technically most difficult and intricate part of the paper. Indeed we are not able to show that intersections of spaces in will suffice in learning linear forms. However, we do show that in most interesting cases, either intersections of kernels of spaces, or intersections of kernels of spaces will suffice. When these do not suffice, then we show that the underlying circuit has some other nice structure which can be exploited to learn the linear forms. We first prove some useful structural properties of and .
In the rest of the proof overview, we will just try to give some high level ideas of the kinds of structural results we need to prove, and some of the algorithmic ideas that go into the reconstruction. It will be a considerable simplification of all the actual cases we need to consider and their analyses.
is essentially low dimensional
is in correspondence with the linear factors of . The linear factors could either divide (which we assume is 1 for the proof overview) or it could divide . We show that the set of linear factors dividing can span dimension at most . This uses lower bounds for 2-query locally decodable codes (similar such bounds also appeared in [14, 45, 25]).
Understanding the structure of
If we could show that linear forms defining the kernels of spaces in are also low dimensional then that would be very convenient, since then if the gates in the circuit start off having many high rank linear forms, it would show that many of the spaces in remain unblocked, and then we can use them to learn linear forms.
However, this turns out to be not true and this causes the proof to become quite a bit more involved. Though we are not able to bound the dimension of as a whole, we still manage to prove some structural results that suffice for our purpose.
Consider any . Thus when we consider modulo and , it is identically zero. There are two ways this can happen. Either each of the ’s vanishes modulo and (in other words, each has a linear form dividing it that is in the span of and ) or the ’s don’t all individually vanish, but still their sum vanishes.
We would like to partition the set based on the above two possibilities. We say is in if each has a linear factor lying in and we say it is in otherwise. (We are cheating a bit here - our actual definitions of these two sets is a bit more subtle, but for intuition, this is good enough. In reality the set is a bit larger. It could be that each has a linear form in but when we remove those linear forms (taking into account multiplicities) then the resulting circuit still vanishes when we go mod and . In this case we add to ).
The set is actually a nice helpful set. It can be quite useful in learning linear forms that appear in the circuit. For dividing , dividing and dividing suppose that did not belong to since it got blocked by a space in . Then we show that if that was a space in , that is usually not a big problem, since the space in can actually be used to learn the space unless one of or is in the kernel of the space. If a large fraction spaces are blocked by spaces whose kernel contains one of or , then we learn these linear forms from the intersection of kernels of spaces, so this case turns out not be be a problem either.
The bigger issue is when gets blocked by a space in . We show that this cannot happen too often. Though we cannot say that the union of kernels of spaces in is low dimensional, we can say something close. Consider the maximum number of spaces, , in such that their kernels are completely linearly independent. In other words, the kernels (that are each dimensional) in total span a dimensional space. We show that is at most . This structure ends up being sufficient to show that cannot just block all spaces of the form that we wanted to learn, assuming that each of the ’s started off with enough linearly independent linear forms present in all of them.
Though our target is to learn independent linear forms appearing in one gate, in most cases it is sufficient to learn two independent linear forms (not in kernels of spaces) from one gate. We can then reconstruct the circuit mod these linear forms as circuits using the reconstruction algorithm in [47] and use it to get projections of a high-rank gate, which we can then glue to obtain independent linear forms from one gate.
We now discuss a few additional algorithmic tools that go into the analysis of one specific case that cannot be learned using the above mentioned ideas. Suppose that there is one gate such that all linear forms appearing in it only span a constant dimensional space.
Learning linear forms when some of the gates have low rank
Assume the linear forms appearing in span a dimensional space for some constant . In this case, it can be that and are both empty, and spaces in and block any codimension 3 space from appearing in . We mention a few ingredients that go into the analysis.
We need a new algorithmic insight, since , and might be useless. This method also shows up in the learning of a few other other instances (of the structural partitioning) of circuits.
When has only low rank linear forms, it turns out we can then (essentially) compute . Note that we do not have blackbox access to . We would still like to compute . The key observation is that if then even though we don’t know that is zero, we can still conclude that has few essential variables (see Definition 7) , i.e. it can be written as a polynomial depending on constantly many linear forms. In order to compute we will attempt to find all such that has few essential variables. We show that this can be done using our algorithms for finding spaces of a polynomial, combined with a suitable modification of an algorithm by Carlini [11] which can compute the number of essential variables in a polynomial. Once we can compute , we can use intersections of the kernels of these spaces to compute linear forms appearing in one of or 333This is again a bit of a simplification of our algorithm. It could be that the rank of is super constant, or it could be that most of is blocked by the spaces.
Learning linear forms when is of special form
We say a circuit is of special form if there is a linear form such that when we go modulo , the simple part of the circuit has low rank. Note, this is the case when we don’t know how to bound and hence compute . Thus again new ideas are needed. In this case we inspect the structure of the circuit and show that it must be one of three types In each of these types, though cannot be learned, we show how to bound and learn an interesting subset of (which we call ) that still contains enough useful codimension 3 spaces that allow us to learn some of the linear forms appearing in . We then again have to consider cases based on whether all gates are high rank or not, and construct learning algorithms similar to the non-special form cases, but with playing the role of .
2.4 Overview: From a few linear forms to reconstructing the entire circuit
Once we learn a few linear independent linear factors (we will actually ensure we learn linear factors) appearing in one of the gates (say ) then the high-level plan is to consider the circuit modulo each of the factors to obtain a projected circuit of top fan-in at most two. This can be learnt using a reconstruction algorithm for circuits from [47] as a unique circuit when the distance between the projected gates is high or as a circuit when the distance is low. Moreover by a result by Shpilka [45], given enough linearly independent projections of suffices in recovering . An extension of this result by Karnin and Shpilka [25], gives a way of recovering from enough projections of low distance .
Comparison to [25]
As described in our overview, once we have a few linear forms from a gate, the process of learning the entire circuit follows closely the outline from [25]. The main difference stems from how the few linear forms are obtained. The main technical contribution of this paper is to show how to efficiently compute these linear forms over large fields. In the works of [25, 45], the authors obtained the linear forms using a brute-force search approach, by searching over all possible linear forms with variables, which took quasi-poly time.
3 Preliminaries
A detailed discussion on the notations and preliminaries can be found in the full version of the paper [41].
3.1 Polynomial Identity Testing and Rank Bounds
A finite set of points with the property that every line through two points of passes through a third point in is called a Sylvester-Gallai configuration. The famous Sylvester-Gallai theorem states that the only Sylvester-Gallai configurations in are those formed by collinear points. This basic theorem about point-line incidences was extended to higher dimensional flats in [23, 10] over the Real numbers and in [3, 13] over . We define the rank of a set of vectors to be the dimension of the linear space they span.
Definition 4 ().
Let be a set of non-zero vectors in without multiples: ie no two vectors in are scalar multiples of each other. Suppose that for every set of linearly independent vectors, the linear span of contains at least vectors of . Then, the set is said to be -closed. The largest possible rank of an -closed set of at most vectors in (for any ) is denoted by .
Over the field of Real numbers, it is known that [23, 10]. The rank of high dimensional Sylvester-Gallai configurations over was bounded by for a fixed constant in [3]. This bound was further improved to (for a fixed constant ) in [13].
The polynomial time blackbox PIT algorithms for circuits over and follow from some strong structural properties of identically zero circuits. In [27] it was shown that the rank of any identically zero, simple and minimal circuit is at most some constant depending on . This bound was improved in [42, 43], and the theorem below gives the best bound we know.
Theorem 5 ([43]).
Let be a circuit, over field , that is simple, minimal and zero. Then, we have .
Combining the above theorem with the best bounds we know for and we obtain the following,
Theorem 6.
Let be a simple, minimal and identically zero circuit over or . Then there is an absolute constant depending only on such that . If is over then we can bound by . If is over then we can bound by for some absolute constant .
3.2 Essential Variables of a Polynomial
This notion will be useful in reconstruction when the input circuit is low rank, as well as when one of more gates is low rank.
Definition 7 ([28], Essential Variables).
The number of essential variables in is the smallest such that there exists an invertible linear transformation on the variables such that depends on only variables.
4 Upper bounding the size of and
In this section, we will show that if a degree polynomial is computed by a circuit then other then in certain degenerate cases, it will “vanish” on only finitely many codimension 3 subspaces. In the next sections we will show how to compute these subspaces and then how to extract the linear forms of from these subspaces.
Given linearly independent linear forms , let denote the codimension subspace of corresponding to those vectors where evaluate to . We say that is a vanishing space for a polynomial if vanishes on all points of .
Equivalently, consider any invertible linear transformation such that for all . Let . Then setting to in results in the identically 0 polynomial.
For a polynomial defined over , we will define to be the set of codimension subspaces over which vanishes.
We would like to define to be the set of codimension subspaces over which vanishes and try to show that this is finite. However note that if has even one codimension subspace on which it vanishes, then there will be infinitely many codimension subspaces on which it vanishes, since for any , vanishes on every single codimension 2 subspace of . Thus when we define , we will not consider such subspaces.
Let .
Note that any is of the form for some two linear forms . Moreover any two independent linear forms in the span of and will result in the same space .
Similarly we define to be the set of codimension subspaces over which vanishes such that is not contained in any subspace from or .
.
Note again that any is of the form for some three linearly independent linear forms . Moreover any three independent linear forms in the span of , and will result in the same space .
Lemma 8.
Let be a degree polynomial. Then, .
Proof.
The proof is quite simple. Any linear form such that vanishes on must be a linear factor of . Since has degree , it can have at most distinct factors.
We will also show how to bound the size of and under additional structural assumptions of . Note that if is computed by an circuit of the form , then by picking one linear form from each of the three multiplication gates, we can obtain codimension 3 spaces on which vanishes. Learning these spaces will eventually allow us to learn the linear forms, and it will be an important ingredient of our final algorithm. Note however that might have other codimension 3 vanishing subspaces that are not of this form. The main goal of this section is to show that nevertheless under some structural assumptions, we are able to bound the number of these spaces.
Before we state and prove our upper bounds for and spaces we state and prove a slightly modified version of a result from [48] that we will be useful for us.
Claim 9.
Let be distinct 2-dim subspaces of some vector space such that
and is not contained in . Suppose that is a 2-dim subspace such that for each , . Then either
-
and or
-
and .
Proof.
Case (1): . Then since and are distinct, clearly and .
Case(2): .
Here, we have , but as , we have . As the and together span a space of dimension and , thus . Now as , thus . But we also know , and therefore and , which means.
4.1 Bounding the number of vanishing codimension 2 subspaces
Lemma 10.
Let be a field that is or . Let be a degree , variate polynomial in . Then there is an absolute constant such that if is computed by a circuit with , then
Proof.
We will be dividing our analysis into the following cases.
-
1.
Case 1: We bound the number of such that for some , vanishes on .
Wlog . Note that if vanishes on , then there must be some such that divides . Note that since for any , thus is nonzero, and moreover there is some linear form with such that . There are at most choices for from the factors of , and there were at most choices for (once we fix ). Therefore, there are at most possibilities for .
-
2.
Case 2: We bound the number of such that no vanishes on .
Consider which is of the form where is a simple, minimal circuit computing the identically zero polynomial and is a product of linear forms. By the rank bound given in Theorem 6, where is some absolute constant. Over the real numbers .
Let be any constant greater than . Therefore, . When we consider , the linear forms appearing in the gates of get mapped to linear forms in or in . The rank of those linear forms that get mapped to can be at most . Thus the rank of the set of linear forms that gets mapped to is at least 8.
Consider 3 linear forms (not all the same) from distinct product gates of such that when we consider them , they map to the same linear form (we don’t distinguish between a linear form and its multiple) and hence all get mapped to the linear forms of . Call such triple of linear forms a “collapsing” triple when we go . These linear forms must be of the form
where the denote field constants.
A collapsing triple can either span a 2 or 3 dimensional space. Now, since the rank of linear forms mapping to is at least , one of the following two scenarios must occur.
-
Case 2(a): There are two collapsing triples and such that each spans a 3-dim space and jointly the two triples span at least a 4-dim space.
Let and . Let . Since and both become 1-dimensional when is projected to 0, it must hold that and . Moreover as and are distinct 3-dim spaces, thus it must hold that . Thus and determine . In particular, the two triples of linear forms determine .
Since, there can be possibilities for each of , we have possibilities for , and hence for .
-
Case 2(b): The collapsing triples that span 2-dim spaces have combined rank at least 5.
Clearly there must be at least 3 collapsing triples. Let , and be the span of of 3 of the triples such that . Since , and all become 1-dimensional when is projected to 0, it follows that each of , , intersects in a 1-dim space. By Claim 9, it follows that knowing , and is enough to determine a vector . Once is determined, then the rest of the argument is similar to case 1. is nonzero, and moreover there is some linear form with such that . There are at most choices for from the factors of . Since there were at most possibilities for , and , and hence at most possibilities for the choice of , thus overall there are at most possibilities for and hence for for .
-
4.2 Bounding the number of vanishing codimension 3 subspaces
Lemma 11.
Let be a field that is or . Let be a degree , variate polynomial in . Then there is an absolute constant such that if is computed by a circuit with the following properties
-
1.
,
-
2.
There is no linear form such that is nonzero and .
then
Proof.
We will be dividing our analysis into the following cases.
-
1.
Case 1: We bound the number of such that for some , vanishes on .
Note that if vanishes on , then there must be some such that divides . Note that since for any thus is nonzero. Moreover by assumption, .
Observe that must vanish when we consider it for any such that . Thus once is fixed, any such is a vanishing codimension 2 space for . By Lemma 10, there are at most choices for . Given that there are at most choices for , thus there are totally possibilities for in this case.
-
2.
Case 2: We bound the number of such that no vanishes on .
Consider which is of the form where is a simple, minimal circuit computing the identically zero polynomial and is a product of linear forms. By the rank bound given in Theorem 6, where is some absolute constant. Over the real numbers .
Let be any constant greater than . Therefore, . When we consider , the linear forms appearing in the gates of get mapped to linear forms in or in . The rank of those linear forms that get mapped to can be at most . Thus the rank of the set of linear forms that gets mapped to is at least 14.
Note that by assumption, there is no linear form such that is nonzero and . However there could exists two independent linear forms , such that . We consider two further sub-cases based on whether this happens or not.
-
(a)
Case 2(a): We bound the number of such that no vanishes on but there exists two independent linear forms , such that .
The analysis for this case is almost identical to that of Case 2 in Lemma 10. By the analysis of Case 2 in Lemma 10, the number of spaces such that is at most . For each fixing of we now count the number of such that . Thus any such is of the form , Note that since for any space in , it is not contained in any space in , we only need to consider those such that . Since but thus must be a linear factor of and hence there can be only choices for . Thus in this case there are at most possibilities for .
-
(b)
Case 2(b): We bound the number of such that no vanishes on and there do not exist two independent linear forms , such that .
Consider 3 linear forms (not all the same) from distinct product gates of such that when we consider them , they map to the same linear form (we don’t distinguish between a linear form and its multiple) and hence all get mapped to the linear forms of . Call such triple of linear forms a “collapsing” triple when we go . These linear forms must be of the form
where the denote field constants.
A collapsing triple can either span a 2 or 3 dimensional space. Now, since the rank of linear forms mapping to is at least , one of the following two scenarios must occur.
Either (i) The set of collapsing triples such that each spans a 3-dim space has combined rank (over all the triples) of at least 7 or (ii) The set of collapsing triples such that each spans a 2-dim space has combined rank at least 8.
We will separately analyze both these subcases.
-
Case 2(b)(i): The set of collapsing triples such that each spans a 3-dim space has combined rank (over all the triples) of at least 7.
Let be the vector spaces spanned by three of the triples such that the , and are distinct and . Three such vector spaces must exist.
Now since going maps these triples to a line , letting
, it must hold that for all , . It follows that for all , . Since the spaces are distinct, thus can equal or .Now if for any , then thus intersection must be contained in . Thus knowing and determines a 1-dim subspace of . Let be the linear form representing this space. Since is determined by two of the triples, thus there are at most possibilities for . Once is determined, we consider . This is nonzero and by assumption, the rank of its simple part is at least . By Lemma 10 there are only choices of co-dim 2 subspaces modulo which vanishes (we disregard those co-dim 2 spaces which contain a co-dim 1 space on which vanishes. Thus in total, in this case there are at most choices of .
We need to still consider the case when for all distinct , . We first prove the following simple claim.
Claim 12.
Let be distinct 3-dim subspaces of some vector space such that . Suppose that for any distinct , . Then there exists a subspace such that such that for any distinct , .
Proof.
Since are distinct 3-dimensional spaces with , we have . As , we have . If intersects and in distinct 2-dimensional spaces, then . But and as , , which is a contradiction. Therefore, intersects and in the same 2-dimensional plane, let it be . Moreover, this means and , and therefore . As , we have .
By the above claim, it follows that for some 2-dim space . Moreover we know that is a 3-dim space intersecting each in a 2-dim space. It must hold that . If not, then . Thus needs to still intersect each in a vector outside of . These three additional vectors will be distinct and also linearly independent since . This is not possible since .
Thus . Hence and determine and hence determine a 2-dim subspace of . Since is determined by two of the triples, thus there are at most possibilities for . Once we fix , it remains to find the number of such that . When we go mod , we can assume that does not vanish (since we are only counting those that are not contained in an space), and hence for each fixing of , since has only at most linear factors, thus there are at most possibilities for .
-
Case 2(b)(ii): The set of collapsing triples such that each spans a 2-dim space, has combined rank at least 8.
Let be the vector spaces spanned by four of the triples such that the , , and are distinct and . Moreover and . Four such vector spaces must exist.
Now since going maps these triples to a single line , letting , it must hold that for all , .
There are three subcases that we will consider. In each case we show that the knowledge of the four subspaces allows us to determine a single linear form . There are at most hence choices for . Once we find and fix
we consider . This is nonzero and by assumption, the rank of its simple part is at least . By Lemma 10 there are only choices of co-dim 2 subspaces modulo which vanishes (we disregard those co-dim 2 spaces which contain a co-dim 1 space on which vanishes. Thus in total, the number of choices of is .
-
–
subcase 1: , are such that .
In particular Thus knowing and determines a 1-dim subspace of .
-
–
subcase 2: The 2-dim space defined by contains the line . Since , thus and .
Using Claim 9, it follows that and hence contains the line .
-
–
subcase 3: i.e. the three 1-dim intersections are independent. But as and , we have , which means . As , we have . By assumption, and therefore . So, is a 1-dim space that is contained in , and is determined by knowing and
-
–
-
-
(a)
References
- [1] M. Agrawal and V. Vinay. Arithmetic circuits: A chasm at depth four. In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 67–75, 2008.
- [2] D. Angluin. Queries and concept learning. Machine Learning, 2:319–342, 1988.
- [3] Boaz Barak, Zeev Dvir, Avi Wigderson, and Amir Yehudayoff. Fractional sylvester–gallai theorems. Proceedings of the National Academy of Sciences, 110(48):19213–19219, 2013.
- [4] A. Beimel, F. Bergadano, N. H. Bshouty, E. Kushilevitz, and S. Varricchio. Learning functions represented as multiplicity automata. J. ACM, 47(3):506–530, 2000. doi:10.1145/337244.337257.
- [5] M. Ben-Or and P. Tiwari. A deterministic algorithm for sparse multivariate polynominal interpolation. In Proceedings of the 20th Annual ACM Symposium on Theory of Computing (STOC), pages 301–309, 1988.
- [6] Vishwas Bhargava, Ankit Garg, Neeraj Kayal, and Chandan Saha. Learning generalized depth three arithmetic circuits in the non-degenerate case. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022), volume 245 of LIPIcs, pages 21:1–21:22. Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2022. doi:10.4230/LIPICS.APPROX/RANDOM.2022.21.
- [7] Vishwas Bhargava, Shubhangi Saraf, and Ilya Volkovich. Reconstruction of depth-4 multilinear circuits. SODA 2020, 2020.
- [8] Vishwas Bhargava, Shubhangi Saraf, and Ilya Volkovich. Reconstruction algorithms for low-rank tensors and depth-3 multilinear circuits. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 809–822, 2021. doi:10.1145/3406325.3451096.
- [9] Vishwas Bhargava and Devansh Shringi. Faster & deterministic FPT algorithm for worst-case tensor decomposition. Electron. Colloquium Comput. Complex., TR24-123, 2024. URL: https://eccc.weizmann.ac.il/report/2024/123.
- [10] William E. Bonnice and Michael Edelstein. Flats associated with finite sets in . Niew. Arch. Wisk, 15:11–14, 1967.
- [11] Enrico Carlini. Reducing the number of variables of a polynomial. In Algebraic geometry and geometric modeling, pages 237–247. Springer, 2006. doi:10.1007/978-3-540-33275-6_15.
- [12] Pranjal Dutta, Prateek Dwivedi, and Nitin Saxena. Deterministic Identity Testing Paradigms for Bounded Top-Fanin Depth-4 Circuits. In Valentine Kabanets, editor, 36th Computational Complexity Conference (CCC 2021), volume 200 of Leibniz International Proceedings in Informatics (LIPIcs), pages 11:1–11:27, Dagstuhl, Germany, 2021. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2021.11.
- [13] Zeev Dvir, Shubhangi Saraf, and Avi Wigderson. Improved rank bounds for design matrices and a new proof of kelly’s theorem. In Forum of Mathematics, Sigma, volume 2, page e4. Cambridge University Press, 2014.
- [14] Zeev Dvir and Amir Shpilka. Locally decodable codes with 2 queries and polynomial identity testing for depth 3 circuits. In Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’05, pages 592–601, New York, NY, USA, 2005. Association for Computing Machinery. doi:10.1145/1060590.1060678.
- [15] M. A. Forbes and A. Shpilka. Quasipolynomial-time identity testing of non-commutative and read-once oblivious algebraic branching programs. Electronic Colloquium on Computational Complexity (ECCC), 19:115, 2012.
- [16] Abhibhav Garg, Rafael Oliveira, Shir Peleg, and Akash Kumar Sengupta. Radical sylvester-gallai theorem for tuples of quadratics. In 38th Computational Complexity Conference (CCC 2023), pages 20:1–20:30. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.CCC.2023.20.
- [17] Abhibhav Garg, Rafael Oliveira, and Akash Kumar Sengupta. Robust Radical Sylvester-Gallai Theorem for Quadratics. In Xavier Goaoc and Michael Kerber, editors, 38th International Symposium on Computational Geometry (SoCG 2022), volume 224 of Leibniz International Proceedings in Informatics (LIPIcs), pages 42:1–42:13, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.SoCG.2022.42.
- [18] Ankit Garg, Neeraj Kayal, and Chandan Saha. Learning sums of powers of low-degree polynomials in the non-degenerate case. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 889–899. IEEE, 2020. doi:10.1109/FOCS46700.2020.00087.
- [19] A. Gupta, P. Kamath, N. Kayal, and R. Saptharishi. Arithmetic circuits: A chasm at depth three. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 578–587, 2013. doi:10.1109/FOCS.2013.68.
- [20] A. Gupta, N. Kayal, and S. V. Lokam. Efficient reconstruction of random multilinear formulas. In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS, pages 778–787, 2011. doi:10.1109/FOCS.2011.70.
- [21] A. Gupta, N. Kayal, and S. V. Lokam. Reconstruction of depth-4 multilinear circuits with top fanin 2. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing (STOC), pages 625–642, 2012. Full version at https://eccc.weizmann.ac.il/report/2011/153.
- [22] A. Gupta, N. Kayal, and Y. Qiao. Random arithmetic formulas can be reconstructed efficiently. Computational Complexity, 23(2):207–303, 2014. doi:10.1007/s00037-014-0085-0.
- [23] Sten Hansen. A generalization of a theorem of sylvester on the lines determined by a finite point set. Mathematica Scandinavica, 16(2):175–180, 1965.
- [24] Zohar S Karnin and Amir Shpilka. Black box polynomial identity testing of generalized depth-3 arithmetic circuits with bounded top fan-in. In 2008 23rd Annual IEEE Conference on Computational Complexity, pages 280–291. IEEE, 2008.
- [25] Zohar S Karnin and Amir Shpilka. Reconstruction of generalized depth-3 arithmetic circuits with bounded top fan-in. In 2009 24th Annual IEEE Conference on Computational Complexity, pages 274–285. IEEE, 2009.
- [26] N. Kayal, V. Nair, C. Saha, and S. Tavenas. Reconstruction of full rank algebraic branching programs. In 32nd Computational Complexity Conference, CCC 2017., pages 21:1–21:61, 2017. doi:10.4230/LIPIcs.CCC.2017.21.
- [27] N. Kayal and S. Saraf. Blackbox polynomial identity testing for depth 3 circuits. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 198–207, 2009. Full version at https://eccc.weizmann.ac.il/report/2009/032.
- [28] Neeraj Kayal. Efficient algorithms for some special cases of the polynomial equivalence problem. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete algorithms, pages 1409–1421. SIAM, 2011. doi:10.1137/1.9781611973082.108.
- [29] Neeraj Kayal, Vineet Nair, and Chandan Saha. Average-case linear matrix factorization and reconstruction of low width algebraic branching programs. computational complexity, 28:749–828, 2019. doi:10.1007/S00037-019-00189-0.
- [30] Neeraj Kayal and Chandan Saha. Reconstruction of non-degenerate homogeneous depth three circuits. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 413–424, 2019. doi:10.1145/3313276.3316360.
- [31] A. Klivans and A. Shpilka. Learning restricted models of arithmetic circuits. Theory of computing, 2(10):185–206, 2006. doi:10.4086/TOC.2006.V002A010.
- [32] A. Klivans and D. Spielman. Randomness efficient identity testing of multivariate polynomials. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing (STOC), pages 216–223, 2001.
- [33] P. Koiran. Arithmetic circuits: the chasm at depth four gets wider. CoRR, abs/1006.4700, 2010. arXiv:1006.4700.
- [34] Nutan Limaye, Srikanth Srinivasan, and Sébastien Tavenas. Superpolynomial lower bounds against low-depth algebraic circuits. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 804–814, 2022. doi:10.1109/FOCS52979.2021.00083.
- [35] Rafael Oliveira and Akash Kumar Sengupta. Radical sylvester-gallai theorem for cubics. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 212–220. IEEE, 2022. doi:10.1109/FOCS54457.2022.00027.
- [36] Rafael Oliveira and Akash Kumar Sengupta. Strong algebras and radical sylvester-gallai configurations. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pages 95–105, 2024. doi:10.1145/3618260.3649617.
- [37] Shir Peleg and Amir Shpilka. Polynomial time deterministic identity testing algorithm for circuits via Edelstein–Kelly type theorem for quadratic polynomials. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 259–271, 2021. doi:10.1145/3406325.3451013.
- [38] Shir Peleg and Amir Shpilka. A generalized sylvester–gallai-type theorem for quadratic polynomials. In Forum of Mathematics, Sigma, volume 10, page e112. Cambridge University Press, 2022.
- [39] Shir Peleg and Amir Shpilka. Robust Sylvester-Gallai Type Theorem for Quadratic Polynomials. In Xavier Goaoc and Michael Kerber, editors, 38th International Symposium on Computational Geometry (SoCG 2022), volume 224 of Leibniz International Proceedings in Informatics (LIPIcs), pages 43:1–43:15, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.SoCG.2022.43.
- [40] Shir Peleg, Amir Shpilka, and Ben Lee Volk. Tensor Reconstruction Beyond Constant Rank. In Venkatesan Guruswami, editor, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024), volume 287 of Leibniz International Proceedings in Informatics (LIPIcs), pages 87:1–87:20, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ITCS.2024.87.
- [41] Shubhangi Saraf and Devansh Shringi. Reconstruction of depth arithmetic circuits with top fan-in . Electron. Colloquium Comput. Complex., TR25-008, 2025. URL: https://eccc.weizmann.ac.il/report/2025/008.
- [42] Nitin Saxena and Comandur Seshadhri. Blackbox identity testing for bounded top fanin depth-3 circuits: the field doesn’t matter. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 431–440, 2011.
- [43] Nitin Saxena and Comandur Seshadhri. From sylvester-gallai configurations to rank bounds: Improved blackbox identity test for depth-3 circuits. Journal of the ACM (JACM), 60(5):1–33, 2013. doi:10.1145/2528403.
- [44] Jacob T Schwartz. Fast probabilistic algorithms for verification of polynomial identities. Journal of the ACM (JACM), 27(4):701–717, 1980. doi:10.1145/322217.322225.
- [45] Amir Shpilka. Interpolation of depth-3 arithmetic circuits with two multiplication gates. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 284–293, 2007. doi:10.1145/1250790.1250833.
- [46] Amir Shpilka. Sylvester-gallai type theorems for quadratic polynomials. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1203–1214, 2019. doi:10.1145/3313276.3316341.
- [47] Gaurav Sinha. Reconstruction of Real Depth-3 Circuits with Top Fan-In 2. In Ran Raz, editor, 31st Conference on Computational Complexity (CCC 2016), volume 50 of Leibniz International Proceedings in Informatics (LIPIcs), pages 31:1–31:53, Dagstuhl, Germany, 2016. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CCC.2016.31.
- [48] Gaurav Sinha. Efficient reconstruction of depth three arithmetic circuits with top fan-in two. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022), pages 118:1–118:33. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.ITCS.2022.118.
- [49] S. Tavenas. Improved bounds for reduction to depth 4 and depth 3. In MFCS, pages 813–824, 2013. doi:10.1007/978-3-642-40313-2_71.
- [50] Richard Zippel. Probabilistic algorithms for sparse polynomials. In International symposium on symbolic and algebraic manipulation, pages 216–226. Springer, 1979. doi:10.1007/3-540-09519-5_73.
