Abstract 1 Introduction 2 Context and consequences of the main theorem 3 Outline of the proof of Theorem 1.1 4 Representations of Lie algebras 5 Efficient construction of projectors References Appendix A Proofs on algebraic natural proofs Appendix B PBW, proof that monomials span the space

Algebraic Metacomplexity and Representation Theory

Maxim van den Berg ORCID Ruhr University Bochum, Germany
University of Amsterdam, The Netherlands
Pranjal Dutta ORCID Nanyang Technological University, Singapore Fulvio Gesmundo ORCID University of Toulouse, France Christian Ikenmeyer ORCID University of Warwick, UK Vladimir Lysikov ORCID Ruhr University Bochum, Germany
Abstract

In the algebraic metacomplexity framework we prove that the decomposition of metapolynomials into their isotypic components can be implemented efficiently, namely with only a quasipolynomial blowup in the circuit size. We use this to resolve an open question posed by Grochow, Kumar, Saks & Saraf (2017). Our result means that many existing algebraic complexity lower bound proofs can be efficiently converted into isotypic lower bound proofs via highest weight metapolynomials, a notion studied in geometric complexity theory. In the context of algebraic natural proofs, it means that without loss of generality algebraic natural proofs can be assumed to be isotypic. Our proof is built on the Poincaré–Birkhoff–Witt theorem for Lie algebras and on Gelfand–Tsetlin theory, for which we give the necessary comprehensive background.

Keywords and phrases:
Algebraic complexity theory, metacomplexity, representation theory, geometric complexity theory
Funding:
Maxim van den Berg: supported by the European Research Council through an ERC Starting Grant (grant agreement No. 101040907, SYMOPTIC) and the Dutch National Growth Fund (NGF) as part of the Quantum Delta NL visitor programme.
Pranjal Dutta: The work was done while PD was at the National University of Singapore, funded by NRF Singapore under the project “Computational Hardness of Lattice Problems and Implications” [NRF-NRFI09-0005].
Christian Ikenmeyer: supported by EPSRC grants EP/W014882/1 and EP/W014882/2.
Vladimir Lysikov: supported by the European Research Council through an ERC Starting Grant (grant agreement No. 101040907, SYMOPTIC).
Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.
Copyright and License:
[Uncaptioned image] © Maxim van den Berg, Pranjal Dutta, Fulvio Gesmundo, Christian
Ikenmeyer, and Vladimir Lysikov; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Algebraic complexity theory
Acknowledgements:
We are grateful to Alessandro Danelon and Nutan Limaye for helpful discussions. We thank anonymous referees for their comments, and especially for bringing our attention to Open Question 2 in [40].
Editors:
Srikanth Srinivasan

1 Introduction

Strong computational complexity lower bounds are often difficult to prove, for many different complexity measures. In most cases, the inability to prove lower bounds can be explained by barrier results such as the Relativization barrier due to Baker, Gill and Solovay [5], the Algebrization barrier due to Aaronson and Wigderson [1] and the Natural Proofs barrier due to Razborov and Rudich [66]. Metacomplexity studies the complexity of computing various complexity measures, such as the minimum circuit complexity of a function. One eventual goal of the metacomplexity program is to understand mathematically why lower bounds are hard to prove. In this paper, we study metacomplexity in the algebraic circuit complexity setup from a representation-theoretic viewpoint.

Algebraic complexity focuses on proving complexity lower bounds for various problems related to the computation of multivariate polynomials and rational functions. Prominent examples include the discrete Fourier transform, matrix multiplication, and solving systems of linear equations; see [14]. In algebraic complexity, the main model of computation is the algebraic circuit, which is a labeled directed graph encoding an algorithm to evaluate a polynomial. Most polynomials require large algebraic circuits; however, exhibiting explicit ones with such property is difficult. Valiant’s flagship conjecture [71], known as 𝖵𝖯𝖵𝖭𝖯, states that the algebraic circuit complexity of the permanent polynomial 𝗉𝖾𝗋d(x1,1,,xd,d)=π𝔖di=1dxi,π(i) is not polynomially bounded as a function of d. This conjecture is an algebraic analogue of the 𝖯𝖭𝖯 conjecture: indeed, the permanent of the adjacency matrix of a bipartite graph counts the number of perfect matchings, which is a #𝖯-hard problem.

Many current proof techniques in algebraic complexity theory are implicitly based on metapolynomials. Examples of such techniques are rank methods, dating back to [70], and used more recently in complexity theory both for polynomials [62, 55] and for tensors [69, 54], methods based on geometric features [53, 36], and the geometric complexity program [60, 59]. We refer to Section 2.2 for a more extensive discussion. Informally, metapolynomials are polynomials whose variables are coefficients of polynomials. The concept dates back to resultants and invariants of forms [20, 70], while the term “metapolynomial” was coined in [67] and made popular in [40]. A classical example is the metapolynomial b24ac in the coefficients of a univariate degree 2 polynomial ax2+bx+c, which vanishes exactly when the two roots of the polynomial coincide. Grochow pointed out in [39] that for many border complexity measures the complexity lower bounds can without loss of generality be proved by metapolynomials that lie in separating modules, or are isotypic, or are highest weight vectors. These are all closely related concepts (and we prove analogous results for all of them), where being isotypic is the most natural one because it does not depend on choosing any bases. Being a homogeneous metapolynomial is the same as being isotypic with respect to the action of 1×1 matrices. Being isotypic also generalizes the property of being invariant under the special linear group. While invariants can have large circuit size, see [34], we prove that not all separating isotypic metapolynomials have large circuit size, unless all separating metapolynomials have large circuit size. More precisely, we prove that without loss of generality one may assume the metapolynomial to be isotypic, or to be a highest weight vector, or to lie in a separating module, with only a quasipolynomial blowup in the circuit size. We use this to answer Open Question 2 in [40] by proving that all metapolynomials that lie in an irreducible representation have the same circuit complexity up to quasipolynomial blowup, see Corollary 1.2.

Our result reduces the search space for lower bounds from all homogeneous metapolynomials to the isotypic metapolynomials, which is a technique for example used in [2, 15, 16, 43, 3, 23, 11]. In particular the large-scale cluster computations in [3, 23, 11] would be infeasible without this search space reduction. For example, [43] reduces the search space for proving that the border complexity of the 2×2 matrix multiplication tensor is 7 to a 4-dimensional linear subspace of isotypic metapolynomials in a (64+20120)=8,179,808,679,272,664,720-dimensional space of homogeneous degree 20 metapolynomials on 4×4×4 tensors. On the reduced search space, the problem is then easily solved by linear algebra. This specific approach was used in [8] to prove that the border support rank of the 2×2 matrix multiplication tensor is 7. To this date, this is the only proof of this fact. Similar methods are used in [63, 37], where the polynomials of interest are invariants for the action of a special linear group: this information is crucially used to reduce the search space and determine the polynomial via interpolation methods on a small space. These are finite results, but the techniques work in the asymptotic settings as well, for example for lower bounds on the rank of matrix multiplication as in [15, 16], or the best known lower bound for the border determinantal complexity of the permanent [53].

Our result also points to potential future lower bounds: Our method can be used to extract isotypic metapolynomials for example from rank-based lower bounds proofs, and the obtained metapolynomials give at least the same lower bounds, but potentially better ones. It is conceivable that these metapolynomials can be adjusted to not be affected by barriers for rank methods [26, 33].

Our procedure uses explicit efficient circuit constructions and hence makes a contribution to algorithmic representation theory that is of independent interest.

Metapolynomials

We work over the field of complex numbers, and the set {x1,x2,} of variables. For every sequence 𝒊 of nonnegative integers with jij<, a monomial x𝒊 is the finite product of variables x1i1x2i2, and a polynomial f is a finite linear combination of monomials. For every monomial x𝒊=x1i1x2i2, define a metavariable c𝒊. A metavariable c𝒊 has degree 1, and it has weight 𝒊. Finite products of metavariables are called metamonomials, and finite linear combinations of metamonomials are called metapolynomials. The weight of a metamonomial is the sum of the weights of its metavariables. If every metamonomial in a metapolynomial Δ has the same weight, then Δ is called a weight metapolynomial (and the zero metapolynomial has every weight). For a fixed weight w the metapolynomials of that weight form a vector space, the weight w space. Every metapolynomial can be written as a unique sum of weight metapolynomials of different weights. A metapolynomial is called affine linear if all its metamonomials have degree at most 1, i.e., are just a metavariable or a constant. Every metapolynomial Δ can be evaluated at any polynomial f by assigning to each metavariable the coefficient of the corresponding monomial in f. For example, the discriminant of a degree two homogeneous polynomial in two variables c20x02+c11x0x1+c02x12 is the metapolynomial Δ=c1124c20c02. It is a classical fact that Δ(f)=0 if and only if f=2 for some homogeneous linear polynomial =α0x0+α1x1. In principle, metapolynomials can involve metavariables that correspond to monomials of different degrees, but for our purposes it is sufficient to consider the case where all metavariables correspond to monomials of the same degree d, see Remark 1.3. Every such metapolynomial can be decomposed uniquely as a sum of its homogeneous degree δ components, and we call a homogeneous degree δ metapolynomial a metapolynomial of format (δ,d,k) if all its metavariables only involve the variables x1,,xk.

Algebraic circuits

In this paper, we are mostly interested in the algebraic circuit size of metapolynomials. Algebraic circuits for metapolynomials are defined in the same way as for usual polynomials, as follows.

An algebraic circuit is a directed acyclic graph in which each vertex has indegree 0 or 2. Every indegree 0 vertex is called input gate and it is labelled by an affine linear metapolynomial; every indegree 2 vertex is called a computation gate and it is labeled by either “+” or “×”; every edge is labeled by a complex number, which is assumed to be 1 if this scalar is omitted; there is exactly one vertex of outdegree 0. An algebraic circuit computes a metapolynomial at every vertex by induction over the directed acyclic graph structure (the labels on the edges rescale the computed values), i.e., if the values computed at the children of a gate v are Δ1 and Δ2 and the incoming edges to v are labelled with α1 and α2, respectively, then the value computed at v is α1Δ1+α2Δ2 or α1α2Δ1Δ2, depending on whether v is labelled with a “+” or a “×”. An algebraic circuit computes the metapolynomial computed at its outdegree 0 vertex. The size of a circuit is the total number of its vertices. The algebraic circuit complexity of a metapolynomial is the minimum size of an algebraic circuit computing it. For example, Figure 1 shows that the algebraic circuit complexity of the discriminant is at most 6. Algebraic circuits for polynomials are defined analogously.

Figure 1: An algebraic circuit computing the discriminant Δ=c1124c20c02.
Representation Theory

Let denote the set of natural numbers including zero. A partition λ=(λ1,λ2,) is a finite sequence of non-increasing natural numbers. Let (λ)=max{jλj0}. We define λj=0 for all j>(λ). Write |λ|:=jλj, and say that λ is a partition of |λ| of length (λ). In this case we write λ|λ| or λ(λ)|λ| depending on whether the length is relevant. For example, λ=(4,2,2) is a partition of 8 of length 3, i.e., λ3 8. The Young diagram of a partition λ is a top-left justified array of boxes with λj many boxes in row j. For example, the Young diagram for (4,2,2) is . A Young tableau T is a filling of the boxes of a Young diagram with positive integer numbers. The partition λ is called the shape of T. For example, is a Young tableau of shape (4,2,2). A Young tableau is semistandard if the numbers within each row from left to right are non-decreasing, and the numbers within each column from top to bottom are strictly increasing, for example . For each partition λ there exists a unique superstandard tableau, that is, the tableau with only the number j in row j, e.g., .

The group GLk acts on the space [x1,,xk]d of homogeneous polynomials on k of degree d by linear change of coordinates. This action induces, again by linear change of coordinates, an action on the space of metapolynomials with format (δ,d,k) for every δ. For instance, the change of basis swapping the two coordinates on 2, defines the change of basis on [x0,x1]2 obtained by exchanging x0 and x1. This induces a linear change of coordinates on [c20,c11,c02] which exchanges c20 and c02 and leaves c11 fixed. The discriminant c1124c02c20 is sent to itself by this change of coordinates. More precisely the group actions are defined as follows. For every homogeneous degree d polynomial f in k variables and every gGLk, the polynomial gf is defined by (gf)(x)=f(g1x) for every xk. Similarly, for every format (δ,d,k) metapolynomial Δ, the metapolynomial gΔ is defined by (gΔ)(f)=Δ(g1f) for every homogeneous degree d polynomial f. The details about this action are provided in Section 4.

The vector space of metapolynomials of format (δ,d,k) is closed under this action of GLk, and it decomposes into a direct sum of subspaces which are also closed under the action of GLk: these subspaces are called isotypic components and they are in one to one correspondence with partitions λkδd. The summand corresponding to the partition λ is called the λ-isotypic component. This is called the isotypic decomposition. Each λ-isotypic component decomposes even further into a direct sum of subspaces with one summand for each semistandard tableau T of shape λ. We call the component for T the T-isotypic component. If V is λ-isotypic and irreducible (i.e., has no nontrivial subrepresentations), then λ is called the isomorphism type of V, and each T-isotypic component of V is 1-dimensional. If T is a superstandard tableau of shape λ, then the corresponding T-isotypic component is called the highest weight metapolynomial vector space of weight λ. Equivalently, a highest weight vector of weight λ is a metapolynomial Δ with gΔ=t1λ1tkλkΔ, for every gGLk upper triangular with t1,,tk on the main diagonal. The λ-isotypic component is known to be the linear span of the union of the orbits of all its highest weight vectors.

For example (not easily verifiable by hand, see Example 5.4), the space of metapolynomials of format (δ,d,k)=(3,2,3) decomposes into three nonzero isotypic components, corresponding to partitions (6,0,0), (4,2,0) and (2,2,2) of δd=6. The isotypic component (4,2,0) further decomposes into three T-isotypic components for the three semistandard tableaux of shape (4,2,0): , , . The metapolynomial Δ=c200c020c002 can be written according to this decomposition as

60Δ = 2(c1012c020+2c110c101c011+c200c0112+c1102c002+2c200c020c002)
+ [ 2(c020c10122c011c101c110+4c002c1102c0112c200+8c002c020c200)
+ 0
+ 5(c020c1012c011c101c110c002c1102+c0112c200+4c002c020c200)]
+ 5(c1012c020+c110c101c011c200c0112c1102c002+4c200c020c002),

where the five summands from top to bottom are -isotypic, -isotypic, -isotypic, -isotypic, and -isotypic, respectively. The fifth summand is a highest weight vector, because is superstandard. All summands have weight (2,2,2), because Δ has weight (2,2,2).

Theorem 1.1 (Main theorem).

Let Δ:[x1,,xk]d be a metapolynomial of format (δ,d,k) computed by an algebraic circuit of size s. Then

  1. 1.

    For every weight μk, |μ|=dδ, the projection of Δ onto the weight space of weight μ can be computed by a circuit of size O(s(δd)2k3).

  2. 2.

    For every partition λdδ the projection of Δ onto the λ-isotypic component can be computed by a circuit of size O(sk2k2(δd)2k3).

  3. 3.

    For every partition λdδ the projection of Δ onto the highest weight space of weight λ can be computed by a circuit of size O(s(k+1)2k2(δd)2k3).

  4. 4.

    For every semistandard tableau T of shape λdδ the projection of Δ onto the T-isotypic space can be computed by a circuit of size O(sk2k2(δd)2k4).

In all applications of Theorem 1.1, we will always have that k is logarithmic in the number of metavariables, which we explain in Section 2.5. Open Question 2 in [40] asks: given a sequence of irreducible representations of metapolynomials, what is the sequence of metapolynomials in it that has the lowest circuit complexity? Theorem 1.1 answers this question in a very satisfying way: Every such sequence of metapolynomials has the same circuit complexity up to quasipolynomial blowup, as seen in the following corollary.

Corollary 1.2.

Let s be the smallest circuit complexity of a nonzero metapolynomial in an irreducible GLk-representation V of type λδd. Then every metapolynomial in V has circuit complexity at most O(sk2k2(δd)2k4+k2).

Proof.

If V is irreducible of isomorphism type λ, then it decomposes into a direct sum of 1-dimensional T-isotypic components, where T ranges over all semistandard tableaux of shape λ. Take a nonzero metapolynomial Δ with minimal complexity in V and apply a generic basis change, i.e., apply a generic element of GLk. Then independently project the result onto all T-isotypic components. We claim that since the basis change was generic, the projections are nonzero, and therefore form a basis of V. In fact, we can be more precise about how we pick the basis change element gGLk. For every semistandard tableau T, define XT:={gGLkgΔ has zero T-isotypic projection}. Such XT is Zariski closed in GLk, but it is not the whole GLk, otherwise the linear span of GLkΔ would be a nontrivial subrepresentation of the irreducible representation V. Therefore the finite union X:=TXT is also a proper Zariski closed subset of V, and its complement GLkX is a nonempty Zariski open subset. If we pick gGLkX, then the projections of gΔ to every T-isotypic space are nonzero. Moreover, by Theorem 1.1, all these projections have complexity at most O(sk2k2(δd)2k4). But the dimension of V equals the number of semistandard tableaux of shape λ with entries 1,,k, which is at most (δd)k2 (each tableaux has k rows, and each row has length at most δd, so can be filled in at most (δd)k ways).

 Remark 1.3.

In general, algebraic complexity classes are defined for possibly nonhomogeneous polynomials. In this paper, we assume that polynomials and metapolynomials are homogeneous. However, one can lift the results to the nonhomogeneous setting as follows.

A polynomial f(x1,,xk) is called homogeneous if all its monomials have the same degree. For ddeg(f), define the degree d homogenization

fd(x0,x1,,xk):=x0df(x1/x0,,xk/x0).

If f has an algebraic circuit of size s, then fd has an algebraic circuit of size at most O(sd): s(d+1) for extracting the homogeneous parts of f via interpolation, d operations to compute all values x0,x02,,x0d, and another d+1 for multiplying the homogeneous components with the correct powers of x0, and a final d for adding up the results. Given a metapolynomial Δ that does not involve x0, define Δd by replacing every metavariable c𝐢 with c(d|𝐢|,i1,i2,), and setting every metavariable c𝐢 with |𝐢|>d to zero. Clearly we have Δd(fd)=Δ(f), provided that ddeg(f). Hence, in this paper we can always assume that f is homogeneous.

Organization of the paper

In the first part of Section 2 we illustrate the motivations for studying the circuit complexity of the isotypic components and the highest weight vectors in the space of metapolynomials, in the context of algebraic complexity lower bounds and border complexity classes. In Section 2.5, based on Theorem 1.1 we prove that algebraic natural proofs can without loss of generality be assumed to be isotypic, see Theorem 2.4. Section 3 sketches the main ideas behind Theorem 1.1. In Section 4, we provide an introduction to Lie algebras, universal enveloping algebras, and their representation theory, on which the core of the proof is built. In Section 5.1, we explain the construction of the projection maps used in the proof of the main theorem; we describe the efficient circuit implementation in Section 5.2. In Appendix A, we prove the results of Section 2.5. In Appendix B, we show one part of the proof of the Poincaré–Birkhoff–Witt Theorem (Theorem 4.9); this proof is explicit and can be used to construct the projectors in Theorem 1.1.

2 Context and consequences of the main theorem

2.1 Complexity classes defined by invariant complexity measures

A p-family is a sequence of polynomials (fn)n for which the number of variables of fn and the degree of fn are polynomially bounded functions of n. The class 𝖵𝖯 is the set of p-families whose algebraic circuit complexity is polynomially bounded. Valiant’s landmark conjecture is that the permanent sequence (𝗉𝖾𝗋n)n with 𝗉𝖾𝗋n=σ𝔖ni=1nxi,σ(i) is not contained in 𝖵𝖯, i.e., 𝖵𝖯𝖵𝖭𝖯; for details see [14, 57, 13]. The extended conjecture states that (𝗉𝖾𝗋n)𝖵𝖰𝖯 [12], i.e., that the algebraic circuit complexity of the permanent is not quasipolynomially bounded.

Let cc(f) denote the algebraic circuit complexity of the polynomial f[x1,,xk]. It is easy to see that cc(f)=cc(gf) for any gGLk: A circuit for gf is obtained from the one for f by applying g at the input gates. Recall that in the definition of algebraic circuits we allow affine linear forms at the input gates. We say that the complexity measure cc is an invariant complexity measure. In more restrictive circuit definitions, for instance if the input gates are required to be variables, the associated circuit complexity c might change under the action of GLk. However, both definitions give the same class 𝖵𝖯.

Several other classical classes in algebraic complexity theory can be defined as the set of p-families for which some invariant complexity measure is polynomially bounded. For example, the class 𝖵𝖥 is the set of p-families whose algebraic formula size is polynomially bounded; this is the smallest size of an algebraic circuit whose underlying directed graph is a tree. The class 𝖵𝖡𝖯 is the set of p-families whose algebraic branching program width is polynomially bounded; we do not define this notion here, but it is an easy consequence of the definition that it is an invariant complexity measure, e.g. [68, Def. 2.2]. More classical classes can be readily defined in this way, for example the class of p-families of polynomially bounded circuit size and constant depth (still allowing arbitrary linear forms as inputs), such as ΣΠΣ, or the class of p-families of sums of polynomially many powering gates ΣΛΣ. Finally, we mention 𝖵𝖭𝖯, which is the class of p-families of polynomially bounded permanental complexity: The smallest r such that f can be written as the permanent of an r×r matrix with affine linear entries. In all these cases, the relevant complexity measure is invariant.

Notable complexity measures that are not invariant are those related to the sparsity of the polynomial, in the sense of [49], or to restrictions on how often a variable can occur, such as read-once or read-k models [27]. A non-invariant complexity measure c can be converted into an invariant complexity measure c by defining c(f):=min{c(gf)gGLk}, i.e., by taking the minimum complexity over all base changes, but for some non-invariant measures this can change the corresponding complexity class. In this paper, we only study complexity classes defined by invariant measures.

2.2 Lower bounds via isotypic metapolynomials

An important problem in algebraic complexity theory concerns proving concrete lower bounds for interesting complexity measures c. More precisely, given a polynomial fhard and a certain integer r, one aims to show c(fhard)>r or equivalently fhardXr where Xr={f[x1,,xk]dc(f)r} is the set of polynomials satisfying the upper bound r on the complexity measure c(). A standard approach is to determine a function Δ with the property that Δ(f)=0 for every fXr and Δ(fhard)0.

Several lower bound methods in complexity theory are based on this approach, and most often the function Δ is a metapolynomial. This is the case for rank methods such as the method of partial derivatives [70, 22], the method of shifted partial derivatives [41, 28], and other augmented flattening methods such as Koszul–Young flattenings [54]: in all these cases the complexity lower bound is obtained in terms of a lower bound of the rank of a matrix whose entries depend on the metavariables, or equivalently on the nonvanishing of a suitable set of minors, which are metapolynomials. Rank methods yield the recent breakthrough of [55] giving the first superpolynomial lower bounds for explicit polynomials to be computed by low-product-depth algebraic circuits in large characteristic, further extended to any field in [30]. Similarly, the current best known lower bounds for the border determinantal complexity of the permanent, as well as the one of the sum of powers, are based on a non-degeneracy condition which can be translated into the vanishing of a suitable metapolynomial [53, 52]. Finally, the algebraic branching program lower bounds discussed in [50, 36] are built on geometric conditions of the zero set of the polynomial of interest, which in turn can be translated into the vanishing of metapolynomials.

When the complexity measure c() is invariant, one can use representation theory to enhance the search for the metapolynomials yielding a separation. Our main result Theorem 1.1 shows that this enhancement is not expensive: the computational overhead is quasi-polynomial.

To illustrate the effect of our result, first consider a simpler reduction. Let Δ be a metapolynomial vanishing on Xr such that Δ(fhard)0. Write Δ=iΔ(i) for homogeneous metapolynomials Δ(i) of degree i. If Xr is invariant under rescaling, one has Δ(i)(Xr)=0 for every i. On the other hand, there exists at least one i such that Δ(i)(fhard)0. The relevant homogeneous component Δ(i) can be extracted from Δ with a simple interpolation argument, and one has cc(Δ(i))(deg(Δ)+1)cc(Δ). In summary, if the lower bound c(fhard)>r can be proved via a metapolynomial Δ satisfying cc(Δ)s, then the same lower bound can be proved via a homogeneous metapolynomial Δ(i) satisfying cc(Δ(i))(deg(Δ)+1)s; and even further, the homogeneous metapolynomial Δ(i) can be realized as the projection of Δ onto the i-th homogeneous component of the space of metapolynomials.

Theorem 1.1 shows similar results for other projections and in particular for the projection onto a specific isotypic component. Indeed, suppose Δ is a (homogeneous) metapolynomial vanishing on Xr and such that Δ(fhard)0. Write Δ=λΔ(λ) as the sum of its isotypic components. If c() is an invariant complexity measure, then Xr is invariant under the action of GLk. In this case, one has Δ(λ)(Xr)=0 for every λ. On the other hand, there exists at least one λ such that Δ(λ)(fhard)0. Theorem 1.1 provides an upper bound on the circuit complexity of Δ(λ) in terms of the circuit complexity of Δ, showing that if the lower bound c(fhard)>r can be proved via a metapolynomial Δ satisfying cc(Δ)s, then the same lower bound can be proved via an isotypic metapolynomial with circuit complexity controlled by the parameter s. Theorem 1.1 further proves analogous results for more refined projections, such as the T-isotypic projections or the projection on the highest weight space. These are discussed in Section 2.3.

2.3 Highest weight vectors and representation theoretic obstructions

Theorem 1.1 can be used to convert lower bounds proved by Δ into lower bounds proved by a highest weight vector (HWV for short). To see this, note that if Δ vanishes on Xr and Δ(fhard)0, then there exists a highest weight vector Δ vanishing on Xr and such that Δ(gfhard)0 for some gGLk; in fact, this can be achieved with a random gGLk. Hence a HWV proves the lower bound c(gfhard)>r; if c() is an invariant complexity measure, the same lower bound holds for fhard. Similarly to before, Theorem 1.1 provides an upper bound on the circuit complexity cc(Δ) in terms of cc(Δ), showing only a quasipolynomial blowup in the complexity of the metapolynomial.

A coarser view on HWVs is given via representation theoretic multiplicities, which is a central idea of geometric complexity theory [60, 61, 59]. Let Y:=GLkfhard¯ denote the orbit closure, and let X:=Xs¯. It is easy to see that fhardX if and only if YX. The vector space of HWVs of weight λ defines a vector space of polynomial functions on X; the dimension of such space is called the multiplicity 𝗆𝗎𝗅𝗍λ[X] of λ in the coordinate ring [X]. Standard facts in representation theory, and in particular Schur’s Lemma (see Lemma 4.4), guarantee that if 𝗆𝗎𝗅𝗍λ[X]>𝗆𝗎𝗅𝗍λ[Y] for some λ, then XY. Therefore, multiplicities give a method to potentially separate two varieties, and obtain a lower bound on c(fhard). In this case, we say λ is a multiplicity obstruction. If additionally 𝗆𝗎𝗅𝗍λ[X]>0=𝗆𝗎𝗅𝗍λ[Y], then λ is called an occurrence obstruction. Occurrence obstructions [15, 16] and multiplicity obstructions [46] are sometimes sufficient to obtain separations. However, it is known that occurrence obstructions do not work in certain setups [47, 38, 18, 24].

Theorem 1.1 makes no statement about the viability of the method of multiplicity obstructions. Indeed, one main hope of geometric complexity theory is that information about 𝗆𝗎𝗅𝗍λ[X] can be obtained without ever having to write down any circuit for a HWV. Information can be gained by decomposing much easier coordinate rings of orbits, see [19, 17]. This gives upper bounds on 𝗆𝗎𝗅𝗍λ[X] by using classical representation theoretic branching rules. The lower bounds on 𝗆𝗎𝗅𝗍λ[GLkfhard¯] are more difficult to obtain, see [46] for a recent implementation.

2.4 Border complexity

Lower bounds achieved via metapolynomials are better described in the setting of border complexity. Generally, for a given algebraic complexity measure c(), one can define a corresponding border measure, c¯() as follows:

c¯(f)=min{r:there exists a sequence fε such that f=limε0fεand c(fε)=r for all but finitely many ε};

correspondingly, a complexity class 𝒞 defined in terms of the growth of a given complexity measure naturally induces a border complexity class 𝒞¯, described by the growth of the corresponding border complexity measure. In fact, 𝒞¯ can be defined as the closure of 𝒞 with respect to a suitable topology [48]. In the setting of algebraic circuits, the circuit size cc() has a corresponding border circuit measure cc¯() which defines the complexity class 𝖵𝖯¯.

A natural question concerns whether 𝒞=𝒞¯ and more generally how much larger 𝒞¯ is compared to 𝒞. For example, the border class of sequences admitting (border) polynomial size ΣΛΣ circuits is contained in 𝖵𝖡𝖯, see [29] and [9, Theorem 4.2]. In [25], the same containment was shown for the border classes of Σ[k]ΠΣ circuits (for every constant k). In some restricted setting, for instance in the study of matrix multiplication complexity, it is known that a complexity measure and its associated border complexity are equivalent [7]. In general, the question is wide open for most complexity classes and, in particular, for 𝖵𝖯.

Metapolynomials characterize border complexity classes in the following sense: if a sequence of polynomials (fn) does not belong to a border complexity class 𝒞¯ then there is a sequence of metapolynomials witnessing the non-membership. This is made precise in Theorem 2.2.

2.5 Algebraic complexity and algebraic natural proofs

Razborov and Rudich [66] introduced the notion of natural proofs in Boolean circuit complexity.

Definition 2.1 (Natural property).

A subset P{f:{0,1}n{0,1}} of Boolean functions is said to be a natural property useful against a class 𝒞 of Boolean circuits if the following is true:

  • (Usefulness). Any Boolean function f:{0,1}n{0,1} that can be computed by a Boolean circuit in 𝒞 does not have the property P.

  • (Constructivity). Given the truth table of a Boolean function f:{0,1}n{0,1}, whether it has the property P can be decided in time polynomial in the length of the input, i.e., in time 2O(n).

  • (Largeness). For all large enough n, at least a 2O(n) fraction of all n variate Boolean functions have the property P.

A proof that a certain family of Boolean functions cannot be computed by circuits in 𝒞 is said to be a natural lower bound proof if the proof (perhaps implicitly) proceeds via establishing a natural property useful against 𝒞, and showing that the candidate hard function has this property. Razborov and Rudich showed that most of the known Boolean circuit lower bound proofs, such as lower bounds for 𝖠𝖢0 (constant-depth) circuits have the natural property. One can set the same framework in the algebraic setting; see [39, 40, 31, 21, 51]. The exact details of the definitions are not fully settled yet, see page 19 of [21].

Following [40], the stretched classes 𝗆𝖾𝗍𝖺𝖵𝖯 and 𝗆𝖾𝗍𝖺𝖵𝖰𝖯 arise as follows. The class 𝗆𝖾𝗍𝖺𝖵𝖯 is the class of sequences (Δ1,Δ2,) of metapolynomials of format (δ(n),n,k(n)) with k(n) polynomially bounded in n, and δ(n) and cc(Δn) polynomially bounded in N(n)=(k(n)+n1n), which is the number of degree n monomials in k(n) variables. The class 𝗆𝖾𝗍𝖺𝖵𝖰𝖯 is defined analogously, but with cc(Δn) quasipolynomially bounded in N(n).

We will now argue that k(n) is bounded by a polynomial in log(N(n)). To see this, choose γ such that k(n)nγ for n large enough. Take any such large enough n and make a case distinction. If k(n)n, then N=(k+n1k1)(1+nk1)k1 implies logN(k1)log(1+nk1)(k1), since nk11; here we use the fact that (mr)(m/r)r. On the other hand, if k(n)n+1, then since (mr) is an increasing function on m for a fixed r, we get that N=(k+n1n)(2nn)2n, for n1. Therefore, (logN)γnγk. Hence, the claim is proved in both cases.

Fix an invariant complexity measure c and let Xd,r denote the set of homogeneous degree d polynomials f with c(f)r. For a format (δ,d,k) metapolynomial Δ:[x1,,xk]d let Δ|Xd,r:Xd,r denote the restriction of Δ to Xd,r. Let 𝒞 denote the set of p-families with polynomially bounded c. The vanishing ideal of 𝒞 is defined as

I(𝒞):={(Δn)n of format (,n,) :for every polynomially bounded r(n),the sequence (Δ1|X1,r(1),Δ2|X2,r(2),)eventually vanishes identically}

To emphasize how natural this definition is, we provide a corresponding Nullstellensatz.

Theorem 2.2 (Nullstellensatz for 𝒞).

Let (fn)n be a p-family of homogeneous polynomials such that (degfn) is a strictly increasing sequence. The sequence (fn) lies in 𝒞¯ if and only if for every (Δn)I(𝒞) we have Δdeg(fn)(fn)=0 eventually.

The proof of Theorem 2.2 is given in Appendix A.

Definition 2.3 (Algebraic natural proof).

We say that 𝒞 has algebraic natural proofs if I(𝒞)𝗆𝖾𝗍𝖺𝖵𝖰𝖯 contains a sequence that is not eventually zero.

Here we loosened the definition from the literature slightly from 𝗆𝖾𝗍𝖺𝖵𝖯 (see [40, 21]) to 𝗆𝖾𝗍𝖺𝖵𝖰𝖯, but note that the literature does not have just one standard definition, see page 19 of [21]. The landmark question in algebraic metacomplexity is whether 𝖵𝖯 has algebraic natural proofs.

Definition 2.3 should be compared with Definition 2.1 in the Boolean setting. If I(𝖵𝖯)𝗆𝖾𝗍𝖺𝖵𝖰𝖯 contains an eventually nonzero sequence Δ of format (δ(n),n,k(n)), then clearly there exists the largest r(n) such that Δn(Xn,r(n))={0}, and further cc(Δn)2poly(logN), where N:=(k(n)+n1n), and k(n)poly(n). Therefore, Δ certifies the non-membership, a useful natural property. Furthermore, it trivially satisfies the largeness criterion, since Δn does not vanish on most polynomials. Finally, the quasipolynomial bound on the algebraic circuit complexity of Δn gives a natural algebraic analogue of the notion of constructivity.

Let Isotypic denote the set of sequences of isotypic metapolynomials. The following theorem follows from Theorem 1.1. It states in a concise way that only isotypic metapolynomials have to be considered for algebraic natural proofs.

Theorem 2.4 (Main theorem about algebraic natural proofs).

𝒞 has algebraic natural proofs if and only if I(𝒞)𝗆𝖾𝗍𝖺𝖵𝖰𝖯Isotypic contains a sequence that is not eventually zero.

Proof.

One direction is clear, because I(𝒞)𝗆𝖾𝗍𝖺𝖵𝖰𝖯IsotypicI(𝒞)𝗆𝖾𝗍𝖺𝖵𝖰𝖯.

For the other direction, we first consider a general property of isotypic components of vanishing ideals. Let I(X)δ={Δ of format (δ,d,k)Δ|X=0}. Let X be closed under the action of GLk. If ΔI(X)δ, then gΔI(X)δ for every gGLk. Indeed, in this case for all xX we have (gΔ)(x)=(Δ)(g1x)=0, since g1xX. Moreover, I(X)δ is closed under linear combinations, hence I(X)δ is a vector space, and in particular closed under taking limits and thus closed under the action of the Lie algebra. We have ΔI(X)δ if and only if for all λ we have Δ(λ)I(X)δ, because the projection of Δ to an isotypic component Δ(λ) involves only finite linear combinations of Lie algebra actions, see Section 5.

Let 𝒞 have algebraic natural proofs, witnessed by a sequence (Δn)nI(𝒞)𝗆𝖾𝗍𝖺𝖵𝖰𝖯 of format (δ(n),n,k(n)) that is not eventually zero. Now, let (λn)n be a sequence of partitions, λnk(n)nδ(n), such that the isotypic component Δn(λn) of Δn is nonzero whenever Δn is nonzero. By construction, (Δn(λn))nIsotypic and (Δn(λn))n is not eventually zero. Since each Xn,i is invariant under the group action, we have for all i and n: Whenever Δn|Xn,i=0, then also Δn(λn)|Xn,i=0. For every polynomially bounded r we have Δn|Xn,r(n)=0 eventually, hence we have Δn(λn)|Xn,r(n)=0 eventually, and hence (Δn(λn))nI(𝒞). Let N(n)=(k(n)+n1n). Let s(n)=cc(Δn)2poly(log(N)). By Theorem 1.1 with k(n)poly(log(N)), δ(n)poly(N), and d=n, we have cc(Δn(λn))2poly(log(N)), which implies (Δn(λn))n𝗆𝖾𝗍𝖺𝖵𝖰𝖯, which finishes the proof.

3 Outline of the proof of Theorem 1.1

The proof of Theorem 1.1 is built on an efficient construction of the image of metapolynomials under projections onto the isotypic spaces.

Roughly speaking, we will construct the projectors of Theorem 1.1 as linear combinations of repeated derivations applied to metapolynomials. These derivations live naturally in a vector space called the Lie algebra 𝔤𝔩k of GLk. They are defined as the linearization of linear change of coordinates by elements of GLk applied to polynomials on which a metapolynomial is evaluated. Repeated applications of these operations is formally described using products in an associative algebra 𝒰(𝔤𝔩k)𝔤𝔩k called the universal enveloping algebra of 𝔤𝔩k. More concretely, a standard basis element in 𝔤𝔩k corresponds to an analogue of a shifted partial derivative for metapolynomials (see ˜4.2), and an element of 𝒰(𝔤𝔩k) captures repeated applications of such shifted partial derivatives.

Lie algebras and their universal enveloping algebras have a long history in mathematics, and their theory is rich and well-developed. We review the theory in Section 4 to the extent required to construct the projectors. The construction is completely explicit. The universal enveloping algebra 𝒰(𝔤𝔩k) contains elements {C1,,Cr} which separate isotypic components in the following sense: each Ci acts by scalar multiplication on each isotypic component, and for any two distinct isotypic components there is a Ci for which the corresponding scalars are different; the precise statement is the content of Theorem 4.11 and Theorem 4.13. This property is used to construct the projectors via an analogue of standard multivariate polynomial interpolation, see Section 5.1. The elements {C1,,Cr} are called Casimir operators, and they are defined as generators of certain commutative subalgebras of 𝒰(𝔤𝔩k). For each case of Theorem 1.1, there is a suitable algebra for which the corresponding isotypic components are the spaces of interest, see Table 1.

The proof that the projectors can be implemented efficiently is built on the following insights. The Poincaré–Birkhoff–Witt theorem (Theorem 4.9) allows one to express each projector as a linear combination in 𝒰(𝔤𝔩k) of m-fold products of elements of 𝔤𝔩k with m bounded polynomially in the number of isotypic components. This number is exponential in k, which is polylogarithmic in N as observed in Section 2.5. We use this fact to construct circuits of quasipolynomial size for the projectors in Section 5.2. In turn, this will complete the proof of Theorem 1.1.

4 Representations of Lie algebras

In this section, we provide an introduction to Lie algebras, universal enveloping algebras, and their representation theory, following mainly [32, 42]. Afterwards, we introduce and explain the results needed to construct the projectors in Theorem 1.1. These results are typically stated in the language of abstract Lie algebras, and therefore we take time to build up this general theory. In our case of interest, the Lie algebra arises from GLk and many parts simplify. We will regularly indicate this in the text for clarity and concreteness. We establish homogeneous polynomials and metamonomials as a Lie algebra representation, which we will use as running examples.

A Lie algebra 𝔤 is a vector space endowed with a bilinear bracket operation [,] which is skew-commutative, i.e., [X,Y]=[Y,X], and satisfies a relation called the Jacobi identity [X,[Y,Z]]+[Z,[X,Y]]+[Y,[Z,X]]=0; in particular, the bracket operation is usually not associative. Every associative algebra 𝒜 is a Lie algebra with the commutator bracket given by [a,b]abba for every a,b𝒜. In this paper, we are interested in the representation theory of the general linear group GLk, and we study it via the representation theory of its Lie algebra 𝔤𝔩k: this is the space of k×k matrices endowed with the commutator bracket.

A Lie algebra representation is a vector space V with an associated linear map ρ:𝔤End(V) such that ρ([X,Y])=[ρ(X),ρ(Y)] where the bracket on the right is the commutator bracket. We equivalently say that 𝔤 (or its elements) acts on V. For vV, we often write X.vρ(X)v as a shorthand. A simple example is V=k, where 𝔤𝔩k acts by matrix-vector multiplication. This is called the standard representation of 𝔤𝔩k. In this paper we examine the space of metapolynomials [[x1,,xk]d]δ as a Lie algebra representation of 𝔤𝔩k. The Lie algebra action is explicitly given in in ˜4.2.

Representation theory of Lie algebras is used to study the representation theory of Lie groups because it allows one to “linearize” a group action. More precisely, every Lie group G has an associated Lie algebra 𝔤 and every Lie group representation (a non-linear map) defines naturally a Lie algebra representation (a linear map). All Lie algebra representations we will encounter arise from this correspondence. We will describe the 𝔤𝔩k action on polynomials and metapolynomials explicitly in ˜4.1 and ˜4.2, which may be taken as the definition. The details of the correspondence between Lie groups and Lie algebras are not crucial for the rest of the paper. We briefly outline them in Remark 4.3.

For a vector space V with basis v1,,vr, let v1,,vrV be the dual basis of V. Explicitly, v is the linear form mapping a vector v=iciviV, with ci, to the coefficient c. The basis dual to the standard basis of k is the set of variables x1,,xk. Let Ei,j𝔤𝔩k be the k×k matrix having 1 at the entry (i,j) and zero elsewhere. Regarding Ei,j:kk as a linear map, we have xEi,j=0 if i and xEi,j=xj if =i; here denotes the composition of functions. Since {Ei,j}i,j forms a basis for 𝔤𝔩k, the action of 𝔤𝔩k on a representation V is uniquely determined by the action of the Ei,j’s.

Claim 4.1 (Action on polynomials).

The vector space [x1,,xk]d of homogeneous degree d polynomials is a representation of 𝔤𝔩k. It is the linearization of the action of GLk acting by base change on the variables x1,,xk. For f[x1,,xk]d, the action is defined by the shifted partial derivative

Ei,j.f==1k(xEi,j)xf=xjxif. (1)

In particular, for a monomial f=x1xd, we have

Ei,j.f=p=1dx1xp1(Ei,j.xp)xp+1xd.

For a multiindex μk,iμi=d, let cμ be the corresponding variable, that is, the linear map sending a polynomial f to the coefficient of xμ. Let e1,,ekk be the standard basis vectors: ei is the vector with 1 at the i-th entry and 0 elsewhere.

Claim 4.2 (Action on metapolynomials).

The vector space of metapolynomials of format (δ,d,k), that is [[x1,,xk]d]δ=[cμ:μk,|μ|=d]δ is a representation of 𝔤𝔩k. It is the linearization of the action of GLk acting by base change on the variables x1,,xk. For a metapolynomial Δ[cμ:μk,|μ|=d]δ, the action is given by

Ei,j.Δ=μ(Ei,j.cμ)cμΔ, (2)

where the summation is over all multi-indices μk with iμi=d. The action on metavariables is given by

Ei,j.cμ={(μi+1)cμ+eiej if ij,μicμ otherwise, (3)

where we set cμ+eiej=0 if μj=0. In particular, for a metamonomial Δ=cμ1cμδ, we have

Ei,j.Δ=p=1δcμ1cμp1(Ei,j.cμp)cμp+1cμδ.

The following remark illustrates the correspondence between representations of complex algebraic Lie groups and representations of the corresponding Lie algebras. As mentioned before, it is not essential for the rest of the paper, but the proof of ˜4.1 and ˜4.2 are built on this correspondence.

 Remark 4.3.

A complex algebraic Lie group is a group with the structure of a complex algebraic variety, [32, Ch. 7]. One can show that GLk is such a group. Let V be a finite-dimensional vector space. A representation of a complex algebraic Lie group G on V is a group homomorphism ρ:GGL(V) which is a regular map, in the sense that the entries of ρ(g)GL(V) regarded as a matrix in a fixed basis, are polynomial functions on G.

As a vector space, the Lie algebra of G is, by definition, the tangent space of G at its identity element 𝔤=T𝗂𝖽G. The bracket operation is defined via the adjoint representation of G on 𝔤, see [32, Ch. 8] In the case of GLk one has 𝔤𝔩kT𝗂𝖽kGLk is the space of k×k matrices, and the bracket is the standard commutator.

Every representation ρ:GGL(V) of G defines a representation 𝔤 via its differential at the identity: dρ:T𝗂𝖽GT𝗂𝖽GL(V); this is a linear map and defines a Lie algebra homomorphism from T𝗂𝖽G=𝔤 to T𝗂𝖽GL(V)=End(V)=𝔤𝔩(V).

Explicitly, the value of dρ at X𝔤 is the element dρ(X)𝔤𝔩(V)End(V) defined by

(dρ(X))v=dds|s=0(ρ(γ(s))v) for every vV; (4)

here γ:G is any smooth curve such that γ(0)=𝗂𝖽G and γ(0)=X. It is easy to see that dρ(X) does not depend on the choice of γ.

Proof of ˜4.1 and ˜4.2.

We derive the Lie algebra representations from their respective group representations, as in Remark 4.3.

Every representation ρ:GGL(V) defines a pullback representation on the space of polynomials of degree p on V, [V]p. For gG and f[V]p, this is given by gf=fρ(g1) where ρ(g1):VV is regarded as a linear map and denotes the composition of functions.

The standard representation of GLk on k induces in this way a representation on the space of polynomials [x1,,xk]d. Similarly, the action of GL([x1,,xk]d) induces a representation on the space of metapolynomials [[x1,,xk]d]δ; the action of GLk on metapolynomials is the composition of the two representations

GLkGL([x1,,xk]d)GL([[x1,,xk]d]δ).

Now, if ρ:GGL(V) induces the Lie algebra representation dρ:𝔤GL(V), the pullback representation of G on [V]p induces the Lie algebra representation described as follows. For X𝔤, let γ(s) be a smooth curve in G such that γ(0)=𝗂𝖽G and γ(0)=X. The chain rule of derivatives implies dds|s=0γ(s)1=X. We show this in the case where G is a group of matrices: in this case, one can multiply elements of G by elements of 𝔤 and obtain

dds|s=0γ(s)1=[dds|s=0γ(s)][(γ(s)2)|s=0]=γ(0)γ(0)2=X𝗂𝖽2=X.

In general, the multiplication of matrices should be replaced by the action of G on 𝔤.

Let V be the dual space of V with basis z1,,zr; in particular [V]p is the space of polynomials of degree p in z1,,zr. By Equation 4, for every F[V]p, X.F is the polynomial function described as follows, on every ζV:

X.F(ζ)= dds|0(γ(s)F)(ζ)=
dds|0(Fγ(s)1)(ζ)=(using chain rule)
=1rzF(ζ)(z(dds|0γ(s)1ζ))=
=1rzF(ζ)(z(X.ζ))==1rzF(ζ)z(X.ζ).

Specialize the calculation above to the case of polynomials and metapolynomials to obtain Equation 1 and Equation 2. It remains to show Equation 3, which follows from a direct calculation: regarding cμ as a function of a polynomial f, we have

Eij.cμ(f)=cμ(Eij.f)=cμ(xjxif)={(μi+1)cμ+eiej(f) if ij,μicμ(f) otherwise.

Let V be a representation of a Lie algebra 𝔤. A subspace VV that is itself a 𝔤-representation is called a subrepresentation. A representation V is irreducible if it contains no subrepresentations except the zero space of V and V itself. A map ϕ:VW between two representations is called equivariant if it commutes with the action of 𝔤. Schur’s Lemma characterizes equivariant maps between irreducible representations:

Lemma 4.4 (Schur’s Lemma, [32, Lem. 1.7]).

Let V,W be irreducible representations for a Lie algebra 𝔤 and let ϕ:VW be an equivariant map. Then either ϕ=0 or ϕ is an isomorphism. Moreover, if V=W, then ϕ=c𝗂𝖽V for some c.

A complex algebraic Lie group is called reductive if every representation is completely reducible, namely it splits into direct sum of irreducible subrepresentations; the general linear group GLk is reductive, see e.g. [45, Ch.X]. In the following 𝔤 is the Lie algebra of a reductive group, and explicitly we are interested in the case where G=GLk and 𝔤=𝔤𝔩k.

4.1 Cartan subalgebras and weights

A subalgebra 𝔱𝔤 is called abelian if [𝔱,𝔱]=0. A Cartan subalgebra 𝔥 of 𝔤 is a maximal abelian subalgebra with the property that if [X,𝔥]𝔥 then X𝔥. The subalgebra of diagonal matrices in 𝔤𝔩k is a Cartan subalgebra of 𝔤𝔩k. It turns out that if G is reductive, then all Cartan subalgebras of 𝔤 have the same dimension; in fact they are all equivalent under a natural action of G on 𝔤 [32, Sec. D3]. In the case of GLk, the action on 𝔤𝔩k is given by conjugation: gX=gXg1 for gGLk and X𝔤𝔩k. In this case, one can show that any Cartan subalgebra is equivalent to the subspace of diagonal matrices.

If 𝔤 is the Lie algebra of a complex reductive Lie group G, then Cartan subalgebras of 𝔤 correspond to maximal abelian subgroups in G. For a fixed Cartan subalgebra 𝔥, let 𝕋 be the corresponding abelian subgroup of G. If 𝔥 is the subset of diagonal matrices in 𝔤𝔩k, then 𝔥 is the Lie algebra of the abelian group 𝕋 of diagonal matrices with nonzero diagonal entries. Irreducible 𝕋-representations are 1-dimensional and they are in one-to-one correspondence with a lattice Λ in the space 𝔥, called the weight lattice of 𝔤; the elements of Λ are called weights. When 𝕋 is the subgroup of GLk of diagonal matrices, then this definition coincides with the one used in Section 1 for metapolynomials; see also Example 4.7. The construction of the weight lattice is not straightforward for arbitrary Lie algebras, but it is very explicit in the case where 𝔥 is the subalgebra of diagonal elements in 𝔤𝔩k:

Claim 4.5.

The weight lattice of 𝔤𝔩k with respect to the Cartan subalgebra 𝔥 of diagonal elements is the abelian group of integer linear combinations of the diagonal elements, that is, the group of linear functions

α:𝔥
(h1hk) α1h1++αkhk.

with α1,,αk. In the following, we usually identify the weight α with the k-tuple of its integer coefficients α=(α1,,αk).

Proof.

Let 𝕋 be the subgroup of diagonal matrices in GLk. Let (α1,,αk)k be an integer vector.

ρ:𝕋 ×GL1
(t1tk) t1α1tkαk

is a group homomorphism, hence it defines a 1-dimensional (irreducible) representation of 𝕋. It is not hard to prove that all irreducible representations of 𝕋 arise in this way.

The induced 𝔥-representation is

dρ:𝔥 𝔤𝔩1
(h1hk) α1h1++αkhk.

In other words, irreducible 𝔥-representations induced from 𝕋-representations are in one-to-one correspondence with functions α𝔥 mapping a diagonal element H𝔥 to an integer linear combination of its diagonal entries.

Every 𝔤-representation V restricts to an 𝔥-representation, hence it decomposes into the direct sum of irreducible 1-dimensional 𝔥-representations. Since they are uniquely determined by their weights, we have V=αΛVαmα; in particular mα0 only for finitely many αΛ. If mα0, then α is called a weight of V, mα is called multiplicity and the direct summand Vαmα is called the weight subspace of V of weight α. An element X𝔥 acts simultaneously on the mα copies of Vα via scalar multiplication by α(X). The elements of a weight subspace are called weight vectors; a basis of V consisting of weight vectors is called a weight basis.

Example 4.6.

Consider the 𝔤𝔩3-representation V=[x1,x2,x3]2 with the action described in ˜4.1. Let 𝔥𝔤𝔩3 be the subalgebra of diagonal matrices. Consider the six monomials of V, which define a basis:

x12,x22,x32,x2x3,x1x3,x1x2.

One can verify that they form a weight basis of V. For instance, given H=(t1t2t3)=t1E11+t2E22+t3E33𝔥, we have

H.(x12)=itiEii.(x12)=itijxjxj(x12)=(2t1)x12

showing that x12 is a weight vector of weight (2,0,0). More generally, we have

H.(x12)=2t1x12,H.(x22)=2t2x22,H.(x32)=2t3x32,H.(x2x3)=(t2+t3)x2x3,H.(x1x3)=(t1+t3)x1x3,H.(x1x2)=(t1+t2)x1x2.

This shows that V decomposes into the sum of six weight spaces, each spanned by one of the weight vectors above. The corresponding weights are the six elements of 𝔥 mapping H to the eigenvalues of H on each weight vector. As k-tuple of integers, they are

(2,0,0),(0,2,0),(0,0,2),(0,1,1),(1,0,1),(1,1,0).

More generally, in a space of polynomials V=[x1,,xk]d, the monomials form a weight basis. The weight of a monomial xμ=x1μ1xkμk, regarded as a vector of integers, is μ.

Example 4.7.

Consider the 𝔤𝔩2-representation V=[[x1,x2]3]2 of metapolynomials of format (2,3,2), with the action described in ˜4.2. One can verify that the metamonomials form a weight basis. For instance, given H=(t1t2)=t1E11+t2E22𝔥, we have

H.(c302)=i=12tiEii.(c302)=i=12tiμEii.cμcμ(c302)=32t1c302=(6t1)c302.

showing that c302 is a weight vector of weight (6,0). Similarly, one can compute the weight of every metamonomial:

H.(c212)=(4t1+2t2)c212,H.(c122)=(2t1+4t2)c122,H.(c032)=(6t2)c032,H.(c30c21)=(5t1+t2)c30c21,H.(c30c12)=(4t1+2t2)c30c12,H.(c30c03)=(3t1+3t2)c30c03,H.(c21c12)=(3t1+3t2)c21c12,H.(c21c03)=(2t1+4t2)c21c03,H.(c12c03)=(t1+5t2)c21c12.

Therefore V has seven weight spaces of weight (p,6p) with p=0,,6. The weight spaces of weight (4,2),(3,3),(2,4) have dimension 2, the others have dimension 1.

More generally, the metamonomials give a weight basis for the space of metapolynomials of format (δ,d,k). A metamonomial cμ1cμδ has weight μ1++μδ, in accordance with what we defined in the introduction.

4.2 Roots and highest weights

An important representation is given by the action of the Lie algebra on itself, called the adjoint representation ad:𝔤End(𝔤) and defined by ad(X)=[X,]. By definition, the weight space of weight 0 is 𝔥 itself; moreover, one can prove that all other weight spaces are 1-dimensional [32, Ch. 14]. The nonzero weights of the adjoint representation are called the roots of the Lie algebra 𝔤 (with respect to 𝔥); let Φ𝔤Λ denote the set of roots of 𝔤. It turns out that η𝔥 is a root if and only if η is a root and ±η are the only integer multiples of η that are roots.

We compute the roots for 𝔤𝔩k with 𝔥 the algebra of diagonal matrices. Let H=diag(t1,,tk)𝔥. Then

ad(H)Ei,j=HEi,jEi,jH=tiEi,jtjEi,j=(titj)Ei,j. (5)

So Ei,j is the weight space corresponding to the weight (eiej)𝔥. Since the matrices Ei,j span 𝔤𝔩k, we obtain that the set of roots is Φ𝔤𝔩k={eieji,j[k],ij}.

The roots of the Lie algebra shift the weights of the Lie algebra representations: more precisely, let V be a representation of 𝔤, vV a weight vector of weight λ and X𝔤α for some root α; then X.v is a weight vector of weight λ+α.

A choice of positive roots is a subset Φ𝔤+ of Φ𝔤 with the following two properties: for every root α, either α or α belong to Φ𝔤+; for every pair of roots α1,α2 such that α1+α2 is a root, if α1,α2Φ𝔤+ then α1+α2Φ𝔤+. For 𝔤𝔩k, one such choice is Φ𝔤𝔩k+={eieji<j}.

All choices of positive roots are equivalent under the action of a finite group 𝒲𝔤, called the Weyl group of 𝔤. This group acts on 𝔥 and hence on 𝔥; the action of 𝔥 sends the set of roots to itself. All possible sets of positive roots are obtained one from the other via the action of 𝒲𝔤. The Weyl group of 𝔤𝔩k is the permutation group 𝔖k on k elements. If 𝔥 is the Cartan subalgebra of diagonal matrices, then 𝒲𝔤 acts by permuting the diagonal elements; the induced action on 𝔥 permutes the weights ei: for σ𝔖k, σei is the weight eσ1(i).

Since for every root α, either α or α is positive, the weight decomposition of 𝔤 gives 𝔤=𝔥αΦ𝔤+(𝔤α𝔤α). In the case of 𝔤𝔩k, this decomposition is 𝔤𝔩k=𝔥i<j(EijEji). A raising operator (resp. lowering operator) is an element XαΦ𝔤+𝔤α (resp. XαΦ𝔤+𝔤α). Let 𝔫=αΦ𝔤+ be the space of raising operators; the condition that a root α1+α2 is positive whenever α1,α2 are positive make 𝔫 into a Lie algebra.

For 𝔤𝔩k, the space 𝔫=span{Ei,j:i<j} of raising operators is the space of strictly upper triangular matrices and the space of lowering operators is the space of strictly lower triangular matrices.

Let V be a representation of 𝔤. A highest weight vector for V is a weight vector vV with the property that 𝔫.v=0. The corresponding weight is called a highest weight for V. When 𝔤=𝔤𝔩k and V is a space of metapolynomials, it is not hard to verify that this condition is equivalent to the characterization of highest weight metapolynomials in Section 1; the characterization in terms of tableaux is explained in details in Section 4.5. A fundamental result in representation theory is that if V is irreducible, then V has a unique highest weight and this highest weight uniquely determines V up to isomorphism [32, Prop. 14.13]. In particular, this result classifies finite-dimensional representations of 𝔤.

Only a subset of weights occur as highest weights of finite-dimensional irreducible representations. Such vectors are called dominant weights and their characterization in general requires the introduction of a suitable inner product structure on 𝔥, which is not straightforward. Let Λ+ denote the subset of the weight lattice consisting of dominant weights. The representation associated to the highest weight λ is denoted Vλ and it is the unique 𝔤-representation having highest weight λ. In the case of 𝔤𝔩k, the set of dominant weights consist of weights λ=(λ1,,λk) such that λjλj+1. If λi0 for every i, we usually identify the weight λ with a partition of length k.

Example 4.8.

We determine the highest weights vectors in the representations [x1,x2,x3]2 and [[x1,x2,x3]2]3. The space of raising operators of 𝔤𝔩3 is 𝔫=span{E12,E23,E13} of upper triangular matrices. The highest weight vectors are weight vectors characterized by the condition 𝔫.v=0. Since E13=[E12,E23], the condition 𝔫.v=0 is equivalent to E12.v=E23.v=0.

For [x1,x2,x3]2, one can check the unique highest weight vector is x32 and the corresponding highest weight is (0,0,2). Therefore [x1,x2,x3]2 is irreducible, and it is isomorphic to the 𝔤𝔩3-representation V(0,0,2).

For [[x,y,z]2]3, the highest weight vectors are the metapolynomials

c2,0,03,c2,0,0c1,1,024c2,0,02c0,2,0and4c2,0,0c0,2,0c0,0,2c0,0,2c1,1,02+c1,0,1c1,1,0c0,1,1c0,2,0c1,0,12c2,0,0c0,1,12.

Their weights are (6,0,0), (4,2,0) and (2,2,2). Therefore [[x1,x2,x3]2]3 is a direct sum of three irreducible 𝔤𝔩3-representations

[[x1,x2,x3]2]3=V(6,0,0)V(4,2,0)V(2,2,2).

4.3 Universal enveloping algebra

The universal enveloping algebra of a Lie algebra 𝔤 is an associative algebra associated to the Lie algebra 𝔤. It is used to study representation theory of Lie algebras using representation theory of associative algebras.

A representation of an associative (unital) algebra 𝒜 is a vector space V equipped with a homomorphism of associative algebras ρ:𝒜End(V). For vV and U𝒜, write U.v=ρ(U)v, as in the Lie algebra setting. Subrepresentations and equivariant maps are defined for representations of an associative algebra in the same way as for Lie algebra representations. For a subset S𝒜, the subalgebra generated by S is the smallest subalgebra of 𝒜 containing S; explicitly, it can be realized as the set of (noncommutative) polynomials in elements of S.

Let 𝔤 be a Lie algebra. Consider a representation ρ:𝔤End(V). Since End(V) is an associative algebra, we can consider the subalgebra of End(V) generated by the linear maps in the image ρ(𝔤). Since ρ is a representation, these generators satisfy the relations

ρ(X)ρ(Y)ρ(Y)ρ(X)=ρ([X,Y]). (6)

The universal enveloping algebra 𝒰(𝔤) is an associative algebra constructed in such a way that the relations (6) are forced for every representation ρ of 𝒰(𝔤). It is a quotient of the tensor algebra 𝒯(𝔤) over 𝔤. The tensor algebra is spanned by formal products of elements of 𝔤,

𝒯(𝔤)=k=0𝔤k=𝔤𝔤2𝔤3 (7)

with multiplication induced by the tensor products :𝔤p×𝔤q𝔤(p+q). In particular, the tensor algebra is graded, with 𝔤p being the component of degree p. Consider the ideal (𝔤)𝒯(𝔤) generated by the elements of the form

XYYX[X,Y]

for X,Y𝔤. The universal enveloping algebra of 𝔤 is defined as 𝒰(𝔤)=𝒯(𝔤)/(𝔤). The multiplication in 𝒰(𝔤) is denoted by the usual multiplication sign rather than . The inclusion of 𝔤 into 𝒯(𝔤) (as a vector space) gives rise to a linear map 𝔤𝒰(𝔤), which is also injective. We identify 𝔤 with the image of this injection. The elements of 𝒯(𝔤) and, therefore, 𝒰(𝔤) can be written as noncommutative polynomials in elements of 𝔤. When 𝔤 is abelian (for instance, in the case of a Cartan algebra), the ideal (𝔤) is indeed homogeneous, since [X,Y]=0; in this case 𝒰(𝔤) is the symmetric algebra of 𝔤, isomorphic to the polynomial ring on 𝔤.

By construction of the ideal (𝔤) we have XYYX=[X,Y] in 𝒰(𝔤) for every X,Y𝔤. This implies that every representation of 𝒰(𝔤) restricts to a Lie algebra representation of the Lie algebra 𝔤, via the embedding 𝔤𝒰(𝔤); conversely every representation ρ:𝔤End(V) of 𝔤 uniquely extends to a representation of 𝒰(𝔤) by setting ρ(X1X2X)=ρ(X1)ρ(X2)ρ(X) for all Xi𝔤.

Note that in general the ideal (𝔤) is not homogeneous, so 𝒰(𝔤) does not inherit the grading from the tensor algebra 𝒯(𝔤). However, 𝒰(𝔤) has a filtration, in the following sense: define 𝒰(𝔤)={X+(𝔤)Xk=0𝔤k}; then 𝒰(𝔤)𝒰(𝔤)(+1) for every , and if X𝒰(𝔤)p,Y𝒰(𝔤)q then XY𝒰(𝔤)(p+q).

In particular, the elements of 𝒰(𝔤) are represented by noncommutative polynomials of degree at most in elements of 𝔤. The length of an element X𝒰(𝔤) is the minimal such that X𝒰(𝔤).

Theorem 4.9 (Poincaré–Birkhoff–Witt [42, Theorem 9.10]).

Let X1,,Xr be a basis of 𝔤. Then the set of ordered monomials Xi1Xi2Xid, i1i2id is a basis of 𝒰(𝔤). Moreover, the set of ordered monomials of length at most is a basis of the subspace 𝒰(𝔤) of elements with length at most .

In fact, we will only use that ordered monomials are a set of generators for the spaces 𝒰(𝔤). This is proved explicitly in Theorem B.1. The Poincaré–Birkhoff–Witt theorem (PBW theorem) shows that elements of 𝒰(𝔤) have unique representation as noncommutative polynomials in X1,,Xr with ordered terms, and the length of the element is given by the longest monomial in this polynomial.

Example 4.10.

Consider 𝔤=𝔤𝔩2 with a 𝔤𝔩2-representation V. Then FE2,1,EE1,2,HiEi,i gives an ordered basis (F,E,H1,H2) of 𝔤𝔩2. Examples of elements of 𝒰(𝔤𝔩2) are H1EF and 112FE+112F2E2. Note that these are formal products, not matrix multiplication (general Lie algebras do not have this additional algebra structure). Taking vV, the second element acts as (112FE+112F2E2).v=v12F.(E.v)+112F.(F.(E.(E.v)))V. The PBW theorem guarantees that these elements can be written as polynomials with ordered monomials. Indeed, we have

H1EF =H1(FE+[E,F])=H1FE+H1(H1H2)
==FEH1+2FE+H12H1H2.

4.4 Central characters and Casimir elements

Let 𝔤 be a Lie algebra. An element X𝒰(𝔤) of the universal enveloping algebra is central if XY=YX for every Y𝒰(𝔤) (or equivalently every Y𝔤). The center 𝒵(𝔤) of 𝒰(𝔤) is the set of all central elements; 𝒵(𝔤) is a commutative associative subalgebra of 𝒰(𝔤).

A representation ρ:𝔤End(V) of 𝔤 extends to a representation of 𝒰(𝔤), and if X𝒵(𝔤) is a central element, the map ρ(X):VV is 𝒰(𝔤)-equivariant. If V is irreducible, then by Schur’s lemma (Lemma 4.4) ρ(X) is a multiplication by some scalar χV(X). More precisely, ρ induces a 1-dimensional representation χV:𝒵(𝔤). This representation is called the central character of V.

It is a crucial fact that for Lie algebras 𝔤 of reductive groups central characters separate non-isomorphic irreducible representations.

Theorem 4.11 ([65], see also [10, §VIII.8.5, Cor.2]).

Let 𝔤 be a reductive Lie algebra. If V and W are irreducible representations of 𝔤 and the central characters χV and χW coincide, then V and W are isomorphic.

It follows that for reductive 𝔤 the isotypic components of a 𝔤-representation are also isotypic components of the induced 𝒵(𝔤)-representation. More specifically, if a 𝔤-representation V decomposes as λΛVλmλ with pairwise non-isomorphic Vλ, then as a 𝒵(𝔤)-representation we have VλΛχλmλdimVλ, where we denote by χλ the 1-dimensional representation of 𝒵(𝔤) corresponding to the central character of Vλ.

Example 4.12.

Consider the 𝔤𝔩3 representation V=[x1,x2,x3]d of homogeneous degree 3 polynomials in three variables (˜4.1). Consider the element I(E1,1+E2,2+E3,3). Clearly, I𝒵(𝔤𝔩3). We determine the value of the central characters. Take fV. We find

I.f=x1x1f+x2x2f+x3x3f=df.

So χV(I)=d. As a simple applications of Theorem 4.11, this allows us to conclude [x1,x2,x3]d and [x1,x2,x3]d contain no isomorphic irreducible subrepresentations unless d=d.

When 𝔤 is the Lie algebra of a reductive group, then 𝒵(𝔤) can be described explicitly as the ring of invariants for the the action of the Weyl group on the Cartan subalgebra. Note that since the Cartan subalgebra 𝔥 is abelian, the universal enveloping algebra 𝒰(𝔥) is in fact commutative and can be identified with the polynomial algebra [𝔥]. Recall that the Weyl group 𝒲𝔤 acts on 𝔥 and, therefore, on 𝒰(𝔥)[𝔥].

Theorem 4.13 (Harish–Chandra isomorphism, [44, Ch. 23];[6]).

Let 𝔤 be a reductive algebra and let 𝔥 be a Cartan subalgebra. The center 𝒵(𝔤) of the universal enveloping algebra is isomorphic to the algebra of invariants 𝒰(𝔥)𝒲𝔤. Moreover, if dim𝔥=k, there exist algebraically independent C1,,Ck𝒰(𝔤) such that 𝒵(𝔤)=[C1,,Ck]. The lengths of these elements coincide with the degrees of the fundamental invariants of 𝒰(𝔥)𝒲𝔤.

The generators C1,,Ck of the center 𝒵(𝔤) are known as Casimir elements. These generators are not unique.

Consider the case of 𝔤𝔩k, with 𝔥 the Cartan subalgebra of diagonal matrices. Then 𝒲𝔤𝔩k is the symmetric group 𝔖k, and it acts on 𝔥 by permuting the diagonal elements. Then 𝒰(𝔥) is the polynomial ring in k variables, and 𝒰(𝔥)𝒲𝔤𝔩k is the subring of polynomials invariant under permutation of variables: this is the classical ring of symmetric polynomials in k variables. It is generated, for instance, by the elementary symmetric polynomials, which have degrees 1,,k. These numbers give lengths of Casimir elements generating the center 𝒵(𝔤). There are several constructions of Casimir elements, one possible choice being

Cpi1=1kip=1kEi1,i2Ei2,i3Eip,i1with 1pk. (8)
Example 4.14.

Let 𝔤=𝔤𝔩2. Then the Casimir elements are given by

C1=E1,1+E2,2andC2=E1,1E1,1+E1,2E2,1+E2,1E1,2+E2,2E2,2.

We will use ˜4.2 to evaluate the application of C1 and C2 to the metapolynomials

Δ=c112+2c02c20andΓ=c112+4c02c20.

We find that

C1.Δ=E1,1.Δ+E2,2.Δ=2c112+4c02c20+2c112+4c02c20=4Δ

and similarly C1.Γ=4Γ (in fact, one can show C1 always scales a metapolynomial of format (δ,d,k) by dδ). We also compute

E2,1.Δ =2c022c11+2c112c02+20c20+2c02c11=6c022c11
(E1,2E2,1).Δ =6c11c11+62c02c20=6Δ

By symmetry of multiindices in the metapolynomial we know (E2,1E1,2).Δ=6Δ too. An analogous computation shows E2,1.Γ=0 and E1,2.Γ=0. We find

C2.Δ=(22+6+6+22)Δ=20ΔandC2.Γ=(22+0+0+22)Δ=8Δ.

We obtain that Δ and Γ live in different isotypic components with respect to 𝒵(𝔤𝔩2), and hence with respect to 𝔤𝔩2.

The eigenvalues of these Casimir elements on irreducible representations of 𝔤𝔩k are computed by Perelomov and Popov [64]:

χλ(Cp)=𝖳𝗋(AλpE) (9)

where E is the k×k matrix consisting of all ones, and the entries of the k×k matrix Aλ=(aij(λ)) are given by

aij(λ)={λi+ki,if i=j,0,if i>j,1,if i<j. (10)
Example 4.15.

Observe first that 𝖳𝗋(AλpE) is simply the sum of the entries of Aλp. Consider again Example 4.14. We compute the formula in Equation 9.

A(4,0) =[5102] and thus χ(4,0)(C2)=𝖳𝗋([25704][1111])=20
A(2,2) =[3102] and thus χ(2,2)(C2)=𝖳𝗋([9504][1111])=8.
Observe that this matches our results in Example 4.14. We give one more example for 𝔤𝔩3.
A(6,0,0) =[811011000] and thus χ(6,0,0)(C2)=𝖳𝗋([6497011000][111111111])=48.

4.5 Gelfand–Tsetlin theory

Gelfand–Tsetlin bases are used to explicitly construct irreducible representations of some classical Lie algebras. The explicit bases first appeared in [35] without much explanation. The construction was expanded upon by Želobenko [72] and independently by Baird and Biedenharn [4]. It can be explained in terms of restrictions of irreducible representations along a chain of Lie algebra inclusions.

Let 𝔞 be a Lie subalgebra of 𝔟. Every representation V of 𝔟 can be regarded as a representation of 𝔞 using this inclusion. The resulting representation is called the restriction of V to 𝔞 and is denoted as V𝔞𝔟.

Let 𝔤1𝔤2𝔤k be an inclusion chain of reductive Lie algebras. We call this chain a Gelfand–Tsetlin chain if 𝔤1 is abelian and every irreducible representation V of 𝔤 (1<k), when restricted to 𝔤1, is a multiplicity-free completely reducible representation, that is,

Vλ𝔤1𝔤μR(λ)Vμ (11)

for some set R(λ) of isomorphism types of irreducible representations of 𝔤i1. We write μλ if Vμ appears in the decomposition for Vλ.

Let Vλ be a irreducible representation of 𝔤k. By repeatedly restricting irreducible representations of 𝔤i we obtain a decomposition of Vλ as a representation of 𝔤1.

Vλ𝔤1𝔤k=λ(1)λ(2)λ(k)=λVλ(1) (12)

Since 𝔤1 is abelian, its irreducible representations are 1-dimensional, so the decomposition above distinguishes a set of 1-dimensional subspaces in Vλ indexed by chains λ(1)λ(2)λ(k) called Gelfand–Tsetlin patterns. We denote the subspace of Vλ(k) corresponding to a Gelfand–Tsetlin pattern T=λ(1)λ(2)λ(k) by VT.

The inclusions 𝔤𝔤k give an inclusion of universal enveloping algebras 𝒰(𝔤)𝒰(𝔤k), and therefore a chain 𝒰(𝔤1)𝒰(𝔤2)𝒰(𝔤k). The Gelfand–Tsetlin algebra 𝒢𝒵(𝔤k) of 𝔤k is the subalgebra of 𝒰(𝔤) generated by all centers 𝒵(𝔤) with k. Since the center 𝒵(𝔤) of 𝒰(𝔤) commutes with all elements of 𝒰(𝔤), it commutes in particular with the centers 𝒵(𝔤j) for j<; as a result, 𝒢𝒵(𝔤k) is a commutative subalgebra.

Consider a Gelfand–Tsetlin pattern T=λ(1)λ(2)λ(k) and the corresponding 1-dimensional subspace VTVλ(k). For every k the subspace VT is contained in an irreducible representation of 𝔤 isomorphic to Vλ(), and the center 𝒵(𝔤) acts on VT via the central character χλ(). It follows that VT is a 1-dimensional representation of 𝒢𝒵(𝔤k) given by a Gelfand–Tsetlin character χT:𝒢𝒵(𝔤k) such that X.v=χT(X)v for X𝒢𝒵(𝔤k), vVT. More specifically, on 𝒵(𝔤) we have χT(X)=χλ()(X). Theorem 4.11 implies that different Gelfand–Tsetlin patterns correspond to different Gelfand–Tsetlin characters.

By the previous discussion, every completely reducible representation of 𝔤k is also completely reducible as a representation of 𝒢𝒵(𝔤k). That is, if V is a completely reducible representation of 𝔤k with isotypic decomposition V=λΛVλmλ, then the action of 𝒢𝒵(𝔤k) on V splits the decomposition further into Gelfand–Tsetlin characters as

V=T:λ(1)λ(k),λ(k)ΛχTmλ(k). (13)

We call the components of this decomposition GZ-isotypic spaces of V.

The classical case – used to construct a basis of irreducible representations of 𝔤𝔩n – corresponds to the chain of inclusions 𝔤𝔩1𝔤𝔩2𝔤𝔩k where 𝔤𝔩k1 is included in 𝔤𝔩k as the subalgebra of 𝔤𝔩k consisting of matrices having the k-th row and the k-th column identically equal to 0. The Gelfand–Tsetlin algebra in this case is the polynomial algebra 𝒢𝒵(𝔤k)=[C,p] generated by the Casimir elements

C,pi1=1ip=1Ei1,i2Ei2,i3Eip,i1with 1k and 1p. (14)

The irreducible representations of 𝔤𝔩k correspond highest weights λ=(λ1,,λk) with integer λ1λk. The restriction Vλ𝔤𝔩k1𝔤𝔩k is given by the classical branching rule (attributed to Schur in [58]) which states that μλ if and only if μ and λ interleave, that is λμλ+1 for all , 1k1. Moreover, Ek,k acts on the copy of Vμ in Vλ as multiplication by |λ||μ|==1k1(λkμk).

A Gelfand-Tsetlin pattern is given by a sequence λ(1)λ(2)λ(k), that is, by integers λi() with 1k, 1i such that λi()λi(1)λi+1(), often arranged into a triangular shape, see Figure 2.

Figure 2: Gelfand–Tsetlin pattern of 𝔤𝔩k.

The 1-dimensional subspace VT corresponding to the pattern T=λ(1)λ(2)λ(k) is spanned by a weight vector of weight (|λ(1)|,|λ(2)||λ(1)|,,|λ(k)||λ(k1)|). The generators C,p of 𝒢𝒵(𝔤𝔩k) act on this space as multiplications by the corresponding eigenvalues χT(C,p). Since C,p is a Casimir operator for 𝔤𝔩, we have χT(C,p)=χλ()(C,p)=𝖳𝗋(Aλ()pE) from Perelomov–Popov formula.

If λ(k) is a partition, that is, all λi(k) are nonnegative, then the Gelfand–Tsetlin patterns for 𝔤𝔩k are in bijective correspondence with semistandard Young tableaux filled with numbers from {1,2,,k}. Recall that a Young tableau is semistandard whenever the numbers are strictly increasing along the columns, and weakly increasing along the rows.

A semistandard tableau T gives rise to a sequence of partitions λ(1),,λ(k) where λ() is the shape of the tableau T() obtained from T be removing all boxes labeled with numbers greater than . Semistandardness of T guarantees that all T() are also semistandard tableaux and the interleaving property λi()λi(1)λi+1() holds (see e. g. [56, p.5]). The weight is given by the content of the tableau T, that is, the vector μ=(μ1,,μk) where μi records the number of occurrences of the label i in the tableau.

For example, the tableau gives

and the sequence of shapes is λ(1)=(2), λ(2)=(4,1), λ(3)=(4,2,0), λ(4)=(4,2,2,0). The corresponding weight is (2,3,1,2), the content of the tableau.

The superstandard tableau of shape λ=(λ1,,λn) corresponds to the Gelfand-Tsetlin pattern with λ()=(λ1,,λ). The corresponding 1-dimensional subspace of Vλ contains the highest weight vectors. To prove this, recall that since Ei,j with ij span root spaces of 𝔤𝔩k, they shift the weights of weight vectors, that is, Ei,j.v is a weight vector of weight μ+eiej. The content of the superstandard tableau of shape λ is λ, so every nonzero vector v in the corresponding 1-dimensional subspace has weight λ, and for every Ei,j the vector Ei,j.v has weight λ+eiej. But there are no semistandard tableaux with shape λ and content λ+eiej when i<j. Therefore, Ei,j.v=0 if i<j, and it follows that v is annihilated by every strictly upper triangular matrix.

5 Efficient construction of projectors

In this section we prove Theorem 1.1. We explain the construction of the projectors in Section 5.1, and describe the efficient circuit implementation in Section 5.2.

5.1 Construction of projectors

We construct the required projections as projections onto a common eigenspace of several commuting matrices. The construction is elementary and is based on the classical construction of eigenprojectors from basic linear algebra.

We present the construction in full generality in Theorem 5.2. The projectors in the cases relevant for Theorem 1.1 are obtained by applying the general construction to suitable commutative subalgebras of the universal enveloping algebra 𝒰(𝔤𝔩k), see Table 1. If 𝒜 is a commutative algebra generated by U1,,Uq, then a representation of ρ:𝒜End(V) is uniquely determined by ρ(U1),,ρ(Uq), which are q commuting endomorphisms of VV. Every irreducible representation of 𝒜 is 1-dimensional, so it is given by an algebra homomorphism χ:𝒜. In other words, an irreducible representation is spanned by a common eigenvector of the generators U1,,Uq of 𝒜 with the corresponding eigenvalues χ(U1),,χ(Uq).

Recall that a representation is called completely reducible if it can be decomposed into a direct sum of irreducible representations. All representations in this paper have this property. For a commutative algebra this means that the generators of the algebra act not just by commuting but by simultaneously diagonalizable linear maps.

We slightly abuse notation and denote the character χ:𝒜 and the corresponding irreducible representation by the same letter. Every completely reducible representation V of 𝒜 has a unique isotypic decomposition

VλΛχλmλ, (15)

where Λ is a set indexing irreducible representations of 𝒜. We denote the isotypic component χλmλ in V by V𝒜,λ.

The length of the projectors as elements of 𝒰(𝔤) depends mainly on the number of different isotypic components of the representation.

Definition 5.1.

Let V be a completely reducible representation of an algebra 𝒜. We call the number of isomorphism types of irreducible subrepresentations of V the diversity of V.

Theorem 5.2.

Let 𝒜 be a commutative algebra generated by U1,,Uq and V be a completely reducible representation of 𝒜 with diversity at most p. For every irreducible representation χλ of 𝒜 there exists a polynomial fλ of degree at most p1 such that

fλ(U1,,Uq).x=Πλx (16)

where Πλ:VV is the unique 𝒜-equivariant projection onto the isotypic component V𝒜,λ.

Proof.

Let Λ be the set of isomorphism types appearing in the isotypic decomposition of V, so that VμΛV𝒜,μ with V𝒜,μ0. An element U𝒜 acts as U.x=χμ(U)x on V𝒜,μ. If λΛ, set fλ=0.

Fix λΛ. For every μΛ with μλ, let Uiμ be a generator such that χμ(Uiμ)χλ(Uiμ). Such generator exist because χλ and χμ are different characters of 𝒜. Define

PλμΛ,μλUiμχμ(Uiμ)χλ(Uiμ)χμ(Uiμ)𝒜. (17)

The factor Uiμχμ(Uiμ)χλ(Uiμ)χμ(Uiμ) acts on V as the zero map on V𝒜,μ and as the identity map on V𝒜,λ; moreover, its action maps isotypic components to themselves. The element Pλ acts as the composition of these factors; therefore its image in End(V) is the projection Πλ onto V𝒜,λ. Finally, since every factor is linear in U1,,Uq, Pλ is a polynomial in U1,,Uq. Its degree is (at most) p1.

The projections required in Theorem 1.1 arise as special cases of this construction. In each case, the image of the projection is an isotypic component of some commutative subalgebra 𝒜𝒰(𝔤𝔩k). Theorem 5.2 then gives a construction of the projectors as elements of 𝒰(𝔤𝔩k), expressed as polynomials in the generators of 𝒜. The length of the projectors is easily bounded in terms of lengths of generators of 𝒜 in 𝒰(𝔤𝔩K) and the diversity of [[x1,,xk]d]δ as a representation of 𝒜. Indeed, if the generators U1,,Uq of 𝒜 have length at most , and the projectors are expressed as polynomials in U1,,Uq of degree at most p1, then the lengths of the projectors are bounded by (p1). The following table summarizes these parameters for the three cases considered.

Table 1: Isotypic components, generators of their corresponding subalgebras, and their diversity in [[x1,,xk]d]δ. See Equation 8 and Equation 14 for the definitions of Cp and C,p.
Isotypic components Weight spaces λ-isotypic spaces T-isotypic spaces
Algebra 𝒜 𝒰(𝔥)[𝔥] 𝒵(𝔤𝔩k) 𝒢𝒵(𝔤𝔩k)
Generators Ui {Ei,ii[k]} {Cpp[k]} {C,p[k],p[]}
Generator lengths 1 k k
Diversity (δd)k (δd)k (δd)k2
 Remark 5.3.

In the construction of Theorem 5.2 we require knowledge of the value of χμ evaluated on the generators Ui, for all irreducible representation types μ occurring in the representation. For the weight spaces these values are given by the weights themselves, and for the other two algebra they can be computed using Equation 9. It is then also possible to remove any choices in the construction by taking for each term instead a linear combination U of all Ui with algebraically independent irrational weights. Since the values χμ are always integer, this ensures χμ(U)χλ(U) when μλ.

Example 5.4.

We construct and evaluate a λ-isotypic projector for metapolynomials of format (3,2,3). From Example 4.8 we know Λ={(6,0,0),(4,2,0),(2,2,2)}. By Equation 9 we find that the Casimir element C2 scales these three irreducible representations with different scalars, namely 48, 28 and 12 respectively. Hence we can use C2 to construct the projectors from Theorem 5.2. Consider first the projector P(6,0,0) to the (6,0,0)-isotypic component. We have that

P(6,0,0)=C2χ(2,2,2)(C2)χ(6,0,0)(C2)χ(2,2,2)(C2)C2χ(4,2,0)(C2)χ(6,0,0)(C2)χ(4,2,0)(C2)=C21236C22820.

Consider again the example Δ=c002c020c200 from the introduction. We will compute P(6,0,0).Δ. Using the equations in ˜4.2 we find

C2.Δ =i=13j=13(Ei,jEj,i).Δ=2c020c1012+2c002c1102+2c0112c200+24c002c020c200.

It directly follows that

C2.Δ28Δ=2c020c1012+2c002c1102+2c0112c2004c002c020c200.

We apply C2 again (at this point a computer algebra program is recommended) and find

C2.(C2.Δ28Δ)=48(c020c1012+c011c101c110+c002c1102+c0112c200).

Hence,

60P(6,0,0).Δ =C2.(C2.Δ28Δ)12(C2.Δ28Δ)12
=2c020c1012+4c011c101c110+2c002c1102+2c0112c200+4c002c020c200.

This is indeed the isotypic component as given in the introduction. The other two can be determined by an analogous computation.

5.1.1 Weight projectors

Let V be a representation of a complex reductive Lie group G. The action of G on V induces the action of the corresponding Lie algebra 𝔤. Fix a maximal torus TG and let 𝔥𝔤 be the corresponding Cartan subalgebra. The torus T is reductive, and the isotypic components of V with respect to the action of T are exactly the weight spaces. The action of T induces the action of 𝔥 with the same isotypic components, so the action of 𝔥 is completely reducible.

Let H1,,Hr be a basis of the Cartan subalgebra 𝔥. Since 𝔥 is abelian, the universal enveloping algebra 𝒰(𝔥)𝒰(𝔤) is commutative. It can be identified with the polynomial algebra [H1,,Hr]. We can now apply Theorem 5.2 to obtain for every weight μ𝔥 an element Hμ𝒰(𝔤) such that Hμ.v is the projection of vV onto the weight space of weight μ. The projectors Hμ are obtained as polynomial expressions Hμ=fμ(H1,,Hr) with degfμp1 where p is the diversity of V as a representation of 𝔥 (or, equivalently, of 𝒰(𝔥)), that is, the number of weights of V. Since Hi as elements of 𝒰(𝔤) have length 1, the length of Hμ is also bounded by p1.

We can specialize this construction to the action of 𝔤𝔩k on the space of metapolynomials W[[x1,,xk]d]δ. The standard Cartan subalgebra consists of all diagonal matrices, and we can take HiEi,i. The weights appearing in W correspond to k-tuples of nonnegative integers summing to δd (see Example 4.7). The number of such tuples is (δd+k1k1)(δd)k.

Corollary 5.5.

For each weight μ=(μ1,,μk) with i=1kμi=δd there is an element Hμ𝒰(𝔤𝔩k) of length at most (δd)k which acts on [[x1,,xk]d]δ as the projector onto the weight space corresponding to μ.

5.1.2 Isotypic projectors

Let V be a representation of a complex reductive Lie group G with the corresponding Lie algebra 𝔤. Then V is completely reducible as a representation of 𝔤, with the same isotypic components as for G. As discussed in Section 4.4, they are also isotypic components of V as a representation of 𝒵(𝔤), which is a commutative algebra generated by Casimir elements C1,,Cr.

Applying Theorem 5.2, we obtain for every highest weight λ a corresponding element Zλ𝒰(𝔤) such that Zλ.v is the projection of v onto the isotypic component of corresponding to the irreducible representation with highest weight λ. The projectors Zλ are obtained as polynomial expressions Zλ=fλ(C1,,Cr) with degfλ is at most p1, where p is the diversity of V as a representation of 𝔤. It follows that if the Casimir elements C1,,Cr have length at most , then the length of Zλ is at most (p1).

For the Lie algebra 𝔤𝔩k Casimir elements have length at most k. Since every highest weight is a weight, the diversity of [[x1,,xk]d]δ is bounded by (dδ)k as in the previous section.

Corollary 5.6.

For each partition λkδd there is an element Zλ𝒰(𝔤𝔩k) of length at most k(δd)k which acts on [[x1,,xk]d]δ as the projector onto the isotypic component corresponding to λ.

5.1.3 Highest weight projectors

Since the highest weight subspace of weight λ is exactly the weight λ subspace of the isotypic component corresponding to the irreducible representation Vλ, the projector onto the highest weight subspace is the composition HλZλ of the projectors constructed in the previous sections.

Corollary 5.7.

For each partition λkδd there is an element Xλ=HλZλ𝒰(𝔤𝔩k) of length at most (k+1)(δd)k which acts on [[x1,,xk]d]δ as the projector onto the highest weight space of weight λ.

5.1.4 GZ-isotypic projectors

Let V be a representation of a complex reductive Lie group G with the corresponding Lie algebra 𝔤 admitting a Gelfand–Tsetlin chain 𝔤1𝔤2𝔤k=𝔤. As discussed in Section 4.5, under the action of the Gelfand–Tsetlin algebra 𝒢𝒵(𝔤) the representation V decomposes into GZ-isotypic components indexed by Gelfand–Tsetlin patterns, and Theorem 5.2 can be used to construct for every Gelfand–Tsetlin pattern T an element YT𝒰(𝔤) such that YT.x is the projection of x onto the T-isotypic subspace of V. The length of YT is bounded in terms of the lengths of Casimir elements of 𝔤 and the number of Gelfand–Tsetlin patterns appearing in the decomposition of V.

Consider the case of 𝔤𝔩k acting on [[x1,,xk]d]δ. For the chain 𝔤𝔩1𝔤𝔩2𝔤𝔩k, Casimir elements have length at most k. Every irreducible subrepresentation of [[x1,,xk]d]δ corresponds to a partition λkdδ. Gelfand–Tsetlin patterns are in bijective correspondence with semistandard Young tableaux with δd cells and k rows. Each row of a tableau is an ordered tuple of integers from {1,,k}, so there are at most (dδ+k1k1)(dδ)k tuples that can serve as rows of a tableau, and therefore at most (dδ)k2 possible semistandard tableaux.

Corollary 5.8.

For each semistandard tableau T of shape λkδd there is an element YT𝒰(𝔤𝔩k) of length at most k(δd)k2 which acts on [[x1,,xk]d]δ as the projector onto the T-isotypic space corresponding to T.

5.2 Circuit transformations

We now combine the bounds on the lengths of the projectors with the PBW theorem (Theorem 4.9) to prove Theorem 1.1. Pick any ordered basis (X1,,Xk2) for 𝔤𝔩k, such as {Ei,j}i,j with (i,j) ordered lexicographically.

We first introduce some notation. Set Kk2. For 𝒊=(i1,,iK)K we will use the shorthand notation X𝒊X1i1XKiK𝒰(𝔤𝔩k). Let 𝒊𝒊 denote that 𝒊=(i1,,iK)K with irir for all r[K]. Write 𝒊𝒊(i1i1,,iKiK) and (𝒊𝒊)(i1i1)(iKiK). Lastly, define |𝒊|rir and LK{𝒊K|𝒊|L}.

By ˜4.2, any element X𝒰(𝔤𝔩k) acts as a derivation on the space of metapolynomials [[x1,,xk]d]=δ0[[x1,,xk]d]δ.

Lemma 5.9.

Suppose Φ,Γ[[x1,,xk]d] are metapolynomials. Then

  1. 1.

    X𝒊.(Φ+Γ)=X𝒊.Φ+X𝒊.Γ

  2. 2.

    X𝒊.(ΦΓ)=𝒊:𝒊𝒊(𝒊𝒊)(X𝒊.Φ)(X𝒊𝒊.Γ)

Proof.

Item 1 follows directly from linearity. Since 𝔤𝔩k acts by derivations (see ˜4.2) we have X.(fg)=(X.f)g+f(X.g). For i a straightforward induction argument shows Xi.(fg)=i:ii(ii)(Xi.f)(Xii.g). Then Item 2 follows.

Theorem 5.10.

Let P𝒰(𝔤𝔩k) be an element of length L. Suppose there exists an arithmetic circuit C of size s computing a metapolynomial Δ:[x1,,xk]d. Then there exists an arithmetic circuit C of size O(sL2k2) computing P.Δ.

Proof.

Again set Kk2. Construct C from C using the following recursive procedure:

  1. 1.

    Add for every input gate Θ in C the input gates X𝒊.Θ for 𝒊LK.

  2. 2.

    Add for every addition gate Θ=Φ+Γ in C the addition gates computing X𝒊.Θ for 𝒊LK, using the expression in Lemma 5.9.

  3. 3.

    Add for every product gate Θ=ΦΓ in C:

    • product gates computing (X𝒊.Φ)(X𝒊𝒊.Γ) for 𝒊LK and 𝒊𝒊,

    • addition gates computing X𝒊.Θ for 𝒊LK, using the expression in Lemma 5.9.

  4. 4.

    The resulting circuit computes X𝒊.Δ for 𝒊LK. By the PBW theorem (Theorem 4.9), there exists scalars β𝒊 such that

    P=𝒊LKβ𝒊X𝒊.

    Add to C the addition gates computing P.Δ.

The largest blow-up occurs in step 3. Let A{(𝒊,𝒊)𝒊LK,𝒊𝒊}. We added for every product gate in C exactly |A| product gates computing (X𝒊.f)(X𝒊𝒊.g) for (𝒊,𝒊)A, and |A|1 addition gates computing X𝒊.(fg)=(𝒊,𝒊)A(𝒊𝒊)(X𝒊.f)(X𝒊𝒊.g).

The blow-up of the other steps is bounded above by |A|, and the total blow-up is therefore upper bounded by 2|A|1. We now prove that |A|=(L+2K2K).

Let (𝒊,𝒊)A. By definition, i1++iKL, and (ijij)0, for all j[K], with ij0. Let aj:=(ijij), and bj:=ij. We want to find the number of (𝒂,𝒃)2K such that j=12K(aj+bj)L. So the number of pairs (𝒂,𝒃)2K such that j=12K(aj+bj)=t is (t+2K12K1). Therefore, by the hockey-stick identity,

|A|=t=0L(t+2K12K1)=(L+2K2K).

The above equality readily gives us the upper bound of |A|=O(L2K), which gives the desired complexity.

Proof of Theorem 1.1.

Let Δ be a metapolynomial of format (d,δ,n) admitting an algebraic circuit of size s. For each case of Theorem 1.1, we have a corresponding projector:

  1. 1.

    For every weight μ, Corollary 5.5 provides an element Hμ𝒰(𝔤𝔩k) of length at most O((δd)k) acting as projection on the weight space of weight μ. Theorem 5.10 guarantees that Hμ.Δ admits a circuit of size at most O(s((δd)k)2k2)=O(s(δd)2k3).

  2. 2.

    For every partition λ, Corollary 5.6 provides an element Zμ𝒰(𝔤𝔩k) of length at most O(k(δd)k) acting as projection on the isotypic component of type λ. Theorem 5.10 guarantees that Zλ.Δ admits a circuit of size at most O(s(k(δd)k)2k2)=O(sk2k2(δd)2k3).

  3. 3.

    For every partition λ, Corollary 5.7 provides an element Xλ𝒰(𝔤𝔩k) of length at most O((k+1)(δd)k) acting as projection on the highest weight space of weight λ. Theorem 5.10 guarantees that Xλ.Δ admits a circuit of size at most O(s((k+1)(δd)k)2k2)=O(s(k+1)2k2(δd)2k3).

  4. 4.

    For every semistandard tableaux T of shape λdδ, Corollary 5.8 provides an element Yλ𝒰(𝔤𝔩k) of length at most O(k(δd)k2) acting as projection on T-isotypic space. Theorem 5.10 guarantees that YT.Δ admits a circuit of size at most O(s(k(δd)k2)2k2)=O(sk2k2(δd)2k4).

References

  • [1] S. Aaronson and A. Wigderson. Algebrization: A new barrier in complexity theory. ACM Transactions on Computation Theory (TOCT), 1(1):1–54, 2009. doi:10.1145/1490270.1490272.
  • [2] A. Abdesselam and J. Chipalkatti. Brill–Gordan loci, transvectants and an analogue of the Foulkes conjecture. Advances in Mathematics, 208(2):491–520, 2007. doi:10.1016/j.aim.2006.03.003.
  • [3] A. Abdesselam, C. Ikenmeyer, and G. Royle. 16,051 formulas for Ottaviani’s invariant of cubic threefolds. J. Algebra, 447:649–663, 2016. doi:10.1016/j.jalgebra.2015.11.009.
  • [4] G. E. Baird and L. C. Biedenharn. On the representations of the semisimple Lie groups. II. J. Mathematical Phys., 4:1449–1466, 1963. doi:10.1063/1.1703926.
  • [5] T. Baker, J. Gill, and R. Solovay. Relativizations of the 𝒫=?𝒩𝒫 question. SIAM Journal on computing, 4(4):431–442, 1975. doi:10.1137/0204037.
  • [6] F. Berdjis. A criterion for completeness of Casimir operators. J. Math. Phys., 22:1851–1856, 1981. doi:10.1063/1.525156.
  • [7] D. Bini, M. Capovani, G. Lotti, and F. Romani. O(n2.7799) complexity for n×n approximate matrix multiplication. Inform. Process. Lett., 8(5):234–235, 1979. doi:10.1016/0020-0190(79)90113-3.
  • [8] M. Bläser, M. Christandl, and J. Zuiddam. The border support rank of two-by-two matrix multiplication is seven. Chicago Journal OF Theoretical Computer Science, 5:1–16, 2018.
  • [9] M. Bläser, J. Dörfler, and C. Ikenmeyer. On the Complexity of Evaluating Highest Weight Vectors. 36th Computational Complexity Conference (CCC 2021), 200:29:1–29:36, 2021. doi:10.4230/LIPIcs.CCC.2021.29.
  • [10] N. Bourbaki. Lie groups and Lie algebras. Chapters 7–9. Elements of Mathematics (Berlin). Springer-Verlag, Berlin, 2005. Translated from the 1975 and 1982 French originals by Andrew Pressley.
  • [11] P. Breiding, R. Hodges, C. Ikenmeyer, and M. Michałek. Equations for GL invariant families of polynomials. Vietnam Journal of Mathematics, 50(2):545–556, 2022. doi:10.1007/s10013-022-00549-4.
  • [12] P. Bürgisser. Completeness and reduction in algebraic complexity theory, volume 7. Springer Science & Business Media, 2000. doi:10.1007/978-3-662-04179-6.
  • [13] P. Bürgisser. Completeness classes in algebraic complexity theory. arXiv:2406.06217, 2024.
  • [14] P. Bürgisser, M. Clausen, and M. A. Shokrollahi. Algebraic complexity theory, volume 315. Springer Science & Business Media, 2013.
  • [15] P. Bürgisser and C. Ikenmeyer. Geometric complexity theory and tensor rank. In Proceedings of the 43rd Annual ACM Symp. on Th. of Comp., pages 509–518, New York, 2011. STOC ’11, ACM. doi:10.1145/1993636.1993704.
  • [16] P. Bürgisser and C. Ikenmeyer. Explicit Lower Bounds via Geometric Complexity Theory. In Proceedings of the 45th Annual ACM Symp. on Th. of Comp., pages 141–150. ACM, 2013. doi:10.1145/2488608.2488627.
  • [17] P. Bürgisser and C. Ikenmeyer. Fundamental invariants of orbit closures. J. Algebra, 477:390–434, 2017. doi:10.1016/j.jalgebra.2016.12.035.
  • [18] P. Bürgisser, C. Ikenmeyer, and G. Panova. No occurrence obstructions in geometric complexity theory. J. Amer. Math. Soc., 32(1):163–193, 2019. doi:10.1090/jams/908.
  • [19] P. Bürgisser, J. M. Landsberg, L. Manivel, and J. Weyman. An overview of mathematical issues arising in the Geometric Complexity Theory approach to VPVNP. SIAM J. Comput., 40(4):1179–1209, 2011. doi:10.1137/090765328.
  • [20] A. Cayley. On the theory of linear transformations. Cambridge Math. J., iv:193–209, 1845.
  • [21] P. Chatterjee, M. Kumar, C. Ramya, R. Saptharishi, and A. Tengse. On the Existence of Algebraically Natural Proofs. In 2020 IEEE 61st Annual Symp. on Found. of Comp. Sc. (FOCS), pages 870–880, 2020. Full version at arXiv:2004.14147. doi:10.1109/FOCS46700.2020.00085.
  • [22] X. Chen, N. Kayal, and A. Wigderson. Partial derivatives in arithmetic complexity and beyond. Found. Trends Theor. Comput. Sci., 6(1–2):1–138, 2011. doi:10.1561/0400000043.
  • [23] M.-W. Cheung, C. Ikenmeyer, and S. Mkrtchyan. Symmetrizing tableaux and the 5th case of the Foulkes conjecture. J. Symbolic Comp., 80:833–843, 2017. doi:10.1016/j.jsc.2016.09.002.
  • [24] J. Dörfler, C. Ikenmeyer, and G. Panova. On geometric complexity theory: Multiplicity obstructions are stronger than occurrence obstructions. SIAM Journal on Applied Algebra and Geometry, 4(2):354–376, 2020. doi:10.4230/LIPIcs.ICALP.2019.51.
  • [25] P. Dutta, P. Dwivedi, and N. Saxena. Demystifying the border of depth-3 algebraic circuits. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 92–103. IEEE, 2021. doi:10.1109/FOCS52979.2021.00018.
  • [26] K. Efremenko, A. Garg, R. Oliveira, and A. Wigderson. Barriers for Rank Methods in Arithmetic Complexity. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), pages 1:1–1:19, Dagstuhl, Germany, 2018. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.ITCS.2018.1.
  • [27] M. A. Forbes. Polynomial identity testing of read-once oblivious algebraic branching programs. PhD thesis, Massachusetts Institute of Technology, 2014.
  • [28] M. A. Forbes. Deterministic divisibility testing via shifted partial derivatives. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 451–465. IEEE, 2015. doi:10.1109/FOCS.2015.35.
  • [29] M. A. Forbes. Some concrete questions on the border complexity of polynomials. Presentation given at the Workshop on Algebraic Complexity Theory WACT 2016 in Tel Aviv, 2016. URL: https://www.youtube.com/watch?v=1HMogQIHT6Q.
  • [30] M. A. Forbes. Low-Depth Algebraic Circuit Lower Bounds over Any Field. In 39th Computational Complexity Conference, CCC 2024, July 22-25, 2024, volume 300 of LIPIcs, pages 31:1–31:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.CCC.2024.31.
  • [31] M. A. Forbes, A. Shpilka, and B. L. Volk. Succinct hitting sets and barriers to proving lower bounds for algebraic circuits. Theory of Computing, 14(1):1–45, 2018. doi:10.4086/toc.2018.v014a018.
  • [32] W. Fulton and J. Harris. Representation theory: a first course, volume 129 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1991.
  • [33] M. Gałązka. Vector bundles give equations of cactus varieties. Lin. Alg. Appl., 521:254–262, 2017. doi:10.1016/j.laa.2016.12.005.
  • [34] A. Garg, C. Ikenmeyer, V. Makam, R. Oliveira, M. Walter, and A. Wigderson. Search problems in algebraic complexity, GCT, and hardness of generators for invariant rings. In 35th Computational Complexity Conference, page 1, 2020.
  • [35] I. M. Gelfand and M. L. Tsetlin. Finite-dimensional representations of the group of unimodular matrices. Dokl. Akad. Nauk SSSR, 71(8):825, 1950.
  • [36] F. Gesmundo, P. Ghosal, C. Ikenmeyer, and V. Lysikov. Degree-Restricted Strength Decompositions and Algebraic Branching Programs. In 42nd IARCS Ann. Conf. Found. Soft. Tech. and TCS (FSTTCS 2022), volume 250 of Leibniz International Proceedings in Informatics (LIPIcs), pages 20:1–20:15, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.FSTTCS.2022.20.
  • [37] F. Gesmundo, Y.-I. Han, and B. Lovitz. Linear preservers of secant varieties and other varieties of tensors. J. Symbolic Comp., 131:102449, 2025. doi:10.1016/j.jsc.2025.102449.
  • [38] F. Gesmundo, C. Ikenmeyer, and G. Panova. Geometric complexity theory and matrix powering. Diff. Geom. Appl., 55:106–127, 2017. doi:10.1016/j.difgeo.2017.07.001.
  • [39] J. A. Grochow. Unifying known lower bounds via geometric complexity theory. computational complexity, 24:393–475, 2015. doi:10.1007/s00037-015-0103-x.
  • [40] J. A. Grochow, M. Kumar, M. Saks, and S. Saraf. Towards an algebraic natural proofs barrier via polynomial identity testing. arXiv, 2017. arXiv:1701.01717.
  • [41] A. Gupta, P. Kamath, N. Kayal, and R. Saptharishi. Arithmetic circuits: A chasm at depth three. Electronic Colloquium on Computational Complexity (ECCC), 20:26, 2013. doi:10.1109/FOCS.2013.68.
  • [42] B. C. Hall. Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, volume 222 of Graduate Texts in Mathematics. Springer, New York, 2010.
  • [43] J. Hauenstein, C. Ikenmeyer, and J. M. Landsberg. Equations for lower bounds on border rank. Experimental Mathematics, 22(4):372–383, 2013. doi:10.1080/10586458.2013.825892.
  • [44] J. E. Humphreys. Introduction to Lie algebras and representation theory, volume 9 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1978.
  • [45] J. E. Humphreys. Linear algebraic groups, volume 21. Springer Science & Business Media, 2012.
  • [46] C. Ikenmeyer and U. Kandasamy. Implementing geometric complexity theory: On the separation of orbit closures via symmetries. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 713–726, 2020. doi:10.1145/3357713.3384257.
  • [47] C. Ikenmeyer and G. Panova. Rectangular Kronecker coefficients and plethysms in geometric complexity theory. Adv. Math., 319:40–66, 2017. doi:10.1016/j.aim.2017.08.024.
  • [48] C. Ikenmeyer and A. Sanyal. A note on VNP-completeness and border complexity. Information Processing Letters, 176:106243, 2022. doi:10.1016/j.ipl.2021.106243.
  • [49] A. G. Khovanskiĭ. Fewnomials, volume 88. American Mathematical Soc., 1991.
  • [50] M. Kumar. A quadratic lower bound for homogeneous algebraic branching programs. computational complexity, 28:409–435, 2019. doi:10.1007/s00037-019-00186-3.
  • [51] M. Kumar, C. Ramya, R. Saptharishi, and A. Tengse. If VNP Is Hard, Then so Are Equations for It. In 39th Int. Symp. on Th. Asp. of Comp. Sc. (STACS 2022), volume 219 of Leibniz International Proceedings in Informatics (LIPIcs), pages 44:1–44:13, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.STACS.2022.44.
  • [52] M. Kumar and B. L. Volk. A lower bound on determinantal complexity. Comput. Complex., 31(2):12, 2022. doi:10.1007/S00037-022-00228-3.
  • [53] J. M. Landsberg, L. Manivel, and N. Ressayre. Hypersurfaces with degenerate duals and the Geometric Complexity Theory program. Comment. Math. Helv., 88(2):469–484, 2013. doi:10.4171/CMH/292.
  • [54] J. M. Landsberg and G. Ottaviani. New lower bounds for the border rank of matrix multiplication. Th. of Comp., 11(11):285–298, 2015. doi:10.4086/toc.2015.v011a011.
  • [55] N. Limaye, Srinivasan S., and S. Tavenas. Superpolynomial lower bounds against low-depth algebraic circuits. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 804–814. IEEE, 2021. doi:10.1109/FOCS52979.2021.00083.
  • [56] I. G. Macdonald. Symmetric functions and Hall polynomials. Oxford Mathematical Monographs. The Clarendon Press Oxford University Press, New York, second edition, 1995.
  • [57] M. Mahajan. Algebraic complexity classes. Perspectives in Computational Complexity: The Somenath Biswas Anniversary Volume, pages 51–75, 2014. doi:10.1007/978-3-319-05446-9_4.
  • [58] A. I. Molev. Gelfand-Tsetlin bases for classical Lie algebras. In Handbook of algebra. Vol. 4, volume 4 of Handb. Algebr., pages 109–170. Elsevier/North-Holland, Amsterdam, 2006. doi:10.1016/S1570-7954(06)80006-9.
  • [59] K. D. Mulmuley. The GCT program toward the P vs. NP problem. Communications of the ACM, 55(6):98–107, 2012. doi:10.1145/2184319.2184341.
  • [60] K. D. Mulmuley and M. Sohoni. Geometric Complexity Theory I: An Approach to the P vs. NP and Related Problems. SIAM J. Comput., 31(2):496–526, 2001. doi:10.1137/S009753970038715X.
  • [61] K. D. Mulmuley and M. Sohoni. Geometric complexity theory II: towards explicit obstructions for embeddings among class varieties. SIAM Journal on Computing, 38(3):1175–1206, 2008. doi:10.1137/080718115.
  • [62] N. Nisan and A. Wigderson. Lower bounds on arithmetic circuits via partial derivatives. Computational complexity, 6:217–234, 1996. doi:10.1007/BF01294256.
  • [63] L. Oeding and S.V Sam. Equations for the fifth secant variety of Segre products of projective spaces. Exp. Math., 25(1):94–99, 2016. doi:10.1080/10586458.2015.1037872.
  • [64] A. M. Perelomov and V. S. Popov. Casimir Operators for Semisimple Lie Groups. Mathematics of the USSR-Izvestiya, 2(6):1313, December 1968. doi:10.1070/IM1968v002n06ABEH000731.
  • [65] G. Racah. Sulla caratterizzazione delle rappresentazioni irriducibili dei gruppi semisemplici di Lie. Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Nat. (8), 8:108–112, 1950.
  • [66] A. A. Razborov and S. Rudich. Natural proofs. In Proc. of the 26th Ann. ACM Symp. Th. of Comp., pages 204–213, 1994.
  • [67] K. W. Regan. Understanding the Mulmuley–Sohoni approach to P vs. NP. Bulletin of the EATCS, 78:86–99, 2002.
  • [68] R. Saptharishi. A survey of lower bounds in arithmetic circuit complexity. https://github.com/dasarpmar/lowerbounds-survey/, 2021. Version 9.0.3.
  • [69] V. Strassen. Rank and optimal computation of generic tensors. Lin. Alg. Appl., 52/53:645–685, 1983. doi:10.1016/0024-3795(83)80041-X.
  • [70] J. J. Sylvester. On the principles of the calculus of forms. Cambridge and Dublin Math. J., 7:52–97, 1852.
  • [71] L. G. Valiant. Completeness classes in algebra. In Proceedings of the 11th ACM Symp. on Th. of Comp., STOC ’79, pages 249–261, New York, 1979. ACM. doi:10.1145/800135.804419.
  • [72] D. P. Žhelobenko. The classical groups. Spectral analysis of their finite-dimensional representations. Uspehi Mat. Nauk, 17(1(103)):27–120, 1962. doi:10.1070/RM1962v017n01ABEH001123.

Appendix A Proofs on algebraic natural proofs

In this section we prove Theorem 2.2.

Definition A.1.

We call a sequence r: strongly superpolynomial if r(n) dominates every polynomial sequence eventually, that is, for every polynomial p there exists N such that r(n)>p(n) for all nN.

Lemma A.2.

(Δn)I(𝒞) if and only if there exists a strongly superpolynomial sequence r such that Δn(Xn,r(n))=0.

Proof.

Clearly, if Δn(Xn,r(n))=0, then Δn(Xn,p(n)) vanishes eventually for every polynomially bounded sequence p, since p(n)r(n) eventually.

On the other hand, if (Δn) is a sequence of metapolynomials in I(𝒞), we can define r(n) as r(n):=max{rΔn(Xn,r)=0}. For every polynomially bounded sequence p(n) we have that Δn(Xn,p(n)) vanishes eventually, so r(n)p(n) eventually.

Proof of Theorem 2.2.

We first observe we may assume deg(fn)=n. Indeed, for every p-family (fn) with the property that deg(fn) is strictly increasing, let (fn) be the sequence defined by

fm={fnif m=deg(fn) for some n0otherwise

Since (fn) is a p-family, the sequence deg(fn) is polynomially bounded, hence (fn) is a p-family as well. Moreover (fn)𝒞 if and only if (fn)𝒞, and similarly for 𝒞¯. Finally, for a sequence of metapolynomials (Δn) of format (,n,), we have Δdeg(fn)(fn)=0 eventually if and only if Δn(fn)=0 eventually. In particular, proving the statement for (fn) is equivalent to proving the statement for (fn). Since deg(fn)=n, this shows we may assume deg(fn)=n.

Suppose (fn)𝒞¯, that is, the sequence p(n)=c¯(fn) is polynomially bounded. For every sequence of metapolynomials (Δn)I(𝒞) we have that Δn(Xn,p(n)) vanishes eventually. Since the metapolynomials Δn are continuous, Δn(Xn,p(n)¯) vanishes eventually, so Δn(fn)=0 eventually.

Conversely, suppose (fn)𝒞¯. Therefore r(n)=c¯(fn) is not polynomially bounded. Let θ(n)=lognr(n), θ^(n)=max{θ(m)mn}, and R(n)=nθ^(n). Note that R(n) is monotonically increasing.

The sequence R(n) is strongly superpolynomial. Indeed, since r(n) is not polynomially bounded, for every k there exists n0 such that θ(n0)k and, therefore, θ^(n)k for all nn0. It follows that for every k we have R(n)nk eventually.

For all Θ, if nΘ is the minimal value such that θ(nΘ)Θ, then θ^(nΘ)=θ(nΘ). Since θ is unbounded, it follows that θ and θ^ coincide on an infinite subset of . On this subset we also have R(n)=r(n).

Consider a metapolynomial sequence Δn such that Δn(Xn,R(n)1¯)=0, and if r(n)=R(n), then Δn is chosen so that Δn(fn)0 (this is possible because c¯(fn)=r(n)=R(n)>R(n)1). Since R(n) is strongly superpolynomial, R(n)1 is strongly superpolynomial, and (Δn)I(𝒞) by Lemma A.2, and since R(n)=r(n) infinitely often, Δn(fn)0 infinitely often.

Appendix B PBW, proof that monomials span the space

In this section, we give an elementary proof of the implication in Theorem 4.9 that we use. We include this standard result in this appendix, because it is the main result that enables our algorithmic speedup compared to a naive implementation.

Recall that the universal enveloping algebra 𝒰(𝔤) is the quotient of the tensor algebra 𝒯(𝔤) by the ideal (𝔤)={XYYX[X,Y]:X,Y𝔤}. The quotient induces a filtration 𝒰(𝔤)m𝒰(𝔤)(m+1) and the elements of 𝒰(𝔤)m are represented by noncommutative polynomials of degree at most m in the elements of 𝔤.

Theorem B.1 (One direction of the PBW Theorem).

Let X1,,Xr be a basis for a Lie algebra 𝔤. Then

m{Xi1Xi2Xii1i2i and m}

spans 𝒰(𝔤)m.

Proof.

It is clear that 𝒯(𝔤)m is spanned by all elements Xi1Xim; hence it suffices to prove that any such element is equivalent, modulo (𝔤), to a linear combination of elements of m. We prove this statement by induction on m.

Let Xj1Xjm𝒯(𝔤)m. For every a=1,,m1, we have XjaXja+1Xja+1Xja[Xja,Xja+1](𝔤); in particular XjaXja+1=Xja+1Xja+Y where Y is a linear combination of elements (𝔤) and of 𝒯(𝔤)1. We obtain

Xj1Xj=Xj1Xja1Xja+1XjaXja+2Xj+Y (18)

where Y is a linear combination of elements of (𝔤) and of 𝒯(𝔤)m1.

Let π𝔖m be a permutation such that (jπ(1)jπ(m)) is nondecreasing. Write π as a composition of elementary transpositions π=σsσ1, where σq=(a,a+1) for some a=1,,m1. Applying Equation 18 s times we obtain

Xj1Xjm=Xjπ(1)Xjπ(m)+U

where U(𝔤)+𝒰(m1). Passing to the quotient modulo (𝔤), we obtain the following identity in 𝒰(𝔤)m:

Xj1Xjm=Xjπ(1)Xjπ(m)+U

with U𝒰(𝔤)(m1). Apply the induction hypothesis to U to conclude.