Which Graph Motif Parameters Count?
Abstract
For a fixed graph , the function maps graphs to the count of induced -copies in ; this function obviously “counts something” in that it has a combinatorial interpretation. Linear combinations of such functions are called graph motif parameters and have recently received significant attention in counting complexity after a seminal paper by Curticapean, Dell and Marx (STOC’17). We show that, among linear combinations of functions involving only graphs without isolated vertices, precisely those with positive integer coefficients maintain a combinatorial interpretation. It is important to note that graph motif parameters can be nonnegative for all inputs , even when some coefficients are negative.
Formally, we show that evaluating any graph motif parameter with a negative coefficient is impossible in an oracle variant of #P, where an implicit graph is accessed by oracle queries. Our proof follows the classification of the relativizing closure properties of #P by Hertrampf, Vollmer, and Wagner (SCT’95) and the framework developed by Ikenmeyer and Pak (STOC’22), but our application of the required Ramsey theorem turns out to be more subtle, as graphs do not have the required Ramsey property.
Our techniques generalize from graphs to relational structures, including colored graphs. Vastly generalizing this, we introduce motif parameters over categories that count occurrences of sub-objects in the category. We then prove a general dichotomy theorem that characterizes which such parameters have a combinatorial interpretation. Using known results in Ramsey theory for categories, we obtain a dichotomy for motif parameters of finite vector spaces as well as parameter sets.
Keywords and phrases:
Graph motif parameters, Combinatorics, Combinatorial InterpretabilityFunding:
Radu Curticapean: Funded by the European Union (ERC, CountHom, 101077083). Views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Complexity theory and logic ; Mathematics of computing CombinatoricsEditors:
Paweł Gawrychowski, Filip Mazowiecki, and Michał SkrzypczakSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
1.1 Positivity in Combinatorics and #P
A fundamental task in combinatorics is to assign combinatorial meanings to quantities: For example, the binomial coefficient has the combinatorial interpretation of counting -element subsets of , which may not be obvious from the algebraic formula . In fact, this is a nonnegative combinatorial interpretation, i.e., the binomial coefficient is equal to the cardinality of a naturally defined set. This is stronger than just having a signed combinatorial interpretation, such as for example the determinant of zero-one matrices , which can be expressed as the difference of the number of even versus odd permutations with for all . In the following, when we speak of combinatorial interpretations, we mean nonnegative ones.
In algebraic combinatorics, there are several nonnegative integer quantities for which finding such a combinatorial interpretation is a major open problem, e.g., problems 9, 10, 11, and 12 in [36]. This begs the question whether these quantities actually have combinatorial interpretations, but without a formal definition of this notion, their existence is hard to rule out. Indeed, Stanley [37, Ch. 1] writes that “only through experience does one develop an idea of what is meant by a “determination” of a counting function”. In many settings however, combinatorial interpretations of a nonnegative quantity can be defined (or at least subsumed) formally in terms of containment of the quantity in the counting complexity class #P, see [15, 29, 30]. This means that the quantity can be interpreted as counting accepting paths of some nondeterministic polynomial-time Turing machine. While “containment in #P” is a coarse over-approximation of what some mathematicians would consider “combinatorial interpretations”, this working definition is formal, robust, and arguably natural. Most importantly, it enables us to prove that some very natural quantities indeed do not have a combinatorial interpretation.
For example, if PH does not collapse, then the squares of the symmetric group characters are not in #P and hence do not have a combinatorial interpretation, see [16]. Other fruitful separations from #P are proved in [15], where the counting versions of several TFNP search problems are studied: For example, given oracle access to an exponentially large graph with degree at most two (i.e., a disjoint union of isolated vertices, paths, and cycles), and given the end of a path, it follows that there is at least one other end of a path. But the task of finding a combinatorial interpretation for the nonnegative quantity “number of other path endpoints ” fails due to the oracle separation in [15].
1.2 Do Nonnegative Graph Motif Parameters Count?
A graph is an induced subgraph of if can be obtained from by deleting vertices. An induced copy of a graph in is an induced subgraph of that is isomorphic to . For two graphs and , write for the number of induced copies of in , and when is fixed, write for the function mapping . As defined by Curticapean, Dell, and Marx [7], a graph motif parameter is a rational linear combination of such functions. The step from individual counts to linear combinations turns graph motif parameters into a rich function space that captures interesting counting problems related to small patterns. For example, the number of (not necessarily induced) subgraphs isomorphic to a fixed -vertex graph is a nonnegative linear combination of induced subgraph counts from all -vertex supergraphs . This means that (not necessarily induced) subgraph counts from fixed graphs are graph motif parameters. The same holds for the number of homomorphisms from a fixed graph . Moreover, the coefficients of a graph motif parameter in its linear combination over are unique.
Graph motif parameters have been studied thoroughly before, mostly from a complexity-theoretic perspective [18, 19, 20, 7, 32, 33, 34, 9, 10]: Each fixed graph motif parameter can be evaluated in polynomial time on -vertex input graphs, as it involves a linear combination of induced subgraph counts from fixed graphs. Under complexity-theoretic assumptions, the above works give lower bounds on the exponent of this polynomial. We focus on the separate question which graph motif parameters have combinatorial interpretations.
For example, most of the above papers consider unweighted sums (i.e., linear combinations with coefficients or ) of induced subgraph counts from a set of -vertex graphs. This has the evident combinatorial interpretation of counting -vertex subsets of a graph that induce a graph from . More generally, we can allow arbitrary integers as coefficients; as we show in Lemma 11, all integer-valued graph motif parameters can be obtained this way. In this setting, it is possible for to have negative integer coefficients but still satisfy on all graphs .
Example 1.
Consider the quantity for -vertex graphs . This is nonnegative and, in fact, a graph motif parameter: Abbreviating , we have and , and . Moreover, the quadratic term can be linearized111 This holds because counts the pairs , and so does the linear combination , whose summands correspond to the isomorphism type of , which is either , , or . As we will elaborate later, graph motif parameters enjoy a general linearization property: Any polynomial combination of induced subgraph counts can be expressed as a unique linear combination. This holds even for “subobject” counts under more general notions of objects; see the full version for the details. This motivates our focus on linear combinations., i.e., expressed as the linear combination . Thus, we have
and have constructed a linear combination of induced subgraph counts evaluating to a nonnegative value for every graph , even though the term appears with coefficient .
This example in fact has a very simple combinatorial interpretation for graphs with vertices: After fixing an arbitrary vertex , the quantity simply counts the pairs with and . Let us consider a example where finding such an interpretation seems trickier:
Example 2.
Consider the graph motif parameter
whose non-negativity is shown in the full version. Contrary to our previous example, the graphs in this linear combination feature no isolated vertices. Does have a combinatorial interpretation?
Our main result in Theorem 13 gives a formal argument against having a combinatorial interpretation: In a nutshell, a nonnegative graph motif parameter involving only patterns without isolated vertices has a combinatorial interpretation if and only if all its coefficients are nonnegative. (The “if” part of this result is trivial.) In the next section, we will give the relevant formal definitions to state our results in Section 3 and compare them to related work in Section 4.
2 Preliminaries
2.1 Type-2 Counting Complexity
We study computational problems on inputs given by concise descriptions. Such problems have been studied by Johnson, Papadimitrou, and Yanakakis [21, 31], who introduced subclasses of the complexity class TFNP, such as PLS or PPAD, to capture search problems in exponentially large objects. The problems in these subclasses of TFNP all have a similar structure: Each input implicitly encodes an exponentially large structure, e.g., a graph via a Boolean circuit that computes the edge relation of the graph . The witness sought in the search problem is however still polynomially large in , which is tiny compared to the size of the encoded input instance. The guiding examples in our paper are induced subgraphs of a fixed size in an exponentially large graph . In this problem, the labels of the vertices in that induce constitute a witness.
A natural abstraction of this setting is obtained by providing the input as an oracle, rather than via a succinct encoding, see e.g. [2]. In this setting, which is denoted as type-2 complexity, we can only query the given structure, for instance by making queries to the edge relation of the graph, but we do not have access to a circuit computing the edge relation. Type-2 problems also arise naturally in computability and complexity studies of real functions, see e.g. [38, 22]. Since a real number is an infinite object, it is given as an oracle which one can query to get better and better rational approximations.
To define the type-2 framework formally, fix some alphabet . We assume that , which are interpreted as false and true, respectively. An oracle is a total function of type . The computational model in the type-2 framework is an oracle Turing machine, which gets an input string together with access to an oracle .
Example 3.
For a fixed graph , we consider the problem of finding an induced -copy in a large graph . The graph is given as where is an oracle encoding the edge relation and is the number of nodes of written in binary. The nodes of are , written as bit strings of fixed length . Given two nodes and of , there is an edge between and iff . (Since we consider strings of fixed length , no separation symbol is needed.) In this representation, most of the information about is stored in the oracle; the input only encodes the size of the graph, and the running time of oracle Turing machines is measured in terms of .
A consistency issue can arise in this definition: Since the graph is exponentially large, the oracle Turing machine might not be able to check whether the encoding is consistent. For instance, one might wish to encode an undirected graph but have . In the case of graphs, such inconsistencies can be “repaired” by avoiding observing them, e.g., by only querying strings with . In the context of TFNP and its subclasses, such issues are often elegantly repaired by interpreting any oracle as a graph with the desired properties. In general however, such a fix may not be possible for more complicated structures, in particular in the abstract setting of category theory that we will also study. Therefore, we will also resort to promise problems, in which a Turing machine is only required to work properly on inputs that are a proper encoding of an input object; see Definition 8 for more details. Note that this also encompasses the previous solutions. If we can repair the oracle or choose an interpretation such that every oracle encodes a valid instance, then we simply choose the set of “don’t care” instances to be the empty set.
A -ary type-2 relation gets a tuple of strings as an input, has access to an oracle , and outputs a value . Typically, we will consider unary type-2 relations, but relations with higher arity naturally occur when one considers witnesses, see below, when we define the existential and counting operators. A type-2 relation is polynomial-time computable if there is an oracle Turing machine that given input and oracle access to computes in polynomial-time in the length of the input . Oracle queries have unit costs, but has to write the query strings on the oracle tape. Note that all time bounds we consider depend only on the length of , but not on itself and, more importantly, not on the oracle.
If there is a polynomial-time computable -ary type-2 relation , we define the -ary type-2 relation by iff there is a polynomially long (in the length ) such that .
Example 4.
In our example of finding an induced copy of in the graph , we start with a binary type-2 relation . The relation interprets as an encoding of a set of distinct nodes of with being the number of nodes of . The relation then queries the oracle to obtain the induced subgraph . It then checks by brute force whether and are isomorphic. If yes, it returns the value and otherwise . This is clearly polynomial-time computable in . The relation now returns if contains an induced copy of somewhere, and if does not contain such a copy. See also Lemma 12 for a more formal proof.
As in classical type-1 complexity, we can also introduce a counting operator to type-2 complexity, similar to the existential operator : A polynomial-time computable -ary type-2 relation defines a -ary type-2 counting problem by , where is polynomially long.
Example 5.
Continuing the example above, now counts the number of induced subgraphs in that are isomorphic to , that is, , where is the oracle encoding of .
The goal of this paper is to characterize those integer valued nonnegative graph motif parameters that have a combinatorial interpretation. In other words, we ask which nonnegative graph motif parameters count something. The informal notion of “combinatorial interpretation” is formalized below as follows:
Definition 6.
.
In the notation , the dotted circle represents a placeholder for the input oracle.
Definition 7.
For a graph motif parameter , we denote by the corresponding type-2 function, which gets the graph given in the form . An integer valued nonnegative graph motif parameter is called combinatorially interpretable if (depending on context also , see later for an explanation).
It is open for discussion whether every problem in admits a “nonnegative combinatorial interpretation” in the informal sense used by combinatorists. If a problem is however not in , then it cannot have a nonnegative combinatorial interpretation. Our dichotomy result for integer-valued nonnegative graph motif parameters is particularly clean-cut in this regard: Such a parameter is in iff all its coefficients are nonnegative integers, so it indeed counts induced subgraphs.
To study more complicated structures, we consider promise problems, since the oracle TM might not be able to check whether an oracle is a legal encoding. Hence, we will have a set of “don’t care” inputs , which will correspond to illegal encodings of input objects in our setting. Our oracle TM only needs to compute the correct result on inputs which are not “don’t care” inputs. One such example are vector spaces, where we interpret the oracle as the characteristic function of the given vector space and it is not immediate how to check whether the oracle indeed describes a vector space. In this case our oracle Turing machine only has to work properly if the given oracle encodes indeed a vector space.
Definition 8.
A counting function together with a set of “don’t care” inputs is in if there is a polynomial-time computable binary type-2 relation such that and coincide on all inputs outside .
Every function in is also in with the empty set of “don’t care” inputs.
2.2 What is a Combinatorial Interpretation – and What is Not?
While we do not have an all-encompasing answer to the first part of this question (and we do not need one in this paper), we explain why type-2 complexity gives satisfactory answers to the second part. Intuitively, a combinatorial interpretation should mean “counting something”. But that is not enough. Even the graph motif parameter from Example 2 counts something: For a given graph , we consider the set of witnesses for each pattern in . So are all (induced) occurrences of an edge in , are all (induced) occurrences of a triangle in and so on. The sets and will be multi-sets containing two and four copies of the witnesses, respectively, due to the coefficients. Let be the union of all witness sets that correspond to a term with positive coefficient and be all witnesses that correspond to a term with negative coefficient. Since is nonnegative, . Thus there is an injective map . So we could say that counts all elements in that are not in . However, this “cheating away” of the difference is unsatisfactory, because it is (computationally) hard to decide whether a given element in should be counted or not.
Counting complexity gives a way to define combinatorial interpretations: We are given a set of witnesses , encoded as binary strings, and for each we can decide in polynomial time whether it is in or not. This rules out the construction above, since constructing the embedding is infeasible. Computing the size is a problem in the complexity class #P: A nondeterministic Turing machine guesses a witness and then deterministically verifies whether it is in . The number of accepting paths is precisely . While it might be debatable whether being in #P is the right definition for a combinatorial interpretation, it appears to be at least a necessary condition. In other words, if a quantity is not in #P, then it does not admit a combinatorial interpretation.
In the case of graph motif parameters, the witnesses are given as ordered lists of nodes having constant length, say , since only the target graph varies. If has nodes, each node can be encoded by a bitstring of length and the total witness size is . When the running time should be polynomial, then the Turing machine for checking the witness needs random access to the graph, because it cannot inspect the whole graph. This is modeled using oracles and type-2 complexity.
2.3 Induced Subgraph Numbers
Graphs in this paper are finite, undirected, and simple, i.e., they do not contain multi-edges or self-loops. Given a graph , we write and for its vertex and edge set, respectively. We write for the class of all graphs. Towards applying a Ramsey theorem, we will also consider ordered graphs: These are graphs where is a graph and is a total order on . We write for the class of these graphs. When ordered graphs are represented by an oracle, we always assume the natural lexicographic order on the vertices.
Given graphs and , we say that is a subgraph of if and . In the case of ordered graphs and , we additionally require that is the restriction of to . Given a vertex set , write for the subgraph of induced by . We write to denote that is an induced subgraph of , and we write for the poset of induced subgraphs of a graph , ordered by . We call pure if it contains no isolated vertices, i.e., no connected components isomorphic to , and we write for the class of pure graphs and for the pure ordered graphs.
A homomorphism from a graph into a graph is a function such that implies ; we write for their number. A homomorphism is an embedding if it is injective; we write for their number. A strong embedding is an embedding with the additional condition that also implies ; we write for their number. An isomorphism is a strong embedding that is bijective; two graphs and are isomorphic if there is an isomorphism from to . For isomorphic graphs we write . All of these definitions apply to ordered graphs as well; in this setting, homomorphisms must additionally preserve the orderings of and , i.e., if , then .
An automorphism of is a strong embedding from into ; we write for the numbers of automorphisms of . The numbers of embeddings and strong embeddings from a graph into some graph are multiples of . We obtain the number of subgraphs isomorphic to as and the number of induced subgraphs isomorphic to as . Note that for ordered graphs , we have . For a graph , we write for the function that maps . We use that these functions are linearly independent for non-isomorphic graphs , even when restricted to particular domains:
Fact 9.
Let be a finite set of pairwise non-isomorphic graphs. For , write for the restriction of to . Then the functions are linearly independent. The same applies for ordered graphs.
Having established that the functions are linearly independent, we consider linear combinations of these basis functions. The following term was coined in [7]:
Definition 10.
A graph motif parameter is a function that admits pairwise non-isomorphic pattern graphs and coefficients such that, for all graphs ,
| (1) |
We say that is pure if every graph with non-zero coefficient is pure. An ordered graph motif parameter similarly is a linear combination of induced subgraph counts from pairwise non-isomorphic ordered graphs. The support is the set of all , s.t. .
Note that being pure is well-defined, as the linear combination (1) is unique by Fact 9: The (isomorphism types of) graphs and coefficients appearing in the linear combination are uniquely determined; we therefore speak of the coefficients and the patterns of . Since we investigate whether or not a graph motif parameter has a combinatorial description, i.e., whether counts a set of objects, we will restrict ourselves to graph motif parameters with image , as all others can clearly not count anything. By the following lemma, this implies that the coefficients in the linear combination (1) can be assumed to be integers.
Lemma 11.
Let be an (ordered) graph motif parameter. Then for every (ordered) graph if and only if all coefficients of are integers.
Proof.
The “if” is obvious, since for any . For the “only if”, let be a pattern in with a coefficient , such that is minimal among all such patterns. Then and for some , since all other basis functions with non-integer rational coefficients evaluate to on , as their pattern has at least vertices and is not isomorphic to .
Lemma 12.
Let with all . Then . This holds for unordered and ordered graph motif parameters.
3 Our Results
3.1 Graphs
We characterize the pure graph motif parameters with a combinatorial interpretation as those whose coefficients are nonnegative integers. Such graph motif parameters clearly have a combinatorial interpretation, as they count all induced occurrences of pattern graphs, possibly with multiplicities (see Lemma 12.) On the other hand, if has a negative coefficient, then we show that is not in and thus does not admit a combinatorial interpretation. Formally we prove (recall Def. 7 about combinatorial interpretability):
Theorem 13.
Let be a pure graph motif parameter, i.e., for pairwise non-isomorphic pure and coefficients .222By Lemma 11 the interesting cases are those where all . Then is combinatorially interpretable iff all are nonnegative integers.
We show Theorem 13 in §5 with a delicate and technically demanding proof. The intuition is however rather easy to explain, and we illustrate it on the Example 2. An NTM is supposed to output , but it can only query a small part of . Assume that is either the union of a triangle and exponentially many singletons, or the union of an edge with exponentially many singletons, or is the graph that consists of only exponentially many singletons. If is an edge and singletons, then one computation path of must accept and must have queried the oracle at the edge, because otherwise this computation path would lead to an accepting path even if has no edges, which would be the wrong answer. Now, in the case that is a triangle and singletons, it turns out (which is delicate) that such an accepting path exists for each of the triangle edges, so we get 3 accepting computation paths, but , so the NTM has too many accepting paths.
3.2 More General Structures
The methodology used to prove Theorem 13 can be generalized to more general structures, provided that they satisfy the following conditions:
-
Ramsey property: We require Ramsey-type theorems for the structures, which assert – roughly speaking – that for given objects , there exists an object which contains enough -copies such that, by partitioning the set of all -copies in into a small number of parts, there is some part that contains a full -copy. In the case of pure graph motif parameters, for instance, this is Lemma 20: For any ordered graphs , there is an ordered graph such that for any partitioning of the induced subgraphs of that are isomorphic to , we can find an induced copy of in such that all induced copies of in this copy of are in the same part.
-
A neutral padding operation: We require an operation to enlarge a given object without increasing the number of induced subobjects. In the case of pure graph motif parameters, this can be achieved by adding isolated vertices, however we introduce different possible ways of achieving this for different objects, leading to different definitions of “pure”.
3.2.1 Relational Structures
For instance, we can prove results similar to Theorem 13 for various variants of relational structures, which generalize graphs. In graphs, we consider exactly one binary relation on a set , the edge relation. In a relational structure, we can have multiple relations of various arities. A relational structure is an induced substructure of another relational structure , if we can obtain by restricting the relations of to the vertices of . Similarly to graphs, we can now define motif parameters as linear combinations of induced substructure numbers. Formal definitions are given in the full version. We get the following classification for motif parameters of relational structures.
Theorem 14 (informal, see full version for a precise statement).
The evaluation function of a pure motif parameter of relational structures is combinatorially interpretable (i.e., it is in ) iff all its coefficients are nonnegative integers.
We also consider further types of relations, namely multiset relations (each relation is a multiset and can have repetitions), and list relations (which is the equivalent to directed graphs in the relational setting) with and without repetitions. We call these relational structures mixed. Precise definitions are again given in the full version. For such mixed relational structures, we get similar dichotomies, which we first prove in the ordered setting, since the corresponding Ramsey property only holds for ordered mixed relational structures. Then we transfer this theorem to the unordered setting – a similar detour into the ordered setting is in fact already needed for the case of graphs.
An application of mixed relational structures are colored graphs, where nodes are colored and isomorphisms have to respect the colors. We show a dichotomy result for colored graphs.
Due to space constraints all these generalizations can be found in the full version.
3.2.2 Category Theory
Ramsey-type theorems have even been established within category theory [11, 26]. Since Ramsey arguments are crucial in our method, this raises the interesting question whether we can also transfer our results into this realm. Indeed, we achieve this in the full version. Our objects of consideration are now objects of some category , and the subobjects we consider are now specified using a class of morphisms from . For two objects and of , we call the morphisms in , identified if they only differ by an isomorphism of , the -subobjects of under . In the case of undirected graphs, we choose to be the class of all strong graph embeddings. An important concept in our construction will be so-called factorization systems, used also by Lagodzinski [24] and Isbell [17] in the context of counting problems. We have a second class of morphisms and we require that every morphism factors (essentially uniquely) as with and . In our running example of unordered graphs, we can take to be the class of surjective graph homomorphisms and the above class of strong graph embeddings. Then every graph homomorphism factors through the subgraph of that is induced by the image of .
Recall that for graph motif parameters, we had a notion of pure graphs. We will have the same here, we will have two categories, a category of objects that we consider and category of pure objects that we want to count. In the full version, we collect a number of natural and rather mild requirements on and (maybe except for the property of being Ramsey) in order to prove our main result that -pure motif parameters can only have a combinatorial interpretation if the coefficients are all nonnegative.
Theorem 15 (informal, see full version for the precise statement).
Let be a -pure motif parameter. Then under some mild conditions, we have unless all its coefficients are nonnegative integers.
The theorem does not automatically give a dichotomy, since we do not have a corresponding upper bound when the coefficients are all nonnegative integers. However, our concrete applications in fact do admit corresponding upper bounds (all details can be found in the full version):
-
First we consider finite dimensional vector spaces over finite fields, which were shown to have the Ramsey property in [11] using methods from category theory. In particular, we show that the evaluation function of a motif parameter over a finite vector space is in iff all its coefficients are nonnegative integers.
-
Then we consider so-called parameter sets, which are subsets of for some fixed alphabet where we may set a coordinate to constant or identify any number of coordinates. They are also known to satisfy a Ramsey property, see [25, Theorem 10.4]. As a consequence of our general theorem, a motif parameter over parameter sets is in iff all its coefficients are nonnegative integers.
3.3 Linearization
As shown examplarily on the first pages, graph motif parameters enjoy a useful linearization property: Any polynomial in induced subgraph counts is a graph motif parameter, that is, a linear combination of induced subgraph numbers. Among other applications, this universality of linear combinations can be used to show that induced subgraph counts are algebraically independent, as any annihilating polynomial would in fact translate to a linear combination that always evaluates to zero, thus contradicting linear independence. Linearization also motivates our focus on linear combinations of subobject counts.
In the full version, we establish conditions for subobject numbers of a category to enjoy similar linearization properties. We obtain that the coefficients in the linearized version are positive and have a combinatorial interpretation if the polynomial we started with has positive coefficients in the binomial basis; see the next section.
4 Related Work
Searching for combinatorial interpretations for nonnegative integer quantities has a long tradition in combinatorics dating back to Cayley, Littlewood, Richardson, Schützenberger, Macdonald, Schensted, Knuth, Stanley, and others, e.g., [23, 35, 36], and has recently been studied from the perspective of #P containment and functional #P closure properties [28, 15, 16, 5]. The univariate relativizing functional closure properties of #P have been characterized in [4, Thm 3.1.1(b)]: Consider the binomial coefficient for as a polynomial in , and let be a univariate polynomial with expansion for . This expansion is called the binomial basis expansion of . It is known that iff all (see [15] for a proof). The connection between combinatorial interpretations and functional closure properties of #P is very recent [15].
In the language of graphs, the construction in [4] can be phrased as follows.
Let denote the graphs on whose vertices either form a triangle or a path
![]()
The non-oracle implications in the univariate case were studied in [27, p. 307], which shows that implies and (see [12] for an overview over these classes.) The multivariate case was resolved in [14, Thm 3.14], cf. [13, Thm 23], based on a Ramsey theorem [13, Thm 12]. As in the univariate case, one gets small variants of our main theorem, whereas we consider a larger set of non-isomorphic graphs of the same cardinality, each without isolated vertices. The functional closure properties of other counting classes are mainly unknown, with the exception of GapP [3, Thm 6] and finite automata [8].
While search versions of type-2 problems have been studied already in [1] and later works on TFNP, the counting versions have only recently been studied to understand whether their nonnegative transformations still have combinatorial interpretations [15]. We improve on the work [15] in various ways. First, we show much more general results due to our general theorem in the categorial framework. Second, our results are also valid for unordered objects, like unordered graphs, which do not have the required Ramsey property per se. We overcome this technical difficulty by carefully reducing the unordered case to the ordered one. All graph problems in [15] are on ordered graphs only.
5 Results for Graphs
5.1 Reduction to Ordered Graphs
To prove Theorem 13, we first prove the following stronger result about ordered graphs:
Theorem 16.
Let be a pure ordered graph motif parameter. Then iff all coefficients of are nonnegative integers.
Theorem 13 then follows by the simple observation that
| (2) |
where is any arbitrary linear ordering of and sums over all linear orders of that result in non-isomorphic ordered graphs. Note that all the pattern graphs occurring on the right-hand side are non-isomorphic pure ordered graphs. Even more, for different patterns on the left hand side, all possible occuring patterns on the right hand side are non-isomorphic. Now let be a pure graph motif parameter with a negative coefficient. Then the pure ordered graph motif parameter obtained by applying (2) to each of the basis functions has a negative coefficient and thus by Theorem 16. Assume for the sake of contradiction that , then we can compute by simulating the ordered oracle in polynomial time and running the corresponding NTM computing . This places in , a contradiction.
5.2 Ordered Graphs
The proof of the hardness in Theorem 16 now proceeds in two parts:
First we show how to instantiate any ordered graph and all its induced subgraphs as oracle graphs. Here it is important that the instantiations all have the same size; otherwise, the Turing machine could infer (through the number of vertices) possibly useful information about an instantiation without explicitly querying the oracle. For this purpose, for some subset , we say that an ordered graph motif parameter is -good if all its coefficients are nonnegative integers and all its patterns are from . If any such coefficient is negative, it is instead called -bad. If the domain is clear from the context, we will also simply call them good and bad respectively.
The second part of the proof finds a target graph such that, if there is any counterexample where any (restricted) good function would disagree with our bad , it must already occur on some induced subgraph of . For a fixed bad pure ordered graph motif parameter such a counterexample will always exist. We then instantiate precisely as oracle graphs.
Simply padding all graphs of to the same size however is not enough. The main goal here is that acceptance of a computation path is monotone relative to which part of the graph is observed by a computation, i.e., if we embed , then all previously accepting paths that accept on oracle input should still be accepting when the oracle input is .
Our proof follows the structure of [15]. We start by formalizing computation paths:
Definition 17.
A computation path of a nondeterministic Turing machine on some input is defined as the sequence of its nondeterministic choice bits and the answers to its oracle queries (both types of bits appear in the same list, ordered chronologically). Formally, it is an element of .
The same Turing machine can yield the same computation path on different inputs, for example, when not the whole input is read, or when having access to different oracles, because the oracles can differ in positions that are not queried.
Given a nondeterministic Turing machine and an oracle graph , we are interested in the number of accepting paths of when given oracle access to . We define the number of accepting paths of on input (given in binary) with oracle access to as . (This notation highlights that is given as an oracle while the number of vertices is a classical input.) Further we will also call a computation. In the following, write to denote those graphs in that contain exactly vertices, with the natural order on them, for .
Definition 18 (Set-instantiator against ).
Let be a nondeterministic Turing machine, let and let be an ordered pure graph motif parameter. Let be a symbolic top element above the induced subgraph poset , i.e., for all induced subgraphs . A set-instantiator is a triple of
some ,
an instantiation function , and
a perception function ,
such that both of the following properties hold for all induced subgraphs :
is an accepting path in computation iff ,
.
In the above definition, we think of the perception of an accepting computation path of to be the induced subgraph of that is observed by through oracle queries, while computation paths that do not accept are given perception .
We can now prove by a randomized construction that set-instantiators always exist for polynomial-time NTMs:
Lemma 19.
Let be a pure ordered graph motif parameter, let be a polynomial-time NTM computing and let . Then there is a set-instantiator against .
Proof.
For now let be undetermined, we will choose it accordingly later. Uniformly at random choose a function and let it induce a monotone function by mapping the -th smallest vertex to . We then set .
For an ordered graph , we denote by the graph on the vertices , where the edges are given as the images of the edges of under . Any vertex that is not in the image of is thus an isolated vertex. We are trying to find some , s.t. for all , all accepting paths of the computation do not access the oracle for any edges incident to vertices in . By the union bound, if we can show this happening with high probability for each of the finitely many individual , then there is such a , so for now fix some .
Now group the functions by their image of and fix one such group . Note that the image of the remaining vertices under is still independently and uniformly distributed from possible choices each by the choice of . Consider the set of all vertices queried by all accepting paths of the computation . Let be the worst-case running-time of on input . Each oracle query can query at most two vertices, there are at most many accepting paths and each accepting path can query the oracle at most many times and as such the size is bounded by , which is polynomial in due to being polynomial-time and being fixed. As a rough estimate using the Bernoulli inequality, this means that for a fraction of at least choices of resulting in , none of the accepting paths queries any edges incident to vertices in . By averaging across all possible (no longer restricted to ), we see that a fraction of choices have our desired property.
Using the union bound over all , we get that a fraction of choices of lead to the desired property for all simultaneously. By choosing large enough (remember that and are polynomially related, so this is always possible), this guarantees the existence of our desired , s.t. for all , all accepting paths of the computation do not access the oracle for any edges incident to vertices in .
We can now construct our set-instantiator with chosen as above. The instantiation function for is . For any computation path of a computation , denote by the set of vertices that are queried by the computation. We set
Note that the preimage is defined to be simply the set , even if is not entirely contained in the image of .
We check the properties of a set-instantiator. Clearly for all , since the two graphs are identical except isolated vertices and no pattern in contains any isolated vertices. Further we need that for all we have that is an accepting path for the computation iff . For this fix and let be an accepting path for the computation . By our choice of we know that no other vertices of are queried, or in other words, and thus is also an accepting path for the computation . This directly implies . On the other hand, let , then and is an accepting path of the computation , since , is also an accepting path of the computation .
If and are isomorphic, then it can still be the case that , which we do not want to happen in a good function. For example if is any graph containing some triangles, then we want the instantiation of every induced triangle of to have the same number of accepting paths. We thus want to construct an instantiation function that no longer has this problem. We achieve this by creating a very large set-instantiator and then restricting ourselves to some small part of the set-instantiator where we can enforce this property.
Ensuring this structure is achieved by usage of a Ramsey theorem. For two graphs and we denote by the subset of induced subgraphs of that are isomorphic to .
Lemma 20 (Main Theorem in [26]).
Let and let be fixed. Then there is a , such that for every , there is a with with the property that .
Note that Lemma 20 is the reason why we need to generalize our result to ordered graphs first instead of being able to prove it directly for unordered graphs. It is known that this type of Ramsey theorem does not hold for unordered graphs [25, Theorem 5.1].
By repeatedly applying Lemma 20 we get
Proposition 21 (Ramsey Theorem).
Let and let be fixed. Then there is an , such that for every there is a with with the property, if with , then .
This theorem can now be used to ensure that for every polynomial-time NTM there must be some small set of inputs where the NTM behaves like a good function.
Lemma 22.
Let be a pure ordered graph motif parameter, let be any nondeterministic polynomial-time Turing machine computing and let . Then there is some , a function and a -good function with and for every .
Proof.
We invoke Proposition 21 for and to obtain . We now want to color according to the behaviour of on the elements of it, in particular how many accepting paths query each specific induced subgraph of . Before we can do this we have to first embed all potential elements of as oracles. In order to do this we use Lemma 19 to construct a set-instantiator against to obtain a , and as in Definition 18.
For define . Note that
| (3) |
Using as a coloring for Proposition 21, we obtain . Note that now only depends on the isomorphism type of for all . By induction we see that the number also depends only on the isomorphism type of , whenever . Denote this number by and set . This is only well-defined for , thus we consider the domain of to be . This means that (3) simplifies:
| (4) |
where the sum is over all isomorphism classes of induced subgraphs . This implies that is -good.
It remains to define the function . Let , then for some with . The property now directly follows from being a set-instantiator.
Note that Lemma 22 only guarantees the local behaviour of to be -good. This leaves two possibilities for contradictions: Either the local behaviour is even -good, or the function grows too quickly for inputs with many isolated vertices. This next Witness Theorem shows that in the first case we can always find an ordered graph that proves that does not compute the correct function using a simple linear algebra argument.
Theorem 23 (The Witness Theorem).
Fix a bad . Then there exists such that for every -good function , there exists with .
Proof.
Choose to be the disjoint union of the graphs in and let , keeping only one representative per isomorphism class. By Fact 9, the functions for are linearly independent. Thus the patterns and coefficients of any function with support are uniquely determined by the evaluations on , and there exists a witness with , since is good while is bad (thus, and are not the same function).
With this last tool we are finally able to lead the existence of polynomial-time NTMs computing bad ordered graph motif parameters to a contradiction.
Lemma 24.
Let be any nondeterministic polynomial-time Turing machine computing for some bad pure ordered graph motif parameter . Then there is a and such that .
Proof.
We invoke the Witness Theorem 23 with our to obtain an as in the theorem. Construct such that
| for all | (5) | ||||
| for all | (6) |
by adding sufficiently many isolated vertices to . For this it is sufficient to add isolated vertices in between every pair of adjacent vertices in , relative to .
We now use Lemma 22 to obtain , a function and a -good function with and for every . There are now two possibilities to find the required counterexample : If there are no basis functions present in with a pattern in , we can invoke the Witness Theorem again. If any of these basis functions are present we can construct a direct counterexample.
We first handle the case where such a basis function is present, so let the coefficient of in be positive for some . Then
by (5) and (6), combined with the fact that is pure, and our counterexample is .
Otherwise assume that no such basis function is present. Then obtained from by restricting to (i.e. setting all coefficients of to zero), is -good, which is a requirement for the second part of the Witness Theorem 23 (we already used it to obtain ). Furthermore . We invoke the second part of the Witness Theorem 23 to find a point with which is our counterexample.
Independent of how we have obtained our we observe , which completes the proof for .
6 Overview of Results for Relational Structures and Categories
Induced subgraph counts can also be considered in a colored setting, where each vertex has a color and colored graphs are considered isomorphic if the colors of the vertices that are mapped to each other by the isomorphism have to agree. Such colored patterns occur, e.g., within the proofs of some parameterized hardness results [9, 6], as access to vertex-colors gives more versatility in the reduction. It is natural to ask whether our main theorem generalizes from graphs to colored graphs and to other more general versions of graphs, such as hypergraphs of bounded edge-cardinality. We answer this question positively for general relational structures; due to space limitations, this part only appears in the full version. The proof proceeds along the same lines as Theorem 13, our main theorem about graphs.
Generalizing further from relational structures, we consider the vastly more general setting of category theory in the full version and show a variant of Theorem 13 for motif parameters defined over categories. It turns out that the arguments from the main proof can be adapted for this purpose, but only after significant technical effort: While homomorphisms and embeddings naturally generalize from graphs to categories, nontrivial concepts from category theory are needed to formulate the proper requirements on categories that allow us to handle subobjects similarly to induced subgraphs. Among other such requirements, we require well-powered categories (where every subobject itself has only finitely many isomorphism types of subobjects) and proper factorization systems (that decompose morphisms into compositions of epimorphisms and monomorphisms). We then use the general category-theoretic framework to study combinatorial interpretations of counting problems related to finite vector spaces and to so-called parameter sets, which play a role in enumerative combinatorics.
References
- [1] Paul Beame, Stephen Cook, Jeff Edmonds, Russell Impagliazzo, and Toniann Pitassi. The relative complexity of NP search problems. In Proceedings of the twenty-seventh annual ACM Symposium on Theory of Computing, pages 303–314, 1995. doi:10.1145/225058.225147.
- [2] Paul Beame, Russell Impagliazzo, and Toniann Pitassi. Improved depth lower bounds for small distance connectivity. In 36th Annual Symposium on Foundations of Computer Science, Milwaukee, Wisconsin, USA, 23-25 October 1995, pages 692–701. IEEE Computer Society, 1995. doi:10.1109/SFCS.1995.492671.
- [3] Richard Beigel. Closure properties of GapP and #P. In Proceedings of the Fifth Israeli Symposium on Theory of Computing and Systems, pages 144–146. IEEE, 1997. doi:10.1109/ISTCS.1997.595166.
- [4] Jin-Yi Cai, Thomas Gundermann, Juris Hartmanis, Lane A Hemachandra, Vivian Sewelson, Klaus Wagner, and Gerd Wechsung. The boolean hierarchy II: Applications. SIAM Journal on Computing, 18(1):95–111, 1989. doi:10.1137/0218007.
- [5] Swee Hong Chan and Igor Pak. Computational complexity of counting coincidences. Theoretical Computer Science, 1015:114776, 2024. doi:10.1016/j.tcs.2024.114776.
- [6] Radu Curticapean. Count on CFI graphs for #P-hardness. In Proceedings of the 2024 ACM-SIAM Symposium on Discrete Algorithms, SODA 2024, Alexandria, VA, USA, January 7-10, 2024, pages 1854–1871. SIAM, 2024. doi:10.1137/1.9781611977912.74.
- [7] Radu Curticapean, Holger Dell, and Dániel Marx. Homomorphisms are a good basis for counting small subgraphs. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, pages 210–223, 2017. doi:10.1145/3055399.3055502.
- [8] Julian Dörfler and Christian Ikenmeyer. Functional closure properties of finite -weighted automata. In 51st International Colloquium on Automata, Languages, and Programming, ICALP 2024, July 8-12, 2024, Tallinn, Estonia, volume 297 of LIPIcs, pages 134:1–134:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.ICALP.2024.134.
- [9] Julian Dörfler, Marc Roth, Johannes Schmitt, and Philip Wellnitz. Counting induced subgraphs: An algebraic approach to #W[1]-hardness. Algorithmica, 84(2):379–404, 2022. doi:10.1007/s00453-021-00894-9.
- [10] Simon Döring, Dániel Marx, and Philip Wellnitz. Counting small induced subgraphs with edge-monotone properties. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 1517–1525. ACM, 2024. doi:10.1145/3618260.3649644.
- [11] RL Graham, K Leeb, and BL Rothschild. Ramsey’s theorem for a class of categories. Advances in Mathematics, 8(3):417–433, 1972.
- [12] Lane A. Hemaspaandra and Mitsunori Ogihara. The Complexity Theory Companion. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2002. doi:10.1007/978-3-662-04880-1.
- [13] Ulrich Hertrampf. Classes of bounded counting type and their inclusion relations. In STACS 95: 12th Annual Symposium on Theoretical Aspects of Computer Science Munich, Germany, March 2–4, 1995 Proceedings 12, pages 60–70. Springer, 1995. doi:10.1007/3-540-59042-0_62.
- [14] Ulrich Hertrampf, Heribert Vollmer, and Klaus W. Wagner. On the power of number-theoretic operations with respect to counting. In Proceedings of Structure in Complexity Theory. Tenth Annual IEEE Conference, pages 299–314, 1995. doi:10.1109/SCT.1995.514868.
- [15] Christian Ikenmeyer and Igor Pak. What is in #P and what is not? In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 860–871, 2022. doi:10.1109/FOCS54457.2022.00087.
- [16] Christian Ikenmeyer, Igor Pak, and Greta Panova. Positivity of the symmetric group characters is as hard as the polynomial time hierarchy. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3573–3586, 2023. doi:10.1137/1.9781611977554.ch136.
- [17] John Isbell. Some inequalities in hom sets. Journal of Pure and Applied Algebra, 76(1):87–110, 1991. doi:10.1016/0022-4049(91)90099-N.
- [18] Mark Jerrum and Kitty Meeks. The parameterised complexity of counting connected subgraphs and graph motifs. J. Comput. Syst. Sci., 81(4):702–716, 2015. doi:10.1016/j.jcss.2014.11.015.
- [19] Mark Jerrum and Kitty Meeks. Some hard families of parameterized counting problems. ACM Trans. Comput. Theory, 7(3):11:1–11:18, 2015. doi:10.1145/2786017.
- [20] Mark Jerrum and Kitty Meeks. The parameterised complexity of counting even and odd induced subgraphs. Comb., 37(5):965–990, 2017. doi:10.1007/s00493-016-3338-5.
- [21] David S. Johnson, Christos H. Papadimitriou, and Mihalis Yannakakis. How easy is local search? J. Comput. Syst. Sci., 37(1):79–100, 1988. doi:10.1016/0022-0000(88)90046-3.
- [22] Ker-I Ko. Complexity Theory of Real Functions. Birkhäuser / Springer, 1991. doi:10.1007/978-1-4684-6802-1.
- [23] Carl Kostka. Über den Zusammenhang zwischen einigen Formen von symmetrischen Functionen. Walter de Gruyter, Berlin/New York Berlin, New York, 1882.
- [24] Gregor Lagodzinski. Counting homomorphisms over fields of prime order (Zählen von Homomorphismen über Körper mit Primzahlordnung). PhD thesis, University of Potsdam, Germany, 2024. doi:10.25932/PUBLISHUP-64603.
- [25] Jaroslav Nešetřil. Ramsey theory. In Handbook of Combinatorics, volume 2, pages 1331–1403. Elsevier Science B.V., Amsterdam, 1995.
- [26] Jaroslav Nešetřil and Vojtech Rödl. Partitions of finite relational and set systems. J. Comb. Theory A, 22(3):289–312, 1977. doi:10.1016/0097-3165(77)90004-8.
- [27] Mitsunori Ogiwara and Lane A Hemachandra. A complexity theory for feasible closure properties. Journal of Computer and System Sciences, 46(3):295–325, 1993. doi:10.1016/0022-0000(93)90006-I.
- [28] Igor Pak. Short stories: Combinatorial inequalities. Notices of the American Mathematical Society, 66(7):1109–1112, 2019.
- [29] Igor Pak. What is a combinatorial interpretation? CoRR, abs/2209.06142, 2022. doi:10.48550/arXiv.2209.06142.
- [30] Greta Panova. Computational complexity in algebraic combinatorics. arXiv:2306.17511, 2023. doi:10.48550/arXiv.2306.17511.
- [31] Christos H. Papadimitriou. On the complexity of the parity argument and other inefficient proofs of existence. J. Comput. Syst. Sci., 48(3):498–532, 1994. doi:10.1016/S0022-0000(05)80063-7.
- [32] Marc Roth. Counting problems on quantum graphs. PhD thesis, Saarland University, Germany, 2019. doi:10.22028/D291-28348.
- [33] Marc Roth, Johannes Schmitt, and Philip Wellnitz. Counting small induced subgraphs satisfying monotone properties. In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 1356–1367. IEEE, 2020. doi:10.1109/FOCS46700.2020.00128.
- [34] Marc Roth, Johannes Schmitt, and Philip Wellnitz. Detecting and counting small subgraphs, and evaluating a parameterized tutte polynomial: Lower bounds via toroidal grids and cayley graph expanders. In 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12-16, 2021, Glasgow, Scotland (Virtual Conference), volume 198 of LIPIcs, pages 108:1–108:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPIcs.ICALP.2021.108.
- [35] M.-P. Schützenberger. La correspondance de Robinson. In Combinatoire et Représentation du Groupe Symétrique, pages 59–113, Berlin, Heidelberg, 1977. Springer Berlin Heidelberg.
- [36] Richard P Stanley. Positivity problems and conjectures in algebraic combinatorics. Mathematics: frontiers and perspectives, 295:319, 2000.
- [37] Richard P. Stanley. Enumerative Combinatorics, Volume 1, volume 49 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, second edition, 2011.
- [38] Klaus Weihrauch. Computable Analysis - An Introduction. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2000. doi:10.1007/978-3-642-56999-9.
