Hereditary First-Order Logic:
the Tractable Quantifier Prefix Classes
Abstract
Many computational problems can be modelled as the class of all finite structures that satisfy a fixed first-order sentence hereditarily, i.e., we require that every (induced) substructure of satisfies . We call the corresponding computational problem the hereditary model checking problem for , and denote it by .
We present a complete description of the quantifier prefixes for such that is in P; we show that for every other quantifier prefix there exists a formula with this prefix such that is coNP-complete. Specifically, we show that if is of the form or of the form , then can be solved in polynomial time whenever the quantifier prefix of is . Otherwise, contains or as a subword, and in this case, there is a first-order formula whose quantifier prefix is and is coNP-complete. Moreover, we show that there is no algorithm that decides for a given first-order formula whether is in P (unless PNP).
Keywords and phrases:
Quantifier prefix, first-order Logic, Computational Complexity, Polynomial-time algorithm, coNP-completenessCopyright and License:
2012 ACM Subject Classification:
Theory of computation Complexity theory and logicAcknowledgements:
This is a low-co2 research paper: https://tcs4f.org/low-co2-v1. This research was developed, written, submitted and presented without the use of air travel.Funding:
Both authors have been funded by the European Research Council (Project POCOCOP, ERC Synergy Grant 101071674). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.Editors:
Stefano Guerrini and Barbara KönigSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Vertex-deletion problems were first studied as optimization problems [17, 18]: Given an input graph determine the minimum number of vertices to be deleted so that the remaining (induced) subgraph belongs to a specified class . Krishnamoorthy and Deo [17] showed that this problem is NP-hard for several natural graph classes such as trees, planar, bipartite, hamiltonian, interval, and chordal graphs. Shortly after, Lewis and Yanakkakis [18] proved that if is a non-trivial hereditary class of finite graphs (i.e., is an infinite class of finite graphs which is closed under induced subgraphs and which is not the class of all finite graphs), then the vertex-deletion problem defined by is -hard.
Given the previous hardness results, it was natural to study vertex-deletion problems from the viewpoint of approximation or of parametrized complexity. Regarding the former, Fujito [13] proposed a unified polynomial-time algorithm that approximates an optimal solution to a vertex-deletion problem defined by a non-trivial hereditary class. Some well-known results include the fixed parameter tractability of the odd-cycle transversal problem [21], and of the feedback vertex set problem [20] – the former corresponds to the vertex-deletion problem for bipartite graphs, and the latter to the vertex-deletion problem for forests.
Recently, the parametrized complexity perspective has been studied systematically for vertex-deletion problems defined by graph classes expressible by some first-order sentence [1, 12]. In this paper we are interested in the class of graphs such that no matter how many vertices are removed from , the remaining induced subgraph does not satisfy . Since first-order logic is closed under negations, we prefer the positive phrasing of this problem: Given a graph , test whether all non-empty induced subgraphs of satisfy a specified first-order formula .
Hereditary first-order logic
From now on we work in the setting of relational structures, for which graphs and digraphs are prototypical examples. We use the convention that all structures have a non-empty domain. We begin by introducing hereditary first-order logic. A structure hereditarily satisfies if every substructure111Some authors use the term “induced substructure”; we follow the convention in model theory and omit the adjective ‘induced . of satisfies . We denote by the class of finite structures that hereditarily satisfy . The hereditary model checking problem for a fixed sentence consists of deciding whether an input structure belongs to . Since verifying whether a finite structure models a fixed first-order formula can be done in polynomial time, it follows that is in coNP.
We say that a class of finite -structures is hereditarily first-order definable if there is a first-order sentence such that . In this case, we also say that is in . Clearly, every class in is hereditary, i.e., closed under taking substructures.
Three simple examples
Clearly, every hereditary class in is also in . The following graph and digraph classes are in , but not in .
Example 1 (Forests).
Consider a first-order formula stating that “there is a loopless vertex of degree at most ”. Clearly, every forest hereditarily satisfies . Conversely, suppose that is a finite graph that hereditarily satisfies . Then has a loopless vertex of degree at most . We inductively see that the subgraph with vertex set is a forest, i.e., has no cycles, and since has degree at most and does not have a loop, we conclude that is a forest.222This example naturally generalizes to -degenerate graphs, i.e., the class of graphs such that every subgraph of contains a vertex of degree at most .
Example 2 (Chordal graphs).
A graph is chordal if every cycle of contains a chord; equivalently, if contains no induced cycle of length . Rose [22] proved that a graph is chordal if and only if every induced subgraph contains a vertex whose neighbourhood induces a clique. Hence, if is a first-order sentence stating “there is a (loopless) vertex such that every two neighbours of are adjacent”, then describes the class of chordal graphs. It is known that membership in this class can be decided in polynomial time [23].
Example 3 (Directed acyclic digraphs).
The class of acyclic digraphs belongs to as well. Indeed, it suffices to consider the first-order sentence , i.e., the first order sentence that states that there exists a sink, i.e., a vertex without outgoing edges. Equivalently, the constraint satisfaction problem for (denoted by CSP, see Section 2) is in . Also this computational problem can be solved in polynomial time (e.g., by depth-first-search).
We are interested in the expressive power of hereditary first-order logic, as well as complexity classification for the hereditary model checking problem. Regarding the former, we will see that is a particularly natural formalism when it comes to the description of constraint satisfaction problems (for the definition of CSPs, see Section 2). Regarding the latter, we will study complexity classifications based on the quantifier prefix of the fixed first-order sentence . This is motivated by similar classifications for universal and existential second-order logic [9, 14], which we now review.
Prefix classifications
It is straightforward to observe that can be expressed in universal monadic second-order logic.
Observation 4.
Consider a first-order -formula and a finite structure . If is a unary predicate not in , then hereditarily satisfies if and only if models
where (respectively, ) is the set of indices such that is a universal (respectively, existential) quantifier.
The question whether a given existential second-order (ESO) sentence describes a polynomial-time solvable problem is easily seen to be undecidable (see, e.g., Theorem 1.4.2 in [2]). This motivates quantifier-prefix dichotomy results for existential second-order logic [9, 14]. For instance, the classification for monadic ESO is as follows [14, Figure 1(a)]. For every quantifier prefix the following holds:
-
either is of the form , and in this case every monadic ESO sentence whose first-order part has a quantifier-free prefix is polynomial-time decidable, or
-
contains or as subwords, and in this case there is an ESO sentence whose first-order part has quantifier prefix and deciding is NP-complete.
It follows from this classification, via Observation 4, that is in P whenever is a first-order sentence with a quantifier prefix of the form . However, as we will soon see, HerFO enjoys a richer polynomial-time solvable fragment.
Our contributions
We first study the expressive power of HerFO in the context of CSPs. We see that HerFO can express finite-domain CSPs note expressible in first-order logic (Example 25), that it can express coNP-complete infinite-domain CSPs (Theorem 8), and that every CSP expressible in HerFO is the CSP of an -categorical structure (Remark 6). We then turn to computational aspects of the hereditary model checking problem for fixed sentences . In particular, we show that if the signature of is monadic, then is solvable in polynomial-time (Proposition 11).
Our first main results are two classifications of the computational complexity of problems in HerFO. The first one concerns sentences with at least one binary predicate (Theorem 20), and the second one concerns sentences such that describes a CSP (Corollary 21). These classifications coincide, and are based on the quantifier prefix of the fixed first-order formula:
-
either is of the form or the form , and in this case is polynomial-time decidable for every first-order formula with quantifier prefix , or
-
contains or as a subword, and in this case there are first-order sentences with quantifier prefix such that is -complete.
Our final main result shows that more fine-grained classifications of the complexity are undecidable: there is no algorithm that tests for a given first-order sentence whether can be solved in polynomial time (Theorem 23). Moreover, we show that this is also the case even if is restricted to having quantifier-prefix or (Corollary 24). We leave open the decidability of this meta-problem for first-order sentences with quantifier prefix .
Outline
We begin by recalling some basic concepts from finite model theory (Section 2). We then list further examples of problems in HerFO, and prove that certain problems cannot be expressed in (Section 3); it will also become clear why hereditary model checking is quite natural in the context of constraint satisfaction. The classifications of the complexity of depending on the allowed quantifier prefix are the main results of Section 4. The undecidability of tractability of can be found in Section 5. A series of problems that are left open can be found in Section 6. In Appendix B we present a comparison between our complexity classifications, the complexity landscape for the parametrized version of HerFO from [12], and the general model checking problem for UMSO from [14].
2 Preliminaries
We assume basic familiarity with first-order logic and we follow standard notation from model theory, as, e.g., in [15]. We also use standard notions from complexity theory.
(First-order) structures
Given a relational signature and a -structure , we denote by the interpretation in of a relation symbol . Also, we denote relational structures with letters , and their domains by . In this article, structures have non-empty domains, and we only work with finite signatures .
If and are -structures, then a homomorphism from to is a map such that for all and of arity , if , then . We write if there exists a homomorphism from to , and we denote by the class of finite structures such that . Let be a class of finite -structures. We say that is
-
closed under homomorphisms if for every , if , then as well.
-
closed under inverse homomorphisms if for every , if , then as well (i.e., the complement of in the class of all finite -structures is closed under homomorphisms).
Note that is closed under inverse homomorphisms.
If and are -structures with disjoint domains and , respectively, then the disjoint union is the -structure with domain and the relation for every . Note that is closed under disjoint unions, i.e., if and , then . It is well-known that a class of finite -structures is of the form for some countably infinite -structure if and only if is closed under inverse homomorphisms and disjoint unions (see, e.g., [2, Lemma 1.1.8]).
For examples we will often use graphs and digraphs, and we will think of these as binary structures with signature . We follow and adapt standard notions from graph theory [6] to this setting. In particular, given a digraph , we call the edge set of , and its elements we call edges. A graph is a digraph whose edge set is a symmetric relation, and an oriented graph is a digraph whose edge set is an anti-symmetric relation. Given a positive integer , we denote by the complete graph on vertices, and by the directed path on vertices. An oriented path is an oriented graph whose symmetric closure is a path.
Fragments of first-order logic
A first-order -formula is called positive if it does not use the negation symbol (so it just uses the logical symbols , variables and symbols from ). The negation of a positive formula is called negative; note that every negative formula is equivalent to a formula in prenex conjunctive normal form where every atomic formula appears in negated form, and all negation symbols are in front of atomic formulas; such formulas will also be called negative.
Consider a quantifier-free formula using only variables, the symbols and , and symbols from , and let be the formula obtained from by iteratively removing each equality conjunct and substituting each occurrence of the variable by the variable . We now consider the structure whose vertex set is the set of variables appearing in , and the interpretation of consists of those tuples such that contains the positive literal . We say that a negative first-order sentence in prenex conjunctive normal form is connected if for every clause of the quantifier-free part of , the structure is connected.
3 Hereditary model checking and CSPs
In this section we see that hereditary model checking naturally arises in the context of constraint satisfaction problems. We do so by providing several examples of s in . We also include inexpressible examples of which help build intuition about hereditary model checking.
It is well-known that a universal first-order sentence in conjunctive normal form describes a if is negative and connected (see, e.g., Theorem 5.6.2 in [2]). The following observation extends this to not necessarily universal sentences.
Observation 5.
The following statements hold for every first-order sentence in prenex conjunctive normal form.
-
If is negative, then is closed under inverse homomorphisms.
-
If is negative and connected, then there is a structure such that .
Proof.
To prove the first itemized statement, suppose that there is a homomorphism and . Let be a substructure of . The substructure of with vertex set models . Suppose for contradiction that does not satisfy . Then it would satisfy the sentence , which is equivalent to a positive sentence. Since surjective homomorphisms preserve positive first-order formulas [15, Theorem 2.4.3], we obtain that satisfies , a contradiction.
It is straightforward to observe that if is negative and connected, then the class of finite models of is closed under disjoint unions. Hence, if every substructure of and every substructure of satisfies , then every substructure of satisfies , i.e., is closed under disjoint unions. By the first claim, is also closed under inverse homomorphisms, and it thus follows that there is a structure such that – for any class of -structures there exists a -structure such that if and only if is closed under disjoint unions and inverse homomorphisms [2, Lemma 1.1.8].
Remark 6.
Every in is the of an -categorical structure (see [2] for the many consequences that the model theoretic property of -categoricity has for the study of the complexity of the CSP). This follows from a result in [5, Corollary 14], which states that every in monadic second-order logic (MSO) is the for an -categorical structure; clearly, is a fragment of monadic second-order logic (see, e.g., Observation 4). For a concrete application of this remark see Example 26.
Example 7.
Let be the signature where and are two binary relational symbols which will encode “equal” and “not equal”, respectively. Note that a -structure belongs to if and only if the structure obtained from by contracting all connected components of the edge relation contains no loop . Equivalently, if and only if contains no loop and no contradicting cycle, i.e., vertices such that for all and . Despite the fact that the existence of such a cycle is not a first-order property, the problem is in . For a positive integer we write for the first-order formula stating “ has exactly neighbors in the relation (different from )”, and for its negation. Now, consider the formula
If does not hereditarily model , then there is a subset such that and every has degree exactly in the relation except for and , and so contains a contradicting cycle. Conversely, if has a contradicting cycle one can find a substructure of that does not model , namely, any shortest contradicting cycle. Therefore, is hereditarily defined by the first-order sentence .
Hard examples
Here, we present an example of a coNP-complete CSP in HerFO that we will also use in Corollaries 24 and 21; to see another (possibly better-known) one see the full version of this paper.
Let consist of two binary symbols and ; we think of a -structure is a digraph with blue and red edges. For a positive integer we denote by the structure with vertex set such that is a directed cycle , and is a complete (symmetric) graph. Let be the set containing the one-element structure with a red loop, and the one-element structure with a blue loop, and all structures for . Let be the class of finite -structures for which there is no homomorphism for any . Clearly, is closed under inverse homomorphisms. Since all structures in are connected, is also closed under disjoint unions, and hence for some -structure (see, e.g., [2, Lemma 1.1.8]).333The reader familiar with Fraïsse limits [16] may notice that can be chosen to be a countable homogeneous structure, i.e., such that every isomorphism between finite substructures of can be extended to an automorphism of . We show that is a coNP-complete CSP in . Consider the -sentence
Clearly, no loop satisfies . We now show that no structure , for , satisfies . Consider a pair of vertices . Firstly, there is some such that , because the blue edges define a directed cycle in . Secondly, if , then because the red edges induce a red clique. Hence, does not satisfy the last conjunct of . Also, since is negative and connected, it follows by Observation 5 that if for some structure , then does not hereditarily satisfy . On the other hand, observe that if a -structure does not hereditarily satisfy , then contains a loop, or there is a subset such that is a complete symmetric graph with at least two vertices, and every vertex has a blue out-neighbour. So if contains no loops, then the shortest directed blue cycle in induces a structure isomorphic to for some . Therefore, .
Theorem 8.
is a -complete hereditarily definable by an - and by an -sentence.
Proof.
It is easy to see that is equivalent to the sentence obtained from by changing the prefix to . Hence, it follows from the discussion above that is hereditarily definable by an - and by an -sentence. We now show that is -complete. Consider an instance of 3SAT with variables and clauses , where and for some . We construct a -structure with vertices for each and . The blue edges of consist of all pairs and , for and ; the red edges of correspond to the relation with , i.e., if and only if the literal does not equal the negation of the literal – in particular, is a symmetric relation without loops.
We claim that is satisfiable if and only if . Suppose there is a satisfying assignment for , and consider the vertices where is true in the clause . Then the substructure with domain satisfies that every vertex has a blue out-neighbour. Clearly, cannot be the negation of , so is a complete red graph. This shows that if is satisfiable, then there is a homomorphism .
Conversely, suppose that . Notice that if there is a substructure of that satisfies that every vertex has a blue out-neighbour, then contains a vertex for each . Moreover, if is a complete graph, then contains at most one vertex for each . Then the evaluation defined by if there is some clause such that and shows that is satisfiable.
Inexpressible examples
In this section we study the limitations of the expressive power of HerFO. Clearly, a class is first-order definable if and only if the complement of is first-order definable. A structure is a minimal obstruction of a hereditary class if but every proper substructure of belongs to . We show that if and only if the complement of contains a first-order definable subclass that contains all minimal obstructions of (Lemma 9).
We then apply this observation to show that the class of bipartite graphs is not in HerFO (Example 10). It is well-known that a graph is bipartite if and only if it does not contain an odd cycle, and that the class of odd cycles cannot be expressed by a first-order formula. However, this is not enough to show that the class of bipartite graphs is not in HerFO: there are properties that are not first-order definable, but the class of all -free structures is in ; see Example 7. To prove that the class of bipartite graphs is not in HerFO, we therefore need the following lemma which we prove in Appendix C.
Lemma 9.
Let be a hereditary class of finite -structures and let be the class of minimal obstructions of . Then is hereditarily first-order definable if and only if there is a first-order sentence such that
-
if , then , and
-
if is a finite -structure such that , then there is an embedding for some .
Building on this simple lemma we can now use standard Ehrenfeucht-Fraïssé arguments to show that certain hereditary classes are not in HerFO. If and are -structures, we write if and satisfy the same first-order -sentences with at most variables.
Example 10 (Bipartite graphs not in ).
It is well known that the minimal obstructions of the class of bipartite graphs are all odd symmetric cycles and the non-symmetric edge. For every positive integer , there is a large enough odd cycle and a large enough even cycle such that (this can be shown by an Ehrenfeucht-Fraïssé argument; see, e.g., [8]). Hence, we conclude via Lemma 9 that the class of bipartite graphs is not in . Moreover, note that is the class of bipartite digraphs. So it also follows from the existence of such cycles and Lemma 9 that is not in .
4 Hereditary model checking and quantifier prefixes
A quantifier prefix is a word ; we say that a first-order formula in prenex normal form has quantifier prefix if where and is a quantifier-free formula. In this section we present the following dichotomy for quantifier prefixes: for every quantifier prefix either
-
is in for every first-order formula with quantifier prefix , or
-
there exists a first-order formula with quantifier prefix such that is coNP-complete.
Moreover, we show that in the former case, is expressible by an ESO sentence whose first-order part is universal.
A relational signature is called monadic if all relation symbols in are monadic. It is easy to see that every problem in HerFO with a monadic signature is in FO, and so polynomial-time solvable. We postpone its simple proof to Appendix C.
Proposition 11.
Let be a finite monadic relational signature. For every first-order formula the class is universally definable and hence in .
From now on, we only consider the non-monadic case. The key components in the proof of our classification (Theorem 20) are Algorithm 1 (for one of the tractable cases) and the fact that the problem of deciding whether every directed cycle in an input digraph induces a symmetric edge is coNP-complete (Theorem 18) and expressible in HerFO (Lemma 17).
The fragment
In this section we prove that for every -formula there is a universal formula such that a structure hereditarily satisfies if and only if .
Lemma 12.
Let be a -formula with universally quantified variables. Then a structure hereditarily models if and only if every -element substructure of models .
Proof.
We prove the non-trivial (but straightforward) implication. Suppose that every substructure of with models , and let be a substructure of . If , then ; otherwise, for a -tuple , let be a tuple such that the quantifier-free part of is true of in the substructure of with vertex set . It follows that , and since such a exists for every , we conclude that , and therefore .
Corollary 13.
If is a -sentence, then is universally definable and hence polynomial-time solvable.
Proof.
If where is quantifier-free, let be the formula
By Lemma 12, a structure hereditarily satisfies if and only if it satisfies .
The fragment
An SNP -sentence (short for strict non-deterministic polynomial-time) is a sentence of the form
where is a quantifier free -formula [10, 19]. If a structure satisfies the sentence , we write . We say that a class of finite -structures is in SNP if there exists an SNP -sentence such that if and only if . We show that if is a -formula, then is in SNP P.
In the following we consider a fixed -formula
where is a quantifier-free -formula. We expand with an -ary relation symbol . We will interpret as a reflexive linear order with parameters, i.e., if the first arguments of are fixed, then the binary relation defined by the remaining two free variables is a linear order. It is straightforward to observe that there is a universal -formula such that is true of an -tuple in an -structure if and only if the binary relation defines a reflexive linear order on . Consider now the SNP sentence defined as follows.
Lemma 14.
A finite -structure hereditarily models if and only if it models .
Proof.
For the easy direction, suppose that models the first-order part of . For all , let be the minimum with respect to the linear order , i.e., the element such that for all . In particular, for all and the atomic formula holds in , and thus . Since the first-order part of is universal, every substructure of also models , and by the previous argument we conclude that models . Hence, hereditarily models .
Conversely, suppose that hereditarily satisfies . For every -tuple of we define a reflexive linear order such that the expansion
models the first-order part of . We define elements inductively as follows. Let be any element witnessing that satisfies
For , if no is a coordinate of for , choose to be any element witnessing that the substructure of with domain satisfies (such a vertex exists since hereditarily satisfies ). Otherwise, if some equals some coordinate of , then let be an arbitrary element of . We define the linear ordering if and only if , and let . It follows from the definition of and of that .
Suppose that satisfies for some . Let be the enumeration of corresponding to the linear ordering , and suppose that . Since , i.e., for every , it must be the case that every belongs to . It thus follows from the definition of that the substructure of with vertex set models , and thus . This shows that satisfies .
Certifying polynomial-time algorithm.
Consider a fixed first-order sentence
where is a universal formula.
Lemma 15.
For every first-order formula where is a universal formula for some quantifier-free formula , and for every -structure with domain the following statements hold.
Proof.
To prove the first claim, notice that (by finite induction) at the if statement in the repeat-until loop of the algorithm, the set contains all entries of . Hence, if the algorithm returns , then does not satisfy and hence does not hereditarily satisfy . Now we argue that the second itemized statement holds. It is straightforward to observe that is a reflexive linear order for every . Hence, it follows from the definition of that . To see that models the second conjunct of , let , , and (so is an evaluation of the universally quantified variables of in ). Further, suppose that
otherwise the second conjunct in the definition of is vacuously true for the tuple , . By the definition of , this means that for every . Hence, there is some iteration of the repeat-until loop such that . Let be the set and the end of this iteration of the loop, and . It follows from the definition of and the assumption that for each , that if , then . Since was added to in the -th iteration, it must be the case that in the -th iteration the if statement is not true for , i.e., . Since , we conclude that in particular where the universally quantified variables from are interpreted as . This means that
Since this is true for any choice of elements and in , we conclude that satisfies the first-order part of , and by Lemma 14 we conclude that .
Clearly, Algorithm 1 runs in polynomial time in the representation size of . Hence, the following statement is an immediate consequence of Lemma 15.
Theorem 16.
If is a -sentence, then there is an sentence such that a finite structure satisfies if and only if . Moreover, can be efficiently computed from , and there is a polynomial-time algorithm that either finds an expansion of proving that , or finds a substructure of such that .
Note that this theorem covers the first-order sentences that show that Example 1, Example 2, Example 3, and Example 27 are in , because they are sentences. Hence, the polynomial-time tractability of each of these problems follows from Theorem 16.
- and -sentences
Theorem 8 already provides examples of first-order sentences and with quantifier prefix and such that and are coNP-complete. Moreover, and are formulas with a binary signature (with two relation symbols). In this section we provide first-order sentences with the same quantifier prefixes over the signature of digraphs (they only use one binary symbol), which is the final ingredient needed to prove Theorem 20.
Consider the class of digraphs such that for every directed cycle of there exist such that is a symmetric edge of , i.e., . In this case, we say that every directed cycle of ‘induces a symmetric edge’.
Lemma 17.
Every directed cycle of a finite digraph induces a symmetric edge if and only if hereditarily satisfies the sentence
| (1) |
Proof.
Suppose there is a directed cycle of that does not induce a symmetric edge. Then the substructure of with vertex set satisfies the formula
and so does not hereditarily satisfy (1).
Conversely, suppose that every directed cycle of induces a symmetric edge and let . If contains a sink , then , and so satisfies (1). Otherwise, contains a directed cycle, and by assumption this directed cycle induces a symmetric edge , so by letting and we conclude that satisfies . Again, satisfies (1).
Now, we prove that deciding whether every directed cycle of an input digraph induces a symmetric edge is coNP-complete.
Theorem 18.
The problem of deciding whether every directed cycle in an input digraph induces a symmetric edge is -complete.
Proof.
Let be the class of digraphs described in this theorem, i.e., the class of all finite digraphs such that every directed cycle induces a symmetric edge. Fix to be a no-instance to this class, e.g., . We reduce from the coNP-hard CSP in Theorem 8 (deciding membership in ) to deciding membership to . First, on an input to , we check whether contains a blue loop, a red loop, or (i.e., a pair of vertices connected by symmetric pairs of blue and red edges). If so, the reduction returns a . Otherwise, the reduction returns the following digraph . The vertex set of equals the domain of , there are no loops , and for a pair of distinct vertices and we do the following.
-
If at least one of or is not a red edge in , then we connect and by a symmetric pair of edges and in .
-
If is a symmetric pair of red edges in , and is a blue edge in , then we connect them by a non-symmetric edge in .
-
If is a symmetric pair of red edges in , and and are not connected by blue edges in , we do not connect and by any edge in .
Clearly, the edge relation admits a quantifier-free definition in , and so, if is a substructure of , then is a substructure of . Clearly, for every with , the digraph is a directed cycle that does not induce any symmetric edge. Conversely, if is a directed cycle on vertices, then . It is straightforward to observe that a loopless -structure with no belongs to if and only if it does not embed any for . Hence, we obtained a polynomial-time reduction from deciding to deciding membership to . Therefore, the -hardness of follows from Theorem 8.
Corollary 19.
There are and (digraph) formulas and such that and are -complete.
Proof.
For consider the formula from Lemma 17 and for consider the quantifier reordering of , and notice that and are logically equivalent.
Prefix classification
The classification for general relational signatures follows from the lemmas proved earlier in this section.
Theorem 20.
Let be a relational signature which is not monadic. For every quantifier prefix one of the following statements hold:
-
is of the form , or of the form , and in this case is in for every first-order -sentence with quantifier prefix , or
-
contains a subword or , and in this case there is a first-order -sentence with quantifier prefix such that is -complete.
Proof.
Clearly, both items describe disjoint and complementary cases. The claim in the first item follows from Corollary 13 and Theorem 16. If contains a binary relation symbol, then the claim in the second item follows from Corollary 19. Otherwise, must contain a relation of arity at least three; however, we can use to model a binary relation, so the claim also holds in this case.
This classification also holds for the intersection of HerFO and CSPs (via Theorem 8).
Corollary 21.
For every quantifier prefix one of the following statements hold:
-
is of the form , or of the form , and in this case if for some structure and some first-order sentence with quantifier prefix , then is in , or
-
contains a subword or , and in this case there is a structure such that for some first-order sentence with quantifier prefix , and is -complete.
5 Undecidability of the tractability problem
The tractability problem for asks whether can be solved in polynomial time for a given first-order sentence . In this section we show that if , then the tractability problem for is undecidable. We begin with the following simple observation.
Observation 22.
Consider a first-order -sentence . If is a first-order -formula, then there is a -sentence such that a -structure satisfies if and only if there is no element that satisfies or the substructure of with domain satisfies . Namely, if , then
Otherwise,
where (respectively, ) is the set of indices such that is a universal (respectively, existential) quantifier.
Similarly, if is a monadic predicate and is a first-order sentence, then denotes the relativization of to the vertices in the set , and the relativization of to the complement of .
Theorem 23.
If , then it is undecidable to test whether the hereditary model-checking problem for a given first-order sentence is solvable in polynomial time.
Proof.
We use Trakhtenbrot’s theorem, which states that there is no algorithm that decides whether a given first-order formula has a finite model [24]. We reduce this decision problem to our problem.
Let be a first-order -sentence, let be a monadic predicate not in , and let a binary predicate also not in . Let be a first-order -sentence such that the hereditary model-checking problem for in -complete, e.g., can be chosen to be the sentence hereditarily describing the CSP from Corollary 19. Building on the relativizations and we define the following first-order -sentence
We claim that if does not have a finite model, then is polynomial-time solvable, and if has a finite model, then is -complete. We first observe that if does not have a finite model, then is valid on all finite -structures. To see this, let be a -structure. If , then satisfies the first disjunct of (Observation 22), and if , then the substructure induced by also satisfies , because has no finite models. In particular, this implies that can be solved in polynomial-time, because every instance is a yes-instance.
Now suppose that has a finite model. Let be a minimal model of , i.e., every proper substructure of satisfies . We present a polynomial-time reduction from to , which implies that is coNP-complete. Given an -structure we consider the -structure defined as follows:
-
the domain of is the disjoint union ,
-
the interpretation of in is ,
-
the interpretation of in equals , and
-
for every , the interpretation of in is .
The structure can be computed from the structure in polynomial time (the structure is constant). We now show that every substructure of satisfies if and only if every substructure of satisfies . First, note that if a substructure of does not contain , then satisfies . Indeed, if , then there is no element of that models , so by Observation 22 it follows that and hence . Otherwise, if , then the substructure of with domain satisfies , because is a proper substructure of , and is the smallest model of . Hence, , and so . These observations imply that every substructure of satisfies if and only if every substructure of with satisfies . If , then , so no element of satisfies , and similarly as above, we conclude that because . Finally, we assume that and , and so . Since and , we have that does not satisfy . Hence, if and only if , and the latter holds if and only if the substructure of with domain satisfies . We thus conclude that if and only if .
All together this shows that if , then has a finite model if and only if is not solvable in polynomial time.
We say that two quantifier prefixes are dual to each other if one can be obtained from the other by exchanging the symbols and .
Corollary 24.
Consider a quantifier prefix , and assume . If the finite satisfiability problem for first-order sentences whose quantifier prefix is undecidable, then the tractability problem for is undecidable for first-order sentences whose quantifier prefix is dual to . In particular, we obtain that the tractability problem for is undecidable if the quantifier prefix of the input may contain or as subwords.
Proof.
Denote by the dual of . It follows from [7, Theorem 3.0.1] that if the finite satisfiability problem for first-order sentences with quantifier prefix is undecidable, then contains or as a subword. Dually, contains or as a subword, and so, by Theorem 8, there is a FO sentence with quantifier prefix such that is -complete. Now, we can reduce from the finite satisfiability, because the finite satisfiability problem for the fragments of FO with quantifier prefix or is undecidable [7, Theorem 3.0.1].
6 Conclusion and Open Problems
We introduced the hereditary first-order model checking problem , and presented a complexity classification for based on allowed quantifier prefixes. A number of open problems are left for future research.
-
1.
Are there first-order sentences such that is coNP-intermediate (assuming )? We conjecture that there are.
-
2.
Is every finite-domain CSP which is in also in P? If , then it follows from the P vs. NP-complete finite-domain CSP dichotomy that this answer has a positive question. We ask for a proof which does not use any complexity-theoretic assumptions.
-
3.
Which finite-domain CSPs are in ?
-
4.
Is every CSP in also of the form for some negative connected sentence ?
- 5.
References
- [1] Max Bannach, Florian Chudigiewitsch, and Till Tantau. On the descriptive complexity of vertex deletion problems. In Rastislav Královic and Antonín Kucera, editors, 49th International Symposium on Mathematical Foundations of Computer Science, MFCS 2024, August 26-30, 2024, Bratislava, Slovakia, volume 306 of LIPIcs, pages 17:1–17:14, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.MFCS.2024.17.
- [2] Manuel Bodirsky. Complexity of Infinite-Domain Constraint Satisfaction. Lecture Notes in Logic (52). Cambridge University Press, Cambridge, United Kingdom; New York, NY, 2021. doi:10.1017/9781107337534.
- [3] Manuel Bodirsky and Santiago Guzmán-Pro. Hereditary First-Order Logic: the tractable quantifier prefix classes. Preprint arXiv:2411.10860, 2024. doi:10.48550/arXiv.2411.10860.
- [4] Manuel Bodirsky and Jan Kára. A fast algorithm and datalog inexpressibility for temporal reasoning. ACM Trans. Comput. Log., 11(3):15:1–15:21, 2010. doi:10.1145/1740582.1740583.
- [5] Manuel Bodirsky, Simon Knäuer, and Sebastian Rudolph. Datalog-expressibility for monadic and guarded second-order logic. In Nikhil Bansal, Emanuela Merelli, and James Worrell, editors, 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12-16, 2021, Glasgow, Scotland (Virtual Conference), volume 198 of LIPIcs, pages 120:1–120:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. Preprint available at https://arxiv.org/abs/2010.05677. doi:10.4230/LIPIcs.ICALP.2021.120.
- [6] J. Adrian Bondy and Uppaluri S. R. Murty. Graph Theory. Graduate Texts in Mathematics. Springer, Berlin, 2008. doi:10.1007/978-1-84628-970-5.
- [7] Egon Börger, Erich Grädel, and Yuri Gurevich. The Classical Decision Problem. Perspectives in Mathematical Logic. Springer, 1997.
- [8] Heinz-Dieter Ebbinghaus and Jörg Flum. Finite model theory. Perspectives in Mathematical Logic. Springer, Berlin, Heidelberg, New York, 1995. Second edition.
- [9] Thomas Eiter, Georg Gottlob, and Thomas Schwentick. The model checking problem for prefix classes of second-order logic: A survey. In Andreas Blass, Nachum Dershowitz, and Wolfgang Reisig, editors, Fields of Logic and Computation, Essays Dedicated to Yuri Gurevich on the Occasion of His 70th Birthday, volume 6300 of Lecture Notes in Computer Science, pages 227–250, Berlin, Heidelberg, 2010. Springer. doi:10.1007/978-3-642-15025-8_13.
- [10] Tomás Feder and Moshe Y. Vardi. The computational structure of monotone monadic SNP and constraint satisfaction: A study through datalog and group theory. SIAM J. Comput., 28(1):57–104, 1998. doi:10.1137/S0097539794266766.
- [11] Tomás Feder and Moshe Y. Vardi. Homomorphism closed vs. existential positive. In 18th IEEE Symposium on Logic in Computer Science (LICS 2003), 22-25 June 2003, Ottawa, Canada, Proceedings, pages 311–320. IEEE Computer Society, 2003. doi:10.1109/LICS.2003.1210071.
- [12] Fedor V. Fomin, Petr A. Golovach, and Dimitrios M. Thilikos. On the parameterized complexity of graph modification to first-order logic properties. Theory Comput. Syst., 64(2):251–271, 2020. doi:10.1007/s00224-019-09938-8.
- [13] Toshihiro Fujito. A unified approximation algorithm for node-deletion problems. Discret. Appl. Math., 86(2-3):213–231, 1998. doi:10.1016/S0166-218X(98)00035-3.
- [14] Georg Gottlob, Phokion G. Kolaitis, and Thomas Schwentick. Existential second-order logic over graphs: Charting the tractability frontier. J. ACM, 51(2):312–362, 2004. doi:10.1145/972639.972646.
- [15] Wilfrid Hodges. Model theory, volume 42 of Encyclopedia of mathematics and its applications. Cambridge University Press, Cambridge, 1993.
- [16] Wilfrid Hodges. A shorter model theory. Cambridge University Press, Cambridge, 1997.
- [17] Mukkai S. Krishnamoorthy and Narsingh Deo. Node-deletion np-complete problems. SIAM J. Comput., 8(4):619–625, 1979. doi:10.1137/0208049.
- [18] John M. Lewis and Mihalis Yannakakis. The node-deletion problem for hereditary properties is np-complete. J. Comput. Syst. Sci., 20(2):219–230, 1980. doi:10.1016/0022-0000(80)90060-4.
- [19] Christos H. Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci., 43(3):425–440, 1991. doi:10.1016/0022-0000(91)90023-X.
- [20] Venkatesh Raman, Saket Saurabh, and C. R. Subramanian. Faster fixed parameter tractable algorithms for finding feedback vertex sets. ACM Trans. Algorithms, 2(3):403–415, July 2006. doi:10.1145/1159892.1159898.
- [21] Bruce A. Reed, Kaleigh Smith, and Adrian Vetta. Finding odd cycle transversals. Oper. Res. Lett., 32(4):299–301, 2004. doi:10.1016/j.orl.2003.10.009.
- [22] Donald J. Rose. A note on consistent ordering and zero circulation. J. ACM, 18(4):573–575, 1971. doi:10.1145/321662.321671.
- [23] Donald J. Rose, Robert Endre Tarjan, and George S. Lueker. Algorithmic aspects of vertex elimination on graphs. SIAM J. Comput., 5(2):266–283, 1976. doi:10.1137/0205021.
- [24] Boris A. Trakhtenbrot. Understanding basic automata theory in the continuous time setting. Fundam. Informaticae, 62(1):69–121, 2004. (in Russian). URL: http://content.iospress.com/articles/fundamenta-informaticae/fi62-1-04.
Appendix A Examples
Polynomial-time solvable examples
Example 3 shows that is in but not in FO. There are also finite-domain CSPs that are in , but not in , as the following examples show. First, recall that the algebraic length of an oriented walk is the absolute value of the difference between the number of forward edges and the number of backward edges in . It is well-known (and straightforward to observe) that a digraph homomorphically maps to the directed path of length if and only if every oriented walk in has algebraic length at most .
Example 25.
The problem is in . Consider a first-order formula saying that there is a loop, or there are (directed) edges and such that the following hold:
-
has exactly one out-neighbour different from , and is the unique in-neighbour of ;
-
has exactly one in-neighbour different from , and is the unique out-neighbour of ;
-
every either
-
–
has no in-neighbours and exactly two out-neighbours, and they are different from , or
-
–
has no out-neighbours and exactly two in-neighbours, and they are different from .
-
–
Suppose that a loopless digraph satisfies , and let be witnesses that show that holds in . By using the first bullet, we find such that , and by using the third point iteratively, we find an alternating oriented path for some , i.e., , and . We claim that we can find an oriented walk of the form
We may choose so that and for . We first consider the case where is an in-neighbour of . In this case, we know by the first item that , because and has exactly one out-neighbour different from , namely . We also know that because the unique out-neighbour of is (second bullet). Finally, , because has exactly two in-neighbours, and they are different from , and so, . In the case where is an out-neighbour of , then because the unique in-neighbour of is . Similarly as before, we see that by applying the third bullet to .
Using the existence of such paths we can easily find an oriented path of algebraic length three in . Indeed, in the case where the path ends with an edge , and , we use the unique in-neighbour of to construct a path of length three . Now, consider the case when the path ends with an edge . If , then is a directed walk of length three, and if , then is an oriented walk of algebraic length three. Hence, if , then either contains a loop or a walk of algebraic length , and so .
Conversely, if , then either contains a loop or it has an oriented path of algebraic length . In the former case, clearly satisfies , and in the latter, by choosing the shortest such path we find a substructure of that models ; namely, the substructure with vertex set . Therefore, if , then .
Inexpressible examples and -categoricity
A digraph is balanced if every oriented cycle in has the same number of forward and of backward edges. Clearly, the class of balanced digraphs is hereditary, and so, it could a priory be expressible in HerFO. Here, we the notion of -categoricity (Remark 6) to prove that the class of balanced digraphs is not expressible in HerFO.
Example 26.
It is folklore that a digraph maps homomorphically to some directed path if and only is balanced. Equivalently, if is the successor relation in , then if and only if is a balanced digraph. Hence, the class of balanced digraphs is in HerFO iff is in HerFO. It is known that there is no -categorical structure whose CSP equals the CSP of [2, Proposition 5.8.2]. Therefore, (equivalently, the class of balanced digraphs) is not expressible in .
HerFO and Datalog
Datalog can be seen as the subclass of SNP where we require that the first-order part of the SNP sentence
-
is Horn, i.e., written in conjunctive normal form such that each clause contains at most one positive literal, and
-
is such that every positive literal in is existentially quantified.
A class of finite models is in Datalog if there exists an SNP sentence of the form described above such that a finite structure is in if and only if it does not satisfy . Note that if a class is in Datalog, then it is closed under homomorphisms444Our definition is standard, but different from the terminology of Feder and Vardi [10, 11], who defined that is in Datalog if its complement is in Datalog in the standard sense. and in P.
A simple example of a CSP which is solved by a Datalog program is . In Example 10 we showed that this class is not in . There are also CSPs that are in and in P, but not in Datalog.
Example 27.
For any positive integer consider a -ary relation symbol . The problem is in P, but not in Datalog [4]. Notice that the problem is in : consider the formula
Corollary 28.
The classes of s in Datalog and in are incomparable.
HerFO and pp definitions
A primitive positive formula is an existential positive formula whose quantifier-free part only uses variables and symbols from (i.e., disjunction is forbidden). Given a structure we say that a relation is primitively positively definable (pp-definable) in if there is a primitive positive formula such that if and only if . A class of structures is preserved by primitive positive definitions if for every , and every pp-definable relation in , the structure belongs to . In Example 3 we showed that is expressible in . The following example shows that is not expressible in , and so, the class of structures whose CSP in is not preserved by primitive positive definitions.
Example 29.
The problem is not hereditarily definable. Consider the family of structures and illustrated below (where undirected blue edges represent the (symmetric) binary relation , and directed black edges represent the binary relation ). It is not hard to see that for every positive integer ,
-
is a minimal obstruction of , and
-
.
Now, for every one can choose appropriately so that , and it thus follows by Lemma 9 that this CSP is not expressible in HerFO.
Appendix B Comparison of complexity classifications
As we discussed in the introduction, for every first-order sentence , there is a UMSO sentence such that is described by , and the first-order part of has the same quantifier-prefix as . Hence, if every problem expressible by a UMSO sentence with quantifier prefix can be solved in polynomial time, then can be solved in polynomial time for with quantifier prefix .
We also discussed the vertex deletion problem (parametrized version of HerFO): for a fixed first-order sentence , decide whether an input structure satisfies after deleting at most vertices. In Table 1, we present a comparison between our complexity classification based on the quantifier prefix and similar known classifications for UMSO and for the parametrized version. We restrict these classification to digraphs, because, as far as we are aware, complexity classifications for the vertex deletion problem have only been considered for digraphs.
| Quantifier prefix | HerFO | UMSO | Vertex deletion |
|---|---|---|---|
| (Theorem 20) | [14, Figure 1(a)] | [12, Theorem 1] | |
| Always in P | Always in P | Always FPT | |
| Always in P | Hard | Always FPT | |
| Always in P | Hard | Hard | |
| Hard | Hard | Always FPT | |
| Hard | Hard | Always FPT | |
| Hard | Hard | Hard |
Appendix C Two missing proofs
Proof of Lemma 9.
Suppose that for some first-order formula . We claim that satisfies both itemized statements. Firstly, note that if , then , so some substructure of does not satisfy . Moreover, all substructures of belong to , so we have that itself does not satisfy , and hence satisfies . If is a finite -structure such that , then , and hence there exists which embeds into . This shows the forward implication of the statement. Conversely, if there exists a formula that satisfies both items of the statement, then it is similarly straightforward to show that .
Proof of Proposition 11.
We claim that hereditarily models if and only if every substructure of with at most many elements satisfies . One direction follows from definition of hereditary satisfiability. For the converse implication, let be a substructure of and consider a minimal subset such that for every there is a such that for all . Clearly, and it is straightforward to observe that if and only if , and the claim follows.
