Two variable logic with ultimately periodic counting

We consider the extension of two variable logic with quantifiers that state that the number of elements where a formula holds should belong to a given ultimately periodic set. We show that both satisfiability and finite satisfiability of the logic are decidable. We also show that the spectrum of any sentence is definable in Presburger arithmetic. In the process we present several refinements to the ``biregular graph method''. In this method, decidability issues concerning two-variable logics are reduced to questions about Presburger definability of integer vectors associated with partitioned graphs, where nodes in a partition satisfy certain constraints on their in- and out-degrees.


Introduction
In the search for expressive logics with decidable satisfiability problem, two-variable logic, denoted here as FO 2 , is one yardstick. This logic is expressive enough to subsume basic modal logic and many description logics, while satisfiability and finite satisfiability coincide, and both are decidable [23,15,9]. However, FO 2 lacks the ability to count. Two-variable logic with counting, C 2 , is a decidable extension of FO 2 that adds counting quantifiers. In C 2 one can express, for example, ∃ 5 x P (x) and ∀x∃ 5 y E(x, y) which, respectively, mean that there are exactly 5 elements in unary relation P , and that every element in a graph has at least 5 adjacent edges. Satisfiability and finite satisfiability do not coincide for C 2 , but both are decidable [10,16]. In [16] the problems were shown to be NEXPTIME-complete under a unary encoding of numbers, and this was extended to binary encoding in [18]. However, the numerical capabilities of C 2 are quite limited. For example, one can not express that the number of outgoing edges of each element in the graph is even.
A natural extension is to combine FO 2 with Presburger arithmetic where one is allowed to define collections of tuples of integers from addition and equality using boolean operators and quantifiers. The collections of k-tuples that one can define in this way are the semi-linear sets, and the collections of integers (when k = 1) definable are the ultimately periodic sets. Prior work has considered the addition of Presburger quantification to fragments of two-variable logic. For every definable set φ(x, y) and every ultimately periodic set S, one has a formula ∃ S y φ(x, y) that holds at x when the number of y such that φ(x, y) is in S. We let FO 2 Pres denote the logic that adds this construct to FO 2 .
On the one hand, the corresponding quantification over general k-tuples (allowing semilinear rather than ultimately periodic sets) easily leads to undecidability [11,3]. On the other hand, adding this quantification to modal logic has been shown to preserve decidability [1,7]. Related one-variable fragments in which we have only a unary relational vocabulary and the main quantification is ∃ S x φ(x) are known to be decidable (see, e.g. [2]), and their decidability is the basis for a number of software tools focusing on integration of relational languages with Presburger arithmetic [14]. The decidability of full FO 2 Pres is, to the best of our knowledge, still open [4]. There are a number of other extensions of C 2 that have been shown decidable; for example it has been shown that one can allow a distinguished equivalence relation [22] or a forest-structured relation [6,5]. FO 2 Pres is easily seen to be orthogonal to these other extensions.
In this paper we show that both satisfiability and finite satisfiability of FO 2 Pres are decidable. Our result makes use of the biregular graph method introduced for analyzing C 2 in [13]. The method focuses on the problem of existence of graphs equipped with a partition of vertices based on constraints on the out-and in-degree. Such a partitioned graph can be characterized by the cardinalities of each partition component, and the key step in showing these decidability results is to prove that the set of tuples of integers representing valid sizes of partition components is definable by a formula in Presburger arithmetic. From this "graph constraint Presburger definability" result one can reduce satisfiability in the logic to satisfiability of a Presburger formula, and from there infer decidability using known results on Presburger arithmetic.
The approach is closely-related to the machinery developed by Pratt-Hartmann (the "star types" of [21]) for analyzing the decidability and complexity of C 2 , its fragments [19], and its extensions [22,5]. An advantage of the biregular graph approach is that it is transparent how to extract more information about the shape of witness structures. In particular we can infer that the spectrum of any formula is Presburger definable, where the spectrum of a formula φ is the set of cardinalities of finite models of φ. It is also interesting to note that a more restricted version of our biregular graph method is used to prove the decidability of FO 2 extended with two equivalence relations [12].
Characterising the spectrum for general first order formulas is quite a difficult problem, with ties to major open questions in complexity theory [8]. This work can be seen as a demonstration of the power of the biregular graph method to get new decidability results. We make heavy use of both techniques and results in [13], adapting them to the richer logic. We also require additional inductive arguments to handle the interaction of ordinary counting quantifiers and modulo counting quantification.
Linear and ultimately periodic sets. A set of the form {a + ip | i ∈ N}, for some a, p ∈ N is a linear set. We will denote such a set by a +p , where a and p are called the offset and period of the set, respectively. Note that, by definition, a +0 = {a}, which is a linear set. For convenience, we define ∅ and {∞} (which may be written as ∞ +p ) to also be linear sets.
Two-variable logic with ultimately periodic counting quantifiers. An atomic formula is either an atom R( u), where R is a predicate, and u is a tuple of variables of appropriate size, or an equality u = u , with u and u variables, or one of the formulas and ⊥ denoting the True and False values. The logic FO 2 Pres is a class of first-order formulas using only variables x and y, built up from atomic formulas and equalities using the usual boolean connectives and also ultimately periodic counting quantification, which is of the form ∃ S x φ where S is a u.p.s. One special case is where S is a singleton {a} with a ∈ N ∞ , which we write ∃ a x φ; in case of a ∈ N, these are counting quantifiers. The semantics of FO 2 Pres is defined as usual except that, for every a ∈ N, ∃ a x φ holds when there are exactly a number of x's such that φ holds, ∃ ∞ φ holds when there are infinitely many x's such that φ holds, and ∃ S x φ holds when there is some a ∈ S such that ∃ a x φ holds.
Note that when S is {∞} ∪ 0 +1 = N ∞ , ∃ S x φ is equivalent to . When S is 0 +1 , ∃ S x φ semantically means that there are finitely many x such that φ holds. We define ∃ ∅ x φ to be ⊥ for any formula φ. We also note that ∃ 0 x φ is equivalent to ∀x ¬φ, and ¬∃ S x φ is equivalent to ∃ N∞−S x φ.
For example, we can state in FO 2 Pres that every node in a graph has even degree (i.e., the graph is Eulerian). Clearly FO 2 Pres extends C 2 , the fragment of the logic where only counting quantifiers are used, and FO 2 , the fragment where only the classical quantifier ∃x is allowed.
Presburger arithmetic. An existential Presburger formula is a formula of the form ∃x 1 . . . x k φ, where φ is a quantifier-free formula over the signature including constants 0, 1, a binary function symbol +, and a binary relation . Such a formula is a sentence if it has no free variables. The notion of a sentence holding in a structure interpreting the function, relations, and constants is defined in the usual way. The structure N = (N, +, , 0, 1), is defined by interpreting +, , 0, 1 in the standard way, while the structure N ∞ = (N ∞ , +, , 0, 1) is the same except that a + ∞ = ∞ and a ∞ for each a ∈ N ∞ .
It is known that the satisfiability of existential Presburger sentences over N is decidable and belongs to NP [17]. Further, the satisfiability problem for N ∞ can easily be reduced to that for N . Indeed, we can first guess which variables are mapped to ∞ and then which atoms should be true, then check whether each guessed atomic truth value is consistent with other guesses and determine additional variables which must be infinite based on this choice, and finally restrict to atoms that do not involve variables guessed to be infinite, and check that the conjunction is satisfiable by standard integers.

Main result
In this section we prove the decidability of FO 2 Pres satisfiability. Our decision procedure is based on the key notion of regular graphs. Note that whenever we talk about graphs or I C A L P 2 0 2 0 112:4 Two Variable Logic with Ultimately Periodic Counting digraphs (i.e., directed graphs), by default we allow both finite and infinite sets of vertices and edges.

Regular graphs
In the following we fix an integer p 0. Let N ∞,+p denote the set whose elements are either a or a +p , where a ∈ N ∞ . For integers t, m 1, let N t×m ∞,+p denote the set of matrices with t rows and m columns where each entry is an element from N ∞,+p .
A t-color bipartite (undirected) graph is G = (U, V, E 1 , . . . , E t ), where U and V are sets of vertices and E 1 , . . . , E t are pairwise disjoint sets of edges between U and V . Edges in E i are called E i -edges. We will write an edge in a bipartite graph as The partition U 1 , . . . , U m and V 1 , . . . , V n is called a witness partition. We should remark that some U i and V i are allowed to be empty.
The above definition can be easily adapted for the case of directed graphs that are not necessarily bipartite . , E t are pairwise disjoint set of directed edges on a set of vertices V . As before, edges in E i are called E i -edges. The E i -indegree and -outdegree of a vertex u, is defined as the number of incoming and outgoing E i -edges incident to u.
In a t-color digraph G we will assume that (i) there are no self-loops -that is, is not an E j -edge for any E j . This will suffice for the digraphs that arise in our decision procedure. We say that a digraph G is complete, if for every u, v ∈ V and u = v, We say that G is a A|B-regular digraph, where A, B ∈ N t×m ∞,+p , if there exists a partition V 1 , . . . , V m of V such that for every 1 i t, for every 1 k m, the E i -indegree and -outdegree of every vertex in V k is A i,k and B i,k , respectively. We say that G has size (|V 1 |, . . . , |V m |), and call V 1 , . . . , V m a witness partition.
Lemma 2 below will be the main technical tool for our decidability result. Letx andȳ be vectors of variables of length m and n, respectively.

Lemma 2. For every
Lemma 3 below is the analog for digraphs.

Lemma 3. For every A ∈ N t×m
∞,+p and B ∈ N t×m ∞,+p , there exists (effectively computable) existential Presburger formula c-reg A|B (x) such that for everyM ∈ N m ∞ , the following holds. There is complete A|B-regular digraph with sizeM if and only if c-reg A|B (M ) holds in N ∞ . Lemmas 2 and 3 can be easily readjusted when we are interested only in finite sizes, i.e., M ∈ N m andN ∈ N n , by requiring the formulas to hold in N , instead of N ∞ . Alternatively, we can also state inside the formulas that none of the variables inx andȳ are equal to ∞.
The proofs of these two lemmas are discussed in Section 4.

Decision procedure
Theorem 4 below is the main result in this paper.
From the decision procedure for existential Presburger formulas (Theorem 1) mentioned in Section 2, we immediately will obtain the following corollary.

Corollary 5. Both satisfiability and finite satisfiability for FO 2
Pres are decidable.
We will sketch how Theorem 4 is proven, making use of Lemmas 2 and 3. We start by observing that satisfiability (and spectrum analysis) for an FO 2 Pres sentence can be converted effectively into the same questions for a sentence in a variant of Scott normal form: where α(x, y) is a quantifier free formula, each β i (x, y) is an atomic formula and each S i is an u.p.s. The proof, which is fairly standard, will appear in the full version of this paper. By taking the least common multiple, we may assume that all the (non-zero) periods in all S i are the same. We recall some standard terminology. A 1-type is a maximally consistent set of atomic and negated atomic unary formulas using only variable x. A 1-type can be identified with the quantifier-free formula that is the conjunction of its constituent formulas. Thus, we say that an element a in a structure A has 1-type π, if π holds on the element a. We denote by A π the set of elements in A with 1-type π. Clearly the domain A of a structure A is partitioned into the sets A π . Similarly, a 2-type is a maximally consistent set of atomic and negated atomic binary formulas using only variables x, y, containing the predicate x = y. The notion of a pair of elements (a, b) in a structure A having 2-type E is defined as with 1-types. We denote by Π = {π 1 , π 2 , . . . , π n } and Let g : E × Π → N ∞,+p be a function. We will use such a function g to describe the "behavior" of the elements in the following sense. Let A be a structure. We say that an element a ∈ A behaves according to g, if for every E ∈ E and for every π ∈ Π, the number of elements b ∈ A π such that the 2-type of (a, b) is E belongs to g(E, π). We denote by A π,g the set of all elements in A π that behave according to g. The restriction of g on 1-type π is the function g π : E → N ∞,+p , where g π (E) = g(E, π). We call the function g π the behavior (function) towards 1-type π.
We are, of course, only interested in functions g that are consistent with the sentence φ in (1), and we formalize this as follows: for every π ∈ Π and for every i the following holds: 2 If A |= φ then A (π,g) = ∅, whenever π and g are incompatible, and in addition every element in A behaves only according to some good function.
The main idea is to construct the sentence PRES φ that "counts" the cardinality |A (π,g) | in every structure A |= φ, for every π and g. Toward this end, let G = {g 1 , g 2 , . . . , g m } enumerate all good functions. Note that G can be computed effectively from the sentence φ, since it suffices to consider functions g : with codomain {0, . . . , a, 0 +p , . . . , a +p , ∞}, where a is the maximal offset of the (non-∞) elements in k i=1 S i . The sentence PRES φ will be of the form whereX is a vector of variables (X (π1,g1) , X (π1,g2) , . . . , X (πn,gm) ). Intuitively, each X (πi,gj ) represents |A πi,gj |. By the formulas consistent 1 (X) and consistent 2 (X), we capture the consistency of the integersX with the formulas ∀x∀y α(x, y) and k i=1 ∀x∃ Si y β i (x, y)∧x = y, respectively.
We start by defining the formula consistent 1 (X). Letting H be the set of all pairs (π, g) where π and g are incompatible, the formula consistent 1 (X) can be defined as Towards defining the formula consistent 2 (X), we introduce some notations. For π ∈ Π, define the matrices M π , ← − M π ∈ N t×m ∞,+p as follows: The idea is that M π captures all possible behavior towards 1-type π, where each column j represents the behavior of g j towards π. Note that for a structure A and 1-type π, the restriction of A on the set A π can be viewed as a t-color digraph G = (V, E 1 , . . . , E t ). It is sufficient to consider only the 2-types E 1 , . . . , E t , because each E i determines its reversal ← − E i . Moreover, an element a has an incoming E i -edge if and only if it has an outgoing ← − E i -edge. Thus, if A |= φ, the graph G is a complete M π | ← − M π -regular digraph. Now, we explain how to capture the behavior between elements with distinct 1-types. Define matrices L π , ← − L π ∈ N 2t×m ∞,+p as follows: That is, in L π the first t rows come from M π with the next t rows from ← − M π . On the other hand, in ← − L π the first t rows come from ← − M π , followed by the t rows from M π .
The idea is that for a structure A, the 2-types that are realized between A π and A π can be viewed as a 2t-color bipartite graph G = (A π , A π , E 1 , . . . , E t , where the direction of the edges are ignored. Moreover, a pair (a, b) has 2-type E if and only if (b, a) has 2-type ← − E , Thus, if A |= φ, the graph G is a complete L π | ← − L π -biregular graph. Now we are ready to define the formula consistent 2 (X). We enumerate all the 1-types π 1 , . . . , π n and define consistent 2 as follows: The formula consistent 1 (X) is Presburger definable by inspection, while consistent 2 (X) is Presburger definable using Lemmas 2 and 3. The correctness comes directly from the following lemma. Proof. Let φ be in Scott normal form as in (1). As before, Π = {π 1 , π 2 , . . . , π n } denote the set of all 1-types and Recall that each 2-type E contains the predicate x = y and that G = {g 1 , . . . , g m } is the set of all good functions.
Note that for π, π ∈ Π and E ∈ E, the conjunction π(x) ∧ E(x, y) ∧ π (y) corresponds to a boolean assignment of the atomic predicates in α(x, y).
We first prove the first statement in the lemma. Let A |= φ. Partition A into A π,g 's. We will show that consistent 1 (X) ∧ consistent 2 (X) holds when each X π,g is assigned with the value |A π,g |.
Next, we will show that consistent 2 (X) holds. Let π ∈ Π. By definition of A π , A π is a complete M π | ← − M π -regular digraph G = (V, E 1 , . . . , E t ), with size (|A π,g1 |, . . . , |A π,gm |). Thus, by Lemma 3, c-reg Mπ| ← − M π (X π ) holds. For π i , π j ∈ Π, where i < j, the structure A restricted to A πi and A πj can be viewed as where U = A πi and V = A πj , and for each 1 i t, we have the interpretation denoted (by a slight abuse of notation) as E i consist of all pairs (a, b) ∈ A πi × A πj whose 2-type is E i , and similarly for ← − E i . By Lemma 2, c-bireg Lπ j | ← − L π i (X πi ,X πj ) holds. Now we prove the second statement. Suppose PRES φ holds. By definition, there exists an assignment to the variables inX such that consistent 1 (X) ∧ consistent 2 (X) holds. Abusing notation as we often do in this work, we denote the value assigned to each X π,g by the variable X π,g itself.
For each (π, g), we have a set V π,g with cardinality X π,g . We denote by V π = g V π,g . We construct a structure A that satisfies φ as follows.
The domain is A = π,g V π,g . For each π ∈ Π, for each a ∈ V π , the unary atomic formulas on a are defined such that the 1-type of a becomes π.
I C A L P 2 0 2 0

112:8 Two Variable Logic with Ultimately Periodic Counting
For each π ∈ Π, the binary predicates on (u, v) ∈ V π × V π are defined as follows. Since with sizeX π . The edges E 1 , . . . , E t define precisely the 2-types among elements in V π . For each π i , π j , where i < j, the binary predicates on (u, v) ∈ V πi × V πj are defined as follows. Since c-bireg Lπ j | ← − We first show that A |= ∀x∀y α(x, y). α(u, v). By definition, there is g such that u ∈ V π,g and g(E, π ) = 0. Thus, V π,g = ∅. This also means that π is incompatible with g, which implies that X π,g = 0 by consistent 1 (X), thus, contradicts the assumption that V π,g = ∅.
Next, we show that A |= k i=1 ∀x∃ Si y β i (x, y)∧x = y. Note that G = {g 1 , . . . , g m } consists of only good functions. Thus, for every g ∈ G, for every β i , the sum π βi(x,y)∈E g(E, π) is an element in S i .

Proof ideas for Lemmas 2 and 3
We now discuss the proof of the main biregular graph lemmas. Due to space constraints, we deal only with the 1-color case, which gives the flavor of the arguments. The general case, which is much more involved, is deferred to the full version of this paper. This section is organized as follows. In Subsection 4.1 we will focus on a relaxation of Lemma 2 where the requirement being complete is dropped. This will then be used to prove the complete case in Subsection 4.2. Finally, in Subsection 4.3 we present a brief explanation on how to modify the proof for the biregular graphs to the one for regular digraphs.

The case of incomplete 1-color biregular graphs
This subsection is devoted to the proof of the following lemma. The desired formula c-bireg A|B (x,ȳ) for complete biregular graphs will be defined using the formula bireg A|B (x,ȳ).
We will use the following notations. The term vectors always refers to row vectors, and we usually useā,b, . . . (possibly indexed) to denote them. We write (ā,b) to denote the vectorā concatenated withb. Obviously 1-row matrices can be viewed as row vectors. For a = (a 1 , . . . , a k ) ∈ N k ∞ , we writeā +p to denote the vector (a +p 1 , . . . , a +p k ). Matrix entries of the form a +p are called periodic entries. Otherwise, they are called fixed entries. By grouping the entries according to whether they are fixed/periodic, we write a 1-row matrix M as (ā,b +p ), whereā andb +p correspond to the fixed and periodic entries in M . Matrices that contain only fixed (or, periodic) entries are written asā (or,ā +p ).
As before, we will writex,ȳ (possibly indexed) to denote a vector of variables. We writē 1 to denote the vector with all components being 1. We use · to denote the standard dot product between two vectors. To avoid being repetitive, when dot products are performed, it is implicit that the vector lengths are the same. In particular,x ·1 is the sum of all the components inx.
We now outline the proof of Lemma 7, focusing only on the case where there is no ∞ degree in the matrices. The case where such a degree exists is similar but simpler. Without loss of generality, we can also assume that none of the fixed entries are zero. For vectors M 0 ,M 1 ,N 0 ,N 1 with the same length asā,b,c,d, respectively, we say that (M 0 ,M 1 )|(N 0 ,N 1 ) is big enough for (ā,b +p )|(c,d +p ), if the following holds: Here δ max is max(p,ā,b,c,d) -that is, the maximal element among p and the components inā,b,c,d. Whenb +p ord +p are missing, the same notion can be defined by dropping condition (b) or (c), respectively. For example, we say thatM |N is big enough forā|b, if . The proof idea is as follows. We first construct a formula that deals with big enough sizes. Then, we construct a formula for each of the cases when one of the conditions (a), (b) or (c) is violated. The interesting case will be when condition (b) is violated. This means that the number of vertices with degrees fromb +p is fixed, and they can be "encoded" inside the Presburger formula.
We start with the big enough case. When there are only fixed entries, we will use the following lemma. Proof. Note that if we have a biregular graph with the desired outdegrees on the left, then the total number of edges must beM ·ā, and similarly the total number of edges considering the requirement for vertices on the right, we see that the total number of edges must beN ·b. Thus this condition is always a necessary one, regardless of whetherM |N is big enough.
When bothM andN do not contain ∞, [13,Lemma 7.2] shows that whenM |N is big enough forā|b, the converse holds:M ·ā =N ·b implies that there is aā|b-biregular graph with sizeM |N . We briefly mention the proof idea there, which we will also see later (e.g., in the proof of Lemma 9). There is a preliminary construction that handles the requirement on vertices on one side in isolation, leaving the vertices on the right with outdegree 1. A follow-up construction merges vertices on the right in order to ensure the necessary number of incoming edges on the right. In doing so we exploit the "big enough" property in order to avoid merging two nodes on the right with a common adjacent edge on the left.
We will now prove that the condition is also sufficient when eitherM orN contains ∞. So assumeM ·ā =N ·b, and thus bothM ,N contain ∞.
We construct anā|b-biregular graph G = (U, V, E) with sizeM |N as follows. Let a = (a 1 , . . . , a m ) andb = (b 1 , . . . , b n ). LetM = (M 1 , . . . , M m ) andN = (N 1 , . . . , N n ). We pick pairwise disjoint sets The edges are constructed as follows. For each i i m, when |U i | is finite, we make each vertex u ∈ U i have degree a i , as follows. For each 1 j t, we pick a i "new" vertices from some infinite set V l -that is, vertices that are not adjacent to any edge, and connect them to u. Likewise, for each vertex v ∈ V i when |V i | is finite. After performing this, every I C A L P 2 0 2 0 112:10 Two Variable Logic with Ultimately Periodic Counting vertex in finite U i and V i has degree a i and b i , respectively, and every vertex in infinite sets U i and V i has degree at most 1.
Finally, we iterate the following process. For every infinite U i , if u ∈ U i has degree other than a i , we change the degree to a i by picking "new" vertices from some infinite set V l , and connect them to u by an appropriate number of edges. Likewise, we can make each vertex v in infinite V i to have degree b i . Note that in any iteration, for every infinite set U i , the degree of a vertex u ∈ U i is either a i , 1, or 0. Likewise, in any iteration, for every infinite set V i , the degree of a vertex v ∈ V i is either b i , 1, or 0. Since there is an infinite supply of vertices, there are always new vertices that can be picked in any iteration. Now we move to the case where the entries are still big enough, but some of the entries are periodic on one side. Then we consider the following formula Ψ (ā,b +p )|c (x 0 ,x 1 ,ȳ): Note that if G = (U, V, E) is a (ā,b +p )|c-biregular graph with size (M 0 ,M 1 )|N , then the number of edges |E| should equal the sum of the degrees of the vertices in U , which is a ·M 0 +b ·M 1 + zp, for some integer z 0. Since this quantity must equal the sum of the degrees of the vertices in V , which isc ·N , we again conclude that this formula is a necessary condition -regardless of whether the entries are big enough. We again show the converse. Proof. Assume that Ψ (ā,b +p )|c (M 0 ,M 1 ,N ) holds. As before, abusing notation, we denote the value assigned to variable z by z itself. Supposeā ·M 0 +b ·M 1 + pz =N ·c. Since (M 0 ,M 1 )|N is big enough for (ā,b +p )|c, it follows immediately that (M 0 ,M 1 , z)|N is big enough for (ā,b, p)|c. Applying Lemma 8, there is a (ā,b, p)|c-biregular graph with size (M 0 ,M 1 , z)|N . That is, we have a graph that satisfies our requirements, but there is an additional partition class Z on the left of size z where the number of adjacent vertices is p, rather than beingb +p as we require. Let G = (U, V, E) be such a graph, and let U = U 0 ∪ U 1 ∪ Z, where U 0 , U 1 , and Z are the sets of vertices whose degrees are fromā,b, and from p. Note that |U 0 | =M 0 ·1, |U 1 | =M 1 ·1 and |Z| = z.
We will construct a (ā,b +p )|c-biregular graph with size (M 0 ,M 1 )|N . The idea is to merge the vertices in Z with vertices in U 1 . Let z 0 ∈ Z. The number of vertices in U 1 reachable from z 0 in distance 2 is at most δ 2 max . Since (M 0 ,M 1 )|N is big enough for (ā,b +p )|c, we have |U 1 | =M 1 ·1 δ 2 max + 1. Thus, there is a vertex u ∈ U 1 not reachable in distance 2. We merge z 0 and u into one vertex. Since the degree of z 0 is p, such merging increases the degree of u by p, which does not break our requirement. We perform such merging for every vertex in Z.
Finally, we turn to the big enough case where there are periodic entries on both sides. There we will deal with the following formula Ψ (ā,b +p )|(c,d +p ) (x 0 ,x 1 ,ȳ 0 ,ȳ 1 ):

The proof for regular digraphs
Recall that in the prior argument we consider only digraphs without any self-loop. Thus, a digraph can be viewed as a bipartite graph by splitting every vertex u into two vertices, where one is adjacent to all the incoming edges, and the other to all the outgoing edges. Thus, A|B-regular digraphs with sizeM can be characterized as A|B-biregular graphs with sizeM |M . For more details, see [13,Section 8].

Extensions and applications
A type/behavior profile for a model M is the vector of cardinalities of the sets A π,g computed in M , where π ranges of 1-types and g over behavior functions (for a fixed φ). Recall that in the proof Theorem 4 we actually showed, in Lemma 6, that we can obtain existential Presburger formulas which define exactly the vectors of integers that arise as the type/behavior profiles of models of φ. The domain of the model can be broken up as a disjoint union of sets A π,g , and thus its cardinality is a sum of numbers in this vector. We can thus add one additional integer variable x total in PRES φ , which will be free, with an additional equation stating that x total is the sum of all X π,g 's. This allows us to conclude definability of the spectrum.

Theorem 12. From an FO 2
Pres sentence φ, we can effectively construct a Presburger formula ψ(n) such that N |= ψ(n) exactly when n is the size of a finite structure that satisfies φ, and similarly a formulas ψ ∞ (n) such that N ∞ |= ψ ∞ (n) exactly when n is the size of a finite or countably infinite model of φ.
We say that φ has NP data complexity of (finite) satisfiability if there is a non-deterministic algorithm that takes as input a set of ground atoms A and determines whether φ ∧ A is satisfiable, running in time polynomial in the size of A. Pratt-Hartmann [20] showed that C 2 formulas have NP data complexity of both satisfiability and finite satisfiability. Following the general approach to data complexity from [20], while plugging in our Presburger characterization of FO 2 Pres , we can show that the same data complexity bound holds for FO 2 Pres .
Theorem 13. FO 2 Pres formulas have NP data complexity of satisfiability and finite satisfiability.
Proof. We give only the proof for finite satisfiability. We will follow closely the approach used for C 2 in Section 4 of [20], and the terminology we use below comes from that work.
Given a set of facts D, our algorithm guesses a set of facts (including equalities) on elements of D, giving us a finite set of facts D + extending D, but with the same domain as D. We check that our guess is consistent with the universal part α and such that equality satisfies the usual transitivity and congruence rules. Now consider 1-types and 2-types with an additional predicate Observable. Based on this extended language, we consider good functions as before, and define the formulas consistent 1 and consistent 2 based on them. 1-types with that contain the predicate Observable will be referred to as observable 1-types. The restriction of a behavior function to observable 1-types will be called an observable behavior. Given a structure M , an observable one-type π, and an observable behavior function g 0 , we let M π,g0 be the elements of M having 1-type π and observable behavior g 0 , and we analogously let D π,g0 be the elements of D whose 1-type and behavior in D + match π and g 0 .

112:15
We declare that all elements in A are in the predicate Observable. Add to the formulas consistent 1 and consistent 2 additional conjuncts stating that for each observable 1-type π and for each observable behavior function g 0 , the total sum of the number of elements with 1-type π and a behavior function g extending g 0 (i.e., the cardinality of M π,g0 ) is the same as |D π,g0 |. with the cardinality being counted modulo equalities of D + .
At this point our algorithm returns true exactly when the sentence obtained by existentially quantifying this extended set of conjuncts is satisfiable in the integers. The solving procedure is certainly in NP. In fact, since the number of variables is fixed, with only the constants varying, it is in PTIME [17].
We argue for correctness, focusing on the proof that when the algorithm returns true we have the desired model. Assuming the constraints above are satisfied, we get a graph, and from the graph we get a model M . M will clearly satisfy φ, but its domain does not contain the domain of D. Letting O be the elements of M satisfying Observable, we know, from the additional constraints imposed, that the cardinality of O matches the cardinality of the domain of D modulo the equalities in D + , and for each observable 1-type π o and observable behavior g 0 , |M π,g0 | = |D π,g0 .
Fix an isomorphism λ taking each M π,g0 to (equality classes of) D π,g0 . Create M by redefining M on O by connecting pairs (o 1 , o 2 ) via E exactly when λ(o 1 ), λ(o 2 ) ise connected via E in D + . We can thus identify O with D + modulo equalities in M .
Clearly M now satisfies D. To see that M satisfies φ, we simply note that since all of the observable behaviors are unchanged in moving from an element e in M to the corresponding element λ(e) in M , and every such e modified has an observable type, it follows that the behavior of every element in M is unchanged in moving from M to M . Since the 1-types are also unchanged, M satisfies φ.
Note that the data complexity result here is best possible, since even for FO 2 the data complexity can be NP-hard [20].