Abstract 1 Introduction 2 𝐅𝐎 and 𝐋𝐓𝐋 3 Logical equivalences 4 Characterisation of monotonicity 5 Semantic and syntactic monotonicity Conclusion References

Positive and Monotone Fragments of FO and LTL

Simon Iosti LIP, ENS Lyon, France Denis Kuperberg ORCID CNRS, LIP, ENS Lyon, France Quentin Moreau ENS Lyon, France
Abstract

We study the positive logic FO+ on finite words, and its fragments, pursuing and refining the work initiated in [11]. First, we transpose notorious logic equivalences into positive first-order logic: FO+ is equivalent to LTL+, and its two-variable fragment FO2+ with (resp. without) successor available is equivalent to UTL+ with (resp. without) the “next” operator X available. This shows that despite previous negative results, the class of FO+-definable languages exhibits some form of robustness. We then exhibit an example of an FO-definable monotone language on one predicate, that is not FO+-definable, refining the example from [11] with 3 predicates. Moreover, we show that such a counter-example cannot be FO2-definable. Finally, we provide a new example distinguishing the positive and monotone versions of FO2 without quantifier alternation. This does not rely on a variant of the previously known counter-example, and witnesses a new phenomenon.

Keywords and phrases:
Positive logic, LTL, separation, first-order, monotone
Category:
Track B: Automata, Logic, Semantics, and Theory of Programming
Funding:
Denis Kuperberg: ANR ReCiProg.
Copyright and License:
[Uncaptioned image] © Simon Iosti, Denis Kuperberg, and Quentin Moreau; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Modal and temporal logics
; Theory of computation Regular languages ; Theory of computation Logic and verification
Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele Puppis

1 Introduction

In various contexts, monotonicity properties play a pivotal role. For instance the field of monotone complexity investigates negation-free formalisms, and turned out to be an important tool for complexity in general [7]. From a logical point of view, a sentence is called monotone (with respect to a predicate P) if increasing the set of values where P is true in a structure cannot make the evaluation of the formula switch from true to false. This is crucial e.g. when defining logics with fixed points, where the fixed points binders μX can only be applied to formulas that are monotone in X. Logics with fixed points are used in various contexts, e.g. to characterise the class PTime on ordered structures [8, 19], as extensions of linear logic such as μMALL [2], or in the μ-calculus formalism used in automata theory and model-checking [3]. Because of the monotonicity constraint, it is necessary to recognise monotone formulas, and understand whether a syntactic restriction to positive (i.e. negation-free) formulas is semantically complete. Logics on words have also been generalised to inherently negation-free frameworks, such as in the framework of cost functions [4].

This motivates the study of whether the semantic monotone constraint can be captured by a syntactic one, namely the removing of negations, yielding the class of positive formulas. For instance, the formula x,a(x) states that an element labelled a is present in the structure. It is both monotone and positive. However, its negation x,¬a(x) is neither positive nor monotone, since it states the absence of a, and increasing the domain where predicate a is true in a given structure could make the formula become false.

Lyndon’s preservation theorem [13] states that on arbitrary structures, every monotone formula of First-Order Logic (FO) is equivalent to a positive one (FO+ syntactic fragment). The case of finite structures was open for two decades until Ajtai and Gurevich [1] showed that Lyndon’s theorem does not hold in the finite, later refined by Stolboushkin [17] with a simpler proof. Recently, this preservation property of FO was more specifically shown to fail already on finite graphs and on finite words by Kuperberg [11], implying the failure on finite structures with a more elementary proof than [1, 17]. However, the relationship between monotone and positive formulas is still far from being understood. On finite words in particular, the positive fragment FO+ was shown [11] to have undecidable membership (with input an FO formula, or a regular language), which could be interpreted as a sign that this class is not well-behaved. This line of research can be placed in the larger framework of the study of preservation theorems in first-order logic, and their behaviour in the case of finite models, see [16] for a survey on preservation theorems.

In this work we will concentrate on finite words, and investigate this “semantic versus syntactic” relationship for fragments of FO and Linear Temporal Logic (LTL). We will in particular lift the classical equivalence between FO and LTL [9] to their positive fragments, showing that some of the robustness aspects of FO are preserved in the positive fragment, despite the negative results from [11]. This equivalence between FO and LTL is particularly useful when considering implementations and real-world applications, as LTL satisfiability is PSpace-complete while FO satisfiability is non-elementary. It is natural to consider contexts where specifications in LTL can talk about e.g. the activation of a sensor, but not its non-activation, which would correspond to a positive fragment of LTL. We could also want to syntactically force such an event to be “good” in the sense that if a specification is satisfied when a signal is off at some time, it should still be satisfied when the signal is on instead. It is therefore natural to ask whether a syntactic constraint on the positivity of LTL formulas could capture the semantic monotonicity, in the full setting or in some fragments corresponding to particular kinds of specifications.

We will also pay a close look at the two-variable fragment FO2 of FO and its LTL counterpart. It was shown in [11] that there exists a monotone FO-definable language that is not definable in positive FO. We give stronger variants of this counter-example language, and show that such a counter-example cannot be defined in FO2[<]. This is obtained via a stronger result characterising FO2-monotone in terms of positive fragments of bounded quantifier alternation. We also give precise complexity results for deciding whether a regular language is monotone, refining results from [11]. Finally, we provide a counterexample showing that the positive and alternation-free fragment of FO2 does not capture all alternation-free monotone languages from FO2.

The goal of this work is to understand at what point the phenomenon discovered in [11] comes into play: what are the necessary ingredients for such a counter-example (FO-monotone but not FO positive) to exist? And on the contrary, which fragments of FO are better behaved, and can capture the monotonicity property with a semantic constraint, and allow for a decidable membership problem in the positive fragment?

Outline and Contributions

We begin by introducing two logical formalisms in Section 2: First-Order Logic (2.1) and Temporal Logic (2.2).

Then, we lift some classical logical equivalences to positive logic in Section 3. First we show that FO+, FO3+ and LTL+ are equivalent in Theorem 20. We prove that the fragment FO2+ with (resp. without) successor predicate is equivalent to UTL+ with (resp. without) X and Y operators available in Theorem 26 (resp. Corollary 28).

In Section 4, we give a characterisation of monotonicity using monoids (Theorem 29) and we deduce from this an algorithm which decides the monotonicity of a regular language given by a monoid (Section 4.2), completing the automata-based algorithms given in [11]. This leads us to the Proposition 31 which states that deciding the monotonicity of a regular language is in LogSpace when the input is a monoid, while it is 𝖭𝖫-𝖼𝗈𝗆𝗉𝗅𝖾𝗍𝖾 when the input is a DFA. This completes the previous result from [11] showing PSpace-completeness for NFA input.

Finally, we study the relationship between semantic and syntactic positivity in Section 5. We give some refinements of the counter-example from [11] (a regular and monotone language FO-definable but not definable in FO+). Indeed, we show that the counter-example can be adapted to FO2 with the binary predicate “between” in Proposition 33, and we show that we need only one predicate to find a counter-example in FO in Proposition 34.

We also consider a characterisation of FO2[<] from Thérien and Wilke [18] stating that FO2[<] is equivalent to Σ2Π2 where Σ2 and Π2 are fragments of FO with bounded quantifier alternation. We show that FO2-monotone is characterised by Σ2+Π2+.

We then show that monotone FO2[<] is included in FO+. In other words no counter-example for FO can be found in FO2 (without successor available) in Corollary 36, and leave open the problem of expressive equivalence between FO2+ and FO2-monotone, as well as decidability of membership in FO2+ for regular languages (see Open 37).

To conclude, we provide in Section 5.3 a negative answer to this problem in the context of the alternation-free fragment of FO2, providing an example of a monotone language definable in FO2 without quantifier alternation, but not definable in the positive fragment of FO2 without alternation. This exhibits a new discrepancy between a positive fragment and its monotone counterpart, not relying on previous constructions.

Due to space constraints, some proofs are omitted or only sketched, and can be found in the long version.

2 𝐅𝐎 and 𝐋𝐓𝐋

We work with a set of atomic unary predicates Σ={a1,a2,a|Σ|}, and consider the set of words on alphabet A=𝒫(Σ). To describe a language on this alphabet, we use logical formulas. Here we present the different logics and how they can be used to define languages.

2.1 First-order logics

Let us consider a set of binary predicates, =, , , <, succ and nsucc, which will be used to compare positions in words. We define the subsets of predicates 𝔅0:={,<,succ,nsucc}, 𝔅<:={,<} and 𝔅succ:={=,,succ,nsucc}, and a generic binary predicate is denoted 𝔟. As we are going to see, equality can be expressed with other binary predicates in 𝔅0 and 𝔅< when we have at least two variables. This is why we do not need to impose that = belongs to 𝔅0 or 𝔅<. The same thing stands for . Generally, we will always assume that predicates = and are expressible.

Let us start by defining first-order logic FO:

Definition 1.

Let 𝔅 be a set of binary predicates. The grammar of FO[𝔅] is as follows:

φ,ψ::=𝔟(x,y)a(x)φψφψx,φx,φ¬φ

where 𝔟 belongs to 𝔅.

Closed FO formulas (those with no free variable) can be used to define languages. Generally speaking, a pair consisting of a word u and a function ν from the free (non-quantified) variables of a formula φ to the positions of u satisfies φ if u satisfies the closed formula obtained from φ by replacing each free variable with its image by ν.

Definition 2.

Let φ, a formula with n free variables, x1, …, xn, and u a word. Let ν be a function of {x1,,xn} in [[0,|u|1]]. We say that (u,ν) satisfies φ, and we define u,νφ by induction on φ as follows:

  • u,ν and we never have u,ν,

  • u,νx<y if ν(x)<ν(y),

  • u,νxy if ν(x)ν(y),

  • u,νsucc(x,y) if ν(y)=ν(x)+1,

  • u,νnsucc(x,y) if ν(y)ν(x)+1,

  • u,νa(x) if au[ν(x)] (note that we only ask inclusion here),

  • u,νφψ if u,νφ and u,νψ,

  • u,νφψ if u,νφ or u,νψ,

  • u,νx,φ(x,x1,,xn) if there is i of u such that we have u,ν[xi]φ,

  • u,νx,φ(x,x1,,xn) if for any index i of u, u,ν[xi]φ,

  • u,ν¬φ if we do not have u,νφ.

For a closed formula, we simply note uφ.

Here is an example:

Example 3.

The formula φ=x,y,(x=y¬a(y)) describes the set of non-empty words that admit at most one a. For example, {a}{a,b} does not satisfy φ because two of its letters contain an a, but {a,b,c}{b} does satisfy φ.

 Remark 4.

The predicates succ and nsucc can be expressed in FO+[𝔅<] with three variables. If there are no restriction on variables, in particular if we can use three variables, all binary predicates in 𝔅0 can be expressed from those in 𝔅<. Thus, we will consider the whole set of binary predicates available when the number of variables is not constrained, and we will note FO for FO[𝔅0] or FO[𝔅<], which are equivalent, and similarly for FO+.

Let us now turn our attention to FO+, the set of first-order formulas without negation. We recall definitions from [11].

Definition 5.

The grammar of FO+ is that of FO without the last constructor, ¬.

Let us also define monotonicity properties, starting with an order on words.

Definition 6.

A word u is lesser than a word v if u and v are of the same length, and for any index i (common to u and v), the i-th letter of u is included in the i-th letter of v. When a word u is lesser than a word v, we note uAv.

Definition 7.

Let L be a language. We say that L is monotone when for any word u,v such that u is lesser than v and uL, we have vL.

Definition 8.

If L is a language, we define its monotone closure L:={vAuL,uAv}. This is the smallest monotone language containing L.

Proposition 9 ([11]).

FO+ formulas are monotone in unary predicates, i.e. if a model (u,ν) satisfies a formula φ of FO+, and uAv, then (v,ν) satisfies φ.

We will also be interested in other logical formalisms, obtained either by restricting FO, or several variants of temporal logics.

First of all, let us review classical results obtained when considering restrictions on the number of variables. While an FO formula on words is always logically equivalent to a three-variable formula [9], two-variable formulas describe a class of languages strictly included in that described by first-order logic. In addition, the logic FO is equivalent to Linear Temporal Logic (see below).

Please note: these equivalences are only true in the framework of word models. In other circumstances, for example when formulas describe graphs, there are formulas with more than three variables that do not admit equivalents with three variables or fewer.

Definition 10.

The set FO3 is the subset of FO formulas using only three different variables, which can be reused. We also define FO3+ for formulas with three variable and without negation. Similarly, we define FO2 and FO2+ with two variables.

Example 11.

The formula y,succ(x,y)(x,(yx)b(x)(z,(xz)(z<y)a(z))) (a formula with one free variable x that indicates that the letter labeled by x will be followed by a factor of the form aaaaaaaab) is an FO3 formula, and even an FO3+ formula: there is no negation, and it uses only three variables, x, y and z, with a reuse of x. On the other hand, it does not belong to FO2.

2.2 Temporal logics

Some logics involve an implicit temporal dimension, where positions are identified with time instants. For example, Linear Temporal Logic (LTL) uses operators describing the future, i.e. the indices after the current position in a word. This type of logic can sometimes be more intuitive to manipulate, and present better complexity properties, as explained in the introduction. As mentioned above, FO2 is not equivalent to FO. On the other hand, it is equivalent to UTL, a restriction of LTL to its unary temporal operators.

To begin with, let us introduce LTL, which is equivalent to FO.

Definition 12.

The grammar of LTL is as follows:

φ,ψ::=aφψφψXφφUψφRψ¬φ.

Removing the last constructor gives the grammar of LTL+.

This logic does not use variables. To check that a word satisfies an LTL formula, we evaluate the formula at the initial instant, that is to say, the word’s first position. The X constructor then describes constraints about the next instant, i.e. the following position in the word. So the word a.u, where a is a letter, satisfies Xφ if and only if the suffix u satisfies φ. The construction φUψ (φ Until ψ) indicates that the formula ψ must be verified at a given point in time and that φ must be verified until then. We define φRψ (φ Releases ψ) as being equal to ¬(¬φU¬ψ). This is similar to ψU(φψ), but allowing for the option of φ never being satisfied and ψ being true until the end. The formal definition of the semantics of LTL will be given in a more general setting with additional operators in Definition 16.

 Remark 13.

Let us call φXUψ the formula X(φUψ), for any pair (φ,ψ) of LTL formulas. The advantage of XU is that X and U can be redefined from XU. The notation U for XU is regularly found in the literature.

LTL is included in Temporal Logic, TL. While the former speaks of the future, i.e. of the following indices in the word, thanks to X, U and R, the latter also speaks of the past. Indeed, we introduce Y, S (Yesterday, Since) and Q the respective past analogues of X, U and R.

Definition 14.

The grammar of TL is as follows:

φ,ψ::=LTLYφφSψφQψ.

Similarly, the grammar of TL+ is that of LTL+ extended with Y, S and Q.

 Remark 15.

As for XU, we will write φYSψ for Y(φSψ), and similarly for XR and YQ. We also note Pφ, Fφ, Hφ and Gφ for YSφ, XUφ, YQφ and XRφ respectively. The formulas Fφ and Gφ mean respectively that the formula φ will be satisfied at least once in the future (F as Future), and that φ will always be satisfied in the future (G as Global). Similarly, the operators P (as Past) and H are the respective past analogues of F and G. Note that the current position is excluded here while it is included in the general convention (usually Pφ=φYSφ).

When evaluating an LTL or TL formula on a word u=u0um, we start by default on the first position u0. However, we need to define more generally the evaluation of a TL formula on a word from any given position:

Definition 16.

Let φ be a TL formula, u=u0um1 a word, and i[[0,m1]]. We define u,iφ by induction on φ:

  • u,i and we never have u,

  • u,ia if aui,

  • u,iφψ if u,iφ and u,iψ,

  • u,iφψ if u,iφ or u,iψ,

  • u,iXφ if u,i+1φ,

  • u,iφUψ if there is j[[i,m1]] such that u,jψ and for all k[[i,j1]], u,kφ,

  • u,iψRφ if u,i¬(¬ψU¬φ),

  • u,i¬φ if we do not have u,iφ,

  • u,iYφ if u,i1φ,

  • u,iφSψ if there is j[[0,i]] such that u,jψ and for all k[[j+1,i]], u,kφ.

  • u,iψQφ if u,i¬(¬ψS¬φ).

We will write uφ as a shorthand for u,0φ

Finally, let us introduce UTL and UTL+, the Unary Temporal Logic and its positive version. The UTL logic does not use the U or R operator, but only X, F and G to talk about the future. Similarly, we cannot use S or Q to talk about the past.

Definition 17.

The grammar of UTL is as follows:

φ,ψ::=aφψφψXφYφPφFφHφGφ¬φ.

We define UTL[P,F,H,G] from this grammar by deleting the constructors X and Y.

The grammar of UTL+ is obtained by deleting the last constructor, and similarly, we define UTL+[P,F,H,G] by deleting the negation in UTL[P,F,H,G].

 Remark 18.

In the above definition, H and G can be redefined with P and F thanks to negation, but are necessary in the case of UTL+.

3 Logical equivalences

When two formulas φ and ψ are logically equivalent, i.e. admit exactly the same models, we denote it by φψ. Note that a closed FO formula can be equivalent to an LTL formula, since their models are simply words. Similarly, we can have φψ when φ is an FO formula with one free variable (having models of the form (u,i)) and ψ is a LTL or TL formula, this time not using the default first position for TL semantics. We will use both these ways of stating equivalence in the following.

We want to lift to positive fragments some classical theorems of equivalence between logics, such as these classical results:

Theorem 19 ([9, 5]).

  • FO and LTL define the same class of languages.

  • FO2 and UTL define the same class of languages.

3.1 Equivalences to 𝐅𝐎+

We aim at proving the following theorem, lifting classical results from FO to FO+:

Theorem 20.

The logics FO+, LTL+ and FO3+ describe the same languages.

Lemma 21.

The set of languages described by LTL+ is included in the set of languages recognised by FO3+.

The proof is direct, see long version for details. From LTL+ to FO+, we can interpret in FO+ all constructors of LTL+.

Let us introduce definitions that will be used in the proof of the next lemma.

Definition 22.

Let qr(φ) be the quantification rank of a formula φ of FO+ defined inductively by:

  • if φ contains no quantifier then qr(φ)=0,

  • if φ is of the form x,ψ or x,ψ then qr(φ)=qr(ψ)+1,

  • if φ is of the form ψχ or ψχ then qr(φ)=max(qr(ψ),qr(χ)).

Definition 23.

A separated formula is a positive Boolean combination of purely past formulas (not using future operators X,U,R), purely present formulas (not using any temporal operators) and purely future formulas (not using past operators Y,S,Q).

We will adapt previous work to show the following auxiliary result:

Lemma 24.

Let φ be a TL+ formula with possible nesting of past and future operators. There is a separated formula of TL+ that is equivalent to φ.

Now we are ready to show the main result of this section:

Lemma 25.

The set of languages described by FO+ is included in the set of languages recognised by LTL+.

Proof.

We follow [12, Prop. 1, Appendix B.3], which shows a translation from FO to TL by induction on the quantification rank. We have adapted this to suit our needs.

Let φ(x) be an FO+ formula with a single free variable. Let us show by induction on qr(φ) that φ is equivalent to a formula of TL+. Notice that we want to show this regardless of the set of unary predicates used, so we might use the induction hypothesis on formulas with an enriched signature, containing additional unary predicates.

Initialisation.

If qr(φ) is zero, then φ(x) translates directly into the TL+ formula. Indeed, disjunctions and conjunctions translate immediately into TL+. Furthermore, unary predicates of the form a(x) translate into a and binary predicates trivialize into and (e.g. x<x translates into and x=x into ). For example, (xxa(x))(b(x)c(x))x<x translates into (a)(bc).

Heredity.

Suppose that any FO+ single free variable formula of quantification rank strictly less than qr(φ) translates into a TL+ formula, and qr(φ) is strictly positive.

If φ is a disjunction or conjunction, we need to transform its various clauses. So, without loss of generality, let us assume that φ(x) is of the form y,ψ(x,y) or y,ψ(x,y).

Let us denote a1,,an where n is a natural number, the letters (which are considered as unary predicates) in ψ(x,y) applied to the free variable x, i.e. ignoring those under the scope of a quantification reusing the variable name x.

For any subset S of [[1,n]], we note ψS(x,y) the formula ψ(x,y) in which each occurrence of ai(x) is replaced by if i belongs to S and by otherwise, for any integer i of [[1,n]].

We then have the logical equivalence:

ψ(x,y)S[[1,n]](iSai(x)iS¬ai(x)ψS(x,y)).

We are going to show that the negations in the above formula are optional. Let us note:

ψ+(x,y)S[[1,n]](iSai(x)ψS(x,y)).

Let us then show the equivalence of the formulas ψ(x,y) and ψ+(x,y) using the monotonicity of ψ as an FO+ formula. First, it is clear that any model satisfying ψ(x,y) satisfies ψ+(x,y).

Conversely, suppose ψ+(x,y) is satisfied. We then have a subset S of [[1,n]] such that (iSai(x))ψS(x,y) is satisfied. In particular, according to the values taken by the unary predicates in x, there exists a subset S of [[1,n]] containing S such that (iSai(x))(iS¬ai(x))ψS(x,y) is satisfied. Now, ψ is monotone in the different predicates a1,,an. So (iSai(x))(iS¬ai(x))ψS(x,y) is also satisfied, and ψ(x,y) is therefore satisfied.

The rest of the proof is similar to the proof from [12]: the quantifiers on y commute with the disjunction on S and the conjunction on i of the formula ψ+. We can therefore fix a subset S of [[1,n]] and simply consider y,ψS(x,y) or y,ψS(x,y). We then replace ψS(x,y) with a formula that depends only on y by replacing each binary predicate of the form 𝔟(x,z) with a unary predicate P𝔟(z). For example, we can replace x<z, z<x or x=z by a unary predicate P>(z), P<(z) or P=(z). We then obtain a formula ψS(y) on an enriched signature with these new unary predicates P>, P< or P=. We can apply to ψS(y) the induction hypothesis, since there is only one free variable. This yields a formula χ from TL+, equivalent to ψS(y). Notice that words on which χ is evaluated also have to specify the values of the unary predicates P>, P< or P=. If Σ is the original set of unary predicates, let Σ=Σ{P>,P<,P=}. For any u𝒫(Σ), and valuation ν giving the value of the free variable x in u, there is a unique fν(u)𝒫(Σ) such that u is the projection of fν(u) on Σ, and the values of the new predicates in fν(u) reflect their intended semantic with respect to x, e.g. P=(z) is true if and only if x=z. We then have for all such u,ν:

u,νy,ψS(x,y)fν(u),ν(x)PχχFχ, and u,νy,ψS(x,y)fν(u),ν(x)HχχGχ.

Let χ be the formula obtained: either PχχFχ or HχχGχ. This formula χ involves unary predicates of the form P𝔟. We then use Lemma 24 to transform χ into a positive Boolean combination of purely past, present and future positive formulas, where predicates P𝔟 trivialize into or , on words of the form fν(u). For example, P< trivializes into in purely past formulas, into in purely present or future formulas.

This completes the induction. From a formula in FO+, we can construct an equivalent formula in TL+. Ultimately, we can return to a future formula. We want to evaluate in x=0, so the purely past formulas, isolated by the separation lemma (Lemma 24), trivialize into or .

Now, to translate a closed formula φ from FO+ to LTL+, we can add a free variable by setting φ(x)=φ(x=0). Then, by the above, φ translates into a formula χ from LTL+, logically equivalent to φ. Putting together Lemma 21 and Lemma 25 achieves the proof of Theorem 20.

3.2 Equivalences in fragments of 𝐅𝐎+

Theorem 26.

The languages described by FO2+[𝔅0] formulas with one free variable are exactly those described by UTL+ formulas.

Proof.

First, let us show the UTL+ to FO2+ direction. In the proof of Lemma 21, as is classical, three variables are introduced only when translating U. By the same reasoning as for X, it is clear that translating Y introduces two variables. It remains to complete the induction of Lemma 21 with the cases of P, F, H and G, but again we can restrict ourselves to future operators by symmetry:

  • [Fφ](x)=y,x<y[φ](y) ;

  • [Gφ](x)=y,yx[φ](y).

For the converse direction from FO2+ to UTL+, we draw inspiration from [5, Theorem 1]. This proof is similar to that of [12] used previously in the proof of Lemma 25: we perform a disjunction on the different valuations of unary predicates in one free variable to build a formula with one free variable. However, the proof of Lemma 25 cannot be adapted as is, since it uses the separation theorem which does not preserve the membership of a formula to UTL, see [6, Lem 9.2.2]. However, the article [5] uses negations, and we must therefore construct our own induction case for the universal quantifier that is treated in [5] via negations.

The beginning of the proof is identical to that of Lemma 25, starting with a formula y,ψ(x,y) or y,ψ(x,y). Using the same notations, we can reduce to the case of a formula y,ψS(x,y) or y,ψS(x,y) with no unary predicate applied to x in ψS(x,y), since all those predicates have been replaced with or according to S. Contrarily to the previous proof, we cannot directly replace binary predicates with unary predicates, because this relied on the separation theorem.

Let us consider, as in [5], the position formulas, y<xnsucc(y,x), succ(y,x), y=x, succ(x,y) and x<ynsucc(x,y), whose set is denoted T.

We then have the logical equivalence:

ψS(x,y)τTτ(x,y)ψτS(y)τTτ(x,y)ψτS(y)

where ψτS(y) is obtained from the formula ψS(x,y) assuming the relative positions of x and y are described by τ. The above equivalence holds because T forms a partition of the possibilities for the relative positions of x and y: exactly one of the five formulas τ(x,y) from T must hold. Since x and y are the only two variables, any binary predicate involving x is a binary predicate involving x and y (or else it involves only x and is trivial). Binary predicates are therefore trivialized according to the position described by τ.

Example 27.

For ψS(x,y)=nsucc(x,y)a(y)(x,xyb(y)) and for the position formula τ=y<xnsucc(y,x), we have ψτS(y)=a(y)(x,xyb(y)). We do not replace the bound variable x. We have obtained a formula with one free variable, so we can indeed use the induction hypothesis.

If we started with the existential form y,ψS(x,y), we will use the disjunctive form ψS(x,y)τTτ(x,y)ψτS(x,y). This allows us to have the quantifier commute with the disjunction, and obtain τTy,τ(x,y)ψτS(x,y). Similarly, if starting with the universal form y,ψS(x,y), we will use the conjunctive form ψS(x,y)τTτ(x,y)ψτS(x,y).

We then need to translate y,τ(x,y)ψτS(y) and y,τ(x,y)ψτS(y), which we note respectively [τ] and [τ], in UTL+, for any position formula τ. We assume ψτS fixed in this context so it is omitted from this notation. However, it is important to note that [τ] and [τ] will still depend on ψτS. In each case, we note χ for the UTL+ formula obtained by induction from ψτS(y):

[y<xnsucc(y,x)]YPχ[y<xnsucc(y,x)]YHχ[succ(y,x)][succ(y,x)]Yχ[y=x][y=x]χ[succ(x,y)][succ(x,y)]Xχ[x<ynsucc(x,y)]XFχ[x<ynsucc(x,y)]XGχ.

This achieves the proof of Theorem 26.

Corollary 28.

The logic FO2+[𝔅<] is equivalent to UTL+[P,F,H,G].

Proof.

For the right-to-left direction, it suffices to notice that the predicates used to translate the constructors of UTL+[P,F,H,G] in the previous proof belong to 𝔅<.

For the left-to-right direction, simply replace the set T in Theorem 26 proof by T={y<x,y=x,x<y}. Once again, we obtain an exhaustive system of mutually exclusive position formulas that allow us to trivialize binary predicates. The proof of Theorem 26 can thus be lifted immediately to this case.

We showed that several classical logical equivalence results can be transposed to their positive variants.

4 Characterisation of monotonicity

So far, we have focused on languages described by positive formulas, from which monotonicity follows. Here, we focus on the monotonicity property and propose a characterisation. We then derive a monoid-based algorithm that decides, given a regular language L, whether it is monotone, refining results from [11] focusing on automata-based algorithms.

4.1 Characterisation by monoids

We assume the reader familiar with monoids (see long version for detailed definitions).

We will note (𝐌,) a monoid and 𝐌L the syntactic monoid of a regular language L and L the syntactic order.

Theorem 29.

Let LA be a regular language, and L be its syntactic order. The language L is monotone if and only if we have:

(s,s)A2,ssh(s)Lh(s)

where h:A𝐌L denotes the syntactic morphism onto the syntactic monoid.

Proof.

For the left-to-right direction let L be a monotone language and ss. Let m and n be two elements of 𝐌L such that mh(s)nh(L). Since h:A𝐌L is surjective, let uh1(m) and vh1(n). Then usvL since h recognises L. So usvL by monotonicity of L. Thus mh(s)nh(L). We can conlude that h(s)Lh(s) due to the arbitrariness of m and n.

For the converse direction, suppose that L verifies the condition of Theorem 29. We can remark that L is compatible with the product of the monoid. Let uL and v be two words such that uAv. Then h(u)Lh(v). By definition of the syntactic order, and since h(u) is in the accepting set of the monoid, we must have h(v) accepted as well, hence vL. Thus L is monotone.

4.2 An algorithm to decide monotonicity

We immediately deduce from Theorem 29 an algorithm for deciding the monotonicity of a regular language L from its syntactic monoid. Indeed, it is sufficient to check for any pair of letters (s,s) such that s is included in s whether mh(s)nh(L) implies mh(s)nh(L) for any pair (m,n) of elements of the syntactic monoid, where h denotes the syntactic morphism onto the syntactic monoid.

This algorithm works for any monoid that recognises L through a surjective h:AM, not just its syntactic monoid. Indeed, for any monoid, we start by restricting it to h(A) to guarantee that h is surjective. Then, checking the above implication is equivalent to checking whether sLs for all letters s and s such that s is included in s.

This is summarised in the following proposition:

Proposition 30.

There is an algorithm which takes as input a monoid (𝐌,) recognising a regular language L through a morphism h and decides whether L is monotone in O(|A|2|𝐌|2).

It was shown in [11, Thm 2.5] that deciding monotonicity is PSpace-complete if the language is given by an NFA, and in P if it is given by a DFA.

Recall that in general, the syntactic monoid of a language L may be exponentially larger than the minimal DFA of L.

We give a more precise result for DFA, and give also a refined complexity result for monoid input.

Proposition 31.

Deciding whether a regular language is monotone is in LogSpace when the input is a monoid while it is 𝖭𝖫𝖼𝗈𝗆𝗉𝗅𝖾𝗍𝖾 when it is given by a DFA.

5 Semantic and syntactic monotonicity

The paper [11, Definition 4.2] exhibits a monotone language definable in FO but not in FO+. The question then arises as to how simple such a counter-example can be. For instance, can it be taken in specific fragments of FO, such as FO2? This section presents a few lemmas that might shed some light on the subject, followed by some conjectures, and a new counterexample to the equivalence between syntactic and semantic monotonicity in the alternation-free fragment of FO2.

5.1 Refinement of the counter-example in the general case

In [11, Sec. 4], the counter-example language that is monotone and FO-definable but not FO+-definable uses three predicates a, b and c and is as follows:

K=((abc))AA

where is the monotone closure from Definition 8, and ={a,b,c} is the letter of A containing all three predicates. Any word containing is therefore in K, which intuitively corresponds to ignoring any phenomenon involving this letter, while ensuring monotonicity.

It uses the following words to find a strategy for Duplicator in EFk+ (the positive EF game that terminates in k rounds):

u0=(abc)n and u1=((ab)(bc)(ca))n(ab)(bc)

where n is greater than 2k, and (st) is just a compact notation for the letter {s,t} for any predicates s and t.

This in turns allows to show the failure on Lyndon’s preservation theorem on finite structures [11, Sec. 5]. Our goal in this section is to refine this counter-example to more constrained settings. We hope that by trying to explore the limits of this behaviour, we achieve a better understanding of the discrepancy between monotone and positive.

In Section 5.1.1, we give a smaller fragment of FO where the counter-example can still be encoded. In Section 5.1.2, we show that the counter-example can still be expressed with a single unary predicate. This means that it could occur for instance in LTL+ where the specification only talks about one sensor being activated or not.

5.1.1 Using the between predicate

Here we investigate the robustness of the counter-example by restricting the fragment of FO+ in which it can be expressed.

First, let us define the “between” binary predicate introduced in [10].

Definition 32 ([10]).

For any unary predicate a (not only predicates from Σ but also Boolean combination of them), a also designates a binary predicate, called between predicate, such that for any word u and any valuation ν, (u,ν)a(x,y) if and only if there exists an index i between ν(x) and ν(y) excluded such that (u,[zi])a(z).

We denote 𝔟𝔢 the set of between predicates and 𝔟𝔢+ the set of positive between predicates, i.e. positive boolean combination of predicates of the form a(x,y) with aΣ.

It is shown in [10] that FO2[𝔅0𝔟𝔢] is strictly less expressive than FO.

Proposition 33.

There exists a monotone language definable in FO2[𝔅0𝔟𝔢] which is not definable in FO2+[𝔅0𝔟𝔢+].

Proof.

We will use the following language:

K:=KA((ab)2(bc)2(ca)2(ab)(ca)(bc)(ab)(ca)(bc))A.

Indeed, in [11, Sec. 4], it is explained that to recognise K in FO, we need to look for some “anchor positions”. Such positions resolve the possible ambiguity introduced by double letters of the form (ab), that could play two different roles for witnessing membership in ((abc)). Indeed, if (ab) appears in a word, we cannot tell whether it stands for an a or a b. In contrast, anchor letters have only one possible interpretation. They may be singletons ({a}, {b}, {c}) or consecutive double letters such as (ab)(ca) which can only be interpreted one way, here as bc. For our language K, we accept any word containing an anchor of the second kind. This means that in remaining words we will only be interested in singleton anchors. Thus, we need two variables only to locate consecutive anchors and between predicates to check if the letters between the anchors are double letters. See long version for a more detailed description of a formula.

In order to show that K is not definable in FO2+[𝔅0𝔟𝔢+], it suffices to remark that the argument showing that K is not definable in FO+ suffices, because the words u0 and u1 used in the EF+-game are still separated by K. This shows that K is not even definable in FO+ so a fortiori it cannot be definable in FO2+[𝔅0𝔟𝔢+].

5.1.2 Only one unary predicate

Now, let us show another refinement. We can lift K to a counter-example where the set of predicates Σ is reduced to a singleton.

Proposition 34.

As soon as there is at least one unary predicate, there exists a monotone language definable in FO but not in FO+.

Proof (Sketch).

Suppose Σ reduced to a singleton. Then, A is reduced to two letters which we note 0 and 1 with 1 greater than 0. We will again rely on the proof from [11, Sec. 4]. We will encode each predicate from {a,b,c} and a new letter # (the separator) into A as follows:

{[a]=100[b]=010[c]=001[#]=100001

We will encode the language K as follows:

[K]=(([a][#][b][#][c][#]))A1(A4\04)1AA15A.

It remains to verify that K is monotone and FO-definable, and that it is not FO+-definable. The choice of encoding guarantees that this is indeed the case and that we can rely on properties of the original language K, without any ambiguities introduced by the encoding. See long version for details.

5.2 Stability through monotone closure

It has been shown by Thérien and Wilke [18] that FO2[𝔅<]-definable languages are exactly those that are both Σ2-definable and Π2-definable where Σ2 is the set of FO-formulas of the form x1,.,xny1,,ymφ(x1,,xn,y1,ym) where φ does not have any quantifier and Π2-formulas are negations of Σ2-formulas. Hence, Σ2Π2 is the set of FO-formulas in prenex normal form with at most one quantifier alternation. Moreover, Pin and Weil [15] showed that Σ2 describes the unions of languages of the form A0.s0.A1.s1..st.At+1, where t is a natural integer, si are letters from A and Ai are subalphabets of A.

In the following, we will use 𝔅< as default set of binary predicates for FO2.

Even though we do not know yet whether FO2+ captures the set of monotone FO2-definable languages, we can state the following theorem:

Theorem 35.

The set Σ2+Π2+ of languages definable by both positive Σ2-formulas (written Σ2+) and positive Π2-formulas (written Π2+) is equal to the set of monotone FO2-definable languages.

Proof (Sketch).

We start by showing that Σ2+ captures the set of monotone Σ2-definable languages. This can be done thanks to a syntactic characterisation of Σ2-definable languages given by Pin and Weil [15]: a language is Σ2-definable if and only if it is a union of languages of the form A0.s0.A1.s1..st.At+1. Next, we introduce a dual closure operator L=((Lc))c that allows a finer manipulation of monotone languages. This allows us to lift the previous result to Π2 languages. Taken together, these results show that the set of languages definable by both positive Σ2-formulas and positive Π2-formulas is exactly the set of monotone FO2-definable languages. See long version for details.

This last result shows how close to capturing monotone FO2-definable languages FO2+ is. However, it does not seem easy to lift the equivalence Σ2Π2=FO2 to their positive fragments as we did for the other classical equivalences in Section 3. Indeed, the proof from [18] relies itself on the proof of [15] which is mostly semantic while we are dealing with syntactic equivalences.

Since languages in Σ2+Π2+ are in particular in FO+, Theorem 35 implies that a counter-example separating FO-monotone from FO+ cannot be in FO2[𝔅<] as stated in the following corollary:

Corollary 36.

Any monotone language described by an FO2[𝔅<] formula is also described by an FO+ formula.

If the monotone closure L of a language L described by a formula of FO2[𝔅<] is in FO+, nothing says on the other hand that L is described by a formula of FO2[𝔅<], or even of FO2[𝔅0] as the counterexample L=abcba shows. The monotone closure L cannot be defined by an FO2[𝔅0] formula. This can be checked using for instance Charles Paperman’s software Semigroup Online [14]. Notice that the software uses the following standard denominations: DA corresponds to FO2[𝔅<], and LDA to FO2[𝔅0].

We leave open the following problem, where FO2 can stand either for FO2[𝔅<] or for FO2[𝔅0]:

Open Problem 37.
  • Is a monotone language definable in FO2 if and only if it is definable in FO2+?

  • Is it decidable whether a given regular language is definable in FO2+?

Since we can decide whether a language is definable in FO2 and whether it is monotone, the first item implies the second one.

5.3 The alternation-free fragment of 𝐅𝐎𝟐+

We consider in this section the alternation-free fragment of FO2, and its FO2+ counterpart, as a first step towards understanding the full logic FO2+ (recall that we are using 𝔅< as the default set of binary predicates here). The alternation hierarchy of FO2 has been studied extensively in recent years (for example in [20]), and might be an angle through which tackling Conjecture 37. The first level of this hierarchy (the alternation-free fragment) enjoys several characterisations. In particular, it corresponds to the variety of languages recognised by J-trivial monoids, which entails in particular the decidability of membership for this fragment. We exhibit a counter-example to an analog of Conjecture 37 for this fragment, namely a language that is monotone and definable in FO2 without alternation, but not definable in FO2+ without alternation. This constitutes a new counter-example separating monotonicity from positivity, and the first one that is not relying on the language K from [11] but exploits instead a different phenomenon.

We will work over the set of unary predicates Σ={1}, and A=𝒫(Σ)={,{1}}. To simplify notations, we will think of A as the set of letters {0,1}0 representing the empty set and 1 the singleton {1}. The language L is defined as the monotone closure of 1+01+, that is L=1+01++1+11+. The language L is monotone by definition. It is definable in FO2 without alternation: FO2 allows to describe the property of having a given subword appearing, or not appearing, in a word, and it is straightforward to give a description of L in terms of this kind of property. See long version.

We now show that this language is indeed a counterexample to Conjecture 37 for the logic FO2 without alternation:

Proposition 38.

The language L is not definable in FO2+ without alternation.

Proof (Sketch).

We make use of the Ehrenfeucht-Fraïssé game for FO2+ without alternation (see long version). Fix a natural number n, and consider the words u=12n01L and v=12n0L. We show that Duplicator wins EFn,altfree2+(u,v), which entails that there is no FO2+ formula without alternation that can define L. The idea is that Spoiler has to play in u because v is a subword of u (so Duplicator can copy the moves of Spoiler); but when Spoiler plays in u, Duplicator can try to mimic Spoiler’s moves in the block 12n, and Spoiler can only win by showing that the two words have different sizes, which cannot be done in less that n+1 turns. The details of the proof can be found in long version.

Note that this counterexample contradicts the first item of an analog of Conjecture 37 for the alternation-free fragment of FO2, but whether the alternation-free fragment of FO2+ has decidable membership remains open.

Conclusion

We have investigated how the tension between semantic and syntactic positivity behaves in FO-definable languages. We paid a close look to how this plays out in relation to monoids, complexity of procedures, LTL, and fragments of FO. We show that despite earlier negative results from [11] such as undecidabiilty of membership in FO+, this logic retains some robustness, as equivalence with LTL+ can be obtained, even at the level of specific fragments FO2+[𝔅<] and FO2+[𝔅0]. We show that the counter-example language K from [11] can be lifted to show that some positive fragments of FO differ from their monotone counterpart, while for some other fragment, namely alternation-free FO2, we provide a new counter-example relying on a different mechanism. We leave the question open for the FO2 fragment, that we consider to be an interesting challenge.

References

  • [1] Miklos Ajtai and Yuri Gurevich. Monotone versus positive. J. ACM, 34(4):1004–1015, October 1987. doi:10.1145/31846.31852.
  • [2] David Baelde and Dale Miller. Least and greatest fixed points in linear logic. In Nachum Dershowitz and Andrei Voronkov, editors, Logic for Programming, Artificial Intelligence, and Reasoning, pages 92–106, Berlin, Heidelberg, 2007. Springer Berlin Heidelberg. doi:10.1007/978-3-540-75560-9_9.
  • [3] Julian Bradfield and Igor Walukiewicz. The mu-calculus and model checking. In Handbook of Model Checking, pages 871–919. Springer, 2018. doi:10.1007/978-3-319-10575-8_26.
  • [4] Thomas Colcombet. Regular Cost Functions, Part I: Logic and Algebra over Words. Logical Methods in Computer Science, Volume 9, Issue 3, August 2013. doi:10.2168/LMCS-9(3:3)2013.
  • [5] Kousha Etessami, Moshe Y. Vardi, and Thomas Wilke. First-order logic with two variables and unary temporal logic. BRICS Report Series, 4(5), January 1997. doi:10.7146/brics.v4i5.18784.
  • [6] Dov M. Gabbay, Ian Hodkinson, and Mark Reynolds. Temporal logic (vol. 1): mathematical foundations and computational aspects. Oxford University Press, Inc., USA, 1994.
  • [7] Michelangelo Grigni and Michael Sipser. Monotone complexity. In Poceedings of the London Mathematical Society Symposium on Boolean Function Complexity, pages 57–75, USA, 1992. Cambridge University Press.
  • [8] Neil Immerman. Relational queries computable in polynomial time. Information and Control, 68(1):86–104, 1986. doi:10.1016/S0019-9958(86)80029-8.
  • [9] Hans Kamp. Tense Logic and the Theory of Linear Order. PhD thesis, University of California Los Angeles, 1968. Published as Johan Anthony Willem Kamp. URL: http://www.ims.uni-stuttgart.de/archiv/kamp/files/1968.kamp.thesis.pdf.
  • [10] Andreas Krebs, Kamal Lodaya, Paritosh K. Pandya, and Howard Straubing. Two-variable logics with some betweenness relations: Expressiveness, satisfiability and membership. Log. Methods Comput. Sci., 16(3), 2020. URL: https://lmcs.episciences.org/6765.
  • [11] Denis Kuperberg. Positive first-order logic on words and graphs. Log. Methods Comput. Sci., 19(3), 2023. doi:10.46298/lmcs-19(3:7)2023.
  • [12] Denis Kuperberg and Michael Vanden Boom. On the expressive power of cost logics over infinite words. In Artur Czumaj, Kurt Mehlhorn, Andrew Pitts, and Roger Wattenhofer, editors, Automata, Languages, and Programming, pages 287–298, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. doi:10.1007/978-3-642-31585-5_28.
  • [13] Roger C. Lyndon. Properties preserved under homomorphism. Pacific J. Math., 9(1):143–154, 1959. URL: https://projecteuclid.org:443/euclid.pjm/1103039459.
  • [14] Charles Paperman. Semigroup online. URL: https://paperman.name/semigroup/.
  • [15] Jean-Éric Pin and Pascal Weil. Polynomial closure and unambiguous product. Theory of Computing Systems, 30:383–422, 1995. URL: https://api.semanticscholar.org/CorpusID:850708.
  • [16] Benjamin Rossman. Homomorphism preservation theorems. J. ACM, 55, July 2008. doi:10.1145/1379759.1379763.
  • [17] Alexei P. Stolboushkin. Finitely monotone properties. In LICS, San Diego, California, USA, June 26-29, 1995, pages 324–330. IEEE Computer Society, 1995. doi:10.1109/LICS.1995.523267.
  • [18] Denis Thérien and Thomas Wilke. Over words, two variables are as powerful as one quantifier alternation. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC ’98, pages 234–240, New York, NY, USA, 1998. Association for Computing Machinery. doi:10.1145/276698.276749.
  • [19] Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing, STOC ’82, pages 137–146, New York, NY, USA, 1982. Association for Computing Machinery. doi:10.1145/800070.802186.
  • [20] Philipp Weis and Neil Immerman. Structure Theorem and Strict Alternation Hierarchy for FO2 on Words. In Thomas Schwentick, Denis Thérien, and Heribert Vollmer, editors, Circuits, Logic, and Games, volume 6451 of Dagstuhl Seminar Proceedings (DagSemProc), pages 1–22, Dagstuhl, Germany, 2007. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/DagSemProc.06451.6.