Tree Algebras and Bisimulation-Invariant MSO on Finite Graphs
Abstract
We establish that the bisimulation invariant fragment of MSO over finite transition systems is expressively equivalent over finite transition systems to modal -calculus, a question that had remained open for several decades.
The proof goes by translating the question to an algebraic framework, and showing that the languages of regular trees that are recognised by finitary tree algebras whose sorts zero and one are finite are the regular ones. This corresponds for trees to a weak form of the key translation of Wilke algebras to omega-semigroup over infinite words, and was also a missing piece in the algebraic theory of regular languages of infinite trees for twenty years.
Keywords and phrases:
MSO, mu-calculus, finite graphs, bisimulation, tree algebraCategory:
Track B: Automata, Logic, Semantics, and Theory of ProgrammingFunding:
Denis Kuperberg: ANR ReCiProg.Copyright and License:
![[Uncaptioned image]](x1.png)
2012 ACM Subject Classification:
Theory of computation Finite Model Theory ; Theory of computation Modal and temporal logics ; Theory of computation Algebraic language theory ; Theory of computation Regular languagesAcknowledgements:
We thank Achim Blumensath for helpful discussions, proofreading this work, and finding a major mistake in a previous version of this paper (now solved).Editors:
Keren Censor-Hillel, Fabrizio Grandoni, Joël Ouaknine, and Gabriele PuppisSeries and Publisher:

Low CO2 This is a low-CO2 research paper. The authors did not use plane travel, neither for the development of this work nor its presentation.
1 Introduction
A well-known result of Janin and Walukiewicz [13, Thm 11] states:
Theorem 1.
For a property of transition systems, the following statements are equivalent:
-
being MSO-definable and bisimulation-invariant, and
-
being -calculus-definable111In this work, -calculus refers to the standard propositional modal -calculus, i.e. the temporal logic constructed from propositions, modalities and , boolean connectives, and least and greatest fixpoints..
In which a property of transition systems is bisimulation-invariant if for any two bisimilar transition systems, both satisfy the property, or none.
Keeping the above result in mind, it is enlightening to recall the following well-known finite model property: two -calculus sentences that can be separated by some transition system can always be separated by a finite one. In other words, the semantics of a -calculus sentence is entirely defined by what it expresses over finite transition systems. For this reason, the question of whether a version of Theorem 1 for finite transition systems would hold is extremely natural. It remained an important open question in the field for almost three decades. One contribution of this paper is to provide a positive answer to it:
Theorem 2.
For a property of finite transition systems, the following items are equivalent:
-
being MSO-definable and bisimulation-invariant, and
-
being -calculus-definable.
For both Theorem 1 and Theorem 2, the upward direction is the same. It follows from well-known facts: (1) -calculus sentences can be effectively translated into equivalent MSO-sentences, and (2) properties definable in -calculus are invariant under bisimulation.
The difficult direction is to show that bisimulation-invariant MSO-definable properties can be translated to equivalent -calculus sentences. The explanation of why the original approach of Janin and Walukiewicz cannot used in the finite case has been clearly phrased by Blumensath and Wolf [5]:
The above mentioned result by Janin and Walukiewicz on bisimulation-invariant monadic second-order logic has so far defied all attempts at a similar transfer to the realm of finite structures. The main reason is that the original proof is based on automata-theoretic techniques and an essential ingredient is a reduction to trees, via the unravelling operation. As this operation produces infinite trees, we cannot use it for formulae that are only bisimulation-invariant over finite transition systems.
Our proof of the downward implication of Theorem 2 uses a translation of the problem to an algebraic formalism. Though this approach is similar to other works in the field, it requires to develop several algebraic arguments that are not present in the literature.
Overview of the arguments and key contributions
We present here a high-level overview of the sequence of arguments used in our proof, putting an emphasis on lemmas that involve entirely new arguments. Due to lack of space, only subsets of these arguments will be covered in this extended abstract.
We roughly decompose the arguments in three main parts: from MSO to algebras, from algebras to yield-algebras, and from yield-algebras to -calculus.
PART 1: From MSO to algebras.
The first part of the proof concerns how to transform monadic second-order logic into an algebraic framework. Technically, it follows the algebraic/compositional approach, which is standard in the field, and does not require to develop new arguments.
-
1.
We abstract transition systems by the notion of systems. Systems are finite structures built by combining symbols from a ranked alphabet. Systems also have an initial vertex and variables ( is the rank of the system). See Section 2.1. Finite transition systems can be seen as systems over a suitably chosen ranked alphabet. See Section 2.5.
-
2.
Systems are equipped with composition operations that allow to build complex systems by combining simpler ones together. In the spirit of category theory, all these operations are uniformly described by a single one, . See Section 2.2.
-
3.
From these composition operations, one derives a notion of algebras, that can be used for recognising sets of systems. See Sections 3.1 and 3.2.
-
4.
At this point, the so-called “composition method” can be used for showing that MSO-definable languages of transition systems are recognisable by rankwise-finite algebras (i.e. an algebra in which all ranks are finite). See Section 3.3.
-
5.
We define morphisms between systems, that correspond to folding the source system to the target system (or unfolding when seen the other way round). Two systems are unfold-equivalent if they share a common unfolding. One can identify the usual notion of “infinite regular tree” as an unfold-equivalence-class of systems. See Section 2.4. We also show that unfold-equivalence is a congruence with respect to (Lemma 7).
PART 2: From algebras to yield-algebras.
What we have achieved so far is that we have translated the logical hypothesis of Theorem 2, i.e. an MSO-definable bisimilar-invariant property, into a rankwise-finite algebra that recognises it. The next step toward constructing a -calculus formula is to add some form of non-determinism to systems. This is what we review now.
-
6.
We generalise the notion of systems to the richer notion of set-systems. The generalisation goes along two directions: (1) a form of non-determinism, and (2) some special vertices called “root vertices” are added (we will not discuss this point for the moment). See Section 2.1. Again, these set-systems are equipped with composition operations via a operation. See Section 2.2.
-
7.
Using the notion of morphism, we can define what are the systems that a set-system can produce: its yields. Yields are, up to unfold-equivalence, systems that can be produced by “resolving the non-determinism”. Two set-systems are yield-equivalent if they share the same yields. Over systems, yield-equivalence coincides with unfold-equivalence. See Section 4.1.
-
8.
As we did for systems, we also have to study how yield-equivalence interacts with the operations of composition of set-systems, and as before, the conclusion is that yield-equivalence is a congruence with respect to See Section 4.3.
-
9.
As a consequence, there is a notion of yield-algebra, which corresponds to set-systems modulo yield-equivalence. Such yield-algebras can be used for recognising sets of systems.
-
10.
A key contribution of this work is Lemma 20: A language of regular-trees which is recognisable by a rankwise-finite algebra is recognisable by a rankwise-finite yield-algebra. See Section 4.4.
PART 3: From yield-algebras to -calculus.
At this moment of the proof, the original MSO-definable bisimulation-invariant property has been transformed into a rankwise-finite yield-algebra. The final arguments involve a careful study of such algebraic objects that we overview in Section 5:
-
11.
We first identify inside yield-algebras some elements that we call deterministic. Deterministic elements form a sub-yield-algebra.
-
12.
We then introduce the key automaton property for a yield-algebra that recognises a language : it signifies informally that “all elements can be under-approximated by deterministic ones in a faithful way”.
-
13.
The reason of this definition is that if a language of regular-trees is recognised by a rankwise-finite yield-algebra that has the automaton property, there exists an automaton of infinite trees (in the classical sense) recognising . If the language is furthermore bisimilar-invariant, by adapting the techniques of [13], one can produce a -calculus formula along these lines.
-
14.
The last missing piece, and the second technical contribution of this work is Lemma 25: Syntactic rankwise-finite yield-algebras have the automaton property.
Related work
Bisimulation-invariant logics.
The seminal Hennessy-Milner theorem [12] gives a first characterisation of modal logics via bisimulation-equivalence. Van Benthem generalised it to First-Order logic (FO) in [20]. The result was then transferred to finite structures [18], and specialised to particular classes of finite structures [9, 11]. In [14], the equivalence between the bisimulation-invariant fragment of monadic path logic with CSL* is shown. An intermediate logic strictly between FO and MSO, obtained by extending FO with fixed-point operators, is considered in [16], where it is shown that the two-way bisimulation-invariant fragment of this logic is equivalent to two-way -calculus on finite structures. Concerning the logic MSO, as already explained, Janin and Walukiewicz prove the equivalence with -calculus on general structures [13]. The case of weak Monadic logic was shown to correspond to continuous -calculus in [8].
On finite structures, this equivalence was shown for structures of bounded Cantor-Bendixson rank in [5].
Composition method and tree algebra.
The composition method that we use here was pioneered in [10] for FO and used intensively for MSO in [19, 17]. There is a long line of research pursuing a well-behaved notion of algebra for regular languages of infinite words and trees [21, 15, 1, 6, 2, 3]. The theory has inherent difficulties. For instance [7] gives an example of rankwise-finite algebras that define non-regular languages of infinite trees. In [4], the specific notion of powerset is investigated, it shows that distributivity laws do not exist if there are non-linear identities in the theory (and this is relevant to the present work).
2 Set-systems and Systems
In this section, we introduce systems and set-systems. It may be meaningful to read this part first while concentrating on systems.
We define set-systems and systems in Section 2.1, how to compose them in Section 2.2, and the notion of [set]-context in Section 2.3. In Section 2.4 we introduce morphisms of set-systems, and the notions of folding, unfold-equivalence and regular-trees. We finally explain what are transition systems, how these can be encoded as systems, and how to phrase bisimulation in this framework (Section 2.5).
2.1 Set-Systems and Systems
A >ranked set (or >alphabet) is a family of sets indexed by natural numbers. We write for for some . For , the >rank of , noted , is the integer such that . A >map of ranked sets from to is a map that preserves the ranks, i.e. a family of maps for all . A ranked set is >rankwise-finite if is finite for all . Note that ranked sets equipped with their maps form a category, that we call the >category of ranked sets. We fix an infinite set of >variables. Variables can be seen as having rank . For , we denote with the set , and the variables .
Let us define now the notions of set-systems and systems, illustrated on Figure 1.
Definition 3 (set-systems).
Given a ranked set , a -set-system of >rank is a tuple , composed of
-
a finite set of >vertices , a set of >initial vertices , a set of >root vertices ,
-
a >labelling function ; in practice, we simply write for ,
-
an >edge relation consisting of >edges of the form with ; is called the >direction of the edge; if is a vertex, then is called a >transition edge; otherwise is a variable and is a >variable edge. We also denote the set , and .
We may add subscripts to these elements to identify the set-system they belong to, e.g. for the set of vertices of . We will note the disjoint union of and . A -set-system is >closed if it has rank . Two -set-systems are of the >same shape if they differ only on their labelling, i.e. , , , and .
Given a map of ranked sets from to , we denote the map from -set-systems to -set-systems that sends to .
We may drop the mention of the alphabet , and simply talk about set-systems. The variables used in a [set-]system of rank are . We sometimes use other variable names such as for convenience if there is no ambiguity about the meaning.
Definition 4 (systems).
A -system is a -set-system such that is a singleton, is empty, and for all vertices and all directions , is a singleton.
2.2 Composing systems and set-systems: the operation
We describe in this section the operations used for composing set-systems (or systems), allowing to build complex ones out of simpler elements. We do it using the categorical approach of monads: the idea is to have a single complex operation of composition, called , that can then be specialized into simpler ones.
The operation , takes a set-system of set-systems as input, and produces a set-system as output. We explain it here through an example: the following picture shows a set-system of set-systems and its >flattening. The subsystems are drawn inside the dashed boxes, and boxes are organised themselves as a set-system structure that we call the pattern222Note that the definition enforces that the rank of subsystems (i.e. the number of variables they use) should be consistent with the number of outgoing directions of the vertex in the pattern.. Direction names for edges going out of a node will always be implicit, the topmost one being , then , and so on, ending with for the bottommost.
As is shown, the operation glues the subsystems together, keeping their internal structures. If an edge of a subsystem ends in a variable , then it is connected instead to all the initial vertices of the subsystems reachable following direction in the pattern, as well as to the variables reachable following direction . After flattening, the initial vertices are the initial vertices of the subsystems that were themselves initial vertices in the pattern. Root vertices are treated differently: the root vertices from subsystems remain root vertices during the process, while initial vertices of subsystems that are root get promoted to roots.
Note that the operation, when given a system of systems as input, produces a system. Hence, the same operation shall be used for composing systems or set-systems depending on the context. In a categorical description, , together with the corresponding operation (that turns a single symbol into a one-vertex system), equips the set-system definition (resp. system definition) with a monad structure.
2.3 Set-contexts, and contexts
A special form of composition, which is derived from is the notion of set-context, which are set-systems with a hole that can be filled.
A -set-context with a -hole is a -set-system, in which is a new symbol of rank , called the >hole symbol, and which labels exactly one vertex, called the >hole vertex. Set-contexts are denoted , , … A set-context is >closed if it is of rank . Given a -set-context and a -set-system of rank , denotes the -set-system obtained by substituting for the hole. A >context is a set-context which is a system.
An example of a closed context , a system of rank , and the result are pictured below. The hole vertex is depicted as an empty box.
Formally, this amounts to construct the (set-system)-set-system of the same shape as such that for the hole vertex of , and for all other vertices . We then define to be .
Note in particular that this substitution follows the “rules of flattening”, which means that (1) if the hole vertex is initial in , then all initial vertices of remain initial in , (2) all the root vertices of remain root vertices in , and (3) if the hole vertex is root in , then all initial vertices of get to be promoted into root vertices in .
Depending on the situations, we shall use or [set-]contexts. In many situations, this is just a choice of presentation.
2.4 Morphisms, unfoldings, and regular-trees
So far, we have seen systems and set-systems as “rigid structures”. The notions of morphisms allows us to compare them in more subtle ways.
Set-systems come with a natural definition of morphisms: a >morphism of set-systems from to , set-systems of same rank, is a map from vertices of to vertices of that preserves initial vertices, root vertices and edges.
If we specialise morphisms to -systems we obtain the notions of unfolding, unfold-equivalence and regular-trees:
Definition 5 (unfolding and regular-trees).
If there is a morphism of set-systems from a -system to another one , then is called an >unfolding of , and a >folding of . Two -systems that have a common unfolding are called >unfold-equivalent333Showing that it is an equivalence relation requires a proof. Note that this would be equivalent to require to have a common folding, but this direction is of no use in the present work.. A -regular-tree is an unfold-equivalence class of -systems.
For instance, in the pictures below, the systems and have both as unfolding, and thus are unfold-equivalent. The system is a folding of both and :
Remark 6.
Note that if one would “>fully unfold” a system (as is classically done), one would obtain an infinite tree which is “regular in the classical sense” (meaning that it has a finite number of subtrees), and that all regular trees (in the classical sense) can be obtained as the full unfolding of some system. It is also easy to check that two systems are unfold-equivalent if and only if they have isomorphic full unfoldings. Hence, the map which to a regular tree (in the classical sense) associates the set of systems that fully unfold to it is a bijection between regular trees (in the classical sense) and regular-trees as defined in this work.
At this point, one can prove that unfold-equivalence interacts nicely with :
Lemma 7 (unfold-equivalence is a congruence).
Let be (-system)-systems of the same shape such that is unfold-equivalent to for all vertices , then and are unfold-equivalent.
Equivalently, for all -contexts and unfold-equivalent systems of rank , then and are unfold-equivalent.
From a categorical point of view, this means that systems modulo unfold-equivalence, i.e. regular-trees, equipped with the and operations form a monad.
2.5 Transition systems as systems
In this section, we recall what is the standard definition of a transition system, and we explain how these can be seen as systems as defined above. We also explain how bisimilarity can be phrased in this setting.
A >transition system consists of a finite444In this work, as for systems, transition systems are finite. set of >vertices , an >initial vertex , a binary relation called the >transition relation, and a labelling function . We now explain how to encode transition systems as systems over a specific ranked alphabet .
Let be the ranked alphabet which has an element of rank for all -valuation , and all . Given a closed -system , call its >decoding the transition system that has the same set of vertices with same labelling, the same initial vertex, and holds if is an edge for some . Conversely, is called an >encoding of . Note that the decoding is unique, while several non-isomorphic -systems may encode the same transition system. Note also that all transition systems admit at least one encoding (we use here the assumption that transition systems are finite).
We define the >bisimilarity relation over closed -systems as the least equivalence relation that contains unfold-equivalence and the relation “encoding the same transition system”.
The following lemma states why this is consistent with the standard terminology.
Lemma 8.
Two -systems are bisimilar if and only if their decodings are “bisimilar in the standard sense”.
From now on, we shall only consider transition systems through this encoding as -systems.
3 Algebras
We have seen in the previous section all the necessary material for introducing a natural notion of algebras and using them for recognising languages of systems.
In this section, we define these algebras (Section 3.1),
and describe how they can be used to recognise languages of systems (Section 3.2).
We then explain why MSO-definable languages of transition systems are recognised by rankwise-finite algebras (Section 3.3).
3.1 Algebras
The following definitions follow the standard approach via monads.
Definition 9 (algebras).
An >algebra is a ranked set together with a map of ranked sets (called the >evaluation) from -systems to , such that
-
, for all ,
-
for all (-system)-systems .
A >morphism from an algebra to an algebra is a map from to such that for all -systems .
An algebra is >unfold-invariant if for all unfold-equivalent -systems . Unfold-invariant algebras are called >regular-tree algebras. The ranked set of -systems equipped with as evaluation is an algebra called the >free algebra generated by , or the -free algebra.
Example 10 (regular-tree algebra).
Consider defined by , and to be mapping an -system of rank to if there is a “” element reachable from the initial vertex. Otherwise is the set of indices such that the variable is reachable in from the initial vertex. We leave to the reader to check that satisfies the identities of algebras. This algebra is unfold-invariant, since the symbol (resp. the variable ) is reachable in from the initial vertex if and only if this is also the case in any unfolding of .
3.2 Languages and their recognition by algebras
Call >language of -systems a set of closed -systems. A >language of -regular-trees is a language of -systems which is invariant under unfold-equivalence, i.e. such that if are unfold-equivalent and , then , for all -systems .
Let be a morphism (called the >recognising morphism) from the -free algebra to an algebra , and (called the >accepting set), the >language recognised by is the set of closed -systems defined as
We say that is recognised by if it is recognised by for some and .
This is the standard definition of recognition, as for word languages. The only subtlety is that we focus on closed systems, which is reflected in the fact that . Quite naturally, we shall be interested in languages that are recognisable by rankwise-finite algebras, i.e. algebras such that is finite for all .
Note, as expected, that the language of systems recognised by regular-tree algebras are languages of regular-trees.
Example 11 (reachable symbol).
Consider some ranked alphabet , and a set of letters . Let also be the algebra from Example 10, and define as the unique algebra morphism from the -free algebra to such that for all symbols of rank :
This morphism sends all -systems of rank to if there is a symbol from reachable from the initial vertex, and otherwise the set of numbers of the variables reachable from the initial vertex otherwise. The language recognised by is the set of closed -systems that contain a symbol of reachable from the initial vertex. Since this language is preserved under unfold-equivalence, it is a language of regular-trees. Similarly, the language recognised by is the set of closed -systems that contain no symbol of reachable from the initial vertex.
3.3 Monadic second-order logic, and the composition method
We have seen in Section 2.5 how to see transition systems as -systems. Let us explain now how logic interacts with this view. We assume the reader familiar with >monadic second-order logic (>MSO for short). The signature here is the one of transition systems (one binary relation, unary relations for each predicate, and an initial constant). We shall say that an MSO-sentence is >bisimulation-invariant if for all bisimilar transition systems , we have if and only if .
Using the “composition method” approach, we get the translation from logic to algebra:
Lemma 12.
Given an MSO-sentence , the set is recognisable by a rankwise-finite algebra.
4 Yield Algebras
We aim to introduce a richer notion of algebras, called yield-algebras. For this, we need to first give some form of semantics to set-systems. This is done in Section 4.1 with the notion of yields and yield-equivalence. The notions of yield-algebras and recognition by them follows (Section 4.3). In Section 4.2, we explain a difficulty in working with this definition, and hint at a tool for circumventing it. The section culminates with Lemma 20 explaining why we can restrict our attention to rankwise-finite yield-algebras (Section 4.4).
4.1 Yields of set-systems and yield-equivalence
We shall now see how set-systems can be understood as “finite-non-deterministic machines that would non-deterministically produce systems”. This is the notion of yields, that come in two variants: init and root-yields.
We introduce the notion through an example. A set-system is pictured below, as well as one of its init-yields and one of its root-yields.
Informally, an >init-yield of a set-system is a system that unfolds to a system that can be obtained starting from some initial vertex of by resolving all the non-deterministic choices, potentially partially unravelling the set-system at the same time. A >root-yield is similar, starting from a root vertex instead of an initial vertex. Note that the definition is such that the set of init-yields (resp. root-yields) of a set-system are closed under unfold-equivalence. The real definition makes use of morphisms of set-systems. We set:
Two set-systems of same rank are >yield-equivalent if and .
Note that if is a system, then an init-yield of is nothing but an unfold-equivalent system, and there is no root-yield. As a consequence yield-equivalence coincides with unfold-equivalence over systems.
From now, for simplification of the presentation, we shall do as if only init-yields existed.
4.2 Resolutions
This section illustrates a difficulty arising from working with yields, and briefly describes the tool of resolution that we use to circumvent it in the complete version of this work.
At a high level, we would like, given a set-context , to write the yields of for some set-system of rank as a “composition” of the yields of . The following example shows that it does not work.
Example 13 (yields are not compositional).
Let us work with a ranked alphabet that has symbols and of rank , and of rank . Consider the set-context and the system :
Then:
-
The set-context has two init-yields: and .
-
The system has exactly one init-yield: itself.
-
The set-system has four init-yields up to unfold-equivalence: , , , and .
If we combine the init-yields of and in all possible ways, we only obtain and . These are indeed init-yields of , but we still miss two of them, namely and . The issue is that the double occurrence of in forbids to use it in a combination of systems for producing, eg, .
This situation is in fact a well-identified mathematical difficulty. For instance, this is what prevents the existence of a distributive law with the powerset monad over the monads of non-linear trees [4].
To circumvent this problem, we introduce the notion of >resolution, which combines the yield with some variable renaming.
Definition 14.
Given a -set-system of rank , a >resolution of consists of:
-
a map from to for some , and
-
a -system of rank , such that
-
is an init-yield of , where is with all variables renamed into .
In Example 13, we see that with the constant map is a resolution of , and that it conveys the information that could not be caught by yields.
Now, it makes sense to understand a yield of as a combination of resolutions of and . This notion of resolution is one of the arguments used for showing that yield-equivalence is a congruence:
Lemma 15 (yield-equivalence is a congruence).
Let be (-set-system)-set-systems of the same shape such that is yield-equivalent to for all vertices , then and are yield-equivalent.
4.3 Yield-algebras and recognisability
The definition of a yield-algebra corresponds to the notion of algebras naturally arising from set-systems, quotiented by yield-equivalence. Since set-systems are equipped with a form of non-determinism, yield-algebras are naturally endowed with an order that provides an inf-semi-lattice structure (note that the non-deterministic sum will be noted , i.e. as an infimum rather than a supremum; this is a choice of presentation that corresponds to the fact that we see in this work this non-determinism as “adversarial”, i.e. controlled by an opponent).
Definition 16.
A >yield-algebra is a ranked set together with a map of ranked sets from -set-systems to , such that
-
, for all ,
-
for all (-set-system)-set-systems , and
-
for all unfold-equivalent555This is a generalised version of unfold-equivalence for set-systems, which relies on the natural notion of locally surjective morphisms. -set-systems of same rank , .
-
for all -set-systems of rank and all ,
for all yields of implies .
where for , we note , and holds if .
For , let be .
A >morphism from the yield-algebra to the yield-algebra is a map of ranked sets from to such that for all -set-systems .
Lemma 17.
The operation is associative, commutative, and idempotent. The relation is an order, and computes the infimum with respect to .
Note that since systems are particular cases of set-systems, yield-algebras are in particular algebras. Furthermore, since yield-equivalence coincides with unfold-equivalence over systems, yield-algebras seen as algebras are unfold-invariant, and hence regular-tree algebras. This downgrading of yield-algebras to regular-tree algebras is made implicitly in the rest of this work. This means that we can use yield-algebras for recognising languages of regular-trees:
Definition 18.
A language of regular-trees is >recognisable by a yield-algebra if it is recognisable by seen as an algebra using an accepting set of the form for some .
The following lemma shows the intention behind the definition.
Lemma 19.
Let be a yield-algebra recognising a language of -systems, then, for all closed -set-systems , if and only if .
4.4 rankwise-finite yield-algebras
One can now state the key lemma of this section (which involves new arguments compared to the literature).
Lemma 20.
If a language of regular-trees is recognised by a rankwise-finite algebra, then it is recognised by a rankwise-finite yield-algebra.
For other kind of algebras, such as monoids, deterministic automata over words or trees, forest algebras, …, a similar result is classically obtained by applying a form of powerset construction. Let us explain by an example why this standard approach fails here.
Example 21 (standard approach fails).
Given a rankwise-finite algebra , the “standard approach” would be to add new elements to in order to build a yield-algebra for the same language. With this approach is a sub-algebra of 666For instance, if we apply the powerset operation to a monoid , then by distributive law, we obtain a new monoid , and is isomorphic to the submonoid of restricted to its singletons.. This is not possible here.
Indeed, consider the language of regular-trees over the alphabet with , and that contains the closed -systems that have all their leaves carrying the same letter (only ’s or only ’s). Let be its syntactic-algebra777It always exists by generic algebraic arguments., and be the accepting set. For simplicity, we identify the letters with their image in the algebra and use as recognising morphism. We have since for all closed -contexts , if and only if . We also have , , but .
Assume now that we have built a yield-algebra that has as a sub-algebra, and is the accepting part. This implies that . However, when composed with (as in Example 13), we would have by Lemma 19 since but since . A contradiction.
Our argument for avoiding the problem and proving Lemma 20 involves two ideas. Let be a rankwise-finite regular-tree algebra recognising a language of regular-trees .
Idea 1.
Define the map which maps each -set-systems to the set . The arguments of Section 4.2 can be used to show that >profile-equivalence (i.e. the equivalence relation over -set-systems “having the same ”) is a congruence for over -set-systems, i.e. for all -set-contexts and -set-systems of rank that are profile-equivalent, then and are profile-equivalent. It follows that -set-systems quotiented by profile-equivalence is a yield-algebra that recognises .
The problem is that it is not rankwise-finite: profile-equivalence is too fine.
Idea 2.
This is where the second argument is put in action: we consider small resolutions. A resolution is >small if the map satisfies the condition that for each in its codomain, . We define as but considering only small resolutions. Two set-systems are >small–profile-equivalent if they have the same small profile. Note that over each rank, small-profile-equivalence is of finite index, is coarser than profile-equivalence, and over rank profile-equivalence and small-profile-equivalence coincide.
At this point, a key technical result is used: a “context smallification lemma”, that has as consequence that for closed -set-contexts and -set-systems of rank that are small-profile-equivalent, we obtain that and are profile-equivalent. Note the only difference in this statement compared to the analogue one for profile-equivalence, which is that here is assumed to be closed.
The consequence of this is that the >syntactic yield-algebra for (defined in a natural way) is a yield-algebra that recognises , and is coarser than small-profile-equivalence, and hence rankwise-finite. Lemma 20 has been established.
5 The Automaton Property
In this section, we identify a combinatorial property that allows us to translate languages recognisable by yield-algebras to automata or -calculus sentences.
Consider some rankwise-finite yield-algebra , its >deterministic elements are the ones obtainable by composing only elements of the two lowest ranks, i.e. from :
Note that is in fact a sub-yield-algebra of . The name of deterministic element comes from the following application (we assume knowledge of automata over infinite trees: our automata have finite states, use a parity condition with finitely many priorities, but there is no limit on the cardinality of the size of the input alphabet):
Lemma 22.
Let be a language of regular-trees recognised by where maps symbols to deterministic elements. There exists a deterministic top-down parity automaton over infinite trees that accepts the full unfolding of a system if and only if .
Consider now that is used to recognise some language of regular-trees , using the accepting set with . Let be the recognising morphism. The automaton property intuitively states that deterministic elements are sufficient for describing the behaviour of all elements when put in a closed context.
Definition 23.
have the >automaton property if for all closed -set–contexts and every with , there is a deterministic element such that and .
It is easy to verify that, if has the automaton property, setting for all , we get that for all -set-contexts :
(1) |
Let us very informally attempt to explain how this property allows us to build a finite state automaton over infinite trees that coincides with over regular trees. This automaton is defined as follows: it guesses a labelling of the tree by deterministic elements such that for all nodes labelled by some it associates some , and then checks that the resulting infinite tree is accepted by the automaton from Lemma 22.
Let us explain now why this automaton accepts the full unfolding of a system if and only if . For this, we first establish, using inductively Equation 1, that:
-
, if and only if
-
there exists a -system of the same shape as such that for all vertices , and moreover . Let us call such a >direct run over .
The downward implication indicates that if , then the automaton accepts the full unfolding of using the full unfolding of the direct run as a witness of acceptance. Conversely, if the automaton accepts the full unfolding of , this means, by Rabin’s lemma888Rabin’s lemma states that a non-empty regular language of infinite trees contains a regular tree., that there is a regular witness of acceptance over the full unfolding of . This can be rephrased as the existence of a -system such that and is a direct run over some unfolding of . Hence .
If we combine these arguments with the ones of Janin and Walukiewicz, we get:
Lemma 24.
If is a rankwise-finite yield-algebra such that has the automaton property and recognises a bisimulation-invariant language of -regular trees , then is definable by a -calculus sentence.
We obtain this by showing that if we assume that the original language is bisimulation-invariant, then the automaton built via our procedure going through deterministic elements can be turned into an automaton of the shape requested by [13, Theorem 7]. Thus, we obtain a -calculus sentence that defines , using directly [13].
The last missing argument, which is also a key contribution of this work is to show:
Lemma 25.
Let be the syntactic yield-algebra for a language of regular-trees , and be the accepting set, if is rankwise-finite, then has the automaton property.
We recall the general intuition behind syntactic algebra. In general, the syntactic algebra for a language is the minimal one that recognises it, which is reflected by the fact that any two distinct elements in the algebra can be distinguished by the language in some context.
The proof of Lemma 25 is the second place where new arguments are used. Let be a -set-context, and , such that . Our goal is to find a deterministic element such that and . Notice that here we will informally write instead of to lighten notations.
We proceed in two major steps. First, we transform into a “maximally difficult context” such that , and for all . This is possible since is rankwise-finite and ordered by . This also requires decomposition arguments for set-contexts. In particular we show that they can always be refactored into a normal form that is easy to manipulate, as a context of rank in normal form is completely given by a tuple of elements of rank .
Second, we combine the element with the structure of the context in normal form, and crucially make use of roots, to define a -set-system of rank , i.e. whose evaluation is deterministic, and that satisfies . The maximality assumption in the construction of then results in that for all -set-context , implies that is “less or as difficult” as , which in turns implies . By definition of the order , this shows that as desired. Since we also have by construction, and , we obtain since is upward-closed. The automaton property is proved.
6 Conclusion
Our notion of systems is equivalent to notions of tree algebras developed elsewhere in the literature. The notion of set-systems is a kind of extended powerset construction on it, expanded with root vertices, something that is new to the best of our knowledge, though close to more classical powerset constructions. One originality here is mainly that all operations are performed while keeping the structures folded, and only finitely unfolding it when necessary. More crucial is the argument that transforms algebras to yield-algebras, which circumvents the impossibility to have the powerset distribute over the monad of an algebraic theory that has non-linear identities. We are studying further this categorical aspect. Another interesting direction is to see to which extent an algebraic approach can be used to bypass automata for deciding the MSO-theory of infinite trees, i.e. a purely algebraic proof or Rabin’s theorem.
References
- [1] Peter Aczel, Jiří Adámek, and Jiří Velebil. A coalgebraic view of infinite trees and iteration. Electronic Notes in Theoretical Computer Science, 44(1):1–26, 2001. CMCS 2001, Coalgebraic Methods in Computer Science (a Satellite Event of ETAPS 2001). doi:10.1016/S1571-0661(04)80900-9.
- [2] Achim Blumensath. Recognisability for algebras of infinite trees. Theor. Comput. Sci., 412(29):3463–3486, 2011. doi:10.1016/J.TCS.2011.02.037.
- [3] Achim Blumensath. An algebraic proof of rabin’s tree theorem. Theor. Comput. Sci., 478:1–21, 2013. doi:10.1016/J.TCS.2013.01.026.
- [4] Achim Blumensath. The Power-Set Construction for Tree Algebras. Logical Methods in Computer Science, Volume 19, Issue 4, November 2023. doi:10.46298/lmcs-19(4:9)2023.
- [5] Achim Blumensath and Felix Wolf. Bisimulation invariant monadic-second order logic in the finite. Theoretical Computer Science, 823:26–43, 2020. doi:10.1016/j.tcs.2020.03.001.
- [6] Mikołaj Bojańczyk and Tomasz Idziaszek. Algebra for infinite forests with an application to the temporal logic EF. In Mario Bravetti and Gianluigi Zavattaro, editors, CONCUR 2009 - Concurrency Theory, 20th International Conference, CONCUR 2009, Bologna, Italy, September 1-4, 2009. Proceedings, volume 5710 of Lecture Notes in Computer Science, pages 131–145. Springer, 2009. doi:10.1007/978-3-642-04081-8_10.
- [7] Mikołaj Bojańczyk and Bartek Klin. A non-regular language of infinite trees that is recognizable by a sort-wise finite algebra. Logical Methods in Computer Science, Volume 15, Issue 4, November 2019. doi:10.23638/LMCS-15(4:11)2019.
- [8] Facundo Carreiro, Alessandro Facchini, Yde Venema, and Fabio Zanasi. Weak MSO: automata and expressiveness modulo bisimilarity. In Thomas A. Henzinger and Dale Miller, editors, Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS ’14, Vienna, Austria, July 14 - 18, 2014, pages 27:1–27:27. ACM, 2014. doi:10.1145/2603088.2603101.
- [9] Anuj Dawar and Martin Otto. Modal characterisation theorems over special classes of frames. Annals of Pure and Applied Logic, 161(1):1–42, 2009. doi:10.1016/J.APAL.2009.04.002.
- [10] S. Feferman and R. Vaught. The first order properties of products of algebraic systems. Fundamenta Mathematicae, 47(1):57–103, 1959. URL: http://eudml.org/doc/213526.
- [11] Helle Hvid Hansen, Clemens Kupke, and Eric Pacuit. Neighbourhood structures: Bisimilarity and basic model theory. Logical Methods in Computer Science, Volume 5, Issue 2, April 2009. doi:10.2168/LMCS-5(2:2)2009.
- [12] Matthew Hennessy and Robin Milner. On observing nondeterminism and concurrency. In J. W. de Bakker and Jan van Leeuwen, editors, Automata, Languages and Programming, 7th Colloquium, Noordweijkerhout, The Netherlands, July 14-18, 1980, Proceedings, volume 85 of Lecture Notes in Computer Science, pages 299–309. Springer, 1980. doi:10.1007/3-540-10003-2_79.
- [13] David Janin and Igor Walukiewicz. On the expressive completeness of the propositional mu-calculus with respect to monadic second order logic. In Ugo Montanari and Vladimiro Sassone, editors, CONCUR ’96, Concurrency Theory, 7th International Conference, Pisa, Italy, August 26-29, 1996, Proceedings, volume 1119 of Lecture Notes in Computer Science, pages 263–277. Springer, 1996. doi:10.1007/3-540-61604-7_60.
- [14] Faron Moller and Alexander Moshe Rabinovich. On the expressive power of CTL*. In 14th Annual IEEE Symposium on Logic in Computer Science, Trento, Italy, July 2-5, 1999, pages 360–368. IEEE Computer Society, 1999. doi:10.1109/LICS.1999.782631.
- [15] Damian Niwiński. Fixed point characterization of infinite behavior of finite-state systems. Theoretical Computer Science, 189(1):1–69, 1997. doi:10.1016/S0304-3975(97)00039-X.
- [16] Maximilian Pflueger, Johannes Marti, and Egor V. Kostylev. A characterisation theorem for two-way bisimulation-invariant monadic least fixpoint logic over finite structures. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2024, Tallinn, Estonia, July 8-11, 2024, pages 63:1–63:14. ACM, 2024. doi:10.1145/3661814.3662107.
- [17] Alexander Rabinovich. On compositionality and its limitations. ACM Trans. Comput. Log., 8(1):4, 2007. doi:10.1145/1182613.1182617.
- [18] Eric Rosen. Modal logic over finite structures. J. Log. Lang. Inf., 6(4):427–439, 1997. doi:10.1023/A:1008275906015.
- [19] Saharon Shelah. The monadic theory of order. Annals of Mathematics, 102(3):379–419, 1975.
- [20] Johan Van Benthem. Correspondence theory. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic: Volume II: Extensions of Classical Logic, pages 167–247. Springer Netherlands, Dordrecht, 1984. doi:10.1007/978-94-009-6259-0_4.
- [21] Thomas Wilke. Algebras for classifying regular tree languages and an application to frontier testability. In Andrzej Lingas, Rolf Karlsson, and Svante Carlsson, editors, Automata, Languages and Programming, pages 347–358, Berlin, Heidelberg, 1993. Springer Berlin Heidelberg. doi:10.1007/3-540-56939-1_85.