Abstract 1 Introduction 2 Preliminaries 3 Transducing the laminar-tree 4 Transducing modular decompositions 5 Conclusion References

CMSO-Transducing Tree-Like Graph Decompositions

Rutger Campbell Discrete Mathematics Group, Institute for Basic Science, Daejeon, South Korea Bruno Guillon Université Clermont Auvergne, Clermont Auvergne INP, LIMOS, CNRS, Clermont-Ferrand, France Mamadou Moustapha Kanté ORCID Université Clermont Auvergne, Clermont Auvergne INP, LIMOS, CNRS, Clermont-Ferrand, France Eun Jung Kim ORCID KAIST, Daejeon, South Korea
CNRS, France
Noleen Köhler ORCID University of Leeds, UK
Abstract

We show that given a graph G we can CMSO-transduce its modular decomposition, its split decomposition and its bi-join decomposition. This improves results by Courcelle [Logical Methods in Computer Science, 2006] who gave such transductions using order-invariant MSO, a strictly more expressive logic than CMSO. Our methods more generally yield C2MSO-transductions of the canonical decomposition of weakly-partitive set systems and weakly-bipartitive systems of bipartitions.

Keywords and phrases:
MSO-transduction, MSO-definability, graph decomposisions
Funding:
Rutger Campbell: Supported by the Institute for Basic Science (IBS-R029-C1).
Mamadou Moustapha Kanté: Supported by the French National Research Agency (ANR-18-CE40-0025-01 and ANR-20-CE48-0002).
Copyright and License:
[Uncaptioned image] © Rutger Campbell, Bruno Guillon, Mamadou Moustapha Kanté, Eun Jung Kim, and
Noleen Köhler; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Finite Model Theory
Related Version:
Extended Version: https://arxiv.org/abs/2412.04970 [6]
Editors:
Olaf Beyersdorff, Michał Pilipczuk, Elaine Pimentel, and Nguyễn Kim Thắng

1 Introduction

A decomposition of a graph, especially a tree-like decomposition, is a result of recursive separations of a graph and is extremely useful for investigating combinatorial properties such as colourability, and for algorithm design. Such a decomposition also plays a fundamental role when one wants to understand the relationship between logic and a graph class. Different notions of the complexity of a separation motivate different ways to decompose, such as tree-decomposition, branch-decomposition, rank-decomposition and carving-decomposition. Furthermore, some important graph classes can be defined through the tree-like decomposition they admit; cographs with cotrees and distance-hereditary graphs with split decompositions being prominent examples.

For a logic 𝖫, an 𝖫-transduction is a non-deterministic map from a class of relational structures to a new class of relational structures using 𝖫-formulas. Transducing a tree-like decomposition is of particular interest. Notably, transducing a decomposition of a graph implies that any property that is definable using a decomposition, is also definable on graphs that admit such a decomposition. Moreover, tree-like decompositions can be often represented by labelled trees, for which the equivalence of recognisability by a tree automaton and definability in CMSO is well known [8]. Hence, it is an interesting question to consider what kind of graph decompositions can be transduced using 𝖫-transductions for some extension 𝖫 of MSO.

In a series of papers [8, 9, 10, 12], Courcelle investigated the relationship between the graph properties that can be defined in an extension of MSO and the graph properties that can be recognized by a tree automaton. In particular, for graphs of bounded treewidth, Courcelle’s theorem states that any property that is definable in the logic MSO with modulo counting predicate, denoted CMSO, can be recognized by a tree automaton [8]. Combining this result with the linear time algorithm for computing tree-decompositions [1], yields that CMSO model-checking can be done in linear time on graphs of bounded treewidth. The converse statement – whether recognisability by a tree automaton implies definability in CMSO on graphs of bounded treewidth – was conjectured by Courcelle in [8] and finally settled by Bojańczyk and Pilipczuk [4]. The key step to obtain this result is obtaining a tree-decomposition of a graph via an MSO-transduction, a strategy which was initially proposed in [9] and is now standard.

The obvious next question is whether an analogous result can be proved for graphs of bounded clique-width and for more general combinatorial objects, most notably, matroids representable over a fixed finite field and of bounded branch-width. Due to the above-mentioned strategy, the key challenge is to produce corresponding tree-like decompositions by MSO-transduction. It is known that clique-width decompositions can be MSO-transduced for graphs of bounded linear clique-width [3]. However, it is unknown if clique-width decompositions can be MSO-transduced in general. In fact, this question remains open even for distance-hereditary graphs, which are precisely graphs of rank-width 1 (thus, of constant clique-width).

Besides tree-decompositions, the problem of transducing cotrees, and in general hierarchical decompositions such as modular decompositions and split decompositions were considered in the literature [10, 11, 12, 14]. In [10], Courcelle provides transductions using order-invariant MSO for cographs and modular decompositions of graphs of bounded modular width. Order-invariant MSO allows the use of a linear order of the set of vertices and is more expressive than CMSO [20]. The applicability of these transductions was later generalized using the framework of weakly-partitive set systems111Weakly-partitive set systems are set systems enjoying some nice closure properties, which were then used to show that some set systems allow canonical tree representations, see for instance the thesis by Montgolfier and Rao [17, 24]. to obtain order-invariant MSO-transductions of modular and split decompositions [12]. It was left as an open question whether one can get rid of the order (see for instance [13] where an overview of the result on hierarchical decompositions was given).

1.1 Our results

In this paper, we consider decompositions given by separations that do not overlap with any other separations of the same type. We view separations of a given kind as a “set system”. A set system consists of a set U, the universe, and a set 𝒮 of subsets of U. Two sets overlap if they have non-empty intersection but neither of them is contained in the other. If no two elements in a set system (U,𝒮) overlap, i.e. the set system is laminar, then the subset relation in (U,𝒮) can be described by a rooted tree T, called the laminar tree of (U,𝒮). For any set system (U,𝒮) we can look at the subset of strong sets, i.e., sets that do not overlap with any other set, and the laminar tree T they induce. Additionally, we consider systems of bipartitions of a universe U. Two bipartitions of U overlap if neither side of one of the bipartition is contained in either side of the other bipartition. In case (U,) has no overlapping bipartitions, then (U,) can be described by an undirected tree, also called laminar tree, in which bipartitions correspond to edge cuts. We can define the concept of strong bipartitions equivalently and consider the laminar tree induced by the strong bipartitions in (U,).

Given a graph G we can consider the set system (V(G),) where is the set of all modules in G, or the system of bipartition (V(G),𝒮) where 𝒮 contains all splits in G, or the system of bipartitions (V(G),) where contains all bi-joins in G. We obtain the modular decomposition/cotree/split decomposition/bi-join decomposition by equipping the laminar tree of the suitable set system/system of bipartitions with some additional structure. The additional structure allows us to recover the graph from the respective decomposition.

Abstractly, the set systems/systems of bipartitions mentioned are instances of weakly-partitive set systems/weakly-bipartitive systems of bipartitions. Roughly speaking, if a set system (U,𝒮) is well behaved, i.e. (U,𝒮) is weakly-partitive (definition in Section 2), then there is a labelling λ of T and a partial order < of its nodes such that (U,𝒮) is completely described by (T,λ,<) [7, 24]. A similar statement holds for systems of bipartitions [17].

We show the following, wherein each item λ is a suitable labelling of the nodes of the laminar tree T, < is a partial ordering of its nodes, and F is an additional edge relation defined only on pairs of siblings in T. A visualization how results depend on each other is given in Figure 1. Due to space constraints, some proofs are omitted, but they are available in the extended preprint [6].

Theorem 1.

There are non-deterministic C2MSO-transductions τ1,,τ7 such that:

  1. 1.

    For any laminar set system (U,𝒮), τ1(U,𝒮) is non-empty and every output in τ1(U,𝒮) is a laminar tree T of (U,𝒮) (Theorem 2);

  2. 2.

    For any graph G, τ2(G) is non-empty and every output in τ2(G) is a modular decomposition (T,F) of G (Theorem 18);

  3. 3.

    For any cograph G, τ3(G) is non-empty and every output in τ3(G) is a cotree (T,λ) of G (Corollary 20);

  4. 4.

    For any graph G, τ4(G) is non-empty and every output in τ4(G) is a split decomposition (T,F) of G [6, Theorem 29];

  5. 5.

    For any graph G, τ5(G) is non-empty and every output in τ5(G) is a bi-join decomposition (T,F) of G [6, Theorems 32 & 37];

  6. 6.

    For any weakly-partitive set system (U,𝒮), τ6(U,𝒮) is non-empty and every output in τ6(U,𝒮) is a weakly-partitive tree (T,λ,<) of (U,𝒮) (Theorem 15);

  7. 7.

    For any weakly-bipartitive set system (U,), τ7(U,) is non-empty and every output in τ7(U,) is a weakly-bipartitive tree (T,λ,<) of (U,) [6, Theorem 24].

The key step in obtaining these transductions is to transduce the laminar tree T of a set system (U,𝒮), namely Theorem 2. The crux here is to find a suitable representative of each node of T amongst the elements of U and a non-deterministic coloring which allows the assignment of representatives to nodes by means of a C2MSO-formula. It should be mentioned that a similar result is claimed in the preprint [2], where a proof sketch designing a C3MSO-transduction is described. Once the laminar tree is obtained, the additional relations for each decomposition can be obtained using a deterministic MSO-transduction. Notice that for each of these transductions, there exists an inverse deterministic MSO-transduction, namely a transduction that from the tree-like decomposition outputs the original structure.

Figure 1: Overview of the various transductions in the paper. An arrow from x to y indicates that result x is used in the proof of result y.

1.2 Organization

The paper is organized as follows. In Section 2 we introduce terminology and notation needed. In Section 3 we prove Theorem 2. In Section 4 we provide the proof of Theorems 15 and 18 and obtain Corollary 20.

2 Preliminaries

2.1 Graphs, trees, set systems

Graphs.

We use standard terminology of graph theory, and we fix some notations. Given a directed graph G, its sets of vertices and edges are denoted by V(G) and E(G), respectively. We denote by uv an edge (u,v)E(G). An undirected graph is no more than a directed graph for which E(G) is symmetric (i.e., uvE(G)vuE(G)). The notions of paths, connected components, etc… are defined as usual. Given a subset Z of V(G), we denote by G[Z] the sub-graph of G induced by Z.

Trees.

A tree is a connected undirected graph without cycles. In the context of trees, we use a slightly different terminology than for graphs. In particular, vertices are called nodes, nodes of degree at most 1 are called leaves, and nodes of degree greater than 1 are called inner. The set of leaves is denoted L(T); thus the set of inner nodes is V(T)L(T). For a node t of a tree T and a neighbor s of t, we denote by Tst the connected component of Tt containing s. We sometimes consider rooted trees, namely trees with a distinguished node, called the root. Rooted trees enjoy a natural orientation of their edges toward the root, which induces the usual notions of parent, child, sibling, ancestor and descendant. Hence, we represent a rooted tree by a set of nodes with an ancestor/descendant relationship (instead of specifying the root). We use the convention that every node is one of its own ancestors and descendants. We refer to ancestors (resp. descendants) of a node that are not the node itself as proper ancestors (resp. proper descendants). For a node t of a rooted tree T, we denote by Tt the subtree of T rooted in t (i.e., the restriction of T to the set of descendants of t).

Set systems and laminar trees.

A set system is a pair (U,𝒮) where U is a finite set, called the universe, and 𝒮 is a family of subsets of U where 𝒮, U𝒮, and {a}𝒮 for every aU.222Though these restrictions on 𝒮 are not usual for set systems, it is convenient for our contribution and it does not significantly impact the generality of set systems: every family  of subsets of U can be associated with a set system (U,𝒮) where 𝒮=({}){U}{{a}aU}. Two sets X and Y overlap if they are neither disjoint nor related by containment. A set system (U,𝒮) is said to be laminar (aka overlap-free) when no two sets from 𝒮 overlap. By extension, a set family 𝒮 is laminar if (𝒮,𝒮) is a laminar set system (note this also requires that 𝒮, 𝒮𝒮, and {a}𝒮 for every a𝒮).

A laminar family 𝒮 of subsets of U naturally defines a rooted tree where the nodes are the sets from 𝒮, the root is U, and the ancestor relation corresponds to set inclusion. We call this rooted tree the laminar tree of U induced by 𝒮 (or laminar tree of (U,𝒮)). In this rooted tree, the leaves are the singletons {x} for xU, which we identify with the elements themselves. That is to say, L(T)=U. Laminar trees have the property that each inner node has at least two children. Observe also that the size of a laminar tree is linearly bounded in the size of the universe.

2.2 Logic and transductions

We use relational structures to model both graphs and the various tree-like decompositions used in this paper. In order to concisely model set systems we use the more general notion of extended relational structures, namely relational structure extended with set predicate names. Such structures also naturally arise as outputs of MSO-transductions defined below.

Define an (extended) vocabulary to be a set of symbols, each being either a relation name R with associated arity ar(R), or a set predicate name P with associated arity ar(P). Set predicate names are aimed to describe relations between sets, e.g., one may have a unary set predicate for selecting finite sets of even size, or a binary set predicate for selecting pairs of disjoint sets. We use capital R or names starting with a lowercase letter (e.g., 𝖾𝖽𝗀𝖾, 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋, 𝗍-𝖾𝖽𝗀𝖾) for relation names, and capital P or uppercase names (e.g., 𝖲𝖤𝖳, C2) for set predicate names. To refer to an arbitrary symbol of undetermined kind, we use capital Q. A relational vocabulary is an extended vocabulary in which every symbol is a relation name.

Let Σ be a vocabulary. An extended relational structure over Σ (Σ-structure) is a structure 𝔸=U𝔸,(Q𝔸)QΣ consisting on the one hand of a set U𝔸 called universe, and on the other hand, for each symbol Q in Σ, an interpretation Q𝔸 of Q, which is a relation of arity ar(Q) either over the universe if Q is a relation name, or over the family of subsets of the universe if Q is a set predicate name. When Σ is not extended, 𝔸 is simply a relational structure.

Given a Σ-structure 𝔸 and, for some vocabulary Γ, a Γ-structure 𝔹, we write 𝔸𝔹 if ΣΓ, U𝔸U𝔹 and for each symbol Q in Σ, Q𝔸=Q𝔹.333We require equality here (in particular only elements or subsets of U𝔸 are related in Q𝔹). This differs from classical notions of inclusions of relational structures which typically require equality only on the restriction of the universe to U𝔸, i.e., Q𝔸=Q𝔹/U𝔸, e.g., in order to correspond to induced graphs. We write 𝔸𝔹 to denote the (ΣΓ)-structure consisting of the universe U𝔸U𝔹 and, for each symbol QΣΓ, the interpretation Q𝔸𝔹 which is Q𝔸, Q𝔹, or Q𝔸Q𝔹 according to whether Q belongs to ΣΓ, to ΓΣ, or to ΣΓ.444We do not require 𝔸 and 𝔹 to be disjoint structures whence we may have 𝔸𝔸𝔹.

To describe properties of (extended) relational structures, we use monadic second order logic (MSO) and refer for instance to [15, 19, 22, 25] for the definition of MSO on extended relational structures such as matroids or set systems in general. This logic allows quantification both over single elements of the universe and over subsets of the universe. We also use counting MSO (CMSO), which is the extension of MSO with, for every positive integer p, a unary set predicate Cp that checks whether the size of a subset is divisible by p or not. We only use C2. As usual, lowercase variables indicates first-order-quantified variables, while uppercase variables indicates monadic-quantified variables. For a formula ϕ, we write, e.g.ϕ(x,y,X) to indicate that the variables x, y, and X belong to the set of free variables of ϕ, namely, the set of variables occurring in ϕ that are not bound to a quantifier within ϕ. A sentence is a formula without free variables.

We now fix some (extended) vocabularies that we will use.

Graphs.

To model both graphs, unrooted trees, and directed graphs, we use the relational vocabulary {𝖾𝖽𝗀𝖾} where 𝖾𝖽𝗀𝖾 is a relation name of arity 2. A (directed) graph G=(V,E) is modeled as the {𝖾𝖽𝗀𝖾}-structure 𝔾 with universe U𝔾=V and interpretation 𝖾𝖽𝗀𝖾𝔾=E. In particular if G is undirected, then 𝖾𝖽𝗀𝖾𝔾 is symmetric.

Rooted trees.

We use the relational vocabulary {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋} to model rooted trees where 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋 is a relation name of arity 2. A rooted tree T is modeled as the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋}-structure 𝕋 with universe U𝕋=V(T) and the interpretation 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋𝕋 being the set of pairs (u,v) such that u is an ancestor of v in T. It is routine to define FO-formulas over this vocabulary to express the binary relations parent, child, proper ancestor, (proper) descendant, as well as the unary relations leaf and root.

Set systems.

To model set systems, we use the extended vocabulary {𝖲𝖤𝖳} where 𝖲𝖤𝖳 is a set predicate name of arity 1. A set system S=(U,𝒮) is thus naturally modeled as the {𝖲𝖤𝖳}-structure 𝕊 with universe U𝕊=U and interpretation 𝖲𝖤𝖳𝕊=𝒮.

Transductions

Let Σ and Γ be two extended vocabularies. A Σ-to-Γ transduction is a set τ of pairs formed by a Σ-structure, call the input, and a Γ-structure, called the output. We write 𝔹τ(𝔸) when (𝔸,𝔹)τ. When for every pair (𝔸,𝔹)τ we have 𝔸𝔹, we call τ an overlay transduction. Some transductions can be defined by means of MSO- or C2MSO-formulas. This leads to the notion of MSO- and C2MSO-transductions. Following the presentation of [3], for 𝖫 denoting MSO or C2MSO, define an 𝖫-transduction to be a transduction obtained by composing a finite number of atomic 𝖫-transductions of the following kinds.

Filtering.

An overlay transduction specified by an 𝖫-sentence ϕ over the input vocabulary Σ, which discards the inputs that do not satisfy ϕ and keeps the other unchanged. Hence, it defines a partial function (actually, a partial identity) from Σ-structures to Σ-structures.

Universe restriction.

A transduction specified by a 𝖫-formula ϕ over the input vocabulary Σ, with one free first-order variable, which restricts the universe to those elements that satisfy ϕ. The output vocabulary is Σ and the interpretation of every relation (resp. every predicate) in the output structure is defined as the restriction of its interpretation in the input structure to those tuples of elements satisfying ϕ (resp. tuples of sets of elements that satisfy ϕ). This defines a total function from Σ-structures to Σ-structures.

Interpretation.

A transduction specified by a family (ϕQ)QΓ over the input vocabulary Σ where Γ is the output vocabulary and each ϕQ has ar(Q) free variables which are first-order if Q is a relation name and monadic if it is a set predicate name. The transduction outputs the Γ-structure that has the same universe as the input structure and in which each relation or predicate Q is interpreted as those set of tuples that satisfy ϕQ. This defines a total function from Σ-structures to Γ-structures.

Copying.

An overlay transduction parametrized by a positive integer k that adds k copies of each element to the universe. The output vocabulary consists in the input vocabulary Σ extended with k binary relational symbols (𝖼𝗈𝗉𝗒i)i[k] interpreted as pairs of elements (x,y) saying that “y is the i-th copy of x”. The interpretation of the relations (resp. predicates) of the input structure are preserved, on original elements. This defines a total function from Σ-structures to Γ-structures, where Γ=Σ{𝖼𝗈𝗉𝗒ii[k]}.

Colouring.

An overlay transduction that adds a new unary relation 𝖼𝗈𝗅𝗈𝗋Σ to the structure. Any possible interpretation yields an output; indeed the interpretation is chosen non-deterministically. The interpretation of the relations (resp. predicates) of the input structure are preserved. Hence, it defines a total (non-functional) relation from Σ-structures to Γ-structures where Γ=Σ{𝖼𝗈𝗅𝗈𝗋} (providing 𝖼𝗈𝗅𝗈𝗋Σ).

We say an 𝖫-transduction is deterministic if it does not use colouring, it is non-deterministic otherwise. By definition, deterministic 𝖫-transductions define functions.

3 Transducing the laminar-tree

In this section, we present an overlay C2MSO-transduction that takes as input a laminar set system and outputs the laminar tree it induces.

Theorem 2.

Let Σ be an extended vocabulary, including a unary set predicate name 𝖲𝖤𝖳 and not including the binary relational symbol 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋. There exists a non-deterministic C2MSO-transduction τ such that, for each laminar set system (U,𝒮) represented as the {𝖲𝖤𝖳}-structure 𝕊 and inducing the laminar tree T with L(T)=U, and for each Σ-structure 𝔸 with 𝕊𝔸, τ(𝔸) is non-empty and every output in τ(𝔸) is equal to some {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋}-structure 𝕋 representing T.

Since the sets from 𝒮 are precisely the sets of leaves of the subtrees of T, there is an MSO-transduction which is the inverse of the above C2MSO-transduction; that is to say, given an {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋}-structure 𝕋 representing the laminar tree, it outputs the original set system 𝕊. Namely,

  1. 1.

    An interpretation Φ𝖲𝖤𝖳(S) that is true for a set S when there exists an element a such that an element x is in S if and only if a is an ancestor of x.

  2. 2.

    The filtering ϕ(x) keeping leaves only.

We prove Theorem 2 in Section 3.2, using the key tools developed in Section 3.1, that allow us to represent each inner node of T with a pair of leaves from its subtree, while keeping the number of inner nodes represented by a given leaf bounded.

3.1 Inner node representatives

In this section, the root of any rooted tree is always an inner node (unless the tree is a unique node). We fix a rooted tree T, in which every inner node has at least two children (a necessary assumption that is satisfied by laminar trees), and we let V denote the set of its nodes (V=V(T)) and LV denote the subset of its leaves (L=L(T)).

Let SVL, and let (π,σ) be a pair of injective mappings from S to L. We say that the pair (π,σ) identifies S if for each sS, s is the least common ancestor of π(s) and σ(s). For sS and xV, we say that x is s-requested in (π,σ) if x lies on the path from π(s) to σ(s) (namely, on either of the paths from π(s) to s and from σ(s) to s). We say that x is requested in (π,σ) if it is s-requested for some s. The pair (π,σ) has unique request if every node x of T is requested at most once in (π,σ), i.e., there exists at most one sS such that x is s-requested in (π,σ). Note that if (π,σ) has unique request then the paths from π(s) to σ(s) are pairwise disjoint for all the sS. We now state a few basic observations.

 Remark 3.

Let (π,σ) identifying some subset S of VL.

  1. 1.

    The reversed pair (σ,π) also identifies S and has unique request whenever (π,σ) does.

  2. 2.

    If SS, then (π|S,σ|S) identifies S, and has unique request if (π,σ) does.

  3. 3.

    For each sS, s is s-requested in (π,σ).

  4. 4.

    If (σ,π) has unique request, then π(S) and σ(S) are disjoint subsets of L.

Lemma 4.

Let (π,σ) identifying some subset S of VL with unique request. Then for each aπ(S) the node π1(a) is the least ancestor of a which belongs to S.

Proof.

Let aπ(S) and let s=π1(a). By definition, s is an ancestor of a which belongs to S. Let y be the least ancestor of a that is contained in S. As s is an ancestor of a belonging to S, y must be a descendant of s. Hence, y is s-requested in (π,σ). Additionally, by Item 3 of Remark 3, y is y-requested in (π,σ). Since (π,σ) has unique request, y=s. Thus, s is the least ancestor of a which belongs to S.

Notice that, by Item 1 of Remark 3, a similar result holds for each bσ(S). It follows that the sets π(S) and σ(S) characterize (π,σ).

Lemma 5.

Let SVL, and (π,σ) and (π,σ) be two pairs of injections from S to L identifying S with unique request. If π(S)=π(S) and σ(S)=σ(S) then π=π and σ=σ.

Proof.

Let sS, a=π(s), and s=π1(a). Both s and s are the least ancestor of a which belongs to S, hence s=s and π(s)=a. Thus π=π. By Item 1 of Remark 3, we also obtain that σ=σ.

Let A and B be two disjoint subsets of L. We call such a pair (A,B) a bi-colouring. We say that (A,B) identifies S if there exists a pair (π,σ) identifying S with unique request such that π(S)=A and σ(S)=B. By the previous lemma, for a fixed set SVL the pair (π,σ) identifying S is unique when it exists. We will also prove that S is actually uniquely determined from (A,B). Before that, we state the following technical lemma.

Lemma 6.

Let (A,B) identify some subset S of VL through a pair (π,σ) of injections having unique request. Then, for each inner node x, exactly one of the three following cases holds:

  1. 1.

    xS, x is not requested in (π,σ), and for each child c of x, |AV(Tc)|=|BV(Tc)|;

  2. 2.

    xS, x is requested in (π,σ), and there exists one leaf z(AB)V(Tx) such that, for each child c of x, |(A{z})V(Tc)|=|(B{z})V(Tc)|;

  3. 3.

    xS, x is requested in (π,σ), and there exists two leaves aAV(Tx) and bBV(Tx) such that, for each child c of x, |(A{a})V(Tc)|=|(B{b}V(Tc))| and {a,b}V(Tc).

Proof.

Let x be an inner node. We consider the set S=S(V(Tx){x}) of all nodes which are not proper descendants of x and the restrictions π and σ of, respectively, π and σ to S. By Item 2 of Remark 3 (π,σ) identify S with unique request. We denote A=π(S) and B=σ(S), thus (A,B) identify S. Let Ax=AV(Tx) and Bx=BV(Tx) be the sets of representatives contained in Tx of nodes in S. Let a be an element of Ax. The element sa=π1(a) is an ancestor of x because it is an ancestor of aV(Tx) and belongs to S whence not to V(Tx){x}. Therefore, x is sa-requested. As (π,σ) has unique request, there exists at most one element in Ax. Similarly, Bx has size at most 1. Moreover, AxBx= by Item 4 of Remark 3. We thus we have three cases:

Case AxBx=.

Then there is no sS such that π(s)V(Tx) or σ(s)V(Tx). Hence, x is not requested in (π,σ), in particular, xS.

Let c be a child of x. If |AV(Tc)||BV(Tc)|, then there exists a(AB)V(Tc) such that, assuming without loss of generality that aA and denoting sa=π1(a), σ(sa)V(Tc). Hence, sa is a proper ancestor of c, thus, equivalently, an ancestor of x, implying that x is sa-requested in (π,σ). This contradicts the above argument. So, |AV(Tc)|=|BV(Tc)| for each child c of x.

Case AxBx={a}.

Assume, without loss of generality, that aA, and denote sa=π1(a) and b=σ(sa). We have that sa is a proper ancestor of x, since it is the least common ancestor of aV(Tx) and bV(Tx). Thus, x is sa-requested and sax so xS.

Let c be a child of x. If |AV(Tc)||BV(Tc)|, then it means that there exists z(AB)V(Tc), such that either zA and sz=π1(z)V(Tc), or zB and sz=σ1(z)V(Tc). In both cases, sz is a proper ancestor of c, whence an ancestor of x. Thus, x is sz-requested in (π,σ). However, x is sa-requested in (π,σ) which has unique request, hence sz=sa. If zB, it follows that z=b, which contradicts the fact bV(Tx). Hence, zA and thus, z=π(sa)=a. Therefore, |(A{a})V(Tc)|=|BV(Tc)| for each child c of x.

Case Ax={a} and Bx={b}.

Let sa=π1(a) and sb=σ1(b). The node x is sa-requested and sb-requested in (π,σ), so, by the unique request property, sa=sb. Since sa is the least common ancestor of a and b, it belongs to V(Tx) whence sa=x implying xX.

Let c be a child of x. If |AV(Tc)||BV(Tc)|, then there exists z(AB)V(Tc) such that either zA and sz=π1(z)V(Tc), or zB and sz=σ1(z)V(Tc). In both cases, sz is a proper ancestor of c, whence an ancestor of x, and thus x is sz-requested in (π,σ). However, x is x-requested in (π,σ) which has unique request, hence sz=x and thus z{a,b}. Therefore, |(A{a})V(Tc)|=|(B{b})V(Tc)| for each child c of x. Because the least common ancestor of a and b is x, there is no child c of x containing both a and b as leaves, i.e., {a,b}V(Tc).

This concludes the proof of the statement.

It follows that for each bi-colouring (A,B), there exists at most one set S of inner nodes identified by (A,B).

Lemma 7.

Let (A,B) be two disjoint subsets of L and let S and S be two subsets of VL. If (A,B) identify both S and S, then S=S.

Proof.

We proceed by contradiction and thus assume SS. Let sSS. By Lemma 6, the number of children c of s for which the set (AB)V(Tc) has odd size is 2 since sS, while it should also be less than 2 since sS. A contradiction.

When (A,B) identifies a set S, we call A-representative (resp. B-representative) of sS the leaf π(s)A (resp. σ(s)B), where (π,σ) witnessing that (A,B) identifies S. An example of bi-colouring identifying a subset of inner nodes is given in Figure 2.

Figure 2: Illustration of a bi-colouring (A,B) identifying some set SVL in a binary tree T. Leaves from A are coloured blue, leaves from B are coloured red, and inner nodes from S are marked purple. Furthermore, the two paths connecting an inner node sS to its A- and B-representatives are coloured blue and red, respectively.

Not every set of inner nodes has a bi-colouring identifying it. To ensure that such a pair exists, we consider thin sets of inner nodes. While thin sets always have bi-colourings identifying them, it is also guaranteed that the set of inner nodes can be partitioned into only 4 thin sets. A subset XVL is thin when, for each xX not being the root, on the one hand, the parent px of x does not belong to X, and on the other hand, x admits at least one sibling (including possible leaves) that does not belong to X. Having a thin set X allows to find branches avoiding it.

Lemma 8.

Let XVL and sVX. If X is thin, then there exists a leaf tV(Ts) such that the path from t to s avoids X (i.e., none of the nodes along this path belong to X).

Proof.

If Ts has height 0, then s is a leaf and taking t=s trivially gives the expected path. Otherwise, s is an inner node and, because X is thin, s has at least one child cs not belonging to X. By induction, there is a path from some leaf tV(Tcs) to cs avoiding X and, since sX, this path could be extended into a path from t to s avoiding X.

The previous lemma allows to identify every thin set.

Lemma 9.

If X is a thin set, then there exists a pair (π,σ) of injections from X to L that has unique request and that identifies X.

Proof.

We proceed by induction on the size of X. If X=, the result is trivial. Let n and suppose that for every thin set of size n there exists a pair of injections identifying it with unique request. Let X be a thin set of size n+1, and let sX be of minimal depth. Clearly, X{s} is thin and thus there exists, by induction, a pair (π,σ) of injections from X{s} to L identifying X{s} with unique request. Since X is thin and sX, we can find two distinct children ca and cb of s not belonging to X.555Remember that every inner node of T has at least two children (including possible leaves). Then, by Lemma 8, there exists a leaf aV(Tca) (resp. a leaf bV(Tcb)) such that the path from a to ca (resp. from b to cb) avoids X. In particular, for each node y along these paths, since y has no ancestor that belongs to X but s, y is not requested in (π,σ). Hence, extending π (resp. σ) in such a way that, besides mapping each xX{s} to π(x) (resp. to σ(x)), it maps s to a (resp. to b), we obtain a pair (π^,σ^) of injections from X to L that identifies X with unique request.

A family F=(A1,B1),,(An,Bn) of bi-colourings identifies a set SVL, if there exists a partition (S1,,Sn) of S such that, for each i[n], (Ai,Bi) identifies Si. Whenever S=VL we say that F identifies T. A collection of subsets of VL is thin if each of its subsets is thin. We now show that there exists a thin 4-partition.

Lemma 10.

There exists a thin 4-partition of VL.

Proof.

We build such a thin 4-partition as follows. First, consider the partition (De,Do) of VL in which De (resp. Do=V(DeL)) is the set of all inner nodes of even (resp. odd) depth. Second, arbitrarily fix one child cx of x for each inner node x, and consider the set C={cxxVL}L, inducing a partition (C,C¯) of VL where C¯=V(LC). By refining these two bi-partitions, we obtain a 4-partition which is thin by construction.5

Hence, four bi-colourings are enough to identify T. For an example see Figure 3.

Corollary 11.

There exists a family of four bi-colourings identifying T.

Proof.

The result immediately follows from Lemmas 9 and 10.

Figure 3: Illustration of a partition of the inner nodes into 4 thin sets indicated by different shapes/fillings. The four colourings identifying each thin set are indicated in the table below the leaves.

3.2 The transduction

The goal of this section is to prove Theorem 2, that is, to design a C2MSO-transduction that produces the laminar tree induced by an input laminar set system. We fix a laminar set system (U,𝒮), represented by a {𝖲𝖤𝖳}-structure 𝕊. Before proving the theorem, we make the following basic observation. On 𝕊, we can define two MSO-formulas 𝖽𝖾𝗌𝖼(X,Y) and 𝖼𝗁𝗂𝗅𝖽(X,Y) expressing that in the laminar tree induced by 𝒮, X and Y are nodes and X is a descendant or a child of Y, respectively:

𝖽𝖾𝗌𝖼(X,Y) :=𝖲𝖤𝖳(X)𝖲𝖤𝖳(Y)XY;
𝖼𝗁𝗂𝗅𝖽(X,Y) :=𝖽𝖾𝗌𝖼(X,Y)XYZ((𝖽𝖾𝗌𝖼(Z,Y)ZY)𝖽𝖾𝗌𝖼(Z,X)).

The key point of our construction consists in defining a C2MSO-formula 𝗋𝖾𝗉𝗋A,B(a,X) which, assuming a bi-colouring (A,B) of the universe modeled as disjoint unary relations and identifying a subset S of inner nodes of T, is satisfied exactly when XS and a is its A-representative.

Lemma 12.

Let (A,B) be a bi-colouring of L(T) identifying a subset S of inner nodes of T. There exists a C2MSO-formula 𝗋𝖾𝗉𝗋A,B(a,X) that is satisfied exactly when X is an inner node of T that belongs to S and is A-represented by a.

Proof.

First, we define a formula ϕA,B(X) that, under the above assumption, is satisfied exactly when X belongs to S. According to Lemma 6, this happens if and only if X is a node of T (i.e.𝖲𝖤𝖳(X) is satisfied) and there exists aXA and bXB such that for each child Z of X, {a,b}Z and the set (Z{a,b})(AB) has even size. This property is easily expressed in C2MSO, using the MSO-formula 𝖼𝗁𝗂𝗅𝖽(X,Y) defined previously, as well as the predicate 𝖲𝖤𝖳:

ϕA,B(X):= 𝖲𝖤𝖳(X)ab[a(XA)b(XB)
Z(𝖼𝗁𝗂𝗅𝖽(Z,X)({a,b}ZC2((Z{a,b})(AB))))].

Now, we can easily define 𝗋𝖾𝗉𝗋A,B(a,X) based on Lemma 4:

𝗋𝖾𝗉𝗋A,B(a,X):= ϕA,B(X)a(XA)ZX(aZ¬ϕA,B(Z)).

This concludes the proof.

We are now ready to prove the theorem.

Proof of Theorem 2.

The C2MSO-transduction is obtained by composing the following atomic C2MSO-transductions. The transduction makes use of the formulas 𝗋𝖾𝗉𝗋A,B(a,X) given by Lemma 12.

  1. 1.

    Guess a family of four bi-colourings (Ai,Bi)i[4] identifying T (which exists by Corollary 11).

  2. 2.

    Copy the input graph four times, thus introducing four binary relations (𝖼𝗈𝗉𝗒i)i[4] where 𝖼𝗈𝗉𝗒i(x,y) indicates that x is the i-th copy of the original element y.

  3. 3.

    Filter the universe keeping only the original elements as well as the i-th copy of each vertex a for which there exists X such that 𝗋𝖾𝗉𝗋Ai,Bi(a,X).

  4. 4.

    Define the relation 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋(x,y) so that it is satisfied exactly when there exist x, X, i, and Y such that, on the one hand 𝖽𝖾𝗌𝖼(Y,X) and 𝖼𝗈𝗉𝗒i(x,x)𝗋𝖾𝗉𝗋Ai,Bi(x,X), and, on the other hand, either y is an original element and Y={y}, or there exists y and j such that 𝖼𝗈𝗉𝗒j(y,y)𝗋𝖾𝗉𝗋Aj,Bj(y,Y).

4 Transducing modular decompositions

The set of modules of a directed graph is a specific example of a particular type of set system, a “weakly-partitive set system” (and a “partitive set system” in the case of an undirected graph). In this section, we first give a general C2MSO-transduction to obtain the canonical tree-like decomposition of a weakly-partitive set system from the set system itself. We then show how to obtain the modular decomposition of a graph via a C2MSO-transduction as an application.

4.1 Transducing weakly-partitive trees

A set system (U,𝒮) is weakly-partitive if for every two overlapping sets X,Y𝒮, the sets XY, XY, XY, and YX belong to 𝒮. It is partitive if, moreover, for every two overlapping sets X,Y𝒮, their symmetric difference, denoted XY, also belongs to 𝒮. By extension, a set family 𝒮 is called weakly-partitive or partitive whenever (𝒮,𝒮) is a set system which is weakly-partitive or partitive, respectively (note these also requires that 𝒮, 𝒮𝒮, and {a}𝒮 for every a𝒮).

A member of a set system (U,𝒮) is said to be strong if it does not overlap any other set from 𝒮. The sub-family 𝒮! of strong sets of 𝒮 is thus laminar by definition. Hence, it induces a laminar tree T. By extension, we say that T is induced by the set system (U,𝒮) (or simply by 𝒮). The next result extends Theorem 2, by showing that T can be C2MSO-transduced from 𝒮.

Lemma 13.

Let Σ be an extended vocabulary, including a set unary predicate name 𝖲𝖤𝖳 and not including the binary relational symbol 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋. There exists a non-deterministic C2MSO-transduction τ such that, for each set system (U,𝒮) represented as the {𝖲𝖤𝖳}-structure 𝕊 and inducing the laminar tree T with L(T)=U, and for each Σ-structure 𝔸 with 𝕊𝔸, τ(𝔸) is non-empty and every output in τ(𝔸) is equal to some {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋}-structure 𝕋 representing T.

Proof.

On the {𝖲𝖤𝖳}-structure 𝕊, it is routine to define an MSO-formula ϕ𝖲𝖤𝖳!(Z) that identifies those subsets ZU that are strong members of 𝒮:

ϕ𝖲𝖤𝖳!(Z) :=𝖲𝖤𝖳(Z)X((𝖲𝖤𝖳(X)XZ)(XZZX)).

Hence, we can design an MSO-interpretation that outputs the Σ{𝖲𝖤𝖳!}-structure corresponding to 𝔸 equipped with the set unary predicate 𝖲𝖤𝖳! that selects strong members of 𝒮. Thus, up to renaming the predicates 𝖲𝖤𝖳! and 𝖲𝖤𝖳, by Theorem 2, we can produce, through a C2MSO-transduction, the Σ{𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋}-structure 𝔸𝕋 where 𝕋 is the tree-structure modeling the laminar tree T induced by 𝒮! with L(T)=U (once the output obtained, the interpretation drops the set predicate 𝖲𝖤𝖳! which is no longer needed).

Clearly, the laminar tree T of a weakly-partitive set system (U,𝒮) does not characterize (U,𝒮). However, as shown by the below theorem, a labeling of its inner nodes and a controlled partial ordering of its nodes are sufficient to characterize all the sets of 𝒮. For Z a set equipped with a partial order < and X a subset of Z, we say that X is a <-interval whenever < defines a total order on X and for every a,bX and every cZ, a<c<b implies cX.

Theorem 14 ([7, 24]).

Let 𝒮 be a weakly-partitive family, 𝒮! be its subfamily of strong sets, and T be the laminar tree it induces. There exists a total labeling function λ from the set V(T)L(T) of inner nodes of T to the set {𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾,𝗉𝗋𝗂𝗆𝖾,𝗅𝗂𝗇𝖾𝖺𝗋}, and, for each inner node tλ1(𝗅𝗂𝗇𝖾𝖺𝗋), a linear ordering <t of its children, such that every inner node having exactly two children is labeled by 𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾 and the following two conditions are satisfied:

  • for each X𝒮𝒮!, there exists tV(T) and a subset 𝒞 of children of t such that X=c𝒞L(Tc) and either λ(t)=𝗅𝗂𝗇𝖾𝖺𝗋 and 𝒞 is a <t-interval, or λ(t)=𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾;

  • conversely, for each inner node t and each non-empty subset 𝒞 of children of t, if either λ(t)=𝗅𝗂𝗇𝖾𝖺𝗋 and 𝒞 is a <t-interval, or λ(t)=𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾, then c𝒞L(Tc)𝒮.

Furthermore, T and λ are uniquely determined from 𝒮, and, for each inner node t of T labeled by 𝗅𝗂𝗇𝖾𝖺𝗋, only two orders <t are possible, one being the inverse of the other (indeed, inverting an order < does preserve the property of being a <-interval). Hence, every weakly-partitive family 𝒮 is characterized by a labeled and partially-ordered tree (T,λ,<) where T is the laminar tree induced by the subfamily 𝒮! of strong sets of 𝒮, λ:V(T)L(T){𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾,𝗉𝗋𝗂𝗆𝖾,𝗅𝗂𝗇𝖾𝖺𝗋} is the labeling function, and < is the partial order tλ1(𝗅𝗂𝗇𝖾𝖺𝗋)<t over V(T). As, up-to inverting some of the <t orders, (T,λ,<) is unique, we abusively call it the weakly-partitive tree induced by 𝒮. Conversely, a weakly-partitive tree characterizes the unique weakly-partitive set system which induced it.

We naturally model a weakly-partitive tree (T,λ,<) of a partitive set system (U,𝒮) by the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾,𝖻𝖾𝗍𝗐𝖾𝖾𝗇𝖾𝗌𝗌}-structure 𝕋 of universe U𝕋=V(T) such that V(T),𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋𝕋𝕋 models T with L(T)=U, 𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾𝕋 is a unary relation which selects the inner nodes of T of label 𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾, i.e., 𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾𝕋=λ1(𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾), and 𝖻𝖾𝗍𝗐𝖾𝖾𝗇𝖾𝗌𝗌𝕋 is a ternary relation selecting triples (x,y,z) satisfying x<y<z or z<y<x. (Although it is possible, through a non-deterministic MSO-transduction, to define < from 𝖻𝖾𝗍𝗐𝖾𝖾𝗇𝖾𝗌𝗌𝕋, the use of 𝖻𝖾𝗍𝗐𝖾𝖾𝗇𝖾𝗌𝗌𝕋 rather than <𝕋 ensures unicity of the output weakly-partitive tree.) The inner nodes of T that are labeled by 𝗅𝗂𝗇𝖾𝖺𝗋 can be recovered through an MSO-formula as those inner nodes whose children are related by 𝖻𝖾𝗍𝗐𝖾𝖾𝗇𝖾𝗌𝗌. The inner nodes labeled by 𝗉𝗋𝗂𝗆𝖾 can be recovered through an MSO-formula as those inner nodes that are labeled neither by 𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾 nor by 𝗅𝗂𝗇𝖾𝖺𝗋. Using Theorem 14, it is routine to design an MSO-transduction which takes as input a weakly-partitive tree and outputs the weakly-partitive set system which induced it. The inverse C2MSO-transduction is the purpose of the next result. The proof is omitted due to space constraints.

Theorem 15.

There exists a non-deterministic C2MSO-transduction τ such that, for every weakly-partitive set system (U,𝒮) represented as the {𝖲𝖤𝖳}-structure 𝕊 and inducing the weakly-partitive tree (T,λ,<) represented as the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾,𝖻𝖾𝗍𝗐𝖾𝖾𝗇𝖾𝗌𝗌}-structure 𝕋, we have 𝕋τ(𝕊) and every output in τ(𝕊) is a weakly-partitive tree of (U,𝒮).

If 𝒮 is partitive then the weakly-partitive tree it induces enjoys a simple form, and is truly unique. Indeed, the label 𝗅𝗂𝗇𝖾𝖺𝗋 and, thus, the partial order <, are not needed.

Theorem 16 ([7]).

Let 𝒮 be a weakly-partitive family and (T,λ,<) be the weakly-partitive tree it induces. If 𝒮 is partitive, then λ1(𝗅𝗂𝗇𝖾𝖺𝗋)= and < is empty.

Hence, in case of a partitive set systems (U,𝒮), we can consider the simpler object (T,λ), called the partitive tree induced by 𝒮 (or the partitive tree of (U,𝒮)) in which λ maps V(T)L(T) to {𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾,𝗉𝗋𝗂𝗆𝖾}. As a direct consequence of Theorems 16 and 15, we can produce, through a C2MSO-transduction, the partitive tree induced by a partitive set system and naturally modeled by an {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾}-structure.

Corollary 17.

There exists a non-deterministic C2MSO-transduction τ such that, for each partitive set system (U,𝒮) represented as the {𝖲𝖤𝖳}-structure 𝕊 and inducing the partitive tree (T,λ) represented as the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾}-structure 𝕋, we have 𝕋τ(𝕊) and every output in τ(𝕊) is a partitive tree of (U,𝒮).

4.2 Application to modular decomposition

Let G be a directed graph and let MV(G). We say that M is a module (of G) if for every uM and every v,wM, uvE(G)uwE(G) and vuE(G)wuE(G). Clearly, the empty set, V(G), and all the singletons {x} for xV(G) are modules; they are called the trivial modules of G. We say a non-empty module M is maximal if it is not properly contained in any non-trivial module. Let M and M be two disjoint non-empty modules of G. Considering the edges that go from M to M, namely edges from the set (M×M)E(G), we have two possibilities: either it is empty, or it is equal to M×M. We write M↛M in the former case and MM in the latter. (It is of course possible to have both MM and MM.) A modular partition of G is a partition 𝒫={M1,,M} of V(G) such that every Mi is a non-empty module. A modular partition 𝒫={M1,,M} is called maximal if it is non-trivial and every Mi is strong and maximal. Note that every graph has exactly one maximal modular partition.

A modular decomposition of G is a rooted tree T in which the leaves are the vertices of G, and for each inner node tT, t has at least two children and the set L(Tt) is a module of G. In a modular decomposition T of G, for each inner node tV(T) with children c1,,cr, the family 𝒫t={L(Tc1),,L(Tcr)} is a modular partition of G[V(Tt)]. When each such partition is maximal, the decomposition is unique and it is called the maximal modular decomposition of G. The maximal modular decomposition T of G alone is not sufficient to characterize G. However, enriching T with, for each inner node t with children c1,,cj, the information of which pair of modules (L(Tci),L(Tcj)) is such that L(Tci)L(Tcj), yields a unique canonical representation of G. Formally, the enriched modular decomposition of G is the pair (T,F) where T is the maximal modular decomposition of G (with L(T)=V(G)) and FV(T)×V(T) is a binary relation, that relates a pair (s,t) of nodes of T, denoted stF, exactly when s and t are siblings and L(Ts)L(Tt). The elements of F are called 𝗆-edges.

It should be mentioned that the family of all non-empty modules of G is known to be weakly-partitive (or even partitive when G is undirected). In particular, the maximal modular decomposition T of G is the laminar tree induced by the family of strong modules. Hence Theorem 15 could be used to produce a partially-ordered and labeled tree which displays all the modules of G. However, this weakly-partitive tree is not sufficient for being able to recover the graph G from it. We now prove how to obtain the enriched modular decomposition of G through a C2MSO-transduction.

To model enriched modular decompositions as relational structures we use the relational vocabulary {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,m-𝖾𝖽𝗀𝖾} where 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋 and m-𝖾𝖽𝗀𝖾 are two binary relation names. An enriched modular decomposition (T,F) of a graph G is modeled by the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,m-𝖾𝖽𝗀𝖾}-structure 𝕄 with universe U𝕄=V(T), 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋𝕄 being the set of pairs (s,t) for which s is an ancestor of t in T, and m-𝖾𝖽𝗀𝖾𝕄 being the set of all pairs (s,t) such that stF (in particular, s and t are siblings in T). We use Lemma 13 in order to transduce the maximal modular decomposition of a graph.

Theorem 18.

There exists a non-deterministic C2MSO-transduction τ such that for every directed graph G represented as the {𝖾𝖽𝗀𝖾}-structure 𝔾, τ(𝔾) is non-empty and every output in τ(𝔾) is equal to some {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,m-𝖾𝖽𝗀𝖾}-structure 𝕄 representing the enriched modular decomposition (T,F) of G.

Proof.

Let G be a graph represented by the {𝖾𝖽𝗀𝖾}-structure 𝔾. Several objects are associated to G, and each of them can be described by a structure:

  • let  be the family of non-empty modules of G and let 𝕊 be the {𝖲𝖤𝖳}-structure modeling the weakly-partitive set system (V(G),) with U𝕊=V(G);

  • let T be the laminar tree induced by the weakly-partitive family  and let 𝕋 be the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋}-structure modeling it with U𝕋=V(T) and L(T)=V(G)U𝕋;

  • let F be the 𝗆-edge relation, namely the subset of V(T)×V(T) such that (T,F) is the maximal modular decomposition of G, and let 𝕄 be the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,m-𝖾𝖽𝗀𝖾}-structure modeling it with 𝕋𝕄 and U𝕋=U𝕄.

Our C2MSO-transduction is obtained by composing the three following transductions:

  • τ1: an MSO-interpretation which outputs the {𝖾𝖽𝗀𝖾,𝖲𝖤𝖳}-structure 𝔾𝕊 from 𝔾;

  • τ2: the non-deterministic C2MSO-transduction given by Lemma 13 which produces the {𝖾𝖽𝗀𝖾,𝖲𝖤𝖳,𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋}-structure 𝔾𝕊𝕋 from 𝔾𝕊;

  • τ3: an MSO-interpretation which outputs the {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,m-𝖾𝖽𝗀𝖾}-structure 𝕄 from 𝔾𝕊𝕋.

In order to define τ1 it is sufficient to observe that there exists an MSO-formula ϕ𝖲𝖤𝖳(Z) with one monadic free-variable, which is satisfied exactly when Z is a non-empty module of G. Then, since τ2 is given by Lemma 13, it only remains to define τ3. Given an inner node t, we can select, within MSO, the set L(Tt) of leaves of the subtree rooted in t. We thus assume a function 𝗅𝖾𝖺𝖿𝗌𝖾𝗍, with one first-order free-variable which returns the set of leaves of the subtree rooted at the given node. Equipped with this function, we can define the MSO-formula ϕm-𝖾𝖽𝗀𝖾 which selects pairs (s,r) of siblings such that srF. Remember that this happen exactly when there exist uL(Ts) and vL(Tr) such that uvE(G). Hence, ϕm-𝖾𝖽𝗀𝖾 could be defined as:

ϕm-𝖾𝖽𝗀𝖾(s,r):=sr t(𝗉𝖺𝗋𝖾𝗇𝗍(t,s)𝗉𝖺𝗋𝖾𝗇𝗍(t,r))
xy(x𝗅𝖾𝖺𝖿𝗌𝖾𝗍(s)y𝗅𝖾𝖺𝖿𝗌𝖾𝗍(r)𝖾𝖽𝗀𝖾(x,y)).

Once defined, the MSO-interpretation τ3 simply drops all non-necessary relations and predicates (namely 𝖾𝖽𝗀𝖾 and 𝖲𝖤𝖳) and keeps only the 𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋 and m-𝖾𝖽𝗀𝖾 relations.

Notice that it is routine to design a deterministic MSO-transduction which, given an {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,m-𝖾𝖽𝗀𝖾}-structure 𝕄 representing an enriched modular decomposition of some directed graph G, produces the {𝖾𝖽𝗀𝖾}-structure 𝔾 representing G.

Cographs

Let G be a graph, (T,F) be its modular decomposition, and (T,λ,<) be the weakly-partitive tree induced by the (weakly-partitive) family of its modules. Let t be an inner node of T, let 𝒞 be its set of children, and let C be the graph (V(T)L(T),F)[𝒞] induced by F on the set of children of t. It can be checked that, if λ(t)=𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾 then C is either a clique or an independant, and if λ(t)=𝗅𝗂𝗇𝖾𝖺𝗋 then C is a tournament consistent with <t (i.e., for every x,y𝒞, xy is an edge of C if and only if x<ty) or with the inverse of <t. In the former case, we can refine the 𝖽𝖾𝗀𝖾𝗇𝖾𝗋𝖺𝗍𝖾 label into 𝗌𝖾𝗋𝗂𝖾𝗌 and 𝗉𝖺𝗋𝖺𝗅𝗅𝖾𝗅 labels, thus expressing that C is a clique or an indenpendant, respectively. In the latter case, up-to reversing <t, we can ensure that C is a tournament consistent with <t. This yields a refined weakly-partitive tree (T,γ,<), where γ maps inner nodes to {𝗌𝖾𝗋𝗂𝖾𝗌,𝗉𝖺𝗋𝖺𝗅𝗅𝖾𝗅,𝗉𝗋𝗂𝗆𝖾,𝗅𝗂𝗇𝖾𝖺𝗋} and < is the order tγ1(𝗅𝗂𝗇𝖾𝖺𝗋)<t which ensures that tournaments are consistent with the corresponding <t. Notice that this labeled and partially-ordered tree is now uniquely determined from G. Moreover, edges from F that connect children of a node not labeled by 𝗉𝗋𝗂𝗆𝖾 can be recovered from the so-refined weakly-partitive tree. In particular, if no nodes of T is labeled by 𝗉𝗋𝗂𝗆𝖾, (T,F) and thus G is fully characterized by (T,γ,<). Graphs for which this property holds are known as directed cographs, and can be described by the refined weakly-partitive tree, called cotree, explained above and formalized in the following statement.

Theorem 19.

Let G be a directed cograph and let T be the laminar tree induced by the family of its strong modules. There exists a unique total labeling λ from the set V(T)L(T) of inner nodes of T to the set {𝗌𝖾𝗋𝗂𝖾𝗌,𝗉𝖺𝗋𝖺𝗅𝗅𝖾𝗅,𝗅𝗂𝗇𝖾𝖺𝗋} of labels, and, for each inner node tλ1(𝗅𝗂𝗇𝖾𝖺𝗋), a unique linear ordering <t of its children, such that every inner node having exactly two children is labeled 𝗌𝖾𝗋𝗂𝖾𝗌 or 𝗉𝖺𝗋𝖺𝗅𝗅𝖾𝗅, and the following condition is satisfied:

  • for every two leaves x and y of T, denoting by t their least common ancestor, xy is an edge of G if and only if either t is labeled by 𝗅𝗂𝗇𝖾𝖺𝗋 and x<ty, or t is labeled by 𝗌𝖾𝗋𝗂𝖾𝗌.

We naturally model cotrees as {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,𝗌𝖾𝗋𝗂𝖾𝗌,ε-𝗈𝗋𝖽}-structures as follow. A cotree (T,γ,<) is modeled by  where U=V(T), 𝗌𝖾𝗋𝗂𝖾𝗌=γ1(𝗌𝖾𝗋𝗂𝖾𝗌), and ε-𝗈𝗋𝖽={(x,y)x<y}. The nodes that are labeled by 𝗅𝗂𝗇𝖾𝖺𝗋 could be recovered as those inner nodes whose children are related by ε-𝗈𝗋𝖽, while the nodes that are labeled by 𝗉𝖺𝗋𝖺𝗅𝗅𝖾𝗅 could be recovered as those inner nodes which are labeled neither by 𝗌𝖾𝗋𝗂𝖾𝗌 nor by 𝗅𝗂𝗇𝖾𝖺𝗋. Based on Theorem 19 and as a consequence of Theorem 18, we can design a C2MSO-transduction which produces the cotree of a cograph G from G.

Corollary 20.

There exists a non-deterministic C2MSO-transduction τ such that, for each cograph G modeled by the {𝖾𝖽𝗀𝖾}-structure 𝔾, τ(𝔾) is non-empty and every output in τ(𝔾) is equal to some {𝖺𝗇𝖼𝖾𝗌𝗍𝗈𝗋,𝗌𝖾𝗋𝗂𝖾𝗌,ε-𝗈𝗋𝖽}-structure  representing the cotree (T,γ,<) of G.

5 Conclusion

We provide transductions for obtaining tree-like graph decompositions such as modular decompositions, cotrees, split decompositions and bi-join decompositions from a graph using CMSO. This improves upon results of Courcelle [10, 12] who gave such transductions for ordered graphs. In a more general settings, we also obtain CMSO-transductions outputing weakly-partitive and weakly-bipartitive trees of weakly-partitive and weakly-bipartitive systems (Items 6 and 7 of Theorem 1). It is worth mentionning that the latter transduction can be also used to CMSO-transduce canonical decompositions of other structures such as Tutte’s decomposition of matroids or generally split-decompositions of submodular functions [16] or modular decompositions of 2-structures [18] or of hypergraphs [21]. As shown by the application given in [12] for transducing Whitney’s isomorphism class of a graph, a line of research is to more investigate which structures can be CMSO-transduced from a graph or a set system by using the transductions from Theorem 1. Also, naturally, the question arises whether counting is necessary or whether MSO is sufficient to transduce such decompositions. Furthermore, our results include that transducing rank decompositions for graphs of rank-width 1 is possible using CMSO. But it is not known whether rank-decompositions can be transduced in general using some suitable extension of MSO. Nevertheless, a corollary of our results is that CMSO-transducing rank-decompositions can be now reduced to consider CMSO-transducing rank-decompositions of prime graphs wrt either modular decomposition or split-decomposition as if a graph has rank-width at least 2, then its rank-width is equal to the rank-width of its prime induced graphs wrt either modular or split decomposition. Thus, our results imply that, for any fixed k, there is a CMSO-transduction that computes a clique-decomposition of small width for any graph belonging to a graph class whose prime graphs have sizes bounded by k or prime graphs have linear clique-width bounded by k, e.g., many H-free graphs have small prime graphs or have small linear clique-width (see for instance [5] or [23] for prominent such examples).

References

  • [1] Hans L Bodlaender. A linear time algorithm for finding tree-decompositions of small treewidth. In Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 226–234, 1993. doi:10.1145/167088.167161.
  • [2] Mikołaj Bojańczyk. The category of mso transductions. CoRR, May 2023. arXiv:2305.18039v1.
  • [3] Mikołaj Bojańczyk, Martin Grohe, and Michał Pilipczuk. Definable decompositions for graphs of bounded linear cliquewidth. Log. Methods Comput. Sci., 17(1), 2021. URL: https://lmcs.episciences.org/7125.
  • [4] Mikołaj Bojańczyk and Michał Pilipczuk. Definability equals recognizability for graphs of bounded treewidth. In Martin Grohe, Eric Koskinen, and Natarajan Shankar, editors, Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’16, New York, NY, USA, July 5-8, 2016, pages 407–416. ACM, 2016. doi:10.1145/2933575.2934508.
  • [5] Andreas Brandstädt, Feodor F. Dragan, Hoàng-Oanh Le, and Raffaele Mosca. New graph classes of bounded clique-width. Theory Comput. Syst., 38(5):623–645, 2005. doi:10.1007/S00224-004-1154-6.
  • [6] Rutger Campbell, Bruno Guillon, Mamadou Moustapha Kanté, Eun Jung Kim, and Noleen Köhler. CMSO-transducing tree-like graph decompositions, 2024. doi:10.48550/arXiv.2412.04970.
  • [7] Michel Chein, Michel Habib, and Marie-Catherine Maurer. Partitive hypergraphs. Discrete mathematics, 37(1):35–50, 1981. doi:10.1016/0012-365X(81)90138-2.
  • [8] Bruno Courcelle. The monadic second-order logic of graphs. I. recognizable sets of finite graphs. Information and computation, 85(1):12–75, 1990. doi:10.1016/0890-5401(90)90043-H.
  • [9] Bruno Courcelle. The monadic second-order logic of graphs V: On closing the gap between definability and recognizability. Theoretical Computer Science, 80(2):153–202, 1991. doi:10.1016/0304-3975(91)90387-H.
  • [10] Bruno Courcelle. The monadic second-order logic of graphs X: linear orderings. Theor. Comput. Sci., 160(1&2):87–143, 1996. doi:10.1016/0304-3975(95)00083-6.
  • [11] Bruno Courcelle. The monadic second-order logic of graphs XI: hierarchical decompositions of connected graphs. Theor. Comput. Sci., 224(1-2):35–58, 1999. doi:10.1016/S0304-3975(98)00306-5.
  • [12] Bruno Courcelle. The monadic second-order logic of graphs XVI: Canonical graph decompositions. Logical Methods in Computer Science, 2, 2006. doi:10.2168/LMCS-2(2:2)2006.
  • [13] Bruno Courcelle. Canonical graph decompositions. Talk, 2012. Available at https://www.labri.fr/perso/courcell/Conferences/ExpoCanDecsJuin2012.pdf.
  • [14] Bruno Courcelle. The atomic decomposition of strongly connected graphs. Technical report, Université de Bordeaux, 2013. Available at https://www.labri.fr/perso/courcell/ArticlesEnCours/AtomicDecSubmitted.pdf.
  • [15] Bruno Courcelle and Joost Engelfriet. Graph Structure and Monadic Second-Order Logic. Cambridge University Press, 2009. doi:10.1017/cbo9780511977619.
  • [16] William H Cunningham. Decomposition of directed graphs. SIAM Journal on Algebraic Discrete Methods, 3(2):214–228, 1982.
  • [17] Fabien de Montgolfier. Décomposition modulaire des graphes. Théorie, extension et algorithmes. Phd thesis, Université Montpellier II, LIRMM, 2003.
  • [18] Andrzej Ehrenfeucht, Tero Harju, and Grzegorz Rozenberg. The Theory of 2-Structures – A Framework for Decomposition and Transformation of Graphs. World Scientific, 1999. URL: http://www.worldscibooks.com/mathematics/4197.html.
  • [19] Daryl Funk, Dillon Mayhew, and Mike Newman. Tree automata and pigeonhole classes of matroids: I. Algorithmica, 84(7):1795–1834, 2022. doi:10.1007/S00453-022-00939-7.
  • [20] Tobias Ganzow and Sasha Rubin. Order-invariant MSO is stronger than counting MSO in the finite. In Susanne Albers and Pascal Weil, editors, STACS 2008, 25th Annual Symposium on Theoretical Aspects of Computer Science, Bordeaux, France, February 21-23, 2008, Proceedings, volume 1 of LIPIcs, pages 313–324. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany, 2008. doi:10.4230/LIPICS.STACS.2008.1353.
  • [21] Michel Habib, Fabien de Montgolfier, Lalla Mouatadid, and Mengchuan Zou. A general algorithmic scheme for combinatorial decompositions with application to modular decompositions of hypergraphs. Theor. Comput. Sci., 923:56–73, 2022. doi:10.1016/J.TCS.2022.04.052.
  • [22] Petr Hlinený. Branch-width, parse trees, and monadic second-order logic for matroids. J. Comb. Theory B, 96(3):325–351, 2006. doi:10.1016/J.JCTB.2005.08.005.
  • [23] Michaël Rao. MSOL partitioning problems on graphs of bounded treewidth and clique-width. Theor. Comput. Sci., 377(1-3):260–267, 2007. doi:10.1016/J.TCS.2007.03.043.
  • [24] Michaël Rao. Décompositions de graphes et algorithmes efficaces. Phd thesis, Université Paul Verlaine – Metz, 2006.
  • [25] Yann Strozecki. Monadic second-order model-checking on decomposable matroids. Discret. Appl. Math., 159(10):1022–1039, 2011. doi:10.1016/J.DAM.2011.02.005.