Abstract 1 Introduction 2 Definitions 3 The Closure of Context-Free Sets under Fusion 4 Conclusions and Future Work References

Iterating Non-Aggregative Structure Compositions

Marius Bozga ORCID Univ. Grenoble Alpes, CNRS, Grenoble INP, VERIMAG, 38000, France Radu Iosif ORCID Univ. Grenoble Alpes, CNRS, Grenoble INP, VERIMAG, 38000, France Florian Zuleger ORCID Institute of Logic and Computation, Technische Universität Wien, Austria
Abstract

An aggregative composition is a binary operation obeying the principle that the whole is determined by the sum of its parts. The development of graph algebras, on which the theory of formal graph languages is built, relies on aggregative compositions that behave like disjoint union, except for a set of well-marked interface vertices from both sides, that are joined. The same style of composition has been considered in the context of relational structures, that generalize graphs and use constant symbols to label the interface.

In this paper, we study a non-aggregative composition operation, called fusion, that joins non-deterministically chosen elements from disjoint structures. The sets of structures obtained by iteratively applying fusion do not always have bounded tree-width, even when starting from a tree-width bounded set. First, we prove that the problem of the existence of a bound on the tree-width of the closure of a given set under fusion is decidable, when the input set is described inductively by a finite hyperedge-replacement (HR) grammar, written using the operations of aggregative composition, forgetting and renaming of constants. Such sets are usually called context-free. Second, assuming that the closure under fusion of a context-free set has bounded tree-width, we show that it is the language of an effectively constructible HR grammar. A possible application of the latter result is the possiblity of checking whether all structures from a non-aggregatively closed set having bounded tree-width satisfy a given monadic second order logic formula.

Keywords and phrases:
Hyperedge replacement, Tree-width
Copyright and License:
[Uncaptioned image] © Marius Bozga, Radu Iosif, and Florian Zuleger; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Logic and verification
; Theory of computation Grammars and context-free languages
Related Version:
Full Version: https://arxiv.org/abs/2510.06019
Funding:
margin: [Uncaptioned image] Marius Bozga and Radu Iosif wish to acknowledge the support of the French National Research Agency project Non-Aggregative Resource COmpositions (NARCO) under grant number ANR-21-CE48-0011. Florian Zuleger wishes to acknowledge the support of the FWF project AUTOSARD: “Automated Sublinear Amortised Resource Analysis of Data Structures” No. P36623 and the project VASSAL: “Verification and Analysis for Safety and Security of Applications in Life” funded by the European Union under Horizon Europe WIDERA Coordination and Support Action/Grant Agreement No. 101160022.
Editors:
C. Aiswarya, Ruta Mehta, and Subhajit Roy

1 Introduction

The tree-width of a graph is a numerical measure of how “tree-like” the graph is. This notion extends naturally to the relational structures, used to define the semantics of classical first and second-order logic. Relational structures generalize a broad range of graph-like objects, such as edge-labeled graphs, hypergraphs, and multi-edge graphs. Tree-width plays a foundational role in logic and verification. Courcelle’s theorem [6] shows that Monadic Second-order Logic (MSO) is decidable on classes of structures having bounded tree-width, while Seese’s theorem [19] asserts that unbounded tree-width leads to undecidability of MSO theories. Thus, proving that a given class of structures has bounded tree-width is tantamount for establishing the decidability of logical theories for that class.

In principle, one is interested in reasoning about infinite families of structures. These families are typically generated inductively from a finite set of basic building blocks, using operations such as composition and renaming. These operations are usually formalized by the Hyperedge Replacement (𝖧𝖱) algebra introduced by Courcelle [8]. The principle of inductive definition is captured by the notion of a context-free set, i.e., the language of a finite grammar written using 𝖧𝖱 operations, or equivalently, the set of evaluations of a set of ground 𝖧𝖱 terms recognized by a tree automaton.

For both graphs and relational structures, 𝖧𝖱 algebras are built on aggregative composition operations, in which the whole is determined by the sum of its parts. These compositions are typically defined as the disjoint union of the arguments, where the elements designated by shared constants on both sides are joined together. In other words, aggregative composition preserves the identity of the substructures while merging them at known interface points. Importantly, context-free sets of graphs and relational structures defined by grammars based on such bounded-interface111Vertex-replacement algebras [8] using disjoint union and edge addition between arbitrarily large interfaces may produce sets of unbounded tree-width. aggregative compositions have bounded tree-width.

In this paper, we investigate a more flexible but less controlled operation, called non-aggregative fusion. Fusion allows elements from two disjoint structures to be merged nondeterministically, even when they are not marked by constants. To maintain a certain level of semantic coherence, we require fusion to be constrained by a coloring discipline: elements can only be joined if their colors (i.e., sets of designated unary relations) are disjoint. This idea stems from existing work in the area of reasoning about the correctness of systems with dynamically reconfigurable connectivity, such as distributed protocols [1] or pointer structures with aliasing [15].

Due to its nondeterministic nature, fusion can cause a dramatic shift in structure: even if a set 𝐒 consists of structures having bounded tree-width, by taking the closure of 𝐒 under fusion one may introduce infinitely many structures of unbounded tree-width. This phenomenon raises two natural and fundamental questions:

  1. 1.

    Given a context-free set of relational structures, does the closure of this set under fusion have bounded tree-width ?

  2. 2.

    If the answer to the above question is yes, is this closure again a context-free set ?

The result this paper is that both questions have a positive answer (Theorem 9):

  1. 1.

    The existence of a bound on the tree-width of the closure by fusion of a context-free set is a decidable problem.

  2. 2.

    If the fusion-closure of context-free set has bounded tree-width, then it is the language of an effectively constructible context-free grammar, that uses only aggregative composition.

These results provide tools for reasoning about nondeterministic structural iteration. For instance, one can check MSO properties over the tree-width bounded fusion-closure of a context-free set, thereby extending algorithmic verification techniques to a broader class of systems. We sketch below two possible application domains that have motivated our work.

Separation Logic of Relations (SLR).

A key motivation for studying non-aggregative fusion comes from SLR, a generalization of classical Separation Logic to relational structures. This logic has been first considered for relational databases and object-oriented languages [14]. More recently, SLR (combined with inductive definitions [12]) has been proposed as an assertion language for the verification of distributed reconfigurable systems [1]. Here, the separating conjunction ϕψ means that the models of two formulæ ϕ and ψ must not have overlapping interpretations of the same relation symbol.

A subtle but crucial point is that, while the separating conjunction enforces disjointness of the tuples that interpret a relation symbol, i.e., that tuples cannot overlap in all positions, the tuples may overlap in some positions. However, by the disjointness of , such overlapping is only possible if the variables at these positions do not occur within the same unary relation symbol 222For each unary relation symbol 𝗋 in the alphabet, the SLR formula 𝗋(x)𝗋(y) entails xy.. From a semantic point of view, this behavior corresponds precisely to our notion of fusion: joining elements of two separate structures is allowed, as long as their sets of unary relation labels are disjoint. Thus, fusion abstracts the semantics of SLR, where aliasing is controlled implicitly by colors (i.e., sets of relation symbols) and the semantics of the separating conjunction. Moreover, considering inductive definitions on top of the basic SLR logic is akin to considering context-free sets generated by recursive grammars here.

Chemical and Biological Systems.

We believe that fusion-like operations naturally arise in the modeling of chemical and biological systems, where complex structures (e.g., proteins or polymeric carbon chains) are formed by joining smaller components through local interactions. Here non-aggregative fusion abstracts the joining of components not only at fixed attachment points, but also through general, property-based interactions, modeled via color compatibility in our framework. Studying the tree-width of these structures is essential for enabling automated reasoning about their properties. We consider the exploration of this area as future work.

Related Work

The notion of composition is central to substructural logics [17]. One of the foremost such logics is Bunched Implications (BI), whose first definition of semantics is based on partially ordered monoids (i.e., the multiplicative connective is interpreted as the multiplication in the monoid) [16]. In particular, the monoidal semantics of BI (and many other follow-up logics) do not assume the composition to be aggregative. The advent of the more popular semantics of BI based on finite partial functions, called heaps, has made aggregative composition popular among the users of Separation Logic [13, 18]. We list below several substructural logics where composition is not aggregative.

Docherty and Pym developped Intuitionistic Layered Graph Logic (ILGL), a substructural logic tailored to reasoning about graph structures with a fixed, non-commutative and non-associative notion of layering [9]. Calcagno et al. [2] introduce Context Logic as a framework for local reasoning about structured data, emphasizing compositionality through structural connectives that describe data and context separately rather than flattening them into aggregates. In [3], they formalize these connectives as modal operators and demonstrate that such non-aggregative reasoning is essential for expressing weakest preconditions and verifying updates. Cardelli et al. [4] present a spatial logic for reasoning about graphs that, like our work, emphasizes local, non-aggregative composition via structural connectives such as spatial conjunction.

2 Definitions

Given integers i and j, we write [i..j] for the set {i,i+1,,j}, assumed to be empty if i>j. For a set A, we denote by pow(A) its powerset. By writing B𝑓𝑖𝑛A we mean that B is a finite subset of A. The cardinality of a finite (multi)set A is written card(A). By writing A=A1A2 we mean that A1 and A2 partition A, i.e., A=A1A2 and A1A2=. The n-times Cartesian product of A with itself is denoted An and the set of possibly empty (resp. nonempty) sequences of elements from A by A (resp. A+). Multisets are denoted as {{a,b,}}, and denote the operations of multiset union and intersection, respectively. The multi-powerset (i.e., the set of multisets) of A is written mpow(A).

2.1 Relational Structures

Let be a finite alphabet of relation symbols 𝗋, of arities #𝗋1, and be a countably infinite set of constants 𝖼 of arity zero. As usual, relation symbols of arity 1, 2 and 3 are called unary, binary and ternary, respectively. A 𝒞-structure, for some finite set 𝒞𝑓𝑖𝑛 of constants, is a pair 𝖲=(𝖴𝖲,σ𝖲), where 𝖴𝖲 is a finite set called universe and σ𝖲 is an interpretation that maps each relation symbol 𝗋 to a subset of 𝖴𝖲#𝗋 and each constant 𝖼𝒞 to an element of 𝖴𝖲. The sort of the structure 𝖲 is the set 𝒞. Two structures are disjoint iff their universes are disjoint and isomorphic iff they are defined over the same alphabet and differ only by a renaming of their elements333See, e.g., [10, Section A3] for a formal definition of isomorphism between structures.. For a given sort 𝒞𝑓𝑖𝑛, we denote by 𝒮(𝒞) the set of 𝒞-structures.

We define the composition of two relational structures as the component-wise union of disjoint isomorphic copies of the structures followed by joining the elements that interpret the common constants. This is the same as the gluing operation defined by Courcelle [7, Definition 2.1], that we recall below, for self-completeness:

Definition 1.

Let 𝖲𝒮(𝒞) be a structure and 𝖴𝖲×𝖴𝖲 be an equivalence relation, where [u] denotes the equivalence class of u𝖴𝖲. The quotient of 𝖲 with respect to is the 𝒞-structure 𝖲/ defined as follows:

𝖴𝖲/=𝖽𝖾𝖿 {[u]u𝖴𝖲}
σ𝖲/(𝗋)=𝖽𝖾𝖿 {([u1],,[u#𝗋])(u1,,u#𝗋)σ𝖲(𝗋)}, for each 𝗋
σ𝖲/(𝖼)=𝖽𝖾𝖿 [σ𝖲(𝖼)], for each 𝖼𝒞

Let 𝖲i=(𝖴i,σi) be disjoint 𝒞i-structures, for i=1,2, and (𝖴1𝖴2)×(𝖴1𝖴2) be the least equivalence relation such that σ1(𝖼)σ2(𝖼), for all 𝖼𝒞1𝒞2. The composition of 𝖲1 with 𝖲2 is the 𝒞1𝒞2-structure 𝖲1𝖲2=(𝖴,σ), where:

𝖴=𝖽𝖾𝖿 {[u]u𝖴1𝖴2}
σ(𝗋)=𝖽𝖾𝖿 {([u1],,[u#𝗋])(u1,,u#𝗋)σ1(𝗋)σ2(𝗋)}, for each 𝗋
σ(𝖼)=𝖽𝖾𝖿 [σi(𝖼)], if 𝖼𝒞i, for each 𝖼𝒞1𝒞2(i.e., [σ1(𝖼)]=[σ2(𝖼)] if 𝖼𝒞1𝒞2)

We remark that the composition of -structures 𝖲1 and 𝖲2 is the same as their disjoint union, denoted 𝖲1𝖲2.

The composition of structures is aggregative, meaning that it keeps both structures separate except for the interpretation of the common constants. In the following, we define a non-aggregative fusion operation that matches also some of the elements which are not interpretations of common constants. In contrast to the deterministic composition, the equivalence relation that matches elements in the fusion operation is chosen nondeterministically.

Before formalizing the notion of non-aggregative fusion, we introduce a generic mechanism for controlling which pairs of elements are allowed to join. We assume a designated set of unary relation symbols . The sets of relation symbols γpow() are called colors. Given some 𝒞-structure 𝖲=(𝖴𝖲,σ𝖲), we denote by 𝖼𝗈𝗅𝖲(u)={𝗋uσ𝖲(𝗋)} the color of each element u𝖴𝖲. Note that the empty set is a color.

Back to the definition of non-aggregative fusion, we use colors to prevent joining elements labeled with non-disjoint colors. This is captured by the following notion of compatibility:

Definition 2.

Let 𝖲=(𝖴,σ) be a 𝒞-structure. A relation 𝖴×𝖴 is compatible with 𝖲 if and only if 𝖼𝗈𝗅𝖲(u1)𝖼𝗈𝗅𝖲(u2)=, for each pair u1u2.

We now define the non-aggregative fusion operation. The operation is non-deterministic, i.e., returns a (possibly empty) set of structures. For reasons of simplicity, the fusion operation is only defined for structures of sort , i.e., for structures that do not interpret any constants. This restriction can be lifted at the expense of complexifying the definition below, by considering fusion in which the interpretation of common constants in both structures must always be joined, as in Definition 1.

Given disjoint sets A and B, a relation A×B is an A-B matching iff {a,b}{a,b}=, for all distinct pairs (a,b),(a,b). The least equivalence relation that contains is denoted (AB)×(AB). We say that an equivalence relation is k-generated iff is a matching consisting of k pairs.

Definition 3.

Let 𝖲i=(𝖴i,σi), for i=1,2, be two disjoint -structures. The fusion of 𝖲1 and 𝖲2 is the following set of -structures:

𝖥(𝖲1,𝖲2)=𝖽𝖾𝖿{(𝖲1𝖲2)/  non-empty 𝖴1-𝖴2 matching compatible with 𝖲1𝖲2}

Let 𝐒 be a set of -structures. The closure of 𝐒 under fusion is the least set 𝖥(𝐒) such that 𝐒{𝖥(𝖲1,𝖲2)𝖲1,𝖲2𝖥(𝐒)}𝖥(𝐒).

Note that 𝖥(𝖲1,𝖲2)= iff 𝖼𝗈𝗅𝖲1(u1)𝖼𝗈𝗅𝖲1(u2), for all pairs (u1,u2)𝖴1×𝖴2. The problems considered in the rest of this paper concern the uses of fusion, in addition to the composition and the unary operations on structures introduced next.

2.2 An Algebra of Structures

We recall the definitions of sorted terms and algebras [6, Definition 1.1]. Let Σ be a countably infinite set of sorts and let 𝔽 be a countably infinite set of function symbols, where the set 𝔽 is called a signature. Each f𝔽 has an associated tuple of argument sorts α(f) and a value sort ρ(f). The arity of f, denoted #f, is the length of α(f). Moreover, each variable has a sort. A 𝔽-term t[x1,,xn] is built as usual from function symbols and variables x1,,xn of matching sorts. A ground term is a term without variables. A trivial term consists of a single variable. A term t is a subterm of t iff there exists a term u[x] such that t=u[t], where u[t] denotes the replacement of x by t in u. The sort of a term t, denoted ρ(t) is the value sort of the top-most symbol, i.e., either the value sort ρ(f) of the top-most function symbol f, in case of a non-trivial term t, or the sort ρ(x) of the variable x, in case of a trivial term x. A position p in a term t is a node of the tree that uniquely represents t, in the usual way (see [5] for a formal definition). 𝒯() denotes the set of ground terms having function symbols taken from a finite set 𝑓𝑖𝑛𝔽.

An 𝔽-algebra 𝒜=({𝖠s}sΣ,{f𝒜}f𝔽) consists of domains 𝖠s of each sort sΣ and interprets the function symbols f𝔽 as functions f𝒜:𝖠s1××𝖠sn𝖠ρ(f), where α(f)=(s1,,sn). By the domain of 𝒜 we understand the set 𝖠=sΣ𝖠s. We denote by t𝒜 the interpretation of an 𝔽-term t in 𝒜, i.e., the function obtained by replacing each function symbol that occurs in t by its interpretation. In particular, t𝒜 is an element of the domain of 𝒜 if t is ground.

We define an algebra of structures, called , with sorts Σ=𝖽𝖾𝖿{𝒞𝒞𝑓𝑖𝑛}, where each universe 𝖧𝖱𝒞 is the set of 𝒞-structures. The signature 𝔽 consists of:

  • constant symbols 𝗋(𝒞1,,𝒞#𝗋), for 𝗋 and 𝒞i𝑓𝑖𝑛 such that either 𝒞i=𝒞j or 𝒞i𝒞j= for all 1i<j#𝗋, interpreted as 𝗋(𝒞1,,𝒞#𝗋)=𝖽𝖾𝖿(𝖴,σ), where:

    𝖴=𝖽𝖾𝖿 {u1,,u#𝗋} for some (possibly equal) elements u1,,u#𝗋
     such that ui=uj𝒞i=𝒞j, for all 1i<j#𝗋
    σ(𝗋)=𝖽𝖾𝖿 {(u1,,u#𝗋)} and σ(𝗋)=𝖽𝖾𝖿, for all 𝗋{𝗋}
    σ(𝖼)=𝖽𝖾𝖿 ui, for all 𝖼𝒞i and 1i#𝗋
  • binary function symbols 𝒞,𝒞, for 𝒞,𝒞𝑓𝑖𝑛, interpreted by the composition operation from Definition 1 applied to structures of sorts 𝒞 and 𝒞, respectively, see Figure 1 (a).

  • unary function symbols 𝗋𝖾𝗇𝖺𝗆𝖾𝒞α, for all 𝒞,𝒞𝑓𝑖𝑛 and surjective function α:𝒞𝒞, interpreted as the operations 𝗋𝖾𝗇𝖺𝗆𝖾𝒞α:𝖧𝖱𝒞𝖧𝖱𝒞 where, for each 𝒞-structure 𝖲, the output 𝖲=𝗋𝖾𝗇𝖺𝗆𝖾𝒞α(𝖲) is defined below:

    𝖴𝖲=𝖽𝖾𝖿 𝖴𝖲σ𝖲(𝗋)=𝖽𝖾𝖿σ𝖲(𝗋)σ𝖲(α(𝖼))=𝖽𝖾𝖿σ𝖲(𝖼), for all 𝗋 and 𝖼𝒞𝒞
  • unary function symbols 𝖿𝗈𝗋𝗀𝖾𝗍𝒞𝒞, for 𝒞𝑓𝑖𝑛 and 𝒞𝒞, interpreted as the operations 𝖿𝗈𝗋𝗀𝖾𝗍𝒞𝒞:𝖧𝖱𝒞𝖧𝖱𝒞𝒞 where, for each 𝒞-structure 𝖲, the output 𝖲=𝖿𝗈𝗋𝗀𝖾𝗍𝒞𝒞(𝖲) is defined below:

    𝖴𝖲=𝖽𝖾𝖿 𝖴𝖲σ𝖲(𝗋)=𝖽𝖾𝖿σ𝖲(𝗋)σ𝖲(𝖼)=𝖽𝖾𝖿σ𝖲(𝖼), for all 𝗋 and 𝖼𝒞

To ease the notation, we omit the sorts of the arguments from the function symbols 𝔽 when they are understood from the context.

Example 4.

The two leftmost graphs in Figure 1 (a) are the values of the following terms, respectively (singleton sets are denoted by their elements, to avoid clutter):

t1=𝖽𝖾𝖿 𝗋𝖾𝗇𝖺𝗆𝖾𝖼3𝖼1(𝖿𝗈𝗋𝗀𝖾𝗍𝖼1(a(𝖼0,𝖼1,𝖼2)c(𝖼1,𝖼3))b(𝖼0,𝖼3))
t2=𝖽𝖾𝖿 b(𝖼0,𝖼1)𝖿𝗈𝗋𝗀𝖾𝗍𝖼2(a(𝖼1,𝖼3,𝖼2))

The graph in Figure 1 (c) is the value of the term t0=𝖽𝖾𝖿𝗋𝖾𝗇𝖺𝗆𝖾𝖼0𝖼1(𝖿𝗈𝗋𝗀𝖾𝗍𝖼2(t1t2)).

We comment on the relationship to the algebra of relational structures introduced in [7, Definition 2.3], as a generalization of both the hyperedge-replacement (𝖧𝖱) algebra of hypergraphs and the vertex-replacement (𝖵𝖱) algebra of binary graphs, to relational structures. As a matter of fact, we do not use this algebra. Instead, our algebra of relational structures algebra follows the standard 𝖧𝖱 algebra on hypergraphs [6]. This relationship can be easily understood by viewing relational structures as hypergraphs in the adjacency encoding, i.e., each tuple (u1,,u#𝗋) from the interpretation of a relation symbol 𝗋 corresponds to a 𝗋-labeled hyperedge attached to the vertices u1,,u#𝗋 in this order. For reasons of space, we omit further details.

Figure 1: HR operations on structures over the alphabet {a,b,c}, where #a=3 and #b=#c=2. The order of vertices attached to an edge is indicated by an arrow pointing to the last vertex.

We now introduce the subalgebras of that define the tree-width parameter of a structure. For this, we assume some enumeration of the constants ={𝖼0,𝖼1,} and denote by 𝔽k the subset of 𝔽 consisting of the function symbols whose argument and value sorts are all contained in {𝖼0,,𝖼k}.

Definition 5.

The tree-width of a structure 𝖲, denoted tw(𝖲), is the minimal integer k0 for which there exists a ground term t𝒯(𝔽k) such that t=𝖲. A set 𝐒 of structures has bounded tree-width if and only if the set {tw(𝖲)𝖲𝐒} is finite.

In particular, for each tree-width bounded set 𝐒, there exists a set 𝒯 of ground terms and a finite set 𝒞𝑓𝑖𝑛 of constants such that each term t𝒯 uses only constants from 𝒞 and 𝐒={tt𝒯}.

This algebraic definition of tree-width of a relational structure is analogous to the definition of the tree-width of a hypergraph using a subalgebra of 𝖧𝖱 defined by a restriction of sorts to finite sets of vertex labels (see, e.g., [8, Proposition 1.19] for a proof of equivalence between the graph-theoretic and algebraic definitions of tree-width for hypergraphs). Moreover, the tree-width of a structure can be equivalently defined in terms of the tree-width of its Gaifman-graph:

Definition 6.

Let 𝖲 be a 𝒞-structure. The Gaifman graph of 𝖲 is the simple undirected graph 𝖦𝖺𝗂𝖿(𝖲)=(V,E), where V=𝖽𝖾𝖿𝖴𝖲 and E=𝖽𝖾𝖿{{ui,uj}(u1,,u#𝗋)σ𝖲(𝗋),1ij#𝗋,𝗋}.

It is known that the tree-width of a structure equals the tree-width of its Gaifman graph [11, Proposition 11.27].

We recall below two standard notions of graph theory. Given binary graphs G and H, we say that H is a minor of G iff H is obtained from a subgraph of G by edge contractions, where the contraction of a binary edge eEG attached to vertices u and v means deleting e and joining u and v into a single vertex x (the edges attached to x are the ones attached to either u or v). It is well-known that the tree-width of each minor of a graph G is bounded by the tree-width of G.

A n×m-grid is a binary graph whose vertices can be labeled with pairs (i,j)[1..n]×[1..m] such that there is an edge between (i,j) and (i,j) iff either i<n, i=i+1 and j=j or j<m, i=i and j=j+1. It is well-known444This can be shown by using the characterisation of tree-width in terms of the cops and robber game [20]. that each n×n-grid has tree-width n. By bluring the distinction between isomorphic grids, we obtain the following:

Proposition 7.

A set of structures whose Gaifman graphs contain infinitely many non-isomorphic square grids has unbounded tree-width.

2.3 Context-Free Sets of Structures

Context-free sets are usually defined as languages of grammars, i.e., finite sets of inductive rules written using nonterminals and function symbols from a given signature. To simplify some of the upcoming proofs, we use here an equivalent definition of contex-free sets based on recognisable sets of ground terms, defined using tree automata [5]. For self-containment reasons, we briefly introduce context-free grammars and discuss their equivalence with tree automata in Section 3.5 (see the statement of Theorem 24).

Let 𝑓𝑖𝑛𝔽 be a finite signature of function symbols. A tree automaton over is a tuple 𝒜=(Q,F,), where Q is a finite set of states, FQ is a set of accepting states and is a set of transition rules of the form (q1,,q#f)𝑓q, where q1,,q#f,qQ and f. A run π of 𝒜 over a ground term t𝒯() maps each position p within t to a state q=π(p) if the automaton has a rule (π(p1),,π(p#f))𝑓q, where p1,,p#f are the positions of the children of p in t. A ground term t is accepted by 𝒜 iff 𝒜 has a run that labels the root of t with an accepting state. The language of 𝒜, denoted (𝒜), is the set of ground terms accepted by 𝒜. A set T𝒯() is recognisable iff it is the language of a tree automaton over the finite signature .

Definition 8.

A set 𝐒 of structures is context-free if and only if there exists a recognisable set of ground terms T𝒯(), over a finite signature 𝔽, such that 𝐒={ttT}.

Note that a context-free set of structures has finitely many sorts, because the signature of terms used to describe the set is finite. For this reason, any context-free set of structures has bounded tree-width. On the other hand, there are bounded tree-width sets which are not context-free, for instance the set of linear structures over the alphabet {a,b,c} of binary relations of the form anbncn, for all n1.

3 The Closure of Context-Free Sets under Fusion

This section is concerned with the statement and proof of the main result of the paper (Theorem 9). For simplicity, we assume that 𝐒 is a set of connected structures, where connectivity of a structure 𝖲 means that there exists an undirected path in 𝖦𝖺𝗂𝖿(𝖲) between each pair of elements u1,u2𝖴𝖲. We note that, because 𝐒 contains only connected structures, each structure from its closure 𝖥(𝐒) is necessarily connected.

The assumption of 𝐒 being a context-free set of connected structures loses no generality, because it is possible, from a tree automaton 𝒜 such that 𝐒=(A), to build a tree automaton such that (B) is the set of connected substructures of a structure in 𝐒. Intuitively, is obtained from 𝒜 be labeling each state q with finite information concerning the existence of a path between each pair of constant symbols in each structure that is the value of a ground term recognized by q in 𝒜. This construction can be used to generalize the statement of Theorem 9 below from connected to arbitrary structures. We will detail this construction in an extended version of the present article.

Theorem 9.

Let 𝐒 be a context-free set of connected -structures.

  1. 1.

    𝖥(𝐒) has bounded tree-width if and only if 𝖥(𝐒) is context-free.

  2. 2.

    It is decidable whether 𝖥(𝐒) has bounded tree-width.

An obvious consequence of this theorem is the decidability of the problem: given a context-free set 𝐒, is 𝖥(𝐒) context-free?

We give an overview of the proof before going into technical details. The core idea is the equivalence between the (A) tree-width boundedness of the closure 𝖥(𝐒) of a context-free set 𝐒 and (B) the non-existence of two structures 𝖲1,𝖲2𝖥(𝐒) having each at least three elements each ui,vi,wi𝖴𝖲i, labeled with disjoint colors γi, for i=1,2. This equivalence is established via a third, more technical, condition about the disjointness relations between the colors that may occur in a structure from 𝖥(𝐒). The latter implies that the matching relation of each fusion of two structures from 𝖥(𝐒) is generated by at most two pairs of elements with compatible colors.

The equivalence between (A) and (B) is used to prove both points of Theorem 9. For point (1) suppose, for a contradiction, that (B) does not hold. Then, 𝖲1 and 𝖲2 can be composed by joining their ui, vi or wi elements, for i=1,2, respectively, as in Figure 2. Consequently, the set of Gaifman graphs corresponding to the structures in 𝖥(𝐒) contains an infinite set of grid minors, thus 𝖥(𝐒) has unbounded tree-width (Proposition 7). Else, if (B) holds (i.e., such structures cannot be found), we prove that the matching relation considered in the fusion of any two structures is generated by either one or two pairs of elements. In each of these cases, by adding a finite number of constants to the signature of the 𝔽-terms from the language of 𝒜, we can build a tree automaton 𝒜 such that (𝒜)=𝖥(𝐒), thus taking care of point (1) of the theorem.

To prove point (2), we rely on the equivalence of (A) and (B) and show that (B) is decidable. This is done by arguing that the existence of two structures with the above property is equivalent to the existence of two multiset abstractions of structures {{γ1,γ1,γ1}}, {{γ2,γ2,γ2}} in the domain of multisets of colors having multiplicity at most three. These multiset abstractions of colors can be effectively computed by a finite fixpoint iteration over the rules of the tree automaton that recognizes the set of ground terms which evaluates to the elements of 𝐒.

3.1 Color Multisets

In the following, let 𝐒 be a context-free set of -structures. We denote the set of colors by Γ=𝖽𝖾𝖿pow(), where is a fixed finite set of unary relation symbols. First, we define an abstraction of structures as finite multisets of colors:

Definition 10.

The multiset color abstraction 𝖲mpow(Γ) of a structure 𝖲 is 𝖲=𝖽𝖾𝖿{{𝖼𝗈𝗅𝖲(u)u𝖴𝖲}}. For an integer k0, the k-multiset color abstraction 𝖲kmpow(Γ) is 𝖲k=𝖽𝖾𝖿{M𝖲card(M)k}. These abstractions are lifted to sets of structures as sets of multisets 𝐒=𝖽𝖾𝖿{𝖲𝖲𝐒} and 𝐒k=𝖽𝖾𝖿𝖲𝐒𝖲k.

Note that 𝖲 is a multiset, whereas 𝖲k is a set of multisets. When lifted to sets of structures, both 𝐒 and 𝐒k are sets of multisets.

Figure 2: Building structures whose Gaifman graphs have infinitely large grid minors.

The core of the proof of Theorem 9 is the equivalence between the following two conditions stated formally below:

tw(𝖥(𝐒)) k, for some k1 (A)
{{γ1,γ1,γ1}},{{γ2,γ2,γ2}}(𝖥(𝐒))3 γ1γ2, for all γ1,γ2Γ (B)

Note that condition (A) is more general than the premiss of Theorem 9. We prove the (A) (B) direction below.

Lemma 11.

If 𝐒 has bounded tree-width, then (A) implies (B).

Proof.

By contradiction, assume that there exist {{γ1,γ1,γ1}},{{γ2,γ2,γ2}}(𝖥(𝐒))3 such that γ1γ2=. Then, there exist structures 𝖲1,𝖲2𝖥(𝐒) such that {{γ1,γ1,γ1}}𝖲1 and {{γ2,γ2,γ2}}𝖲2. We shall use 𝖲1 and 𝖲2 to build infinitely many structures whose Gaifman graphs have arbitrarily large square grid minors, as illustrated in Figure 2.

First, construct the structure 𝖲12𝖥(𝐒) by fusing one pair (u1,u2), having colors γ1 and γ2, respectively. Let v1 and w1 (resp. v2 and w2) be the remaining distinct elements of 𝖲12, having color γ1 (resp. γ2) from 𝖲1 and 𝖲2, respectively. For an arbitrarily large integer n1, consider n×n disjoint copies (𝖲12i,j)i,j=1,n of 𝖲12. Let 1,j be the equivalence relation generated by {(v11,j,v21,j1)} and i,1 be generated by {(w2i,1,w1i1,1)}, i,j be generated by {(v1i,j,v2i,j1),(w2i,j,w1i1,j)}, for all i,j=2,n. Second, construct the grid-like connected structure Xn,n𝖥(𝐒):

Xn,n=((((𝖲121,1𝖲121,2)/1,2𝖲122,1)/2,1𝖲12i,j)/i,j𝖲12n,n)/n,n

where structures 𝖲12i,j are added to the fusion in the increasing order of i+j. We can show that 𝖦𝖺𝗂𝖿(Xn,n) has an n×n square grid minor. Finally, as n can be taken arbitrarily large, we conclude that 𝖥(𝐒) does not have bounded tree-width, which contradicts (A).

3.2 Color Schemes

For the proof of the (A) “” (B) direction, we first organize the set of colors using the RGB color schemes defined below:

Definition 12.

A partition (Γred,Γgreen,Γblue) of Γ is an RGB color scheme if and only if:

  1. 1.

    γ1γ2, for all γ1,γ2Γblue,

  2. 2.

    γ1γ2, for all γ1Γgreen and all γ2Γblue,

  3. 3.

    for all γ1Γred there exists γ2Γblue such that γ1γ2=.

Note that an RGB color scheme is fully specified by the set Γblue. Indeed, any color not in Γblue is unambiguously placed within Γred or Γgreen, depending on whether or not it is disjoint from some color in Γblue. In particular, if Γblue= then Γred= and Γgreen=Γ. For example, Figure 3 shows several RGB color schemes for the set ={𝖺,𝖻,𝖼} of unary relation symbols.

Figure 3: Examples of RGB color schemes.

Because a fusion operation only joins element with disjoint colors, blue elements can only be joined with red elements, green elements can be joined with green or red elements, whereas red elements can be joined with elements of any other color. We define below what is meant for a set of structures to conform to an RGB color scheme:

Definition 13.

A set 𝐒 of structures conforms to (Γred,Γgreen,Γblue) if and only if:

  1. 1.

    for all structures 𝖲𝐒, if 𝖼𝗈𝗅𝖲(u)Γred, for some element u𝖴𝖲, then 𝖼𝗈𝗅𝖲(u)Γblue, for all other elements u𝖴𝖲{u}, and

  2. 2.

    𝖲Γgreen{{γ,γγΓgreen}}, for all structures 𝖲𝖥(𝐒).

In other words, 𝐒 conforms to a given color scheme if each structure from 𝐒 has either a single red and the rest blue, or at most occurrences of the same green color and the rest blue elements. Moreover, the number of occurrences of a green color must not exceed two, for each structure obtained by taking fusions of some structures in 𝐒. This observation justifies the following notion of type of a structure:

Definition 14.

A structure 𝖲 is of type 𝖱 if it has exactly one red element and the rest blue, 𝖦 if it has at least one green element and the rest blue and 𝖡 if it has only blue elements.

Note that there can be structures of neither 𝖱, 𝖦 or 𝖡 type, but these are the only types of interest, as justified by the following:

Lemma 15.

Let 𝐒 be a set of structures conforming to an RGB color scheme. Then, each structure 𝖲𝖥(𝐒) is of type either 𝖱, 𝖦 or 𝖡.

Proof.

By induction on the construction of 𝖲𝖥(𝐒) from one or more structures from 𝐒. Table 1 summarizes the possible types of 𝖥(𝖲1,𝖲2) on structures 𝖲1 and 𝖲2 of types 𝖱, 𝖦 or 𝖡, respectively.

Table 1: The types of structures obtained by fusion, where means that the result of the fusion is the empty set.
𝖥(𝖲1,𝖲2) 𝖲2 of 𝖱 type 𝖲2 of 𝖦 type 𝖲2 of 𝖡 type
𝖲1 of 𝖱 type 𝖱,𝖦,𝖡 𝖦,𝖡 𝖡
𝖲1 of 𝖦 type 𝖦,𝖡 𝖦,𝖡
𝖲1 of 𝖡 type 𝖡

The (B) “” (A) direction will be established via a third condition (C), which is conformance to an RGB color scheme defined by taking the Γblue set to be the colors occurring three times in some structure from 𝖥(𝐒):

Lemma 16.

If (B) holds then:

𝐒 conforms to (Γred,Γgreen,Γblue), where Γblue=𝖽𝖾𝖿{γΓ{{γ,γ,γ}}(𝖥(𝐒))3} (C)
Proof.

We show that 𝐒 conforms to the (Γred,Γgreen,Γblue) RGB color scheme from the statement, by checking the two points of Definition 13:

  1. (1)

    Let 𝖲𝐒 and prove that for any two colors γ1,γ2, if {{γ1,γ2}}𝖲 and γ1Γred then γ2Γblue. Since γ1Γred, there must exists a color γ1Γblue, such that γ1γ1=, by Definition 12. By the definition of Γblue, this further implies {{γ1,γ1,γ1}}(𝖥(𝐒))3. Henceforth, there exists a structure 𝖲𝖥(𝐒) such that {{γ1,γ1,γ1}}𝖲. We can now use 𝖲 and three disjoint copies of 𝖲 to build a new structure 𝖲′′ by gluing progressively, each one of the three elements of color γ1 in 𝖲 to the element of color γ1 of 𝖲. Then, by construction, the structure 𝖲′′ will also contain three elements of color γ2, one from each disjoint copy of 𝖲. Therefore, {{γ2,γ2,γ2}}𝖲′′ and because 𝖲′′𝖥(𝐒) this implies {{γ2,γ2,γ2}}(𝖥(𝐒))3 and therefore γ2Γblue.

  2. (2)

    By contradiction, let 𝖲𝖥(𝐒) be such that 𝖲Γgreen{{γ,γγΓgreen}}. Then there exists γ(𝖲Γgreen){{γ,γγΓgreen}}, i.e., γΓgreen and {{γ,γ,γ}}𝖲. The latter implies {{γ,γ,γ}}𝖲3(𝖥(𝐒))3. But this implies γΓblue according to the definition of the RGB color scheme, contradicting γΓgreen.

In the rest of this and the next subsections, we are concerned with the proof of the following implication, that establishes the equivalence of (A) and (B). As previously mentioned, this direction of the proof uses the third condition (C) that is, conformance to the RGB color scheme from the statement of Lemma 16:

Lemma 17.

If 𝐒 has bounded tree-width and (C) holds then (A) holds.

The proof of the above lemma is split into two technical results (Lemmas 18 and 19). The first (Lemma 18) involves reasoning about the number of pairs of elements that are joined by the fusion operation in order to obtain a structure from 𝖥(𝐒):

Lemma 18.

If 𝐒 conforms to an RGB scheme (Γred,Γgreen,Γblue) and 𝖲=(𝖲1𝖲2)/ for some 𝖲i=(𝖴i,σi)𝖥(𝐒), for i=1,2, and some equivalence closure of a 𝖴1-𝖴2 matching, then exactly one of the following holds:

  1. 1.

    is 1-generated, or

  2. 2.

    is 2-generated, 𝖲 is of type 𝖡, and either:

    1. (a)

      𝖲1, 𝖲2 are both of type 𝖱, or

    2. (b)

      𝖲1, 𝖲2 are both of type 𝖦 and card(𝖲1Γgreen)=card(𝖲2Γgreen)=2.

Proof.

We distinguish two cases:

  • 𝖲1 is of type 𝖱: If 𝖲2 is of type 𝖡 or 𝖦 then 𝖲1 and 𝖲2 can be fused only by equivalences generated by a single pair, that contains the element from the support of 𝖲1 with color in Γred, thus matching the case (1) from the statement. Else, if 𝖲2 is of type 𝖱 then 𝖲1 and 𝖲2 can be fused by equivalences generated by at most two pairs, each containing an element with color from Γred, from either 𝖲1 or 𝖲2, thus matching the case 2a from the statement. In this latter case, 𝖲 is of type 𝖡 because joining a red with a blue element always results in a blue element.

  • 𝖲1, 𝖲2 are both of type 𝖦: By contradiction, assume they can be fused by an equivalence generated by three pairs of elements (u1i,u2i)i=1,2,3. Let G1i=𝖼𝗈𝗅𝖲1(u1i), G2i=𝖼𝗈𝗅𝖲2(u2i) be the colors from Γgreen of the matching elements in the two structures, for i=1,2,3. Then, we can construct structures using 𝖲1 and 𝖲2 where any of these colors repeat strictly more than twice, henceforth, contradicting the conformance property to the RGB color scheme. The principle of the construction is depicted in Figure 4. Finally, note that the construction depicted in Figure 4 fuse actually only pairs of colors (G1i,G2i) for i=1,2. Henceforth, the conformance property is also contradicted if 𝖲1 and 𝖲2 can be fused by a 2-generated equivalence relation , such that the support of either 𝖲1 or 𝖲2 contains more than three elements with colors in Γgreen. By the same argument, it follows that 𝖲 is of type 𝖡, if is generated by two pairs of green elements.

Figure 4: Fusion of 𝖦 structures by 3-generated matchings.

3.3 Fusions as Operations on Terms

The second technical result required for the proof of Lemma 17 is a characterization of the k-generated fusion, for k=1,2, via operations on witness 𝔽-terms. We assume that 𝐒 is a given tree-width bounded set of structures that conforms to a fixed RGB color scheme (Γred,Γgreen,Γblue). The goal is to prove that tw(𝖥(𝐒))tw(𝐒)+K, for an integer K0.

Figure 5: The 𝗃𝗈𝗂𝗇(t1,t2,𝖼γ1i1,𝖼γ2i2) (a) and 𝖺𝗉𝗉𝖾𝗇𝖽(t1,t2,n,k,𝖼γi) (b) operations.

Since 𝐒 has bounded tree-width, there exists a set 𝒯 of terms such that 𝐒={tt𝒯} and a finite set 𝒞 of constants such that each term t𝒯 uses only constants from 𝒞 and assume w.l.o.g. 𝒞 to be the least such set. Moreover, consider the following set of special constants 𝒞¯=𝖽𝖾𝖿{𝖼γiγΓ, 1i2}, with the following intuition. Recall that each element of a structure t is the common interpretation of all the constants 𝖼𝒞i in the label 𝗋(𝒞1,,𝒞#𝗋) of a leaf of t. By adding 𝖼γi to the set 𝒞i, we mean that the color of that element in the structure is γ. A constant is visible in a term if it is not in the scope of a 𝗋𝖾𝗇𝖺𝗆𝖾 or 𝖿𝗈𝗋𝗀𝖾𝗍 operation. Let K=𝖽𝖾𝖿card(𝒞¯) and note that card(K)=2card(Γ).

We shall prove that tw(𝖲)tw(𝐒)+K by building, for any 𝖲𝖥(𝐒), a term t that uses only constants from 𝒞𝒞¯, such that 𝖿𝗈𝗋𝗀𝖾𝗍𝒞¯(t)=𝖲. We then note that 𝖲 has tree-width at most card(𝒞)+K and obtain tw(𝖲)card(𝒞)+K=tw(𝐒)+K. Since the choice of 𝖲𝖥(𝐒) was arbitrary, this leads to tw(𝖥(𝐒))tw(𝐒)+K.

In the following, we understand terms as trees whose nodes are labeled by function symbols from 𝔽. For each node n of a term t, we write 𝗅𝖺𝖻(n) for its label. The children of each node n form an ordered sequence of length equal to the arity of 𝗅𝖺𝖻(n). In order to the build the witness terms for the structures 𝖲𝖥(𝐒), we make use of the following operations on terms t, t1 and t2 having constants in 𝒞𝒞¯:

  1. 1.

    𝗅𝖺𝖻𝖾𝗅(t,n,k,𝖼γi), where n is a leaf of t, 𝗅𝖺𝖻(n)=𝗋(𝒞1,,𝒞#𝗋) and k[1..#𝗋]: Let 1k1<<k#𝗋 be the indices of the sets of constants from 𝗅𝖺𝖻(n) that are equal to 𝒞k (see the definition of in Section 2.2). The operation changes the label of n in t only, by replacing the set 𝒞kj with 𝒞kj{𝖼γi}, for each 1j.

  2. 2.

    𝗃𝗈𝗂𝗇(t1,t2,𝖼γ1i1,𝖼γ2i2): The result is the following term, for a nondeterministic choice of j, such that 𝖼γ1γ2j is not visible in either t1 or t2 (the operation is undefined otherwise):

    𝖿𝗈𝗋𝗀𝖾𝗍(𝗋𝖾𝗇𝖺𝗆𝖾𝖼γ1i1𝖼γ1γ2j(t1)𝗋𝖾𝗇𝖺𝗆𝖾𝖼γ2i2𝖼γ1γ2j(t2)),=𝖽𝖾𝖿{𝖼γ1γ2jγ1γ2Γblue}

    We refer to Figure 5 (a) for an illustration. In addition, we consider an overloaded version of 𝗃𝗈𝗂𝗇(t1,t2,𝖼γ11i11,𝖼γ12i12,𝖼γ21i21,𝖼γ22i22) that fuses the interpretation of 𝖼γ1ji1j with that of 𝖼γ2ji2j, for both j=1,2. This definition is similar to the one above, thus omitted for brevity.

  3. 3.

    𝖺𝗉𝗉𝖾𝗇𝖽(t1,t2,n,k,𝖼γi), where n is a leaf of t1, 𝗅𝖺𝖻(n)=𝗋(𝒞1,,𝒞#𝗋) and k[1..#𝗋]: the result is the term t1[n/𝖿𝗈𝗋𝗀𝖾𝗍𝖼γi(𝗅𝖺𝖻𝖾𝗅(n,n,k,𝖼γi)t2)], where t[n/s] denotes the substitution of the leaf n by the term s in t. We refer to Figure 5 (b) for an illustration.

Then, Lemma 17 is an immediate consequence of the following lemma:

Lemma 19.

For each structure 𝖲𝖥(𝐒) there exists a 𝔽-term t using only constants from 𝒞𝒞¯, such that (i) 𝖿𝗈𝗋𝗀𝖾𝗍𝒞¯(t)=𝖲 and (ii) for each element u𝖴𝖲 such that γ=𝖼𝗈𝗅𝖲(u)ΓredΓgreen there exists a special constant 𝖼γi𝒞¯ such that σ𝖲(𝖼γi)=u.

Proof (sketch).

We build t by induction of the derivation of 𝖲=(𝖴,σ)𝖥(𝐒). For the base case 𝖲𝐒, let t𝒯 be a term such that 𝖲=t. By repeating the 𝗅𝖺𝖻𝖾𝗅(t,n,k,𝖼γi) operation, we add a special constant 𝖼γi to each leaf n of t, on the appropriate position 1k#𝗋, where 𝗅𝖺𝖻(n)=𝗋(𝒞1,,𝒞#𝗋), such that σ(𝒞k)={u} and γ=𝖽𝖾𝖿𝖼𝗈𝗅𝖲(u)ΓredΓgreen. The choice of 1i2 is nondeterministic. The result of applying these labeling operations to t is t. Then, 𝖿𝗈𝗋𝗀𝖾𝗍𝒞¯(t)=𝖲 and (ii) holds, by construction. For the inductive step, let 𝖲=(𝖲1𝖲2)/, where 𝖲1,𝖲2𝖥(𝐒) and is an equivalence relation that is generated by the set of pairs {(u1i,u2i)}iI, where I is either {1} or {1,2}. Let γji=𝖽𝖾𝖿𝖼𝗈𝗅𝖲j(uji) for all 1j2 and iI. By the inductive hypothesis, there exist terms tj and integers 1kji2, such that (𝖴j,σj)=𝖽𝖾𝖿tj and 𝖿𝗈𝗋𝗀𝖾𝗍𝒞¯(tj)=𝖲j, for all 1j2 and iI.

  1. 1.

    I={1}, i.e., is 1-generated.

    1. (a)

      γ11,γ21ΓredΓgreen: by the inductive hypothesis (ii), there exist 𝖼γ11i1,𝖼γ21i2𝒞¯ such that uj1=σj(𝖼γj1ij), for both 1j2. Suppose that 𝖼γ11γ21 is visible in t1 (visibility in t2 is a symmetric case), for some 12. Then γ11γ21 must belong to ΓredΓgreen, by the inductive hypothesis (ii). If 𝖼γ11γ213 is not visible in t2, suppose first that 𝖼γ11γ213 is visible in t1. Then, 𝖼γ11i1, 𝖼γ11γ21 and 𝖼γ11γ213 are visible in t1. By Lemma 15, γ11,γ11γ21Γgreen and γ11γ21 occurs 3 times in 𝖲, thus contradicting the definition of Γblue. Hence 𝖼γ1γ23 is not visible in either t1 or t2 and t=𝖽𝖾𝖿𝗃𝗈𝗂𝗇(t1,t2,𝖼γ11i1,𝖼γ21i2) is well-defined. Else, if 𝖼γ11γ213 is visible in t2, then γ11γ21 occurs 3 times in 𝖲, thus contradicting the definition of Γblue.

    2. (b)

      γ11Γblue and γ21Γred (γ11Γred and γ21Γblue is a symmetric case): Let n be the leaf of t1 such that 𝗅𝖺𝖻(n)=𝗋(𝒞1,,𝒞#𝗋) and 𝒞k be a set of constants that are (all) interpreted as u11, for some 1k#𝗋. Let 𝖼γ21 be the special constant such that u21=σ2(𝖼γ21), for some 12. We can assume w.l.o.g. that 𝖼γ21 is not visible in n. If this were not to be the case, then n must have been involved in a previous join of a term t3 with another term t1, such that t1 is the outcome of this join. In this case, we change the construction, by first joining t2 with t3, as in the previous case, then joining the result with t1,. Note that this is possible due of the associativity of the operation. Finally, we define t=𝖽𝖾𝖿𝖺𝗉𝗉𝖾𝗇𝖽(t1,t2,n,k,𝖼γ21).

  2. 2.

    I={1,2}, i.e., is 2-generated. By Lemma 18, either one of the following holds:

    1. (a)

      𝖲1 and 𝖲2 are of type 𝖦: let 𝖼γjikji be special constants such that uji=σj(𝖼γjikji), for all 1i,j2. Then, we define t=𝖽𝖾𝖿𝗃𝗈𝗂𝗇(t1,t2,(𝖼γ1iki)iI,(𝖼γ2iki)iI) and check that the operation is well-defined, following a similar argument as in case (1a).

    2. (b)

      𝖲1 and 𝖲2 are of type 𝖱: we can assume w.l.o.g. that uii is the interpretation of a special constant 𝖼γiiji, for some 1ji2, where γiiΓred, for both i=1,2. Let n1 be the leaf of t1 such that 𝗅𝖺𝖻(n1)=𝗋(𝒞1,,𝒞#𝗋) and u11 is the interpretation of (all) constants from 𝒞k1, for some 1k1#𝗋. Analogously, we consider n2 to be the leaf of t2 and k2 the position of the constants from its label, that are interpreted as u22. Under similar assumptions as in the case (2a), ensuring that the result is well defined, we let t=𝖽𝖾𝖿𝖿𝗈𝗋𝗀𝖾𝗍𝖼γ11j1(𝖺𝗉𝗉𝖾𝗇𝖽(t1,𝗅𝖺𝖻𝖾𝗅(t2,n2,k2,𝖼γ11j1),n1,k1,𝖼γ22j2)). The only difference with the previous case is that the indices j1 and j2 must be different to avoid name clashes, hence we require 2 special constants 𝖼γ1 and 𝖼γ2, for each color γΓred.

3.4 Tree-width Bounded Fusion-closed Sets are Context-free

This subsection completes the proof of the first point of Theorem 9. The final ingredient is the following lemma, whose proof relies on the lifting of the construction that simulates the 1- or 2-generated fusion of structures from terms to tree automata recognising sets of terms:

Lemma 20.

Let 𝐒 be a context-free set of structures conforming to some RGB color scheme. Then, the set 𝖥(𝐒) is context-free.

Proof of Theorem 9 (1).

” Since 𝐒 is a context-free set, it has bounded tree-width. If 𝖥(𝐒) has bounded tree-width, then 𝐒 conforms to an RGB color scheme, by the combined results of Lemmas 11 and 16. Because 𝐒 is context-free, we obtain that 𝖥(𝐒) is context-free, by Lemma 20. “” Because each context-free set has bounded tree-width.

3.5 Color Abstractions

This section is concerned with the proof of the second point of Theorem 9. Let 𝐒 be a context-free set of structures given by a tree automaton 𝒜 over a finite signature 𝔽. In the rest of this section, 𝐒, and 𝒜 are considered to be fixed.

Note that the equivalence between (A) and (B) follows from (A) (B) (Lemma 11), (B) (C) (Lemma 16) and (C) (A) (Lemma 17). Hence, it is sufficient to establish the decidability of the condition (B) for 𝐒. To this end, we compute the k-multiset abstraction (𝖥(𝐒))k, for an arbitrary given integer k1 (note that checking (B) requires k=3). First, we reduce the computation of (𝖥(𝐒))k to that of 𝐒k. Second, we sketch the argument behind the effective computability of 𝐒k.

The following lemma shows that, because we are interested only in k-multisets color abstractions, we can restrict fusion to 1-generated equivalence relations, while preserving the k-multiset color abstraction. We denote by 𝖥1(𝐒) the set of structures obtained by taking the closure of 𝐒 only with respect to fusions induced by 1-generated matchings.

Lemma 21.

(𝖥(𝐒))k=(𝖥1(𝐒))k for any set 𝐒 of structures and integer k1.

The set (𝖥1(𝐒))k can be computed by a least fixpoint iteration of the following abstract operation on the domain of k-multiset color abstractions. As the later domain is finite, this fixpoint computation is guaranteed to terminate.

Definition 22.

The single-pair multiset fusion is defined below, for all M1,M2mpow(Γ):

𝖿1(M1,M2)=def{Mmpow(Γ)γ1M1.γ2M2.γ1γ2=,M={{γ1γ2}}i=1,2(Mi{{γi}})}

Given an integer k1, the single-pair k-multiset fusion is defined for M1, M2mpow(Γ), such that card(M1)k and card(M2)k:

𝖿1k(M1,M2)=𝖽𝖾𝖿{M|M𝖿1(M1,M2).MM,card(M)k}

For a set of multisets (resp. k-multisets) of colors, let 𝖿1() (resp. 𝖿1k()) be the closure of under taking single-pair fusion on multisets (resp. k-multisets).

This operation is used to compute (𝖥1(𝐒))k by a iterating 𝖿1k starting with 𝐒k until a fixed point is reached. Since there are finitely many colors, the domain of multisets of colors having multiplicity at most k is finite, hence this iteration is guaranteed to compute 𝖿1k(𝐒k) in finitely many steps.

Lemma 23.

(𝖥1(𝐒))k=𝖿1k(𝐒k), for any set 𝐒 of structures and integer k1.

The last step concerns the effective computation of 𝐒k, i.e., the set of color multisets of multiplicity at most k that occur in the multiset abstraction of some structure 𝖲𝐒. We leverage from the fact that 𝐒 is a context-free set described by the set of terms that forms the language of a tree automaton 𝒜.

We assume basic acquaintance with context-free grammars, i.e., finite sets of rules of the form either qt[q1,,qk] or q, where q,q1,,qk denote nonterminals and t is a 𝔽-term with variables q1,,qk. The language (Γ) of a grammar Γ is the set of interpretations in of the terms produced by derivations starting with an axiom q. It is well known that a tree automaton can be transformed into a context-free grammar having the same language, by turning each transition (q1,,qk)𝑓q into a rule qf[q1,,qk], for k1, and adding an axiom q for each final state q of the tree automaton.

By first-order logic we understand the set of formulæ consisting of equalities between variables, relation atoms of the form 𝗋(x1,,x#𝗋), for some relation symbol 𝗋, composed via boolean operations and quantifiers. A first-order logic sentence φ (i.e., a formula without free variables) is interpreted over a structure 𝖲 by the satisfiability relation 𝖲φ, defined inductively on the structure of φ, as usual.

Over words, it is a well-known result that the non-emptiness of the intersection of a context-free set (e.g., given by a context-free grammar) and a regular set (e.g., given by a regular grammar, DFA, or a MSO formula) is decidable. This result has been generalized by Courcelle to context-free grammars over the 𝖧𝖱-algebra of structures and FO-definable sets of structures555[7, Theorem 3.6] is actually given for Monadic Second Order Logic, which subsumes first-order logic.:

Theorem 24 (Theorem 3.6 in [7]).

For each grammar Γ and first-order sentence φ, one can decide the existence of a structure 𝖲(Γ) such that 𝖲φ.

In order to compute 𝐒k, we build a first-order sentence φM for each multiset of colors M:Γ[1..k] such that 𝖲φM iff 𝖲k. As there are only finitely many such multi-sets M, we are able to construct 𝐒k by finitely many calls to the above decision procedure. We now state the details for the construction of φM. For each color γΓ, we denote by φγ(X) the formula 𝗋𝗋(x)𝗋¬𝗋(x). We then obtain φM as the conjunction of the formulæ =M(γ)x.φγ(x), if M(γ)<k, and M(γ)x.φγ(x), if M(γ)=k, for all colors γΓ. As usual, the quantifier =nx.ϕ(x) (resp. nx.ϕ(x)) means “there exists exactly n (resp. at least n) elements x that satisfy ϕ(x)”. It is now easy to verify that 𝖲φM iff M𝖲k. This concludes the proof of Theorem 9 (2).

4 Conclusions and Future Work

We have defined a non-aggregative and nondeterministic fusion operation on logical structures, that is controlled by a coloring of structures using unary relations. We study the tree-width of the closure of a context-free set under fusion. We prove that it is decidable whether the closure of a context-free set has bounded tree-width. Moreover, if this is the case, we show that the closure set is context-free as well, described by an effectively constructible grammar.

Future work involves considering more general notions of coloring, e.g., coloring functions defined by MSO-definable transductions. Moreover, we plan to investigate generalizations of Theorem 9 for other notions of width, such as generalizations of clique-width and rank-width from graphs and hypergraphs to relational structures.

References

  • [1] Emma Ahrens, Marius Bozga, Radu Iosif, and Joost-Pieter Katoen. Reasoning about distributed reconfigurable systems. Proc. ACM Program. Lang., 6(OOPSLA2):145–174, 2022. doi:10.1145/3563293.
  • [2] Cristiano Calcagno, Philippa Gardner, and Uri Zarfaty. Context logic and tree update. In Jens Palsberg and Martín Abadi, editors, Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2005, Long Beach, California, USA, January 12-14, 2005, pages 271–282. ACM, 2005. doi:10.1145/1040305.1040328.
  • [3] Cristiano Calcagno, Philippa Gardner, and Uri Zarfaty. Context logic as modal logic: completeness and parametric inexpressivity. In Martin Hofmann and Matthias Felleisen, editors, Proceedings of the 34th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2007, Nice, France, January 17-19, 2007, pages 123–134. ACM, 2007. doi:10.1145/1190216.1190236.
  • [4] Luca Cardelli, Philippa Gardner, and Giorgio Ghelli. A Spatial Logic for Querying Graphs. In Peter Widmayer, Francisco Triguero Ruiz, Rafael Morales Bueno, Matthew Hennessy, Stephan Eidenbenz, and Ricardo Conejo, editors, Proceedings of the 29th International Colloquium on Automata, Languages and Programming (ICALP’02), volume 2380 of Lecture Notes in Computer Science, pages 597–610. Springer, July 2002. doi:10.1007/3-540-45465-9_51.
  • [5] Hubert Comon, Max Dauchet, Rémi Gilleron, Florent Jacquemard, Denis Lugiez, Christof Löding, Sophie Tison, and Marc Tommasi. Tree Automata Techniques and Applications. HAL, 2008. URL: https://inria.hal.science/hal-03367725.
  • [6] Bruno Courcelle. The monadic second-order logic of graphs. I. Recognizable sets of finite graphs. Information and Computation, 85(1):12–75, 1990. doi:10.1016/0890-5401(90)90043-H.
  • [7] Bruno Courcelle. The monadic second-order logic of graphs VII: graphs as relational structures. Theor. Comput. Sci., 101(1):3–33, 1992. doi:10.1016/0304-3975(92)90148-9.
  • [8] Bruno Courcelle and Joost Engelfriet. Graph Structure and Monadic Second-Order Logic: A Language-Theoretic Approach. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2012. doi:10.1017/CBO9780511977619.
  • [9] Simon Docherty and David J. Pym. Intuitionistic layered graph logic: Semantics and proof theory. Log. Methods Comput. Sci., 14(4), 2018. doi:10.23638/LMCS-14(4:11)2018.
  • [10] Heinz-Dieter Ebbinghaus and Jörg Flum. Finite model theory. Perspectives in Mathematical Logic. Springer, 1995.
  • [11] Jörg Flum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2006. doi:10.1007/3-540-29953-X.
  • [12] Radu Iosif and Florian Zuleger. Expressiveness results for an inductive logic of separated relations. In Guillermo A. Pérez and Jean-François Raskin, editors, 34th International Conference on Concurrency Theory (CONCUR 2023), volume 279 of Leibniz International Proceedings in Informatics (LIPIcs), pages 20:1–20:20, Dagstuhl, Germany, 2023. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CONCUR.2023.20.
  • [13] Samin S. Ishtiaq and Peter W. O’Hearn. BI as an assertion language for mutable data structures. In Chris Hankin and Dave Schmidt, editors, Conference Record of POPL 2001: The 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, London, UK, January 17-19, 2001, pages 14–26. ACM, 2001. doi:10.1145/360204.375719.
  • [14] Viktor Kuncak and Martin Rinard. Generalized records and spatial conjunction in role logic. In Static Analysis, pages 361–376, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. doi:10.1007/978-3-540-27864-1_26.
  • [15] Peter W. O’Hearn, John C. Reynolds, and Hongseok Yang. Local reasoning about programs that alter data structures. In Proceedings of the 15th International Workshop on Computer Science Logic, CSL ’01, pages 1–19, 2001. doi:10.1007/3-540-44802-0_1.
  • [16] David J. Pym, Peter W. O’Hearn, and Hongseok Yang. Possible worlds and resources: the semantics of bi. Theoretical Computer Science, 315(1):257–305, 2004. Mathematical Foundations of Programming Semantics. doi:10.1016/j.tcs.2003.11.020.
  • [17] Greg Restall. Substructural Logics. In Edward N. Zalta and Uri Nodelman, editors, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, Fall 2024 edition, 2024.
  • [18] John C. Reynolds. Separation logic: A logic for shared mutable data structures. In 17th IEEE Symposium on Logic in Computer Science (LICS 2002), 22-25 July 2002, Copenhagen, Denmark, Proceedings, pages 55–74. IEEE Computer Society, 2002. doi:10.1109/LICS.2002.1029817.
  • [19] D. Seese. The structure of the models of decidable monadic theories of graphs. Annals of Pure and Applied Logic, 53(2):169–195, 1991. doi:10.1016/0168-0072(91)90054-P.
  • [20] P.D. Seymour and R. Thomas. Graph searching and a min-max theorem for tree-width. Journal of Combinatorial Theory, Series B, 58(1):22–33, 1993. doi:10.1006/jctb.1993.1027.