Abstract 1 Introduction 2 Preliminaries: Towards the Consistency Problem for Ontologies 3 Tree Global Constraint Automata 4 Constraint Automata for Checking Consistency 5 Extensions 6 Conclusion References

Robustness of Constraint Automata for Description Logics with Concrete Domains

Stéphane Demri ORCID Université Paris-Saclay, CNRS, ENS Paris-Saclay, LMF, 91190, Gif-Sur-Yvette, France Tianwen Gu ORCID Université Paris-Saclay, CNRS, ENS Paris-Saclay, LMF, 91190, Gif-Sur-Yvette, France
Abstract

Decidability or complexity issues about the consistency problem for description logics with concrete domains have already been analysed with tableaux-based or type elimination methods. Concrete domains in ontologies are essential to consider concrete objects and predefined relations. In this work, we expose an automata-based approach leading to the optimal upper bound ExpTime, that is designed by enriching the transitions with symbolic constraints. We show that the nonemptiness problem for such automata belongs to ExpTime if the concrete domains satisfy a few simple properties. Then, we provide a reduction from the consistency problem for ontologies, yielding ExpTime-membership. Thanks to the expressivity of constraint automata, the results are extended to additional ingredients such as inverse roles, functional role names and constraint assertions, while maintaining ExpTime-membership, which illustrates the robustness of the approach.

Keywords and phrases:
Description logics, concrete domains, constraint automata, complexity
Copyright and License:
[Uncaptioned image] © Stéphane Demri and Tianwen Gu; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Logic
Acknowledgements:
We thank the reviewers for their numerous suggestions that help us to improve the quality of the document, in particular to have spotted an error in the submitted version.
Related Version:
Full Version: https://arxiv.org/abs/2601.19644
Editors:
Stefano Guerrini and Barbara König

1 Introduction

Landmark results for description logics with concrete domains.

In description logics with concrete domains, the elements of the interpretation domain are enriched with tuples of values from the concrete domain, see e.g. [4, 25, 26, 27, 6], enabling reasoning about concrete information. The first decidability result for a large class of concrete domains 𝒟 can be traced back to [27] in which the decidability of the consistency problem with ω-admissible concrete domains is established using tableaux-based decision procedures (apart from the very first decidability result in [4, Theorem 4.3]). Though this was a genuine breakthrough in understanding 𝒜𝒞 with concrete domains, a few restrictions were required: the set of binary relations is finite; 𝒟 satisfies the patchwork property [27, Definition 3] which corresponds to the amalgamation property; and it satisfies compactness, see e.g. [8, Section 6.1] and [27, Definition 4] which corresponds to homomorphism ω-compactness. Furthermore, CD-restrictions are of the form 𝒬v1:p1,v2:p2.P1(v1,v2)Pk(v1,v2) [27, Definition 6], where 𝒬{,} and the Pi’s are binary predicates. Moreover, no ABoxes in the consistency problem are handled in [27]. Among the concrete domains captured by [27], it is worth mentioning Allen’s interval algebra and RCC8, see more details in [27, Section 2].

Though the works [27, 15] mainly focus on decidability, a recent remarkable breakthrough was achieved in [13], which shows that the knowledge base consistency problem for 𝒜𝒞 with ω-admissible concrete domains belongs to ExpTime, refining the decidability results; see also [3]. This complexity result nicely extends the type elimination algorithm for 𝒜𝒞 that leads already to ExpTime, based on the technique using elimination of Hintikka sets established in [29], see also [12, Section 6.8], [28, Section 2.3] and [30, Section 4.2].

Our motivations.

The well-known automata-based approach, see e.g. [14, 35, 36], consists of reducing logical problems to automata-based problems to take advantage of results and decision procedures from automata theory. Among the expected properties, the reduction should be conceptually simple even if showing its correctness requires some substantial work. Ideally, the complexity of the automata-based target problems should be well-understood, so that the reduction yields a tight upper bound on the complexity of the logical problem. Herein, we would like to follow an automata-based approach for reasoning in the description logics 𝒜𝒞𝒪 with concrete domains and regain the ExpTime-membership from [13] (see also the recent work [3]) based on the approach from [19, 20]. Type elimination and tableaux-based techniques are definitely worth being developed but we would like to expose an alternative approach with comprehensively developed formal methods that are equally accessible to a wide audience. Our motivations definitely rest on the use of concrete domains as the automata-based approach for description logics (without concrete domains) with automata over finite alphabets is already advocated in [2, Section 3.2], see also [5] and [7, Section 3.6.1]. This is the research agenda we wish to pursue with constraint automata, see also [19].

Our contributions.

Our first task is to introduce a class of constraint automata parameterised by concrete domains that accept infinite data trees and the scope of the constraints slightly extends the scope involved in the constraint automata in [19, 20] (Section 3). The constraints involved in the transitions have a simple and natural form (constraints between siblings are allowed) that shall lead to a forthcoming smooth reduction from the consistency problem, though technical difficulties still arise. In order to get k-ExpTime-membership of the nonemptiness problem, the concrete domains satisfy (C1.1) the completion property [18, 11] (resp. (C1.1.1) the amalgamation property and (C1.1.2) homomorphism ω-compactness [13]), (C2) the arity of predicates is bounded, (C3.k) the constraint satisfaction problem for the concrete domain 𝒟 is in k-ExpTime and (C4) equality is part of the relations. We provide a parameterised complexity analysis that is helpful to characterise then the complexity of the consistency problem for the description logic 𝒜𝒞𝒪(𝒟). More precisely, we design a reduction to the nonemptiness problem for Büchi tree automata, known to be solved in quadratic time, see e.g. [16]. Along the paper, we are particularly interested in determining which conditions are really needed for each part of our investigations.

Then, we design a reduction from the consistency problem for 𝒜𝒞𝒪(𝒟) with concrete domains 𝒟 satisfying the above conditions into the nonemptiness problem for constraint automata parameterised by 𝒟, leading to ExpTime-membership if (C3.1) holds, which is known to be optimal, see e.g. [13, 3]. We mix standard ingredients to translate modal/temporal/description logics into tree automata with a quite natural handling of CD-restrictions. A few complications arise to handle nominals (see e.g. [34] and the notion of guess in [32, Definition 6] about the μ-calculus with converse, universality modality and nominals), and the fact that concrete features admit partial interpretations. It is notable that we allow slightly more general classes of concrete domains than those in [13] (see Section 2.1 for a brief discussion about the differences). By way of example, we allow more general concrete domain restrictions and finiteness of the set of relations is not always required. Furthermore, we are a bit more liberal with the set of CD-restrictions, which allows us to get rid of the condition JEPD (jointly exhaustive pairwise disjoint), see e.g. [13]. Often, we introduce notions that have counterparts in [13, 3] but tailored to our approach.

Constraint automata allow us to add ingredients and to adapt smoothly the translation, for instance by adding inverse roles while giving up the nominals (as hinted possible in [13, Section 5]), functional role names and predicate assertions (see Section 2.3 and Section 5). The extension with inverse roles, though technically challenging, perfectly illustrates the versatility of our approach. In short, we revisit the complexity results about description logics with concrete domains from [13, 3] by advocating an automata-based approach, while slightly refining some results (use of the completion property) and establishing a few new ones (inverse roles). A forthcoming arXiv report shall contain the missing proofs.

2 Preliminaries: Towards the Consistency Problem for Ontologies

In this section, we introduce the concrete domains 𝒟 and the few assumptions involved in this document. Then, we define the description logics 𝒜𝒞𝒪(𝒟) parameterised by 𝒟.

2.1 Concrete Domains Under Scrutiny

A concrete domain is a relational structure 𝒟=(𝔻,P1𝔻,P2𝔻,), where 𝔻 is a nonempty set and each Pi𝔻𝔻ki interprets a predicate symbol Pi of arity ki. Elements of 𝔻 are written 𝕕0,𝕕1,. Let VAR={v1,v2,} be a countably infinite set of variables. A (Boolean) constraint is an expression following the grammar below:

Θ::=P(v1,,vk)¬ΘΘΘΘΘ,

with P a predicate symbol of arity k and v1,,vkVAR (repetitions allowed). A valuation is a total function 𝔳:VAR𝔻. Satisfaction of atomic formulas is defined by 𝔳P(v1,,vk) def (𝔳(v1),,𝔳(vk))P𝔻, and extended to Boolean combinations in the usual way. A constraint system S is a set of literals of the form either P(v1,,vk) or ¬P(v1,,vk). We write VAR(S) to denote the set of variables occurring in S. Given XVAR, we write S|X to denote the subset of S made of literals using only variables from the set X. A valuation 𝔳:VAR𝔻 satisfies the system S, written 𝔳S, if it satisfies every literal in S. The constraint satisfaction problem for 𝒟, denoted CSP(𝒟), takes as an input a finite constraint system S and asks whether there is a valuation 𝔳 such that 𝔳S.

We assume that the concrete domains satisfy a few properties to guarantee at least decidability of the logical decision problems, and k-ExpTime-membership if possible. The concrete domain 𝒟 satisfies the completion property def for every finite constraint system S and for every valuation 𝔳:X𝔻 with XVAR(S), if 𝔳 satisfies the restriction S|X and S is satisfiable, then there is an extension 𝔳:VAR(S)𝔻 such that 𝔳|X=𝔳 and 𝔳S.

A constraint system S is complete with respect to some set 𝒫 of predicate symbols and some set X of variables def no other predicate symbol or variable occurs in S and for all P𝒫 of arity k and v1,,vkX, we have either P(v1,,vk)S or ¬P(v1,,vk)S (but not both). A concrete domain 𝒟 satisfies the amalgamation property def

  • either 𝒟 has a finite set 𝒫𝒟 of predicate symbols and for all finite constraint systems S, S such that S|VAR(S)VAR(S)=S|VAR(S)VAR(S) and S|VAR(S)VAR(S) is complete w.r.t. 𝒫𝒟 and VAR(S)VAR(S), we have () S and S are satisfiable (separately) iff SS is satisfiable,

  • or 𝒟 has an infinite set of predicate symbols and for all finite sets 𝒫𝒫𝒟, for all finite constraint systems S, S built over 𝒫 such that S|VAR(S)VAR(S)=S|VAR(S)VAR(S) and S|VAR(S)VAR(S) is complete w.r.t. 𝒫 and VAR(S)VAR(S), we have ().

Similar definitions about amalgamation can be found in [27, 31, 9, 13, 3]. Note that the involved constraint systems are finite, and unlike [27, 31], we do not assume completeness of the constraint systems, see also the related notion of symbolic type in Section 3.2. Finally, 𝒟 is homomorphism ω-compact (see e.g. [13]) def if every finite SS of a countable constraint system S is satisfiable, then S itself is satisfiable.

We are ready to define the main properties satisfied by concrete domains involved herein.

(C1)

The concrete domain 𝒟 satisfies the completion property or (it satisfies the amalgamation property and is homomorphism ω-compact).        (from local to global)

(C2)

There is k0 such that all the relations in 𝒟 have arity bounded by k0. (bounded arity)

(C3.k)

CSP(𝒟) is in k-ExpTime (k1). (k-ExpTime CSP problem)

(C4)

𝒟 contains the equality relation. (equality in the concrete domain)

The condition (C4) is a bit less general than the condition JD (jointly diagonal) from [13, 9] but simpler to check. The completion property has been already considered in the literature [10, 18, 11], sometimes under the term “global consistency”, see e.g. [17, Definition 2.3] and [27]. By way of example, concrete domains satisfying the completion property include (,<,=), (,<,=) and (𝔻,=) for some infinite set 𝔻. The concrete domain RCC8 with space regions in 2 equipped with topological relations between spatial regions [37] also satisfies the completion property, see e.g. [27, Section 2.1] and [18]. This applies also to the Allen’s temporal concrete domain 𝒟A=(I,(Ri)i[1,13]), where I is the set of closed intervals [r,r] and (Ri)i[1,13] is the family of 13 Allen’s relations [1], see also [27, Section 2]. Observe that RCC8 and Allen’s interval algebra also satisfy (C2), (C3.1) and (C4), see e.g. [27, Section 2]. The completion property being one alternative in the condition (C1) is a natural condition that happens to be sufficient for our needs. However, this excludes (,<,=) that is handled in [15, 23, 22, 20]. A simple property can be already established.

Lemma 1.

If 𝒟 satisfies the completion prop., then it satisfies the amalgamation prop.

The converse implication is not known to hold in general; that is, whether the amalgamation property (or some variant) implies the completion property remains open. More broadly, the exact relationships between the completion property, the amalgamation property, and ω-compactness are not fully understood. By contrast with [13, 3], observe that the satisfaction of (C1), (C2), (C3.1) and (C4) allows the concrete domains to have an infinite set of relations as in (,<,(=q)q). The condition JEPD from [13] is not required in our framework and (C4) implies JD (jointly diagonal) from [13]. Unlike the completion property in (C1), the second disjunct of (C1) is often assumed in the literature, see e.g. [27, 31] and the ExpTime-ω-admissible concrete domains in [13, 3].

2.2 Reasoning about Ontologies with Concrete Objects

In this section, we define 𝒜𝒞𝒪(𝒟) (over the concrete domain 𝒟) defined as the description logic 𝒜𝒞(𝒟) from [13, Section 2] (see also [3]) except that the assumptions about 𝒟 may differ and the concrete domain restrictions (a.k.a. CD-restrictions) are defined in a slightly more general way. Let 𝖭𝐂={A,B,}, 𝖭𝐑={r,s,}, 𝖭𝐈={a,b,c,}, and 𝖭𝐂𝐅={f0,f1,} be countable sets of concept names, role names, individual names and concrete features, respectively. Following the assumptions from [13], a role path p is either a concrete feature f or the concatenation rf of a role name r and a concrete feature f. The set of complex concepts in 𝒜𝒞𝒪(𝒟) is defined as follows.

C A{a}¬ACCCCr.Cr.C
v1:p1,,vk:pk.Θ(v1,,vk)v1:p1,,vk:pk.Θ(v1,,vk),

where A𝖭𝐂, a𝖭𝐈, r𝖭𝐑, and Θ(v1,,vk) is a 𝒟-constraint built over the variables v1,,vk (not necessarily an atomic formula). The variables vi bind values retrieved via the paths pi. The expression {a} is known as a nominal and is interpreted by a singleton, namely the one containing the interpretation of the individual name a. For instance, let 𝒟=(,<,=), and suppose 𝖯𝖺𝗍𝗂𝖾𝗇𝗍𝖭𝐂, 𝖺𝗀𝖾𝖭𝐂𝐅, and 𝗁𝖺𝗌𝖡𝗋𝗈𝗍𝗁𝖾𝗋𝖭𝐑. The concept 𝖯𝖺𝗍𝗂𝖾𝗇𝗍v1:𝖺𝗀𝖾.v1<18 describes patients younger than 18, while v1:𝖺𝗀𝖾,v2:𝗁𝖺𝗌𝖡𝗋𝗈𝗍𝗁𝖾𝗋𝖺𝗀𝖾.v2<v1 captures individuals older than at least one of their brothers.

A general concept inclusion (GCI) is an expression CD, where C, D are concepts. A role assertion (resp. concept assertion) is an expression (a,b):r (resp. a:C). An ontology is a pair 𝒪=(𝒯,𝒜), where 𝒯 is a finite set of GCIs and 𝒜 is a finite set of assertions. As we shall provide complexity analyses, let us briefly describe how the size of ontologies is defined. The size of 𝒪, written size(𝒪), is defined as the sum of the size of 𝒯, written size(𝒯), and the size of 𝒜, written size(𝒜). The size of a TBox is the sum of the sizes of its GCIs, whereas the size of an ABox is defined as the sum of the sizes of its assertions. Assuming that the size of a concept C, written size(C), is defined as the sum between the number of its subconcepts and the number of its subconstraints occurring in CD-restrictions, the size of a GCI is the sum of the size of its two concepts. Moreover, the size of a role assertion is defined as one, and the size of a concept assertion is defined as the size of its concept.

An interpretation =(Δ,) consists of a nonempty (interpretation) domain Δ (its elements are written d,e,) and an interpretation function such that

  • AΔ for all A𝖭𝐂; rΔ×Δ for all r𝖭𝐑, aΔ for all a𝖭𝐈.

  • f:Δ𝔻 is a partial function, for all f𝖭𝐂𝐅.

  • Given dΔ, p(d) denotes either {f(d)} for p=f or {f(e)(d,e)r} for p=rf. Such sets might be empty in case f is undefined for the involved elements of Δ.

The semantics below is standard, except CD-restrictions admit arbitrary constraints, see also [31, Definition 1].

=defΔ,=def,{a}=def{a},(¬A)=defΔA,(CD)=defCD,(CD)=CD,

r.C=def{dΔe:(d,e)r,eC},r.C=def{dΔe:(d,e)reC},

(v1:p1,,vk:pk.Θ)=def{dΔ|(𝕕1,,𝕕k)pj(d):[v1𝕕1,,vk𝕕k]Θ},

(v1:p1,,vk:pk.Θ)=def{dΔ|(𝕕1,,𝕕k)pj(d):[v1𝕕1,,vk𝕕k]Θ}.

An interpretation satisfies the ontology 𝒪=(𝒯,𝒜) (written 𝒪) def CD for all CD𝒯, aC for all a:C𝒜 and (a,b)r for all (a,b):r𝒜. The consistency problem for 𝒜𝒞𝒪(𝒟) consists in determining for some ontology whether there exists some interpretation that satisfies it. Recall that a:C (resp. (a,b):r) can be encoded by {a}C (resp. {a}r.{b}). Herein, we design an automata-based decision procedure for solving the fundamental consistency problem of 𝒜𝒞𝒪 with a concrete domain 𝒟 satisfying the conditions (C1), (C2), (C3.1), (C4), and get ExpTime-membership. The consistency problem for 𝒜𝒞 (without concrete domains) can take advantage of the finite model property, see e.g. [6, Section 4.2], but this property does not hold for 𝒜𝒞𝒪(𝒟). For instance, within 𝒜𝒞𝒪(,<), the GCI (v1:f1,v2:rf1.v1<v2) can be satisfied only by interpretations with infinite domains Δ. As usual, we write 𝒜𝒞(𝒟) to denote 𝒜𝒞𝒪(𝒟) without nominals. We find it natural to handle nominals in the language of complex concepts as ontologies admit ABoxes (see also [3]). However, there is a price to pay. It is challenging to handle nominals with an automata-based approach as only one node of the accepted trees should correspond to the interpretation of each nominal. Furthermore, infinite interpretations (see the example above for 𝒜𝒞𝒪(,<)) and satisfying r.{a} lead to interpretations such that a has an infinite amount of incoming r-edges.

Given a set X of individual names, an interpretation satisfies the unique name assumption (UNA) w.r.t. X iff for all distinct individual names aaX, we have aa. Below, we state a standard result that allows us to deal with interpretations satisfying (UNA) only, as we are interested in complexity classes equal or above ExpTime. Indeed, suppose that 𝒪 and be the equivalence relation on {a0,,aη1} (individual names in 𝒪) such that ab iff a=b. We write [a] to denote the equivalence class of a. We can build a new ontology 𝒪=(𝒯,𝒜) as the quotient of 𝒪 by the equivalence relation : in each concept, the nominal {a} is replaced by {[a]} and each individual name a in assertions is replaced by [a]. One can show that 𝒪 is consistent iff there is some equivalence relation on {a0,,aη1} such that 𝒪 can be satisfied by an interpretation satisfying (UNA) w.r.t. {[aj]j[0,η1]}. The equivalence classes [aj] are viewed as new individual names.

2.3 Beyond 𝒜𝒞𝒪 with Concrete Domains

We briefly present several ingredients that can be found in description logics, see e.g. [6]. Section 5 is dedicated to showing how the approach developed in Section 4 can be adapted. We write 𝒜𝒞(𝒟) to denote the extension of 𝒜𝒞(𝒟) (without nominals) with inverse roles. The roles are either r or r where r𝖭𝐑 and (r) is equal to the converse of r. Herein, inverse roles can be found in restrictions as in r.C and in CD-restrictions as in v1:f1,v2:rf2.Θ(v1,v2). We write 𝒜𝒞𝒪(𝒟) to denote the extension of 𝒜𝒞𝒪(𝒟) with functional role names, see e.g. [24]. We introduce the set 𝖭𝐅𝐑 of functional role names (a.k.a. abstract features) and the roles in 𝒜𝒞𝒪(𝒟) are the role names from 𝖭𝐑𝖭𝐅𝐑. If r𝖭𝐅𝐑 and is an interpretation of 𝒜𝒞𝒪(𝒟), then r is weakly functional. Functional role names can be used anywhere. We may also enrich the set of assertions by allowing in ontologies, expressions Θ(f1(a1),,fk(ak)) stating constraints between the concrete feature values of a1,,ak (predicate assertions are usually restricted to atomic constraints).

3 Tree Global Constraint Automata

In this section, we introduce a class of tree constraint automata parameterised by concrete domains 𝒟 that accept infinite data trees in which every node is labelled by a finite tuple of data values from 𝔻. This is slightly more general than the automata models from [19], not only because the definition does not assume a fixed concrete domain but more importantly because constraints between siblings are allowed unlike what is done in [19, 20].

3.1 Definitions

Unless otherwise stated, in the rest of this section, we assume that the concrete domain 𝒟 satisfies the conditions (C1) and (C3.k) for some k1 (meaning that (C2) and (C4) are not needed). We introduce the class of tree global constraint automata that accept sets of labelled trees of the form 𝕥:[0,𝚍1]𝚺×𝔻β for some finite alphabet 𝚺, for some 𝚍1 and β0. The transition relation of such automata expresses constraints between the β values at a node and the values at its children. The automaton is equipped with a Büchi acceptance condition. Formally, a tree global constraint automaton (in short, TGCA) over 𝒟 is a tuple 𝔸=(Q,𝚺,𝚍,β,Qin,δ,F), where:

  • Q is a finite set of locations, QinQ are initial locations, FQ are accepting locations, 𝚺 is a finite alphabet;

  • 𝚍1 is the branching degree, and β the number of registers (a.k.a. variables);

  • δQ×𝚺×Cons(β,𝚍)×Q𝚍 is the transition relation where Cons(β,𝚍) denotes the set of 𝒟-constraints built over the distinguished registers in REG(β,𝚍)=def{x1,,xβ}{xji1jβ, 0i<𝚍}. The expression xji refers to the jth register of the ith child. Arbitrary registers in REG(β,𝚍) are also referred to by y0,y1,y2,.

Runs.

Let 𝕥:[0,𝚍1]𝚺×𝔻β be a data tree, with 𝕥(n)=(𝚊n,zn) at each node n. A run of 𝔸 on 𝕥 is a mapping ρ:[0,𝚍1]δ that maps each node n to a transition (qn,𝚊n,Θn,qn0,,qn(𝚍1)) such that:

  1. (i)

    The source location of ρ(ni) is qni for all 0i<𝚍.

  2. (ii)

    Let 𝔳n:REG(β,𝚍)𝔻 be the valuation defined by 𝔳n(xj)=zn[j] and 𝔳n(xji)=zni[j], for all 1jβ, 0i<𝚍. Then 𝔳nΘn.

  3. (iii)

    The source location of ρ(ε) is in Qin.

Given a path π=j1j2 in ρ starting from ε, we define inf(ρ,π) to be the set of locations that appear infinitely often as the source locations of the transitions in ρ(ε)ρ(j1)ρ(j2) A run ρ is accepting if all paths π in ρ starting from ε, we have inf(ρ,π)F. We write L(𝔸) to denote the set of data trees 𝕥 for which there exists some accepting run of 𝔸 on 𝕥. The nonemptiness problem for TGCA, written NE(TGCA), asks whether L(𝔸) for some TGCA 𝔸. Herein, we use Büchi acceptance conditions but such conditions play no essential role in our investigations because we shall always assume that F=Q (as in looping tree automata, see e.g. [7, Section 3.6.1]).

3.2 Reduction to Nonemptiness for Büchi Tree Automata

To analyse the complexity of NE(TGCA), we construct a reduction to the nonemptiness problem for Büchi tree automata (BTA), known to be in PTime [35]. Büchi tree automata are TGCA with β=0 and no 𝒟-constraints. Given 𝔸=(Q,𝚺,𝚍,β,Qin,δ,F), we write 𝒫𝔸={P1,,Pm} to denote either the full set of predicate symbols of 𝒟 if finite, or the finite set of predicate symbols from 𝒟 occurring in 𝔸.

Symbolic types.

We abstract concrete values by symbolic types, following the standard notion of types in first-order languages, see e.g. [30, Section 3.1] and [33]. These are sets of literals that capture all atomic constraints relevant to the current node and its children. A forthcoming notion of projection compatibility ensures the coherence between parent and child types across the tree. Let 𝖠𝗍𝗈𝗆𝗌(𝔸) be the set of atomic formulae of the form P(y1,,yk) for some y1,,ykREG(β,𝚍) and P𝒫𝔸. Similarly, let 𝖫𝗂𝗍𝖾𝗋𝖺𝗅𝗌(𝔸) be the set of all literals over 𝖠𝗍𝗈𝗆𝗌(𝔸), i.e., atomic formulae or their negations. A symbolic type τ is a set τ𝖫𝗂𝗍𝖾𝗋𝖺𝗅𝗌(𝔸) s.t. for every Θ𝖠𝗍𝗈𝗆𝗌(𝔸), either Θτ or ¬Θτ, but not both. Let 𝖲𝖳𝗒𝗉𝖾𝗌(𝔸) denote the set of all symbolic types. Symbolic types can be understood as constraints made of conjunctions of its elements built over the registers in REG(β,𝚍). A symbolic type τ is satisfiable if there is 𝔳:REG(β,𝚍)𝔻 such that 𝔳τ. Let 𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸) denote the set of all satisfiable symbolic types. The properties of satisfiable types that we mainly use are stated in Lemma 2 where 𝒟τ(z,z0,,z𝚍1) means that 𝔳ΘτΘ for the valuation 𝔳 such that 𝔳(xj)=z[j] and 𝔳(xji)=zi[j], for all 1jβ, 0i<𝚍.

Lemma 2 (Three properties about types).

(I) Let z,z0,,z𝚍1𝔻β. There is a unique satisfiable type τ𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸) such that 𝒟τ(z,z0,,z𝚍1). (II) For every constraint Θ built over the predicate symbols in 𝒫𝔸 and the registers REG(β,𝚍), there is a disjunction τ1τγ logically equivalent to Θ and each τi belongs to 𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸) (empty disjunction stands for ). (III) For all distinct types ττ𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸), ττ is not satisfiable.

The proof of Lemma 2 is by an easy verification. We write τΘ to denote that for all valuations 𝔳, if 𝔳τ, then 𝔳Θ. If τ𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸) and Θ satisfies the properties of Lemma 2(II), then τΘ can be checked in polynomial time. Let τ,τ0,,τ𝚍1𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸). We say that τ is projection-compatible with (τ0,,τ𝚍1) def for each i<𝚍 and every predicate symbol P𝒫𝔸 of arity k, and for all tuples (j1,,jk)[1,β]k, we have: P(xj1i,,xjki)τ iff P(xj1,,xjk)τi. We recall that in a type τ, registers associated to the values of the children nodes are also present. Hence, projection-compatibility simply reflects the fact that the constraints for such registers in τ must be compatible with the constraints of the registers in the children types.

Construction of Büchi tree automata.

We now define the Büchi tree automaton 𝔹=(Q,𝚺,δ,𝚍,Q𝗂𝗇,F) that simulates the TGCA 𝔸 using symbolic types. The basic idea consists in abstracting data values in 𝔻 by satisfiable types from 𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸). Moreover, the type at a node needs to be compatible with the children’s types since a type expresses constraints between the values at a node and those at its children. This is where projection compatibility plays a crucial role. In this way, an infinite tree accepted by the Büchi tree automaton can be viewed as an infinite constraint system (each node has its own variables coming from β local registers) for which satisfiability is required. Generally, additional conditions are needed to guarantee satisfiability, see e.g. [23, 22, 19]. In the case of concrete domains satisfying the condition (C1), importantly, this is all we need as shown below. First, let us define the Büchi tree automaton 𝔹 with the remarkably simple construction below.

  • Q=defQ×𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸), Qin=defQin×𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸), F=defF×𝖲𝖺𝗍𝖲𝖳𝗒𝗉𝖾𝗌(𝔸),

  • δ contains the transition ((q,τ),𝚊,(q0,τ0),,(q𝚍1,τ𝚍1)) if (a) (q,𝚊,Θ,q0,,q𝚍1)δ, (b) τΘ and (c) τ is projection-compatible with (τ0,,τ𝚍1).

To build effectively 𝔹 from 𝔸, we only need the decidability of CSP(𝒟) to determine the satisfiable symbolic types. We can establish that this construction preserves nonemptiness. Again, apart from the decidability of CSP(𝒟), only the condition (C1) is required.

Lemma 3 (Soundness).

L(𝔸) implies L(𝔹).

The proof does not require much about the concrete domain 𝒟 since it mainly rests on the construction of satisfiable types from tuples made of data values in 𝔻.

Lemma 4 (Completeness).

L(𝔹) implies L(𝔸).

In its proof, essential for our investigations, each disjunct of (C1) leads to a way to construct the valuation 𝔳 on which the data values for the data trees accepted by 𝔸 are defined. If 𝒟 satisfies the completion property, then the valuation 𝔳 is built incrementally while visiting [0,𝚍1] with a breadth-first search. By contrast, assuming the second disjunct of (C1) allows us to define an infinite chain of constraint systems (thanks to the amalgamation property) and then to obtain the existence of 𝔳 as 𝒟 is homomorphism ω-compact. Today, it is unclear to us what are the exact relationships between these two options; it is not excluded that model-theoretical developments from [31, Chapter 4] could be helpful but we were unable to draw conclusive arguments.

It remains to perform a parameterised analysis of the complexity of the nonemptiness problem for TGCA. We warn the reader that the analysis is rather standard but this needs to be performed with care. Assuming that CSP(𝒟) can be solved in time 𝒪(2𝚙1(n)n) and the nonemptiness problem for Büchi tree automata can be solved in time 𝒪(𝚙2(n)n) for some polynomials 𝚙1 and 𝚙2, the nonemptiness problem for TGCA can be solved in time

𝒪(2𝚙1(m×(β(𝚍+1))k0)+𝚙2(|Q|𝚍+1×|𝚺|×2(𝚍+1)×m×(β(𝚍+1))k0)|δ|(𝙼𝙲𝚂(𝔸)+𝚍×m×(β(𝚍+1))k0)),

where k0 is the maximal arity of predicates in 𝔸 and 𝙼𝙲𝚂(𝔸) is the maximal size of constraints occurring in 𝔸. The use of the polynomials (𝚙i(n)n)’s is convenient for the above expression.

Theorem 5.

Let 𝒟 be a concrete domain satisfying (C1), (C2) and (C3.1). The nonemptiness problem for TGCA over 𝒟 is in ExpTime.

As a slight digression, ExpTime-hardness already holds with the concrete domain (𝔻,=) with infinite domain 𝔻 as a consequence of [20, Appendix B.1]. If we assume (C3.k) for some k2 instead of (C3.1) in Theorem 5, we get k-ExpTime-membership.

4 Constraint Automata for Checking Consistency

Below, we reduce the consistency problem for 𝒜𝒞𝒪(𝒟) to the nonemptiness problem for TGCA over 𝒟. Correctness of the reduction only requires that the concrete domain 𝒟 satisfies the condition (C4). As shown in Section 3, the translation from TGCA to BTA requires the satisfaction of (C1) and CSP(𝒟) is decidable. In order to get ExpTime-membership of the consistency problem for 𝒜𝒞𝒪(𝒟), the condition (C2) and (C3.1) are further assumed.

4.1 Preliminaries

Given an ontology 𝒪=(𝒯,𝒜), we shall build a TGCA 𝔸=(Q,𝚺,𝚍,β,Qin,δ,F) over 𝒟, such that 𝒪 is consistent iff L(𝔸). Below, we introduce the notions of concept types (similar to Hintikka sets), and (local/global/contextual) abstractions that are finite pieces of information about elements in some interpretation domain Δ. The construction of 𝔸 is finalised in Section 4.2 and the correctness is postponed to Section 4.3.

Normalisation.

A concept is in negation normal form (NNF) if the negation occurs only in front of concept names or nominals. Every concept is logically equivalent to a concept in NNF. In particular, ¬(𝒬v1:p1,,vk:pk.Θ)𝒬¯v1:p1,,vk:pk.¬Θ, where 𝒬{,}, ¯= and ¯=. This is valid since the constraints in CD-restrictions are closed under negations. Also, every GCI CD is equivalent to ¬CD. Hence, we assume a fixed ontology 𝒪=(𝒯,𝒜) such that all the concepts are in NNF and the GCIs are of the form C. This normalisation is standard and harmless for complexity. We write 𝖲𝗎𝖻(𝒪) to denote the set of subconcepts occurring in the ontology 𝒪 augmented with the concepts {a} and ¬{a} for all individual names a occurring in 𝒪 in assertions and nominals (details omitted). We introduce the following sets: 𝖭𝐈(𝒪)=def{a0,,aη1} is the set of individual names occurring in assertions/nominals in 𝒪, 𝖭𝐂𝐅(𝒪)=def{f1,,fα} is the set of concrete features in 𝖲𝗎𝖻(𝒪) and 𝖭𝐑(𝒪)=def{r1,,rκ} the set of role names in 𝖲𝗎𝖻(𝒪).

Abstractions.

Consistency of the ontology 𝒪 asks for the existence of an interpretation such that 𝒪. In the automata-based approach, this reduces to the existence of an infinite data tree 𝕥 such that 𝕥L(𝔸). Such a witness tree 𝕥 represents an interpretation (nodes in [0,𝚍1] roughly correspond to elements of Δ), so L(𝔸) is designed to accept such representations. An accepting run on 𝕥 thus ensures that 𝕥, viewed as an 𝒜𝒞𝒪(𝒟) interpretation, satisfies 𝒪. In an accepting run ρ on 𝕥, each node n is labelled by a transition (q,𝚊,Θ,q0,,q𝚍1), where the location q contains some finite symbolic abstraction about the node n viewed as an element from the interpretation domain. This is where local, global and contextual abstractions enter into play.

Given an element dΔ for some interpretation , the local abstraction of the element d in , written locabs(d,), is a triple (T,sl,act) defined as follows.

  • T is the concept type and T=def{C𝖲𝗎𝖻(𝒪)dC},

  • act is the activity vector in {,}α s.t. act[i]= def fi(d) is defined, for all i[1,α],

  • sl is the set of symbolic links in 𝖭𝐑(𝒪)×𝖭𝐈(𝒪) such that (r,a)sl def (d,a)r.

It is not immediate that this notion of local abstraction suffices, but we shall show that it does. Of course, in the forthcoming TGCA 𝔸, we also make use of the registers to encode the concrete features from interpretations of 𝒜𝒞𝒪(𝒟). By contrast, the global abstractions act as persistent memories relevant for the whole interpretation. Unsurprisingly, they assign to each individual name in 𝖭𝐈(𝒪) a local abstraction; due to a technical reason, we can restrict these to just the concept type and activity vector. Since runs of the automaton 𝔸 cannot revisit nodes, the global abstraction enables to recover the local abstraction of the individual names from 𝖭𝐈(𝒪) from any node. We use a standard notion of types to represent locally consistent sets of subconcepts, similarly to Hintikka sets to prove completeness of propositional logic for tableaux-style calculi, see also [6, Section 5.1.2]. The set of concept types, denoted 𝖢𝖳𝗒𝗉𝖾𝗌(𝒪), consists of all subsets T𝖲𝗎𝖻(𝒪) such that

  1. 1.

    T; for all concept names A𝖢𝖳𝗒𝗉𝖾𝗌(𝒪), if AT then ¬AT,

  2. 2.

    for all j[0,η1], either {aj}T or ¬{aj}T (but never both),

  3. 3.

    if CDT, then CT and DT; if CDT, then CT or DT,

  4. 4.

    for every GCI C𝒯, we have CT.

An anonymous concept type T is such that {¬{a0},,¬{aη1}}T. An a-type T is such that {a}T and for all {b} with b𝖭𝐈(𝒪) distinct from a, we have ¬{b}T (see e.g. the named states in [21, Section 5] and the named types in [13, Section 4]).

An activity vector act is a tuple in {,}α where the i-th component act[i] indicates whether the concrete feature fi (from 𝖭𝐂𝐅(𝒪)) is defined () or not (). The set of activity vectors is written 𝖠𝖢𝖳. Since the interpretation of each concrete feature is a partial function, only the registers corresponding to positions where act[i]= represent meaningful feature values. While the data tree assigns values to all registers uniformly, entries with act[i]= are ignored when encoding 𝒜𝒞𝒪(𝒟) interpretations. A symbolic link is a pair (r,a)𝖭𝐑(𝒪)×𝖭𝐈(𝒪) intended to represent that the current element is related to the interpretation of a via the interpretation of r. The set 𝖲𝖫𝗂𝗇𝗄𝗌 of symbolic link sets is the powerset 𝒫(𝖭𝐑(𝒪)×𝖭𝐈(𝒪)). A local abstraction 𝖫𝖠 is a triple in 𝖢𝖳𝗒𝗉𝖾𝗌(𝒪)×𝖲𝖫𝗂𝗇𝗄𝗌×𝖠𝖢𝖳. A global abstraction 𝖦𝖠 is a function 𝖦𝖠:𝖭𝐈(𝒪)𝖢𝖳𝗒𝗉𝖾𝗌(𝒪)×𝖠𝖢𝖳. A location of the forthcoming automaton 𝔸 is structured in such a way that it contains a local abstraction 𝖫𝖠 (related to the associated node) and a global abstraction 𝖦𝖠 that records a fixed knowledge about the individual names. Such pairs are called contextual abstractions 𝖢𝖠=(𝖦𝖠,𝖫𝖠).

Branching degree 𝚍.

We need to determine a branching degree 𝚍 that allows us to satisfy all kinds of existential restrictions. To do so, we introduce the following values.

  • N𝖾𝗑=def|{r.Cr.C𝖲𝗎𝖻(𝒪)}|.

  • N𝖼𝖽=def|{v1:p1,,vk:pk.Θv1:p1,,vk:pk.Θ𝖲𝗎𝖻(𝒪)}|.

  • N𝗏𝖺𝗋=defmax({0}{kv1:p1,,vk:pk.Θ𝖲𝗎𝖻(𝒪)}).

The rationale to define 𝚍 is based on the following requirements. For each r.C𝖲𝗎𝖻(𝒪), one needs one direction to have a witness individual. Similarly, for each v1:p1,,vk:pk.Θ𝖲𝗎𝖻(𝒪), k directions are sufficient to get k witness values in 𝔻. Finally, the tree interpretation property for 𝒜𝒞𝒪(𝒟) is based on the representation of a forest such that each of its trees corresponds to the unfolding of the interpretation from some individual name in 𝖭𝐈(𝒪). As this is encoded at the root of the data tree accepted by 𝔸, the degree should be also at least |𝖭𝐈(𝒪)|. So, it makes sense to define 𝚍 as the value max{N𝖾𝗑+N𝖼𝖽×N𝗏𝖺𝗋,|𝖭𝐈(𝒪)|}; the soundness proof of Lemma 6 shall confirm this formally.

Correspondence between directions, role names and restrictions.

Once the degree 𝚍 is defined, one needs to establish a discipline to put in correspondence directions in [0,𝚍1], role names and restrictions. To do so, below, we define the maps λ and dir. Indeed, a standard approach consists in booking directions in [0,𝚍1] for each role name in 𝖭𝐑(𝒪). The map λ takes care of assigning directions to existential restrictions and to role paths in existential CD-restrictions. By contrast, the map dir is uniquely determined by λ and returns a set of directions among [0,𝚍1] to each role name. Such a discipline is required to construct the transitions in 𝔸 as one needs to agree on the directions that are relevant for getting witnesses of a given restriction. We introduce the following sets.

  • r=def{ri1.C1,,riN𝖾𝗑.CN𝖾𝗑} contains all the existential restrictions from 𝖲𝗎𝖻(𝒪).

  • 𝖼𝖽=def{C1,,CN𝖼𝖽} contains all the existential CD-restrictions from 𝖲𝗎𝖻(𝒪).

We define the labeling domain as: =defr{(Ci,j)1iN𝖼𝖽, 1jN𝗏𝖺𝗋}. The global labeling function λ:[0,𝚍1], that is designed as a bijection, is defined as:

  • λ(rik.Ck)=defk1, for 1kN𝖾𝗑,

  • λ(Ci,j)=defN𝖾𝗑+(i1)×N𝗏𝖺𝗋+(j1), for 1iN𝖼𝖽, 1jN𝗏𝖺𝗋.

The definition of the map dir is provided below. Let r𝖭𝐑(𝒪).

dir(r)=def{i[0,𝚍1]|λ1(i)=r.C for some C, orλ1(i)=(C,j) where C=v1:p1,,vk:pk.Θ,with 1jk and pj=rf}.

Dimension 𝜷 and conventions about registers.

The forthcoming automaton 𝔸 accepts infinite data trees 𝕥:[0,𝚍1]𝚺×𝔻β and it remains to define β before moving to the final steps of the construction. Since 𝖭𝐂𝐅(𝒪) contains α concrete features, we require at least α registers. Though global abstractions are persistent memories, this is not sufficient to handle the values returned by the concrete features for the individual names. This is useful to consider the satisfaction of CD-restrictions requiring as witness individuals those interpreting some individual name in 𝖭𝐈(𝒪). That is why, we augment the dimension by |𝖭𝐈(𝒪)|×α so that it is possible to carry all over the runs the values of the concrete features for such individuals. Consequently, we set β=def(η+1)×α. To use the registers in REG(β,𝚍), we follow conventions that happen to be handy and that diverge slightly from the general notations introduced in Section 3. The distinguished registers x1,,xβ referring to values at the current node shall be represented by x1,,xα,xa0,1,,xa0,α,,xaη1,1,,xaη1,α. Hence, implicitly xj refers to the value of the concrete feature fj, if any. Similarly, xa,j refers to the value of fj for the interpretation of the individual name a. Furthermore, we use xji (resp. xa,ji) to refer to the value of fj for the ith child (resp. for the individual name a). However, by construction of 𝔸, we require that xa,j is enforced to be equal to xa,ji as such registers are dedicated to store global values.

4.2 Definition of the Constraint Automaton 𝔸

Now, we define the TGCA 𝔸=(Q,𝚺,𝚍,β,Qin,δ,F). Each non-root node n[0,𝚍1]+ in a run of the automaton may represent an element of the interpretation domain, but this is not systematic. A node represents a domain element if its location is not the designated sink location – a special marker to identify inactive nodes. In the active case, the location encodes a contextual abstraction as previously defined. By contrast, the root node ε does not correspond to any domain element. It serves as a technical device to assemble a forest-shaped model into a single data tree: its immediate children within [0,η1] encode the interpretations of the individual names in the ABox (recall we can safely assume (UNA)). The transitions are designed to enforce the semantical properties of the description logic 𝒜𝒞𝒪(𝒟) based on the contextual abstractions. Below, we present the definition of the constraint automaton 𝔸 based on the definitions in Section 4.1.

Set of locations and alphabet.

The set Q is defined as 𝒞𝒜{\(\varepsilon\)⃝,} and F=defQ, where

  • 𝒞𝒜=def𝒢𝒜×𝒜 is the set of contextual abstractions, where 𝒢𝒜 is the set of global abstractions and 𝒜 the set of local abstractions;

  • \(\varepsilon\)⃝ is a distinguished location used only at the root (root location) and Qin=def{\(\varepsilon\)⃝},

  • is a sink location used to label nodes that represent no domain elements.

The alphabet 𝚺 is {𝚊}; the automaton 𝔸 does not rely on input symbols – only the structure of the tree and the tuples in 𝔻β matter.

Transition relation.

Before providing the formal definition, let us briefly explain its main principles. Each non-root node in a run is either associated with a contextual abstraction representing a candidate element of the interpretation domain, or is labelled with the sink location . If a node is associated with a contextual abstraction, its concept type encodes the concepts it must belong to in any corresponding interpretation: if a concept appears in the concept type of a node, then the interpreted element is required to satisfy that concept and this creates obligations (see the conditions for general transitions). Role names are interpreted with the help of two kinds of links: (i) direct links to the children, where each direction in [0,𝚍1] is possibly attached to a role name; and (ii) symbolic links connecting a node to the interpretation of individual names from the ABox. These links are used to satisfy CD-restrictions, existential restrictions and value restrictions.

Constraints are evaluated using register values drawn from the current node and its children (as determined by λ). The current node stores its own concrete feature values via the registers xj and also the concrete feature values for the interpretation of the individual names via the registers xa,j. These latter values must remain constant during the run, and this is explicitly enforced by the transition relation with the help of the constraints. Activity vectors determine which concrete feature values are defined and guarantee that only the defined values are referred to in the constraints. The global abstraction – common to all nodes labelled by non-sink locations – must remain constant too. Formally, the transition relation δ is a subset of Q×𝚺×Cons(β,𝚍)×Q𝚍 and contains three kinds of transitions.

(a) Padding transition.

There is a unique padding transition 𝚝=def(,𝚊,,,,).

(b) Root transitions.

From the root location \(\varepsilon\)⃝, the automaton performs an initialization step by assigning contextual abstractions to the first η children, each of which represents the interpretation of an individual name from the ABox. Each such contextual abstraction is of the form (𝖦𝖠i,(Ti,sli,acti)). For each individual name ai, the local abstraction assigned to the i-th child must match the corresponding entry in the global abstraction: specifically, the concept type and activity vector must coincide with 𝖦𝖠(ai). In addition, the value of each concrete feature fj for ai must be consistently stored in the register xai,j at each node, which is treated as a constant throughout the run. This value originates from the local register xj of the i-th child, which represents the interpretation of ai. Therefore, the consistency condition xai,j=xj must hold at that node. These requirements together form the global information consistency check. Beyond this, the local abstraction must correctly reflect the ABox assertions involving ai.

Formally, root transitions are of the form (\(\varepsilon\)⃝,𝚊,Θ𝗋𝗈𝗈𝗍,𝖢𝖠0,,𝖢𝖠η1,,,), where each 𝖢𝖠i=(𝖦𝖠i,(Ti,sli,acti)) is a contextual abstraction and Ti is an ai-type. The constraint Θ𝗋𝗈𝗈𝗍 is equal to the expression i,i[0,η1],j[1,α],actai[j]=xji=xai,ji.

The transition must also satisfy the following structural conditions: 𝖦𝖠0==𝖦𝖠η1 and for all i[0,η1], (Ti,acti)=𝖦𝖠(ai), {Cai:C𝒜}Ti, and {(r,b)(ai,b):r𝒜}sli. From now on, for each individual name a𝖭𝐈(𝒪), we let 𝖦𝖠(a)=(Ta,acta), where Ta denotes the concept type and acta the activity vector associated to a via the global abstraction 𝖦𝖠. This notation is well-defined due to the global information consistency check performed by the root transition.

(c) General transitions.

From a location 𝖢𝖠=(𝖦𝖠,(T,sl,act)), we need to find witness individuals for the existential restrictions and CD–restrictions occurring in T. Each witness can be realized in two ways: either through a child, determined via the branch labeling function λ, or via a symbolic link in sl pointing to an individual name. As for value restrictions, the transition must verify that all possible witnesses satisfy the required concept or constraint. For universal CD–restrictions, the constraint must hold across all applicable combinations of registers extracted from those directions and links. General transitions also maintain some global information and are of the form 𝚝=(𝖢𝖠,𝚊,Θ𝚝,q0,,q𝚍1), where 𝖢𝖠=(𝖦𝖠,(T,sl,act)) satisfies the conditions below.

  1. 1.

    Existential restrictions. For all r.CT, there is (r,a)sl such that CTa or i=λ(r.C), qi is of the form (𝖦𝖠i,(Ti,sli,acti)) and CTi.

  2. 2.

    Value restrictions. For all r.CT, for all idir(r) with qi of the form (𝖦𝖠i,(Ti,sli,acti)), we have CTi and for all (r,a)sl, we have CTa.

  3. 3.

    CD-restrictions. Let CD(T) and CD(T) be the sets of existential and universal CD-restrictions in T. Let C=𝒬v1:p1,,vk:pk.Θ(v1,,vk) be such a CD-restriction with 𝒬{,}. For each pj, we define a set Zj of registers as follows:

    • If pj=fl, then if act[l]=, then Zj=def{xl}, otherwise Zj=def.

    • If pj=rfl, then Zj consists of the values of the feature fl taken from nodes reachable via the role name r, provided the corresponding feature is defined.

      Zj=def{xliidir(r),qi=(𝖦𝖠i,(Ti,sli,acti)),acti[l]=}{xa,l(r,a)sl,acta[l]=}.

    Let Z=defZ1××Zk. If CCD(T), then ΘC=def(y1,,yk)ZΘ(y1,,yk), otherwise ΘC=def(y1,,yk)ZΘ(y1,,yk). The constraint Θ𝚝 is defined below: Θ𝚝=defCCD(T)CD(T)ΘCdedication to CD-restrictionsi[0,𝚍1]qij=1αj[0,η1],actaj[j]=xaj,j=xaj,jipropagation of global values.

    The constraint Θ𝚝 is uniquely determined by the sequence q,q0,,q𝚍1.

  4. 4.

    Global abstraction preservation. For each i<𝚍 with qi, we require 𝖦𝖠i=𝖦𝖠.

  5. 5.

    Inactive children. For i[0,𝚍1], if λ1(i)=r.C and λ1(i)T or λ1(i)=(v1:p1,,vk:pk.Θ,j) and ( j>k or v1:p1,,vk:pk.ΘT), then qi=.

  6. 6.

    Anonymity. For i<𝚍 such that qi is of the form (𝖦𝖠i,(Ti,sli,acti)), Ti is an anonymous concept type.

Such general transitions are reminiscent of the augmented types in [13, Section 3].

4.3 Correctness of the Reduction

It remains to establish that L(𝔸) iff the ontology 𝒪 is consistent. Forthcoming Lemma 6 states the left-to-right direction and its proof amounts to design an interpretation from a data tree in L(𝔸) with its accepting run. Elements of the interpretation domain are non-root nodes that are not labelled by the sink location, interpretation of the concept names can be read from the concept types, and the interpretation of the concrete features are read from values in the tuples from 𝔻β in the data tree. Requirements on the transitions complete the analysis to guarantee that an interpretation satisfying the ontology is indeed produced.

Lemma 6.

L(𝔸) implies 𝒪 is consistent.

Lemma 7.

𝒪 is consistent with an interpretation satisfying (UNA) implies L(𝔸).

The construction of the TGCA 𝔸 can be done in exponential-time in size(𝒪), since all components – locations, transitions, and constraints – can be generated and assembled in exponential time. By substituting these parameters into the general complexity expression for the nonemptiness check of TGCA yields the main result below.

Theorem 8.

The consistency problem for 𝒜𝒞𝒪(𝒟) is in k-ExpTime if 𝒟 satisfies the conditions (C1), (C2), (C3.k) and (C4), for any k1.

As a corollary of the combination of Lemma 6 and Lemma 7, if 𝒪 is consistent, then it admits a forest-like interpretation satisfying it with branching degree bounded by max{N𝖾𝗑+N𝖼𝖽×N𝗏𝖺𝗋,|𝖭𝐈(𝒪)|}. This provides an alternative to the tree unfolding of the interpretations. Besides, the concrete domain 𝒟=(,(+k)k) for which the consistency problem 𝒜𝒞(𝒟) is shown undecidable in [31, Proposition 1], has an infinite set of predicate symbols but does not satisfy the notion of amalgamation used in Section 2.1 (see also [31, Example 3]).

5 Extensions

In this section, we consider the additional ingredients described in Section 2.3 and we show how our automata-based approach can be adapted, preserving the ExpTime-membership. Due to lack of space, we only state their main results about functional role names (resp. constraint assertions).

Theorem 9.

The consistency problem for 𝒜𝒞𝒪(𝒟) augmented either with constraint assertions or with functional role names is in k-ExpTime if 𝒟 satisfies the conditions (C1), (C2), (C3.k) and (C4), for any k1.

Note that nothing prevents us from using constraint assertions and functional role names in the same logic. Now, let us consider in more detail 𝒜𝒞(𝒟), see Section 2.3. The roles R in 𝒜𝒞(𝒟) are either role names r or their inverses r. Role assertions involve only role names because inverse role names add no expressivity in such assertions. By convention, we write (a,b):r𝒜 to denote that the role assertion (b,a):r belongs to 𝒜. Given a role R, we write R to denote its inverse, i.e. R=defr if R=r and R=defr if R=r. Given an ontology 𝒪 for 𝒜𝒞(𝒟), we write 𝖱𝗈𝗅𝖾𝗌(𝒪) to denote the set {R1,,Rκ} of roles occurring in 𝒪. In the definition of 𝖲𝗎𝖻(𝒪), we remove the parts about nominals. To extend the automaton construction to 𝒜𝒞(𝒟), the transitions for 𝔸 need to encode access to new witnesses by considering also the edge to the parent node.

Parent abstractions and new contextual abstractions.

A parent abstraction 𝖯𝖠 is either a pair (γ,𝔣) with γ[0,η1] and 𝔣:𝖭𝐈(𝒪)𝖠𝖢𝖳 (encoding unknown parent, incoming direction and activity vectors of the individual names) or a triple (R,T,act)𝖱𝗈𝗅𝖾𝗌(𝒪)×𝖢𝖳𝗒𝗉𝖾𝗌(𝒪)×𝖠𝖢𝖳. As expected, R encodes the role to reach a node from its parent node, T is the parent’s concept type and act is the parent’s activity vector. The set 𝒞𝒜 of contextual abstractions is now defined as the product 𝒫𝒜×𝒜, where 𝒫𝒜 denotes the set of parent abstractions. Contextual abstractions no longer include global abstractions since we gave up the nominals (but the pair (γ,𝔣) contains a bit of global information). Similarly, the local abstractions have no more symbolic links.

Automaton construction for 𝓐𝓛𝓒𝓘(𝓓).

We define the TGCA 𝔸=(Q,𝚺,𝚍,β,Qin,δ,F) from some ontology 𝒪 as follows, extending the construction in Section 4.2.

  • 𝚍=defmax{N𝖾𝗑+N𝖼𝖽×N𝗏𝖺𝗋,|𝖭𝐈(𝒪)|}, where N𝖾𝗑 is the number of existential restrictions in 𝖲𝗎𝖻(𝒪). To support the reasoning on the concrete feature values of the parent nodes, the new registers x1,,xα are dedicated to the parent nodes. Consequently, β=defα×(η+2). Unlike what happens in the automata for 𝒜𝒞𝒪(𝒟), the registers dedicated to the individual names are not maintained constant along the run but only used at the nodes 0,,η1 (labelled by some location of the form ((γ,𝔣),𝖫𝖠)).

  • Q=def𝒞𝒜{\(\varepsilon\)⃝,} (suitable definition for 𝒞𝒜), F=defQ, Qin=def{\(\varepsilon\)⃝} and 𝚺=def{𝚊}.

  • The relation δ is a subset of Q×𝚺×Cons(β,𝚍)×Q𝚍. As for 𝒜𝒞𝒪(𝒟), we distinguish three kinds of transitions but the general transitions have now two categories. There is still a unique padding transition 𝚝=def(,𝚊,,,,). In the root transitions (\(\varepsilon\)⃝,𝚊,Θ𝗋𝗈𝗈𝗍,𝖢𝖠0,,𝖢𝖠η1,,,), each 𝖢𝖠i is of the form ((i,𝔣i),(Ti,acti)) with the same requirements as for 𝒜𝒞𝒪(𝒟) (except that there is no more global abstractions) but we require that 𝔣0==𝔣η1. Furthermore, Θ𝗋𝗈𝗈𝗍 keeps the same value. There is nevertheless an additional requirement due to the absence of global abstractions: for all (ai,ai):r𝒜, for all r.CTi and r.CTi, we have CTi and CTi. The main changes are for the general transitions 𝚝=(𝖢𝖠,𝚊,Θ𝚝,q0,,q𝚍1) satisfying the following conditions with 𝖢𝖠=(𝖯𝖠,(T,act)).

    (Existential restrictions)

    For all R.CT, i=λ(R.C), qi of the form (𝖯𝖠i,(Ti,acti)) and CTi or R=R and CT if 𝖯𝖠=(R,T,act) (new option).

    (Value restrictions)

    For all R.CT, for all idir(R) with qi=(𝖯𝖠i,(Ti,acti)), we have CTi, and if R=R, then CT if 𝖯𝖠=(R,T,act) (new requirement).

    (Back-propagation)

    If qi=(𝖯𝖠i,(Ti,acti)), R.CTi and idir(R), then CT. This is a new condition due to inverse roles.

    (CD-restrictions)

    Let CD(T) and CD(T) be the sets of existential and universal CD-restrictions in T. Let C=𝒬v1:p1,,vk:pk.Θ be such a CD-restriction with 𝒬{,}. For each role path pj, we define ZjREG(β,𝚍) as follows.

    • If pj=fl, then if act[l]=, then Zj=def{xl} else Zj=def.

    • If pj=Rfl, then Zj is defined as follows. If 𝖯𝖠=(R,T,act), then

      Zj=def{xliidir(R),qi=(𝖯𝖠i,(Ti,acti)),acti[l]=}{xl|R=R,act[l]=}

      Otherwise 𝖯𝖠 is of the form (γ,𝔣), we need to take advantage of the registers dedicated to individual names: Zj=def{xliidir(R),qi=(𝖯𝖠i,(Ti,acti)),acti[l]=}{xaγ,l(aγ,aγ):R𝒜,𝔣(aγ)[l]=}.

    Let Z=defZj. If CCD(T), then ΘC=def(y1,,yk)ZΘ(y1,,yk), otherwise ΘC=def(y1,,yk)ZΘ(y1,,yk). The constraint Θ𝚝 is defined below, where the new part expresses the consistency of the registers dedicated to the parent node (xji denotes the register at the ith child for the parent’s value of fj).

    XΘ𝚝=CCD(T)CD(T)ΘCi[0,𝚍1]qij[1,α],act[j]=xji=xj.

    (Inactive children)

    Same condition as for 𝒜𝒞𝒪(𝒟).

    (Parent information propagation)

    For each i such that qi is a contextual abstraction ((Ri,Ti,acti),(Ti,acti)), we require idir(Ri) and (T,act)=(Ti,acti).

Soundness is shown as for Lemma 10 by extending the interpretation of role names.

Lemma 10.

L(𝔸) implies 𝒪 is consistent.

Similarly, completeness is also shown along the lines of the proof of Lemma 7.

Lemma 11.

𝒪 is consistent with an interpretation satisfying (UNA) implies L(𝔸).

As far as complexity is concerned, the main differences with the construction of 𝔸 for 𝒜𝒞𝒪(𝒟) include β still bilinear in α and η, and the presence of parent abstractions. All of this causes no harm and we can establish Theorem 12.

Theorem 12.

The consistency problem for 𝒜𝒞(𝒟) is in k-ExpTime if 𝒟 satisfies the conditions (C1), (C2), (C3.k) and (C4), for any k1.

As shown herein, our developments for 𝒜𝒞𝒪(𝒟) can be naturally adapted to 𝒜𝒞(𝒟), witnessing the robustness of our approach. It is open whether there is an adaptation for the extension 𝒜𝒞𝒪(𝒟) with inverse role names and nominals. Indeed, in 𝒜𝒞𝒪(,<), the ontology 𝒪=({v1:f1,v2:rf1.v1<v2,r.{a}},{a:v1:f1,v2:rf1.Θ}), can be made consistent and for any interpretation satisfying it, a has an infinite amount of r-successors, which cannot be captured currently with the finite sets Zj. It is not so much the combination of inverse roles and nominals that is problematic, see e.g. the handling of the μ-calculus with converse and nominals in [32], but the additional presence of CD-restrictions.

6 Conclusion

We have considered a class of concrete domains 𝒟 for which the consistency problem for 𝒜𝒞𝒪(𝒟) can be reduced to the nonemptiness problem for constraint automata parameterised by 𝒟, leading to the ExpTime-membership of the consistency problem (Theorem 8). This follows an automata-based approach, whose versatility allowed us to add ingredients such as inverse roles (while giving up nominals) and functional role names (Section 5). Furthermore, we carefully examined the conditions on the concrete domain 𝒟 to get the ExpTime upper bound. In particular, we have shown that the nonemptiness problem for constraint automata is in ExpTime whenever 𝒟 satisfies the conditions (C1), (C2) and (C3.1) (Theorem 5). The decidability/complexity status of the logics 𝒜𝒞𝒪(𝒟) is left open (also evoked in [3, Section 7] for some extension).

References

  • [1] J. Allen. Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11):832–843, 1983. doi:10.1145/182.358434.
  • [2] F. Baader. Description Logics. In Reasoning Web. Semantic Technologies for Information Systems, 5th International Summer School 2009, Tutorial Lectures, volume 5689 of LNCS, pages 1–39. Springer, 2009. doi:10.1007/978-3-642-03754-2_1.
  • [3] F. Baader, S. Borgwardt, F. De Bortoli, and P. Koopmann. Concrete domains meet expressive cardinality restrictions in description logics. In CADE’25, volume 15943 of LNCS, pages 676–695. Springer, 2025. doi:10.1007/978-3-031-99984-0_35.
  • [4] F. Baader and P. Hanschke. A scheme for integrating concrete domains into concept languages. In IJCAI’91, pages 452–457, 1991.
  • [5] F. Baader, J. Hladik, C. Lutz, and F. Wolter. From Tableaux to Automata for Description Logics. Fundamenta Informaticae, 57(2–4):247–279, 2003. URL: http://content.iospress.com/articles/fundamenta-informaticae/fi57-2-4-08.
  • [6] F. Baader, I. Horrocks, C. Lutz, and U. Sattler. An Introduction to Description Logic. Cambridge University Press, 2017.
  • [7] F. Baader, I. Horrocks, and U. Sattler. Description logics. In F. van Harmelen, V. Lifschitz, and B.W. Porter, editors, Handbook of Knowledge Representation, volume 3 of Foundations of Artificial Intelligence, pages 135–179. Elsevier, 2008. doi:10.1016/S1574-6526(07)03003-9.
  • [8] F. Baader and C. Lutz. Description logic. In P. Blackburn, J. van Benthem, and F. Wolter, editors, Handbook of Modal Logic, pages 757–819. Elsevier, 2006.
  • [9] F. Baader and J. Rydval. Using model theory to find decidable and tractable description logics with concrete domains. Journal of Automated Reasoning, 66(3):357–407, 2022. doi:10.1007/S10817-022-09626-2.
  • [10] Ph. Balbiani and J.F. Condotta. Computational complexity of propositional linear temporal logics based on qualitative spatial or temporal reasoning. In FroCoS’02, volume 2309 of LNAI, pages 162–173. Springer, 2002.
  • [11] A. Bhaskar and M. Praveen. Realizability problem for constraint LTL. Information & Computation, 296:105126, 2024. doi:10.1016/J.IC.2023.105126.
  • [12] P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic. Cambridge University Press, 2001.
  • [13] S. Borgwardt, F. De Bortoli, and P. Koopmann. The precise complexity of reasoning in 𝒜𝒞 with ω-admissible concrete domains. In DL’24, volume 3739 of CEUR Workshop Proceedings. CEUR-WS.org, 2024.
  • [14] J.R. Büchi. On a decision method in restricted second-order arithmetic. In International Congress on Logic, Method and Philosophical Science’60, pages 1–11, 1962.
  • [15] C. Carapelle and A.-Y. Turhan. Description Logics Reasoning w.r.t. General TBoxes is Decidable for Concrete Domains with the EHD-property. In ECAI’16, volume 285, pages 1440–1448. IOS Press, 2016. doi:10.3233/978-1-61499-672-9-1440.
  • [16] K. Chatterjee and M. Henzinger. An 𝒪(n2) time algorithm for alternating Büchi games. In SODA’12, pages 1386–1399. SIAM, 2012.
  • [17] R. Dechter. From local to global consistency. Artificial Intelligence, pages 87–107, 1992.
  • [18] S. Demri and D. D’Souza. An automata-theoretic approach to constraint LTL. Information & Computation, 205(3):380–415, 2007. doi:10.1016/J.IC.2006.09.006.
  • [19] S. Demri and K. Quaas. First steps towards taming description logics with strings. In JELIA’23, volume 14281 of LNCS, pages 322–337. Springer, 2023. doi:10.1007/978-3-031-43619-2_23.
  • [20] S. Demri and K. Quaas. Constraint automata on infinite data trees: From CTL(Z)/CTL*(Z) to decision procedures. Logical Methods in Computer Science, 21(2):16:1–16:79, 2025. doi:10.46298/LMCS-21(2:16)2025.
  • [21] S. Demri and U. Sattler. Automata-theoretic decision procedures for information logics. Fundamenta Informaticae, 53(1):1–22, 2002. URL: http://content.iospress.com/articles/fundamenta-informaticae/fi53-1-01.
  • [22] N. Labai. Automata-based reasoning for decidable logics with data values. PhD thesis, TU Wien, May 2021.
  • [23] N. Labai, M. Ortiz, and M. Simkus. An Exptime Upper Bound for 𝒜𝒞 with integers. In KR’20, pages 425–436. Morgan Kaufman, 2020.
  • [24] C. Lutz. NEXPTIME-complete description logics with concrete domains. In IJCAR’01, volume 2083 of LNCS, pages 46–60. Springer, 2001. doi:10.1007/3-540-45744-5_5.
  • [25] C. Lutz. The Complexity of Description Logics with Concrete Domains. PhD thesis, RWTH, Aachen, 2002.
  • [26] C. Lutz. Description logics with concrete domains—a survey. In Advances in Modal Logics Volume 4, pages 265–296. King’s College Publications, 2003.
  • [27] C. Lutz and M. Milicić. A Tableau Algorithm for Description Logics with Concrete Domains and General Tboxes. Journal of Automated Reasoning, 38(1-3):227–259, 2007. doi:10.1007/S10817-006-9049-7.
  • [28] M. Marx. Complexity of modal logic. In P. Blackburn, J. van Benthem, and F. Wolter, editors, Handbook of Modal Logic, pages 139–179. Elsevier, 2006.
  • [29] V. Pratt. Models of program logics. In FoCS’79, pages 115–122, 1979.
  • [30] I. Pratt-Hartmann. Fragments of First-Logic. Oxford University Press, 2023.
  • [31] J. Rydval. Using Model Theory to Find Decidable and Tractable Description Logics with Concrete Domains. PhD thesis, Dresden University, 2022.
  • [32] U. Sattler and M. Vardi. The hybrid mu-calculus. In IJCAR’01, volume 2083 of LNAI, pages 76–91. Springer, 2001.
  • [33] L. Segoufin and S. Toruńczyk. Automata based verification over linearly ordered data domains. In STACS’11, pages 81–92, 2011.
  • [34] S. Tobies. The complexity of reasoning with cardinality restrictions and nominals in expressive description logics. Journal of Artificial Intelligence Research, 12:199–217, 2000. doi:10.1613/JAIR.705.
  • [35] M. Vardi and P. Wolper. Automata-theoretic techniques for modal logics of programs. Journal of Computer and System Sciences, 32:183–221, 1986. doi:10.1016/0022-0000(86)90026-7.
  • [36] M.Y. Vardi. Automata-theoretic techniques for temporal reasoning. In P. Blackburn, J. van Benthem, and F. Wolter, editors, Handbook of Modal Logic, pages 971–989. Elsevier, 2006.
  • [37] F. Wolter and M. Zakharyaschev. Spatio-temporal representation and reasoning based on RCC-8. In KR’00, pages 3–14, 2000.