Expressivity of Bisimulation Pseudometrics over Analytic State Spaces

Luckhardt, Daniel; Beohar, Harsh; Kupke, Clemens

doi:10.4230/LIPIcs.CALCO.2025.13

Expressivity of Bisimulation Pseudometrics over Analytic State Spaces

Daniel Luckhardt

UCL, University College London, UK Harsh Beohar

University of Sheffield, UK Clemens Kupke

University of Strathclyde, Glasgow, UK

Abstract

A Markov decision process (MDP) is a state-based dynamical system capable of describing probabilistic behaviour with rewards. In this paper, we view MDPs as coalgebras living in the category of analytic spaces, a very general class of measurable spaces. Note that analytic spaces were already studied in the literature on labelled Markov processes and bisimulation relations. Our results are twofold. First, we define bisimulation pseudometrics over such coalgebras using the framework of fibrations. Second, we develop a quantitative modal logic for such coalgebras and prove a quantitative form of Hennessy-Milner theorem in this new setting stating that the bisimulation pseudometric corresponds to the logical distance induced by modal formulae.

Keywords and phrases:

Markov decision process, quantitative Hennessy-Milner theorem

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Modal and temporal logics

Editors:

Corina Cîrstea and Alexander Knapp

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Markov decision processes (MDPs) are a well known mathematical model for decision-theoretic planning [5] and reinforcement learning [37]. Informally, an MDP can be seen as a generalisation of an automaton, where the transition function (for each action in the alphabet) gives a probability distribution over the state space together with a reward function that for each state and action gives a real-valued number.

Inspired from the previous work on bisimulation pseudometrics on labelled Markov processes [8, 10] and probabilistic transition systems [39, 38], Ferns et al. [17, 18] defined (among other results) a notion of bisimulation pseudometric on the states of an MDP. Unlike [17] and the previous work on bisimulation equivalence for MDPs [23], the systems considered in [18] were MDPs with continuous state spaces. Conformances over continuous state MDPs have found applications in representation learning [21, 43] (a topic studied within the field of reinforcement learning).

In this paper, we propose a modal logic $\mathcal{L}$ (cf. Section 5) with quantitative semantics, i.e. the semantics of each formula is given by a real-valued function for MDPs with continuous state space. We then prove a quantitative version of the Hennessy-Milner theorem (a well known result [25] from concurrency theory), i.e. we show that the bisimulation pseudometrics on continuous state MDPs coincide with the logical distance in our logic. A major obstacle to overcome in the continuous setting is the definition of bisimulation pseudometrics itself. Moreover, the fundamental question “what is a distance on a measurable space” (besides the usual equations of a pseudometric, i.e. $d$ is reflexive, symmetric, and satisfies the triangle inequality) needs addressing. In [18] the authors had to invoke an additional Polish structure inducing the $\sigma$ -algebra as their methods forced them to work with lower semi-continuous distance functions $d$ . In this sequel, we work in a purely measure theoretic set-up with a far more general class of distances, universally measurable distance functions, cf. Subsection 3.1 for details.

Although our approach is rooted in the theory of fibrations [26], the recent approaches [1, 29, 30] to obtain expressive modal logic for coalgebras do not apply. For instance, our fibration $\operatorname{\mathsf{Pred}}$ of predicates¹¹1Note that a fibration of predicates is more fundamental than a fibration of conformances like pseudometrics and equivalence relations, since the latter can be derived from the former. over a state space has only countable many meets; thus, it is not a complete lattice fibration as required in [1, 29, 30]. As a result, the codensity lifting used in [1, 29] to derive the Kantorovich lifting for the distribution endofunctor, cannot be used to derive the Kantorovich lifting for the Giry endofunctor over measurable spaces.

In the sequel, we recalibrate the fibration infrastructure in Subsection 2.1. Our inspiration is [4] which presented a coupling-based lifting for an endofunctor that – when instantiated to distribution endofunctor – gives rise to the well-known Wasserstein lifting on probability distributions. Analogously, we will show in Subsection 4.1 how to capture the Wasserstein lifting on probability measures. For our definition to work, we will restrict to a full subcategory of the category $(X,\mathcal{A})$ of measurable spaces – the category Ana of analytic spaces. Note that analytic spaces already appeared in the literature on labelled Markov processes (for instance, see [9]) to show that logical equivalence induced by a modal logic given in [9] coincide with probabilistic bisimilarity.

After having clarified our measure theoretic assumptions, we will define bisimulation metrics for MDPs as the least fixpoint of the following functional:

\operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A}))\xrightarrow% {\sigma_{X}}\operatorname{\mathsf{Pred}}(B_{\mathsf{MDP}}(X,\mathcal{A})\times B% _{\mathsf{MDP}}(X,\mathcal{A}))\xrightarrow{\smash{(\gamma\times\gamma)^{*}}}% \operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A})),

where $B_{\mathsf{MDP}}$ is the endofunctor modelling MDPs as given in Section 3 and $\sigma$ is a lifting of distance functions (or put simply, a distance lifting) for $B_{\mathsf{MDP}}$ as given in Section 4. The definition of our distance lifting $\sigma$ is parameterised by a discount factor $c\in[0,1]$ . Furthermore, thanks to the Kantorovich-Rubinstein duality for measurable spaces [33, Theorem 5], the above functional corresponds to the functional given in [18, Theorem 3.12] whose least fixpoint is the bisimulation pseudometric on the state space of an MDP.

Moving on to our modal logic and comparing with the expressive modal logic for probabilistic systems studied in [7, 10, 39], the key distinguishing feature of our work is the semantics of our diamond modality $\diamond_{a}\varphi$ . Intuitively, the $\llbracket\diamond_{a}\varphi\rrbracket(x)$ (for a state $x$ ) gives the expected value of landing in an $a$ -successor from $x$ with some fixed probability $c\in[0,1]$ combined with the reward for $a$ when staying in the state $x$ with probability $1-c$ . In other words, the semantics of $\diamond_{a}\varphi$ is a convex combination of expected value of moving to an $a$ -successor and the reward for $a$ at a state. Unlike the above references, we were unable (without breaking the proof of the adequacy result) to further decompose this modality into the traditional diamond modality and 0-ary reward modality as defined in [7, 10, 39].

This paper is organised as follows. In Section 2, we recall the preliminaries from measure theory and calibrate our fibration setup for measurable spaces. In Section 3, we give the concrete definition of behaviour endofunctors that model Markov reward processes (MRPs) and MDPs and establish a bifibration of predicates. The former can be seen as unlabelled version of an MDP. In Section 4, we capture the bisimulation pseudometrics for both MRPs and MDPs as least fixpoint of a functional as explained above. In Section 5, we define our modal logic and establish the adequacy and expressivity results. In Section 6, we end this paper by a discussion on related work and potential topics for future work. The proofs of all lemmas and theorems can be found in the clearly marked appendix.

2 Preliminaries

2.1 Capturing behavioural conformances categorically

In this subsection, we refine the construction [4] of coupling-based lifting for an endofunctor on Set by working with two different fibrations of predicates (cf. Assumptions A 1 and A 2). Moreover, our presentation works in a category C having products; unlike, in [4], where the coalgebras were living in Set. This will provide us a blueprint to define a bisimulation distance for both MRPs and MDPs when viewed as coalgebras in Section 3.

Throughout this section, let Pos be the category of posets and order preserving maps; and, let $B\colon\textnormal{{C}}\to\textnormal{{C}}$ be the functor modelling the branching type of systems of interest.

A1.

There is an indexed category ${\operatorname{\mathsf{Pred}}}\colon\textnormal{{C}}^{\text{op}}\to\textnormal% {{Pos}}$ such that $\operatorname{\mathsf{Pred}}(X)$ (for $X\in\textnormal{{C}}$ ) is a poset and $\operatorname{\mathsf{Pred}}(f)\colon\operatorname{\mathsf{Pred}}(Y)\to% \operatorname{\mathsf{Pred}}(X)$ (for $f\colon X\to Y\in\textnormal{{C}}$ ) is order preserving.

Henceforth we write the reindexing $f^{*}$ (instead of $\operatorname{\mathsf{Pred}}(f)$ ) which is customary in the literature on fibrations [26]. In the sequel, we will view an element $p\in\operatorname{\mathsf{Pred}}(X)$ intuitively as a predicate over an object $X\in\textnormal{{C}}$ . The idea is to view $\operatorname{\mathsf{Pred}}$ as a semantic universe in which we interpret the formulae of a modal logic. Thus, the operators of a modal logic (like negation, conjunction etcetera) must be operators definable over the fibre $\operatorname{\mathsf{Pred}}(X)$ .

Furthermore, the authors in [4] required that $\operatorname{\mathsf{Pred}}$ is rather a bifibration, which is difficult to obtain in general for arbitrary measurable spaces (cf. Subsection 3.1). Our observation, which leads to A 2, is that we can arrange both universally measurable predicates and lower semi-measurable predicates in such a way that the latter results in a bifibration structure and the former acts as a semantic universe to interpret our modal formulae.

A2.

there is an indexed category $\mathsf{lsPred}$ such that $\mathsf{lsPred}$ is a subfunctor of $\operatorname{\mathsf{Pred}}$ . Moreover, the indexed category $\mathsf{lsPred}$ has a bifibration structure, i.e. for every $f\colon X\to Y\in\textnormal{{C}}$ the reindexing functor $f^{*}$ has a left adjoint $\exists_{f}\colon\mathsf{lsPred}(X)\to\mathsf{lsPred}(Y)$ .

Now, following [4], one needs a predicate lifting to define a coupling-based lifting, which in our setting due to the presence of two fibrations of predicates takes the following shape.

A3.

there is an indexed morphism $\sigma\colon{\operatorname{\mathsf{Pred}}}\Rightarrow\mathsf{lsPred}\circ B^{% \text{op}}$ , i.e. $\sigma$ is a natural transformation of type ${\operatorname{\mathsf{Pred}}}\Rightarrow\mathsf{lsPred}\circ B^{\text{op}}$ .

Thanks to the above three assumptions, every predicate lifting $\sigma$ induces a lifting $\hat{\sigma}$ , which simplifies to the composition given in [4, Eq. 5] when $\mathsf{lsPred}=\operatorname{\mathsf{Pred}}$ ).

\operatorname{\mathsf{Pred}}(X\times X)\xrightarrow{\sigma_{X\times X}}\mathsf% {lsPred}(B(X\times X))\xrightarrow{\exists_{\pi_{X}}}\mathsf{lsPred}(BX\times BX% )\hookrightarrow\operatorname{\mathsf{Pred}}(BX\times BX),

(1)

where $\pi_{X}\colon B(X\times X)\to BX\times BX$ is the unique map such that $\operatorname{pr}_{i}^{BX}\circ\pi_{X}=B(\operatorname{pr}_{i}^{X})$ and $\operatorname{pr}_{i}\colon X\times X\to X$ are the obvious projection maps (for $i\in\{1,2\}$ ). It is this lifting which will give us the usual Wasserstein lifting for a Giry functor $B$ defined in Section 3. Now, for a given coalgebra $\gamma\colon X\to BX\in\textnormal{{C}}$ , simply take the greatest fixpoint of the functional given below to define a coupling-based lifting for the endofunctor $B$ .

\operatorname{\mathsf{Pred}}(X\times X)\xrightarrow{\hat{\sigma}_{X}}% \operatorname{\mathsf{Pred}}(BX\times BX)\xrightarrow{(\gamma\times\gamma)^{*}% }\operatorname{\mathsf{Pred}}(X\times X).

(2)

To ensure this fixpoint exists, we require the following assumption

A4.

the indexed category $\operatorname{\mathsf{Pred}}$ has countable fibred limits, i.e. each fibre of $\operatorname{\mathsf{Pred}}$ has countable meets and these countable meets are preserved by the reindexing operation.

Proposition 1.

If the induced lifting $\hat{\sigma}$ defined in (1) is (Scott) cocontiuous, then the greatest fixpoint of the functional given in (2) exists.

$\blacktriangleright$ Remark 2.

It should be noted that, in the above proposition, we use the dual version to apply Kleene’s fixpoint theorem on the lattice $\operatorname{\mathsf{Pred}}$ . However, our concrete predicates in Section 3 will be ordered by pointwise lifting of the dual order (i.e. $\geq$ ) on the unit interval $[0,1]$ , which then leads to the application of the usual Kleene’s fixpoint theorem.

2.2 Measurable spaces

A measurable space is a pair $(X,\mathcal{A})$ consisting of a set $X$ thought of as a space, e.g. state space of an MDP, and a $\sigma$ -algebra $\mathcal{A}\subseteq\mathcal{P}(X)$ (whose elements are called measurable sets) of subsets of $X$ stable under the complement operation $(\,\_\,)^{\complement}$ and countable union (including empty union). In applications the $\sigma$ -algebra often contains a given topology, i.e. the collection of open sets $\mathcal{T}$ , on the state space. Often one considers the minimal such $\sigma$ -algebra, the Borel- $\sigma$ -algebra, denoted $\mathcal{B}_{\mathcal{T}}$ . In this context, we use the notation $#1\langle\mathcal{P}\rangle$ to denote the minimal $\sigma$ -algebra containing a given $\mathcal{P}\subseteq\mathcal{P}(X)$ . The elements of $\mathcal{B}_{\mathcal{T}}=#1\langle\mathcal{T}\rangle$ are called Borel sets and $(\mathcal{T},\mathcal{B}_{\mathcal{T}})$ is called a Borel space. If a given measurable space $(X,\mathcal{A})$ stems from a Polish space, a completely metrisable separable topological space, then $(X,\mathcal{A})$ is called standard. The collection of all measurable spaces form a category Meas when endowed with maps that inversely preserve measurable sets, so-called measurable maps. The term $\mathcal{A}$ - $\mathcal{B}$ -measurable for a measurable morphism $(X,\mathcal{A})\to({Y},\mathcal{B})$ is also used to be specific about $\sigma$ -algebras. The category Meas has arbitrary products; in particular, $(X,\mathcal{A})\times({Y},\mathcal{B})=(X\times{Y},\mathcal{A}\mathbin{\otimes% }\mathcal{B})$ where $\mathcal{A}\otimes\mathcal{B}$ is the $\sigma$ -algebra generated by the set $\{\,U\times V\;|\;U\in\mathcal{A}\land V\in\mathcal{B}\,\}$ .

A probability measure $\mathfrak{m}$ on a measurable space $(X,\mathcal{A})$ is a function of type $\mathcal{A}\to[0,1]$ such that $\mathfrak{m}(X)=1$ and $\mathfrak{m}(\bigcup_{i\in\mathbb{N}}A_{i})=\sum_{i\in\mathbb{N}}\mathfrak{m}(% A_{i})$ for any sequence of pairwise disjoint sets $(A_{i}\in\mathcal{A})_{i\in\mathbb{N}}$ . We denote by $\operatorname{\mathcal{G}}(X,\mathcal{A})$ the collection of all probability measures on $(X,\mathcal{A})$ endowed – going back to [22] – with the $\sigma$ -algebra generated by all the evaluation maps $\operatorname{ev}_{A}\colon\operatorname{\mathcal{G}}(X,\mathcal{A})\to[0,1]$ (one for each $A\in\mathcal{A}$ ) given by the mapping $\mathfrak{m}\mapsto\mathfrak{m}(A)$ , i.e. the minimal $\sigma$ -algebra making all maps $\operatorname{ev}_{A}\colon\operatorname{\mathcal{G}}(X,\mathcal{A})\to([0,1],% \mathcal{B}_{[0,1]})$ measurable.

We restrict the exposition of this theory to it bare minimum with some additional background given in Appendix A. We call a subset of a measurable space $(X,\mathcal{A})$ Suslin, if it is the image of an element in $\mathcal{B}_{\mathbb{N}^{\mathbb{N}}}\mathbin{\otimes}\mathcal{A}$ along the projection $\mathbb{N}^{\mathbb{N}}\times X\to X$ . Let $\operatorname{\mathfrak{S}}{\mathcal{A}}$ denote the set of all Suslin subsets of $(X,\mathcal{A})$ . A measurable space $(X,\mathcal{A})$ is called analytic if it is homeomorphic to a Suslin set of a standard space $({Y},\mathcal{B})$ endowed with the restricted $\sigma$ -algebra, i.e. $\mathcal{B}|_{X}\coloneqq\{\,B\cap X\;|\;B\in\mathcal{B}\,\}$ . Such constructions are also a common subject in descriptive set theory, cf. [27] and [20, Ch. 42]. By Ana we denote the full subcategory of Meas of analytic spaces, which admits countable products. The endofunctor $\operatorname{\mathcal{G}}$ restricts to Ana. To provide full generality, cf. Subsection 3.1, of our results, we also introduce the concept of a smooth space [16]: $(X,\mathcal{A})$ is smooth, cf. Subsection A.2, if for any other measurable space $({Y},\mathcal{B})$ any projection to ${Y}$ of a Borel (or equivalently Suslin) subset of $({Y},\mathcal{B})\times(X,\mathcal{A})$ is Suslin. Nevertheless, it is perfectly fine to assume analytic spaces throughout at least for the first reading.

Given a measure space (i.e., a measurable space with a measure) $(X,\mathcal{A},\mathfrak{m})$ one may wish to extend $\mathcal{A}$ . The $\mathfrak{m}$ -completion of $(X,\mathcal{A})$ , $\overline{\mathcal{A}missing}^{\mathfrak{m}}\supseteq\mathcal{A}$ , is defined as the smallest $\sigma$ -algebra containing $\mathcal{A}\cup\{B\subseteq X\mid\exists A\in\mathcal{A}.\;m(A)=0\mbox{ and }B% \subseteq A\}$ . A way to describe $\overline{(X,\mathcal{A})missing}^{\mathfrak{m}}$ , when $\mathfrak{m}$ is a probability measure, is as the set of all $A$ such that there are $A_{-},A_{+}$ with $\mathfrak{m}(A_{-})=\mathfrak{m}(A_{+})$ and $A_{-}\subseteq A\subseteq A_{+}$ . The measure $\mathfrak{m}$ uniquely extends to a measure on $\overline{\mathcal{A}missing}^{\mathfrak{m}}$ . Given a measurable space $(X,\mathcal{A})$ , the universal completion of $\mathcal{A}$ (denoted $\overline{\mathcal{A}missing}$ ) is the intersection of all completions of $\mathcal{A}$ with respect to any (probability) measure on $(X,\mathcal{A})$ . The universal completion is quite big; especially it contains $\operatorname{\mathfrak{S}}{\mathcal{A}}$ (so every Suslin set is universally measurable).

3 Markov decision processes

In this section we are going to instantiate our categorical parameters (cf. Assumptions A 1 - A 4) in the setting of measurable spaces. We begin by recalling the definition of a Markov decision processes from [18] and view them as coalgebras.

Definition 3.

A (continuous) Markov reward process (MRP) is a coalgebra $\gamma$ of type

(X,\mathcal{A})\to\operatorname{\mathcal{G}}(X,\mathcal{A})\times([0,1],% \mathcal{B}_{[0,1]})\in\textnormal{{Meas}}.

In other words, $\gamma$ is given by a pair $(\gamma^{0},\gamma^{1})$ of maps satisfying the following properties:

$\blacksquare$

$\gamma^{0}(x)$ (for each $x\in X$ ) is a probability measure;
$\blacksquare$

$\gamma^{0}(\_)(U)\colon X\to[0,1]$ (for each measurable set $U\in\mathcal{A}$ ) is a measurable function; and
$\blacksquare$

$\gamma^{1}$ is a measurable function.

Moreover, given a countable set $\Sigma$ of actions, we define a Markov decision process (MDP) [18] as a coalgebra $\gamma$ of type

(X,\mathcal{A})\to\prod_{\Sigma}\left(\operatorname{\mathcal{G}}(X,\mathcal{A}% )\times([0,1],\mathcal{B}_{[0,1]})\right)\in\textnormal{{Meas}}.

In other words, the map $\gamma(x)(a)$ (for each state $x\in X$ and action $a\in\Sigma$ ), is a Markov process. Henceforth we write $\gamma_{a,x}$ to denote $\gamma(x)(a)$ ; so, $\gamma_{a,x}^{0}$ corresponds to a probability measure and $\gamma_{a,x}^{1}$ corresponds to a “reward” at $x$ for an action $a$ .

Thus, the endofunctors of interest are the following:

$\blacksquare$

$B_{\mathsf{MRP}}=\operatorname{\mathcal{G}}\times[0,1]$ whose coalgebras correspond to Markov reward processes
$\blacksquare$

$B_{\mathsf{MDP}}=\prod_{\Sigma}B_{\mathsf{MRP}}$ whose coalgebras correspond to Markov decision processes.

3.1 Fibrations induced universally/l.s.m. predicates

Having fixed the type of systems, we now look into the issue of endowing a bifibration structure on the space of all Boolean/quantitative predicates. Consider the indexed category $\operatorname{\mathsf{Pred}}(X,\mathcal{A})=\textnormal{{Meas}}((X,\mathcal{A}% ),(2,\mathcal{B}_{2}))$ of Boolean predicates, i.e. a predicate $p\in\operatorname{\mathsf{Pred}}(X,\mathcal{A})$ is a measurable function of type $X\to 2$ . The reindexing functor $f^{*}$ (for a measurable function $f\colon(X,\mathcal{A})\to({Y},\mathcal{B})$ ) is given by the inverse image operation (since inverse image of measurable sets is measurable). It is well known (originally due to Suslin [36]) that Borel measurable sets even on standard spaces are not closed under direct images; thus as a result the left adjoint to reindexing functor cannot exist in general. Nevertheless if we weaken measurable sets to Suslin sets (which are equivalently analytic sets for analytic spaces, cf. [20, 421K] and [27, 13.3iii)]), then Suslin sets of analytic spaces are preserved by direct image onto analytic spaces, cf. [6] for an in-depth discussion how to develop these concepts.

In lieu of the above discussion, we restrict our state spaces to be analytic, i.e. our working category for the remainder is the category Ana of analytic spaces and measurable functions as morphisms. This is, on the one hand, a bit more general than Polish spaces as required in [18] and on the other hand conceptually more elegant, as we are only working with measurable spaces and do not require an underlying topology. Moreover, we consider quantitative predicates on an analytic space $(X,\mathcal{A})$ to be lower semimeasurable (l.s.m.) functions (a real-valued generalisation of Suslin sets) of type $X\to[0,1]$ .

Definition 4.

Let $(X,\mathcal{A})$ be an analytic space. Then a function $p\colon X\to[0,1]$ is lower semi-measurable, l.s.m. for short, (resp. universally measurable) predicate iff the preimage of the interval $[0,r]$ (for every $r\in[0,1]$ ) under $p$ is a Suslin set (resp. universally measurable set), i.e. for every $r\in[0,1]$ , $p^{-1}([0,r])\in\operatorname{\mathfrak{S}}\mathcal{A}$ (resp. $p^{-1}([0,r])\in\overline{\mathcal{A}missing}$ ).

The term “lower semi-measurable” is chosen in parallel to the term “lower semi-continuous” in topology which refers to a real-valued function which is continuous with respect to the upper-interval topology $\{\,(r,\infty)\;|\;r\in\mathbb{R}\,\}$ .

We can arrange l.s.m. predicates in an indexed category as follows. Consider the mapping $\mathsf{lsPred}\colon\textnormal{{Ana}}^{\text{op}}\to\textnormal{{Pos}}$ such that $\mathsf{lsPred}(X,\mathcal{A})$ is the set of all l.s.m. predicates where the ordering relation is the pointwise lifting of the “greater-than-equality” relation on the unit interval²²2In other words, we are viewing the unit interval as the Lawvere quantale $([0,1],\geq,+)$ where $+$ is the truncated addition. So, $p\leq q\iff\forall_{x}\ p(x)\geq q(x)$ .. The reindexing $f^{*}$ (for an arrow $f\colon(X,\mathcal{A})\to({Y},\mathcal{B})\in\textnormal{{Ana}}$ ) is given by pre-composition, i.e.

f^{*}(q)=q\circ f\qquad\text{(for every $q\in\mathsf{lsPred}({Y},\mathcal{B})$% ).}

Lemma 5.

The indexed category $\mathsf{lsPred}$ has countable fibred (co)limits, i.e. each fibre has countable meets and countable joins which are preserved by the reindexing functor. Moreover, $\mathsf{lsPred}$ has a bifibration structure, i.e., for every $f\colon(X,\mathcal{A})\to({Y},\mathcal{B})\in\textnormal{{Ana}}$ , the reindexing functor has a left adjoint $\exists_{f}$ given by:

\exists_{f}(p)(y)=\inf_{f(x)=y}p(x)\qquad\text{(for every $p\in\mathsf{lsPred}% (X,\mathcal{A}),y\in Y$)}.

$\blacktriangleright$ Remark 6.

It should be noted that the existence of a left adjoint can be stated in more general terms by requiring that $(X,\mathcal{A})$ is a smooth space, cf. Subsection A.2, and $({Y},\mathcal{B})$ is countably separated, i.e. there is a countable family of measurable sets distinguishing every pair of distinct points. Moreover, Axiom 2 could be weakened to maps of type $f\colon BX\to Y\in\textnormal{{C}}$ :

A1’.

the reindexing functor $f^{*}$ has a left adjoint $\exists_{f}\colon\mathsf{lsPred}(BX)\to\mathsf{lsPred}(Y)$ .

Any universally measurable subset of a standard space is countably separated as a measurable space, but also an analytic space. So our construction generalises to the full subcategory of Meas of measurable spaces expressible in this form.

To show that A 2 is satisfied, it remains to define an indexed category $\operatorname{\mathsf{Pred}}\colon\textnormal{{Ana}}^{\text{op}}\to\textnormal% {{Pos}}$ such that each fibre $\mathsf{lsPred}(X,\mathcal{A})$ is contained in $\operatorname{\mathsf{Pred}}(X,\mathcal{A})$ . We disregard the trivial definition, i.e. $\operatorname{\mathsf{Pred}}=\mathsf{lsPred}$ , since Suslin sets are not closed under complementation. As a result, we cannot give semantics to the negation operator in our logic. Nonetheless, it is also known that Suslin sets are universally measurable sets [11], so we simply let $\operatorname{\mathsf{Pred}}(X,\mathcal{A})$ be the set of universally measurable predicates on the analytic space $(X,\mathcal{A})$ .

Proposition 7.

Assumptions A 1 and A 2 are satisfied by $\operatorname{\mathsf{Pred}}$ and $\mathsf{lsPred}$ , respectively.

4 Bisimulation distance

The objective of this section is to define bisimulation pseudometrics (cf. Subsection 4.2) for Markov reward processes and MDPs as the least³³3Recall the predicates are ordered by $\geq$ , so the greatest fixpoint is actually least fixpoint under the usual order $\leq$ . fixpoint of a functional given in (2) on page 2 where $B=\{B_{\mathsf{MRP}},B_{\mathsf{MDP}}\}$ . In both cases, the definition of a pseudometric rests on a coupling-based lifting (1) for the Giry endofunctor $\operatorname{\mathcal{G}}$ which we will work out in the following subsection.

4.1 Wasserstein lifting categorically

We begin by defining a predicate lifting for $\operatorname{\mathcal{G}}$ (i.e. when $B=\operatorname{\mathcal{G}}$ in Assumption A 3). Consider the mapping $\sigma_{(X,\mathcal{A})}\colon\operatorname{\mathsf{Pred}}(X,\mathcal{A})\to% \mathsf{lsPred}(\operatorname{\mathcal{G}}(X,\mathcal{A}))$ given by

\sigma_{(X,\mathcal{A})}(p)(\mathfrak{m})=\int p\mathop{}\!\mathrm{d}\mathfrak% {m}\quad(\text{for every $p\in\operatorname{\mathsf{Pred}}{(X,\mathcal{A})},% \mathfrak{m}\in\operatorname{\mathcal{G}}(X,\mathcal{A})$}).

(3)

Henceforth, we drop the sigma-algebra notation from the subscript whenever it is clear from the context. Thus, $\sigma_{X}(p)(\mathfrak{m})$ is the expectation of random variable $p$ under the measure $\mathfrak{m}$ .

Theorem 8.

The mapping $\sigma$ defined in (3) is a natural transformation valued in $\mathsf{lsPred}$ ; thus, an indexed category morphism. Moreover, $\sigma$ preserves directed suprema which is a consequence of the monotone convergence theorem well known in measure theory.

Note that predicate lifting improves universally measurable predicates even to Borel measurable predicates for analytic spaces; the proof of this fact can be found in [2].

Thus, A 3 is satisfied and invoking the $\hat{\sigma}$ given in (1) gives the usual Wasserstein lifting for the Giry endofunctor $\operatorname{\mathcal{G}}$ as expected.

Definition 9.

A predicate $d\in\operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A}))$ is a pseudometric on $(X,\mathcal{A})\in\textnormal{{Ana}}$ iff $d$ is reflexive, symmetric, and satisfies the triangle inequality.

Moreover, a probability measure $\mathfrak{c}\in\operatorname{\mathcal{G}}((X,\mathcal{A})\times(X,\mathcal{A}))$ is a coupling for two probability measures $\mathfrak{m},\mathfrak{n}$ iff $\operatorname{\mathcal{G}}(\operatorname{pr}_{1})(\mathfrak{c})=\mathfrak{m}$ and $\operatorname{\mathcal{G}}(\operatorname{pr}_{2})(\mathfrak{c})=\mathfrak{n}$ . We write $K(\mathfrak{m},\mathfrak{n})$ to denote the set of all couplings for the probability measures $\mathfrak{m},\mathfrak{n}$ .

Proposition 10.

Let $d$ be a pseudometric on a space $(X,\mathcal{A})\in\textnormal{{Ana}}$ . Then, the lifting $\hat{\sigma}$ given in (1) evaluates to the following well known formula associated with Wasserstein lifting of probability measures. Moreover, $\hat{\sigma}(d)$ is a pseudometric on $\operatorname{\mathcal{G}}(X,\mathcal{A})$ .

\hat{\sigma}(d)(\mathfrak{m},\mathfrak{n})=\inf_{\mathfrak{c}\in K(\mathfrak{m% },\mathfrak{n})}\int d\mathop{}\!\mathrm{d}\mathfrak{c}

4.2 Distance lifting for $B=\{B_{\mathsf{MRP}},B_{\mathsf{MDP}}\}$

One way to define the distance lifting $\sigma^{\mathsf{MRP}}$ for $B_{\mathsf{MRP}}$ , i.e. a map of type

\sigma^{\mathsf{MRP}}_{X}\colon\operatorname{\mathsf{Pred}}((X,\mathcal{A})% \times(X,\mathcal{A}))\Rightarrow\operatorname{\mathsf{Pred}}(B_{\mathsf{MRP}}% (X,\mathcal{A})\times B_{\mathsf{MRP}}(X,\mathcal{A})),

is to first define a predicate lifting for $B_{\mathsf{MRP}}$ and then use the equation in (1) where $B=B_{\mathsf{MRP}}$ . To this end one may follow [28, Subsection 5.6.2] in deriving a predicate lifting for $B_{\mathsf{MRP}}$ in a compositional manner. These results (though stated for the category of sets) can be generalised to measurable spaces, but they are only applicable when the underlying endofunctor preserves weak pullbacks. In particular, it is known that the Giry functor (a composite functor in the case of $B_{\mathsf{MRP}}$ ) does not preserve weak pullbacks in Meas [41].

So instead of compositionally deriving predicate liftings for $B_{\mathsf{MRP}}$ and then invoking (1), we derive the coupling based lifting for $B_{\mathsf{MRP}}$ in three stages:

$\blacksquare$

first, we view $B_{\mathsf{MRP}}$ as the composition of functors $B_{I}\circ\operatorname{\mathcal{G}}$ where $B_{I}\colon\textnormal{{Ana}}\to\textnormal{{Ana}}$ maps every space to its product with the unit interval, i.e.

$B_{I}=\text{Id}\times([0,1],\mathcal{B}_{[0,1]}).$
$\blacksquare$

second, for a fixed $c\in[0,1]$ , we define

$\sigma^{c}_{X}\colon\operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,% \mathcal{A}))\Rightarrow\operatorname{\mathsf{Pred}}(B_{I}(X,\mathcal{A})% \times B_{I}(X,\mathcal{A}))$

as $\sigma^{c}_{X}(d)((x,r),(y,s))=cd(x,y)+(1-c)|r-s|$ , for $x,y\in X$ and $r,s\in[0,1]$ .
$\blacksquare$

third, recall $\hat{\sigma}$ from Subsection 4.1 and let $\sigma^{\mathsf{MRP}}$ be the composition:

$\operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A}))\xrightarrow% {\hat{\sigma}_{X}}\operatorname{\mathsf{Pred}}(\operatorname{\mathcal{G}}(X,% \mathcal{A})\times\operatorname{\mathcal{G}}(X,\mathcal{A}))\xrightarrow{% \sigma_{\operatorname{\mathcal{G}}X}^{c}}\operatorname{\mathsf{Pred}}(B_{% \mathsf{MRP}}(X,\mathcal{A})\times B_{\mathsf{MRP}}(X,\mathcal{A})).$

Note that $c$ may take the extremal values 0 and 1. This is possible – in contrast to [18] – as the bisimulation distance is not obtained using a contraction-based fixpoint argument. However, in the extreme cases the bisimulation distance would not take into account either the transition or the reward part.

Lemma 11.

The above mapping $\sigma^{c}$ is well defined. Moreover, for any pseudometric $d\in\operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A}))$ , the lift $\sigma^{\mathsf{MRP}}_{X}(d)$ is given by

\sigma^{\mathsf{MRP}}_{X}(d)((\mathfrak{m},r),(\mathfrak{n},s))=c\left(\inf_{% \mathfrak{c}\in K(\mathfrak{m},\mathfrak{n})}\int d\mathop{}\!\mathrm{d}% \mathfrak{c}\right)+(1-c)|r-s|

for $\mathfrak{m},\mathfrak{n}\in\operatorname{\mathcal{G}}(X,\mathcal{A})$ , and $r,s\in[0,1]$ and is a pseudometric.

In a similar vein, we can now define a distance lifting $\sigma^{\mathsf{MDP}}$ for $B_{\mathsf{MDP}}$ (whose coalgebras model MDPs) by letting $B_{\mathsf{MDP}}=B_{\Sigma}\circ B_{\mathsf{MRP}}$ , where $B_{\Sigma}=\prod_{\Sigma}\text{Id}$ . Now consider the distance lifting $\sigma^{\Sigma}_{(\,\_\,)}$ for $B_{\Sigma}$ as follows:

\sigma^{\Sigma}(d)(\vec{x},\vec{y})=\sup_{a\in\Sigma}d(\vec{x}(a),\vec{y}(a)),% \quad\text{(for $\vec{x},\vec{y}\in\prod_{\Sigma}X$)}.

Lemma 12.

The above mapping $\sigma^{\Sigma}$ is well defined. Moreover, the mapping $\sigma^{\mathsf{MDP}}=\sigma^{\Sigma}_{B_{\mathsf{MRP}}(\,\_\,)}\circ\sigma^{% \mathsf{MRP}}$ for a distance $d\in\operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A}))$ evaluates to

\sigma^{\mathsf{MDP}}(d)((\vec{\mathfrak{m}},\vec{r}),(\vec{\mathfrak{n}},\vec% {s}))=\sup_{a\in\Sigma}\left[c\left(\inf_{\mathfrak{c}\in K(\vec{}\mathfrak{m}% (a),\vec{\mathfrak{n}}(a))}\int d\mathop{}\!\mathrm{d}\mathfrak{c}\right)+(1-c% )|\vec{r}(a)-\vec{s}(a)|\right].

Now composing the two distance liftings $\sigma^{B}$ (for $B\in\{B_{\mathsf{MRP}},B_{\mathsf{MDP}}\}$ ) with a $B$ -coalgebra is the desired functional as given in (2). Clearly, A 4 is satisfied since $\operatorname{\mathsf{Pred}}$ has countable suprema and they are preserved by reindexing functors. We end this subsection by showing that the least fixpoint exists for both of these functionals; thus, also paving a way to compute bisimulation pseudometrics for these systems. To this end we need a general result, whose proof is based on some classical results comprising a non-topological version of Riesz–Markov–Kakutani representation theorem [13, IV.5.1], Banach-Alaoglu theorem [13, V.4.2] and Sion’s minimax theorem [35, Thm. 3].

Theorem 13.

Let $\gamma\colon(X,\mathcal{A})\to\operatorname{\mathcal{G}}(X,\mathcal{A})\in% \textnormal{{Ana}}$ . Then the functional $\gamma\circ\hat{\sigma}$ is $\omega$ -cpo continuous w.r.t. $\leq$ , i.e. for any $\leq$ -increasing sequence $d_{i}\in\operatorname{\mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A}))$ of pseudometrics with $i\in\mathbb{N}$ , we have (for each $x,y\in X$ ):

\inf_{\mathfrak{c}\in K(\gamma_{x},\gamma_{y})}\int\sup_{i\in\mathbb{N}}d_{i}% \mathop{}\!\mathrm{d}\mathfrak{c}=\adjustlimits{\sup}_{i\in\mathbb{N}}{\inf}_{% \mathfrak{c}\in K(\gamma_{x},\gamma_{y})}\int d_{i}\mathop{}\!\mathrm{d}% \mathfrak{c}\text{.}

$\blacktriangleright$ Remark 14.

The above theorem can be stated for general measurable spaces as well, but by restricting the coalgebra map so that $\gamma(x)$ (for each $x\in X$ ) is a perfect measure (see Subsection A.1). Perfect measures were introduced by Kolmogorov [24, 22–23] and have many different equivalent definitions. For us a measure space $(X,\mathcal{A},\mathfrak{m})$ is called perfect, if for any separable metrisable space $({Y},\mathcal{T})$ and every measurable map $f\colon X\to{Y}$ we have the following property: For every $A\in\mathcal{A}$ and $r<\mathfrak{m}(A)$ there is a compact set $K\subseteq\operatorname{im}f$ with $\mathfrak{m}(A\cap{f}^{-1}K)\geq r$ , cf. [20, 451O(a)]. In this case, the measure $\mathfrak{m}$ is called perfect.

$\blacktriangleright$ Remark 15.

Let $p\in[1,\infty)$ and recall the $p$ -Wasserstein distance between probability measures $\mathfrak{m},\mathfrak{n}\in\operatorname{\mathcal{G}}(X,\mathcal{A})$ . Below we argue how to capture this lifting in our setup.

W_{p}(\mathit{d})(\mathfrak{m},\mathfrak{n})=\sqrt[{\textstyle p}]{\inf_{% \mathfrak{c}\in K(\mathfrak{m},\mathfrak{n})}\int{\mathit{d}}^{p}\mathop{}\!% \mathrm{d}\mathfrak{c}},\qquad\text{for a pseudometric $d\in\operatorname{% \mathsf{Pred}}((X,\mathcal{A})\times(X,\mathcal{A}))$}.

Note that any monotonously increasing lower semicontinuous function $f\colon[0,1]\to[0,1]$ induces a $\omega$ -cpo-continuous map $f\circ\,\_\,\colon\operatorname{\mathsf{Pred}}(X,\mathcal{A})\to\operatorname{% \mathsf{Pred}}(X,\mathcal{A})$ . It is monotone by monotonicity of $f$ and for any increasing sequence $(p_{i}\in\operatorname{\mathsf{Pred}}(X,\mathcal{A}))_{i\in\mathbb{N}}$ we have

\sup_{i\in\mathbb{N}}f(p_{i}(x))=f\left(\sup_{i\in\mathbb{N}}p_{i}(x)\right)% \qquad\text{(for each $x\in X$)}.

Note that both $(\,\_\,)^{p}$ and $\sqrt[p]{\vphantom{l}\,\_\,}$ are monotonously increasing lower-semicontinuous functions. So $W_{p}=\sqrt[p]{\vphantom{l}\,\_\,}\mathbin{\circ}\hat{\sigma}\mathbin{\circ}(% \,\_\,)^{p}$ is $\omega$ -cpo-continuous as a composition of $\omega$ -cpo-continuous functions.

Corollary 16.

Let $\gamma_{B}\colon(X,\mathcal{A})\to B(X,\mathcal{A})\in\textnormal{{Ana}}$ be a coalgebra where $B\in\{B_{\mathsf{MRP}},B_{\mathsf{MDP}}\}$ . Then the least fixpoint for the functionals $\gamma_{B}\circ\sigma^{B}$ exists.

Using the fact that any $\omega$ -cpo-continuous endofunction has a least fixpoint by Kleene’s fixpoint theorem, we write $\operatorname{\mathbf{bd}}_{c}^{\gamma}$ (or simply $\operatorname{\mathbf{bd}}$ whenever the coalgebra structure is clear from the context) to denote the least fixpoint of the functionals in the above corollary.

Note that by using Kleene fixed point theorem we require weaker assumptions than [17, 3.12], who use the Banach fixed point theorem to define their bisimulation pseudometric restrict themselves to a set-up with contractions.

5 A quantitative modal logic and its expressivity

The signatures of the (logical) languages considered in this paper are parametrised by a set $\{\,f_{iy}\;|\;y\in Y_{i}\,\}$ of $\omega$ -indexed families of $Y_{i}$ -indexed function symbols of arity $n_{i}$ as follows:

\top\mid\neg\,\_\,\mid\,\_\,\land\,\_\,\mid\diamond_{a}\,\_\,,a\in A\mid f_{iy% },y\in Y_{i},i\in\omega

(4)

In other words, the signature we are using extends the semi-lattice signature ( $\top$ , $\land$ ) with negation ( $\neg$ ), modalities $\diamond_{a}$ (one for each action $a$ ) and additional function symbols $f_{iy}$ (each $f_{iy}$ could be viewed as $n_{i}$ -ary predicate on the unit interval). We simply write $\mathcal{L}$ to denote the set of formulae generated by the above signature. The restriction to countably many families of function symbols will become important when we construct a second-countable topology on $\mathcal{L}$ . Note that we can also view the logical symbols, $\top,\neg\,\_\,,\,\_\,\wedge\,\_\,$ , as (singleton index families of) function symbols. This will be very handy for proofs by structural induction over $\mathcal{L}$ . Throughout this section, we let $\Omega=[0,1]$ and consider only MDPs (the modal logic for MRPs can be derive by letting the set $\Sigma$ of actions to be a singleton set).

The reason to choose this general formulation with index sets $Y_{i}$ are twofold. First, to endow a topology on $\mathcal{L}$ which is needed to prove both adequacy (i.e. the distance induced by formulae in $\mathcal{L}$ is below than the bisimulation distance $\operatorname{\mathbf{bd}}$ for MDPs) and expressivity (which is the converse of adequacy). Second, it allows greater flexibility in applications, cf. 6, by accommodating additional operators for which adequacy and expressivity still hold.

5.1 Interpretation of modal formulae in $\mathcal{L}$

We give semantics to each formulae $\varphi\in\mathcal{L}$ by defining an interpretation $\llbracket\varphi\rrbracket$ as a predicate in $\operatorname{\mathsf{Pred}}(X,\mathcal{A})$ . This is done by structural induction over $\varphi$ in the following way. The logical symbols, i.e. $\top$ (truth), $\neg\,\_\,$ (negation), and $\,\_\,\land\,\_\,$ (conjunction), respectively, are interpreted as the functions

1,\quad\uplambda\;x.1-x,\quad\uplambda\;x\,y.\min\{x,y\}\quad\text{% respectively.}

(5)

Next we consider function symbols $f_{iy}$ of arity $n_{i}>0$ (for some $y\in Y_{i}$ ) and a sequence $(\varphi_{j})_{1\leq j\leq n_{i}}$ . Its interpretation is given as

\llbracket f_{iy}\rrbracket(\varphi_{1},\cdots,\varphi_{n_{i}})=f_{iy}(% \llbracket\varphi_{1}\rrbracket,\cdots,\llbracket\varphi_{n_{i}}\rrbracket)% \mbox{ for some fixed }f_{iy}:[0,1]^{n_{i}}\to[0,1]

whilst, $\llbracket f_{iy}\rrbracket=f_{iy}\in[0,1]$ when $n_{i}=0$ (i.e. $f_{iy}$ is a constant). Besides the logical symbols we introduce the following basic families of function symbols indexed by $r,c\in[0,1]$ and $\alpha\in(0,\infty)$ : Scalar addition ( $\,\_\,+r$ ), scalar subtraction ( $\,\_\,-r$ ), scalar multiplication ( $r\,\_\,$ ), and convex combination ( $\,\_\,+_{c}\,\_\,$ ) are interpreted as the function assigning to $x$ (and $y$ )

\min\{1,x+r\},\quad\max\{0,x-r\},\quad rx,\quad x+_{c}y\text{,}

(6)

respectively, throughout this paper. Note that only the first two basic operations are required for our main results.

Finally, for the modal operators we fix a parameter $c\in\Omega$ (which was also used to define bisimulation pseudometrics $\operatorname{\mathbf{bd}}$ in the previous section) and a coalgebra $\gamma\colon(X,\mathcal{A})\to B_{\mathsf{MDP}}(X,\mathcal{A})\in\textnormal{{% Ana}}$ to define the interpretation of $\llbracket\diamond_{a}\varphi\rrbracket$ as:

\llbracket\diamond_{a}\varphi\rrbracket_{\gamma}(x)=\int\llbracket\varphi% \rrbracket\mathop{}\!\mathrm{d}\gamma_{a,x}^{0}\ +_{c}\ \gamma_{a,x}^{1},

(7)

where $r+_{c}s\coloneqq cr+(1-c)s$ denotes convex combination of $r,s\in\Omega$ . Intuitively, this means that the value of $\diamond_{a}\varphi$ is determined by a convex combination of the expected value of $\varphi$ and the utility after performing $a$ .

The defined interpretation function $\llbracket\,\_\,\rrbracket$ gives rise a (quantitative) theory map $\operatorname{qTh}\colon X\to\Omega^{\mathcal{L}}$ defined by

\operatorname{qTh}(x)(\varphi)\coloneqq\llbracket\varphi\rrbracket(x)\quad% \text{(for every $\varphi\in\mathcal{L}$)}.

(8)

The question whether the theory map lives in our working category Ana is an important step for expressivity (cf. Theorem 28). However, for adequacy (cf. Theorem 22), we only require that the function symbols $f_{iy}$ in $\mathcal{L}$ are nonexpansive w.r.t. the suprema distance $d_{\infty}(\vec{x},\vec{y})=\max_{1\leq j\leq n_{i}}\lvert\vec{x}(j),\vec{y}(j)\rvert$ . Nonetheless, before attempting these results we need the definition of a logical distance, which at this stage is purely a set-theoretic assignment. Later, in next subsection, we will show that the logical distance defined below is indeed a predicate over the product of a state space with itself (cf. Subsection 5.2).

Definition 17 (Logical distance).

Given an interpretation $\llbracket\,\_\,\rrbracket\colon\mathcal{L}\to\textnormal{{Set}}(X,[0,1])$ , we define the logical distance as follows (for every $x,y\in X$ ):

\mathbf{d}_{\mathcal{L}}(x,y)\ =\ \sup_{\varphi\in\mathcal{L}}\ \left|% \llbracket\varphi\rrbracket(x)-\llbracket\varphi\rrbracket(y)\right|\ =\ \sup_% {\varphi\in\mathcal{L}}\ \llbracket\varphi\rrbracket(x)-\llbracket\varphi% \rrbracket(y)\enspace.

(9)

Note that the absolute value is redundant since the negation operator $\neg$ is in our logic $\mathcal{L}$ .

5.2 Endowing a topology on $\mathcal{L}$ through its shapes

As we are dealing with more than countably many function symbols, we will have to impose some structure on the set of function symbols in order to prove our expressivity theorem (cf. Theorem 28). Technically, we need a topology $\mathcal{T}_{\mathcal{L}}$ on $\mathcal{L}$ so that the theory map $\operatorname{qTh}$ (8) becomes topologisable in the following sense.

Definition 18.

Let $(X,\mathcal{A})\in\textnormal{{Ana}}$ . Then the theory map $\operatorname{qTh}\colon X\to\Omega^{\mathcal{L}}$ is topologisable iff there is a topology $\mathcal{T}_{\mathcal{L}}$ on $\mathcal{L}$ such that the preimage of every open set $U\subseteq\Omega^{\mathcal{L}}$ (in the compact open topology on the function space $\Omega^{\mathcal{L}}$ ) is a universal measurable set, i.e. ${\operatorname{qTh}}^{-1}(U)\in\overline{\mathcal{A}missing}$ .

To be able to do this, we switch our focus from the uncountable language $\mathcal{L}$ to the countable language $\operatorname{Sh}(\mathcal{L})$ of shapes induced by the language $\mathcal{L}$ . This language $\operatorname{Sh}(\mathcal{L})$ of shapes is constructed by collapsing all function symbols indexed by the same index set, i.e.

\top\mid\neg\,\_\,\mid\,\_\,\land\,\_\,\mid\diamond_{a}\,\_\,,a\in A\mid f_{i}% ,i\in\omega\text{.}

(10)

To each formula $\varphi$ on $\mathcal{L}$ one can assign a formula in $\operatorname{Sh}(\mathcal{L})$ by replacing each $f_{iy}$ by $f_{i}$ . This defines an equivalence relation on $\mathcal{L}$ . For each $\psi\in\operatorname{Sh}(\mathcal{L})$ denote by $\widehat{\psi}\subseteq\mathcal{L}$ the corresponding equivalence class. For instance, for an $n_{i}$ -ary function symbol $f_{i}$ the set $\widehat{\psi}$ for $\psi=f_{i}(\top,\ldots,\top)$ is in canonical bijection to $Y_{i}$ , and the shape $r\neg\top\wedge r\top$ (when $\mathcal{L}$ allows for scalar multiplication) corresponds to $\Omega\times\Omega$ , as each $r$ ranges over $\Omega=[0,1]$ . Through this equivalence relation we can subdivide $\mathcal{L}$ into countably many chunks, each of which associated to a finite product of $Y_{i}$ ’s. If for a family $\{f_{y}\}_{y\in Y}$ of $n$ -ary function symbols the set $Y$ is endowed with a $\sigma$ -algebra (resp. topology), we call $\llbracket\,\_\,\rrbracket$ jointly measurable (resp. jointly continuous), if the function $[0,1]^{n}\times Y\to[0,1]$ is measurable (resp. continuous) with respect to the respective product $\sigma$ -algebra (resp. product topology).

For the remainder of this subsection $\Omega^{(X,\mathcal{T})}$ will denote the set of continuous functions from $(X,\mathcal{T})$ to $\Omega$ , which will be endowed with the compact-open topology [14, 3.4] if not explicitly stated otherwise. Recall that a Hausdorff topological space $(X,\mathcal{T})$ is called locally compact, if each point admits a compact neighbourhood.

Lemma 19 ([14], 3.4.16).

Let $(X,\mathcal{T})$ be a locally compact second countable space. Moreover if $\Omega$ is second countable, then $\Omega^{(X,\mathcal{T})}$ is second countable w.r.t. the compact-open topology.

$\blacktriangleright$ Remark 20.

Subsection 5.2 will be used only once but at a crucial point in Theorem 28, which is our main result. One could ask if one can extend the classes of spaces for which Subsection 5.2 holds. For instance, does it hold for arbitrary second countable Hausdorff spaces? An ansatz would be to weaken the notion of compactness further. The promising notion is a k-space, the quotient space of some locally compact space, i.e. there is a quotient map $q\colon({Y},\mathcal{T}_{{Y}})\to(X,\mathcal{T})$ for some locally compact space $(X,\mathcal{T})$ [14, p. 152]. The reason is that in this case there is a decomposition $\Omega^{({Y},\mathcal{T}_{{Y}})}=\lim\limits_{\longleftarrow}{}_{i}\Omega^{K_{% i}}$ for any directed system of of compact sets $K_{i}\subseteq X$ with $\bigcup_{i}K_{i}=X$ [14, 3.4.11]. Second countability follows as soon as $i$ can be chosen to range over a countable set. Unfortunately, this already implies hemicompactness of $({Y},\mathcal{T}_{{Y}})$ in case of regular spaces, cf. [40, 8.1(d)] and [14, 3.4.E(c)]. [40] also discusses other weakenings of compactness, but the mentioned Fact 8.1 therein do not provide a remedy.

We end this subsection by the main results of this section; namely that the logical distance $\mathbf{d}_{\mathcal{L}}$ is a predicate on $(X,\mathcal{A})\times(X,\mathcal{A})$ and the logic $\mathcal{L}$ is adequate w.r.t. $\operatorname{\mathbf{bd}}_{c}$ .

Lemma 21.

If $\operatorname{qTh}$ on $(X,\mathcal{A})$ is topologisable, then $\mathbf{d}_{\mathcal{L}}$ is measurable.

Theorem 22.

If the function symbols $f_{iy}$ are interpreted by $\llbracket\,\_\,\rrbracket$ as nonexpansive functions w.r.t. $\mathrm{d}_{\infty}$ , then the logic $\mathcal{L}$ is adequate, i.e. $\operatorname{\mathbf{bd}}\geq\mathbf{d}_{\mathcal{L}}$ .

Moreover, the language consisting of logical symbols, modalities, scalar addition, subtraction, multiplication, and convex combination is always adequate.

$\blacktriangleright$ Remark 23.

With the aim to develop continuous version of first order logic, Yaacov and Berenstein studied metric structures and their model theory in [42]. Simply put, a metric structure consists of a complete bounded metric space $(M,d)$ with a set of $\mathbb{R}$ -valued predicates, a set of operators on the metric space (which are uniformly continuous of type $M^{n}\to M$ for some $n>0$ ), and a set of distinguished elements of $M$ . It is worthwhile to note that the operators defined by our signature (4) is a special case of a metric structure on the unit interval with an empty set of predicates.

5.3 A general expressivity theorem for $\mathcal{L}$

Our main expressivity result (cf. Theorem 28) is based on the well known Stone-Weierstraß theorem and Kantorovich-Rubinstein duality. This follows the tradition of expressivity results from recent papers [29, 1, 19] on coalgebraic modal logic; however, the proof of Theorem 28 does not follow from the abstract results established in the aforementioned articles. In contrast, we need an extra condition that the theory map $\operatorname{qTh}$ is topologisable (cf. Subsection 5.2).

Definition 24.

A set $L$ of functions $X\to[0,1]$ approximates a function $f\colon X\to\mathbb{R}$ at a pair $x,y\in X$ up to $\varepsilon$ (for some $\varepsilon>0$ ) if there is a $g\in L$ with $\lvert g(x)-f(x)\rvert,\lvert g(y)-f(y)\rvert<\varepsilon$ . We further say that $L$ approximates $f$ at $x, y$ if $L$ approximates $f$ at $x, y$ for each $\varepsilon>0$ .

Henceforth we write $\llbracket\mathcal{L}\rrbracket=\{\,\llbracket\varphi\rrbracket\;|\;\varphi\in% \mathcal{L}\,\}$ . It turns out that the operators (truth, conjunction, positive, and negative scaling) in our logic $\mathcal{L}$ approximate any predicate over a state space. The following lemma is taken from [7, Lemma 10].

The following lemma is a continuous (and thus simpler) version of [7, Lem. 10].

Lemma 25.

Assume $\top,\,\_\,\wedge\,\_\,,\,\_\,+r,\,\_\,-r\in\mathcal{L}$ for all $r\in\Omega$ and $\llbracket\,\_\,\rrbracket\colon\mathcal{L}\to\operatorname{\mathsf{Pred}}(X,% \mathcal{A})$ an interpretation on predicates on some measurable space $(X,\mathcal{A})$ . Let $p\in\operatorname{\mathsf{Pred}}(X,\mathcal{A})$ , $x,y\in X$ and $\varepsilon>0$ .

Then

	$\displaystyle\exists\varphi\in\mathcal{L}\colon 0\leq p(x)-p(y)<\llbracket% \varphi\rrbracket(x)-\llbracket\varphi\rrbracket(y)+\varepsilon$	(11a)
	$\displaystyle\implies\bigl{(}\exists\psi\in\mathcal{L}\colon p(x)-\llbracket% \psi\rrbracket(x)\in[0,\varepsilon)\text{ and }p(y)-\llbracket\psi\rrbracket(y% )=0\bigr{)}$	(11b)
	$\displaystyle\implies\llbracket\mathcal{L}\rrbracket\text{ approximates $p$ at% $x,y$ up to }\varepsilon\text{.}$	(11c)

With the help of this lemma, we can approximate any short (aka nonexpansive) predicates w.r.t. logical distance $\mathbf{d}_{\mathcal{L}}$ by formulae in our logic $\mathcal{L}$ . Given a measurable space $(X,\mathcal{A})$ , we define the set of short predicates with respect to logical distance $\mathbf{d}_{\mathcal{L}}$ be

\operatorname{\mathsf{Pred}}(X,\mathcal{A},\mathbf{d}_{\mathcal{L}})\coloneqq% \{\,h\in\operatorname{\mathsf{Pred}}(X,\mathcal{A})\;|\;\forall x,y\in X\colon% \mathbf{d}_{\mathcal{L}}(x,y)\geq h(x)-h(y)\,\}\text{.}

(12)

Corollary 26.

As long as the scalar addition is allowed in the signature of $\mathcal{L}$ in (4), every short predicate can be approximated by the interpretation of modal formulae in $\llbracket\mathcal{L}\rrbracket$ .

Proof.

Take $h\in\operatorname{\mathsf{Pred}}(X,\mathcal{A},\mathrm{d}_{\llbracket\,\_\,% \rrbracket})$ . As $\neg\,\_\,\in\mathcal{L}$ we have for each $x,y\in X$ by 9 that $h(x)-h(y)\leq\sup_{f\in\llbracket\mathcal{L}\rrbracket}f(x)-f(y)$ . This implies ˜11a for any $\varepsilon>0$ . Thus $\llbracket\mathcal{L}\rrbracket$ approximates $h$ at $x, y$ for any $\varepsilon>0$ by Subsection 5.3. Hence the claim follows. $\hfill\blacktriangleleft$

The next lemma is the well-known Kantorovich-Rubinstein duality extended to perfect measures. We provide a proof in the appendix, since the rather sketchy proof in [33, Thm. 5] considers only distances on their induced Borel- $\sigma$ -algebra, while other known proofs, especially [12, 11.8.2&6], chose a topological instead of a purely measure theoretic set-up.

Lemma 27 (Kantorovich-Rubinstein theorem).

Let $\mathfrak{m},\mathfrak{n}\in\operatorname{\mathcal{G}}(X,\mathcal{A})$ be perfect measures and $d\colon(X,\mathcal{A})\times(X,\mathcal{A})\to([0,1],\mathcal{B}_{[0,1]})$ a l.s.m. pseudo-metric such that $(X,\mathcal{A})$ is analytic (or smooth, or $\mathcal{T}_{d}$ , the topology induced by $d$ , is contained in $\overline{\mathcal{A}missing}$ ) and $\mathit{d}(\,\_\,,x_{0})$ is integrable for every $x_{0}\in X$ . Then

\inf_{\mathfrak{c}\in K(\mathfrak{m},\mathfrak{n})}\int\mathit{d}\mathop{}\!% \mathrm{d}\mathfrak{c}=\sup_{h\in\operatorname{\mathsf{Pred}}(X,\mathcal{A},% \mathit{d})}\int h\mathop{}\!\mathrm{d}(\mathfrak{m}-\mathfrak{n})\text{.}

Theorem 28.

Let $\mathcal{L}$ be a language with a coalgebra $\gamma\colon(X,\mathcal{A})\to B_{\mathsf{MDP}}(X,\mathcal{A})$ so that $\llbracket\,\_\,\rrbracket$ is well defined. Assume the following restrictions:

1.

every measure $\gamma_{x,a}^{0}$ (for every $x\in X$ , $a\in\Sigma$ ) is perfect,
2.

the theory map $\operatorname{qTh}$ is topologisable by a locally compact and second countable topology, and
3.

the scalar addition is in the signature of our language $\mathcal{L}$ ,

then $\mathbf{d}_{\mathcal{L}}$ is a fixpoint of the functional $\sigma^{B_{\mathsf{MDP}}}\circ\gamma$ . As a consequence, we have that the language $\mathbf{d}_{\mathcal{L}}$ is expressive w.r.t. $\operatorname{\mathbf{bd}}$ (i.e., $\operatorname{\mathbf{bd}}\leq\mathbf{d}_{\mathcal{L}}$ ).

Proof.

Let $\mathfrak{m}_{x,a}=\gamma_{x,a}^{0}$ (for each $x\in X,a\in\Sigma$ ), let $r^{a}_{x}=\gamma_{a,x}^{1}$ and $r^{a}_{x,y}=\lvert r^{a}_{x}-r^{a}_{y}\rvert$ . Recall the distance liftings $\hat{\sigma}$ and $\sigma^{\mathsf{MRP}}$ from Subsection 4.1 and Subsection 4.2, respectively. The claim $\operatorname{\mathbf{bd}}\leq\mathbf{d}_{\mathcal{L}}$ is – using order preservation of $\sigma^{B_{\mathsf{MDP}}}\circ\gamma$ , Theorem 13 – equivalent to $\forall\varepsilon>0\colon\sigma^{B_{\mathsf{MDP}}}\circ\gamma(x,y)\leq\mathbf% {d}_{\mathcal{L}}(x,y)+\varepsilon$ for all $x,y\in X$ . From the definition of $\sigma^{B_{\mathsf{MDP}}}$ this translates to the condition

\forall\varepsilon>0\colon\forall a\in\Sigma\colon\qquad\sigma^{\mathsf{MRP}}_% {X}(\mathbf{d}_{\mathcal{L}})\circ\gamma(x,y)((\mathfrak{m}_{x,a},r^{a}_{x}),(% \mathfrak{m}_{y,a},r^{a}_{y}))\leq\mathbf{d}_{\mathcal{L}}+\varepsilon.

(13)

To this end, we begin by expanding the left hand side of the above inequality:

	$\displaystyle\sigma^{\mathsf{MRP}}_{X}(\mathbf{d}_{\mathcal{L}})\circ\gamma(x,% y)((\mathfrak{m}_{x,a},r^{a}_{x}),(\mathfrak{m}_{y,a},r^{a}_{y}))$
	$\displaystyle=\inf_{\mathfrak{c}\in K(\mathfrak{m}_{x,a},\mathfrak{m}_{y,a})}% \int\mathbf{d}_{\mathcal{L}}\mathop{}\!\mathrm{d}\mathfrak{c}+_{c}r^{a}_{xy}$	(14)
we push this expression to $[0,1]^{\mathcal{L}}$ by letting $\widetilde{\mathbf{d}_{\mathcal{L}}}\coloneqq\sup_{\varphi\in\operatorname{im}% (\operatorname{qTh})}$ be a distance on $[0,1]^{\mathcal{L}}$
	$\displaystyle=\inf_{\mathfrak{c}\in K(\operatorname{qTh}_{*}(\mathfrak{m}_{x,a% }),\operatorname{qTh}(\mathfrak{m}_{y,a}))}\int\widetilde{\mathbf{d}_{\mathcal% {L}}}\mathop{}\!\mathrm{d}\mathfrak{c}+_{c}r^{a}_{xy}$	(15)
As $[0,1]^{\mathcal{L}}$ is second countable, cf. Subsection 5.2, we can (depending on $\operatorname{qTh}_{}(\mathfrak{m}_{x,a})$ and $\operatorname{qTh}(\mathfrak{m}_{y,a})$ ) restrict the integral to $A\times A$ for some standard (and thus analytic) subspace $A\in\mathcal{B}_{[0,1]^{\mathcal{L}}}$ [15, Thm. 6] (using that $\operatorname{qTh}_{}(\mathfrak{m}_{x,a})$ and $\operatorname{qTh}(\mathfrak{m}_{y,a})$ again are perfect). (Note that this argument actually requires only second countability of $\mathcal{L}$ as it can be refined using [14, 3.8.D], [3, 2.1.15] and [15, § 8 Remark].) As $\mathbf{d}_{\mathcal{L}}$ is bounded, so $\mathbf{d}_{\mathcal{L}}(\,\_\,,x)$ is certainly integrable for any $x\in X$ . Thus we can finally apply Kantorovic-Rubinstein duality Subsection 5.3, using Subsection 5.2 and Item 1
	$\displaystyle=\sup_{h\in\operatorname{\mathsf{Pred}}([0,1]^{\mathcal{L}},% \mathcal{B}_{[0,1]^{\mathcal{L}}},\widetilde{\mathbf{d}_{\mathcal{L}}})}\int h% \mathop{}\!\mathrm{d}(\mathfrak{m}_{x,a}-\mathfrak{m}_{y,a})+_{c}r^{a}_{xy}$	(16)
	$\displaystyle\leq\sup_{h\in\operatorname{\mathsf{Pred}}(X,\mathcal{A},\mathbf{% d}_{\mathcal{L}})}\int h\mathop{}\!\mathrm{d}(\operatorname{qTh}_{}(\mathfrak% {m}_{x,a})-\operatorname{qTh}_{}(\mathfrak{m}_{y,a}))+_{c}r^{a}_{xy}$	(17)

By applying Subsection 5.3 we can approximate any short predicate $h$ by the interpretation of formulae in our logic $\llbracket\mathcal{L}\rrbracket$ .

Define $\mathcal{S}\subseteq\mathcal{A}$ to be $\mathcal{S}=\left\{\,U,U^{\complement}\;\middle|\;U\in{\operatorname{qTh}}^{-1% }[\mathcal{S}_{\llbracket\,\_\,\rrbracket}]\cup{h}^{-1}[\mathcal{S}_{\Omega}]% \,\right\}\text{.}$ Let $\mathcal{T}$ denote the topology generated by $\mathcal{S}$ and $(X^{\prime},\mathcal{T}^{\prime})$ the Kolmogorov quotient, cf. Appendix C, of $(X,\mathcal{T})$ with unit map $\eta\colon(X,\mathcal{T})\to(X^{\prime},\mathcal{T}^{\prime})$ . Both $\mathfrak{m}_{z,a}^{\flat}\coloneqq\operatorname{\mathcal{G}}(\eta)(\mathfrak{% m}_{z,a})$ (for $z\in\{x,y\}$ ) are perfect since the push-forward measures of perfect measures [20, 451Ea] is perfect and using 1 we know $\mathfrak{m}_{z,a}$ is perfect. By [20, 451M] both measures are inner regular with respect to compact sets; thus we find compact sets $K_{x},K_{y}\subseteq X^{\prime}$ with $\mathfrak{m}_{x,a}^{\flat}((K_{x})^{\complement}),\mathfrak{m}_{y,a}^{\flat}((% K_{y})^{\complement})<\delta$ . Thus $K_{x}\cup K_{y}$ is compact and so is $K\coloneqq{\eta}^{-1}(K_{x}\cup K_{y})$ by Appendix C. Moreover we have $\mathfrak{m}_{x,a}((K)^{\complement}),\mathfrak{m}_{y,a}((K_{y})^{\complement}% )<\delta$ . As $\mathcal{S}$ is closed under complement, $(X^{\prime},\mathcal{T}^{\prime})$ is $\mathrm{R}_{2}$ .

So finally, the Stone-Weierstraß Theorem, Appendix C, is applicable to $(K,\mathcal{T}|_{K})$ ; thus, every nonexpansive predicate $h$ can be approximated on $K$ by a function from the family $\{\,\llbracket\varphi\rrbracket|_{K}\;|\;\varphi\in\mathcal{L}\,\}$ . Let $\varphi_{h,\delta}\in\mathcal{L}$ denote a witness of a $\delta$ -approximation from $\llbracket\mathcal{L}\rrbracket|_{K}$ of $h|_{K}$ .

Continuing at 17 we obtain (for all $\delta>0$ ):

		$\displaystyle\sigma(\mathbf{d}_{\mathcal{L}})(\mathfrak{m}_{x,a},\mathfrak{m}_% {y,a})+_{c}r^{a}_{x,y}$
		$\displaystyle\leq\biggl{(}\sup_{h\in\operatorname{\mathsf{Pred}}(X,\mathcal{A}% ,\mathbf{d}_{\mathcal{L}})}\int_{K}h\mathop{}\!\mathrm{d}(\mathfrak{m}_{x,a}-% \mathfrak{m}_{y,a})+\int_{K^{\complement}}h\mathop{}\!\mathrm{d}(\mathfrak{m}_% {x,a}-\mathfrak{m}_{y,a})\biggr{)}+_{c}r^{a}_{xy}$
		$\displaystyle\leq\biggl{(}\sup_{h\in\operatorname{\mathsf{Pred}}(X,\mathcal{A}% ,\mathbf{d}_{\mathcal{L}})}\int_{K}\llbracket\varphi_{h,\delta}\rrbracket% \mathop{}\!\mathrm{d}(\mathfrak{m}_{x,a}-\mathfrak{m}_{y,a})+\delta+\delta% \cdot 1\biggr{)}+_{c}r^{a}_{xy}$
		$\displaystyle\leq\biggl{(}\sup_{h\in\operatorname{\mathsf{Pred}}(X,\mathcal{A}% ,\mathbf{d}_{\mathcal{L}})}\int\llbracket\varphi_{h,\delta}\rrbracket\mathop{}% \!\mathrm{d}(\mathfrak{m}_{x,a}-\mathfrak{m}_{y,a})+3\delta\biggr{)}+_{c}r^{a}% _{xy}$
		$\displaystyle\leq\biggl{(}\sup_{\varphi\in\mathcal{L}}\int\llbracket\varphi% \rrbracket\mathop{}\!\mathrm{d}(\mathfrak{m}_{x,a}-\mathfrak{m}_{y,a})+3\delta% \biggr{)}+_{c}r^{a}_{xy}$
		$\displaystyle=\biggl{(}\sup_{\varphi\in\mathcal{L}}\left(\int\llbracket\varphi% \rrbracket\mathop{}\!\mathrm{d}\mathfrak{m}_{x,a}+_{c}r^{a}_{x}\right)-\left(% \int\llbracket\varphi\rrbracket\mathop{}\!\mathrm{d}\mathfrak{m}_{y,a}+_{c}r^{% a}_{y}\right)\biggr{)}+3c\delta$		(18)
		$\displaystyle=\sup_{\varphi\in\mathcal{L}}\llbracket\diamond_{a}\varphi% \rrbracket(x)-\llbracket\diamond_{a}\varphi\rrbracket(y)+3c\delta$
		$\displaystyle\mathrel{{\mathop{\leq}\limits^{\mathllap{\text{\lx@cref{% creftype~refnum}{eq:dist_Sprache_def}}}}}}\mathbf{d}_{\mathcal{L}}(x,y)+3c% \delta\text{.}$

Choosing $3c\delta<\varepsilon$ , 13 follows – finishing the proof. $\hfill\blacktriangleleft$

It should be noted that the restrictions on perfect measures in the above theorem is redundant when the coalgebra map $\gamma\in\textnormal{{Ana}}$ . Moreover, the second restriction from the previous theorem can also be discarded by imposing the following restrictions on the function symbols $(f_{iy})_{i\in\omega,y\in Y_{i}}$ , which belong to the signature of our language $\mathcal{L}$ .

Theorem 29.

Assume that $\mathcal{L}$ given in (4) is such that each family $Y_{i}$ is endowed with a second countable Hausdorff topology $\mathcal{T}_{i}$ and the interpretation of function symbols $f_{iy}$ are jointly continuous with respect to $\mathcal{T}_{i}$ , then $\operatorname{qTh}$ is topologisable by a second countable Hausdorff topology. This topology can be chosen to be locally compact, if each $\mathcal{T}_{i}$ has this property.

Recalling that $[0,1]$ is compact, thus the assumptions of Theorem 29 are fulfilled.

Lemma 30.

For a language $\mathcal{L}$ with the following signature in which the set $\Sigma$ of actions is countable, the theory map $\operatorname{qTh}$ is topologisable.

\mathcal{L}\Coloneqq{\wedge}\mid{\neg}\mid\top\mid\,\_\,+r,r\in[0,1]\mid\,\_\,% -r,r\in[0,1]\mid\diamond_{a},a\in\Sigma.

Now combining Subsections 5.3, 22, and 28 we get that the modal language $\mathcal{L}$ defined in Subsection 5.3 is both adequate and expressive for bisimulation pseudo-metrics $\operatorname{\mathbf{bd}}$ defined in Subsection 4.2 for MDPs.

Corollary 31.

Let $\mathcal{L}$ be the language as given in Subsection 5.3 and let $\gamma\colon(X,\mathcal{A})\to B_{\mathsf{MDP}}(X,\mathcal{A})\in\textnormal{{% Ana}}$ be an MDP. Then the bisimulation pseudometric (defined in Subsection 4.2) coincides with the logical distance $\mathbf{d}_{\mathcal{L}}$ .

One may anticipate, following [7], to decompose the semantics of our diamond modality into two modalities: one modelling the expectation modality $\diamond_{a}^{\prime}\varphi$ and the 0-ary reward modality $\operatorname{rew}_{a}$ . The semantics of these two modalities given below in $\mathcal{L}^{\prime}$ is taken from [7]. We argue next that this is unfortunately not possible without jeopardizing the adequacy result.

$\blacktriangleright$ Remark 32.

Assume $c\in[0,1]$ and a language $\mathcal{L}^{\prime}$ with signature

\top\mid\neg\,\_\,\mid\,\_\,\land\,\_\,\mid\operatorname{rew}_{a},a\in\Sigma% \mid\diamond^{\prime}_{a},a\in\Sigma\mid r\,\_\,,r\in[0,1]\mid\,\_\,+\,\_\,

(19)

and an interpretation depending on $\gamma\colon(X,\mathcal{A})\to B_{\mathsf{MDP}}(X,\mathcal{A})\in\textnormal{{% Ana}}$ defined by:

\llbracket\diamond_{a}\varphi\rrbracket_{\gamma}(x)=c\int\llbracket\varphi% \rrbracket\mathop{}\!\mathrm{d}\gamma_{a,x}^{0}\qquad\text{and}\qquad% \llbracket\operatorname{rew}_{a}\rrbracket(x)=\gamma_{a,x}^{1}.

Then the language $\mathcal{L}^{\prime}$ is expressive w.r.t. $\operatorname{\mathbf{bd}}$ (i.e., $\operatorname{\mathbf{bd}}\leq\mathbf{d}_{\mathcal{L}^{\prime}}$ ). However, $\mathcal{L}^{\prime}$ is not adequate since the binary (truncated) addition is not nonexpansive w.r.t. suprema distance.

6 Related work and concluding remarks

6.1 Related work

Our work is inspired by [18] and establishes a quantitative version of Hennessy-Milner theorem for the therein defined bisimulation pseudometric. To the best of our knowledge, such a generalisation is novel and has not been studied elsewhere in the literature. The key technical differences between the two works are as follows. First, our notion of conformance on continuous state MDPs is based on universal measurability; whilst, it is based on lower semi-continuity in [18]. Note that every lower semi-continuous function is universally measurable. Second, our MDPs are coalgebras living in Ana and the state space of an MDP is thus an analytic space in our paper; whilst, it is a Polish space in [18]. Third, the bisimulation pseudometric $\operatorname{\mathbf{bd}}$ defined in this paper is based on Wasserstein lifting; whilst, the bisimulation pseudometric (denoted $\operatorname{\mathbf{bd}}_{FPP}$ ) of Ferns et al. is based on Kantorovich lifting. Note that the pseudometrics $\operatorname{\mathbf{bd}}$ and $\operatorname{\mathbf{bd}}_{FPP}$ are equivalent due to the Kantorovich-Rubinstein duality (Subsection 5.3). Finally, we employ the Kleene’s fixpoint theorem to define $\operatorname{\mathbf{bd}}$ , whilst, Ferns et al. employed Banach fixed point theorem to define their bisimulation pseudometric $\operatorname{\mathbf{bd}}_{FPP}$ .

The recent work of Chen et al. [7] on continuous time Markov processes (i.e. a family of $B_{\mathsf{MRP}}$ -coalgebras indexed by non-negative real numbers) is also insightful, where a quantitative version of Hennessy-Milner theorem is also proven. The mathematical development followed in [7] is worth comparing, especially when this family is restricted to a singleton coalgebra (say, for instance, $\gamma\colon(X,\mathcal{A})\to B_{\mathsf{MRP}}(X,\mathcal{A})$ ) and the $\sigma$ -algebra $\mathcal{A}$ is generated by a Polish topology on $X$ . The functional $\mathcal{F}_{c}$ (for some $0<c<1$ ) defined in [7] is not an endofunction in general on the lattice of lower semi-continuous functions on $(X,\mathcal{A})$ . Using the notations of this paper, $\mathcal{F}_{c}$ can be rewritten as: $\mathcal{F}_{c}(d)(x,y)=c\cdot\hat{\sigma}(d)(\gamma^{0}(x),\gamma^{0}(y))$ for every $x,y\in X$ .

Nevertheless, to capture their bisimulation pseudometric (denoted $\operatorname{\mathbf{bd}}_{CCP}$ ) by a fixpoint argument, the authors had to work with continuous distance functions on $X$ . The usual Knaster-Tarski fixpoint theorem is inapplicable and the authors constructed $\operatorname{\mathbf{bd}}_{CCP}$ as the limit of following pseudometrics $\delta_{i}$ : $\delta_{0}=(\gamma^{1}\times\gamma^{1})^{*}(d_{E})$ ; $\delta_{i+1}=\mathcal{F}_{c}(\delta_{i}).$ As a result, the two bisimulation pseudometrics $\operatorname{\mathbf{bd}}_{CCP}$ and $\operatorname{\mathbf{bd}}$ (Subsection 4.2) are different.

The recent works [1, 30, 29, 19] on developing expressive modal logic for a behavioural conformance that are defined by codensity lifting (called Kantorovich lifting in [19]) can, unfortunately, not be directly applied to the current setting. This is due to the underlying assumption of behavioural conformances defined internally in a complete lattice fibration (or equivalently using the language of topological functors [19]). To this end, we adopted a coupling-based lifting approach (inspired from [4]) to define our bisimulation pseudometric. This adoption required significant effort in recasting old results from measure theory in our framework as outlined in Assumptions A 1-A 4.

6.2 Concluding remarks

To summarise, we model both MRPs and MDPs with continuous state spaces as coalgebras in Ana and define the notion of bisimulation pseudometric using the well known Kleene’s fixpoint theorem. The latter was based on a given coalgebra $\gamma\colon(X,\mathcal{A})\to\operatorname{\mathcal{G}}(X,\mathcal{A})\in% \textnormal{{Ana}}$ and the fact that a functional $\gamma\circ\hat{\sigma}\colon\operatorname{\mathsf{Pred}}(X,\mathcal{A})\to% \operatorname{\mathsf{Pred}}(X,\mathcal{A})$ is $\omega$ -cpo-continuous (Theorem 13), whose proof was in turn based on classical results from functional analysis. In addition, we also presented a “quantitative” modal logic $\mathcal{L}$ whose formulae are interpreted as universally measurable predicates over the state space of an MDP and the logical distance $\mathbf{d}_{\mathcal{L}}$ generated by $\mathcal{L}$ coincides with the bisimulation distance $\operatorname{\mathbf{bd}}$ . To prove the expressivity result (Theorem 28) is, certainly, more involved than the adequacy result (Theorem 22); nonetheless, they both require that the theory map $\operatorname{qTh}$ (8) is topologisable.

For future work the fact that in the expressivity result a topological structure on the formulas instead of any requirement on the statement was the key assumption may stipulate new perspectives. A more concrete worthwhile enterprise would be to generalise the Stone-Weierstraß theorem for measurable spaces. This will help in directly invoking the argument to approximate a nonexpansive map $h$ by logical formulae in the proof of Theorem 28; thus, avoiding the topological arguments used here.

References

[1] Harsh Beohar, Sebastian Gurke, Barbara König, Karla Messing, Jonas Forster, Lutz Schröder, and Paul Wild. Expressive quantale-valued logics for coalgebras: An adjunction-based approach. In Olaf Beyersdorff, Mamadou Moustapha Kanté, Orna Kupferman, and Daniel Lokshtanov, editors, 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024), volume 289, pages 10:1–10:19, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.STACS.2024.10.
[2] Harsh Beohar, Clemens Kupke, and Daniel Luckhardt. Henneysey-milner type theorem for measurable pseudometrics, 2025. arXiv:2505.23635 [cs.LO].
[3] Vladimir I. Bogachev. Measures on topological spaces. Journal of Mathematical Sciences, 91(4):3033–3156, 1998.
[4] Filippo Bonchi, Barbara König, and Daniela Petrisan. Up-to techniques for behavioural metrics via fibrations. In Sven Schewe and Lijun Zhang, editors, 29th International Conference on Concurrency Theory (CONCUR 2018), volume 118 of Leibniz International Proceedings in Informatics (LIPIcs), pages 17:1–17:17, Dagstuhl, Germany, 2018. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CONCUR.2018.17.
[5] Craig Boutilier, Thomas Dean, and Steve Hanks. Decision-theoretic planning: structural assumptions and computational leverage. J. Artif. Int. Res., 11(1):1–94, July 1999. doi:10.1613/JAIR.575.
[6] D. W. Bressler and M. Sion. The current theory of analytic sets. Canadian Journal of Mathematics, 16:207–230, 1964. doi:10.4153/CJM-1964-021-7.
[7] Linan Chen, Florence Clerc, and Prakash Panangaden. A behavioural pseudometrics for continuous-time Markov processes. In Foundations of Software Science and Computation Structures, Lecture Notes in Computer Science, 2025. Accepted; arXiv:2501.13008 [cs.LO]; Event: FoSSaCS, Hamilton, Ontario, Canada, May 5–8, 2025. doi:10.48550/arXiv.2501.13008.
[8] Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labeled Markov systems. In Jos C. M. Baeten and Sjouke Mauw, editors, CONCUR’99 Concurrency Theory, pages 258–273, Berlin, Heidelberg, 1999. Springer. doi:10.1007/3-540-48320-9_19.
[9] Josée Desharnais, Abbas Edalat, and Prakash Panangaden. Bisimulation for labelled Markov processes. Information and Computation, 179(2):163–193, 2002. doi:10.1006/inco.2001.2962.
[10] Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labelled Markov processes. Theoretical Computer Science, 318(3):323–354, 2004. doi:10.1016/j.tcs.2003.09.013.
[11] Ernst-Erich Doberkat. Measures and all that — a tutorial, 2014. arXiv:1409.2662 [math.FA], Version 3.
[12] R. M. Dudley. Real Analysis and Probability. Number 74 in Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2002.
[13] Nelson Dunford and Jacob Schwartz. Linear Operators, volume 3 of Pure and Applied Mathematics. Wiley-InterScience, 1958/1971.
[14] Ryszard Engelking. General topology, volume 6 of Sigma series in pure mathematics. Heldermann Verlag, revised and completed edition edition, 1989.
[15] Arnold M. Faden. The existence of regular conditional probabilities: Necessary and sufficient conditions. The Annals of Probability, 13(1):288–298, 1985.
[16] Neil Falkner. Generalizations of analytic and standard measurable spaces. Mathematica Scandinavica, pages 283–301, 1981.
[17] Norm Ferns, Prakash Panangaden, and Doina Precup. Metrics for finite Markov decision processes. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI 2004), pages 162–169, 2004. arXiv:1207.4114 [cs.AI]. URL: https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=1103&proceeding_id=20.
[18] Norm Ferns, Prakash Panangaden, and Doina Precup. Bisimulation metrics for continuous Markov decision processes. SIAM Journal on Computing, 40(6):1662–1714, 2011. doi:10.1137/10080484X.
[19] Jonas Forster, Sergey Goncharov, Dirk Hofmann, Pedro Nora, Lutz Schröder, and Paul Wild. Quantitative Hennessy-Milner theorems via notions of density. In Bartek Klin and Elaine Pimentel, editors, 31st EACSL Annual Conference on Computer Science Logic (CSL 2023), volume 252, pages 22:1–22:20, Dagstuhl, Germany, 2023. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CSL.2023.22.
[20] D.H. Fremlin. Measure Theory, volume 5. Torres Fremlin, 2000/2008.
[21] Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, and Marc G. Bellemare. DeepMDP: Learning continuous latent space models for representation learning. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2170–2179. PMLR, June 2019. URL: https://proceedings.mlr.press/v97/gelada19a.html.
[22] Michèle Giry. A categorical approach to probability. In B. Banaschewski, editor, Categorical Aspects of Topology and Analysis, volume 915 of Lecture Notes in Mathematics, pages 68–85. Springer, 1982.
[23] Robert Givan, Thomas Dean, and Matthew Greig. Equivalence notions and model minimization in Markov decision processes. Artificial Intelligence, 147(1):163–223, 2003. Planning with Uncertainty and Incomplete Information. doi:10.1016/S0004-3702(02)00376-4.
[24] Boris Vladimirovich Gnedenko and Andrey Nikolaevich Kolmogoroff. Predelnye raspredeleniya dlya summ nezavisimykh sluchaynykh velichin. GITTL, 1949.
[25] Matthew Hennessy and Robin Milner. On observing nondeterminism and concurrency. In Jaco de Bakker and Jan van Leeuwen, editors, Automata, Languages and Programming, pages 299–309, Berlin, Heidelberg, 1980. Springer Berlin Heidelberg. doi:10.1007/3-540-10003-2_79.
[26] B. P. F. Jacobs. Categorical Logic and Type Theory. Number 141 in Studies in Logic and the Foundations of Mathematics. North Holland, Amsterdam, 1999.
[27] Alexander Kechris. Classical descriptive set theory, volume 156 of Graduate Texts in Mathematics. Springer Science & Business Media, 2012.
[28] Henning Kerstan. Coalgebraic Behavior Analysis: From Qualitative To Quantitative Analyses. PhD thesis, Universität Duisburg-Essen, May 2016. Submitted on 2016-05-09. URL: https://duepublico2.uni-due.de/receive/duepublico_mods_00041220.
[29] Yuichi Komorida, Shin-ya Katsumata, Clemens Kupke, Jurriaan Rot, and Ichiro Hasuo. Expressivity of quantitative modal logics: Categorical foundations via codensity and approximation. In Proceedings of the Thirty Sixth Annual IEEE Symposium on Logic in Computer Science (LICS 2021), pages 1–14, Rome, Italy, June 2021. IEEE Computer Society Press. doi:10.1109/LICS52264.2021.9470656.
[30] Clemens Kupke and Jurriaan Rot. Expressive Logics for Coinductive Predicates. In Maribel Fernández and Anca Muscholl, editors, 28th EACSL Annual Conference on Computer Science Logic (CSL 2020), volume 152 of Leibniz International Proceedings in Informatics (LIPIcs), pages 26:1–26:18, Dagstuhl, Germany, 2020. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CSL.2020.26.
[31] Jan K Pachl. Two classes of measures. Colloquium Mathematicum, 42(1):331–340, 1979. Erratum in Vol. 45.2 (1981), pp. 331–333.
[32] D. Ramachandran and L. Rüschendorf. On the monge–kantorovich duality theorem. Teoriya veroyatnostey i ee primeneniya, 45(2), 2000.
[33] Doraiswamy Ramachandran and Ludger Rüschendorf. A general duality theorem for marginal problems. Probability Theory and Related Fields, 101:311–319, 1995.
[34] Czesław Ryll-Nardzewski. On quasi-compact measures. Fundamenta Mathematicae, 40:125–130, 1953.
[35] Stephen Simons. Minimax Theorems and Their Proofs, chapter 1, pages 1–23. Number 5 in Nonconvex Optimization and Its Applications. Springer, 1995. doi:10.1007/978-1-4613-3557-3.
[36] M. Ya. Suslin. Sur une définition des ensembles mesurables B sans nombres transfinis. Comptes Rendus Mathématique. Académie des Sciences. Paris., 164(2):88–91, 1917.
[37] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA, 2018.
[38] Franck van Breugel and James Worrell. An algorithm for quantitative verification of probabilistic transition systems. In Kim G. Larsen and Mogens Nielsen, editors, CONCUR 2001 — Concurrency Theory, pages 336–350, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg. doi:10.1007/3-540-44685-0_23.
[39] Franck van Breugel and James Worrell. Towards quantitative verification of probabilistic transition systems. In Fernando Orejas, Paul G. Spirakis, and Jan van Leeuwen, editors, Automata, Languages and Programming, pages 421–432, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg. doi:10.1007/3-540-48224-5_35.
[40] Eric K. van Douwen. The integers and topology. In Handbook of set-theoretic topology, pages 111–167. Elsevier, 1984.
[41] Ignacio D. Viglizzo. Final sequences and final coalgebras for measurable spaces. In Algebra and Coalgebra in Computer Science, pages 395–407, Berlin, Heidelberg, 2005. Springer. doi:10.1007/11548133_25.
[42] Itaï Ben Yaacov, Alexander Berenstein, C. Ward Henson, and Alexander Usvyatsov. Model theory for metric structures. In Zoé Chatzidakis, Dugald Macpherson, Anand Pillay, and Alex Wilkie, editors, Model Theory with Applications to Algebra and Analysis, volume 350 of London Mathematical Society Lecture Note Series, pages 315–427. Cambridge University Press, Cambridge, 2008.
[43] Amy Zhang, Rowan Thomas McAllister, Roberto Calandra, Yarin Gal, and Sergey Levine. Learning invariant representations for reinforcement learning without reconstruction. In International Conference on Learning Representations, 2021. URL: https://openreview.net/forum?id=-2FCwDKRREu.

Appendix A Notations and background

The power set of a set $X$ is denoted by $\mathcal{P}(X)$ . For a function $f\colon X\to{Y}$ we denote for a subset $A\subseteq X$ its direct image under $f$ by ${f}_{*}(U)$ or $f[U]$ and for a $V\subseteq{Y}$ the inverse image by ${f}^{-1}(V)$ .

A.1 Perfect measures

Perfect measures were introduced by Kolmogorov [24, 22–23]. The aim was to provide a convenient subclass of measures general enough for all applications. There are many different equivalent definitions. We choose the following one: A measure space $(X,\mathcal{A},\mathfrak{m})$ is called perfect, if for any separable metrisable space $({Y},\mathcal{T})$ and every measurable map $f\colon X\to{Y}$ we have the following property: For every $A\in\mathcal{A}$ and $r<\mathfrak{m}(A)$ there is a compact set $K\subseteq\operatorname{im}f$ with $\mathfrak{m}(A\cap{f}^{-1}K)\geq r$ , cf. [20, 451O(a)]. Also $\mathfrak{m}$ is called perfect in this case. As a direct consequence of the definition, perfectness is functorial, i.e. the push-forward of a perfect measure is perfect.

In typical real-world applications two points of a measurable space should only be distinguished by the $\sigma$ -algebra if they can be distinguished by an observation. As only a finite amount of observations with limited precision can be made per unit of time, there should be a countable subset $\mathcal{S}\subseteq\mathcal{A}$ distinguishing as strong a $\mathcal{A}$ does ( $\forall x,y\in X\forall A\in\mathcal{A}\colon x\in A\wedge y\notin A\implies(% \exists S\in\mathcal{S}\colon\#S\cap\{x,y\}=1)$ ). A measurable space enjoying this property is called countably fibered. Perfect countably fibered probability spaces are actually – despite being a very general notion, including, e.g. analytic spaces – quite close to standard spaces [15, § 8 Rem.]: Any such space is almost pre-standard with respect to some sub- $\sigma$ -algebra $\mathcal{A}^{\prime}\subseteq\mathcal{A}$ . Almost pre-standard means that a space is standard when restricted to a Borel set of full measure and identifying all point not distinguished by $\mathcal{A}$ [15]. For countably generated spaces perfectness can even be characterised equivalently by being almost pre-standard [15, Thm. 6].

A.2 Suslin operation and smooth spaces

For a function $f_{(\,\_\,)}\colon\omega\to X$ let $f_{(\,\_\,)}\|_{\leq i}$ denote the restriction to the first $k$ indices. Further let $\omega^{<\omega}$ denote the set of all finite sequences in $\omega$ . The Suslin operation, cf. [27, 25.4] or [20, 421B], is denoted by $\operatorname{\mathfrak{S}}$ ; we recall that it is defined by
	$\operatorname{\mathfrak{S}}\mathcal{P}=\bigcup_{n_{(\,\_\,)}\in\omega^{\omega}% }\bigcap_{i\in\omega}A_{n_{(\,\_\,)}\|_{\leq i}}$		(20a)
for a Suslin scheme $A_{(\,\_\,)}\colon\omega^{<\omega}$ . Further, we remind of its elementary properties: that for any paving $\mathcal{P}$ and countable family $\mathcal{S}\subseteq\operatorname{\mathfrak{S}}\mathcal{P}$

	$\displaystyle\bigcup\mathcal{S}$	$\displaystyle\in\operatorname{\mathfrak{S}}\mathcal{P}$	(20b)
	$\displaystyle\bigcap\mathcal{S}$	$\displaystyle\in\operatorname{\mathfrak{S}}\mathcal{P}$	(20c)
[20, 421E], also for any function $f\colon X\to{Y}$ and paving $\mathcal{Q}$ on ${Y}$ that
	${f}^{-1}[\operatorname{\mathfrak{S}}\mathcal{Q}]=\operatorname{\mathfrak{S}}({% f}^{-1}[\mathcal{Q}])$		(20d)
[20, 421Cc], as well as monotonicity [20, 421Ca] and idempotence [20, 421D], that for any paving $\mathcal{P}$

	$\displaystyle\mathcal{P}$	$\displaystyle\subseteq\operatorname{\mathfrak{S}}\mathcal{P}$	(20e)
	$\displaystyle\operatorname{\mathfrak{S}}(\operatorname{\mathfrak{S}}\mathcal{P})$	$\displaystyle=\operatorname{\mathfrak{S}}\mathcal{P}\text{.}$	(20f)

Denote the Giry monad by $\operatorname{\mathcal{G}}(X,\mathcal{A})=(\mathrm{M}_{(X,\mathcal{A})},% \mathrm{A}_{(X,\mathcal{A})})$ .

We also recall a generalisation of analytic spaces: smooth spaces as introduced by Falkner [16]. It can be defined as follows [16, 1.3]:

Definition 33.

A measurable space $(X,\mathcal{A})$ is called smooth if for any measurable space $(X,\mathcal{A})$ and any $A\in\operatorname{\mathfrak{S}}(\mathcal{B}\mathbin{\otimes}\mathcal{A})$ the projection on the first component $\operatorname{pr}_{{Y}}A$ is in $\operatorname{\mathfrak{S}}{\mathcal{B}}$ .

Note the following fact [16, 1.3]:

Lemma 34.

Every analytic space is smooth.

Appendix B Proof of Theorem 13

The proof of Theorem 13 is based on Sion’s minmax theorem [35, Thm. 3] which we recall first.

Lemma 35.

Let $U$ be a convex subset of a linear topological space, $V$ a compact convex subset of a linear topological space, and $f\colon U\times V\to\mathbb{R}$ be upper semicontinuous on $U$ and lower semicontinuous on $V$ . Suppose that

	$\displaystyle\forall y\in V,\lambda\in\mathbb{R}\colon$	$\displaystyle\{\,x\in U\;\|\;f(x,y)\geq\lambda\,\}$	is convex and	(21a)
	$\displaystyle\forall x\in U,\lambda\in\mathbb{R}\colon$	$\displaystyle\{\,y\in V\;\|\;f(x,y)\leq\lambda\,\}$	is convex.	(21b)

Then $\displaystyle\adjustlimits{\inf}_{y\in V}{\sup}_{x\in U}f(x,y)=\adjustlimits{% \sup}_{x\in U}{\inf}_{y\in V}f(x,y)$ .

In addition, we also need some important results from functional analysis; in particular, non-topological version of Riesz–Markov–Kakutani representation theorem [13, IV.5.1] and Banach-Alaoglu theorem [13, V.4.2].

The vector space $\operatorname{B}(X)$ of bounded functions $f\colon X\to\mathbb{R}$ is endowed with the uniform norm given by

\lVert f\rVert_{\textnormal{u}}=\sup\{\,\lvert f(x)\rvert\;|\;x\in X\,\}\text{.}

It is well-known that this space $\operatorname{B}(X)$ is complete [13, IV.5].

Let $\operatorname{B}(X,\mathcal{A})$ denote the Banach space [13, IV.5] consisting of limits of simple functions on $(X,\mathcal{A})$ with respect to the uniform norm $\lVert\,\_\,\rVert_{\textnormal{u}}$ . It is also functorial. Moreover, we recall that $B^{*}$ denotes the dual of a Banach space $B$ , i.e. all bounded linear functionals $B\to\mathbb{R}$ . This dual can be either endowed with the uniform norm obtaining a Banach space or with the weak-* topology. The assignment $(\,\_\,)^{*}$ is functorial with respect to both choices. We write the dual vector space of $\operatorname{B}(X,\mathcal{A})$ simply as $\operatorname{B}^{*}(X,\mathcal{A})$ .

Recall that a charge is the same thing as a measure but only finitely additive. Denote by $\operatorname{Charge}(X,\mathcal{A})$ the set of all charges on $(X,\mathcal{A})$ . We may also view $\operatorname{Charge}$ as a functor $\textnormal{{Meas}}\to\textnormal{{Meas}}$ by endowing $\operatorname{Charge}(X,\mathcal{A})$ with the same $\sigma$ -algebra as $\operatorname{\mathcal{G}}(X,\mathcal{A})$ , i.e. the one making all $\operatorname{ev}_{A}$ with $A\in\mathcal{A}$ measurable. There is the following classical duality between charges and positive functionals given by the isometric (with respect to the uniform norm) isomorphism [13, IV.5.1]

\displaystyle\operatorname{Int}\colon\operatorname{Charge}(X,\mathcal{A})\to% \operatorname{B}^{*}(X,\mathcal{A})\text{,}\quad\mathfrak{m}\mapsto\uplambda\;% f.\,\int f\mathop{}\!\mathrm{d}\mathfrak{m}\text{.}

(22)

The above duality given by $\operatorname{Int}$ can be viewed as a non-topological version of the Riesz-Markov-Kakutani representation theorem. Note that probability charges $\mathfrak{m}$ are characterised by being positive, i.e. $\int f\mathop{}\!\mathrm{d}\mathfrak{m}=\operatorname{Int}(\mathfrak{m})(f)\geq 0$ for any $f\in\operatorname{B}(X,\mathcal{A})$ with $f\geq 0$ , and normed, i.e. $\int 1\mathop{}\!\mathrm{d}\mathfrak{m}=\operatorname{Int}(\mathfrak{m})(1)=1$ .

We are now ready to prove Theorem 13. Actually, we prove the following slight generalisation (recall that every probability measure on an analytic space is perfect).

Theorem 36.

Let $\gamma\colon(X,\mathcal{A})\to\operatorname{\mathcal{G}}(X,\mathcal{A})\in% \mathfrak{m}$ such that $\gamma(x)$ is a perfect measure (for each $x\in X$ ). Then the $\hat{\sigma}$ is $\omega$ -cpo-continuous with respect to $\leq$ .

Proof.

Assume an $\leq$ -increasing sequence $(d_{i})_{i\in\mathbb{N}}$ of pseudometrics over $(X,\mathcal{A})$ , then we need to show that the following equation for every $x,y\in X$ .

\inf_{\mathfrak{c}\in K(\gamma_{x},\gamma_{y})}\int\sup_{i\in\mathbb{N}}d_{i}% \mathop{}\!\mathrm{d}\mathfrak{c}=\adjustlimits{\sup}_{i\in\mathbb{N}}{\inf}_{% \mathfrak{c}\in K(\gamma_{x},\gamma_{y})}\int d_{i}\mathop{}\!\mathrm{d}% \mathfrak{c}\text{.}

(23)

Let $x,y\in X$ and let $\mathfrak{m}_{1}=\gamma_{x}$ and $\mathfrak{m}_{2}=\gamma_{y}$ . Recall the notion of charge which is the same thing as a measure but only finitely additive and note that

	$\displaystyle\{\,\mathfrak{c}\text{ measure on }\mathcal{A}\mathbin{\otimes}% \mathcal{A}\;\|\;\operatorname{\mathcal{G}}(\operatorname{pr}_{1})\mathfrak{c}=% \mathfrak{m}_{1},\operatorname{\mathcal{G}}(\operatorname{pr}_{2})\mathfrak{c}% =\mathfrak{m}_{2}\,\}$
	$\displaystyle=\left\{\,\mathfrak{c}\text{ charge on }\mathcal{A}\mathbin{% \otimes}\mathcal{A}\;\middle\|\;\operatorname{\mathcal{G}}(\operatorname{pr}_{1% })\mathfrak{c}=\mathfrak{m}_{1},\operatorname{\mathcal{G}}(\operatorname{pr}_{% 2})\mathfrak{c}=\mathfrak{m}_{2}\,\right\}$

as $\mathfrak{m}_{1},\mathfrak{m}_{2}$ are perfect [32, D5] (see also [31, Prop. 3], original result [34, Thm. VIII]). Now applying the duality (22) and by suppress $\mathcal{A}$ below in favour of readability (i.e. $X\times X$ as a shorthand for the measurable space $(X,\mathcal{A})\times(X,\mathcal{A})$ ).

\displaystyle\mathrel{{\mathop{\cong}\limits^{\operatorname{Int}}}}\left\{\,c% \in(\operatorname{B}^{*}(X\times X))\;\middle|\;\begin{multlined}c\text{ is % positive},c(1)=1\text{ and}\\ \text{for }i=1,2\colon(\operatorname{B}^{*}\operatorname{pr}_{i})c=% \operatorname{Int}(\mathfrak{m}_{i})\end{multlined}c\text{ is positive},c(1)=1% \text{ and}\\ \text{for }i=1,2\colon(\operatorname{B}^{*}\operatorname{pr}_{i})c=% \operatorname{Int}(\mathfrak{m}_{i})\,\right\}

\displaystyle=\bigcap\left\{\,{\operatorname{ev}_{f}}^{-1}([0,\infty))\;% \middle|\;f\in\operatorname{B}\bigl{(}X\times X\bigr{)},f\geq 0\,\right\}\cap{% \operatorname{ev}_{1}}^{-1}(\{1\}){}\cap\bigcap\nolimits_{i=1,2}{(% \operatorname{B}^{*}\operatorname{pr}_{i})}^{-1}(\{\operatorname{Int}\mathfrak% {m}_{i}\})

(24)

where $\operatorname{ev}_{f}\colon c\mapsto c(f)$ is the evaluation function.

\displaystyle\eqqcolon V_{xx^{\prime}}

From the last representation it is apparent that $V_{xx^{\prime}}$ is an intersection of weak-*-closed subsets of $\operatorname{B}^{*}(X\times X)$ as $\operatorname{B}(\operatorname{pr}_{1}),\operatorname{B}(\operatorname{pr}_{2}% )\colon\operatorname{B}X\to\operatorname{B}(X\times X)$ are continuous by functoriality. Hence $V_{xx^{\prime}}$ is closed. Moreover the set $\{\,\phi\in\operatorname{B}^{*}(X\times X)\;|\;\phi(f)\in[-1,1]\text{ for any % }f\text{ with }\lVert f\rVert_{\textnormal{u}}\leq 1\,\}$ is compact by Banach-Alaoglu theorem [13, V.4.2]. This set also contains any positive $c\in\operatorname{B}^{*}(X\times X)$ with $c(1)=1$ (for any $f$ with $\lVert u\rVert_{\textnormal{f}}\leq 1$ note that $\lvert c(f)\rvert\mathrel{{\mathop{\leq}\limits_{c\text{ linear}}}}\lvert c(% \lvert f\rvert)\rvert\mathrel{{\mathop{=}\limits_{c\text{ positive}}}}\int% \lvert f\rvert\mathop{}\!\mathrm{d}c\hskip-40.00006pt\mathrel{{\mathop{\leq}% \limits_{\begin{subarray}{c}c\text{ positive,}\\ \text{\cite[cite]{[\@@bibref{}{DunfordSchwartz58}{}{}, III.2.22, p.\ 119; III.% 1.5]}}\end{subarray}}}}\hskip-40.00006pt\int 1\mathop{}\!\mathrm{d}c=1$ ). Thus $V_{xx^{\prime}}$ is compact.

Set $U=[0,\infty)$ and define $\tilde{d}_{x}\colon U\to[0,1]$ by $\tilde{d}_{x}=d_{i}$ if $x\in[i,i+1)$ . Note that $\tilde{d}_{x}$ is an increasing function in $x$ . Further define $f\colon U\times V_{xx^{\prime}}\to[0,1]\in\textnormal{{Set}}$ by $f(x,c)\coloneqq c(\tilde{d}_{x})=\int\tilde{d}_{x}\mathop{}\!\mathrm{d}c$ . For upper semi-continuity of $f$ in its first argument fix a $c\in V_{xx^{\prime}}$ and take any $r\in\mathbb{R}$ . Observe that ${f(\,\_\,,c)}^{-1}([0,r))=\{\,x\in U\;|\;\exists i\in\omega\colon x<i+1\wedge c% (d_{i})<r\,\}$ is open. As $r$ was chosen arbitrarily, upper-semicontinuity is proven. For lower semi-continuity of $f$ in its second argument fix an $x\in U$ and take again any $r\in[0,\infty)$ . Let $i$ be the element of $\omega$ with $x\in[i,i+1)$ . Observe that ${f(x,\,\_\,)}^{-1}((r,\infty))={f(i,\,\_\,)}^{-1}((r,\infty))=\{\,c\in V_{xx^{% \prime}}\;|\;c(d_{i})>r\,\}={\operatorname{ev}_{d_{i}}}^{-1}((r,\infty))$ is open by definition of weak-*-topology. Since $x$ and $r$ were chosen arbitrarily, lower-semicontinuity is proven. For each $c$ and $\lambda\in\mathbb{R}$ the level set $\{\,x\in U\;|\;f(x,c)\geq\lambda\,\}$ is of form $[i,\infty)\subseteq\mathbb{R}$ and thus obviously convex. Convexity of $V_{xx^{\prime}}$ is also quickly confirmed by noting that every operand in 24 is closed under convex combination. For each $x\in U$ and $\lambda\in\mathbb{R}$ the level set $\{\,c\in\operatorname{B}^{*}(X\times X)\;|\;c(\tilde{d}_{x})\leq\lambda\,\}$ is convex by linearity of the $c$ ’s. As the intersection of convex sets is convex, the level set $\{\,c\in V_{xx^{\prime}}\;|\;c(\tilde{d}_{x})\leq\lambda\,\}$ is convex.

Then by Appendix B, $\inf_{c\in V_{xx^{\prime}}}\sup_{x\in[0,\infty)}c(\tilde{d}_{x})=\sup_{x\in[0,% \infty)}\inf_{c\in V_{xx^{\prime}}}c(\tilde{d}_{x})$ . By definition of $\tilde{d}$ this becomes $\inf_{c\in V_{xx^{\prime}}}\sup_{i\in\mathbb{N}}c(d_{i})=\sup_{i\in\mathbb{N}}% \inf_{c\in V_{xx^{\prime}}}c(d_{i})$ . By the canonical isomorphism and applying Levi’s theorem (monotone convergence theorem) [20, 123A] the claim in (23) follows. $\hfill\blacktriangleleft$

Appendix C Topology

Recall that a topological space is $\mathrm{R}_{2}$ if any pair of topologically distinct point (i.e. $\exists U\in\mathcal{T}\colon U\cap\{x,y\}\in\{\{x\},\{y\}\}$ ) are separated by disjoint open sets (i.e. $\exists U,V\in\mathcal{T}\colon x\in U\wedge y\in V\wedge U\cap V=\emptyset$ ). Also recall that a topological space is called $\mathrm{T}_{2}$ if it is Hausdorff and $\mathrm{T}_{0}$ if any pair of distinct point is separated by an open set. Obviously, a $\mathrm{T}_{2}$ -space is precisely an $\mathrm{R}_{2}$ -space that is $\mathrm{T}_{0}$ . Any topological space can be transformed canonically into a $\mathrm{T}_{0}$ space by identifying all point that are not topologically distinct point, resulting in the so-called Kolmogorov quotient $\operatorname{Kol}$ . We now generalise a bit the well-known Stone-Weierstraß theorem using the fact that $\mathrm{T}_{0}$ -spaces form a reflective subcategory of topological spaces by the Kolmogorov quotient construction. Proofs of the following theorems are found in [2].

Lemma 37.

Stone-Weierstraß Let $(X,\mathcal{T})$ a compact $\mathrm{R}_{2}$ -space. Let $L$ be a set of continuous functions $X\to\mathbb{R}$ such that $\min\{f,g\},\max\{f,g\}\in L$ for all $f,g\in L$ . If some continuous function $f\colon X\to\mathbb{R}$ can be approximated at each pair of points by functions in $L$ , then $f$ itself can also be approximated by functions in $L$ with respect to the uniform norm $\lVert\,\_\,\rVert_{\textnormal{u}}$ .

Lemma 38.

The unit of the Kolmogorov construction $\eta_{X}\colon X\to\operatorname{Kol}X$ is proper, i.e. preimages of compact sets are compact.

Appendix D Proof of adequacy, Theorem 22

Proof.

Set $\llbracket\,\_\,\rrbracket=\llbracket\,\_\,\rrbracket_{\gamma}$ . We prove $\operatorname{\mathbf{bd}}\geq\llbracket\varphi\rrbracket^{\bullet}\mathrm{d}_% {\textnormal{E}}$ for each $\varphi\in\mathcal{L}$ by structural induction over $\varphi$ , where $\llbracket\varphi\rrbracket^{\bullet}={(\llbracket\varphi\rrbracket\times% \llbracket\varphi\rrbracket)}^{-1}$ . Recall again that all logical symbols are also function symbols, so the proof consists of two cases: Function symbols and modal operators.

Take a formula $\varphi=f(\varphi_{1},\ldots,\varphi_{n})$ for an arbitrary $n$ -ary function symbol $f$ – interpreted by a function also denoted by $f\colon[0,1]^{n}\mapsto[0,1]$ – and formulas $\varphi_{i}$ with $i=1,\ldots,n$ with $\operatorname{\mathbf{bd}}\geq\llbracket\varphi_{i}\rrbracket^{\bullet}\mathrm% {d}_{\textnormal{E}}$ for each $i$ . Given two states $x,y\in X$ we find

As $x, y$ had been chosen arbitrarily, it follows that $\operatorname{\mathbf{bd}}\geq\llbracket\varphi\rrbracket^{\bullet}\mathrm{d}_% {\textnormal{E}}$ .

Turning in the final step to modal operators, take any $a\in\Sigma$ and assume that for a formula $\psi$ we already know that $\operatorname{\mathbf{bd}}\geq\llbracket\psi\rrbracket^{\bullet}\mathrm{d}_{% \textnormal{E}}$ . Observe for any two states $x,y\in X$

As $x, y$ had been chosen arbitrarily, it follows that $\operatorname{\mathbf{bd}}\geq\llbracket\diamond_{a}\psi\rrbracket^{\bullet}% \mathrm{d}_{\textnormal{E}}$ . $\hfill\blacktriangleleft$

Note that the usage of Jensen’s inequality could be avoided using 9. But we chose to conduct the proof this way in view of possible further research into logics without negation.

Appendix E Remaining proofs

The remaining proofs in this paper and its appendices are found in [2].

[bib.bib1] [1] Harsh Beohar, Sebastian Gurke, Barbara König, Karla Messing, Jonas Forster, Lutz Schröder, and Paul Wild. Expressive quantale-valued logics for coalgebras: An adjunction-based approach. In Olaf Beyersdorff, Mamadou Moustapha Kanté, Orna Kupferman, and Daniel Lokshtanov, editors, 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024), volume 289, pages 10:1–10:19, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.STACS.2024.10.

[bib.bib2] [2] Harsh Beohar, Clemens Kupke, and Daniel Luckhardt. Henneysey-milner type theorem for measurable pseudometrics, 2025. arXiv:2505.23635 [cs.LO].

[bib.bib3] [3] Vladimir I. Bogachev. Measures on topological spaces. Journal of Mathematical Sciences, 91(4):3033–3156, 1998.

[bib.bib4] [4] Filippo Bonchi, Barbara König, and Daniela Petrisan. Up-to techniques for behavioural metrics via fibrations. In Sven Schewe and Lijun Zhang, editors, 29th International Conference on Concurrency Theory (CONCUR 2018), volume 118 of Leibniz International Proceedings in Informatics (LIPIcs), pages 17:1–17:17, Dagstuhl, Germany, 2018. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CONCUR.2018.17.

[bib.bib5] [5] Craig Boutilier, Thomas Dean, and Steve Hanks. Decision-theoretic planning: structural assumptions and computational leverage. J. Artif. Int. Res., 11(1):1–94, July 1999. doi:10.1613/JAIR.575.

[bib.bib6] [6] D. W. Bressler and M. Sion. The current theory of analytic sets. Canadian Journal of Mathematics, 16:207–230, 1964. doi:10.4153/CJM-1964-021-7.

[bib.bib7] [7] Linan Chen, Florence Clerc, and Prakash Panangaden. A behavioural pseudometrics for continuous-time Markov processes. In Foundations of Software Science and Computation Structures, Lecture Notes in Computer Science, 2025. Accepted; arXiv:2501.13008 [cs.LO]; Event: FoSSaCS, Hamilton, Ontario, Canada, May 5–8, 2025. doi:10.48550/arXiv.2501.13008.

[bib.bib8] [8] Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labeled Markov systems. In Jos C. M. Baeten and Sjouke Mauw, editors, CONCUR’99 Concurrency Theory, pages 258–273, Berlin, Heidelberg, 1999. Springer. doi:10.1007/3-540-48320-9_19.

[bib.bib9] [9] Josée Desharnais, Abbas Edalat, and Prakash Panangaden. Bisimulation for labelled Markov processes. Information and Computation, 179(2):163–193, 2002. doi:10.1006/inco.2001.2962.

[bib.bib10] [10] Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labelled Markov processes. Theoretical Computer Science, 318(3):323–354, 2004. doi:10.1016/j.tcs.2003.09.013.

[bib.bib11] [11] Ernst-Erich Doberkat. Measures and all that — a tutorial, 2014. arXiv:1409.2662 [math.FA], Version 3.

[bib.bib12] [12] R. M. Dudley. Real Analysis and Probability. Number 74 in Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2002.

[bib.bib13] [13] Nelson Dunford and Jacob Schwartz. Linear Operators, volume 3 of Pure and Applied Mathematics. Wiley-InterScience, 1958/1971.

[bib.bib14] [14] Ryszard Engelking. General topology, volume 6 of Sigma series in pure mathematics. Heldermann Verlag, revised and completed edition edition, 1989.

[bib.bib15] [15] Arnold M. Faden. The existence of regular conditional probabilities: Necessary and sufficient conditions. The Annals of Probability, 13(1):288–298, 1985.

[bib.bib16] [16] Neil Falkner. Generalizations of analytic and standard measurable spaces. Mathematica Scandinavica, pages 283–301, 1981.

[bib.bib17] [17] Norm Ferns, Prakash Panangaden, and Doina Precup. Metrics for finite Markov decision processes. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI 2004), pages 162–169, 2004. arXiv:1207.4114 [cs.AI]. URL: https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=1103&proceeding_id=20.

[bib.bib18] [18] Norm Ferns, Prakash Panangaden, and Doina Precup. Bisimulation metrics for continuous Markov decision processes. SIAM Journal on Computing, 40(6):1662–1714, 2011. doi:10.1137/10080484X.

[bib.bib19] [19] Jonas Forster, Sergey Goncharov, Dirk Hofmann, Pedro Nora, Lutz Schröder, and Paul Wild. Quantitative Hennessy-Milner theorems via notions of density. In Bartek Klin and Elaine Pimentel, editors, 31st EACSL Annual Conference on Computer Science Logic (CSL 2023), volume 252, pages 22:1–22:20, Dagstuhl, Germany, 2023. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CSL.2023.22.

[bib.bib20] [20] D.H. Fremlin. Measure Theory, volume 5. Torres Fremlin, 2000/2008.

[bib.bib21] [21] Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, and Marc G. Bellemare. DeepMDP: Learning continuous latent space models for representation learning. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2170–2179. PMLR, June 2019. URL: https://proceedings.mlr.press/v97/gelada19a.html.

[bib.bib22] [22] Michèle Giry. A categorical approach to probability. In B. Banaschewski, editor, Categorical Aspects of Topology and Analysis, volume 915 of Lecture Notes in Mathematics, pages 68–85. Springer, 1982.

[bib.bib23] [23] Robert Givan, Thomas Dean, and Matthew Greig. Equivalence notions and model minimization in Markov decision processes. Artificial Intelligence, 147(1):163–223, 2003. Planning with Uncertainty and Incomplete Information. doi:10.1016/S0004-3702(02)00376-4.

[bib.bib24] [24] Boris Vladimirovich Gnedenko and Andrey Nikolaevich Kolmogoroff. Predelnye raspredeleniya dlya summ nezavisimykh sluchaynykh velichin. GITTL, 1949.

[bib.bib25] [25] Matthew Hennessy and Robin Milner. On observing nondeterminism and concurrency. In Jaco de Bakker and Jan van Leeuwen, editors, Automata, Languages and Programming, pages 299–309, Berlin, Heidelberg, 1980. Springer Berlin Heidelberg. doi:10.1007/3-540-10003-2_79.

[bib.bib26] [26] B. P. F. Jacobs. Categorical Logic and Type Theory. Number 141 in Studies in Logic and the Foundations of Mathematics. North Holland, Amsterdam, 1999.

[bib.bib27] [27] Alexander Kechris. Classical descriptive set theory, volume 156 of Graduate Texts in Mathematics. Springer Science & Business Media, 2012.

[bib.bib28] [28] Henning Kerstan. Coalgebraic Behavior Analysis: From Qualitative To Quantitative Analyses. PhD thesis, Universität Duisburg-Essen, May 2016. Submitted on 2016-05-09. URL: https://duepublico2.uni-due.de/receive/duepublico_mods_00041220.

[bib.bib29] [29] Yuichi Komorida, Shin-ya Katsumata, Clemens Kupke, Jurriaan Rot, and Ichiro Hasuo. Expressivity of quantitative modal logics: Categorical foundations via codensity and approximation. In Proceedings of the Thirty Sixth Annual IEEE Symposium on Logic in Computer Science (LICS 2021), pages 1–14, Rome, Italy, June 2021. IEEE Computer Society Press. doi:10.1109/LICS52264.2021.9470656.

[bib.bib30] [30] Clemens Kupke and Jurriaan Rot. Expressive Logics for Coinductive Predicates. In Maribel Fernández and Anca Muscholl, editors, 28th EACSL Annual Conference on Computer Science Logic (CSL 2020), volume 152 of Leibniz International Proceedings in Informatics (LIPIcs), pages 26:1–26:18, Dagstuhl, Germany, 2020. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CSL.2020.26.

[bib.bib31] [31] Jan K Pachl. Two classes of measures. Colloquium Mathematicum, 42(1):331–340, 1979. Erratum in Vol. 45.2 (1981), pp. 331–333.

[bib.bib32] [32] D. Ramachandran and L. Rüschendorf. On the monge–kantorovich duality theorem. Teoriya veroyatnostey i ee primeneniya, 45(2), 2000.

[bib.bib33] [33] Doraiswamy Ramachandran and Ludger Rüschendorf. A general duality theorem for marginal problems. Probability Theory and Related Fields, 101:311–319, 1995.

[bib.bib34] [34] Czesław Ryll-Nardzewski. On quasi-compact measures. Fundamenta Mathematicae, 40:125–130, 1953.

[bib.bib35] [35] Stephen Simons. Minimax Theorems and Their Proofs, chapter 1, pages 1–23. Number 5 in Nonconvex Optimization and Its Applications. Springer, 1995. doi:10.1007/978-1-4613-3557-3.

[bib.bib36] [36] M. Ya. Suslin. Sur une définition des ensembles mesurables B sans nombres transfinis. Comptes Rendus Mathématique. Académie des Sciences. Paris., 164(2):88–91, 1917.

[bib.bib37] [37] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA, 2018.

[bib.bib38] [38] Franck van Breugel and James Worrell. An algorithm for quantitative verification of probabilistic transition systems. In Kim G. Larsen and Mogens Nielsen, editors, CONCUR 2001 — Concurrency Theory, pages 336–350, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg. doi:10.1007/3-540-44685-0_23.

[bib.bib39] [39] Franck van Breugel and James Worrell. Towards quantitative verification of probabilistic transition systems. In Fernando Orejas, Paul G. Spirakis, and Jan van Leeuwen, editors, Automata, Languages and Programming, pages 421–432, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg. doi:10.1007/3-540-48224-5_35.

[bib.bib40] [40] Eric K. van Douwen. The integers and topology. In Handbook of set-theoretic topology, pages 111–167. Elsevier, 1984.

[bib.bib41] [41] Ignacio D. Viglizzo. Final sequences and final coalgebras for measurable spaces. In Algebra and Coalgebra in Computer Science, pages 395–407, Berlin, Heidelberg, 2005. Springer. doi:10.1007/11548133_25.

[bib.bib42] [42] Itaï Ben Yaacov, Alexander Berenstein, C. Ward Henson, and Alexander Usvyatsov. Model theory for metric structures. In Zoé Chatzidakis, Dugald Macpherson, Anand Pillay, and Alex Wilkie, editors, Model Theory with Applications to Algebra and Analysis, volume 350 of London Mathematical Society Lecture Note Series, pages 315–427. Cambridge University Press, Cambridge, 2008.

[bib.bib43] [43] Amy Zhang, Rowan Thomas McAllister, Roberto Calandra, Yarin Gal, and Sergey Levine. Learning invariant representations for reinforcement learning without reconstruction. In International Conference on Learning Representations, 2021. URL: https://openreview.net/forum?id=-2FCwDKRREu.

Expressivity of Bisimulation Pseudometrics over Analytic State Spaces

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

2 Preliminaries

2.1 Capturing behavioural conformances categorically

Proposition 1.

▶ Remark 2.

2.2 Measurable spaces

3 Markov decision processes

Definition 3.

3.1 Fibrations induced universally/l.s.m. predicates

Definition 4.

Lemma 5.

▶ Remark 6.

Proposition 7.

4 Bisimulation distance

4.1 Wasserstein lifting categorically

Theorem 8.

Definition 9.

Proposition 10.

4.2 Distance lifting for 𝑩={𝑩𝗠𝗥𝗣,𝑩𝗠𝗗𝗣}

Lemma 11.

Lemma 12.

Theorem 13.

▶ Remark 14.

▶ Remark 15.

Corollary 16.

5 A quantitative modal logic and its expressivity

5.1 Interpretation of modal formulae in 𝓛

Definition 17 (Logical distance).

5.2 Endowing a topology on 𝓛 through its shapes

Definition 18.

Lemma 19 ([14], 3.4.16).

▶ Remark 20.

Lemma 21.

Theorem 22.

▶ Remark 23.

5.3 A general expressivity theorem for 𝓛

Definition 24.

Lemma 25.

Corollary 26.

Proof.

Lemma 27 (Kantorovich-Rubinstein theorem).

Theorem 28.

Proof.

Theorem 29.

Lemma 30.

Corollary 31.

▶ Remark 32.

6 Related work and concluding remarks

6.1 Related work

6.2 Concluding remarks

References

Appendix A Notations and background

A.1 Perfect measures

A.2 Suslin operation and smooth spaces

Definition 33.

Lemma 34.

Appendix B Proof of Theorem 13

Lemma 35.

Theorem 36.

Proof.

Appendix C Topology

Lemma 37.

Lemma 38.

Appendix D Proof of adequacy, Theorem 22

Proof.

Appendix E Remaining proofs

$\blacktriangleright$ Remark 2.

$\blacktriangleright$ Remark 6.

4.2 Distance lifting for $B=\{B_{\mathsf{MRP}},B_{\mathsf{MDP}}\}$

$\blacktriangleright$ Remark 14.

$\blacktriangleright$ Remark 15.

5.1 Interpretation of modal formulae in $\mathcal{L}$

5.2 Endowing a topology on $\mathcal{L}$ through its shapes

$\blacktriangleright$ Remark 20.

$\blacktriangleright$ Remark 23.

5.3 A general expressivity theorem for $\mathcal{L}$

$\blacktriangleright$ Remark 32.