ε-Distance via Lévy-Prokhorov Lifting

Desharnais, Josée; Sokolova, Ana

doi:10.4230/LIPIcs.CSL.2026.26

$\varepsilon$ -Distance via Lévy-Prokhorov Lifting

Josée Desharnais

Laval University, Québec, Canada Ana Sokolova

University of Salzburg, Austria

Abstract

The most studied and accepted pseudometric for probabilistic processes is one based on the Kantorovich distance between distributions. It comes with many theoretical and motivating results, in particular it is the fixpoint of a given functional and defines a functor on (complete) pseudometric spaces. It is also the foundation for a categorical lifting of pseudometrics.

Other notions of behavioural pseudometrics have also been proposed, one of them ( $\varepsilon$ -distance) based on $\varepsilon$ -bisimulation. $\varepsilon$ -Distance has the advantages that it is intuitively easy to understand, it relates systems that are conceptually close (for example, an imperfect implementation is close to its specification), and it comes equipped with a natural notion of $\varepsilon$ -coupling. Finally, this distance is easy to compute.

We show that $\varepsilon$ -distance is also the greatest fixpoint of a functional and provides a functor. The latter is obtained by replacing the Kantorovich distance in the lifting functor with the Lévy-Prokhorov distance. In addition, we show that $\varepsilon$ -couplings and $\varepsilon$ -bisimulations have an appealing coalgebraic characterization.

Keywords and phrases:

Lévy-Prokhorov metric, behavioural distance, epsilon-bisimulation, reactive probabilistic transition systems, discrete labelled Markov processes, coalgebraic epsilon-(bi)simulation

Funding:

Josée Desharnais: Work funded by NSERC grant.

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Probabilistic computation ; Theory of computation

\rightarrow

Categorical semantics ; Theory of computation

\rightarrow

Program verification

Related Version:

Full Version: https://arxiv.org/abs/2507.10732

Acknowledgements:

This work was partly done during a sabbatical of Josée Desharnais at the University of Salzburg, as well as during Dagstuhl Seminar 24432 and the Bellairs Workshop on Quantitative Reasoning 2025. We thank these venues, the organizers, and the participants for providing a perfect working environment. We also thank Matteo Mio and Franck van Breugel for some fruitful and motivating discussions, and the anonymous reviewers for their insightful and constructive suggestions.

DOI:

10.4230/LIPIcs.CSL.2026.26

Event:

34th EACSL Annual Conference on Computer Science Logic (CSL 2026)

Editors:

Stefano Guerrini and Barbara König

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Probabilistic systems [28, 38, 48, 7, 8, 44] and their behaviour have been the object of study of over thirty years in the area of formal verification and analysis of systems. They are used to represent uncertainty, incomplete information, as well as randomized behaviour. One important direction, in order to prove that systems behave the same, is the study of behavioural equivalences: these identify states with (exactly) the same behaviour. Behavioural equivalences are an elegant way to analyse and compare behaviour of systems, they have nice foundations in concurrency theory [41, 60] as well as elegant abstract generalizations in the theory of coalgebras [20, 49, 32, 36, 33, 35, 13, 14]. However, as was already observed in [29], in a non-exact world, e.g., when the probabilities in the model are approximate and not exactly known, or estimated by sampling, or when small differences should not be considered the same as large differences, behavioural equivalences may be too strong.

One solution is to employ a distance, actually a pseudometric, providing a quantitative notion of how much states in a probabilistic model differ from one another, or how close they are to each other – distance zero corresponding to bisimulation equivalence. The study of pseudometrics [22, 58, 24, 18, 59, 56, 26, 54, 55] and quantitative theories [39, 40, 5, 6, 9, 42, 16, 43] has been a fruitful one in the past decade(s).

Historically, the impulse to define a behavioral distance between probabilistic processes became imperative for those with a continuous state space and continuous probability distributions, called Labelled Markov Processes (LMPs). It started with the idea of defining a real-valued logic, as proposed by Kozen [37], to express properties satisfied by processes: Taking the supremum over the differences of the values of formulas was a natural way of defining a pseudometric, see Desharnais et al. [22, 24]. The authors later on observed that this distance was the greatest fixpoint of some functional [25] and, in between and later, van Breugel and Worrell [58, 59] and van Breugel et al. [56], proved that this gives a functor (monad) that is a lifting of the distribution functor (monad) using the Kantorovich metric on distributions. Moreover, the distance is the one obtained using final coalgebra semantics. Since then, this pseudometric has been studied further in different contexts and algorithms have been proposed to compute it [53, 5, 6]. The distance comes with many theoretical and motivating results, and has also been extended by Baldan et al. [9] and Sprunger et al. [51, 52] to a generic Kantorovich-style (and dually Wasserstein) lifting of arbitrary functors parametric in certain evaluation maps – the two liftings coincide on distributions. With all these seals of approval, the Kantorovich behavioural distance has imposed itself as the one to use and study. One aspect that it lacks is an intuitive accompanying notion of approximate bisimulation.

Approximate bisimulations provide another strategy to circumvent the non-robustness of bisimulation equivalences. They have been studied in non-probabilistic systems [61, 30, 51], and in probabilistic systems [26, 54, 27, 1, 2, 10, 50]. We focus on the notion of $\varepsilon$ -bisimulation defined in [26]. One aim of such approximate bisimulations is to unite the best of both worlds: give a relational structure for reasoning about systems and at the same time define a distance. States related by an $\varepsilon$ -bisimulation are at distance at most $\varepsilon$ from one another. The advantages of such a distance are its intuitive nature and its ability to relate systems that are almost the same in structure, as the next example will illustrate. Finally, the distance can be computed relatively easily.

For comparison with the Kantorovich behavioural distance, consider the imperfect channel (when $\varepsilon>0$ ) depicted in Figure 1. The $\varepsilon$ -distance of this channel to a perfect channel (one with $\varepsilon=0$ ) is $\varepsilon$ , and it provides a simple example that imperfect implementation

Figure 1: A simple channel that fails to send a token with probability

\varepsilon

.

can be considered close to its specification. On the other hand, the Kantorovich distance (without discount) gives distance 1 to the pair of perfect and imperfect channels, which reflects the fact that the two channels have a very different behavior on the long run – see Example 1 for more details. Interestingly, $\varepsilon$ -distance has an almost build-in notion of continuity: Two channels with close values of $\varepsilon$ are close in the $\varepsilon$ -distance as well.

However, the approximate equivalences had no categorical formulation attached and it was not clear whether the associated distance shares some of the generic good properties identified for the Kantorovich distance. Our main motivation for this work was to investigate whether we can remedy this situation. Indeed, we provide the missing abstract characterization of $\varepsilon$ -bisimulation in two directions: by obtaining a fixpoint characterization of the induced distance and by giving a coalgebraic view following Aczel-Mendler coalgebraic bisimilarity. The fixpoint characterization is the main result of the paper. For this, we show how to replace the Kantorovich metric on distributions with the Lévy-Prokhorov metric on distributions, and show that the distance obtained is indeed the $\varepsilon$ -distance. It is remarkable how well the Lévy-Prokhorov distance fits into the definition of $\varepsilon$ -bisimulation: We show that $\varepsilon$ -distance is the greatest fixpoint of a suitable functional, and that the lifting also lifts the discrete distributions functor to a functor on pseudometric spaces. We also show a result that has no matching for the Kantorovich functional: any fixpoint distance of the Lévy-Prokhorov functional defines an $\varepsilon$ -bisimulation. Applications of our results may take different routes: On the practical side, the distance could be computed by an iterative algorithm given a suitable fixpoint theorem. On the theoretical side, the fixpoint characterization may open the way for studying new problems, one example being $\varepsilon$ -bisimulations up-to as an acceleration to computing $\varepsilon$ -bisimilarity, along the lines of [12].

Our observation may also open a new way of studying other pseudometrics on probabilistic systems that would be constructed from other basic distances on distributions.

Unlike in the case of the Kantorovich lifting, the Lévy Prokhorov lifting does not yield a monad on the category of pseudometric spaces, see Section 3.4. As a consequence, we could not follow the path of [56] and reproduce their results on the final coalgebra. Instead, we take a different way and describe $\varepsilon$ -(bi)simulations as coalgebraic (bi)simulations, following and extending the framework of Hasuo [32, 33] and the original results of Hughes and Jacobs [34]. We only present the necessary abstraction needed to discuss $\varepsilon$ -(bi)simulations on discrete time Markov chains and labelled Markov processes, generalizations of these notions are possible and we plan to elaborate on them in follow-up work.

All proofs omitted for space reasons are in the appendix.

2 The objects of interest

Our development concerns discrete labelled Markov processes (LMPs) also called reactive probabilistic transition systems. They consist of a set of states $S$ and transitions $\tau_{a}:S\to\mathcal{D}S$ labelled with actions $a$ of some set $\mathcal{A}$ , where $\mathcal{D}S$ is the set of discrete sub-distributions on $S$ :

\mathcal{D}S=\{\varphi\colon S\to[0,1]\mid\sum_{s\in S}\varphi(s)\leq 1\}.

Note that the size of the state set need not be restricted (for the sums to be defined), as the sum is defined as $\sum_{x\in S}\varphi(x)=\sup_{{\textrm{finite }}X\subseteq S}\left(\sum_{x\in X% }\varphi(x)\right)$ . It is easy to prove that discrete probability distributions always have a countable support, i.e., they assign non-zero probability to at most countably many elements of $S$ .

For $s,t\in S$ and $a\in\mathcal{A}$ , the value $\tau_{a}(s)(t)$ (also written $\tau_{a}(s,t)$ ) encodes the probability of jumping from $s$ to $t$ when action $a$ is taken. Figure 1 shows such a system with actions send and ack. The first transition is a subprobability: the missing probability represents the probability of failure. If $X\subseteq S$ , we write $\tau_{a}(s)(X)$ for $\sum_{t\in X}\tau_{a}(s)(t)$ . Coalgebraically, LMPs, for a set of actions or labels $\mathcal{A}$ , are arrows $\tau\colon S\to(\mathcal{D}S)^{\mathcal{A}}$ . We write $\tau_{a}(s)$ for the subdistribution $\tau(s)(a)$ . In case of just one label, LMPs are (discrete time) Markov chains, i.e., coalgebras of the functor $\mathcal{D}$ . The reason for working with subdistributions is that they come equipped with a non-trivial pointwise order. Bisimilarity of LMPs is the classical Larsen-Skou [38] bisimilarity.

We denote by $\mathbf{PMet_{1}}$ the category of $1$ -bounded pseudometric spaces. It has as objects pseudometric spaces, which are pairs $\langle X,d\rangle$ of a set $X$ and a function $d\colon X\times X\to[0,1]$ which satisfies $d(x,x)=0$ , is symmetric: $d(x,y)=d(y,x)$ and satisfies the triangular inequality: $d(x,y)+d(y,z)\geq d(x,z)$ for all $x,y,z\in X$ . As arrows, $\mathbf{PMet_{1}}$ has non-expansive maps $f\colon\langle X,d_{X}\rangle\to\langle Y,d_{Y}\rangle$ , i.e., functions $f\colon X\to Y$ satisfying $d_{Y}(f(x_{1}),f(x_{2}))\leq d_{X}(x_{1},x_{2})$ . A metric or distance is a pseudometric that additionally satisfies $d(x,y)=0$ iff $x=y$ . Of particular interest for LMPs are pseudometrics whose kernel is bisimilarity, that is, $d(x,y)=0$ iff $x$ and $y$ are bisimilar. As we only deal with pseudometrics here, we often use the word distance or metric as a shorthand for pseudometric. Following [59, 56], we will use the opposite pointwise order¹¹1One reason for this choice is to obtain the distance as the greatest, rather than least, fixpoint – just like bisimilarity is the greatest fixpoint of a suitable functional. on distances defined by:

d_{1}\sqsubseteq d_{2}\quad\Longleftrightarrow\quad\forall x,y\in X:d_{1}(x,y)% \geq d_{2}(x,y).

However, whenever it is clearer, we may use the direct pointwise order too: $d_{1}\geq d_{2}\Leftrightarrow d_{1}\sqsubseteq d_{2}$ .

2.1 The Kantorovich pseudometric

The observation that bisimulation is too strong in the context of probabilistic systems has led to the idea of defining a pseudometric that would give zero distance to bisimilar states. Initially, this pseudometric was defined using a real-valued logic. A set $\cal F$ of functionals were defined from states of LMPs to $[0,1]$ , with the following syntax, mimicking logical formulas [24]:

f:=1\mid\inf(f_{1},f_{2})\mid 1-f\mid f\varominus q\mid\langle a\rangle\,f% \mbox{ with }q\in[0,1]\cap\mathbb{Q}.

We omit the semantics, but the next example will give a taste of it. A distance emerged naturally as

d_{K}(s,t)=\sup_{f\in\cal F}\big|f(s)-f(t)\big|.

Example 1.

As an example, we can look back at the channel of Figure 1. The maximum difference over functionals between $c_{\varepsilon}$ and $c_{\gamma}$ , with $\varepsilon,\gamma>0$ is over the functional $\langle\textit{send\,}\rangle\langle\textit{ack}\,\rangle 1$ , which evaluates to $1-\varepsilon$ and $1-\gamma$ , respectively, and thus yields the distance $d_{K}(c_{\varepsilon},c_{\gamma})=|\varepsilon-\gamma|$ . Between $c_{\varepsilon}$ and $c_{0}$ , the functional $(\langle send\rangle\langle ack\rangle)^{n}1$ evaluates to $(1-\varepsilon)^{n}$ on $c_{\varepsilon}$ , and to 1 on $c_{0}$ . So the supremum results in $d_{K}(c_{\varepsilon},c_{0})=1$ .

Later on, it was proven that the functionals could be any non-expansive maps and that this distance is the distance obtained via a final coalgebra construction for a functor (monad) that is a lifting of the Distribution functor (monad) to pseudometric spaces using the Kantorovich metric, by van Breugel and Worrell [59] and van Breugel et al. [56, 57].

This distance is a fixpoint of the functional on distances on the states of an LMP $\tau:S\to(\mathcal{D}S)^{\mathcal{A}}$

\Delta_{K}(d)(s_{0},s_{1})=\sup_{a\in\mathcal{A}}\,\delta_{K}^{d}(\tau_{a}(s_{% 0}),\tau_{a}(s_{1}))

(1)

where $\delta_{K}^{d}$ is the well-known Kantorovich distance between two distributions. We do not need its definition here, the interested reader can find it in, e.g., [56].

The distance on distributions is just a parameter in Equation (1); changing the Kantorovich distance $\delta_{K}^{d}$ to another distance on distributions and looking for a fixpoint will give us a new distance on states. In fact we are interested in the question formulated the other way around: find a distance on distributions for which the fixpoint is the $\varepsilon$ -distance $d^{*}$ , defined in the next section. Similarly, the functional $\Delta_{K}$ was discovered after the behavioural distance $d_{K}$ was introduced. However, it could have been done the other way around, choosing a distance on distributions, and looking for the resulting distance on states, as is done by Baldan et al. [9]. We discuss this in more detail again in Section 3 below.

2.2 $\varepsilon$ -Bisimulation

Bisimulation, being too strong for the comparison of states in quantitative systems, has also led to relaxing the definition of simulation and bisimulation itself to approximate relations. As mentioned in the introduction, there are a few of those definitions of approximate bisimulations, but we are interested in the following.

Definition 2 ([26]).

Let $\tau:S\to(\mathcal{D}S)^{\mathcal{A}}$ be an LMP and let $\varepsilon\in[0,1]$ . A relation $R\subseteq S\times S$ is an $\varepsilon$ -simulation if whenever $s R t$ , then for all $a\in\mathcal{A}$ , for all $X\subseteq S$

\tau_{a}(s)(X)\leq\tau_{a}(t)({R}(X))+\varepsilon,\qquad\mbox{ where }R(X)=\{y% \mid\exists x\in X:xRy\}.

If $R$ is symmetric, it is an $\varepsilon$ -bisimulation. A state $s$ is $\varepsilon$ -simulated by state $t$ , written $s\prec_{\varepsilon}t$ , if $s R t$ for some $\varepsilon$ -simulation $R$ . If $s R t$ for $R$ $\varepsilon$ -bisimulation, we write $s\sim_{\varepsilon}t$ and we say that $s$ and $t$ are $\varepsilon$ -bisimilar.

As expected, ordinary (bi)simulation on LMPs [38, 11] is simply $0$ -(bi)simulation. This definition has an extension to nondeterministic and probabilistic finite systems [54].

The operation $R(X)$ on a set $X$ is what restricts this work to discrete distributions. In general, if $X$ is measurable, we may not have $R(X)$ measurable. Working in analytic spaces, as was done before for LMPs [23] may solve this issue, but we leave it for future work. In particular, one could ask $R(X)$ to be measurable whenever $X$ is, as done in [10], but many results are not proven in that case, like the logical characterisation, or Theorem 5 below.

Example 3.

Consider the following example, with $\gamma\in(0,1]$ .

The relation $R=\{(s,t),(s_{1},t_{1})\}$ is an $\varepsilon$ -simulation for $\varepsilon=\gamma$ , and so is $R\cup R^{-1}$ . Hence, $s\prec_{{\gamma}}t$ , $t\prec_{{\gamma}}s$ and $s\sim_{{\gamma}}t$ . However, the situation is different for $\varepsilon<\gamma$ : for example, taking $\varepsilon=0$ , we observe that $s\prec_{0}t$ (with the relation $R\cup\{(s_{2},t_{1})\}$ ). However, $t\not\prec_{0}s$ , because a relation $R^{\prime}$ relating $t$ and $s$ would have to also relate $t_{1}$ to $s_{1}$ and to $s_{2}$ , if $\gamma<1$ , and $t_{1}$ to $s_{2}$ if $\gamma=1$ . Indeed, taking $X=\{t_{1}\}$ , we need $\tau_{a}(t)(X)\leq\tau_{a}(s)(R^{\prime}(X))+0$ , but for this to be the case, we would need $\tau_{a}(s)(R^{\prime}(X))=1$ and hence $R^{\prime}$ must include the pairs $(t_{1},s_{1})$ and $(t_{1},s_{2})$ if $\gamma<1$ , and otherwise must include $(t_{1},s_{2})$ if $\gamma=1$ . So, in any case $(t_{1},s_{2})\in R^{\prime}$ , but the pair $(t_{1},s_{2})$ cannot be related by any $\varepsilon$ -simulation relation $R^{\prime}$ for $\varepsilon<1$ (which is the case here as $\varepsilon=0$ ), as, for example, taking again $X=\{t_{1}\}$ , we would have $\tau_{a}(t_{1})(X)=1$ but $\tau_{a}(s_{2})(R^{\prime}(X))=0$ . Hence, also $s\not\sim_{0}t$ .

In contrast to the case of 0-bisimilarity, two-way $\varepsilon$ -similarity is not $\varepsilon$ -bisimilarity [26, 54]. From the notion of $\varepsilon$ -bisimilarity, a pseudometric arises naturally by taking the infimum.

Definition 4 (The $\varepsilon$ -distance $d^{*}$ [26]).

Let $\tau:S\to(\mathcal{D}S)^{\mathcal{A}}$ be an LMP. The pseudometric $d^{*}$ on $S\times S$ , called $\varepsilon$ -distance, is defined as follows:

\begin{array}[]{lcl}d^{*}:&S\times S&\rightarrow[0,1]\\ &(s,t)&\mapsto\inf\ \left\{\varepsilon\in[0,1]\mid s\sim_{\varepsilon}t\right% \}.\end{array}

The function $d^{*}$ is a pseudometric on the states of the considered LMP: the triangle inequality comes from the well-known fact, see e.g. [26], that $s\sim_{\varepsilon_{1}}u$ and $u\sim_{\varepsilon_{2}}t$ imply $s\sim_{\varepsilon_{1}+\varepsilon_{2}}t$ . (The same property holds for $\prec_{\varepsilon}$ as well.) Indeed, $\prec_{\varepsilon}$ and $\sim_{\varepsilon}$ are not transitive. Instead, these relations are entourages that form a uniform structure, or uniformity, cf. [15].

The kernel of $d^{*}$ , that is, the set of pairs at zero distance, is $\sim_{0}$ , the bisimilarity relation on LMPs.

Theorem 5 ([26]).

$d^{*}(s,t)=0\Leftrightarrow s\sim_{0}t$ .

One of the nice properties of this distance is its relatively easy computability, especially regarding computation “by hand”. It suffices to define an $\varepsilon$ -bisimulation to obtain an upper bound to the distance between states. In Example 3, we have $d^{*}(s,t)=\gamma$ , and, indeed, we easily see that the probabilities of $s$ and $t$ seen as processes are within $\gamma$ . Also here, the Kantorovich distance between these states is 1. One way to see that is by using the functional $\langle a\rangle^{n}1$ , which evaluates to $(1-\varepsilon)^{n}$ on $s$ , and to 1 on $t$ . So $d_{K}(s,t)=1$ . Similarly, as already noted in the introduction, for the channel in Figure 1, we can prove that $d^{*}(c_{\varepsilon},c_{\gamma})=|\varepsilon-\gamma|=d_{K}(c_{\varepsilon},c% _{\gamma})$ , for $\varepsilon,\gamma>0$ , but $d^{*}(c_{\varepsilon},c_{0})=\varepsilon\neq d_{K}(c_{\varepsilon},c_{0})$ . In fact, $\varepsilon$ -distance is incomparable to both the discounted and the undiscounted Kantorovich distance, as argued in [26]. That paper also gives a straightforward polynomial algorithm to compute $d^{*}$ . The computation of the Kantorovich distance, discounted or not, has been the subject of many papers, which also led to polynomial algorithms [53].

3 The Lévy-Prokhorov distance lifting

We will now explain in details what we think should be a behavioral pseudometric on LMPs. These details may not be new, or may appear obvious to the expert reader, but we find it useful to spell them out here. When looking for a distance $d^{\dagger}$ on the states of an LMP, one observes that states are both targets of distributions and (a set of) distributions themselves (since they are defined by their outgoing transition distributions): that is, in Example 3, the states $s$ and $t$ can be viewed as distributions over the set $\{s_{1},s_{2},t_{1}\}$ . As there are already a few distances on distributions that were studied outside computer science and concurrency theory, they give a starting point for distances on states: we could say, given such a distance $\delta$ :

d^{\dagger}(s,t):=\sup_{a\in\mathcal{A}}\,\delta(\tau_{a}(s),\tau_{a}(t)),% \qquad\mbox{ for $s,t\in S$.}

(2)

We will use the letter $d$ for distances on states, and the symbol $\delta$ for distances on distributions. Analysing the options of this equation in full generality is outside the scope of the current paper but some distances, such as $\delta_{K}^{d}$ (and the one we introduce below $\delta^{d}_{LP}$ ) have a particularity. They are defined using a parameter $d$ , a basic distance on the space where the distributions are defined. This is desirable for concurrency theory because distributions on states that are different but close (or even bisimilar!) should also be close: so a starting distance $d$ on the state space is of key importance to account for these similarities between states/processes. Note that a few papers [30, 19, 51] start from distances on the states that account for extra information (e.g. a distance on the labels/observations that are attached to states) – but this goes beyond the behaviour of states that we want to capture here (although they constitute an interesting line of work to extend our method).

A needed property for $d^{\dagger}$ to be a behavioural distance, as also pointed out in [9] for $\delta_{K}^{d}$ , is that it accords with the starting distance $d$ used in $\delta^{d}$ , as follows.

d^{\dagger}(s,t)=\sup_{a\in\mathcal{A}}\,\delta^{d^{\dagger}}(\tau_{a}(s),\tau% _{a}(t)),\qquad\mbox{ for $s,t\in S$.}

(3)

This says that the distance between states viewed as simple members of the space (the left hand part of the equality), is the same as their distance when viewed as processes, that is, their outgoing transition distributions. So this really says that $d^{\dagger}$ treats states according to their behaviour. Technically, this is saying that we are looking for a distance fixpoint $d^{\dagger}$ of the functional

\Delta(d)(s,t):=\sup_{a\in\mathcal{A}}\,\delta^{d}(\tau_{a}(s),\tau_{a}(t)),% \qquad\mbox{ for $s,t\in S$.}

(4)

Of course, by taking $\delta^{d}:=\delta_{K}^{d}$ , this instantiates exactly to Equation (1), the fixpoint property of the Kantorovich behavioural distance, observed by [25, 59] and explicitly proven in [57]. One of our goals in this work was to find a similar functional that would have the $\varepsilon$ -distance $d^{*}$ as its greatest fixed point. That is, we are seeking a suitable distance $\delta^{d}$ on distributions.

$\blacktriangleright$ Remark 6.

Another property that one might consider natural and one might expect from such a distance $\delta^{d}$ is that the distance between the Dirac distributions on states is the same as the starting distance between them. That is, for a distance $d$ on states, one might expect

d(s,t)=\delta^{d}(1_{s},1_{t}),\qquad\mbox{ for $s,t\in S$,}

(5)

where we write $1_{s}$ for the Dirac distribution on $s\in S$ . This is an interesting property of the lifting $\delta^{d}$ – it is a stronger version of the non-expansiveness of the unit of the distribution monad, showing that the unit is an isometry, and we will return to it in Section 3.4 below where we show that it holds for the Lévy-Prokhorov distance, for any starting distance $d$ on states. It also holds for the (undiscounted) Kantorovich distance – this is easy to prove using the Wasserstein formulation of the Kantorovich distance, whereas it does not hold for the total variation distance [43]. Therefore, this property is neither unique to the Lévy-Prokhorov distance that we are interested in, nor it is “behavioural” in the sense that we explained so far: it does not take into account the transition structure $\tau$ of the states. For a somewhat behavioural explanation, note that combining it with Equation (2), this condition implies that a pair of states $s^{\prime},t^{\prime}$ each having a single outgoing transition as a Dirac on state $s$ and $t$ respectively, would get the same distance as $s$ and $t$ . In particular, such a distance does not discount the future.

3.1 The Lévy-Prokhorov distance on distributions

While examining equation (3) and trying to make $d^{*}$ fit into it as $d^{\dagger}$ we came up with the following distance on distributions. Only afterwards, we discovered that the distance was actually known as the Lévy-Prokhorov lifting of a distance $d$ to distributions. In this section we introduce the distance and provide an example that illustrates it.

Definition 7 (Lévy-Prokhorov distance [45]).

Let $\langle S,d\rangle$ be a pseudometric space; we endow $\mathcal{D}S$ with the pseudometric (or distance) $\delta^{d}_{LP}:\mathcal{D}S\times\mathcal{D}S\to[0,1]$ , defined as

\delta^{d}_{LP}(\mu_{0},\mu_{1})=\inf\{\varepsilon\mid\forall X\subseteq S:\mu% _{i}(X)\leq\mu_{1-i}(X^{d}_{\varepsilon})+\varepsilon,\mbox{ for }i=0,1\},

where $X^{d}_{\varepsilon}=\{y\mid\exists x\in X:d(x,y)<\varepsilon\}$ .
We call $\delta^{d}_{LP}$ the Lévy-Prokhorov (LP, for short) distance.

The strict inequality in the definition of $X_{\varepsilon}^{d}$ is necessary for the proof of Theorem 11.

In the following example we define simple probability measures that help illustrate the need for the extra “ $+\,\varepsilon$ ” in this definition and the need for the “ $\varepsilon$ ball around $X$ ”. At first sight, $\delta^{d}_{LP}$ looks very much like the total variation distance, but this “ $\varepsilon$ ball around $X$ ” makes it very different.

Example 8.

Consider the set of states $S=\{x_{\gamma}\mid\gamma\in[0,1]\}$ and a distance on these states given by $d(x_{\gamma},x_{\xi})=|\gamma-\xi|$ , for $\gamma,\xi\in[0,1]$ . In a probabilistic transition system, one could imagine that $x_{\gamma}$ has an $a$ -loop to itself with probability $1-\gamma$ , $\gamma\in[0,1]$ , as depicted below (adding transitions out of the states to help see their differences as processes in a transition system). We now define a family of distributions on these states, and we picture them below as states with an outgoing transition without label (one could imagine an $a$ -label). Let $\nu_{\gamma}=\gamma 1_{x_{1}}+(1-\gamma)1_{x_{\gamma}}$ , for $\gamma\in[0,\frac{1}{2})$ , with $1_{x}$ the Dirac measure on $x$ . The distributions $\nu_{\gamma}$ and $\nu_{0}$ are illustrated below. For a fixed $\gamma$ , these two distributions are actually non-zero on a three-state space $S^{\prime}=\{x_{0},x_{1},x_{\gamma}\}$ .

We show that $\delta^{d}_{LP}(\nu_{\gamma},\nu_{0})=\gamma$ . Let $1-\gamma>\varepsilon>\gamma$ . We show that all such $\varepsilon$ satisfy the inequalities in the definition of $\delta^{d}_{LP}$ and any $\varepsilon\leq\gamma$ does not, so the distance, being the infimum over all such values, is indeed $\gamma$ . In particular, we need to check that for all $X\subseteq S$ , $\nu_{\gamma}(X)\leq\nu_{0}(X_{\varepsilon}^{d})+\varepsilon$ and $\nu_{0}(X)\leq\nu_{\gamma}(X_{\varepsilon}^{d})+\varepsilon$ . The cases $X=S$ and $X=\emptyset$ are trivial, since the distributions are full. For the sets $X$ of size two, we have
$\begin{array}[]{lllll}\mbox{For }X:=\{x_{0},x_{1}\}:&\nu_{\gamma}(X)=\gamma&\!% \!\leq 1+\varepsilon&=\nu_{0}(S)+\varepsilon&=\nu_{0}(X^{d}_{\varepsilon})+% \varepsilon\\ &\nu_{0}(X)=1&\!\!\leq 1+\varepsilon&=\nu_{\gamma}(S)+\varepsilon&=\nu_{\gamma% }(X^{d}_{\varepsilon})+\varepsilon\\ \mbox{For }X:=\{x_{1},x_{\gamma}\}:&\nu_{\gamma}(X)=1&\!\!\leq 1+\varepsilon&=% \nu_{0}(S)+\varepsilon&=\nu_{0}(X^{d}_{\varepsilon})+\varepsilon\\ &\nu_{0}(X)=0&\!\!\leq 1+\varepsilon&=\nu_{\gamma}(S)+\varepsilon&=\nu_{\gamma% }(X^{d}_{\varepsilon})+\varepsilon\\ \mbox{For }X:=\{x_{0},x_{\gamma}\}:&\nu_{\gamma}(X)=1-\gamma&\!\!\leq 1+% \varepsilon&=\nu_{0}(\{x_{0},x_{\gamma}\})+\varepsilon&=\nu_{0}(X^{d}_{% \varepsilon})+\varepsilon\\ &\nu_{0}(X)=1&\!\!\leq(1-\gamma)+\varepsilon&=\nu_{\gamma}(\{x_{0},x_{\gamma}% \})+\varepsilon&=\nu_{\gamma}(X^{d}_{\varepsilon})+\varepsilon\\ \end{array}$

The last inequality is satisfied thanks to the room given by the “ $+\varepsilon$ ”. For sets of size one, omitting the checks where the probability is zero on the left-hand side of the inequality:

\begin{array}[]{lllll}\mbox{For }X:=\{x_{1}\}:&\nu_{\gamma}(X)=\gamma&\!\!\leq 0% +\varepsilon&=\nu_{0}(\{x_{1}\})+\varepsilon&=\nu_{0}(X^{d}_{\varepsilon})+% \varepsilon\\ \mbox{For }X:=\{x_{\gamma}\}:&\nu_{\gamma}(X)=1-\gamma&\!\!\leq 1+\varepsilon&% =\nu_{0}(\{x_{\gamma},x_{0}\})+\varepsilon&=\nu_{0}(X^{d}_{\varepsilon})+% \varepsilon\\ \mbox{For }X:=\{x_{0}\}:&\nu_{0}(X)=1&\!\!\leq 1-\gamma+\varepsilon&=\nu_{% \gamma}(\{x_{\gamma},x_{0}\})+\varepsilon&=\nu_{\gamma}(X^{d}_{\varepsilon})+% \varepsilon.\end{array}

In the one but last inequality, it is crucial that $x_{\gamma}$ be within $\varepsilon$ of $x_{0}$ , which includes $x_{0}$ in the $\varepsilon$ ball around $x_{\gamma}$ . With $X$ in place of $X^{d}_{\varepsilon}$ , the inequality would not be satisfied, and not even with $X^{d}_{\gamma/2}$ , a set where states at distance zero are included (like bisimilar states – of which there are none here), because we would obtain:

\nu_{\gamma}(\{x_{\gamma}\})=1-\gamma\not\leq 0+\varepsilon=\nu_{0}(\{x_{% \gamma}\}^{d}_{\gamma/2})+\varepsilon.

The last inequality also illustrates this, and also the need for the “ $+\varepsilon$ ”, as the $\varepsilon$ -ball itself is not enough for the inequality to be satisfied. Moreover, for a pair of distributions $\nu_{\gamma},\nu_{\xi}$ in our family one can similarly prove that $\delta^{d}_{LP}(\nu_{\gamma},\nu_{\xi})=|\gamma-\xi|$ .

Finally, we note that for $\varepsilon\leq\gamma$ we have $X_{\varepsilon}^{d}=X$ for any $X\subseteq S$ and hence, e.g., for $X=\{x_{\gamma}\}$ , as already noted above we have $\nu_{\gamma}(X)=1-\gamma\not\leq 0+\varepsilon=\nu_{0}(X^{d}_{\varepsilon})+\varepsilon$ as $\varepsilon\leq\gamma<\frac{1}{2}$ .

One can notice that when viewed as processes, $\nu_{\gamma}$ and $\nu_{0}$ are the same as $s$ and $t$ of Example 3. Technically, it is rather $\tau_{a}(s)$ and $\tau_{a}(t)$ of course.

Once one has seen the Lévy-Prokhorov distance on distributions it seems not surprising that it has some link with the $\varepsilon$ -distance. However, our surprise was the other way around, when we realized that the distance on distributions $\delta^{d}$ that we had to concoct to make $d^{*}$ a fixed point of $\Delta$ in Equation (4) already existed.

3.2 The Lévy-Prokhorov distance on LMPs

We are now ready to define the functional on distances over LMPs that we were looking for. We expect a fixpoint of this functional to be a behavioral distance, but more importantly we expect $d^{*}$ to be the greatest fixpoint of it.

Definition 9.

Let $\langle S,d\rangle$ be a pseudometric space, on which an LMP $\tau:S\to(\mathcal{D}S)^{\mathcal{A}}$ is defined. Let $s_{0}$ and $s_{1}$ be states. We define the functional $\Delta_{LP}$ as

	$\displaystyle\Delta_{LP}(d)(s_{0},s_{1})$	$\displaystyle=\sup_{a\in\mathcal{A}}\delta_{LP}^{d}(\tau_{a}(s_{0}),\tau_{a}(s% _{1}))$
		$\displaystyle=\sup_{a\in\mathcal{A}}\inf\{\varepsilon\mid(\forall A\subseteq S% :\tau_{a}(s_{i})(A)\leq\tau_{a}(s_{1-i})(A^{d}_{\varepsilon})+\varepsilon,% \quad i=0,1)\}.$

Hence, for a fixed pseudometric space $\langle S,d\rangle$ , $\Delta_{LP}$ applied to $d$ gives a new pseudometric on $S$ . Clearly, this can be seen as a functional on $\mathbf{PMet_{1}}$ , mapping $\langle S,d\rangle$ to $(S,\Delta_{LP}(d))$ .

Before proving that $d^{*}$ is a fixpoint, we show how the supremum over actions can safely be replaced by inserting a universal quantification on actions inside the set, as will be useful in Proposition 15.

Lemma 10.

Let $\langle S,d\rangle$ be a pseudometric space, on which an LMP $\tau:S\to(\mathcal{D}S)^{\mathcal{A}}$ is defined. Let $s_{0}$ and $s_{1}$ be states. Then

\displaystyle\Delta_{LP}(d)(s_{0},s_{1})

\displaystyle=\inf\{\varepsilon\mid(\forall a\in\mathcal{A},\forall A\subseteq S% :\tau_{a}(s_{i})(A)\leq\tau_{a}(s_{1-i})(A^{d}_{\varepsilon})+\varepsilon,% \quad i=0,1)\}.

One remarkable property of prefixpoints of this functional is that they define $\varepsilon$ -bisimulations. Of course the theorem is also true for fixpoints.

Theorem 11.

Let $\tau:S\to(\mathcal{D}S)^{\mathcal{A}}$ be an LMP with $\langle S,d\rangle$ a pseudometric space. If $d$ is a prefixpoint of $\Delta_{LP}$ , then for any $\varepsilon>0$ , $R_{\varepsilon}:=\{(s,t)\in S\times S\mid d(s,t)<\varepsilon\}$ , is an $\varepsilon$ -bisimulation.

Proof.

Let $d$ be a prefixpoint, i.e., let $\Delta_{LP}(d)\leq d$ . Let $R_{\varepsilon}=\{(s,t)\mid d(s,t)<\varepsilon\}$ . We show it is an $\varepsilon$ -bisimulation. Let $s_{0}R_{\varepsilon}s_{1}$ , i.e., $d(s_{0},s_{1})<\varepsilon$ , and also $\Delta_{LP}(d)(s_{0},s_{1})<\varepsilon$ , because $d$ is a prefixpoint. Let $a\in\mathcal{A}$ ; then $\delta_{LP}^{d}(\tau_{a}(s_{0}),\tau_{a}(s_{1}))<\varepsilon$ . Because $\delta_{LP}^{d}$ is an infimum, there is some $\gamma<\varepsilon$ such that for $i=1,2$ and $X\subseteq S$

$\displaystyle\tau_{a}(s_{i})(X)$	$\displaystyle\leq\tau_{a}(s_{1-i})(X^{d}_{\gamma})+\gamma$	by definition of $\delta_{LP}$
	$\displaystyle\leq\tau_{a}(s_{1-i})(X^{d}_{\varepsilon})+\varepsilon$	$\displaystyle\mbox{because }X^{d}_{\gamma}\subseteq X^{d}_{\varepsilon}\mbox{ % and }\gamma\leq\varepsilon.$
	$\displaystyle=\tau_{a}(s_{1-i})(R_{\varepsilon}(X))+\varepsilon$	$\displaystyle\mbox{by definition of }R_{\varepsilon}(X).$

So $R_{\varepsilon}$ is an $\varepsilon$ -bisimulation, as wanted. $\hfill\blacktriangleleft$ The proof of this theorem needs that $X_{\varepsilon}^{d}$ be defined with a strict inequality. It is necessary for the existence of $\gamma$ (and for the last equality).

This theorem is nice in itself but it also has an important corollary, that $d^{*}$ is greater than or equal to all fixpoints of $\Delta_{LP}$ .

Corollary 12.

$d^{*}$ is greater than or equal to all fixpoints, i.e.,

\Delta_{LP}(d)=d\quad\Longrightarrow\quad d\sqsubseteq d^{*}.

Proof.

Let $d$ be a fixpoint of $\Delta_{LP}$ and let $d(s,t)=\varepsilon$ . By the previous theorem, $s\sim_{\gamma}t$ for all $\gamma>\varepsilon$ . Since $d^{*}(s,t)=\inf\{e\mid s\sim_{e}t\}$ , we obtain $d^{*}(s,t)\leq\varepsilon=d(s,t)$ , as wanted. $\hfill\blacktriangleleft$

With the next proposition, we provide another example of a fixpoint of $\Delta_{LP}$ : the exact bisimilarity distance, which we denote by $d_{\sim}$ . It is equal to 0 if the states are bisimilar, otherwise it is 1.

Proposition 13.

The exact bisimilarity distance $d_{\sim}$ is a fixpoint of $\Delta_{LP}$ .

Proof.

We first prove that bisimilar states $s, t$ satisfy $\Delta_{LP}(d_{\sim})(s,t)=0$ . Consider $R$ to be bisimilarity. Because $R$ is a bisimulation, we have that for all $s_{0}Rs_{1}$ , $a\in\mathcal{A}$ and $X\subseteq S$ ,

\tau_{a}(s_{0})(X)\leq\tau_{a}(s_{1})(R(X))=\tau_{a}(s_{1})(X^{d_{\sim}}_{% \gamma}),

for any $\gamma\in(0,1)$ , because $X^{d_{\sim}}_{\gamma}$ is just the smallest $R$ -closed set that contains $X$ . A symmetric argument applies with 0 and 1 interchanged. Consequently, $\Delta_{LP}(d_{\sim})(s_{0},s_{1})=0$ , as wanted.

Conversely, if $\Delta_{LP}(d_{\sim})(s_{0},s_{1})=0$ , then there is a sequence of $\gamma_{n}\in(0,1)$ that converges to zero, for which we have

\tau_{a}(s_{0})(X)\leq\tau_{a}(s_{1})(X^{d_{\sim}}_{\gamma_{n}})+\gamma_{n}=% \tau_{a}(s_{1})(R(X))+\gamma_{n}.

Since the property is monotone, recall also (6), for all $\gamma\in(0,1)$ we have

\tau_{a}(s_{0})(X)\leq\tau_{a}(s_{1})(X^{d_{\sim}}_{\gamma})+\gamma=\tau_{a}(s% _{1})(R(X))+\gamma.

So $\tau_{a}(s_{0})(X)\leq\tau_{a}(s_{1})(R(X))$ , as wanted. $\hfill\blacktriangleleft$

Proposition 14.

$\Delta_{LP}$ is monotonic.

We can now prove that the distance defined by $\varepsilon$ -bisimulation is a fixpoint of $\Delta_{LP}$ .

Proposition 15.

$d^{*}$ is a fixpoint of $\Delta_{LP}$ , and hence it is the greatest fixpoint.

At this point we remark again that even if it seems not surprising that the Lévy-Prokhorov metric on distributions has some link with the $\varepsilon$ -distance, going from the basic metric on distributions $\delta_{LP}$ to a meaningful distance on LMPs is not direct (one has to find the fixed-point properties).

3.3 The Lévy-Prokhorov lifting of the subdistribution functor to $\mathbf{PMet_{1}}$

Using the LP distance lifting, we define a functor $\mathcal{D}_{LP}$ on $\mathbf{PMet_{1}}$ , the category of 1-bounded pseudometric spaces with nonexpansive functions, as follows: On objects $\langle X,d\rangle$ , we have

\mathcal{D}_{LP}\langle X,d\rangle=(\mathcal{D}X,\delta^{d}_{LP})

and on morphisms $f\colon\langle X_{0},d_{0}\rangle\to\langle X_{1},d_{1}\rangle$ we set $\mathcal{D}_{LP}f=\mathcal{D}f$ , that is

\mathcal{D}_{LP}f\colon\mathcal{D}_{LP}\langle X_{0},d_{0}\rangle\to\mathcal{D% }_{LP}\langle X_{1},d_{1}\rangle\textrm{ with }\varphi\mapsto\lambda y.\,% \varphi(f^{-1}(\{y\})).

Proposition 16.

$\mathcal{D}_{LP}$ is a functor on $\mathbf{PMet_{1}}$ .

Just like for the Kantorovich-lifting of the distribution functor, we can prove that the Lévy-Prokhorov lifting of the distribution functor is locally nonexpansive, in the next proposition. For this reason, note that given pseudometric spaces $\langle X,d_{X}\rangle$ and $\langle Y,d_{Y}\rangle$ in $\mathbf{PMet_{1}}$ , the hom-set $X\to Y$ of all nonexpansive maps from $X$ to $Y$ carries a metric defined by

d_{X\to Y}(f_{1},f_{2})=\sup_{x\in X}d_{Y}(f_{1}(x),f_{2}(x)).

Proposition 17.

The functor $\mathcal{D}_{LP}$ is locally nonexpansive, that is, for $f_{1},f_{2}\in X\to Y$

\delta_{\mathcal{D}_{LP}X\to\mathcal{D}_{LP}Y}(\mathcal{D}_{LP}f_{1},\mathcal{% D}_{LP}f_{2})\leq d_{X\to Y}(f_{1},f_{2})

with $\delta_{\mathcal{D}_{LP}X\to\mathcal{D}_{LP}Y}$ being the metric on the hom-set $\mathcal{D}_{LP}\langle X,d_{X}\rangle\to\mathcal{D}_{LP}\langle Y,d_{Y}\rangle$ .

For the following property, recall that a map $f\colon X\to Y$ for (pseudo)metric spaces $\langle X,d_{X}\rangle$ and $\langle Y,d_{Y}\rangle$ is an isometry if and only if for all $x,y\in X$ , $d_{Y}(f(x),f(y))=d_{X}(x,y)$ .

Proposition 18.

The functor $\mathcal{D}_{LP}$ preserves isometries, i.e., if $f\colon X\to Y$ is an isometry for (pseudo)metric spaces $\langle X,d_{X}\rangle$ and $\langle Y,d_{Y}\rangle$ , then so is $\mathcal{D}_{LP}f$ .

As a consequence, from the results of [31], we obtain that the lifting functor $\mathcal{D}_{LP}$ is Kantorovich, i.e., it has codensity lifting which means that there is a set of predicate liftings of type $\mathcal{D}[0,1]\to[0,1]$ such that codensity lifting gives rise to $\mathcal{D}_{LP}$ . This has already been remarked in [31, Example 5.5.2]. However, the exact set of predicate liftings needed to give rise to the LP-lifting is not provided.

3.4 The lifted functor $\mathcal{D}_{LP}$ is not a monad lifting of $\mathcal{D}$

Behavioural distances have been axiomatized within the line of work on quantitative equational theories [39]. It was therefore a natural question for us whether the $\varepsilon$ -distance, or the Lévy-Prokhorov distance itself can be given a quantitative axiomatization. This is what we briefly investigate in this section.

Mio et al. [43, Lemma 7.2, Theorem 7.7(2)] have shown that:

$\blacksquare$

If a monad on metric spaces is axiomatizable, then (just by being a monad on metric spaces) it has nonexpansive unit and multiplication.
$\blacksquare$

A monad on metric spaces that is a lifting of a monad on $\mathbf{Sets}$ is axiomatizable with a quantitative theory. Being a lifting means that it acts on objects and arrows in metric spaces in the same way it does in $\mathbf{Sets}$ and the unit and multiplication are nonexpansive.

The Kantorovich lifting of $\mathcal{D}$ has been axiomatized, and the total variation distance has been shown non-axiomatizable (as the unit is not nonexpansive). In the case of $\mathcal{D}_{LP}$ we can see that the unit is nonexpansive, but the multiplication is not, and hence $\mathcal{D}_{LP}$ is not a lifting of the monad $\mathcal{D}$ to $\mathbf{PMet_{1}}$ and the Lévy-Prokhorov distance on distributions is not axiomatizable with a quantitative theory, at least not with the standard multiplication.

Recall the definitions of $\eta$ and $\mu$ for the subdistribution monad $\mathcal{D}$ :

\eta_{X}(x)=1_{x},\text{ the Dirac distribution at }x;\text{ and }\mu_{X}(\Phi% )(x)=\sum_{\varphi}\Phi(\varphi)\cdot\varphi(x).

Lemma 19.

The unit $\eta$ of $\mathcal{D}$ is nonexpansive with respect to the Lévy-Prokhorov distance. Moreover, it is an isometry, i.e., $\delta_{LP}^{d}(1_{x},1_{y})=d(x,y)$ .

The unit of the distribution monad is also an isometry with respect to the Kantorovich distance (without discount), as is easy to prove using the duality with the Wasserstein distance. However, the multiplication $\mu$ of the monad $\mathcal{D}$ is not nonexpansive, as the following example shows.

Example 20.

Consider $a$ , $b$ and $c$ in the figure below as distributions over $\{\bot,\bullet\}$ .

The ambient distance is $d(\bot,\bullet)=1$ , we omit it from the notation: we write $\delta_{LP}$ for $\delta_{LP}^{d}$ . Consider the distributions $\varphi:=1_{a}$ and $\psi:=(1-\varepsilon)1_{b}+\varepsilon 1_{c}$ over these distributions, as pictured. We collect several useful facts:

$\blacksquare$

$\delta_{LP}(a,b)=\varepsilon$ and $\delta_{LP}^{\delta_{LP}}(\varphi,\psi)=\varepsilon$ .
$\blacksquare$

$\mu\varphi=a$ and $\mu\psi=(1-\varepsilon)^{2}\bot+(1-\varepsilon)\varepsilon\bullet+\varepsilon% \bullet=(1-\varepsilon)^{2}\bot+(2-\varepsilon)\varepsilon\bullet$ .
$\blacksquare$

$\{\bot\}_{\gamma}=\{\bot\}$ for any $\gamma\leq\varepsilon$ and hence $\mu\varphi(\{\bot\})=1$ and $\mu\psi(\{\bot\}_{\gamma})=\mu\psi(\{\bot\})=(1-\varepsilon)^{2}$ .

For $\gamma<\varepsilon(2-\varepsilon)$ , we have $1\not\leq(1-\varepsilon)^{2}+\gamma$ which therefore yields

\mu\varphi(\{\bot\})=1\not\leq(1-\varepsilon)^{2}+\gamma=\mu\psi(\{\bot\}_{% \gamma})+\gamma.

This shows that $\delta_{LP}(\mu\varphi,\mu\psi)\geq\varepsilon(2-\varepsilon)>\varepsilon=% \delta_{LP}^{\delta_{LP}}(\varphi,\psi)$ and hence $\mu$ is not nonexpansive.

This result is not surprising, as the binding operator $\mu$ really is a multiplication, and so it accords well with the Kantorovich metric, which does multiply the probabilities of distributions in play. We do not know yet if another operator could help express the functor $\mathcal{D}_{LP}$ as a monad lifting.

4 $\varepsilon$ -(Bi)simulations, coalgebraically

The Kantorovich behavioural distance has a coalgebraic characterization via a final coalgebra semantics. Whether the $\varepsilon$ -distance is obtained by finality is still open, but the results on the Kantorovich distance do not seem to directly apply here. However, as the distance is defined via $\varepsilon$ -bisimulations and bisimilarity, it is natural to see whether a coalgebraic semantics arising from generalizing coalgebraic bisimulations and bisimilarity is in place. The answer to this question is positive and we present the necessary observations in this section. An abstract (in this case coalgebraic) characterization allows for, on the theoretical side, deeper and clearer understanding of the notion under study, and, on the practical side, generalizations. We could use this generality to define notions of $\varepsilon$ -bisimulations for other types of (probabilistic) systems.

In a nutshell, in this section we show that $\varepsilon$ -simulation and $\varepsilon$ -bisimulation have a span-diagram characterization, in a way similar to coalgebraic simulation [34, 32, 33] and Aczel-Mendler coalgebraic bisimulation [3, 35, 47]. The development is using a notion of $\varepsilon$ -coupling defined in [54]. For simplicity, we ignore the labels in this section, i.e., we assume there is a single label.

To start with, we recall the basic notions related to coalgebraic (bi)simulation, formulated for the functor $\mathcal{D}$ that we are mainly interested in in this work. Recall the definition of the sub-distribution functor $\mathcal{D}$ on $\mathbf{Sets}$ : It maps a set $X$ to $\mathcal{D}X$ , the set of discrete subdistributions on $X$ and on arrows $f\colon X\to Y$ , $\mathcal{D}f\colon\mathcal{D}X\to\mathcal{D}Y$ is $\mathcal{D}f(\varphi)(y)=\varphi(f^{-1}(\{y\})$ .

Coalgebras of the functor $\mathcal{D}$ are Markov chains, formally they are pairs $(X,c)$ of a carrier set $X$ and transition map $c\colon X\to\mathcal{D}X$ . We often just refer to the coalgebra by the transition map $c$ . Given two such coalgebras, $c\colon X\to\mathcal{D}X$ and $d\colon Y\to\mathcal{D}Y$ a coalgebra homomorphism from $(X,c)$ to $(Y,d)$ is a map $h\colon X\to Y$ making the left diagram below commute.

Definition 21.

Given two coalgebras $c\colon X\to\mathcal{D}X$ and $d\colon Y\to\mathcal{D}Y$ , a coalgebraic bisimulation is a relation $R\subseteq X\times Y$ such that there exists a coalgebra structure $b\colon R\to\mathcal{D}R$ making the two projections $\pi_{1}\colon R\to X$ and $\pi_{2}\colon R\to Y$ coalgebra homomorphisms, i.e., making the right span diagram above commute.

It is well known ([20, 49]) that this corresponds to the standard notion of probabilistic bisimulation ([38]). For the case of continuous distributions, taking a cospan instead of a span avoids the need for analytic spaces [17]. Back to the discrete case, one way to define the coalgebra structure $b$ is using the notion of a coupling: If for all $(x,y)\in R$ , there is a coupling $\beta$ for $\mu=c(x)$ and $\nu=d(y)$ , then setting $b(x,y)=\beta$ provides the needed transition structure. Recall that $\beta\in\mathcal{D}R$ is a coupling of $\mu\in\mathcal{D}X$ and $\nu\in\mathcal{D}Y$ if its marginals are $\mu$ and $\nu$ , respectively:

\sum_{y\in Y}\beta(x,y)=\mu(x),\quad\sum_{x\in X}\beta(x,y)=\nu(y).

Note that this definition of a coupling, fitting the definition of coalgebraic bisimulation for the subprobability distribution functor, ensures that a coupling of two subdistributions exists only if they have the same total mass. One could also define couplings by adding a dummy element and hence viewing a subdistribution as a full distribution, which provides a more general notion. However, these details are not relevant for what we are really interested in, which are $\varepsilon$ -bisimulations.

We will use a relaxed notion of $\varepsilon$ -coupling [54] to define $\varepsilon$ -(bi)simulation in analogy with the coalgebraic definition above. Before we recall those, let us first mention coalgebraic simulations, due to [34, 32, 33] with a coalgebraic formulation for LMPs already in [21]. For this, note that an ordered functor $F$ is a functor with an order on each object $F X$ . The original definition in [34] uses preorders as minimal requirement, and also implies that for any map $f\colon X\to Y$ , $F f$ preserves the order, i.e., is monotone – a condition that is not used in our results. The subdistribution functor $\mathcal{D}$ is ordered, e.g., by pointwise order. This order becomes trivial in case of the distribution functor which is the reason why we work with subdistributions here. Coalgebraic simulation can be defined for ordered functors, here we recall the definition in the special case of the subdistribution functor $\mathcal{D}$ . All notions involve lax and oplax morphism.

Definition 22.

A lax homomorphism from $(X,c)$ to $(Y,d)$ is a morphism $l\colon X\to Y$ with the property that $d\circ l\sqsubseteq\mathcal{D}l\circ c$ , i.e., it makes the left lax diagram below commute. An oplax homomorphism from $(X,c)$ to $(Y,d)$ is a lax homomorphism for the dual/opposite order, that is a morphism $o\colon X\to Y$ that makes the middle diagram below commute:

The order on distributions is defined pointwise: For a set $A$ and $\varphi,\psi\in\mathcal{D}A$ , we have $\varphi\sqsubseteq\psi$ iff for all $a\in A$ , $\varphi(a)\leq\psi(a)$ .

Note that, equivalently, $\varphi\sqsubseteq\psi$ iff for all $B\subseteq A$ , $\varphi(B)\leq\psi(B)$ , where the right-to-left implication follows from instantiating on singleton subsets and the left-to-right implication follows from the additivity of distributions.

Initially, generic coalgebraic simulations have been studied for ordered functors, by Hughes and Jacobs [34], using a notion of lax relation lifting. On the other hand, Hasuo [32, 33] discovered (generalizations of) forward, backward, and hybrid simulations in Kleisli categories and moreover showed in [33] that Hughes-Jacobs simulations are a special case. While Kleisli categories are appealing for soundness results, and monads suitable for traces come equipped with an order, we may safely ignore all Kleisli aspects here. We now recall the following notions of simulations [32], formulated for general categories of coalgebras (as long as the functor comes equipped with an order) where they amount to different names for lax/oplax morphisms, for the sake of making the connection to [32] explicit.

Definition 23.

A forward simulation from $(X,c)$ to $(Y,d)$ is a lax homomorphism from $(Y,d)$ to $(X,c)$ . A backward simulation from $(X,c)$ to $(Y,d)$ is an oplax homomorphism. Combining the lax-commuting boxes leads to hybrid, forward-backward and backward-forward, simulations. In particular, Hughes-Jacobs simulation is a forward-backward simulation: It is a relation $R$ on which there exists a coalgebra structure $b\colon R\to\mathcal{D}R$ making $\pi_{1}$ a forward simulation from $(X,c)$ to $(R,b)$ and $\pi_{2}$ a backward simulation from $(R,b)$ to $(Y,d)$ , depicted in the right diagram above.

Clearly, a symmetric simulation on a single system is a bisimulation.

In the rest of this section, we will show that $\varepsilon$ -(bi)simulations can be depicted similarly. For this we will need bounded-lax-commutativity of morphisms as well as a notion of $\varepsilon$ -coupling (which then directly gives an $\varepsilon$ -relation-lifting). We will focus here on the functor $\mathcal{D}$ only. These notions can be generalized to generic coalgebras for functors with suitable structure. However, those observations are beyond the scope of this paper and we leave them for future work.

Definition 24.

An $\varepsilon$ -lax homomorphism from $(X,c)$ to $(Y,d)$ is a morphism $l\colon X\to Y$ that makes the left $\varepsilon$ -lax diagram below commute, i.e., $d\circ l\sqsubseteq_{\varepsilon}\mathcal{D}l\circ c$ . An $\varepsilon$ -oplax homomorphism from $(X,c)$ to $(Y,d)$ is a morphism $o\colon X\to Y$ with the property that $\mathcal{D}o\circ c\sqsubseteq_{\varepsilon}d\circ o$ , i.e., it makes the right $\varepsilon$ -lax diagram below commute.

The $\varepsilon$ -order on distributions is defined by: For a set $A$ and $\varphi,\psi\in\mathcal{D}A$ , we have $\varphi\sqsubseteq_{\varepsilon}\psi$ iff for all $B\subseteq A$ , $\varphi(B)\leq\psi(B)+\varepsilon$ .

$\blacktriangleright$ Remark 25.

Note that $\sqsubseteq_{\varepsilon}$ is not an order relation, namely it is not transitive. Note also that the definition of $\sqsubseteq_{\varepsilon}$ can not be expressed on elements (of $A$ ) only, as it is stronger than the property: for all $a\in A$ , $\varphi(a)\leq\psi(a)+\varepsilon$ . Namely, the role of $\varepsilon$ here, just like in the definition of $\varepsilon$ -(bi)simulation is global.

We next recall the notion of $\varepsilon$ -coupling from [54], where it is called an $\varepsilon$ -weight function.

Definition 26.

Let $R\subseteq X\times Y$ . A distribution $\beta\in\mathcal{D}R$ is an $\varepsilon$ -coupling for $\mu\in\mathcal{D}X$ and $\nu\in\mathcal{D}Y$ iff the following three conditions hold:

1.

$\sum_{y\in Y}\beta(x,y)\leq\mu(x)$ , for all $x\in X$
2.

$\sum_{x\in X}\beta(x,y)\leq\nu(y)$ , for all $y\in Y$
3.

$\mu(X)\leq\sum_{x\in X,y\in Y}\beta(x,y)+\varepsilon=\sum_{(x,y)\in R}\beta(x,% y)+\varepsilon$ .

To be fully precise, we should consider $\beta\in\mathcal{D}(X\times Y)$ with support contained in $R$ . We write $\beta\in\mathcal{D}R$ and yet sometimes sum over all $x\in X,y\in Y$ , i.e., we identify $\beta\in\mathcal{D}R$ with a distribution in $\mathcal{D}(X\times Y)$ that assigns probability $0$ to all pairs out of $R$ and acts as $\beta$ on $R$ .

Before we proceed with the main observation of this section, we prove an auxiliary property that allows for rewriting the definition of $\varepsilon$ -coupling.

Lemma 27.

Assume that Condition 1. in Definition 26 holds for $\beta\in\mathcal{D}R$ with $R\subseteq X\times Y$ and $\mu\in\mathcal{D}X$ . Then $\mu(X)\leq\sum_{x\in X,y\in Y}\beta(x,y)+\varepsilon$ (i.e., Condition 3. in Definition 26) is equivalent to the condition

\mu(S)\leq\sum_{x\in S,y\in Y}\beta(x,y)+\varepsilon,\quad\text{for all }S% \subseteq X.

This property again emphasises the global nature of $\varepsilon$ in our situation. As a consequence, for an $\varepsilon$ -coupling $\beta\in\mathcal{D}R$ of $\mu\in\mathcal{D}X$ and $\nu\in\mathcal{D}Y$ , with $R\subseteq X\times Y$ , for each $S\subseteq X$ :

\sum_{x\in S,y\in Y}\beta(x,y)\,\,\leq\,\,\mu(S)\,\,\leq\,\,\sum_{x\in S,y\in Y% }\beta(x,y)+\varepsilon.

Proposition 28.

Let $(X,c)$ and $(Y,d)$ be two $\mathcal{D}$ -coalgebras. The following three properties are equivalent for $R\subseteq X\times Y$ :

1.

$R$ is an $\varepsilon$ -simulation.
2.

For every $(x,y)\in R$ , there is an $\varepsilon$ -coupling $\beta\in\mathcal{D}R$ of $\mu=c(x)$ and $\nu=d(y)$ .
3.

The “ $\varepsilon$ -lax-bounded” span of morphisms on the left below commutes.

Note that Condition 2. in Proposition 28 can be taken as a definition of $\varepsilon$ -relation-lifting, in analogy to relation lifting for $\mathcal{D}$ being defined using the existence of a coupling for any pair of elements in $R$ . This way was also taken in [34] with the definition of lax relation lifting.

As a consequence of Proposition 28, we immediately get that a relation $R\subseteq X\times X$ on the states of a $\mathcal{D}$ -coalgebras is an $\varepsilon$ -bisimulation iff the right diagram above commutes. This diagram resembles the defining diagram of the recently introduced uncertain bisimulations [46], but requires a double property (for $\sqsubseteq_{\varepsilon}$ as well as for $\sqsubseteq^{op}$ ). A detailed comparison with this interesting work remains a future task.

As already mentioned, generalizing the notions of approximate (bi)simulations to generic coalgebras is an interesting direction for future work. A similar notion is being developed for cost automata [4].

5 Concluding remarks

We have shown that $\varepsilon$ -bisimulations are closely connected with the Lévy-Prokhorov metric on (sub)probability distributions: The LP pseudometric yields a lifting of $\mathcal{D}$ to a functor on pseudometric spaces and induces coinductive behaviour distance on LMPs, which is the greatest fixpoint of a suitable functional and turns out to be exactly the distance induced by $\varepsilon$ -bisimilarity. Remarkably, any fixpoint distance of that functional defines an $\varepsilon$ -bisimulation. This is the first time that a distance on distributions other than the Kantorovich distance is used for characterizing a known behavioural distance.

References

[1] Alessandro Abate. Approximation metrics based on probabilistic bisimulations for general state-space markov processes: A survey. Electronic Notes in Theoretical Computer Science, 297:3–25, 2013. Proceedings of the first workshop on Hybrid Autonomous Systems. doi:10.1016/j.entcs.2013.12.002.
[2] Alessandro Abate, Marta Z. Kwiatkowska, Gethin Norman, and David Parker. Probabilistic model checking of labelled markov processes via finite approximate bisimulations. In Franck van Breugel, Elham Kashefi, Catuscia Palamidessi, and Jan Rutten, editors, Horizons of the Mind. A Tribute to Prakash Panangaden - Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday, volume 8464 of Lecture Notes in Computer Science, pages 40–58. Springer, 2014. doi:10.1007/978-3-319-06880-0_2.
[3] Peter Aczel and Nax Mendler. A final coalgebra theorem. In Proc. 3^rd CTCS, volume 389 of LNCS, pages 357–365. Springer, 1989. doi:10.1007/BFB0018361.
[4] Pedro Azevedo de Amorim, Mayuko Kori, and Koko Muroya. A framework for coalgebraic reward-sensitive bisimulation, 2025.
[5] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, and Radu Mardare. On-the-fly computation of bisimilarity distances. Log. Methods Comput. Sci., 13(2), 2017. doi:10.23638/LMCS-13(2:13)2017.
[6] Giovanni Bacci, Giorgio Bacci, Kim G. Larsen, and Radu Mardare. On the metric-based approximate minimization of markov chains. In Ioannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl, editors, 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10-14, 2017, Warsaw, Poland, volume 80 of LIPIcs, pages 104:1–104:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPIcs.ICALP.2017.104.
[7] C. Baier. On Algorithmic Verification Methods for Probabilistic Systems. Habilitation Thesis, 1998.
[8] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008.
[9] Paolo Baldan, Filippo Bonchi, Henning Kerstan, and Barbara König. Coalgebraic behavioral metrics. Log. Methods Comput. Sci., 14(3), 2018. doi:10.23638/LMCS-14(3:20)2018.
[10] Gaoang Bian and Alessandro Abate. On the relationship between bisimulation and trace equivalence in an approximate probabilistic context. In Javier Esparza and Andrzej S. Murawski, editors, Foundations of Software Science and Computation Structures - 20th International Conference, FOSSACS 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, volume 10203 of Lecture Notes in Computer Science, pages 321–337, 2017. doi:10.1007/978-3-662-54458-7_19.
[11] Richard Blute, Josée Desharnais, Abbas Edalat, and Panangaden Panangaden. Bisimulation for labelled Markov processes. In Proceedings of the Twelfth IEEE Symposium On Logic In Computer Science (LICS), Warsaw, Poland, 1997.
[12] Filippo Bonchi, Barbara König, and Daniela Petrisan. Up-to techniques for behavioural metrics via fibrations. Math. Struct. Comput. Sci., 33(4-5):182–221, 2023. doi:10.1017/S0960129523000166.
[13] Filippo Bonchi, Alexandra Silva, and Ana Sokolova. The Power of Convex Algebras. In CONCUR 2017, volume 85, pages 23:1–23:18. LIPIcs, 2017. doi:10.4230/LIPIcs.CONCUR.2017.23.
[14] Filippo Bonchi, Ana Sokolova, and Valeria Vignudelli. The theory of traces for systems with nondeterminism, probability, and termination. Log. Methods Comput. Sci., 18(2), 2022. doi:10.46298/LMCS-18(2:21)2022.
[15] Nicolas Bourbaki. Elements of Mathematics. Springer-Verlag, 1995. Original French edition published by MASSON, Paris in 1971.
[16] Keri D’Angelo, Sebastian Gurke, Johanna Maria Kirss, Barbara König, Matina Najafi, Wojciech Rozowski, and Paul Wild. Behavioural metrics: Compositionality of the kantorovich lifting and an application to up-to techniques. In Rupak Majumdar and Alexandra Silva, editors, 35th International Conference on Concurrency Theory, CONCUR 2024, September 9-13, 2024, Calgary, Canada, volume 311 of LIPIcs, pages 20:1–20:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.CONCUR.2024.20.
[17] Vincent Danos, Josée Desharnais, François Laviolette, and Prakash Panangaden. Bisimulation and cocongruence for probabilistic systems. Information and Computation, 2005. Special issue for selected papers from CMCS04. 22 pages.
[18] Luca de Alfaro, Marco Faella, and Mariëlle Stoelinga. Linear and branching metrics for quantitative transition systems. In Josep Díaz, Juhani Karhumäki, Arto Lepistö, and Donald Sannella, editors, Automata, Languages and Programming: 31st International Colloquium, ICALP 2004, Turku, Finland, July 12-16, 2004. Proceedings, volume 3142 of Lecture Notes in Computer Science, pages 97–109. Springer, 2004. doi:10.1007/978-3-540-27836-8_11.
[19] Luca de Alfaro and Rupak Majumdar. Quantitative solution of omega-regular games. In STOC ’01: Proceedings of the thirty-third annual ACM symposium on Theory of computing, pages 675–683, New York, NY, USA, 2001. ACM. doi:10.1145/380752.380871.
[20] Erik P. de Vink and Jan J. M. M. Rutten. Bisimulation for probabilistic transition systems: A coalgebraic approach. Theor. Comput. Sci., 221(1-2):271–293, 1999. doi:10.1016/S0304-3975(99)00035-3.
[21] Josée Desharnais. Labelled Markov Processes. PhD thesis, McGill University, 2000.
[22] Josée Desharnais, Vineet Gupta, R. Jagadeesan, and P. Panangaden. Metrics for labeled Markov processes. In Jos C. M. Baeten and S. Mauw, editors, Proceedings of 10th International Conference on Concurrency Theory, Eindhoven, The Netherlands, Lecture Notes in Computer Science, pages 258–273. Springer-Verlag, August 1999.
[23] Josée Desharnais, Vineet Gupta, R. Jagadeesan, and P. Panangaden. Approximating continuous Markov processes. In Proceedings of the 15th Annual IEEE Symposium On Logic In Computer Science, Santa Barbara, Californie, USA, 2000. pp. 95-106.
[24] Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labelled Markov processes. Theoretical Computer Science, 318(3):323–354, 2004. doi:10.1016/j.tcs.2003.09.013.
[25] Josée Desharnais, R. Jagadeesan, Vineet Gupta, and P. Panangaden. The metric analogue of weak bisimulation for probabilistic processes. In Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science (LICS02), pages 413–422, Copenhagen, Denmark, July 2002. IEEE Computer Society. doi:10.1109/LICS.2002.1029849.
[26] Josée Desharnais, François Laviolette, and Mathieu Tracol. Approximate analysis of probabilistic processes: Logic, simulation and games. In Fifth International Conference on the Quantitative Evaluaiton of Systems (QEST 2008), 14-17 September 2008, Saint-Malo, France, pages 264–273. IEEE Computer Society, 2008. doi:10.1109/QEST.2008.42.
[27] Alessandro D’Innocenzo, Alessandro Abate, and Joost-Pieter Katoen. Robust PCTL model checking. In Thao Dang and Ian M. Mitchell, editors, Hybrid Systems: Computation and Control (part of CPS Week 2012), HSCC’12, Beijing, China, April 17-19, 2012, pages 275–286. ACM, 2012. doi:10.1145/2185632.2185673.
[28] A. Giacalone, P.Misra, and S. Prasad. Facile: A symmetric integration of concurrent and functional programming. In LNCS 352: TAPSOFT 89, 1989.
[29] Alessandro Giacalone, Chi chang Jou, and Scott A. Smolka. Algebraic reasoning for probabilistic concurrent systems. In Proc. IFIP TC2 Working Conference on Programming Concepts and Methods, pages 443–458. North-Holland, 1990.
[30] Antoine Girard and George J. Pappas. Approximate bisimulation relations for constrained linear systems. Automatica, 43(8):1307–1317, 2007. doi:10.1016/j.automatica.2007.01.019.
[31] Sergey Goncharov, Dirk Hofmann, Pedro Nora, Lutz Schröder, and Paul Wild. Kantorovich functors and characteristic logics for behavioural distances. In Orna Kupferman and Pawel Sobocinski, editors, Foundations of Software Science and Computation Structures - 26th International Conference, FoSSaCS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2023, Paris, France, April 22-27, 2023, Proceedings, volume 13992 of Lecture Notes in Computer Science, pages 46–67. Springer, 2023. doi:10.1007/978-3-031-30829-1_3.
[32] Ichiro Hasuo. Generic forward and backward simulations. In CONCUR 2006, pages 406–420. LNCS 4137, 2006. doi:10.1007/11817949_27.
[33] Ichiro Hasuo. Generic forward and backward simulations II: probabilistic simulation. In CONCUR 2010, pages 447–461. LNCS 6269, 2010. doi:10.1007/978-3-642-15375-4_31.
[34] Jesse Hughes and Bart Jacobs. Simulations in coalgebra. Theor. Comput. Sci., 327(1-2):71–108, 2004. doi:10.1016/J.TCS.2004.07.022.
[35] Bart Jacobs. Introduction to Coalgebra: Towards Mathematics of States and Observation, volume 59 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 2016. doi:10.1017/CBO9781316823187.
[36] Bart Jacobs, Alexandra Silva, and Ana Sokolova. Trace semantics via determinization. J. Comput. Syst. Sci., 81(5):859–879, 2015. doi:10.1016/J.JCSS.2014.12.005.
[37] Dexter Kozen. Semantics of probabilistic programs. Journal of Computer and Systems Sciences, 22:328–350, 1981. doi:10.1016/0022-0000(81)90036-2.
[38] Kim G. Larsen and Arne Skou. Bisimulation through probablistic testing. Information and Computation, 94:1–28, 1991. doi:10.1016/0890-5401(91)90030-6.
[39] Radu Mardare, Prakash Panangaden, and Gordon D. Plotkin. Quantitative algebraic reasoning. In Martin Grohe, Eric Koskinen, and Natarajan Shankar, editors, Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’16, New York, NY, USA, July 5-8, 2016, pages 700–709. ACM, 2016. doi:10.1145/2933575.2934518.
[40] Radu Mardare, Prakash Panangaden, and Gordon D. Plotkin. On the axiomatizability of quantitative algebras. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017, pages 1–12. IEEE Computer Society, 2017. doi:10.1109/LICS.2017.8005102.
[41] R. Milner. Communication and Concurrency. Prentice Hall, 1989.
[42] Matteo Mio, Ralph Sarkis, and Valeria Vignudelli. Combining nondeterminism, probability, and termination: Equational and metric reasoning. In 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021, pages 1–14. IEEE, 2021. doi:10.1109/LICS52264.2021.9470717.
[43] Matteo Mio, Ralph Sarkis, and Valeria Vignudelli. Universal quantitative algebra for fuzzy relations and generalised metric spaces. Log. Methods Comput. Sci., 20(4), 2024. doi:10.46298/LMCS-20(4:19)2024.
[44] Prakash Panangaden. Labelled Markov Processes. Imperial College Press, 2009.
[45] Yu. V. Prokhorov. Convergence of random processes and limit theorems in probability theory. Theory of Probability & Its Applications, 1(2):157–214, 1956. doi:10.1137/1101016.
[46] Jurriaan Rot and Thorsten Wißmann. Bisimilar states in uncertain structures. In Paolo Baldan and Valeria de Paiva, editors, 10th Conference on Algebra and Coalgebra in Computer Science, CALCO 2023, Indiana University Bloomington, IN, USA, June 19-21, 2023, volume 270 of LIPIcs, pages 12:1–12:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.CALCO.2023.12.
[47] J.J.M.M. Rutten. Universal coalgebra: A theory of systems. Theoretical Computer Science, 249:3–80, 2000. doi:10.1016/S0304-3975(00)00056-6.
[48] R. Segala and N.A. Lynch. Probabilistic simulations for probabilistic processes. In Proc. Concur’94, pages 481–496. LNCS 836, 1994.
[49] A. Sokolova. Coalgebraic Analysis of Probabilistic Systems. PhD thesis, TU Eindhoven, 2005.
[50] Timm Spork, Christel Baier, Joost-Pieter Katoen, Jakob Piribauer, and Tim Quatmann. A spectrum of approximate probabilistic bisimulations. In Rupak Majumdar and Alexandra Silva, editors, 35th International Conference on Concurrency Theory, CONCUR 2024, September 9-13, 2024, Calgary, Canada, volume 311 of LIPIcs, pages 37:1–37:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, August 2024. doi:10.4230/LIPIcs.CONCUR.2024.37.
[51] David Sprunger, Shin-ya Katsumata, Jérémy Dubut, and Ichiro Hasuo. Fibrational bisimulations and quantitative reasoning. In Corina Cîrstea, editor, Coalgebraic Methods in Computer Science, pages 190–213, Cham, 2018. Springer International Publishing. doi:10.1007/978-3-030-00389-0_11.
[52] David Sprunger, Shin-ya Katsumata, Jérémy Dubut, and Ichiro Hasuo. Fibrational bisimulations and quantitative reasoning: Extended version. J. Log. Comput., 31(6):1526–1559, 2021. doi:10.1093/LOGCOM/EXAB051.
[53] Qiyi Tang and Franck van Breugel. Algorithms to compute probabilistic bisimilarity distances for labelled markov chains. In Proc. CONCUR 2017, volume 85 of LIPIcs, pages 27:1–27:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPIcs.CONCUR.2017.27.
[54] Mathieu Tracol, Josée Desharnais, and Abir Zhioua. Computing distances between probabilistic automata. In Mieke Massink and Gethin Norman, editors, Proceedings Ninth Workshop on Quantitative Aspects of Programming Languages, QAPL 2011, Saarbrücken, Germany, April 1-3, 2011, volume 57 of EPTCS, pages 148–162, 2011. doi:10.4204/EPTCS.57.11.
[55] Franck van Breugel. Probabilistic bisimilarity distances. ACM SIGLOG News, 4(4):33–51, November 2017. doi:10.1145/3157831.3157837.
[56] Franck van Breugel, Claudio Hermida, Michael Makkai, and James Worrell. Recursively defined metric spaces without contraction. Theor. Comput. Sci., 380(1-2):143–163, 2007. doi:10.1016/j.tcs.2007.02.059.
[57] Franck van Breugel, Babita Sharma, and James Worrell. Approximating a behavioural pseudometric without discount for probabilistic systems. In Helmut Seidl, editor, FoSSaCS, volume 4423 of Lecture Notes in Computer Science, pages 123–137. Springer, 2007. doi:10.1007/978-3-540-71389-0_10.
[58] Franck van Breugel and James Worrell. Towards quantitative verification of probabilistic transition systems. In ICALP ’01: Proceedings of the 28th International Colloquium on Automata, Languages and Programming,, pages 421–432, London, UK, 2001. Springer-Verlag. doi:10.1007/3-540-48224-5_35.
[59] Franck van Breugel and James Worrell. A behavioural pseudometric for probabilistic transition systems. Theor. Comput. Sci., 331(1):115–142, February 2005. doi:10.1016/j.tcs.2004.09.035.
[60] Rob J. van Glabbeek, Scott A. Smolka, and Bernhard Steffen. Reactive, generative and stratified models of probabilistic processes. Inf. Comput., 121(1):59–80, 1995. doi:10.1006/inco.1995.1123.
[61] Mingsheng Ying and Martin Wirsing. Approximate bisimilarity. In AMAST ’00: Proceedings of the 8th International Conference on Algebraic Methodology and Software Technology, pages 309–322, London, UK, 2000. Springer-Verlag. doi:10.1007/3-540-45499-3_23.

Appendix A Proofs

Lemma 10. [Restated, see original statement.]

Let $\langle S,d\rangle$ be a pseudometric space, on which an LMP $\tau:S\to(\mathcal{D}S)^{\mathcal{A}}$ is defined. Let $s_{0}$ and $s_{1}$ be states. Then

\displaystyle\Delta_{LP}(d)(s_{0},s_{1})

\displaystyle=\inf\{\varepsilon\mid(\forall a\in\mathcal{A},\forall A\subseteq S% :\tau_{a}(s_{i})(A)\leq\tau_{a}(s_{1-i})(A^{d}_{\varepsilon})+\varepsilon,% \quad i=0,1)\}.

Proof.

For $a\in\mathcal{A}$ and $\varepsilon>0$ , we define the following predicate:

P(a,\varepsilon):=(\forall A\subseteq S:\tau_{a}(s_{i})(A)\leq\tau_{a}(s_{1-i}% )(A^{d}_{\varepsilon})+\varepsilon,\quad i=0,1).

This family of predicates satisfies

\mbox{if }\gamma>\varepsilon,P(a,\varepsilon)\Rightarrow P(a,\gamma).

(6)

We now prove that

\sup_{a\in\mathcal{A}}\inf\{\varepsilon\mid P(a,\varepsilon)\}=\inf\{% \varepsilon\mid(\forall a\in\mathcal{A}:P(a,\varepsilon))\}

$\leq$ :

The following sequence of arguments shows this inequality.

$\displaystyle\{\varepsilon\mid P(a_{0},\varepsilon)\}$	$\displaystyle\supseteq\{\varepsilon\mid(\forall a\in\mathcal{A}:P(a,% \varepsilon))\}$	$\displaystyle\mbox{for all }a_{0}\in\mathcal{A},\mbox{ so}$
$\displaystyle\inf\{\varepsilon\mid P(a_{0},\varepsilon)\}$	$\displaystyle\leq\inf\{\varepsilon\mid(\forall a\in\mathcal{A}:P(a,\varepsilon% ))\}$	$\displaystyle\mbox{for all }a_{0}\in\mathcal{A},\mbox{ and so}$
$\displaystyle\sup_{a\in\mathcal{A}}\inf\{\varepsilon\mid P(a,\varepsilon)\}$	$\displaystyle\leq\inf\{\varepsilon\mid(\forall a\in\mathcal{A}:P(a,\varepsilon% ))\}.$

$\geq$ :

Let $\alpha=\sup_{a\in\mathcal{A}}\inf\{\varepsilon\mid P(a,\varepsilon)\}.$ Then for all $a\in\mathcal{A}$ , $\alpha\geq\inf\{\varepsilon\mid P(a,\varepsilon)\}.$ Let $\gamma>\alpha$ . Then we have

$\displaystyle\mbox{for all }a\in\mathcal{A}$	$\displaystyle\quad\gamma\in\{\varepsilon\mid P(a,\varepsilon)\}$	$\displaystyle\mbox{by\nobreak\ \eqref{eq:gamma>epsilon} and because }\gamma>% \inf\{\varepsilon\mid P(a,\varepsilon)\}$
so	$\displaystyle\quad\gamma\in\cap_{a\in\mathcal{A}}\{\varepsilon\mid P(a,% \varepsilon)\}$
so	$\displaystyle\quad\gamma\in\{\varepsilon\mid(\forall a\in\mathcal{A}:P(a,% \varepsilon))\}$
so	$\displaystyle\quad\gamma\geq\inf\{\varepsilon\mid(\forall a\in\mathcal{A}:P(a,% \varepsilon))\}.\hskip-11.38109pt$

This implies $\alpha\geq\inf\{\varepsilon\mid(\forall a\in\mathcal{A}:P(a,\varepsilon))\}$ as well. Otherwise, suppose $\alpha<\iota:=\inf\{\varepsilon\mid(\forall a\in\mathcal{A}:P(a,\varepsilon))\}$ . Then $\alpha<(\alpha+\iota)/2<\iota$ , so the result above with $\gamma=(\alpha+\iota)/2$ yields $(\alpha+\iota)/2\geq\iota$ , a contradiction.

$\hfill\blacktriangleleft$

The proof that $\Delta_{LP}$ is monotonic needs the following simple lemma.

Lemma 29.

If $d_{2}\sqsubseteq d_{1}$ , we have $X^{d_{2}}_{\varepsilon}\subseteq X^{d_{1}}_{\varepsilon}$ .

Proof.

Assume $d_{1}\leq d_{2}$ . Let $y\in X^{d_{2}}_{\varepsilon}$ . Then there is some $x\in X$ such that $d_{2}(x,y)<\varepsilon$ . This gives $d_{1}(x,y)\leq d_{2}(x,y)<\varepsilon$ . So $y\in X^{d_{1}}_{\varepsilon}$ . $\hfill\blacktriangleleft$

Proposition 14. [Restated, see original statement.]

$\Delta_{LP}$ is monotonic.

Proof.

Let $d_{1}\leq d_{2}$ . Let $s_{0}$ , $s_{1}$ be states, let $a\in\mathcal{A}$ and define the two sets

	$\displaystyle E_{1}$	$\displaystyle=\{\varepsilon\mid(\forall A\subseteq S:\tau_{a}(s_{i},A)\leq\tau% _{a}(s_{1-i},A^{d_{1}}_{\varepsilon})+\varepsilon,\quad i=0,1)\}$
	$\displaystyle\mbox{and}\quad E_{2}$	$\displaystyle=\{\varepsilon\mid(\forall A\subseteq S:\tau_{a}(s_{i},A)\leq\tau% _{a}(s_{1-i},A^{d_{2}}_{\varepsilon})+\varepsilon,\quad i=0,1)\}.$

Then $E_{2}\subseteq E_{1}$ . Indeed, let $\varepsilon\in E_{2}$ and $A\subseteq S$ . Then $\tau_{a}(s_{i},A)\leq\tau_{a}(s_{1-i},A^{d_{2}}_{\varepsilon})+\varepsilon\leq% \tau_{a}(s_{1-i},A^{d_{1}}_{\varepsilon})+\varepsilon$ , because $A^{d_{2}}_{\varepsilon}\subseteq A^{d_{1}}_{\varepsilon}$ by Lemma 29. So $\inf E_{1}\leq\inf E_{2}$ and hence $\Delta_{LP}(d_{1})=\sup_{a\in\mathcal{A}}\inf E_{1}\leq\sup_{a\in\mathcal{A}}% \inf E_{2}=\Delta_{LP}(d_{2})$ , as wanted. $\hfill\blacktriangleleft$

Proposition 15. [Restated, see original statement.]

$d^{*}$ is a fixpoint of $\Delta_{LP}$ , and hence it is the greatest fixpoint.

Proof.

Let $s_{0}$ , $s_{1}\in S$ . We want $\Delta_{LP}(d^{*})(s_{0},s_{1})=d^{*}(s_{0},s_{1})$ . Consider the following sets

	$\displaystyle E^{\Delta}$	$\displaystyle=\{\varepsilon\mid\forall a\in\mathcal{A}:(\forall X\subseteq S:% \tau_{a}(s_{i},X)\leq\tau_{a}(s_{1-i},X^{d^{*}}_{\varepsilon})+\varepsilon,% \quad i=0,1)\}$
	$\displaystyle\mbox{and}\quad E^{*}$	$\displaystyle=\{\varepsilon\mid s_{0}\sim_{\varepsilon}s_{1}\}$
		$\displaystyle=\{\varepsilon\mid\exists R\mbox{ an $\varepsilon$-bisimulation s% .t. $s_{0}Rs_{1}$, that is, for all $(t_{0},t_{1})\in R$, we have}$
		$\displaystyle\mbox{\nobreak\ \nobreak\ \nobreak\ \nobreak\ }\quad\forall a\in% \mathcal{A}:(\forall X\subseteq S:\tau_{a}(t_{i},X)\leq\tau_{a}(t_{1-i},R(X))+% \varepsilon,\quad i=0,1)\}.$

Then $\Delta(d^{*})(s_{0},s_{1})=\inf E^{\Delta}$ by Lemma 10 and $d^{*}(s_{0},s_{1})=\inf E^{*}\!$ . We prove $\inf E^{\Delta}=\inf E^{*}\!$ .

$\geq$ :

We prove that $E^{\Delta}\subseteq E^{*}$ . Let $\varepsilon\in E^{\Delta}$ . Let $R_{\varepsilon}=\{(x,y)\mid x\sim_{\varepsilon}y\}$ , that is, the biggest $\varepsilon$ -bisimulation. We need to prove that $s_{0}\sim_{\varepsilon}s_{1}$ . For that we prove that $R_{\varepsilon}^{+}:=R_{\varepsilon}\cup\{(s_{0},s_{1})\}$ is an $\varepsilon$ -bisimulation. Let $a\in\mathcal{A}$ and $X\subseteq S$ . Then $X_{\varepsilon}^{d^{*}}\subseteq R^{+}_{\varepsilon}(X)$ . Indeed, let $y\in X_{\varepsilon}^{d^{*}}$ then $d^{*}(y,x)<\varepsilon$ for some $x\in X$ . So, because $d^{*}(y,x)=\inf\{e\mid y\sim_{e}x\}$ , we have $y\sim_{\gamma}x$ for some $\gamma<\varepsilon$ , and hence $y\sim_{\varepsilon}x$ as every $\gamma$ -bisimulation is an $\varepsilon$ -bisimulation. Hence $y\in R^{+}_{\varepsilon}(X).$ We only need to prove the bisimulation condition for the pair $(s_{0},s_{1})$ . Then, for $i=0,1$ we have

	$\displaystyle\tau_{a}(s_{i},X)$	$\displaystyle\leq\tau_{a}(s_{1-i},X_{\varepsilon}^{d^{*}})+\varepsilon$	$\displaystyle\mbox{by choice of }\varepsilon$
		$\displaystyle\leq\tau_{a}(s_{1-i},R^{+}_{\varepsilon}(X))+\varepsilon$	$\displaystyle\mbox{because }X_{\varepsilon}^{d^{*}}\subseteq R^{+}_{% \varepsilon}(X).$

So $s_{0}\sim_{\varepsilon}s_{1}$ and hence $\varepsilon\in E^{*}$ .

$\leq$ :

Let $\varepsilon\in E^{*}$ . We prove that for all $\gamma>\varepsilon$ , we have $\gamma\in E^{\Delta}$ . This will prove that $\inf E^{\Delta}\leq\varepsilon$ . Since $\varepsilon$ is arbitrary, this will give the result: $\inf E^{\Delta}\leq\inf E^{*}$ .
So let $\gamma>\varepsilon$ . Because $\varepsilon\in E^{*}$ , we have $s_{0}\sim_{\varepsilon}s_{1}$ , and hence there is some $\varepsilon$ -bisimulation $R$ such that $s_{0}Rs_{1}$ . We want $\gamma\in E^{\Delta}$ , so let $a\in\mathcal{A}$ and $X\subseteq S$ . We want $\tau_{a}(s_{i},X)\leq\tau_{a}(s_{1-i},X^{d^{*}}_{\gamma})+\gamma,\quad i=0,1$ . First observe that $R(X)\subseteq X^{d^{*}}_{\gamma}$ . Indeed, if $y\in R(X)$ , then there is some $x\in X$ such that $y R x$ . Because $R$ is an $\varepsilon$ -bisimulation, $d^{*}(y,x)\leq\varepsilon$ , so $y\in X^{d^{*}}_{\gamma}$ (we cannot say $y\in X^{d^{*}}_{\varepsilon}$ because these sets are open balls). Then, for $i=0,1$ :

	$\displaystyle\tau_{a}(s_{i},X)$	$\displaystyle\leq\tau_{a}(s_{1-i},R(X))+\varepsilon,$	because $R$ is an $\varepsilon$ -bisimulation
		$\displaystyle\leq\tau_{a}(s_{1-i},X^{d^{*}}_{\gamma})+\gamma,$	$\displaystyle\mbox{because }R(X)\subseteq X^{d^{*}}_{\gamma}\mbox{and }% \varepsilon<\gamma.$

So, $\gamma\in E^{\Delta}$ , as wanted. We have proven that $d^{*}$ is a fixpoint of $\Delta_{LP}$ , and it is the greatest by Proposition 12.

$\hfill\blacktriangleleft$

Proposition 16. [Restated, see original statement.]

$\mathcal{D}_{LP}$ is a functor on $\mathbf{PMet_{1}}$ .

Proof.

The only thing to prove is that $\mathcal{D}_{LP}f$ is nonexpansive. Let $\varphi_{0},\varphi_{1}\in\mathcal{D}_{LP}\langle X_{0},d_{0}\rangle$ . Then (we omit the symmetric inequality for simplicity)

$\displaystyle\delta^{d_{1}}_{LP}(\mathcal{D}_{LP}f(\varphi_{0}),\mathcal{D}_{% LP}f(\varphi_{1}))$	$\displaystyle=$	$\displaystyle\inf\{\varepsilon\mid\forall B\subseteq X_{1}:\varphi_{0}(f^{-1}(% B))\leq\varphi_{1}(f^{-1}(B_{\varepsilon}^{d_{1}}))+\varepsilon\}$
	$\displaystyle\stackrel{{\scriptstyle(*)}}{{\leq}}$	$\displaystyle\inf\{\varepsilon\mid\forall B\subseteq X_{1}:\varphi_{0}(f^{-1}(% B))\leq\varphi_{1}((f^{-1}B)_{\varepsilon}^{d_{0}})+\varepsilon\}$
	$\displaystyle=$	$\displaystyle\inf\{\varepsilon\mid\forall A\in\{f^{-1}(B)\mid B\subseteq X_{1}% \}:\varphi_{0}(A)\leq\varphi_{1}(A_{\varepsilon}^{d_{0}})+\varepsilon\}$
	$\displaystyle\stackrel{{\scriptstyle(**)}}{{\leq}}$	$\displaystyle\inf\{\varepsilon\mid\forall A\subseteq X_{0}:\varphi_{0}(A)\leq% \varphi_{1}(A_{\varepsilon}^{d_{0}})+\varepsilon\}$
	$\displaystyle=$	$\displaystyle\delta^{d_{0}}_{LP}(\varphi_{0},\varphi_{1})$

where the marked inequalities are justified below. For the first one, marked with $(*)$ , let

\mbox{\nobreak\ \nobreak\ \nobreak\ \nobreak\ \nobreak\ \nobreak\ }V=\{% \varepsilon\mid\forall B\subseteq X_{1}:\varphi_{0}(f^{-1}(B))\leq\varphi_{1}(% (f^{-1}B)_{\varepsilon}^{d_{0}})+\varepsilon\}\mbox{ and}

W=\{\varepsilon\mid\forall B\subseteq X_{1}:\varphi_{0}(f^{-1}(B))\leq\varphi_% {1}(f^{-1}(B_{\varepsilon}^{d_{1}}))+\varepsilon\}.

We show that $V\subseteq W$ . Let $\varepsilon\in V$ . For every $B\subseteq X_{1}$ , let us first show that $f^{-1}(B)_{\varepsilon}^{d_{0}}\subseteq f^{-1}(B_{\varepsilon}^{d_{1}})$ . Indeed, take $x\in f^{-1}(B)_{\varepsilon}^{d_{0}}$ and let $a\in f^{-1}(B)$ be such that $d_{0}(x,a)<\varepsilon$ . Then by nonexpansivity of $f$ ,

d_{1}(f(x),f(a))\leq d_{0}(x,a)<\varepsilon

and so $f(x)\in B_{\varepsilon}^{d_{1}}$ , because $f(a)\in B$ . So $x\in f^{-1}(B_{\varepsilon}^{d_{1}})$ . Now combining this with the inequality condition in $V$ we get

\varphi_{0}(f^{-1}(B))\leq\varphi_{1}(f^{-1}(B)_{\varepsilon}^{d_{0}})+% \varepsilon\leq\varphi_{1}(f^{-1}(B_{\varepsilon}^{d_{1}}))+\varepsilon

and so $\varepsilon\in W$ .
For the inequality marked with (**) , observe that if $\varepsilon$ satisfies the inequality for all $A\subseteq X_{0}$ then it does for all $A\in\{f^{-1}(B)\mid B\subseteq X_{1}\}$ . So the infimum is taken on a possibly bigger set on the left-hand side of the inequality, which concludes the argument. $\hfill\blacktriangleleft$

Proposition 17. [Restated, see original statement.]

The functor $\mathcal{D}_{LP}$ is locally nonexpansive, that is, for $f_{1},f_{2}\in X\to Y$

\delta_{\mathcal{D}_{LP}X\to\mathcal{D}_{LP}Y}(\mathcal{D}_{LP}f_{1},\mathcal{% D}_{LP}f_{2})\leq d_{X\to Y}(f_{1},f_{2})

with $\delta_{\mathcal{D}_{LP}X\to\mathcal{D}_{LP}Y}$ being the metric on the hom-set $\mathcal{D}_{LP}\langle X,d_{X}\rangle\to\mathcal{D}_{LP}\langle Y,d_{Y}\rangle$ .

Proof.

Let $f_{1},f_{2}\in X\to Y$ . We have that

\delta_{\mathcal{D}_{LP}X\to\mathcal{D}_{LP}Y}(\mathcal{D}_{LP}f_{1},\mathcal{% D}_{LP}f_{2})=\sup_{\varphi\in\mathcal{D}X}\delta_{LP}^{d_{Y}}(\mathcal{D}_{LP% }f_{1}(\varphi),\mathcal{D}_{LP}f_{2}(\varphi)).

Let $\varphi\in\mathcal{D}X$ and recall that $\mathcal{D}_{LP}f_{i}(\varphi)=\varphi(f_{i}^{-1}(\cdot))$ . We will show that

\delta_{LP}^{d_{Y}}(\varphi(f_{1}^{-1}(\cdot)),\varphi(f_{2}^{-1}(\cdot)))\leq d% _{X\to Y}(f_{1},f_{2}).

Let $\alpha=d_{X\to Y}(f_{1},f_{2})=\sup_{x\in X}d_{Y}(f_{1}(x),f_{2}(x))$ . Let $B\subseteq Y$ . We show $f_{1}^{-1}(B)\subseteq f_{2}^{-1}(B^{d_{Y}}_{\alpha})$

$\displaystyle f_{1}^{-1}(B)$	$\displaystyle=\{x\mid\exists y\in B:y=f_{1}(x)\}$
	$\displaystyle=\{x\mid\exists y\in B:y=f_{1}(x)\mbox{ and }d_{Y}(y,f_{2}(x))% \leq\alpha\}$	$\displaystyle\mbox{since }d_{Y}(f_{1}(x),f_{2}(x))\leq\alpha$
	$\displaystyle\subseteq\{x\mid\exists y\in B:d_{Y}(y,f_{2}(x))\leq\alpha\}$
	$\displaystyle=\{x\mid f_{2}(x)\in B^{d_{Y}}_{\alpha}\}$
	$\displaystyle=f_{2}^{-1}(B^{d_{Y}}_{\alpha}).$

This implies that $\alpha$ satisfies $(\forall B\subseteq Y:\varphi(f_{1}^{-1}(B))\leq\varphi(f_{2}^{-1}(B^{d_{Y}}_{% \alpha}))+\alpha)$ , and similarly with 1 and 2 inverted.

	$\displaystyle\delta_{LP}^{d_{Y}}(\varphi(f_{1}^{-1}(\cdot)),\varphi(f_{2}^{-1}% (\cdot)))$	$\displaystyle=\inf\{\varepsilon\mid\forall B\subseteq Y:\varphi(f_{1}^{-1}(B))% \leq\varphi(f_{2}^{-1}(B^{d_{Y}}_{\varepsilon}))+\varepsilon\}$
		$\displaystyle\leq\alpha\,\,=\,\,d_{X\to Y}(f_{1},f_{2}),$

as wanted. $\hfill\blacktriangleleft$

Lemma 19. [Restated, see original statement.]

The unit $\eta$ of $\mathcal{D}$ is nonexpansive with respect to the Lévy-Prokhorov distance. Moreover, it is an isometry, i.e., $\delta_{LP}^{d}(1_{x},1_{y})=d(x,y)$ .

Proof.

We have

	$\displaystyle\delta^{d}_{LP}(\eta_{X}(x),\eta_{X}(y))$	$\displaystyle=\delta^{d}_{LP}(1_{x},1_{y})$
		$\displaystyle=\inf\{\varepsilon\mid\forall A\subseteq X:1_{x}(A)\leq 1_{y}(A^{% d}_{\varepsilon})+\varepsilon,1_{y}(A)\leq 1_{x}(A^{d}_{\varepsilon})+% \varepsilon\}.$

Let $A$ be a subset of $X$ . We distinguish four cases in which we consider the relevant inequalities:

	$\displaystyle 1_{x}(A)$	$\displaystyle\leq 1_{y}(A^{d}_{\varepsilon})+\varepsilon$		(7)
	$\displaystyle\mbox{ and\nobreak\ \nobreak\ }1_{y}(A)$	$\displaystyle\leq 1_{x}(A^{d}_{\varepsilon})+\varepsilon$		(8)

1.

$x\notin A,y\notin A$ : Then $1_{x}(A)=0$ and $1_{y}(A)=0$ , and any $\varepsilon\geq 0$ satisfies both (7) and (8).
2.

$x\notin A,y\in A$ : Here $1_{x}(A)=0$ and $1_{y}(A)=1$ , and (7) holds always, but (8) holds for $\varepsilon>d(x,y)$ as then $x\in A^{d}_{\varepsilon}$ and hence $1_{x}(A^{d}_{\varepsilon})=1$ , and might not hold for $\varepsilon\leq d(x,y)$ , e.g. when $A=\{y\}$ and $\varepsilon<d(x,y)\leq 1$ , as then $1_{x}(A^{d}_{\varepsilon})=0$ .
3.

$x\in A,y\notin A$ : This case is dual to the second case. Here (8) always holds, but (7) holds for $\varepsilon>d(x,y)$ and might not hold for $\varepsilon\leq d(x,y)$ as, e.g., for $A=\{x\}$ we have $y\in A^{d}_{\varepsilon}$ if and only if $\varepsilon>d(x,y)$ .
4.

$x\in A,y\in A$ : Then $1_{x}(A)=1$ and $1_{y}(A)=1$ , as well as $1_{x}(A^{d}_{\varepsilon})=1_{y}(A^{d}_{\varepsilon})=1$ , and again, as in the first case, any $\varepsilon\geq 0$ satisfies both (7) and (8).

Hence, both inequalities are satisfied for all $A$ iff $\varepsilon>d(x,y)$ , and therefore $\delta^{d}_{LP}(1_{x},1_{y})=d(x,y)$ . $\hfill\blacktriangleleft$

Lemma 27. [Restated, see original statement.]

Assume that Condition 1. in Definition 26 holds for $\beta\in\mathcal{D}R$ with $R\subseteq X\times Y$ and $\mu\in\mathcal{D}X$ . Then $\mu(X)\leq\sum_{x\in X,y\in Y}\beta(x,y)+\varepsilon$ (i.e., Condition 3. in Definition 26) is equivalent to the condition

\mu(S)\leq\sum_{x\in S,y\in Y}\beta(x,y)+\varepsilon,\quad\text{for all }S% \subseteq X.

Proof.

The right-to-left implication is immediate, as $X\subseteq X$ . For the left-to-right implication, assume that for some $S\subseteq X$ we have

\mu(S)>\sum_{x\in S,y\in Y}\beta(x,y)+\varepsilon.

Since, by assumption, for all $x\in X$ , and hence for all $x\in X\setminus S$ , $\mu(x)\geq\sum_{y\in Y}\beta(x,y)$ , we have

$\displaystyle\mu(X)$	$\displaystyle=$	$\displaystyle\mu(S)+\mu(X\setminus S)$
	$\displaystyle>$	$\displaystyle\sum_{x\in S,y\in Y}\beta(x,y)+\varepsilon+\sum_{x\in X\setminus S% }\mu(x)$
	$\displaystyle\geq$	$\displaystyle\sum_{x\in S,y\in Y}\beta(x,y)+\sum_{x\in X\setminus S}\sum_{y\in Y% }\beta(x,y)+\varepsilon$
	$\displaystyle=$	$\displaystyle\sum_{x\in X,y\in Y}\beta(x,y)+\varepsilon.$

The property now follows by contraposition. $\hfill\blacktriangleleft$

Proposition 28. [Restated, see original statement.]

Let $(X,c)$ and $(Y,d)$ be two $\mathcal{D}$ -coalgebras. The following three properties are equivalent for $R\subseteq X\times Y$ :

1.

$R$ is an $\varepsilon$ -simulation.
2.

For every $(x,y)\in R$ , there is an $\varepsilon$ -coupling $\beta\in\mathcal{D}R$ of $\mu=c(x)$ and $\nu=d(y)$ .
3.

The “ $\varepsilon$ -lax-bounded” span of morphisms on the left below commutes.

Proof.

The equivalence of 1. and 2. has been shown in [54]. Let $R$ be an $\varepsilon$ -simulation. For $(x,y)\in R$ , let $\mu=c(x)$ and $\nu=d(y)$ . Define $b\colon R\to\mathcal{D}R$ by $(x,y)\mapsto\beta$ , the epsilon coupling of $\mu$ and $\nu$ that exists.

Then, unfolding the definitions, we get:

$\blacksquare$

$\mathcal{D}\pi_{1}\circ b\sqsubseteq c\circ\pi_{1}$ is equivalent to: for all $(x,y)\in R$ , condition 1. from Definition 26 holds.
$\blacksquare$

$\mathcal{D}\pi_{2}\circ b\sqsubseteq d\circ\pi_{2}$ is equivalent to: for all $(x,y)\in R$ , condition 2. from Definition 26 holds.
$\blacksquare$

$c\circ\pi_{1}\sqsubseteq_{\varepsilon}\mathcal{D}\pi_{1}\circ b$ is equivalent to: for all $(x,y)\in R$ , for all $S\subseteq X$ ,

$\mu(S)\leq\sum_{x\in S,y\in Y}\beta(x,y)+\varepsilon$

which by Lemma 27 is equivalent to condition 3. from Definition 26.

These facts yield the equivalence of 3. with 2. (and hence 1. as well). $\hfill\blacktriangleleft$

[bib.bib1] [1] Alessandro Abate. Approximation metrics based on probabilistic bisimulations for general state-space markov processes: A survey. Electronic Notes in Theoretical Computer Science, 297:3–25, 2013. Proceedings of the first workshop on Hybrid Autonomous Systems. doi:10.1016/j.entcs.2013.12.002.

[bib.bib2] [2] Alessandro Abate, Marta Z. Kwiatkowska, Gethin Norman, and David Parker. Probabilistic model checking of labelled markov processes via finite approximate bisimulations. In Franck van Breugel, Elham Kashefi, Catuscia Palamidessi, and Jan Rutten, editors, Horizons of the Mind. A Tribute to Prakash Panangaden - Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday, volume 8464 of Lecture Notes in Computer Science, pages 40–58. Springer, 2014. doi:10.1007/978-3-319-06880-0_2.

[bib.bib3] [3] Peter Aczel and Nax Mendler. A final coalgebra theorem. In Proc. 3^rd CTCS, volume 389 of LNCS, pages 357–365. Springer, 1989. doi:10.1007/BFB0018361.

[bib.bib4] [4] Pedro Azevedo de Amorim, Mayuko Kori, and Koko Muroya. A framework for coalgebraic reward-sensitive bisimulation, 2025.

[bib.bib5] [5] Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, and Radu Mardare. On-the-fly computation of bisimilarity distances. Log. Methods Comput. Sci., 13(2), 2017. doi:10.23638/LMCS-13(2:13)2017.

[bib.bib6] [6] Giovanni Bacci, Giorgio Bacci, Kim G. Larsen, and Radu Mardare. On the metric-based approximate minimization of markov chains. In Ioannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl, editors, 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10-14, 2017, Warsaw, Poland, volume 80 of LIPIcs, pages 104:1–104:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPIcs.ICALP.2017.104.

[bib.bib7] [7] C. Baier. On Algorithmic Verification Methods for Probabilistic Systems. Habilitation Thesis, 1998.

[bib.bib8] [8] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008.

[bib.bib9] [9] Paolo Baldan, Filippo Bonchi, Henning Kerstan, and Barbara König. Coalgebraic behavioral metrics. Log. Methods Comput. Sci., 14(3), 2018. doi:10.23638/LMCS-14(3:20)2018.

[bib.bib10] [10] Gaoang Bian and Alessandro Abate. On the relationship between bisimulation and trace equivalence in an approximate probabilistic context. In Javier Esparza and Andrzej S. Murawski, editors, Foundations of Software Science and Computation Structures - 20th International Conference, FOSSACS 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, volume 10203 of Lecture Notes in Computer Science, pages 321–337, 2017. doi:10.1007/978-3-662-54458-7_19.

[bib.bib11] [11] Richard Blute, Josée Desharnais, Abbas Edalat, and Panangaden Panangaden. Bisimulation for labelled Markov processes. In Proceedings of the Twelfth IEEE Symposium On Logic In Computer Science (LICS), Warsaw, Poland, 1997.

[bib.bib12] [12] Filippo Bonchi, Barbara König, and Daniela Petrisan. Up-to techniques for behavioural metrics via fibrations. Math. Struct. Comput. Sci., 33(4-5):182–221, 2023. doi:10.1017/S0960129523000166.

[bib.bib13] [13] Filippo Bonchi, Alexandra Silva, and Ana Sokolova. The Power of Convex Algebras. In CONCUR 2017, volume 85, pages 23:1–23:18. LIPIcs, 2017. doi:10.4230/LIPIcs.CONCUR.2017.23.

[bib.bib14] [14] Filippo Bonchi, Ana Sokolova, and Valeria Vignudelli. The theory of traces for systems with nondeterminism, probability, and termination. Log. Methods Comput. Sci., 18(2), 2022. doi:10.46298/LMCS-18(2:21)2022.

[bib.bib15] [15] Nicolas Bourbaki. Elements of Mathematics. Springer-Verlag, 1995. Original French edition published by MASSON, Paris in 1971.

[bib.bib16] [16] Keri D’Angelo, Sebastian Gurke, Johanna Maria Kirss, Barbara König, Matina Najafi, Wojciech Rozowski, and Paul Wild. Behavioural metrics: Compositionality of the kantorovich lifting and an application to up-to techniques. In Rupak Majumdar and Alexandra Silva, editors, 35th International Conference on Concurrency Theory, CONCUR 2024, September 9-13, 2024, Calgary, Canada, volume 311 of LIPIcs, pages 20:1–20:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.CONCUR.2024.20.

[bib.bib17] [17] Vincent Danos, Josée Desharnais, François Laviolette, and Prakash Panangaden. Bisimulation and cocongruence for probabilistic systems. Information and Computation, 2005. Special issue for selected papers from CMCS04. 22 pages.

[bib.bib18] [18] Luca de Alfaro, Marco Faella, and Mariëlle Stoelinga. Linear and branching metrics for quantitative transition systems. In Josep Díaz, Juhani Karhumäki, Arto Lepistö, and Donald Sannella, editors, Automata, Languages and Programming: 31st International Colloquium, ICALP 2004, Turku, Finland, July 12-16, 2004. Proceedings, volume 3142 of Lecture Notes in Computer Science, pages 97–109. Springer, 2004. doi:10.1007/978-3-540-27836-8_11.

[bib.bib19] [19] Luca de Alfaro and Rupak Majumdar. Quantitative solution of omega-regular games. In STOC ’01: Proceedings of the thirty-third annual ACM symposium on Theory of computing, pages 675–683, New York, NY, USA, 2001. ACM. doi:10.1145/380752.380871.

[bib.bib20] [20] Erik P. de Vink and Jan J. M. M. Rutten. Bisimulation for probabilistic transition systems: A coalgebraic approach. Theor. Comput. Sci., 221(1-2):271–293, 1999. doi:10.1016/S0304-3975(99)00035-3.

[bib.bib21] [21] Josée Desharnais. Labelled Markov Processes. PhD thesis, McGill University, 2000.

[bib.bib22] [22] Josée Desharnais, Vineet Gupta, R. Jagadeesan, and P. Panangaden. Metrics for labeled Markov processes. In Jos C. M. Baeten and S. Mauw, editors, Proceedings of 10th International Conference on Concurrency Theory, Eindhoven, The Netherlands, Lecture Notes in Computer Science, pages 258–273. Springer-Verlag, August 1999.

[bib.bib23] [23] Josée Desharnais, Vineet Gupta, R. Jagadeesan, and P. Panangaden. Approximating continuous Markov processes. In Proceedings of the 15th Annual IEEE Symposium On Logic In Computer Science, Santa Barbara, Californie, USA, 2000. pp. 95-106.

[bib.bib24] [24] Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labelled Markov processes. Theoretical Computer Science, 318(3):323–354, 2004. doi:10.1016/j.tcs.2003.09.013.

[bib.bib25] [25] Josée Desharnais, R. Jagadeesan, Vineet Gupta, and P. Panangaden. The metric analogue of weak bisimulation for probabilistic processes. In Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science (LICS02), pages 413–422, Copenhagen, Denmark, July 2002. IEEE Computer Society. doi:10.1109/LICS.2002.1029849.

[bib.bib26] [26] Josée Desharnais, François Laviolette, and Mathieu Tracol. Approximate analysis of probabilistic processes: Logic, simulation and games. In Fifth International Conference on the Quantitative Evaluaiton of Systems (QEST 2008), 14-17 September 2008, Saint-Malo, France, pages 264–273. IEEE Computer Society, 2008. doi:10.1109/QEST.2008.42.

[bib.bib27] [27] Alessandro D’Innocenzo, Alessandro Abate, and Joost-Pieter Katoen. Robust PCTL model checking. In Thao Dang and Ian M. Mitchell, editors, Hybrid Systems: Computation and Control (part of CPS Week 2012), HSCC’12, Beijing, China, April 17-19, 2012, pages 275–286. ACM, 2012. doi:10.1145/2185632.2185673.

[bib.bib28] [28] A. Giacalone, P.Misra, and S. Prasad. Facile: A symmetric integration of concurrent and functional programming. In LNCS 352: TAPSOFT 89, 1989.

[bib.bib29] [29] Alessandro Giacalone, Chi chang Jou, and Scott A. Smolka. Algebraic reasoning for probabilistic concurrent systems. In Proc. IFIP TC2 Working Conference on Programming Concepts and Methods, pages 443–458. North-Holland, 1990.

[bib.bib30] [30] Antoine Girard and George J. Pappas. Approximate bisimulation relations for constrained linear systems. Automatica, 43(8):1307–1317, 2007. doi:10.1016/j.automatica.2007.01.019.

[bib.bib31] [31] Sergey Goncharov, Dirk Hofmann, Pedro Nora, Lutz Schröder, and Paul Wild. Kantorovich functors and characteristic logics for behavioural distances. In Orna Kupferman and Pawel Sobocinski, editors, Foundations of Software Science and Computation Structures - 26th International Conference, FoSSaCS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2023, Paris, France, April 22-27, 2023, Proceedings, volume 13992 of Lecture Notes in Computer Science, pages 46–67. Springer, 2023. doi:10.1007/978-3-031-30829-1_3.

[bib.bib32] [32] Ichiro Hasuo. Generic forward and backward simulations. In CONCUR 2006, pages 406–420. LNCS 4137, 2006. doi:10.1007/11817949_27.

[bib.bib33] [33] Ichiro Hasuo. Generic forward and backward simulations II: probabilistic simulation. In CONCUR 2010, pages 447–461. LNCS 6269, 2010. doi:10.1007/978-3-642-15375-4_31.

[bib.bib34] [34] Jesse Hughes and Bart Jacobs. Simulations in coalgebra. Theor. Comput. Sci., 327(1-2):71–108, 2004. doi:10.1016/J.TCS.2004.07.022.

[bib.bib35] [35] Bart Jacobs. Introduction to Coalgebra: Towards Mathematics of States and Observation, volume 59 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 2016. doi:10.1017/CBO9781316823187.

[bib.bib36] [36] Bart Jacobs, Alexandra Silva, and Ana Sokolova. Trace semantics via determinization. J. Comput. Syst. Sci., 81(5):859–879, 2015. doi:10.1016/J.JCSS.2014.12.005.

[bib.bib37] [37] Dexter Kozen. Semantics of probabilistic programs. Journal of Computer and Systems Sciences, 22:328–350, 1981. doi:10.1016/0022-0000(81)90036-2.

[bib.bib38] [38] Kim G. Larsen and Arne Skou. Bisimulation through probablistic testing. Information and Computation, 94:1–28, 1991. doi:10.1016/0890-5401(91)90030-6.

[bib.bib39] [39] Radu Mardare, Prakash Panangaden, and Gordon D. Plotkin. Quantitative algebraic reasoning. In Martin Grohe, Eric Koskinen, and Natarajan Shankar, editors, Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’16, New York, NY, USA, July 5-8, 2016, pages 700–709. ACM, 2016. doi:10.1145/2933575.2934518.

[bib.bib40] [40] Radu Mardare, Prakash Panangaden, and Gordon D. Plotkin. On the axiomatizability of quantitative algebras. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017, pages 1–12. IEEE Computer Society, 2017. doi:10.1109/LICS.2017.8005102.

[bib.bib41] [41] R. Milner. Communication and Concurrency. Prentice Hall, 1989.

[bib.bib42] [42] Matteo Mio, Ralph Sarkis, and Valeria Vignudelli. Combining nondeterminism, probability, and termination: Equational and metric reasoning. In 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021, pages 1–14. IEEE, 2021. doi:10.1109/LICS52264.2021.9470717.

[bib.bib43] [43] Matteo Mio, Ralph Sarkis, and Valeria Vignudelli. Universal quantitative algebra for fuzzy relations and generalised metric spaces. Log. Methods Comput. Sci., 20(4), 2024. doi:10.46298/LMCS-20(4:19)2024.

[bib.bib44] [44] Prakash Panangaden. Labelled Markov Processes. Imperial College Press, 2009.

[bib.bib45] [45] Yu. V. Prokhorov. Convergence of random processes and limit theorems in probability theory. Theory of Probability & Its Applications, 1(2):157–214, 1956. doi:10.1137/1101016.

[bib.bib46] [46] Jurriaan Rot and Thorsten Wißmann. Bisimilar states in uncertain structures. In Paolo Baldan and Valeria de Paiva, editors, 10th Conference on Algebra and Coalgebra in Computer Science, CALCO 2023, Indiana University Bloomington, IN, USA, June 19-21, 2023, volume 270 of LIPIcs, pages 12:1–12:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.CALCO.2023.12.

[bib.bib47] [47] J.J.M.M. Rutten. Universal coalgebra: A theory of systems. Theoretical Computer Science, 249:3–80, 2000. doi:10.1016/S0304-3975(00)00056-6.

[bib.bib48] [48] R. Segala and N.A. Lynch. Probabilistic simulations for probabilistic processes. In Proc. Concur’94, pages 481–496. LNCS 836, 1994.

[bib.bib49] [49] A. Sokolova. Coalgebraic Analysis of Probabilistic Systems. PhD thesis, TU Eindhoven, 2005.

[bib.bib50] [50] Timm Spork, Christel Baier, Joost-Pieter Katoen, Jakob Piribauer, and Tim Quatmann. A spectrum of approximate probabilistic bisimulations. In Rupak Majumdar and Alexandra Silva, editors, 35th International Conference on Concurrency Theory, CONCUR 2024, September 9-13, 2024, Calgary, Canada, volume 311 of LIPIcs, pages 37:1–37:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, August 2024. doi:10.4230/LIPIcs.CONCUR.2024.37.

[bib.bib51] [51] David Sprunger, Shin-ya Katsumata, Jérémy Dubut, and Ichiro Hasuo. Fibrational bisimulations and quantitative reasoning. In Corina Cîrstea, editor, Coalgebraic Methods in Computer Science, pages 190–213, Cham, 2018. Springer International Publishing. doi:10.1007/978-3-030-00389-0_11.

[bib.bib52] [52] David Sprunger, Shin-ya Katsumata, Jérémy Dubut, and Ichiro Hasuo. Fibrational bisimulations and quantitative reasoning: Extended version. J. Log. Comput., 31(6):1526–1559, 2021. doi:10.1093/LOGCOM/EXAB051.

[bib.bib53] [53] Qiyi Tang and Franck van Breugel. Algorithms to compute probabilistic bisimilarity distances for labelled markov chains. In Proc. CONCUR 2017, volume 85 of LIPIcs, pages 27:1–27:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPIcs.CONCUR.2017.27.

[bib.bib54] [54] Mathieu Tracol, Josée Desharnais, and Abir Zhioua. Computing distances between probabilistic automata. In Mieke Massink and Gethin Norman, editors, Proceedings Ninth Workshop on Quantitative Aspects of Programming Languages, QAPL 2011, Saarbrücken, Germany, April 1-3, 2011, volume 57 of EPTCS, pages 148–162, 2011. doi:10.4204/EPTCS.57.11.

[bib.bib55] [55] Franck van Breugel. Probabilistic bisimilarity distances. ACM SIGLOG News, 4(4):33–51, November 2017. doi:10.1145/3157831.3157837.

[bib.bib56] [56] Franck van Breugel, Claudio Hermida, Michael Makkai, and James Worrell. Recursively defined metric spaces without contraction. Theor. Comput. Sci., 380(1-2):143–163, 2007. doi:10.1016/j.tcs.2007.02.059.

[bib.bib57] [57] Franck van Breugel, Babita Sharma, and James Worrell. Approximating a behavioural pseudometric without discount for probabilistic systems. In Helmut Seidl, editor, FoSSaCS, volume 4423 of Lecture Notes in Computer Science, pages 123–137. Springer, 2007. doi:10.1007/978-3-540-71389-0_10.

[bib.bib58] [58] Franck van Breugel and James Worrell. Towards quantitative verification of probabilistic transition systems. In ICALP ’01: Proceedings of the 28th International Colloquium on Automata, Languages and Programming,, pages 421–432, London, UK, 2001. Springer-Verlag. doi:10.1007/3-540-48224-5_35.

[bib.bib59] [59] Franck van Breugel and James Worrell. A behavioural pseudometric for probabilistic transition systems. Theor. Comput. Sci., 331(1):115–142, February 2005. doi:10.1016/j.tcs.2004.09.035.

[bib.bib60] [60] Rob J. van Glabbeek, Scott A. Smolka, and Bernhard Steffen. Reactive, generative and stratified models of probabilistic processes. Inf. Comput., 121(1):59–80, 1995. doi:10.1006/inco.1995.1123.

[bib.bib61] [61] Mingsheng Ying and Martin Wirsing. Approximate bisimilarity. In AMAST ’00: Proceedings of the 8th International Conference on Algebraic Methodology and Software Technology, pages 309–322, London, UK, 2000. Springer-Verlag. doi:10.1007/3-540-45499-3_23.

𝜺-Distance via Lévy-Prokhorov Lifting

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

Acknowledgements:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

2 The objects of interest

2.1 The Kantorovich pseudometric

Example 1.

2.2 𝜺-Bisimulation

Definition 2 ([26]).

Example 3.

Definition 4 (The ε-distance d∗ [26]).

Theorem 5 ([26]).

3 The Lévy-Prokhorov distance lifting

▶ Remark 6.

3.1 The Lévy-Prokhorov distance on distributions

Definition 7 (Lévy-Prokhorov distance [45]).

Example 8.

3.2 The Lévy-Prokhorov distance on LMPs

Definition 9.

Lemma 10.

Theorem 11.

Proof.

Corollary 12.

Proof.

Proposition 13.

Proof.

Proposition 14.

Proposition 15.

3.3 The Lévy-Prokhorov lifting of the subdistribution functor to 𝐏𝐌𝐞𝐭𝟏

Proposition 16.

Proposition 17.

Proposition 18.

3.4 The lifted functor 𝓓𝑳⁢𝑷 is not a monad lifting of 𝓓

Lemma 19.

Example 20.

4 𝜺-(Bi)simulations, coalgebraically

Definition 21.

Definition 22.

Definition 23.

Definition 24.

▶ Remark 25.

Definition 26.

Lemma 27.

Proposition 28.

5 Concluding remarks

References

Appendix A Proofs

Lemma 10. [Restated, see original statement.]

Proof.

Lemma 29.

Proof.

Proposition 14. [Restated, see original statement.]

Proof.

Proposition 15. [Restated, see original statement.]

Proof.

Proposition 16. [Restated, see original statement.]

Proof.

Proposition 17. [Restated, see original statement.]

Proof.

Lemma 19. [Restated, see original statement.]

Proof.

Lemma 27. [Restated, see original statement.]

Proof.

Proposition 28. [Restated, see original statement.]

Proof.

$\varepsilon$ -Distance via Lévy-Prokhorov Lifting

2.2 $\varepsilon$ -Bisimulation

Definition 4 (The $\varepsilon$ -distance $d^{*}$ [26]).

$\blacktriangleright$ Remark 6.

3.3 The Lévy-Prokhorov lifting of the subdistribution functor to $\mathbf{PMet_{1}}$

3.4 The lifted functor $\mathcal{D}_{LP}$ is not a monad lifting of $\mathcal{D}$

4 $\varepsilon$ -(Bi)simulations, coalgebraically

$\blacktriangleright$ Remark 25.