Abstract 1 Introduction 2 Definitions 3 Unfolding Almost Non-Zeno Weighted Timed Games 4 Simplifying Transformations of Kernel Games 5 Value Iteration in Two-clock Kernel Games Conclusion: Extension to arbitrary weights References Appendix A Proofs of Section 4.4

Deciding the Value of Two-Clock Almost Non-Zeno Weighted Timed Games

Isa Vialard ORCID Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany
Abstract

The Value Problem for weighted timed games (wtgs) consists in determining, given a two-player weighted timed game with a reachability objective and a rational threshold, whether or not the value of the game exceeds the threshold. When restrained to wtgs with non-negative weight, this problem is known to be undecidable for weighted timed games with three or more clocks, and decidable for one-clock wtgs. The Value Problem for two-clock non-negative wtgs, which remained stubbornly open for a decade, was recently shown to be undecidable. In this paper, we show that the Value Problem is decidable when considering two-clock almost non-Zeno wtgs.

Keywords and phrases:
Weighted timed games, decidability, real-time systems
Copyright and License:
[Uncaptioned image] © Isa Vialard; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Timed and hybrid models
Related Version:
arXiv Version: https://arxiv.org/abs/2508.00014
Acknowledgements:
I want to thank Quentin Guilmant for his help in formalizing the proofs in Section A.1.
Editors:
Stefano Guerrini and Barbara König

1 Introduction

Introduced by Alur and Dill ([2]) in the early 1990s, a timed automaton is an automaton where transitions are limited by time constraints on a set of finite clocks. Weighted timed automata, also known as priced timed automata, are timed automata with integer costs added to locations and transitions. These costs can be punctual, or linear in terms of time spent in a location. Timed automata and weighted timed automata are powerful models for real-time systems – for instance task scheduling, controller synthesis, energy-aware systems, etc.

Real-time systems often have to deal with perturbations from an uncontrollable environment (for instance, a user). This can be modelled by timed games: timed automata where transitions are divided among two players, the control, who has a reachability objective, versus the environment.

When adding costs to timed games, we obtain weighted timed games (wtgs): 𝖬𝗂𝗇 (the control) now attempts to reach a goal location while minimizing the cost of doing so, against her opponent 𝖬𝖺𝗑 (the environment).

A natural problem on wtgs is the following: Given a wtg, can we compute its Value, i.e., the infimum of the optimal cost111where the optimal cost is the supremum on all possible strategies of 𝖬𝖺𝗑 of the weight of the path produced by the strategy profile. on all strategies of 𝖬𝗂𝗇? Or, formulated as a decision problem: is its Value less than or equal to c?

This is the Value Problem, not to be confused with the Existence Problem: Given a wtg and a threshold c, does 𝖬𝗂𝗇 have a strategy to reach her goal location with cost at most c? These two problems can yield different answers (see Figure 1).

Figure 1: Example of a wtg where the Value is 0, but every strategy of 𝖬𝗂𝗇 yields cost >0.

In this paper, we focus on the Value Problem. While decidable for weighted timed automata, the Value Problem is undecidable for weighted timed games ([6]).

However, one can recover decidability by restricting the number of clocks. Bouyer et al. ([7]) establish that the Value problem is decidable for one-clock wtg with non-negative weights. This decidability result is extended to one-clock wtg with arbitrary weights in [16]. On the other hand, Bouyer et al. ([6]) prove undecidability for wtgs with non-negative weight and 3-clocks or more. Brihaye et al. ([10]) show undecidability of two-clock wtg with arbitrary weights. Only recently, Guilmant et al. ([15]) proved undecidability of two-clock, time-bounded wtg with non-negative weights.

Another way to recover decidability is with non-Zeno (or divergence) properties. A wtg with non-negative weights has a strictly non-Zeno cost property when every cycle is of cost at least 1. Intuitively, this property forbids any “Zeno paradox” behaviour. Strictly non-Zeno wtgs can be enfolded into acyclic wtgs; hence, the Value Problem is decidable ([5]). This result was generalized to wtgs with arbitrary weight in [11], for which non-Zenoness becomes divergence.222A wtg is divergent if every strongly connected component has either cycles of weight in (,1] or cycles of weight in [1,).

Divergence properties can be weakened into almost-divergence properties. A wtg with non-negative weight is said to be almost non-Zeno (or almost strictly non-Zeno, or almost strongly non-Zeno) if its cycles are of weight 0, or at least 1. Bouyer et al. ([6]) establish that the Value of such wtgs is approximable333i.e., can be computed to arbitrary precision. (but still undecidable). Busatto-Gaston et al. ([12]) extend this result to almost divergent444A wtg is almost divergent if every strongly connected component has either cycles of weight in (,1]{0}, or cycles of weight in {0}[1,). wtgs with arbitrary weights.

Figure 2: Landscape of WTG decidability and approximability.

Contributions

The main theorem of this paper is the following:

Theorem 1.

Given a two-player, turn-based, two-clocks, almost non-Zeno weighted timed game with non-negative integer weights, the Value Problem is decidable.

The proof of Theorem 1 relies on the partial unfolding used in the approximability proof of [6], and on several techniques to turn a wtg into an equivalent, simpler game with desirable properties, such as the relaxing of guards, or adding clock resets to every transition.

2 Definitions

Let 𝒳 be a finite set of clocks. Clock constraints (or guards) over 𝒳 are expressions of the form xn, where x,y𝒳 are clocks, {<,,=,,>} is a comparison symbol, and n is a natural number. We write 𝒞 to denote the set of all clock constraints over 𝒳. A valuation on 𝒳 is a function ν:𝒳0. For d0 we denote by ν+d the valuation such that, for every clock x𝒳, (ν+d)(x)=ν(x)+d. Let X𝒳 be a subset of all clocks. We write ν[X:=0] for the valuation such that, for every clock xX, ν[X:=0](x)=0, and ν[X:=0](y)=ν(y) for all other clocks yX. For C𝒞 a set of clock constraints over 𝒳, we say that the valuation ν satisfies C, denoted νC, if and only if all the comparisons in C hold when replacing each clock x by its corresponding value ν(x).

Definition 2.

A (turn-based) weighted timed game is given by a tuple 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w), where:

  • L𝖬𝗂𝗇 and L𝖬𝖺𝗑 are the (disjoint) sets of locations belonging to Players 𝖬𝗂𝗇 and 𝖬𝖺𝗑 respectively; we let L=L𝖬𝗂𝗇L𝖬𝖺𝗑 denote the set of all locations. (In drawings, locations belonging to 𝖬𝗂𝗇 are depicted by blue circles, and those belonging to 𝖬𝖺𝗑 are depicted by red squares.)

  • GL𝖬𝗂𝗇 are the goal locations.

  • 𝒳 is a set of clocks.

  • T(LG)×2𝒞×2𝒳×L is a set of (discrete) transitions. A transition C,X enables moving from location to location , provided all clock constraints in C are satisfied, and afterwards resetting all clocks in X to zero.

  • w:(LG)T is a weight function.

In the above, we assume that all data (set of locations, set of clocks, set of transitions, set of clock constraints) are finite.

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w) be a wtg. A configuration over 𝒢 is a pair (,ν), where L and ν is a valuation on 𝒳. Let d0 be a delay and t=C,XT be a discrete transition. One then has a delayed transition (or simply a transition if the context is clear) (,ν)d,t(,ν) provided that ν+dC and ν=(ν+d)[X:=0]. Intuitively, control remains in location for d time units, after which it transitions to location , resetting all the clocks in X to zero in the process. The weight of such a delayed transition is dw()+w(t), taking account both of the time spent in as well as the weight of the discrete transition t.

As noted in [13], without loss of generality one can assume that no configuration (other than those associated with goal locations) is deadlocked; in other words, for any location LG and valuation ν0𝒳, there exists d0 and tT such that (,ν)d,t(,ν).555This can be achieved by adding unguarded transitions to a sink location for all locations controlled by 𝖬𝗂𝗇 and unguarded transitions to a goal location for the ones controlled by 𝖬𝖺𝗑.

Let k. A run ρ of length k over 𝒢 from a given configuration (0,ν0) is a sequence of matching delayed transitions, as follows:

ρ=(0,ν0)d0,t0(1,ν1)d1,t1dk1,tk1(k,νk).

The weight of ρ is the cumulative weight of the underlying delayed transitions:

𝗐𝖾𝗂𝗀𝗁𝗍(ρ)=i=0k1(diw(i)+w(ti)).

An infinite run ρ is defined in the obvious way; however, since no goal location is ever reached, its weight is defined to be infinite: 𝗐𝖾𝗂𝗀𝗁𝗍(ρ)=+.

A run is maximal if it is either infinite or cannot be extended further. Thanks to our deadlock-freedom assumption, finite maximal runs must end in a goal location. We refer to maximal runs as plays.

We now define the notion of strategy. Recall that locations of 𝒢 are partitioned into sets L𝖬𝗂𝗇 and L𝖬𝖺𝗑, belonging respectively to Players 𝖬𝗂𝗇 and 𝖬𝖺𝗑. Let Player 𝖯{𝖬𝗂𝗇,𝖬𝖺𝗑}, and write 𝒢𝖯 to denote the collection of all non-maximal finite runs of 𝒢 ending in a location belonging to Player 𝖯. A strategy for Player 𝖯 is a mapping σ𝖯:𝒢𝖯0×T such that for all finite runs ρ𝒢𝖯 ending in configuration (,ν) with L𝖯, the delayed transition (,ν)d,t(,ν) is valid, where σ𝖯(ρ)=(d,t) and (,ν) is some configuration (uniquely determined by σ𝖯(ρ) and ν).

Let us fix a starting configuration (0,ν0), and let σ𝖬𝗂𝗇 and σ𝖬𝖺𝗑 be strategies for Players 𝖬𝗂𝗇 and 𝖬𝖺𝗑 respectively (one speaks of a strategy profile). Let us denote by 𝗉𝗅𝖺𝗒𝒢((0,ν0),σ𝖬𝗂𝗇,σ𝖬𝖺𝗑) the unique maximal run starting from configuration (0,ν0) and unfolding according to the strategy profile (σ𝖬𝗂𝗇,σ𝖬𝖺𝗑): in other words, for every strict finite prefix ρ of 𝗉𝗅𝖺𝗒𝒢((0,ν0),σ𝖬𝗂𝗇,σ𝖬𝖺𝗑) in 𝒢𝖯, the delayed transition immediately following ρ in 𝗉𝗅𝖺𝗒𝒢((0,ν0),σ𝖬𝗂𝗇,σ𝖬𝖺𝗑) is labelled with σ𝖯(ρ).

Recall that the objective of Player 𝖬𝗂𝗇 is to reach a goal location through a play whose weight is as small possible. Player 𝖬𝖺𝗑 has an opposite objective, trying to avoid goal locations, and, if not possible, to maximise the cumulative weight of any attendant play. This gives rise to the following two symmetrical definitions:

𝖵𝖺𝗅¯𝒢(0,ν0) =infσ𝖬𝗂𝗇{supσ𝖬𝖺𝗑{𝗐𝖾𝗂𝗀𝗁𝗍(𝗉𝗅𝖺𝗒𝒢((0,ν0),σ𝖬𝗂𝗇,σ𝖬𝖺𝗑))}} and
𝖵𝖺𝗅¯𝒢(0,ν0) =supσ𝖬𝖺𝗑{infσ𝖬𝗂𝗇{𝗐𝖾𝗂𝗀𝗁𝗍(𝗉𝗅𝖺𝗒𝒢((0,ν0),σ𝖬𝗂𝗇,σ𝖬𝖺𝗑))}}.

𝖵𝖺𝗅¯𝒢(0,ν0) represents the smallest possible weight that Player 𝖬𝗂𝗇 can possibly achieve, starting from configuration (0,ν0), against best play from Player 𝖬𝖺𝗑, and conversely for 𝖵𝖺𝗅¯𝒢(0,ν0): the latter represents the largest possible weight that Player 𝖬𝖺𝗑 can enforce, against best play from Player 𝖬𝗂𝗇.666Technically speaking, these values may not be literally achievable; however given any ε>0, both players are guaranteed to have strategies that can take them to within ε of the optimal value. As noted in [13], turned-based wtgs are determined, and therefore 𝖵𝖺𝗅¯𝒢(0,ν0)=𝖵𝖺𝗅¯𝒢(0,ν0) for any starting configuration (0,ν0); we denote this common value by 𝖵𝖺𝗅𝒢(0,ν0).

 Remark 3.

Note that 𝖵𝖺𝗅𝒢(0,ν0) can take on real numbers, or either of the values and +. However, since reachability is decidable in timed games, it is decidable whether 𝖵𝖺𝗅𝒢(0,ν0)=+ or not.

In the remainder of this paper, every weighted timed game is turn-based, with non-negative weights, of value in .

3 Unfolding Almost Non-Zeno Weighted Timed Games

Let us first give an informal definition of the region construction (𝒢) of a wtg 𝒢. In Section 4.2, we will give a more formal definition, in the special case of [0,1)-wtgs. The region partition of 𝒢 is the finest partition of the clock valuations into regions, where each region is defined by guards of the form xk or xyn with x,y𝒳, {<,,=,,>} and nN the largest constant in guards of 𝒢. For instance, {(x,y):x=3,N<y} and {(x,y): 1<x<2,3<y<4,x2<y3} are regions.

We denote by (𝒢) the region automaton associated with a wtg 𝒢(see [2]). In (𝒢), every location is assigned a unique region 𝗋𝖾𝗀() of accessible valuations.

Bouyer et al. ([6]) showed that, even though the Value Problem is undecidable for wtg with non-negative weight and 3 clocks or more, it is approximable in the subclass of almost non-Zeno wtg. In this section, we use the structure of their proof of approximability to prove decidability for almost non-Zeno wtgs with 2 clocks.

Definition 4 (Almost non-Zeno wtg).

A wtg 𝒢 is almost non-Zeno if there exists κ>0 such that for any finite run ρ in 𝒢 that follows a region cycle of (𝒢), 𝗐𝖾𝗂𝗀𝗁𝗍(ρ)=0 or κ.

 Remark 5.

It is decidable whether a weighted timed game is almost non-Zeno or not (by enumerating all simple cycles in the corner-point abstraction of 𝒢, see [4]).

In an acyclic wtg, the value is decidable and can be computed from the target locations up to the initial location, by computing for each node a function W:𝗋𝖾𝗀() which assigns to a valuation ν𝗋𝖾𝗀() the optimal weight 𝖵𝖺𝗅𝒢(,ν). By construction, every W is a piecewise linear function.

Intuitively, we will unfold cycles of weight κ to obtain a “tree-like” wtg where only cycles of weight 0 are left; we will deal with them separately.

Semi-unfolding

For any wtg 𝒢, let 𝒢~ be the semi-unfolded wtg built from (𝒢) in [6]:

First color in green every location and edge that are part of a cycle of weight 0. Observe that you can modify any wtg such that any green location has weight 0777[6] make a similar observation, but their construction implies adding a clock.: in a trimmed888A trimmed region wtg is a region wtg where we have erased inaccessible locations and redundant guards. It will be detailed later in Definition 13. region wtg, if a location L𝖯 of weight p>0 is part of a cycle of weight 0, then there exists an outgoing transition from with a guard x=0 for some clock x. Therefore, as in Figure 3, one can add a location 0 of weight 0 in L𝖯 such that:

  • every transition arriving in arrives in 0 instead.

  • every green transition leaving leaves 0 instead.

  • there is a transition 0x=0.

Thus let us assume that every green location has weight 0.

Figure 3: How to ensure that every green location has weight 0. Thick green transitions and locations are part of cycles of weight 0, locations labeled with weight 0 or p belong to the same player.

We define 𝒦 the kernel of 𝒢 as the restriction of (𝒢) to fully-green strongly connected components. Edges that leave 𝒦 are called the output edges of 𝒦.

Then we partially unfold (𝒢) into a finite tree structure 𝒯(𝒢) : starting from the initial location i as a root, we follow every possible path in 𝒢, with a node for each time we visit a (non kernel) location, as to avoid creating cycles. However when along a branch we enter the kernel in some location , we create a node 𝒦 instead of , and for each output edge t of 𝒦 accessible from , with t leading to a location , let (or 𝒦 if 𝒦) be a child of 𝒦, and continue to unfold from there.

We stop unfolding when, along any branch, a location or edge with positive weight of (𝒢) is visited at least W/κ+2 times, where W is an upper bound on the value of 𝒢.999obtained by using the corner-point abstraction, or considering a memoryless region-uniform strategy for 𝖬𝗂𝗇.

To obtain 𝒢~ from 𝒯(𝒢), replace each node 𝒦 by a copy of the strongly connected component of 𝒦 that contains (see [6] for the formal construction). Then 𝖵𝖺𝗅𝒢~(i,ν)=𝖵𝖺𝗅𝒢(i,ν) for any ν𝗋𝖾𝗀(i).

In the partially unfolded games 𝒢~ with three clocks or more, the cause of undecidability is inside the kernel nodes. Hence, in [6], the approximation happens in the kernels. However, with only two clocks, the value is decidable in kernel weighted timed games:

Definition 6 (Kernel weighted timed games).

A kernel weighted timed game 𝒢 is a [0,1]-wtg (L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w,w𝗈𝗎𝗍) where every location or transition is of weight 0, and each target location G has an output weight function w𝗈𝗎𝗍(,):𝗋𝖾𝗀() which is continuous and piecewise linear. In later notations, we omit w.

Theorem 7.

For any two-clock kernel wtg 𝒢, for any location i𝒢, Wi is a continuous piecewise linear function which can be computed through the value iteration algorithm.

This is the main technical result of this paper, which we will prove in Section 5. Let us show first how Theorem 7 entails value decidability of the partial enfolding 𝒢~:

Lemma 8.

For every node n in the tree 𝒯(𝒢), one can compute Wn a continuous piecewise linear function such that Wn:ν𝖵𝖺𝗅𝒢~(,ν), where is either n if n𝒦, or the entrance location of n=𝒦.

Proof.

In the tree structure of 𝒢~, consider a node n: if n is a leaf, then nG. Thus let Wn be the constant null function. Now consider that n is not a leaf, and by induction hypothesis assume that for every child n of n, Wn is continuous and piecewise linear. If n=𝒦 then

Wn:νinf/supnC,Xn𝒯(𝒢)ν+δCWn(ν+δ[X:=0]).

Thus by induction Wn is also continuous and piecewise linear.

Otherwise, n=𝒦 for some 𝒦. Let K be the SCC containing , and T𝗈𝗎𝗍 the output edges leaving from K. Consider the kernel wtg K=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w𝗈𝗎𝗍) where

  • G={t:tTout}.

  • For every 𝖯, L𝖯=KL𝖯.

  • T=T|K×K{C,Xt|t:C,X′′Tout}

  • for every t:C,X′′T𝗈𝗎𝗍, w𝗈𝗎𝗍(t,):νW′′(ν)+w(t), which is piecewise linear by induction hypothesis.

  • w maps to 0 always.

Then Wn=𝖵𝖺𝗅K(,), which is piecewise linear according to Theorem 7.

This is sufficient to conclude the proof of Theorem 1.

Theorem 1. [Restated, see original statement.]

Given a two-player, turn-based, two-clocks, almost non-Zeno weighted timed game with non-negative integer weights, the Value Problem is decidable.

4 Simplifying Transformations of Kernel Games

Before proving Theorem 7, let us apply some useful simplifying transformations that preserve the value. These transformations happen in four steps:

Step 1:

Transform a WTG into a WTG with clock values in [0,1).

Step 2:

Transform a wtg into a region trimmed wtg, i.e., a wtg where to every location is assigned a region, and without any “useless” transition, or guard on transition.

Step 3:

Transform a trimmed region kernel wtg by relaxing every strict guard into a strict-or-equal guard.

Step 4:

Transform a relaxed trimmed region kernel wtg such that every transition resets at least one clock.

Commentaries

Step 1 only serves to lighten notations in the rest of this paper. In terms of state complexity, the transformations 1+2 increase the number of locations as much as the classical region construction.

Trimming a wtg in step 2 is necessary: without it, Step 3 would create pathological cases where relaxing some guards would allow a player to take transitions that would have been unreachable in the original wtg.

Relaxing guards in step 3 is a technique that has merits of its own outside of the scope of the proof. For instance, the value of a wtg might not be reached by an optimal strategy for 𝖬𝗂𝗇, but could be the infimum produced by a set of strategies ϵ-close to an optimal strategy in the relaxed wtg.

Relaxing guard is also a prerequisite for step 4, which is the most important of all four steps. In a two-clock wtg, resetting at least one clock in each transition allows us to consider only regions in one dimension. Reducing a two-dimension problem to a one-dimension one is key to the termination argument in Section 5.

4.1 Restraining Clock Values to [0,1)

Before presenting the well-known notions of regions and region wtgs, let us first restrict the setting to [0,1)-wtgs, which will simplify the region notations.

Definition 9.

A [0,1)-wtg is a weighted timed game where for every reachable configuration (,ν), 0ν(x)<1 for any clock x.

Lemma 10.

For any wtg 𝒢, there is an equivalent [0,1)-wtg 𝒢.

Proof.

See [7], Proposition 2 for a detailed proof. Their proof is for 1-clock wtg, however, the construction can easily be generalized to any number of clocks.

The intuition of the construction is that the information of the integer parts of the clock can be contained in the locations, while the clocks keep track only of the fractional part: Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w) and let us build 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w). First, w.l.o.g., let us assume that all clocks are bounded by an integer M ([3]). Then, for 𝖯{𝖬𝗂𝗇,𝖬𝖺𝗑}, let L𝖯=L𝖯×M|𝒳|, and define T and w such that a valuation ν=(x1,,x|𝒳|)[0,1)𝒳 in a location (,n1,,n|𝒳|) in 𝒢 is equivalent to a valuation (n1+x1,,n|𝒳|+x|𝒳|) in location in 𝒢. Note that every transition of 𝒢, with a guard x=1 for some clock x, resets x.

4.2 Regions and Region Trimmed Games

Definition 11.

Let 𝒳 be a set of clocks in a [0,1)-wtg. A region over 𝒳 is a tuple r=(X0,,Xp,X=1) such that Xi for all 1ip, and {X0,,Xp,X=1} is a partition of 𝒳: 𝒳=i=0pXi

We denote by Reg𝒳 the set of regions over 𝒳. A valuation ν is said to belong to the region r, denoted by νr, whenever

  • x𝒳, ν(x)=0xX0,

  • x𝒳, ν(x)=1xX=1,

  • x,y𝒳, ν(x)<ν(y)<1i,j{0,,p} s.t. i<jxXiyXj.

For r=(X0,,Xp,X=1) a region and X a non-empty subset of 𝒳, we denote by r[X:=0] the region (X0X,X1X,,XpX,X=1X). In other words, r[X:=0] is the region such that if ν belongs to r then ν[X:=0] belongs to r[X:=0]. A [0,1)-region is a region (X0,,Xp,X=1) where X=1=. Let us abuse notation and denote r by (X0,,Xp). We denote by Reg𝒳< the set [0,1)-regions over 𝒳.

A time-successor of a region r, with r=(X0,,Xp)Reg𝒳< is a region r=(X0,,Xp,X=1) such that either r=r; or X0= and Xi=Xi1 for 1ip, and either Xp=X=1 or Xp=Xp+1 (and then X=1=).

We often abuse notation and write r for the set of valuations νr it represents.

Definition 12 (Region wtg [4, 12]).

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w) be a [0,1)-wtg. The region wtg (𝒢) of 𝒢 is the [0,1)-wtg (𝒢)=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w) with

  • L𝖯=L𝖯×Reg𝒳< for 𝖯{𝖬𝗂𝗇,𝖬𝖺𝗑}.

  • G=G×Reg𝒳<.

  • For every r=(X0,Xp)Reg𝒳<, for every r=(X0,,Xp,X=1)Reg𝒳 a time-successor of r, if C,XT then (,r)CC(r),X(,r[X:=0]), with

    C(r)={(x=0):xX0}{(x=1):xX=1}{(0<x<1):xXi,1ip}.
  • For L𝖬𝗂𝗇L𝖬𝖺𝗑 and rReg𝒳<, w(,r)=w().

  • For t=(,r)CC(r),X(,r[X:=0])T, w(t)=w(C,X).

While applying simplifying transformations to (𝒢), we wish to preserve the “one-region-per-location” property. Thus let us formally define what is a region wtg (as opposed to the region wtg). A wtg 𝒢 is a region weighted timed game if there is a region assignment 𝗋𝖾𝗀:LReg𝒳<, such that for any transition t:C,X, the valuations ν+δ with ν𝗋𝖾𝗀() and δ0 that satisfy C are contained in a unique region r, such that r[X:=0]=𝗋𝖾𝗀(). Furthermore, for any initial configuration (i,ν), we require ν𝗋𝖾𝗀(i).

Obviously (𝒢) is a region wtg for any [0,1)-wtg 𝒢. For any location in a region wtg, let X=𝖽𝖾𝖿Xp with 𝗋𝖾𝗀()=(X0,Xp).

Let us now “trim” (𝒢), i.e., delete every useless transition, and every useless guard on transitions :

Definition 13 (Trimmed region wtg).

A region [0,1)-wtg 𝒢 is trimmed if for any transition t:C,X and any region r=𝗋𝖾𝗀() in 𝒢,

  • for any valuation νr there exist some δ0 such that ν+δC.

  • for any cC, there exists a valuation νr and some δ0 such that ν+δ is in [0,1]𝒳 and ν+δ⊧̸c.

In other words, there are no inaccessible transitions from any tuple location-region. Furthermore, there are no unnecessary clauses in C (the ones that are always verified from the region). Removing inaccessible transitions and unnecessary clauses can always be done from any region wtg without change in value.

Since every [0,1)-wtg 𝒢 is equivalent to the region [0,1)-wtg (𝒢), and every region wtg is equivalent to a trimmed region wtg, we can always assume 𝒢 to be a trimmed region wtg.

Observation 14.

For 𝒢 a trimmed region [0,1)-wtg, for any transition t:C,X, with 𝗋𝖾𝗀()=(X0,,Xp):

  • if (y=0)C or (y>0)C for some clock y then yX0, in other words y is 0 on the whole region r.

  • if (y=1)C for some clock y, then yXp in other words y is one of the clocks with largest value in r.

  • there cannot be both x=0 and y=1 in C for any two clocks x,y.

Proof.

For the first point, notice that if y was not 0, then either the transition or the clause would have been trimmed. For the second one, if yXp then upon taking the transition with condition y=1 there is a clock xXp such that x>y=1. For the third point, for any valuation in r, there is no δ such that ν+δ satisfies both clause.

4.3 Relaxing Strict Guards

A kernel wtg can easily be transformed into a kernel wtg without strict guards, without change in value.

Definition 15.

Let r=(X0,,Xp). Then the adherence or r, denoted by r¯, is the set of regions of the form (Y0,,Yp,Y=1) with pp and ι:[0,p+1][0,p] strictly increasing such that ι(p+1)=p and Y0=X0Xι0 and Yi=Xι(i1)+1Xι(i) for all 1ip and Y=1=Xι(p)+1Xι(p+1).

Let us abuse notation and write νr¯ when νr for rr¯.

Lemma 16.

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w𝗈𝗎𝗍) be a trimmed region kernel wtg. Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w𝗈𝗎𝗍) be a copy of 𝒢 where

  • every guard has been relaxed, i.e., every guard of the form x>0 and x<1 have been replaced by x0 or x1, respectively, for all x𝒳.

  • For any G, the output function w𝗈𝗎𝗍(,) is w𝗈𝗎𝗍(,) extended continuously to 𝗋𝖾𝗀()¯ in 𝒢.

Then 𝖵𝖺𝗅𝒢=𝖵𝖺𝗅𝒢.

The proof of this theorem relies on a bisimulation argument that is detailed in the appendix.

Note that 𝒢 is not a [0,1)-wtg, but a [0,1]-wtg, i.e., every accessible valuation is in [0,1]𝒳. Furthermore, 𝒢 is not a region wtg:

Definition 17.

A relaxed region wtg is a [0,1]-wtg without strict guard where to each location is assigned a region 𝗋𝖾𝗀(): for any transition t:C,X, the valuations ν+δ, with ν𝗋𝖾𝗀())¯ and δ0, that satisfy C are contained in r¯ for a unique region r, such that r[X:=0]=𝗋𝖾𝗀(). Furthermore, initial configuration (i,ν) must verify ν𝗋𝖾𝗀(i)¯. A relaxed trimmed region wtg is a relaxed region wtg if for any transition t:C,X and any region r=𝗋𝖾𝗀() in 𝒢,

  • for any valuation νr¯ there exist some δ0 such that ν+δC.

  • for any cC, there exists a valuation νr¯ and some δ0 such that ν+δ is in [0,1]𝒳 and ν+δ⊧̸c.

Lemma 18.

Let 𝒢 be a region trimmed [0,1)-wtg of region assignment 𝗋𝖾𝗀. Let 𝒢 be a copy of 𝒢 where

  • every guard has been relaxed, i.e., every guard of the form x>0 and x<1 have been replaced by x0 or x1 respectively, for all x𝒳.

  • useless guards (see the second point of Definition 17) have been removed.

Then 𝒢 is a relaxed trimmed region wtg, of region assignment 𝗋𝖾𝗀.

4.4 Adding Resets to every Transition

Lemma 19.

For any relaxed region trimmed kernel [0,1]-wtg 𝒢, such that 𝒢 has no requirement x<1 for any x𝒳 then there exists a relaxed region trimmed kernel [0,1]-wtg 𝒢 of same value and verifying the same conditions such that every transition of 𝒢 is a reset transition or a transition to the target location. Furthermore, any transition of 𝒢 with, for some clock x, a guard of the form x=0 or x=1, resets x.

See Section A.2 for the proof.

5 Value Iteration in Two-clock Kernel Games

In this section, we prove Theorem 7 using the value iteration paradigm (see [1, 9, 8]):

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w or w𝗈𝗎𝗍) be a trimmed region (kernel) wtg. In a wtg, the value iteration algorithm builds, for each location and for all k0, a function 𝗈𝗉𝗍k:0𝒳 such that 𝗈𝗉𝗍k(ν) is the value of the game started in with clock valuation ν, where 𝖬𝗂𝗇 has to win in at most k steps. The 𝗈𝗉𝗍 functions are built inductively:

  • 𝗈𝗉𝗍0 is the constant 0 function if G (or w𝗈𝗎𝗍(,) in the case of a kernel wtg), or the constant + function otherwise.

  • for any k, 𝗈𝗉𝗍k+1 is obtained from the 𝗈𝗉𝗍 functions at step k: if belongs to 𝖬𝗂𝗇 (resp. 𝖬𝖺𝗑), then

    𝗈𝗉𝗍k+1(ν)=inf(resp. sup{𝗈𝗉𝗍k((ν+δ)[X:=0]):C,XT,ν+δC}.

Note that 𝗈𝗉𝗍k(ν)=𝖵𝖺𝗅k(,ν) the value of the game started from configuration ,ν when Min must reach G in at most k steps. Naturally, 𝗈𝗉𝗍k+1(ν)𝗈𝗉𝗍k(ν) for all valuation ν. If there exists k such that, for all locations , 𝗈𝗉𝗍k+1=𝗈𝗉𝗍k, then the value iteration algorithm terminates.

In general, there is no termination guarantee. However, if there exists k such that 𝗈𝗉𝗍k+1=𝗈𝗉𝗍k for all , then 𝗈𝗉𝗍k(ν)=𝖵𝖺𝗅𝒢(,ν). This means that the value of the wtg is obtained even when considering plays of length at most k.

Here is an example where the value iteration algorithm does not terminate.

Figure 4: A wtg where the value iteration algorithm does not terminate. Blue circle locations belong to 𝖬𝗂𝗇, red square locations belong to 𝖬𝖺𝗑, the green circle location G is the target. The flags serve to easily refer to locations. The +1 label is a transition cost.
Example 20.

The wtg 𝒢 in Figure 4 is a 3-clock, almost non-Zeno wtg with a value of 1. The kernel of 𝒢 contains only the cycle between and . The cost of the output edge in is a+2δ for a valuation (x,y,t)=(δ,a+δ,0), whereas the cost of the output edge in is exactly 1, for a valuation (x,y,t)=(0,1,0).

An optimal strategy for 𝖬𝗂𝗇 is to loop in the kernel an arbitrary number of times: In this strategy, each time she enters with valuation (x,y,t)=(0,a,t), she should waits δ such that 2δ=1a and enters , then 𝖬𝖺𝗑 can either reach G with a cost of exactly a+2δ=1, or return to . When 𝖬𝗂𝗇 decides to end the game, she then picks δ=1a instead. Then 𝖬𝖺𝗑 chooses between going to G at cost 1+(1a), or letting 𝖬𝗂𝗇 leave from at cost 1. Each time 𝖬𝗂𝗇 takes the cycle with delay δ such that 2δ=1a , y gets closer to 1, thus minimizing the cost of picking δ=1a at some point. Thus 𝖬𝗂𝗇 has a strategy to reach G with cost >1, but arbitrarily close to 1 depending on how long she plays. This entails that the value of this wtg is obtained by considering arbitrarily long plays. Thus, the value iteration algorithm does not terminate on this example.

However, the value iteration algorithm terminates for two-clock kernel wtgs.

Theorem 7. [Restated, see original statement.]

For any two-clock kernel wtg 𝒢, for any location i𝒢, Wi is a continuous piecewise linear function which can be computed through the value iteration algorithm.

Proof.

Without loss of generality, let us assume that 𝒢 is a relaxed trimmed region kernel [0,1]-wtg where every transition is either a transition to a target location, or resets at least one clock (see Lemmas 10, 16, and 19). Furthermore, we assume that there is no transition to the initial location i.101010This can be done by making a copy of the initial location, such that all transition that should enter the initial location only enter the copy instead.

For every location G{i}, the set of valuations 𝗋𝖾𝗀()¯ is either {(0,y):y[0,1]} or {(x,0):x[0,1]}, or the singleton {(0,0)}. Note that the 𝗈𝗉𝗍 functions will be defined on 𝗋𝖾𝗀()¯ for any location . This entails that the value iteration algorithm will mainly build 1-dimensional functions.

Let us highlight this observation with a subtle change of variable: Let Δ the circular clock difference of a valuation ν𝗋𝖾𝗀()¯ be defined as:

Δ(ν)={y when ν=(0,y) with y0,1x when ν=(x,0) with x>0.

Now, for any G{i}, for any k, let 𝖮𝗉𝗍k(Δ(ν))=𝖽𝖾𝖿𝗈𝗉𝗍k(ν) for all ν𝗋𝖾𝗀()¯. The function 𝖮𝗉𝗍k is either defined on {0} if 𝗋𝖾𝗀()¯={(0,0)}, or on [0,1].

Figure 5: Evolution of Δ in a one-clock-reset transition, without guards =0 or =1, from region {(0,y):y[0,1]}.
Figure 6: Evolution of Δ in a one-clock-reset transition, without guards =0 or =1, from region {(x,0):x[0,1]}.

The following observation motivates this change of variable: In Figures 5 and 6, we consider a transition t, from locations to such that the 𝖮𝗉𝗍 and 𝖮𝗉𝗍 functions are defined on [0,1], with no =0 or =1 guards. For a transition (,ν)t,δ(,ν), we then observe that Δ(ν)[Δ(ν),1] if ν(x)=0 (Figure 5), and Δ(ν)[0,Δ(ν)] if ν(y)=0 (Figure 6). Thus the variation of Δ only depends on the region of , not . In other words, it does not matter which clocks are reset by a transition, just the number of clocks that are reset.

Let us now define the induction relation between the 𝖮𝗉𝗍 functions. Let be a location such that G{i}. Moreover, assume that belongs to 𝖬𝗂𝗇 (resp. 𝖬𝖺𝗑 ). For k=0, 𝖮𝗉𝗍k=Δ+. Consider 𝖫𝖾𝖺𝗏𝗂𝗇𝗀() the set of outgoing transition from a location belonging to 𝖬𝗂𝗇 (resp. 𝖬𝖺𝗑 ), with i. For each t:C,X, for any k, let us define a function 𝖮𝗉𝗍kt such that 𝖮𝗉𝗍k+1=mint𝖫𝖾𝖺𝗏𝗂𝗇𝗀()𝖮𝗉𝗍kt (resp. max):

  • if is a goal location, then for all k,

    • If 𝗋𝖾𝗀()¯={(0,y):y[0,1]}, then

      𝖮𝗉𝗍kt=𝖮𝗉𝗍0t:Δinf0δ1Δ(δ,Δ+δ)Cw𝗈𝗎𝗍(,(δ,Δ+δ)[X:=0]) (resp. sup).
    • If 𝗋𝖾𝗀()¯={(x,0):x[0,1]}, then

      𝖮𝗉𝗍kt=𝖮𝗉𝗍0t:Δinf0δΔ(1Δ+δ,δ)Cw𝗈𝗎𝗍(,(1Δ+δ,δ)[X:=0]) (resp. sup).
    • If 𝗋𝖾𝗀()¯={0,0}, then 𝖮𝗉𝗍kt(0)=𝖮𝗉𝗍0t(0)=inf0δ1(δ,δ)Cw𝗈𝗎𝗍(,(δ,δ)[X:=0]) (resp. sup).

  • if is not a goal location, then X (t resets one or two clocks).

    • If X={x,y} then, since 𝒢 is almost trimmed, every valuation in 𝗋𝖾𝗀()¯ can reach (,(0,0)). Therefore 𝖮𝗉𝗍kt=Δ𝖮𝗉𝗍k(0).

    • Otherwise, 𝖮𝗉𝗍k is defined on [0,1]. Then:

      • *

        If 𝗋𝖾𝗀()¯={(0,y):y[0,1]},

        • ·

          If (x=0)C (thus X={x}) or (y=1)C (thus X={y}),111111Note that since 𝒢 is almost trimmed, both guards cannot be in C at the same time. then this forces a delay such that t preserves Δ.

        • ·

          Otherwise, as observed in Figure 5,

          𝖮𝗉𝗍kt:ΔinfΔΔ1𝖮𝗉𝗍k(Δ) (resp. sup).
      • *

        Symetrically, if 𝗋𝖾𝗀()¯={(x,0):x[0,1]},

        • ·

          If (y=0)C or (x=1)C, then 𝖮𝗉𝗍kt=Δ𝖮𝗉𝗍k(Δ).

        • ·

          Otherwise, as observed in Figure 6,

          𝖮𝗉𝗍kt:Δinf0ΔΔ𝖮𝗉𝗍k(Δ) (resp. sup).
      • *

        If 𝗋𝖾𝗀()¯={(0,0)}, 𝖮𝗉𝗍kt(0)=inf0Δ1𝖮𝗉𝗍k(Δ) (resp. sup).

We call the functions 𝖮𝗉𝗍0t, for all transitions t to a goal location, the projected output functions of 𝒢. The projected output functions serve to initialize the value iteration algorithm on the 𝖮𝗉𝗍 functions, as do w𝗈𝗎𝗍(,) functions for the value iteration algorithm on the 𝗈𝗉𝗍 functions. Observe that, since w𝗈𝗎𝗍(,) is piecewise linear and continuous for any G, the projected output functions are continuous piecewise linear functions.

Figure 7: 𝖮𝗉𝗍k induction relation, with t a transition from a 𝖬𝗂𝗇 location where xy. t has no =0 or =1 guards, and applies y:=0.

Thus the functions 𝖮𝗉𝗍k are by construction continuous piecewise linear functions. Furthermore, as can be seen in Figure 7, 𝖮𝗉𝗍kt is obtained from 𝖮𝗉𝗍k by replacing 𝖮𝗉𝗍k on a finite number of intervals by constant functions Δc (while preserving continuity) where the constants c are taken among local extremums of 𝖮𝗉𝗍k.

Hence every linear piece of a function 𝖮𝗉𝗍k is:

  • either equal to some projected output function,

  • or of slope 0, equal to some z+, where z is a local extremum of some function 𝖮𝗉𝗍k for k<k. Hence, by induction, z is either a local extremum of some projected output function, or a local extremum of the minimum or maximum of two projected output functions.

There is a finite number of such pieces, hence only a finite number of way they can be assembled to make a continuous piecewise linear function on [0,1]. Thus there is a finite number of functions of this form.

Since the 𝗈𝗉𝗍 and 𝖮𝗉𝗍 functions decrease at each iteration (𝖮𝗉𝗍k+1(Δ)𝖮𝗉𝗍k(Δ)), then the value iteration algorithm on the 𝖮𝗉𝗍 functions terminates.

Furthermore, since no transition enters i, 𝗈𝗉𝗍ki does not affect 𝗈𝗉𝗍k+1 for any location . Therefore, the value iteration algorithm on the 𝗈𝗉𝗍 functions aside from 𝗈𝗉𝗍i stabilizes. Thus, if it terminates in k steps for every location except i, then, adding 𝗈𝗉𝗍i, the value iteration algorithm terminates in at most k+1 steps.

Note that all transformations in Section 4.4 only serve to make the termination argument of Theorem 7 more visible. However, the value iteration algorithm on two-clock kernel wtgs terminates even without these simplifications. Indeed, termination in k steps entails that 𝖬𝗂𝗇 needs only to consider strategies that access a goal location in at most k steps. Since the transformations described in Section 4 do not make arbitrarily long paths equivalent to one shorter path, then termination of the value algorithm on the transformed kernel wtgs immediately implies termination on the original kernel wtgs. This in turn entails that the value iteration algorithm terminates on the semi-unfolding 𝒢~, thus on any two-clock wtg with non-negative weight.

Complexity analysis

Assume that the output weight functions of a kernel game 𝒢 consist of a total of k pieces. Then the number of possible 𝖮𝗉𝗍 functions, which bounds the number of steps of the value iteration algorithm according to Theorem 7, is exponential in k.

In the semi-unfolding of a weighted timed game 𝒢, the number of pieces of the piecewise linear value functions Wn of nodes n increases at most exponentially along a branch. Each branch has length bounded by |(𝒢)|(W/κ+2) (see [6]). Since W|(𝒢)|P, where P is the maximal rate appearing in the automaton, applying the Value Iteration algorithm requires double exponential time in |(𝒢)|.

Finally, the size of 𝒯(𝒢) is bounded by |(𝒢)||(𝒢)|(W/κ+2)+1, hence computing the Value is doubly exponential in |(𝒢)| (or triple-exponential in the size of the original game).

Note that the approximation algorithm for almost non-Zeno wtgs given in [6] is in double-exponential time. Hence computing the exact value of a two-clock almost non-Zeno wtg is exponentially more complex than computing an approximation.

Conclusion: Extension to arbitrary weights

Decidability actually holds even for almost divergent wtgs with arbitrary (negative and positive) weights. There exists a similar semi-unfolding construction in [12], for approximation of almost-divergent wtgs with arbitrary weights. The main technical difficulty is that a kernel in this semi-unfolding is defined as an SCC where every cycle is of weight 0. However, this does not mean that every location or transition of the kernel has weight 0. Thus accumulated cost can increase (or decrease) while crossing a kernel, even though cycling in a SCC does not change the accumulated cost. The construction to transform such a kernel game into a zero-weight kernel game is quite technical; it will be developed in a journal version of this article.

References

  • [1] Rajeev Alur, Mikhail Bernadsky, and P. Madhusudan. Optimal reachability for weighted timed games. In Josep Díaz, Juhani Karhumäki, Arto Lepistö, and Donald Sannella, editors, Automata, Languages and Programming, pages 122–133, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. doi:10.1007/978-3-540-27836-8_13.
  • [2] Rajeev Alur and David L. Dill. A theory of timed automata. Theor. Comput. Sci., 126(2):183–235, 1994. doi:10.1016/0304-3975(94)90010-8.
  • [3] Gerd Behrmann, Ansgar Fehnker, Thomas Hune, Kim Guldstrand Larsen, Paul Pettersson, Judi Romijn, and Frits W. Vaandrager. Minimum-cost reachability for priced timed automata. In Maria Domenica Di Benedetto and Alberto L. Sangiovanni-Vincentelli, editors, Hybrid Systems: Computation and Control, 4th International Workshop, HSCC 2001, Rome, Italy, March 28-30, 2001, Proceedings, volume 2034 of Lecture Notes in Computer Science, pages 147–161, Berlin, Heidelberg, 2001. Springer. doi:10.1007/3-540-45351-2_15.
  • [4] Patricia Bouyer, Ed Brinkshma, and Kim G. Larsen. Optimal infinite scheduling for multi-priced timed automata. Formal Methods Syst. Des., 32:3–23, 2008. doi:10.1007/s10703-007-0043-4.
  • [5] Patricia Bouyer, Franck Cassez, Emmanuel Fleury, and Kim Guldstrand Larsen. Optimal strategies in priced timed game automata. In Kamal Lodaya and Meena Mahajan, editors, FSTTCS 2004: Foundations of Software Technology and Theoretical Computer Science, 24th International Conference, Chennai, India, December 16-18, 2004, Proceedings, volume 3328 of Lecture Notes in Computer Science, pages 148–160, Berlin, Heidelberg, 2004. Springer. doi:10.1007/978-3-540-30538-5_13.
  • [6] Patricia Bouyer, Samy Jaziri, and Nicolas Markey. On the Value Problem in Weighted Timed Games. In Luca Aceto and David de Frutos Escrig, editors, 26th International Conference on Concurrency Theory (CONCUR 2015), volume 42 of Leibniz International Proceedings in Informatics (LIPIcs), pages 311–324, Dagstuhl, Germany, 2015. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CONCUR.2015.311.
  • [7] Patricia Bouyer, Kim Guldstrand Larsen, Nicolas Markey, and Jacob Illum Rasmussen. Almost optimal strategies in one clock priced timed games. In S. Arun-Kumar and Naveen Garg, editors, FSTTCS 2006: Foundations of Software Technology and Theoretical Computer Science, 26th International Conference, Kolkata, India, December 13-15, 2006, Proceedings, volume 4337 of Lecture Notes in Computer Science, pages 345–356, Berlin, Heidelberg, 2006. Springer. doi:10.1007/11944836_32.
  • [8] Thomas Brihaye, Gilles Geeraerts, Axel Haddad, Engel Lefaucheux, and Benjamin Monmege. One-clock priced timed games with negative weights. Logical Methods in Computer Science, Volume 18, Issue 3, August 2022. doi:10.46298/lmcs-18(3:17)2022.
  • [9] Thomas Brihaye, Gilles Geeraerts, Axel Haddad, and Benjamin Monmege. Pseudopolynomial iterative algorithm to solve total-payoff games and min-cost reachability games. Acta Informatica, 54(1):85–125, February 2017. doi:10.1007/s00236-016-0276-z.
  • [10] Thomas Brihaye, Gilles Geeraerts, Shankara Narayanan Krishna, Lakshmi Manasa, Benjamin Monmege, and Ashutosh Trivedi. Adding negative prices to priced timed games. CoRR, abs/1404.5894, 2014. doi:10.48550/arXiv.1404.5894.
  • [11] Damien Busatto-Gaston, Benjamin Monmege, and Pierre-Alain Reynier. Optimal reachability in divergent weighted timed games. In Javier Esparza and Andrzej S. Murawski, editors, Foundations of Software Science and Computation Structures, pages 162–178, Berlin, Heidelberg, 2017. Springer Berlin Heidelberg. doi:10.1007/978-3-662-54458-7_10.
  • [12] Damien Busatto-Gaston, Benjamin Monmege, and Pierre-Alain Reynier. Symbolic Approximation of Weighted Timed Games. In Sumit Ganguly and Paritosh Pandya, editors, 38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2018), volume 122 of Leibniz International Proceedings in Informatics (LIPIcs), pages 28:1–28:16, Dagstuhl, Germany, 2018. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.FSTTCS.2018.28.
  • [13] Damien Busatto-Gaston, Benjamin Monmege, and Pierre-Alain Reynier. Optimal controller synthesis for timed systems. Log. Methods Comput. Sci., 19(1), 2023. doi:10.46298/lmcs-19(1:20)2023.
  • [14] Quentin Guilmant and Joël Ouaknine. Inaproximability in Weighted Timed Games. In Rupak Majumdar and Alexandra Silva, editors, 35th International Conference on Concurrency Theory (CONCUR 2024), volume 311 of Leibniz International Proceedings in Informatics (LIPIcs), pages 27:1–27:15, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.CONCUR.2024.27.
  • [15] Quentin Guilmant, Joël Ouaknine, and Isa Vialard. The value problem for weighted timed games with two clocks is undecidable, 2025. doi:10.48550/arXiv.2507.10550.
  • [16] Benjamin Monmege, Julie Parreaux, and Pierre-Alain Reynier. Decidability of one-clock weighted timed games with arbitrary weights. CoRR, abs/2207.01608, 2022. doi:10.48550/arXiv.2207.01608.

Appendix A Proofs of Section 4.4

A.1 Proof of Lemma 16

Definition 21 (Simulation and bisimulation).

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w) and 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w) be wtgs. Then a relation R(L𝖬𝗂𝗇×𝒳×L𝖬𝗂𝗇×𝒳)(L𝖬𝖺𝗑×𝒳×L𝖬𝖺𝗑×𝒳) is a simulation relation when

(,ν)R(,ν){G and G,or, for all tTδ, there exist t and δ such that if (,ν)δ,t(l2,ν2) then (,ν)δ,t(2,ν2) where (2,ν2)R(2,ν2).

R is a bisimulation if R and its converse are both simulations.

Example 22.

The region relation R between configurations of a same wtg, where (l1,ν1)R(l2,ν2) iff l1=l2 and ν1,ν2 are in the same region, is a bisimulation.

A (bi)simulation relation R can be extended to runs: ρ1Rρ2 when |ρ1|=|ρ2|, and ρ1𝒞(n)Rρ2𝒞(n) for all n<|ρi|.

Lemma 23.

Let 𝒢1 and 𝒢2 be two wtgs. Let R be a bisimulation relation between 𝒢1 and 𝒢2, such that 𝒢1 and 𝒢2 start from configurations c1Rc2. Assume that 𝖵𝖺𝗅𝒢i+ for i{1,2}. Then

|𝖵𝖺𝗅𝒢1𝖵𝖺𝗅𝒢2|sup{|𝗐𝖾𝗂𝗀𝗁𝗍(ρ1)𝗐𝖾𝗂𝗀𝗁𝗍(ρ2)||ρ1,ρ2 plays of 𝒢1,𝒢2ρ1Rρ2}.
Proof.

For all σ1𝖬𝗂𝗇 a strategy for Min on 𝒢1 and σ2𝖬𝖺𝗑 a strategy for Max on 𝒢2, there exists σ2𝖬𝗂𝗇 and σ1𝖬𝖺𝗑 strategies for Min and Max on 𝒢2 and 𝒢1 respectively, such that ρ1Rρ2 for ρi=play(ci,σi𝖬𝗂𝗇,σi𝖬𝖺𝗑). (In any Min location, σ2𝖬𝗂𝗇 simulates σ1𝖬𝗂𝗇 following R. Similarly in any Max location, σ1𝖬𝖺𝗑 simulates σ2𝖬𝖺𝗑 following R.)

Note that σ1𝖬𝖺𝗑,σ2𝖬𝗂𝗇 and ρ1,ρ2 are functions of σ1𝖬𝗂𝗇,σ2𝖬𝖺𝗑. However, we omit these arguments to lighten notations.

Since the set of strategies obtained by such a simulation is included in the set of all strategies,

supσ2𝖬𝖺𝗑𝗐𝖾𝗂𝗀𝗁𝗍(ρ1)supσ𝖬𝖺𝗑V𝒢1(σ1𝖬𝗂𝗇,σ𝖬𝖺𝗑) for a fixed σ1𝖬𝗂𝗇,
and
infσ1𝖬𝗂𝗇𝗐𝖾𝗂𝗀𝗁𝗍(ρ2)infσ𝖬𝗂𝗇V𝒢2(σ𝖬𝗂𝗇,σ2𝖬𝖺𝗑) for a fixed σ2𝖬𝖺𝗑.

Therefore

infσ1𝖬𝗂𝗇supσ2𝖬𝖺𝗑𝗐𝖾𝗂𝗀𝗁𝗍(ρ1)infσ𝖬𝗂𝗇supσ𝖬𝖺𝗑𝗐𝖾𝗂𝗀𝗁𝗍(play𝒢1(c1,σ𝖬𝗂𝗇,σ𝖬𝖺𝗑))=𝖵𝖺𝗅𝒢1

and

infσ1𝖬𝗂𝗇supσ2𝖬𝖺𝗑𝗐𝖾𝗂𝗀𝗁𝗍(ρ2) supσ2𝖬𝖺𝗑infσ1𝖬𝗂𝗇𝗐𝖾𝗂𝗀𝗁𝗍(ρ2)
supσ𝖬𝖺𝗑infσ𝖬𝗂𝗇𝗐𝖾𝗂𝗀𝗁𝗍(play𝒢2(c2,σ𝖬𝗂𝗇,σ𝖬𝖺𝗑))=𝖵𝖺𝗅𝒢2

Therefore 𝖵𝖺𝗅𝒢2𝖵𝖺𝗅𝒢1sup{𝗐𝖾𝗂𝗀𝗁𝗍(ρ2)𝗐𝖾𝗂𝗀𝗁𝗍(ρ1):ρ1Rρ2}.

Reasoning in mirror, one can obtain

𝖵𝖺𝗅𝒢1𝖵𝖺𝗅𝒢2sup{𝗐𝖾𝗂𝗀𝗁𝗍(ρ1)𝗐𝖾𝗂𝗀𝗁𝗍(ρ2):ρ1Rρ2}.

and combine to conclude.

Now let us define the following relation between valuations: For ϵ>0, two valuations ν,ν are ϵ-neighbours if there exists ϵ1,ϵ20 such that ϵ1+ϵ2<ϵ, and for any clock x𝒳, ν(x)ν(x)[ϵ1,ϵ2].

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w𝗈𝗎𝗍) be a trimmed region weighted timed game. We consider 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w𝗈𝗎𝗍) a copy of 𝒢 where every strict guard has been relaxed into a strict-or-equal guard.

Lemma 24.

For any configurations (,ν) in 𝒢, and (,ν) in 𝒢,

Let (,ν)Rϵ(,ν) iff

  • =

  • ν𝗋𝖾𝗀() and ν𝗋𝖾𝗀()¯

  • ν and ν are ϵ-neighbours

Then Rϵ is a bisimulation relation.

Proof.

First observe that one side of the bisimulation is easier than the other one, since transitions in 𝒢 are more permissive. Therefore we will detail only the other side:

Let t:1C,X2. Let C¯ be the relaxation of C. Let ν1 be a valuation belonging to 𝗋𝖾𝗀(1) in 𝒢 and let ν1 be an ϵ-neighbour of ν1. By definition, there are ϵ1,ϵ20 such that ϵ1+ϵ2<ϵ and for any clock x𝒳, ν1(x)ν1(x)[ϵ1,ϵ2].

Let δ0 such that ν1+δC¯ and let ν2=(ν1+δ)[X:=0]. Then let us show that there exists δ0 such that ν1+δC, and (2,ν2)Rϵ(2,ν2) with ν2=(ν1+δ)[X:=0]. To prove that, one only needs to show that ν1+δ is in the adherence of the region of ν1+δ, and that ν1+δ and ν1+δ are ϵ-neighbours.

Consider the interval

Δ={δ0|ν1+δCν1+δr s.t. ν1+δr¯}

Since 𝒢 is region trimmed, there exists δ such that ν1+δC, and every valuation of the form ν+δ that satisfy C with ν𝗋𝖾𝗀(1) belong to a unique region rt. Since ν1𝗋𝖾𝗀(1)¯ and ν1+δC¯, ν1+δrt¯. Therefore, Δ is not empty.

If δΔ, then (ν1+δ)[X:=0] and ν1+δ are ϵ-neighbours. Otherwise, when δΔ, it entails that Δ is constrained by some guards in C. These guards can be of the form x=1, x>0, x<1 or x1 for any x𝒳 121212There cannot be any guard of the form x=0 in C, otherwise Δ={δ}={0}. .

  • If there is a guard x=1 in C for some x𝒳, then Δ is the singleton {1ν1(x)}. Indeed Δ is not empty and δ=𝖽𝖾𝖿1ν1(x) is the only delay that can satisfy the guard x=1. Note that, for the same reason, δ=1ν1(x). By definition of Δ, ν2=(ν1+δ) and ν2 are in the same region. Moreover, for all y𝒳, if yX, then ν2(y)=ν2(y)=0 and if y𝒳X, then (ν1(y)+δ)(ν1(y)+δ[(ϵ1+δδ),ϵ2(δδ)]. Now, δδ=ν1(x)ν1(x)[ϵ1,ϵ2] entails that ϵ2=𝖽𝖾𝖿ϵ2(δδ)0, and ϵ1=𝖽𝖾𝖿ϵ1+(δδ)0. Finally, ϵ1+ϵ2=ϵ1+ϵ2<ϵ.

Now assume that there are no such guards in C:

  • If there is a guard x>0 in C for some x𝒳 such that ν1+δ⊧̸(x>0) but ν1+δ(x>0) then ν1(x)+δ=0. Furthermore, since 𝒢 is trimmed, ν1(x)=0. Therefore, Δ is an interval of the form ]0,]. Pick δΔ such that 0<δ<ϵ(ϵ1+ϵ2). Let ϵ2=𝖽𝖾𝖿ϵ2(δδ)=ϵ2+δ, and ϵ1=𝖽𝖾𝖿min{0,ϵ1+(δδ)}0. Then ϵ1+ϵ2ϵ1+ϵ2+δ<ϵ, hence ν1+δ and ν1+δ are ϵ-neighbourss.

  • If there is a guard x<1 in C (resp. x1) for some x𝒳, such that ν1+δ(x1) but ν1+δ⊧̸(x<1) (resp. x1), then Δ<δ. If δ+ν1(x)ν1(x)Δ then pick δ=δ+ν1(x)ν1(x). Since δδ=ν1(x)ν1(x)[ϵ1,ϵ2], ν1+δ and ν1+δ are ϵ-neighbours.

    Otherwise pick δΔ such that 1ϵ+(ϵ1+ϵ2)<ν1(x)+δ<1. Then ν1(x)ν1(x)<δδ<ν1(x)ν1(x)+ϵ(ϵ1+ϵ2), with ν1(x)ν1(x)[ϵ1,ϵ2]. Let ϵ2=𝖽𝖾𝖿min(0,ϵ2(δδ))0 and ϵ1=𝖽𝖾𝖿ϵ1+(δδ)0. Then ϵ1+ϵ2<ϵ, hence ν1+δ is an ϵ-neighbour of ν1+δ.

We are now ready to prove Lemma 16.

Lemma 16. [Restated, see original statement.]

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w𝗈𝗎𝗍) be a trimmed region kernel wtg. Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w𝗈𝗎𝗍) be a copy of 𝒢 where

  • every guard has been relaxed, i.e., every guard of the form x>0 and x<1 have been replaced by x0 or x1, respectively, for all x𝒳.

  • For any G, the output function w𝗈𝗎𝗍(,) is w𝗈𝗎𝗍(,) extended continuously to 𝗋𝖾𝗀()¯ in 𝒢.

Then 𝖵𝖺𝗅𝒢=𝖵𝖺𝗅𝒢.

Proof.

𝒢 and 𝒢 start from the same configuration, thus their first configurations are in bisimulation Rϵ for any ϵ>0.

Therefore, according to Lemma 23,

|𝖵𝖺𝗅𝒢𝖵𝖺𝗅𝒢|sup{|𝗐𝖾𝗂𝗀𝗁𝗍(ρ)𝗐𝖾𝗂𝗀𝗁𝗍(ρ)||ρ,ρ plays of 𝒢,𝒢ρRϵρ}.

In a kernel, for any finite run ρ of length n, 𝗐𝖾𝗂𝗀𝗁𝗍(ρ)=w𝗈𝗎𝗍(ρC(n)). Let s be the maximal slope (in absolute value, in one variable) in functions w𝗈𝗎𝗍(,). Then

sup {|𝗐𝖾𝗂𝗀𝗁𝗍(ρ1)𝗐𝖾𝗂𝗀𝗁𝗍(ρ2)|:ρ1Rϵρ2}
sup{|w𝗈𝗎𝗍(,ν1)w𝗈𝗎𝗍(,ν2)|:G,(,ν1)Rϵ(,ν2)}
ϵs|𝒳|

Conclude with Lemma 23.

A.2 Proof of Lemma 19

Lemma 25.

For A,B=𝖬𝗂𝗇,𝖬𝖺𝗑 or 𝖬𝖺𝗑,𝖬𝗂𝗇. In a relaxed region kernel [0,1]-wtg, consider a transition t:AC,XB between A a location belonging to Player A, and B a location belonging to Player B, such that C has no guard of the form x=0,x=1,x<1, and X=. Then adding the guard x=1 to C for some xXA does not change the value.

Proof.

Pick some xXA. By picking a delay δ<1ν(x), Player A offers Player B more options than if they picked δ=1ν(x). From the perspective of B, if a larger delay is advantageous, then they can take it from B at cost 0. Hence it is optimal for either A or B to pick the largest delay possible, i.e. δ=1ν(x). However, since w(A)=w(B)=0, forcing A to take a delay in A which would have been taken in B by Player B given the chance does not change the value. Thus, restricting A to strategies which, when choosing t from a valuation ν, choose a delay δ such that (ν+δ)(x)=1 for all xXA does not change the value of the wtg. See 19

Proof.

Let 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w,w𝗈𝗎𝗍). We assume that 𝖬𝖺𝗑 does not have full control over any cycle (i.e., in any cycle there is a 𝖬𝗂𝗇 location from which she can decide to leave the cycle). Indeed, if 𝖬𝖺𝗑 could reach such a cycle, the Value of 𝒢 would be +. Furthermore, let us assume that there is no 𝖬𝗂𝗇 self-loop (a transition L𝖬𝗂𝗇C,X) with X= in 𝒢: it does not make strategic sense for 𝖬𝗂𝗇 to take such a loop, hence they can be deleted without change in value.

Let us transform 𝒢 through the following operations. For any t:1C,X2 of 𝒢 such that 2 is not a target location, and X= and C has no guards of the form x=1 for any clock x:

  • If C has a x=0 requirement for some clock x, then add x to X.

  • If C has no x=0 requirement for every clock x, and 1 and 2 belong to the same player, then remove t and, for any t:23 with 31, 131313Adding a transition in the case 3=1 would create a self-loop: 𝖬𝗂𝗇 has no use for self-loops without reset, and we assume that 𝖬𝖺𝗑 has full control over no cycle, so this situation never happens when 1 belongs to 𝖬𝖺𝗑. create a transition t:13 such that C(t′′)=C(t)C(t)

  • If C has no x=0 requirement for every clock x, and 1 and 2 do not belong to the same player, then let us add a x=1 requirement to C where xX1. According to Lemma 25, it does not change the value.

After these transformations, every transition without reset in 𝒢 is either a transition to a target location, or has a x=1 guard for some x𝒳.

Then let us build 𝒢=(L𝖬𝗂𝗇,L𝖬𝖺𝗑,G,𝒳,T,w,w𝗈𝗎𝗍) a kernel wtg where:

  • For any player 𝖯, let L𝖯=L𝖯{:L𝖯}.

  • Let G=G{:G}.

  • Start from T=. For any t:C,X in T,

    • if G then for all XX add t:C,X and t:C,X to T, where C is C{x=0:xX} deprived of guards (x=1) for all xX. Note that here X is defined according to 𝗋𝖾𝗀() the region assignment in the relaxed trimmed region wtg 𝒢.

    • if G and X= then (x=1)C for some xX. Then add t:C,X and t′′:C,X to T.

    • if G and X=X, then add t:C,X and t′′:C,X to T, such that X=X and C=C{x=0:xX1}{x=1:xX1}.

    • if G and X, then add t:C,X and t′′:C,X to T such that X=X and C=C{x=0:xX1}{x=1:xX1}.

  • For all G, let w𝗈𝗎𝗍(,ν)=w𝗈𝗎𝗍(,ν), w𝗈𝗎𝗍(,ν)=w𝗈𝗎𝗍(,ν) where ν(x)=ν(x) for all xX, and ν(x)=1 otherwise.

Intuitively, in 𝒢, when one or several clocks are made to reach 1 by a guard, there will usually be some urgent transitions taken until all these clocks have been reset. In 𝒢, those clocks are immediately reset, and control moves to a location . All paths leaving have (x=0) conditions (for all x that have been reset in 𝒢 but not in 𝒢) to guarantee urgency. A valuation ν in a location in 𝒢 is thus equivalent to a valuation ν in in 𝒢 iff ν(x)=ν(x) if xX, and ν(x)=1 and ν(x)=0 for xX.

𝒢 is a relaxed region wtg with region assignment 𝗋𝖾𝗀()=𝗋𝖾𝗀() and 𝗋𝖾𝗀()=(X0Xp,X1,,Xp1) when 𝗋𝖾𝗀()=(X0,X1,,Xp). From there, trim 𝒢 to obtain a relaxed trimmed region wtg.