Conditional simple temporal networks with uncertainty and decisions

Abstract A Conditional Simple Temporal Network with Uncertainty (CSTNU) is a formalism able to model temporal plans subject to both conditional constraints and uncertain durations. The combination of these two characteristics represents the uncontrollable part of the network. That is, before the network starts executing, we do not know completely which time points and constraints will be taken into consideration nor how long the uncertain durations will last. Dynamic Controllability (DC) implies the existence of a strategy scheduling the time points of the network in real time depending on how the uncontrollable part behaves. Despite all this, CSTNUs fail to model temporal plans in which a few conditional constraints are under control and may therefore influence (or be influenced by) the uncontrollable part. To bridge this gap, this paper proposes Conditional Simple Temporal Networks with Uncertainty and Decisions (CSTNUDs) which introduce decision time points into the specification in order to operate on this conditional part under control. We model the dynamic controllability checking (DC-checking) of a CSTNUD as a two-player game in which each player makes his moves in his turn at a specific time instant. We give an encoding into timed game automata for a sound and complete DC-checking. We also synthesize memoryless execution strategies for CSTNUDs proved to be DC and carry out an experimental evaluation with , a tool that we have designed for CSTNUDs to make the approach fully automated.


Introduction
Temporal networks are a framework to model temporal plans and check the coherence of their temporal constraints which impose a minimal and maximal temporal distance between the occurrence of the events specified in the plan.Temporal plans mainly divide in plans having everything under control and plans having something out of control.The main components of a temporal network are time points and constraints.Time points are variables having continuous domain and model the occurrence of events as soon as these variables are assigned real values (i.e., executed).Constraints regulate the minimal and maximal temporal distance between the occurrence of pairs of events and are formalized as linear inequalities.Whenever both these two components are under control we simply deal with a consistency problem asking us to find an assignment of real values to all time points satisfying all constraints.Simple temporal networks (STNs) model exactly this case [10], whereas Drake [9] addresses temporal plans with choices that are, however, under control; therefore, we keep dealing with a consistency problem asking us to further find suitable values for such choices.
Instead, when some component is out of control, satisfiability is, in general, not enough.In such a case, we deal with a controllability problem.
Conditional simple temporal networks (CSTNs) [14,20] address conditional constraints to enable or disable some parts of the network (i.e., a subset of time points and constraints) during execution.Conditionals are expressed as labels consisting of conjunctions of literals whose atoms are Boolean propositions.The truth value assignments to such propositions are out of control and depend on the behavior of unpredictable external events which are only observed to occur while executing the network.
Simple temporal networks with uncertainty (STNUs) [17,18] address uncertain (but bounded) durations.Such durations are modeled by contingent links, i.e. pairs of distinct time points specifying a range of allowed values between their distance.One of these time points is called activation and it is under control, whereas the other one is called contingent and it is not.The real value assignment to the contingent one depends again on the behavior of unpredictable external events which are only observed to occur while executing the network.
Conditional simple temporal networks with uncertainty (CSTNUs) [7,13] merge the semantics of CSTNs and STNUs addressing conditional constraints and uncertain durations.
Controllability of a temporal network implies the existence of a strategy operating on the controllable part such that all constraints will eventually be satisfied.Controllability mainly divides in weak, strong and dynamic.Weak controllability ensures the existence of a (possible different) strategy to operate on the controllable part whenever we are able to predict how the entire uncontrollable part will behave before the execution starts.Strong controllability is the opposite case ensuring the existence of a strategy operating always the same way on the controllable part no matter how the uncontrollable part will behave.However, strong controllability is "too strong".If a temporal network is not strongly controllable, it could still be executable by operating on the controllable part reacting to the uncontrollable one as soon as it becomes known.Dynamic controllability addresses exactly this case.
However, none of the formalisms mentioned so far tackles temporal plans in which some conditional constraints under control may influence (or be influenced by) some uncontrollable part.An initial discussion is given in [3] where CSTNs are extended with decision nodes regulating the truth value assignments to some propositions under control.
We give here the first attempt to address temporal plans in which decisions may influence (or be influenced by) both conditional and temporal uncertainty.
Toward this aim our contributions are three-fold.First, we define conditional simple temporal networks with uncertainty and decisions (CSTNUDs) as a unified formalism for temporal networks expressing uncontrollable parts and model dynamic controllability as a two-player game in which players make moves in their turns.Second, we provide an encoding into timed game automata for a sound and complete DC-checking and synthesize execution strategies by means of the UPPAAL-TIGA software [2].Third, we automate our approach by discussing a proof of concept tool we came up with.
The rest of the paper is organized as follows.Section 2 provides essential background on CSTNUs, timed game automata (TGAs) and the DC-checking of CSTNUs via TGAs.Section 3 introduces our main contribution: CSTNUs with Decisions along with a new semantics given in move-based strategies.Section 4 extends the encoding given in Section 2 to address the DC-checking of CSTNUDs.Section 5 discusses our tool and a preliminary experimental evaluation.Section 6 discusses the correctness and complexity of the encoding.Section 7 discusses related work.Section 8 draws conclusions and discusses future work.Background: CSTNUs, TGAs and Dynamic Controllability

Conditional Simple Temporal Networks with Uncertainty
Given a set P of Boolean propositions, a label = λ 1 . . .λ n is any finite conjunction of literals λ i , where a literal is either a proposition p (positive literal) or its negation ¬p (negative literal).The empty label is denoted by .The label universe of P, denoted by P * , is the set of all possible (consistent) labels drawn from P; e.g., if P = {p, q}, then P * = { , p, q, ¬p, ¬q, p ∧ q, p ∧ ¬q, ¬p ∧ q, ¬p ∧ ¬q}.Two labels for some ∈ P * , then k = ∞ (for that label).A CSTNU is well-defined if and only if all the following properties hold.
We execute a time point by assigning it a real value (modeling the occurrence of some temporal event).We execute a CSTNU by executing all relevant non-contingent time points (see below).For any contingent link (A, x, y, C), A is the activation time point, whereas C is the contingent time point.A is under control, C is not.Once we execute A, we can merely observe the execution of C (by the environment).However, C is guaranteed to occur such that C − A ∈ [x, y].A contingent link has a unique implicit label given by = L(A) = L(C).
Likewise, an observation time point P ?∈ OT is under control, whereas the truth value assignment to its associated Boolean proposition p is not.Once we execute P ?we can merely observe such an assignment.Before executing P ? the value of p is unknown, and after executing P ? is either (true) or ⊥ (false).As we execute observation time points, their truth value assignments to the associated propositions generate the current partial scenario.That is, a label cps ∈ P * consisting of the conjunction of these literals.Initially, cps = , and whenever a proposition is assigned a truth value, the resulting literal λ is appended to cps .Time points and constraints are relevant if their labels are not falsified by cps .Before executing the network all time points and constraints are relevant.If a time point turns irrelevant, we will not execute it.If a constraint does, we will not be obliged to satisfy it.

23:4
Conditional Simple Temporal Networks with Uncertainty and Decisions A CSTNU is said dynamically controllable (DC) if there exists a strategy executing all relevant non-contingent time points such that all (relevant) constraints are satisfied no matter which uncertain durations and truth value assignments turn out to be during execution.
We graphically represent a CSTNU as a labeled (multi)graph, where the set of nodes coincides with the set of time points (labels are shown below the nodes), whereas the set of edges divides in contingent links and requirement links.A contingent link (shown as a double arrow A ⇒ C labeled by [x, y]) models (A, x, y, C) ∈ L. A requirement link (shown as a single arrow Figure 1 shows an example of CSTNU having two observation time points P ?, D? and four contingent links (A 1 , 1, 6, C 1 ), (A 2 , 8, 12, C 2 ), (A 3 , 3, 5, C 3 ) and (A 4 , 6, 10, C 4 ).P ? is the first time point to execute, whereas E is the last.If P ?assigns to p, then A 2 and C 2 along with the constraints labeled by ¬p turn irrelevant as cps = p falsifies ¬p.If P ?assigns ⊥ to p, we will ignore A 1 , C 1 and all constraints labeled by p.Likewise, if D? assigns (resp., ⊥) to d, we will ignore A 4 and C 4 (resp.,A 3 and C 3 ) and all constraints labeled by ¬d (resp., d).The CSTNU in Figure 1

Timed Game Automata
A timed automaton (TA) [1] refines a finite automaton [12] by adding real-valued clocks and clock constraints.All clocks increase at the uniform rate and may by reset many times.

Definition 2 (TGA).
A Timed Automaton (TA) is a tuple Loc, Act, X , →, Inv , where Loc is a finite set of locations (one is initial).A location is urgent if time freezes in it.Act is a finite set of actions and X is a finite set of real-valued clocks.
is the set of clocks to reset (i.e., set to 0).Inv : Loc → H(X ) is a function assigning an invariant (modeled as a conjunction of clock constraints) to each location.Inv(L) says when the TA is allowed to remain in L. A Timed Game Automaton (TGA) [15] extends a TA by dividing transitions into controllable and uncontrollable.Uncontrollable transitions have priority over controllable ones.
TGA encoding the CSTNU in Fig. 1: L0 is the initial location, L1, L , Lp, L¬p, L d , goal are urgent.Solid (resp., dashed) edges model controllable (resp., uncontrollable) transitions.Σ1 : We graphically represent a TGA as a (multi)graph where the set of nodes coincides with that of locations whereas the set of edges models controllable transitions (solid edges) and uncontrollable ones (dashed edges).Figure 2 depicts a TGA encoding the CSTNU in Figure 1.
In what follows we sum up how this encoding is achieved and dynamic controllability checked.

Dynamic Controllability
The DC-checking problem is the problem of deciding if a CSTNU is DC.We can answer the DC-checking problem by using sound and complete TGA reachability algorithms [4,5].We model the DC-checking as a two-player game between a controller (ctrl) and the environment (env).The aim of ctrl is to reach a specific location as soon as all relevant time points have been executed and all constraints are satisfied, whereas env's goal is to prevent ctrl from doing that.If ctrl wins, the network is DC, otherwise it is not.An important aspect of this encoding is that ctrl is assigned uncontrollable transitions, whereas env is assigned controllable ones.This is necessary to allow env's instantaneous reactions as in the TGA semantics, uncontrollable transitions go first [4,5,6].The encoding is as follows.
Clocks.X contains a clock cX for each time point X ∈ T and a clock bP for each proposition p ∈ P. X also contains two special clocks ĉ (modeling the global time) and c δ (regulating the interplay of the game).cX = ĉ, means that X has not been executed, whereas cX < ĉ means that X was executed at time ĉ − cX (when this difference is > 0).Likewise, bP = ĉ means that p = , whereas bP < ĉ means that p = ⊥ (both when cP < ĉ).Each cX and bP may be reset at most once.For our example we have , cE, bP, bD}.Locations.Loc contains three core locations L 0 (initial), L 1 (urgent) and goal (urgent), and n − 1 urgent locations L 1 , . . ., L n−1 where n is the number of distinct labels in the

T I M E 2 0 1 7
Transitions.→ contains controllable and uncontrollable transitions to model the following: Game interplay.pass and gain are uncontrollable transitions regulating the game interplay.In particular gain can be taken only when c δ > 0 modeling the reaction time needed to observe how the uncontrollable part behaves.
Non-contingent time point executions.For each non-contingent time point X there is an uncontrollable self-loop transition L 1 , cX = ĉ, exX, {cX}, L 1 modeling the execution of X.The guard says that X has not been executed yet, while the reset fixes the execution time of X to ĉ − cX by resetting cX.
Contingent time point executions.For each contingent link (A, x, y, C) and a fail transition L 0 , cA < ĉ ∧ cC = ĉ ∧ cA > y, failC, {cC, c δ }, goal to allow ctrl to move to goal if env fails or refuses to take the transition.
Truth value assignments.For each proposition p ∈ P there is a controllable self-loop transition L 0 , cP < ĉ ∧ cP = 0 ∧ bP = ĉ, pFalse, {bP, c δ }, L 0 to allow env to assign ⊥ to p, if it decides so.If it does not, the truth value of p will remain forever .
Winning conditions.To check that all relevant time points have been executed and all constraints are satisfied we connect each pair of locations goal by means of a set of uncontrollable transitions.Each set of transitions going from L i−1 to L i verifies that if cps does not falsify i , then all time points labeled by i must have been executed and all constraints labeled by i are satisfied.If cps falsifies i , a skip transition allows us to ignore this check.In this way, the problem is decomposed with respect to the specific labels avoiding the combinatorial explosion of all arising cases.For example, the set of transitions going from L to L p is generated as follows.In the scenario where P ?has been executed and p assigned (i.e., cps = p), then A 1 and C 1 must have been executed, and In other words, the meta conditional constraint (cP since TGAs do not allow negations or disjunctions of clock constraints in the guards.Finally, we generate a transition1 for each disjunct (sat p ,skip 1 p ,skip 2 p ).
DC-checking is done by looking for a control strategy for env to always prevent ctrl from getting to goal.If such a strategy exists, the initial CSTNU is not DC, otherwise it is (as ctrl has a counter-strategy to react to any combination of env's moves).

CSTNUs with Decisions
In this section we extend CSTNUs by injecting a new kind of time point: the decision time point.A decision time point D! dualizes an observation one P ? as the truth value assignment to the associated proposition is under control.As a result, the controllable and uncontrollable part may now mutually influence one another.That is, deciding some truth value may restrict (or even exclude) some uncontrollable part and vice versa.Several interesting cases may arise depending on if a few truth values are decided before or after having full information on how 23:7 (a) A decision before any uncontrollable part.
A decision after all observation and some contingent.the uncontrollable part will or have behaved.We go ahead with this discussion by taking Figure 3 as an example.There, we took the initial CSTNU in Figure 1 and substituted decision time points for observation ones in all possible combinations.We discuss these examples focusing on the combinations of minimal and maximal durations of contingent links only.If it works for them, then it must work for any other combination of durations.
In Figure 3a P ! is a decision time point.The resulting CSTNUD is uncontrollable.If we decide p (i.e., assign to p), then observe ¬d (i.e., D? assigns ⊥ to d) and C 1 , C 4 take their maximal durations, then we will have to execute E at 20 violating (P ?− E ≤ −21, ¬d) as P ? is executed at 0. Conversely, if we decide ¬p, then observe d and C 2 and C 3 take their maximal durations, then we will have to execute E at 21 (violating E − P ?≤ 20, d).
In Figure 3b D In Figure 3c P ! and D! are both decision time points.The resulting CSTNUD is of course2 dynamically controllable.If we decide p, then deciding d is always going to be fine.If we decide ¬p, then we will decide either d or ¬d depending on how long C 2 lasts.If C 2 takes its minimal duration, then we will decide d (but not ¬d since C 4 could then take its minimal duration).If C 2 takes its maximal duration, then we will decide ¬d (but not d since if C 3 could then take its maximal duration).
Hence, decisions are dynamic.A CSTNUD is well-defined if and only if the underlying CSTNU is well-defined and time point label honesty extends to decidable propositions as follows: For each X ∈ T , if λ ∈ L(X), where λ = {d, ¬d} and d ∈ DP, then L(X)

Definition 3 (CSTNUD). A Conditional Simple Temporal
That is, X can be executed at the same time of D! (but instantaneously after D! since time points executed at the same instant must in general follow an order of execution).
We model the execution semantics of a CSTNUD as a two-player game in which Player1 models the controller and Player2 models the environment.We employ execution sequences [16] to model the state of the game and define players' strategies as mappings from execution sequences considered at specific time instants to moves.
A sequence {x 1 , x 2 , . . ., x n } is a totally ordered collection of elements such that for any pair of elements x i , x j , if i < j (resp., i > j), then it means that x i is before (resp., after) x j .We abuse notation and write {x 1 , x 2 , . . ., x n } ∪ {x p } to mean the appending operation resulting in {x 1 , x 2 , . . ., x n , x p } where n < p.We write x i ∈ {x 1 , x 2 , . . ., x n } iff there exists j ∈ N, 1 ≤ j ≤ n such that x i = x j (membership), and |{x 1 , x 2 , . . ., x n }| = n (cardinality).A partial schedule for a subset of time points T ⊆ T is a mapping S T : T → R assigning a real value to each X ∈ T .A partial schedule for a subset of Boolean propositions P ⊆ P is a mapping S P : P → { , ⊥} assigning either or ⊥ to each p ∈ P .We write b for a generic Boolean value (i.e., b ∈ { , ⊥}).We write S T ∪ {S T (Y ) = k} to shorten that the domain of S T extends by adding time point Y such that S T (Y ) = k.Similarly, we write S P ∪ {S P (p) = b} for Boolean propositions.

Definition 4 (Instantiation sequence
).An instantiation sequence is a quadruple E, K, S E , S K , where E is a finite sequence of distinct time points in T , K is a finite sequence of distinct propositions in P, and S E , S K are partial schedules whose domains are E and K, respectively.

Definition 5 (Execution sequence).
An execution sequence Z = E, K, S E , S K is an instantiation sequence satisfying the following properties: S E Monotonicity For any pair X i , X j ∈ E if i < j, then S E (X i ) ≤ S E (X j ).(Time Point Label) Honesty For each X ∈ E and each literal λ ∈ L(X) where λ ∈ {p, ¬p}, . Z * represents the set of all execution sequences.t last (Z) = max {S E (X) | X ∈ E} represents the last time instant in which a time point was executed in Z. last(Z) = {X | X ∈ E ∧ S E (X) = t last } represents the set of the last executed time points.
Therefore, an execution sequence models the ordered sequence of executed time points and assigned propositions according to the well-definedness of a CSTNUD.As an example, consider again Figure 3b.Assume that we execute P ? at 0 and observe ¬p.Assume then that we execute A 2 at 1 and observe C 2 to occur at 13 (i.e., at its maximal duration).The execution sequence is We can now compute the current partial scenario as the conjunction of all positive and negative literals arising from all propositions in K according to S K and define local consistency.

Definition 6 (Current partial scenario). Given any Z = E, K, S E , S K , the current partial scenario is given by cps
For Z we have that cps = ¬p since p ∈ K and S K (p) = ⊥.

Definition 7 (Local consistency). An execution sequence E = E, K, S E , S K , is locally consistent if and only if for each (Y
Z is locally consistent since the schedule S E satisfies (A 2 − P ?≤ 1, ¬p) and (P ?− A 2 ≤ −1, ¬p).An execution sequence evolves over time according to the evolution of the game that Player1 (the controller) plays against Player2 (the environment).Each player follows a strategy saying what moves to make and when.Moreover, many moves can be made at the same time instant (provided that they respect an order) and sometimes moves are mandatory.

Definition 8 (Move).
A move m is either X meaning "execute time point X" or (p, b) meaning "assign b ∈ { , ⊥} to proposition p".A move for Player1 requires that X is a non-contingent time point and p is a decidable proposition.A move for Player2 requires that X is a contingent time point and p is an observable proposition.M * 1 and M * 2 represent the sets of all moves for Player1 and Player2, respectively.
A move-based strategy is a mapping from execution sequences considered at particular time instants to moves augmented with a wait condition modeling the absence of move.A strategy tells a player to make a move at a particular time instant only if the move is applicable at that particular time.Therefore, a strategy specifies applicability conditions saying when a move can be made, obligations saying when a move has to be made and postconditions saying how the execution sequence evolves accordingly.

Definition 9 (Move-based strategy). A move-based strategy for Player1 is a mapping
1 ∪ {wait} such that its applicability conditions are: 1.For any execution sequence Z and any time instant t, σ where ∼ is > if last(Z) contains a contingent time point C or an observation time point P ?such that K contains its related proposition p (reaction time enforcement), ≥ otherwise.2. For any execution sequence Z and any time instant t, σ 1 (Z, t) = X is applicable if (1) holds and X is an unexecuted non-contingent time point such that the current partial scenario entails L(X) (i.e., X ∈ E ∧ X ∈ Contingent ∧ cps ⇒ L(X)).3.For any execution sequence Z and any time instant t, σ 1 (Z, t) = wait is applicable if (1) holds and there is no obligation at time t.
The unique obligation involves decidable propositions requiring that whenever a decision time point D! has been executed and its related proposition d has not been assigned yet, then the strategy must issue a move to assign d a truth value instantaneously.In symbols: A move-based strategy for Player2 is a mapping σ 2 : Z * × R → M * 2 ∪ {wait} such that its applicability conditions are: 1.For any execution sequence Z and any time instant t, σ 2 (Z, t) is applicable iff t ≥ t last (Z).A, x, y, C

For any execution sequence Z, any time instant t and any contingent link (
holds, A has already been executed, C has not, and executing C at this time satisfies For any execution sequence Z and any time instant t, σ 2 (Z, t) = wait is applicable if (1) holds and there is no obligation at time t.
Obligations are of two kinds.The first obligation involves observable propositions requiring that whenever an observation time point P ?has been executed and its related proposition p has not been assigned yet, then the strategy must issue a move to assign p a truth value T I M E 2 0 1 7

23:10
Conditional Simple Temporal Networks with Uncertainty and Decisions instantaneously.In symbols: (P ?∈ E ∧ p ∈ K) =⇒ σ 2 (Z, S E (P ?)) = (p, b).The second obligation involves contingent links (A, x, y, C) requiring that if A has already been executed, C has not and the current time t is the last instant in which C can be executed, then the strategy must issue a move to execute C at t.In symbols: Postconditions for both σ 1 and σ 2 are the same.If the strategy tells the player to execute a time point X at time t then Z updates by appending X to E and extending S E such that S E (X) = t.If the strategy tells the player to assign the truth value b to the proposition p, then Z updates by appending p to K and extending S K such that S K (p) = b.In symbols: If Getting back to our example we have that t last (Z) = 13 and last(Z) = {C 2 }.Suppose that current time is t = 14.σ 1 (Z, 14) = D! is applicable since t > t last and D! has not been executed yet, whereas We now model Player2 as the most powerful player possible.If Player1 can beat this (worst-case of) environment, then Player1 must be able to beat any other less powerful environment playing the same game.To achieve this purpose we model the game in turns.That is, at any time instant t, there exist two turns: T 1 (t) (occurring first) and T 2 (t) (occurring last).Player1 makes his moves in T 1 (t), whereas Player2 makes his in T 2 (t).If player i does not make any move in T i (t), then he loses forever the possibility to play at time t.As a result, Player2, making his moves in T 2 (t), is guaranteed to always have full information on what Player1 has done in T 1 (t) (worst-case scenario).In what remains of this section we define the concept of snapshot modeling an execution sequence a particular time instant t (after the players are done in T 1 (t) and T 2 (t)), continuous game evolution modeling how the execution sequence evolves and winning conditions for each player.
Definition 10 (Snapshot).Let Z = E, K, S E , S K be any execution sequence.Z(t) = E , K , S E , S K models the snapshot of Z at time t, where , and ∀p ∈ K , S K (p) = S K (p).
To give an example, let us get back to the execution sequence we have discussed before.At t = 11, we have Z (11) Definition 11 (Continuous game evolution).Let t ∈ R ≥0 be the global time.The continuous game evolution is modeled by an infinite sequence of snapshots Z(t) defined as: where T i (t) models the evolution of Z during turn i at time t according to σ i , whereas > 0.
Definition 12 (Winning conditions).Player1 wins the game if and only if the game evolution leads to a snapshot Z(t) such that for each unexecuted time point X, cps falsifies L(X) and for each constraints (Y − X ≤ k, ) where X, Y ∈ E and cps ⇒ , S E (Y ) − S E (X) ≤ k holds.Player2 wins otherwise.σ i is a winning strategy if player i wins the game by following σ i .

Definition 13 (Dynamic controllability).
A CSTNUD is dynamically controllable if Player1 has a winning strategy such that for any t > 0 and any pair of execution sequences In other words, whenever Player2 has made the same (infinite) sequence of moves up to time t − , then Player1 will make the same move(s) at time t.

Dynamic Controllability of CSTNUDs via TGAs
In this section we extend the encoding given for CSTNUs in Section 2. As an example, we consider Figure 4 depicting the encoding of the CSTNUD in Figure 3b.
Once again, we have three core locations but this time we borrow a few names from Section 3 and rename them to T 1 (ex L 1 ), T 2 (ex L 0 ) and win (ex goal).T 1 and T 2 model the two turns T 1 (t) and T 2 (t) when global time is > 0. T 2 is the initial location.The winning path is computed the same way only renaming each L i to w i .gain and pass regulate the turns at any time instant t.We still have a clock cX for each X ∈ T (considering decision time points too) and a clock bP for each p ∈ P (considering decidable propositions too).
We optimize the guard of each uncontrollable self-loop at T 1 by exploiting what we know of the CSTNUD.That is, we extend the guards so that they enforce time point label honesty as well as the partial order among the time points when not ambiguous.This optimization was first discussed in [8] but there it dealt with disjunctive constraints and exploited internal data structures provided by the UPPAAL-TIGA software.Here, we propose a more formal definition avoiding such data structures.Moreover, [8] does not address decisions.
To give an example of this optimization, consider time points A 1 and A 4 of the CSTNUD in Figure 3b.L(A 1 ) = p and L(A 4 ) = ¬d.Recall that the encoding models p and d as two dedicated clocks bP and bD such that if each of these clocks is equal to (resp., less than) ĉ, once its related observation or decision time point has been executed, then the related proposition is (resp., ⊥).Moreover, time point label honesty also requires that Therefore, considering the time point label honesty property for CSTNUDs, it is possible to extend the guards of exA 1 and exA 4 by appending bP = ĉ ∧ cP < ĉ ∧ cP > 0 and bD < ĉ ∧ cD < ĉ ∧ cD ≥ 0, respectively.The former models the fact that A 1 must be executed if only if p = (i.e., bP = ĉ), which also implies that A 1 must be executed after P ?(i.e., P ?have been executed (cP < ĉ)) and a positive amount of time has elapsed (cP > 0).The latter models the fact that A 4 must be executed if only if d = ⊥ (i.e., bD < ĉ), which also implies that A 4 must be executed after D! (i.e., D! have been executed (cD < ĉ)) and possibly immediately or after a positive amount of time has elapsed (cD ≥ 0).

Definition 14 (Encoding time point label honesty).
A label encoder is a mapping L enc : T → H(X ) translating the label of a time point into the equivalent clock constraint L enc (X) = L OP enc (X) ∧ L DP enc (X), where L OP enc (X) and L DP enc (X) encode all literals containing observable and decidable propositions, respectively.
We now focus on constraints.Consider the requirement link P ?→ A 1 labeled by [1,1], p in the CSTNUD that we are discussing.Such a constraint says that A 1 must be executed after 1 and within 1 since P ?(thus, exactly after 1 since P ?).This requirement link has also an important characteristic: L(A 1 ) coincides with the label of the link.Therefore, whenever A 1 is executed, the constraint must hold.Thus, we extend the original guard of exA 1 (formerly cA 1 = ĉ) to cA 1 = ĉ ∧ cP < ĉ ∧ cP = 1, where the new conjuncts say that P ?has already been executed (cP < ĉ) and A 1 − P ?∈ Figure 4 TGA encoding the CSTNUD in Figure 3b.T2 (ex L0) is the initial location (modeling T2(t) for t > 0).T1 (ex L1) models T1(t) for t > 0).w , wp, w¬p, w d , win model the winning path.
mapping Π enc : T → H(X ) translating each X ∈ Π(Y ) (along with its temporal bounds) into an equivalent clock constraint as follows.
After optimizing the guard of each exX transition we now discuss how to model the truth value assignment to the decidable propositions.Dually to observable propositions, for each decidable proposition d ∈ DP we generate an uncontrollable self-loop transition T 1 , cD < ĉ ∧ cD = 0 ∧ bD = ĉ, dFalse, {bD}, T 1 at T 1 .If we take this transition, it means that we decide ¬d.If we do not, it means that we decide (actually confirm) d.In the former case, such a transition has to be taken at the same instant in which D! was executed but after exD was taken.In this way we model "how" to decide the truth values of the propositions in DP.All other transitions remain the same of those given for CSTNUs.

Automated Planning: A Tool for the Experimental Evaluation
We made a tool 3 for CSTNUDs which takes as input a CSTNUD specification and allows for the automated encoding into the corresponding UPPAAL-TIGA specification as well as execution simulation.To get the UPPAAL-TIGA specification we run ./CstnudNetwork.cstnud--encode TGA.xml, where Network.cstnud is the CSTNUD specification and TGA.xml the encoding into a TGA the tool returns in output.To synthesize a strategy we use UPPAAL-TIGA by querying the TGA with verifytga -s -q -w0 TGA.xml dc.q > strategy, where dc.q contains the TCTL query control: A[] not tga.winand (if p ∈ K and S K (p) = ) and cP < ĉ ∧ bP < ĉ (if p ∈ K and S K (p) = ), cP = bP = ĉ otherwise.Finally, Player2 has finished taking controllable transitions at t.When t = 0 (i.e., ĉ = 0) Player2 cannot play in T 2 as no controllable transition is enabled.Player1 cannot play either because the current location is not T 1 and he can only got there after a positive amount of time has elapsed.Therefore, at t = 0, Z(0) = ∅, ∅, ∅, ∅ .
When t > 0 (i.e., ĉ > 0) both Player1 and Player2 can play in their respective turns T 1 (t) and T 2 (t).Player1 can take gain to enter T 1 at time t.Player2 cannot prevent him from doing so because gain, being urgent, has priority over any other controllable transition that Player2 could take at that time.So, Player1 plays first.Once got in T 1 , Player1 can take (in general) a non-empty sequence of transitions to execute a few non contingent time points and decide the truth values of some decidable propositions if he has executed some decision time points.Such a sequence is finite since there is a finite number of time points to execute and a finite number of propositions to assign.Furthermore, each time point (resp., proposition) can be executed (resp., assigned a value) only once.When this sequence of transitions is over, Player1 ends his turn by taking pass to lead the run back to T 2 .Since T 1 is urgent, time has not elapsed.Therefore, the sequence of transitions taken at T 1 corresponds to the sequence of moves made by Player1 in T 1 (t).Instead, if Player1 wants to wait at time t, he can either take gain and pass immediately after or just avoid taking gain.Now, at T 2 , Player2 does the same for contingent time points and observable propositions if some observation time points have been executed by Player1 in T 1 (t).When Player2 is done, the sequence of transitions taken, models the sequence of moves made in T 2 (t).Since Player2 does not make any other move in T 2 (t), Z(t) can no longer be modified.
Player1 and Player2 are driven by their strategies σ 1 and σ 2 which say what moves to make (i.e., transitions to take) in T 1 (t) and T 2 (t) at any time t depending on the current Z.The purpose of σ 1 is to keep Z(t) locally consistent, whereas that of σ 2 is the opposite.
The strategies also satisfy their applicability conditions as Player1 can make his moves in T 1 (t) according to σ 1 iff Player2 has not played yet in T 2 (t), whereas Player2 can make his moves in T 2 (t) according to σ 2 iff either Player1 has not played in T 1 (t) or Player2 is not done in T 2 (t).We have already proved that for any t > 0, Player1 plays first.
The strategies satisfy their obligations as each time Player1 executes a decision time point D!, he also assigns the associated decidable proposition d a truth value as well.This occurs at the same time but sequentially after the execution of D!. Player1 assigns to d by not taking dFalse and assigns ⊥ to d by taking pFalse.If Player1 takes the transition then he cannot take it again in the same turn (as the guard of pFalse invalidates).If he does not, then he will never be able to take dFalse in any T 1 (t ) where t > t.Likewise, σ 2 satisfies its similar obligation for observable propositions.Furthermore, σ 2 also satisfies the obligation regarding contingent time points as the encoding generates a failC transition for each contingent time point C (belonging to a (A, x, y, C) ∈ L) allowing Player1 to move to win if Player2 does not take exC within its maximum upper bound y.Since Player2 wants to prevent Player1 from getting to win, σ 2 is obliged to schedule Both σ 1 and σ 2 satisfy their postconditions: the reset of cX clocks says when the time points were executed, whereas the values of bP clocks say what truth values the propositions have been assigned.Finally, winning conditions are modeled differently with respect to the player.For Player1 they are abstracted as a winning path checking that all time points and constraints whose labels are not falsified by cps have been executed and satisfied, respectively.For Player2 winning conditions correspond to schedule a contingent time point at a particular time or decide a truth value for an observable propositions (or any combination of these moves) such that Player1 is unable to satisfy at least one constraint and ends up blocked somewhere while going through the winning path before entering win.

Theorem 17. Any CSTNUD can be encoded into a TGA in polynomial time.
Proof Sketch.Our encoding subsumes that for CSTNUs which runs in polynomial time [4,5].We "worsen" that encoding by adding a dFalse transition for each d ∈ DP.For each X ∈ T , L enc (X) and Π enc (X) are computed in polynomial time by analyzing L(X) and C.

Related Work
STNs [10] and Drake [9] differ from CSTNUDs since they do not specify any uncontrollable part.Therefore, they are incomparable with CSTNUDs.STNUs [18] specify contingent durations as the unique uncontrollable part.The execution of non-contingent time points cannot influence any contingent duration.Instead, contingent durations do influence the real-value assignment to the non-contingent time points.However, such durations never prevent any non-contingent time point from being executed.This work also addresses the influence of the controllable part over the uncontrollable one.
CSTNs [14,20] specify conditional constraints as the unique uncontrollable part.Again, the execution of non-contingent time points cannot prevent any truth value assignment from happening.Instead, depending on what truth value a propositional variable is assigned some time point might be excluded, runtime, from the execution of the network.Similar explanations hold for CSTNUs [7,13] which merge CSTNs and STNUs.CSTNUDs are also able to prevent uncontrollable truth value assignments and durations from happening.
In [3] CSTNs are extended with decision nodes regulating the truth value assignments to some propositions under control.That work focuses on the complexity analysis of the DC-checking problem and provides constraint-propagation algorithms for special cases in which either the network specifies only decisions and no observations or all decisions are made before any observation.Moreover, contingent durations are not addressed.This work follows a complete different direction starting from CSTNUs and it is based on TGAs.
In temporal workflow management, the difference between controllable and uncontrollable XOR splits is introduced in [11] and a technique based on PERT-nets computes internal activity deadlines in order to meet the global ones.Some missed deadlines require human interaction for recovery.We rely on DC, which guarantees that we never miss any deadline.
In [19] UPPAAL-TIGA is used to synthesize a controller for timeline-based plans which consider multivalued state variables and networks of TGAs.Apart from time points, our variables are Boolean and our encoding involves one TGA only.

Conclusions and Future Work
We defined conditional simple temporal networks with uncertainty and decisions (CSTNUDs) as a unified formalism.CSTNUDs implicitly embed all minor temporal network formalisms such as STNs (if L = OT = DT = ∅), CSTNs (if L = DT = ∅), STNUs (if OT = DT = ∅), CSTNUs (if DT = ∅), STNDs (if L = OT = ∅), CSTNDs (if L = ∅), and STNUDs (if OT = ∅).We modeled the DC-checking of a CSTNU as a two-player game where Player1 models the controller and Player2 models the environment and gave the execution semantics in move-based strategies.We provided an encoding from CSTNUDs into TGAs as an optimized extension of that given for CSTNUs and discussed the correctness and complexity of such an encoding.We automated the approach by making a tool we used to analyze and simulate the execution of the examples discussed in this paper.We also provided a T I M E 2 0 1 7

23:16
Conditional Simple Temporal Networks with Uncertainty and Decisions preliminary experimental evaluation of the approach against a set of 1000 randomly generated CSTNUDs.As future work, we plan to address weak and strong controllability of CSTNUDs.

Figure 3
Figure 3Possible cases of the CSTNU in Figure1when substituting decision time points for observation ones.Missing labels on requirement links X → Y are all[1,1], L(X) ∧ L(Y ) (Figure1).
! is a decision time point.The resulting CSTNUD is DC.Assume that we observe p. Regardless on what duration C 1 takes, we can only decide d.Indeed, if we decided ¬d, regardless of the duration of C 4 we would have to execute E before time 21 violating (P ?− E ≤ −21, ¬d).Assume now that we observe ¬p.If C 2 takes its minimal duration, d is the only good decision.If we decided ¬d and then C 4 took its minimal duration, we would execute E at 18 violating (P ?− E ≤ −21, ¬d).On the contrary, if C 2 takes its maximal duration then we can only decide ¬d.If we decided d and C 3 took its maximal duration, we would have to execute E at 21 violating (E − P ?≤ 20, d).
Network with Uncertainty and Decisions (CSTNUD) is a tuple T , OT , DT , P, O, L, L, C , where: T , OT , P, L, L, C are exactly the same of those given for CSTNUs.Furthermore, we denote the set of contingent time points as Contingent = {C | (A, x, y, C) ∈ L}.DT ⊆ T = {D!, E!, . . .} is a set of decision time points such that OT ∩ DT = ∅.O : P → DT ∪ OT is a bijection associating a unique observation or decision time point to each proposition.If O(p) ∈ OT , then p is called observable, whereas if O(d) ∈ DT , then d is called decidable.OP ⊆ P = {p | O(p) ∈ OT } and DP ⊆ P = {d | O(d) ∈ DT } shorten the sets of all observable and decidable propositions, where OP ∩ DP = ∅.

T
[1,1] (cP = 1).More formally: Definition 15 (Encoding predecessors).Given a CSTNUD, a predecessor of a time point Y ∈ T is a time point X ∈ T such that there exists a constraint (X − Y ≤ −x, L(Y )) ∈ C where x > 0. Π : T → 2 T returns the predecessors of a given time point and it is formalized as Π(Y ) = {X | (X − Y ≤ −x, ) ∈ C ∧ x > 0 ∧ = L(Y )}.A predecessor encoder is a 1 , 2 ∈ P * are consistent if and only if their conjunction 1 ∧ 2 is satisfiable.A label 1 entails a label 2 (written 1 ⇒ 2 ) if and only if all literals in 2 appear in 1 too (i.e., if 1 is more specific than 2 ).A label 1 falsifies a label 2 iff 1 ∧ 2 is inconsistent.For instance, if 1 = p ∧ ¬q and 2 = p, then 1 and 2 are consistent since p ∧ ¬q ∧ p is satisfiable, and 1 entails 2 since p ∧ ¬q ⇒ p.
is a tuple T , OT , P, O, L, L, C , where: T = {X, Y, . . .} is a finite set of time points (i.e., variables with continuous domain).OT ⊆ T = {P ?, Q?, . . .} is a set of observation time points.P = {p, q, . . .} is a finite set of Boolean propositions.O : P → OT is a bijection associating a unique P ?∈ OT to each p ∈ P (i.e., O(p) = P ?). L : T → P * is a function assigning a label to each time point X ∈ T .L is a finite set of contingent links (A, x, y, C), where A