Abstract 1 Introduction 2 Multiplayer reachability games 3 Permissiveness in strategies 4 Characterizations of permissive equilibria 5 Computation of permissive equilibria 6 Conclusion References

Permissive Equilibria in Multiplayer Reachability Games

Aline Goeminne F.R.S.-FNRS & UMONS – Université de Mons, Belgium Benjamin Monmege ORCID Aix-Marseille Univ, CNRS, LIS, Marseille, France
Abstract

We study multi-strategies in multiplayer reachability games played on finite graphs. A multi-strategy prescribes a set of possible actions, instead of a single action as usual strategies: it represents a set of all strategies that are consistent with it. We aim for profiles of multi-strategies (a multi-strategy per player), where each profile of consistent strategies is a Nash equilibrium, or a subgame perfect equilibrium. The permissiveness of two multi-strategies can be compared with penalties, as already used in the two-player zero-sum setting by Bouyer, Duflot, Markey and Renault [3]. We show that we can decide the existence of a multi-strategy profile that is a Nash equilibrium or a subgame perfect equilibrium, while satisfying some upper-bound constraints on the penalties in PSPACE, if the upper-bound penalties are given in unary. The same holds when we search for multi-strategies where certain players are asked to win in at least one play or in all plays.

Keywords and phrases:
multiplayer reachability games, penalties, permissive equilibria
Funding:
Aline Goeminne: Postdoctoral Researcher of the Fonds de la Recherche Scientifique – FNRS.
Benjamin Monmege: This author was partially funded by ANR JCJC Quasy ANR-23-CE48-0008.
Copyright and License:
[Uncaptioned image] © Aline Goeminne and Benjamin Monmege; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Software and its engineering Formal methods
; Theory of computation Logic and verification ; Theory of computation Solution concepts in game theory
Related Version:
Full Version: https://arxiv.org/abs/2411.13296 [14]
Editors:
Jörg Endrullis and Sylvain Schmitz

1 Introduction

Nowadays, computer systems are ubiquitous and increasingly complex. Errors in such systems can have dramatic consequences. This is why model checking provides a formal tool to ensure these systems are correct and meet certain specifications. Synthesis, on the other hand, allows for the construction of a correct-by-construction system model: concepts from game theory can be used for this purpose.

Two-player zero-sum games are commonly used to model a system interacting with its environment. In this model, the system aims to achieve a goal while the environment acts antagonistically to prevent it. This situation can be abstracted as a game played on a graph involving two players (the system and the environment). The graph represents the different possible configurations of the system, and an infinite path in this graph is a sequence of interactions between the system and the environment. In this model, building a correct system amounts to synthesizing a winning strategy, that is, a way for the system to play that ensures its goal is met regardless of the environment’s behavior.

Unlike the purely antagonistic view of two-player zero-sum games, multiplayer games allow for modeling situations where the environment may have its own goals, or where the system consists of different interacting components, each with its own specification. In this context, the notion of a winning strategy is no longer appropriate, hence notions of equilibria are studied: Nash equilibria or subgame perfect equilibria, which more adequately account for the sequential aspect of games played on graphs (avoiding non-credible threats). Intuitively, an equilibrium can be seen as a contract among players such that no player has an incentive to unilaterally change his strategy.

It is well known that different equilibria can coexist in the same game. In particular, a game may include an equilibrium where no player achieves his goal and an equilibrium where all players achieve their goals. The latter equilibrium is more relevant than the former. Therefore, it seems appropriate to focus on the existence and synthesis of relevant equilibria (according to certain relevance criteria).

Even if the synthesis process provides an equilibrium, its implementation may fail. This can be due to the occurrence of errors; for example, the action prescribed by the equilibrium may be unavailable. Synthesizing robust equilibria against such perturbations is therefore essential. To address these robustness issues, the classic notion of a player’s strategy can be replaced by the notion of a multi-strategy: unlike a classic strategy that provides a single action at each decision point, a multi-strategy provides a subset of possible actions (see, for example, [2, 3]).

Intuitively, a multi-strategy is more permissive than another if the first allows more behaviors than the second. There are different ways to express this permissiveness. A qualitative view of permissiveness is studied in [2], where a multi-strategy is more permissive than another if the set of resulting plays includes those of the second multi-strategy. A quantitative view is addressed in [3] via the notion of penalty of multi-strategies, where a cost is associated with each edge not chosen by the multi-strategy. Thus, the penalty of a multi-strategy is the highest sum of blocked edges along a play consistent with the multi-strategy.

Related works.

In [2], permissiveness in parity games (a highly expressive winning condition) is studied: considering the qualitative view of permissiveness, there does not necessarily exist a most permissive strategy. However, one exists when restricted to memoryless strategies (which always make the same decision in any given vertex of the game). By reducing to safety games, the authors show that it is possible to compute the most permissive strategy. In [3], the above-mentioned quantitative view of permissiveness is implemented. Several penalty measures and games are used, and the complexity of computing the most permissive strategies in this context is given. More general parity objectives are then studied in [5]. Recently, other methods have explored permissiveness in two-player games using templates to concisely represent multiple strategies in graph games [1]. This approach is also used in multiplayer games for the synthesis of secure equilibria [16].

Independently, different equilibria (Nash or subgame perfect) have been studied in multiplayer games to ensure a strategy profile where no player has an incentive to deviate. Several works have characterized such equilibria and studied the complexity of decision problems related to the existence of relevant equilibria. Notably, these works have focused on the study and characterization of (i) Nash equilibria in games where players have classical ω-regular objectives [17, 11], (ii) weak subgame perfect equilibria (a variant of subgame perfect equilibria) where players have classical ω-regular objectives [9] (this work also characterizes subgame perfect equilibria when the studied objectives are either qualitative reachability or safety objectives); (iii) subgame perfect equilibria for games with quantitative reachability objectives [10]; (iv) subgame perfect equilibria for games with parity objectives [6] or (v) mean-payoff objectives [8, 7].

Contribution.

Our goal is to combine these two research directions by studying permissiveness in strategy profiles that describe equilibria (Nash or subgame perfect). In this first work, we focus on reachability games only. We study permissive strategy profiles such that all the fully described strategy profiles they contain are equilibria. The motivation is to allow greater latitude and robustness of equilibrium profiles without losing quality in the final goal of secure synthesis. With the qualitative view, as in the two-player game framework [2], it is not difficult to show that there does not necessarily exist most permissive profiles that are Nash equilibria (or subgame perfect equilibria). We will thus consider a quantitative view of permissiveness similar to the penalty measures introduced in [3] for two-player games. We obtain a characterization based on trees, and decision algorithms with penalties bounded by a given threshold in polynomial space with respect to the size of the game and the maximal penalty bound (if this is encoded in unary). We also solve the problem of synthesis of robust and relevant equilibria, where the relevance is the constraint that all derived equilibria ensure that all players in a fixed subset satisfy their objective (strongly winning), or that at least one derived equilibrium ensures this guarantee (weakly winning).

All missing proofs can be found in the long version of this article [14].

2 Multiplayer reachability games

A (multiplayer) reachability games is a tuple (N,V,(Vi)iN,E,(Fi)iN,v0), that we denote (𝒢,v0) to emphasize the v0 component, where N={1,,n} is a finite set of n players, (V,E) is a finite directed graph without deadlocks (for all vV, there exists vV such that (v,v)E), (Vi)iN is a partition of V between the players, FiV is the set of target vertices, called target set, of player iN, and v0 is an initial vertex. Given a vertex vV, we let Succ(v)={vV(v,v)E} be the set of all successors of v.

A play in 𝒢 is an infinite sequence of vertices consistent with the graph structure, i.e., if ρ=ρ0ρ1 is a play, then for all k, ρkV and (ρk,ρk+1)E. The set of plays is denoted by Plays, while Plays(v) denotes the set of plays beginning in v. Given a play ρ=ρ0ρ1 and k, ρk is the suffix ρkρk+1 of ρ.

For each player iN, we let Gaini be the gain function that associates with each play the value 1 if the play is winning for player i, 0 if it is losing. For a reachability game as above, we have Gaini(ρ)=1 iff player i reaches his target set in ρ, i.e., ρ=ρ0ρ1 and there exists k with ρkFi. In the rest of this article, (𝒢,v𝟎) will always denote a reachability game associated with these gain functions.

A history is a finite sequence of vertices h=h0h1hk with k defined similarly. The set of histories is denoted by Hist, while Hist(v) denotes the set of histories beginning in v. For all iN, we write Histi to denote the set of histories ending in a vertex owned by player i. If h=h0hk with k is a history, Last(h) denotes the last vertex hk, while |h| denotes its length k. Given a history h=h0hk, Visit(h)={iN1khFi} is the set of players who visit their target set along h.

A strategy of player i is a function σi:Histi(v0)V that assigns to each history hvHisti(v0) a vertex v such that (v,v)E. A play ρ=ρ0ρ1 is consistent with a strategy σi if for all ρkVi, σi(ρ0ρk)=ρk+1. A strategy profile is a tuple σ=(σi)iN of strategies, one per player: there is a unique play from v0 which is consistent with each strategy σi, and we call this play the outcome of σ, denoted by σv0. To highlight the role of player i, we sometimes write σ=(σi,σi) where σi denotes the strategy profile of the players other than player i.

The strategy profile σ is a Nash equilibrium (NE) in (𝒢,v0) if no player has an incentive to deviate unilaterally from his strategy to increase his gain, i.e., if for all players iN and all strategies σi of player i, Gaini(σv0)Gaini(σi,σiv0).

The concept of subgame perfect equilibrium (SPE) takes more into account the sequential nature of games played on graphs by avoiding non-credible threat, a well-known weakness of NEs in this setting. Informally, a strategy profile is an SPE if it is an NE in all subgames. Given a history hvHist(v0), the subgame (𝒢h,v) is obtained from 𝒢 by changing the initial vertex to v, and by considering the gain functions (Gainih)iN taking into account the players that have won in history h: we thus write, for each iN, Gainih(ρ)=Gaini(hρ) for all ρPlays(v). Moreover, if σi is a strategy of player i in 𝒢, then σih is the strategy of player i in the subgame (𝒢h,v) such that for all hHisti(v), σih(h)=σi(hh). In the same way, from a strategy profile σ in 𝒢, we can derive a strategy profile σh in (𝒢h,v). We now define formally the concept of SPEs: a strategy profile σ is an SPE in 𝒢 if for all iN, for all hvHisti(v0), σh is an NE in (𝒢h,v). Notice that an SPE is an NE and that there always exists an SPE (and thus an NE) in a reachability game [17].

3 Permissiveness in strategies

Our goal is to allow for some permissiveness in strategies of all players, i.e., being able to underspecify the strategies of the players, while maintaining that they describe an NE or an SPE.

A multi-strategy of player i is a function Θi:Histi(v0)2V{} that assigns to each history hvHisti(v0) a non-empty set of vertices AV such that for all vA, (v,v)E. Notice that a strategy σi can be seen as a multi-strategy Θi where, for all hvHisti(v0), Θi(hv) is the singleton {σi(hv)}. A multi-strategy profile Θ=(Θi)iN is a tuple of multi-strategies, one per player.

Unlike strategies, when we fix a game 𝒢 and a multi-strategy profile Θ, there exist several plays beginning in v0 that are consistent with all the multi-strategies Θi. To describe them, we say that a strategy σi is consistent with a multi-strategy Θi, written σiΘi if for all hvHisti(v0), σi(hv)Θi(hv). We extend this notation to profiles of strategies, as expected. Then, we let Θv0 be the set of plays σv0 for all profiles σ of strategies consistent with the multi-strategy Θ. We call this set the outcomes of Θ. We also let Θv0H be the set of histories consistent with the multi-strategy Θ, i.e., the finite prefixes of plays in Θv0.

Our goal is to compute profiles of multi-strategies such that all profiles of consistent strategies are NEs or SPEs: such profiles of multi-strategies are called permissive NEs or permissive SPEs. By the existence of NEs and SPEs in reachability games, we straightforwardly obtain the existence of permissive NEs and permissive SPEs. We thus want to study most permissive NEs or SPEs, i.e., profiles of multi-strategies that are permissive NEs or SPEs, and such that no “more permissive” multi-strategies are still permissive NEs or SPEs.

The natural first attempt would be to look for a notion of “more permissive” that is set-theoretic, with respect to a given solution concept. We would thus say that a profile of multi-strategies Θ is at least as permissive as a profile of multi-strategies Θ if for all iN, for all histories hHisti(v0), Θi(h)Θi(h). Then, Θ would be more permissive than Θ if it is at least as permissive, while being different (for at most one history). Finally, Θ would be a most permissive NE or SPE if it is a permissive NE or SPE, respectively, and no permissive NE or SPE, respectively, is more permissive than Θ.

This natural definition is very problematic in the realm of reachability games (as already noticed in the context of winning strategies in parity games by [2]) where no most permissive NE or SPE could exist, as demonstrated by the game in Figure 1.

v0v1
Figure 1: In this game, player 1 owns all vertices and wants to reach v1. For all k, we define the multi-strategy Θ1k such that for all hHist(v0), Θ1k(h)={v0,v1} if Last(h)=v0 and |{nhn=v0}|k, and Θ1k(h)={v1} otherwise. We have that for all k, for all σ1Θ1k, Gain1(σ1v0)=1 (and thus Θ1k is a permissive SPE), but for all k, Θ1kv0Θ1k+1v0.

We thus propose another way to measure the permissiveness of a multi-strategy, inspired by the definition of penalty used in [3] to describe permissive winning strategies in two-player games. To define the notion of penalty in our context, we equip the game with a function w:E assigning a non-negative weight to each edge: if unspecified, we will consider that every edge has weight 1. The player who owns the vertex at the source of an edge e will pay the penalty w(e) if he decides to not include the edge e in his multi-strategy. All penalties are then counted additively. Formally, for a multi-strategy profile Θ, we first define for each player iN the penalty of player i w.r.t. Θ in a play ρ=ρ0ρ1 by induction on the length of its prefixes:

  • PenaltyiΘ(ε)=0 where ε denotes the empty prefix;

  • for h=ρ0ρk, PenaltyiΘ(hv)={PenaltyiΘ(h)+vSucc(v)Θi(hv)w(v,v) if vViPenaltyiΘ(h) otherwise;

  • PenaltyiΘ(ρ)=limk+PenaltyiΘ(ρ0ρk): this limit exists (it may be equal to +) since (PenaltyiΘ(ρ0ρk))k is a non-decreasing sequence of natural numbers.

There are several ways to associate a penalty with a multi-strategy profile Θ, depending on how we take into account the non-determinism offered in the multi-strategies. A first choice consists in considering a worst-case scenario in the outcomes (without considering the possible deviations). A second choice consists in considering only the deviations of one player, i.e., to consider that the retaliation of other players with respect to the deviation of a player will count in the final penalty. It is then possible to combine both types of penalties, though we will treat them separately in the rest of this article.

Definition 1 (Penalties).

Let Θ be a multi-strategy profile in (𝒢,v0). The main penalty and retaliation penalty of player i with respect to Θ are defined respectively as

MPenaltyi(Θ,v0) =supρΘv0PenaltyiΘ(ρ)
RPenaltyi(Θ,v0) =suphvHisti(v0)Θv0Hsup{PenaltyiΘ(ρ)ρΘhvv}

If there are no histories hv in Histi(v0)Θv0H, we let RPenaltyi(Θ,v0)=0.

The existence of a multi-strategy profile which satisfies some upper-bounds on penalties does not provide any certainty about the satisfaction of the reachability objectives of the players. For this reason, we also consider multi-strategy profiles that satisfy some properties on the set of players who satisfy their objective. Let Win be a subset of players and Θ be a multi-strategy profile. Then, Θ is said weakly winning if there exists a strategy profile σ which is consistent with Θ and such its outcome is winning for all players in Win. Similarly, Θ is said strongly winning if for each strategy profile σ which is consistent with Θ, its outcome is winning for all players in Win.

Definition 2 (Weakly and strongly winning).

Given a subset of player WinN and a multi-strategy profile Θ,

  • Θ is said weakly winning with respect to Win if there exists a strategy profile σ such that σΘ and for all iWin, Gaini(σv0)=1.

  • Θ is said strongly winning with respect to Win if for all strategy profiles σ such that σΘ, we have that for all iWin, Gaini(σv0)=1.

v0v2v1v3v4v5v6v7v8v910
Figure 2: An example of a reachability game where player 1 (resp. player 2) owns circle (resp. rectangle) vertices. The initial vertex is v0. Target vertices F1={v3,v6,v8,v9} of player 1 and F2={v4,v6} of player 2 are drawn with gray vertices and double-bordered vertices respectively.
v0v2v1v3v4v5v6v7v8v910(a)
v0v2v1v3v4v5v6v7v8v910(b)
Figure 3: Examples of permissive equilibria: (a) a permissive NE and (b) a permissive SPE.
Example 3.

An example of a reachability game with two players is depicted in Figure 2. The edge labelled with 10 corresponds to the penalty if player 2 decides not to allow this edge: all other penalties are set to 1 by default. A multi-strategy is represented with red edges (black dotted edges are thus the ones that are not selected in the multi-strategy) in Figure 3(a).111Notice that, in this example, the set of successors prescribed by multi-strategies only depends on the current vertex and not on the past history. All strategy profiles that are consistent with this multi-strategy depend on the choice of successor for v0 among {v1,v2}. It is indeed a permissive NE since the consistent strategies are NEs: player 1 has no interest in deviating from either v1 or v2 in v0, since all strategies lead to plays where he visits his target set, while going to v5 make him lose. It has a main penalty of 2 for player 1 and 0 for player 2. Player 1 can do slightly better by allowing the edge (v1,v4) in the multi-strategy: this remains a permissive NE (now player 2 wins in certain plays, but he is left with no real choices to make), and player 1 now gets a main penalty of 1. This modified permissive NE is strongly winning w.r.t. {1}, and weakly winning w.r.t. {1,2}. It is not a permissive SPE since player 2 has a profitable deviation from v5 by going to v6 where he wins. A permissive SPE is depicted in Figure 3(b), that is strongly winning w.r.t. {1}, but only weakly winning w.r.t {1,2}. Player 2 has a main penalty of 11 (because he cuts edges (v5,v5) and (v5,v7)), while player 1 has a retaliation penalty of 1 (because he cuts edge (v7,v9)). If we want a permissive SPE that is strongly winning w.r.t. {1,2}, we need to increase the main penalty of player 1 to 2 by removing edges (v1,v3) and (v0,v2). However, we may decrease to 0 the retaliation penalty of player 1 by adding the edge (v7,v9) (since it is equally good to him anyway).

We now define the problems we study in the rest of the article, where we use the word “equilibrium” to either mean NE or SPE, depending on the solution concept we want to check. In all these problems, we give different penalty bounds for the main penalty and the retaliation penalty. Notice though that the bounds can be set to +, relaxing the constraints in this case.

Problem 1 (Constrained penalty problem).

Given a reachability game (𝒢,v0), m({})n and r({})n, does there exist a permissive equilibrium Θ in (𝒢,v0) such that for all iN, MPenaltyi(Θ,v0)mi and RPenaltyi(Θ,v0)ri?

Problem 2 (Weakly winning with constrained penalty problem).

Given a reachability game (𝒢,v0), m({})n, r({})n and WinN, does there exist a permissive equilibrium Θ in (𝒢,v0) such that (i) for all iN, MPenaltyi(Θ,v0)mi and RPenaltyi(Θ,v0)ri and (ii) Θ is weakly winning w.r.t. Win?

Problem 3 (Strongly winning with constrained penalty problem).

Given a reachability game (𝒢,v0), m({})n, r({})n and WinN, does there exist a permissive equilibrium Θ in (𝒢,v0) such that (i) for all iN, MPenaltyi(Θ,v0)mi and RPenaltyi(Θ,v0)ri and (ii) Θ is strongly winning w.r.t. Win?

We show in the rest of this article that all these problems, for NEs and SPEs, are decidable in PSPACE, if the upper-bound penalties are encoded in unary. To do so, we characterize the permissive equilibria in the various problems in Section 4. In Section 5, we then show that tree-like witnesses can be found if the according permissive equilibria exist. These witnesses have a height bounded by a polynomial depending on the size of the game and the largest upper-bound on penalties. We use these witnesses to obtain the PSPACE decision procedures.

4 Characterizations of permissive equilibria

We now characterize permissive equilibria of the reachability game (𝒢,v0). This is a first step towards their computation in the next section. We provide a characterization for permissive NEs in Section 4.1 and one for permissive SPEs in Section 4.2. These characterizations are inspired by existing ones for classical NEs (resp. SPEs) [11, 9]. The latter rely on properties that a play (resp. a set of plays) must satisfy in order to be the outcome of an NE (resp. the set of subgame outcomes of an SPE). However, the outcomes of permissive equilibria are a set of plays and not a simple play. For that reason, the characterizations of permissive equilibria employ trees that we first formally define.

Trees.

We call tree over 𝒢 rooted at v (for some vV) any subset 𝒯 of non-empty histories of 𝒢 that contains v and such that if hu𝒯 then h𝒯. All h𝒯 are called nodes of the tree, the particular node v is called the root of the tree, and for all hu𝒯, h is called the parent of hu, and hu a child of h.

As for histories in an arena, for all hu𝒯, we let Last(hu)=u. The depth of a node h𝒯, written depth(h), is equal to |h| and its height, denoted by height(h), is given by sup{|Last(h)h|hHist and hh𝒯}. The height of the tree corresponds to the height of its root. A node h𝒯 is called a leaf if height(h)=0.

We denote by 𝒯hu, the subtree of 𝒯 rooted at u for some hu𝒯, that is the set of non-empty histories hHist(u) such that hh𝒯.

A (finite or infinite) branch of the tree is a maximal (finite or infinite) sequence of nodes h0h1 such that for all k, hk is the parent of hk+1. Finally, we denote by 𝒯 the set of plays in 𝒢 represented by infinite branches in 𝒯, i.e.,

𝒯={ρ0ρ1Playsthere exists a branch h0h1𝒯 st. k,ρk=Last(hk)}

In what follows, we consider outcomes of multi-strategies as trees. Indeed, given a multi-strategy Θ, Θv0H can be seen as a tree 𝒯 over 𝒢 rooted at v0 and Θv0 corresponds to 𝒯. In particular, penalties can also be defined on trees, mimicking the definition for profiles of multi-strategies. The penalty of a tree 𝒯 for a player i, denoted by Penaltyi(𝒯), is the maximal penalty of a branch of 𝒯, the penalty of a branch being equal to the penalty of the associated play ρ w.r.t. any profile of multi-strategies that is consistent with the choices appearing in 𝒯 along the play ρ. Formally, let 𝒯 be a tree and iN be a player. For each hv=v1vkv𝒯, we define Blocked(h)={uSucc(vk)hu𝒯} as the set of blocked successors of h in 𝒯 and

Penaltyi(hv)={0 if h=εPenaltyi(h)+uBlocked(h)w(vk,u) if vkViPenaltyi(h) otherwise.

Moreover, for all plays ρ=ρ0ρ1𝒯, we let Penaltyi(ρ)=limk+Penaltyi(ρ0ρk). Thus, the penalty of a tree 𝒯 for a player iN is naturally defined as:

Penaltyi(𝒯)=sup{Penaltyi(ρ)ρ𝒯}.

4.1 Characterization of permissive Nash equilibria

In order to characterize permissive Nash equilibria, we start by defining good trees, by checking two conditions. The first one, called resistance to internal deviations, means that at any node h of the tree such that Last(h) belongs to player i, if h has at least two children, the plays starting with h are either all losing, or all winning, for player i. The second one, called resistance to external deviations, means that at any node hu of the tree with u belonging to player i, if player i has the possibility to play to a successor u not in the tree from which he has a winning strategy, then all plays in the subtree from hu must be winning for player i.

Definition 4.

Let 𝒯 be a tree over (𝒢,v0).

  1. 1.

    Given a subset of players DN, the tree 𝒯 is D-resistant to internal deviations if for all iD and for all hv𝒯 such that vVi and |{hvv𝒯vV}|2, we have that for all ρ,ρ𝒯hv, Gaini(hρ)=Gaini(hρ). If D=N, we simply say that 𝒯 is resistant to internal deviations.

  2. 2.

    The tree 𝒯 is resistant to external deviations if for all hu𝒯 with uVi and iVisit(hu), if there exists uSucc(u) such that huu𝒯 and player i has a winning strategy from u (against the coalition of the other players), then for all plays ρ𝒯hu, Gaini(ρ)=1.

  3. 3.

    The tree 𝒯 is good if it is resistant to internal and external deviations.

The resistance to internal and external deviations leads to the characterization of outcomes of permissive NEs (Theorem 5): given a good tree 𝒯, there exists a permissive NE such that its outcomes are the plays corresponding to the infinite branches of 𝒯 iff 𝒯 is good.

Theorem 5.

Let 𝒯 be a tree over (𝒢,v0) rooted at v0. The following assertions are equivalent:

  1. 1.

    There exists a permissive NE Θ in (𝒢,v0) such that Θv0H=𝒯;

  2. 2.

    The tree 𝒯 is good.

 Remark 6.

For all multi-strategies Θ, and all players iN, the penalty MPenaltyi(Θ,v0) is equal to the penalty of player i in the good tree Θv0H, i.e., Penaltyi(Θv0H). The construction of Theorem 5 thus also preserves the main penalties.

Proof sketch.

For (12) let us assume that Θ is a permissive NE and that Θv0H=𝒯. We have to prove that 𝒯 is good. If 𝒯 is not resistant to internal deviations that means that from some vertex v there exists two plays ρ, crossing u, and ρ, crossing uu, such that: ρ is winning for player i and ρ is losing for player i, see Figure 4(a). In particular, we can build a strategy profile σ consistent with Θ such that σhv=ρ and σhvu=ρ1. Meaning that player i should deviate by choosing u instead of u from v, meaning that σ is not an NE and Θ is not a permissive NE. If 𝒯 is not resistant to external deviations, that means that from some vertex u of player i there exists a play ρ such that Gaini(ρ)=0 and u a successor of u outside 𝒯 from which player i can win, see Figure 4(b). Thus we can build a strategy profile σ consistent with Θ such that σv0=hρ. In this way, player i should choose to go in u and then follow a winning strategy meaning that σ is not an NE and Θ not a permissive NE.

For (21), let us assume that 𝒯 is a good tree. We build a permissive NE Θ such that its outcomes are the plays corresponding to the infinite branches of 𝒯. Additionally, if a player i deviates from 𝒯, the coalition of the other players plays its retaliation222This retaliation strategy corresponds to the winning strategy of player 2 in a two-player zero-sum reachability game in which player 1 is player i and wants to reach Fi and player 2 is the coalition of the other players and wants to avoid visiting Fi [15, Chapter 2]. strategy to prevent player i from deviating.

v0vVihuuρρ(a)
v0uVihuρ(b)
Figure 4: Examples of trees that do not respect: (a) the resistance to internal deviations since Gaini(ρ)=0 but Gaini(ρ)=1; (b) the resistance to external deviations since Gaini(ρ)=0 but Player i can win from u.

4.2 Characterization of permissive subgame perfect equilibria

Permissive subgame perfect equilibria are intrinsically more complex than permissive Nash equilibria. Thus their characterization cannot only rely on the outcomes from the initial vertex, it should also take into account the outcomes in all subgames. This is the reason why, in order to deal with a compact representation of outcomes of a permissive SPE and its subgames, we introduce the notion of forest. Then, we generalize the definition of good trees to define good forests needed to characterize SPEs instead of NEs.

Forests and penalties of forests.

Trees of the forest are indexed by tuples (i,v,I)N×V×2N. More precisely, we let

={(0,v0,I0)}{(i,v,I)N×V×2NhvHisti(v0) st. vSucc(v)I=Visit(hvv)}

where I0={iNv0Fi}. Apart from the special tuple (0,v0,I0), a tuple (i,v,I) represents the fact that v is a vertex played by player i and reachable from v0, and that all players in I have already seen their target when v is reached. A forest in (𝒢,v0) is thus a set of trees ={𝒯i,v,I(i,v,I)} such that 𝒯i,v,I is a tree without leaves over 𝒢 rooted at v. The intuition behind this object is that the tree 𝒯0,v0,I0 represents the outcomes of a multi-strategy Θ and the other trees 𝒯i,v,I represent the outcomes of Θhv in the subgames (𝒢hv,v) for all hvHisti(v0) such that Visit(hvv)=I.

Moreover the main (resp. retaliation) penalty of a forest for a player iN are respectively given by

MPenaltyi()=Penaltyi(𝒯0,v0,I0) and RPenaltyi()=sup𝒯i,v,I(i,v,I)OutPenaltyi(𝒯i,v,I)

where Out={(i,v,I){(0,v0,I0)}hvHist(v0) st. hv𝒯(0,v0,I0)Last(h)ViI=Visit(hv)} described the indices of trees in the forest that are deviations from the main tree 𝒯0,v0,I0. If Out is empty, we let RPenaltyi()=0.

Characterization.

Following the same philosophy as for permissive NEs, a forest is good if each tree 𝒯i,v,I of the forest satisfies two properties. The first one is that 𝒯i,v,I has to be (NI)-resistant to internal deviations, exactly as for permissive NEs except that we take into account players who have already visited their target set, i.e., players in I. The second one, called resistance to constrained external deviations, means that at any node hu of the tree such that u belongs to player j, if player j has the possibility to jump to another tree 𝒯j,u,I by playing to a successor u not in the tree and if there exists a play in this latter tree which is winning for player j, then all plays after hu in 𝒯i,v,I have to be winning for player j.

Definition 7 (Good forest).

Let be a forest in (𝒢,v0).

  1. 1.

    A tree 𝒯i,v,I is resistant to constrained external deviations if it satisfies the following property: for all hu𝒯i,v,I and jN such that we have that (i) uVj and jIVisit(hu) and (ii) there exists uSucc(u) such that huu𝒯i,v,I, if there exists ρ𝒯j,u,I, where I=IVisit(huu), such that Gainj(ρ)=1, then for all ρ𝒯i,v,Ihu, Gainj(ρ)=1.

  2. 2.

    The forest is good if each tree 𝒯i,v,I is (NI)-resistant to internal deviations (see (1) in Definition 4) and resistant to constrained external deviations.

Thanks to good forests, we are able to characterize the outcomes of permissive SPEs: given a good tree 𝒯, there exists a permissive SPE such that its outcomes correspond to 𝒯 iff there exists a good forest whose “main” tree is 𝒯, i.e., 𝒯0,v0,I0=𝒯. With some other constraints, this also preserves strongly (resp. weakly) winning and penalty properties.

Theorem 8.

Let m({})n and r({})n be upper thresholds. Let 𝒯 be a tree rooted at v0 and WinN be a set of players. The following assertions are equivalent:

  1. 1.

    There exists a permissive SPE Θ in (𝒢,v0) such that:

    1. (a)

      Θv0H=𝒯;

    2. (b)

      Θ is strongly winning w.r.t. Win;

    3. (c)

      for all iN, MPenaltyi(Θ,v0)mi and RPenaltyi(Θ,v0)ri.

  2. 2.

    There exists a good forest in (𝒢,v0) such that:

    1. (a)

      𝒯0,v0,I0=𝒯;

    2. (b)

      for all ρ𝒯0,v0,I0, for all iWin, Gaini(ρ)=1;

    3. (c)

      for all iN, MPenaltyi()mi and RPenaltyi()ri.

These assertions are still equivalent by replacing 1b by “Θ is weakly winning w.r.t. Win” and 2b by “there exists ρ𝒯0,v0,I0 such that for all iWin, Gaini(ρ)=1”.

Proof sketch.

For (12), let us assume that Θ is a permissive SPE. We build a good forest such that 𝒯0,v0,I0 is the outcomes of Θ, i.e., 𝒯0,v0,I0=Θv0H, and a tree 𝒯i,v,I is a representative of the outcomes of Θhv in some subgame (𝒢hv,v) such that vVi and Visit(hvv)=I. In order to obtain a good forest and since several hvv could satisfy those properties, each representative 𝒯i,v,I has to be chosen in a proper way: it has to minimize the maximal gain of player i for plays in 𝒯i,v,I. More formally, for each (i,v,I), we let 𝒪(i,v,I)={ΘhvvHhvvHist(v0)vViI=Visit(hvv)} and we choose 𝒯i,v,I𝒪(i,v,I) such that max{Gaini(ρ)ρ𝒯i,v,I}=min𝒯𝒪(i,v,I)max{Gaini(ρ)ρ𝒯}.

Thanks to this latter property, is good. Indeed, let 𝒯i,v,I be a tree of . Exactly as for permissive NEs, if 𝒯i,v,I is not (NI)-resistant to internal deviations, we can build a strategy profile σ consistent with Θ such that the restriction of σ is not an NE in a subgame corresponding to 𝒯i,v,I. If 𝒯i,v,I is not resistant to constrained external deviations that means that from some node u, owned by player j, there exists a play ρ losing for player j and player j could choose to play outside 𝒯i,v,I by jumping to a tree 𝒯j,u,I in which there exists a play ρ winning for him, see Figure 5. Let g be the history such that 𝒯i,v,I represents the outcomes of Θg in (𝒢g,v). Notice that Θghu may be different from 𝒯j,u,I. However thanks to the way in which this representative is chosen, we have that there exists a play ρ′′ in Θghu with Gainj(ρ′′)=1. Thus, we can build a strategy σ consistent with Θ such that σghu=ρ and σghuu=ρ′′. This means that player j could deviate by choosing u instead of u from v in the subgame (𝒢g,v), thus σ would not be an SPE and Θ not a permissive SPE.

For (21), from a good forest a multi-strategy is build such that its subgame outcomes are the trees of . This forms a permissive SPE because is good.

v𝒯i,v,IuVjhρu𝒯j,u,Iρ
Figure 5: Example of forest that does not respect the resistance to constrained external deviations since Gaini(ρ)=0 but Gaini(ρ)=1.

For now, good trees and trees in good forests are infinite, but Section 5 will show that we can represent some trees using a finite representation (intuitively, by supposing that every branch ends with a lasso in the game). It is this finite representation of good trees and good forests that will be used to decide the constrained penalty problems for permissive NEs and permissive SPEs, thanks to the characterizations of Theorems 5 and 8.

5 Computation of permissive equilibria

Theorems 5 and 8 characterize permissive NEs and SPEs with respect to infinite tree-shaped objects. In this section, we use these characterizations in order to decide the various penalty problems defined in Section 3: we check the existence of the good infinite tree-shaped objects by checking the existence of finite symbolic representations of such objects. We start by describing for a single tree this symbolic representation, and show that there exists a polynomial-size such representation (when the penalty upper-bounds are encoded in unary).

5.1 Symbolic trees and forests

Definition 9.

A symbolic tree is a pair 𝒰=(𝒯,f) with 𝒯 a finite tree (i.e., a finite subset of non-empty histories of 𝒢), and f a function mapping each leaf h of 𝒰 to a non-empty set of successor nodes h that are ancestors of h in 𝒰 such that (Last(h),Last(h))E.

A symbolic tree can be unfolded into an infinite tree by repeatedly expanding the leaves of 𝒰 using as successors the choice prescribed by f. We denote by 𝒰~ the infinite tree obtained by unfolding the symbolic tree 𝒰. Similarly, the notions of symbolic forest , where every tree in it is a symbolic tree, and unfolding of symbolic forest ~ can be defined.

In order to treat simultaneously NEs and SPEs, we introduce a new definition generalizing the resistance to external deviations and constrained external deviations. For a vector γ{0,1}N×V×2N of gains, and a subset DN of players (that represent players that did not already win at the beginning of the tree), we say that a tree 𝒯 is (γ,D)-resistant if for all hu𝒯 with uVi and uSucc(u) with huu𝒯, if γi,u,(ND)Visit(huu)=1, if i(ND)Visit(hu), then for all plays ρ𝒯hu, Gaini(ρ)=1.

 Remark 10.

The notion of (γ,D)-resistance is close to the resistance to external deviations and constrained external deviations, so that we directly obtain from Theorems 5 and 8:

  • Let γ𝒢 be defined as follows: for all (i,u,I), we let γi,u,I𝒢 equals 1 iff player i belongs to I or can win from u against the coalition of the other players in 𝒢. Let 𝒯 be a tree. Then, 𝒯 is a good tree iff 𝒯 is resistant to internal deviations and (γ𝒢,N)-resistant.

  • Let be a forest and let γ defined as follows: for all (j,u,J), we let γj,u,J equals 1 iff player j belongs to J or the tree 𝒯j,u,J contains at least one branch with a vertex of Fj. Then, is a good forest iff each each tree 𝒯i,v,I of is (NI)-resistant to internal deviations and (γ,NI)-resistant.

The challenge to make this remark a decision procedure is to make the tree and forest finitely representable. We treat each tree independently of each other, thus explaining how to symbolically represent one single tree in the following proposition:

Proposition 11.

Let 𝒯 be a tree that is D-resistant to internal deviations, with DN. We let γ{0,1}N×V×2N be a vector of gains such that 𝒯 is (γ,D)-resistant, and (Pi)iN be finite constraints on penalties for a subset NN of players. There exists a symbolic tree 𝒰, that is a subtree of 𝒯, of height polynomial in the number of players and vertices of 𝒢, and in the largest bound on penalty Pi, such that the infinite tree 𝒰~ satisfies the following properties:

  1. 1.

    𝒰~ is D-resistant to internal deviations;

  2. 2.

    in 𝒰~, every player iN has a penalty at most Pi;

  3. 3.

    𝒰~ is (γ,D)-resistant.

Moreover, for a subset Win of players, if we start with 𝒯 that is strongly (respectively, weakly) winning w.r.t. Win, then we can make the above construction so that moreover 𝒰~ is strongly (respectively, weakly) winning w.r.t. Win.

Figure 6: Construction of the symbolic tree.

The proof of this result goes by several steps, that we briefly sketch here only in the case where 𝒯 is strongly winning w.r.t. Win. Figure 6 depicts the notions used in the construction of the symbolic tree. First, we consider the smallest subtree of 𝒯 where leaves are such that all players of Win have visited their target set: this subtree is finite by König’s lemma, since all branches of 𝒯 have such a node where all players of Win have won, and the tree is finitely branching. This subtree is called the core. We then continue considering the parts of 𝒯 outside the core, in order to complete the branches so that: the D-resistance to internal deviations is fulfilled (if a player has won in a certain branch of a subtree, he must win in all of them), the (γ,D)-resistance is fulfilled (if γ gives a constraint in the current node for a player i, all the branches of this subtree should visit a target vertex of i). This extension of the core is cut into two parts: the expanded core that ends in places where all the new players that must visit their target because of D-resistance to internal deviations and (γ,D)-resistance have indeed won; the completion of branches in order to then find leaves of the symbolic tree where all successors can be replaced (with function f) by similar nodes in the same branch, and the lassos thus formed are such that the penalty of players that have a finite penalty threshold does not increase along them. We show that these completions of branches can be chosen of polynomial length. We then compress the core and expanded core so that they also have polynomial height.

The symbolic tree 𝒰 thus built is a subtree of 𝒯 (even if its unfolding 𝒰~ is not): in particular, as a corollary, if a player j has no winning play in 𝒯, he does not have a winning play in 𝒰 neither. In particular, when we apply independently this proposition to all the trees of a forest , to obtain a symbolic forest , this remark allows us to check that the new vector γ~ has all its components not above the corresponding ones in γ (if γi,v,I=0 then γi,v,I~=0). In particular, if the tree 𝒯i,v,I of the forest is (γ,NI)-resistant, then the tree 𝒰~i,v,I of the symbolic forest is (γ~,NI)-resistant.

Finally, by combining this result with Remark 10, we obtain the following corollaries that allow us to obtain the PSPACE decision procedures:

Corollary 12.

Let m({})n be upper thresholds, and M be the largest such upper threshold. The following assertions are equivalent:

  1. 1.

    There exists a permissive NE Θ in (𝒢,v0) such that:

    (a) Θ is strongly winning w.r.t. Win; (b) for all iN, MPenaltyi(Θ,v0)mi.

  2. 2.

    There exists a symbolic tree 𝒯~ in (𝒢,v0) of height polynomial in the number of players and vertices of 𝒢 and in M, such that

    (a) 𝒯~ is resistant to internal deviations, and (γ𝒢,N)-resistant, with γ𝒢 defined in Remark 10; and (b) for all ρ𝒯~ and iWin, Gaini(ρ)=1; and (c) for all iN, Penaltyi(𝒯~)mi.

These assertions are still equivalent by replacing 1(a) by “Θ is weakly winning w.r.t. Win” and 2(b) by “there exists ρ𝒯~ such that for all iWin, Gaini(ρ)=1”.

Corollary 13.

Let m({})n and r({})n be upper thresholds, and M be the largest such upper threshold. The following assertions are equivalent:

  1. 1.

    There exists a permissive SPE Θ in (𝒢,v0) such that:

    (a) Θ is strongly winning w.r.t. Win; and (b) for all iN, MPenaltyi(Θ,v0)mi and RPenaltyi(Θ,v0)ri.

  2. 2.

    There exists a symbolic forest in (𝒢,v0), where each symbolic tree has a height polynomial in the number of players and vertices of 𝒢 and in M, such that (a) each tree 𝒯~i,v,I is (NI)-resistant to internal deviations, and (γ,N)-resistant, with γ defined in Remark 10; (b) for all ρ𝒯~0,v0,I0 and iWin, Gaini(ρ)=1; and (c) for all iN, MPenaltyi(~)mi and RPenaltyi(~)ri.

These assertions are still equivalent by replacing 1(a) by “Θ is weakly winning w.r.t. Win” and 2(b) by “there exists ρ𝒯~0,v0,I0 such that for all iWin, Gaini(ρ)=1”.

5.2 Decision problems over permissive Nash equilibria

For permissive NEs, it makes little sense to take into consideration the retaliation penalties, since the punishment after a deviation should definitely make the deviator lose whatever the penalty from now on. We thus obtain the following decision result:

Theorem 14.

The constrained penalty problem, the weakly winning with constrained penalty problem and the strongly winning with constrained penalty problem, all with infinite (and thus no) constraints on retaliation penalties and for NEs are decidable in PSPACE (when the penalty bounds are encoded in unary).

Proof.

We build upon Corollary 12, looking for a finite symbolic tree with the corresponding properties. We first explain how to solve the constrained penalty problem, and explain afterwards the adaptation for the two other problems. The idea is to use an alternating polynomial time Turing machine (since AP=PSPACE [12]) to guess a symbolic tree, checking the various constraints over it by using branch per branch. We describe the construction by supposing that the states of the Turing machine are split between existential states (where the machine accepts if at least one execution accepts) and universal states (where the machine accepts if all the executions accept). Existential states thus allow us to non-deterministically guess the finite symbolic tree node after node. We use a polynomial counter to keep track of the polynomially bounded height of the tree: if the counter goes over the polynomial bound, the execution of the alternating machine fails. At each node, existential states guess non-deterministically the set of successors on the working tape.

Universal states allow us to check several pieces of information on the guessed symbolic tree: the resistance to internal deviations, the constraint on the penalty for each player, and the (γ𝒢,N)-resistance, with γ𝒢 as in Remark 10. Notice that this vector has exponential size, but the index I in a triple (i,v,I) is useless (apart from knowing if iI), and can thus be ignored: moreover, this set I will be maintained along the execution of the algorithm. This vector can thus be precomputed in (deterministic) polynomial time by determining, for each player, their set of winning vertices (against the coalition of the other players) [15].

The various checks can be performed branch per branch by keeping some pieces of information in memory, not only for the current node of the symbolic tree, but also for the whole current branch (this remains in polynomial space). Universal states are thus used to perform the checks on all the branches of the guessed tree.

  • Checking the penalty for player i. If we have to check that the main penalty of player i is bounded by a threshold mi (i.e., that the penalty of player i over each branch is bounded by mi), we keep in memory the current penalty, forbidding for it to go above mi.

  • Checking the resistance to internal deviations and (γ𝒢,N)-resistance. At each node of the guessed tree, if the existential states guessed at least two successors, or depending on the vector γ𝒢 (for a vertex v where γ𝒢 has value 1, and that has not been chosen among the set of successors), we must remember constraints on the successors: either (a) all plays in their subtrees must be winning for a certain player i, or (b) none. We could add neither constraint (a) nor (b) for a certain player (if only one successor has been chosen, and the γ𝒢 value of all the other successors is 0). In the case where only the resistance to internal deviation applies (if at least two successors have been chosen, and the γ𝒢 value of all the other successors is 0), the choice of constraint (a) or (b) is guessed non-deterministically. These constraints are kept all along the guessed branch except if a vertex of the target set of player i is visited; in this case the constraint (a) is released. Moreover, the constraint (b) for a player i forbids to select a successor in the future where player i visits one of his target vertices.

  • The end of the branches. The existential states decide when to stop the branch of the symbolic tree (before the counter runs out of the polynomial bound). Notice that the branch cannot stop if one of the type (a) constraints is not released. Then existential states provide the set of successors taken in the ancestors so that for players that have a finite upper threshold on their penalty, ancestors must have the same current penalty as the leaf (to ensure that their penalty does not raise to + in the long run).

For the strongly winning variants, universal states also check the constraint that every player of Win must win at the end of each branch. For the weakly winning variant, the existential states are also used to propose a branch where all players of Win will win. The universal states moreover check whether this condition is fulfilled for this particular branch.

5.3 Decision problems over permissive subgame perfect equilibria

Theorem 15.

The constrained penalty problem, the weakly winning with constrained penalty problem and the strongly winning with constrained penalty problem for SPEs are decidable in PSPACE (when the penalty bounds are encoded in unary).

Proof.

The proof is the same as for NEs, instead of the fact that we use Corollary 13, with a vector γ that is partially guessed non-deterministically when it is needed. When the existential states extend a branch of the tree 𝒯 from a vertex of player i, the universal states does not only explore the chosen successors (with constraints (a) or (b) as in the previous proof), but now also explores the other vertices u by starting a fresh exploration of another tree 𝒯i,u,I of the forest. Existential states also non-deterministically guess if player i is weakly winning in 𝒯i,u,I. If so, this gives new constraints (a) in the tree 𝒯. The guessed weakly winning constraints are then checked in the fresh exploration: if player i must be weakly winning, this is a constraint of the same type as a weakly winning constraint in the “main” tree; if player i must not be weakly winning, this is a constraint of type (b) (none of the play must be winning for player i) that we deal as before.

Main penalties are checked as before. For the retaliation penalties, for each player, we check that the total penalty of all new symbolic trees 𝒯i,u,I is below the given upper threshold. To ensure polynomial time termination, we maintain a polynomial counter, and the set of trees (more precisely, the set of triples (i,u,I) used to index the trees of the forest) we jumped in so far. The polynomial counter again takes care of the depth of the branch we explore in the current tree (we reset this counter when we jump from a tree to another one). The set of trees we jumped in so far is maintained to forbid several explorations of the same tree of the forest. As for NEs, the exploration is losing if the depth of the current branch is longer than the polynomial bound. The cardinal of the set of triples (i,u,I) we must maintain is also polynomial (bounded by |N|×|V|×|N|, even though there are exponentially many trees in a forest), since the subset I of winning players does not decrease along the jumps from a tree to the next one. This also implies that the total length of the executions of the Turing machine is indeed polynomial.

Notice that weakly and strongly winning conditions have only to be checked on the “main” tree as for permissive NEs.

6 Conclusion

We studied the permissiveness in Nash, and subgame perfect equilibria over multiplayer reachability games. We showed that several associated problems are decidable in PSPACE: they ask for the existence of such equilibria with various constraints, both on the set of players who reach their target set, and on the penalties that allow us to compare the permissiveness of two equilibria. The polynomial space depends on the size of the game, and the largest upper threshold on the penalties. We were not able to decrease the space dependency to be only polynomial in the logarithm of the penalty thresholds: we leave for future work to investigate if this is possible, or if there is a matching lower bound on complexity.

As other ideas for future works, we would like to extend our study to other objectives than reachability, like more general ω-regular objectives (e.g., parity games), but also weighted games like mean-payoff games, discounted-payoff games, or shortest-path games (where the reachability objective is combined with an objective to reach the target with the smallest possible total weight). An even more challenging problem is to extend this study to the setting of timed games, where the permissiveness is not only on the choice of edges, but also on the choice of delays spent in a given vertex. Work along these lines has been carried out on timed automata and two-player timed games [4, 13].

References

  • [1] Ashwani Anand, Satya Prakash Nayak, and Anne-Kathrin Schmuck. Synthesizing permissive winning strategy templates for parity games. In CAV 2023, volume 13964 of LNCS, pages 436–458. Springer, 2023. doi:10.1007/978-3-031-37706-8_22.
  • [2] Julien Bernet, David Janin, and Igor Walukiewicz. Permissive strategies: from parity games to safety games. RAIRO Theor. Informatics Appl., 36(3):261–275, 2002. doi:10.1051/ITA:2002013.
  • [3] Patricia Bouyer, Marie Duflot, Nicolas Markey, and Gabriel Renault. Measuring permissivity in finite games. In CONCUR 2009, volume 5710 of LNCS, pages 196–210. Springer, 2009. doi:10.1007/978-3-642-04081-8_14.
  • [4] Patricia Bouyer, Erwin Fang, and Nicolas Markey. Permissive strategies in timed automata and games. Electron. Commun. Eur. Assoc. Softw. Sci. Technol., 72, 2015. doi:10.14279/TUJ.ECEASST.72.1015.
  • [5] Patricia Bouyer, Nicolas Markey, Jörg Olschewski, and Michael Ummels. Measuring permissiveness in parity games: Mean-payoff parity games revisited. In ATVA 2011, volume 6996 of LNCS, pages 135–149. Springer, 2011. doi:10.1007/978-3-642-24372-1_11.
  • [6] Léonard Brice, Jean-François Raskin, and Marie van den Bogaard. On the Complexity of SPEs in Parity Games. In CSL 2022, volume 216 of LIPIcs, pages 10:1–10:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.CSL.2022.10.
  • [7] Léonard Brice, Jean-François Raskin, and Marie van den Bogaard. The Complexity of SPEs in Mean-Payoff Games. In ICALP 2022, volume 229 of LIPIcs, pages 116:1–116:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.ICALP.2022.116.
  • [8] Léonard Brice, Jean-François Raskin, and Marie van den Bogaard. Subgame-perfect equilibria in mean-payoff games. Logical Methods in Computer Science, 19, 2023. doi:10.46298/LMCS-19(4:6)2023.
  • [9] Thomas Brihaye, Véronique Bruyère, Aline Goeminne, and Jean-François Raskin. Constrained existence problem for weak subgame perfect equilibria with ω-regular boolean objectives. In GandALF 2018, volume 277 of EPTCS, pages 16–29, 2018. doi:10.4204/EPTCS.277.2.
  • [10] Thomas Brihaye, Véronique Bruyère, Aline Goeminne, Jean-François Raskin, and Marie van den Bogaard. The complexity of subgame perfect equilibria in quantitative reachability games. In CONCUR 2019, volume 140 of LIPIcs, pages 13:1–13:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CONCUR.2019.13.
  • [11] Thomas Brihaye, Véronique Bruyère, Aline Goeminne, and Nathan Thomasset. On relevant equilibria in reachability games. In RP 2019, volume 11674 of LNCS, pages 48–62. Springer, 2019. doi:10.1007/978-3-030-30806-3_5.
  • [12] Ashok K. Chandra, Dexter C. Kozen, and Larry J. Stockmeyer. Alternation. Journal of the ACM, 28(1):114–133, 1981. doi:10.1145/322234.322243.
  • [13] Emily Clement, Thierry Jéron, Nicolas Markey, and David Mentré. Computing maximally-permissive strategies in acyclic timed automata. In FORMATS 2020, volume 12288 of LNCS, pages 111–126. Springer, 2020. doi:10.1007/978-3-030-57628-8_7.
  • [14] Aline Goeminne and Benjamin Monmege. Permissive equilibria in multiplayer reachability games. Technical Report 2411.13296, arXiv, 2024. doi:10.48550/arXiv.2411.13296.
  • [15] Erich Grädel, Wolfgang Thomas, and Thomas Wilke, editors. Automata, Logics, and Infinite Games: A Guide to Current Research [outcome of a Dagstuhl seminar, February 2001], volume 2500 of LNCS. Springer, 2002. doi:10.1007/3-540-36387-4.
  • [16] Satya Prakash Nayak and Anne-Kathrin Schmuck. Most general winning secure equilibria synthesis in graph games. In TACAS 2024, volume 14572 of LNCS, pages 173–193. Springer, 2024. doi:10.1007/978-3-031-57256-2_9.
  • [17] Michael Ummels. Rational behaviour and strategy construction in infinite multiplayer games. In FSTTCS 2006, volume 4337 of LNCS, pages 212–223. Springer, 2006. doi:10.1007/11944836_21.