Abstract 1 Introduction 2 Preliminaries 3 Positional Objectives and PPGs 4 Determinacy of PPGs and HPPGs 5 The Complexity of PPGs and HPPGs 6 Rabin and Streett PPGs and HPPGs 7 Muller PPGs and HPPGs 8 Weighted Multiple Objectives PPGs and HPPGs 9 Discussion References

Positional-Player Games

Orna Kupferman ORCID School of Engineering and Computer Science, Hebrew University, Jerusalem, Israel Noam Shenwald ORCID School of Engineering and Computer Science, Hebrew University, Jerusalem, Israel
Abstract

In reactive synthesis, we transform a specification to a system that satisfies the specification in all environments. For specifications in linear-temporal logic, research on bounded synthesis, where the sizes of the system and the environment are bounded, captures realistic settings and has lead to algorithms of improved complexity and implementability. In the game-based approach to synthesis, the system and its environment are modeled by strategies in a two-player game with an ω-regular objective, induced by the specification. There, bounded synthesis corresponds to bounding the memory of the strategies of the players. The memory requirement for various objectives has been extensively studied. In particular, researchers have identified positional objectives, where the winning player can follow a memoryless strategy – one that needs no memory.

In this work we study bounded synthesis in the game setting. Specifically, we define and study positional-player games, in which one or both players are restricted to memoryless strategies, which correspond to non-intrusive control in various applications. We study positional-player games with Rabin, Streett, and Muller objectives, as well as with weighted multiple Büchi and reachability objectives. Our contribution covers their theoretical properties as well as a complete picture of the complexity of deciding the game in the various settings.

Keywords and phrases:
Games, ω-Regular Objectives, Memory, Complexity
Copyright and License:
[Uncaptioned image] © Orna Kupferman and Noam Shenwald; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Formal languages and automata theory
; Theory of computation Semantics and reasoning
Editors:
Paweł Gawrychowski, Filip Mazowiecki, and Michał Skrzypczak

1 Introduction

Synthesis is the automated construction of a system from its specification [46]. A reactive system interacts with its environment and has to satisfy its specification in all environments [28]. A useful way to approach synthesis of reactive systems is to consider the situation as a game between the system and its environment. In each round of the game, the environment provides an assignment to the input signals and the system responds with an assignment to the output signals. The system wins if the generated computation satisfies the specification. The system and the environment are modeled by transducers, which direct them how to assign values to the signals given the history of the interaction so far.

Aiming to study realistic settings, researchers have studied bounded synthesis [50]. There, the input to the problem also contains bounds on the sizes of the transducers. Beyond modeling the setting more accurately, bounded synthesis has turned out to have practical advantages. Indeed, bounding the system enables a reduction of synthesis to SAT [19, 22]. Bounding both the system and the environment, a naive algorithm, which essentially checks all systems and environments, was shown to be optimal [38]. In richer settings, for example ones with concurrency, partial visibility, or probability, restricting the available memory is sometimes the key to decidability or to significantly improved complexity [9, 42, 16].

Algorithms for synthesis reduce the problem to deciding a two-player graph game. The arena of the game is a graph induced from the specification. The vertices are partitioned between the two players, namely the system and the environment. Starting from an initial vertex, the players jointly generate a play, namely a path in the graph, with each player deciding the successor vertex when the play reaches a vertex she owns. The objectives of the players refer to the infinite play that they generate. Each objective α defines a subset of Vω [41], where V is the set of vertices of the game graph.111As we elaborate in Section 2, all our results apply also for edge-based objectives, namely when α defines a subset of Eω, for the set E of edges of the graph. For example, in games with Büchi objectives, α is a subset of V, and a play satisfies α if it visits vertices in α infinitely often.

In the graph-game setting, a strategy for a player directs her how to proceed in vertices she owns, and it is winning if it guarantees the satisfaction of the player’s objective. Winning strategies may need to choose different successors of a vertex in different visits to the vertex. Indeed, choices may depend on the history of the play so far. The number of histories is unbounded, and extensive research has concerned the memory requirements for strategies in games with ω-regular objectives, namely the minimal number of equivalence classes to which the histories can be partitioned [51, 18, 8, 10]. For example, it is well known that a winning strategy for a conjunction of k Büchi objectives requires memory k [18]. In practice, the strategies of the system and the environment are implemented by controllers whose state spaces correspond to the different memories that the strategies require. Clearly, we seek winning strategies whose controllers are of minimal size. Of special interest are memoryless strategies (also called positional strategies), which depend only on the current vertex. For them, all histories are in one equivalence class, leading to trivial controllers.

The need to design efficient controllers has led to extensive research on memoryless strategies. Researchers identified positional objectives, namely ones in which both players can use memoryless strategies (formally, an objective α is positional if in all games with objective α, the winner of the game has a memoryless winning strategy), and half-positional objectives, namely ones in which one of the players can use a memoryless strategy (formally, an objective α is 1-positional (2-positional) if in all games with objective α, if the system (environment, respectively) wins the game, then it has a memoryless winning strategy). For example, it is well known that parity (and hence, also Büchi) objectives are positional [21, 55, 26]. Then, Rabin objectives are 1-positional, and their dual Streett objectives are 2-positional [21, 35]. On the other hand, Muller objectives are not even half-positional. In addition, researchers study positional [25, 14, 45] and 1-positional [36, 44, 7] fragments of objectives that are in general not positional. For example, [11] identifies a class of parity automata that recognize general ω-regular positional objectives.

In this work we take a different view on the topic. Rather than studying the memory required for different types of objectives, we take the approach of bounded synthesis and study games in which the memory of the players is bounded, possibly to a level that prevents them from winning. We focus on controllers of size 1. Note that while transducers of size 1 are not of much interest, controllers of size 1 in the game setting correspond to memoryless strategies, and are thus of great interest. Indeed, in applications like program repair [33, 27], supervisory control [17] and regret minimization [31], researchers have studied non-intrusive control, which is modeled by memoryless strategies. Although controllers of size 1 do not always exist, due to the obvious implementation advantages, one can first try to find a controller of size 1, and extend the search if one does not exist. Since memoryless strategies amount to consistent behavior, restricting the memory of the environment models setting in which the system may learn an unknown yet static environment [47, 48].

We define and study positional-player games (PPGs, for short), in which both players are restricted to memoryless strategies, and half-positional-player games (HPPGs, for short), in which only one player is restricted to memoryless strategies. We distinguish between positional-Player 1 games (PP1Gs, for short), in which only the system is restricted to memoryless strategies, and positional-Player 2 games (PP2Gs, for short), in which only the environment is restricted.

We study PPGs and HPPGs with Rabin, Streett, and Muller objectives, as well as with weighted multiple objectives [40]. Such objectives are of the form α,w,t, where α2V, is a set of objectives that are all Büchi, co-Büchi, reachability, or avoid objectives, w:2αIN is a non-decreasing, non-negative weight function that maps each subset S of α to a reward earned when exactly all the objectives in S are satisfied, and t0 is a threshold. An objective can be viewed as a maximization objective, in which case the goal is to maximize the earned reward (and t serves as a lower bound) or a minimization objective, in which case the goal is to minimize the earned reward (and t serves as an upper bound). Weighted multiple objectives can express generalized objectives, namely when α contains several objectives, all of which have to be satisfied. A weight function allows for a much richer reference to the underlying objectives: prioritizing them, referring to desired and less desired combinations, and addressing settings where we cannot expect all sub-specifications to be satisfied together.

Studying the theoretical properties of PPGs and HPPGs, we start with some easy observations on how the positionality of the objective type causes different variants of PPGs to coincide. We then study determinacy for PPGs and HPPGs; that is, whether there always exists a player that has a winning strategy. Clearly, if a player wins in a game but needs memory in order to win, she no longer wins when restricted to memoryless strategies. Can the other player always win in this case? We prove that the answer is positive for prefix-independent objectives (and only for them). On the other hand, when the other player is restricted too, the answer may be negative; thus PPGs of all objective types that are not 1-positional or not 2-positional need not be determined. Also, interestingly, even for objectives that are 1-positional, Player 1 may need memory in order to win against a positional Player 2. Indeed in games in which Player 2 wins when her memory is not bounded, Player 1 can win only by learning and remembering the memoryless strategy of Player 2.

We continue and study the complexity of deciding whether Player 1 or Player 2 win in a given PPG or HPPG. Our results are summarized in Tables 1 and 2. Since memoryless strategies are polynomial in the game graph, the problems for PPGs are clearly in Σ2P, namely they can be solved in NP using a co-NP oracle. Indeed, one can guess a strategy for the system, then guess a strategy for the environment, and finally check their outcome. Our main technical contribution here is to identify cases in which this naive algorithm is tight and cases where it can be improved. Moving to HPPGs, deciding the winner involves reasoning about the graph induced by a given strategy, and again, the complexity picture is diverse and depends on the half-positionality of the objective, the determinacy of HPPGs, and the succinctness of the objective type. In particular, handling Muller objectives, we have to cope with their complementation (a naive dualization may be exponential) and to introduce and study the alternating vertex-disjoint paths problem, which adds alternation to the graph.

2 Preliminaries

2.1 Two-player games

A two-player game graph is a tuple G=V1,V2,v0,E, where V1, V2 are finite disjoint sets of vertices, controlled by Player 1 and Player 2, respectively, and we let V=V1V2. Then, v0V is an initial vertex, and EV×V is a total edge relation, thus for every vV, there is uV such that (v,u)E. The size of G, denoted |G|, is |E|, namely the number of edges in it.

In the beginning of a play in the game, a token is placed on v0. Then, in each turn, the player that owns the vertex that hosts the token chooses a successor vertex and moves the token to it. Together, the players generate a play ρ=v0,v1, in G, namely an infinite path that starts in v0 and respects E: for all i0, we have that (vi,vi+1)E.

For i{1,2}, a strategy for Player i is a function fi:VViV that maps prefixes of plays that end in a vertex that belongs to Player i to possible extensions in a way that respects E. That is, for every ρV and vVi, we have that (v,fi(ρv))E. Intuitively, a strategy for Player i directs her how to move the token, and the direction may depend on the history of the game so far. The strategy fi is finite-memory if it is possible to replace the unbounded histories in V by a finite number of memories. The strategy fi is memoryless if it depends only on the current vertex,222Memoryless strategies are sometimes termed positional strategies. We are going to use “positional” in order to describe objectives and games in which the players use memoryless strategies, and prefer to leave the adjective used for describing strategies different from the one used for describing objectives and games. thus for all ρ,ρV and vVi, we have that fi(ρv)=fi(ρv). Accordingly, a memoryless strategy is given by a function fi:ViV.

A profile is a tuple π=f1,f2 of strategies, one for each player. The outcome of a profile π=f1,f2 is the play obtained when the players follow their strategies in π. Formally, 𝗈𝗎𝗍𝖼𝗈𝗆𝖾(π)=v0,v1,Vω is such that for all j0, we have that vj+1=fi(v0,v1,,vj), where i{1,2} is such that vjVi.

A two-player game is a pair 𝒢=G,ψ, where G=V1,V2,v0,E is a two-player game graph, and ψ is a winning condition for Player 1, specifying a subset of Vω, namely the set of plays in which Player 1 wins. The game is zero-sum, thus Player 2 wins when the play does not satisfy ψ. A strategy f1 is a winning strategy for Player 1 if for every strategy f2 for Player 2, Player 1 wins in the profile f1,f2, thus 𝗈𝗎𝗍𝖼𝗈𝗆𝖾(f1,f2) satisfies ψ. Dually, a strategy f2 for Player 2 is a winning strategy for Player 2 if for every strategy f1 for Player 1, we have that Player 2 wins in f1,f2. We say that Player i wins in 𝒢 if she has a winning strategy. A game is determined if Player 1 or Player 2 wins in it.

2.2 Boolean objectives

For a play ρ=v0,v1,, we denote by reach(ρ) the set of vertices that are visited at least once along ρ, and we denote by inf(ρ) the set of vertices that are visited infinitely often along ρ. That is, reach(ρ)={vV:there exists i0 such that vi=v}, and inf(ρ)={vV:there are infinitely many i0 such that vi=v}. For a set of vertices αV, a play ρ satisfies the reachability objective α iff reach(ρ)α, and satisfies the Büchi objective α iff inf(ρ)α. The objectives dual to reachability and Büchi are avoid (also known as safety) and co-Büchi, respectively. Formally, a play ρ satisfies an avoid objective α iff reach(ρ)α=, and satisfies a co-Büchi objective α iff inf(ρ)α=.

A Rabin objective is a set α={Li,Ri}i[k]2V×2V of pairs of sets of vertices. A play ρ satisfies α iff there exists i[k] such that ρ visits Li infinitely often and visits Ri only finitely often. That is, inf(ρ)Li and inf(ρ)Ri=, for some i[k]. The objective dual to Rabin is Streett. Formally, a play ρ satisfies a Streett objective α={Li,Ri}i[k]2V×2V iff inf(ρ)Li= or inf(ρ)Ri, for every i[k].

Finally, a Muller objective over a set of colors [k] is a pair α=,χ, where 2[k] specifies desired subsets of colors and χ:V[k] colors the vertices in V. A play ρ satisfies α iff the set of colors visited infinitely often along ρ is in . That is, {i[k]:inf(ρ)χ1(i)}. The objective dual to α is the Muller objective 2[k],χ.

We define the size of an objective as the size of the sets in it. For Muller objectives, we also add the number k of colors. That is, the size of ,χ is k+F|F|.

 Remark 1.

Objectives in two-player games can also be defined with respect to the edges (rather than vertices) traversed during plays. For some problems, the change is significant. For example, minimization of some types of automata is NP-complete in the state-based setting and can be solved in polynomial time in the edge-based setting [49, 1]. For our study here, all the results also apply for edge-based objectives. For example, for Muller objectives, by adding a new color, we can allow uncolored vertices, which enable a translation of games with colored-edges objectives to games with colored-vertices objectives in a way that preserves winning and memoryless winning strategies. Similar translations can be applied for other types of objectives. ∎

For two types of objectives γ and γ, we use γγ to indicate that every set of plays that satisfy an objective of type γ can be specified also as an objective of type γ. For example, Büchi Rabin, as every Büchi objective α is equivalent to the Rabin objective {α,}.

An objective type γ is prefix-independent if the satisfaction of any γ objective ψ in a play depends only on the infinite suffix of the play. That is, for every play ρ, we have that ρ satisfies ψ iff every infinite suffix ρ of ρ satisfies ψ. An objective that is not prefix-independent is prefix-dependent. Note that objectives defined with respect to inf(ρ) only are prefix-independent, whereas objectives defined with respect to reach(ρ) are prefix-dependent.

2.3 Weighted multiple objectives

A weighted objective is a pair α,w, where α={α1,,αk} is a set of k objectives, all of the same type, and w:2αIN is a weight function that maps subsets of objectives in α to natural numbers. We assume that w is non-decreasing: for every sets S,Sα, if SS, then w(S)w(S). In the context of game theory, non-decreasing functions are very useful, as they correspond to settings with free disposal, namely when satisfaction of additional objectives does not decrease the utility [43]. We also assume that w()=0. A non-decreasing weight function is additive if for every set Sα, the weight of S equals to the sum of weights of the singleton subsets that constitute S. That is, w(S)=αiSw({αi}). An additive weight function is thus given by w:αIN, and is extended to sets of objectives in the expected way, thus w(S)=αiSw(αi), for every Sα.

For a play ρ, let 𝗌𝖺𝗍(ρ,α)α be the set of objectives in α that are satisfied in ρ. The satisfaction value of α,w in ρ, denoted 𝗏𝖺𝗅(ρ,α,w), is then the weight of the set of objectives in α that are satisfied in ρ. That is, 𝗏𝖺𝗅(ρ,α,w)=w(𝗌𝖺𝗍(ρ,α)).

Weighted objectives can be viewed as either maximization or minimization objectives. That is, the goal is to maximize or minimize the weight of the set of objectives satisfied. 333Note that adding a threshold to the objective makes satisfaction binary, and the corresponding optimization problem can be solved using a binary search. Note that when w is uniform, thus when w(S)=|S| for all Sα, the goal is to maximize or minimize the number of satisfied objectives. A special case of the latter, known in the literature as generalized conditions, is when we aim to satisfy all or at least one objective. We denote different classes of weighted objectives by acronyms in {MaxW, MinW, All, Exists}×{R, A, B, C}, describing the way we refer to the satisfaction value and the objectives type: reachability (R), avoid (A), Büchi (B), or co-Büchi (C). Formally, for a play ρVω, an objective type γ{R, A, B, C}, a set α={α1,,αk} of objectives, a weight function w:2αIN, and a threshold tIN, we have the following winning conditions.

  • ρ satisfies a MaxW-γ objective α,w,t if 𝗏𝖺𝗅(ρ,α,w)t.

  • ρ satisfies a MinW-γ objective α,w,t if 𝗏𝖺𝗅(ρ,α,w)t.

  • ρ satisfies an All-γ objective α if |𝗌𝖺𝗍(ρ,α)|=|α|.

  • ρ satisfies an Exists-γ objective α if |𝗌𝖺𝗍(ρ,α)|1.

For All-γ and Exists-γ objectives, we omit the weight function from the specification of the objective. Note that for all objective types γ, we have that All-γ MaxW-γ and Exists-γ MaxW-γ. Also, by [40], MaxWB Muller. Indeed, it is easy to specify in a Muller objective all sets S such that w(S)t. Since we assume non-decreasing weight functions, the other direction does not hold, thus Muller MaxWB. In Section 8, we study dualities in weighted objectives and extend the above observations to minimization games and to underlying co-Büchi and avoid objectives. Finally, note that MaxW-γ and MinW-γ objectives are prefix-independent iff γ is prefix-independent.

3 Positional Objectives and PPGs

For i{1,2}, an objective type γ is i-positional if for every γ-game 𝒢, Player i wins in 𝒢 iff she has a memoryless winning strategy in 𝒢. Then, γ is positional iff it is both 1-positional and 2-positional, thus the winner of every γ-game has a memoryless winning strategy. If γ is neither 1-positional nor 2-positional, we say that it is non-positional.444In the literature, 1-positional and 2-positional objectives are sometimes termed positional, and positional objectives are sometimes termed bipositional [11]. It is known that the objective types reachability, Büchi, and parity (and thus also avoid and co-Büchi) are positional [21], Rabin is 1-positional (and thus Streett is 2-positional) [35], and Muller is non-positional [18]. As for weighted multiple objectives, MaxWB and MinWB are 2- and 1-positional, respectively, whereas MaxWR and MinWR are non-positional [40].

We lift the notion of positionality to games, studying settings in which one or both players are restricted to memoryless strategies. For i{1,2}, a game is a positional-Player i game if Player i is restricted to memoryless strategies, and is a positional-player game (PPG, for short) if both players are restricted to memoryless strategies. Positional-Player 1 and positional-Player 2 games are also called half-positional-player games (HPPG, for short).555An automaton is determinizable by pruning if it embodies an equivalent deterministic automaton, thus if nondeterminism can be resolved in a memoryless manner [3]. This may hint on a relation to PP1Gs. However, the labels on the transitions of automata make the setting different, and reasoning about determinization by pruning is different from reasoning about PP1Gs [37, 2].

For a game 𝒢, we use Pos-𝒢, 1Pos-𝒢, and 2Pos-𝒢 to denote the positional-player and half-positional-player variants of 𝒢, respectively. Formally, for a game 𝒢=G,ψ, we have the following.

  • Player 1 wins Pos-𝒢 iff she has a memoryless strategy f1 such that for every memoryless strategy f2 for Player 2, we have that 𝗈𝗎𝗍𝖼𝗈𝗆𝖾(f1,f2) satisfies ψ.

  • Player 1 wins 1Pos-𝒢 iff she has a memoryless strategy f1 such that for every strategy f2 for Player 2, we have that 𝗈𝗎𝗍𝖼𝗈𝗆𝖾(f1,f2) satisfies ψ.

  • Player 1 wins 2Pos-𝒢 iff she has a strategy f1 such that for every memoryless strategy f2 for Player 2, we have that 𝗈𝗎𝗍𝖼𝗈𝗆𝖾(f1,f2) satisfies ψ.

Example 2.

Consider the AllB game 𝒢=G,α, for G that appears in Fig. 1, and α={{u1,u2},{d1,d2}}. Since the objective of Player 1 is to satisfy both Büchi objectives in α, she wins iff for infinitely many traversals of G, the choices of the players from the vertices v1 and v2 as to whether they go up or down do not match.

Figure 1: The game graph G. Drawing game graphs, vertices owned by Player 1 are circles, and vertices owned by Player 2 are squares.

It is easy to see that Player 1 wins in 𝒢. Indeed, a winning strategy f1 for Player 1 can move the token up to u1 after visits of the token in d2, and move the token down to d1 after visits of the token in u2. Since 2Pos-𝒢 only restricts the strategies that Player 2 may use, the strategy f1 is a winning strategy for Player 1 also in 2Pos-𝒢.

On the other hand, Player 1 does not win in 1Pos-𝒢. Indeed, for every memoryless strategy f1 for Player 1, Player 2 has a strategy f2, in fact a memoryless one, that matches the choice Player 1 makes in v1. That is, if f1(v1)=u1, then f2(v2)=u2, and if f1(v1)=d1, then f2(v2)=d2. Since f2 is memoryless, Player 1 does not win in Pos-𝒢 either.

As for Player 2, she clearly does not win in 𝒢 and 2Pos-𝒢. In 1Pos-𝒢, a winning strategy f2 for Player 2 can move the token up to u2 after visits in u1, and move the token down to d2 after visits in d1. The strategy f2 requires memory, and Player 2 does not have a winning strategy in Pos-𝒢. Indeed, for every memoryless strategy for Player 2, Player 1 has a memoryless strategy in which the choice in v1 does not match the choice Player 2 makes in v2, causing both Büchi objectives to be satisfied. Thus, interestingly, even though AllB objectives are 2-positional, Player 2 needs memory in order to win in 1Pos-𝒢. ∎

We conclude with some easy observations about PPGs and HPPGs and positional and half-positional objectives. The first follows immediately from the fact that the restriction to memoryless strategies only reduces the set of possible strategies. For a player i{1,2}, we use i~ to denote the other player; thus, if i=1, then i~=2, and if i=2, then i~=1.

Theorem 3.

For every game 𝒢 and i{1,2}, all the following hold.

  • If Player i wins iPos-𝒢, then she also wins 𝒢 and Pos-𝒢.

  • If Player i wins Pos-𝒢, then she also wins i~Pos-𝒢.

We continue with observations that use the positionality of the objective type in order to relate different variants of PPGs. The proof of Theorem 4 follows immediately from the definitions and the proof of Theorem 5 can be found in the full version.

Theorem 4.

Consider i{1,2} and an i-positional objective type γ. For every γ-game 𝒢, Player i wins 𝒢 iff Player i wins iPos-𝒢. In particular, if γ is positional, then Player i wins 𝒢 iff Player i wins Pos-𝒢, 1Pos-𝒢, and 2Pos-𝒢.

Theorem 5.

Consider i{1,2}, and an i~-positional objective type γ. For every γ-game 𝒢, Player i wins iPos-𝒢 iff Player i wins Pos-𝒢.

4 Determinacy of PPGs and HPPGs

Games with ω-regular objectives enjoy determinacy: in every game, one player wins [41]. In this section we study the determinacy of PPGs and HPPGs. As expected, restricting the strategies of both players makes some games undetermined. Surprisingly, we are able to prove that for every prefix-independent objective ψ that is not positional, there is an undetermined PPG with the objective ψ. On the other hand, all HPPGs with prefix-independent objectives are determined. Thus, if a player needs memory in order to win a game with a prefix-independent objective, then restricting her to use only memoryless strategies without restricting her opponent, not only prevents her from winning, but also makes the opponent winning.

Theorem 6.

For every objective ψ that is prefix-independent, not positional, and requires finite memory, there is an undetermined PPG with objective ψ. In particular, AllB and ExistsC PPGs need not be determined.

Proof.

Since ψ is not positional, then it is not 1-positional or not 2-positional. We consider the case ψ is not 1-positional. The case ψ is not 2-positional follows, as Player 2 has an objective that is not 1-positional. In the full version, we prove that for every objective ψ as above, there is a set V of vertices, a finite graph G over V consisting of a vertex vV and two simple and vertex-disjoint paths p,qV, such that vp and vq are cycles, and either (vpvq)ω satisfies ψ, and (vp)ω and (vq)ω do not satisfy ψ, or (vpvq)ω does not satisfy ψ, and (vp)ω and (vq)ω satisfy ψ. We then take two copies of G in order to construct a PPG with the same objective in which no player wins.

We continue to HPPGs. Note that for 𝒢 in Example 2, allowing a single player to use memory makes this player win. Indeed, as the other player uses a memoryless strategy, she commits on her strategy in vertices visited along the play. As we formalize in Theorem 7 below, the player that uses memory can then learn these commitments, and follow a strategy that is tailored for them. Since learning is performed during a traversal of a prefix of the play, this works only for objectives that are prefix-independent.

Theorem 7.

Consider a prefix-independent objective type γ. For every γ-game 𝒢 and i{1,2}, we have that iPos-𝒢 is determined.

Proof.

Consider a γ-game 𝒢=G,ψ. Let G=V1,V2,v0,E. We show that Player 1 wins in 1Pos-𝒢 iff Player 2 does not win 1Pos-𝒢. The proof for 2Pos-𝒢 is similar.

Clearly, if Player 1 wins 1Pos-𝒢, then Player 2 does not win 1Pos-𝒢. For the second direction, assume that Player 1 does not win 1Pos-𝒢. We show that then, Player 2 has a strategy that wins against all memoryless strategies of Player 1. Let VV be the set of vertices from which Player 1 does not have a memoryless winning strategy in G, and let G be the sub-graph of G that contains only vertices in V. That is, G=V1V,V2V,v0,E(V×V). Note that indeed v0V since Player 1 does not win 1Pos-𝒢.

Let f1,,fk be all the memoryless strategies for Player 1 in G. Since all of them are not winning strategies, there exist strategies g1,,gk for Player 2 in G such that 𝗈𝗎𝗍𝖼𝗈𝗆𝖾(fi,gi) does not satisfy ψ, from every vertex in V, for every i[k]. Note that for every i[k], such a strategy gi exists only because fi is restricted to vertices in V. We construct from g1,,gk a winning strategy g for Player 2 in 1Pos-𝒢.

Intuitively, starting with i=1, the strategy g follows gi, and whenever Player 1 takes a transition from a vertex vV1V to a vertex u, it checks that the transition is consistent with fi; that is, whether fi(v)=u. If this is not the case, the strategy g updates i to the minimal j>i such that fj(v)=u. In the full version, we define g formally and prove that an index j as above always exists, and that the process eventually stabilizes, simulating a game in which Player 1 uses some memoryless strategy fi, and Player 2 uses a strategy that wins against fi, making g a winning strategy for Player 2.

As for prefix-dependent objective types, here the vertices traversed while the unrestricted player learns the memoryless strategy of the restricted player may play a role in the satisfaction of the objective, and the picture is different (see detailed proof in the full version):

Theorem 8.

For every objective type γ such that AllRγ or ExistsAγ, both γ-PPGs and γ-HPPGs need not be determined.

5 The Complexity of PPGs and HPPGs

Given a game 𝒢, we would like to decide whether Player 1 wins in 1Pos-𝒢, 2Pos-𝒢, and Pos-𝒢. The complexity of the problem depends on the type of objective in 𝒢. In this section we present general complexity results for the problem. Then, in Sections 67, and 8, we provide an analysis for the different objective types. Note that since not all PPGs and HPPGs are determined, the results for Player 2 do not follow immediately from the results for Player 1. They do, however, follow from results about the dual objective.

5.1 General upper bound results

We start with upper bounds. We say that an objective type γ is path-efficient if given a lasso shape path ρ1(ρ2)ω, for ρ1V and ρ2V+, checking whether ρ1(ρ2)ω satisfies a γ objective ψ can be done in time polynomial in |ρ1|,|ρ2|, and |ψ|. We say that an objective type γ is all-path-efficient if given a graph G, checking whether all the infinite paths in G satisfy a γ objective ψ can be done in time polynomial in |G| and |ψ|.

The complexity class Σ2P contains all the problems that can be solved in polynomial time by a nondeterministic Turing machine augmented by a co-NP oracle. Since memoryless strategies are of polynomial size, the complexity class Σ2P is a natural class in the context of PPGs. Formally, we have the following (see proof in the full version).

Theorem 9.

Consider a path-efficient objective type γ. For every γ-game 𝒢, deciding whether Player 1 wins in Pos-𝒢 can be done in Σ2P.

For HPPGs, once a memoryless strategy for Player 1 is guessed, one has to check all the paths in the graph that are consistent with it. Formally, consider a γ-game 𝒢=G,ψ and a memoryless strategy f1 for Player 1. Let Gf1 be the graph obtained from G by removing edges that leave vertices in V1 that do not agree with f1. Clearly, f1 is a winning strategy iff all the paths in Gf1 satisfy ψ. Hence, we have the following (see proof in the full version).

Theorem 10.

Consider an all-path-efficient objective type γ. For every γ-game 𝒢, deciding whether Player 1 wins 1Pos-𝒢 can be done in NP.

Note that beyond the fact that not all objective types are known to be all-path-efficient, the upper bounds that follow from Theorems 9 and 10 need not be tight.

5.2 General lower bound constructions

For our lower bounds, we use reductions from two well-known problems: QBF (Quantified Boolean Formulas) and its special case 2QBF, which are standard problems for proving hardness in PSPACE and Σ2P, respectively. Here, we define these problems and set some infrastructure for the reductions. Consider a set X={x1,,xn} of variables. Let X¯={x1¯,,xn¯}. A QBF formula is of the form Φ=Q1x1Q2x2Qnxnφ, where Q1,,Qn{,} are existential and universal quantifiers, and φ is a propositional formula over XX¯. The QBF problem is to decide whether Φ is valid.

A QBF formula Φ is in 2QBF if there is only one alternation between existential and universal quantification in Φ, and the external quantification is existential. In 2QBF, we can assume that φ is given in 3DNF. For nicer presentation, we use X={x1,,xn} and Y={y1,,ym} for the sets of existentially and universally quantified variables, respectively. That is, a 2QBF formula is of the form Φ=x1xny1ymφ, where φ=C1Ck, for some k1, and for every 1ik, we have Ci=(li1li2li3), with li1,li2,li3XX¯YY¯.

Below we describe two game graphs that are used in our reductions.

Consider a set of variables X={x1,,xn}, and a QBF formula Φ=Q1x1Q2x2Qnxnφ, where Q1,,Qn{,}. The two-player game graph GΦ lets Player 1 choose assignments to the existentially-quantified variables and Player 2 choose assignments to the universally-quantified variables. The idea is that when the players use memoryless strategies, they repeat the same choices. For the special case of 2QBF (see Figure 2), the order of the quantifiers on the variables corresponds to the way the strategies of the players are quantified. For QBF, the quantification on the variables is arbitrary, and GΦ is used in the context of HPPGs. In the full version, we define GΦ formally.

Refer to caption
Figure 2: The game graph GΦ for a 2QBF formula Φ=x1xny1ymφ.

In GΦ, the vertices that correspond to the assignment to the last variable go back to the initial vertex. We use Reach(GΦ) to denote the game graph obtained from GΦ by replacing these edges by self-loops. Note that GΦ and Reach(GΦ) are independent of φ and only depend on the partition of X in Φ to existentially and universally quantified variables.

Next, we define a game graph FΦ, induced by a QBF formula Φ=Q1x1Q2x2Qnxnφ, for φ in 3DNF. The game proceeds in two phases that are repeated infinitely often and are described as follows (see the full version for the full details, and Figure 3 below for an example). In the assignment phase, the players choose an assignment to the variables in X. This is done as in GΦ. Then, the game continues to a checking phase, in which Player 2 tries to refute the chosen assignment by showing that every clause has a literal that is evaluated to 𝐟𝐚𝐥𝐬𝐞. For this, the game sequentially traverses, for every clause Ci, a vertex from which Player 2 can visit refute-literal vertices, associated with li1¯, li2¯, and li3¯, and continues to the next clause. Thus, for every clause, Player 2 chooses a vertex that corresponds to the negation of one of its literals.

Intuitively, when the players use memoryless strategies, the assignment phase induces an assignment to the variables. An assignment satisfies φ iff there exists a clause all whose literals are evaluated to 𝐭𝐫𝐮𝐞, which holds iff Player 2 is forced to choose a refute-literal vertex that corresponds to a literal evaluated to 𝐟𝐚𝐥𝐬𝐞. Then, Φ is valid iff there exists a memoryless strategy for Player 1 such that for every memoryless strategy for Player 2, there exists a literal lij such that both lij and a refute-literal vertex that corresponds to lij¯ are visited infinitely often.

Figure 3: The game graph FΦ, for Φ=x1x2x3(x1x2x3)(x1x2¯x3¯)(x1¯x2x3¯).

We use Reach(FΦ) to denote the game graph obtained from FΦ by replacing the edges from the rightmost refute-literal vertices to the initial vertex by self-loops.

6 Rabin and Streett PPGs and HPPGs

In this section we study Rabin and Streett (and their respective special cases ExistsC and AllB) PPGs and HPPGs. Our results are summarized in Table 1 below.

Table 1: Complexity results for ω-regular PPGs and HPPGs. For positional objectives, the problems coincide with deciding usual games. Accordingly, they can be solved in PTIME for reachability, avoid, Büchi and co-Büchi objectives [5, 32, 55, 53], and in UP co-UP for parity objectives [34].
Type Positionality P1 wins 𝒢 P1 wins Pos-𝒢 P1 wins 1Pos-𝒢 P1 wins 2Pos-𝒢
ExistsC 1-positional
PTIME
[12]
Σ2P-complete
(Theorem 12)
PTIME
(Theorem 11)
co-NP-complete
(Theorem 11)
Rabin 1-positional
NP-complete
[20]
NP-complete
(Theorem 11)
AllB 2-positional
PTIME
[12]
NP-complete
(Theorem 13)
NP-complete
(Theorem 11)
PTIME
(Theorem 11)
Streett 2-positional
co-NP-complete
[20]
co-NP-complete
(Theorem 11)
Muller non-positional
PSPACE-complete
[30]
Σ2P-complete
(Theorem 15)
NP-complete
(Theorem 17)
co-NP-complete
(Theorem 17)

We start with HPPGs (see proof in the full version).

Theorem 11.

Deciding whether Player 1 wins:

  • an ExistsC PP1G or a AllB PP2G can be done in polynomial time.

  • a γ-PP1G is NP-complete, for γ{Rabin, AllB, Streett}.

  • a γ-PP2G is co-NP-complete, for γ{Streett, ExistsC, Rabin}.

Theorem 12.

Deciding whether Player 1 wins a Rabin PPG is Σ2P-complete. Hardness in Σ2P applies already for ExistsC PPGs.

Proof.

The upper bound follows from Theorem 9. For the lower bound, we describe a reduction from 2QBF. That is, given a 2QBF formula Φ, we construct an ExistsC game 𝒢Φ such that Φ=𝐭𝐫𝐮𝐞 iff Player 1 wins Pos-𝒢Φ.

Consider a 2QBF formula Φ=XYφ such that φ=C1Ck and Ci=(li1li2li3). Recall the game graph GΦ defined in Section 5.2, and recall that memoryless strategies for the players induce assignments to the variables in X and Y in a way that corresponds to their quantification in Φ. Thus, we only need to define an objective that captures the satisfaction of some clause in φ. For this, we define the ExistsC objective α={{li1¯,li2¯,li3¯}:i[k]}. Thus, each clause Ci of φ contributes to α a set with the negations of the literals in Ci. Since for each variable zXY, a play visits exactly one of the literal vertices z and z¯ infinitely often, the play satisfies a co-Büchi objective in α iff the chosen assignment satisfies φ, and so Φ=𝐭𝐫𝐮𝐞 iff Player 1 wins in Pos-GΦ,α (see proof in the full version).

Theorem 12 shows that Rabin PPGs are strictly more complex than general Rabin games. For the dual Streett objective, the 1-positionality of Rabin objectives does make the problem easier. Formally, we have the following (see proof in the full version).

Theorem 13.

Deciding whether Player 1 wins a Streett PPG is NP-complete. Hardness in NP applies already for AllB PPGs.

7 Muller PPGs and HPPGs

In this section, we study the complexity of Muller PPGs and HPPGs. Our results are summarized in Table 1. We start with PPGs. It is not hard to see that Muller is path-efficient, and so Theorem 9 implies that the problem of deciding whether Player 1 wins a Muller PPG is in Σ2P. In Theorem 12, we proved that the problem is Σ2P-hard for Rabin and even ExistsC PPGs, and as ExistsC Muller, it may seem that a similar lower bound would be easy to obtain. The translation from an ExistsC objective to a Muller objective may, however, be exponential [6], which is in particular the case for the objective used in the lower bound proof in Theorem 12. Note that the AllB objective used in the lower bound proof in Theorem 13 can be translated with no blow-up to a Muller objective, but deciding whether Player 1 wins an AllB PPG is NP-complete.

Accordingly, our first heavy technical result in the context of Muller PPGs is a proof of their Σ2P-hardness. For this, we add an alternation to the problem of vertex-disjoint paths, used in [33] in order to prove NP-hardness for AllB PPGs. We show that a Muller objective can capture the alternation, which lifts the complexity from NP to Σ2P.

We first need some definitions and notations. Consider a directed graph G=V,E. A path p in G is simple if each vertex appears in p at most once. Two simple paths p and q in G are vertex-disjoint iff they do not have vertices in common, except maybe their first and last vertices. The vertex-disjoint-paths (VDP, for short) problem is to decide, given G and two vertices s,tV, whether there exist two vertex-disjoint paths from s to t and from t to s in G. The complementing problem, termed NVDP, is to decide whether there do not exist two vertex-disjoint paths from s to t and from t to s in G.

Now, the alternating NVDP problem (ANVDP, for short) is to decide, given a two-player game graph G=V1,V2,E and two vertices s,tV, whether there exists a memoryless strategy f1 for Player 1 in G such that the Gf1,s,t is in NVDP. That is, there do not exist two vertex-disjoint paths from s to t and from t to s in Gf1.

Lemma 14.

ANVDP is Σ2P-complete.

Proof.

VDP is known to be NP-complete [24, 33], making NVDP co-NP-complete, and inducing a Σ2P upper bound. NP-hardness is shown in [24] by a reduction from 3SAT (see full version for details): given a 3CNF formula φ over variables in X, a graph Gφ with vertices s and t is constructed such that paths from s to t correspond to choosing an assignment to the variables in X, and then choosing one literal in every clause in φ. Then, a path p from s to t has a path q from t to s that is vertex-disjoint from p iff all the chosen literals are evaluated to 𝐭𝐫𝐮𝐞 in the chosen assignment. Accordingly, Gφ,s,t is in VDP iff φ is satisfiable.

For ANVDP, we use a similar reduction, but from 2QBF. Given a 2QBF formula Φ=XYφ where φ is in 3DNF, we construct a game graph similar to Gφ¯ for the 3CNF formula φ¯ with vertices s and t, where memoryless strategies for Player 1 correspond to assignments to the variables in X. Every such assignment defines a sub-graph that is in VDP iff there exists an assignment to the variables in Y that, together with the assignment Player 1 chose, satisfies φ¯. Accordingly, there exists a memoryless strategy for Player 1 such that the corresponding sub-graph, together with s and t, is in NVDP iff Φ=𝐭𝐫𝐮𝐞.

Theorem 15.

Deciding whether Player 1 wins a Muller PPG is Σ2P-complete.

Proof.

The upper bound follows from Theorem 9. For the lower bound, we describe a reduction from ANVDP, which, by Lemma 14, is Σ2P-hard. Consider a two-player game graph G=V1,V2,s,E, and a vertex tV{s}. We define a Muller game 𝒢=G,,χ such that Player 1 wins Pos-𝒢 iff G,s,t is in ANVDP.

Consider the set of colors {1,2,3}. We define χ:V{1,2,3}, where χ(s)=1, χ(t)=2, and χ(v)=3 for every vV{s,t}. Then, ={{1},{2},{3},{1,3},{2,3}}. Thus, Player 1 wins in a play iff it does not visit both s and t infinitely often. For the correctness of the construction, note that for every memoryless strategy f1 for Player 1 in G, we have that f1 is winning in Pos-𝒢 iff Player 2 does not have a memoryless winning strategy in Gf1, which she has iff there exists a simple cycle in Gf1 that visits both s and t. Such a simple cycle exists iff there exist two vertex-disjoint paths from s to t and from t to s in Gf1. That is, if Gf1,s,t is in VDP. Accordingly, Player 1 wins Pos-𝒢 iff G,s,t is in ANVDP.

We continue to Muller HPPGs. Note that beyond implying membership in NP for the problem of deciding whether Player 1 wins a Muller PP1G, establishing the all-path efficiency of the Muller objective also implies that the universality problem for universal Muller word automata can be solved in polynomial time.666An alternative proof to Theorem 16 can use the known polynomial translation of Muller objectives to Zielonka DAGs [30] and the fact Zielonka-DAG automata can be complemented in polynomial time [29]. In the full version, we describe this approach in detail. We find our direct proof useful, as it shows that translating full CNF formulas to equivalent DNF formulas can be done in polynomial time.

Theorem 16.

Muller objectives are all-path-efficient.

Proof.

Consider a graph G, and a Muller objective α=,χ defined over a set of colors [k]. Deciding whether every infinite path in G satisfies α can be reduced to deciding whether there is an infinite path in G that satisfies the dual Muller objective α~=2[k],χ. Since the size of α~ need not be polynomial in ||, a naive algorithm that checks the existence of a path that satisfies α~ does not run in polynomial time.

The key point in our algorithm is to complement a given Muller objective not to another Muller objective, but rather to view as a DNF formula φ over [k], dualize it to a CNF formula φ¯, and then convert φ¯ to an equivalent DNF formula. Specifically, φ=F((iFi)(i[k]Fi¯)), with the semantics that a literal i[k] (i¯, respectively) requires vertices with the color i to be visited infinitely (finitely, respectively) often [4]. The dual Muller objective then corresponds to the complementing CNF formula; thus it corresponds to the Emerson-Lei objective [30] φ¯=F((iFi¯)(i[k]Fi)).

Note that deciding whether a graph G has a path that satisfies a DNF formula can be done in polynomial time (for example by checking whether there exists a path in G that respects the requirements induced by one clause in the formula). On the other hand, deciding whether a graph G has a path that satisfies a CNF formula is hard, which is why we convert φ¯ to an equivalent DNF formula. Converting CNF to DNF is in general exponential. The CNF formula φ¯, however, is full: for every variable i, every conjunct contains the literal i or the literal i¯. In the full version, we show that full CNF formulas can be converted to DNF in polynomial time. The conversion is based on the equality φ(φxn)(φxn¯), recursively applied to (φxn) and (φxn¯). The blow-up is kept polynomial by minimizing (φxn) to include only clauses of φ that contain xn, after removing xn from them, and similarly for (φxn¯).

We can now conclude with the complexity of Muller HPPGs (see proof in the full version).

Theorem 17.

Deciding whether Player 1 wins a Muller PP1G is NP-complete, and deciding whether Player 1 wins a Muller PP2G is co-NP-complete.

8 Weighted Multiple Objectives PPGs and HPPGs

In this section we study PPGs and HPPGs with weighted multiple objectives. We focus on games with underlying Büchi or reachability objectives. In the full version, we prove that the results for underlying co-Büchi or avoid objectives follow. Essentially, this follows from the fact that weighted objectives may be dualized by complementing either the type of the objective or the way we refer to the satisfaction value. Specifically, for an objective type γ{R, A, B, C}, let γ~ be the dual objective, thus R~=A and B~=C. Then, as shown in [40], for every γ{R, A, B, C}, every MaxW-γ objective has an equivalent MinW-γ~ objective of polynomial size, and vise versa. Our results are summarized in Table 2 below.

Table 2: Complexity results for PPGs with weighted multiple objectives.
Type Positionality P1 wins 𝒢 P1 wins Pos-𝒢 P1 wins 1Pos-𝒢 P1 wins 2Pos-𝒢
MaxWB
MinWC
2-positional
[40]
co-NP-complete
[40]
Σ2P-complete
(Theorem 18)
Σ2P-complete
(Theorem 20)
co-NP-complete
(Theorem 20)
MinWB
MaxWC
1-positional
[40]
NP-complete
[40]
Σ2P-complete
(Theorem 18)
NP-complete
(Theorem 20)
Π2P-complete
(Theorem 20)
MaxWR
MinWA
non-positional
[40]
PSPACE-complete
[40]
Σ2P-complete
(Theorem 19)
Σ2P-complete
(Theorem 21)
PSPACE-complete
(Theorem 21)
MinWR
MaxWA
non-positional
[40]
Σ2P-complete
(Theorem 21)
PSPACE-complete
(Theorem 21)

We start with PPGs with underlying Büchi and reachability objectives.

Theorem 18.

Deciding whether Player 1 wins a MinWB or MaxWB PPG is Σ2P-complete. Hardness in Σ2P applies already for games with uniform weight functions.

Proof.

Both upper bounds follow from Theorem 9. Since the ExistsC objective used in the lower bound in the proof of Theorem 12 can be specified as a MinWB objective with a uniform weight function, a matching lower bound for MinWB PPGs is easy.

A lower bound for MaxWB PPGs is less easy. Consider a 2QBF formula Φ=XYφ such that φ is in 3DNF. We construct a MaxWB game 𝒢Φ over the game graph FΦ defined in Section 5.2 such that Φ=𝐭𝐫𝐮𝐞 iff Player 1 wins Pos-𝒢Φ.

For every literal l in φ, we define a Büchi objective αl as the set of vertices associated with l. That is, the literal vertex l from the assignment phase of the game, and refute-literal vertices in the checking phase that originate from l¯ appearing in some clause. The objective of Player 1 is to satisfy at least |X|+|Y|+1 different Büchi objectives, which can be expressed with a uniform weight function. Formally, 𝒢Φ=FΦ,α,|X|+|Y|+1, where α={αl:lXX¯YY¯}, with αl={l}{Cij:i[k],j[3], and lij=l¯}, for every lXX¯YY¯.

Intuitively (see proof in the full version), the literal vertices that a play visits infinitely often are exactly those that correspond to literals evaluated to 𝐭𝐫𝐮𝐞 in the chosen assignment, and thus the play satisfies |X|+|Y| Büchi objectives during the assignment phase – one for every literal evaluated to 𝐭𝐫𝐮𝐞. Then, the game satisfies an additional Büchi objective in the checking phase iff Player 2 chooses a refute-literal vertex that corresponds to a literal evaluated to 𝐟𝐚𝐥𝐬𝐞. Since Player 2 is forced to choose such a refute-literal vertex iff there exists a clause all whose literals are evaluated to 𝐭𝐫𝐮𝐞, Player 2 is forced to satisfy an additional Büchi objective iff the chosen assignment satisfies φ. Therefore, Φ=𝐭𝐫𝐮𝐞 iff Player 1 can force the satisfaction of at least |X|+|Y|+1 Büchi objectives.

Theorem 19.

Deciding whether Player 1 wins a MinWR or MaxWR PPG is Σ2P-complete. Hardness in Σ2P applies already for uniform weight functions.

Proof.

The upper bounds follow from Theorem 9. The reductions from 2QBF to MinWB and MaxWB PPGs in the proof of Theorem 18 are also valid for MinWR and MaxWR PPGs. Indeed, when the players are restricted to memoryless strategies in GΦ and FΦ, every vertex is visited infinitely often iff it is reached. Thus, a Büchi (co-Büchi) objective α is satisfied iff the reachability (avoid, respectively) objective α is satisfied.

Moving to HPPGs, we start with underlying Büchi objectives, where things are easy (see proof in the full version):

Theorem 20.

Deciding whether Player 1 wins:

  • a MinWB PP1G is NP-complete.

  • a MaxWB PP1G is Σ2P-complete.

  • a MaxWB PP2G is co-NP-complete.

  • a MinWB PP2G is Π2P-complete.

In all cases, hardness holds already for games with a uniform weight function.

For HPPGs with underlying reachability objectives, things are more complicated. First, recall that MinWR and MaxWR HPPGs are undetermined, thus the results for PP2Gs cannot be inferred from the results for PP1Gs. In addition, since MinWR and MaxWR objectives are not half-positional, the memory requirements for the unrestricted player adds to the complexity.

Theorem 21.

Deciding whether Player 1 wins:

  • a MinWR or MaxWR PP1G is Σ2P-complete.

  • a MinWR or MaxWR PP2G is PSPACE-complete.

In all cases, hardness holds already for games with a uniform weight function.

Proof.

(sketch, see full details in the full version). We start with PP1Gs. For the upper bounds, consider a MinWR or MaxWR game 𝒢=G,ψ. A memoryless strategy f1 for Player 1 in 𝒢 is winning iff the objectives that are satisfied in ψ are reached within polynomially many rounds of the game. Hence, an NP algorithm that uses a co-NP oracle guesses a memoryless strategy f1 for Player 1, and checks that ψ is satisfied in every guessed path of the appropriate length in Gf1.

For the lower bounds, we argue that the reductions from 2QBF to MinWR and MaxWR PPGs in the proof of Theorem 19 are valid also for MinWR and MaxWR PP1Gs when replacing the game graphs GΦ and FΦ by Reach(GΦ) and Reach(FΦ), respectively. Essentially, this follows from the fact that in these games each vertex is visited at most once.

We continue to the PP2Gs. Proving a PSPACE upper bound, we describe an ATM T that runs in polynomial time and accepts a MinWR or a MaxWR game 𝒢=G,ψ iff Player 1 wins 2Pos-𝒢. The alternation of T is used in order to simulate the game, and we show that, for MaxWR, we can require the reachability objectives to be satisfied within |V||α|, and for MinWR, we can require the reachability objectives to not be satisfied while traversing a cycle. Accordingly, we can bound the number of rounds in the simulation, with T maintaining on the tape the set of reachability objectives satisfied so far, and information required to detect when it may terminate and to ensure that Player 2 uses a memoryless strategy.

For the lower bounds, we describe reductions from QBF. That is, given a QBF formula Φ=Q1x1Q2x2Qnxnφ, we construct a MinWR and a MaxWR game 𝒢Φ=Reach(GΦ),α such that Φ=𝐭𝐫𝐮𝐞 iff Player 1 wins 2Pos-𝒢Φ. For MinWR, we assume that φ is in DNF and the winning objective α is similar to the ExistsC objective in Theorem 12. For MaxWR, we assume that φ is in CNF, and define an AllR objective in which each clause induces the set of its literals. It is easy to see that the reduction is valid when the strategies of the players are not restricted. We argue that since each vertex in Reach(GΦ) is visited in each outcome at most ones, Player 1 wins 𝒢Φ iff Player 1 wins 2Pos-𝒢Φ, and so we are done.

9 Discussion

We introduced and studied positional-player games, where one or both players are restricted to memoryless strategies. Below we discuss two directions for future research.

The focus on memoryless strategies corresponds to non-intrusiveness in applications like program repair [33, 27], supervisory control [17], regret minimization [31], and more. More intrusive approaches involve executing controllers of bounded size in parallel with the system or environment, leading to bounded-player games. There, the input to the problem contains, in addition to the game 𝒢, also bounds m1 and m2 to the sizes of the system and the environment. Player 1 wins (m1,m2)-𝒢 iff she has a strategy with memory of size at most m1 that wins against all strategies of size at most m2 of Player 2.

Note that the games (1,1)-𝒢, (1,)-𝒢, and (,1)-𝒢 coincide with Pos-𝒢, 1Pos-𝒢, and 2Pos-𝒢, respectively, and so our complexity lower bounds here apply also to the bounded setting. Moreover, if we assume, as in work about bounded synthesis [38], that m1 and m2 are given in unary, then many of our upper bounds here extend easily to the bounded setting. Indeed, upper bounds that guess memoryless strategies for a player can now guess memory structures of the given size, and reason about the game obtained by taking the product with the structures. Some of our results, however, are not extended easily. For example, the NP upper bound for Streett PPGs relies on the fact that for every memoryless strategy f1 for Player 1, we have that Player 2 has a winning strategy in Gf1 iff she has a memoryless winning strategy in Gf1. When m2<m1, the latter is not helpful for Player 2, and we conjecture that the problem in the bounded setting is more complex.

The second direction concerns positional-player non-zero-sum games, namely multi-player games in which the objectives of the players may overlap [13, 52]. There, typical questions concern the stability of the game and equilibria the players may reach [54]. In particular, in rational synthesis, we seek an equilibrium in which the objective of the system is satisfied [23, 15]. The study of positional strategies in the non-zero-sum setting is of particular interest, as it also restricts the type of deviations that players may perform, and our ability to incentivize or block such deviations [39].

References

  • [1] B. Abu Radi and O. Kupferman. Minimization and canonization of GFG transition-based automata. Log. Methods Comput. Sci., 18(3), 2022. doi:10.46298/LMCS-18(3:16)2022.
  • [2] B. Abu Radi, O. Kupferman, and O. Leshkowitz. A hierarchy of nondeterminism. In 46th Int. Symp. on Mathematical Foundations of Computer Science, volume 202 of LIPIcs, pages 85:1–85:21, 2021. doi:10.4230/LIPICS.MFCS.2021.85.
  • [3] B. Aminof, O. Kupferman, and R. Lampert. Formal analysis of online algorithms. In 9th Int. Symp. on Automated Technology for Verification and Analysis, volume 6996 of Lecture Notes in Computer Science, pages 213–227. Springer, 2011. doi:10.1007/978-3-642-24372-1_16.
  • [4] T. Babiak, F. Blahoudek, A. Duret-Lutz, J. Klein, J. Kretínský, D. Müller, D. Parker, and J. Strejcek. The hanoi omega-automata format. In Proc. 27th Int. Conf. on Computer Aided Verification, volume 9206 of Lecture Notes in Computer Science, pages 479–486. Springer, 2015. doi:10.1007/978-3-319-21690-4_31.
  • [5] C. Beeri. On the membership problem for functional and multivalued dependencies in relational databases. ACM Trans. on Database Systems, 5:241–259, 1980. doi:10.1145/320613.320614.
  • [6] U. Boker. On the (in)succinctness of muller automata. In Proc. 26th Annual Conf. of the European Association for Computer Science Logic, volume 82 of LIPIcs, pages 12:1–12:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPICS.CSL.2017.12.
  • [7] P. Bouyer, A. Casares, M Randour, and P. Vandenhove. Half-Positional Objectives Recognized by Deterministic Büchi Automata. In Proc. 33rd Int. Conf. on Concurrency Theory, volume 243 of Leibniz International Proceedings in Informatics (LIPIcs), pages 20:1–20:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.CONCUR.2022.20.
  • [8] P. Bouyer, Fijalkow N, M. Randour, and P. Vandenhove. How to play optimally for regular objectives?, 2023. doi:10.4230/LIPICS.ICALP.2023.118.
  • [9] N. Bulling and V. Goranko. Combining quantitative and qualitative reasoning in concurrent multi-player games. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, page 1926?1928. International Foundation for Autonomous Agents and Multiagent Systems, 2022.
  • [10] A. Casares. On the minimisation of transition-based rabin automata and the chromatic memory requirements of muller conditions. In Proc. 30th Annual Conf. of the European Association for Computer Science Logic, volume 216 of LIPIcs, pages 12:1–12:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.CSL.2022.12.
  • [11] A. Casares and P. Ohlmann. Positional ω-regular languages. In Proc. 39th ACM/IEEE Symp. on Logic in Computer Science. Association for Computing Machinery, 2024.
  • [12] K. Chatterjee, W. Dvorak, M. Henzinger, and V. Loitzenbauer. Conditionally optimal algorithms for generalized büchi games. In 41st Int. Symp. on Mathematical Foundations of Computer Science, volume 58 of LIPIcs, pages 25:1–25:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPICS.MFCS.2016.25.
  • [13] K. Chatterjee, R. Majumdar, and M. Jurdzinski. On Nash equilibria in stochastic games. In Proc. 13th Annual Conf. of the European Association for Computer Science Logic, volume 3210 of Lecture Notes in Computer Science, pages 26–40. Springer, 2004. doi:10.1007/978-3-540-30124-0_6.
  • [14] T. Colcombet and D. Niwiński. On the positional determinacy of edge-labeled games. Theoretical Computer Science, 352(1):190–196, 2006. doi:10.1016/J.TCS.2005.10.046.
  • [15] R. Condurache, E. Filiot, R. Gentilini, and J.-F. Raskin. The complexity of rational synthesis. In Proc. 43th Int. Colloq. on Automata, Languages, and Programming, volume 55 of LIPIcs, pages 121:1–121:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPICS.ICALP.2016.121.
  • [16] F. Delgrange, J.P. Katoen, T. Quatmann, and M. Randour. Simple strategies in multi-objective mdps. In Proc. 26th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, pages 346–364. Springer International Publishing, 2020.
  • [17] J. Dubreil, Ph. Darondeau, and H. Marchand. Supervisory control for opacity. IEEE Transactions on Automatic Control, 55(5):1089–1100, 2010. doi:10.1109/TAC.2010.2042008.
  • [18] S. Dziembowski, M. Jurdzinski, and I. Walukiewicz. How much memory is needed to win infinite games. In Proc. 12th ACM/IEEE Symp. on Logic in Computer Science, pages 99–110, 1997.
  • [19] R. Ehlers. Symbolic bounded synthesis. In Proc. 22nd Int. Conf. on Computer Aided Verification, volume 6174 of Lecture Notes in Computer Science, pages 365–379. Springer, 2010. doi:10.1007/978-3-642-14295-6_33.
  • [20] E.A. Emerson and C. Jutla. The complexity of tree automata and logics of programs. In Proc. 29th IEEE Symp. on Foundations of Computer Science, pages 328–337, 1988.
  • [21] E.A. Emerson and C. Jutla. Tree automata, μ-calculus and determinacy. In Proc. 32nd IEEE Symp. on Foundations of Computer Science, pages 368–377, 1991.
  • [22] E. Filiot, N. Jin, and J.-F. Raskin. An antichain algorithm for LTL realizability. In Proc. 21st Int. Conf. on Computer Aided Verification, volume 5643, pages 263–277, 2009. doi:10.1007/978-3-642-02658-4_22.
  • [23] D. Fisman, O. Kupferman, and Y. Lustig. Rational synthesis. In Proc. 16th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, volume 6015 of Lecture Notes in Computer Science, pages 190–204. Springer, 2010. doi:10.1007/978-3-642-12002-2_16.
  • [24] S. Fortune, J. Hopcroft, and J. Wyllie. The directed subgraph homeomorphism problem. Theoretical Computer Science, 10(2):111–121, 1980. doi:10.1016/0304-3975(80)90009-2.
  • [25] H. Gimbert and W. Zielonka. Games where you can play optimally without any memory. In Proc. 16th Int. Conf. on Concurrency Theory, volume 3653 of Lecture Notes in Computer Science, pages 428–442, 2005. doi:10.1007/11539452_33.
  • [26] E. Grädel. Positional determinacy of infinite games. In Proc. 21st Symp. on Theoretical Aspects of Computer Science, volume 2996 of Lecture Notes in Computer Science, pages 4–18, 2004. doi:10.1007/978-3-540-24749-4_2.
  • [27] D. Harel, G. Katz, A. Marron, and G. Weiss. Non-intrusive repair of reactive programs. In 17th IEEE International Conference on Engineering of Complex Computer Systems, ICECCS 2012, Paris, France, July 18-20, 2012, pages 3–12. IEEE Computer Society, 2012. doi:10.1109/ICECCS.2012.25.
  • [28] D. Harel and A. Pnueli. On the development of reactive systems. In K. Apt, editor, Logics and Models of Concurrent Systems, volume F-13 of NATO Advanced Summer Institutes, pages 477–498. Springer, 1985.
  • [29] C. Hugenroth. Zielonka dag acceptance and regular languages over infinite words. In Developments in Language Theory, pages 143–155. Springer Nature Switzerland, 2023.
  • [30] P. Hunter and A. Dawar. Complexity bounds for regular games. In 30th Int. Symp. on Mathematical Foundations of Computer Science, volume 3618, pages 495–506. Springer, 2005. doi:10.1007/11549345_43.
  • [31] P. Hunter, G.A. Pérez, and J.F. Raskin. Reactive synthesis without regret. In Proc. 26th Int. Conf. on Concurrency Theory, volume 42 of Leibniz International Proceedings in Informatics (LIPIcs), pages 114–127. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2015. doi:10.4230/LIPICS.CONCUR.2015.114.
  • [32] N. Immerman. Number of quantifiers is better than number of tape cells. Journal of Computer and Systems Science, 22(3):384–406, 1981. doi:10.1016/0022-0000(81)90039-8.
  • [33] B. Jobstmann, A. Griesmayer, and R. Bloem. Program repair as a game. In Proc. 17th Int. Conf. on Computer Aided Verification, volume 3576 of Lecture Notes in Computer Science, pages 226–238, 2005. doi:10.1007/11513988_23.
  • [34] M. Jurdzinski. Deciding the winner in parity games is in UP co-UP. Information Processing Letters, 68(3):119–124, 1998.
  • [35] N. Klarlund. Progress measures, immediate determinacy, and a subset construction for tree automata. Annals of Pure and Applied Logic, 69(2):243–268, 1994. doi:10.1016/0168-0072(94)90086-8.
  • [36] E. Kopczyński. Half-positional determinacy of infinite games. In Proc. 33rd Int. Colloq. on Automata, Languages, and Programming, volume 4052 of Lecture Notes in Computer Science, pages 336–347. Springer, 2006. doi:10.1007/11787006_29.
  • [37] D. Kuperberg and A. Majumdar. Width of non-deterministic automata. In Proc. 35th Symp. on Theoretical Aspects of Computer Science, volume 96 of LIPIcs, pages 47:1–47:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPICS.STACS.2018.47.
  • [38] O. Kupferman, Y. Lustig, M.Y. Vardi, and M. Yannakakis. Temporal synthesis for bounded systems and environments. In Proc. 28th Symp. on Theoretical Aspects of Computer Science, pages 615–626, 2011.
  • [39] O. Kupferman and N. Shenwald. Games with trading of control. In Proc. 34th Int. Conf. on Concurrency Theory, volume 279 of Leibniz International Proceedings in Informatics (LIPIcs), pages 19:1–19:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.CONCUR.2023.19.
  • [40] O. Kupferman and N. Shenwald. Games with weighted multiple objectives. In 22nd Int. Symp. on Automated Technology for Verification and Analysis, Lecture Notes in Computer Science. Springer, 2024.
  • [41] D.A. Martin. Borel determinacy. Annals of Mathematics, 65:363–371, 1975.
  • [42] S. Nain and M.Y. Vardi. Solving partial-information stochastic parity games. In Proc. 28th ACM/IEEE Symp. on Logic in Computer Science, pages 341–348. IEEE Computer Society, 2013. doi:10.1109/LICS.2013.40.
  • [43] N. Nisan, T. Roughgarden, E. Tardos, and V.V. Vazirani. Algorithmic Game Theory. Cambridge University Press, 2007.
  • [44] P. Ohlmann. Characterizing positionality in games of infinite duration over infinite graphs. In Proc. 37th ACM/IEEE Symp. on Logic in Computer Science. Association for Computing Machinery, 2022.
  • [45] P. Ohlmann and M. Skrzypczak. Positionality in Σ02 and a completeness result, 2024. arXiv:2309.17022.
  • [46] A. Pnueli and R. Rosner. On the synthesis of a reactive module. In Proc. 16th ACM Symp. on Principles of Programming Languages, pages 179–190, 1989.
  • [47] D. Raju, R. Ehlers, and U. Topcu. Playing against opponents with limited memory. CoRR, abs/2002.07274, 2020. arXiv:2002.07274.
  • [48] D. Raju, R. Ehlers, and U. Topcu. Adapting to the behavior of environments with bounded memory. In Proc. 12th International Symposium on Games, Automata, Logics and Formal Verification, 2021.
  • [49] S. Schewe. Minimising good-for-games automata is NP-complete. In Proc. 40th Conf. on Foundations of Software Technology and Theoretical Computer Science, volume 182 of LIPIcs, pages 56:1–56:13. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPICS.FSTTCS.2020.56.
  • [50] S. Schewe and B. Finkbeiner. Bounded synthesis. In 5th Int. Symp. on Automated Technology for Verification and Analysis, volume 4762 of Lecture Notes in Computer Science, pages 474–488. Springer, 2007. doi:10.1007/978-3-540-75596-8_33.
  • [51] W. Thomas. On the synthesis of strategies in infinite games. In Proc. 12th Symp. on Theoretical Aspects of Computer Science, volume 900 of Lecture Notes in Computer Science, pages 1–13. Springer, 1995. doi:10.1007/3-540-59042-0_57.
  • [52] M. Ummels. The complexity of Nash equilibria in infinite multiplayer games. In Proc. 11th Int. Conf. on Foundations of Software Science and Computation Structures, pages 20–34, 2008.
  • [53] M.Y. Vardi and P. Wolper. Automata-theoretic techniques for modal logics of programs. Journal of Computer and Systems Science, 32(2):182–221, 1986. doi:10.1016/0022-0000(86)90026-7.
  • [54] J. von Neumann and O. Morgenstern. Theory of games and economic behavior. Princeton University Press, 1953.
  • [55] W. Zielonka. Infinite games on finitely coloured graphs with applications to automata on infinite trees. Theoretical Computer Science, 200(1-2):135–183, 1998. doi:10.1016/S0304-3975(98)00009-7.