Memory Requirements in Non-Zero-Sum Games

Feinstein, Yoav; Kupferman, Orna

doi:10.4230/LIPIcs.CSL.2026.34

Memory Requirements in Non-Zero-Sum Games

Yoav Feinstein

School of Engineering and Computer Science, Hebrew University, Jerusalem, Israel Orna Kupferman

School of Engineering and Computer Science, Hebrew University, Jerusalem, Israel

Abstract

The interaction between a system and the components modeling its environment is traditionally modeled by a multi-player game played on a finite graph. In zero-sum games, the players have conflicting objectives, and it is clear that increasing the memory of the environment players can only make it harder for the system to win. In non-zero-sum games, the objectives of the players may overlap. There, typical questions concern the stability of the game and the equilibria the players may reach. In particular, in rational synthesis (RS), the goal is to find an equilibrium that satisfies the objective of the system.

We study how the memory of the environment players may affect the existence of an RS solution. As we show, the picture is diverse, even when the objectives of all players are memoryless. On the one hand, when stability amounts to a Nash equilibrium (NE), then increasing the memory of the environment may only help the system to suggest an RS solution. On the other hand, when the notion of stability involves deviations by coalitions of environment players, for example in a strong Nash equilibrium (SNE), then increasing their memory may sometimes enable and sometimes prevent the existence of an RS solution. We study memory bounds for the players, showing that the memory required may be polynomial in an NE-RS solution and exponential in an SNE-RS solution. We also solve the SNE-RS problem, show that it is PSPACE-complete, and relate the differences between NE and SNE with the differences between cooperative and non-cooperative RS.

Keywords and phrases:

Non-Zero-Sum Games, Synthesis, Memory

Copyright and License:

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Formal languages and automata theory ; Theory of computation

\rightarrow

Semantics and reasoning

DOI:

10.4230/LIPIcs.CSL.2026.34

Event:

34th EACSL Annual Conference on Computer Science Logic (CSL 2026)

Editors:

Stefano Guerrini and Barbara König

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Synthesis is the automated construction of systems from their specifications [28]. Modern systems often consist of interacting components. The interaction is modeled by a multi-player game played on a finite graph. In the turn-based setting, the vertices of the game graph are partitioned among the players. A token is placed on an initial vertex, and in each turn, the player that owns the vertex with the token moves it to a successor vertex. Each player has a strategy that directs her how to move the token when it reaches vertices she owns. A profile is a vector of strategies, one for each player. The outcome of a profile is a play – an infinite path in the game graph, obtained when the players follow their strategies. The goal of each player is to direct the game into a play that satisfies her objective. Each objective $\alpha$ defines a subset of $V^{\omega}$ [25], where $V$ is the set of vertices of the game graph. For example, in games with Büchi objectives, $\alpha$ is a subset of $V$ , and a play satisfies $\alpha$ if it visits vertices in $\alpha$ infinitely often.

In zero-sum games, the players compete with each other on the satisfaction of contradicting objectives. Zero-sum games with $\omega$ -regular objectives are determined: in every game, exactly one player has a winning strategy – one that achieves her objective against all strategies of the other players [25]. Deciding a zero-sum game amounts to finding this player. In contrast, in non-zero-sum games, the objectives of the players may overlap [12, 31]. There, typical questions concern the stability of the game and the equilibria the players may reach [32]. The most common notion of stability is Nash equilibrium (NE) [26]. A profile of strategies is an NE if no (single) player can benefit from unilaterally changing her strategy.

Two-player zero-sum games model the interaction between a system and its environment. The system aims to satisfy its specification in all environments. Accordingly, the environment is assumed to be hostile, as if its objective is to violate the specification [4]. Often, however, the components composing the environment have objectives of their own, and they act to achieve their objectives. For example, clients interacting with a server typically have objectives other than to fail the server. Accordingly, the setting induces a non-zero-sum game, where the objectives of the players are induced from their specifications. In rational synthesis, we let the system exploit the rationality of the environment. In particular, in cooperative rational synthesis (CRS) [18], the desired output is an NE profile whose outcome satisfies the objective of the system. Thus, in CRS, we assume that we can suggest strategies to the environment players, and if they have no incentive to deviate from these strategies, they follow them. Rational synthesis has been extensively studied in various settings and variants [33, 13, 1, 22, 9, 10]. Recall that strategies for the players direct them how to move the token in vertices they own. A strategy may depend on the history of the game so far. Thus, in different visits of the token to the same vertex $v$ , a strategy may direct the owner of $v$ to move the token to different successors. Extensive research has concerned the memory requirements for strategies in zero-sum games with $\omega$ -regular objectives [30, 14, 5, 11, 8]. For example, it is well known that a winning strategy for a Büchi objective can be memoryless, thus it may depend only on the current vertex. On the other hand, a winning strategy for a conjunction of $k$ Büchi objectives may require memory $k$ [14]. Researchers have also studied games in which the memory of the players is bounded [29, 15, 17, 20, 23]. Clearly, increasing the memory of the system or reducing the memory of the environment in a zero-sum game can only help the system to win a game.

For non-zero-sum games, the situation is less clear. First, as we show in Section 3, even when its objective is memoryless, the system may need memory for its strategy in a CRS solution. Moreover, unlike the situation in zero-sum games, an increase of memory to the environment players may be helpful for the system. That is, even when the objectives are memoryless, the only possible CRS solutions may require the environment players to have memory. Intuitively, increasing the memory of the players in non-zero-sum games enables them to satisfy multiple objectives, which may be essential for achieving both stability and the satisfaction of the system’s objective [31]. In fact, we show that when the objectives of the environment players are memoryless, then adding memory to the environment players may only help the system to win.

Our basic observations above raise several interesting questions about memory bounds in non-zero-sum games. We first prove that a CRS solution in a $k$ -player non-zero-sum game with sink objectives (that is, ones that can be specified with all reachability or $\omega$ -regular objectives) may require each of the players to have memory $O(k)$ , matching the known upper bound [24, 31]. Further questions concern richer settings, detailed below.

The notion of an NE corresponds to deviations of single players. In some applications, a coalition of players may deviate together. For example, protocols for voting, mechanisms for exchange of messages, allocation and construction of shared resources – all should take into account the possibility of players that deviate together. The different applications induce different notions of stability. The first such definition is of strong Nash equilibrium (SNE). A profile is an SNE if no subset of players can deviate in a way that benefits all its members [3]. Then, for $b_{1},b_{2}\geq 0$ , a profile is a $(b_{1},b_{2})$ -robust equilibrium if no coalition of size $b_{1}$ can deviate in a way that benefits at least one of its members without harming the other members, and no coalition of size $b_{2}$ can deviate in a way that harms the other players [6]. Finally, a profile is a strong secure equilibria (SSE) if every deviation of a coalition of players that harms some player, also harms a player in the coalition [7].

We study the CRS problem when stability is defined with respect to deviations by a coalition of players. For example, in the SNE-CRS problem, we seek a profile of strategies that satisfies the objective of the system and is an SNE. The contributions in [6, 7] include a study of the complexity of the CRS problem when the solution concepts are robust-equilibrium and SSE. PSPACE upper bounds for the problems involve a reduction to a two-player game, termed the deviator game [6], which we easily extend to SSE-CRS. For the lower-bound, our contribution is more interesting and involves relating non-cooperation in rational synthesis with deviations of coalitions in SNEs: Recall that in cooperative rational synthesis, the desired output is an NE that satisfies the system objective. In non-cooperative rational synthesis (NRS) [21], the desired output is a strategy for the system player such that the objective of the system is satisfied in the outcome of all NE profiles that include this strategy. Thus, in NRS, the environment players are rational, but we cannot suggest them a strategy. As shown in [1], the cooperative and non-cooperative approaches are related to the two stability-inefficiency measures of price of stability [2] and price of anarchy [19, 27]. We relate NE-NRS (that is, NRS when stability amounts to being an NE) with SNE-CRS, showing that the challenge of coping with deviations of a coalition is similar to the challenge of coping with non-cooperation. Intuitively, in both cases, all the environment players may deviate simultaneously, as long as these deviations are beneficial for them. The relation implies that the PSPACE-hardness of the NE-NRS problem [13] applies also to the SNE-CRS problem.

Back to the study of memory requirements, we examine how the transition to solution concepts that involve deviations by a coalition of players affects these requirements. Since players in a coalition care for the satisfaction of the objectives of all the players in the coalition, their strategies have to satisfy multiple objectives. Since the latter typically requires memory, the study of the memory requirements in non-zero-sum games with deviations by coalitions is more interesting and involved. We start with some observations about the effect of increasing the memory to the environment players and show that, unlike the case of NE-CRS, here an increase to the memory of the environment players may prevent the existence of an SNE-CRS solution even when the objectives are memoryless.

On the other hand, for some games, the existence of an SNE-CRS solution requires the environment players to have memory, and we examine the memory requirements for them. For the upper bound, it is not hard to extend the analysis in [6] and describe an exponential upper bound for the required memory. Our main technical contribution is a matching lower bound. We show that even in $k$ -player games with sink objectives, an SNE-CRS solution may require $O(k)$ players to have memory $2^{\Theta(k)}$ . Moreover, our bounds apply also to the solution concepts of $(k,0)$ -robust-equilibrium and SSE, completing the picture to all known solution concepts with deviations of a coalition of players.

2 Preliminaries

Games.

For $k\geq 1$ , let $[k]=\{1,\ldots,k\}$ . A $k$ -player (turn-based) game graph is a tuple $G=\langle\{V_{i}\}_{i\in[k]},v_{0},E\rangle$ , where $V_{1},\ldots,V_{k}$ are disjoint sets of vertices. For every $i\in[k]$ , the vertices in $V_{i}$ are owned by Player $i$ , and we let $V=\bigcup_{i\in[k]}V_{i}$ . Then, $v_{0}\in V$ is an initial vertex, and $E\subseteq V\times V$ is a total edge relation, thus for every $v\in V$ , there is $u\in V$ such that $\langle v,u\rangle\in E$ . For $v\in V$ , we denote by ${\sf owner}(v)$ the player $i\in[k]$ such that $v\in V_{i}$ . The size of $G$ , denoted $|G|$ , is $|E|$ , namely the number of edges in it.

A game is a tuple ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ , where $G$ is a $k$ -player game graph, and $\alpha_{i}$ , for $i\in[k]$ , is a winning condition (a.k.a. objective) for Player $i$ . In the beginning of a play in the game, a token is placed on $v_{0}$ . Then, in each turn, the player that owns the vertex that hosts the token chooses a successor vertex and moves the token to it. Together, the players generate a play $\rho=v_{0},v_{1},v_{2}\ldots\in V^{\omega}$ in ${\mathcal{G}}$ , namely an infinite path that starts in $v_{0}$ and respects $E$ : for all $i\geq 0$ , we have that $\langle v_{i},v_{i+1}\rangle\in E$ .

Each winning condition $\alpha_{i}$ defines a subset of $V^{\omega}$ . The objective of Player $i$ is to cause the interaction to generate a play that satisfies $\alpha_{i}$ . We describe some types of winning conditions below. For a play $\rho=v_{0},v_{1}\ldots$ , we denote by ${\sf reach}(\rho)$ the set of vertices visited at least once along $\rho$ , and by ${\sf inf}(\rho)$ the set of vertices visited infinitely often along $\rho$ . That is, ${\sf reach}(\rho)=\{v\in V:\,\mbox{there exists some $i\geq 0$ such that $v_{i% }=v$}\}$ and ${\sf inf}(\rho)=\{v\in V:\,$ there are infinitely many $i\geq 0$ such that $v_{i}=v\}$ . A reachability objective is given by a set of vertices $\alpha\subseteq V$ , and it requires some vertex in $\alpha$ to be visited at least once; thus a play $\rho\in V^{\omega}$ satisfies $\alpha$ iff ${\sf reach}(\rho)\cap\alpha\neq\emptyset$ . A Büchi objective is given by a set of vertices $\alpha\subseteq V$ , and it requires some vertex in $\alpha$ to be visited infinitely often; thus $\rho$ satisfies $\alpha$ iff ${\sf inf}(\rho)\cap\alpha\neq\emptyset$ . The objectives dual to reachability and Büchi are avoid (also known as safety) and co-Büchi, respectively. Formally, a play $\rho$ satisfies an avoid objective $\alpha\subseteq V$ iff ${\sf reach}(\rho)\cap\alpha=\emptyset$ , and satisfies a co-Büchi objective $\alpha\subseteq V$ iff $\inf(\rho)\cap\alpha=\emptyset$ . A generalized Büchi objective is a set $\alpha=\{\alpha_{1},\ldots,\alpha_{m}\}$ of Büchi objectives. A play $\rho\in V^{\omega}$ satisfies $\alpha$ if it satisfies all the objectives in $\alpha$ ; thus if for all $j\in[m]$ , we have that ${\sf inf}(\rho)\cap\alpha_{j}\neq\emptyset$ . Generalized reachability objectives are defined similarly, requiring all underlying reachability objectives to be satisfied.

Strategies, profiles, and equilibria.

For $i\in[k]$ , a strategy for Player $i$ is a function $f_{i}:V^{*}\cdot V_{i}\rightarrow V$ that maps prefixes of plays that end in a vertex owned by Player $i$ to possible extensions in a way that respects $E$ . That is, for every history $h\in V^{*}$ and $v\in V_{i}$ , we have that $\langle v,f_{i}(h\cdot v)\rangle\in E$ . Intuitively, a strategy for Player $i$ directs her how to move the token, and the direction may depend on the history of the play so far.

A profile is a tuple $\pi=\langle f_{1},\ldots,f_{k}\rangle$ of strategies, one for each player. The outcome of a profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ is the play obtained when the players follow their strategies. Formally, ${\sf outcome}(\pi)=v_{0},v_{1},v_{2},\ldots\in V^{\omega}$ is such that for all $j\geq 0$ , we have that $v_{j+1}=f_{{\sf owner}(v_{j})}(v_{0}\cdots v_{j})$ . Consider a game ${\mathcal{G}}$ and a profile $\pi$ . The set of winners in ${\mathcal{G}}$ when the players follow $\pi$ , denoted ${\sf Win}({\mathcal{G}},\pi)$ , is the set of players whose objectives are satisfied in ${\sf outcome}(\pi)$ . Formally, $i\in{\sf Win}(\pi,{\mathcal{G}})$ iff ${\sf outcome}(\pi)$ satisfies $\alpha_{i}$ . The set of losers in $\pi$ , denoted ${\sf Lose}({\mathcal{G}},\pi)$ , is then $[k]\setminus{\sf Win}(\pi)$ , namely the set of players whose objectives are not satisfied in ${\sf outcome}(\pi)$ . When ${\mathcal{G}}$ is known from the context we write ${\sf Win}(\pi)$ and ${\sf Lose}(\pi)$ respectively.

For a subset $S\subseteq[k]$ of players, an $S$ -profile is a set of strategies, one for each player in $S$ . We say that a profile $\pi$ extends an $S$ -profile $\pi^{\prime}$ if the players in $S$ use in $\pi$ their strategies in $\pi^{\prime}$ . For a profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ , a non-empty subset $C\subseteq[k]$ , and a $C$ -profile $\pi^{\prime}_{C}=\bigcup_{i\in C}\{f^{\prime}_{i}\}$ , we denote by $\pi[C\leftarrow\pi^{\prime}_{C}]$ the profile in which the players in $C$ follow their strategies in $\pi^{\prime}_{C}$ and the players in $[k]\setminus{C}$ follow their strategies in $\pi$ . Formally, $\pi[C\leftarrow\pi^{\prime}_{C}]=\langle g_{1},\ldots,g_{k}\rangle$ , where for every $i\in[k]$ , we have that $g_{i}=f_{i}^{\prime}$ , if $i\in C$ , and $g_{i}=f_{i}$ , otherwise. When $C=\{i\}$ is a singleton, for some $i\in[k]$ , we simplify the notation and use $\pi[i\leftarrow f^{\prime}_{i}]$ rather than $\pi[\{i\}\leftarrow\{f^{\prime}_{i}\}]$ .

A profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ is a Nash Equilibrium (NE, for short) [26] if no single player has an incentive to deviate from $\pi$ . Formally, $\pi$ is an NE if for every $i\in[k]$ , if $i\in{\sf Lose}(\pi)$ , then for every strategy $f^{\prime}_{i}$ for Player $i$ , we have that $i\in{\sf Lose}(\pi[i\leftarrow f^{\prime}_{i}])$ . The notion of an NE assumes deviation by single players. In some applications, a coalition of players may deviate together. The different applications induce different definitions of stability. We consider three definitions. First, a profile $\pi$ is a Strong Nash Equilibrium (SNE, for short) [3] if no coalition of players can jointly deviate in a way that strictly benefits all its members. Formally, $\pi$ is an SNE if for every non-empty subset $C\subseteq{\sf Lose}(\pi)$ , and every $C$ -profile $\pi^{\prime}_{C}$ , there exists $j\in C$ such that $j\in{\sf Lose}(\pi[C\leftarrow\pi^{\prime}_{C}])$ . Note that an NE is a special case of an SNE in which only deviations of coalitions of size $1$ are possible. Then, for $b_{1},b_{2}\geq 0$ , a profile is a $(b_{1},b_{2})$ -robust equilibrium if no coalition of size $b_{1}$ can deviate in a way that benefits at least one of its members without harming at least one of the other members, and no coalition of size $b_{2}$ can deviate in a way that harms at least one of the other players [6].¹¹1The setting in [6] considers weighted objectives, where rather than winning or losing, each profile induces a payoff for each player, which enables also a quantitative definition of “harm” and “benefit”. Here we consider $\omega$ -regular Boolean objectives, inducing a Boolean interpretations for “harm” and “benefit”. Formally, $\pi$ is $(b_{1},b_{2})$ -robust if $\pi$ is $b_{1}$ -resilient: for every subset $C\subseteq[k]$ of size at most $b_{1}$ , and every $C$ -profile $\pi^{\prime}_{C}$ , if ${\sf Lose}(\pi)\cap{\sf Win}(\pi[C\leftarrow\pi^{\prime}_{C}])\neq\emptyset$ , then ${\sf Win}(\pi)\cap{\sf Lose}(\pi[C\leftarrow\pi^{\prime}_{C}])\neq\emptyset$ , and $\pi$ is $b_{2}$ -immune: for every subset $C\subseteq[k]$ of size at most $b_{2}$ , and every $C$ -profile $\pi^{\prime}_{C}$ , we have that ${\sf Win}(\pi)\cap{\sf Lose}(\pi[C\leftarrow\pi^{\prime}_{C}])\cap([k]% \setminus C)\neq\emptyset$ . Finally, a profile is a strong secure equilibria (SSE) if every deviation of a coalition of players that harms some player, also harms a player in the coalition [7]. Formally, $\pi$ is an SSE if for every subset $C\subseteq[k]$ , and every $C$ -profile $\pi^{\prime}_{C}$ , if ${\sf Win}(\pi)\cap{\sf Lose}(\pi[C\leftarrow\pi^{\prime}_{C}])\cap([k]% \setminus C)\neq\emptyset$ , then ${\sf Win}(\pi)\cap{\sf Lose}(\pi[C\leftarrow\pi^{\prime}_{C}])\cap C\neq\emptyset$ .

For a subset $W\subseteq[k]$ of players, we say that $\pi$ is a $W$ -NE if $\pi$ is an NE with $W={\sf Win}(\pi)$ , and similarly for the other solution concepts.

Rational Synthesis.

We consider a setting in which the players model a controllable system and its rational environment. Technically, we assume that Player $1$ models the system (a system may be composed from several components, but since the system is controllable, we can merge them to a single player), and the other players model the components of the environment. Let ${\sf Env}=\{2,\ldots,k\}$ . The basic problem we consider is the existence and finding stable profiles that satisfy the objective of the system.

We refine the notions of NE and SNE to take into account our ability to control the system. For a profile $\pi$ , we say that $\pi$ is a $1$ -fixed NE if no player in ${\sf Env}$ can benefit from unilaterally changing her strategy. Likewise, we say that $\pi$ is a $1$ -fixed SNE if no coalition of players in ${\sf Env}$ can jointly deviate in a way that strictly benefits all its members.

Consider a $k$ -player game ${\mathcal{G}}$ . The problem of NE (SNE) cooperative rational synthesis, denoted NE-CRS (resp., SNE-CRS) is to return a $1$ -fixed NE (resp., SNE) in ${\mathcal{G}}$ in which Player $1$ wins. As in traditional synthesis, one can also define the corresponding decision problems, of rational realizability, where we only need to decide whether the desired strategies exist. In order to avoid additional notations, we sometimes refer to NE-CRS and SNE-CRS also as decision problems.

Finite-Memory Strategies.

A strategy $f_{i}:V^{*}\rightarrow V$ is finite-memory if it is possible to replace the unbounded histories in $V^{*}$ by finitely many memories. Formally, a memory structure for a game ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ with $G=\langle\{V_{i}\}_{i\in[k]},v_{0},E\rangle$ is $\mathcal{M}=\left\langle M,\mu_{0},\delta\right\rangle$ , consisting of a finite set $M$ of memory states, an initial memory state $\mu_{0}\in M$ , and an update function $\delta:M\times E\rightarrow$ $M$ . A memory structure is similar to an automaton with alphabet $E$ , which is executed in parallel to the game: it starts in $\mu_{0}$ and reads the edges traversed by the token. Then, a strategy for Player $i$ that relies on $\mathcal{M}$ replaces the dependency on the history of the play by dependency on the current memory state of $\mathcal{M}$ . Thus, the strategy is given by a function $f_{i}:M\times V_{i}\rightarrow V$ , such that for all $\mu\in M$ and $v\in V_{i}$ , we have that $\langle v,f_{i}(\mu,v)\rangle\in E$ . When the current memory state is $\mu$ and the token is in vertex $v\in V_{i}$ , Player $i$ moves the token to $f_{i}(\mu,v)$ and $\mathcal{M}$ moves to state $\delta\left(\mu,\langle v,f_{i}(\mu,v)\rangle\right)$ . A strategy $f_{i}$ is memoryless if it only depends on the current vertex. That is, if for every two histories $h,h^{\prime}\in V^{*}$ and vertex $v\in V_{i}$ , we have that $f_{i}(h\cdot v)=f_{i}(h^{\prime}\cdot v)$ . Note that a memoryless strategy can be viewed as a function $f_{i}:V_{i}\rightarrow V$ , and corresponds to the case $|M|=1$ . An objective type $\gamma$ is called memoryless if in every zero-sum game with an objective of type $\gamma$ , if Player $1$ wins, then she has a memoryless winning strategy. Reachability, avoid, Büchi, and co-Büchi objectives are all memoryless [30].

For $l\geq 1$ , we say that a strategy $f_{i}:V^{*}\cdot V_{i}\rightarrow V$ uses memory $l$ if a memory structure that generates $f_{i}$ needs $l$ states. Given a profile $\pi=\langle f_{1},\cdots,f_{k}\rangle$ , we say that Player $i$ uses memory $l$ (in $\pi$ ) if $f_{i}$ uses memory $l$ .

3 Is Memory for the Environment Helpful?

In zero-sum games, increasing the memory of the environment may only decrease the ability of the system to satisfy the specification. Formally, for every zero-sum game ${\mathcal{G}}$ and for every bound $m\geq 1$ , if Player $1$ wins against Player $2$ that uses memory $m$ , then for every $m^{\prime}\leq m$ , Player $1$ also wins against Player $2$ that uses memory $m^{\prime}$ , and possibly there is $m^{\prime\prime}>m$ such that Player $1$ loses against Player $2$ that uses memory $m^{\prime\prime}$ .

In this section we show that the picture in non-zero-sum games is different and more involved. First, the system may need memory in order to have a CRS solution in a game with memoryless objectives for all players. In addition, memory for the environment is required for the existence of a CRS solution in some cases yet prevents the existence of a CRS solution in other cases. Intuitively, memory for the environment enables the system to suggest to the environment richer strategies, but also enables the environment to have richer deviations.

We first describe cases in which increasing the memory of the system and the environment is required for a CRS solution, even in games with memoryless objectives for all players. Our examples are with two-player games, and thus apply to both NE-CRS and SNE-CRS.

Theorem 1.

There are two-player Büchi (or reachability) games ${\mathcal{G}}_{1}$ and ${\mathcal{G}}_{2}$ such that the following hold.

1.

There is a CRS solution for ${\mathcal{G}}_{1}$ in which Player $1$ uses a memory of size $2$ and there is no CRS solution for ${\mathcal{G}}_{1}$ in which Player $1$ is memoryless.
2.

There is a CRS solution for ${\mathcal{G}}_{2}$ in which Player $2$ uses a memory of size $2$ and there is no CRS solution for ${\mathcal{G}}_{2}$ in which Player $2$ is memoryless.

Proof.

We describe ${\mathcal{G}}_{1}$ and ${\mathcal{G}}_{2}$ with Büchi objectives. The same games and considerations apply when the games have reachability objectives. Consider the Büchi game ${\mathcal{G}}_{1}$ in Figure 1 (left). Let $\alpha_{1}=\{v_{3}\}$ and $\alpha_{2}=\{v_{1},v_{4}\}$ . Drawing two-player games we use circles and boxes to describe the vertices in $V_{1}$ and $V_{2}$ , respectively.

Figure 1: The games

{\mathcal{G}}_{1}

and

{\mathcal{G}}_{2}

.

Consider the strategy $f_{1}$ for Player $1$ that, in $v_{2}$ , alternates between $v_{3}$ and $v_{4}$ . That is $f_{1}(v_{0},v_{2},(v_{3},v_{2},v_{4},v_{2})^{*})=v_{3}$ and $f_{1}(v_{0},v_{2},(v_{3},v_{2},v_{4},v_{2})^{*},v_{3},v_{2})=v_{4}$ . Note that in order to implement the alternation, the strategy $f_{1}$ requires a memory of size $2$ . Consider the strategy $f_{2}$ for Player $2$ that, in $v_{0}$ , takes the token down to $v_{2}$ . Note that the profile $\langle f_{1},f_{2}\rangle$ is a CRS solution. Indeed, its outcome is $v_{0},(v_{2},v_{3},v_{2},v_{4})^{\omega}$ , which visits both $v_{3}$ and $v_{4}$ infinitely often. Thus, both objectives are satisfied, and the profile is a CRS solution.

On the other hand, consider a memoryless strategy for Player $1$ . If, in $v_{0}$ , Player $2$ moves the token to $v_{1}$ , then the play reaches and stays forever in $v_{1}$ no matter what the strategy of Player $1$ is, and $\alpha_{1}$ is not satisfied. If, in $v_{0}$ , Player $2$ moves the token to $v_{2}$ , then either, in $v_{2}$ , Player $1$ always moves the token to $v_{4}$ , in which case the play never visits $v_{3}$ , and so $\alpha_{1}$ is not satisfied, or Player $1$ always moves the token to $v_{3}$ , in which case $\alpha_{2}$ is not satisfied, causing Player $2$ to deviate in $v_{0}$ . Thus, there is no CRS solution in which Player $1$ uses a memoryless strategy.

Consider now the Büchi game ${\mathcal{G}}_{2}$ in Figure 1 (right). Let $\alpha_{1}=\{v_{3}\}$ and $\alpha_{2}=\{v_{1},v_{4}\}$ . Note that all the vertices with more than one successor belong to Player $2$ , and so there is a single strategy $f_{1}$ for Player $1$ in the game. Consider the strategy $f_{2}$ for Player $2$ that, in $v_{0}$ , takes the token down to $v_{2}$ , and, in $v_{2}$ , alternates between $v_{3}$ and $v_{4}$ . That is $f_{2}(v_{0},v_{2},(v_{3},v_{2},v_{4},v_{2})^{*})=v_{3}$ and $f_{2}(v_{0},v_{2},(v_{3},v_{2},v_{4},v_{2})^{*},v_{3},v_{2})=v_{4}$ . Note that in order to implement the alternation, the strategy $f_{2}$ requires a memory of size $2$ . It is easy to see that the profile $\langle f_{1},f_{2}\rangle$ is a CRS solution. Indeed, its outcome is $v_{0},(v_{2},v_{3},v_{2},v_{4})^{\omega}$ , which satisfies both objectives.

On the other hand, consider a memoryless strategy for Player $2$ . If, in $v_{0}$ , Player $2$ moves the token to $v_{1}$ , then the play reaches and stays forever in $v_{1}$ and $\alpha_{1}$ is not satisfied. If, in $v_{0}$ , Player $2$ moves the token to $v_{2}$ , then either, in $v_{2}$ , Player $2$ always moves the token to $v_{4}$ , in which case $\alpha_{1}$ is not satisfied, or Player $2$ always moves the token to $v_{3}$ , in which case $\alpha_{2}$ is not satisfied, causing Player $2$ to deviate in either $v_{0}$ or $v_{2}$ . Thus, there is no CRS solution in which Player $2$ uses a memoryless strategy. $\hfill\blacktriangleleft$

The example of ${\mathcal{G}}_{2}$ in Theorem 1 shows that increasing the memory of the environment may help the system to achieve a CRS solution. We continue and examine whether this is always the case. We first need some notations. Consider a non-zero-sum game ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ . For $m\geq 1$ , we say that a profile $\pi=\langle f_{1},f_{2},\ldots,f_{k}\rangle$ is an $m$ -bounded $1$ -fixed NE if for every $2\leq i\leq k$ , the strategy $f_{i}$ uses memory at most $m$ , and if $i\in{\sf Lose}(\pi)$ , then for every strategy $f^{\prime}_{i}$ for Player $i$ that uses memory at most $m$ , we have that $i\in{\sf Lose}(\pi[i\leftarrow f^{\prime}_{i}])$ . Thus, environment players are restricted to strategies that use memory at most $m$ , in both $\pi$ and their deviations. Likewise, $\pi$ is an $m$ -bounded $1$ -fixed SNE if all the strategies of the environment players in $\pi$ use memory at most $m$ and no coalition of environment players can jointly deviate to strategies that use memory at most $m$ in a way that strictly benefits all its members. Then, we say that $\pi$ is an $m$ -bounded NE-CRS (SNE-CRS) solution if $\pi$ is an $m$ -bounded $1$ -fixed NE (SNE, respectively) that satisfies $\alpha_{1}$ . Note that the usual CRS problem coincides with the case $m=\infty$ .

We first show that, unsurprisingly, when the objectives of the environment players require memory, in particular when they consist of a conjunction of objectives, then a system may have a CRS solution only thanks to bounds on the memory of the environment players.

Theorem 2.

There is a two-player generalized-Büchi (or generalized reachability) game ${\mathcal{G}}_{3}$ such that there is no CRS solution for ${\mathcal{G}}_{3}$ , yet there is a $1$ -bounded CRS solution for ${\mathcal{G}}_{3}$ .

Proof.

We prove the theorem for the Büchi case. The same game and considerations apply for generalized-reachability objectives.²²2The application to reachability is less straightforward here. In particular, for Büchi, one could give up the vertex $v_{2}$ and let $v_{0}$ have three successors. For reachability, we need the decision of Player $2$ about not visiting $v_{1}$ in her first transition to be un-recoverable. Consider the generalized Büchi game ${\mathcal{G}}_{3}$ played on the graph of ${\mathcal{G}}_{2}$ (Figure 1, right), now with $\alpha_{1}=\{\{v_{1}\}\}$ and $\alpha_{2}=\{\{v_{3}\},\{v_{4}\}\}$ . Note that Player $1$ has a Büchi objective.

Recall that all the vertices with more than one successor belong to Player $2$ , and so there is a single strategy $f_{1}$ for Player $1$ in the game. Consider the strategy $f_{2}$ for Player $2$ that, in $v_{0}$ , takes the token to $v_{1}$ , where it loops forever. Clearly, the induced play satisfies $\alpha_{1}$ . When Player $2$ is restricted to memoryless strategies, it cannot satisfy her objectives, making this profile a $1$ -bounded CRS solution. On the other hand, when Player $2$ is not memoryless, she can take the token down to $v_{2}$ and then alternate between $v_{3}$ and $v_{4}$ . Hence, when Player $2$ has memory $2$ or more, she would deviate from every profile that leads to $v_{0}$ , and so there is no CRS solution in ${\mathcal{G}}_{3}$ . $\hfill\blacktriangleleft$

The example of ${\mathcal{G}}_{3}$ in the proof of Theorem 2 heavily relies on Player $2$ having an objective that requires memory. We now show that for SNE-CRSs, when a deviation of a player may need to be beneficial to several players, a system may have an SNE-CRS solution only thanks to bounds on the memory of the environment players, even when all players have memoryless objectives.

Theorem 3.

There is a $3$ -player Büchi game ${\mathcal{G}}_{4}$ such that there is no SNE-CRS solution for ${\mathcal{G}}_{4}$ , yet there is a $1$ -bounded SNE-CRS solution for ${\mathcal{G}}_{4}$ .

Proof.

We prove the theorem for the Büchi case. The same game and considerations apply for reachability objectives. Consider the $3$ -players Büchi game ${\mathcal{G}}_{4}$ in Figure 2. We use diamonds to denote vertices controlled by Player $3$ . Let $\alpha_{1}=\{v_{1}\}$ , $\alpha_{2}=\{v_{3}\}$ , and $\alpha_{3}=\{v_{4}\}$ .

Figure 2: The game

{\mathcal{G}}_{4}

.

Note that all the vertices with more than one successor belong to Player $2$ or Player $3$ , and so there is a single strategy $f_{1}$ for Player $1$ in the game. We first describe a $1$ -bounded SNE-CRS solution for ${\mathcal{G}}_{4}$ . Let $f_{2}$ be the memoryless strategy for Player $2$ that, in $v_{0}$ , takes the token to $v_{1}$ and, in $v_{2}$ , take the token to $v_{3}$ . Let $f_{3}$ be the memoryless strategy for Player $3$ that loops in $v_{5}$ . Clearly, ${\sf outcome}(\pi)=v_{0},v_{1}^{\omega}$ , and so ${\sf Win}(\pi)=\{1\}$ . We prove that Player $2$ and Player $3$ cannot deviate to memoryless strategies in a way that causes their objectives to be satisfied. Clearly, a deviation by Player $3$ alone cannot affect the outcome of the game. Also, a deviation by Player $2$ alone may not cause the outcome to reach $v_{3}$ or $v_{4}$ . Consider now a joint deviation of Player $2$ and Player $3$ . When Player $2$ uses a memoryless strategy, it cannot cause the outcome of the profile to visit both $v_{3}$ and $v_{4}$ infinitely often. Thus, the deviation is not beneficial to either Player $2$ and Player $3$ . Hence, $\pi$ is a $1$ -bounded SNE-CRS solution.

On the other hand, when Player $2$ has memory $m\geq 2$ , then for every profile whose outcome reaches $v_{1}$ , Player $2$ and Player $3$ would deviate to an outcome that satisfies their both objectives, and so no SNE-CRS solution exists for ${\mathcal{G}}_{4}$ . $\hfill\blacktriangleleft$

We now complete the picture and show that when the objective of the environment players are memoryless and deviations are allowed only for single players, then adding memory to the environment may only help the system. Thus, for NE-CRS with memoryless objectives, we cannot have examples as those in Theorems 2 and 3.

Theorem 4.

For every Büchi (or reachability) game ${\mathcal{G}}$ , if there is a $1$ -bounded NE-CRS solution in ${\mathcal{G}}$ , then there is also an NE-CRS solution in ${\mathcal{G}}$ .

Proof.

Consider a Büchi (or reachability) game ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ . Let $\pi=\{f_{1},f_{2},\ldots,f_{k}\}$ be a $1$ -bounded NE-CRS solution. We prove that there exists a strategy $g_{1}$ for Player $1$ such that the profile $\pi^{\prime}=\pi[1\leftarrow g_{1}]$ is an NE-CRS solution.³³3In Appendix A.1, we show that the transition to $g_{1}$ is essential, thus $\pi$ need not be an NE-CRS solution. In fact, the example is stronger, showing that a profile $\pi$ may be a $1$ -bounded NE-CRS solution and still no profile in which the system follows its strategy in $\pi$ is an NE-CRS solution. We also show that the strategy of Player $1$ in a $1$ -bounded NE-CRS solution may require memory of size $2$ .

For every $i\in{\sf Env}$ , let $G^{\pi,i}$ be the graph obtained from $G$ by removing edges that leave vertices in $V_{j}$ that do not agree with the memoryless strategy $f_{j}$ , for all $j\in[k]\setminus\{1,i\}$ . Consider the zero-sum two-player game ${\mathcal{G}}^{\pi,i}=\langle G^{\pi,i},\alpha_{i}\rangle$ between Player $i$ and Player $1$ . Let $W_{i}\subseteq V$ be the winning region of Player $i$ in ${\mathcal{G}}^{\pi,i}$ , and let $f^{\prime}_{i}$ be a memoryless strategy for Player $i$ for the vertices in $W_{i}$ . That is, $v\in W_{i}$ iff for every strategy $f^{\prime}_{1}$ for Player $1$ , the outcome in ${\mathcal{G}}^{\pi,i}$ of $f^{\prime}_{i}$ and $f^{\prime}_{1}$ from $v$ satisfies $\alpha_{i}$ . Likewise, let $f^{i}_{1}$ be a memoryless strategy for Player $1$ in ${\mathcal{G}}^{\pi,i}$ that is winning in all vertices not in $W_{i}$ . That is, $v\not\in W_{i}$ iff for every strategy $g_{i}$ for Player $i$ , the outcome in ${\mathcal{G}}^{\pi,i}$ of $g_{i}$ and $f^{i}_{1}$ from $v$ does not satisfy $\alpha_{i}$ . Note that since $\alpha_{i}$ is a Büchi objectives, memoryless strategies $f^{\prime}_{i}$ and $f^{i}_{1}$ exist.

Let $\rho=v_{0},v_{1},v_{2},\ldots={\sf outcome}(\pi)$ . We first argue that for all $i\in{\sf Lose}(\pi)$ and vertices $v\in{\sf reach}(\rho)\cap V_{i}$ , we have that $v\notin W_{i}$ . To see this, assume by way of contradiction that there exists $i\in{\sf Lose}(\pi)$ and a vertex $v\in{\sf reach}(\rho)\cap V_{i}$ such that $v\in W_{i}$ . Let $v=v_{j}$ be the first such vertex. That is, $v_{0},\ldots,v_{j-1}$ are all not in $W_{i}$ and $v_{j}\in W_{i}$ . Consider the strategy $g_{i}$ for Player $i$ that agrees with $f_{i}$ in the vertices $v_{0},\ldots,v_{j-1}$ and agrees with $f^{\prime}_{i}$ in all other vertices. Consider the profile $\pi^{\prime}_{i}=\pi[i\leftarrow g_{i}])$ . Note that $g_{i}$ is memoryless, ${\sf outcome}(\pi^{\prime}_{i})$ has a prefix $v_{0},\ldots,v_{j}$ , and, as $f^{\prime}_{i}$ is winning in ${\mathcal{G}}^{\pi,i}$ , the outcome continues in a way that satisfies $\alpha_{i}$ . Thus, $i\in{\sf Win}(\pi[i\leftarrow g_{i}])$ , contradicting the fact that $\pi$ is a $1$ -bounded $1$ -fixed NE.

We can now define the strategy $g_{1}$ , as follows. As long as the generated play follows $\rho$ , then $g_{1}$ agrees with $f_{1}$ . If for some $i\in{\sf Env}$ and vertex $v\in{\sf reach}(\rho)\cap V_{i}$ , Player $i$ deviates and moves the token to a successor of $v$ that is different from $f_{i}(v)$ , then $g_{1}$ follows the strategy $f^{i}_{1}$ for the rest of the game. Note that $g_{1}$ need not be memoryless (even when $f_{1}$ is memoryless).

We argue that the profile $\pi^{\prime}=\pi[1\leftarrow g_{1}]$ is an NE-CRS solution. First, since $g_{1}$ agrees with $f_{1}$ as long as the environment players follow their strategies in $\pi$ , then ${\sf outcome}(\pi^{\prime})={\sf outcome}(\pi)$ , and so ${\sf Win}(\pi^{\prime})={\sf Win}(\pi)$ . Thus, $1\in{\sf Win}(\pi^{\prime})$ . It is left to show that $\pi^{\prime}$ is a $1$ -fixed NE. Consider a player $i\in{\sf Lose}(\pi)$ and a strategy $f^{\prime}_{i}$ for Player $i$ . Let $h\cdot v\in V^{*}\cdot V_{i}$ be the longest prefix of ${\sf outcome}(\pi^{\prime}[i\leftarrow f^{\prime}_{i}])$ that agrees with $\rho$ . Thus, $f^{\prime}_{i}(h\cdot v)\neq f_{i}(v)$ , and so, by its definition, the strategy $g_{1}$ starts to follow the strategy $f_{1}^{i}$ after the history $h\cdot v$ . Since $v\in{\sf reach}(\rho)\cap V_{i}$ , then $v\not\in W_{i}$ . Therefore, the strategy $f_{1}^{i}$ is winning in ${\mathcal{G}}^{\pi,i}$ when the game starts in $v$ . Hence, ${\sf outcome}(\pi^{\prime}[i\leftarrow f^{\prime}_{i}])$ does not satisfy $\alpha_{i}$ , and so $f^{\prime}_{i}$ is not a beneficial deviation for Player $i$ . Thus, $\pi^{\prime}$ is an NE-CRS solution, and we are done. $\hfill\blacktriangleleft$

4 Memory Requirements for NE-CRS

In this section we consider the memory requirements for NE-CRS, namely when the solution concept is an NE. Consider a Büchi game ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ . In [31], Ummels shows⁴⁴4The study in [31] considers Streett objectives, and includes also the parameter of the number of pairs in the objectives. It also combines the memory of the strategy with the size of the state space. The presentation here includes a straightforward adjustment to our setting. that for a desired set $W$ of winners, if there exists a $W$ -NE in ${\mathcal{G}}$ , then there exists a $W$ -NE in which the memory of all players is of size $O(k)$ . Since an NE-CRS solution corresponds to a $1$ -fixed $W$ -NE with $1\in W$ , and once $1\in W$ , then a profile is a $1$ -fixed $W$ -NE iff it is a $W$ -NE, the result provides an upper bound also to our problem. Below we prove a matching lower bound. The proof is not too complicated, and mainly serves as a warm-up to the study of SNE-CRS.

The lower bound holds already the class of sink games, defined below. Consider a graph $G$ . A vertex in $G$ is a sink if it has only one outgoing edge, which is a self-loop. Then, a $k$ -player sink game ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ is a game in which the only cycles in $G$ are sinks, and for every $i\in[k]$ , the objective $\alpha_{i}\subseteq V$ is a set of sinks. Note that since once a play reaches a sink it stays there forever, the objective in sink games can be described using reachability, avoid, Büchi, or co-Büchi objectives.

Theorem 5.

For every $k>2$ , we can construct a $k$ -player sink game ${\mathcal{G}}_{k}$ such that $G_{k}$ has an NE-CRS solution, and every CRS solution for ${\mathcal{G}}_{k}$ requires all the environment players to have memory $k-2$ .

Proof.

We define ${\mathcal{G}}_{k}$ as follows (see an illustration for the case $k=4$ in Figure 3).

Figure 3: The game

G_{4}

. Each vertex is labeled by its owner (top). The colors correspond to owners.

A play in $G_{k}$ starts at the initial vertex $d_{2}$ . For each $i\in{\sf Env}$ , Player $i$ controls the vertex $d_{i}$ and decides whether to move the token to $d_{i+1}$ or to a vertex $c_{j}$ for some $j\in{\sf Env}\setminus{\{i\}}$ . For each $j\in{\sf Env}$ , Player $j$ also controls the vertex $c_{j}$ . From $c_{j}$ , Player $j$ chooses a successor $s_{i}$ for some $i\in{\sf Env}$ . The vertex $s_{i}$ is a sink that satisfies the objectives of all players in ${\sf Env}\setminus\{i\}$ . The vertex $d_{k+1}$ is a sink that satisfies the objective of Player $1$ .

Note that the only play in which Player $1$ achieves her objective is $d_{2},d_{3},\ldots,d_{k},(d_{k+1})^{\omega}$ , in which all players in ${\sf Env}$ , when controlling their $d_{i}$ vertices, choose to move the token towards $d_{k+1}$ . If for some $i\in{\sf Env}$ , Player $i$ instead chooses to move the token to a vertex $c_{j}$ for some $j\in{\sf Env}\setminus{\{i\}}$ , then Player $j$ can move the token from $c_{j}$ to $s_{i}$ , making the deviation non-beneficial for Player $i$ . Thus, an NE-CRS solution for ${\mathcal{G}}_{k}$ consists of strategies that direct the token to $d_{k+1}$ and “punish” players that do not follow such a direction. Consider $j\in{\sf Env}$ . In order to punish Player $i$ by moving the token from $c_{j}$ to $s_{i}$ , Player $j$ has to remember the vertex from which the token has reached $c_{j}$ . Since there are $k-2$ such possible vertices, ${\mathcal{G}}_{k}$ has a NE-CRS solution where the strategy of each player uses $k-2$ memory. Also, if for some $i\in[k]$ , the strategy of Player $i$ has a memory smaller than $k-2$ , then there exists a player $j\in{\sf Env}\setminus{\{i\}}$ that can move from $d_{j}$ to $c_{i}$ without being punished, which implies that the profile is not an NE-CRS solution.

In Appendix A.2, we describe ${\mathcal{G}}_{k}$ and prove the bounds formally. $\hfill\blacktriangleleft$

5 On the SNE-CRS Problem

In this section we analyze the complexity of the SNE-CRS problem and the memory required for the players in an SNE-CRS solution. For both problems, the upper bounds follow easily from the study of robust equilibrium and SSE. Our main contributions are the lower bounds. For the complexity of the SNE-CRS problem, we relate the collaboration of players in SNE with the concept of non-cooperation in rational synthesis. For the results on the memory, lower bounds are open also for the other solution concepts, and we show that our contribution applies for them too.

5.1 Solving SNE-CRS

In this section we prove that the SNE-CRS problem is PSPACE-complete. The upper bound is similar to the one presented for other types of equilibria with respect to deviations by a coalition of players and is based on adjusting the objectives in the deviator game of Brenguier [6] to the solution concept of SNE. The adjustment is quite straightforward, and we describe it in the full version. The PSPACE algorithm holds for every objective that can be translated in polynomial space to an Emerson-Lei objective⁵⁵5An Emerson-Lei objective is given by a Boolean assertion $\theta$ over subsets of $V$ . A play $\rho\subseteq V^{\omega}$ induces an assignments $f_{\rho}:2^{V}\rightarrow\{F,T\}$ , where for every set $S\in 2^{V}$ , we have that $f_{\rho}(S)=T$ iff ${\sf inf}(\rho)\cap S\neq\emptyset$ . Then, $\rho$ satisfies $\theta$ iff $f_{\rho}$ satisfies $\theta$ . For example, the Emerson-Lei objective $\alpha_{1}\wedge\alpha_{2}\wedge\cdots\wedge\alpha_{k}$ is equivalent to the generalized Büchi objective $\{\alpha_{1},\alpha_{2},\ldots,\alpha_{k}\}$ , and the objective $\alpha_{1}\wedge\neg\alpha_{2}$ is satisfied by plays that visit vertices in $\alpha_{1}$ infinitely often and visit vertices in $\alpha_{2}$ only finitely often. [16]. This clearly includes (but is not limited to) Büchi and co-Büchi objectives.

Our lower bound involves an interesting relation between the collaboration of players in an SNE and the concept of non-cooperation in rational synthesis. We start with a definition of the latter. Recall that in CRS, we assume that the environment players are collaborative, in the sense they would follow a suggested equilibrium. In non-cooperative rational synthesis (NRS), we cannot suggest a strategy to the environment players and only know they would reach an equilibrium [21]. Accordingly, for NE-NRS, the goal is to return a strategy $f_{1}$ for Player $1$ such that Player $1$ wins in every $1$ -fixed NE $\langle f_{1},f_{2},\ldots,f_{k}\rangle$ . In other words, Player $1$ follows $f_{1}$ , and no matter how the environment players behave, then as long as they are rational, and so resulting profile is a $1$ -fixed NE, the objective of Player $1$ is satisfied.

Note that in NE-NRS, the system has to cope only with deviations of single players, yet the players are non-cooperative. On the other hand, in SNE-CRS, the system has to cope with deviations of coalitions of players, yet the players are cooperative, and would follow a suggested $1$ -fixed SNE. In this section we relate NE-NRS with SNE-CRS, showing that the challenge of coping with deviations of coalitions is similar to the challenge of coping with non-cooperation. Intuitively, in both cases, all the environment players may deviate simultaneously, as long as these deviations are beneficial for them.

We formalize the connection by describing a class ${\cal C}$ of games such that for every game ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ in the class ${\cal C}$ , there is a game ${\mathcal{G}}^{\prime}=\langle G,\{\alpha^{\prime}_{i}\}_{i\in[k]}\rangle$ such that there is an NE-NRS in ${\mathcal{G}}$ iff there is an SNE-CRS in ${\mathcal{G}}^{\prime}$ . Note that ${\mathcal{G}}$ and ${\mathcal{G}}^{\prime}$ are defined with respect to the same game graph, and only the objectives are different. Moreover, the class ${\cal C}$ is strong, in the sense that the PSPACE-hardness of NE-NRS applies already to a game in ${\cal C}$ [13]. Accordingly, the results in this section, beyond the interesting connection between non-cooperation and coalitions, also imply PSPACE-hardness for the problem of SNE-CRS.

In the rest of this section we define the class ${\cal C}$ and describe the reduction from ${\mathcal{G}}$ to ${\mathcal{G}}^{\prime}$ . Recall that in a sink game, the only cycles are sinks, and objectives are set of sinks. The class ${\cal C}$ consists of special sink games, defined below, which restricts sink games further. Nevertheless, the class of special sink games captures the PSPACE-hardness of NE-NRS for all types of objectives, and we are going to relate NE-NRS solutions in special sink games to SNE-CRS solutions in sink games.

Definition 6.

A $k$ -player sink game ${\mathcal{G}}=\langle G,\{\alpha_{i}\}_{i\in[k]}\rangle$ , with $G=\langle\{V_{i}\}_{i\in[k]},v_{0},E\rangle$ , is special if the following holds:

1.

$\alpha_{1}=\bigcap_{i\in\{3,\ldots,k\}}\alpha_{i}$ .
2.

$\alpha_{2}=V$ .
3.

For every $i\in\{3,\ldots,k\}$ and vertex $v\in V_{i}$ , there is a vertex $u\in\alpha_{1}$ such that $\langle v,u\rangle\in E$ .

Note that the third condition implies that $\alpha_{1}$ is not empty and that whenever a token reaches a vertex of Player $i$ , for $i\in\{3,\ldots,k\}$ , she can move the token to $\alpha_{1}$ and cause all players to win.

Theorem 7.

Consider a $k$ -player special sink game ${\mathcal{G}}=\langle G,\{\alpha_{1},\alpha_{2},\ldots,\alpha_{k}\}\rangle$ . The game ${\mathcal{G}}$ has an NE-NRS solution iff the game ${\mathcal{G}}^{\prime}=\langle G,\{\alpha_{1},\alpha_{2}\setminus{\alpha_{1}},% \alpha_{3}\setminus{\alpha_{1}},\ldots,\alpha_{k}\setminus{\alpha_{1}}\}\rangle$ has an SNE-CRS solution.

Proof.

Let $G=\langle\{V_{i}\}_{i\in[k]},v_{0},E\rangle$ , and let ${\sf Env}=\{2,\ldots,k\}$ be the set of environment players.

Given a play $p\in V^{\omega}$ in $G$ , let ${\sf relevant}(p)\subseteq{\sf Env}$ denote the set of environment players that own at least one vertex visited along $p$ . For a profile $\pi$ , we use ${\sf relevant}(\pi)$ to denote ${\sf relevant}({\sf outcome}(\pi))$ . Note that when we consider deviations in a $1$ -NE $\pi$ , only deviations of players in ${\sf relevant}(\pi)$ are of interest. Indeed, for every $i\in{\sf Env}\setminus{\sf relevant}(\pi)$ , if Player $i$ changes her strategy, the outcome of the profile is not changed. We say that a play $p\in V^{\omega}$ is a trap for Player $1$ in ${\mathcal{G}}$ if it reaches a sink $s\not\in\alpha_{1}$ such that for every $i\in{\sf relevant}(p)$ , we have that $s\in\alpha_{i}$ . Given a strategy $f_{i}$ for Player $i$ , and a play $p=v_{0},v_{1},v_{2}\ldots$ in $G$ , we say that $p$ agrees with $f_{i}$ (and that $f_{i}$ agrees with $p$ ) if for every $j\geq 0$ , if $v_{j}\in V_{i}$ , then $f_{i}(v_{0},v_{1},\ldots,v_{j})=v_{j+1}$ .

In Appendix A.3, we prove that the following three claims are equal. The theorem follows from the equivalence between $(\mathbf{C}_{1})$ and $(\mathbf{C}_{2})$ .

$(\mathbf{C}_{1})$: The game ${\mathcal{G}}$ has an NE-NRS solution.
$(\mathbf{C}_{2})$: The game ${\mathcal{G}}^{\prime}$ has an SNE-CRS solution.
$(\mathbf{C}_{3})$: There exists a strategy $f_{1}$ for Player $1$ such that there does not exist a trap in ${\mathcal{G}}$ that agrees with $f_{1}$ .

Intuitively, both NE-NRS and SNE-CRS solutions consider stable profiles in which Player $1$ uses a strategy $f_{1}$ as described in $(\mathbf{C}_{3})$ . In NE-NRS, stability amounts to an NE, and the universal quantification on all stable profiles is explicit in the definition of NRS. In SNE-CRS, we do start with a single profile, but deviations of sets of players capture the many profiles that have to be considered in NRS. Then, as specified in $(\mathbf{C}_{3})$ , in the setting of special sink games, the relevant deviations correspond to traps, and the two notions coincide. $\hfill\blacktriangleleft$

Recall that Büchi and co-Büchi objectives are special cases of Emerson-Lei objectives, for which the SNE-CRS problem can be solved in PSPACE. Also, sink objectives are special case of both Büchi and co-Büchi objectives. Since the NE-NRS problem is PSPACE-hard already for special sink games [13], Theorem 7 enables us to conclude with the following.

Theorem 8.

The SNE-CRS problem for Büchi and co-Büchi games is PSPACE-complete.

5.2 Memory Requirements for SNE-CRS

In this section we consider the memory requirements for SNE-CRS. The upper bound is similar to the one known for resilient-CRS and is based on an analysis of the memory required for the players in the corresponding deviator game. Essentially (see details in the full version), the players have to maintain in their memory the set of players who have deviated, as well as memory required for the satisfaction of the objectives of the winning players.

Theorem 9.

Consider a $k$ -player Büchi game ${\mathcal{G}}$ . If there exists a SNE-CRS solution in ${\mathcal{G}}$ , then there also exists an SNE-CRS solution in which the strategy of each player uses memory at most $2^{O(k)}$ .

Our main contribution is a lower bound, which was not studied before, and applies also to other solution concepts.

Theorem 10.

For every $k\geq 4$ , and $m\leq k-2$ , there is a sink game ${\mathcal{G}}_{k,m}=\langle G_{k},\{\alpha_{i}\}_{i\in[k]}\rangle$ such that ${\mathcal{G}}_{k,m}$ has an SNE-CRS solution, and every SNE-CRS solution requires $O(k)$ environment players to have memory $2^{\Theta(k)}$ .

Proof.

We define ${\mathcal{G}}_{k,m}$ as follows (see $G_{6,2}$ in Figure 4).

Figure 4: The sink game

G_{6,2}

. Each vertex is labeled by its owner (top). The colors correspond to owners. Some edges are omitted for clarity. In particular, all the vertices in

{\cal S}_{2}\times\{1,2\}

have an edge to the vertex

\bot

.

Let ${\cal S}_{m}$ denote the set of all subsets of $\{3,\ldots,k\}$ of size $m$ . That is, ${\cal S}_{m}=\{A\subseteq\{3,\ldots,k\}:\,|A|=m\}$ . For every set of players $A\in{\cal S}_{m}$ and $l\in[m]$ , let $A[l]$ be the $l$ -th element in $A$ when the elements are ordered by the usual $\leq$ order on $\{3,\ldots,k\}$ .

The game begins at the initial vertex $q_{0}$ , which is owned by Player $2$ . From $q_{0}$ , Player $2$ chooses a subset $A\in{\cal S}_{m}$ and moves the token to the vertex $\langle A,1\rangle$ . For each subset $A\in{\cal S}_{m}$ and $l\in[m]$ , Player $A[l]$ controls the vertex $\langle A,l\rangle$ . If $l\neq m$ , then the successors of $\langle A,l\rangle$ are the vertices $\bot$ and $\langle A,l+1\rangle$ . If $l=m$ , then the successors of $\langle A,l\rangle$ are the vertices $\bot$ and $c_{j}$ , for $j\in\{3,\ldots,k\}$ . For each $i\in\{3,\ldots,k\}$ , Player $i$ controls the vertex $c_{i}$ . From $c_{i}$ , Player $i$ chooses a subset $A\in{\cal S}_{m}$ and moves the token to the vertex $\langle A,i,\mbox{{\sc p}}\rangle$ , where p is a symbol, indicating the game is in the “punishment” layer. For every $A\in{\cal S}_{m}$ and $i\in\{3,\ldots,k\}$ , the vertex $\langle A,i,\mbox{{\sc p}}\rangle$ is owned by Player $2$ , who chooses an index $j\in A$ and moves the token to the vertex $s_{i,j}$ . For every $i,j\in\{3,\ldots,k\}$ , the vertex $s_{i,j}$ is a sink that satisfies the objectives of all players in $[k]\setminus\{1,i,j\}$ . The vertex $\bot$ is a sink that satisfies the objective of Player $1$ .

The idea behind ${\mathcal{G}}_{k,m}$ is as follows. Consider a profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ . Note that if Player $1$ wins in $\pi$ , then its outcome reaches the vertex $\bot$ , in which case all the other players lose in $\pi$ . Also, Player $2$ wins in $\pi$ iff its the outcome of $\pi$ reaches a sink of the form $s_{i,j}$ , in which case Player $1$ loses. Thus, the objectives of Player $1$ and Player $2$ complement each other. Assume that Player $1$ wins in $\pi$ , thus its outcome reaches $\bot$ . In order for $\pi$ to be an SNE, every deviation of players in $\{2,\ldots,k\}$ should result in a profile in which at least one of the players that deviate does not satisfy its objective. Thus, when the coalition of deviators is $C\subseteq\{2,\ldots,k\}$ , then there should be $i\in C$ such that the outcome of the new profile either continues to reach $\bot$ or, in case $i\neq 2$ , reaches a sink $s_{i,j}$ or $s_{j,i}$ for some $j\in\{3,\ldots,k\}$ . Before we explain why this implies the existence and an SNE-CRS solution, and why strategies for players $3,\ldots,k$ in any SNE-CRS solution require exponential memory, let us define ${\mathcal{G}}_{k,m}$ formally.

We define $G_{k,m}=\langle\langle\{V_{i}\}_{i\in[k]},q_{0},E\rangle,\{\alpha_{i}\}_{i\in[% k]}\rangle$ , as follows.

1.
The set of vertices and its partition to owners is as follows.
- $\blacksquare$
  
  $V_{1}=\{\bot\}\cup\{s_{i,j}:\,i,j\in\{3,\ldots,k\}\}$ .
- $\blacksquare$
  
  $V_{2}=\{q_{0}\}\cup\{\langle A,i,\mbox{{\sc p}}\rangle:\,A\in{\cal S}_{m}\mbox% { and }i\in\{3,\ldots,k\}\}$ .
- $\blacksquare$
  
  For every $i\in\{3,\ldots,k\}$ , we define $V_{i}=\{c_{i}\}\cup\{\langle A,l\rangle:\,A\in{\cal S}_{m},l\in[m],\mbox{ and % }i=A[l]\}$ .
2.
The set $E$ contains edges of the following types:
- $\blacksquare$
  
  For every $A\in{\cal S}_{m}$ , there is an edge from $q_{0}$ to $\langle A,1\rangle$ .
- $\blacksquare$
  
  For every $A\in{\cal S}_{m}$ and $l\in[m-1]$ , there is an edge from $\langle A,l\rangle$ to $\bot$ and to $\langle A,l+1\rangle$ .
- $\blacksquare$
  
  For every $A\in{\cal S}_{m}$ , there is an edge from $\langle A,m\rangle$ to $\bot$ and to $c_{j}$ , for all $j\in\{3,\ldots,k\}$ .
- $\blacksquare$
  
  For every $A\in{\cal S}_{m}$ and $i\in\{3,\ldots,k\}$ , there is an edge from $c_{i}$ to $\langle A,i,\mbox{{\sc p}}\rangle$ .
- $\blacksquare$
  
  For every $A\in{\cal S}_{m}$ , $i\in\{3,\ldots,k\}$ , and $j\in A$ , there is an edge from $\langle A,i,\mbox{{\sc p}}\rangle$ to $s_{i,j}$ .
- $\blacksquare$
  
  The vertices $\bot$ and $s_{i,j}$ , for all $i,j\in\{3,\ldots,k\}$ , are sinks.
3.
The objectives of the players are defined as follows.
- $\blacksquare$
  
  $\alpha_{1}=\{\bot\}$ .
- $\blacksquare$
  
  For every $i\in\{2,\ldots,k\}$ , we have $\alpha_{i}=\{s_{l,j}:\,l,j\in\{3,\ldots,k\}\setminus{\{i\}}\}$ .

We first describe an SNE-CRS solution $\pi=\langle f_{1},\ldots,f_{k}\rangle$ in ${\mathcal{G}}_{k,m}$ . Since Player $1$ only owns sinks, her strategy is straightforward. Moreover, the notions of SNE and $1$ -fixed SNE coincide in ${\mathcal{G}}_{k,m}$ . For Player $2$ , the strategy $f_{2}$ is an arbitrary memoryless strategy. As for $i\in\{3,\ldots,k\}$ , the strategy $f_{i}$ directs Player $i$ to move the token to $\bot$ from vertices in ${\cal S}_{m}\times[m]$ that she owns and to move the token to $\langle A,i,\mbox{{\sc p}}\rangle$ from $c_{i}$ . It is easy to see that ${\sf outcome}(\pi)=q_{0},f_{2}(q_{0}),(\bot)^{\omega}$ , and so ${\sf Win}(\pi)=\{1\}$ . We prove that $\pi$ is a $1$ -fixed SNE. Consider a coalition $C\subseteq\{2,\ldots,k\}$ and a deviation profile $\{f^{\prime}_{i}\}_{i\in C}$ . Since Player $2$ is losing in $\pi$ , and win in every outcome that does not reach $\bot$ , a deviation for Player $2$ that does not reach $\bot$ is always beneficial, and so, we can assume that $2\in C$ . Let $\pi^{\prime}=\pi[C\leftarrow\{f^{\prime}_{i}\}_{i\in C}]$ . Let $A\in{\cal S}_{m}$ be the set of players that Player $2$ chooses from $q_{0}$ in $\pi^{\prime}$ ; thus $f^{\prime}_{2}(q_{0})=\langle A,1\rangle$ . If there is a player $i\in A$ that in $\pi^{\prime}$ moves the token from the vertex $\langle A,l\rangle$ with $A[l]=i$ to the sink $\bot$ , then the outcome of $\pi^{\prime}$ reaches $\bot$ . Thus, ${\sf Win}(\pi^{\prime})=\{1\}$ , and the deviation of the players in $C$ is not beneficial. If all players in $A$ do not move the token to $\bot$ in their strategies in $\pi^{\prime}$ (in particular, this means that $A\subseteq C$ ), then let $c_{i}$ be the vertex to which Player $A[m]$ moves the token from $\langle A,m\rangle$ in $\pi^{\prime}$ . Since all the sinks $s_{i,l}$ that are reachable from $c_{i}$ are losing for Player $i$ , the deviation is not beneficial for Player $i$ , implying that $i\notin C$ . Hence, Player $i$ follows $f_{i}$ , which directs her to move the token from $c_{i}$ to $\langle A,i,\mbox{{\sc p}}\rangle$ . Since every successor of $\langle A,i,\mbox{{\sc p}}\rangle$ is losing for some player in $A$ , the deviation is not beneficial to all the players in $C$ , and so $\pi$ is an SNE in which Player $1$ wins.

We continue and prove that every SNE-CRS solution in ${\mathcal{G}}_{n,k}$ requires each of the players $3,\ldots,k$ to have memory $\binom{k-3}{m}$ . Assume by contradiction that there exists an SNE profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ in which Player $1$ wins and there exists $i\in\{3,\ldots,k\}$ such that the strategy $f_{i}$ has a memory structure ${\cal M}_{i}$ with fewer than $\binom{k-3}{m}$ states. Since there are fewer than $\binom{k-3}{m}$ states in ${\cal M}_{i}$ , and there are $\binom{k-3}{m}$ subsets of $\{3,\ldots,k\}\setminus\{i\}$ of size $m$ , there exists a set $A\in{\cal S}_{m}$ such that $i\notin A$ and $f_{i}$ never (that is, no matter what the history along which $c_{i}$ has been reached) directs Player $i$ to move the token from $c_{i}$ to $\langle A,i,\mbox{{\sc p}}\rangle$ .

Consider the coalition $C=A\cup\{2\}$ , and the deviation profile $f^{\prime}_{C}=\{\{f^{\prime}_{j}\}_{j\in C}\}$ , where for every $j\in A$ , the strategy $f^{\prime}_{j}$ agrees with $f_{j}$ , except that from $\langle A,l\rangle$ , with $A[l]=j$ , the strategy $f^{\prime}_{j}$ directs Player $j$ to move the token to $\langle A,l+1\rangle$ , in case $l<m$ , and to $c_{i}$ , in case $l=m$ . Finally, the strategy $f^{\prime}_{2}$ agrees with $f_{2}$ , except that from $q_{0}$ , the strategy $f^{\prime}_{2}$ directs Player $2$ to move the token to $\langle A,1\rangle$ , and for every $A^{\prime}\in{\cal S}_{m}$ such that $A^{\prime}\neq A$ , the strategy $f^{\prime}_{2}$ directs Player $2$ to move the token from $\langle A^{\prime},i,\mbox{{\sc p}}\rangle$ to $s_{i,\min A^{\prime}\setminus A}$ . Note that since both $A$ and $A^{\prime}$ are of size $m$ , the set $A^{\prime}\setminus A$ is not empty, and thus $\min A^{\prime}\setminus A$ exists and is in $\{3,\ldots,k\}$ . Let $\pi^{\prime}=\pi[C\leftarrow f^{\prime}_{C}]$ . By the definition of the strategies in $\pi^{\prime}$ , there exists $A^{\prime}\in{\cal S}_{m}$ such that $A^{\prime}\neq A$ and ${\sf outcome}(\pi^{\prime})=q_{0},\langle A,1\rangle,\langle A,2\rangle,\ldots% ,\langle A,m\rangle,c_{i},\langle A^{\prime},i,\mbox{{\sc p}}\rangle,(s_{i,% \min A^{\prime}\setminus A})^{\omega}$ . Thus, ${\sf Win}(\pi^{\prime})=[k]\setminus\{1,i,\min A^{\prime}\setminus A\}$ . Since $1,i$ , and $\min A^{\prime}\setminus A$ are all not in $A$ , and thus also not in $C$ , it follows that $C\subseteq{\sf Win}(\pi^{\prime})$ . Hence, the deviation to $f^{\prime}_{C}$ is beneficial to all the players in $C$ , contradicting the assumption that $\pi$ is an SNE. $\hfill\blacktriangleleft$

6 Memory Requirements for Additional Solution Concepts

Recall that different applications have initiated the study of different solution concepts for settings in which a coalition of players may deviate together. While upper bounds on the concepts of robust equilibria and SSE serve as a basis to our upper bounds here, no lower bounds are known on the memory requirements for CRS solutions with respect to these concepts. In this section we show that the construction in Theorem 10 can be modified to show a lower bound on the memory required to the environment players in CRS solution when the solution concepts are resilient (and hence, robust) equilibria and SSE.

For $k\geq 4$ and $m\in[k-2]$ , consider the $k$ -player sink game ${\mathcal{G}}^{\prime}_{k,m}$ obtained from ${\mathcal{G}}_{k,m}$ by changing the objectives of the players so that now the sink $\bot$ is winning for all players in $[k]\setminus\{2\}$ (rather than for Player $1$ only in ${\mathcal{G}}_{k,m}$ ). Thus, as in ${\mathcal{G}}_{k,m}$ , the objectives of Player $1$ and Player $2$ still complement each other, and now, Players $3,\ldots,k$ win also in profiles that reach the vertex $\bot$ . Also, as in ${\mathcal{G}}_{k,m}$ , all the vertices owned by Player $1$ have only one successor, and the notions of $1$ -fixed equilibrium and (usual) equilibrium coincide.

We start with resilient equilibria, and show that ${\mathcal{G}}^{\prime}_{k,m}$ has a resilient-CRS solution, and that every resilient-CRS solution requires $O(k)$ environment players to have memory $2^{\Theta(k)}$ . Recall that a profile is a resilient-equilibrium if for every subset $C\subseteq{\sf Env}$ , every deviation of the players in $C$ that benefits a player in $C$ also harms a player in $C$ . The proof is similar to the proof of Theorem 10. In particular, the profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ described in the proof is a resilient-CRS solution in ${\mathcal{G}}^{\prime}_{k,m}$ . To see this, recall that ${\sf outcome}(\pi)$ reaches $\bot$ , and is thus winning in ${\mathcal{G}}^{\prime}_{k,m}$ for all the players in $[k]\setminus\{2\}$ . Thus, only Player $2$ may benefit from a deviation. When Player $2$ chooses a set $A\in{\cal S}_{m}$ , all the players in $A$ have to deviate on order for $\bot$ to be avoided, and so, by the definition of resilient equilibrium, all of them have to win when the game reach the punishment layer. This, however, is impossible in deviations from $\pi$ , as for all $i\in\{3,\ldots,k\}$ , the only way for Player $i$ to make sure she does not lose is to remember the set $A$ and proceed as in $\pi$ , from $c_{i}$ to the vertex $\langle A,i,\mbox{{\sc p}}\rangle$ . Moreover, using considerations similar to these in the proof of Theorem 10 (see details in Appendix A.4), if there is $i\in\{3,\ldots,k\}$ such that Player $i$ has no memory to keep track of the set $A$ chosen by Player $2$ , then there is no resilient-CRS. Essentially, in such a case Player $2$ can convince a coalition $A$ of players to join the deviation by letting the outcome of the new profile reach $c_{i}$ , where a player not in $A$ would be punished.

For the solution concept of SSE, we show that ${\mathcal{G}}^{\prime}_{k,m}$ has an SSE-CRS solution, and that every SSE-CRS solution requires $O(k)$ environment players to have memory $2^{\Theta(k)}$ . Recall that a profile is an SSE if for every coalition $C\subseteq{\sf Env}$ , every deviation that harms a player not in $C$ also harms a player in $C$ . In ${\mathcal{G}}^{\prime}_{k,m}$ , a deviation that does not reach the vertex $\bot$ causes Player $1$ to lose. Thus, a profile is an SSE iff for every coalition $C\subseteq{\sf Env}$ and every deviation that causes the game to reach the punishment layer, at least one player in the coalition looses. Accordingly, the profile $\pi$ described in the proof of Theorem 10 is an SSE-CRS solution in ${\mathcal{G}}^{\prime}_{k,m}$ and in every SSE-CRS solution, every player in $\{3,\ldots,k\}$ has to remember the set $A$ chosen by Player $2$ from $q_{0}$ . Indeed (see details in Appendix A.5), otherwise, a coalition of players in $A$ can deviate by letting the outcome of the new profile reach the vertex $c_{i}$ , where a player not in $A$ would be punished.

References

[1] S. Almagor, O. Kupferman, and G. Perelli. Synthesis of controllable Nash equilibria in quantitative objective game. In Proc. 27th Int. Joint Conf. on Artificial Intelligence, pages 35–41, 2018.
[2] E. Anshelevich, A. Dasgupta, J. Kleinberg, E. Tardos, T. Wexler, and T. Roughgarden. The price of stability for network design with fair cost allocation. In Proc. 45th IEEE Symp. on Foundations of Computer Science, pages 295–304. IEEE Computer Society, 2004. doi:10.1109/FOCS.2004.68.
[3] R. Aumann. Acceptable Points in General Cooperative $n$ -Person Games. In Contributions to the Theory of Games, volume 4, 1959.
[4] R. Bloem, K. Chatterjee, and B. Jobstmann. Graph games and reactive synthesis. In Handbook of Model Checking., pages 921–962. Springer, 2018. doi:10.1007/978-3-319-10575-8_27.
[5] P. Bouyer, N. Fijalkow, M. Randour, and P. Vandenhove. How to play optimally for regular objectives? In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023), volume 261 of Leibniz International Proceedings in Informatics (LIPIcs), pages 118:1–118:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ICALP.2023.118.
[6] R. Brenguier. Robust equilibria in mean-payoff games. In Proc. 19th Int. Conf. on Foundations of Software Science and Computation Structures, volume 9634 of Lecture Notes in Computer Science, pages 217–233. Springer, 2016. doi:10.1007/978-3-662-49630-5_13.
[7] L. Brice, J-F. Raskin, M. Sassolas, G. Scerri, and M. Bogaard. Pessimism of the will, optimism of the intellect: Fair protocols with malicious but rational agents. In IEEE 38th Computer Security Foundations Symposium, pages 33–47, 2025.
[8] T. Brihaye, A. Goeminne, J.C.A. Main, and M. Randour. Reachability games and friends: A journey through the lens of memory and complexity. In 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023), volume 284 of Leibniz International Proceedings in Informatics (LIPIcs), pages 1:1–1:26. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.FSTTCS.2023.1.
[9] V. Bruyère, C. Grandmont, and J-F.Raskin. As soon as possible but rationally. In 35th International Conference on Concurrency Theory (CONCUR 2024), volume 311 of Leibniz International Proceedings in Informatics (LIPIcs), pages 14:1–14:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.CONCUR.2024.14.
[10] V. Bruyère, J-F. Raskin, A. Reynouard, and M. Bogaard. The non-cooperative rational synthesis problem for spes and omega-regular objectives. In 36th International Conference on Concurrency Theory (CONCUR 2025), volume 348 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1–12:23. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2025. doi:10.4230/LIPIcs.CONCUR.2025.12.
[11] A. Casares. On the minimisation of transition-based rabin automata and the chromatic memory requirements of muller conditions. In 30th EACSL Annual Conference on Computer Science Logic (CSL 2022), volume 216 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1–12:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.CSL.2022.12.
[12] K. Chatterjee, R. Majumdar, and M. Jurdzinski. On Nash equilibria in stochastic games. In Proc. 13th Annual Conf. of the European Association for Computer Science Logic, volume 3210 of Lecture Notes in Computer Science, pages 26–40. Springer, 2004. doi:10.1007/978-3-540-30124-0_6.
[13] R. Condurache, E. Filiot, R. Gentilini, and J.-F. Raskin. The complexity of rational synthesis. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), volume 55 of Leibniz International Proceedings in Informatics (LIPIcs), pages 121:1–121:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ICALP.2016.121.
[14] S. Dziembowski, M. Jurdzinski, and I. Walukiewicz. How much memory is needed to win infinite games. In Proc. 12th ACM/IEEE Symp. on Logic in Computer Science, pages 99–110, 1997.
[15] R. Ehlers. Symbolic bounded synthesis. In Proc. 22nd Int. Conf. on Computer Aided Verification, volume 6174 of Lecture Notes in Computer Science, pages 365–379. Springer, 2010. doi:10.1007/978-3-642-14295-6_33.
[16] E.A. Emerson and C.-L. Lei. Modalities for model checking: Branching time logic strikes back. Science of Computer Programming, 8:275–306, 1987. doi:10.1016/0167-6423(87)90036-0.
[17] E. Filiot, N. Jin, and J.-F. Raskin. An antichain algorithm for LTL realizability. In Proc. 21st Int. Conf. on Computer Aided Verification, volume 5643, pages 263–277, 2009. doi:10.1007/978-3-642-02658-4_22.
[18] D. Fisman, O. Kupferman, and Y. Lustig. Rational synthesis. In Proc. 16th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, volume 6015 of Lecture Notes in Computer Science, pages 190–204. Springer, 2010. doi:10.1007/978-3-642-12002-2_16.
[19] E. Koutsoupias and C. Papadimitriou. Worst-case equilibria. Computer Science Review, 3(2):65–69, 2009. doi:10.1016/J.COSREV.2009.04.003.
[20] O. Kupferman, Y. Lustig, M.Y. Vardi, and M. Yannakakis. Temporal synthesis for bounded systems and environments. In Proc. 28th Symp. on Theoretical Aspects of Computer Science, pages 615–626, 2011.
[21] O. Kupferman, G. Perelli, and M.Y. Vardi. Synthesis with rational environments. Annals of Mathematics and Artificial Intelligence, 78(1):3–20, 2016. doi:10.1007/S10472-016-9508-8.
[22] O. Kupferman and N. Shenwald. Games with trading of control. In 34th International Conference on Concurrency Theory (CONCUR 2023), volume 279 of Leibniz International Proceedings in Informatics (LIPIcs), pages 19:1–19:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.CONCUR.2023.19.
[23] O. Kupferman and N. Shenwald. Positional-Player Games. In 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025), volume 345 of Leibniz International Proceedings in Informatics (LIPIcs), pages 64:1–64:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2025. doi:10.4230/LIPIcs.MFCS.2025.64.
[24] J. C. A. Main. Arena-independent memory bounds for nash equilibria in reachability games. In 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024), volume 289 of Leibniz International Proceedings in Informatics (LIPIcs), pages 50:1–50:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.STACS.2024.50.
[25] D.A. Martin. Borel determinacy. Annals of Mathematics, 65:363–371, 1975.
[26] J.F. Nash. Equilibrium points in $n$ -person games. In Proceedings of the National Academy of Sciences of the United States of America, 1950.
[27] C. H. Papadimitriou. Algorithms, games, and the internet. In Proc. 33rd ACM Symp. on Theory of Computing, pages 749–753, 2001.
[28] A. Pnueli and R. Rosner. On the synthesis of a reactive module. In Proc. 16th ACM Symp. on Principles of Programming Languages, pages 179–190, 1989.
[29] S. Schewe and B. Finkbeiner. Bounded synthesis. In 5th Int. Symp. on Automated Technology for Verification and Analysis, volume 4762 of Lecture Notes in Computer Science, pages 474–488. Springer, 2007. doi:10.1007/978-3-540-75596-8_33.
[30] W. Thomas. On the synthesis of strategies in infinite games. In Proc. 12th Symp. on Theoretical Aspects of Computer Science, volume 900 of Lecture Notes in Computer Science, pages 1–13. Springer, 1995. doi:10.1007/3-540-59042-0_57.
[31] M. Ummels. The complexity of Nash equilibria in infinite multiplayer games. In Proc. 11th Int. Conf. on Foundations of Software Science and Computation Structures, pages 20–34, 2008.
[32] J. von Neumann and O. Morgenstern. Theory of games and economic behavior. Princeton University Press, 1953.
[33] M. Wooldridge, J. Gutierrez, P. Harrenstein, E. Marchioni, G. Perelli, and A. Toumi. Rational verification: From model checking to equilibrium checking. In Proc. of 30th Conf. on Artificial Intelligence, pages 4184–4190, 2016.

Appendix A Missing Proofs and Examples

A.1 On strategies for Player 1 in a $1$ -bounded NE-CRS solution

We describe two interesting examples. The first is a game ${\mathcal{G}}_{5}$ and a CRS solution $\langle f_{1},f_{2}\rangle$ such that there is no $1$ -bounded CRS solution $\langle f_{1},f^{\prime}_{2}\rangle$ for a memoryless $f_{2}$ . The second is a game ${\mathcal{G}}_{6}$ such that every $1$ -bounded CRS solution $\langle f_{1},f_{2}\rangle$ requires $f_{1}$ to have memory of size at least $2$ .

Consider the Büchi game ${\mathcal{G}}_{5}$ in Figure 5 (left). Let $\alpha_{1}=\{v_{3},v_{4}\}$ and $\alpha_{2}=\{v_{4}\}$ . Consider the following strategy of $f_{1}$ of Player $1$ :

$\blacksquare$

$f_{1}(v_{0},v_{2},(v_{3},v_{2})^{*})=v_{3}$ .
$\blacksquare$

$f_{1}(v_{0},v_{1},v_{0},v_{2},(v_{4},v_{2})^{*})=v_{4}$ .

Thus, if the token reaches $v_{2}$ without visiting $v_{1}$ before, Player $1$ always direct it to $v_{3}$ , where only Player $1$ wins, and if the token visits $v_{1}$ for one time before reaching $v_{2}$ , then Player $1$ always direct the token to $v_{4}$ , where both players win. Let $f_{2}$ be a strategy for Player $2$ that proceeds to $v_{1}$ once and then to $v_{2}$ . It is easy to see that while $\langle f_{1},f_{2}\rangle$ is a CRS solution, there is no $1$ -bounded CRS solution $\langle f_{1},f^{\prime}_{2}\rangle$ for a memoryless $f_{2}$ .

Figure 5: The games

{\mathcal{G}}_{5}

(left) and

{\mathcal{G}}_{6}

(right).

We continue and show that the strategy of Player $1$ in a $1$ -bounded CRS solution may require memory of size $2$ . Consider the Büchi game ${\mathcal{G}}_{6}$ in Figure 5 (right). Let $\alpha_{1}=\{v_{3}\}$ and $\alpha_{2}=\{v_{4}\}$ .

Consider a profile $\pi=\langle f_{1},f_{2}\rangle$ for a memoryless strategy $f_{1}$ for Player $1$ . We claim that $\pi$ is not a $1$ -bounded CRS solution. Indeed, if $f_{1}(v_{0})=v_{1}$ , then $\alpha_{1}$ is not satisfied, and if $f_{1}(v_{0})=v_{2}$ , then Player $2$ can deviate to a memoryless strategy $f^{\prime}_{2}(v_{2})=v_{4}$ , where only $\alpha_{2}$ is satisfied.

Nevertheless, the game ${\mathcal{G}}_{6}$ does have a $1$ -bounded CRS solution $\pi=\langle f_{1},f_{2}\rangle$ , for $f_{1}$ that is not memoryless. To see this, consider a strategy $f_{1}$ of Player $1$ that behaves as follows:

$\blacksquare$

$f_{1}((v_{0},v_{2},v_{3})^{*},v_{0})=v_{2}$ .
$\blacksquare$

$f_{1}(v_{0},v_{2},v_{4},v_{0})=v_{1}$ .

Thus, when the game starts and as long as Player $2$ moves the token from $v_{2}$ to $v_{3}$ , Player $1$ moves the token from $v_{0}$ down to $v_{2}$ . If Player $2$ moves the token from $v_{2}$ to $v_{4}$ , then on the next visit of the token in $v_{0}$ , Player $1$ moves it to $v_{1}$ , where no player wins. Note that when Player $2$ uses a memoryless strategy, then this “next” visit must be the second one, and so the definition of $f_{1}$ above covers all possible histories.

It is not hard to see that $\pi=\langle f_{1},f_{2}\rangle$ , for the memoryless strategy $f_{2}$ with $f_{2}(v_{2})=v_{3}$ is a $1$ -bounded CRS solution

A.2 Missing Details in the proof of Theorem 5

The game graph $G_{k}=\langle\{V_{i}\}_{i\in[k]},d_{2},E\rangle$ and the objectives $\{\alpha_{i}\}_{i\in[k]}$ are defined as follows:

1.
The vertex sets of the players are defined as follows:
- $\blacksquare$
  
  Player $1$ controls the set $V_{1}=\{d_{k+1},c_{1}\}\cup\{s_{2},\ldots,s_{k}\}$
- $\blacksquare$
  
  For every $i\in\{2,\ldots,k\}$ , Player $i$ controls the set $V_{i}=\{c_{i},d_{i}\}$ .
2.
The set $E$ contains edges of the following types:
- $\blacksquare$
  
  For every $i\in\{2,\ldots,k\}$ , edges from $d_{i}$ to $d_{i+1}$ .
- $\blacksquare$
  
  For every $i\in\{2,\ldots,k\}$ , there is an edge from $d_{i}$ to $c_{j}$ where $j\in[k]\setminus{\{i\}}$ .
- $\blacksquare$
  
  For every $i\in[k]$ , and $j\in\{2,\ldots,k\}$ , there is an edge from $c_{i}$ to $s_{j}$ .
- $\blacksquare$
  
  The vertices in $\{d_{k+1}\}\cup\{s_{2},\ldots,s_{k}\}$ are sink vertices.
3.
The objectives are defined as follows:
- $\blacksquare$
  
  The objective of Player 1 is $\alpha_{1}=\{d_{k+1}\}$ .
- $\blacksquare$
  
  For every $i\in\{2,\ldots,k\}$ , the objective of Player $i$ is $\alpha_{i}=\{s_{j}:\,j\in[k]\setminus{\{i\}}\}$ .

We begin by describing an NE profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ in which Player $1$ wins. For every $i\in\{2,\ldots,k\}$ , Player $i$ decides to always move from $d_{i}$ towards $d_{k+1}$ . If there exists $i\in[k]\setminus{\{1\}}$ such that Player $i$ deviated and moved the token from $d_{i}$ to $c_{j}$ for some $j\in[k]\setminus{\{i\}}$ , then Player $j$ , move the token to $s_{i}$ . The outcome of the profile $\pi$ is $d_{2},d_{3},\ldots,d_{k},d_{k+1}^{\omega}$ , thus, ${\sf Win}(\pi)=\{1\}$ . If for some $l\in\{2,\ldots,k\}$ , Player $l$ deviates and move the token from $d_{l}$ to $c_{j}$ for some $j\in[k]\setminus{\{l\}}$ , then Player $j$ moves the token to $s_{l}$ and Player $l$ loses. Meaning that deviation is not beneficial and that $\pi$ is an NE. Overall $\pi$ is an NE in which Player $1$ wins. Note that the strategy for each players in $\pi$ can be implemented with a finite memory structure of size $k-2$ . Indeed. each player only needs to remember if a deviation has happened, and if so, which player (besides themselves and Player $1$ ) has deviated.

We now show that in every NE in which Player $1$ wins, the strategy of every player uses memory at least $k-2$ . Assume by contradiction that there exists an NE profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ where Player $1$ wins and there exists $i\in[k]$ such that the strategy $f_{i}$ has a memory structure ${\cal M}_{i}$ with less than $k-2$ states. Then, by the pigeon hole principle, since there are less than $k-2$ states in ${\cal M}_{i}$ but $c_{i}$ has $k-1$ successors, there exists a successors to $c_{i}$ , which we mark with $s_{j}$ , such that $f_{i}$ never chooses $s_{j}$ as a successor from $c_{i}$ . Since $j>2$ , we have that $j\in{\sf Lose}(\pi)$ and may deviate. Let $f^{\prime}_{j}$ be a strategy for Player $j$ that agrees with $f_{j}$ , except that from $d_{j}$ , the strategy $f^{\prime}_{j}$ always move the token to $c_{i}$ . Then, since $f_{i}$ never chooses $s_{j}$ as a successor from $c_{i}$ , the profile $\pi^{\prime}=\pi[i\leftarrow f^{\prime}_{i}]$ visits a sink $s_{m}$ where $m\in[k]\setminus{\{1,j\}}$ . Thus, Player $j$ wins in $\pi^{\prime}$ and so $\pi$ is not stable.

A.3 Missing Details in the proof of Theorem 7

We prove that the following three claims are equal:

$(\mathbf{C}_{1})$: The game ${\mathcal{G}}$ has an NE-NRS solution.
$(\mathbf{C}_{2})$: The game ${\mathcal{G}}^{\prime}$ has an SNE-CRS solution.
$(\mathbf{C}_{3})$: There exists a strategy $f_{1}$ for Player $1$ such that there does not exist a trap in ${\mathcal{G}}$ that agrees with $f_{1}$ .

We first prove that $(\mathbf{C}_{1})$ iff $(\mathbf{C}_{3})$ . In fact we prove a stronger claim, namely that for every strategy $f_{1}$ for Player $1$ , we have that $f_{1}$ is an NE-NRS solution in ${\mathcal{G}}$ iff there does not exist a trap in ${\mathcal{G}}$ that agrees with $f_{1}$ .

Consider a strategy $f_{1}$ for Player $1$ . Assume first that $f_{1}$ is an NE-NRS solution in ${\mathcal{G}}$ , and assume by way of contradiction that there exists a trap $p$ that agrees with $f_{1}$ . Thus, $p$ reaches a sink $s\not\in\alpha_{1}$ , and for every $i\in{\sf relevant}(p)$ , we have that $s\in\alpha_{i}$ . Consider a profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ , such that for every $i\in{\sf Env}$ , the strategy $f_{i}$ agrees with $p$ . Since $s\not\in\alpha_{1}$ , Player $1$ loses in $\pi$ . Since for every $i\in{\sf relevant}(p)$ , we have that $s\in\alpha_{i}$ , then ${\sf relevant}(\pi)\subseteq{\sf Win}({\mathcal{G}},\pi)$ . Thus, no player in ${\sf relevant}(\pi)$ has an incentive to deviate in ${\mathcal{G}}$ . Hence, $\pi$ is a $1$ -fixed NE in ${\mathcal{G}}$ in which Player $1$ loses, contradicting the fact $f_{1}$ is an NE-NRS solution for ${\mathcal{G}}$ .

For the second direction, assume that there does not exist a trap in ${\mathcal{G}}$ that agrees with $f_{1}$ , and assume by way of contradiction that $f_{1}$ is not an NE-NRS solution for ${\mathcal{G}}$ . That is, there is a $1$ -NE profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ such that $1\notin{\sf Win}({\mathcal{G}},\pi)$ . Consider the play $p={\sf outcome}(\pi)$ . Since $1\notin{\sf Win}({\mathcal{G}},\pi)$ , the play $p$ reaches a sink $s\not\in\alpha_{1}$ . Also, since $p$ is not a trap, there exists $i\in{\sf relevant}(p)$ such that $s\notin\alpha_{i}$ , which implies that $i\notin{\sf Win}({\mathcal{G}},\pi)$ . By the third condition on the structure of special sink games, Player $i$ can deviate and move the token from the first vertex she owns in $p$ to a vertex in $\alpha_{1}$ . Since $\alpha_{i}\subseteq\alpha_{1}$ , such a deviation causes Player $i$ to win in ${\mathcal{G}}$ , contradicting the fact $\pi$ is a $1$ -fixed NE.

We continue and prove that $(\mathbf{C}_{2})$ iff $(\mathbf{C}_{3})$ . Here too, we prove a stronger claim, namely that for every strategy $f_{1}$ for Player $1$ , we have that there is an SNE-CRS solution $\langle f_{1},f_{2},\ldots,f_{k}\rangle$ for ${\mathcal{G}}^{\prime}$ iff there does not exist a trap in ${\mathcal{G}}$ that agrees with $f_{1}$ .

Assume first that $f_{1}$ is such that there is an SNE-CRS solution $\pi=\langle f_{1},f_{2},\ldots,f_{k}\rangle$ for ${\mathcal{G}}^{\prime}$ . Recall that for every $i\in{\sf Env}$ , we have that $\alpha^{\prime}_{i}=\alpha_{i}\setminus\alpha_{1}$ . Thus, as $1\in{\sf Win}({\mathcal{G}}^{\prime},\pi)$ , it must be that ${\sf Env}\subseteq{\sf Lose}({\mathcal{G}}^{\prime},\pi)$ .

Assume by way of contradiction there exists a trap $p$ that agrees with $f_{1}$ . Let $C={\sf relevant}(p)$ . Since $C\subseteq{\sf Env}$ , then $C\subseteq{\sf Lose}({\mathcal{G}}^{\prime},\pi)$ . For every $i\in C$ , let $g_{i}$ be a strategy for Player $i$ that agrees with $p$ . Let $g_{C}=\{g_{i}:i\in C\}$ , and consider the profile $\pi^{\prime}=\pi[C\leftarrow g_{C}]$ . Since $f_{1}$ agrees with $p$ and for all $i\in{\sf relevant}(p)$ , the strategy $g_{i}$ agrees with $p$ , we have that ${\sf outcome}(\pi^{\prime})=p$ . Since $p$ is a trap, we have that $C\subseteq{\sf Win}({\mathcal{G}}^{\prime},\pi^{\prime})$ . Thus, the deviation strictly benefits all the players in $C$ and so $\pi$ is not a $1$ -fixed SNE.

Assume now that $f_{1}$ is such that there does not exist a trap $p$ that agrees with $f_{1}$ . Consider the profile $\pi=\langle f_{1},f_{2},\ldots,f_{k}\rangle$ , where $f_{2}$ is some memoryless strategy, and for every $i\in\{3,\ldots,k\}$ the strategy $f_{i}$ is a memoryless strategy that moves the token from every vertex to $\alpha_{1}$ . Note that since ${\mathcal{G}}^{\prime}$ is a special sink game, such strategies $f_{3},\ldots,f_{k}$ exist. We prove that $\pi$ is an SNE-CRS solution for ${\mathcal{G}}^{\prime}$ .

Let $p={\sf outcome}(\pi)$ . First, we show that $p$ reaches $\alpha_{1}$ , which implies that $1\in{\sf Win}({\mathcal{G}}^{\prime},\pi)$ . Clearly, if ${\sf relevant}(p)\cap\{3,\ldots,k\}\neq\emptyset$ , then, by the definition of the strategies $f_{3},\ldots,f_{k}$ , the play $p$ reaches $\alpha_{1}$ . Otherwise, ${\sf relevant}(p)\subseteq\{1,2\}$ . Then, if $p$ does not reach $\alpha_{1}$ , then, by the definition of $\alpha^{\prime}_{2}=V\setminus\alpha_{1}$ , we have that $p$ reaches $\alpha^{\prime}_{2}$ , implying that $p$ is a trap that agrees with $f_{1}$ and contradicting the fact that no such trap exists.

Second, we show that $\pi$ is a $1$ -fixed SNE in ${\mathcal{G}}^{\prime}$ . Assume by way of contradiction that $\pi$ is not a $1$ -fixed SNE in ${\mathcal{G}}^{\prime}$ . Then, there exists a coalition $C\subseteq{\sf Lose}({\mathcal{G}}^{\prime},\pi)$ and a strategy profile $g_{C}$ such that for the profile $\pi^{\prime}=\pi[C\leftarrow g_{C}]$ , we have that $C\subseteq{\sf Win}({\mathcal{G}}^{\prime},\pi^{\prime})$ . By the definition of $\alpha^{\prime}_{2},\ldots,\alpha^{\prime}_{k}$ , the latter implies that ${\sf outcome}(\pi^{\prime})$ reaches a sink $s\not\in\alpha_{1}$ . Let $p^{\prime}={\sf outcome}(\pi^{\prime})$ . By the definition of the strategies $f_{i}$ , it must be that ${\sf relevant}(p^{\prime})\cap\{3,\ldots,k\}\subseteq C$ . Indeed, otherwise $p^{\prime}$ would have reached $\alpha_{1}$ . Thus, as $C\subseteq{\sf Win}({\mathcal{G}}^{\prime},\pi^{\prime})$ , it follows that for every $i\in{\sf relevant}(p^{\prime})\cap\{3,\ldots,k\}$ , we have that $s\in\alpha_{i}$ . In addition, as $\alpha_{2}=V$ , we also have that $s\in\alpha_{2}$ . Thus, the play $p^{\prime}$ reaches a sink $s\not\in\alpha_{1}$ , and for every $i\in{\sf relevant}(\pi^{\prime})$ , we have that $s\in\alpha_{i}$ . Thus, $p^{\prime}$ is a trap that agrees with $f_{1}$ , contradicting the assumption that no such trap exists.

A.4 Resilient-CRS memory lower bound

We first show that $\pi=\langle f_{1},\ldots,f_{k}\rangle$ is a resilient-CRS. Consider a coalition $C\subseteq\{2,\ldots,k\}$ and a deviation profile $\{f^{\prime}_{i}\}_{i\in C}$ . Let $\pi^{\prime}=\pi[C\leftarrow\{f^{\prime}_{i}\}_{i\in C}]$ . Let $A\in{\cal S}_{m}$ be the set of players that Player $2$ choose from $q_{0}$ in $\pi^{\prime}$ ; thus $f^{\prime}_{2}(q_{0})=\langle A,1\rangle$ . If there is a player $i\in A$ that in $\pi^{\prime}$ moves the token from the vertex $\langle A,l\rangle$ with $A[l]=i$ to the sink $\bot$ , then the outcome of $\pi^{\prime}$ reaches $\bot$ . Thus, ${\sf Win}(\pi^{\prime})=[k]\setminus\{2\}$ , and the deviation of the players in $C$ is not strictly beneficial to a player in the coalition. If all players in $A$ do not move the token to $\bot$ in their strategies in $\pi^{\prime}$ (in particular, this means that $A\subseteq C$ ), then let $c_{i}$ be the vertex to which Player $A[m]$ moves the token from $\langle A,m\rangle$ in $\pi^{\prime}$ . Since all the sinks $s_{i,l}$ that are reachable from $c_{i}$ are losing for Player $i$ , the deviation is harmful for Player $i$ , implying that $i\notin C$ . Hence, Player $j$ follows $f_{i}$ , which directs her to move the token from $c_{i}$ to $\langle A,i,\mbox{{\sc p}}\rangle$ . Since every successor of $\langle A,i,\mbox{{\sc p}}\rangle$ is losing for some player in $A$ , the deviation harms a player in $C$ , and so $\pi$ is resilient in which Player $1$ wins.

We continue and prove that every resilient-CRS solution in ${\mathcal{G}}^{\prime}_{k,m}$ requires each of the players $3,\ldots,k$ to have memory $\binom{k-3}{m}$ . Assume by contradiction that there exists a resilient profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ in which Player $1$ wins and there exists $i\in\{3,\ldots,k\}$ such that the strategy $f_{i}$ has a memory structure ${\cal M}_{i}$ with fewer than $\binom{k-3}{m}$ states. Since there are fewer than $\binom{k-3}{m}$ states in ${\cal M}_{i}$ , and there are $\binom{k-3}{m}$ subsets of $\{3,\ldots,k\}\setminus\{i\}$ of size $m$ , there exists a set $A\in{\cal S}_{m}$ such that $i\notin A$ and $f_{i}$ never (that is, no matter what the history along which $c_{i}$ has been reached) directs Player $i$ to move the token from $c_{i}$ to $\langle A,i,\mbox{{\sc p}}\rangle$ .

Consider the coalition $C=A\cup\{2\}$ , and the deviation profile $f^{\prime}_{C}=\{\{f^{\prime}_{j}\}_{j\in C}\}$ , where for every $j\in A$ , the strategy $f^{\prime}_{j}$ agrees with $f_{j}$ , except that from $\langle A,l\rangle$ , with $A[l]=j$ , the strategy $f^{\prime}_{j}$ directs Player $j$ to move the token to $\langle A,l+1\rangle$ , in case $l<m$ , and to $c_{i}$ , in case $l=m$ . Finally, the strategy $f^{\prime}_{2}$ agrees with $f_{2}$ , except that from $q_{0}$ , the strategy $f^{\prime}_{2}$ directs Player $2$ to move the token to $\langle A,1\rangle$ , and for every $A^{\prime}\in{\cal S}_{m}$ such that $A^{\prime}\neq A$ , the strategy $f^{\prime}_{2}$ directs Player $2$ to move the token from $\langle A^{\prime},i,\mbox{{\sc p}}\rangle$ to $s_{i,\min A^{\prime}\setminus A}$ . Note that since both $A$ and $A^{\prime}$ are of size $m$ , the set $A^{\prime}\setminus A$ is not empty, and thus $\min A^{\prime}\setminus A$ exists and is in $\{3,\ldots,k\}$ . Let $\pi^{\prime}=\pi[C\leftarrow f^{\prime}_{C}]$ . By the definition of the strategies in $\pi^{\prime}$ , there exists $A^{\prime}\in{\cal S}_{m}$ such that $A^{\prime}\neq A$ and ${\sf outcome}(\pi^{\prime})=q_{0},\langle A,1\rangle,\langle A,2\rangle,\ldots% ,\langle A,m\rangle,c_{i},\langle A^{\prime},i,\mbox{{\sc p}}\rangle,(s_{i,% \min A^{\prime}\setminus A})^{\omega}$ . Thus, ${\sf Win}(\pi^{\prime})=[k]\setminus\{1,i,\min A^{\prime}\setminus A\}$ . Since $1,i$ , and $\min A^{\prime}\setminus A$ are all not in $A$ , and thus also not in $C$ , it follows that $C\subseteq{\sf Win}(\pi^{\prime})$ .

Hence, the deviation to $f^{\prime}_{C}$ is strictly beneficial for Player $2$ , and does not harm any member of $A$ , contradicting the assumption that $\pi$ is resilient.

A.5 SSE-CRS memory lower bound

We first show that $\pi=\langle f_{1},\ldots,f_{k}\rangle$ is an SSE-CRS. Consider a coalition $C\subseteq\{2,\ldots,k\}$ and a deviation profile $\{f^{\prime}_{i}\}_{i\in C}$ . Let $\pi^{\prime}=\pi[C\leftarrow\{f^{\prime}_{i}\}_{i\in C}]$ . Let $A\in{\cal S}_{m}$ be the set of players that Player $2$ choose from $q_{0}$ in $\pi^{\prime}$ ; thus $f^{\prime}_{2}(q_{0})=\langle A,1\rangle$ . If there is a player $i\in A$ that in $\pi^{\prime}$ moves the token from the vertex $\langle A,l\rangle$ with $A[l]=i$ to the sink $\bot$ , then the outcome of $\pi^{\prime}$ reaches $\bot$ . Thus, ${\sf Win}(\pi^{\prime})=[k]\setminus{\{2\}}$ , and the deviation of the players in $C$ does not harm a player outside $C$ . If all players in $A$ do not move the token to $\bot$ in their strategies in $\pi^{\prime}$ (in particular, this means that $A\subseteq C$ ), then let $c_{i}$ be the vertex to which Player $A[m]$ moves the token from $\langle A,m\rangle$ in $\pi^{\prime}$ . Since all the sinks $s_{i,l}$ that are reachable from $c_{i}$ are losing for Player $i$ , the deviation harms Player $i$ , implying that $i\notin C$ . Hence, Player $j$ follows $f_{i}$ , which directs her to move the token from $c_{i}$ to $\langle A,i,\mbox{{\sc p}}\rangle$ . Since every successor of $\langle A,i,\mbox{{\sc p}}\rangle$ is losing for some player in $A$ , the deviation harms a player in $C$ , and so $\pi$ is SSE in which Player $1$ wins.

We continue and prove that every SSE-CRS solution in ${\mathcal{G}}_{n,k}$ requires each of the players $3,\ldots,k$ to have memory $\binom{k-3}{m}$ . Assume by contradiction that there exists a SSE profile $\pi=\langle f_{1},\ldots,f_{k}\rangle$ in which Player $1$ wins and there exists $i\in\{3,\ldots,k\}$ such that the strategy $f_{i}$ has a memory structure ${\cal M}_{i}$ with fewer than $\binom{k-3}{m}$ states. Since there are fewer than $\binom{k-3}{m}$ states in ${\cal M}_{i}$ , and there are $\binom{k-3}{m}$ subsets of $\{3,\ldots,k\}\setminus\{i\}$ of size $m$ , there exists a set $A\in{\cal S}_{m}$ such that $i\notin A$ and $f_{i}$ never (that is, no matter what the history along which $c_{i}$ has been reached) directs Player $i$ to move the token from $c_{i}$ to $\langle A,i,\mbox{{\sc p}}\rangle$ .

Consider the coalition $C=A\cup\{2\}$ , and the deviation profile $f^{\prime}_{C}=\{\{f^{\prime}_{j}\}_{j\in C}\}$ , where for every $j\in A$ , the strategy $f^{\prime}_{j}$ agrees with $f_{j}$ , except that from $\langle A,l\rangle$ , with $A[l]=j$ , the strategy $f^{\prime}_{j}$ directs Player $j$ to move the token to $\langle A,l+1\rangle$ , in case $l<m$ , and to $c_{i}$ , in case $l=m$ . Finally, the strategy $f^{\prime}_{2}$ agrees with $f_{2}$ , except that from $q_{0}$ , the strategy $f^{\prime}_{2}$ directs Player $2$ to move the token to $\langle A,1\rangle$ , and for every $A^{\prime}\in{\cal S}_{m}$ such that $A^{\prime}\neq A$ , the strategy $f^{\prime}_{2}$ directs Player $2$ to move the token from $\langle A^{\prime},i,\mbox{{\sc p}}\rangle$ to $s_{i,\min A^{\prime}\setminus A}$ . Note that since both $A$ and $A^{\prime}$ are of size $m$ , the set $A^{\prime}\setminus A$ is not empty, and thus $\min A^{\prime}\setminus A$ exists and is in $\{3,\ldots,k\}$ . Let $\pi^{\prime}=\pi[C\leftarrow f^{\prime}_{C}]$ . By the definition of the strategies in $\pi^{\prime}$ , there exists $A^{\prime}\in{\cal S}_{m}$ such that $A^{\prime}\neq A$ and ${\sf outcome}(\pi^{\prime})=q_{0},\langle A,1\rangle,\langle A,2\rangle,\ldots% ,\langle A,m\rangle,c_{i},\langle A^{\prime},i,\mbox{{\sc p}}\rangle,(s_{i,% \min A^{\prime}\setminus A})^{\omega}$ . Thus, ${\sf Win}(\pi^{\prime})=[k]\setminus\{1,i,\min A^{\prime}\setminus A\}$ . Since $1,i$ , and $\min A^{\prime}\setminus A$ are all not in $A$ , and thus also not in $C$ , it follows that $C\subseteq{\sf Win}(\pi^{\prime})$ .

Hence, the deviation to $f^{\prime}_{C}$ does not harm any member of $C$ while harming Player $1$ , contradicting the assumption that $\pi$ is SSE.

[bib.bib1] [1] S. Almagor, O. Kupferman, and G. Perelli. Synthesis of controllable Nash equilibria in quantitative objective game. In Proc. 27th Int. Joint Conf. on Artificial Intelligence, pages 35–41, 2018.

[bib.bib2] [2] E. Anshelevich, A. Dasgupta, J. Kleinberg, E. Tardos, T. Wexler, and T. Roughgarden. The price of stability for network design with fair cost allocation. In Proc. 45th IEEE Symp. on Foundations of Computer Science, pages 295–304. IEEE Computer Society, 2004. doi:10.1109/FOCS.2004.68.

[bib.bib3] [3] R. Aumann. Acceptable Points in General Cooperative $n$ -Person Games. In Contributions to the Theory of Games, volume 4, 1959.

[bib.bib4] [4] R. Bloem, K. Chatterjee, and B. Jobstmann. Graph games and reactive synthesis. In Handbook of Model Checking., pages 921–962. Springer, 2018. doi:10.1007/978-3-319-10575-8_27.

[bib.bib5] [5] P. Bouyer, N. Fijalkow, M. Randour, and P. Vandenhove. How to play optimally for regular objectives? In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023), volume 261 of Leibniz International Proceedings in Informatics (LIPIcs), pages 118:1–118:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.ICALP.2023.118.

[bib.bib6] [6] R. Brenguier. Robust equilibria in mean-payoff games. In Proc. 19th Int. Conf. on Foundations of Software Science and Computation Structures, volume 9634 of Lecture Notes in Computer Science, pages 217–233. Springer, 2016. doi:10.1007/978-3-662-49630-5_13.

[bib.bib7] [7] L. Brice, J-F. Raskin, M. Sassolas, G. Scerri, and M. Bogaard. Pessimism of the will, optimism of the intellect: Fair protocols with malicious but rational agents. In IEEE 38th Computer Security Foundations Symposium, pages 33–47, 2025.

[bib.bib8] [8] T. Brihaye, A. Goeminne, J.C.A. Main, and M. Randour. Reachability games and friends: A journey through the lens of memory and complexity. In 43rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2023), volume 284 of Leibniz International Proceedings in Informatics (LIPIcs), pages 1:1–1:26. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.FSTTCS.2023.1.

[bib.bib9] [9] V. Bruyère, C. Grandmont, and J-F.Raskin. As soon as possible but rationally. In 35th International Conference on Concurrency Theory (CONCUR 2024), volume 311 of Leibniz International Proceedings in Informatics (LIPIcs), pages 14:1–14:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.CONCUR.2024.14.

[bib.bib10] [10] V. Bruyère, J-F. Raskin, A. Reynouard, and M. Bogaard. The non-cooperative rational synthesis problem for spes and omega-regular objectives. In 36th International Conference on Concurrency Theory (CONCUR 2025), volume 348 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1–12:23. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2025. doi:10.4230/LIPIcs.CONCUR.2025.12.

[bib.bib11] [11] A. Casares. On the minimisation of transition-based rabin automata and the chromatic memory requirements of muller conditions. In 30th EACSL Annual Conference on Computer Science Logic (CSL 2022), volume 216 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1–12:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPIcs.CSL.2022.12.

[bib.bib12] [12] K. Chatterjee, R. Majumdar, and M. Jurdzinski. On Nash equilibria in stochastic games. In Proc. 13th Annual Conf. of the European Association for Computer Science Logic, volume 3210 of Lecture Notes in Computer Science, pages 26–40. Springer, 2004. doi:10.1007/978-3-540-30124-0_6.

[bib.bib13] [13] R. Condurache, E. Filiot, R. Gentilini, and J.-F. Raskin. The complexity of rational synthesis. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), volume 55 of Leibniz International Proceedings in Informatics (LIPIcs), pages 121:1–121:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ICALP.2016.121.

[bib.bib14] [14] S. Dziembowski, M. Jurdzinski, and I. Walukiewicz. How much memory is needed to win infinite games. In Proc. 12th ACM/IEEE Symp. on Logic in Computer Science, pages 99–110, 1997.

[bib.bib15] [15] R. Ehlers. Symbolic bounded synthesis. In Proc. 22nd Int. Conf. on Computer Aided Verification, volume 6174 of Lecture Notes in Computer Science, pages 365–379. Springer, 2010. doi:10.1007/978-3-642-14295-6_33.

[bib.bib16] [16] E.A. Emerson and C.-L. Lei. Modalities for model checking: Branching time logic strikes back. Science of Computer Programming, 8:275–306, 1987. doi:10.1016/0167-6423(87)90036-0.

[bib.bib17] [17] E. Filiot, N. Jin, and J.-F. Raskin. An antichain algorithm for LTL realizability. In Proc. 21st Int. Conf. on Computer Aided Verification, volume 5643, pages 263–277, 2009. doi:10.1007/978-3-642-02658-4_22.

[bib.bib18] [18] D. Fisman, O. Kupferman, and Y. Lustig. Rational synthesis. In Proc. 16th Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, volume 6015 of Lecture Notes in Computer Science, pages 190–204. Springer, 2010. doi:10.1007/978-3-642-12002-2_16.

[bib.bib19] [19] E. Koutsoupias and C. Papadimitriou. Worst-case equilibria. Computer Science Review, 3(2):65–69, 2009. doi:10.1016/J.COSREV.2009.04.003.

[bib.bib20] [20] O. Kupferman, Y. Lustig, M.Y. Vardi, and M. Yannakakis. Temporal synthesis for bounded systems and environments. In Proc. 28th Symp. on Theoretical Aspects of Computer Science, pages 615–626, 2011.

[bib.bib21] [21] O. Kupferman, G. Perelli, and M.Y. Vardi. Synthesis with rational environments. Annals of Mathematics and Artificial Intelligence, 78(1):3–20, 2016. doi:10.1007/S10472-016-9508-8.

[bib.bib22] [22] O. Kupferman and N. Shenwald. Games with trading of control. In 34th International Conference on Concurrency Theory (CONCUR 2023), volume 279 of Leibniz International Proceedings in Informatics (LIPIcs), pages 19:1–19:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPIcs.CONCUR.2023.19.

[bib.bib23] [23] O. Kupferman and N. Shenwald. Positional-Player Games. In 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025), volume 345 of Leibniz International Proceedings in Informatics (LIPIcs), pages 64:1–64:19. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2025. doi:10.4230/LIPIcs.MFCS.2025.64.

[bib.bib24] [24] J. C. A. Main. Arena-independent memory bounds for nash equilibria in reachability games. In 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024), volume 289 of Leibniz International Proceedings in Informatics (LIPIcs), pages 50:1–50:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.STACS.2024.50.

[bib.bib25] [25] D.A. Martin. Borel determinacy. Annals of Mathematics, 65:363–371, 1975.

[bib.bib26] [26] J.F. Nash. Equilibrium points in $n$ -person games. In Proceedings of the National Academy of Sciences of the United States of America, 1950.

[bib.bib27] [27] C. H. Papadimitriou. Algorithms, games, and the internet. In Proc. 33rd ACM Symp. on Theory of Computing, pages 749–753, 2001.

[bib.bib28] [28] A. Pnueli and R. Rosner. On the synthesis of a reactive module. In Proc. 16th ACM Symp. on Principles of Programming Languages, pages 179–190, 1989.

[bib.bib29] [29] S. Schewe and B. Finkbeiner. Bounded synthesis. In 5th Int. Symp. on Automated Technology for Verification and Analysis, volume 4762 of Lecture Notes in Computer Science, pages 474–488. Springer, 2007. doi:10.1007/978-3-540-75596-8_33.

[bib.bib30] [30] W. Thomas. On the synthesis of strategies in infinite games. In Proc. 12th Symp. on Theoretical Aspects of Computer Science, volume 900 of Lecture Notes in Computer Science, pages 1–13. Springer, 1995. doi:10.1007/3-540-59042-0_57.

[bib.bib31] [31] M. Ummels. The complexity of Nash equilibria in infinite multiplayer games. In Proc. 11th Int. Conf. on Foundations of Software Science and Computation Structures, pages 20–34, 2008.

[bib.bib32] [32] J. von Neumann and O. Morgenstern. Theory of games and economic behavior. Princeton University Press, 1953.

[bib.bib33] [33] M. Wooldridge, J. Gutierrez, P. Harrenstein, E. Marchioni, G. Perelli, and A. Toumi. Rational verification: From model checking to equilibrium checking. In Proc. of 30th Conf. on Artificial Intelligence, pages 4184–4190, 2016.

Memory Requirements in Non-Zero-Sum Games

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

2 Preliminaries

Games.

Strategies, profiles, and equilibria.

Rational Synthesis.

Finite-Memory Strategies.

3 Is Memory for the Environment Helpful?

Theorem 1.

Proof.

Theorem 2.

Proof.

Theorem 3.

Proof.

Theorem 4.

Proof.

4 Memory Requirements for NE-CRS

Theorem 5.

Proof.

5 On the SNE-CRS Problem

5.1 Solving SNE-CRS

Definition 6.

Theorem 7.

Proof.

Theorem 8.

5.2 Memory Requirements for SNE-CRS

Theorem 9.

Theorem 10.

Proof.

6 Memory Requirements for Additional Solution Concepts

References

Appendix A Missing Proofs and Examples

A.1 On strategies for Player 1 in a 𝟏-bounded NE-CRS solution

A.2 Missing Details in the proof of Theorem 5

A.3 Missing Details in the proof of Theorem 7

A.4 Resilient-CRS memory lower bound

A.5 SSE-CRS memory lower bound

A.1 On strategies for Player 1 in a $1$ -bounded NE-CRS solution