Games with -Automatic Preference Relations
Abstract
This paper investigates Nash equilibria (NEs) in multi-player turn-based games on graphs, where player preferences are modeled as -automatic relations via deterministic parity automata. Unlike much of the existing literature, which focuses on specific reward functions, our results apply to any preference relation definable by an -automatic relation. We analyze the computational complexity of determining the existence of an NE (possibly under some constraints), verifying whether a given strategy profile forms an NE, and checking whether a specific outcome can be realized by an NE. When a (constrained) NE exists, we show that there always exists one with finite-memory strategies. Finally, we explore fundamental properties of -automatic relations and their implications in the existence of equilibria.
Keywords and phrases:
Games played on graphs, Nash equilibrium, -automatic relations, -recognizable relations, constrained Nash equilibria existence problemFunding:
Jean-François Raskin: Supported by Fondation ULB (https://www.fondationulb.be/en/) and the Thelam Fondation.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Automata over infinite objects ; Software and its engineering Formal methods ; Theory of computation Solution concepts in game theory ; Theory of computation Exact and approximate computation of equilibriaFunding:
This work has been supported by the Fonds de la Recherche Scientifique – FNRS under Grant n° T.0023.22 (PDR Rational).Editors:
Paweł Gawrychowski, Filip Mazowiecki, and Michał SkrzypczakSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Non-zero-sum games on graphs provide a powerful framework for analyzing rational behavior in multi-agent systems, see, e.g., [9, 10, 14, 22, 27, 30, 33]. By modeling settings where agents have individual objectives, this approach captures the complexity of real-world scenarios where the interests of agents (modeled by players) are neither fully aligned nor entirely antagonistic. It enables the study of solution concepts such as Nash and subgame-perfect equilibria [36, 37], offering insight into strategic decision making. This, in turn, can aid in designing systems that anticipate and respond to rational behaviors, enriching reactive synthesis methodologies.
In this context, specifying player objectives [26] is central to reasoning about strategies and equilibria. In qualitative games, objectives determine whether an execution (an infinite path in the graph) is winning or losing for a given player. In quantitative games, executions are instead assigned numerical values, allowing players to compare and rank them based on accumulated rewards, with higher values being preferable. From this perspective, the qualitative setting can be viewed as a special case where the values are Boolean, typically captured by parity acceptance conditions, which encompass all -regular objectives. In the quantitative setting, a variety of reward functions have been explored, including total sum, limsup and liminf, discounted-sum, and hybrid models such as cost-optimal reachability. For each of these functions, dedicated techniques have been developed to design algorithms that analyze optimal strategies and, more broadly, equilibria.
However, these solutions are often tightly coupled to the specific reward function used, which limits their generality. When a new reward function or combination thereof is introduced, significant technical effort is required, as existing techniques rarely transfer across different reward models. This lack of general results – where solutions remain specialized to the underlying evaluation model, preventing knowledge transfer between different classes of objectives – has been noted in related contexts such as quantitative verification (see, e.g., [2]).
To address this, we propose a general approach based on automata-based preference relations to compare infinite paths in the graph. This framework provides a structured and unified method for reasoning about strategies and equilibria across various reward models. A similar use of automata-based preference relations has been explored in [2, 4, 8], and here we demonstrate how this idea can be adapted to fit the non-zero-sum game setting.
Contributions.
Our contributions center on using -automatic relations on infinite words [41], as introduced in [23], to define a general framework for preference relations over paths in game graphs, thereby establishing a generic method to compare executions for players in non-zero-sum games. These relations are specified by deterministic parity automata that read pairs of words synchronously and accept them whenever is preferred to .
Our main contributions focus on the computational complexity of four key problems related to NEs in non-zero-sum games [36] with -automatic preference relations. First, we study the problem of verifying whether a given strategy profile, specified by Mealy machines, one per strategy, constitutes an NE in the given game. We prove that this problem is -complete (Theorem 3). Second, we examine whether a lasso-shaped path (i.e., a regular path) is the outcome of an NE, showing that this problem is in and -hard (Theorem 4). Third, we establish the existence of games without any NE, motivating the fundamental problem of determining whether a given game admits at least one NE. This problem turned out to be particularly challenging, and we reduce it to a three-player zero-sum game with imperfect information. We provide an algorithm for solving this problem with exponential complexity in the size of the graph, the parity automata defining the preference relation, and the number of their priorities, and doubly exponential complexity in the number of players. However, since the number of players is a natural parameter that tends to be small in practical scenarios,111In robotic systems or in security protocols, the number of agents is usually limited to a few. For example, in a security protocol, the players are Alice and Bob who exchange messages, the trusted third party, and a fourth player for the network (see, e.g., [19, 29]) we refine this result by proving that for a fixed number of players, the problem lies in and is -hard (Theorem 5). In addition, our approach has the advantage of being modular and therefore easily adapts to question the existence of a constrained NE. When we attach one constraint to each player given as a lasso-shaped path and ask for an NE whose outcome is preferred to any of those constraints, the adapted algorithm keeps the same complexity except that it becomes doubly exponential in the number of priorities of the parity conditions. Yet the number of priorities is often small222important classes of objectives such as Büchi, co-Büchi, reachability, and safety require at most three priorities and when we fix it and the number of players, the algorithm remains in and -hard (Theorem 6). Note that our approach allows to show that when there exists an (constrained) NE, there exists one composed of finite-memory strategies.
Additionally, we analyze the algorithmic complexity of verifying whether an -automatic relation satisfies the axioms of a strict partial order (irreflexivity and transitivity) or of a preorder (reflexivity and transitivity) which are two classical requirements for a relation to model preferences. We show that these problems are -complete (Proposition 9). Finally, we show that when the -automatic preference relations are all -recognizable (a strict subclass of -automatic relations where the two input words can be processed independently) and preorders, the existence of at least one NE is always guaranteed (Theorem 11).
Related work.
A well-established hierarchy of rational relations holds for both finite and infinite words [17, 40]. The -automatic relations – also called synchronized -rational relations – were first studied in [23]. Some decision problems about -automatic and -recognizable relations were solved in [35] and improved in [3]. The study of automatic structures has also led to results involving rational relations, notably within first-order logic (see, e.g., [5, 25, 31, 39]).
The problems we study in this paper were widely investigated in the literature for specific reward functions, including functions that mix different objectives, see, e.g., [6, 28, 42, 43]. There are also works that study these problems across large classes of reward functions rather than individual ones, or that consider general notions of preference relations. For instance, in [12], the authors prove the existence of finite-memory NEs for all cost-semi-linear reward functions. In [7], a complete methodology is developed to solve the (constrained) NE existence problem, thanks to the concept of suspect game, encompassing all reward functions definable by a class of monotone circuits over the set of states that appear (finitely or infinitely often) along paths in a game graph. The preference relations studied in [7] are all -automatic. In [20], the authors study NEs for games with a reward function that, given a finite set of objectives of the same type, associates an integer with each subset of satisfied objectives of . Again, if the objectives of are -regular, the reward functions of [20] lead to -automatic preference relations. The existence of NEs is guaranteed within a broad setting, both in [27] and [34], without relying on an automata-based approach, however with no complexity result about the constrained NE existence problem. In case of games with -recognizable preference relations, our proof that NEs always exist relies on the technique developed in [27].
The results we obtain with games with -recognizable preference relations cover a large part of the games studied classically. In addition, our setting allows any combinations of objectives as soon as they are expressible by automata. However, it does not cover games with mean-payoff or energy objectives. Indeed, in the first case, it is proved in [2] that the related preference relation is not -automatic; and in the second case, the constrained NE existence problem is undecidable [11]. Note that the general concepts of -automatic and -recognizable relations have also been used to study imperfect information in games in [4, 8] and formal verification of quantitative systems in [2].
2 Preliminaries
In this section, we introduce the useful definitions of games with -automatic preference relations and give several illustrative examples.
Automatic Relations.
Let be a fixed finite alphabet. We consider binary relations on infinite words over . The relation is -automatic if it is accepted by a deterministic finite parity automaton over the alphabet , that is, is an -regular language over . The automaton reads pairs of letters by advancing synchronously on the two words. This behavior is illustrated in Figure 1 below. A relation is -recognizable if it is equal to where are -regular languages over [35]. Any -recognizable relation is -automatic [40].
We suppose that the reader is familiar with the usual notion of deterministic parity automaton (DPA) used to accept -automatic relations [35]. A run is accepting if the maximum priority seen infinitely often is even. In this paper, we also use other classical notions of automata: deterministic Büchi automata (DBA) and Rabin automata. See, e.g., [26] for general definitions, or [1, 32] for deeper details. We also need the concept of generalized parity automaton which is an automaton with a positive333The negation is not allowed in the Boolean combination. Boolean combination of parity conditions. Given an automaton , its size is its number of states.
Games with Preference Relations.
An arena is a tuple where is a finite set of vertices, is a set of edges, is a finite set of players, and is a partition of , where is the set of vertices owned by player . We assume, w.l.o.g., that each has at least one successor, i.e., there exists such that . We define a play (resp. a history ) as an infinite (resp. finite) sequence of vertices such that for any two consecutive vertices . The set of all plays of an arena is denoted , and we write when the context is clear. The length of a history is the number of its vertices. The empty history is denoted .
A game is an arena equipped with -automatic relations over the alphabet , one for each player , called his preference relation. For any two plays , player prefers to if . In the sequel, we write instead of and for all , , or , instead of . We also say that is maximal (resp. minimal) for if for all , we have (resp. ). Below we give various examples of games whose preference relations are all strict partial orders. At this stage, is just an -automatic relation without any additional hypotheses. Such hypotheses will be discussed in Section 6.
Given a play and an index , we write the suffix of . We denote the first vertex of by . These notations are naturally adapted to histories. We also write for the last vertex of a history . We can concatenate two non-empty histories and into a single one, denoted or if . When a history can be concatenated to itself, we call it cycle. A play , where is a history and a cycle, is called a lasso. The length of is then the length of , denoted .444To have a well-defined length for a lasso , we assume that with of minimal length.
Let be an arena. A strategy for player maps any history to a successor of , which is the next vertex that player chooses to move after reaching the last vertex in . A play is consistent with if for all such that . Consistency is naturally extended to histories. A tuple of strategies with a strategy for player , is called a strategy profile. The play starting from an initial vertex and consistent with each is denoted by and called outcome.
A strategy for player is finite-memory [26] if it can be encoded by a Mealy machine where is the finite set of memory states, is the initial memory state, is the update function, and is the next-move function. Such a machine defines the strategy such that for all histories , where extends to histories as expected. A strategy is memoryless if it is encoded by a Mealy machine with only one state.
Generality of the -Automatic Preference Framework.
Let us show that the above notion of game encompasses many cases of classic games and more. We begin with games where each player has an -regular objective , such as a reachability or a Büchi objective [21, 26]. In this case, the preference relation is defined by if and only if , where is seen as a function . As is -regular, it follows that is -automatic. For instance, given a target set , the first DPA of Figure 1 accepts when is a reachability objective ; the second DPA accepts when is a Büchi objective , where is the set of vertices seen infinitely many times in .
More general preference relations can be defined from several -regular objectives for player . With each is associated the payoff vector . Given a strict partial order on these payoff vectors, we define a preference relation such that if and only if [7]. There exist several strict partial orders on the payoff vectors, like, for example, the lexicographic order, or the counting order, i.e., if and only if . One can check that all preference relations studied in [7] are -automatic.
Let us move on to classical quantitative objectives, such as quantitative reachability, limsup or discounted-sum objectives [21, 26]. In this case, an objective for player is now a function .555It can also be a function . We then define a preference relation such that if and only if . Bansal et al. showed in [2] that such a relation is -automatic for a limsup objective and for a discounted-sum objective with an integer discount factor. They also proved that is not -automatic for a mean-payoff objective. The first DPA of Figure 1 where the label on the loop on the vertex with priority is replaced by , accepts a preference relation defined from a min-cost-reachability objective as follows: if and only if there exists such that and, for all , (player prefers plays with fewer steps to reach the target set ). The variant where player prefers to maximize the number of steps to reach ,666as each step corresponds to a reward. called max-reward-reachability, is accepted by the third DPA in Figure 1.
Hence, there are many ways to envision -automatic relations. Note that in our framework, the preference relations of a game can vary from one player to another, where each relation can be defined from a combination of several objectives (see Example 1 below).
3 Decision Problems about Nash Equilibria
In this section, we state the decision problems studied in this paper and we provide our main results regarding their complexity classes.
Studied Problems.
A Nash Equilibrium (NE) from an initial vertex is a strategy profile such that for all players and all strategies of player , we have , where denotes . So, NEs are strategy profiles where no single-player has an incentive to unilaterally deviate from his strategy. When there exists a strategy such that , we say that (or, by notation abuse, ) is a profitable deviation for player . The set of players is called coalition , sometimes seen as one player opposed to player .
Example 1.
Let us illustrate the NE definition with two examples. We consider the two-player arena depicted in Figure 2 such that player owns only and player owns all other vertices. The preference relation for player is defined from a min-cost-reachability objective with a target set . The preference relation for player is defined from a Büchi objective with a target set . Let us consider the strategy profile defined by two memoryless strategies such that and .777As and have only one successor, the strategy is trivially defined for those vertices. It is an NE from the initial vertex with outcome . Player has no profitable deviation if player sticks on his strategy : it is not possible to visit vertex . Player has also no profitable deviation. There exists another NE from such that
-
if the history visits , and to otherwise,
-
is the memoryless strategy such that .
In that case, the NE outcome is . Note that both players prefer the second NE as for .
Let us slightly modify the relation of player such that is defined from a lexicographic order using two objectives: a min-cost-reachability objective with and a Büchi objective with . We have if and only if () or ( and ). If we consider the two previous strategy profiles, is still an NE, but is no longer an NE as player has a profitable deviation. Indeed, with the memoryless strategy such that , we get .
Example 2.
In this example, we show that there does not always exist an NE in games with -automatic preference relations. Consider the simple one-player game with two vertices , the edges , and whose preference relation is defined from a max-reward-reachability objective with a target set . This game has no NE from the initial vertex .888A similar example is given in [34]. Indeed, if the strategy of player is to loop on , then he has a profitable deviation by going to at some point, and if his strategy is to loop times in and then go to , then he has a profitable deviation by looping one more time in before going to .
In this paper, we investigate the following problems.
Problems.
-
The NE checking problem is to decide, given a game , an initial vertex , and a strategy profile where each strategy is defined by a Mealy machine , whether is an NE from in .
-
The NE outcome checking problem is to decide, given a game and a lasso , whether is the outcome of an NE in .
-
The NE existence problem is to decide, given a game and an initial vertex , whether there exists an NE from in .
-
The constrained NE existence problem is to decide, given a game , an initial vertex , and a lasso for each player , whether there exists an NE from in with an outcome such that for all players .
Main Results.
Let us state our main results. We consider games on the arena , where each preference relation is -automatic. We denote by the DPA accepting and by its set of priorities. We say that a problem is -hard if there exists a polynomial reduction from the problem of deciding the winner of a two-player zero-sum parity game to .
Theorem 3.
The NE checking problem is -complete.
Theorem 4.
The NE outcome checking problem is in and -hard.
Theorem 5.
The NE existence problem is exponential in , , and , thus doubly exponential in . If the number of players is fixed (resp. for a one-player game), this problem is in and -hard (resp. -complete).
Theorem 6.
The constrained NE existence problem, with the constraints given by lassoes , is exponential in , , , and doubly exponential in , thus also doubly exponential in . If the number of players and each are fixed (resp. for a one-player game), this problem is in and -hard (resp. -complete).
The proofs of these theorems are detailed in the next two sections. In Section 7, we reconsider the studied problems in the special case of games with -recognizable relations.
4 NE Checking and NE Outcome Checking Problems
We first prove Theorem 3, stating the -completeness of the NE checking problem. The hardness is limited to a sketch of proof, the full technical details are given in the long version of this paper [15].
Proof of Theorem 3.
We begin with the membership result. Given the Mealy machines , , and the strategies they define, we have to check whether is an NE from a given initial vertex . Equivalently, we have to check whether there exists a strategy for some player such that (in which case is not an NE). That is, whether there exists such that the language
is non-empty. We are going to describe a generalized DPA , with a conjunction of three parity conditions, that accepts . We proceed as follows.
-
1.
The set is accepted by the given DPA that accepts .
-
2.
The outcome is a lasso obtained from the product of the arena and all . We can define a DPA, of size exponential in the number of players, that only accepts .
-
3.
Finally, consider the product of the arena with all , with . We denote by the set of vertices of , where each vertex is of the form , with and a memory state of . The set of plays consistent with and starting at is accepted by a DPA whose set of states is with , a new state, its initial state, all those states with priority , and whose transition function is such that for , and . Note that is a function as each is deterministic and that this DPA is of exponential size in the number of players.
The announced automaton is the product of the automata defined in the previous steps. It has exponential size and can be constructed on the fly, hence leading to a algorithm. Indeed, to check whether is non-empty, we guess a lasso and its exponential length, and check whether the guessed lasso is accepted by . This only requires a polynomial space as the lasso is guessed on the fly, state by state, while computing the maximum priority occurring in for each priority function, and the length is stored in binary. Finally, we repeat this procedure for each automaton , .
For the -hardness, we use a reduction from the membership problem for linear bounded deterministic Turing machines (LBTMs), known to be -complete [24], to the complement of the NE checking problem. Recall that an LBTM has a limited memory such that the tape head must remain in the cells that contain the input word .
We give only a sketch of proof. First, let us show how we encode any configuration of the LBTM. For the current word written on the tape, we associate one player per cell, and we say that the letter in the -th cell, , is the current memory state of the Mealy machine of player . Then we define an arena where each vertex is of the form , for a state of and the current position of the tape head, and such that player owns all the vertices . Second, we simulate transitions of the LBTM with the Mealy machines: can describe the next vertex according to its memory state. For example, from vertex and memory state for player , moves to vertex and updates its memory state to if the LBTM says that from state and letter , the tape head must write and go right, and that the next state is .
This construction allows us to completely simulate the LBTM with an arena, described in Figure 3. We add an extra player who decides whether to let the other players follow their Mealy machine to simulate the LBTM on the given word, or go to a sink state . With his preference relation , player prefers a play visiting a vertex , for any , to any other play. His Mealy machine goes from to . Thus, the strategy profile given by all Mealy machines is not an NE if and only if it is profitable for player to let the other players simulate the LBTM on , i.e., this simulation visits .
Let us now prove Theorem 4 stating the complexity of the NE outcome checking problem.
Proof of Theorem 4.
Let us begin with the membership result. Given a lasso starting at , checking whether is an NE outcome amounts to finding a strategy profile with outcome such that for all and all strategies , we have . In other words, given a strategy profile partially defined such that , our goal is to check whether, for all , there exists that extends this partially defined profile such that for all , . For this purpose, we explain the algorithm in for one given player , and then repeat it for the other players.
Let us consider . This set is accepted by a DPA constructed as the product of the complement of and the lasso . Clearly, the size of is polynomially bounded in the sizes of and . So, contains all the deviations that are not profitable for player compared to . Now, it suffices to decide whether the coalition has a strategy against player to ensure that every play consistent with lies in . As is accepted by the DPA , this amounts to solving a zero-sum parity game (of polynomial size) defined directly from . The details are as follows.
Suppose that has a set of states, an initial state , and a transition function . Let us define the game , where the two players are and . Its set of vertices is the Cartesian product , such that player (resp. player ) controls the vertices with (resp. ). In other words, has the role of player while has the role of the coalition . As is deterministic, it is seen as an observer, and its states are information added to the vertices of . Hence, the edges of are of the form such that and . We define a parity objective for player as follows: the priority of each vertex of is equal to the priority of in . Consequently, a play in is won by player if and only if the projection on its first component belongs to . For the constructed game , from every vertex , we can decide in which player wins in together with a memoryless winning strategy for that player [26].
To obtain an algorithm in , it remains to check whether , seen as a lasso in , only crosses vertices that are winning for player whenever . Indeed, in this case, we can deduce from a winning strategy from for player , a strategy for the coalition such that for all , . Similarly, to obtain an algorithm in , we check whether in crosses at least one vertex that is winning for player and deduce a winning strategy for player .
We continue with the hardness result, with a reduction from the problem of deciding whether player has a winning strategy in a zero-sum parity game. We reduce this problem to the complement of the NE outcome checking problem, to establish its -hardness. Let be a parity game with players and , an arena with as set of vertices, an initial vertex , and a priority function . We construct a new game with the same players, whose arena is a copy of with an additional vertex owned by player , with and itself as successors (see Figure 5). Given , the preference relation is empty, accepted by a one-state DPA, while the preference relation is defined as follows: if and only if and , with and is a play in starting at and satisfying the parity condition . A DPA accepting is depicted in Figure 5, it is constructed with a copy of the arena and a new state with priority 1.
The proposed reduction is correct. Indeed, suppose that is not an NE outcome. As is empty, there cannot be profitable deviations for player . This means that for each strategy profile with outcome , there exists a deviating strategy of player such that with . Thus, by definition of , is equal to with a winning play in . Hence, transferred to , we get that for each strategy of player , there exists a strategy of player such that is winning. By determinacy, player has thus a winning strategy in from . The other direction is proved similarly, if player has a winning strategy in , then transferring this strategy to gives a profitable deviation from a strategy with outcome .
5 NE Existence and Constrained NE Existence Problems
This section is devoted to the NE existence problem and its constrained variant. We mainly focus on the NE existence problem and explain at the end of the section how to take into account the constraints imposed on the NE outcome.
To solve the NE existence problem, we adapt a recent approach proposed in [16]. The idea is to reduce our problem to solving a three-player game with imperfect information.999Although the underlying game structure of [16] is reused, the player roles and correctness arguments differ entirely. Let us first give some intuition (see also Figure 5) and then the formal definition. We use a reduction to a game with three players: two Provers and and one Challenger . The two Provers aim to build an NE outcome while Challenger contests that it is an NE outcome: has the task of building edge by edge, while has the task of showing that the deviation of player proposed by is not profitable, i.e., . We need two Provers (we cannot use a two-player zero-sum game), as the construction of cannot depend on one specific deviation and must be fixed, i.e., its construction cannot change according to the deviation to artificially force . This also means that has to build without knowing when deviates: he has partial observation of the game, while and have perfect information. This game, called game, is articulated in two parts. The first part consists of vertices where does not deviate, where an action of is to suggest an edge to extend the current construction of , and an action of is either to accept it or to deviate from by choosing another edge with . Such a deviation corresponds to a deviation by the player who owns , leading to the second part of the game. In this part, the vertices must retain the construction of the play , the construction of the deviation , and the component to identify the player who deviated: continues to propose an extension for with no interaction with , and and , representing respectively the deviating player and the opposed coalition , interact to construct . When the game stays in the first part, the objective of is to produce an NE outcome , and if it goes in the second part, has the same goal and the aim of is to retaliate on the deviations proposed by to guarantee that is not a profitable deviation. Hence, the vertices of the game also store the current states of the DPAs accepting the preference relations, in a way to compare the outcome with the deviation .
We now proceed to the formal definition of the game. Suppose that we are given a game with and as the initial vertex, and each relation accepted by a DPA . We denote each automaton as with its set of states, its initial state, its alphabet, its transition function, and its priority function. The game
is a three-player game with partial observation for , defined as follows.
-
The set of vertices are of the form or such that , , , and . Coming back to the intuition given above, is the current vertex of , is the deviating player (or if did not deviate yet), is the current vertex of (if it exists, otherwise ), is the current state of while comparing and .
Given that we are looking for an NE in from some initial vertex , we consider the initial vertex in the game.
-
The set is partitioned as such that is composed of the vertices , is composed of the vertices such that either and , or and , and is composed of the vertices such that and .
-
The set of actions101010We introduce actions in a way to easily define the transition function . is, respectively for each player, equal to: ( chooses an edge to extend the current construction of ) and ( and choose the next vertex of in case deviates, otherwise accepts the vertex of the edge proposed by .)
-
The transition function is defined as follows:
-
–
for : for each and each , we have ,
-
–
for who has not yet deviated: for each and each with , we have either and (which means that accepts the edge proposed by ), or , , and (which means that starts deviating), with , for each , in both cases, i.e., the states of the DPAs are updated.
-
–
for who has deviated and : for each and each with , we have either and thus , or and thus , and in both cases, with , for each .
-
–
-
The observation function for 111111Recall that and have total observation of the game. is such that and . When and , we consider that cannot distinguish and . Hence, can only observe the vertices of the initial game and the edges that he proposes. We naturally extend to histories and plays of the game by applying the observation function on each of their vertices.
-
To complete the definition of the game, it remains to define the winning condition . Let us introduce some notation. Given a vertex , we denote by (resp. , ) the projection on its first (resp. second, third) component. For a vertex , we denote by this last component of . Note that if , then , and if , then . Given a play of the game starting at the initial vertex , is an alternation of vertices of and vertices of . Moreover, looking at the first (resp. third) components of the vertices of , each such component is repeated from one vertex to the next one. Thus, we denote by the projection on the first component of the vertices of . Similarly, we use notation for the projection on the third component. We also define the notation for the projection of on the last component of its vertices. Note that and . In the play , either the second component always remains equal to or ultimately becomes equal to some . We use notation to denote this value or . All these notations are also used for histories.
The set is defined as where is the set of plays where always agreed with and is the set of plays where deviated but was able to show that this deviation is not profitable, i.e.,
-
–
,
-
–
.
This set is the winning condition for both and while has the complementary winning condition .
-
–
The next theorem states how the game helps to solve the NE existence problem. A strategy of is observation-based if for all histories ending in a vertex of such that , we have .
Theorem 7.
The following statements are equivalent:
-
In , there exists an NE from ,
-
In , there exists an observation-based strategy of such that for all strategies of , there is a strategy of such that .
Theorem 7 is the key tool to solve the NE existence problem. It is proved in detail in the long version of this paper [15]. We give hereafter a sketch of proof for the membership result of Theorem 5, which follows the approach proposed in [16]. The -hardness already holds for one-player games, with a reduction from the existence of a maximal element in a relation , which is a -complete problem and close to the existence of NEs in one-player games. All details are given in the long version [15]
Sketch of Proof of Theorem 5, Membership.
By Theorem 7, deciding whether there exists an NE from in reduces to deciding whether there exists an observation-based strategy of in such that for all strategies of , there is a strategy of such that . In [16], the authors solve the problem they study by solving a similar three-player game with imperfect information. They proceed at follows: (i) the winning condition is translated into a Rabin condition121212Recall that a Rabin condition uses a finite set of pairs in a way to accept plays such that there exists with and . on the arena of the game, (ii) the three-player game is transformed into a two-player zero-sum Rabin game with imperfect information, and finally (iii) classical techniques to remove imperfect information are used to obtain a two-player zero-sum parity game with perfect information.
In this sketch of proof, we only explain the first step, i.e., how to translate into a Rabin condition, as the second and third steps heavily use the arguments of [16]. To translate , we use one pair such that and . To translate , notice that is equivalent to , and thus . Recall that each relation is accepted by the DPA with the priority function , thus also with the modified priority function . Therefore, can be translated into a Rabin condition on the vertices of with Rabin pairs [32]. Steps (ii) and (iii) are detailed in the long version [15], leading to the announced complexity: the NE existence problem is exponential in , , and .
Let us finally comment on Theorem 6 stating the complexity class of the constrained NE existence problem. The detailed proof is presented in the long version of this paper [15]. The approach to proving membership is very similar to that of the NE existence problem, as we only need to modify in a way to include the constraints imposed on the NE outcome. A constraint imposed by a lasso can be represented by a DPA accepting the language , with a polynomial size . Then it suffices to extend the arena the game with the states of each . The hardness result is obtained by a reduction from the NE existence problem, already for one-player games.
Note that by steps (i)-(iii), solving the (constrained) NE existence problem is equivalent to solving a zero-sum parity game with memoryless winning strategies for both players. Therefore, we get the following property:
Corollary 8.
If there exists a (constrained) NE, then there exists one with finite-memory strategies.
There is a great interest in using the concept of game, as it provides a unified approach to solve the NE existence problem and its constrained variant. With this approach, we could also decide the existence of an NE whose outcome satisfies various combinations of constraints, such as, e.g., for one or several players . The chosen constraints only impact the winning condition and thus its translation into a Rabin condition.
6 Hypotheses on Preference Relations
In the previous sections, we presented several decision algorithms. Since the players’ relations are intended to formalize how they prefer one play to another, we may naturally expect them to satisfy certain properties, such as irreflexivity and transitivity. However, since the relation is accepted by a DPA, its structure can be intricate, and it becomes relevant to verify whether such properties hold. In this section, we address decision problems related to the algorithmic verification of properties of . We also explore alternative approaches to modeling preferences between plays, focusing in particular on cases where the DPA accepts a non-strict preference relation , i.e., a preorder, rather than a strict partial order.
Hypotheses on Relations.
Given a relation , it is:
-
reflexive (resp. irreflexive) if for all , we have (resp. ),
-
transitive (resp. -transitive) if for all , we have (resp. ),
-
total if for all , we have ,
A reflexive and transitive relation is a preorder. An irreflexive and transitive relation is a strict partial order. When, in addition, a strict partial order is -transitive, it is a strict weak order. The next proposition states that all the relevant properties mentioned above can be efficiently verified on the DPA accepting the relation . It is proved in the long version [15].
Proposition 9.
The problem of deciding whether an -automatic relation is reflexive (resp. irreflexive, transitive, -transitive, total) is -complete.
Variants on our Setting.
Let us first observe that all the lower bounds established for the decision problems about NEs remain valid even when the players’ preference relations are assumed to be strict partial orders.131313The upper bounds do not need more than the -automaticity of each This implies that taking this additional property into account does not yield any important advantage in terms of complexity class.
We now consider an alternative setting in which each relation is not a strict preference, but rather a preorder (where some plays are declared equivalent). To discuss this further, let us recall the relationship between strict partial orders and preorders. From a strict partial order , one can obtain a preorder by taking its reflexive closure, i.e., if or . When is a strict weak order, another preorder which is total, is obtained by defining if . In both cases, if is -automatic, is also -automatic (resp. by a generalized DPA with a disjunction of two parity conditions, or by a DPA). Conversely, from any preorder , we can define a strict partial order such that if . Every strict partial order can be constructed this way. Moreover, if is total, then is a strict weak order. We can also define the equivalence relation such that if . The equivalence class of is denoted . Again, if is -automatic, then and are both -automatic (by a generalized DPA with a conjunction of two parity conditions).
For games with preorders , we keep the same upper bounds, except that we only have an membership for Theorem 4. Indeed, it requires solving a generalized parity game with a disjunction of two parity conditions (instead of a parity game), solvable in [18]. All lower bounds remain valid by carefully modifying the preference relations used in the reductions into preorders, see the long version [15].
Finally, recall that given a lasso , it is easy to construct an automaton that accepts all plays related to according to an -automatic preference relation. This set being -regular, all standard verification techniques for -regular languages can be applied. For example, one may wish to verify that all plays preferred to satisfy a given -regular property. In such cases, the full range of verification algorithms developed for -regular languages can be used.
Alternative Definition of NE.
Given a game with preorders , an NE is a strategy profile such that for all players and all strategies of player , we have . An alternative definition asks for all and that [7]. The two definitions yield different notions of NE (unless all are total). In this paper, we do not consider the second definition, due to the nonexistence of NEs in very simple games. Let us consider the one-player game depicted in Figure 6, where, from the initial vertex , player has the choice between and . We consider the preorder equal to . Clearly, and , showing that there is no NE from for this alternative definition (while and are both NE outcomes with the first definition). This phenomenon appears as soon as there are two incomparable plays.
7 -Recognizable Relations
In this section, we suppose that we have a game whose relations are -recognizable and preorders. We recall that is -recognizable if it is of the form where are -regular languages over . Any -recognizable relation is -automatic (see [40]), and deciding whether an -automatic relation accepted by a DPA is -recognizable is -complete [3]. For each , we use the related relations and as defined in the previous section.
In Example 2, we presented a one-player game with no NE. The reason for the absence of NE is that has an unbounded infinite ascending chain. This situation cannot happen for -recognizable preorders, as highlighted in the next proposition, easily derived from [35] (its proof is given in the long version [15]). This motivates the interest of games with -recognizable preference relations.
Proposition 10.
An -automatic preorder is -recognizable if and only if its induced equivalence relation has finite index.
Thanks to this result, we can partition as a finite lattice given by a partial order induced by on the equivalence classes of . In particular, there always exists a maximal (resp. minimal) element in this lattice. Examples of -recognizable preorders are numerous: those deriving from any Boolean combination of -regular objectives or any multidimensional objective where each dimension is defined using an -regular objective. In the subclass of games with -recognizable preorders, the main difference is the existence of NEs.
Theorem 11.
When the preference relations of a game are all -recognizable preorders, then there always exists an NE composed of finite-memory strategies.
The proof of Theorem 11 requires two steps. We first prove the existence of an NE under the assumption that each preference relation is a total preorder and then without this assumption. (Note that we get an NE composed of finite-memory strategies by Corollary 8). The first step can be obtained as a corollary of [34, Theorem 15] that guarantees the existence of an NE in the case of strict weak orders . Recall that the relation induced by a preorder is a strict weak order if is total. Nevertheless, we provide a proof of this first step in the long version of this paper [15], inspired by the work of [27] and [12], where the existence of NEs is studied through the concept of value and optimal strategy (see below and in the long version [15]).
The second step is obtained thanks to an embedding of partial preorders into total preorders, as described in the next proposition. Theorem 11 easily follows (see the long version [15]).
Proposition 12.
Any -recognizable preorder can be embedded into an -recognizable total preorder . Moreover, for all , if , then , for .
We now focus on prefix-independent relations , such that for all , . From our proof of Theorem 11, when the relations are total and prefix-independent, we can derive the following characterization of NE outcomes in terms of values (the proof is given in the long version [15]). In this context, for each player and vertex of , there always exists a value (which is an equivalence class of ) and optimal strategies for player and for the coalition such that (resp. ) ensures consistent plays starting at such that (resp. ) (see the long version [15]). Such an NE characterization is well-known for games with classical objectives (see, e.g., the survey [13]).
Theorem 13.
Let be a game such that each preference relation is an -recognizable preorder, total, and prefix-independent. Then a play is an NE outcome if and only if for all vertices of , if , then .
In this theorem, we can weaken the hypothesis of prefix-independency into prefix-linearity. A relation is prefix-linear if, for all and , implies . In that case, the condition in Theorem 13 must be replaced by . Moreover, deciding whether a relation is prefix-independent (resp. prefix-linear) is -complete. All proofs and details are provided in the long version [15].
8 Conclusion
In this work, we have introduced a general framework for defining players’ preferences via -automatic preference relations instead of fixed reward functions. It subsumes several classical settings, including the Boolean setting with -regular objectives and quantitative models such as min-cost-reachability, as well as combinations of several such objectives.
In this framework, we have studied the complexity of four fundamental problems related to NEs, notably with a novel use of the game setting recently introduced in [16]. This approach enables a broader applicability and more reusable results. It contrasts sharply with most existing work that is typically focused on specific reward functions.
We hope that our framework will serve as a basis for exploring additional problems such as decision problems about subgame perfect equilibria (which are NEs in any subgame of a game [38]), or the rational synthesis problem as studied in [33]. New results will lead to the development of general and modular solutions for a wider class of questions in the theory of infinite games played on graphs.
References
- [1] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008.
- [2] Suguman Bansal, Swarat Chaudhuri, and Moshe Y. Vardi. Comparator automata in quantitative verification. Logical Methods in Computer Science, Volume 18, Issue 3, July 2022. doi:10.46298/lmcs-18(3:13)2022.
- [3] Pascal Bergsträßer and Moses Ganardi. Revisiting Membership Problems in Subclasses of Rational Relations. In 38th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2023, Boston, MA, USA, June 26-29, 2023, pages 1–14. IEEE, 2023. doi:10.1109/LICS56636.2023.10175722.
- [4] Dietmar Berwanger and Laurent Doyen. Observation and Distinction: Representing Information in Infinite Games. Theory of Computing Systems, 67(1):4–27, 2023. doi:10.1007/s00224-021-10044-x.
- [5] Achim Blumensath and Erich Grädel. Automatic structures. In 15th Annual IEEE Symposium on Logic in Computer Science, Santa Barbara, California, USA, June 26-29, 2000, pages 51–62. IEEE Computer Society, 2000. doi:10.1109/lics.2000.855755.
- [6] Patricia Bouyer, Romain Brenguier, and Nicolas Markey. Nash Equilibria for Reachability Objectives in Multi-player Timed Games. In Paul Gastin and François Laroussinie, editors, CONCUR 2010 - Concurrency Theory, 21th International Conference, CONCUR 2010, Paris, France, August 31-September 3, 2010. Proceedings, volume 6269 of Lecture Notes in Computer Science, pages 192–206. Springer, 2010. doi:10.1007/978-3-642-15375-4_14.
- [7] Patricia Bouyer, Romain Brenguier, Nicolas Markey, and Michael Ummels. Pure Nash Equilibria in Concurrent Deterministic Games. Logical Methods in Computer Science, 11(2), 2015. doi:10.2168/lmcs-11(2:9)2015.
- [8] Laura Bozzelli, Bastien Maubert, and Sophie Pinchinat. Uniform strategies, rational relations and jumping automata. Information and Computation, 242, 2015. doi:10.1016/j.ic.2015.03.012.
- [9] Romain Brenguier, Lorenzo Clemente, Paul Hunter, Guillermo A. Pérez, Mickael Randour, Jean-François Raskin, Ocan Sankur, and Mathieu Sassolas. Non-Zero Sum Games for Reactive Synthesis. In Adrian-Horia Dediu, Jan Janousek, Carlos Martín-Vide, and Bianca Truthe, editors, Language and Automata Theory and Applications - 10th International Conference, LATA 2016, Prague, Czech Republic, March 14-18, 2016, Proceedings, volume 9618 of Lecture Notes in Computer Science, pages 3–23. Springer, 2016. doi:10.1007/978-3-319-30000-9_1.
- [10] Léonard Brice, Jean-François Raskin, and Marie van den Bogaard. Rational Verification for Nash and Subgame-Perfect Equilibria in Graph Games. In Jérôme Leroux, Sylvain Lombardy, and David Peleg, editors, 48th International Symposium on Mathematical Foundations of Computer Science, MFCS 2023, August 28 to September 1, 2023, Bordeaux, France, volume 272 of LIPIcs, pages 26:1–26:15. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/lipics.mfcs.2023.26.
- [11] Léonard Brice, Marie van den Bogaard, and Jean-François Raskin. Subgame-perfect Equilibria in Mean-payoff Games (journal version). Logical Methods in Computer Science, 19(4), 2023. doi:10.46298/lmcs-19(4:6)2023.
- [12] Thomas Brihaye, Julie De Pril, and Sven Schewe. Multiplayer Cost Games with Simple Nash Equilibria. In Sergei N. Artëmov and Anil Nerode, editors, Logical Foundations of Computer Science, International Symposium, LFCS 2013, San Diego, CA, USA, January 6-8, 2013. Proceedings, volume 7734 of Lecture Notes in Computer Science, pages 59–73. Springer, 2013. doi:10.1007/978-3-642-35722-0_5.
- [13] Véronique Bruyère. Computer Aided Synthesis: A Game-Theoretic Approach. In Émilie Charlier, Julien Leroy, and Michel Rigo, editors, Developments in Language Theory - 21st International Conference, DLT 2017, Liège, Belgium, August 7-11, 2017, Proceedings, volume 10396 of Lecture Notes in Computer Science, pages 3–35. Springer, 2017. doi:10.1007/978-3-319-62809-7_1.
- [14] Véronique Bruyère. Synthesis of Equilibria in Infinite-Duration Games on Graphs. ACM SIGLOG News, 8(2):4–29, May 2021. doi:10.1145/3467001.3467003.
- [15] Véronique Bruyère, Christophe Grandmont, and Jean-François Raskin. Games with -Automatic Preference Relations. CoRR, abs/2503.04759, 2025. doi:10.48550/arXiv.2503.04759.
- [16] Véronique Bruyère, Jean-François Raskin, Alexis Reynouard, and Marie Van Den Bogaard. The Non-Cooperative Rational Synthesis Problem for Subgame Perfect Equilibria and omega-regular Objectives, 2024. doi:10.48550/arxiv.2412.08547.
- [17] Olivier Carton, Christian Choffrut, and Serge Grigorieff. Decision problems among the main subfamilies of rational relations. RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications, 40(2):255–275, 2006. doi:10.1051/ita:2006005.
- [18] Krishnendu Chatterjee, Thomas A. Henzinger, and Nir Piterman. Generalized Parity Games. In Helmut Seidl, editor, Foundations of Software Science and Computational Structures, 10th International Conference, FOSSACS 2007, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2007, Braga, Portugal, March 24-April 1, 2007, Proceedings, volume 4423 of Lecture Notes in Computer Science, pages 153–167. Springer, 2007. doi:10.1007/978-3-540-71389-0_12.
- [19] Krishnendu Chatterjee and Vishwanath Raman. Synthesizing Protocols for Digital Contract Signing. In Viktor Kuncak and Andrey Rybalchenko, editors, Verification, Model Checking, and Abstract Interpretation - 13th International Conference, VMCAI 2012, Philadelphia, PA, USA, January 22-24, 2012. Proceedings, volume 7148 of Lecture Notes in Computer Science, pages 152–168. Springer, 2012. doi:10.1007/978-3-642-27940-9_11.
- [20] Yoav Feinstein, Orna Kupferman, and Noam Shenwald. Non-Zero-Sum Games with Multiple Weighted Objectives. In Proceedings of the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2025), volume To appear of Lecture Notes in Computer Science, page To appear. Springer, 2025. URL: https://www.cs.huji.ac.il/˜ornak/publications/tacas25b.pdf.
- [21] Nathanaël Fijalkow, Nathalie Bertrand, Patricia Bouyer-Decitre, Romain Brenguier, Arnaud Carayol, John Fearnley, Hugo Gimbert, Florian Horn, Rasmus Ibsen-Jensen, Nicolas Markey, Benjamin Monmege, Petr Novotný, Mickael Randour, Ocan Sankur, Sylvain Schmitz, Olivier Serre, and Mateusz Skomra. Games on Graphs. Online, 2023. doi:10.48550/arxiv.2305.10546.
- [22] János Flesch, Jeroen Kuipers, Ayala Mashiah-Yaakovi, Gijs Schoenmakers, Eilon Solan, and Koos Vrieze. Perfect-Information Games with Lower-Semicontinuous Payoffs. Mathematics of Operations Research, 35(4):742–755, 2010. doi:10.1287/moor.1100.0469.
- [23] Christiane Frougny and Jacques Sakarovitch. Synchronized rational relations of finite and infinite words. Theoretical Computer Science, 108(1):45–82, 1993. doi:10.1016/0304-3975(93)90230-Q.
- [24] M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979. URL: https://perso.limos.fr/˜palafour/PAPERS/PDF/Garey-Johnson79.pdf.
- [25] Erich Grädel. Automatic Structures: Twenty Years Later. In Holger Hermanns, Lijun Zhang, Naoki Kobayashi, and Dale Miller, editors, LICS ’20: 35th Annual ACM/IEEE Symposium on Logic in Computer Science, Saarbrücken, Germany, July 8-11, 2020, LICS’20, pages 21–34, New York, NY, USA, 2020. Association for Computing Machinery. doi:10.1145/3373718.3394734.
- [26] Erich Grädel, Wolfgang Thomas, and Thomas Wilke, editors. Automata, Logics, and Infinite Games: A Guide to Current Research [outcome of a Dagstuhl seminar, February 2001], volume 2500 of Lecture Notes in Computer Science. Springer, 2002. doi:10.1007/3-540-36387-4.
- [27] Erich Grädel and Michael Ummels. Solution Concepts and Algorithms for Infinite Multiplayer Games. In New Perspectives on Games and Interaction, volume 4, pages 151–178. Amsterdam University Press, 2008. URL: https://core.ac.uk/reader/36484420.
- [28] Julian Gutierrez, Aniello Murano, Giuseppe Perelli, Sasha Rubin, Thomas Steeples, and Michael J. Wooldridge. Equilibria for games with combined qualitative and quantitative objectives. Acta Informatica, 58(6):585–610, 2021. doi:10.1007/s00236-020-00385-4.
- [29] Julian Gutierrez, Muhammad Najib, Giuseppe Perelli, and Michael J. Wooldridge. Automated temporal equilibrium analysis: Verification and synthesis of multi-player games. Artificial Intelligence, 287:103353, 2020. doi:10.1016/j.artint.2020.103353.
- [30] Julian Gutierrez, Muhammad Najib, Giuseppe Perelli, and Michael J. Wooldridge. On the complexity of rational verification. Ann. Math. Artif. Intell., 91(4):409–430, 2023. doi:10.1007/s10472-022-09804-3.
- [31] Bernard Ralph Hodgson. Décidabilité par automate fini. Annales des sciences mathématiques du Québec, 7:39–57, 1983. URL: https://www.labmath.uqam.ca/˜annales/volumes/07-1/PDF/039-057.pdf.
- [32] Orna Kupferman. Automata Theory and Model Checking. In Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors, Handbook of Model Checking, pages 107–151. Springer, 2018. doi:10.1007/978-3-319-10575-8_4.
- [33] Orna Kupferman, Giuseppe Perelli, and Moshe Y. Vardi. Synthesis with rational environments. Annals of Mathematics and Artificial Intelligence, 78(1):3–20, 2016. doi:10.1007/s10472-016-9508-8.
- [34] Stéphane Le Roux and Arno Pauly. Equilibria in multi-player multi-outcome infinite sequential games. Information and Computation, 276:104557, 2021. 5th International Workshop on Strategic Reasoning (SR 2017). doi:10.1016/j.ic.2020.104557.
- [35] Christof Löding and Christopher Spinrath. Decision Problems for Subclasses of Rational Relations over Finite and Infinite Words. Discrete Mathematics & Theoretical Computer Science, 21(3), 2019. doi:10.23638/dmtcs-21-3-4.
- [36] John F. Nash. Equilibrium points in -person games. Proceedings of the National Academy of Sciences of the United States of America, 36(1):48–49, 1950. doi:10.1073/pnas.36.1.48.
- [37] Martin J. Osborne. An introduction to game theory. Oxford Univ. Press, 2004.
- [38] Martin J. Osborne and Ariel Rubinstein. A course in game theory. The MIT Press, Cambridge, USA, 1994. electronic edition.
- [39] Sasha Rubin. Automatic structures. In Jean-Éric Pin, editor, Handbook of Automata Theory, pages 1031–1070. European Mathematical Society Publishing House, Zürich, Switzerland, 2021. doi:10.4171/automata-2/6.
- [40] Jacques Sakarovitch. Elements of Automata Theory. Cambridge University Press, 2009. URL: http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521844253.
- [41] Wolfgang Thomas. Automata on Infinite Objects. In Jan Van Leeuwen, editor, Formal Models and Semantics, Handbook of Theoretical Computer Science, pages 133–191. Elsevier, Amsterdam, 1990. doi:10.1016/B978-0-444-88074-1.50009-3.
- [42] Michael Ummels. The Complexity of Nash Equilibria in Infinite Multiplayer Games. In Roberto Amadio, editor, Foundations of Software Science and Computational Structures, 11th International Conference, FOSSACS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29 - April 6, 2008. Proceedings, volume 4962 of Lecture Notes in Computer Science, pages 20–34. Springer, 2008. doi:10.1007/978-3-540-78499-9_3.
- [43] Michael Ummels and Dominik Wojtczak. The Complexity of Nash Equilibria in Limit-Average Games. In Joost-Pieter Katoen and Barbara König, editors, CONCUR 2011 - Concurrency Theory - 22nd International Conference, CONCUR 2011, Aachen, Germany, September 6-9, 2011. Proceedings, volume 6901 of Lecture Notes in Computer Science, pages 482–496. Springer, 2011. doi:10.1007/978-3-642-23217-6_32.
