Wait-Only Broadcast Protocols Are Easier to Verify
Abstract
We study networks of processes that all execute the same finite-state protocol and communicate via broadcasts. We are interested in two problems with a parameterized number of processes: the synchronization problem which asks whether there is an execution which puts all processes on a given state; and the repeated coverability problem which asks if there is an infinite execution where a given transition is taken infinitely often. Since both problems are undecidable in the general case, we investigate those problems when the protocol is Wait-Only, i.e., it has no state from which a process can both broadcast and receive messages. We establish that the synchronization problem becomes Ackermann-complete, and the repeated coverability problem is in ExpSpace and PSpace-hard.
Keywords and phrases:
Parameterised verification, Reachability, BroadcastCopyright and License:
2012 ACM Subject Classification:
Theory of computation Formal languages and automata theoryFunding:
ANR project PaVeDyS (ANR-23-CE48-0005).Editors:
Paweł Gawrychowski, Filip Mazowiecki, and Michał SkrzypczakSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Distributed systems are at the core of modern computing, and are widely used in critical applications such as sensor networks, or distributed databases. These systems rely on protocols to enable communication, maintain coherence and coordination between the different processes. Because of their distributed nature, such protocols have proved to be error-prone. As a result, the formal verification of distributed systems has become an essential area of research. Formal verification of distributed systems presents unique challenges compared to fthe one of centralized systems. One of the most significant issue is the state explosion problem: the behavior of multiple processes that execute concurrently and exchange information often lead to state spaces that grow exponentially with the number of processes, making the analysis highly challenging. When the systems are parameterized, meaning designed to operate for an arbitrary number of processes, which is often the case in distributed protocols, classical techniques are not useful anymore. The difficulty shifts from state explosion to dealing with an infinite number of system configurations, which leads to undecidability in the general case [1]. Despite these challenges, verification becomes more tractable in certain restricted settings. For instance, in parameterized systems, the behavior of individual processes needs not always to be explicitly represented, which can mitigate state explosion. This has motivated researchers to focus on specific subclasses of systems or communication models where decidability can be achieved without sacrificing too much expressiveness. One such restriction involves protocols where processes are anonymous (i.e., without identities), and communicate via simple synchronous mechanisms, such as rendezvous [9], where two processes exchange a message synchronously. Several variants of this model have been studied, such as communication via a shared register containing some value from a finite set [7], or non-blocking rendezvous [10, 4], where a sender is not prevented from sending a message when no process is ready to receive it. In all these cases, the processes execute the same protocol, which is described by a finite automaton.
A more expressive model is broadcast protocols, introduced by Emerson and Namjoshi [6]. In these protocols, a process can send a broadcast message that is received simultaneously by all other processes (or by all neighboring processes in a network). Several key verification problems arise in this context, including the coverability problem, which asks whether a process can eventually reach a specific (bad) state starting from an initial configuration. Another important property is synchronization, which asks whether all processes can simultaneously reach a given state. Liveness properties, like determining whether infinite executions exist, ensure that certain behaviors occur indefinitely. While broadcast protocols allow more powerful communication than rendezvous protocols, this added expressiveness comes at the cost of increased verification complexity. Indeed, rendezvous communication can be simulated using broadcasts [8]. However, verification becomes significantly harder: while coverability and synchronization can be solved in polynomial time for rendezvous protocols [9], they become respectively Ackermann-complete [8, 20] and undecidable (we show it in this paper) for broadcast protocols. Liveness properties are also undecidable for broadcast protocols [8]. The challenge in verifying broadcast protocols can be understood through their relationship with Vector Addition Systems with States (VASS). Rendezvous-based protocols can be encoded in a VASS, where counters track the number of processes in each state. A rendezvous is simulated by decreasing the counters corresponding to the sender and receiver states and increasing the counters for their post-communication states. While using a VASS for protocols using rendezvous is certainly not the way to go due to the complexity gap between problems on VASS and the existing known polynomial time algorithms to solve verification problems, it is interesting to note that this encoding fails for broadcast protocols because a broadcast message must be received by all processes in a given state, requiring a bulk transfer of a counter value – something not expressible in classical VASS without adding powerful operations such as transfer arcs, which leads to an undecidable reachability problem [21]. However, in some cases, verification of broadcast protocols can become easier, complexity-wise. One example is the setting of Reconfigurable Broadcast Networks, in which communication between processes can be lossy [5]. Liveness properties become decidable in that case [5] and even polynomial time [2].
Another such case is given by a syntactic restriction known as Wait-Only protocols, introduced in [10] in the context of non-blocking rendezvous communication. In Wait-Only protocols, a process cannot both send and receive a message from the same state. From a practical perspective, Wait-Only protocols were proposed to model the non-blocking rendezvous semantics used in Java’s wait/notify/notifyAll synchronization mechanism and C’s wait/signal/broadcast operations with conditional variables, commonly found in multi-threaded programs. In both languages, threads can be awakened by the reception of a message. This naturally leads to distinguishing between action and waiting states: a sleeping thread cannot act on its own and can only be awakened by a signal. This restriction simplifies the possible behaviours of the system, potentially reducing the complexity of the verification. Actually, the coverability problem for Wait-Only broadcast protocols becomes PSpace-complete [11]. Indeed, when processes are in a state where they can receive broadcasts, they cannot send messages, meaning they will move together to the next state, reducing the need for precise counting. Moreover, when a process is in a broadcasting state, it remains there until it decides to send a message, simplifying execution reconstruction and enabling better verification algorithms. By leveraging these properties, we design algorithms for the synchronization problem and liveness properties, obtaining new decidability results for these challenging problems. Proofs omitted due to space contraints can be found in [12].
2 Model and Verification Problems
In this section, we provide the description of a model for networks with broadcast communication together with some associated verification problems we are interested in. First we introduce some useful mathematical notations. We use to represent the set of natural numbers, to represent , and for with , we denote by the set . For a finite set and a natural , the set represents the -dimensional vectors with values in and for and , we use to represent the -th value of . Furthermore, for such a vector , we denote by its dimension .
Networks of Processes using Broadcast Communication.
In the networks we consider, all the processes execute the same protocol, given by a finite state machine. The actions of this machine can either broadcast a message to the other processes (denoted by ) or receive a message (denoted by ). For a finite alphabet of messages , we use the following notations: , and .
Definition 2.1.
A protocol is a tuple where is a finite set of states, is a finite set of messages, is an initial state, and is a set of transitions.
For all , we define as the set of messages that can be received from state , given by . A protocol is Wait-Only whenever for all , either , or . In a Wait-Only protocol, a state such that is called a waiting state. A state that is not a waiting state is an action state. We denote by the set of waiting states and by the set of action states (hence ). Observe that if , then no process is ever able to broadcast messages, for this reason we will always assume that .
For , an -process configuration of a protocol is a vector where represents the state of the -th process in for . The configuration is said to be initial whenever for all . The dimension hence characterizes the number of processes in . For a state , we use to represent the set of processes in state , formally . For a subset of processes, we use to identify the set of states of processes in , formally . We let [resp. ] be the set of all configurations [resp. initial configurations] of .
We now define the broadcast network semantics associated to a protocol . For configurations , transition and , we write whenever , , and, for all , either a reception occurs, i.e., , or the process cannot receive , in which case ( and ). Intuitively, the -th process of broadcasts a message , which is received by all processes able to receive it, while the other processes remain in their current state. We note if there exists and such that and use [resp. ] for the reflexive and transitive [resp. transitive] closure of . A finite [resp. infinite] execution is a finite [resp. infinite] sequence of configurations [resp. ] such that and for all [resp. for all ]. Its length is equal to [resp. ]. We write [resp. ] for the set of finite [resp. infinite] executions. For an execution and , we use to denote , the -th configuration. We use to represent , the number of processes in the execution. For , we define . We also let and . For , we denote by the sequence of states .
Example 2.2.
Figure 1 illustrates an example of a Wait-Only protocol. The waiting states are , and , with . Here is one of its execution with processes:
Verification Problems.
We investigate two verification problems over broadcast networks: the synchronization problem (Synchro) and the repeated coverability problem (RepCover). Synchro asks, given a protocol and a state , whether there exist and such that , and for all . RepCover asks, given a protocol and a transition , whether there exist and , such that for all , there exists verifying .
Theorem 2.3.
Synchro and RepCover are undecidable.
We give the proof of the undecidability of Synchro in [12], while the undecidability of RepCover was established in [8, Theorem 5.1]. In the remainder of the paper, we show that by restricting to Wait-Only protocols, one can regain decidability.
Remark 2.4.
Throughout the remainder of this paper, we will consider protocols without self-loop broadcast transitions of the form . Such transitions can be transformed into two transitions , where is a new state added to the set of states and is a new message added to the alphabet. This transformation of the protocol is equivalent to the original one with respect to Synchro and RepCover.
Vector Addition System with States.
In the following, we will extensively rely on the model of Vector Addition Systems with States (VASS) which we introduce below. A VASS is a tuple where Loc is a finite set of locations, is the initial location, is a finite set of natural variables, called counters, and is a finite set of transitions. A configuration of is a pair in where provides a value for each counter. The initial configuration is where for all . Given two configurations , and a transition , we write whenever for all counters . We simply use when there exists such that . We denote for the transitive and reflexive closure of . An (infinite) execution (or run) of the VASS is a (infinite) sequence of configurations , starting with the initial configuration and such that for all .
Reach, the well-known reachability problem for VASS, is defined as follows: given a VASS and a location , is there an execution from to ?
3 Solving Synchro for Wait-Only Protocols
To solve Synchro we show in this section that we can build a VASS that simulates the behavior of a broadcast network running a Wait-Only protocol. An intuitive way to proceed, inspired by the counting abstraction proposed for protocols communicating by pairwise rendez-vous communication [9], consists in having one counter per state of the protocol, that stores the number of processes that populate that state. Initially, the counter associated with the initial state can be incremented as much as one wants: this will determine the number of processes of the execution simulated in the VASS. Then, the simulation of broadcast messages amounts to decrementing the counter associated to and increment the counter associated to . However, simulating receptions of the message (e.g. a transition ), requires to empty the counter associated to and transfer its value to the counter associated to . This transfer operation is not something that can be done in VASS unless the model is extended with special transfer transitions, leading to an undecidable reachability problem [21]. To circumvent this problem, we rely on two properties of Wait-Only protocols: (1) processes that occur to be in the same waiting state follow the same path of reception in the protocol, and end in the same action state (we show how to take care of potential non-determinism in the next section) and (2) processes in an action state cannot be influenced by the behaviour of other processes as they cannot receive any message. Thanks to property (1), instead of precisely tracking the number of processes in each waiting state, we only count the processes in a “waiting region” – a connected component of waiting states populated by processes that will all reach the same action state simultaneously. The waiting region allows us also to monitor when the processes go out from a waiting region to an action state. We will explain later how we use property (2) to handle the transfer of values of counters within the VASS semantics, and thus simulating processes leaving a waiting region.
For the rest of this section, we fix a Wait-Only protocol (with [resp. ] the set of waiting states [resp. action states]) for which we assume w.l.o.g. the existence of an uncoverable state verifying and for all and . Furthermore, we consider a final waiting state . To ease the reading, we assume in this section that the final state is a waiting state, but our construction could be adapted to the case where is an action state. Moreover, we show in the next section that Synchro in this latter case is easier to solve.
3.1 Preliminary properties
We present here properties satisfied by Wait-Only protocols, which we will rely on for our construction. We first show with Lemma 3.1 that if, during an execution, two processes fulfill two conditions: () they are on the same waiting state , and () the next action state they reach (not necessarily at the same time) is the same (namely ), then one can build an execution where they both follow the same path between and . This is trivial for deterministic protocols (where for all and , there is at most one transition of the form ) and is also true for non-deterministic protocols.
For an execution of , an index , a waiting state and a process number such that , we define , i.e. the first moment after where the process reaches an action state (if such moment does not exist, it is set to ). We note the next action state for process from if and otherwise we take the convention that is equal to , the uncoverable state of the protocol . Finally, we let .
Lemma 3.1.
Let be an execution of . If there exist and such that (i) , (ii) , and (iii) with , then there exists an execution of the form such that for all , and and for all . In particular .
Example 3.2.
Consider the protocol in Figure 1 and the execution of Example 2.2. We have and and . Applying Lemma 3.1, we build: where process and process follow the same path between and . However, consider the following events from : , here we cannot apply the lemma as from , processes and reach two different next action states ( and ).
The above lemma enables us to focus on a specific subset of executions for Synchro, which we refer to as well-formed. In these executions, at any moment, two processes in the same waiting state and with the same next action state, follow the same path of receptions. Formally, an execution of a protocol is well-formed iff for all , for all such that and , it holds that for all . From Lemma 3.1, we immediately get:
Corollary 3.3.
There exists an execution such that for all iff there exists a well-formed execution from to .
We finally provide a result which will allow us to bound the size of the VASS that we will build from the given Wait-Only protocol. Given an execution of , an index and an action state , we define as the set of indices where processes in a waiting state at will reach , if it is their next action state: .
Lemma 3.4.
For all well-formed executions , for all , and for all , we have .
Proof.
If we have , by pigeon hole principle there exist such that and such that and with . Consequently and hence , which contradicts the definition of well-formedness.
3.2 Building the VASS that Simulates a Broadcast Network
Summaries.
We present here the formal tool we use to represent processes in waiting states. We begin by introducing some notations. A print pr is a set of waiting states. Given a non empty subset of states , we define a configuration such that for all . When , we write instead of . Conversely, given a configuration , we define . For two configurations and , we let be the configuration defined by: if and if . A summary is a tuple such that is a print, is an exit state, and is an identifier. The label of is and we denote its print by . Finally, we define a special summary Done.
In our construction, each process in a waiting state is associated with a summary while processes in action states are handled by the counters of the VASS. Intuitively, a summary represents a set of processes lying in the set that will reach the same next action state simultaneously (this restriction is justified by Lemma 3.1). Since the set of waiting states pr evolve during the simulation, each summary must be uniquely identified. To achieve this, we use an integer . Hence each summary is identified with the pair . We do not need more than different identifiers, because when a process enters a new summary (i.e it arrives in a waiting state from an action state), aiming for next action state , it either joins an existing summary with exit state , or create a new one. In the latter case, well-formed executions ensure that no existing summary with exit state is such that . Otherwise two processes would be in the same state , aiming for the same action state, but at different moments, contradicting Lemma 3.1. Note that we need different identifers, and not for technical reasons. Finally, the special summary Done is used to indicate when the processes described by a summary reach the exit action state.
We now describe how summaries evolve with the sequence of transitions. Let and be two summaries (with the same exit state and identifier) and such that there exists a configuration verifying . We then write whenever . This represents the evolution of a summary upon reception of message , when the processes all stay in waiting states, and no new process joins the “waiting region” of the given summary. We write whenever and . In that latter case, the process responsible for the transition joins the “waiting region” represented by . This typically occurs when the process’s next action state is , and it reaches simultaneously with the processes described by . Finally, we use whenever . This represents the evolution (and disappearance) of when all the processes reach (they all reach it simultaneously).
Example 3.5.
Returning to the protocol of Figure 1, consider the summary . Observe that and . However, by definition, we do not have . Indeed, processes summarized in are forced to all go either on or on upon receiving : thanks to Corollary 3.3, we consider only well-formed executions, where all processes summarized in choose the same next state. Additionnaly, we have and . However, note that we do not have , since the action state reached from upon receiving is not the exit state .
Definition of the VASS.
We now explain how we use the summaries in the VASS simulating the executions in the network. First we say that a set of summaries is coherent if for all distinct pairs such that , we have and . We denote by CoSets the set of coherent sets of summaries. For a set of summaries we let . Observe that, when is coherent, for a label , there is a unique such that .
We define the VASS simulating the protocol as follows: , where , the set of counters is , and the set of transitions , is defined as follows:
-
contains exactly the following transitions:
-
–
where and for all other counters;
-
–
where 0 is the null vector (note that the empty set is coherent);
-
–
where ;
-
–
where and for all other counters.
-
–
-
For , we have iff one of the following conditions holds:
-
a.
and . Then, for all , there exists such that and . Moreover, , and for all . Here we simply update the sets of states populated by processes in waiting states after the transition . Some summaries may have disappeared from if the processes represented by this summary have reached their exit action state when receiving . For now, we only modify the counters associated to the states and . Note that this is well-defined, since we assume in Remark 2.4 that there is no self-loop of the form .
-
b.
, , and there exists and such that . For all , there exists such that . Then . Moreover, , and for all . In that case, the process having performed the transition joins an existing summary . We have then modified accordingly the counters associated to and to the appropriate summary.
-
c.
and . For all , there exists such that . Then for some and such that . Moreover, , , and for all . This case happens when the process having sent the message reaches a waiting state, and its expected next action pair (index and state) does not correspond to any existing summary. In that case, it joins a new summary, the counter associated to the state is decremented and the counter of this new summary is incremented.
-
a.
-
Finally, allows to empty the counters associated to summaries that have disappeared from the set of locations. It is defined as the set of transitions such that: and there exists with and and for all other counters, .
Transitions from allow the transfer of counter values from summaries that have disappeared to the counter of their exit action state. This transfer is not immediate, as it is done through iterative decrements and increments of counters. In particular, it is possible for the execution of the VASS to continue without the counter of a disappeared summary being fully transfered. The processes represented by the counter of a disappeared summary behave in the same way as processes in the exit action state that do not take any action. These counters can be emptied later, but always from a location from where no summary with the same label exists. This ensures that there is no ambiguity between processes in a waiting region and those on an action state that have not yet been transferred to the appropriate counter.
Example 3.6.
Figures 4, 4, and 4 illustrate the VASS built for the protocol from Figure 1. Figure 4 shows its overall structure: any run reaching starts by incrementing the counter of the initial state and ends by decrementing the counter of the summary . Here, is an artificial, unreachable state used to ensure that the counted processes must end in (since they cannot be in ). Figure 4 illustrates how message receptions and summary creations are handled. The top-left location contains a single summary , representing configurations where all processes on are in , all progressing to . After the broadcast , three behaviors may follow. In the first case (right), the sender creates a new summary (pink). In the second (diagonal), it creates another new summary (orange), indicating a different arrival time at . In the third (below), it joins an existing summary, and is added to the print. Well-formed executions restrict the number of summaries with the same exit state, as there are at most distinct moments where processes can reach a given action state (Lemma 3.4). Figure 4 illustrates summary deletion. The transition causes the sender to join the pink summary (bottom one), incrementing its counter. Processes in the green summary () receive the message via , reaching their exit state . This summary is deleted in the next location. From there, value of is transferred to the counter with the loop transition. If it is not taken enough times, may remain non-zero. This does not affect correctness: will be decremented eventually from a state in which there is no summary labeled with . Not decrementing soon enough only delay the moment the corresponding processes will move from the action state .
Remark 3.7.
In the sequel of this paper, the size of will be of interest to us. Hence, observe that (as one summary is composed of a state in , one label in and one set of waiting states), and .
Soundness of the construction.
We show now that if there exists a run in the VASS from to , then in the network built from , there exist and such that and for all .
We say that a configuration of is an S-configuration if for some . The implementation of an S-configuration is the set of network configurations defined as follows: if and only if there exists a function such that for all , and
-
CondImpl1
for all with , we have for all ,;
-
CondImpl2
for all , if there exists for some , then for all ;
-
CondImpl3
for all , if , then for all .
In an implementation of , the processes populate states according to the values of the counters. All processes associated with a counter of an action state will populate this exact state (CondImpl1). However, processes associated with a counter of a summary do not necessarily populate waiting states. This occurs when the label does not appear in while the associated counter remains strictly positive. Such a situation arises when a summary has been previously deleted, but its counter has not been emptied. Since all associated processes have already reached the exit action state, the processes associated to the corresponding counter have to populate this active state (CondImpl3). If the summary with label is active (i.e. its label appears in ), the processes associated with will be in the corresponding waiting region, or in the exit action state (CondImpl2). This may seem contradictory, as all processes are supposed to reach their exit state simultaneously. However, the processes already on the exit action state are “old” processes, that remained after a previous deletion of this summary (cf. CondImpl3). These processes will be allowed to send messages only when they are identified with the counter of the active state, meaning when the corresponding transition in will be taken. The next lemma allows to build an execution of the protocol from an execution of the VASS .
Lemma 3.8.
Let be two S-configurations of and . If in then there exists such that for .
Sketch of Proof..
Assume and . Then either or for some . In the first case, we show that if , then . Otherwise, let , there is a process such that . The transition in the VASS makes all the summaries and counters evolve according to the reception of the message . According to the different cases a, b or c, one can build a configuration such that .
Lemma 3.9.
If in , then there exist and such that and for all .
Proof.
From the definition of , any run from to is of the form where for some and for all , whereas and for all . Define as the initial configuration in with processes (i.e. ). Trivially, and with Lemma 3.8 and a simple induction, we get that there exists such that . Observe that by definition of , and since , there exists such that . Using CondImpl2, we have for all with . Since is unreachable by construction, we deduce that for all .
Completeness of the construction.
Let us first introduce some new definitions. Given a well-formed execution , two indices and an action state , we define , the set of processes in waiting states in , and whose next action state is , reached at . Formally, . Furthermore, an S-configuration is a representative of the configuration in iff the following conditions are respected:
-
CondRepr1
for all , , and
for all , there is an injective function s.t.:
-
CondRepr2
-
CondRepr3
for all , we have for all
and for all .
In a representative of the configuration , the counters faithfully count the number of processes on active states (CondRepr1), and on waiting regions. The set gathers exactly the set of processes that populate a same waiting region in a representative of in : they all reach the same next action state at the same instant . The set being bounded by (cf. Lemma 3.4), we can associate with each of these indices a unique identifier, given by the injective function . We use a spare identifier for technical reasons, to handle more easily situations when summaries exist in the location of the VASS, and a new summary is created while one of the previous summaries is deleted at the next step. Then, CondRepr2 and CondRepr3 require that the counters corresponding to the summaries and the summaries themselves reflect faithfully the situation in the configuration with respect to . Observe that such a representative is tightly linked to the simulated execution, since we need to know the future behavior of the different processes to determine the different summaries. Finally, if we let be the set of all -configurations that are representatives of in , we establish the following result before proving the main lemma of this subsection (Lemma 3.11).
Lemma 3.10.
Let be a well-formed execution, and . For all , there exists such that in .
Sketch of Proof..
From an S-configuration in , we can compute the effect on of the transition taken in to go from to , and find a matching transition (according to whether is a waiting state or not, and in the latter case, whether it joins an existing summary or a new one, depending of the execution ). We then obtain a new configuration such that . Note that is not necessarily in : if some summaries have been deleted between and , the transition in has not updated the corresponding counters. Hence we need to apply transitions from to empty counters corresponding to deleted summaries and accordingly increase corresponding counters on matching action states to obtain in .
Lemma 3.11.
If there exist , and such that in and for all , then in .
Proof.
Thanks to Corollary 3.3, we know there exists a well-formed execution from to . Let and . First, consider the sequence where and for all other counters , . Observe that, since , . By applying inductively Lemma 3.10, we get that with . By definition, every time that for some process , then , hence there will always be at most one summary with exit state . So in the execution we build, every time that some summary , we chose . We then have and for all other . Hence, by construction of , we have .
Using Lemma 3.10 and Lemma 3.11 and the fact that the reachability problem for VASS is decidable (see Theorem 2.5) we get the main theorem of this section.
Theorem 3.12.
Synchro restricted to Wait-Only protocols is decidable.
3.3 Lower Bound for Synchro in Wait-Only Protocols
In this subsection we reduce the reachability problem for VASS to Synchro. We fix a VASS and a final location . W.l.o.g., we suppose that any transition in either increments or decrements of 1 exactly one counter. Hence, for a transition , we might describe by giving only for the only such that .
We explain how to build a protocol that simulates : it is depicted in Figure 5 in which we don’t consider the dashed edge. The states zero and represent the evolution of the different counters of the VASS while the blue box describes the evolution of with respect to its locations. Formally, consists of a set of states and a set of transitions . Some transitions of are depicted in Figure 6.
We then obtain the following theorem.
Theorem 3.13.
Synchro with Wait-Only protocols is Ackermann-complete.
Sketch of Proof..
The upper bound follows from Theorem 3.12 and Theorem 2.5. For the lower bound, we give the ideas that prove that the reduction described above is correct. In particular, there exist and such that in and for all if and only if in .
Assume that in , and let be the maximum value of reached during the run. We choose such that . The execution proceeds as follows: all processes except one (the leader) broadcast a dummy message . The leader then broadcasts start and reaches , while the others move to zero. These simulate the counter values, and the leader simulates the VASS by moving through the states of . The value of counter is represented by the number of processes in . To simulate an increment of from to , a process in zero sends , which the leader receives to transition to . If the current configuration correctly represents , the new one faithfully represents . At the end of the simulation, the leader reaches , and the other processes remain in zero. They then broadcast end, and the leader (now in ) broadcasts verif, gathering all processes in .
To prove the other direction, we must show that the processes cannot cheat in simulating the VASS. First, note that reaching requires exactly one broadcast of verif, a second would send the original sender to err. This ensures that only one process (the leader) simulates the VASS by moving through . Second, counter updates must follow the VASS transitions: any unauthorized update would cause the leader to receive an unexpected message and transition to the losing state . Finally, when the leader reaches , all other processes must be in zero. If not, problems occur during the final steps: one process (the helper) broadcasts end, which sends the leader to . Then, all remaining processes must reach z-end before the unique verif broadcast. If a process in moves to z-end, it must broadcast , which is received by other processes in z-end, sending them, including the helper, to . As a result, the helper misses verif and cannot reach , preventing a successful execution. Thus, the simulation must end with the leader in and all others in zero.
4 Synchro is ExpSpace-complete when the Target State is an Action State
Upper bound.
We show that Synchro when can be reduced to the mutual reachability problem on VASS, defined as follows: given a VASS and a location , do there exist a run from to and one from to . This problem is shown to be in ExpSpace [16]. More precisely, Leroux establishes the ExpSpace-membership and provides a bound on the lengths of the runs, for the mutual reachability problem in VAS (Vector Addition Systems). However, by applying the VASS-to-VAS transformation presented in [14], we get the following result.
Theorem 4.1 ([16] Corollary 10.6, Transformation of [14]).
Given a VASS , if there are two runs and , then there are two runs and whose lengths are bounded by where .
Given a Wait-Only protocol and a target state , we explain how to transform it into a VASS where mutual reachability is equivalent to Synchro. We build the VASS as described in Section 3.2, with some modifications. The first two modifications simply encode the fact that the target state is not a waiting sate anymore, while the third one ensures the mutual reachability.
-
We add a state and the transition is replaced by two transitions: and . Here, , , and for all other counters , . It prevents the reachability of to be trivial with the run .
-
The transition is replaced by , where and for all other counters .
-
We add a transition to allow the resetting of to after reaching .
Lemma 4.2.
There exists and such that and, for all , if and only if there are two runs and in .
Sketch of Proof..
Assume that there exist and such that and, for all , . Since the main body of is the same as in Section 3.2 we can, like in Lemma 3.11, build a run of from to . By construction of , .Assume now that (and ). Observe that it might be the case that the run looks like . By construction of the VASS, we know that there is an execution . Here is the valuation obtained after having increased the counter , is the valuation obtained just before going to , and is obtained after having decremented counter . Adapting the proof of Lemma 3.9 to this case, we deduce the existence of an execution of the protocol from an initial configuration to a configuration such that for all , and for all . From the portion of run of the VASS , we similarly get another sequence of configurations of the protocol from a configuration (not necessarily initial because might not be equal to 0 for some counters other than ) to a configuration such that and for all . Observe however that these two sequences can be merged into a single one, of size (the second part corresponds to processes added in with transition , and ). The execution then first behaves like the execution from to , potentially leaving some processes in the initial state (in particular processes from ). Once this first part is over, all the processes are either in the initial state, or in an action state (because for all ). Then, one can simulate the second sequence, (processes already in from the first execution won’t be affected because they are in an action state, so they won’t receive any message. This would not be true if ). Since the execution of eventually reaches , it means that it reaches where for all . Then, by iterating the construction described above, one obtains an execution of from an initial configuration to a configuration such that for all .
As runs of doubly exponential length can be guessed in exponential space in the size of the VASS and , we get the following theorem using Remarks 3.7 and 4.1. Observe that even if the number of locations is doubly exponential in the size of the protocol, the bound of Theorem 4.1 is polynomial in and doubly exponential in , hence lengths of the runs to guess remain doubly exponential in the size of the protocol.
Theorem 4.3.
Synchro for Wait-Only protocols is in ExpSpace when .
Lower bound.
ExpSpace-hardness follows from a reduction of the coverability problem in VASS, which is ExpSpace-hard [19], and stated as follows: given a VASS of dimension , and a location , decide whether there is an execution from to for some .
The reduction uses the protocol depicted in Figure 5, this time including the dashed edge but excluding the thick edges: is now an action state. Then is coverable iff there exists an execution in which all processes reach . The execution of the VASS can be simulated in the protocol like in Section 3.3. The processes that may still remain in the states can reach z-end even when is populated (recall that all the thick edges have been removed). Once this is done, all processes can gather in . Conversely, if there exists an execution in which all processes reach , it means that is coverable. Let be the first process to broadcast start, then an execution of the VASS covering can be built based on the sequence of states visited by between and . This relies on three keys observations: (1) The process must visit , otherwise it cannot reach . (2) When is between and , is the only process in this region. If another process attempts to join by broadcasting start, will receive the message and go to err, preventing it from reaching . (3) When reaches for the first time, no other process is in the states , since start is sent for the first time at this point. Hence, with Theorem 4.3, it establishes the following result.
Theorem 4.4.
Synchro for Wait-Only protocols is ExpSpace-complete when .
5 Solving the Repeated Coverability problem for Wait-Only Protocols
Upper Bound.
We show that RepCover on Wait-Only protocols is in ExpSpace, via the repeated coverability problem in VASS: given a VASS and a location , is there an infinite execution such that, for all , there exists such that ? We rely on the following theorem.
Theorem 5.1 (Theorem 3.1 of [13]).
For a VASS , the repeated coverability problem is solvable in nondeterministic space for some constant independent of , and where the absolute values of components of vectors in are smaller than .
Let be a Wait-Only protocol and the transition that has to occur infinitely often. We build a VASS based on the construction presented in Section 3.2. This time, we add a new set of states . and the set of transitions . Hence, when a process takes the transition , it is highlighted in by going in a location tagged with , before continuing the execution. Taking the protocol of Figure 1, with , the previous transition from the location of the VASS to the location (see Figure 4) is now transformed into two transitions in the VASS we build here: one from to the location and one from to . There is an execution of with a process that takes infinitely often iff there is an execution of where one state in is visited infinitely often. The procedure to decide RepCover is then: guess a location in and solve the repeated coverability problem for and . From Remarks 3.7 and 5.1, this can be done in nondeterministic space. The overall procedure is then in . Hence, the following theorem.
Theorem 5.2.
RepCover is in ExpSpace.
Remark 5.3.
One could define RepCover with a reception transition that has to occur infinitely often. The VASS we described hereabove can be adapted by enlarging each location with the current state of the witness process. This can also be done in ExpSpace.
Lower bound.
We reduce the intersection non-emptiness problem for deterministic finite automata, which is known to be PSpace-complete [15], to RepCover. Let be deterministic, complete finite automata, with for . The reduction assumes a single accepting state per automaton, which does not affect complexity. We denote by the set of words over the alphabet , and extend each transition function to by defining: , and for all and .
We define a protocol composed of several components. For each automaton , includes a component (see Figure 8), where and . The main control component (see Figure 8) includes the transition , which must occur infinitely often. A process taking infinitely often also receives infinitely many sequences of messages , where each is broadcast by the component simulating . In addition to transitions in Figures 8 and 8, also includes: . One can show that there exists a word in the intersection of the automata iff there is an execution in which is taken infinitely often.
Theorem 5.4.
RepCover is PSpace-hard.
References
- [1] K. R. Apt and D. C. Kozen. Limits for automatic verification of finite-state concurrent systems. Inf. Process. Lett., 22(6):307–309, 1986. doi:10.1016/0020-0190(86)90071-2.
- [2] Peter Chini, Roland Meyer, and Prakash Saivasan. Liveness in broadcast networks. Computing, 104(10):2203–2223, 2022. doi:10.1007/S00607-021-00986-Y.
- [3] Wojciech Czerwinski and Lukasz Orlikowski. Reachability in vector addition systems is ackermann-complete. In 62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022, pages 1229–1240. IEEE, 2021. doi:10.1109/FOCS52979.2021.00120.
- [4] Giorgio Delzanno, Jean-François Raskin, and Laurent Van Begin. Towards the automated verification of multithreaded java programs. In TACAS 2002, volume 2280 of LNCS, pages 173–187. Springer, 2002. doi:10.1007/3-540-46002-0_13.
- [5] Giorgio Delzanno, Arnaud Sangnier, and Gianluigi Zavattaro. Parameterized verification of ad hoc networks. In Paul Gastin and François Laroussinie, editors, CONCUR 2010 - Concurrency Theory, pages 313–327, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg. doi:10.1007/978-3-642-15375-4_22.
- [6] E.A. Emerson and K.S. Namjoshi. On model checking for non-deterministic infinite-state systems. In Proceedings. Thirteenth Annual IEEE Symposium on Logic in Computer Science (LICS), pages 70–80, 1998. doi:10.1109/LICS.1998.705644.
- [7] J. Esparza, P. Ganty, and R. Majumdar. Parameterized verification of asynchronous shared-memory systems. In CAV’13, volume 8044 of LNCS, pages 124–140. Springer-Verlag, 2013. doi:10.1007/978-3-642-39799-8_8.
- [8] Javier Esparza, Alain Finkel, and Richard Mayr. On the verification of broadcast protocols. In 14th Annual IEEE Symposium on Logic in Computer Science, Trento, Italy, July 2-5, 1999, pages 352–359. IEEE Computer Society, 1999. doi:10.1109/LICS.1999.782630.
- [9] S. M. German and A. P. Sistla. Reasoning about systems with many processes. Journal of the ACM, 39(3):675–735, 1992. doi:10.1145/146637.146681.
- [10] Lucie Guillou, Arnaud Sangnier, and Nathalie Sznajder. Safety analysis of parameterised networks with non-blocking rendez-vous. In CONCUR’23, volume 279 of LIPIcs, pages 7:1–7:17. loss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.CONCUR.2023.7.
- [11] Lucie Guillou, Arnaud Sangnier, and Nathalie Sznajder. Safety verification of wait-only non-blocking broadcast protocols. In Application and Theory of Petri Nets and Concurrency - 45th International Conference, PETRI NETS 2024, volume 14628 of Lecture Notes in Computer Science, pages 291–311. Springer, 2024. doi:10.1007/978-3-031-61433-0_14.
- [12] Lucie Guillou, Arnaud Sangnier, and Nathalie Sznajder. Wait-only broadcast protocols are easier to verify, 2025. arXiv:2506.22144.
- [13] Peter Habermehl. On the complexity of the linear-time mu -calculus for petri-nets. In Pierre Azéma and Gianfranco Balbo, editors, Application and Theory of Petri Nets 1997, 18th International Conference, ICATPN ’97, Toulouse, France, June 23-27, 1997, Proceedings, volume 1248 of Lecture Notes in Computer Science, pages 102–116. Springer, 1997. doi:10.1007/3-540-63139-9_32.
- [14] John E. Hopcroft and Jean-Jacques Pansiot. On the reachability problem for 5-dimensional vector addition systems. Theor. Comput. Sci., 8:135–159, 1979. doi:10.1016/0304-3975(79)90041-0.
- [15] Dexter Kozen. Lower bounds for natural proof systems. In 18th Annual Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, 31 October - 1 November 1977, pages 254–266. IEEE Computer Society, 1977. doi:10.1109/SFCS.1977.16.
- [16] Jérôme Leroux. Vector addition system reversible reachability problem. In Joost-Pieter Katoen and Barbara König, editors, CONCUR 2011 - Concurrency Theory - 22nd International Conference, CONCUR 2011, Aachen, Germany, September 6-9, 2011. Proceedings, volume 6901 of Lecture Notes in Computer Science, pages 327–341. Springer, 2011. doi:10.1007/978-3-642-23217-6_22.
- [17] Jérôme Leroux. The reachability problem for petri nets is not primitive recursive. In 62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022, pages 1241–1252. IEEE, 2021. doi:10.1109/FOCS52979.2021.00121.
- [18] Jérôme Leroux and Sylvain Schmitz. Reachability in vector addition systems is primitive-recursive in fixed dimension. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, Vancouver, BC, Canada, June 24-27, 2019, pages 1–13. IEEE, 2019. doi:10.1109/LICS.2019.8785796.
- [19] R.J. Lipton. The reachability problem requires exponential space. Research report (Yale University. Department of Computer Science). Department of Computer Science, Yale University, 1976. URL: https://books.google.fr/books?id=7iSbGwAACAAJ.
- [20] S. Schmitz and P. Schnoebelen. The power of well-structured systems. In CONCUR’13, volume 8052 of Lecture Notes in Computer Science, pages 5–24. Springer, 2013. doi:10.1007/978-3-642-40184-8_2.
- [21] Rüdiger Valk. Self-modifying nets, a natural extension of petri nets. In ICALP’78, volume 62 of LNCS, pages 464–476. Springer, 1978. doi:10.1007/3-540-08860-1_35.
