Abstract 1 Introduction 2 Preliminaries 3 Algorithm 4 Main Result 5 Concluding Remarks References

Deterministic Synchronous Self-Stabilizing BFS Construction with Constant Space Complexity

Lélia Blin ORCID Université Paris Cité, CNRS
IRIF, F-75013 Paris, France
Franck Petit ORCID Sorbonne Université, CNRS
LIP6, Paris, France
Sébastien Tixeuil ORCID Sorbonne Université, CNRS
LIP6, IUF, Paris, France
Abstract

In this paper, we resolve a long-standing open problem in self-stabilization asking whether it is possible to construct a spanning tree using constant memory per node in a synchronous semi-uniform networks, i.e., networks in which one node is distinguished. We design a synchronous self-stabilizing algorithm that constructs a breadth-first search (BFS) tree in any anonymous semi-uniform network using only a constant number of bits of memory per node. Crucially, our approach operates without any prior knowledge of global network parameters such as maximum degree, diameter, or number of nodes. In contrast to traditional self-stabilizing methods – such as pointer-to-neighbors, distance-to-root, or identifiers – that are unsuitable under strict memory constraints, our solution employs an innovative constant-space token dissemination mechanism. This mechanism effectively eliminates cycles and rectifies errors in the BFS structure, ensuring both correctness and memory efficiency. The proposed algorithm not only meets the stringent requirements of memory-constrained distributed systems, but also opens new avenues for research in the design of self-stabilizing protocols under severe resource limitations: constant space-complexity may not systematically prevent the existence of self-stabilizing algorithms for important non-trivial tasks.

Keywords and phrases:
Distributed algorithms, fault-tolerance, transient faults, self-stabilization, memory optimization
Funding:
Lélia Blin: Additional support from ANR projects ENEDISC (ANR-24-CE48-7768-01), and from the InIDEX Project METALG.
Franck Petit: Additional support from ANR project ANR SKYDATA (ANR-22-CE25-0008-02).
Sébastien Tixeuil: Additional support from ANR project SAPPORO (2019-CE25-0005).
Copyright and License:
[Uncaptioned image] © Lélia Blin, Franck Petit, and Sébastien Tixeuil; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithms
; Mathematics of computing Discrete mathematics
Editor:
Dariusz R. Kowalski

1 Introduction

This paper addresses the challenge of developing memory-efficient self-stabilizing algorithms for the Breadth-First Search (BFS) Spanning Tree Construction problem. Self-stabilization [3, 18, 20, 23] is a versatile technique that facilitates recovery after arbitrary transient faults impact the distributed system, affecting both the participating processes and the communication medium. Essentially, a self-stabilizing protocol can restore the system to a legitimate configuration (from which its behavior satisfies its specification), beginning from an arbitrary, potentially corrupted, initial global state, without the need of any human intervention. A BFS spanning tree construction involves (i) each node identifying a single neighbor as its parent, with the exception of the root node, which has no parent, and (ii) the set of parent-child relationships forming an acyclic connected structure, ensuring that following the parent nodes from any node results in the shortest path to the root node. Memory efficiency pertains to the amount of information transmitted to neighboring nodes to enable stabilization. A smaller space complexity results in reduced information transmission, which (i) decreases the overhead of self-stabilization in the absence of faults or after stabilization, and (ii) facilitates the integration of self-stabilization and replication [22, 24].

We adopt the classical state model introduced in [18]. In this model, a node can read its own memory and the memory of its neighbors, and execute its algorithm accordingly when needed. This means that a node’s memory is accessible to all its neighbors, ensuring they all obtain the same information. Consequently, if a node v wants to communicate information exclusively to its neighbor u, v must indicate in its memory that the information is intended only for u. This indication is typically made by adding node u’s identifier. This type of communication is inherent to wireless networks, where nodes exchange information by broadcasting messages over one or more carrier waves. In such networks, the notion of neighboring node of v refers to any other node able to directly receiving messages emitted by v, typically without the aid of relays. Consequently, targeting a specific neighbor necessitates that the sender explicitly includes the recipient’s ID within the transmitted message. This identification requirement adds a significant overhead in memory space, specifically O(logn) bits in a network containing n nodes. Within anonymous semi-uniform networks, nodes are devoid of identifiers, and only a single node assumes a role distinct from the rest. More precisely, the distinguished node operates under a dedicated protocol, whereas all other nodes execute an identical protocol. As a result of anonymity, it is not possible to rely on identifiers to interact with or refer to a specific node.

In [21], the authors introduce the notion of a silent algorithm (defined in a model slightly different from the state model [18]), a fundamental concept that significantly influences the space complexity of self-stabilizing algorithms. An algorithm is said to be silent if every execution eventually reaches a point beyond which the state of all nodes remains unchanged. They show that in networks of n nodes, solving tasks such as leader election or spanning tree construction silently requires at least Ω(logn) bits of memory per node. However, in the state model, a self-stabilizing algorithm is not necessary silent, and talkative (i.e., non-silent) approaches were successful in reducing the memory cost of self-stabilizing algorithms [10, 11].

In general networks, such as those considered in this paper, self-stabilizing leader election is closely linked to self-stabilizing tree construction. On one hand, the presence of a leader enables efficient self-stabilizing tree construction [9, 13, 15, 16, 19, 25, 26, 27]. On the other hand, growing and merging trees is the primary technique for designing self-stabilizing leader election algorithms in networks, as the leader often serves as the root of an inward tree [1, 2, 4, 7, 10, 11].

From a memory perspective, the aforementioned works employ two algorithmic techniques for self-stabilization that negatively impact memory complexity. The first technique involves using a distance variable to store the distance of each node to the elected node in the network. This distance variable is employed in self-stabilizing spanning tree construction to break cycles resulting from arbitrary initial states, see for instance [1, 2, 4]. Clearly, storing distances in n-node networks may require Ω(logn) bits per node. Some works [6, 10, 11] distribute pieces of information about the distances to the leader among the nodes using different mechanisms, allowing for the storage of o(logn) bits per node. However, in general networks, the best result so far [11] uses O(loglogn+logΔ) bits per node, closely matched by the Ω(loglogn) bits per node lower bound obtained in [8].

The second technique involves using a pointer-to-neighbor variable to unambiguously designate a specific neighbor for each node (with the designated neighbor being aware of this). For tree construction, pointer-to-neighbor variables typically store the parent node in the constructed tree (with the parent being aware of its children). In a naive implementation, the parent of each node is designated by its identifier, requiring Ω(logn) bits for each pointer variable. However, it is possible to reduce the memory requirement to O(logΔ) bits per pointer variable in networks with a maximum degree of Δ by using node-coloring at distance 2 instead of identifiers to identify neighbors. The best available deterministic self-stabilizing distance-2 coloring protocol [11] achieves a memory footprint of O(loglogn+logΔ) bits per node, assuming unique identifiers. This implies that even self-stabilizing protocols solely relying on a constant number of pointer-to-neighbor variables per node [26, 16] actually use at best O(loglogn+logΔ) bits per node (assuming unique identifiers) in the classical state model, and that a lower bound of Ω(logΔ) bits per node also holds for these solutions (regardless of the nodes being anonymous or not).

We now turn to the results of Itkis and Levin [25], which constitute one of the works most closely related to ours. their approach, the authors propose a randomized asynchronous self-stabilizing algorithm for leader election, based on a deterministic construction of a BFS tree. Their algorithms are designed under the same model as in this paper, namely the state model, but augmented with pointer-to-neighbor variables. As already mentioned above, using such additional assumption, each node u can maintain a unique pointer to a neighbor vN(u); more importantly, this neighbor v is aware that it is being pointed to by u. Three deterministic BFS construction algorithms are presented by Itkis and Levin [25], proving successive improvements on their space complexity. Their first algorithm is a standard BFS construction based on the distance from each node to the root, and it is requiring O(logn) bits of memory per node. The other two algorithms are presented in natural language, and their formalization as a set of guarded rules is left undefined. The former is restricted to graphs of degree 2. It is based on a sophisticated and elegant cycle-detection mechanism based on the Thue-Morse sequence. Note that although no formal proof exist in the literature, the inherent properties of the Thue-Morse sequence and finite automata theory suggest that it is impossible to implement this sequence within finite memory. Nevertheless, the authors use a Thue-Morse-based certificate to obtain a constant space complexity per node. The second of the two algorithms generalizes the first to arbitrary graphs. This algorithm maintains a forest of pointers, and guarantees acyclicity of these pointers thanks to the use of the degree-2 algorithm. The latter is applied to each path from the root of the BFS tree to its leaves, where each path is identified thanks to a depth-first search (DFS) traversal of the BFS tree. These DFS traversals are benefiting from the pointer-to-neighbors variables. To keep the space complexity constant, the DFS traversals are performed successively. This requires to keep track of which DFS traversals have already been performed. This authors argue that this can be done using constant space complexity using the pointer-to-neighbor variables allowed by their model. However, in absence of such pointers, it is not clear how to implement this sequence of DFS traversals using less than Ω(logΔ) bits of memory per node [15, 17].

Overall, up to this paper, there exists no deterministic self-stabilizing solution to the spanning tree construction problem that uses a constant number of bits of memory per process. Furthermore, the reuse of known techniques makes this goal unattainable.

Our contribution.

We introduce the first deterministic self-stabilizing Breath-First Search Spanning Tree construction algorithm that utilizes a constant amount of memory per node in the classical state model, irrespective of network parameters such as degree, diameter, or size. Our algorithm functions in arbitrary topology networks and does not depend on any global knowledge.

In [8], it is shown that constructing a self-stabilizing spanning tree without a pre-existing leader using constant memory is impossible. Therefore, a necessary assumption for achieving constant space per node is that the anonymous network is semi-uniform, with a single node designated as the leader, serving as the root of the spanning tree, while all other nodes are anonymous and uniformly execute the same algorithm. Additionally, for synchronization purposes, our solution assumes a fully synchronous network, where all nodes operate at the same pace in a lock-step manner.

Our solution relies on a few key components.

First, each non-root node maintains a rank variable, which classifies neighbors as parents (if their rank is lower), siblings (if their rank is the same), or children (if their rank is higher). In the stabilized phase, the rank value corresponds to the distance to the root modulo three (hence, the BFS property of the constructed spanning tree), an idea already suggested in various non self-stabilizing settings [5, 12, 14]. The self-stabilizing property adds a significant challenge to the task, as the initial configuration may make the rank variable induce cycles or multiple roots, from which the algorithm must recover. Since identifying the single parent of each node (if it exists) cannot make use of node identifiers (as all non-root nodes are anonymous), instead we select the parent node (among the set of parents) whose port number is minimal. However, nodes do not explicitly communicate the specific parent they have selected, as this would necessitate O(logΔ) bits of memory.

The second component of our algorithm uses a similar technique as the Power Supply introduced by Afek and Bremler [1]. In both cases, the root continuously floods the network with tokens that have a limited lifespan to circumvent one of the main issues associated with token-based approaches: the infinite circulation of tokens within a cycle. The token’s lifespan is controlled by a binary counter, where each bit is stored on a distinct node. So, the tokens generated by the root decay exponentially (half of the tokens disappear at the root’s neighbors, half of the remaining ones at the neighbors’ downward neighbors, and so on). Tokens are used to destroy cycles and correct inconsistent situations that may appear in arbitrary initial configurations, observing that in synchronous networks, tokens passed deterministically arrive simultaneously at all nodes equidistant from the root (and thus at all neighbors with the same rank for any given node, if the ranks are correct). The exponential decay prevents tokens from looping indefinitely in the network, without relying on any global knowledge.

Overall, our algorithm uses only 6 bits of memory per node (actually, only 36 states), and has a stabilization time of O(2ε) time units, where ε denotes the root eccentricity (unknown to the participating nodes).

2 Preliminaries

In this section, we define the underlying distributed execution model considered in this paper, and state what it means for a protocol to be self-stabilizing.

A distributed system is an undirected connected graph, G=(V,E), where V is a set of vertices (or nodes) – |V|=n,n2 – and E is the set of edges connecting those vertices. The distance between two nodes u and v, denoted by dvu is the length of the shortest path between u and v in G. Note that since G is not oriented, dvu=duv, and if u=v, then dvu=0. Then, the eccentricity of a node v is the maximum distance from v to any other node in the network.

Vertices represent processes, and edges represent bidirectional communication links (that is, the ability for two nodes to communicate directly). We assume that each node v is able to distinguish each incident edge with a locally assigned unique label called port number. Port numbers are immutable and are not coordinated in any way: the edge (u,v) may correspond to port number k at u and to port number kk at v. Let N(v) be the set of port numbers, also called set of neighbors of v. The degree of v denotes |N(v)|.

We consider the state model in the context of semi-uniform anonymous networks. In this model, the communication paradigm allows each node to read its own memory as well as the memories of its neighbors. Let v be any node in V, and let {u0,,u|N(v)|} denote its set of neighbors. No neighbor ui,i{0,,|N(v)|}, can determine which port number in N(v) corresponds to the edge (v,ui). Conversely, since nodes have no identifiers, v cannot designate any particular ui to receive v’s information. Hence, a pointer-to-neighbor cannot be implemented solely by reading the neighbors’ variables.

A semi-uniform anonymous network is characterized by the absence of node identifiers and the presence of a unique distinguished node. This node serves as the root of the BFS tree and follows a distinct protocol, while all other nodes adhere to a uniform protocol. The protocol of a node consists of a set of variables and a set of (guarded) rules, of the following form:

<label>:<guard><statement>

Each node may write its own variables, and read its own variables and those of its neighboring nodes. The guard of a rule is a predicate on the variables of v and its neighbors. The statement of a rule of v updates one or more variables of v. A rule may be executed only if its guard evaluates to true. The rules are atomically executed, so the evaluation of a guard and the execution of the corresponding statement, if any, are done in one atomic step.

The state of a node is defined by the values of its variables. The configuration of a system is the product of the states of all nodes. Let Γ be the set of all possible configurations of the system. A distributed algorithm (or, protocol) 𝒜 is a collection of binary transition relations, denoted by , on Γ such that given two configurations γ1 and γ2, γ1γ2 by the atomic execution of guarded action of one or more nodes. A protocol 𝒜 induces an oriented graph Γ=(Γ,), called the transition graph of 𝒜. A sequence e=γ0,γ1,,γi,γi+1,, i0,γiΓ, is called an execution of 𝒜 if and only if i0,γiγi+1.

A node v is said to be enabled in a configuration γΓ if there exists a rule R such that the guard of R is true at v in γ. In this paper, we assume that each transition of Γ is driven by a synchronous scheduler. This means that, given a pair of configurations γ1,γ2Γ, then γ1γ2Γ if and only if all enabled nodes in γ1 execute an atomic rule during the transition γ1γ2.

Self-Stabilization.

A predicate P is closed for a transition graph Γ if and only if every configuration in any execution e starting from a configuration satisfying P also satisfies P. A predicate Q is an attractor of P, denoted PQ, if and only if: (i) Q is closed for Γ, and (ii) for every execution e on Γ beginning in a configuration satisfying P, e contains at least one configuration satisfying Q. A transition graph Γ is self-stabilizing for a predicate P when P is an attractor of the predicate true (formally, 𝑡𝑟𝑢𝑒P), ensuring convergence to P from any initial configuration.

Problem to be solved.

We assume a rooted graph, meaning that the set of vertices includes a specific node called the root, denoted by r. Let us call Vr the set V with the extra property that each node vV is labeled with dvr. For every node vr, let Predv be the set of neighboring nodes of v in Vr that are labeled with dvr1. Let Er be the subset of edges (ErE) such that for every edge uvEr, either u belongs to Predv, or v belongs to Predu. So, in Gr=(Vr,Er), no two neighbors have the same label. Then Gr is the directed version of Gr every edge is oriented toward the smaller label. By construction, Gr is a Directed Acyclic Graph (DAG) rooted in r such that every decreasing label path from every vVr is a path of minimum length between v and r. Note that this DAG is uniquely defined. In the remainder of the paper, we refer to Gr as BFS-DAG (rooted at r).

Let BFS3 be the projection of the BFS-DAG, where each vertex v is labeled with dvrmod3. The problem we consider in this paper consists in the design of a self-stabilizing distributed algorithm that builds the BFS3 using a constant amount of memory. From this construction, one can retrieve the BFS tree by choosing for each node (except the root) the parent port number of minimum value.

3 Algorithm

3.1 Overview of the algorithm

Our algorithm formally described in Algorithm 1 aims to build a distributed version of BFS3, meaning each node v computes its distance to the root r modulo 3. Our algorithm makes use of an infinite flow of tokens originating from the root downwards the BFS-DAG. Of course, since the initial configuration is arbitrary, the nodes’ variables may not originally match the BFS3, so the tokens may not initially flow according to the BFS-DAG.

For this algorithm to function properly, it is crucial that only the root creates tokens. When the root creates a token, all of its neighbors take that token in the next synchronous step, which implies that after one step, there are as many tokens as neighbors of the root. This process is then repeated from parents to children (according to the distance modulo 3 variables).

Since our model is synchronous, all tokens generated by the root at a given step progress while preserving an identical distance from the root.

A node v updates its distance modulo 3 based on the distance (modulo 3) of the neighbors from which it receives one or more tokens at the same synchronous step; if multiple neighbors send tokens, v merges them into a single token.

A common issue when using tokens to construct a spanning tree starting from an arbitrary configuration is the possibility of infinite token circulation within cycles, which prevents a valid spanning tree from forming. This problem may occur if tokens (not generated by the root) are already present in the initial configuration.

To address this, our algorithm ensures that each token has a limited lifespan, thereby avoiding infinite circulation. This limited lifespan is based on the following idea: each node maintains a bit, whose value indicate whether the token should be forwarded downwards; each time a token is received, the bit is flipped, effectively dropping half of the tokens that flow through the node. Globally, this process can be seen as creating a binary word along the paths from the root to the nodes. Tokens move along these paths, effectively performing binary addition one by one. A node v is allowed to collect a token if and only if all upward nodes having sent their tokens have their bit set to 1 before the addition. Otherwise, node v does not forward the tokens, leading to their disappearance.

Consider an execution of this approach on a chain of length , where all nodes v1 through v have their bits set to 0. Consequently, the binary word of length is 000000, note that the root does not contribute to this binary word. For simplicity, assume that a token moves from one neighbor to the next in just one step. If the root offers a token, its neighbor v0 does not pass the token to its neighbor v1 but changes its bit to 1. The resulting binary word is 100000. Next, when v1 acquires the token, it can pass the token to v2. At this point, v1 changes its bit to 0 and v2 changes its bit to 1, but the token is not forwarded to v3 and thus disappears. The resulting binary word is 010000. This process continues accordingly. From this scenario, observe that the token’s lifespan is very short and that for the node farthest from the root to receive the token, it may have to wait 2 steps.

This approach also circumvents one of the major issues in self-stabilizing spanning tree construction: the spanning tree is built here from the root toward the other nodes, and the root never waits for information from the spanning tree – typically expected from leaves up to the root – thus avoiding deadlock when no leaves exist (i.e., when only cycles map the network).

Although our algorithm is designed so that the distance modulo 3 of each node is eventually stable (i.e., eventually remains unchanged), Algorithm 1 is not silent. Indeed, even when the distributed version of BFS3 is built, an underlying token-passing mechanism keeps operating indefinitely.

3.2 Variables

Each node v maintains three variables. A variable that, upon stabilization of the algorithm, stores the distance to the root modulo 3, this variable, called the rank, is denoted by k. Consequently, for each node v, the variable kv takes values in {0,1,2,}. As explained in the sequel, the extra value is used to handle errors. When the system is stabilized, i.e., Variable kv is supposed to induce a kinship relationship matching BFS3. However, in the arbitrary initial configuration, this variable may be incorrect, affecting the relationship that should normally exist to form BFS3. Hence, cycles (induced by the parent relationships in the graph G) and incorrect DAGs (i.e., DAGs rooted on nodes other than the root, or with multiple roots) may exist in an initial configuration.

Therefore, we add a token diffusion mechanism to stabilize kv. The circulation of tokens is handled by the variable t. This variable takes values in {𝑡𝑟𝑢𝑒,𝑓𝑎𝑙𝑠𝑒,𝑤𝑎𝑖𝑡}. The states 𝑡𝑟𝑢𝑒 and 𝑓𝑎𝑙𝑠𝑒 indicate whether a node holds a token or not. The 𝑤𝑎𝑖𝑡 state serves two purposes: it prevents a node that already holds a token from immediately receiving another token, and it informs a token-holding node about which node(s) sent it the token. After stabilization, nodes cycle through the states 𝑓𝑎𝑙𝑠𝑒, 𝑡𝑟𝑢𝑒, and 𝑤𝑎𝑖𝑡, in that order. A node with a 𝑡𝑟𝑢𝑒 token moves to a 𝑤𝑎𝑖𝑡 token in the subsequent step, while a node with a 𝑤𝑎𝑖𝑡 token moves to a 𝑓𝑎𝑙𝑠𝑒 token in the subsequent step. A node may remain in the 𝑓𝑎𝑙𝑠𝑒 state for more than one step.

The last local variable, called bit, and denoted by b, is used to construct binary words originating at the root. For each node v, the variable bv can take values in {𝟎,𝟏,}. The states 𝟎 and 𝟏 are the normal states of a bit, and as explained further, is used to manage a reset mechanism when an error is detected locally.

Note that those three variables are also present at the root r. But, they are considered as constant with the following values: kr=0, tr=𝑡𝑟𝑢𝑒, and br=𝟏. The root has no rules to execute. Note that since the root has tr=𝑡𝑟𝑢𝑒 and br=𝟏, it continuously proposes a token to its neighbors.

3.3 Parent-child relationships sets and predicates

In the following, we denote by N(v) the set of neighbors of node v, and by pt(v,u) the port number of v leading to u. The parents relationship is defined by the rank variable (i.e., k) of each node. Let us denote by P(v) the set of v’s parents: P(v)={uN(v)|ku=(kv1)mod3}.

To obtain a BFS spanning tree, all nodes must choose a single parent among its parents’ set P(v). We use the minimum port number to break such symmetric cases.

p(v)={u|wP(v)pt(v,u)=min{pt(v,w)}if P(v)otherwise

Children of node v, noted C(v), are formally defined as: C(v)={uN(v)|ku=(kv+1)mod3}

Let S(v) be the set of siblings of node v defined as follow: S(v)={uN(v)|ku=kv}

3.4 Algorithm 𝚃𝚘𝚔𝙱𝚒𝚗

Algorithm 1 Algorithm 𝚃𝚘𝚔𝙱𝚒𝚗 for Node vV{r}.

Our algorithm 𝚃𝚘𝚔𝙱𝚒𝚗 (see Algorithm 1) is composed of seven rules. Note that our approach is not silent: once the system has stabilized, rules Rtok, Radd, and Rready are activated infinitely often by all nodes in the network. The others rules are executed in the stabilizing phase only.

Predicates, Macros, and Functions used in Algorithm 1 are formally defined in Subsections 3.4.1 and 3.4.2.

We use the state of variable bv to indicate that a node v is in an error state. Consequently, rule Rer detects local errors: a node v that is not in error yet observes an error (predicate Er(v)) activates this rule to transition to an error state. In the next step, a node already in an error state resets itself via rule Rreset.

For our algorithm to function correctly, it is necessary to clean as much as possible of the network when an error is detected. So, when an error is detected when executing rule Rer, the predicate Er(v) of its neighbors also becomes true, allowing the the error to propagate when the neighbors also execute rule Rer. This process may not propagate to the entire network if the root is an articulation point of the graph (that is, the graph minus the root node G is disconnected). However, the error does propagate to the entire connected component of G.

Rules Rtok, Radd, and Rready govern token circulation. The system is synchronous, and once the algorithm converges, node ranks remain unchanged. Therefore, when tokens circulate, a node v must receive its token(s) from every parent u satisfying bu=1. These conditions are captured by the predicate 𝑇𝑎𝑘𝑒𝑃(v), described formally later. If 𝑇𝑎𝑘𝑒𝑃(v) is true, v executes rule Rtok to acquire a token (possibly merging its parents’ tokens). When a node has a token, its children must retrieve it if its bit is 𝟏; otherwise, the token disappears. In either case, the node holding the token releases it by executing rule Radd, which also increments its b variable. Note that, to enable neighbors with tokens to detect which nodes have sent one token, rule Radd sets the token variable to 𝑤𝑎𝑖𝑡 rather than directly setting it to false. This delay simplifies the proof of the algorithm. Finally, after sending a token and incrementing its variable bv, node v can execute rule Rready to indicate that it is ready to receive a new token. Figure 1 illustrates a possible execution of these three rules.

Observe that a node v must receive token(s) from all of its parents to maintain the parent-child relationships; otherwise, v detects an error. This error is captured by predicate 𝑇𝑎𝑘𝑒𝑂(v), described formally below. In such a case, v executes rule RerRank to declare itself in an error state.

The last rule Rjoin, concerns nodes that have been reset, and therefore do not hold a rank. A reset node v that is offered a token by a set of nodes P – all sharing the same rank (predicate 𝑇𝑎𝑘𝑒𝑅(v)) – joins the spanning structure by taking the nodes in P as its parents. To do so, v adjusts its rank based on the nodes in P and also acquires a token.

(a)
(b)
(c)
(d)
(e)
Figure 1: The gray nodes in each subfigure represent activatable nodes. (a) Node v possesses and offers a token. Node u acquires the token by applying the rule Rtok, and node v increments its bit using Radd. (b) Node v applies rule Rready to indicate that it is ready to receive a new token. Node u holds the token, it releases the token and increments its bit by applying Radd. Simultaneously, the node w executes rule Rtok to acquire the token. (d) Node x holds the token; however, node y does not accept it because node x has bx=𝟎.

3.4.1 Token reception sets and predicates

In the remainder, any node satisfying predicate Reset(v) is referred to as a reset node:

Reset(v)kv=tv=𝑓𝑎𝑙𝑠𝑒bv=𝟎 (1)

In our approach, a node v may receive tokens from a neighbor u if and only if the neighbor’s binary variable satisfies bu=𝟏. The set 𝑇𝑜𝑘(v) represents the neighbors of v that may send a token to v.

𝑇𝑜𝑘(v)={uN(v)|tu=𝑡𝑟𝑢𝑒bu=𝟏} (2)

For our purpose, a distinction is made based on which node transmits the token to v. The two following predicates apply when node v has a valid rank (i.e., kv), allowing it to identify its parent set. The predicate 𝑇𝑎𝑘𝑒𝑃(v) holds if all proposed tokens are from all parents of v, while predicate 𝑇𝑎𝑘𝑒𝑂(v) holds if at least one token sender is not the parent of v, or if not all parents of v propose a token.

𝑇𝑎𝑘𝑒𝑃(v)𝑇𝑜𝑘(v)𝑇𝑜𝑘(v)=P(v) (3)
𝑇𝑎𝑘𝑒𝑂(v)𝑇𝑜𝑘(v)𝑇𝑜𝑘(v)P(v) (4)

The final predicate in this section is used by a reset node v. It verifies that all token sender nodes share the same rank, and that no node outside the token sender set possesses that same rank.

𝑇𝑜𝑘rk(v)=min{ku|u𝑇𝑜𝑘(v)}
𝑇𝑎𝑘𝑒𝑅(v)𝑇𝑜𝑘(v)(u,w𝑇𝑜𝑘(v)|ku=kw)(uN(v)𝑇𝑜𝑘(v)|ku𝑇𝑜𝑘rk(v)) (5)

When a new reset node v satisfies predicate 𝑇𝑎𝑘𝑒𝑅(v), it can join the spanning structure by becoming a child of nodes in Tok(v). In doing so, v adjust its rank accordingly, using function kt(v):

kt(v)=({ku|u𝑇𝑜𝑘(v)}+1)mod3 (6)

3.4.2 Error Detection sets and predicates

Let us introduce some predicates for all nodes in V{r}. Inconsistencies between variables can arise due to a corrupted initial configuration, necessitating detection by the algorithm. A node v with an invalid rank (kv=) is considered a reset node, and thus all its variables must reflect the state of a reset node. When a node v detects an error, it sets bv=. After detecting an error, the decision is made to clean the network. Consequently, any non-reset node that has an erroneous neighbor transitions to an error state. In other words, errors propagate from a node to its neighborhood, as captured by the following predicate.

Erprg(v)uN(v)|kubv= (7)

Predicate Ervar(v) detects inconsistencies between variables of a reset node.

Ervar(v)¬Erprg(v)bv(kv=(tv𝑓𝑎𝑙𝑠𝑒bv0)) (8)

If a node v has at least one child with a token, and v has not sent a token (i.e., ¬(tv=𝑤𝑎𝑖𝑡bv=𝟎)), then v detects an error thanks to predicate Ertp(v).

Ertp(v)¬Erprg(v)bvC(v)(uC(v)|tu=𝑡𝑟𝑢𝑒)¬(tv=𝑤𝑎𝑖𝑡bv=𝟎) (9)

We deal with an synchronous scheduler, and we construct a BFS3. As a consequence, all parents of a node v must have the same state (see predicate ErP(v)), all children of v must share the same state (see predicate ErC(v)), and all siblings of v must be in the same state as v (see predicate ErS(v)). Moreover, a node that has a valid rank, children and no token must not have any reset nodes as neighbors (see predicate Rt_nd(v)). In a normal execution of the algorithm, when a node offers the token, all nodes that do not have a valid rank join it as children. Consequently, a node cannot end up with both children and neighbors lacking a valid rank. All these constrains are captured by the following predicates.

ErP(v)(u,wP(v)|bubwtutw)(kvP(v)=) (10)
ErC(v)u,wC(v)|bubwtutw (11)
ErS(v)uS(v)|bubvtutv (12)
Rt_nd(v)kvC(v)tv𝑡𝑟𝑢𝑒(uN(v)ku=) (13)
ErN(v)¬Erprg(v)bv¬𝑇𝑎𝑘𝑒𝑂(v)(ErP(v)ErC(v)ErS(v)Rt_nd(v)) (14)

A node v with a valid rank (kv) must have a non empty set of parents, otherwise the node is in error:

FalseR(v)kvP(v)= (15)

Neighbors u and w of a reset node v with the same rank must have identical values for variables b and t. Otherwise, v detects an error and enters the error state (i.e., bv=) so that, in the next step, nodes u and w can also transition to the error state. Predicate Erπ(v) is dedicated to this purpose:

Erπ(v)Reset(v)(u,wN(v)|ku=kw(bubwtutw)) (16)

A node v can then apply the rest of the rules of the algorithm if it does not find an error in its neighborhood, and it is not in an error state itself. Predicate Er(v) captures this.

Er(v)Ervar(v)Ertp(v)ErN(v)FalseR(v)Erprg(v)Erπ(v) (17)

4 Main Result

We now state our main technical result.

Theorem 1.

In a semi-uniform model where r is the distinguished node, and every other node is anonymous, our synchronous deterministic self-stabilizing algorithm 𝚃𝚘𝚔𝙱𝚒𝚗 constructs a BFS tree rooted in r using O(1) bits per node in O(2ε) steps, where ε denotes the eccentricity of r.

Overview of Correctness.

To establish space complexity, we simply observe that each node vV maintains three local variables: kv,tv, and bv. Together, those variables require only 6 bits, ensuring a per-node space complexity of O(1) bits. The correctness proof is carried out in several stages. First, we demonstrate – using a potential function – that starting from an arbitrary configuration, our system converges and remains within a subset of configurations denoted by Γe¯. More precisely, Γe¯ is the set of configurations where obvious errors Ervar(γ0,v)=𝑡𝑟𝑢𝑒, Ertp(γ0,v)=𝑡𝑟𝑢𝑒, or ErN(γ0,v)=𝑡𝑟𝑢𝑒 (see Equations 8, 9, and 14) are no longer present.

Afterwards, we define the set of legal configurations Γ. A configuration is considered legal if, for every node v, v’s rank is equal to its distance from the root in the graph modulo 3. Moreover, we must ensure that the nodes’ ranks no longer change. In our synchronous setting, this requires that all shortest paths from the root to any node v are identical with respect to the nodes’ states on the paths. In other words, every node v at the same distance from the root must be in the same state (i.e., has identical values for variables tv and bv).

Then, we need to prove that, (i) starting from a possibly illegal (but free of obvious errors) configuration in Γe¯, the system converges to a legal configuration, and (ii) the system remains in the set of legal configurations Γ, thanks to our algorithm. Note that in Γe¯, only the rules Rtok, Radd, and Rready are executable – these are the rules that govern token circulation.

The second part of the proof (closure) is established by observing that at each step of the algorithm execution, the predicate defining legal configurations is maintained. The first par of the proof (convergence) is more involved, and is broken into several steps.

The remainder of the correctness proof is as follows. First, we show that within O(n) steps, all tokens present in the initial configuration are destroyed, resulting in configurations belonging to the set Γtr. A key component in achieving this is the predicate Ertp(v) (see Equation 9) and the rule Radd, which modifies the variable b. Recall that a node can receive the token if and only if its parent p has bp=𝟏. Therefore, if a node v receives the token when its parent offers it (i.e., when bp=𝟏), then the next time the parent offers it, v cannot accept the token since the parent has bp=0.

Now we can prove that, starting from a configuration in Γtr, if a node receives the token, it must have a legal rank (i.e., kv=dvrmod3). Next, we formalize the paths originating from the root to demonstrate that any node v whose every path from the root is legal receives the token in at most O(2dvr1) steps, where dvr is the distance from the root r to v in the graph (dvr is bounded by ε).

With these results, we have all the necessary tools to prove that our system converges to a set of error-free configurations, denoted by Γcl. More precisely, Γcl comprises configurations where nodes are either legal or have been restarted. Finally, starting from a configuration in Γcl, we prove convergence toward the set of legal configurations Γ.

5 Concluding Remarks

We presented the first constant space deterministic self-stabilizing BFS tree construction algorithm that operate in semi-uniform synchronous networks of arbitrary topology. Our solution 𝚃𝚘𝚔𝙱𝚒𝚗 is independent of any global network parameter (maximum degree, diameter, number of nodes, etc.). As classical techniques such as point-to-neighbor and distance-to-the-root cannot be used in such a constrained setting, we introduced a novel infinite token stream technique to break cycles and restore BFS construction that may be of independent interest for other tasks. We briefly mention how our BFS construction can be used to optimally color bipartite graphs (that is, with two colors Black and White). All nodes run 𝚃𝚘𝚔𝙱𝚒𝚗 so that eventually a BFS3 is constructed. With respect to colors, the root has color Black and retains it forever, while every other node take the opposite color of its parents whenever it executes 𝚃𝚘𝚔𝙱𝚒𝚗 (if it has parents). After stabilization of the BFS3 construction, all children of the root take the same color, and after one step their children take the root color, and so on, so that a 2-coloring of the bipartite graph is eventually achieved. This very simple application of 𝚃𝚘𝚔𝙱𝚒𝚗 still uses constant memory, as opposed to dedicated previous solutions to the same problem found in the literature [29, 28] that require Ω(logn) bits per node.

Two important open problems remain. First, our solution heavily relies on the network operation being fully synchronous (all nodes execute code in a lock-step synchronous fashion). It would be interesting to extend our token stream technique to an asynchronous setting. In this setting, we expect that some synchronization technique will be necessary, and it remains unclear whether such additional mechanism can be done in constant space. Second, while our algorithm uses a constant amount of memory per node, its time complexity (measured in steps) is exponential. This is due to the exponential decay mechanism for tokens that may require the root to send an exponential number of them to finally reach the furthest nodes in the network. One could speed up this process by adding more states to the b variables: if the domain of b becomes {𝟎,𝟏,,𝐤,}, and token are accepted for any value that is not 𝟎 or (assuming b is incremented modulo k anytime the incoming token is accepted), less tokens are destroyed, but the overall time complexity remains exponential. By contrast, solutions that make use of O(logΔ+loglogn) bits per node have polynomial time complexity [10]. Whether there exists a constant space and polynomial time solution to the problem is left for future research.

References

  • [1] Yehuda Afek and Anat Bremler-Barr. Self-stabilizing unidirectional network algorithms by power supply. Chic. J. Theor. Comput. Sci., 1998, 1998. URL: http://cjtcs.cs.uchicago.edu/articles/1998/3/contents.html.
  • [2] Yehuda Afek, Shay Kutten, and Moti Yung. Memory-efficient self stabilizing protocols for general networks. In Jan van Leeuwen and Nicola Santoro, editors, Distributed Algorithms, 4th International Workshop, WDAG ’90, Bari, Italy, September 24-26, 1990, Proceedings, volume 486 of Lecture Notes in Computer Science, pages 15–28. Springer, 1990. doi:10.1007/3-540-54099-7_2.
  • [3] Karine Altisen, Stéphane Devismes, Swan Dubois, and Franck Petit, editors. Introduction to Distributed Self-Stabilizing Algorithms. Synthesis Lectures on Distributed Computing. Morgan & Claypool Publishers, 2019.
  • [4] Anish Arora and Mohamed G. Gouda. Distributed reset. IEEE Trans. Computers, 43(9):1026–1038, 1994. doi:10.1109/12.312126.
  • [5] James Aspnes. Distributed breadth first search, 2005. Lecture notes of Distributed Computing, Fall 2005, http://pine.cs.yale.edu/pinewiki/DistributedBreadthFirstSearch.
  • [6] Baruch Awerbuch and Rafail Ostrovsky. Memory-efficient and self-stabilizing network {RESET} (extended abstract). In James H. Anderson, David Peleg, and Elizabeth Borowsky, editors, Proceedings of the Thirteenth Annual ACM Symposium on Principles of Distributed Computing, Los Angeles, California, USA, August 14-17, 1994, pages 254–263. ACM, 1994. doi:10.1145/197917.198104.
  • [7] Lélia Blin, Shlomi Dolev, Maria Gradinariu Potop-Butucaru, and Stephane Rovedakis. Fast self-stabilizing minimum spanning tree construction - using compact nearest common ancestor labeling scheme. In Nancy A. Lynch and Alexander A. Shvartsman, editors, Distributed Computing, 24th International Symposium, DISC 2010, Cambridge, MA, USA, September 13-15, 2010. Proceedings, volume 6343 of Lecture Notes in Computer Science, pages 480–494. Springer, 2010. doi:10.1007/978-3-642-15763-9_46.
  • [8] Lélia Blin, Laurent Feuilloley, and Gabriel Le Bouder. Optimal space lower bound for deterministic self-stabilizing leader election algorithms. Discret. Math. Theor. Comput. Sci., 25(1), 2023. doi:10.46298/DMTCS.9335.
  • [9] Lélia Blin, Maria Potop-Butucaru, and Stephane Rovedakis. A super-stabilizing log(n)log(n)-approximation algorithm for dynamic steiner trees. Theor. Comput. Sci., 500:90–112, 2013. doi:10.1016/J.TCS.2013.07.003.
  • [10] Lélia Blin and Sébastien Tixeuil. Compact deterministic self-stabilizing leader election on a ring: the exponential advantage of being talkative. Distributed Computing, 31(2):139–166, 2018. doi:10.1007/s00446-017-0294-2.
  • [11] Lélia Blin and Sébastien Tixeuil. Compact self-stabilizing leader election for general networks. J. Parallel Distributed Comput., 144:278–294, 2020. doi:10.1016/J.JPDC.2020.05.019.
  • [12] Christian Boulinier, Ajoy Kumar Datta, Lawrence L. Larmore, and Franck Petit. Time and space optimal distributed BFS tree construction. Information Processing Letters, 108(5):273–278, 2008. doi:10.1016/j.ipl.2008.05.016.
  • [13] NS Chen, HP Yu, and ST Huang. A self-stabilizing algorithm for constructing spanning trees. Information Processing Letters, 39:147–151, 1991. doi:10.1016/0020-0190(91)90111-T.
  • [14] Reuven Cohen, Pierre Fraigniaud, David Ilcinkas, Amos Korman, and David Peleg. Label-guided graph exploration by a finite automaton. ACM Trans. Algorithms, 4(4):42:1–42:18, 2008. doi:10.1145/1383369.1383373.
  • [15] Zeev Collin and Shlomi Dolev. Self-stabilizing depth-first search. Inf. Process. Lett., 49(6):297–301, 1994. doi:10.1016/0020-0190(94)90103-1.
  • [16] Ajoy K. Datta, Stéphane Devismes, Colette Johnen, and Lawrence L. Larmore. Analysis of a memory-efficient self-stabilizing BFS spanning tree construction. Theor. Comput. Sci., 955:113804, 2023. doi:10.1016/J.TCS.2023.113804.
  • [17] Ajoy Kumar Datta, Colette Johnen, Franck Petit, and Vincent Villain. Self-stabilizing depth-first token circulation in arbitrary rooted networks. Distributed Comput., 13(4):207–218, 2000. doi:10.1007/PL00008919.
  • [18] Edsger W. Dijkstra. Self-stabilizing systems in spite of distributed control. Commun. ACM, 17(11):643–644, 1974. doi:10.1145/361179.361202.
  • [19] S Dolev, A Israeli, and S Moran. Self-stabilization of dynamic systems assuming only read/write atomicity. Distributed Computing, 7:3–16, 1993. doi:10.1007/BF02278851.
  • [20] Shlomi Dolev. Self-Stabilization. MIT Press, 2000.
  • [21] Shlomi Dolev, Mohamed G. Gouda, and Marco Schneider. Memory requirements for silent stabilization. Acta Informatica, 36(6):447–462, 1999. doi:10.1007/S002360050180.
  • [22] Mohamed G. Gouda, Jorge Arturo Cobb, and Chin-Tser Huang. Fault masking in tri-redundant systems. In Ajoy Kumar Datta and Maria Gradinariu, editors, Stabilization, Safety, and Security of Distributed Systems, 8th International Symposium, SSS 2006, Dallas, TX, USA, November 17-19, 2006, Proceedings, volume 4280 of Lecture Notes in Computer Science, pages 304–313. Springer, 2006. doi:10.1007/978-3-540-49823-0_21.
  • [23] Ted Herman. Origin of self-stabilization. In Krzysztof R. Apt and Tony Hoare, editors, Edsger Wybe Dijkstra: His Life, Work, and Legacy, volume 45 of ACM Books, pages 81–104. ACM / Morgan & Claypool, 2022. doi:10.1145/3544585.3544592.
  • [24] Ted Herman and Sriram V. Pemmaraju. Error-detecting codes and fault-containing self-stabilization. Inf. Process. Lett., 73(1-2):41–46, 2000. doi:10.1016/S0020-0190(99)00164-7.
  • [25] Gene Itkis and Leonid A. Levin. Fast and lean self-stabilizing asynchronous protocols. In 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico, USA, 20-22 November 1994, pages 226–239. IEEE Computer Society, 1994. doi:10.1109/SFCS.1994.365691.
  • [26] C Johnen. Memory-efficient self-stabilizing algorithm to construct BFS spanning trees. In Proceedings of the Third Workshop on Self-Stabilizing Systems, pages 125–140. Carleton University Press, 1997.
  • [27] Amos Korman, Shay Kutten, and Toshimitsu Masuzawa. Fast and compact self stabilizing verification, computation, and fault detection of an MST. In Cyril Gavoille and Pierre Fraigniaud, editors, Proceedings of the 30th Annual ACM Symposium on Principles of Distributed Computing, PODC 2011, San Jose, CA, USA, June 6-8, 2011, pages 311–320. ACM, 2011. doi:10.1145/1993806.1993866.
  • [28] Adrian Kosowski and Lukasz Kuszner. Self-stabilizing algorithms for graph coloring with improved performance guarantees. In Leszek Rutkowski, Ryszard Tadeusiewicz, Lotfi A. Zadeh, and Jacek M. Zurada, editors, Artificial Intelligence and Soft Computing - ICAISC 2006, 8th International Conference, Zakopane, Poland, June 25-29, 2006, Proceedings, volume 4029 of Lecture Notes in Computer Science, pages 1150–1159. Springer, 2006. doi:10.1007/11785231_120.
  • [29] Sumit Sur and Pradip K. Srimani. A self-stabilizing algorithm for coloring bipartite graphs. Inf. Sci., 69(3):219–227, 1993. doi:10.1016/0020-0255(93)90121-2.