DAG It Off: Latency Prefers No Common Coins

Amores-Sesar, Ignacio; Grøndal, Viktor; Holmgård, Adam; Ottendal, Mads

doi:10.4230/LIPIcs.DISC.2025.5

DAG It Off: Latency Prefers No Common Coins

Ignacio Amores-Sesar

Aarhus University, Denmark Viktor Grøndal

Aarhus University, Denmark Adam Holmgård

Aarhus University, Denmark Mads Ottendal

Aarhus University, Denmark

Abstract

We introduce Black Marlin, the first Directed Acyclic Graph (DAG)-based Byzantine atomic broadcast protocol in a partially synchronous setting that successfully forgoes the reliable broadcast and common coin primitives. Black Marlin achieves the optimal latency of 3 rounds of communication (4.25 with Byzantine faults) while maintaining optimal communication and amortized communication complexities. We present a formal security analysis of the protocol, accompanied by empirical evidence that Black Marlin outperforms state-of-the-art DAG-based protocols in both throughput and latency.

Keywords and phrases:

Atomic broadcast, DAG-based, Partial synchrony

Copyright and License:

2012 ACM Subject Classification:

Security and privacy

\rightarrow

Distributed systems security

Funding:

This work has been mostly funded by the Cryptographic Foundations for Digital Society, CryptoDigi, DFF Research Project 2, Grant ID 10.46540/3103-00077B, and partially funded by the Swiss National Science Foundation(SNSF) under grant agreement Nr. 188443 (Advanced Consensus Protocols) and by a grant from Avalanche, Inc. to the University of Bern.

Editor:

Dariusz R. Kowalski

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

Scalability poses a significant challenge for consensus protocols. Paradoxically, increasing the number of participants in consensus degrades its performance rather than enhancing it.

Traditional consensus protocols, such as Paxos [19], PBFT [10], and HotStuff [27], primarily adopt a leader-based approach. In this model, a designated leader proposes a value, which the remaining participants validate. This design results in an uneven distribution of workload, with the leader becoming a performance bottleneck. If we add more parties to a consensus protocol, we simply increase the communication load of the leader, degrading the performance of the protocol. Furthermore, if the leader behaves in a Byzantine manner, the system must invoke a view-change mechanism to replace it. These view-changes are sensitive to timing assumptions and become progressively more expensive [8] as the number of parties increases.

To address these limitations, Keidar et al. [16] introduced DAG-Rider, an elegant asynchronous solution that leverages the common core abstraction [9]. This approach enables every participant to act as a leader in the protocol. These instances are interwoven into a directed acyclic graph (DAG), on top of which a total ordering is achieved using a common coin [8], applied at the end of every wave: a set of four consecutive rounds.

DAG-Rider demonstrates remarkable throughput compared to leader-based protocols, but this comes at the cost of significantly increased latency. The primary cause of this latency is the repeated use of reliable broadcast [8] across each of the four rounds required to implement the common core primitive [9]. This triggered the effort of the research community [12, 26, 17, 15, 24, 6, 25, 5] aimed at mitigating this latency overhead. These contributions generally fall into two broad categories.

On the one hand, Cordial Miners [17] replaces the reliable broadcast rounds with bare-message communications, reducing communication overhead at the cost of increasing the wave length to five rounds (three in partial synchrony). On the other hand, a majority of the works [12, 26, 15, 24, 6, 25, 5] pursue orthogonal improvements, either through more efficient use of the common coin or by shortening the length of the waves under partial synchronous assumptions. These approaches have collectively succeeded in halving the latency of DAG-based protocols, marking a significant advance in the state of the art. However, their reliance on the reliable broadcast primitive continues to be a limiting factor. Moreover, to the best of our knowledge, no existing protocol has yet successfully elected an anchor in every round.

These are precisely the two limitations addressed in this paper. We introduce Black Marlin, the first partially synchronous, DAG-based protocol that elects an anchor in every round without relying on reliable broadcast. The simplest – yet most impactful – innovation is the substitution of the common coin with a deterministic round-robin mechanism. This substitution is valid because the FLP impossibility [14] only applies in asynchronous settings.

The primary advantage of the round-robin approach is that it allows block creation to be conditioned on the reception of the anchor block from a given round. To handle scenarios in which a malicious anchor fails to behave correctly, Black Marlin employs timeouts, thereby preventing deadlocks. This conditional block creation enables Black Marlin to elect an anchor in every round, fulfilling the long-standing goal of the research on DAG-based protocols.

Moreover, by operating in a partially synchronous model, Black Marlin is able to adapt ideas from Cordial Miners [17] to eliminate the need for reliable broadcast, using only digital signatures without the need for extra rounds of communication. The trade-off for achieving anchor election without reliable broadcast is a small constant delay and a minor reduction in commitment probability under Byzantine behavior, while maintaining optimal message complexity in both cases.

Specifically, Black Marlin commits the anchor of round $r-2$ upon completing round $r$ with probability $4/9$ under adversarial conditions. This probability increases to 1 when all participants behave honestly. As a result, the expected latency is 4.25 communication rounds under Byzantine behavior and the optimal 3 rounds in the honest case.

Additionally, timeouts are only triggered during asynchronous periods or when facing Byzantine faults. Thus, Black Marlin remains a reactive protocol, effectively leveraging the advantages of both synchronous and asynchronous models. We prove that Black Marlin implements atomic broadcast [8] together with all the performance results mentioned above. Furthermore, we benchmark Black Marlin against Bullshark, showing its practicality.

1.1 Structure

The paper is structured as follows. Section 1 briefly describes the paper. Section 2 details the state of the art of DAG-based protocols. Section 3 sets the notation and key concepts used in the paper. Section 4 describes Black Marlin in full detail. Section 5 shows that Black Marlin implements atomic broadcast and studies its complexities. Section 6 describes a comparison of the throughput and latency of Black Marlin and Bullshark.

2 Related work

DAG-based protocols have constituted a significant transformation in the field of consensus over the past few years.

This revolution began in the permissionless setting, with protocols such as Avalanche [22] and IOTA [20]. However, these protocols were initially introduced without thorough security analyses. As a result, several vulnerabilities were later discovered [11, 4, 3], which limited their success despite recent work demonstrating their optimality [2].

DAG-based consensus protocols were later extended to the permissioned setting, most notably with DAG-Rider by Keidar et al. [16]. DAG-Rider was the first protocol to leverage the properties of the common core primitive [9] to implement atomic broadcast in asynchronous networks. It outperforms traditional leader-based protocols [19, 10, 27] in terms of throughput. However, DAG-Rider requires on average six instances of reliable broadcast [7], resulting in 18 rounds of communication. Since its introduction, several works have aimed to address the latency limitations of DAG-Rider.

On one hand, Cordial Miners [17] reduces latency to an average of 7.5 communication rounds by replacing reliable broadcast with bare-message communication. The tradeoff of this approach is that its new communication paradigm requires an additional round per wave to compensate for the properties that reliable broadcast provides. When a party creates a block in Cordial Miners, it must include any blocks the recipient may have missed. We adopt this idea in the design of Black Marlin.

On the other hand, a widely adopted strategy to improve DAG-Rider’s latency has involved either assuming synchronous networks or modifying the core and coin primitives. Narwhal and Tusk [12] introduced a clever transaction preprocessing mechanism, where transactions are grouped into batches, so only the hashes of these batches are submitted to consensus. This batching is delegated to specialized workers, allowing the system to scale dynamically with transaction load. However, Narwhal and Tusk lack a rigorous security analysis, and liveness vulnerabilities have been identified [23].

Bullshark [26], which builds on top of Narwhal and Tusk, also introduces a variant for partial synchrony that improves latency to just three instances of reliable broadcast, about nine rounds of communication, on average. Bullshark is particularly relevant to our work, as its authors provide a public implementation [21]¹¹1Bullshark has been chosen over more recent frameworks such as Mysticeti [18] due to its closer similarity to Black Marlin and the timing of this project., enabling a fair, side-by-side comparison with Black Marlin. It is worth noting that both Narwhal and Tusk, as well as Bullshark, rely on reliable broadcast as a fundamental building block.

More recently, several protocols have been proposed targeting latency. These works largely follow the above trajectory. Mysticeti [6] reduces latency further by employing consistent broadcast, achieving an average-case latency of only three instances [8]. Sailfish [24] also improves latency, though it continues to rely on reliable broadcast. Similarly, Shoal and Shoal++ [25] provide enhancements in anchor selection, but still depend on reliable broadcast.

To the best of our knowledge, Black Marlin is the first DAG-based protocol to achieve optimal latency in partial synchrony, while electing an anchor in every round, a crucial feature for minimizing the latency of non-anchor blocks without relying on reliable broadcast. We also highlight that Black Marlin achieves these properties while preserving simplicity.

3 Model

3.1 Notation

We consider a set $\mathcal{N}=\{0,...,n-1\}$ of $n$ parties running a round-based algorithm. In each round, these parties create blocks that form the vertices of a directed acyclic graph $(\textit{DAG},\mathcal{R})$ with references that constitute the edges. Edge $(B_{1},B_{2})\in\mathcal{R}$ if $B_{1}$ includes a reference to $B_{2}$ . A block $B=[r,i,\textit{refs},\textit{refs}^{\prime}]$ is a tuple formed by a round number $r$ , an identifier for party $i$ , a set of strong references refs, and a set of weak references $\textit{refs}^{\prime}$ . For simplicity, we omit the set of transactions and the signature from the block, as these transactions play no role in the consensus mechanism. Given a block $B$ , $\textsl{creator}(B)$ denotes the party that created $B$ , $\textsl{round}(B)$ denotes the round in which $B$ was created. The variable $\textit{DAG}(r)$ denotes the set of block $B^{\prime}\in\textit{DAG}$ such that $\textsl{round}(B^{\prime})=r$ . A party selected by a round-robin mechanism is referred to as anchor, blocks produced by the anchor party are referred to as anchor blocks.

3.2 Adversary and Network

We model parties as interactive Turing machines (ITM). An interactive Turing machine is a Turing machine with an input and an output tape that allows the Turing machine to communicate with other Turing machines and make decisions based on the content of their input tape. The adversary is modeled as another ITM that corrupts up to $f$ parties at the beginning of the execution, the actual number of corruptions is denoted by $t\leq f<\frac{n}{3}$ . These corrupted parties obey the adversary; i.e., they may diverge from the execution of the protocol. Corrupted parties are often referred to as Byzantine and non-corrupted as honest.

We assume a partial synchronous communication model [13] on point-to-point links. The network begins in an asynchronous state and turns synchronous after a global stabilization time (GST). After this point, every message sent is delivered with a delay of up to $\Delta$ . The adversary not only decides the global stabilization time but also when the message is made available to the party within the $\Delta$ time limit. We also assume that the local computations are instantaneous, this assumption can easily be relaxed by factoring the computation of the honest parties in the network delay. We assume the existence of a public key infrastructure. We also assume that every block and a message produced by every party is signed, and that the signature scheme is secure.

3.3 Atomic broadcast

We model our protocol as atomic broadcast. Our atomic broadcast primitive is accessed with the events $\textsl{ab-broadcast}(B,i,r)$ and outputs events $\textsl{ab-deliver}(B,j,r)$ . The event $\textsl{ab-broadcast}(B,i,r)$ can only be triggered by party $i$ , but $\textsl{ab-deliver}(B,j,r)$ may be output by every party.

Definition 1 (Atomic broadcast).

A protocol solves atomic broadcast if it satisfies the following properties, except with negligible probability:

Validity.: If an honest party $i$ invokes $\textsl{ab-broadcast}(B,i,r)$ after the global stabilization time, then party $i$ eventually outputs $\textsl{ab-deliver}(B,i,r)$ .
Agreement.: If an honest party outputs $\textsl{ab-deliver}(B,i,r)$ , then all honest parties eventually output $\textsl{ab-deliver}(B,i,r)$ .
Integrity.: For any party $i$ and round $r$ , every honest party outputs $\textsl{ab-deliver}(B,i,r)$ at most once regardless of $B$ . Moreover, if $i$ is honest, then $i$ has invoked $\textsl{ab-broadcast}(B,i,r)$ .
Total order.: If an honest party outputs $\textsl{ab-deliver}(B,i,r)$ before $\textsl{ab-deliver}(B^{\prime},j,r^{\prime})$ , then no honest party outputs $\textsl{ab-deliver}(B^{\prime},j,r^{\prime})$ before $\textsl{ab-deliver}(B,i,r)$ .

Note that our definition of atomic broadcast is adapted to the partial synchrony model; the validity property applies to blocks that are broadcast after the network becomes synchronous. However, the safety properties are satisfied during both synchrony and asynchrony.

4 Protocol

Black Marlin (Algorithm 2) proceeds in rounds. In round $r$ , honest parties wait to receive blocks from at least $n!-!f$ parties. They then check if an anchor for round $r$ has been received and the anchors for rounds $r-2$ and $r-1$ appear in the past of at least $n-f$ blocks from rounds $r-1$ and $r$ , respectively. If both hold, a new block is created referencing all received blocks, and the party advances to round $r+1$ . Otherwise, the party waits until the conditions are met or a timeout triggers block creation. These timeouts ensure liveness under Byzantine anchors and are only used when progress is blocked, allowing the protocol to be responsive. Figure 1 illustrates an example execution.

Figure 1: An example of execution of Black Marlin (Algorithm 2) with

f=1

and

n=4

. Every block has references to at least

3

blocks from the previous round. The blocks

B_{0},...,B_{5}

are the anchor blocks for the corresponding round. Note that blocks

B_{0}

,

B_{1}

,

B_{2}

, and

B_{3}

have enough support (i.e., at least 3 references) to be committed. However,

B_{3}\notin\textsl{strong}(B_{4})

, thus

B_{0}

,

B_{1}

, and

B_{2}

, satisfy the commitment rule (L18–32) while

B_{3}

does not. After the global stabilization time, this situation cannot occur without malicious behavior, as

B_{3}

would satisfy

B_{3}\in\textsl{past}(B_{4})

. This situation has been constructed to illustrate the commitment rule.

4.1 Validity predicates

A block from round $r\neq 1$ and creator $j$ is valid if it is signed by party $j$ and contains references to valid blocks created by at least $n-f$ parties from round $r-1$ . Since blocks in round $r=1$ do not reference any previous blocks, a block is valid if it has been signed by its creator. A history $\mathcal{H}$ is valid if all the blocks contained in it are also valid.

When receiving a message $[\textsc{block},{B^{\prime},\mathcal{H}^{\prime},j,r^{\prime}}]$ (L57), party $i$ checks the validity of both the block $B^{\prime}$ and the history $\mathcal{H}^{\prime}$ , and adds the block $B^{\prime}$ along with every unknown block from the history $\mathcal{H}^{\prime}$ to its local view (DAG). Party $i$ also updates party $j$ ’s local history (L60), used solely to optimize message complexity by sending only strictly necessary blocks.

4.2 Initialization

The first round of the protocol differs from the rest as there are no blocks from earlier rounds. Party $i$ creates a block with an empty set of references and invokes the event $\textsl{ab-broadcast}(B,i)$ before sending the message $[\textsc{block},{B,\{\},i,0}]$ to every party and starting a timeout with time $2\Delta$ . Party $i$ then moves to the next round, a standard round.

4.3 Round structure

Every round beyond the first shares the same structure. Consider an honest party $i$ in round $r$ , i.e., it has just produced a block corresponding to round $r$ . Party $i$ waits to receive messages $[\textsc{block},{B,\mathcal{H},j,r^{\prime}}]$ from at least $n-f$ different parties for some round $r^{\prime}\geq r$ . Once sufficient messages for such a round have been received, party $i$ checks whether it has received a message $[\textsc{block},{B^{\prime},\mathcal{H}^{\prime},j,r^{\prime}}]$ with $j$ being the anchor for the current round, and verifies that the previous two anchors have enough support, or if a timeout has been triggered, to conclude the round. This dual condition to start a new round allows the protocol to be responsive, i.e., it runs at the speed of the actual network delay instead of the estimated network delay $\Delta$ without deadlocking when an anchor party behaves maliciously.

Party $i$ concludes the round by performing the following sequential operations. First, $i$ attempts the delivery of blocks (L42); this function is described in its own paragraph below. Secondly, party $i$ creates a new block $B$ corresponding to the next round $r^{\prime}+1$ , including strong references refs to every block in $\textit{DAG}(r^{\prime})$ (L43–50). A set of weak references $\textit{refs}^{\prime}$ to blocks from previous rounds are also included; this secondary set is time-bounded, preventing the usual garbage collection problem associated with these solutions. Party $i$ concludes these operations by sending $B$ to every other party, together with the history of the parties (L54). Note that party $i$ may skip one or multiple rounds if an appropriate quorum is received.

Figure 2: Illustration of the commit rule of Black Marlin. When a party invokes the commit function on an anchor block

B_{1}

, it will recursively commit earlier uncommitted anchor blocks first. The subscript numbers on the blocks indicate the order in which the commit function is called. In this process, blocks

B_{1}

B_{2}

,

B_{3}

, and

B_{4}

are ab-delivered, in the reverse order of their commitment. Before committing any anchor block, all non-delivered blocks in their causal history are deterministically sorted and committed. In the illustration, these correspond to the smaller vertices contained within the history boxes associated with each anchor block.

4.4 Delivery

The delivery function determines which blocks are delivered and in what order, based on the local view of the DAG. Upon concluding round $r$ , party $i$ attempts the commitment of the anchor block $B$ from round $r-2$ (L14–17). $B$ is an anchor block if its creator has been elected by the round-robin mechanism. However, it is only committed if it is supported by at least $n-f$ parties in round $r-1$ and including an anchor of round $r-1$ $B^{\prime}$ , and $B^{\prime}$ is also supported by at least $n-f$ parties in round $r$ . These conditions prevent honest parties from committing different blocks when the anchor party is Byzantine.

If this check succeeds, party $i$ calls the function $\textsl{commit}(B)$ (L18–32), which iterates through the local view of the DAG. Party $i$ searches for a non-delivered block $B^{\prime}$ from a higher round reachable through strong references from $B$ . If such a block $B^{\prime}$ exists, the function $\textsl{commit}(B^{\prime})$ is called recursively. If no such block exists, party $i$ deterministically orders the blocks in $\textsl{past}(B)$ before invoking ab-deliver on them. In case of conflicting blocks from the same party and round, only the first block – determined by the deterministic ordering above – is ab-delivered. Figure 2 shows an example of the commitment rule.

4.5 Functions

Black Marlin (Algorithm 2) uses special functions as described in the paragraphs above. For the sake of the reader, we summarize the functions together in this section.

Algorithm 1 State and functions (party

i

).

Given a block $B$ , the function $\textsl{creator}(B)$ returns the party that created and signed $B$ , and the function $\textsl{round}(B)$ returns the round in which $B$ was created. The validity predicate $V(B)$ returns $1$ if $B$ has at least $n-f$ blocks from different parties and no two blocks from the same party corresponding to $\textsl{round}(B)-1$ in its references, and $B$ has been signed by its creator. When invoked with a set of blocks $\mathcal{H}$ , the validity predicate $V(\mathcal{H})$ returns $1$ if every block $B^{\prime}\in\mathcal{H}$ satisfies $V(B^{\prime})$ . The function $\textsl{past}(B)$ returns the set of blocks reachable from $B$ through strong and weak links, and the function $\textsl{strong}(B)$ returns the set of blocks reachable from $B$ through strong links, $B$ is not part of the output of either function. The function $\textsl{supp}(B)$ counts the number of parties $i$ that created a block $B_{i}$ in round $\textsl{round}(B)+1$ such that $B\in\textsl{strong}(B_{i})$ and no other block from the same creator and round is in $\textsl{strong}(B_{i})$ . The function $H(B)$ is a collision-resistant hash function. The operation $\textsl{setTimeout}(r,\rho)$ starts a timeout of duration $\rho$ corresponding to round $r$ , whereas $\textsl{timeout}(r)$ triggers when the timeout is completed and $\textsl{clearTimeout}(r)$ removes any active timeout corresponding to round $r$ . The function $\textsl{time}()$ returns the local time of the party and when input with a round $r$ , $\textsl{time}(r)$ returns the local time when the party adopted round $r$ . In the case that the party never adopted round $r$ , $\textsl{time}(r)$ returns the time when the party adopted the smallest round $r^{\prime}>r$ . The function $\textsl{RR}(r)$ takes a round as an input and returns the party by the round-robin mechanism in round $r$ , a block from this party is also referred to as anchor. In case multiple blocks exist, e.g. due to Byzantine behavior, the function returns both blocks. The variable $\textit{DAG}(r)$ returns the set of blocks from round $r$ in the local view of the party. The operation $\textsl{maxAnchor}(\mathcal{A})$ takes a set of blocks $\mathcal{A}$ as an input and returns the anchor from the highest round. Given a round $r$ , the function $\textsl{quorum}(r)$ returns 1 if there are blocks from at least $n-f$ parties in $\textit{DAG}(r)$ . The function $\textsl{anchor}(r)$ returns 1 if there is a block created by the anchor of $r$ in $\textit{DAG}(r)$ , and $\textsl{suppAnchor}(r)$ returns 1 if there is an anchor block in $r$ that is supported by at least $n-f$ parties in $\textit{DAG}(r+1)$ .

5 Analysis

Black Marlin (Algorithm 2) is designed for partially synchronous networks, which means that it may not be alive during the asynchronous period. However, its safety properties are still guaranteed during this period. This fact is reflected in the analysis below.

5.1 Safety

The results in this section apply to every block, even during the asynchronous phase.

Lemma 2.

Every honest message $[\textsc{block},{B^{\prime},\mathcal{H}^{\prime},j,r}]$ received by an honest party $i$ is valid, i.e., satisfies line L58 in the view of $i$ .

Proof.

A message $[\textsc{block},{B^{\prime},\mathcal{H}^{\prime},j,r}]$ is valid if block $B^{\prime}$ and history $\mathcal{H}^{\prime}$ satisfy the validity predicate $V$ . If the sender is honest, both $B^{\prime}$ and $\mathcal{H}^{\prime}$ satisfy the validity predicate. Thus, it is only left to prove that party $i$ has enough information to verify the validity predicate. The construction of the history $\mathcal{H}^{\prime}$ guarantees this fact. $\hfill\blacktriangleleft$

Lemma 2 applies whenever an honest party sends a message. For the sake of readability, we omit explicit references to this lemma each time an honest message is sent.

Lemma 3.

Given two honest parties $i$ and $j$ and blocks $B_{i}$ and $B_{j}$ from round $r$ , if $B_{i}$ satisfies line L16 in the view of party $i$ and $B_{j}$ satisfies line L16 in the view of party $j$ , then $B_{i}=B_{j}$ .

Proof.

Given a block $B_{i}$ created by party $k$ in round $r$ that satisfies line L16 in the view of some party, it follows that $\textsl{creator}(B_{i})=\textsl{RR}(r)$ and $\textsl{supp}(B_{i})\geq n-f$ . Recall that, by definition of the support function, an honest party supports at most one block per party per round. Therefore, if blocks $B_{i}$ and $B_{j}$ both satisfy line L16 in the views of parties $i$ and $j$ respectively, then the following condition must be satisfied $\textsl{creator}(B_{i})=\textsl{creator}(B_{j})=\textsl{RR}(r)$ and $\textsl{supp}(B_{i}),\textsl{supp}(B_{j})\geq n-f$ . Given the corruption threshold condition of $n\geq 3f+1$ , it is impossible for two distinct blocks to each have support from at least $n-f$ parties. Hence, it must be the case that $B_{i}=B_{j}$ . $\hfill\blacktriangleleft$ Lemma 3 states that at most one block is honestly committed per round. Furthermore, these committed blocks create a chain in the DAG, as we prove in the following lemmas.

Algorithm 2 Black Marlin (party

i

).

Lemma 4.

Consider an honest party $i$ and the maximal round $r$ such that $\textit{DAG}(r)\neq\emptyset$ in the view of party $i$ , then $|\textit{DAG}(r^{\prime})|\geq n-f$ for every round $r^{\prime}<r$ .

Proof.

Consider the maximal round $r$ and a block $B\in\textit{DAG}(r)$ . Since $B$ is a valid block, there exists valid blocks from at least $n-f$ parties in $\textsl{strong}(B)$ corresponding to round $r-1$ . Thus, $|\textit{DAG}(r-1)|\geq n-f$ . Now, consider any block in $\textit{DAG}(r-1)$ and iterate recursively. We conclude that $|\textit{DAG}(r^{\prime})|\geq n-f$ for every round $r^{\prime}<r$ . $\hfill\blacktriangleleft$

In contrast to traditional leader-based protocols, a block in DAG-based protocols must be in the past of an overwhelming majority of blocks to ensure its delivery. The following result formalizes this requirement for committed blocks.

Lemma 5.

If a block $B$ from round $r$ satisfies line L16 in the view of some honest party $i$ , then every valid block $B^{\prime}$ from round $r^{\prime}\geq r+2$ satisfies $B\in\textsl{past}(B^{\prime})$ .

Proof.

By definition of line L16, $\textsl{supp}(B)\geq n-f$ , thus $B\in\textsl{past}(B^{\prime})$ of blocks $B^{\prime}$ from at least $n-f$ different parties in round $r+1$ . We proceed by induction over $\rho:=r^{\prime}-r$ :

$\blacksquare$

Let $B^{\prime}$ be any block from round $r+2$ . By construction, there are blocks from at least $n-f$ in $\textit{DAG}(r+1)$ , and up to $f$ of them could be Byzantine. Block $B$ is in the past of blocks from round $r+1$ from at least $n-f$ different parties, thus $B\in\textsl{past}(B^{\prime})$ . We conclude $B\in\textsl{past}(B^{\prime})$ of any block $B^{\prime}$ from round $r+2$ .
$\blacksquare$

Assume that every block $B_{0}$ from round $r+\rho_{0}$ satisfies $B\in\textsl{past}(B_{0})$ and let $B^{\prime}$ be a block from round $r^{\prime}=r+\rho_{0}+1$ . We have that $B\in\textsl{past}(B^{\prime})$ , since there are at least $n-f$ blocks in $\textsl{past}(B^{\prime})$ and every block $B_{0}$ from round $r+\rho_{0}$ fulfills $B\in\textsl{past}(B_{0})$ .

We conclude that any block $B$ from round $r$ that satisfies line L16 in the view of some honest party, then every block $B^{\prime}$ from round $r^{\prime}\geq r+2$ satisfies $B\in\textsl{past}(B^{\prime})$ . $\hfill\blacktriangleleft$

Lemma 6.

If honest parties $i$ and $j$ commit blocks $B_{i}$ and $B_{j}$ respectively, then $B_{i}=B_{j}$ , $B_{i}\in\textsl{past}(B_{j})$ , or $B_{j}\in\textsl{past}(B_{i})$ .

Proof.

Let $\textsl{round}(B_{i})$ and $\textsl{round}(B_{j})$ denote the rounds in which parties $i$ and $j$ commit blocks $B_{i}$ and $B_{j}$ , respectively. We analyze three cases based on the values of these rounds:

$\blacksquare$

Case 1: $\textsl{round}(B_{i})=\textsl{round}(B_{j})$ .
By Lemma 3, at most one block can be committed per round. Hence, $B_{i}=B_{j}$ .
$\blacksquare$
Case 2: $\textsl{round}(B_{i})<\textsl{round}(B_{j})$ . We further distinguish two sub-cases:
- –
  
  Case 2a: $\textsl{round}(B_{j})=\textsl{round}(B_{i})+1$ .
  By the second condition in line L16, there exists a block $B_{j}^{\prime}\in\textit{DAG}(\textsl{round}(B_{j}))$ such that $\textsl{creator}(B_{j})=\textsl{creator}(B_{j}^{\prime})$ and $B_{i}\in\textsl{strong}(B_{j}^{\prime})$ . Since Byzantine parties may create multiple blocks, we cannot immediately conclude that $B_{j}=B_{j}^{\prime}$ . However, the third condition of line L16 ensures that both $B_{j}$ and $B_{j}^{\prime}$ satisfy $\textsl{supp}(B_{j}),\textsl{supp}(B_{j}^{\prime})\geq n-f$ , which, by Lemma 3, implies $B_{j}=B_{j}^{\prime}$ . Thus, $B_{i}\in\textsl{past}(B_{j})$ .
- –
  
  Case 2b: $\textsl{round}(B_{i})+1<\textsl{round}(B_{j})$ .
  In this case, Lemma 5 guarantees that $B_{i}\in\textsl{past}(B_{j})$ .
$\blacksquare$

Case 3: $\textsl{round}(B_{i})>\textsl{round}(B_{j})$ .
This is symmetric to Case 2 and follows by applying the same reasoning, yielding $B_{j}\in\textsl{past}(B_{i})$ .

We conclude that any two blocks committed by honest parties satisfy that both parties commit the same block or blocks in the past of each other. $\hfill\blacktriangleleft$ This chain of committed blocks is common across honest parties, thus defining a partial order common to every honest party.

5.2 Liveness

Liveness is guaranteed only after the global stabilization time. For this reason, the majority of the results presented here apply only to blocks that are ab-broadcast after time GST.

Lemma 7.

If an honest party invokes $\textsl{ab-broadcast}(B,i,r)$ at time $T\geq\textit{GST}$ , then every honest party $j$ that invokes $\textsl{ab-broadcast}(B^{\prime},j,r)$ , does it before time $T+3\Delta$ .

Proof.

Assume that party $i$ invokes $\textsl{ab-broadcast}(B,i,r)$ at time $T\geq\textit{GST}$ . Then, according to line L55, $i$ sends a message of the form $[\textsc{block},{B,\textit{DAG}\setminus\textsl{history}[j],i,r}]$ to every party. An honest party $j$ receives this message by time $T+\Delta$ . Since the message includes all the strong references of $B$ , party $j$ ’s view satisfies $|\textit{DAG}(r-1)|\geq n-f$ . As a result, $j$ can create a block for round $r$ once the anchors have sufficient support or a timeout of duration $2\Delta$ triggers. Therefore, $j$ can create its block by time $T+3\Delta$ . $\hfill\blacktriangleleft$

Lemma 5 guarantees that committed anchor blocks are in the past of future blocks. The result below generalizes this property to all honest blocks created after the global stabilization time GST.

Lemma 8.

If honest party $i$ invokes $\textsl{ab-broadcast}(B,i,r)$ at time $T\geq\textit{GST}$ , then $B\in\textsl{past}(B^{\prime})$ of every honestly $\textsl{ab-broadcast}(B^{\prime})$ block $B^{\prime}$ after time $T+\Delta$

Proof.

Assume that party $i$ invokes $\textsl{ab-broadcast}(B,i,r)$ at time $T$ , then $i$ sends a message of the form $[\textsc{block},{B,\textit{DAG}\setminus\textsl{history}[j],i,r}]$ to every party (L55). Any block $B^{\prime}$ created by party $j$ after receiving the message $[\textsc{block},{B,\textit{DAG}\setminus\textsl{history}[j],i,r}]$ satisfies $B\in\textsl{past}(B^{\prime})$ . Lemma 7 guarantees that the message $[\textsc{block},{B,\textit{DAG}\setminus\textsl{history}[j],i,r}]$ is not garbage collected, thus proving the statement. $\hfill\blacktriangleleft$

Lemma 9.

Given an honest party $i$ and a round $r\in\mathbb{N}$ , $\exists r^{\prime}\geq r$ such that $i$ invokes $\textsl{ab-broadcast}(B,i,r^{\prime})$ for some block $B$ .

Proof.

Let $r$ be a round, and define $r^{\prime}\geq r$ as the first round such that no honest party has executed it before time GST, and $\textsl{RR}(r^{\prime})=i$ . We show that party $i$ eventually invokes $\textsl{ab-broadcast}(B,i,r^{\prime})$ for some block $B$ .

Consider line L40 as executed in round $r^{\prime}$ by any honest party $j$ . Since $i$ is the leader, no honest party concludes round $r$ without a block from $i$ unless its timeout triggers. Consider $j$ to be the first honest party that concluded round $r^{\prime}-1$ , the duration of its timeout is $2\Delta$ . This guarantees that party $i$ receives the quorum of round $r-1$ sent by $j$ (at most $\Delta$ delay) before any honest party concludes round $r^{\prime}$ , thus party $i$ invokes ab-broadcast( $B,i,r^{\prime}$ ) $\hfill\blacktriangleleft$

Lemma 10.

At time $T\geq\textit{GST}$ , if an honest party concludes a round $r$ (L40) such that the anchors from rounds $r-2$ and $r-1$ are honest, then $\textsl{suppAnchor}(r-2)\geq n-f$ and $\textsl{suppAnchor}(r-1)\geq n-f$ .

Proof.

At time $T\geq\textit{GST}$ , assume an honest party concludes round $r$ (L40) and that the anchors from rounds $r-2$ and $r-1$ are also honest. By construction, if a party creates a block for round $r$ at time $T^{\prime}$ , then every honest party receives this block by time $T^{\prime}+\Delta$ and subsequently creates its own block, which is received by all honest parties by time $T^{\prime}+2\Delta$ .

Given that both anchors from rounds $r-2$ and $r-1$ are honest, it follows that $\textsl{suppAnchor}(r-2)\geq n-f$ and $\textsl{suppAnchor}(r-1)\geq n-f$ , since there are at least $n-f$ honest parties contributing to support anchors in each round. $\hfill\blacktriangleleft$

Lemma 10 states that after the GST, the timeout is not triggered with honest anchors.

Lemma 11.

After the global stabilization time, given a round $r$ , the probability that honest party $i$ commits a block in $r$ is at least $\frac{(n-t)^{2}}{n^{2}}$ . Thus the expected number of rounds until a block is committed is at most $\frac{n^{2}}{(n-t)^{2}}$ .

Proof.

Given a round $r$ , party $i$ commits a block $B\in\textit{DAG}(r-2)$ (L16) if and only if the following conditions are satisfied:

$\blacksquare$

The creator of $B$ has been elected by the round-robin mechanism in round $r-2$ , i.e., $\textsl{creator}(B)=\textsl{RR}(r-2)$ , and the block has sufficient support, $\textsl{supp}(B)\geq n-f$ .
$\blacksquare$

There exists a block $B^{\prime}\in\textit{DAG}(r-1)$ such that $B\in\textsl{strong}(B^{\prime})$ , $\textsl{creator}(B^{\prime})=\textsl{RR}(r-1)$ , and $\textsl{supp}(B^{\prime})\geq n-f$ .

Assume that the anchors from rounds $r-2$ and $r-1$ are both honest; this implies the existence of $B\in\textit{DAG}(r-2)$ and $B^{\prime}\in\textit{DAG}(r-1)$ , as described above. According to Lemma 10, both blocks have enough support: $\textsl{supp}(B)\geq n-f$ and $\textsl{supp}(B^{\prime})\geq n-f$ . Since there are $n-t$ honest parties, the anchors from rounds $r-2$ and $r-1$ are both honest with probability $\frac{(n-t)^{2}}{n^{2}}$ . Hence, party $i$ commits a block in round $r$ with probability at least $\frac{(n-t)^{2}}{n^{2}}$ and commits every a block every at most $\frac{n^{2}}{(n-t)^{2}}$ rounds on average. $\hfill\blacktriangleleft$ This lemma shows that, under maximum number of corruptions, honest parties commit a block every at most $9/4$ rounds. Nonetheless, a clever adversary may delay commitment for a short period of time ( $t$ rounds), at the cost of allowing the network to commit optimally (every round) outside this period.

Lemma 12.

Given two honest parties $i$ and $j$ and a block $B$ , then $i$ commits $B$ if and only if $j$ eventually commits $B$ .

Proof.

Assume that an honest party $i$ commits a block $B$ . By Lemma 11, any other honest party $j$ will eventually commit a block $B^{\prime}$ such that $\textsl{round}(B^{\prime})\geq\textsl{round}(B)$ . Lemma 6 applied to blocks $B$ and $B^{\prime}$ implies that either $B=B^{\prime}$ or $B\in\textsl{past}(B^{\prime})$ . Since $\textsl{round}(B^{\prime})\geq\textsl{round}(B)$ , it cannot be the case that $B^{\prime}\in\textsl{past}(B)$ unless $B=B^{\prime}$ . If $B^{\prime}=B$ , then party $j$ directly commits the same block. If $B\in\textsl{past}(B^{\prime})$ and $B\neq B^{\prime}$ , then by construction of the delivery function, party $j$ must have also committed $B$ . Therefore, we conclude that if an honest party $i$ commits a block $B$ , then every honest party $j$ eventually commits $B$ . $\hfill\blacktriangleleft$

5.3 Main Result

Previous lemmas are sufficient to show that Black Marlin implements atomic broadcast.

Theorem 13.

Black Marlin (Algorithm 2) implements atomic broadcast.

Proof.

We proceed property by property:

Validity.: Assume an honest party $i$ invokes $\textsl{ab-broadcast}(B,i,r)$ after the global stabilization time GST. According to the delivery function (L14), party $i$ outputs $\textsl{ab-deliver}(B,i,r)$ the first time it observes a block $B^{\prime}$ such that $B\in\textsl{past}(B^{\prime})$ and $B^{\prime}$ satisfies line L16. Lemma 8 guarantees that $B$ is eventually in the past of every valid block. Lemmas 9 and 11 ensure that $B^{\prime}$ is eventually committed, hence party $i$ outputs $\textsl{ab\text{-}deliver}(B,i,r)$ .
Agreement.: Suppose an honest party $i$ outputs $\textsl{ab-deliver}(B,j,r)$ . By Algorithm 2, this means a block $B^{*}$ with $B\in\textsl{past}(B^{*})$ satisfies line L16. For any other honest party $j$ , Lemmas 9 and 11 guarantee that $j$ eventually commits a block $B^{\prime}$ with $B^{*}\in\textsl{past}(B^{\prime})$ , and thus $B\in\textsl{past}(B^{\prime})$ . Therefore, party $j$ eventually $\textsl{ab-delivers}(B,j,r)$ , ensuring agreement.
Integrity.: The delivery function (L16) ensures that honest party $j$ keeps track of all previously delivered blocks. Consequently, $j$ outputs $\textsl{ab-deliver}(B,i,r)$ at most once per party $i$ and round $r$ . Furthermore, if $i$ is honest, $j$ outputs $\textsl{ab-deliver}(B,i,r)$ only if $i$ previously invoked $\textsl{ab\text{-}broadcast}(B,i,r)$ as Byzantine parties cannot impersonate honest parties.
Total order.: By definition of the delivery function, the order of delivery of non-anchor blocks is determined by the order in which the anchor blocks are committed (L16). Thus, it suffices to show that honest parties commit anchors in the same order. Lemma 12, guarantees that every honest party commits the same set of blocks. The delivery function called with block $B$ guarantees that any anchorin the past of $B$ is committed before $B$ , Lemma 6 guarantees that any committed blocks (possibly by different parties) are in the past of each other. Thus, honest parties commit anchors in the same order.

$\hfill\blacktriangleleft$

5.4 Communication and Time Complexity

Previously, we showed that Algorithm 2 is both safe and live in a partial synchronous network. In particular, we demonstrated that blocks are eventually delivered, avoiding any concrete analysis of the communication and time complexity of the algorithm.

5.4.1 Communication complexity

Each round, a party sends at most one message of the form $[\textsc{block},{B,\textit{DAG}\setminus\textsl{history}[j],i,r}]$ . We assume that the round number can be expressed in logarithmic size $\mathcal{O}(\ln(n))$ (note that the round number grows more slowly than the number of blocks). The identifier of the party can also be expressed in $\mathcal{O}(\ln(n))$ number of bits. For the computation of the size of $\textit{DAG}\setminus\textsl{history}[j]$ we assume that the Byzantine parties do not send multiple equivocating blocks, as these blocks include signatures that implicate the party, thus the size of $\textit{DAG}\setminus\textsl{history}[j]\leq\mathcal{O}(n)$ . We conclude that the amount of bits sent per honest party and round is $\mathcal{O}(n(|B|+n))$ , hence packing $n$ transactions of constant size in $B$ we obtain a communication complexity of $\mathcal{O}(n^{2})$ with an amortized complexity of $\mathcal{O}(n)$ , both optimal [1].

5.4.2 Time complexity

Lemma 11 gives an upper bound on the expected number of rounds of communication between anchor blocks are committed. An anchor block is committed on average every $\frac{n^{2}}{(n-t)^{2}}$ rounds in the presence of $t$ corruptions: in the worst case, every $9/4$ round of communication, and every round of communication in the good case. Note that in round $r$ , an anchor from round $r-2$ is committed. Considering latency to be the number of rounds of communication since an anchor block is broadcast until it is delivered we obtain $4.25$ and $3$ rounds of communication in the average and good case, respectively. It is important to remark that in contrast to most DAG-based protocols, we are measuring in rounds of communication and not instances of reliable broadcast. Thus, Black Marlin not only achieves $\mathcal{O}(1)$ time complexity in expectation, but also the best concrete time complexity concerning DAG-based protocols [24, 25, 5, 15, 6], as their use of reliable broadcast hinders the protocols.

6 Benchmarking

We evaluate the performance of our implementation with respect to the metrics of throughput and latency for a varying number of parties and network delays.

6.1 Experiments

We conducted all benchmarks locally on a MacBook Pro equipped with an M1 Pro chip, 16 GB of RAM, 10 cores (8 performance and 2 efficiency), and running macOS Sequoia Version 15.0.1. To ensure a fair comparison, we modified the benchmarking framework developed for Bullshark [26, 21]²²2Bullshark has been chosen over more recent frameworks such as Mysticeti [18] due to its closer similarity to Black Marlin and the timing of this project. to incorporate our implementation of Black Marlin. The biggest difference between Black Marlin and Bullshark is the use of common coins by Bullshark. However, their implementation also uses round-robin³³3https://github.com/facebookresearch/narwhal/blob/bullshark/consensus/src/lib.rs L202.. This approach allows a close performance comparison between Black Marlin and Bullshark and also demonstrates that Bullshark’s preprocessing techniques can be adapted to Black Marlin. Note that the actual performance of Bullshark is worse due to the lack of common coins in the implementation.

To emulate realistic network conditions despite running benchmarks on a single machine, we modified both the Black Marlin and Bullshark implementations to introduce configurable network delays. Specifically, messages are delivered based on a Poisson distribution with the expected value $\Delta/2$ , where $\Delta$ represents the maximum network delay assumed by the protocol. This design ensures that with high probability, the actual message delay remains below $\Delta$ . For instance, when $\Delta=400$ ms, the average delivery time for a message is 200 ms.

For each experimental run, we measure throughput as the total number of transactions delivered divided by the total duration of the run. The latency of a transaction tx is defined as the time elapsed from the creation of the block containing tx until it is ab-delivered. The latency for a run is the average latency of all ab-delivered transactions in the run. Every party behaves honestly during the benchmarking.

Each data point in Figure 3 represents the average throughput or latency over 10 independent runs, each lasting 60 seconds. Error bars indicate two standard deviations. We report results across varying numbers of parties (4, 10, and 13) and network delays ( $\Delta=200$ , 400, 600, 800, and 1000 ms). All other parameters remained constant throughout the experiments: one worker per party, a transaction size of 512 bytes, and a transaction injection rate of 50,000 transactions per second.

6.2 Results

The results of the experiments described above are summarized in Figure 3, Black Marlin is represented by blue tones and Bullshark by red ones. The two figures of merit that we consider are throughput and latency.

Black Marlin consistently achieves lower latency than Bullshark across all configurations. The latency improvement ranges from a factor of 2 to 3, depending on the network delay. For example, with 10 parties and a network delay of $\Delta=200$ ms, Black Marlin achieves a latency of $696\pm 5$ ms, compared to $1800\pm 54$ ms for Bullshark. At $\Delta=1000$ ms, Black Marlin’s latency is $3226\pm 55$ ms, while Bullshark’s latency increases to $7773\pm 360$ ms. This improvement is expected: Black Marlin eliminates the need for reliable broadcast, reducing latency by approximately a factor of 2. Additionally, its more frequent anchor selection contributes to a constant-factor latency reduction, further widening the latency gap.

Black Marlin also achieves consistently higher throughput than Bullshark across all network delay scenarios. In contrast to the more pronounced latency gains, the throughput improvement is by a constant margin. This behavior can be explained by the way Bullshark constructs blocks: a party running Bullshark accumulates batches of transactions from workers until it is ready to broadcast a new block. As network delay increases, blocks become larger, which helps mitigate throughput loss. Therefore, the $\sim 2\times$ latency improvement observed with Black Marlin does not directly translate into an equivalent gain in throughput – unless the number of workers or the network delay is significantly increased. However, these adjustments were beyond the scope of this study. We were unable to increase the number of workers due to hardware limitations, and we excluded network delays greater than 1 second, as such conditions are considered unrealistic. This block size phenomenon also explains the erratic throughput results observed for Bullshark under configurations with 4 parties and low network delays. In these cases, small block sizes limited Bullshark’s throughput.

Figure 3 also demonstrates that Black Marlin’s garbage collection mechanism is robust to overestimations of the actual network delay. Specifically, a constant-factor overestimation leads to only a constant increase in the number of rounds that parties must retain in memory. This implies that even for arbitrarily long executions, Black Marlin only requires $\mathcal{O}(n)$ memory, provided the network delay is overestimated by a constant factor.

Refer to caption — (a) Latency of Black Marlin and Bullshark.

Overall, Black Marlin outperforms Bullshark in our implementation, which uses the same codebase and experimental conditions for both protocols. Unfortunately, source code for more recent DAG-based protocols is not publicly available, preventing a direct empirical comparison. However, as we demonstrate in Section 5.4, Black Marlin also surpasses these protocols at a theoretical level in terms of communication and latency complexities while allowing implementations with $\mathcal{O}(n)$ memory.

References

[1] Ittai Abraham, Kartik Nayak, Ling Ren, and Zhuolun Xiang. Good-case latency of byzantine broadcast: a complete categorization. In PODC, pages 331–341. ACM, 2021. doi:10.1145/3465084.3467899.
[2] Ignacio Amores-Sesar and Christian Cachin. We will DAG you. In ESORICS Workshops (1), volume 15263 of Lecture Notes in Computer Science, pages 276–291. Springer, 2024. doi:10.1007/978-3-031-82349-7_19.
[3] Ignacio Amores-Sesar, Christian Cachin, and Philipp Schneider. An analysis of avalanche consensus. In SIROCCO, volume 14662 of Lecture Notes in Computer Science, pages 27–44. Springer, 2024. doi:10.1007/978-3-031-60603-8_2.
[4] Ignacio Amores-Sesar, Christian Cachin, and Enrico Tedeschi. When is spring coming? A security analysis of avalanche consensus. In OPODIS, volume 253 of LIPIcs, pages 10:1–10:22. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.OPODIS.2022.10.
[5] Balaji Arun, Zekun Li, Florian Suri-Payer, Sourav Das, and Alexander Spiegelman. Shoal++: High throughput DAG BFT can be fast and robust! In NSDI, pages 813–826. USENIX Association, 2025. URL: https://www.usenix.org/conference/nsdi25/presentation/arun.
[6] Kushal Babel, Andrey Chursin, George Danezis, Anastasios Kichidis, Lefteris Kokoris-Kogias, Arun Koshy, Alberto Sonnino, and Mingwei Tian. Mysticeti: Reaching the latency limits with uncertified dags. In NDSS. The Internet Society, 2025. URL: https://www.ndss-symposium.org/ndss-paper/mysticeti-reaching-the-latency-limits-with-uncertified-dags/.
[7] Gabriel Bracha and Sam Toueg. Asynchronous consensus and broadcast protocols. J. ACM, 32(4):824–840, 1985. doi:10.1145/4221.214134.
[8] Christian Cachin, Rachid Guerraoui, and Luís E. T. Rodrigues. Introduction to Reliable and Secure Distributed Programming (2. ed.). Springer, 2011. doi:10.1007/978-3-642-15260-3.
[9] Ran Canetti and Tal Rabin. Fast asynchronous byzantine agreement with optimal resilience. In STOC, pages 42–51. ACM, 1993. doi:10.1145/167088.167105.
[10] Miguel Castro and Barbara Liskov. Practical byzantine fault tolerance. In OSDI, pages 173–186. USENIX Association, 1999. URL: https://dl.acm.org/citation.cfm?id=296824.
[11] Andrew Cullen, Pietro Ferraro, Christopher K. King, and Robert Shorten. On the resilience of dag-based distributed ledgers in iot applications. IEEE Internet Things J., 7(8):7112–7122, 2020. doi:10.1109/JIOT.2020.2983401.
[12] George Danezis, Lefteris Kokoris-Kogias, Alberto Sonnino, and Alexander Spiegelman. Narwhal and tusk: a dag-based mempool and efficient BFT consensus. In EuroSys, pages 34–50. ACM, 2022. doi:10.1145/3492321.3519594.
[13] Cynthia Dwork, Nancy A. Lynch, and Larry J. Stockmeyer. Consensus in the presence of partial synchrony. J. ACM, 35(2):288–323, 1988. doi:10.1145/42282.42283.
[14] Michael J. Fischer, Nancy A. Lynch, and Mike Paterson. Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374–382, 1985. doi:10.1145/3149.214121.
[15] Neil Giridharan, Florian Suri-Payer, Ittai Abraham, Lorenzo Alvisi, and Natacha Crooks. Autobahn: Seamless high speed BFT. In SOSP, pages 1–23. ACM, 2024. doi:10.1145/3694715.3695942.
[16] Idit Keidar, Eleftherios Kokoris-Kogias, Oded Naor, and Alexander Spiegelman. All you need is DAG. In PODC, pages 165–175. ACM, 2021. doi:10.1145/3465084.3467905.
[17] Idit Keidar, Oded Naor, Ouri Poupko, and Ehud Shapiro. Cordial miners: Fast and efficient consensus for every eventuality. In DISC, volume 281 of LIPIcs, pages 26:1–26:22. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.DISC.2023.26.
[18] Mysten Labs. Github, accessed on April 2025. URL: https://github.com/asonnino/mysticeti/tree/paper.
[19] Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133–169, 1998. doi:10.1145/279227.279229.
[20] Serguei Popov, Olivia Saa, and Paulo Finardi. Equilibria in the tangle. Comput. Ind. Eng., 136:160–172, 2019. doi:10.1016/J.CIE.2019.07.025.
[21] Facebook Research. Bullshark implementation. Github, accessed on April 2025. https://github.com/facebookresearch/narwhal/tree/bullshark.
[22] Team Rocket, Maofan Yin, Kevin Sekniqi, Robbert van Renesse, and Emin Gün Sirer. Scalable and probabilistic leaderless BFT consensus through metastability. CoRR, abs/1906.08936, 2019. arXiv:1906.08936.
[23] Victor Shoup. Blue fish, red fish, live fish, dead fish. IACR Cryptol. ePrint Arch., page 1235, 2024. URL: https://eprint.iacr.org/2024/1235.
[24] Nibesh Shrestha, Aniket Kate, and Kartik Nayak. Sailfish: Towards improving latency of dag-based BFT. IACR Cryptol. ePrint Arch., page 472, 2024. URL: https://eprint.iacr.org/2024/472.
[25] Alexander Spiegelman, Balaji Arun, Rati Gelashvili, and Zekun Li. Shoal: Improving DAG-BFT latency and robustness. In FC (1), volume 14744 of Lecture Notes in Computer Science, pages 92–109. Springer, 2024. doi:10.1007/978-3-031-78676-1_6.
[26] Alexander Spiegelman, Neil Giridharan, Alberto Sonnino, and Lefteris Kokoris-Kogias. Bullshark: DAG BFT protocols made practical. In CCS, pages 2705–2718. ACM, 2022. doi:10.1145/3548606.3559361.
[27] Maofan Yin, Dahlia Malkhi, Michael K. Reiter, Guy Golan-Gueta, and Ittai Abraham. Hotstuff: BFT consensus with linearity and responsiveness. In PODC, pages 347–356. ACM, 2019. doi:10.1145/3293611.3331591.

[bib.bib1] [1] Ittai Abraham, Kartik Nayak, Ling Ren, and Zhuolun Xiang. Good-case latency of byzantine broadcast: a complete categorization. In PODC, pages 331–341. ACM, 2021. doi:10.1145/3465084.3467899.

[bib.bib2] [2] Ignacio Amores-Sesar and Christian Cachin. We will DAG you. In ESORICS Workshops (1), volume 15263 of Lecture Notes in Computer Science, pages 276–291. Springer, 2024. doi:10.1007/978-3-031-82349-7_19.

[bib.bib3] [3] Ignacio Amores-Sesar, Christian Cachin, and Philipp Schneider. An analysis of avalanche consensus. In SIROCCO, volume 14662 of Lecture Notes in Computer Science, pages 27–44. Springer, 2024. doi:10.1007/978-3-031-60603-8_2.

[bib.bib4] [4] Ignacio Amores-Sesar, Christian Cachin, and Enrico Tedeschi. When is spring coming? A security analysis of avalanche consensus. In OPODIS, volume 253 of LIPIcs, pages 10:1–10:22. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.OPODIS.2022.10.

[bib.bib5] [5] Balaji Arun, Zekun Li, Florian Suri-Payer, Sourav Das, and Alexander Spiegelman. Shoal++: High throughput DAG BFT can be fast and robust! In NSDI, pages 813–826. USENIX Association, 2025. URL: https://www.usenix.org/conference/nsdi25/presentation/arun.

[bib.bib6] [6] Kushal Babel, Andrey Chursin, George Danezis, Anastasios Kichidis, Lefteris Kokoris-Kogias, Arun Koshy, Alberto Sonnino, and Mingwei Tian. Mysticeti: Reaching the latency limits with uncertified dags. In NDSS. The Internet Society, 2025. URL: https://www.ndss-symposium.org/ndss-paper/mysticeti-reaching-the-latency-limits-with-uncertified-dags/.

[bib.bib7] [7] Gabriel Bracha and Sam Toueg. Asynchronous consensus and broadcast protocols. J. ACM, 32(4):824–840, 1985. doi:10.1145/4221.214134.

[bib.bib8] [8] Christian Cachin, Rachid Guerraoui, and Luís E. T. Rodrigues. Introduction to Reliable and Secure Distributed Programming (2. ed.). Springer, 2011. doi:10.1007/978-3-642-15260-3.

[bib.bib9] [9] Ran Canetti and Tal Rabin. Fast asynchronous byzantine agreement with optimal resilience. In STOC, pages 42–51. ACM, 1993. doi:10.1145/167088.167105.

[bib.bib10] [10] Miguel Castro and Barbara Liskov. Practical byzantine fault tolerance. In OSDI, pages 173–186. USENIX Association, 1999. URL: https://dl.acm.org/citation.cfm?id=296824.

[bib.bib11] [11] Andrew Cullen, Pietro Ferraro, Christopher K. King, and Robert Shorten. On the resilience of dag-based distributed ledgers in iot applications. IEEE Internet Things J., 7(8):7112–7122, 2020. doi:10.1109/JIOT.2020.2983401.

[bib.bib12] [12] George Danezis, Lefteris Kokoris-Kogias, Alberto Sonnino, and Alexander Spiegelman. Narwhal and tusk: a dag-based mempool and efficient BFT consensus. In EuroSys, pages 34–50. ACM, 2022. doi:10.1145/3492321.3519594.

[bib.bib13] [13] Cynthia Dwork, Nancy A. Lynch, and Larry J. Stockmeyer. Consensus in the presence of partial synchrony. J. ACM, 35(2):288–323, 1988. doi:10.1145/42282.42283.

[bib.bib14] [14] Michael J. Fischer, Nancy A. Lynch, and Mike Paterson. Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374–382, 1985. doi:10.1145/3149.214121.

[bib.bib15] [15] Neil Giridharan, Florian Suri-Payer, Ittai Abraham, Lorenzo Alvisi, and Natacha Crooks. Autobahn: Seamless high speed BFT. In SOSP, pages 1–23. ACM, 2024. doi:10.1145/3694715.3695942.

[bib.bib16] [16] Idit Keidar, Eleftherios Kokoris-Kogias, Oded Naor, and Alexander Spiegelman. All you need is DAG. In PODC, pages 165–175. ACM, 2021. doi:10.1145/3465084.3467905.

[bib.bib17] [17] Idit Keidar, Oded Naor, Ouri Poupko, and Ehud Shapiro. Cordial miners: Fast and efficient consensus for every eventuality. In DISC, volume 281 of LIPIcs, pages 26:1–26:22. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.DISC.2023.26.

[bib.bib18] [18] Mysten Labs. Github, accessed on April 2025. URL: https://github.com/asonnino/mysticeti/tree/paper.

[bib.bib19] [19] Leslie Lamport. The part-time parliament. ACM Trans. Comput. Syst., 16(2):133–169, 1998. doi:10.1145/279227.279229.

[bib.bib20] [20] Serguei Popov, Olivia Saa, and Paulo Finardi. Equilibria in the tangle. Comput. Ind. Eng., 136:160–172, 2019. doi:10.1016/J.CIE.2019.07.025.

[bib.bib21] [21] Facebook Research. Bullshark implementation. Github, accessed on April 2025. https://github.com/facebookresearch/narwhal/tree/bullshark.

[bib.bib22] [22] Team Rocket, Maofan Yin, Kevin Sekniqi, Robbert van Renesse, and Emin Gün Sirer. Scalable and probabilistic leaderless BFT consensus through metastability. CoRR, abs/1906.08936, 2019. arXiv:1906.08936.

[bib.bib23] [23] Victor Shoup. Blue fish, red fish, live fish, dead fish. IACR Cryptol. ePrint Arch., page 1235, 2024. URL: https://eprint.iacr.org/2024/1235.

[bib.bib24] [24] Nibesh Shrestha, Aniket Kate, and Kartik Nayak. Sailfish: Towards improving latency of dag-based BFT. IACR Cryptol. ePrint Arch., page 472, 2024. URL: https://eprint.iacr.org/2024/472.

[bib.bib25] [25] Alexander Spiegelman, Balaji Arun, Rati Gelashvili, and Zekun Li. Shoal: Improving DAG-BFT latency and robustness. In FC (1), volume 14744 of Lecture Notes in Computer Science, pages 92–109. Springer, 2024. doi:10.1007/978-3-031-78676-1_6.

[bib.bib26] [26] Alexander Spiegelman, Neil Giridharan, Alberto Sonnino, and Lefteris Kokoris-Kogias. Bullshark: DAG BFT protocols made practical. In CCS, pages 2705–2718. ACM, 2022. doi:10.1145/3548606.3559361.

[bib.bib27] [27] Maofan Yin, Dahlia Malkhi, Michael K. Reiter, Guy Golan-Gueta, and Ittai Abraham. Hotstuff: BFT consensus with linearity and responsiveness. In PODC, pages 347–356. ACM, 2019. doi:10.1145/3293611.3331591.

DAG It Off: Latency Prefers No Common Coins

Abstract

Keywords and phrases:

Copyright and License:

2012 ACM Subject Classification:

Funding:

Related Version:

DOI:

Event:

Editor:

Series and Publisher:

1 Introduction

1.1 Structure

2 Related work

3 Model

3.1 Notation

3.2 Adversary and Network

3.3 Atomic broadcast

Definition 1 (Atomic broadcast).

4 Protocol

4.1 Validity predicates

4.2 Initialization

4.3 Round structure

4.4 Delivery

4.5 Functions

5 Analysis

5.1 Safety

Lemma 2.

Proof.

Lemma 3.

Proof.

Lemma 4.

Proof.

Lemma 5.

Proof.

Lemma 6.

Proof.

5.2 Liveness

Lemma 7.

Proof.

Lemma 8.

Proof.

Lemma 9.

Proof.

Lemma 10.

Proof.

Lemma 11.

Proof.

Lemma 12.

Proof.

5.3 Main Result

Theorem 13.

Proof.

5.4 Communication and Time Complexity

5.4.1 Communication complexity

5.4.2 Time complexity

6 Benchmarking

6.1 Experiments

6.2 Results

References