Distributed Download from an External Data Source in Asynchronous Faulty Settings

Augustine, John; Chatterjee, Soumyottam; King, Valerie; Kumar, Manish; Meir, Shachar; Peleg, David

doi:10.4230/LIPIcs.OPODIS.2025.18

Distributed Download from an External Data Source in Asynchronous Faulty Settings

John Augustine

Indian Institute of Technology Madras, Chennai, India Soumyottam Chatterjee

CISPA – Helmholtz Center for Information Security, Saarbrücken, Germany Valerie King

University of Victoria, Canada Manish Kumar

Indian Institute of Technology Madras, Chennai, India Shachar Meir

Weizmann Institute of Science, Rehovot, Israel David Peleg

Weizmann Institute of Science, Rehovot, Israel

Abstract

The distributed Data Retrieval (DR) model consists of $k$ peers connected by a complete peer-to-peer communication network, and a trusted external data source that stores an array X of $n$ bits ( $n\gg k$ ). Up to $\beta k$ of the peers might fail in any execution (for $\beta\in[0,1)$ ). Peers can obtain the information either by inexpensive messages passed among themselves or through expensive queries to the source array X. In the DR model, we focus on designing protocols that minimize the number of queries performed by any nonfaulty peer (a measure referred to as the query complexity) while maximizing the resiliency parameter $\beta$ .

The Download problem requires each nonfaulty peer to correctly learn the entire array X. Earlier work on this problem focused on synchronous communication networks and established several deterministic and randomized upper and lower bounds. Our work is the first to extend the study of distributed data retrieval to asynchronous communication networks. We address the Download problem under both the Byzantine and crash failure models. We present query-optimal deterministic solutions in an asynchronous model that can tolerate any fixed fraction $\beta<1$ of crash faults. In the Byzantine failure model, it is known that deterministic protocols incur a query complexity of $\Omega(n)$ per peer, even under synchrony. We extend this lower bound to randomized protocols in the asynchronous model for $\beta\geq 1/2$ , and further show that for $\beta<1/2$ , a randomized protocol exists with near-optimal query complexity.

Keywords and phrases:

Byzantine Fault Tolerance, Blockchain Oracle, Data Retrieval Model, Distributed Download, asynchrony

Funding:

John Augustine: Supported by the Centre for Cybersecurity, Trust and Reliability (CyStar) Centre, IIT Madras.

Soumyottam Chatterjee: Work done while at IIT Madras supported by CyStar.

Manish Kumar: Supported by CyStar, IIT Madras.

David Peleg: Venky Harinarayanan and Anand Rajaraman Visiting Chair Professor. The funds from this professorship enabled exchange visits between IIT Madras, India, and the Weizmann Institute of Science, Israel.

Copyright and License:

© John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, and
David Peleg; licensed under Creative Commons License CC-BY 4.0

2012 ACM Subject Classification:

Theory of computation

\rightarrow

Distributed algorithms

Editors:

Andrei Arusoaie, Emanuel Onica, Michael Spear, and Sara Tucci-Piergiovanni

Series and Publisher:

Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik

1 Introduction

1.1 Background and motivation

The Data Retrieval Model (DR) was first introduced in [3] to abstract the fundamental process of a group learning from a reliable external data source, where the data source is too large or it is too expensive to be learned individually (i.e., requires members of the group to collaborate), and some members of the group might crash during execution or act in other ways to deliberately sabotage the learning process. One key example of systems where this process takes place is blockchain oracles [12, 16]. We address this Oracle data delivery process in detail later and present a method for improving its performance using the DR model and the protocols designed in this work.

The DR model contains two entities: (i) a peer-to-peer network and (ii) an external data source in the form of an $n$ bit array X. There are $k$ peers, up to $\beta$ fraction of which may be faulty (and at least $\gamma=1-\beta$ fraction of which are nonfaulty). Each peer has access to the content of the array through queries. The general class of retrieval problems consists of problems ${\textsf{Retrieve}}(f)$ requiring every peer to output $f(\textbf{X})$ for some computable function $f$ of the input array X. In this work, we focus on the most fundamental retrieval problem, ${\textsf{Retrieve}}(f_{id})$ where $f_{id}(\textbf{X})=\textbf{X}$ , referred to hereafter as the Download problem¹¹1It is fundamental since every retrieval problem ${\textsf{Retrieve}}(f)$ can be solved by first performing download and then locally computing $f(\textbf{X})$ ., where every peer needs to learn the entire input X.

In the absence of failures, the problem can be easily solved in a query-balanced manner. Even with failures, the problem can be trivially solved at the cost of a large number of queries, as the non-faulty peers can directly query all the bits. This solution is prohibitively expensive; thus, we focus on minimizing the number of queries made by each non-faulty peer. For synchronous systems with Byzantine faults, a lower bound of $\Omega(\beta n)$ on query complexity for deterministic Download is shown in [3] for every $\beta<1$ , followed by a matching upper bound when $\beta<1/2$ . This implies that in the presence of Byzantine faults, one cannot attain the ideal query complexity of $\frac{n}{\gamma k}$ without using randomization.

In this work, we consider Download protocols in the asynchronous setting for both the crash and Byzantine fault models. In the asynchronous Byzantine fault setting, we prove that, unlike the synchronous setting, where randomization can overcome the deterministic lower bound, $\Omega(n)$ queries per peer are required when $\beta\geq 1/2$ , even for randomized protocols. We complement this lower bound with a protocol for the $\beta<1/2$ regime that achieves a query complexity of $\tilde{O}\left(\frac{n}{(\gamma-\beta)k}\right)$ , which, for a constant $\gamma-\beta$ , is within log factors of the generic lower bound of $\Omega(n/\gamma k)$ .

Turning to the more benign setting of crash faults (i.e., where all peers are honest but some $\beta$ fraction may stop functioning), the picture is brighter. For this model, it turns out that even in the asynchronous setting, one can get efficient deterministic Download protocols that achieve the optimal query complexity of $O\left(\frac{n}{\gamma k}\right)$ , for any fraction $\beta<1$ of crashes.

1.2 The Model

In the Data Retrieval (DR) model, the system consists of two components. The first is a collection of $k$ peers, each equipped with a unique ID from the range $[1,k]$ , connected by a complete communication network (or clique). The network provides peer-to-peer message passing, namely, every peer can send at time $t$ a (possibly different) message of size at most $\phi$ bits to each other peer.

The second component of the DR model is an external data source. The source stores an $n$ -bit input array $\textbf{X}=\{b_{1},\ldots,b_{n}\}$ . It provides the peers with read-only access, allowing each peer to retrieve the data through queries of the form $\mbox{\tt Query}(i)$ , for $1\leq i\leq n$ . The answer returned by the source would then be $b_{i}$ , the $i^{th}$ element in the array. This type of communication is referred to as source-to-peer communication.

We consider asynchronous communication, where any communication (both among peer-to-peer and source-to-peer) can be delayed by any finite amount of time. For randomized protocols, we use the following notion of cycles.

Cycles.

In the asynchronous model, there is no global notion of rounds, as each peer operates at a different pace. Nevertheless, to describe our protocols and analyze their performance, it is convenient to divide the local execution of each peer $\mu$ into (varying time) cycles. Each such local cycle consists of the following stages.

$\blacksquare$

Sending (0 or more) queries and getting answers.
$\blacksquare$

Sending (0 or more) messages.
$\blacksquare$

Waiting to receive messages.

We assume that local computation takes 0 time and can be performed at any point in a cycle. Moreover, when waiting for messages, after every message is received, the peer can adaptively decide whether to keep waiting for an additional message or continue to the next cycle. Note that the local cycle $r$ of peer $\mu$ might coincide with a different local cycle $r^{\prime}$ of another peer $\mu^{\prime}$ .

In the absence of global time units, it is convenient to break the time axis into “virtual blocks” by defining $t^{r}$ , for integer $r\geq 1$ , as the first time any peer started its local cycle $r$ .

Every message is of size at most $\phi$ bits, where $\phi$ is a system parameter. Note that throughout the paper, we either set $\phi$ to a specific value or leave it as a parameter, in which case increasing the message size parameter $\phi$ would result in faster protocols.

The adversary.

Our analysis uses the notion of an adversary, representing the adverse conditions in which the system operates, including the asynchronous communication and the possibility of failures.

The adversary has two types of operations. First, it can fail up to $\beta k$ peers, under the restriction that it can only fail a peer between its cycles (or before the first cycle), meaning that a peer can make random decisions in its current cycle without the adversary being able to react until the end of the cycle. Second, it can set the time $t_{\mu,\mu^{\prime}}^{r}$ it takes a message sent by peer $\mu$ in its local cycle $r$ to reach peer $\mu^{\prime}$ , under the restriction that it must set the time $t_{\mu,\mu^{\prime}}^{r}$ for every pair of peers $\mu,\mu^{\prime}$ , before time $t^{r}$ . In other words, the adversary must set the latency of each message sent during a cycle $r$ before any peer starts cycle $r$ . The adversary can also decide when every peer starts its execution (i.e., we do not assume a simultaneous start). Note that in the case of deterministic protocols, the notion of cycles is irrelevant, and we consider a standard adversary that can fail a peer at any point of the execution and can delay messages for any finite amount of time.

The adversary $\mathcal{A}$ selects the input data and determines the failure pattern of the peers. In the crash failure model, the adversary’s power is limited to crashing some of the peers in every execution of the protocol. Once a peer crashes, it stops its local execution of the protocol arbitrarily and permanently. This could happen in the middle of operation, e.g., after the peer has already sent some, but perhaps not all, of the messages it was instructed by the protocol to send out at a given point in time. In contrast, in the Byzantine failure model, a failed peer can deviate from the protocol in arbitrary ways. We assume that the adversary can fail at most $\beta k$ peers, for some given²²2We do not assume $\beta$ to be a fixed constant (unless mentioned otherwise). $\beta\in[0,1)$ . We let $\gamma=1-\beta$ , so there is (at least) a $\gamma$ fraction of nonfaulty peers in every execution. Denote the set of faulty (respectively, nonfaulty) peers in the execution by $\mathcal{F}$ . (resp., $\mathcal{H}$ ).

We assume that the adversary knows the protocol and hence can simulate it (up to random coins).

We concentrate on the following complexity measures.

Query Complexity ( $\mathcal{Q}$ ):: the maximum number of bits queried by a nonfaulty peer during the execution.
Time Complexity ( $\mathcal{T}$ ):: the time it takes for the protocol to terminate.
Message Complexity ( $\mathcal{M}$ ):: the total number of messages sent by nonfaulty peers during the execution.

We assume that queries to the source are the more expensive component in the system, so we focus mainly on optimizing the query complexity $\mathcal{Q}$ . Measuring the maximum cost per peer (rather than the total cost) gives priority to a balanced load of queries over the nonfaulty peers.

Let us now formally define the Download problem. Consider a DR network with $k$ peers, where at most $\beta k$ can be faulty, and a source that stores a bit array $\textbf{X}=[b_{1},\dots,b_{n}]$ . Each peer is required to learn X. Formally, each nonfaulty peer $\mu$ outputs a bit array $res^{\mu}$ , and it is required that, upon termination, $res^{\mu}[i]=b_{i}$ for every $i\in\{1,\cdots,n\}$ and $\mu\in\mathcal{H}$ .

In the absence of failures, this problem can be solved by sharing the task of querying all $n$ bits evenly among the $k$ peers, yielding $\mathcal{Q}=\Theta(n/k)$ . The message complexity is $\mathcal{M}=\tilde{O}(nk)$ , assuming small messages of size $\tilde{O}(1)$ , and the time complexity is $\mathcal{T}=\tilde{O}(n/k)$ since $\Omega(n/k)$ bits need to be sent along each communication link when the workload is shared.

1.3 Related Work

The vast literature on fault-tolerant distributed computing includes extensive work on both crash faults and Byzantine faults. A foundational result by Fischer, Lynch, and Paterson (FLP)[20] demonstrated that in asynchronous networks, even a single crash fault renders many fundamental problems – such as consensus and reliable broadcast – impossible to solve deterministically. Specifically, they showed that the adversary can indefinitely delay progress, violating the termination property. To circumvent this impossibility, many subsequent works in asynchronous settings have adopted randomized techniques[7, 22] or relaxed the termination requirement [9].

Given the practical relevance of the asynchronous model, it has been widely adopted for studying fault-tolerant protocols under crash faults [19, 24, 17, 6] and Byzantine faults [1, 8, 10, 13, 14, 15, 18, 21, 23, 28]. Fundamental problems such as agreement and reliable broadcast have often served as building blocks in distributed protocol design, typically assuming that all input data is already locally available to the peers.

In contrast, our work addresses the Data Retrieval (DR) model, where each peer must actively fetch data from a trusted external source and disseminate it to the rest of the peers while minimizing the cost associated with querying the source. We focus on the Download problem, which requires every nonfaulty peer to correctly learn the entire data. Unlike classical problems, we show that Download – despite requiring termination – can be solved deterministically in asynchronous networks. This highlights that employing reliable broadcast or agreement as foundational components is not necessary to solve the Download problem. Moreover, this can be done with optimal query complexity for any fraction $\beta<1$ of crash-faults. In the more adversarial Byzantine setting, we also design randomized protocols that solve the problem even when a majority of the peers are Byzantine.

To the best of our knowledge, this work is the first to study retrieval problems in the Data Retrieval (DR) model under asynchronous communication. The DR model has previously been explored in synchronous networks, most notably in [3, 5]. In particular,[3] introduced the Download problem, motivated by practical applications such as Distributed Oracle Networks (DONs), which form a crucial component of blockchain systems and employ protocols like OCR and DORA[12, 16].

The primary focus of [3] was on minimizing query complexity in the presence of Byzantine faults in synchronous settings. They proved that any deterministic protocol must incur a query complexity of at least $\mathcal{Q}=\Omega(\beta n)$ and matched this with an upper bound when $\beta<1/2$ . On the randomized front, they proposed two protocols. The first tolerates any constant fraction $\beta<1$ of Byzantine faults but has suboptimal query complexity, achieving $\mathcal{Q}=O\left(\frac{n}{\gamma k}+\sqrt{n}\right)$ with high probability. The second protocol improves upon this by achieving near-optimal query complexity $\tilde{O}\left(\frac{n}{\gamma k}\right)$ ³³3We use the $\tilde{O}(\cdot)$ notation to hide $\beta$ factors and polylogarithmic terms in $n$ and $k$ . with high probability, but it can only tolerate up to a $\beta<1/3$ fraction of Byzantine faults.

The paper [5] builds upon the foundational work in [3] by closing several gaps in the randomized setting. It presents a randomized Download protocol with query complexity $\mathcal{Q}=O\left(\frac{n}{\gamma k}\right)$ , time complexity $\mathcal{T}=O(n\log k)$ , and message complexity $\mathcal{M}=O(nk^{2})$ for any Byzantine-fault fraction $\beta\in[0,1)$ . Additionally, it establishes a lower bound showing that in any single-round randomized protocol, each peer must essentially query the entire input, indicating the inherent limitations of extremely fast protocols.

Moving to two-round protocols, [5] proposes a randomized solution that achieves query complexity $\mathcal{Q}=O\left(\frac{n}{\gamma k}+\sqrt{n}\right)$ with high probability, improving upon the time complexity of [3] under the same fault threshold $\beta<1$ . Notably, this protocol operates under a stronger adversarial model – referred to as Dynamic Byzantine – in which the set of Byzantine peers may change from one round to another. Under this dynamic fault model, the authors further develop a protocol that achieves expected query complexity $\tilde{O}\left(\frac{n}{\gamma k}\right)$ (within logarithmic factors), at the expense of a higher time complexity of $O(\log k)$ .

In contrast to prior work, this paper explores the Download problem in asynchronous networks, under both crash and Byzantine faults. Our results are summarized in the next section; a concise comparison to prior synchronous protocols is provided in Table 1.

Table 1: An overview of our results with prior closely related work.

Prior Work
Synchrony	Query	Fault Model	Resilience	Type	Reference
Synchronous	$\tilde{O}\left(\frac{n}{\gamma k}\right)$	Byzantine	$\beta<\frac{1}{3}$	Randomized	[3]
Synchronous	$\tilde{O}\left(\frac{n}{\gamma k}+\sqrt{n}\right)$	Byzantine	$\beta<1$	Randomized	[3]
Synchronous	$\tilde{O}\left(\frac{n}{\gamma k}\right)$	Byzantine	$\beta<1$	Randomized	[5]
This Paper
Asynchronous	$\Theta\left(\frac{n}{\gamma k}\right)$	Crash	$\beta<1$	Deterministic	Thm 8
Asynchronous	$\Omega(n)$	Byzantine	$\beta\geq 1/2$	Randomized	Thm 10
Asynchronous	$\tilde{O}\left(\frac{n}{(\gamma-\beta)k}\right)$	Byzantine	$\beta<1/2$	Randomized	Thm 12

1.4 Contributions

We present the Download problem in asynchronous communication networks, under both crash and Byzantine failures settings. In the crash-fault setting, our deterministic results are optimal w.r.t. to the resilience (for any $\beta<1$ ) and query complexity. Notice that this optimality also holds for randomized algorithms. In the Byzantine failure setting, we provide deterministic and randomized lower bounds as well as upper bounds. The main results are:

(1)

Deterministic Download in Crash-Fault: We present a deterministic protocol for solving Download problem in the asynchronous setting with at most $f<k$ crash faults ( $\gamma=1-f/k$ ) with $\mathcal{Q}=\Theta(\frac{n}{\gamma k})$ , $\mathcal{T}=O\left(\frac{n}{\phi}+\log_{k/f}(\phi)\right)$ and $\mathcal{M}=O(nk^{2})$ where $\phi$ is the message size. Our result achieves the optimal query complexity for any fraction of crash fault, $\beta<1$ .
(2)

Deterministic Lower Bound in Byzantine Fault: We show that for $\beta\geq 1/2$ , every deterministic asynchronous Download protocol that is resilient to Byzantine faults requires $\mathcal{Q}=n$ .
(3)

Deterministic Download in Byzantine Fault: We show that for $\beta<1/2$ , there exists a deterministic asynchronous protocol that solves Download with $Q=O(\beta n)$ , $\mathcal{T}=O\left(\frac{\beta n}{\phi}\right)$ and $\mathcal{M}=O(f\cdot n)$ .
(4)

Randomized Lower Bound in Byzantine Fault: We show that for any randomized asynchronous Download protocol where $\beta\geq 1/2$ , there does not exist any execution in which every peer queries less than or equal $n/2$ bits.
(5)

2-cycle Randomized Download in Byzantine Fault: We present a 2-cycle asynchronous randomized protocol for Download with $\mathcal{Q}=O\left(\sqrt{\frac{n}{\gamma-\beta}}+\frac{n\log n}{(\gamma-\beta)% k}\right)$ and $\mathcal{M}=O(k^{2})$ where $\beta\leq 1/2$ , and the message size is $\phi=O(\frac{n}{\gamma k})$ .
(6)

Randomized Download in Byzantine Fault: We present a $O\left(\log\left(\frac{\gamma k}{\ln n}\right)\right)$ -cycle protocol that computes Download whp in the point-to-point model having expected query complexity $\mathcal{Q}=O\left(\frac{n\log n}{(\gamma-\beta)k}\right)$ and $\mathcal{M}=O\left(\log\left(\frac{\gamma k}{\ln n}\right)k^{2}\right)$ where $\beta\leq 1/2$ , and the message size is $\phi=O(n)$ .

2 Deterministic Download in the Asynchronous Model with Crash Faults

In this section, we present deterministic protocols that solve the Download problem in an asynchronous setting. In the full version of the paper, we show a deterministic protocol that handles a single crash failure. In the current version, we only present an extended solution that tolerates $f$ crashes for $f>1$ , in section 2.1.

2.1 Tolerating any Number $f<k$ of Crashes

In this subsection, we present a protocol that can tolerate up to $f$ crashes for any $f<k$ (for a more basic algorithm for the case of $f=1$ see [4]). The main difficulty in achieving tolerance with up to $f$ crashes is that in the presence of asynchrony, one cannot distinguish between a slow peer and a crashed peer, making it difficult to coordinate.

Algorithm 1 executes in phases, each consisting of three stages. Each peer $\mu$ stores the following local variables. (We omit the superscript $\mu$ when it is clear from the context.)

$\blacksquare$

$phase(\mu)$ : $\mu$ ’s current phase.
$\blacksquare$

$stage(\mu)$ : $\mu$ ’s current stage within the phase.
$\blacksquare$

$H_{p}^{\mu}$ : the correct set of $\mu$ for phase $p$ , i.e., the set of peers $\mu$ heard from during phase $p$ .
$\blacksquare$

$\sigma_{p}^{\mu}$ : the assignment function of $\mu$ for phase $p$ , which assigns the responsibility for querying each bit $i$ to some peer $\mu^{\prime}$ .
$\blacksquare$

$res^{\mu}$ : the output array.

In the first stage of phase $p$ , each peer $\mu$ queries bits according to its local assignment $\sigma_{p}$ and sends a ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ 1$ request (asking for bit values according to $\sigma_{p}$ , namely $\{i\mid\sigma_{p}(i)=\mu^{\prime}\}$ ) to every other peer $\mu^{\prime}$ and then continues to stage 2. Upon receiving a ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ 1$ request, $\mu$ waits until it is at least in stage 2 of phase $p$ and returns the requested bit values that it knows.

In stage 2 of phase $p$ , $\mu$ waits until it hears from at least $|H_{p}^{\mu}|\geq k-f$ peers (again, waiting for the remaining $f$ peers risks deadlock). Then, it sends a ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ 2$ request containing the set of peers’ IDs $F_{p}^{\mu}=\{1,\dots,k\}\setminus H_{p}^{\mu}$ (namely, all the peers it didn’t hear from during phase $p$ ) and continues to stage 3. Upon receiving a ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ 2$ request, $\mu$ waits until it is at least in stage 3 of phase $p$ , and replies to every peer $\mu^{\prime}$ as follows. For every $j\in F_{p}^{\mu^{\prime}}$ , it sends $\mu_{j}$ ’s bits if $j\in H_{p}^{\mu^{\prime}}$ and “me neither” otherwise.

In stage 3 of phase $p$ , $\mu$ waits for $k-f$ ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ 2$ responses. Then, for every $j\in F_{p}^{\mu}$ , if it received only “me neither” messages, it reassigns $\mu_{j}$ ’s bits evenly between peers $1,\dots,k$ . Otherwise, it updates $r e s$ in the appropriate indices. Finally, it continues to stage 1 of phase $p+1$ . Upon receiving a ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ i$ response, $\mu$ updates $r e s$ in the appropriate index and updates $H_{p}$ for every bit value in the message. The pseudocode is provided in Algorithm 1.

Before diving into the analysis, we overview the following intuitive flow of the protocol’s execution. At the beginning of phase 1, the assignment function $\sigma_{1}$ is the same for every peer. Every peer is assigned $n/k$ bits, which it queries and sends to every other peer. Every peer $\mu$ hears from at least $k-f$ peers, meaning that it has at most $f\cdot n/k$ unknown bits after phase 1. In the following phases, every peer $\mu$ reassigns its unknown bits uniformly among all the peers, such that the bits assigned to every peer $\mu^{\prime}$ are either known to it from a previous phase or $\mu^{\prime}$ is about to query them in the current phase (i.e., $\mu^{\prime}$ assigned itself the same bits). Hence, after every phase, the number of unknown bits diminishes by a factor of $f/k$ . After sufficiently many phases, the number of unknown bits will be small enough to be directly queried by every peer.

Protocol 1 Async Download version 2 for peer

\mu

.

We start the analysis by showing some properties of the relations between local variables.

Observation 1.

For every nonfaulty peer $\mu$ , if $H_{p}^{\mu}=\{1,\dots,k\}$ for some phase $p\geq 0$ then $res^{\mu}=\textbf{X}$

Proof.

Let $p\geq 0$ be such that $H_{p}=\{1,\dots,k\}$ , and consider $1\leq i\leq n$ . There exists some $1\leq j\leq k$ such that $\sigma_{p}(i)=\mu_{j}$ . Since $j\in H_{p}^{\mu}$ , $\mu$ has heard from $\mu_{j}$ , so $res^{\mu}[i]\neq\bot$ , and overall $res^{\mu}=\textbf{X}$ . $\hfill\blacktriangleleft$

Denote by $\sigma^{\mu}_{p}$ the local value of $\sigma_{p}$ for peer $\mu$ at the beginning of phase $p$ . Denote by $res^{\mu}_{p}[i]$ the local value of $res[i]$ for peer $\mu$ after stage 1 of phase $p$ .

$\vartriangleright$ Claim 2.

For every phase $p$ , two nonfaulty peers $\mu,\mu^{\prime}$ , and bit $i$ , one of the following holds.

$(1_{p})$

$\sigma^{\mu}_{p}(i)=\sigma^{\mu^{\prime}}_{p}(i)$ , i.e., both $\mu$ and $\mu^{\prime}$ assign the task of querying $i$ to the same peer, or
$(2_{p})$

$res^{\mu}_{p}[i]\neq\bot$ or $res^{\mu^{\prime}}_{p}[i]\neq\bot$ .

Proof.

By induction on $p$ . For the basis, $p=0$ , the claim is trivially true because of the initialization values (specifically, property ( $1_{0}$ ) holds).

For $p\geq 1$ . By the induction hypothesis, either $(1_{p-1})$ or $(2_{p-1})$ holds. Suppose first that $(2_{p-1})$ holds, i.e., $res^{\mu}_{p-1}[i]\neq\bot$ or $res^{\mu^{\prime}}_{p-1}[i]\neq\bot$ . Without loss of generality, assume that $res^{\mu}_{p-1}[i]\neq\bot$ . Then, since values are never overwritten, $res^{\mu}_{p}[i]\neq\bot$ , so $(2_{p})$ holds as well.

Now suppose that $(1_{p-1})$ holds, i.e., $\sigma^{\mu}_{p-1}(i)=\sigma^{\mu^{\prime}}_{p-1}(i)$ . Let $j$ be an index such that $\sigma^{\mu}_{p-1}(i)=\mu_{j}$ . If both $\mu$ and $\mu^{\prime}$ didn’t hear from $\mu_{j}$ during phase $p-1$ , then both peers will assign the same peer to $i$ in stage 3 of phase $p-1$ (see Line 22), so $(1_{p})$ holds. If one of the peers heard from $\mu_{j}$ , w.l.o.g assume $\mu$ did, then $res^{\mu}_{p}[i]\neq\bot$ . Hence, $(2_{p})$ holds. $\hfill\vartriangleleft$

Claim 2 yields the following corollary.

Corollary 3.

Every phase $p$ stage $1$ request received by a nonfaulty peer is answered with the correct bit values.

Next, we show that the protocol never deadlocks, i.e., whenever a nonfaulty peer waits in stages 2 and 3 (see Lines 14 and 18), it will eventually continue.

$\vartriangleright$ Claim 4.

If one nonfaulty peer has terminated, then every nonfaulty peer will eventually terminate.

Proof.

Let $\mu$ be a nonfaulty peer that has terminated. Prior to terminating, $\mu$ queried all the remaining unknown bits and sent all the bits to every other peer. Since $\mu$ is nonfaulty every other nonfaulty peer $\mu^{\prime}$ will eventually receive the message sent by $\mu$ and will set $H_{p}^{\mu^{\prime}}=\{1,\dots,k\}$ , resulting in $res^{\mu^{\prime}}=\textbf{X}$ by Observation 1. Subsequently, $\mu^{\prime}$ will terminate as well. $\hfill\vartriangleleft$

$\vartriangleright$ Claim 5.

While no nonfaulty peer has terminated, a nonfaulty peer will not wait infinitely for $k-f$ responses.

Proof.

Let $\mu$ be the least advanced nonfaulty peer, i.e, for every nonfaulty peer $\mu^{\prime}$ either $phase(\mu)<phase(\mu^{\prime})$ , or $phase(\mu)=phase(\mu^{\prime})$ with $stage(\mu)\leq stage(\mu^{\prime})$ . Note that there are at least $k-f$ nonfaulty peers. As none of these peers terminate before receiving a request from $\mu$ (premise of the claim), they will send back a response. Hence, $\mu$ will receive responses from at least $k-f$ different peers, and will not wait infinitely. $\hfill\vartriangleleft$

The combination of Claims 5 and 4 implies that eventually, every nonfaulty peer satisfies the termination condition (see Line 41) and subsequently terminates correctly (since it queries all unknown bits beforehand). That is because by Claim 5 some nonfaulty peer $\mu$ will get to phase $\log_{k/f}(n/k)$ , or set $H_{p}^{\mu}=\{1,\dots,k\}$ prior to that, and terminate, which will lead to the termination of every nonfaulty peer by Claim 4.

$\vartriangleright$ Claim 6.

At the start of phase $p\geq 0$ , every nonfaulty peer has at most $n\cdot\left(\frac{f}{k}\right)^{p}$ unknown bits.

Proof.

By induction on $p$ . Consider nonfaulty peer $\mu$ . For the base step $p=0$ the claim holds trivially by the initialization values.

Now consider $p\geq 1$ . By the induction hypothesis on $p-1$ , $\mu$ has at most $\hat{n}=n\cdot\left(\frac{f}{k}\right)^{p-1}$ unknown bits at the start of phase $p-1$ . Since unknown bits are assigned evenly in stage $3$ (see Line 22), each peer is assigned $\hat{n}/k$ unknown bits (to be queried during phase $p-1$ ). During stage 2 of phase $p-1$ , $\mu$ waits until $|H_{p-1}^{\mu}|\geq k-f$ , meaning that $\mu$ did not receive the assigned bits from at most $f$ peers. Hence, at most $\hat{n}/k\cdot f=n\cdot\left(\frac{f}{k}\right)^{p}$ bits are unknown after stage 2 of phase $p$ . The claim follows. $\hfill\vartriangleleft$

From the above discussion, we have the following lemma.

Lemma 7.

Algorithm 1 solves Download in the asynchronous setting with at most $f$ crash faults after $\log_{k/f}(n/k)$ phases with $\mathcal{Q}=O(\frac{n}{\gamma k})$ and $\mathcal{T}=O\left((\frac{\beta}{\gamma}+1v)\cdot\frac{n}{\phi}+\log_{\frac{k}% {f}}(\phi)\right)$

Proof.

By Claim 6 and since unknown bits are distributed evenly among $\{0,\dots,k-1\}$ , every nonfaulty peer queries at most $\frac{n}{k}\cdot\left(\frac{f}{k}\right)^{p}$ in phase $0\leq p\leq\log_{k/f}(n/k)$ and at most $\frac{n}{k}\cdot\left(\frac{f}{k}\right)^{\log_{k/f}(n/k)}=1$ additional bits when terminating (By Observation 1). Hence, the worst case query complexity (per peer) is bounded by

\mathcal{Q}\nobreak\ \leq\nobreak\ 1+\sum_{p=1}^{\log_{k/f}(n/k)}\frac{n}{k}% \cdot\left(\frac{f}{k}\right)^{p}\nobreak\ =\nobreak\ O\left(\frac{n}{\gamma k% }\right).

We next turn to time analysis. Consider a peer $\mu$ . For every phase $p$ , after $\lceil\frac{n}{k}\cdot(\frac{f}{k})^{p}/\phi\rceil$ time, every ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ 1$ response by a nonfaulty peer is heard by $\mu$ (even slow ones), and stage 2 starts. After that, it takes at most $\lceil n\cdot(\frac{f}{k})^{p+1}/\phi\rceil$ time units for every ${\mathtt{phase}}\ p\ {\mathtt{stage}}\ 2$ response to be heard by $\mu$ , allowing it to move to stage 3. Hence, it takes at most $\lceil n/k\cdot(\frac{f}{k})^{p}/\phi\rceil+\lceil n\cdot(\frac{f}{k})^{p+1}/\phi\rceil$ time for phase $p$ to finish once $\mu$ started it. Finally, upon termination, $\mu$ sends $res^{\mu}$ which takes $n$ time. Let $p^{\prime}$ be the phase such that $(n/k)\cdot(\frac{f}{k})^{p^{\prime}}/\phi=1$ and $p^{\prime\prime}$ be the phase such that $(nf/k)\cdot(\frac{f}{k})^{p^{\prime\prime}}/\phi=1$ . Hence, $p^{\prime}=\log_{k/f}\left(\frac{n}{k\phi}\right)$ and $p^{\prime\prime}=\log_{k//f}\left(\frac{nf}{k\phi}\right)$ . Overall the time complexity is

	$\displaystyle\mathcal{T}$	$\displaystyle\leq\nobreak\ \frac{n}{\phi}+\sum_{p=0}^{\log_{k/f}(n/k)}\left(% \left\lceil\frac{n}{k}\cdot\left(\frac{f}{k}\right)^{p}\middle/\phi\right% \rceil+\left\lceil n\cdot\left(\frac{f}{k}\right)^{p+1}\middle/\phi\right% \rceil\right)$
		$\displaystyle=\nobreak\ \frac{n}{\phi}+\sum_{p=0}^{p^{\prime}}\frac{n}{\phi k}% \cdot\left(\frac{f}{k}\right)^{p}+\sum_{p=p^{\prime}+1}^{\log_{k/f}(n/k)}1% \nobreak\ +\nobreak\ \sum_{p=0}^{p^{\prime\prime}}\frac{n}{\phi}\cdot\left(% \frac{f}{k}\right)^{p+1}+\sum_{p=p^{\prime\prime}+1}^{\log_{k/f}(n/k)}1$
		$\displaystyle\leq\nobreak\ \frac{n}{\phi}+\frac{n}{\gamma k\phi}\nobreak\ +% \nobreak\ \log_{k/f}(\phi)\nobreak\ +\nobreak\ \frac{nf}{\gamma k\phi}\nobreak% \ +\nobreak\ \log_{k/f}(\phi/f)$
		$\displaystyle=\nobreak\ \frac{n}{\phi}+O\left(\frac{n\cdot f}{\gamma k\phi}+% \log_{\frac{k}{f}}(\phi)\right)\nobreak\ =\nobreak\ O\left(\left(\frac{\beta}{% \gamma}+1\right)\cdot\frac{n}{\phi}+\log_{\frac{k}{f}}(\phi)\right).$

The lemma follows. $\hfill\blacktriangleleft$

Finally, a modification of the protocol (see [4]) yields an improved time complexity, resulting in the following theorem.

Theorem 8.

There is a deterministic protocol for solving Download in the asynchronous setting with at most $f$ crash faults (for any $f<k$ ) with $\mathcal{Q}=O(\frac{n}{\gamma k})$ , $\mathcal{T}=O\left(\frac{n}{\phi}+\log_{k/f}(\phi)\right)$ and $\mathcal{M}=O(nk^{2})$ .

3 Download in the Asynchronous Model with Byzantine Faults

In this section, we consider the asynchronous model with Byzantine faults, rather than crashes. In this setting, the Download problem is still solvable by the naive protocol where each honest peer queries all bits, but it is unclear whether one can do better. It turns out that this depends on whether $\beta\geq 1/2$ or not. The rest of the section handles these two cases.

3.1 Majority Byzantine Failures ( $\beta\geq 1/2$ )

When $\beta\geq 1/2$ , any asynchronous Download protocol that is resilient to Byzantine faults requires $\mathcal{Q}=\Omega(n)$ . Moreover, any deterministic asynchronous Download protocol resilient to Byzantine faults requires $\mathcal{Q}=n$ , namely, the only such protocol is the naive one.

Theorem 9.

When $\beta\geq 1/2$ , every deterministic asynchronous Download protocol that is resilient to Byzantine faults has $\mathcal{Q}=n$ .

We defer the proof to [4] since we next establish a similar result for randomized protocols (with a slightly weaker bound of $n/2$ instead of $n$ ).

One subtle point that makes the randomized lower bound more complicated has to do with the limitations imposed on the adversary due to the fact that honest peers are aware of the minimal number of honest peers in every execution, and are therefore entitled to wait for these many messages. This point deserves further scrutiny. In the asynchronous model, each peer operates in an event-driven mode. This means that its typical cycle consists of (a) performing some local computation, (b) sending some messages, and then (c) entering a waiting period, until “something happens.” Typically, this “something” is the arrival of a new message. However, the algorithm may instruct the peer $\mu$ to continue waiting until it receives new messages from $k-f-1$ distinct peers. Since it is guaranteed that at least this many honest peers exist in the execution, this instruction is legitimate (in the sense that it cannot cause the peer to deadlock). Moreover, in case $\mu$ has already identified and “blacklisted” a set $F^{\prime}$ of peers as failed peers, it is entitled to wait until it receives new messages from $k-f-1$ distinct peers in $V\setminus F^{\prime}$ . This observation restricts the ability of the adversary to delay messages sent by honest peers indefinitely. In particular, at some point during the execution, it may happen that all honest peers are waiting for new messages from other honest peers and will not take any additional actions until they do. In such a situation, sometimes referred to in the literature as reaching quiescence, the adversary may not continue delaying messages indefinitely and is compelled to “release” some of the delayed messages and let them reach their destination. Throughout, we assume that the adversary abides by this restriction.

Theorem 10.

For any asynchronous Download protocol where $\beta\geq 1/2$ , there are executions in which some peer queries more than $n/2$ bits.

Proof.

Assume towards contradiction that there exists a randomized protocol $\mathcal{P}$ such that in every execution of $\mathcal{P}$ every peer queries at most $n/2$ bits. Consider the following (types of) executions.

Execution $A$ .: The input is all 0’s. The adversary corrupts the peers $\mu_{2},\dots,\mu_{k/2}$ and delays the peers $\mu_{k/2+1},\dots,\mu_{k}$ until $\mu_{1}$ terminates. (If quiescence is reached before $\mu_{1}$ terminates, then the adversary forwards all delayed messages and abandons its attempt to fail the protocol). The adversary sets the random string used by the peers $\mu_{2},\dots,\mu_{k/2}$ to $\hat{r}$ ( $\hat{r}$ should be picked in a way that makes the probability of quiescence negligible; we explain later how this is done). The corrupted peers act as they would in an honest execution (except they use $\hat{r}$ set by the adversary).
Execution $B$ .: The input is all 0’s, except for one index $i$ ( $i$ should be picked according to the random distribution used by $\mu_{1}$ such that the probability that $i$ gets queried by $\mu_{1}$ is less than $1/2$ ; we explain later how this is done). The adversary corrupts the peers $\mu_{2},\dots,\mu_{k/2}$ and delays the peers $\mu_{k/2+1},\dots,\mu_{k}$ until $\mu_{1}$ terminates (same as in execution $A$ ). The adversary sets the random string used by the peers $\mu_{2},\dots,\mu_{k/2}$ to the same $\hat{r}$ as in execution $A$ . The corrupted peers act as if they are in execution $A$ .

Denote by $A_{r,\hat{r}}$ (respectively $B_{r,\hat{r}}$ ) an execution of type $A$ (respectively $B$ ) where $\mu_{1}$ uses $r$ as its random string and $\mu_{2},\dots,\mu_{k/2}$ use $\hat{r}$ . We denote by $A_{r}$ (respectively, $B_{r}$ ) the execution $A_{r,\hat{r}}$ (resp., $B_{r,\hat{r}}$ ) where $\hat{r}$ is chosen by the adversary.

Note that from the point of view of $\mu_{1}$ , executions $A_{r}$ and $B_{r}$ are indistinguishable if $\mu_{1}$ does not query bit $i$ . Also note that the adversary’s strategy is only valid if $\mu_{1}$ does not reach a quiescent state, namely, one in which it will not terminate before receiving a message from at least one of the peers $\mu_{k/2+1},\dots,\mu_{k}$ . We now show that $\hat{r}$ and $i$ can be chosen such that the probability (over the possible random choices $r$ of $\mu_{1}$ ) of reaching quiescence is negligible and the fraction of pairs $(A_{r},B_{r})$ in which $\mu_{1}$ queries bit $i$ is at most $1/2$ .

W.l.o.g, we assume that $\mu_{1}$ picks exactly $n/2$ bits to query. First, for every set $S$ of $n/2$ bits, the adversary knows the probability $p_{S}$ that $\mu_{1}$ will query $S$ . For every $i$ , denote the probability that bit $i$ gets queried by $\mu_{1}$ by $x_{i}=\sum_{S:i\in S}p_{S}$ . Note that $\sum_{i=1}^{n}x_{i}=\sum_{S}\frac{n}{2}\cdot p_{S}=n/2$ (since every set $S$ contributes its probability $p_{S}$ to the sum $n/2$ times ). The adversary picks bit $i$ with probability $p_{i}=\frac{1-x_{i}}{n/2}$ (note that $\sum_{i}p_{i}=1$ ).

Let $\hat{S}$ and $I$ be random variables indicating the random selection of a set $S$ by $\mu_{1}$ and an index $i$ by the adversary, respectively. Then

	$\displaystyle P[I\in\hat{S}]$	$\displaystyle=$	$\displaystyle\sum_{i=1}^{n}P[i\in\hat{S}\wedge I=i]\nobreak\ =\nobreak\ \sum_{% i=1}^{n}P[i\in\hat{S}]\cdot P[I=i]\nobreak\ =\nobreak\ \sum_{i=1}^{n}x_{i}% \cdot\frac{1-x_{i}}{n/2}$
		$\displaystyle=$	$\displaystyle\frac{2}{n}\sum_{i=1}^{n}x_{i}\cdot(1-x_{i})\nobreak\ \leq% \nobreak\ \frac{2}{n}\cdot\frac{n}{4}\nobreak\ =\nobreak\ \frac{1}{2}\nobreak\ ,$

where the second equality follows since $\hat{S}$ and $I$ are independent and the inequality is derived by the Cauchy–Schwarz inequality.

Next, we show that there is a choice of $\hat{r}$ for which the probability (over the possible random choices $r$ of $\mu_{1}$ ) of reaching quiescence in $A_{r}$ is negligible. Denote by $Q_{\hat{r}}$ the event of reaching quiescence in $A_{r,\hat{r}}$ (where $\mu_{1}$ uses the random string $r$ ). Denote by $Q$ the event of reaching quiescence in $A_{r,\hat{r}}$ (with the random strings $r$ , $\hat{r}$ ). Assume towards contradiction that for every value of $\hat{r}$ , the probability of $Q_{\hat{r}}$ , over the possible random choices $r$ of $\mu_{1}$ , is $P[Q_{\hat{r}}]\geq 1/n$ . Consider an execution $A^{\prime}_{r,\hat{r}}$ , where the input is all 0’s, the adversary crashes peers $\mu_{k/2+1},\dots,\mu_{k}$ , and the random strings used by the peers $\mu_{1}$ and $\mu_{2},\dots,\mu_{k/2}$ are $r$ and $\hat{r}$ , respectively. Note that if quiescence is reached in $A_{r,\hat{r}}$ , then quiescence is also reached in $A^{\prime}_{r,\hat{r}}$ . Hence, $A^{\prime}_{r,\hat{r}}$ does not terminate. Noting that

P[Q]\nobreak\ =\nobreak\ \sum_{\hat{r}}P[Q_{\hat{r}}]\cdot P[\hat{r}]\nobreak% \ \geq\nobreak\ \frac{1}{n}\sum_{\hat{r}}P[\hat{r}]\nobreak\ =\nobreak\ \frac{% 1}{n},

we get that the probability that the algorithm fails (does not terminate) in $A^{\prime}_{r,\hat{r}}$ is at least $1/n$ , a contradiction. It follows that there must be a choice of $\hat{r}$ such that the probability of $Q_{\hat{r}}$ over the possible random choices $r$ of $\mu_{1}$ satisfies $P[Q_{\hat{r}}]<1/n$ . The theorem follows. $\hfill\blacktriangleleft$

3.2 Minority Byzantine Failures ( $\beta<1/2$ )

For $\beta<1/2$ , we consider an input vector X that is divided into $\mathcal{K}=\lceil n/\varphi\rceil$ contiguous segments, each of length approximately $\varphi$ . The $\ell$ th segment is denoted by $\textbf{X}[\ell,\varphi]$ . Peers issue queries for specific segments and obtain the corresponding bit strings as responses. These responses are then broadcast to all other peers in the form of messages $\langle\ell,s\rangle$ , where $\ell$ denotes the segment index and $s$ the returned bit string. Two messages are said to be overlapping if they refer to the same segment and consistent if, in addition, they carry identical bit strings. A set of consistent responses from at least $t$ peers is referred to as a $t$ -frequent string, denoted by the function $\mbox{\sf FS}(MS,t)$ applied to a multiset $M S$ of overlapping responses.

To handle inconsistencies caused by Byzantine peers, we employ a decision-tree construction for each set $S$ of overlapping responses corresponding to a particular segment. If $S$ contains only one candidate, the tree consists of a single leaf labeled by that bit string. Otherwise, two non-consistent bit strings $s,s^{\prime}\in S$ are chosen, and the first index at which they differ is identified as a separating index. An internal node is created at this index, and the set $S$ is split into two subsets according to the value of the separating bit. The process is applied recursively until the leaves correspond to distinct candidate bit strings. By querying X at the separating indices, inconsistencies can be resolved, and the correct bit string is determined as long as it appears among the leaves of the decision tree.

Next, the input is first partitioned into $\mathcal{K}$ segments, and each peer $\mu$ independently selects one segment uniformly at random, queries it, and broadcasts the result to all other peers. Subsequently, each peer constructs decision trees for every segment using the overlapping responses it received. A segment $\ell$ is determined by forming a decision tree from the responses that appeared at least $t=\frac{(\gamma-\beta)k}{2\mathcal{K}}$ times, ensuring that the correct bit string for each segment is eventually recovered. From the above discussion, we have the following results (due to space constraints, details are deferred to [4].

Theorem 11.

There is a 2-cycle asynchronous randomized protocol for Download with $\mathcal{Q}=O\left(\sqrt{\frac{n}{\gamma-\beta}}+\frac{n\ln n}{(\gamma-\beta)k% }\right)$ .

To extend the above protocol ( $2$ -cycle) into a multi-cycle protocol that improves the expected query complexity. The first cycle matches the first cycle of the $2$ -cycle protocol, but in each later cycle $i>1$ , the input is divided into $\mathcal{K}_{i}=\frac{n}{2^{i}\varphi}$ segments of size $\varphi_{i}=2^{i}\cdot\varphi$ . Each $i$ -segment consists of two $(i-1)$ -segments. Every peer $\mu$ selects an $i$ -segment uniformly at random, reconstructs its value from decision trees built in the previous cycle, and then broadcasts the result. Since segment size doubles each cycle, after $O(\log\mathcal{K})$ cycles, each peer has the entire input and outputs correctly, therefore, leading to the following results (details are deferred to [4].

Theorem 12.

There is a $O\left(\log\left(\frac{\gamma k}{\ln n}\right)\right)$ -cycle protocol which w.h.p. computes Download in the point-to-point model with expected query complexity $\mathcal{Q}=O(n\log n/(\gamma-\beta k))$ and message size $O(n)$ .

3.3 Deterministic Download protocol

We now present a deterministic asynchronous Download protocol for Byzantine faults where $\beta<1/2$ . Consider the deterministic synchronous protocol presented in [3], where a committee $C_{i}$ of size $2f+1$ is formed for every $i\in[1,n]$ , and every peer $\mu\in C_{i}$ queries the bit $\textbf{X}[i]$ and broadcasts the message $(\mu,\textbf{X}[i]=b_{i})$ to all other peers. In order to adapt that protocol to the asynchronous model, we modify the final part of the protocol, requiring each non-committee peer $\mu\notin C_{i}$ to wait until it gets messages $(\mu^{\prime},\textbf{X}[i]=b)$ with identical value $b$ from at least $f+1$ peers $\mu^{\prime}$ , and then decide $res^{\mu}[i]\leftarrow b$ . We get the following result (see [4]).

Theorem 13.

When $\beta<1/2$ , there exists a deterministic asynchronous protocol that solves Download with $Q=O(\beta n)$ , $\mathcal{T}=O\left(\frac{\beta n}{\phi}\right)$ and $\mathcal{M}=O(f\cdot n)$ .

4 Application: Efficient Blockchain Oracles

Blockchain systems [25] have seen a rise in popularity due to their ability to provide both transparency and strong cryptographic guarantees of agreement on the order of transactions, without the need for trusted third-party entities. More general computational abilities have also been well sought out for blockchains. Smart contracts [27] fulfill that need by providing users of the blockchain a way to run programs on the blockchain that ensures reliable and deterministic execution while providing transparency and immutability of both the program code and its state(s). Note that since the execution is required to be deterministic, i.e., every node must produce the same result, smart contracts are restricted to accessing (agreed upon) on-chain data, as off-chain data may introduce non-determinism to the execution.

Blockchain oracles [2, 11, 26] are components of blockchain systems that provide multiple services that support and extend the functionality of smart contracts (and other on-chain entities). The most important and fundamental service a blockchain oracle provides is bridging between the on-chain network and off-chain resources [12, 16], providing smart contracts access to external data without introducing non-determinism into their execution. We focus on this service and artificially consider it to be the sole responsibility of a blockchain oracle. In the remainder of this section, we explain in detail a possible application of the DR model and the Download problem for improved query efficiency within the context of blockchain oracles.

Blockchain oracles general structure.

Blockchain oracles consist of an on-chain component and an off-chain component. The off-chain component encompasses the different data sources that store the required external information (e.g., stock prices, weather predictions) and the network of nodes in charge of retrieving that information and transmitting it to the on-chain component. The on-chain component can be thought to be (but is not necessarily) a smart contract that is responsible for verifying the validity of the report, making the information public on the blockchain it is hosted on, and using it for its execution.

Formally, the off-chain component consists of two parts: an asynchronous⁴⁴4The network is sometimes assumed to be partially synchronous in blockchain oracles. oracle network, with peers (nodes) $v_{i}$ , $i\in[1,k]$ , capable of exchanging direct messages among themselves, and data sources $DS_{j}$ , $j\in[1,m]$ , each storing an array $\textbf{X}_{j}$ of $n$ variables in which the on-chain component is interested. Each peer can read the $i$ -th cell from the $j$ -th data source $DS_{j}$ by invoking $\mbox{\tt Query}(i,j)$ . A fraction of up to $\beta_{t}\leq 1/3$ of the peers may be Byzantine, and a fraction of up to $\beta_{d}\leq 1/2$ of the $m$ data sources may be Byzantine. Denote the on-chain component by $S C$ .

The goal of blockchain oracles, as mentioned above, is to pull information from external sources and push a final value on-chain. There are a few difficulties that may arise when trying to develop such a system. First, it might be the case that different data sources report slightly different values (e.g., prices of a specific stock), even if all of them act honestly. Moreover, corrupted data sources might provide false and even inconsistent values (providing some nodes with value $\alpha$ and another with $\alpha^{\prime}$ ). The system needs to pick a final value in a way that (1) represents the range of honest values pulled from honest data sources and (2) malicious players (both data sources and peers) cannot force the system to pick a final value that does not represent the range of honest values.

The Oracle Data Delivery (ODD) problem.

Denote by $\mathcal{H}_{ds}$ the set of honest data sources. Let $h_{\min}(i)=\min_{j\in\mathcal{H}}\{\textbf{X}_{j}[i]\}$ and $h_{\max}(i)=\max_{j\in\mathcal{H}}\{\textbf{X}_{j}[i]\}$ . The honest range of $i$ is the range $\sigma(i)=[h_{\min}(i),h_{\max}(i)]$ . The ODD problem requires the on-chain $S C$ to publish an array of values $r e s$ to the target blockchain such that $res[i]\in\sigma(i)$ , for every $i\in[n]$ .

A blockchain oracle protocol can generally be split into three distinct steps: (1) collecting data, (2) reaching an agreement on the collected data, and (3) deriving and publishing a final value. Note that this abstraction is the minimum required abstraction to capture the operation of blockchain oracle protocols such as OCR [12] and DORA [16]⁵⁵5These protocols have many additional technical aspects, different structures, and different ways of handling steps (2) and (3). As our focus is on improving step (1), we may w.l.o.g. assume the abstract structure..

We now show how our Download protocols can be used to significantly reduce the cost of the Oracle Data Collection (ODC), i.e, step (1) of blockchain oracles.

Improving ODC by blockchain oracles via Download.

Current protocols perform the data collection step by the following ODC process:
For every node:

$\blacksquare$

Pick $2m\beta_{d}+1$ data sources into a set $A D S$ .
$\blacksquare$

Perform $o_{i,j}\leftarrow\mbox{\tt Query}(i,j)$ , for every $i\in[1,n]$ and $j\in ADS$ .
$\blacksquare$

Calculate the median $o_{i}\leftarrow median(\{o_{i,j}\mid j\in ADS\})$ and proceed to step (2).

The results of [12, 16], cast in our abstract formulation, yield the following result.

Theorem 14 ([12, 16]).

The ODC process guarantees that $o_{i}\in\sigma(i)$ for every $i\in[1,n]$ and has total query cost $O(mnk)$ and worst case individual query cost $\mathcal{Q}=O(mn)$ .

Instead, we propose utilizing the guarantees of Download protocols, namely, that for an honest data source $DS_{j}$ , the output of each peer is exactly $\textbf{X}_{j}$ , to construct the following modification of the ODC steps.

$\blacksquare$

For every node, pick $2m\beta_{d}+1$ data sources into a set $A D S$ .
$\blacksquare$

For every data source $j\in ADS$ , run a Download protocol (denote the result for cell $i$ from data source $j$ by $o_{i,j}$ ).
$\blacksquare$

Calculate the median $o_{i}\leftarrow median(\{o_{i,j}\mid j\in ADS\})$ and proceed to step (2).

It is easy to verify that this modified construction yields the following.

Theorem 15.

The Download-based ODC process guarantees $o_{i}\in\sigma(i)$ for every $i\in[1,n]$ and takes $\tilde{O}(mn)$ total queries and $\mathcal{Q}=\tilde{O}(mn/k)$ w.h.p.

Note that the Download protocol presented in this paper assumes a binary input array, but this can be extended to numbers via a relatively simple extension. However, it is important to note that our solution relies on the following restrictive assumption. For two honest peers $v,v^{\prime}$ , if both $v$ and $v^{\prime}$ issue the query $\mbox{\tt Query}(i,j)$ , then they get the same result, for every $i\in[1,n]$ and honest data source $j$ (i.e., the data does not change if queried at different times). Getting rid of this assumption and solving the problem efficiently for dynamic data is left as an open problem for future study.

References

[1] Ittai Abraham, Dahlia Malkhi, and Alexander Spiegelman. Asymptotically optimal validated asynchronous byzantine agreement. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 337–346, 2019. doi:10.1145/3293611.3331612.
[2] John Adler, Ryan Berryhill, Andreas Veneris, Zissis Poulos, Neil Veira, and Anastasia Kastania. Astraea: A decentralized blockchain oracle. In 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pages 1145–1152, 2018. doi:10.1109/Cybermatics_2018.2018.00207.
[3] John Augustine, Jeffin Biju, Shachar Meir, David Peleg, Srikkanth Ramachandran, and Aishwarya Thiruvengadam. Byzantine Resilient Distributed Computing on External Data. In Dan Alistarh, editor, 38th International Symposium on Distributed Computing (DISC 2024), volume 319 of Leibniz International Proceedings in Informatics (LIPIcs), pages 3:1–3:23, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2024.3.
[4] John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, and David Peleg. Distributed download from an external data source in asynchronous faulty settings. CoRR, abs/2509.03755, 2025. doi:10.48550/arXiv.2509.03755.
[5] John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, and David Peleg. Distributed Download from an External Data Source in Byzantine Majority Settings. In Dariusz R. Kowalski, editor, 39th International Symposium on Distributed Computing (DISC 2025), volume 356 of Leibniz International Proceedings in Informatics (LIPIcs), pages 9:1–9:22, Dagstuhl, Germany, 2025. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2025.9.
[6] Adam D Barwell, Ping Hou, Nobuko Yoshida, and Fangyi Zhou. Crash-stop failures in asynchronous multiparty session types. Logical Methods in Computer Science, 21, 2025. doi:10.46298/LMCS-21(2:5)2025.
[7] Michael Ben-Or. Another advantage of free choice (extended abstract): Completely asynchronous agreement protocols. In Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing, PODC ’83, pages 27–30, New York, NY, USA, 1983. Association for Computing Machinery. doi:10.1145/800221.806707.
[8] Michael Ben-Or. Another advantage of free choice (extended abstract) completely asynchronous agreement protocols. In Proceedings of the second annual ACM symposium on Principles of distributed computing, pages 27–30, 1983.
[9] Gabriel Bracha. Asynchronous byzantine agreement protocols. Information & Computation, 75:130–143, 1987. doi:10.1016/0890-5401(87)90054-X.
[10] Gabriel Bracha. Asynchronous byzantine agreement protocols. Information and Computation, 75(2):130–143, 1987. doi:10.1016/0890-5401(87)90054-X.
[11] Lorenz Breidenbach, Christian Cachin, Benedict Chan, Alex Coventry, Steve Ellis, Ari Juels, Farinaz Koushanfar, Andrew Miller, Brendan Magauran, Daniel Moroz, Sergey Nazarov, Alexandru Topliceanu, Florian Tram‘er, and Fan Zhang. Chainlink 2.0: Next steps in the evolution of decentralized oracle networks. Technical report, Chainlink Labs, 2021.
[12] Lorenz Breidenbach, Christian Cachin, Alex Coventry, Ari Juels, and Andrew Miller. Chainlink off-chain reporting protocol. Technical report, Chainlink Labs, 2021.
[13] Christian Cachin, Klaus Kursawe, Frank Petzold, and Victor Shoup. Secure and efficient asynchronous broadcast protocols. In Annual International Cryptology Conference, pages 524–541. Springer, 2001. doi:10.1007/3-540-44647-8_31.
[14] Christian Cachin, Klaus Kursawe, and Victor Shoup. Random oracles in constantipole: practical asynchronous byzantine agreement using cryptography. In Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing, pages 123–132, 2000.
[15] Ran Canetti and Tal Rabin. Fast asynchronous byzantine agreement with optimal resilience. In Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 42–51, 1993. doi:10.1145/167088.167105.
[16] Prasanth Chakka, Saurabh Joshi, Aniket Kate, Joshua Tobkin, and David Yang. DORA: distributed oracle agreement with simple majority. CoRR, abs/2305.03903, 2023. doi:10.48550/arXiv.2305.03903.
[17] Brian A. Coan. A compiler that increases the fault tolerance of asynchronous protocols. IEEE Transactions on Computers, 37(12):1541–1553, 1988. doi:10.1109/12.9732.
[18] Sisi Duan, Michael K Reiter, and Haibin Zhang. Beat: Asynchronous bft made practical. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 2028–2041, 2018. doi:10.1145/3243734.3243812.
[19] Alan David Fekete. Asynchronous approximate agreement. In Proceedings of the sixth annual ACM Symposium on Principles of distributed computing, pages 64–76, 1987. doi:10.1145/41840.41846.
[20] Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374–382, April 1985. doi:10.1145/3149.214121.
[21] Bruce M Kapron, David Kempe, Valerie King, Jared Saia, and Vishal Sanwalani. Fast asynchronous byzantine agreement and leader election with full information. ACM Transactions on Algorithms (TALG), 6(4):1–28, 2010. doi:10.1145/1824777.1824788.
[22] Valerie King and Jared Saia. Breaking the o(n2) bit barrier: Scalable byzantine agreement with an adaptive adversary. J. ACM, 58(4), July 2011. doi:10.1145/1989727.1989732.
[23] Julian Loss and Tal Moran. Combining asynchronous and synchronous byzantine agreement: The best of both worlds. Cryptology ePrint Archive, 2018.
[24] Nancy Lynch and Srikanth Sastry. Consensus using asynchronous failure detectors. arXiv preprint arXiv:1502.02538, 2015. arXiv:1502.02538.
[25] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system. whitepaper, May 2009.
[26] Supra Research. Supra’s blockchain infrastructure stack. Whitepaper, Supra Labs, November 2024. URL: https://supra.com/documents/SupraTech-Whitepaper.pdf.
[27] Nick Szabo. Formalizing and securing relationships on public networks. First Monday, 2, 1997. doi:10.5210/FM.V2I9.548.
[28] Lewis Tseng and Nitin H Vaidya. Asynchronous convex hull consensus in the presence of crash faults. In Proceedings of the 2014 ACM symposium on Principles of distributed computing, pages 396–405, 2014. doi:10.1145/2611462.2611470.

[bib.bib1] [1] Ittai Abraham, Dahlia Malkhi, and Alexander Spiegelman. Asymptotically optimal validated asynchronous byzantine agreement. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, pages 337–346, 2019. doi:10.1145/3293611.3331612.

[bib.bib2] [2] John Adler, Ryan Berryhill, Andreas Veneris, Zissis Poulos, Neil Veira, and Anastasia Kastania. Astraea: A decentralized blockchain oracle. In 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pages 1145–1152, 2018. doi:10.1109/Cybermatics_2018.2018.00207.

[bib.bib3] [3] John Augustine, Jeffin Biju, Shachar Meir, David Peleg, Srikkanth Ramachandran, and Aishwarya Thiruvengadam. Byzantine Resilient Distributed Computing on External Data. In Dan Alistarh, editor, 38th International Symposium on Distributed Computing (DISC 2024), volume 319 of Leibniz International Proceedings in Informatics (LIPIcs), pages 3:1–3:23, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2024.3.

[bib.bib4] [4] John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, and David Peleg. Distributed download from an external data source in asynchronous faulty settings. CoRR, abs/2509.03755, 2025. doi:10.48550/arXiv.2509.03755.

[bib.bib5] [5] John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, and David Peleg. Distributed Download from an External Data Source in Byzantine Majority Settings. In Dariusz R. Kowalski, editor, 39th International Symposium on Distributed Computing (DISC 2025), volume 356 of Leibniz International Proceedings in Informatics (LIPIcs), pages 9:1–9:22, Dagstuhl, Germany, 2025. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2025.9.

[bib.bib6] [6] Adam D Barwell, Ping Hou, Nobuko Yoshida, and Fangyi Zhou. Crash-stop failures in asynchronous multiparty session types. Logical Methods in Computer Science, 21, 2025. doi:10.46298/LMCS-21(2:5)2025.

[bib.bib7] [7] Michael Ben-Or. Another advantage of free choice (extended abstract): Completely asynchronous agreement protocols. In Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing, PODC ’83, pages 27–30, New York, NY, USA, 1983. Association for Computing Machinery. doi:10.1145/800221.806707.

[bib.bib8] [8] Michael Ben-Or. Another advantage of free choice (extended abstract) completely asynchronous agreement protocols. In Proceedings of the second annual ACM symposium on Principles of distributed computing, pages 27–30, 1983.

[bib.bib9] [9] Gabriel Bracha. Asynchronous byzantine agreement protocols. Information & Computation, 75:130–143, 1987. doi:10.1016/0890-5401(87)90054-X.

[bib.bib10] [10] Gabriel Bracha. Asynchronous byzantine agreement protocols. Information and Computation, 75(2):130–143, 1987. doi:10.1016/0890-5401(87)90054-X.

[bib.bib11] [11] Lorenz Breidenbach, Christian Cachin, Benedict Chan, Alex Coventry, Steve Ellis, Ari Juels, Farinaz Koushanfar, Andrew Miller, Brendan Magauran, Daniel Moroz, Sergey Nazarov, Alexandru Topliceanu, Florian Tram‘er, and Fan Zhang. Chainlink 2.0: Next steps in the evolution of decentralized oracle networks. Technical report, Chainlink Labs, 2021.

[bib.bib12] [12] Lorenz Breidenbach, Christian Cachin, Alex Coventry, Ari Juels, and Andrew Miller. Chainlink off-chain reporting protocol. Technical report, Chainlink Labs, 2021.

[bib.bib13] [13] Christian Cachin, Klaus Kursawe, Frank Petzold, and Victor Shoup. Secure and efficient asynchronous broadcast protocols. In Annual International Cryptology Conference, pages 524–541. Springer, 2001. doi:10.1007/3-540-44647-8_31.

[bib.bib14] [14] Christian Cachin, Klaus Kursawe, and Victor Shoup. Random oracles in constantipole: practical asynchronous byzantine agreement using cryptography. In Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing, pages 123–132, 2000.

[bib.bib15] [15] Ran Canetti and Tal Rabin. Fast asynchronous byzantine agreement with optimal resilience. In Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 42–51, 1993. doi:10.1145/167088.167105.

[bib.bib16] [16] Prasanth Chakka, Saurabh Joshi, Aniket Kate, Joshua Tobkin, and David Yang. DORA: distributed oracle agreement with simple majority. CoRR, abs/2305.03903, 2023. doi:10.48550/arXiv.2305.03903.

[bib.bib17] [17] Brian A. Coan. A compiler that increases the fault tolerance of asynchronous protocols. IEEE Transactions on Computers, 37(12):1541–1553, 1988. doi:10.1109/12.9732.

[bib.bib18] [18] Sisi Duan, Michael K Reiter, and Haibin Zhang. Beat: Asynchronous bft made practical. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 2028–2041, 2018. doi:10.1145/3243734.3243812.

[bib.bib19] [19] Alan David Fekete. Asynchronous approximate agreement. In Proceedings of the sixth annual ACM Symposium on Principles of distributed computing, pages 64–76, 1987. doi:10.1145/41840.41846.

[bib.bib20] [20] Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374–382, April 1985. doi:10.1145/3149.214121.

[bib.bib21] [21] Bruce M Kapron, David Kempe, Valerie King, Jared Saia, and Vishal Sanwalani. Fast asynchronous byzantine agreement and leader election with full information. ACM Transactions on Algorithms (TALG), 6(4):1–28, 2010. doi:10.1145/1824777.1824788.

[bib.bib22] [22] Valerie King and Jared Saia. Breaking the o(n2) bit barrier: Scalable byzantine agreement with an adaptive adversary. J. ACM, 58(4), July 2011. doi:10.1145/1989727.1989732.

[bib.bib23] [23] Julian Loss and Tal Moran. Combining asynchronous and synchronous byzantine agreement: The best of both worlds. Cryptology ePrint Archive, 2018.

[bib.bib24] [24] Nancy Lynch and Srikanth Sastry. Consensus using asynchronous failure detectors. arXiv preprint arXiv:1502.02538, 2015. arXiv:1502.02538.

[bib.bib25] [25] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system. whitepaper, May 2009.

[bib.bib26] [26] Supra Research. Supra’s blockchain infrastructure stack. Whitepaper, Supra Labs, November 2024. URL: https://supra.com/documents/SupraTech-Whitepaper.pdf.

[bib.bib27] [27] Nick Szabo. Formalizing and securing relationships on public networks. First Monday, 2, 1997. doi:10.5210/FM.V2I9.548.

[bib.bib28] [28] Lewis Tseng and Nitin H Vaidya. Asynchronous convex hull consensus in the presence of crash faults. In Proceedings of the 2014 ACM symposium on Principles of distributed computing, pages 396–405, 2014. doi:10.1145/2611462.2611470.

Distributed Download from an External Data Source in Asynchronous Faulty Settings

Abstract

Keywords and phrases:

Funding:

Copyright and License:

2012 ACM Subject Classification:

Related Version:

DOI:

Event:

Editors:

Series and Publisher:

1 Introduction

1.1 Background and motivation

1.2 The Model

Cycles.

The adversary.

1.3 Related Work

1.4 Contributions

2 Deterministic Download in the Asynchronous Model with Crash Faults

2.1 Tolerating any Number 𝒇<𝒌 of Crashes

Observation 1.

Proof.

⊳ Claim 2.

Proof.

Corollary 3.

⊳ Claim 4.

Proof.

⊳ Claim 5.

Proof.

⊳ Claim 6.

Proof.

Lemma 7.

Proof.

Theorem 8.

3 Download in the Asynchronous Model with Byzantine Faults

3.1 Majority Byzantine Failures (𝜷≥𝟏/𝟐)

Theorem 9.

Theorem 10.

Proof.

3.2 Minority Byzantine Failures (𝜷<𝟏/𝟐)

Theorem 11.

Theorem 12.

3.3 Deterministic Download protocol

Theorem 13.

4 Application: Efficient Blockchain Oracles

Blockchain oracles general structure.

The Oracle Data Delivery (ODD) problem.

Improving ODC by blockchain oracles via Download.

Theorem 14 ([12, 16]).

Theorem 15.

References

2.1 Tolerating any Number $f<k$ of Crashes

$\vartriangleright$ Claim 2.

$\vartriangleright$ Claim 4.

$\vartriangleright$ Claim 5.

$\vartriangleright$ Claim 6.

3.1 Majority Byzantine Failures ( $\beta\geq 1/2$ )

3.2 Minority Byzantine Failures ( $\beta<1/2$ )