Brief Announcement: Distributed Download from an External Data Source in Asynchronous Faulty Settings
Abstract
The distributed Data Retrieval (DR) model consists of peers connected by a complete peer-to-peer communication network, and a trusted external data source that stores an array X of bits (). Up to of the peers might fail in any execution (for ). Peers can obtain the information either by inexpensive messages passed among themselves or through expensive queries to the source array X. In the DR model, we focus on designing protocols that minimize the number of queries performed by any nonfaulty peer (a measure referred to as query complexity) while maximizing the resilience parameter .
The Download problem requires each nonfaulty peer to correctly learn the entire array X. Earlier work on this problem focused on synchronous communication networks and established several deterministic and randomized upper and lower bounds. Our work is the first to extend the study of distributed data retrieval to asynchronous communication networks. We address the Download problem under both the Byzantine and crash failure models. We present query-optimal deterministic solutions in an asynchronous model that can tolerate any fixed fraction of crash faults. In the Byzantine failure model, it is known that deterministic protocols incur a query complexity of per peer, even under synchrony. We extend this lower bound to randomized protocols in the asynchronous model for , and further show that for , a randomized protocol exists with near-optimal query complexity. To the best of our knowledge, this is the first work to address the Download problem in asynchronous communication networks.
Keywords and phrases:
Byzantine Fault Tolerance, Blockchain Oracle, Data Retrieval Model, Distributed DownloadFunding:
John Augustine: Supported by the Centre for Cybersecurity, Trust and Reliability (CyStar), IIT Madras.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Distributed algorithmsEditor:
Dariusz R. KowalskiSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
1.1 Background and motivation
The Data Retrieval Model (DR) was first introduced in [2] to abstract the fundamental process of a group learning from a reliable external data source, where the data source is too large or it is too expensive to be learned individually (i.e., requires members of the group to collaborate), and some members of the group might crash during execution or act in other ways to deliberately sabotage the learning process. One key example of systems where this process takes place is blockchain oracles [6, 7]. We address this Oracle data delivery process in detail later and present a method for improving its performance using the DR model and the protocols designed in this work.
The DR model contains two entities: (i) a peer-to-peer network and (ii) an external data source in the form of an bit array X. There are peers, up to fraction of which may be faulty (and at least fraction of which are nonfaulty). Each peer has access to the content of the array through queries. The general class of retrieval problems consists of problems requiring every peer to output for some computable function of the input array X. In this work, we focus on the most fundamental retrieval problem, where , referred to hereafter as the Download problem111It is fundamental since every retrieval problem can be solved by first performing download and then locally computing ., where every peer needs to learn the entire input X.
In [2] it is shown that deterministic Download in synchronous systems with Byzantine faults requires queries, for every . This implies that in the presence of Byzantine faults, one cannot attain the ideal query complexity of without using randomization.
In this work, we consider Download protocols in the asynchronous setting for both the crash and Byzantine fault models. In the asynchronous Byzantine fault setting, we prove that, unlike the synchronous setting, where randomization can overcome the deterministic lower bound, queries per peer are required when , even for randomized protocols. We complement this lower bound with a protocol for the regime that achieves a query complexity of , which, for a constant , is within log factors of the generic lower bound of .
Turning our attention to the more benign setting of crash faults (i.e., where all peers are honest but some fraction may stop functioning), the picture is brighter. For this model, it turns out that even in the asynchronous setting, one can get efficient deterministic Download protocols that achieve the optimal query complexity of , for any fraction of crashes.
1.2 The Model
In the Data Retrieval (DR) model, the system consists of two components. The first is a collection of peers, each equipped with a unique ID from the range , connected by a complete communication network (or clique). The network provides peer-to-peer message passing, namely, every peer can send at time a (possibly different) message of size at most bits to each other peer.
The second component of the DR model is an external data source. The source stores an -bit input array . It provides the peers with read-only access, allowing each peer to retrieve the data through queries of the form , for . The answer returned by the source would then be , the element in the array. This type of communication is referred to as source-to-peer communication.
We consider asynchronous communication, where any communication (both among peer-to-peer and source-to-peer) can be delayed by any finite amount of time. For randomized protocols, we use the following notion of cycles.
Cycles.
In the asynchronous model, there is no global notion of rounds, as each peer operates at a different pace. Nevertheless, to describe our protocols and analyze their performance, it is convenient to divide the local execution of each peer into (varying time) cycles. Each such local cycle consists of the following stages.
-
Sending (0 or more) queries and getting answers.
-
Sending (0 or more) messages.
-
Waiting to receive messages.
We assume that local computation takes 0 time and can be performed at any point in a cycle. Moreover, when waiting for messages, after every message is received, the peer can adaptively decide whether to keep waiting for an additional message or continue to the next cycle. Note that the local cycle of peer might coincide with a different local cycle of another peer .
In the absence of global time units, it is convenient to break the time axis into “virtual blocks” by defining , for integer , as the first time any peer started its local cycle .
Every message is of size at most bits, where is a system parameter. Note that throughout the paper, we either set to a specific value, or leave it as a parameter, in which case increasing the message size parameter would result in faster protocols.
The adversary.
Our analysis uses the notion of an adversary, representing the adverse conditions in which the system operates, including the asynchronous communication and the possibility of failures.
The adversary has two types of operation. First, it can fail up to peers, under the restriction that it can only fail a peer between its cycles (or before the first cycle), meaning that a peer can make random decisions in its current cycle without the adversary being able to react until the end of the cycle. Second, it can set the time it takes a message sent by peer in its local cycle to reach peer , under the restriction that it must set the time for every pair of peers , before time . In other words, the adversary must set the latency of each message sent during a cycle before any peer starts cycle . The adversary can also decide when every peer starts its execution (i.e., we do not assume a simultaneous start). Note that in the case of deterministic protocols, the notion of cycles is irrelevant, and we consider a standard adversary that can fail a peer at any point of the execution and can delay messages for any finite amount of time.
The adversary selects the input data and determines the failure pattern of the peers. In the crash failure model, the adversary’s power is limited to crashing some of the peers in every execution of the protocol. Once a peer crashes, it stops its local execution of the protocol arbitrarily and permanently. This could happen in the middle of operation, e.g., after the peer has already sent some, but perhaps not all, of the messages it was instructed by the protocol to send out at a given point in time. In contrast, in the Byzantine failure model, a failed peer can deviate from the protocol in arbitrary ways. We assume that the adversary can fail at most peers, for some given222We do not assume to be a fixed constant (unless mentioned otherwise). . We let , so there is (at least) a fraction of nonfaulty peers in every execution. Denote the set of faulty (respectively, nonfaulty) peers in the execution by . (resp., ). We assume that the adversary knows the protocol and hence can simulate it (up to random coins).
We concentrate on the following complexity measures.
- Query Complexity ():
-
the maximum number of bits queried by a nonfaulty peer during the execution.
- Time Complexity ():
-
the time it takes for the protocol to terminate.
- Message Complexity ():
-
the total number of messages sent by nonfaulty peers during the execution.
We assume that queries to the source are the more expensive component in the system, so we focus mainly on optimizing the query complexity . Measuring the maximum cost per peer (rather than the total cost) gives priority to a balanced load of queries over the nonfaulty peers.
Let us now formally define the Download problem. Consider a DR network with peers, where at most can be faulty, and a source that stores a bit array . Each peer is required to learn X. Formally, each nonfaulty peer outputs a bit array , and it is required that, upon termination, for every and .
In the absence of failures, this problem can be solved by sharing the task of querying all bits evenly among the peers, yielding . The message complexity is , assuming small messages of size , and the time complexity is since bits need to be sent along each communication link when the workload is shared.
1.3 Contributions
We present the Download problem in asynchronous communication networks, under both crash and Byzantine failures settings. In the crash-fault setting, our deterministic results are optimal w.r.t. to the resilience (for any ) and query complexity. Notice that this optimality also holds for randomized algorithms. In the Byzantine failure setting, we provide deterministic and randomized lower bounds as well as upper bounds. The main results are the following (more details can be found in [3]).
-
(1)
Deterministic Download in Crash-Fault: We present a deterministic protocol for solving Download problem in the asynchronous setting with at most crash faults () with , and where is the message size. Our result achieves the optimal query complexity for any fraction of crash fault, .
-
(2)
Deterministic Lower Bound in Byzantine Fault: We show that for , every deterministic asynchronous Download protocol that is resilient to Byzantine faults requires .
-
(3)
Deterministic Download in Byzantine Fault: We show that for , there exists a deterministic asynchronous protocol that solves Download with , and .
-
(4)
Randomized Lower Bound in Byzantine Fault: We show that for any randomized asynchronous Download protocol where , there does not exist any execution in which every peer queries less than or equal bits.
-
(5)
2-cycle Randomized Download in Byzantine Fault: We present a 2-cycle asynchronous randomized protocol for Download with and where , and the message size is .
-
(6)
Randomized Download in Byzantine Fault: We present a -cycle protocol that computes Download whp in the point-to-point model having expected query complexity and where , and the message size is .
To the best of our knowledge, this work is the first to study retrieval problems in the Data Retrieval (DR) model under asynchronous communication. The DR model has previously been explored in synchronous networks, most notably in [2] and the companion paper [4]. In particular, [2] introduced the Download problem, motivated by practical applications such as Distributed Oracle Networks (DONs), which form a crucial component of blockchain systems and employ protocols like OCR and DORA[6, 7].
2 Application: Efficient Blockchain Oracles
Blockchain systems [8] have seen a rise in popularity due to their ability to provide both transparency and strong cryptographic guarantees of agreement on the order of transactions, without the need for trusted third-party entities. More general computational abilities have also been well sought out for blockchains. Smart contracts [10] fulfill that need by providing users of the blockchain a way to run programs on the blockchain that ensures reliable and deterministic execution while providing transparency and immutability of both the code of the program and its state(s). Note that since the execution is required to be deterministic, i.e., every node must produce the same result, smart contracts are restricted to accessing on-chain data (that has been agreed upon), as off-chain data may introduce non-determinism to the execution.
Blockchain oracles [1, 5, 9] are components of blockchain systems that provide multiple services that support and extend the functionality of smart contracts (and other on-chain entities). The most important and fundamental service a blockchain oracle provides is bridging between the on-chain network and off-chain resources [6, 7], providing smart contracts access to external data without introducing non-determinism into their execution. We focus on this service and artificially consider it to be the sole responsibility of a blockchain oracle. In the remainder of this section, we explain in detail a possible application of the DR model and the Download problem for improved query efficiency within the context of blockchain oracles.
Blockchain oracles general structure.
Blockchain oracles consist of an on-chain component and an off-chain component. The off-chain component encompasses the different data sources that store the required external information (e.g., stock prices, weather predictions) and the network of nodes in charge of retrieving that information and transmitting it to the on-chain component. The on-chain component can be thought to be (but is not necessarily) a smart contract that is responsible for verifying the validity of the report, making the information public on the blockchain it is hosted on, and using it for its execution.
Formally, the off-chain component consists of two parts: an asynchronous333The network is sometimes assumed to be partially synchronous in blockchain oracles. oracle network, with peers (nodes) , , capable of exchanging direct messages among themselves, and data sources , , each storing an array of variables in which the on-chain component is interested. Each peer can read the -th cell from the -th data source by invoking . A fraction of up to of the peers may be Byzantine, and a fraction of up to of the data sources may be Byzantine. Denote the on-chain component by .
The goal of blockchain oracles, as mentioned above, is to pull information from external sources and push a final value on-chain. There are a few difficulties that may arise when trying to develop such a system. First, it might be the case that different data sources report slightly different values (e.g., prices of a specific stock), even if all of them act honestly. Moreover, corrupted data sources might provide false and even inconsistent values (providing some nodes with value and another with ). The system needs to pick a final value in a way that (1) represents the range of honest values pulled from honest data sources and (2) malicious players (both data sources and peers) cannot force the system to pick a final value that does not represent the range of honest values.
The Oracle Data Delivery (ODD) problem.
Denote by the set of honest data sources. Let and . The honest range of is the range . The ODD problem requires the on-chain to publish an array of values to the target blockchain such that , for every .
A blockchain oracle protocol can generally be split into three distinct steps: (1) collecting data, (2) reaching an agreement on the collected data, and (3) deriving and publishing a final value. Note that this abstraction is the minimum required abstraction to capture the operation of blockchain oracle protocols such as OCR [6] and DORA [7]444These protocols have many additional technical aspects, different structures, and different ways of handling steps (2) and (3). As our focus is on improving step (1), we may w.l.o.g. assume the abstract structure..
We now show how our Download protocols can be used to significantly reduce the cost of the Oracle Data Collection (ODC), i.e, step (1) of blockchain oracles.
Improving ODC by blockchain oracles via Download.
Current protocols perform the data collection
step by the following ODC process:
For every node:
-
Pick data sources into a set .
-
Perform , for every and .
-
Calculate the median and proceed to step (2).
Theorem 1.
Instead, we propose utilizing the guarantees of Download protocols, namely, that for an honest data source , the output of each peer is exactly , to construct the following modification of the ODC steps.
-
For every node, pick data sources into a set .
-
For every data source , run a Download protocol (denote the result for cell from data source by ).
-
Calculate the median and proceed to step (2).
It is easy to verify that this modified construction yields the following.
Theorem 2.
The Download-based ODC process guarantees for every and takes total queries and w.h.p.
Note that the Download protocol presented in this paper assumes a binary input array, but this can be extended to numbers via a relatively simple extension. However, it is important to note that our solution relies on the following restrictive assumption. For two honest peers , if both and issue the query , then they get the same result, for every and honest data source (i.e., the data does not change if queried at different times). Getting rid of this assumption and solving the problem efficiently for dynamic data is left as an open problem for future study.
References
- [1] John Adler, Ryan Berryhill, Andreas Veneris, Zissis Poulos, Neil Veira, and Anastasia Kastania. Astraea: A decentralized blockchain oracle. In 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), pages 1145–1152, 2018. doi:10.1109/Cybermatics_2018.2018.00207.
- [2] John Augustine, Jeffin Biju, Shachar Meir, David Peleg, Srikkanth Ramachandran, and Aishwarya Thiruvengadam. Byzantine Resilient Distributed Computing on External Data. In Dan Alistarh, editor, 38th International Symposium on Distributed Computing (DISC 2024), volume 319 of Leibniz International Proceedings in Informatics (LIPIcs), pages 3:1–3:23, Dagstuhl, Germany, 2024. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2024.3.
- [3] John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, and David Peleg. Distributed download from an external data source in faulty majority settings. CoRR, abs/2412.19649, 2024. doi:10.48550/arXiv.2412.19649.
- [4] John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, and David Peleg. Distributed Download from an External Data Source in Byzantine Majority Settings. In DISC, 2025. These proceedings.
- [5] Lorenz Breidenbach, Christian Cachin, Benedict Chan, Alex Coventry, Steve Ellis, Ari Juels, Farinaz Koushanfar, Andrew Miller, Brendan Magauran, Daniel Moroz, Sergey Nazarov, Alexandru Topliceanu, Florian Tram‘er, and Fan Zhang. Chainlink 2.0: Next steps in the evolution of decentralized oracle networks. Technical report, Chainlink Labs, 2021.
- [6] Lorenz Breidenbach, Christian Cachin, Alex Coventry, Ari Juels, and Andrew Miller. Chainlink off-chain reporting protocol. Technical report, Chainlink Labs, 2021.
- [7] Prasanth Chakka, Saurabh Joshi, Aniket Kate, Joshua Tobkin, and David Yang. DORA: distributed oracle agreement with simple majority. CoRR, abs/2305.03903, 2023. doi:10.48550/arXiv.2305.03903.
- [8] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system. whitepaper, May 2009.
- [9] Supra Research. Supra’s blockchain infrastructure stack. Whitepaper, Supra Labs, November 2024. URL: https://supra.com/documents/SupraTech-Whitepaper.pdf.
- [10] Nick Szabo. Formalizing and securing relationships on public networks. First Monday, 2, 1997. doi:10.5210/FM.V2I9.548.
