Round-Preserving Parallel Composition of Probabilistic-Termination Cryptographic Protocols

An important benchmark for multi-party computation protocols (MPC) is their round complexity. For several important MPC tasks, such as broadcast, (tight) lower bounds on the round complexity are known. However, some of these lower bounds can be circumvented when the termination round of every party is not a priori known, and simultaneous termination is not guaranteed. Protocols with this property are called probabilistic-termination (PT) protocols. Running PT protocols in parallel affects the round complexity of the resulting protocol in somewhat unexpected ways. For instance, an execution of m protocols with constant expected round complexity might take O(logm)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(\log m)$$\end{document} rounds to complete. In a seminal work, Ben-Or and El-Yaniv (Distributed Computing ‘03) developed a technique for a parallel execution of arbitrarily many broadcast protocols, while preserving expected round complexity. More recently, Cohen et al. (CRYPTO ‘16) devised a framework for universal composition of PT protocols, and provided the first composable parallel-broadcast protocol with a simulation-based proof. These constructions crucially rely on the fact that broadcast is “privacy-free,” and do not generalize to arbitrary protocols in a straightforward way. This raises the question of whether it is possible to execute arbitrary PT protocols in parallel, without increasing the round complexity. In this paper we tackle this question and provide both feasibility and infeasibility results. We construct a round-preserving protocol compiler, tolerating any dishonest minority of actively corrupted parties, that compiles arbitrary protocols into a protocol realizing their parallel composition, while having a black-box access to the underlying protocols. Furthermore, we prove that the same cannot be achieved, using known techniques, given only black-box access to the functionalities realized by the protocols, unless merely security against semi-honest corruptions is required, for which case we provide a protocol. To prove our results, we utilize the language and results by Cohen et al., which we extend to capture parallel composition and reactive functionalities, and to handle the case of an honest majority.


Introduction
Secure multi-party computation (MPC) [33,61] allows a set of parties to jointly perform a computation on their inputs, in such a way that no coalition of cheating parties can learn any information beyond what is revealed by their outputs (privacy) or affect the outputs of the computation in any way other than by choosing their own inputs (correctness).
Since the first seminal works on MPC [6,13,33,56,61], it has been studied in a variety of different settings and for numerous security notions: There exist protocols secure against passively corrupted (aka semi-honest) parties and against actively corrupted (aka malicious) parties; the underlying network can be synchronous or asynchronous; and the required security guarantees can be information-theoretic or computational-to name but a few of the axes along which the MPC task can be evaluated.
The prevalent model for the design of MPC protocols is the synchronous model, where the protocol proceeds in rounds. In this setting, the round complexity, i.e., the number of rounds it takes for a protocol to deliver outputs, is arguably the most important efficiency metric. Tight lower bounds are known on the round complexity of several MPC tasks. For example, for the well-known problems of Byzantine agreement (BA) and broadcast [48,53], it is known that any deterministic protocol against an active attacker corrupting a linear fraction of the parties has linear round complexity [25,28]. This result has quite far-reaching consequences as, starting with the seminal MPC works mentioned above, a common assumption in the design of secure protocols has been that the parties have access to a broadcast channel, which they potentially invoke in every round. In reality, such a broadcast channel might not be available and would have to be implemented by a broadcast protocol designed for a point-to-point network. It follows that even though the round complexity of many MPC protocols is linear in the multiplicative depth of the circuit being computed, their actual running time depends on the number of parties, when executed over point-to-point channels.
The above lower bound on the number rounds for deterministic BA holds even if less than t parties are cheating, as long as all honest parties are required to complete the protocol together, at the same round [26]. Indeed, randomized BA protocols that circumvent this lower bound and run in expected constant number of rounds (e.g., [4,27,29,44,50,55]) do not provide simultaneous termination, i.e., once a party completes the protocol's execution, it cannot know whether all honest parties have also terminated or if some honest parties are still running the protocol; in particular, the termination round of each party is not a priori known. A protocol with this property is said to have probabilistic termination ( PT).
As pointed out by Ben-Or and El-Yaniv [5], when several such PT protocols are executed in parallel, the expected round complexity of the combined execution might no longer be constant (specifically, might not be equal to the maximum of the expected running times of the individual protocols). Indeed, when m protocols, whose termination round is geometrically distributed (and so, have constant expected round complexity), are run in parallel, the expected number of rounds that elapse before all of them terminate is (log m) [16]. While an elegant mechanism was proposed in [5] for implementing parallel calls to broadcast such that the total expected number of rounds remains constant, it did not provide any guarantees to remain secure under composition, raising questions about its usability in a higher-level protocol (such as the MPC setting described above). Such a shortcoming was recently addressed by Cohen et al. [16] who provided a framework for universal composition of PT protocols (building upon the framework of universal composability [8]). An application of their result was the first composable protocol for parallel broadcast (with a simulation-based proof) that can be used for securely replacing broadcast channels in arbitrary protocols, and whose round complexity is constant in expectation.
Indeed, an immediate application of the composable parallel-broadcast protocol from [16] is plugging it into broadcast-model MPC protocols in order to obtain point-to-point protocols with a round complexity that is independent of the number of parties. In the information-theoretic setting, this approach yields protocols whose round complexity depends on the depth of the circuit computing the function [6,13,21,56], whereas in the computational setting, assuming standard cryptographic assumptions, this approach yields expected-constant-round protocols [2,3,23,32,34,41,47,51]. However, the resulting point-to-point protocols have probabilistic-termination on their own. The techniques used for composing PT broadcast protocols in parallel crucially rely on the fact that broadcast is a privacy-free functionality, and a naïve generalization of this approach to arbitrary PT protocols fails to be secure. This raises the question of whether it is possible to execute arbitrary PT protocols in parallel, without increasing the round complexity.
We remark that circumventing lower bounds on round complexity is just one of the areas where such PT protocols have been successfully used. Indeed, randomizing the termination round has been proven to be a very useful technique in circumventing impossibilities and improving efficiency for many cryptographic protocols. Notable examples include non-committing encryption [24], cryptographic protocols designed for rational parties [1,30,31,35,37,52], concurrent zero-knowledge protocols [10,14], and parallel repetition of interactive arguments [36,38]. The rich literature on such protocols motivates a thorough investigation of their security and composability. As mentioned above, in [16] the initial foundations were laid out for such an investigation, but what was proven for arbitrary PT protocols was a round-preserving sequential composition theorem, leaving parallel composition as an open question.

Our Contributions
Protocol-black-box composition In this work, we investigate the issue of parallel composition for arbitrary protocols with probabilistic termination. In particular, we develop a compiler such that given functionalities F 1 , . . . , F M and protocols π 1 , . . . , π M , where for every i ∈ [M], protocol π i realizes F i (possibly using correlated randomness as setup 1 ), then the compiled protocol realizes the parallel composition of the functionalities, denoted as (F 1 · · · F M ).
Our compiler uses the underlying protocols in a black-box manner, 2 guarantees output delivery (i.e., secure without abort), and is resilient against a computationally unbounded active adversary, adaptively corrupting up to t < n/2 parties (which is optimal [56]). Moreover, our compiler is round-preserving, meaning that if the maximal (expected) round complexity of each protocol is μ, then the expected round complexity of the compiled protocol is O(μ). For example, if each protocol π i has constant expected round complexity, then so does the compiled protocol. Recall that this task is quite complicated even for the simple case of BA ( [5,16]). For arbitrary functionalities it is even more involved, since as we show, the approach from [5] cannot be applied in a functionally black-box way in this case. 3 Thus, effectively, our result is the first round-preserving parallel composition result for arbitrary multi-party protocols/tasks with probabilistic termination.
We now describe the ideas underlying our compiler. In [5] (see also [16]), a roundpreserving parallel-broadcast protocol was constructed by iteratively running, for a constant number of rounds, multiple instances of BA protocols (each instance is executed multiple times in parallel, in a batch), hoping that at least one execution of every BA instance will complete. By choosing the multiplicity suitably, this would occur with constant probability, and therefore, the process is only needed to be repeated a constant expected number of times.
At first sight it might seem that this idea can be applied to arbitrary tasks, but this is not the case. Intuitively, the reason is that if the tasks that we want to compose in parallel have privacy requirements, then making the parties run them in (parallel) "batches" with the same input might compromise privacy, since the adversary will be able to use different inputs and learn multiple outputs of the function(s). This issue is not relevant for broadcast, because it is a "privacy-free" functionality; the adversary may learn the result of multiple computations using the same inputs for honest parties, without compromising security.
To cope with the above issue, our parallel-composition compiler generalizes the approach of [5] in a privacy-preserving manner. At a high level, it wraps the batching technique by an MPC protocol which restricts the parties to use the same input in all protocols for the same function. In particular, the compiler is defined in the Setup-Commit-then-Prove hybrid model [11,42], where the parties receive private correlated randomness that allows each party to commit to its input values and later execute multiple instances of every protocol, each time proving that the same input value is used in all executions. 1 A trusted setup phase is needed for implementing broadcast in the honest-majority setting. Note that as shown in [15,17], some interesting functions can be computed without such a setup phase. 2 Following [40], by a black-box access to a protocol we mean a black-box usage of a semi-honest MPC protocol computing its next-message function. 3 Loosely speaking, a functionally black-box protocol, as defined in [57], is a protocol that can compute a function f without knowing the code of f , i.e., given only an oracle access to the function f . Note that in this model, each ideal functionality F i has an oracle access to the function f i it computes. We note that most MPC protocols, e.g., [6,33,61], are not functionally black-box, as they need an explicit representation of the function in the form of a circuit.
The constructions in [11,42] for realizing the Setup-Commit-then-Prove functionality are designed for the dishonest-majority setting and therefore allow for a premature abort. Since we assume an honest majority, we require security with guaranteed output delivery. A possible way around would be, as is common in the MPC literature, to restart the protocol upon discovering some cheating or add for each abort a recovery round; this, however, would induce a linear overhead (in the number of parties) on the round complexity of the protocol.
Instead, in order to recover from a misbehavior by corrupted parties, we modify the Setup-Commit-then-Prove functionality and secret-share every committed random string between all the parties, using an error-correcting secret-sharing scheme (aka robust secret sharing [12,22,56]). 4 In case a party is identified as cheating, every party broadcasts the share of the committed randomness corresponding to that party, reconstructs the correlated randomness for that party, and locally computes the messages corresponding to this party in every instance of every protocol. We also prove that the modified Setup-Commit-then-Prove functionality can be realized in a constant number of rounds, thus yielding no (asymptotic) overhead on the round complexity of the compiler. We refer the reader to Sect. 6.1 for more details. Functionally black-box composition Our feasibility result described above shows how to compute the parallel composition (F 1 · · · F M ) of the functionalities F 1 , . . . , F M , given a black-box access to the protocols π 1 , . . . , π M . That is, the compiler crucially relies on knowing the messages (i.e., the transcript) of the protocols' executions, but it does not need to know how these messages are generated. Indeed, given the messages of the underlying protocols, it is possible to exploit MPC techniques to enforce consistency between multiple executions of the protocols.
We proceed to ask whether a similar compiler can be constructed without having access to the transcripts of the protocols' executions. Specifically, we investigate the question of whether there exists a protocol that securely realizes (F 1 · · · F M ) given only black-box access to the functionalities F 1 , . . . , F M , but not to protocols realizing them. Each F i is modeled as an ideal functionality that receives the inputs from the parties and returns the output in a probabilistic-termination fashion. We note that this question only makes sense if the functionalities F 1 , . . . , F M are not fixed in advance, but belong to some class of functionalities (see [57]), since otherwise a protocol may always ignore the functionalities F 1 , . . . , F M and implement (F 1 · · · F M ) from scratch. We thus consider M classes of functions C 1 , . . . , C M , where the i'th ideal functionality F i has an oracle access to one of the functions in C i , denoted as F C i i . When considering semi-honest corruptions, we show that there indeed exists a roundpreserving protocol for parallel composition of arbitrary deterministic functionalities F C 1 1 , . . . , F C M M in a functionally black-box manner. The composition follows the "batching-technique" from [5,16] of invoking each functionality F C i i multiple times in parallel for a constant number of rounds, such that with constant probability at least one computation in a batch will successfully produce an output. The reason that this approach works is that if the adversary follows the protocol, it cannot provide different inputs to different invocations of F C i i . This works for deterministic computations since it is crucial to ensure that the adversary obtains one output value per batch even if more than one invocation produces outputs. Note that using standard techniques a randomized functionality can be reduced to a deterministic one in which each party provides an additional random string r i , and the functionality uses r 1 ⊕ . . . ⊕ r n as its random coins. We refer the reader to Sect. 5 for more details.
When considering malicious corruptions, we prove that there exist function-classes for which no "natural" protocol (that is based on any of the known techniques from the literature) can compute their parallel composition in a round-preserving manner, while accessing the functions in a black-box way, tolerating even a single adversarial party. More precisely, we consider the function class C of distributed point functions, consisting of functions f α (x 1 , x 2 , λ, . . . , λ) (parametrized by α ∈ {0, 1} κ , where κ is the security parameter and λ is the empty string) that return α if x 1 ⊕ x 2 = α and 0 otherwise. We show that for this class it holds that: 1. Calling to each of the ideal functionalities F C i until termination (i.e., until all parties receive the output) will not be round-preserving. 2. It is impossible to compute the parallel composition without calling every ideal functionality (until some parties receive the output). 3. Using the same input value in more than one call to any of the ideal functionalities will break privacy.
We refer the reader to Sect. 6.2 for the details of the lower bound. We phrase our results using the framework for composition of protocols with probabilistic termination [16], and extend it (a side result of independent interest) to include parallel composition, reactive functionalities (in order to capture the Setup-Committhen-Prove functionality), and to the higher corruption threshold of t < n/2.
Related work Most relevant to our work are results on compositional aspects of broadcast protocols with probabilistic termination. Lindell et al. [49] studied sequential composition of probabilistic-termination Byzantine agreement protocols, and Ben-Or and El-Yaniv [5] a round-preserving parallel composition of such protocols. A simplification for parallel composition of probabilistic-termination BA protocols that are based on leader election was presented by Fitzi and Garay [29], and was used by Katz and Koo [44] for analyzing the exact round complexity of probabilistic-termination broadcast in the context of secure computation. Cohen et al. [16] studied probabilistic-termination protocols in the UC framework and constructed a composable parallel-broadcast protocol with a simulation-based proof.
Organization of the paper The rest of the paper is organized as follows. In Sect. 2 we describe the network model, the basics of the probabilistic-termination framework by Cohen et al. [16], and other tools that are used throughout the paper. In Sect. 3 we extend the framework to the honest-majority setting, and in Sect. 4 define parallel composition of PT protocols. Section 5 presents the protocol that achieves round-preserving parallel composition for arbitrary functionalities in a functionally black-box manner against semi-honest adversaries. Section 6 is dedicated to active corruptions; first, the round-preserving protocol-black-box construction is presented, followed by the negative result on round-preserving functionally black-box composition in the case of active corruptions. For ease of exposition, finer details of the model and PT framework are presented in "Appendix."

Model and Preliminaries
In the following, we introduce some necessary notation and terminology. We denote by κ the security parameter. For n ∈ N, let [n] = {1, · · · , n}. Let poly denote the set of all positive polynomials, and let PPT denote a probabilistic algorithm that runs in strictly polynomial time. A function ν : N → [0, 1] is negligible if ν(κ) < 1/ p(κ) for every p ∈ poly and large enough κ. Given a random variable X , we write x ← X to indicate that x is selected according to X .
The statistical distance between two random variables X and Y over a finite set U, denoted SD(X, Y ), is defined as 1

Synchronous Protocols in UC
We consider synchronous protocols in the model of Katz et al. [46], which is designed on top of the universal composability framework of Canetti [8]. More specifically, we consider n parties P 1 , . . . , P n and a computationally unbounded, adaptive t-adversary that can dynamically corrupt up to t parties during the protocol execution. Synchronous protocols in [46] are protocols that run in a hybrid model where parties have access to a simple "clock" functionality F clock . This functionality keeps an indicator bit, which is switched once all honest parties request the functionality to do so, i.e., once all honest parties have completed their operations for the current round. In addition, all communication is done over bounded-delay secure channels, where each party requests the channel to fetch messages that are sent to him, such that the adversary is allowed to delay the message delivery by a bounded and a priori known number of fetch requests. Stated differently, once the sender has sent some message, it is guaranteed that the message will be delivered within a known number of activations of the receiver. For simplicity, we assume that every message is delivered within a single fetch request. A more detailed overview of [46] can be found in "Appendix B."

The Probabilistic-Termination Framework
Cohen et al. [16] extended the UC framework to capture protocols with probabilistic termination, i.e., protocols without a fixed output round and without simultaneous termination. This section outlines their techniques; additional details can be found in "Appendix C." Canonical synchronous functionalities The main idea behind modeling probabilistic termination is to separate the functionality to be computed from the round complexity that is required for the computation. The atomic building block in [16] is a functionality template called a canonical synchronous functionality (CSF), which is a simple tworound functionality with explicit (one-round) input and (one-round) output phases. The functionality F csf has two parameters: (1) a (possibly) randomized function f that receives n + 1 inputs (n inputs from the parties and one additional input from the adversary) and (2) a leakage function l that determines what information about the input values is leaked to the adversary.
The functionality F csf proceeds in two rounds: In the first (input) round, all the parties hand F csf their input values, and in the second (output) round, each party receives its output. Whenever some input is submitted to F csf , the adversary is handed some leakage function of this input; the adversary can use this leakage for deciding which parties to corrupt and which input values to use for corrupted parties. Additionally, the adversary is allowed to input an extra message, which-depending on the function f -might affect the output(s). The detailed description of F csf is given in Fig. 7 in "Appendix C.1." As a side contribution, in Definition C.1, we extend the definition of CSF to the reactive setting.
Wrappers and traces Computation with probabilistic termination is captured by defining output-round randomizing wrappers. Such wrappers address the issue that while an ideal functionality abstractly describes a protocol's task, it does not describe its round complexity. Each wrapper is parametrized by a distribution (more precisely, an efficient probabilistic sampling algorithm) D that may depend on a specific protocol implementing the functionality. The wrapper samples a round ρ term ← D, by which all parties are guaranteed to receive their outputs. Two wrappers are considered: The first, denoted W strict , ensures in a strict manner that all (honest) parties terminate together in round ρ term ; the second, denoted W flex , is more flexible and allows the adversary to deliver outputs to individual parties at any time before round ρ term . The detailed descriptions of the two wrappers can be found in "Appendix C.4." As pointed out in [16], it is not sufficient to inform the simulator S about the round ρ term . In many cases, the wrapper should explain to S how this round was sampled; concretely, the wrapper provides S with the random coins that are used to sample ρ term . In particular, S learns the entire trace of calls to ideal functionalities that are made by the protocol in order to complete by round ρ term . A trace basically records which hybrids were called by a protocol's execution, and in a recursive way, for each hybrid, which hybrids would have been called by a protocol realizing that hybrid. The recursion ends when the base case is reached, i.e., when the protocol is defined using the atomic functionalities that are "assumed" by the model. 5 We refer the reader to "Appendix C.3" for further intuition and an illustrating example, and formally define a trace as follows: Definition 2.1. (Traces) A trace is a rooted tree of depth at least 1, in which all nodes are labeled by functionalities and where every node's children are ordered. The root and all internal nodes are labeled by wrapped CSFs (by either of the two wrappers), and the leaves are labeled by unwrapped CSFs. The trace complexity of a trace T , denoted c tr (T ), is the number of leaves in T . Moreover, denote by flex tr (T ) the number nodes labeled by flexibly wrapped CSFs in T .
In this work, we consider an augmented definition of traces, which allows parallel calls to ideal functionalities at the same round. Specifically, a trace is augmented with another (potentially empty) layer of nodes, such that each leaf, in the original definition of a trace, may have a list of (unordered) children. The trace complexity is defined as in Definition 2.1, as the number of original leaves (before augmenting with unsorted node-lists). We note that the composition theorems (below) trivially extend to use this augmented definition of a trace. To simplify notations, we denote by [F 1 , . . . , F m ] the node with unordered list of children F 1 , . . . , F m (modeling a parallel call to these functionalities). We will also denote by a trace consisting of l (ordered) sequences of leaves, each augmented with an (unordered) list of k nodes of each F i . This corresponds to l sequential calls to k parallel instances of each F i .
Sequential composition of probabilistic-termination protocols When a set of parties execute a probabilistic-termination protocol, or equivalently, invoke a flexibly wrapped CSF, they might get out-of-sync and start the next protocol in different rounds. The approach in [16] for dealing with sequential composition is to start by designing simpler protocols, that are in a so-called synchronous normal form, where the parties remain in-sync throughout the execution, and next, compile these protocols into slack-tolerant protocols.

Definition 2.2. (Synchronous normal form)
Let F 1 , . . . , F m be canonical synchronous functionalities. A synchronous protocol π in the (F 1 , . . . , F m )-hybrid model is in synchronous normal form (SNF) if in every round exactly one ideal functionality F i is invoked by all honest parties, and in addition, no honest party hands inputs to other CSFs before this instance halts. SNF protocols are designed as an intermediate step only, since the hybrid functionalities F 1 , . . . , F m are two-round CSFs and, in general, cannot be realized by real-world protocols. In order to obtain protocols that can be realized in the real world, [16] introduced slack-tolerant variants of both the strict and the flexible wrappers, denoted W sl-strict and W sl-flex . These wrappers are parametrized by a slack parameter c ≥ 0 and can be used even if parties provide inputs within c + 1 consecutive rounds (i.e., they tolerate input slack of c rounds); furthermore, the wrappers ensure that all honest parties obtain output within two consecutive rounds (i.e., they reduce the slack to c = 1). The detailed definitions of the slack-tolerant wrappers are given in "Appendix C.5." In order to convert SNF protocols into protocols that realize functionalities with slack tolerance, [16] con-structed a deterministic-termination compiler Comp dt , a probabilistic-termination compiler Comp pt , and a probabilistic-termination with slack-reduction compiler Comp ptr . Loosely speaking, the composition theorems provide the following guarantees: 3. If an SNF protocol π realizes a wrapped CSF The compilers maintain the security and the asymptotic (expected) round complexity of the original SNF protocols. At the same time, the compilers take care of any potential slack that is introduced by the protocol and ensure that the resulting protocol can be safely executed even if the parties do not start the protocol simultaneously. More precise descriptions of the compilers can be found in "Appendix C.6." As a side contribution, we extend this framework to the honest-majority setting in "Appendix C." Finally, in [16], the authors also provided protocols for realizing wrapped variants of the atomic CSF functionality for secure point-to-point communication. This suggested the following design paradigm for realizing a wrapped functionality W sl-strict (F) (resp., W sl-flex (F)): First, construct an SNF protocol for realizing W strict (F) (resp., W flex (F)) using CSF hybrids F 1 , . . . , F m . Next, for each of the non-atomic hybrids F i , show how to realize W strict (F i ) (resp., W flex (F i )) using CSF hybrids F 1 , . . . , F m . Proceed in this manner until all CSF hybrids are atomic functionalities. Finally, repeated applications of the composition theorems above yield a protocol for W sl-strict (F) (resp., W sl-flex (F)) using only atomic functionalities as hybrids.

A Lemma on Termination Probabilities
The following lemma, which will be used in our positive results, provides a constant lower bound on the probability that when running simultaneously (i.e., in parallel) N copies of M probabilistic-termination protocols π 1 , . . . , π M , at least one copy of each π i will complete after R rounds, for suitable choices of N and R. , let X i j be independent random variables over the natural numbers, such that X i1 , . . . , X iN are identically distributed with expectation μ i . Denote Y i = min{X i1 , . . . , X iN } and μ = max{μ 1 , . . . , μ M }. Then, for any constant 0 < < 1, if R > μ and N > log(M/ ) log(R/μ) , it holds that Pr [∀i : where ( * ) follows from Markov's inequality. Therefore, Finally, since R > μ it holds that 1 > μ/R. Therefore, for constant 0 < < 1, by setting it holds that

Probabilistic Termination with an Honest Majority
In this section, we extend the probabilistic-termination framework [16] (that was defined for the 2/3-majority setting) to the honest-majority regime.

Fast Sequential Composition
The composition theorems from [16] are defined for t < n/3 (as they focused on perfect security). When moving to the honest-majority setting, i.e., t < n/2, the compilers and composition theorems follow in a straightforward way. The main difference lies in the usage of the Bracha-termination technique (specifically, in Theorem 3.2), where the "termination messages" in the compiled protocol π = Comp c ptr (π, D 1 , . . . , D m , I ) must be authenticated. Therefore, there is an additional hybrid functionality that is required for generating correlated randomness to be used for authenticating messages. If informationtheoretic security is required, correlated randomness for information-theoretic signatures [54] can be used, whereas if computational security suffices, a public-key infrastructure (PKI) can be used.
-Correlated Randomness. The correlated-randomness functionality, parametrized by a distribution D, is defined as follows. The function to compute is f corr (λ, . . . , λ, a) = (R 1 , . . . , R n ), where (R 1 , . . . , R n ) ← D, and the leakage function is l corr (λ, . . . , λ) = ⊥. We denote by F D corr the functionality F csf when parametrized with the above functions f corr and l corr . We denote by F corr-bc the functionality F D bc corr , where D bc is the distribution for correlated randomness needed for information-theoretic broadcast [54], and by F pki functionality F D pki corr , where D pki is the distribution for correlated randomness needed for a public-key infrastructure using standard digital signatures.
We state without proof the composition theorems for the honest-majority setting. The proofs follow in similar lines to [16]. See "Appendix C.6" for additional details including the definition of full-trace. Then, protocol π = Comp c dt (π, D 1 , . . . , D m ) UC-realizes W D full ,c sl-strict (F), with information-theoretic (resp., computational) security, in the (W D 1 ,c sl-strict (F 1 ), . . . , W D m ,c sl-strict (F m ))-hybrid model, in the presence of an adaptive, malicious t-adversary, assuming that all honest parties receive their inputs within c + 1 consecutive rounds.
Furthermore, the expected round complexity of the compiled protocol π is where d i is the expected number of calls in π to hybrid F i , T i is a trace sampled from D i , and B c = 3c + 1 is the blow-up factor. Then, the compiled protocol π = Comp c ptr (π, D 1 , . . . , D m , I ) UC-realizes W D full ,c sl-flex (F), with information-theoretic (resp., computational) security, in the (F corr-bc , in the presence of an adaptive, malicious t-adversary, assuming that all honest parties receive their inputs within c + 1 consecutive rounds.
Furthermore, the expected round complexity of the compiled protocol π is where d i is the expected number of calls in π to hybrid F i , T i is a trace sampled from D i , and B c = 3c + 1 is the blow-up factor.
where d i is the expected number of calls in π to hybrid F i , T i is a trace sampled from D i , and B c = 3c + 1 is the blow-up factor.

Fast Parallel Broadcast
Cohen et al. [16], based on Hirt and Zikas [39], defined the unfair parallel-broadcast functionality, in which the functionality informs the adversary which messages it received, and allows the adversary, based on this information, to corrupt senders and replace their input messages.
-Unfair Parallel Broadcast. In the unfair parallel-broadcast functionality, each party P i with input x i distributes its input to all the parties. The adversary is We denote by F upbc the functionality F csf when parametrized with the above functions f upbc and l upbc .
The protocol of Katz and Koo [45] realizes (a wrapped version of) F upbc when the parties have correlated-randomness setup. The following result follows.
Theorem 3.4. Let c ≥ 0 and t < n/2. There exists an efficiently sampleable distribution D such that the functionality W D,c sl-flex (F upbc ) has an expected-constant-round complexity, and can be UC-realized in the (F smt , F corr-bc )-hybrid model, with informationtheoretic security, in the presence of an adaptive, malicious t-adversary, assuming that all honest parties receive their inputs within c + 1 consecutive rounds.
The parallel-broadcast functionality is similar to the unfair version, except that the adversary cannot corrupt parties based on the messages they send.
-Parallel Broadcast. In the parallel-broadcast functionality, each party P i with input x i distributes its input to all the parties. Unlike the unfair version, the adversary only learns the length of the honest parties' messages before their distribution, i.e., the leakage function is l pbc (x 1 , . . . , . It follows that the adversary cannot use the leaked information in a meaningful way when deciding which parties to corrupt. The function to compute is identical to the unfair version, . We denote by F pbc the functionality F csf when parametrized with the above functions f pbc and l pbc . We next show how to realize the parallel-broadcast functionality F pbc in the F upbchybrid model, in the honest-majority setting. The construction follows [16], where the only difference is that for t < n/2, perfectly correct error-correcting secret sharing (see Definition A.1) cannot be achieved, and a negligible error probability is introduced. We describe this protocol, denoted π pbc , in Fig. 1.
Theorem 3.5. Let c ≥ 0 and t < n/2. There exists an efficiently sampleable distribution D such that the functionality W D,c sl-flex (F pbc ) has an expected-constant-round complexity, and can be UC-realized in the (F smt , F corr-bc )-hybrid model, with informationtheoretic security, in the presence of an adaptive malicious t-adversary, assuming that all honest parties receive their inputs within c + 1 consecutive rounds.
The proof of the theorem follows in the same lines of the proof of [16, Theorem 5.6].

Fast SFE in the Point-to-Point Model
We conclude this section by showing how to construct a UC-secure SFE protocol which computes a given circuit in expected O(d) rounds, independently of the number of parties, in the point-to-point channels model. The protocol is obtained by taking the protocol of Cramer et al. [21], denoted π sfe . This protocol relies on (parallel) broadcast and (parallel) point-to-point channels, and therefore it can be described in the (F psmt , F pbc )-hybrid model. Using an adaptively secure constant-round MPC in the broadcast model, such as the protocol of Damgård and Ishai [23] (or the two-round protocol that exists under stronger assumptions [20]), we obtain expected-constant-round MPC over point-to-point channels.
Theorem 3.7. Let f be an n-party function, let c ≥ 0, let t < n/2, and assume that one-way functions exist. Then, there exists an efficiently sampleable distribution D such that the functionality W D,c sl-flex (F f sfe ) has round complexity O(1) in expectation, and can be UC-realized in the (F smt , F pki )-hybrid model, with computational security, in the presence of an adaptive, malicious t-adversary, assuming that all honest parties receive their inputs within c + 1 consecutive rounds.

Functionally Black-Box Protocols and Parallel Composition
In this section, we extend the probabilistic-termination framework [16] to capture the notions of functionally black-box protocols [57] and of parallel composition of canonical synchronous functionalities.
Functionally black-box protocols Most MPC protocols in the literature explicitly require a representation of the function to be computed, usually in the form a Boolean circuit, an arithmetic circuit, or a RAM program. This representation is a common param-eter to the protocol and is known to all of the parties. Rosulek [57] asked whether this is inherent or if it possible to securely compute a function without "knowing its code." Specifically, Rosulek [57] defined functionally black-box (FBB) protocols, in which the parties compute a function from within some function class, where each party is given only a local oracle access to the function (i.e., the party can query the oracle to learn the evaluation of the function on inputs of its choice). A feasibility result for semi-honest two-party FBB protocols was given in [57] for a class of functions related to blind signatures, as well as an impossibility result (that was strengthened in [43]) ruling out generic FBB protocols, also in the semi-honest case.
Looking ahead, in Sects. 5 and 6.2 we consider parallel composition of FBB computations. That is, given public function-classes C 1 , . . . , C M and (hidden) functions g i ∈ C i , each party is given a local oracle access to g 1 , . . . , g M (as in [57]), but in addition the parties can globally use M ideal functionalities to jointly compute each g i ; every ideal functionality is parametrized by a function class C i and is given an oracle access to the corresponding g i . Because the parties can use an ideal functionality to compute each g i , the impossibility results from [43,57] do not apply, and indeed, in Sect. 5 we show that in the semi-honest setting it is possible to compose in parallel FBB computations in a round-preserving manner. In Sect. 6.2 we extend the technique from [43] to rule out round-preserving FBB parallel composition in the malicious setting.
We formalize the notion of functionally black-box protocols of Rosulek [57] in the language of canonical synchronous functionalities. As in [57], we focus on secure function evaluation. The SFE functionality F g sfe (see Sect. C.1), parametrized by an n-party function g, is defined as the CSF F  Denote by F C sfe the CSF, implemented as an (uninstantiated) oracle machine that in order to compute f C sfe (x 1 , . . . , x n , a), queries the oracle with (x 1 , . . . , x n ) and stores the response (y 1 , . . . , y n ). The leakage function l sfe (x 1 , . . . , Parallel composition of CSFs The parallel composition of CSFs is defined in a natural way as the CSF that evaluates the corresponding functions in parallel. , is the CSF defined by the function ( f 1 · · · f M ) and the leakage function (l 1 · · · l M ).

Round-Preserving Parallel Composition: Passive Security
In this section, we show that round-preserving parallel composition is feasible, in a functionally black-box manner, facing semi-honest adversaries. The underlying idea of our protocol π pfbb (standing for parallel functionally black-box), formally presented in Fig. 2, is based on a simplified form of the parallel-broadcast protocol of Ben-Or and El-Yaniv [5]. The protocol proceeds in iterations, where in each iteration, the parties invoke, in parallel and using the same input values, sufficiently many instances of each (oracle-aided) ideal functionality, but only for a constant number of rounds. If some party received an output in at least one invocation of every ideal functionality, it distributes all output values and the protocol completes; otherwise, the protocol resumes with another iteration. This protocol retains privacy for deterministic functions with public output, 6 since the adversary is semi-honest, and so corrupted parties will provide the same input values to all instances of each ideal functionality.
Intuitively, during the simulation of the protocol, the simulator should imitate every call for every ideal functionality toward the adversary. A subtle issue is that in order to do so, the simulator must know the exact trace that is sampled by each instance of each ideal functionality during the execution of the real protocol. Therefore, it is indeed essential for the simulator to receive the random coins used to sample the trace for the entire protocol, by the ideal functionality computing the parallel composition (see Sect. 2.2). By defining the trace-distribution sampler in a way that consists of all (potential) subtraces for every instance of every ideal functionality, the simulator can induce the exact random coins used to sample the correct sub-trace for every ideal functionality that is invoked.
)-hybrid model, with information-theoretic security, in the presence of an

adaptive, semi-honest t-adversary, assuming that all honest parties receive their inputs at the same round. In particular, if for every j ∈ [M], the expectation μ j is constant, then μ is constant.
The proof of Theorem 5.1 follows immediately from the following lemma. Then, for any R > μ, -hybrid model, with information-theoretic security, in the presence of an adaptive, semi-honest t-adversary, assuming that all honest parties receive their inputs at the same round.
Proof We start by defining the sampling algorithm for the distribution D pfbb , parametrized by N, R, L, and distributions D 1 , . . . , D M . The sampler initially sets α ← 0 and a trace T with a root labeled by W that c tr (Tk j ) < R, then output T and halt. Else, set α ← α + 1. If α < L, repeat the sampling process; otherwise output T and halt.
Following Lemma 2.3, for R > μ and N > log(M/c) log(R/μ) , it holds that in every iteration, at least one invocation of W D j flex (F C j sfe ) will produce output, for every j ∈ [M], with a constant probability. It follows that the expected number of iterations until all honest parties receive output and the protocol terminates is constant, and since each iteration consists of O(R) rounds, the entire execution completes within O(R) rounds in expectation, as required. The failure probability that the protocol will not terminate within L = poly(κ) iterations is negligible.
Let A be a semi-honest adversary, we now construct a simulator S for A. Initially, S sets the values α ← 0 and y ← ⊥, and starts by receiving leakage messages (leakage, sid, P i , (l 1 , . . . , l M )) and a trace message (trace, sid, , followed by a call to F psmt ). More precisely, S receives the coins that were used by the functionality W to sample the trace T and using these coins, S samples the same traces Tk j that were used to define T .
In order to simulate the α'th iteration, S sends the message (leakage, sid j,k , P i , l j ) to A, for every j ∈ [M], every k ∈ [N], and every honest P i (wherek = α · N + k), and receives (input, sid j,k , x j i ) from A on behalf of every corrupted party P i . Since A is semi-honest, it holds that the same x j i is used for each corrupted party P i in all instances of the functionality W on behalf of every party. Proving that no environment can distinguish between its view in an execution of π pfbb in the (F smt , )-hybrid model, and its view when interacting with S in the ideal computation of W , follows via a standard hybrid argument. Starting with the execution of π pfbb , the invocations of the ideal func- are replaced, one-by-one, with the answers of the simulator S. Since S perfectly emulates each such call using the trace it received from the ideal functionality W D pfbb , it follows immediately that the views of the environment in two neighboring hybrids are indistinguishable; therefore, the view of the environment in the simulation is indistinguishable from its view in the execution of π pfbb .

Round-Preserving Parallel Composition: Active Security
In this section, we consider security against active adversaries. First, in Sect. 6.1, we show how to compute the parallel composition of probabilistic-termination functionalities, in a round-preserving manner, using a black-box access to protocols realizing the individual functionalities. In Sect. 6.2, we investigate the question of whether there exists a functionally black-box round-preserving malicious protocol for the parallel composition of probabilistic-termination functionalities, and show that for a natural extension of protocols, following the techniques from [5], this is not the case-i.e., there exist functions such that no such protocol with black-box access to them can compute their parallel composition, in a round-preserving manner, tolerating even a single adversarial party.

Feasibility of Round-Preserving Parallel Composition
In this section, we show how to compile multiple protocols, realizing probabilistictermination functionalities, into a single protocol that realizes the parallel composition of the functionalities, in a round-preserving manner, and while only using black-box access to the underlying protocols. We start by providing a high-level description of the compiler.
The compiler receives as input protocols π 1 , . . . , π M , where each protocol π j is defined in the point-to-point model, in which the parties are given correlated randomness in a secure setup phase, i.e., in the (F smt , F D corr j corr )-hybrid model. 7 It follows that the next-message function for each party in each protocol is a deterministic function that receives the input value, correlated randomness, private randomness, and history of incoming messages, and outputs a vector of n messages to be sent in the following round (one message for each party); we denote by f π j ,i nxt-msg the next-message function for party P i in protocol π j . In particular, we note that the entire transcript of the protocol is fixed once the input value, correlated randomness, and private randomness of each party are determined.
The underlying ideas of the compiler are inspired by the constructions in [5,16], where a round-preserving parallel-broadcast protocol was constructed by iteratively running, for a constant number of rounds, multiple instances of BA protocols (each instance is executed multiple times in parallel), until at least one execution of every BA instance is completed. This approach is indeed suitable for computing "privacy-free" functionalities (such as broadcast), where the adversary may learn the result of multiple computations using the same inputs for honest parties, without compromising security. However, when considering the parallel composition of arbitrary functions, running two instances of a protocol using the same input values will violate privacy, since the adversary can use different inputs to learn multiple outputs of the function.
The parallel-composition compiler generalizes the above approach in a privacypreserving manner. The compiler follows the GMW paradigm [33] and is defined in the Setup-Commit-then-Prove hybrid model [11,42], which generates committed correlated randomness for the parties and ensures that all parties follow the protocol specification. This mechanism allows each party to commit to its input values and later execute multiple instances of each protocol, while proving that the same input value is used in all executions. For simplicity and without loss of generality, we assume that each function is deterministic and has a public output. In this case, it is ensured that if two parties receive output values in two executions of π j , then they receive the same output value. The private random coins that are used in each execution only affect the termination round, but not the output value. Using this simplification, we can remove the leader-election phase from the output-agreement technique in [5,16] and directly use the termination technique from Bracha [7].
Another obstacle is to recover from corruptions without increasing the round complexity. Indeed, in case some party misbehaves, e.g., by using different input values in different instances of the same protocol π j , then the Setup-Commit-then-Prove functionality ensures that all honest parties will identify the cheating party. In this case, the parties cannot recover by, for example, backtracking and simulating the cheating party, as this will yield a round complexity that is linear in the number of parties. Furthermore, the protocol must resume in a way such that all instances of a specific protocol π j will use the same input value that the identified corrupted party used throughout the protocol's execution until it misbehaved (since the cheating party might have learned an output value in one of the executed protocols).
To this end, we slightly adjust the Setup-Commit-then-Prove functionality and secretshare every committed random string r i (the correlated randomness for party P i ) among all the parties, using an error-correcting secret-sharing scheme (see Sect. A. 1). Note that this can be done information theoretically as we assume an honest majority [12,22,56]. In case a party is identified as cheating, every party broadcasts the share of the committed randomness corresponding to that party, reconstructs the correlated randomness for that party, and from that point onwards, locally computes the messages corresponding to this party in every instance of every protocol. Using this approach, every round in the original protocols π 1 , . . . , π M is expanded by a constant number of rounds, and the overall round complexity is preserved.
We prove the following theorem. Then, W D,c sl-flex (F 1 · · · F M ), for some distribution D with expectation μ = O(μ), can be UC-realized with information-theoretic security by a protocol π in the (F smt , F corr-bc )-hybrid model, in the same adversarial setting, assuming that all honest parties receive their inputs within c + 1 consecutive rounds. In addition, protocol π requires only black-box access to the protocols π 1 , . . . , π M .
In particular, if for every j ∈ [M], the expectation μ j is constant, then μ is constant.
Proof (sketch) Without loss of generality, we assume that every CSF F j is deterministic and has public output. The proof for randomized functionalities with private output follows using standard techniques. In Lemma 6.3, we prove that the compiled protocol with expected round complexity O(R) and informationtheoretic security, in the (F pbc , F scp )-hybrid model. In Lemma 6.2, we show that a wrapped version of F scp (P, D parallel (π 1 , . . . , π M , N · L, R, ), R parallel , ) (explained below) can be implemented, such that every call can be UC-realized in the (F psmt , F pbc )hybrid model with constant round complexity and information-theoretic security. Following Theorem 3.5, F pbc can be UC-realized in the F smt -hybrid model with expectedconstant-round complexity and information-theoretic security. The proof follows from the sequential composition theorems, Theorems 3.1, 3.2 and 3.3 .
We now proceed to define the Setup-Commit-Then-Prove Functionality F scp and prove Lemma 6.2 in Sect. 6.1.1, and to prove Lemma 6.3 in Sect. 6.1.2.

The Setup-Commit-Then-Prove Functionality
An important building block in our parallel-composition compiler is the Setup-Committhen-Prove functionality. This functionality is used in order to allow parties to execute multiple instances of a protocol, using the same inputs, while ensuring input consistency. In addition, in case a party misbehaves and tries to deviate from the protocol or to use different inputs in different executions, the functionality allows all parties to identify this misbehavior and recover, while increasing the round complexity only by a constant factor. This functionality was defined by Ishai et al. [42], based on the Commit-then-Prove functionality of Canetti et al. [11], and was used in order to compile any semihonest protocol into a protocol that is secure with identifiable abort, facing malicious adversaries, by generating committed setup for the parties and allow them to prove NP-statements in zero knowledge.
The Setup-Commit-then-Prove functionality F scp is a reactive functionality. (The notion of CSF is extended to the reactive setting in "Appendix C.2.") The functionality is formally defined in Fig. 3 and is parametrized by a party-set P, a distribution D, a vector of n NP-relations R, and a (t, n) error-correcting secret-sharing scheme . In the first call to the functionality, the parties do not send inputs (more precisely, send the empty input λ). The functionality samples correlated randomness (r 1 , . . . , r n ) ← D, hands r i to P i and stores r i as the committed witness for P i . In addition, the functionality secret shares r i between all parties. All subsequent calls are used to prove NP-statements on the committed witnesses, i.e., on the k'th call, for k > 1, P i sends as its input a statement x i,k ; the functionality verifies whether R i (x i,k , r i ) = 1 and sends x i,k and the result to all parties.
Ishai et al. [42] showed how to realize a slightly reduced version of the Setup-Committhen-Prove functionality (in which the functionality does not secret share the witnesses), unconditionally, with identifiable abort, facing an arbitrary number of corrupted parties. Their protocol is based on the "MPC-in-the-head" approach that was put forth by Ishai et al. [40] with the goal of using information-theoretic MPC to construct a zero-knowledge protocol. In a nutshell, given a statement x and a witness w, the prover secret shares the witness as ω = ω 1 ⊕· · ·⊕ω m (for some m) and emulates in its head an m-party t-secure MPC protocol with perfect correctness (e.g., BGW [6]) where the i'th party has input (ω i , x). The MPC task is to reconstruct the witness ω = ω 1 ⊕ · · · ⊕ ω m , evaluate the relation R(x, ω) and provide each party with the resulting bit. Next, the prover sends to the verifier a commitment to the local views of each of the virtual m parties (consisting of the input ω i , the random coins, and the incoming messages). In turn, the verifier asks the prover to open t of these commitments, and verifies that these views are consistent and that the output is 1. Intuitively, completeness holds by the correctness of the MPC, soundness holds by the perfect security, and zero knowledge since the views of t parties leak no information about other inputs; so the overall security of the zero-knowledge protocol reduces to that of the commitment scheme.
In the following lemma, we adjust and simplify the protocol from [42] for the honestmajority setting, and show how to realize F scp with guaranteed output delivery. Proof (sketch ) We start with a high-level description of the protocol in [42], for a single prover that proves a single statement (the extension to the multi-instance version of many provers that prove many statements follows via the JUC theorem [9]). As this protocol is designed for the dishonest-majority setting, it does not guarantee output delivery, but achieves identifiable abort.
As mentioned above, the protocol follows the "MPC-in-the-head" approach [40,41], where the prover emulates in its head a protocol, where m servers, each has as input a share of the witness ω, compute for a public statement x the function b = R(x, ω), and output (x, b). The prover publicly commits to the view of each server, and the verifiers challenge the prover to open the views of some of the servers. The verifiers accept the statement x, if and only if all opened views are consistent.
For simplicity, consider an ideal functionality F 1scp corr for generating the correlated randomness to the parties. As we argue later, this functionality can be realized in constant rounds in the broadcast model via generic MPC. (Looking ahead, the distribution D will consist of n independent copies of the following values, one for each party that acts a the prover.) The parties receive the following correlated randomness from F 1scp corr : • The prover receives signed secret shares of the witness (ω i , σ (ω i )), for i ∈ [n], where (ω 1 , . . . , ω n ) are shares of the prover's witness ω, and σ (ω i ) is an information-theoretic signature of ω i . • The prover also receives random strings v 1 , . . . , v m along with corresponding signatures σ (v 1 ), . . . , σ (v m ), that will be used for committing to the server's views in the m-party protocol. • Every P i receives a challenge string c i along with an information-theoretic signature σ (c i ). • Every party P i receives a verification key for each of the signature values (note that all of the signing keys are hidden from the parties).
The prove phase, consists of three rounds: 1. The prover emulates in its head the protocol for computing b = R(x, ω 1 ⊕· · ·⊕ω m ) and broadcasts, for every j ∈ [m], a commitment to the view of the j'th server as view j ⊕ v j along with σ (v j ). 2. Every party P i broadcasts the committed random string c i along with its commitment σ (c i ). All parties locally compute c = c j and use it to choose a subset J ⊆ [m]. In case some party P i sends invalid values, all parties identify P i as corrupted and abort. 3. The prover broadcasts view j and σ (ω j ), for every j ∈ J , and all parties validate consistency.
Since in this work we consider an honest majority, we can simplify the protocol and achieve security with guaranteed output delivery. In the second round, instead of having each P i broadcasting the committed randomness (c i , σ (c i )), we have the parties jointly compute the XOR function to agree on a subset J ⊆ [m]; that is, each party enters a (locally chosen) random string as its input and receives the XOR of all strings (if a party does not input a value it is simply ignored and treated as 0). Since this functionality can be represented using a constant-depth circuit, we can use the protocol of Cramer et al. [21], whose round complexity is O(d), where d is the depth of the circuit, and provides information-theoretic security in the broadcast-hybrid model against an adaptive malicious t-adversary. A second modification we require is that F 1scp corr secret shares the witness ω between all parties using an error-correcting secret-sharing scheme, which can be easily achieved in the honest-majority setting.
It is left to show that the sampling algorithm for the correlated randomness that is used in [42] can be represented using a constant-depth circuit; the proof will then follow using again the protocol from [21]. By assumption, sampling a value from the distribution D can be done using a constant-depth circuit. In addition, the informationtheoretic commitments in [42] are computed using information-theoretic signatures (see "Appendix A.2"), such that no party knows the signing key. This means that the adversary cannot generate signatures on its own. In addition, each signature will only need to be verified once (by each party). Using the information-theoretic signatures construction from [60] (see Theorem A. 3), the degree of the polynomial, that is used to generate the signature, is bounded by the number of signatures the adversary is allowed to see from each honest party, which in our case is constant. We conclude that the randomnesssampling algorithm can be represented using a constant-depth circuit.
In the following, we will consider the distribution D parallel (π 1 , . . . , π M , q, R, ) for protocols π 1 , . . . , π M , where each π j is defined in the (F psmt , F D corr j corr )-hybrid model, that prepares (at most) q executions of each protocol, for exactly R rounds. For each of the q instances of every protocol π j , the correlated randomness consists of three parts: first, sample correlated randomness for π j from the distribution D corr j ; next, sample independent random coins (local for each party) from the corresponding distribution D π j , suitable for R rounds; finally, sample random coins that are used to mask all the communication (explained below). The parameter = poly(κ) represents the maximum between the input length and the maximal message length in the protocols π 1 , . . . , π M . The distribution is formally defined in Fig. 4.
The masking of the communication in the ρ'th round of the k'th instance of protocol π j is performed as follows: • When P i wants to send messages (m 1 i , . . . , m n i ) (where m u i is sent privately to P u ), party P i sets for every u ∈ [n],m u i = m u i ⊕ r mask i, j,k,ρ,u and broadcastsm i = (m 1 i , . . . ,m n i ). • When P i receives messagesm u = (m 1 u , . . . ,m n u ) from P u , P i computes m i u = m i u ⊕ r mask u, j,k,ρ,i , and uses m i u as the message P u sent him. The vector of relations R parallel = (R 1 parallel , . . . , R n parallel ), described below, will be used to verify that every party sends its messages in multiple executions of a protocol according to the specification of the protocol, while using the same input value in all executions. Formally, for every i ∈ [n], the relation R i parallel consists of pairs ((α, ρ,m,h), r i ), satisfying: α is an integer representing the iteration number.  by m i, j,k,ρ = (m i, j,k,ρ,1 , . . . , m i, j,k,ρ, That is, m i, j,k,ρ is the output of the next-message function of P i in protocol π j on input x j i , correlated randomness r corr i, j,k , private randomness r prot i, j,k and history h i, j,k . In the sequel, for simplicity, we will denote by F scp the functionality F scp (P, D parallel (π 1 , . . . , π M , N · L, R, ), R parallel , ).

Round-Preserving Parallel-Composition Compiler
We are now ready to present our protocol-black-box (PBB) parallel-composition compiler, formally described in Fig. 5. To prove security of the construction, we construct a simulator for the dummy adversary, which simulates the functionality F scp and all honest parties. At a high level, the simulation proceeds as follows. Since every protocol π j realizes F j , there exists a simulator S j for the dummy adversary. In order to simulate the k'th instance of each protocol π j , the simulator S invokes an instance of S j , denoted S k j , and receives correlated randomnessr corr i, j,k for every corrupted party P i . The simulator S samples randomness from the distribution D parallel , adjusts the correlated randomness for every corrupted party accordingly and hands the adversary (r i , v i 1 , . . . , v i n ) as the answer from the first call to F scp , where (v 1 i , . . . , v n i ) are shares of zero, for every i ∈ [n]. For the inputcommitment message, the simulator broadcasts commitments of zero for the honest parties (i.e., random messages). The k'th instance of π j is simulated now using S k j , where S masks/unmasks the messages between A and S k j , appropriately. Every message sent by A on behalf of a corrupted party P i is validated by S according to the relation R i parallel , and in case it is invalid, S locally computes the messages for P i , using its input and correlated randomness. In case party P i gets corrupted, the simulator corrupts the dummy party in the ideal computation and learns its input; next, S hands the input to each simulator S k j , receives the random coins for P i in each instance of π j , updates the random coins r i accordingly and hands it to A. This is a valid simulation for the dummy adversary, following the security guarantees of each simulator S j and of the secret-sharing scheme .
Proof For simplicity of notation, denote F pbb = W D pbb flex (F 1 · · · F M ). We start by defining the sampling algorithm for the distribution D pbb , parametrized by N, R, L and distributions D 1 , . . . , D M . The sampler initially sets α ← 0 and a trace T with root labeled by F pbb and children (F 1 scp , F pbc ). Next, independently sample traces Tk j ← D j , for j ∈ [M] and k ∈ [N] (wherek = α · N + k), and append to the leaves of the trace T (i.e., calling R times, sequentially, to (F i scp , F pbc ), followed by three sequential calls to F pbc ). If for every j ∈ [M], there exists k ∈ [N] such that c tr (Tk j ) < R, then output T and halt. Else, set α ← α + 1. If α < L, repeat the sampling process; otherwise output T and halt.
Following Lemma 2.3, for R > μ and N > log(M/ ) log(R/μ) , it holds that in every iteration, at least one execution of π j will produce output, for every j ∈ [M], with a constant probability. It follows that the expected number of iterations until all honest parties receive output and the protocol terminates is constant, and since each iteration consists of R rounds, the entire execution completes within expected O(R) rounds, as required. The failure probability that the protocol will not terminate within L = poly(κ) iterations is negligible.
We construct a simulator S for the dummy adversary A. Let Z be an environment. The simulator S uses, in a black-box way, the simulators S j , for j ∈ [M], where every S j simulates the dummy adversary for protocol π j . The simulators S j are guaranteed to exist since every π j UC-realizes F j . Each simulator S j is invoked (at most) q = L · N times; denote by S k j the k'th invocation of S j . We consider the representation of the reactive CSF F scp as a sequence of (non-reactive) CSFs (Ff The simulator S proceeds as follows: • Initially, set x 1 = · · · = x n = ⊥ and y = ⊥.
• Simulating the first (input-commitment) message: 1. For every honest party P i , send a random stringm input i to A. • Send inputs to F pbb : 1. For every corrupted party P i , send the message (input, sid, x i ) to F pbb . 2. Upon receiving message (leakage, sid, P i , (l 1 , . . . , l M )) from F pbb , for an honest party P i , store the leakage information. 3. In addition, S receives (trace, sid, T ), from F pbb , where T is a depth-1 trace of the form (i.e., initially, calls to F 1 scp and F pbc , followed by q iterations of calling R times (F i scp , F pbc ) and 3 rounds of F pbc in order to agree on termination). More precisely, S receives the coins that were used by the functionality F pbb to sample the trace T from D pbb . Using these coins, S samples the same traces Tk j that were used to define T .   )) as the response from F scp , on behalf of every corrupted party. (e) For every corrupted party P i with b i = 1 and every honest P u , forward to Sk j the message m i, j,k,ρ,u .
(f) For every corrupted party P i with b i = 0, compute (v 1 i , . . . , v n i ) ← Share(r i ) and send v u i to A, on behalf of every honest party P u . (Note that r i may change as the simulation proceeds, in case P i gets corrupted dynamically.) (g) For every corrupted party that has been previously identified, locally compute the messages m i, j,k,ρ,u , for every honest party P u , using (x i , r i ) and h i, j,k , and send them to Sk j . • Explaining corruption requests: upon a corruption request for P i , proceed as follows.
1. Corrupt the dummy party P i and learn its input  3. Provide the input x i and the adjusted correlated randomness r i to Z as the internal state of party P i .
We next prove that no environment can distinguish between interacting with the dummy adversary and the honest parties running protocol π in the (F pbc , F scp )-hybrid model, from interacting with the simulator S and the dummy honest parties computing F pbb , except for a negligible probability. We prove this using a series of hybrid games. The output of each game is the output of the environment.
The game HYB 1 π pbb ,A,Z. This is exactly the execution of the protocol π pbb in the (F pbc , F scp )-hybrid model with environment Z and dummy adversary A.
The games HYB 2,i * π pbb ,A,Z , for 0 ≤ i * ≤ n. In these games, we modify HYB 1 π pbb ,A,Z as follows. During the first call to F scp , for as the shares for every party P i . For every Next, send to every corrupted P k the shares (ṽ k 1 , . . . ,ṽ k n ) and to every honest party the shares Proof Note that HYB 1 π pbb ,A,Z ≡ HYB 2,0 π pbb ,A,Z . For every 0 ≤ĩ < n, it holds that HYB 2,ĩ π pbb ,A,Z s ≡ HYB 2,ĩ+1 π pbb ,A,Z . This follows from the security of the ECSS scheme and the honest-majority assumption. In particular, the shares {v u 1 , . . . , v u n } u held by the adversary (for all corrupted P u ), completely hide for that random coins r i of every honest party P i , and are compatible even if P i gets adaptively corrupted and r i is revealed. This holds since all honest parties receive valid shares of r i , therefore, r i will be correctly reconstructed even if all corrupted parties have incorrect shares. The claim follows using a standard hybrid argument.
The game HYB 3 π pbb ,A,Z . In this game, we modify HYB 2,n π pbb ,A,Z as follows. The inputcommitment messagem input i , sent by an honest party is uniformly chosen. In addition, in every call to F scp the returned value b i for every honest party P i is always set to 1. Finally, the random coins for honest parties that are corrupted adaptively are set as r Proof The claim immediately follows since honest parties always send correct messages.
The games HYB 4, j * ,k * π pbb ,A,Z , for 0 ≤ j * ≤ M and 0 ≤ k * ≤ q = N · L. In these games, we modify HYB 3 π pbb ,A,Z as follows. For ( j, k) < ( j * , k * ) (i.e., for j < j * , or j = j * and k < k * ), the k'th execution of protocol π j is replaced with the simulated transcript generated by S k j . More specifically, the experiment interacts with the ideal functionality F pbb . Initially, it receives the leakage messages (leakage, sid, P i , (l 1 , . . . , l M )) and the coins used to sample a trace T from D pbb (via the message (trace, sid, T )); using these random coins, sample the traces T k j from D j . Each simulator S k j (for ( j, k) < ( j * , k * )) is invoked and is given the leakage information l j and the trace T k j . The simulator S k j provides correlated randomnessr corr i, j,k for every corrupted party; the random coins r i for corrupted P i are adjusted accordingly. Once A sends the input-commitment messagem input i , for a corrupted P i , extract the input value x i = (x 1 i , . . . , x M i ) and hand S k j the value x j i as the input value for P i . The interaction with S k j is done as in the simulation, i.e., validate the messages from A and unmask valid messages before forwarding to S k j , and mask messages from S k j using r corr i, j,k . Messages from identified corrupted parties are locally computed and sent to S k j . When S k j sends (early-output, sid, ·) requests, respond with the output value y j from y = (y 1 , . . . , y j ), where if y = ⊥, then forward the message to F pbb .

An Impossibility of FBB Round-Preserving Parallel Composition
In this section, we prove that for a natural class of protocols, following and/or extending in various ways the techniques from Ben-Or and El-Yaniv [5], 8 there exist functions such that no protocol can compute their parallel composition in a round-preserving manner, while accessing the functions in a black-box way, tolerating even a single adversarial party. Although this is not a general impossibility result, it indicates that the batching approach of [5] is limited to semi-honest security (Sect. 5) and/or functionally white-box transformations.
We observe that this impossibility serves as an additional justification for the optimality of our protocol-black-box parallel composition (Sect. 6.1). Indeed, on the one hand, it formally confirms the generic observation that the naïve parallel composition of a set of PT functionalities does not preserve their round complexity. On the other hand, and most importantly, it proves that all existing techniques for composing PT functionalities in parallel in the natural (FBB) manner fail in preserving the round complexity. Hence, the only known existing round-preserving composition for such functionalities is the protocol-black-box compiler presented in Sect. 6.1 or more inefficient non-black-box techniques. The wideness of the class of excluded protocols by our impossibility result justifies our conjecture that there exists no round-preserving FBB protocol for parallel composition of PT functionalities. Proving this conjecture is in our opinion a very interesting research direction.
We first argue informally why the approach of [5], cannot be directly extended to privacy-sensitive functions. The idea in [5] for allowing each of the n parties to broadcast its value is to have each of the n parties participate in m = O(log n) parallel invocations (hereafter called batches, to avoid confusion with the goal of parallel broadcast for different messages) of broadcast as sender with the same input. Each of those batches is executed in parallel for a fixed (constant) number of rounds (for the same broadcast message); this increases the probability that sufficiently many parties receive output from each batch. At the end of each batch execution, the parties check whether they jointly hold the output, and if not, they repeat the computation of the batches. It might seem that this idea can be applied to arbitrary tasks, but this is not the case. The reason is that this idea fails if the functionality has any privacy requirements, is that the adversary can input different values on different calls of the functionality within a batch and learn more information on the input.
Batched-parallel composition The above issue with privacy appears whenever a function is invoked twice in the same round on the same inputs from honest parties. Indeed, in this case the adversary can use different inputs to each invocation and learn information as sketched above. The same attack can be extended to composition protocols which invoke the function in two different rounds ρ and ρ ; as long as the adversary knows these rounds, he can still launch the above attack on privacy. Generalizing the claim even further, for specific classes of functions, it suffices that there are two (possibly different) functions which are evaluated on the same inputs in rounds ρ and ρ . This excludes protocols that might attempt to avoid using some functionality W To capture the above generalization, we define the class of batched-parallel composition protocols: A protocol π implementing the PT parallel composition D, D 1 , . . . , D M ) is a batched-parallel composition protocol if it has the following structure. It proceeds in rounds, where in each round the protocol might initiate (possibly multiple) calls to any number of the hybrid functionalities W D j flex (F j ) and/or continue calls that were initiated in previous rounds. Furthermore, there exist two publicly known protocol rounds ρ and ρ , and indices j, j , ∈ [M], such that for the input vector ) the following properties are satisfied: 1. In round ρ the functionality W D j flex (F j ) is called on input x = (x 1 , . . . , x n ) and at least two of its rounds are executed. 9 2. In round ρ the functionality W D j flex (F j ) is also called on input x and at least two of its rounds are executed.
Note that the protocol in [5] (as well as our semi-honest protocol from Sect. 5) is an example of a batched-parallel composition protocol for ρ = ρ = 1 and for j = j = being the index of any one of the hybrid functionalities. Indeed, in the first round of these protocols, each functionality is invoked at least twice on the same inputs. In particular, protocols that follow this structure, e.g., even ones that do not call all functionalities in every phase, or those that have variable batch sizes, can be described as such batchedparallel composition protocols.
We next show that the there are classes of functions C 1 , . . . , C M such and for any protocol π that securely computes the parallel composition W D flex (F C 1 sfe · · · F C M sfe ) while given hybrid access to PT functionalities W D i flex (F C i sfe ) the following properties hold simultaneously: 1. π has to call each of the hybrids W D i flex (F C i sfe ) (for at least 2 rounds each). 10 2. The naïve solution of π calling each of the W D i flex (F C i sfe )'s in parallel until they terminate is not round-preserving (for an appropriate choice of D i 's.) 3. π cannot be a batched-parallel composition protocol.
The above shows that the classes C 1 , . . . , C M not only exclude the existence of a batched-parallel composition protocol, but they also exclude all other known solutions. This implies that for this classes of functions, every known approach-and generalizations thereof-fail to compute the parallel composition of the corresponding functionality in an FBB and round-preserving manner. 1. The protocol that calls each W D i flex (F C i sfe ) in parallel (once) until it terminates is not round-preserving (i.e., its expected round complexity is asymptotically higher than that of the distributions D i ).

Any
has to make a meaningful call (i.e., a call that executes at least two rounds) to each PT hybrid

There exists no functionally black-box batched-parallel composition protocol for
))hybrid model, where D has (asymptotically) the same expectation as D 1 , . . . , D M . 9 Note that any call to a PT (i.e., wrapped) functionality that executes less than two rounds is useless as it can be simulated by the protocol that does nothing (without of course increasing the round complexity). The reason is that, by definition of PT functionalities, any such call gives no output (the first round is an input-only round). 10 Note that this does not mean that π is not round preserving as the calls might be in parallel.
Proof Let D 1 = · · · = D M be the geometric distribution with parameter 1/2. (This means that the expected round complexity of each W D i flex (F C i sfe ) is constant.) Then, Property 1 follows immediately from the observation in [5] (see also [16]), which implies that the expectation of the round complexity of the naïve protocol that executes each functionality in parallel until its completion will be (log M), which is super-constant.
We next turn to Property 2. Toward proving it we first prove the following useful lemma. Lemma 6.8. There exists a family of functions C = { f α } α∈{0,1} κ such that there exists no FBB protocol for computing the family of (oracle-aided) n-party SFE-functionalities {F f α } f α ∈C , which is secure against a semi-honest adversary corrupting any one of the n parties.
Proof Let f α be the function that takes inputs x 1 ∈ {0, 1} κ and x 2 ∈ {0, 1} κ from P 1 and P 2 , respectively (and a default empty string λ from every P j with j ∈ {3, . . . , n}) and outputs y i to each P i as follows: The argument that there exists no FBB protocol for the above function family is inspired by [43,Theorem 1]. Concretely, assume toward contradiction that such a protocol π exists 11 and consider the following experiment. Pick x 1 , x 2 , α uniformly and independently at random and run π f α with inputs x 1 and x 2 for P 1 and P 2 , respectively, and input λ for all other parties. Then, we argue that the following events have negligible probability of occurring (where the probability is taken over the choice of x 1 , x 2 , α and the random coins r = (r 1 , . . . , r n ) of the parties): (A) Any of the parties (i.e., any of the π i 's) queries its oracle f α with ( p, q, λ, . . . , λ) such that p ⊕ q = α. (B) Any of the parties queries its oracle f α with ( p, q, λ, . . . , λ) The fact that (A) occurs with negligible probability is due to the fact that α is uniformly random.
The fact that (B) occurs with negligible probability follows from the protocol's privacy (and is, in fact, independent of the distribution of α). Indeed, suppose that the probability that P i makes a query ( p, q, λ, . . . , λ) such that p ⊕ q = x 1 ⊕ x 2 is noticeable (i.e., not negligible). Consider an adversary corrupting P i and outputting the list of values p ⊕ q, for each ( p, q, λ, . . . , λ) that P i makes to its oracle. By assumption, this list will include x 1 ⊕ x 2 with noticeable probability. However, in the ideal-world execution, the simulator, even if it knows α, will be unable to produce a list of values containing x 1 ⊕ x 2 with noticeable probability, since x 1 and x 2 are chosen uniformly at random, and a simulator corrupting a single party cannot learn both x 1 and x 2 . This implies that the above adversary cannot be simulated, which contradicts the protocol's privacy.
Let now R denote the set of all (r, x 1 , x 2 , α) such that in the above experiment, neither event (A) nor (B) occurs. From the above argument we know that Pr[(r, x 1 , x 2 , α) ∈ R] > 1 − ν(κ), for a negligible function ν.
Next, as in [43, Theorem 1], we consider the coupled experiment in which we use the same r, x 1 , x 2 as above, but run the protocol π f α * where α * = x 1 ⊕ x 2 . As in [43, Theorem 1], we can prove that this experiment proceeds identically as the original one (which, recall differs only on the oracle calls); in particular, all oracle queries will be answered by 0 κ to all parties. The reason for this is that an oracle query ( p, q, λ, . . . , λ) is answered by a value other than 0 κ in the first experiment only if p ⊕ q = α and in the second experiment only if p ⊕ q = x 1 ⊕ x 2 , which for the combinations of (r, x 1 , x 2 , α) ∈ R does not occur. This in particular implies that the output vector y = (y 1 , . . . , y n ) that the parties P 1 , . . . , P n receive from the protocol will be identical in both experiments for (r, It follows that, since Pr[(r, , the distribution of y in both experiments can be at most negligible apart. However, since the protocol is required to output the correct value with overwhelming probability, in the first experiment (where Proof Assume toward contradiction that there exists a protocol π computing W D flex (F C 1 sfe · · · F C M sfe ) such that for some i, protocol π might execute at most one round of W D i flex (F C i sfe ). We will prove that no such protocol can exists for a uniformly random choice of α = (α 1 , . . . , α M ). 12 This is sufficient, since by an averaging argument, it implies that there exists no such protocol that is secure for all choices of α, as required by the definition in [57]. (Indeed, if there exists a protocol that is secure for all choices of α (independent of α) there exists one that is secure for a randomly chosen α. ) First, we observe that any call to a functionality W D j flex (F C j sfe ) that executes less than two rounds can be simulated by the protocol that does nothing during this execution (without of course increasing the round complexity). The reason is that, by definition of flexibly wrapped CSFs (Sect. 2.2), any such call gives no output (the first round is an 12 Consistently with [57], we assume that although α is chosen uniformly at random, it is known to the environment, the adversary and the simulator in the proof (in particular, we can assume that the environment chooses α to be uniformly random in each of the experiments and hands it to the adversary/simulator). input-only round). Thus, we can assume without loss of generality that π makes no call to W D i flex (F C i sfe ). Next, observe that any protocol that computes . Indeed, one simply needs to invoke the protocol for W D flex (F C 1 sfe · · · F C M sfe ) and take its i'th output as the output of W D i flex (F C i sfe ). Hence, π can be trivially turned to a protocol for computing ). Note that since we assume that π is FBB, the parties are given access to all other functionali- , and can make oracle queries to all underlying functions Finally, we observe the argument of Lemma 6.8 can be easily extended to the above scenario, where the aim is to compute {F f α i } f α i ∈C in an FBB manner, but where the parties have oracle access to the function f α i (as required by the definition of FBB protocols [57]) and, in addition, have oracle access to f α 1 , . . . , . . , f α M is independent of f α i and can be therefore trivially simulated by means of an informationtheoretic MPC protocol (recall we only have one corruption) that implements a globally accessible oracle to f α 1 , . . . , (Since α is chosen uniformly at random their role in computing f α i can be emulated by choosing independent α j 's for j ∈ [M] \ {i}.) Thus, if there were a protocol which would compute {F f α i } f α i ∈C in the above hybrid world, then it can be trivially converted to a protocol which does not access the hybrids or the oracle calls to the functions other than f α i , which would contradict Lemma 6.8. Observe that the above extension is independent of the round complexity, and adding a PT structure to the hybrids does not affect the impossibility.
We complete the proof of the theorem by proving Property 3 for the above choice of C 1 , . . . , C M and D 1 , . . . , D M .

Lemma 6.10. There exists no functionally black-box batched-parallel composition protocol for computing
)-hybrid model, tolerating a static adversary actively corrupting any one of the parties.
Proof As before, it suffices to prove that there exists no batched-parallel composition protocol π that is secure for computing W D flex (F )-hybrid model, for a uniformly random choice of α = (α 1 , . . . , α M ).
Assume toward contradiction that such a protocol π exists, which is secure against a malicious (i.e., active) adversary corrupting any one party, and assume without loss of generality that party P 1 is corrupted. Let (ρ, j), (ρ , j ), and denote the values that are assumed to exists by the fact that π is a batched-parallel composition protocol and denote by W ) the corresponding functionalities indexed by j and j . We will denote by x 1 = (x 1 1 , . . . , x M 1 ) ∈ ({0, 1} κ ) M the input of P 1 and by . Consider an environment that chooses all inputs to the parties uniformly at random but hands its adversary the first κ −2 bits of the input x 2 . (Recall that the batched-parallel composition ( y 1 , . . . , y n ), where each y i = (y 1 i , . . . , y M i ) is the output of P i . Because all inputs are independently and uniformly distributed, the simulator gains no information on the missing (i.e., last) two bits of x 2 neither by using its knowledge of the α , nor by the inputs and outputs of any of the parallelly composed W D flex (F other than its 'th output. In other words, the only way that the simulator might learn additional information on the missing two last bits of the input x 2 of the honest party P 2 is from the corrupted P 1 's 'th output y 1 of the ideal functionality W D flex (F In the analysis below we use the following notation. Given a string x = (x 1 , . . . , x m ) ∈ {0, 1} m , denote by x[i, . . . , j], for i < j, the substring (x i , . . . , x j ). Consider the following cases for the input x 1 that S hands to the 'th functionality in . In this case, by correctness of π , the 'th output y 1 equals 0 κ independently of the last two bits of x 1 . Hence, in this case the simulator is able to output the last two bits of x 2 with probability 1/4 (i.e., the best he can do is guess). [1, . . . , κ − 2]. In this case, we consider the following event -If E 1 occurs, then the simulator will see that the output y 1 = 0 κ , so can output as his guess for the last two bits of x 2 , which will always (with probability 1) be the correct guess (by definition of the function). Note that Pr[E 1 ] = 1/4 since the simulator knows α (by the definition of FBB [57]) and has no information on x 2 [κ − 1, κ].
-If E 1 does not occur, then S will see that y 1 = 0 κ , from which it can deduce that , but gets no more information on x 2 [κ − 1, κ]. Hence, the probability of outputting x 2 [κ − 1, κ] in this case is at most 1/3 (i.e., the probability of guessing among the 2-bit strings that are not equal to Hence, the total probability that a simulator outputs a correct guess for x 2 [κ − 1, κ] is To complete the proof, we will describe an adversary who outputs the two last bits of x 2 , i.e., x 2 [κ − 1, κ], with probability noticeably higher than 1/2; this implies a noticeable distinguishing advantage between the real world and the ideal world.
The adversary chooses two different random two-bit strings b, b ∈ {0, 1} 2 and Once the adversary receives P 1 's outputs from the above two functionalities (denote them asŷ 1, j andŷ 1, j ) 13 he does the following: Ifŷ 1, j = 0 κ orŷ 1, j = 0 κ (an event that happens with probability 1/2 since there are four possible two-bit strings and one of them makes the output = 0 κ ) then the adversary outputs b ⊕ α j [κ − 1, κ] (or b ⊕ α j [κ − 1, κ], respectively) as his guess of the last two bits of x 2 . By correctness of the protocol, except for negligible probability the guess will be correct in this case. Otherwise, the adversary outputs a random string from T = {00, 01, 10, 00} \ {b, b }; the probability of outputting a correct guess in this case is 1/2 since it has to be one of the strings in T . Hence, the overall probability that this adversary outputs the right guess for the last two bits of x 2 is 3/4 − ν, where ν is a negligible function implied by the error probabilities in the above protocol. Hence, the output of the adversary is distinguishable from the output of the best simulator which contradicts the assumed security of π.
This completes the proof of Theorem 6.7. ECSS can be constructed information-theoretically, with a negligible positive error probability, when t < n/2 [12,22,56].

A.2. Information-Theoretic Signatures
Parts of the following section are taken almost verbatim from [42]. 13 Recall that, by definition of probabilistic-termination SFE, the adversary is always able to learn the output in the second round.

P -verifiable Information-Theoretic Signatures
We recall the definition and construction of information-theoretic signatures [58,59] but slightly modify the terminology to what we consider to be more intuitive. The signature scheme (in particular the keygeneration algorithm) needs to know the total number of verifiers or alternatively the list P of their identities. Furthermore, as usually for information-theoretic primitives, the key-length needs to be proportional to the number of times that the key is used. Therefore, the scheme is parameterized by two natural numbers S and V which will be upper bounds on the number of signatures that can be generated and verified, respectively, without violating the security.
A P-verifiable signature scheme consists of a triple of randomized algorithms (Gen, Sign, Ver), where: · (consistency) 14  In [59,60] a signature scheme satisfying the above notion of security was constructed. These signatures have a deterministic signature generation algorithm Sign. In the following (Fig. 6) we describe the construction from [59] (as described by [60] but for a single signer). We point out that the keys and signatures in the described scheme are elements of a sufficiently large finite field F (i.e., |F| = O(2 poly(κ) ); one can easily derive a scheme for strings of length = poly(κ) by applying an appropriate encoding: for example, map the i'th element of F to the i'th string (in the lexicographic order) and vice versa. We say that a value σ is a valid signature on message m (with respect to a given key setup (sk, vk)), if for every honest P i it holds that Ver(m, σ, vk i ) = 1.

B. Synchronous Protocols in UC (Cont'd)
In this section, we give complementary material to Sect. 2.1 and in particular we include a high-level overview of the formulation of synchronous UC from [46]. More concretely, Katz et al. [46] introduced a framework for universally composable synchronous computation. For self containment we describe here the basics of the model and introduce some terminology that simplifies the description of corresponding functionalities.
Synchronous protocols can be cast as UC protocols which have access to a special clock functionality F clock , which allows them to coordinate round switches as described below, and communicate over bounded-delay channels. 15 In a nutshell, the clock functionality works as follows: It stores a bit b which is initially set to 0 and it accepts from each party two types of messages: clock-update and clock-read. The response to clock-read is the value of the bit b to the requestor. Each clock-update is forwarded to the adversary, but it is also recorded, and upon receiving such a clock-update message from all honest parties, the clock functionality updates b to b ⊕ 1. It then keeps working as above, until it receives again a clock-update message from all honest parties, in which case it resets b to b ⊕ 1 and so on.
Such a clock can be used as follows to ensure that honest parties remain synchronized, i.e., no honest party proceeds to the next round before all (honest) parties have finished the current round: Every party stores a local variable where it keeps (its view of) the current value of the clock indicator b. At the beginning of the protocol execution this variable is 0 for all parties. In every round, every party uses all its activations (i.e., messages it receives) to complete all its current-round instructions, and only then sends clock-update to the clock signaling to the clock that it has completed its round; following clock-update, all future activations result to the party sending clock-read to the clock until its bit b is flipped; once the party observes that the bit b has flipped, it starts its next round. For the sake of clarity, we do not explicitly mention F clock in our constructions.
In [46], for each message that is to be sent in the protocol, the sender and the receiver are given access to an independent single-use channel. 16 We point out, that instead of the bounded-delay channels, in this work we will assume very simple CSFs that take as input from the sender the message he wishes to send (and a default input from other parties) and deliver the output to the receiver in a fetch mode. Such a simple secure-channel SFE can be realized in a straightforward manner from bounded-delay channels and a clock F clock .
As is common in the synchronous protocols literature, throughout this work we will assume that protocols have the following structure: In each round every party sends/receives a (potentially empty) message to all parties and hybrid functionalities. Such a protocol can be described in UC in a regular form using the methodology from [46] as follows: Let μ ∈ N denote the maximum number of messages that any party 15 As argued in [46], bounded-delay channels are essential as they allow parties to detect whether or not a message was sent within a round. 16 As pointed out in [46], an alternative approach would be to have a multi-use communication channel; as modeling the actual communication network is out of the scope of the current work, we will use the more standard and formally treated model of single-use channels from [46]. P i might send to all its hybrids during some round. 17 Every party in the protocol uses exactly μ activations in each round. That is, once a party P i observes that the round has changed, i.e., the indicator-bit b of the clock has being flipped, P i starts its next round as described above. However, this round finishes only after P i receives μ additional activations. Note that P i uses these activations to execute his current round instructions; since μ is a bound to the number of hybrids used in any round by any party, μ activations are enough for the party to complete its round (If P i finishes the round early, i.e., in less than μ activations, it simply does nothing until the μ activations are received.) Once μ activations are received in the current round, P i sends clock-update to the clock and then keeps sending clock-read messages, as described above, until it observes a flip of b indicating that P i can go to the next round.
In addition to the regular form of protocol execution, Katz et al. [46] described a way of capturing in UC the property that a protocol is guaranteed to terminate in a given number of rounds. The idea is that a synchronous protocol in regular form, which terminates after r rounds, realizes the following functionality F. The functionality F keeps track of the number of times every honest party sends μ activations/messages and delivers output as soon as this has happened r times. More concretely, imitating an r -round synchronous protocol with μ activations per party per round, upon being instantiated, F initiates a global round-counter τ = 0 and an indicator variable τ i := 0 for each P i ∈ P; as soon as some party P i sends μ messages to F, while the round-counter τ is the same, F sets τ i := 1 and performs the following check: 18 if τ i = 1 for every honest P i then increase τ := τ + 1 and reset τ i = 0 for all P i ∈ P. As soon as τ = r , the functionality F enters a "delivery" mode. In this mode, whenever a message fetch-output is received from some party P i , F outputs to P i its output. (If F has no output to P i is outputs ⊥. ) We refer to a functionality that has the above structure, i.e., which keeps track of the current round τ by counting how many times every honest party has sent a certain number μ of messages, as a synchronous functionality. To simplify the description of our functionalities, we introduce the following terminology. We say that a synchronous functionality F is in round ρ if the current value of the above internal counter in F is τ = ρ.
We note that protocols in the synchronous model of [46] enjoy the strong composition properties of the UC framework. However, in order to have protocols being executed in a lock-step mode, i.e., where all protocols complete their round within the same clocktick, Katz et al. [46] make use of the composition with joint-state (JUC) [9]. The idea is the parties use an F clock -hybrid protocolπ that emulates toward each of the protocols, sub-clocks and assigns to each sub-clock a unique sub-session ID (ssid). Each of these sub-clocks is local to its calling protocol, butπ makes sure that it gives a clock-update to the actual (joint) clock functionality F clock , only when all sub-clocks have received such a clock-update message. This ensures that all clocks will switch their internal bits at the same time with the bigger clock, which means that the protocols using them will 17 In the simple case where the parties only use point-to-point channels, μ = 2(n − 1), since each party uses n − 1 channels as sender and n − 1 as receiver to exchange his messages for each round with all other n parties. 18 To make sure that the simulator can keep track of the round index, F notifies S about each received input, unless it has reached its delivery state defined below. be mutually synchronized. This property can be formally proved by a direct application of the JUC theorem. For further details the interested reader is referred to [9,46].

C. The Probabilistic-Termination Framework (Cont'd)
In this section, we provide supplementary material for Sect. 2.2.

C.1. Canonical Synchronous Functionalities
The description of the canonical synchronous functionality (CSF) is given in Fig. 7. As a generalization of the SFE functionality, CSFs are powerful enough to capture any deterministic well-formed functionality. In fact, all the basic (unwrapped) functionalities considered in this work will be CSFs. The functionality F csf is parametrized by a (possibly) randomized function f that receives n + 1 inputs (n inputs from the parties and one additional input from the adversary) and a leakage function l that determines what information about the input values is leaked to the adversary. In the first (input) round, all the parties hand F csf their input values, and in the second (output) round, each party receives its output. Whenever some input is submitted to F csf , the adversary is given some leakage function of this input; the adversary can use this leakage for deciding which parties to corrupt and which input values to use for corrupted parties. For example, in a broadcast protocol such as [25] the adversary may decide to adaptively corrupt the broadcaster and replace its message based on the leakage received. Additionally, the adversary is allowed to input an extra message that for some functionalities (e.g.,Byzantine agreement) might affect the output.
We now describe a few standard functionalities from the MPC literature as CSFs, we refer the reader to [16] for additional examples.
-Secure Message Transmission (aka Secure Channel). In the secure message transmission (SMT) functionality, a sender P i with input x i sends its input to a receiver P j . The function to compute is f i, j smt (x 1 , . . . , x n , a) = (λ, . . . , x i , . . . , λ) (where x i is the value of the j'th coordinate) and the leakage function is l i, j smt (x 1 , . . . , x n ) = y, where y = |x i | in case P j is honest and y = x i in case P j is corrupted. We denote by F i, j smt the functionality F csf when parametrized with the above functions f i, j smt and l i, j smt , for sender P i and receiver P j . -Broadcast. In the (standard) broadcast functionality, a sender P i with input x i distributes its input to all the parties, i.e., the function to compute is The adversary only learns the length of the message x i before its distribution, i.e., the leakage function is l i bc (x 1 , . . . , x n ) = |x i |. This means that the adversary does not gain new information about the input of an honest sender before the output value for all the parties is determined, and in particular, the adversary cannot corrupt an honest sender and change its input after learning the input message. We denote by F i bc the functionality F csf when parametrized with the above functions f i bc and l i bc , for sender P i . -Secure Function Evaluation. In the secure function evaluation functionality, the parties compute a randomized function g(x 1 , . . . , x n ), i.e., the function to compute is f g sfe (x 1 , . . . , x n , a) = g(x 1 , . . . , x n ). The adversary learns the length of the input values via the leakage function, i.e., the leakage function is l sfe (x 1 , . . . , x n ) = (|x 1 | , . . . , |x n |). We denote by F g sfe the functionality F csf when parametrized with the above functions f g sfe and l sfe , for computing the n-party function g.

C.2. Reactive CSFs
We proceed to extend the notion of CSF to reactive CSFs (RCSFs), i.e., CSFs with multiple input/output phases. Correspondingly, a reactive CSF is parametrized by two vectors of functions f = ( f 1 , . . . , f q ) and l = (l 1 , . . . , l q ). The description of reactive CSFs can be found in Fig. 8.
rcsf be a reactive CSF, with f = ( f 1 , . . . , f q ) and l = (l 1 , . . . , l q ), let t < n/2, and let = (Share, Recon) be a (t, n) error-correcting secret-sharing scheme. For every k ∈ [q], denote byf k the function that on inputs  (x 1 , . . . ,x n , a), withx i = (x i , s i ), first reconstructs the state s = Recon(s 1 , . . . , s n ), next samples random coins r and computes (y 1 , . . . , y n ) = f k (s, x 1 , . . . , x n , a, r ), and finally shares the new state s = (s, x 1 , . . . , x n , a, r ) as (s 1 , . . . , s n ) ← Share(s ) and , for a vector of distributions D = (D 1 , . . . , D q ), if π consists of q sub-protocols (π 1 , . . . , π q ), such that for every k ∈ [q], sub-protocol π k UC-realizes W D k strict (Ff k ,l k csf ). In addition, each party P i in π keeps a value s i , initially set to ⊥, that is used as the second input for each sub-protocol. Upon completing the execution of each sub-protocol, party P i updates s i to be the second output value received.

C.3. Execution Traces
As discussed in Sect. 2.2, a computation with probabilistic termination is modeled in [16] by augmenting CSFs that capture the actual computational task, with output-round randomizing wrappers that capture the round structure of the protocol realizing the task. The underlying idea is to ensure that an environment, that can always observe how many rounds the execution of a protocol takes, will see an indistinguishable round structure in the corresponding ideal computation. Before formally describing the wrappers in "Appendix C.4", we illustrate the notion of execution traces, which is central to enable the simulator emulating the round structure.
Recall that the randomizing wrappers are parametrized by a round sampler D, that may depend on a specific protocol implementing the functionality. The round sampler D samples a round number ρ term by which all parties are guaranteed to receive their outputs no matter what the adversary strategy is. The strict wrapper W strict ensures that all (honest) parties terminate together in the round ρ term , whereas the flexible wrapper W flex allows the adversary to deliver outputs to individual parties at any time before round ρ term .
Consider an arbitrary functionality F that is realized by some protocol π . If F is to provide guaranteed termination (whether probabilistic or not), it must enforce an upper bound on the number of rounds that elapse until all parties receive their outputs. If the termination round of π is not fixed (but may depend on random choices made during its execution), this upper bound must be chosen according to the distribution induced by π .
Thus, in order to simulate correctly, the functionality F and π 's simulator S must coordinate the termination round, and therefore, F must pass the upper bound it samples to S. However, it is not sufficient to simply inform the simulator about the guaranteedtermination upper bound ρ term . Intuitively, the reason is that protocol π may make probabilistic choices as to the order in which it calls its hybrids (and, even worse, these hybrids may even have probabilistic termination themselves). Thus, F needs to sample the upper bound based on π and the protocols realizing the hybrids called by π . As S needs to emulate the entire protocol execution, it is now left with the task of trying to sample the protocol's choices conditioned on the upper bound it receives from F. In general, however, it is unclear whether such a reverse sampling can be performed in (strict) polynomial time.
To avoid this issue and allow for an efficient simulation, we have F output all the coins that were used for sampling the round ρ term to S. Because S knows the roundsampler algorithm, it can reproduce the entire computation of the sampler and use it in its simulation. In fact, as we discuss below, it suffices for our proofs to have F output a trace of its choices to the simulator instead of all the coins that were used to sample this trace.
Execution traces As mentioned above, in the synchronous communication model, the execution of the ideal functionality must take the same number of rounds as the protocol. For example, suppose that the functionality F in our illustration above is used as a hybrid by a higher-level protocol π . The functionality G realized by π must, similarly to F, choose an upper bound on the number of rounds that elapse before parties obtain their outputs. However, this upper bound now depends not only on π itself but also on π (in particular, when π is a probabilistic-termination protocol).
Given the above, the round sampler of a functionality needs to keep track of how the functionality was realized. This can be achieved via the notion of a trace. A trace basically records which hybrids were called by a protocol, and in a recursive way, for each hybrid, which hybrids would have been called by a protocol realizing that hybrid. The recursion ends with the hybrids that are "assumed" by the model, called atomic functionalities (in this paper atomic functionalities are the secure point-to-point communication functionality and the correlated-randomness functionality for broadcast).
Building on our running illustration above, suppose protocol π (realizing G) makes ideal hybrid calls to F and to some atomic functionality H. Assume that in an example execution, π happens to make (sequential) calls to instances of H and F in the following order: F, then H, and finally F again. Moreover, assume that F is replaced by protocol π (realizing F) and that π happens to make two (sequential) calls to H upon the first invocation by π , and three (sequential) calls to H the second time. (We assume that both π and π call exactly one hybrid in every round.) Then, this would result in the trace depicted in Fig. 9.
Assume that π is a probabilistic-termination protocol and π a deterministictermination protocol. Consequently, this means that F is in fact a flexibly wrapped functionality of some CSF F , i.e., F = W D F flex (F ), where the distribution D F samples (from a distribution induced by π ) depth-1 traces with root W D F flex (F ) and leaves H. 19 Similarly, G is a strictly wrapped functionality of some CSF G , i.e., G = W D G strict (G ), where the distribution D G first samples (from a distribution induced by π ) a depth-1 19 Note that the root node of the trace sampled from D F is merely labeled by W D F flex (F ), i.e., this is not a circular definition.   Fig. 9 would look as in Fig. 10.

C.4. Strict and Flexible Wrappers
This section contains the definitions of the strict wrapper W strict and of the flexible wrapper W flex .

Strict-wrapper functionality
The strict-wrapper functionality, formally defined in Fig. 11, is parametrized by (a sampler that induces) a distribution D over traces, and internally runs a copy of a CSF functionality F. Initially, a trace T is sampled from D; this trace is given to the adversary once the first honest party provides its input. The trace T is used by the wrapper to define the termination round ρ term ← c tr (T ). In the first round, the wrapper forwards all the messages from the parties and the adversary to (and from) the functionality F. Next, the wrapper essentially waits until round ρ term , with the exception that the adversary is allowed to send (adv-input, sid, ·) messages and change its input to the function computed by the CSF. Finally, when round ρ term arrives, the wrapper provides the output generated by F to all parties.

Flexible-wrapper functionality
The flexible-wrapper functionality, defined in Fig. 12, follows in similar lines as the strict wrapper. The difference is that the adversary is allowed to instruct the wrapper to deliver the output to each party at any round. In order to accomplish this, the wrapper assigns a termination indicator term i , initially set to 0, to each party. Once the wrapper receives an (early-output, sid, ·) request from the adver-  sary for P i , it sets term i ← 1. Now, when a party P i sends a (fetch-output, sid) request, the wrapper checks if term i = 1, and lets the party receive its output in this case (by forwarding the (fetch-output, sid) request to F). When the guaranteedtermination round ρ term arrives, the wrapper provides the output to all parties that did not receive it yet.

C.5. Slack-Tolerant Wrappers
Slack-tolerant strict wrapper The slack-tolerant strict wrapper W D,c sl-strict , formally defined in Fig. 13, is parametrized by an integer c ≥ 0, which denotes the amount of slack tolerance that is added, and a distribution D over traces. The wrapper W sl-strict is similar to W strict but allows parties to provide input within a window of 2c + 1 rounds and ensures that they obtain output with the same slack they started with. The wrapper essentially increases the termination round by a factor of B c = 3c + 1, which is due to the slack-tolerance technique used to implement the wrapped version of the atomic parallel SMT functionality (see [16]).

Slack-tolerant flexible wrapper
The slack-tolerant flexible wrapper W D,c sl-flex , formally defined in Fig. 14, is parametrized by an integer c ≥ 0, which denotes the amount of slack tolerance that is added, and a distribution D over traces. The wrapper W sl-flex is similar to W flex but allows parties to provide input within a window of 2c + 1 rounds and ensures that all honest parties will receive their output within two consecutive rounds. The wrapper essentially increases the termination round to ρ term = B c · c tr (T ) + 2 · flex tr (T ) + c, where the blow-up factor B c is as explained above, and the additional factor of 2 results from the termination protocol of Bracha [7] used for every flexibly wrapped CSF, which increases the round complexity by at most two additional rounds (recall that flex tr (T ) denotes the number of such CSFs), and c is due to the potential slack. W sl-flex allows the adversary to deliver output at any round prior to ρ term but ensures that all parties obtain output with a slack of at most one round. Moreover, it allows the adversary to obtain the output using the (get-output, sid) command, which is necessary in order to simulate the termination protocol.

C.6. Compilers and Composition Theorems
Deterministic-termination compiler Let F, F 1 , . . . , F m be canonical synchronous functionalities, and let π be an SNF protocol that UC-realizes the strictly wrapped functionality W D strict (F), for some depth-1 distribution D, in the (F 1 , . . . , F m )-hybrid model, assuming that all honest parties receive their inputs at the same round. The compiler Comp c dt , parametrized with a slack parameter c ≥ 0, receives as input the protocol π and distributions D 1 , . . . , D m over traces and replaces every call to a CSF F i with a call to the wrapped CSF W D i ,c sl-strict (F i ). We denote the output of the compiler by π = Comp c dt (π, D 1 , . . . , D m ). The compiled protocol π realizes W D full ,c sl-strict (F), for a suitably adapted distribution D full , assuming all parties start within c + 1 consecutive rounds. Consequently, the compiled protocol π can handle a slack of up to c rounds while using hybrids that are realizable themselves. Calling the wrapped CSFs instead of the original CSFs F 1 , . . . , F m affects the trace corresponding to F. The new trace D full = full-trace(D, D 1 , . . . , D m ) is obtained as follows: 1. Sample a trace T ← D, which is a depth-1 tree with a root label W D strict (F) and leaves from the set {F 1 , . . . , F m }. 2. Construct a new trace T with a root label W D full ,c sl-strict (F). 3. For each leaf node F = F i , for some i ∈ [m], sample a trace T i ← D i and append the trace T i to the first layer in T (i.e., replace the node F with T i ). 4. Output the resulting trace T .
To illustrate what a full trace is, consider a two-round SFE protocol π that uses a broadcast channel in both rounds (e.g., the protocol from [20] that is adaptively secure and guarantees output delivery). That is, the SNF protocol π is defined in the F pbchybrid model, where F pbc is an uninstantiable two-round CSF. 20 The distribution D π outputs a trace consisting of a root labeled with W D π strict (F sfe ), and two (ordered) children (F pbc , F pbc ). Now, consider the compiled protocol π = Comp c dt (π, D 1 , D 2 ), where D 1 and D 2 are the distributions corresponding to the protocol of Dolev and Strong [25], consisting of t + 1 point-to-point rounds, i.e., a root labeled with W D 1 ,c sl-strict (F pbc ), and (t + 1)(3c + 1) (ordered) children (F psmt , . . . , F psmt ) (recall that each round is replaced by 3c + 1 rounds to support a slack of c rounds). Then, the full trace D full π = full-trace(D π , D 1 , D 2 ) outputs a root labeled with W D full π ,c sl-strict (F sfe ) with two (ordered) children W D 1 ,c sl-strict (F pbc ) and W D 2 ,c sl-strict (F pbc ), each has (t + 1)(3c + 1) (ordered) children (F psmt , . . . , F psmt ). (To simplify the example and focus on the meaning of a full trace, we swipe some details under the rug, such as the correlated randomness functionality that is needed for the protocol from [20] and that the protocol from [25] achieves unfair broadcast rather than broadcast.) Let F, F 1 , . . . , F m be canonical synchronous functionalities, and let π be an SNF protocol that UC-realizes the flexibly wrapped functionality W D flex (F) in the (F 1 , . . . , F m )-hybrid model, for some depth-1 distribution D, assuming all parties start at the same round. Define the following compiler Comp c ptr , parametrized by a slack parameter c ≥ 0. The compiler receives as input the protocol π , distributions D 1 , . . . , D m over traces, and a subset I ⊆ [m] indexing which CSFs F i are to be wrapped with W sl-flex and which with W sl-strict ; every call in π to a CSF F i is replaced with a call to the wrapped CSF W D i ,c sl-flex (F i ) if i ∈ I or to W D i ,c sl-strict (F i ) if i / ∈ I . In addition, the compiler adds the termination procedure, based on an approach originally suggested by Bracha [7], which ensures all honest parties will terminate within two consecutive rounds:

Probabilistic-termination compiler
• As soon as a party is ready to output a value y (according to the prescribed protocol) or upon receiving at least t + 1 messages (end, sid, y) for the same value y (whichever happens first), it sends (end, sid, y) to all parties.
• Upon receiving n − t messages (end, sid, y) for the same value y, a party outputs y as the result of the computation and halts.
This termination technique only applies to public-output functionalities; therefore, only CSFs with public output can be wrapped by W sl-flex . We denote the output of the compiler by π = Comp c ptr (π, D 1 , . . . , D m , I ).
The compiled protocol π UC-realizes the wrapped functionality W D full ,c sl-flex (F), for the adapted distribution D full = full-trace (D, D 1 , . . . , D m ). Consequently, the compiled protocol π can handle a slack of up to c rounds, while using hybrids that are realizable themselves, and ensuring that the output slack is at most one round (as opposed to π ). Calling the wrapped hybrids instead of the CSFs affects the trace corresponding to F in exactly the same way as in the case with deterministic termination. 21 The probabilistic-termination compiler Comp c ptr is suitable for SNF protocols that implement a flexibly wrapped functionality, e.g., the (adjusted) protocol of Feldman and Micali [27] that realizes randomized Byzantine agreement. Indeed, such protocols introduce new slack; hence, the slack-reduction technique described above is needed to control the new slack and reduce it to c = 1. As pointed out in [16], in some cases the SNF protocol may realize a strictly wrapped functionality; however, some of the hybrids are to be wrapped using the flexible wrapper. An example for the latter type of probabilistictermination protocols is the BGW protocol [6] that has deterministic termination in the broadcast model, yet, once the broadcast channel is implemented using randomized protocols, the obtained protocol has probabilistic termination. For that reason, a second probabilistic-termination compiler Comp c pt , without the slack-reduction procedure, was introduced in [16].
To continue with our illustration, consider again a two-round SFE protocol π that uses a broadcast channel in both rounds, where the distribution D π outputs a trace consisting of a root labeled with W D π strict (F sfe ) and two (ordered) children (F pbc , F pbc ). Now, instead of using the protocol of Dolev and Strong [25], consider the parallelbroadcast protocol from Theorem 3.5 that consists of r constant-round phases, followed by two rounds for slack-reduction; the number of rounds in a phase is denoted by const, and the number of phases r is sampled from a geometric distribution. The compiled protocol is π = Comp c pt (π, D 1 , D 2 ), where D 1 and D 2 are the distributions that output a root labeled with W D 1 ,c sl-flex (F pbc ), and r · const · (3c + 1) + 2 (ordered) children (F psmt , . . . , F psmt ), where r is sampled from the corresponding geometric distribution (recall that each round is replaced by 3c + 1 rounds to support a slack of c rounds). Then, the full trace D full π = full-trace(D π , D 1 , D 2 ) outputs a root labeled with W D full π ,c sl-flex (F sfe ) with two (ordered) children W D 1 ,c sl-flex (F pbc ) and W D 2 ,c sl-flex (F pbc ), where the first child has r 1 · const · (3c + 1) + 2 (ordered) children (F psmt , . . . , F psmt ) and the second child has r 2 · const · (3c + 1) + 2 (ordered) children (F psmt , . . . , F psmt ), where r 1 and r 2 are independently sampled from the corresponding geometric distribution.