Abstract 1 Introduction 2 Preliminaries and Notations 3 Migratory Schedules 4 The non-migratory setting 5 Conclusion References Appendix A Special cases of non-migratory schedules

Approximating Optimal Broadcast of Files in a Hose-Model Network

Thomas Erlebach ORCID Department of Computer Science, Durham University, UK Naveen Garg ORCID Department of Computer Science and Engineering, Indian Institute of Technology Delhi, India Sukriti Gupta ORCID Department of Computer Science and Engineering, Indian Institute of Technology Delhi, India Amitabh Trehan ORCID Department of Computer Science, Durham University, UK
Abstract

The paper considers the problem of file sharing among peers who are connected to a common core network through links of differing upload and download capacities, as is the case in networks provisioned according to the hose model. The file is assumed to be divided into equal-sized chunks, and a peer can start sending a “chunk” of the file to another peer only after it has received the entire chunk. The objective is to share a chunk, initially residing on one of the peers, with all other peers in the least time possible. Peers can simultaneously send/receive parts of a chunk to/from multiple peers, subject to the upload and download capacity constraints. We only consider the problem of broadcasting one chunk to all peers.

We consider two different models – in the migratory model, a peer can receive the chunk from multiple peers, while in the non-migratory model, any peer can receive the chunk only from one peer. For the migratory model, introduced in this paper, we show a novel integer program and use the optimum solution to the LP-relaxation to give a schedule with makespan e1/e𝙾𝙿𝚃+P where P is the time required by the slowest peer to download the chunk.

Minimising makespan in the non-migratory model is known to be NP-hard. We give a solution with makespan 18𝙾𝙿𝚃+P and this is the first approximation algorithm for heterogeneous and asymmetric upload/download capacities. We also consider 2 special cases. For uniform download capacities, we obtain a solution with makespan 2𝙾𝙿𝚃 extending a result due to Liu [12]. For uniform upload capacities, we give the first approximation algorithm, producing makespan at most 2𝙾𝙿𝚃+2P.

Keywords and phrases:
File sharing, scheduling, peer-to-peer networks
Funding:
Naveen Garg: Supported by Usha Hasteer Chair.
Copyright and License:
[Uncaptioned image] © Thomas Erlebach, Naveen Garg, Sukriti Gupta, and Amitabh Trehan; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Design and analysis of algorithms
Editors:
C. Aiswarya, Ruta Mehta, and Subhajit Roy

1 Introduction

We consider the problem of file sharing between peers in networks. The file to be shared initially resides with a root peer and has to be broadcast, i.e., sent to all the remaining n peers. The peers are all connected to a core network with links of bounded upload and download capacities. The core network is suitably provisioned so that any traffic matrix between the peers that meets the upload and download capacity constraints on each peer can be supported; this is commonly referred to as the hose model [6, 1]. The rate of transfer between two peers is only constrained by the upload and download capacities of the sending and receiving peer, respectively. For instance, if peer i has an upload capacity of 10Mbps and peers j,k have download capacities of 6Mbps then peer i can transfer data to peers j,k at a rate of 5Mbps or to peer j at 6Mbps and peer k at 4Mbps or vice-versa. Our objective is to schedule the transfers between peers, efficiently.

Such a scenario would, for instance, arise when the core is a cloud network [2] to which a set of enterprise clients are connected who have to share a file with all when it becomes available at one. While there has been a lot of previous work on broadcasting in networks, most of it focuses on the structure of the communication network [9, 10], which in our model plays no role.

Large files are typically distributed in “chunks”. A chunk is the smallest unit of the file that must be completely available to a peer before it can start sharing any part of it with another peer. A small file can be treated as a single chunk. In this paper, we shall only be concerned with distributing one chunk residing at the root peer to the remaining peers in the network. Our model permits a chunk to be split and for a peer to send parts of the chunk to multiple peers concurrently. We consider two variants of this model. In the non-migratory setting, a peer must receive the entire chunk from only one peer (its parent peer) while in the migratory setting, a peer could receive parts of the chunk from different peers. Our objective in both settings is to schedule these transfers so that the time by which all peers receive the chunk (the makespan) is minimised.

We remark that it is standard in networking that certain units (such as IP packets) can only be forwarded once they have been received in full, for example, because a checksum needs to be verified before one can be sure that the unit has been received correctly and can start forwarding it. Dropping this requirement for chunks in our setting, i.e., allowing parts of a chunk to be forwarded immediately, gives rise to the fluid limit model, in which the makespan can be minimised in polynomial time. We emphasise that most results on broadcasting treat the message to be broadcast as an atomic unit; the chunk, on the other hand can be split and reach a peer along different paths.

If upload capacities were uniform then it would be natural to distribute the chunk in order of decreasing download capacities with the peer having the largest download capacity receiving it first. Similarly, if the download capacities were uniform it would make sense to distribute the chunk in order of decreasing upload capacity and first sending it to the peer with the largest upload capacity. When upload and download capacities are identical (the symmetric setting) both these ideas yield the same order for distributing the chunk to the peers. However, when upload and download capacities are different (for a peer and across peers) there seems to be no natural preference order for distributing the chunk. For the non-migratory setting minimizing makespan is NP-hard [8] but for the migratory setting, we do not know if the problem is NP-hard.

The problem of scheduling the broadcast of multiple chunks is also of much interest but in this paper, we limit our discussion to a single chunk.

1.1 Our Results

This paper is the first to address broadcast in the hose model in its full generality, i.e., the asymmetric heterogeneous setting where the upload and download capacities of each peer can be different. Previous papers have only considered the symmetric setting when the upload and download capacity of each peer are identical [8]; see Section 1.2 for details. The problem of finding a constant approximation for the asymmetric setting has been open, and even the setting of uniform upload capacities had not been considered earlier.

Additionally, the paper introduces the novel migratory setting that allows a peer to download from multiple peers subject to its download capacity constraint. This is a natural relaxation and does not lead to loss of data identity, as we still process the chunk as a whole. The constraint that a peer starts uploading only after receiving the complete chunk is important and separates our model from the fluid limit model.

Our first result is a novel completion time integer program for the problem of minimising makespan in the migratory setting. Formulating this integer program requires proving that a certain non-trivial property is necessary and sufficient for characterising valid broadcast schedules. Relaxing the integrality constraints gives a linear program, and we exploit the structure of the optimum solution of this LP to obtain a solution with makespan at most e1/e𝙾𝙿𝚃+P (Theorem 5). P is the time required for the slowest peer to download the chunk and while it is definitely less than 𝙾𝙿𝚃, it could be significantly smaller.

For the non-migratory setting, we do not know how to formulate a suitable integer program, but we circumvent this difficulty by restricting attention to so-called slotted schedules. We formulate an LP by restricting attention to downloads that fit into “slots”; we establish that this restriction does not lead to a significant increase in makespan. Our algorithm for rounding the optimum solution to the linear program proceeds through three stages each of which requires careful reasoning about the structure of the solution. This yields a solution of makespan 3𝙾𝙿𝚃+P where 𝙾𝙿𝚃 is the minimum makespan of any slotted schedule. Since 𝙾𝙿𝚃6𝙾𝙿𝚃 (Lemma 7) we obtain a solution of makespan at most 18𝙾𝙿𝚃+P (Theorem 11).

In the Appendix, we also consider 2 special cases in the non-migratory setting. When all download capacities are identical we extend the results of [12] to obtain a 2-approximation algorithm. When all upload capacities are identical we modify the download capacities of some peers and add some dummy peers to obtain a modified instance with makespan at most 2𝙾𝙿𝚃+P. We then find an optimum schedule for this modified instance and convert it into a schedule for the original instance with makespan at most 2𝙾𝙿𝚃+2P.

In the results mentioned above, we assume that the root peer has an upload capacity at least as large as the largest download capacity. If such is not the case, we can, in max{P,1/u(0)} time units (u(0) is the upload capacity of the root peer) transfer the chunk to the peer with the largest upload capacity and then run our algorithms. This increases the makespan by at most 𝙾𝙿𝚃 as 1/u(0) is also a lower bound on 𝙾𝙿𝚃. In the non-migratory model, no peer can receive the chunk at a rate larger than the largest upload capacity, and so we can cap all download capacities at this value. For the migratory model, we show (Theorem 5) that the schedule obtained by our algorithm has makespan P1/PQ𝙾𝙿𝚃+P where Q is the ratio of the largest upload capacity to the largest download capacity of the peers.

1.2 Previous Work

The problem of broadcasting, where the task is to disseminate a message from a source node to the remaining nodes of a network as quickly as possible, has been studied under various models (see [9] for a survey). A popular model is the telephone model where communication happens in rounds. In each round, a peer with the message can communicate it to exactly one other peer and the objective is to minimize the number of rounds. If every node can communicate with any other node, Bar-Noy et al. [5] gave a simple recursive doubling algorithm which is optimal when the number of nodes is even and an additive 1 approximation when the number of nodes is odd. When communication can happen only between adjacent nodes of a given graph, Ravi [14] defined the notion of poise and used it to give a O(log2nloglogn)-approximation, later improved to a O(lognloglogn)-approximation by Elkin and Kortsarz [7].

In the simultaneous send/receive model a peer can send the message to one peer and receive a message from another peer in the same round. Bar-Noy et al. [5] gave an optimal algorithm for broadcasting multiple messages for this model. A natural generalisation of this model is the k-port model which allows each peer to send k messages to k peers and receive k messages from k peers in a single round. An additive 1 approximation for the minimum number of rounds needed to send multiple messages was obtained in [4].

Another natural generalisation to these models is allowing for network heterogeneity. Banikazemi et al. [3] considered a setting of heterogeneous processors (peers) capable of point-to-point communication with each other. The peers could have different sending speeds/rates and the model requires that any peer take part in at most a single message transmission at any given time. For this model they proposed a fastest node first algorithm (FNF); in each iteration, the peer with the least transmission time which has the message sends it to the peer with the least transmission time which does not have the message. Liu [12] gave a 2-approximation algorithm for minimizing makespan in this model and showed that the fastest node first algorithm is optimum if sending rates are integer multiples of each other. Khuller and Kim [11] proved that the Fastest node first algorithm is a 1.5-approximation.

Mundinger et al. [13] considered the file-sharing problem and introduced the uplink-sharing model. The file is broken into multiple chunks and a peer can receive a chunk only from 1 other peer. However, a peer could simultaneously send a chunk to multiple peers. The rate of transfer is constrained by the upload capacity of the sending peer and the authors assume that concurrent uploads from one peer would share the upload capacity equally (fair-sharing assumption). Under this assumption, they prove that for identical upload capacities, a simple greedy algorithm is optimal. The authors [13] also studied the fluid limit model where the chunks are infinitesimally small.

Goetzmann et al. [8] removed the fair sharing assumption from [13] and generalised the model by introducing download capacity constraints. They consider the setting of symmetric heterogeneous capacities – equal upload and download capacities per peer that may differ among peers. Continuing with the assumption that a peer can receive a chunk from only one other peer, they prove that the problem of minimizing makespan is strongly NP-hard and give a (1+22)-approximation algorithm for a single chunk and a 5-approximation algorithm for multiple chunks.

2 Preliminaries and Notations

For a,b+, let [a..b] denote the set of integers between (and including) a and b. We use [n] as shorthand for [1..n] and (a..b] as shorthand for [a+1..b]. For a,b, [a,b] denotes the set of reals between (and including) a and b, and (a,b] the reals x such that a<xb.

The peers are numbered 0 through n, and we assume that the chunk resides at peer 0, which we refer to as the root peer. We define a unit of time to be the minimum time needed by the peer with the highest download capacity to download the chunk. The upload/download capacity of a peer is defined as the fraction of the chunk it can upload/download in one time unit. From our definition of a time unit, it follows that the maximum download capacity of any peer is 1. Let u:[0..n]0 be the upload and d:[n](0,1] the download capacities. Thus, a peer j with download capacity d(j) can download the entire chunk in 1/d(j) time units. Note that the peer may download the chunk at a rate lower than its download capacity; in this case, the time it will take is longer than 1/d(j). Let P=maxj[n]1/d(j) be the maximum time a peer would take to download the chunk if it does so at the maximum permissible rate.

To allow us to model the problems as time-indexed integer programs, we discretize our schedules. For the migratory setting, we assume that all downloads and uploads begin and end at times that are integer multiples of ϵ/n. If a peer is the ith peer to receive a chunk in the optimum schedule, its completion time will increase by at most iϵ/n due to this assumption, and this increases the makespan of the optimum schedule by at most ϵ. In this setting, time-slot t+, will refer to the time interval [ϵ(t1)/n,ϵt/n).

In the non-migratory setting, our schedules are “slotted” and all uploads, downloads begin and end at integer times. We shall argue (Lemma 7) that this assumption does not cause a significant increase in the makespan of our solution. In this setting, time-slot t+, will refer to the time interval [t1,t). In both settings, time t+ will refer to a point in time, while time-slot t+ will refer to an interval.

Let γ=ϵ/n for migratory schedules and γ=1 for non-migratory schedules. A schedule is a function x:[0..n]×[1..n]×[1..T][0,1] where x(i,j,t) is the fraction of the chunk sent from peer i to peer j in time-slot t. Further, x satisfies the following constraints.

Upload Capacity:

The total upload by peer i in time-slot t, jx(i,j,t) cannot exceed its upload capacity γu(i).

Download Capacity:

The total download by peer j in time slot t, ix(i,j,t), cannot exceed its download capacity γd(j).

Full Download before Upload:

A peer i can start uploading only after it has received the entire chunk. Thus, if x(i,j,t)>0, then t<tjx(j,i,t)=1.

Refer to caption
Figure 1: An example schedule: x(0,1,1)=x(0,2,1)=x(0,1,2)=x(0,2,2)=1/2,x(2,3,3)=x(0,4,3)=1. The x-axis is time and the y-axis is the total upload capacity.

Figure 1 shows a schedule for an instance with 4 peers 1,2,3,4 with (upload, download) capacities (0,1/2),(1,1),(0,1),(0,1) respectively. The root peer has upload capacity 1 and sends the chunk to peers 1 and 2 during (0,2). At t=2, the total upload capacity is 2 (root and peer 2 uploading), and the chunk is sent to peers 3 and 4 during (2,3). If the root peer sends the chunk to 4 and peer 2 sends the chunk to 3 during (2,3), this schedule is non-migratory, as every peer receives the chunk from a single other peer.

3 Migratory Schedules

We first consider the problem of finding a migratory schedule that minimizes makespan. In such schedules, a peer can download the chunk from multiple peers simultaneously and hence the key challenge is to ensure that there is enough upload capacity available to support all the downloads at any point in time.

3.1 Formulating a Linear Program

We formulate an integer program with completion times to determine if a single chunk can be shared by all the peers within time C. We then relax the integrality constraint to obtain a linear program that we solve to obtain a fractional solution.

3.1.1 Feasibility of download completion times

Given times {ci+,i[n]}, is there a schedule in which peer i completes downloading the chunk at time ci? We renumber the peers so that c1c2cn=C. The chunk is initially with peer 0, and so c0=0. Let U(t) be the total upload capacity of all peers who have received the chunk at or before time t. Thus, for t[ci1,ci),i[n], U(t)=j=0i1u(j), and U(0)=u(0). Note that U:[0,C]+ is a monotonically non-decreasing step-function with steps at ci,i[n1]; the steps in U(t) are the points of discontinuity of U(t). It now remains to check if there is a way of scheduling downloads such that the total download at time t does not exceed U(t) and at any time the download of peer j does not exceed d(j).

By time ci the chunk has been downloaded by i peers. The total upload capacity available till time ci, 0ciU(t)𝑑t, must thus be at least i. This condition, although necessary, for the ci, i[n], to form a feasible set of download completion times, is not sufficient since we have the added constraint that at any time a peer j can only download a d(j) fraction of the chunk.

Consider a time t[0,T) and a peer j such that t<cj. Note that at most a (cjt)d(j) fraction of the chunk can be downloaded in the interval [t,cj] and if (cjt)d(j)<1 at least a 1(cjt)d(j) fraction of the chunk would have to be downloaded by peer j before t. Since this is true for every peer j with cj>t, it follows that all such peers download a j:cj>tmax{0,1(cjt)d(j)} fraction of the chunk before t. Peers whose completion time is at or before t download the entire chunk before t and hence

0tU(t)𝑑t|{j:cjt}|+j:cj>tmax{0,1(cjt)d(j)} (1)

is a necessary condition for feasibility of {cj,j[n]}.

3.1.2 Proving sufficiency

We next give an algorithm for scheduling downloads that establishes the sufficiency of condition 1. We pack the downloads in the upload profile U in order of increasing completion times. Let Ui denote the residual upload profile after downloads for peers 1 to i have been (see Figure 2). Thus U0=U and since downloads for peers 1 to i are scheduled before time ci, Ui,U are identical after time ci. For every i, we ensure that Ui(t) is a monotonically non-decreasing step-function.

Algorithm 1 Scheduling download for peer j.

We now give the procedure for scheduling the download of peer j. Initially set α=cj and β=1. We wish to schedule the download of a β fraction of the chunk for peer j in [0,α]. Let λ0 be such that 0αmax{Uj1(t)λ,0}dt=β. We would like to use the part of the profile of Uj1 above λ to pack the download for peer j. Since Uj1 is monotonically non-decreasing, such a packing would be valid if and only if Uj1(α)λd(j). If this condition is true, then we have completed scheduling the download of peer j. Else, we schedule d(j) units of download for peer j at every time in the interval [α,α] where α is the time at which Uj1 has a step immediately preceding the step at α. The remaining download for peer j would be scheduled in the interval [0,α] and this is done by repeating the above process after setting α to α and β to βd(j)(αα).

Note that when we repeat the above process, the value of λ decreases. Our procedure would fail if at some step we realize that the total residual upload capacity in [0,α] which is 0αUj1(t)𝑑t, is less than β and thus insufficient to schedule the download of peer j. We next argue that this α is a witness to the violation of condition 1.

Refer to caption
Figure 2: The upload capacities are u(0)=0.5,u(1)=0.25,u(2)=0.3,u(3)=0.15 and the download capacities are d(1)=1/3,d(2)=0.22,d(3)=1,d(4)=0.3. The figure shows the scheduling of downloads of peers 1,2,3, and 4.

Let δ be the value of α at the step at which our procedure fails to schedule the download for peer j. Thus 0δUj1(t)𝑑t<1d(j)(cjδ).

Lemma 1.

If the algorithm fails to schedule downloads that respect the given completion times, then

0δU(t)𝑑t<|{j:cjδ}|+j:cj>δmax{0,1(cjδ)d(j)}
Proof.

To prove this lemma, we start with the following Claims.

Claim 2.

j[0..n], Uj(t) is a monotonically non-decreasing step function. Every step in Uj is also a step in U.

Proof.

We prove the statement is true by induction on j; it is true for j=0 since U0=U.

From our procedure for scheduling the download of peer j it follows that there exist steps at a,b, abcj in Uj1 such that

Uj(t)={Uj1(t)t[0,a)[cj,T]λt[a,b)Uj1(t)d(j)t[b,cj)

It also follows from our choice of λ that Uj(t)=Uj1(t)<λ for t[0,a). Monotonicity of Uj in [b,cj) follows from the fact that Uj1 is monotonically non-decreasing and Uj(t)=Uj1(t)d(j) for t[b,cj). Thus it remains to argue that Uj(b)λ. If Uj(b)<λ then our choice of λ in the penultimate step of our process would have been such that Uj1(α)=Uj1(b)<λ+d(j) and so our procedure would have terminated at the penultimate step.

From the above argument it follows that all steps in Uj1 in the intervals [0,a] and [b,cj] are retained in Uj. However the steps in Uj1 in (a,b) are not steps anymore in Uj since Uj(t)=λ for t[a,b). This implies all steps in Uj are also steps in Uj1 and hence in U.

Since δ is a step in Uj1, the above claim implies that there exists k[j1] such that ck=δ.

Claim 3.

If peer i, i<j receives a download fraction less than d(i) at time tck=δ then peer i receives the entire chunk in [ck,ci].

Proof.

Since δ=ck is a step in Uj it is also a step in Ui,i[j1]. If peer i received some download before time δ, then it was downloaded to an extent less than d(i) at times before and after δ. Hence δ[ai,bi] which would imply that δ is no more a step in Ui+1 leading to a contradiction. The above claim implies that if the download of a peer i,k+1ij is scheduled before δ, then for all t[δ,ci], peer i received a d(i) fraction of the chunk. This in turn implies that max{0,1d(i)(ciδ)} fraction of the chunk is downloaded by peer i before δ. Peers 1 to k complete their entire download before δ and since the algorithm fails while scheduling download for peer j it must be the case that

k+i=k+1jmax{1d(i)(ciδ),0}>0δU(t)𝑑t

which completes the proof of the Lemma.

Thus, constraint (1) is violated, and this establishes the sufficiency of the condition.

3.1.3 An LP relaxation

The constraint (1) gives a necessary and sufficient condition for {cj,j[n]} to be a feasible set of completion times. We use this to formulate a completion time integer program that is feasible if and only if all downloads can be completed by time C. Our first step is to round the download capacities down to the nearest power of 1/(1+δ), where δ is a constant to be picked later. To simplify notation, we continue to use d(j) for the rounded download capacities. We define class k as the set of peers with download capacity 1/(1+δ)k; the number of classes is at most log1+δP, and we denote this by L.

Let T=nC/ϵ and xjt,j[n],t[T] be an indicator variable that is 1 if peer j finishes downloading the chunk in the time slot t and 0 otherwise. Clearly,

j[n]t[T]xjt=1 (2)

Then U(t)=u(0)+j[n]t[t1]xjtu(j), where U(t) denotes the total upload capacity in time slot t. The constraint (1) can now be formulated as

t[T]j[n](t[t]xjt+t(t..T]xjtmax{1d(j)(tt),0})t[t]U(t) (3)

Relaxing integrality constraints on the variables {xjt,j[n],t[T]} gives a linear program with n2C/ϵ variables and n+nC/ϵ constraints. In Theorem 5 we argue that Cn2 and therefore the LP can be solved in time polynomial in n,1/ϵ.

Figure 3 shows an instance with an integrality gap of 3/2. Peers 1,2,3 have (upload, download) capacities (0,1/2),(2,1),(0,1) respectively, and the root peer has upload capacity 1. The linear program is feasible for C=2 while the best integral solution has makespan 3.

Refer to caption
Figure 3: The LP solution on the right has x1,2=1,x2,1=1/2,x2,2=1/2,x3,2=1 and a makespan of 2. The best integral solution is on the left and has a makespan of 3. In both figures, the x-axis is time and the y-axis is the total upload capacity.

3.2 Constructing an integer solution

Let x be an optimum solution to the linear program and let Uf(t)=U(t),t[T]. We can modify the algorithm from Section 3.1.2 to compute the schedule corresponding to the solution x. With each xjt>0 we associate a pseudo-peer j,t which has download capacity xjtd(j), upload capacity xjtu(j) and needs to download only a xjt fraction of the chunk by time slot t. The algorithm can be used to schedule the download of the chunk to these pseudo-peers in order of increasing t. The only modification needed is that β, the fraction of the chunk that remains to be downloaded to the pseudo-peer j,t, is initially set to xjt, and γ, the largest fraction of chunk which can be downloaded in a time-slot, is now set to xjtd(j).

Thus, if the upload capacity in time slot t is Uf(t), we can build a schedule such that an xjt fraction of the chunk finishes downloading to pseudo-peer j,t and hence to peer j by time slot t. Let c(j) be the earliest time slot such that t[c(j)]xjt=1. Peer j completes downloading the chunk at time slot c(j) in this schedule we have built and hence it can start uploading the chunk only after time slot c(j). Thus the true (or integral) upload capacity in time slot t in this schedule is Ui(t)=u(0)+j:c(j)<tu(j). Note that Uf(t), the fractional upload capacity at time slot t, is the sum of the upload capacities of all peers weighted by the fraction of the chunk that they have received by time slot t. It follows therefore, that for all time slots t, Uf(t)Ui(t). A feasible schedule can, in any time slot t, only permit downloads of total capacity at most the true upload capacity Ui(t) and hence the schedule obtained from solution x is not feasible. In order to build a feasible schedule we need a better understanding of the difference between Uf and Ui. Informally, the following lemma shows that for two peers in the same class, the download of the peer with higher upload capacity precedes the download of the peer with lower upload capacity.

Lemma 4.

There exists an optimum fractional solution in which j1,j2 such that d(j1)=d(j2), if u(j1)>u(j2) then t<c(j1),xj2t=0

Proof.

For contradiction assume that j1,j2,d(j1)=d(j2),u(j1)>u(j2) and time slots t1,t2,t1<t2 such that xj2t1,xj1t2>0. Let δ=min(xj2t1,xj1t2) and we construct a new LP solution x which is identical to x with the exception that xj2t1=xj2t1δ,xj1t1=xj1t1+δ and xj1t2=xj1t2δ,xj2t2=xj2t2+δ.

We next argue that x is also a feasible solution to the linear program. Replacing the download of peer j2 that finishes in time slot t1 with an equal download of peer j1 does not change the left-hand side of inequality (3) since d(j1)=d(j2). The same holds for replacing the download of peer j1 ending in time slot t2 with a download for peer j2. Doing these exchanges together ensures that the constraint (2) is also met.

Let U(t)=t[t1]j[n]xjtu(j). Since u(j1)>u(j2), exchanging downloads of j2 and j1 in time slots t1,t2 will only increase the upload capacity at time slots t(t1..t2]. Thus U(t)>U(t),t(t1..t2] and they are equal otherwise. This implies that the right-hand-side of inequality 3 is only larger in x than in x and hence the constraint is met.

3.2.1 Shifting downloads within a class

Let j1,j2,,jp be peers of a class l such that u(j1)u(j2)u(jp). From the above lemma, it follows that there exists an LP solution, say x, such that c(j1)c(j2)c(jp). In time slot t,t(c(ji)..c(ji+1)], the total contribution (in solution x) by peers of class l to the true upload capacity is exactly h=1iu(jh) while their contribution to the fractional upload capacity is at most h=1i+1u(jh).

We now build a solution x(1) by shifting the downloads of peers within each class. This is done by downloading the chunk for peer ji+1 in the time slots in which the download for peer ji was scheduled in the solution x. This can be done since the restriction on the fraction of the chunk that can be downloaded in a time-slot (the download capacity) is identical for peers ji,ji+1. Thus for all t and 1ip, we set, xji+1,t(1)=xji,t, where j1,j2,,jp are peers of a class. For now, we will assume that peer j1 has already received the chunk from the root peer and shall remove this assumption later.

We repeat this process of shifting downloads of peers for all classes. Once this shifting is done, consider t(c(ji)..c(ji+1)] and the true upload capacity in solution x(1) at time slot t. The total contribution by peers of class l to this quantity is h=1i+1u(jh), which is at least as large as their contribution to the fractional upload capacity in time slot t in solution x. Since the fractional upload capacity was sufficient to schedule all downloads, it follows that the solution x(1) is indeed a feasible solution. We note our assumption that the peers with the highest upload capacity in each class (the leaders) have already received the chunk at time t=0 from the root peer and hence their upload capacity is being included in the true upload capacity available in each time slot.

3.2.2 Distributing chunks to class-leaders

We now show how to distribute the chunk to the leaders. The root peer simultaneously uploads the chunk to each of the L class leaders at a rate 1/P, and hence all class leaders receive the chunk at time P. This requires that the root peer has an upload capacity of at least L/P=P1log1+δP.

We assume ϵ is chosen such that nP/ϵ is an integer. We construct a new solution x(2) by shifting the solution x(1) forward in time by P units. Formally, for peers j[n] and time slots t[0,T] we set xj,t+nP/ϵ(2)xj,t(1). This shifting creates nP/ϵ empty time-slots at the start, which we use for downloading the chunk from the root to all class leaders. The makespan of this modified solution is at most C+P.

We are now ready to prove the following theorem.

Theorem 5.

In the migratory setting, there is a polynomial time algorithm to compute a schedule of transfers such that a root peer with upload capacity Q can share a chunk with all peers in time at most P1/PQ𝙾𝙿𝚃+P, where P is the time required by the slowest peer to download the chunk. When Q1, our schedule has makespan e1/e𝙾𝙿𝚃+P.

Proof.

Suppose Pn/Q. All peers have download capacity at least 1/P. Since 1/PQ/n, they can trivially download the chunk simultaneously from the root peer (which has upload capacity Q) in time P, and this is the optimum. Hence, we assume P<n/Q. All peers can also download the chunk sequentially from the root peer in time at most nPn2/Q, and this is an upper bound on the makespan C.

Let I be the original instance and I the instance with download capacities rounded down to a power of 1/(1+δ).

Claim 6.

𝙾𝙿𝚃(I)(1+δ)𝙾𝙿𝚃(I).

Let C be a guess on the makespan of the instance I. Note that C is at least P and at most nP. By doing a binary search on C and solving the linear program for the instance I on each guess, we determine the smallest value of C, say C, for which the linear program is feasible. Note that C𝙾𝙿𝚃(I) and let x be the solution to the linear program. We modify x as discussed in previous sections to obtain a feasible solution x(2) with makespan C+P𝙾𝙿𝚃(I)+P(1+δ)𝙾𝙿𝚃(I)+P.

The upload capacity of the root peer Q, must be at least P1log1+δP and this implies δP1/PQ1. With this choice of δ, the schedule returned by our algorithm has makespan P1/PQ𝙾𝙿𝚃+P. When Q1, the makespan is at most e1/e𝙾𝙿𝚃+P.

If the peer with the largest upload capacity is different from the root then we incur an additional max{1/u(0),P} time in transferring the chunk from the root to this peer.

4 The non-migratory setting

This section considers the setting when a peer can download the chunk from only one peer – the parent peer. Let T be an estimate of the minimum makespan of a slotted schedule, defined below. We divide the interval [0,T] into slots of different sizes. Let K=log2T. For k[0..K], a class-k-slot is the time-interval [p2k,(p+1)2k),p[0..T/2k) and we refer to this interval as the slot (k,p). For a slot (k,p) and time t, we let “t<(k,p)” denote t<p2k, “t>(k,p)” denote t(p+1)2k, and “t(k,p)” denote t[p2k,(p+1)2k). Note that any two slots – of the same or different classes – are either disjoint or one slot contains the other.

4.1 Slotted Schedules

A peer j is in class l if 1/2ld(j)<1/2l1; let ϕ(j) denote the class of peer j. A peer downloads the chunk in slot (k,p) if the chunk is downloaded in the interval [p2k,(p+1)2k) at a uniform rate 1/2k. A peer j can download the chunk in a slot of class k only if kϕ(j). We call a schedule slotted if every peer downloads the chunk in some slot and we formulate an LP to minimize the makespan of a slotted schedule. The following Lemma, whose proof we omit due to space constraints, argues that the makespan of a slotted schedule is not significantly more than the makespan of the optimum schedule.

Lemma 7.

Any non-migratory schedule S with makespan T can be converted into a non-migratory slotted schedule S with makespan at most 6T.

4.2 An LP for slotted schedules

Let T be our estimate of the minimum makespan and let xijkp,i[0..n],j[n],k[0..K],p[0..T/2k) be 1 if peer j downloads the chunk from peer i in slot (k,p). Note that peer i can upload the chunk in a slot (k,p) only if 2ku(i). We combine this with our earlier observation that kϕ(j) to define variables xijkp only for kκij where κij=max(ϕ(j),log2u(i)1)

  1. 1.

    The constraint k,pi[0..n]xijkp1 ensures that peer j downloads the chunk in some slot.

  2. 2.

    Let yjt denote the fraction of the chunk downloaded by peer j before time t. Note that this is at most k,p:(k,p)<ti[0..n]xijkp. Peer 0 (the root) has the chunk available to it at time 0 and hence y0t=1,t[T].

  3. 3.

    The total downloads from peer i assigned to slots containing t is k,p:t(k,p)j[n]xijkp2k and this cannot exceed u(i)yit which is the total upload capacity of peer i at time t.

This then yields the following system of linear inequalities.

k,pi[0..n]xijkp 1 j[n] (4)
k,p:(k,p)<ti[0..n]xijkp yjt j[n],t[T] (5)
k,p:t(k,p)j[n]xijkp2k u(i)yit i[0..n],t[T] (6)
xijkp [0,1] i[0..n],j[n],k[κij..K],p[0..T/2k)
yjt [0,1] t[T]
y0t =1 t[T]

If the above linear system is infeasible, our estimate of T is too low and we double T. By performing a binary search on T we determine the smallest T, say T, such that the linear system is feasible and let x be this feasible solution.

4.3 Rounding to a non-migratory schedule

An important property of x, along the lines of Lemma 4, is that if j1,j2 are peers of the same class and u(j1)>u(j2), then j2 cannot be downloading the chunk in a slot that ends strictly before any slot in which j1 downloads the chunk.

Lemma 8.

Let j1,j2ϕ1(l) be such that u(j1)>u(j2). Let xij1kp>0 and t=(p+1)2k. Then for all i,k,p such that (k,p)<t and i[n], xij2kp=0.

Thus for each class, the chunk is received by peers of this class in decreasing order of their upload capacity. For each class l order the peers of this class by decreasing upload capacity, breaking ties arbitrarily and let l,i be the ith peer in this ordering; it would be convenient to define l,0 as 0 (the root peer). Let ylkp=i[n]jϕ1(l)xijkp be the total extent to which peers of class l download the chunk in a slot (k,p). To determine which slot each peer of class l downloads the chunk in, we order slots (k,p) by increasing end-time. Then peer l,1 downloads the chunk in the earliest slots of this ordering for which the sum of ylkp is 1. In general, if c(l,i) denotes the time at which the chunk is fully downloaded by peer l,i then c(l,i)=t is the earliest time at which k,p:(k,p)tylkpi. We define c(l,0) as 0. Thus, 0=c(l,0)c(l,1)c(l,2)c(l,i). Note that a peer could be downloading the chunk from many different peers and hence, x,y is a migratory schedule. We now modify x,y in a sequence of steps to obtain a non-migratory schedule.

4.3.1 Ensuring a peer uploads only after downloading the entire chunk

From the preceding discussion it follows that for j=l,h, yjt=0 for t<c(l,h1) and yjt=1 for tc(l,h). Note that yjt is monotonically non-decreasing with time t and increases from 0 to 1 in the interval [c(l,h1),c(l,h)]. In the linear system, a yjt fraction of the upload capacity of peer j is available at time t and thus in the solution x, peer j can start uploading the chunk after time c(l,h1) and before time c(l,h). However, recall that in a valid schedule, peer j can upload the chunk to the other peers only after it has completed the download at time c(l,h). We resolve this by

  1. 1.

    assuming that at time 0, the chunk is available to the first peer of each class, and

  2. 2.

    scheduling the downloads by peer l,h, h2, in slots which peer l,h1 uses to download the chunk.

Peer l,h would then complete its download by time c(l,h1) and the uploads by peer l,h after time c(l,h1) in the modified solution would now be valid. The modified solution (x(1),y(1)) is defined by

xil,hkp(1) =xil,h1kp i[0..n],k[κij..K],p[0..T/2k],l[0..L],h2
yl,ht(1) =1 t[c(l,h1)..T],l[0..L],h1
yl,ht(1) =0 t[0..c(l,h1)1],l[0..L],h2
y0t(1) =1 t[0..T]

Let Z[n] be the peers who are not class leaders. We next argue that the solution (x(1),y(1)) satisfies the constraints of our linear system for peers in Z.

Lemma 9.

The solution (x(1),y(1)) satisfies constraint (6) and constraints (4) and (5) for peers in Z.

Proof.

Consider a peer l,h,h>1. Since xil,hkp(1)=xil,h1kp, we have

k,pi[0..n]xik,hkp(1)=k,pi[0..n]xil,h1kp1

where the second inequality follows from the fact that x is a feasible solution and satisfies constraint (4).

When viewed as a function of t, yk,ht(1) increases from 0 to 1 at t=c(l,h1). Since yk,h1t1 for all tc(l,h1) it follows that for all t[0..T], yk,ht(1)yk,h1t. Thus,

k,p:(k,p)<ti[0..n]xik,hkp(1)=k,p:(k,p)<ti[0..n]xik,h1kpyk,h1tyk,ht(1)

and this establishes constraint (5)

Finally to prove constraint (6) we observe that for any peer i[0..n] and slot (k,p), jZxijkp(1)j[n]xijkp. Thus

k,p:t(k,p)jZxijkp(1)2kk,p:t(k,p)j[n]xijkp2ku(i)yitu(i)yit(1)

where the last inequality follows from the fact that for any peer l,h,h2, yk,ht(1) increases from 0 to 1 at t=c(l,h1) while yk,ht remains 0 till c(l,h1)1 and is at most 1 at tc(l,h1). For a peer l,1, yl,1t(1)=1 for all t[0..T] while yl,1t1.

4.3.2 Ensuring a peer downloads from only one peer

Note that in (x(1),y(1)) a peer could be downloading the chunk from multiple peers and so (x(1),y(1)) is a migratory solution. We will now modify this solution so that each peer downloads the chunk completely in one slot from one peer. Let xikp(1)=j[n]xijkp(1) be the total extent to which the chunk is downloaded from peer i in slot (k,p).

We now construct a bipartite graph G=(U,V,E) where U contains a vertex for each peer which is not a class leader. For each triple (i,k,p),i[n],k[0..K],p[0..T/2k] we have a vertex in V if xikp(1)>0. Let b:V+ be a function on the vertices in V defined such that for (i,k,p)V, b(i,k,p)=xikp(1). For peer j=l,h,h2 we include edge (j,(i,k,p)) in E if xijkp(1)>0 and (k,p)<c(l,h1).

The graph G and the function b define an instance of an assignment problem in which we wish to assign every vertex uU to a vertex in vV, such that (u,v)E and the number of vertices assigned to any vertex vV is at most b(v). For integer b it is known that a fractional assignment implies the existence of an integer assignment. The variables xijkp(1) define a fractional assignment as follows

  1. 1.

    For edge (j,(i,k,p))E, assign vertex jU to vertex (i,k,p)V to an extent xijkp(1).

  2. 2.

    For j=l,h,t=c(l,h1),h2, the feasibility of solution (x(1),y(1)) implies

    k,p:(k,p)<ti[0..n]xijkp(1)yjt=1.

    Hence vertex j is assigned to a total extent of 1 in this fractional assignment.

  3. 3.

    jxijkp(1)=xikp(1)xikp(1)=b(i,k,p) and hence the total fractional assignment of vertices of U to a vertex (i,k,p)V is no more than b(i,k,p).

An integer assignment can be computed by finding a maximum flow in a network obtained by adding vertices s,t and edges (s,u),uU and (v,t),vV to the graph G. All edges of G and edges incident to s are assigned unit capacity while edges (v,t),vV are assigned capacity b(v). Since all edge capacities are integer a maximum st flow in this network is integral and edges of G carrying unit-flow give an integer assignment. We use this assignment to define an integral solution x(2) by setting xijkp(2) to 1 iff the vertex corresponding to peer j in U is assigned to vertex (i,k,p)V. We also set yjt(2)=yjt(1),j[n],t[0..T].

Lemma 10.

(x(2),y(2)) satisfies constraints (4) and (5) for peers in Z. If upload capacities were three times the original then (x(2),y(2)) also satisfies constraint (6).

Proof.

In the assignment we use to define x(2), every peer in Z is assigned to a (peer, slot) pair. This implies that for jZ,

k,pi[0..n]xijkp(2)1

which is constraint (4).

For a peer l,h,h2, we have set yl,ht(2)=yl,ht(1)=0 for t<c(l,h1) and hence we only need to check constraint (5) for t=c(l,h1). From our construction of graph G it follows that in the integer assignment, peer l,h is assigned to a vertex (i,k,p) such that (k,p)<c(l,h1). This implies that for t=c(l,h1)

k,p:(k,p)<ti[0..n]xik,hkp(2)1yl,ht(2)

which implies constraint (5) is satisfied.

As concerns constraint (6), note that for any time t[0..T] and peer i[0..n]

k,p:t(k,p)j[n]xijkp(2)2k <k,p:t(k,p)xikp(1)2k+k,p:t(k,p),2ku(i)2k
=k,p:t(k,p)j[n]xijkp(1)2k+k=log2min{1,u(i)}K2k
u(i)yit(1)+2min{1,u(i)}3u(i)yit(2)

where the last inequality follows from the fact that peer i is uploading the chunk at time t in solution (x(1),y(1)), and hence yit(1)=yit(2)=1.

Since every peer downloads the chunk from one peer the solution (x(2),y(2)) is non-migratory. However, it violates the constraint that the total download scheduled from a peer at any time should not exceed the upload capacity of that peer. In addition, we need to distribute the chunk to class leaders.

4.3.3 Scaling and shifting the schedule

We first project the solution (x(2),y(2)) to obtain a solution that does not reference slots. Let xijt(3)=k,p:t(k,p)xijkp(2)2k be the fraction of the chunk sent from peer i to peer j in time-slot t. For jZ, we have

i[0..n]t[0..T]xijt(3)=i[0..n]t[0..T]k,p:t(k,p)xijkp(2)2k=i[0..n]k,pxijkp(2)1

and hence every peer in Z receives the complete chunk. Since x(2) is 0/1, peer j downloads the chunk in some slot (k,p) where kϕ(j). This implies xijt(3)2ϕ(j)1/d(j) for all t[0..T] and hence no peer downloads at a rate more than its download capacity.

Since y(2) is 0/1 and

xijt(3)=k,p:t(k,p)xijkp(2)2kj[n]k,p:t(k,p)xijkp(2)2k3u(i)yit(2),

xijt(3)>0 implies yit(2)=1. Hence, peer i uploads the chunk only after it has completed its download. Finally, the total upload by peer i in time slot t can be bounded as

j[n]xijt(3)j[n]k,p:t(k,p)xijkp(2)2k3u(i)yit(2)3u(i).

To ensure that this does not exceed the upload capacity of peer i, we scale and stretch the solution x(3) by 3. Formally, for t[0..T], jZ and i[0..n] we set

xij(3t)(3)=xij(3t+1)(3)=xij(3t+2)(3)=xijt(3)/3

With this modification, the total upload by peer i in a time slot does not exceed its upload capacity u(i).

Finally, we shift this solution by P time units and use the interval [0,P] to distribute the chunk to class leaders as was done for the migratory setting. The final solution, which we denote by x, is defined as xij(t+P)=xijt(3). Note that the makespan of the solution x is 3T+P where T is the minimum makespan of a slotted schedule. By Lemma 7, T6T where T is the minimum makespan of a non-migratory schedule.

Theorem 11.

There is a polynomial time algorithm to find a non-migratory schedule of makespan at most 18T+P where T is the minimum makespan of a non-migratory schedule and PT is the maximum, over all peers, of the minimum time that a peer needs to download a chunk from root.

5 Conclusion

The paper takes a linear programming approach for scheduling transfers of a chunk of a file in a hose-model network and presents the first results for the general setting when the upload and download capacities of peers could be different – for each peer and also across peers. We revisit the assumption made in earlier papers that a peer can download the chunk from only one other peer (We call this the non-migratory setting). The migratory setting has no such constraint. We present approximation results for makespan in both these settings and various special cases. The non-migratory assumption meant that the solution defined a tree and hence all previous approaches were in effect focused on constructing a suitable tree. Whether the migratory setting (like the non-migratory setting) is NP-hard needs to be studied.

Extending our approach to multiple chunks seems challenging. For sharing multiple chunks one would like to design a schedule which is periodic – the schedule for distributing each chunk is identical, albeit shifted in time. Periodic schedules are easy to implement in a network since the same upload and download policies are followed for every chunk. However, in a periodic schedule the upload capacity for a chunk is no more a monotonically non-decreasing function and this causes most of our arguments to break down.

It would be interesting to see if the algorithm in this paper can be redesigned so that it does not involve solving a linear program or if the linear program could be solved very efficiently perhaps using a combinatorial algorithm.

References

  • [1] Satyajeet Singh Ahuja, Varun Gupta, Vinayak Dangui, Soshant Bali, Abishek Gopalan, Hao Zhong, Petr Lapukhov, Yiting Xia, and Ying Zhang. Capacity-efficient and uncertainty-resilient backbone network planning with hose. In Fernando A. Kuipers and Matthew C. Caesar, editors, ACM SIGCOMM 2021 Conference, Virtual Event, USA, August 23-27, 2021, pages 547–559. ACM, 2021. doi:10.1145/3452296.3472918.
  • [2] What is cloud networking? accessed on 27/01/2025. URL: https://aws.amazon.com/what-is/cloud-networking/.
  • [3] Mohammad Banikazemi, Vijay Moorthy, and Dhabaleswar K. Panda. Efficient collective communication on heterogeneous networks of workstations. In 1998 International Conference on Parallel Processing (ICPP ’98), 10-14 August 1998, Minneapolis, Minnesota, USA, Proceedings, pages 460–467. IEEE Computer Society, 1998. doi:10.1109/ICPP.1998.708518.
  • [4] Amotz Bar-Noy and Ching-Tien Ho. Broadcasting multiple messages in the multiport model. IEEE Trans. Parallel Distributed Syst., 10(5):500–508, 1999. doi:10.1109/71.770196.
  • [5] Amotz Bar-Noy, Shlomo Kipnis, and Baruch Schieber. Optimal multiple message broadcasting in telephone-like communication systems. Discret. Appl. Math., 100(1-2):1–15, 2000. doi:10.1016/S0166-218X(99)00155-9.
  • [6] Nick G. Duffield, Pawan Goyal, Albert G. Greenberg, Partho Pratim Mishra, K. K. Ramakrishnan, and Jacobus E. van der Merwe. A flexible model for resource management in virtual private networks. In Lyman Chapin, James P. G. Sterbenz, Guru M. Parulkar, and Jonathan S. Turner, editors, Proceedings of the ACM SIGCOMM 1999 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, August 30 - September 3, 1999, Cambridge, Massachusetts, USA, pages 95–108. ACM, 1999. doi:10.1145/316188.316209.
  • [7] Michael Elkin and Guy Kortsarz. Sublogarithmic approximation for telephone multicast. Journal of Computer and System Sciences, 72(4):648–659, 2006. doi:10.1016/j.jcss.2005.12.002.
  • [8] Kai-Simon Goetzmann, Tobias Harks, Max Klimm, and Konstantin Miller. Optimal file distribution in peer-to-peer networks. In Takao Asano, Shin-Ichi Nakano, Yoshio Okamoto, and Osamu Watanabe, editors, Algorithms and Computation - 22nd International Symposium, ISAAC 2011, Yokohama, Japan, December 5-8, 2011. Proceedings, volume 7074 of Lecture Notes in Computer Science, pages 210–219. Springer, 2011. doi:10.1007/978-3-642-25591-5_23.
  • [9] Sandra Mitchell Hedetniemi, Stephen T. Hedetniemi, and Arthur L. Liestman. A survey of gossiping and broadcasting in communication networks. Networks, 18(4):319–349, 1988. doi:10.1002/net.3230180406.
  • [10] J. Hromkovic, R. Klasing, B. Monien, and R. Peine. Dissemination of information in interconnection networks (broadcasting and gossiping). In F. Hsu and D.-Z. Du, editors, Combinatorial Network Theory, pages 125–212. Kluwer Academic Publishers, 1995.
  • [11] Samir Khuller and Yoo Ah Kim. Broadcasting in heterogeneous networks. Algorithmica, 48(1):1–21, 2007. doi:10.1007/s00453-006-1227-9.
  • [12] Pangfeng Liu. Broadcast scheduling optimization for heterogeneous cluster systems. Journal of Algorithms, 42(1):135–152, 2002. doi:10.1006/jagm.2001.1204.
  • [13] Jochen Mundinger, Richard R. Weber, and Gideon Weiss. Optimal scheduling of peer-to-peer file dissemination. J. Sched., 11(2):105–120, 2008. doi:10.1007/s10951-007-0017-9.
  • [14] R. Ravi. Rapid rumor ramification: Approximating the minimum broadcast time (extended abstract). In 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico, USA, 20-22 November 1994, pages 202–213. IEEE Computer Society, 1994. doi:10.1109/SFCS.1994.365693.

Appendix A Special cases of non-migratory schedules

The high approximation guarantee of Theorem 11 motivates us to study two special cases of the non-migratory model – when all peers have the same download capacity and when all peers have the same upload capacity. For uniform download capacities, we adapt a proof by Liu [12] to show that the Greedy algorithm produces the optimum makespan for instances where all upload capacities are a power of 2; the full details for this special case and proofs of Lemma marked with a are deferred to the full paper.

A.1 Uniform upload capacity

We now assume that u(j)=u for all j and recall d(j)(0,1]. If u<1 then peer i can transfer the chunk to peer j at a maximum rate min{d(j),u}. Since a peer can receive the chunk from only one peer, there is no loss of generality in limiting download capacities to u. Hence for all j, d(j)(0,u]. Next, we scale all download and upload capacities by u to get an instance, I, with upload capacity 1 and d(j)(0,1],j[n].

We now consider a procedure for scheduling downloads of peers that would involve modifying the download capacity of some peers besides adding some dummy peers. We begin by rounding down d(j) to the nearest power of 1/2 and renumber peers so that d(1)d(2)d(n). For the rest of this section, we will assume d(j),j[n] is a power of 1/2. We will schedule the downloads by peers in blocks in the order 1 to n and let j be the earliest peer whose download has not yet been scheduled. Let t be the earliest time by which peers 1..(j1) have completed their download and note that j peers (including the root) have received the chunk by time t and the upload capacity at time t, U(t), is j. Initially at t=0, U(0)=j=1.

Let k=U(t)/d(j) and note that k downloads at rate d(j) can be scheduled during (t,t+1/d(j)]; in subsequent discussion, we refer to this time interval as a block. Peers j,j+1,,(j+k1) would be the k peers downloading the chunk in this block and we increase their download capacities to d(j). If n<j+k1, i.e., the number of remaining peers is less than k then we create j+kn1 dummy peers with upload capacity 1 and download capacity d(j). At the end of this round the value of j increases to j+k while that of t increases to t+1/d(j). We continue to the next round if j<n. Let T0 be the value of t when this procedure completes and let I be the instance obtained. Note that I might have more peers than the original instance I and in I some peers might have a download capacity different from their download capacity in I. Let S be the schedule obtained and note that S is a valid schedule for I and has makespan T0.

Lemma 12 ().

𝑂𝑃𝑇(I)2𝑂𝑃𝑇(I)+P.

We now convert the schedule S into a schedule, S, which is valid for the instance I and has makespan at most T0+P. Consider a block B in S in which peers j to (j+k1) download the chunk during the interval (t,t+1/d(j)]. Let j=j+k,t=t+1/d(j). Note that B, the block following B in S, corresponds to the interval (t,t+1/d(j)]. We observe that peers who download the chunk in block B have a download capacity larger than d(j) and can therefore be scheduled in block B in a valid schedule for instance I. This suggests constructing S from S by simply shifting downloads of peers in a block to the following block. A block, (T0,T0+P] is appended to S to accommodate the downloads of peers in the last block of S. Hence S has makespan T0+P.

The last piece of the argument is to prove that T0𝑂𝑃𝑇(I). To prove this we shall show that a makespan of T01 is infeasible for instance I. To show this we present a technique to argue infeasibility of a makespan value for a given instance of the problem.

Let y:[T]0 be a monotonically non-increasing assignment of weights to time slots. If U(t) is the upload capacity in time slot t in a given schedule x then we refer to the quantity t[T]y(t)U(t) as the total weighted upload capacity of x and denote it by W(x,y).

Let y[a..b] denote the sum t[a..b]y(t). If peer j finishes its download at time t in x then it contributes y[t..T]u(j) to W(x,y). The download by peer j consumes at least y((td(j)1)..t]d(j) units of W(x,y). Thus the net contribution of peer j to W(x,y) if it completes its download at time t in x is y(t..T]u(j)y((td(j)1)..t]d(j) and we denote this by w(j,t). Finally, for j[n], let w(j)=maxt[d(j)1..T]w(j,t) and let w(0), the net contribution of peer 0 to the W(x,y), equal u(0)y[1..T]. Note that w(j),j[0..n] is independent of x but depends on our choice of y.

Claim 13.

If T is a feasible value for the makespan then for any choice of y, j[0..n]w(j)0.

Proof.

Let x be a schedule with makespan T and xijt,i[0..n],j[n],t[T] be the fraction of the chunk that peer j downloads from peer i in time slot t. Let c(j) be the time at which peer j completes its download. For any peer j, t[c(j)]i[0..n]xijt=1 and for any time t[c(j)], i[0..n]xijtd(j). This together with the fact that y is non-increasing implies that for all j[n],

t[c(j)]i[0..n]xijty(t)y((c(j)d(j)1)..c(j)]d(j). (7)

For all t[T],

i[0..n]j[n]xijtU(t)=j:c(j)<tu(j).

Multiplying the above inequality by y(t) and summing over all t[T] yields

t[T]y(t)i[0..n]j[n]xijtt[T]y(t)j:c(j)<tu(j)=j[n]u(j)t[c(j)+1..T]y(t)

We now sum inequality (7) for all j[n] and combine it with the above inequality to obtain

j[n]y((c(j)d(j)1)..c(j)]d(j)t[T]i[0..n]j[n]xijty(t)j[n]y(c(j)..T]u(j)

which implies that j[n]w(j,c(j))0. The claim follows from the fact that w(j)w(j,c(j)),j[n]. The above claim implies that if we can find an assignment of weights y:[T01]0 such that j[0..n]w(j)<0 then we can argue that 𝑂𝑃𝑇(I)>T01.

Lemma 14 ().

T0𝑂𝑃𝑇(I)

Combining Lemmas 12 and 14 we obtain T02𝑂𝑃𝑇(I)+P. Hence the makespan of our schedule S is at most T0+P2𝑂𝑃𝑇(I)+2P.

Theorem 15.

There exists a polynomial time algorithm for the broadcast problem in the hose model with uniform upload capacities that finds a solution with makespan at most 2𝙾𝙿𝚃+2P.