Abstract 1 Introduction 2 An Overview of Our Scheme 3 The Detailed Description and Formal Analysis of Our EPTAS References

An Efficient Polynomial Time Approximation Scheme for Minimizing the Total Weighted Completion Time on Uniformly Related Machines

Leah Epstein ORCID Department of Mathematics, Faculty of Natural Sciences, University of Haifa, Israel Asaf Levin ORCID Faculty of Data and Decisions Science, The Technion, Haifa, Israel
Abstract

We study a classic scheduling problem on uniformly related machines for which we show an efficient polynomial time approximation scheme (EPTAS), where an EPTAS is a fast and practical approximation scheme. For a desired approximation ratio of 1+ε for ε>0, the running time of an EPTAS is a function of ε multiplied by a polynomial function of the input length. New methods and techniques are essential in developing such improved approximation schemes, and their design is a primary goal of this research agenda. We present an EPTAS for the scheduling problem of a set of jobs on uniformly related machines so as to minimize the total weighted completion time. The problem is NP-hard in the strong sense, and therefore an EPTAS is the best possible approximation scheme for the problem, unless P=NP. Prior to our work, only a PTAS was known for the problem, while an EPTAS was known only for the special case of identical machines.

Keywords and phrases:
Scheduling algorithms, Approximation schemes, Min-sum objective
Funding:
Asaf Levin: Partially supported by ISF – Israel Science Foundation grant number 1467/22.
Copyright and License:
[Uncaptioned image] © Leah Epstein and Asaf Levin; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Scheduling algorithms
Editors:
Pat Morin and Eunjin Oh

1 Introduction

We consider one of the most basic multiprocessor scheduling problems: scheduling on uniformly related machines with the goal of minimizing the total weighted completion time of jobs. More precisely, our scheduling problem is defined as follows. We are given a set of n jobs, where each job has a positive size and a positive weight associated with it. We are also given a set of m machines to be used for the processing of jobs, such that each machine has a given positive speed. Running job j on machine i requires the allocation of a time interval on this machine, whose length is precisely the size of j divided by the speed of i, which is called the processing time of j on i.

We consider non-preemptive schedules and thus every job is assigned to a machine and to a single (continuous) time slot (or interval) on that machine, under the following conditions. The length of the time slot assigned to job j (on one of the machines) has to be the processing time of j on that machine. A machine can process at most one job at each time so the time intervals assigned to two jobs that are assigned to a common machine do not intersect in an inner point. Given such a schedule, the completion time of job j is defined as the ending point of the time interval of j, and the weighted completion time of j is the product of its weight and its completion time. The goal is to find a schedule for which the sum of the weighted completion times of all jobs.

In the scheduling community there is a consensus that the objective of weighted sum of job completion times is the most important min-sum objective, and together with the makespan minimization these are the two most important objective functions in the scheduling literature. A special case of our problem is that of identical machines, where each machine has speed 1. A generalization of these problems is the one where each job also has a release date associated with it, which restricts its starting time such that it cannot start running before its release date. Here, all release dates are equal to zero. Another well-motivated special case of our problem is the unweighted one, where all weights are equal to 1.

We study here a core problem in scheduling theory. We refer to [13, 27, 16, 29, 35, 5] for some recent work on the problem with respect to various aspects.

Now, we define the notions of approximation algorithms and the different types of approximation schemes. An -approximation algorithm for a minimization problem is a polynomial time algorithm that always finds a feasible solution of cost at most times the cost of an optimal solution. The infimum value of for which an algorithm is an -approximation is called the approximation ratio or the performance guarantee of the algorithm. A polynomial time approximation scheme (PTAS) is a family of approximation algorithms such that the family has a (1+ε)-approximation algorithm for any ε>0. An efficient polynomial time approximation scheme (EPTAS) is a PTAS whose time complexity is of the form f(1ε) multiplied by poly(n) where f is some (not necessarily polynomial) computable function, and poly(n) is a polynomial function of the length of the (binary) encoding of the input. A fully polynomial time approximation scheme (FPTAS) is a stronger concept, defined like an EPTAS, but the function f must be a polynomial in 1ε.

In this paper, we are interested in EPTAS’s and we say that an algorithm (for some problem) has a polynomial running time complexity if its time complexity is of the form f(1ε)poly(n). Note that while a PTAS may have time complexity of the form ng(1ε), where g can be polynomial or even super-exponential, this cannot be the case for an EPTAS. The notion of an EPTAS is modern and finds its roots in the FPT (fixed parameter tractable) literature (see [7, 9, 14, 31]). It was introduced in order to distinguish practical from impractical running times of PTAS’s, for cases where an FPTAS does not exist (unless P=NP). In this work, we design an EPTAS for the scheduling problem defined above for which an FPTAS cannot exist unless P=NP.

The seminal work of Smith [34] established the existence of a polynomial time algorithm (of time O(nlogn)) for solving the problem of minimizing the total weighted completion time on a single machine. This algorithm can be described as follows. The jobs are scheduled according to a non-increasing order of their densities, starting at time zero, and without any idle time (where the density of job j is the ratio between its weight and its size). This ratio is called Smith’s ratio, and any tie breaking policy leads to an optimal solution. The correctness of this algorithm follows by a simple exchange argument. This algorithm generalizes the SPT (shortest-processing-time) approach for the case of equal weights. In our settings, we can conclude the following property for the problem. Once the jobs have been assigned to machines (but not to time slots), the permutation of jobs assigned to a given machine is fixed according to Smith’s algorithm.

For a constant number of machines (at least two machines), the problem is NP-hard in the ordinary sense, but it is solvable in pseudo-polynomial time and has an FPTAS [32]. For the case where the number of machines is a part of the input, the problem is strongly NP-hard (see e.g. problem SS13 in [15]). The case of equal weights is polynomially solvable even for the more general case of unrelated machines [22]. The property that our problem is strongly NP-hard excludes the possibility to design an FPTAS for it, and thus an EPTAS is the best possible result (unless P=NP).

The development of good approximation algorithms for problems studied here was fairly slow. Till the late 1990’s, only constant approximation algorithms were developed for min-sum scheduling problems such as the ones we study. We refer to the papers cited in [33, 1, 8] for a survey of such results. Here we elaborate on the approximation schemes in the literature. The first approximation scheme for a special case of our problem was introduced by Skutella and Woeginger [33], which was designed for the special case of identical machines. Moreover, the last scheme is in fact an EPTAS for this special case. Shortly afterwards, Afrati et al. [1] presented a PTAS for the problem on identical machines with release dates. Their approach was generalized by Chekuri and Khanna [8], who showed the existence of a PTAS for uniformly related machines with release dates (see also [2]). The discussed schemes of [1, 8] are not EPTAS’s, and they cannot be converted into EPTAS’s (using known methods) even for the case without release dates studied here.

Before explaining the methods and techniques of these last schemes, as well as the limitations of those approaches, we mention the state of the art of approximation schemes for general load balancing problems on identical machines and uniformly related machines. The relation between the problem studied here and this family of problems will become clear later. In load balancing problems, the goal is to minimize functions of the vector of machine completion times (and not job completion times as we study here). Hochbaum and Shmoys presented the dual approximation framework and used it to show that the makespan minimization problem has a PTAS for identical machines [20] and for uniformly related machines [21]. In the dual approximation framework, the problem is converted into a bin packing type problem by fixing the machine loads. It was noted in [18] that the PTAS of [20] for identical machines can be converted into an EPTAS by using an integer program in fixed dimension instead of dynamic programming. Jansen [23] developed an EPTAS for the makespan minimization problem on uniformly related machines (see [24] for an alternative EPTAS for this problem and [25, 6] for improved time complexity EPTAS’s for this problem). The p norm minimization problem (of the vector of machine completion times) has an EPTAS for identical machines [3, 4], and a PTAS and an EPTAS for uniformly related machines [10, 11]. The EPTAS for the p norm minimization problem on uniformly related machines presented in [11] provides a method for constructing an EPTAS for our problem if all job densities are equal. However, this does not translate into an EPTAS for the problem even if there are only a constant number of distinct job densities. After the preliminary version of our work here was posted [12], Kones and Levin [28] have used the framework we present here for the case of constant number of job densities in order to establish an alternative EPTAS for the p norm minimization problem on uniformly related machines, and they used this framework to approximate a generalization of this problem.

Previous directions.

Next, we elaborate on the known approximation schemes for related problems. Skutella and Woeginger [33] observed that for identical machines the 2 norm minimization of the vector of machine loads is equivalent to minimizing the total weighted completion time, if the jobs have equal densities. This equivalence means that an optimal solution to one problem is an optimal solution to the other as the values of the two objective functions differ by an additive constant (which is common to all feasible solutions). Using this approach, they showed that if the ratio between the maximum density and the minimum density of jobs is upper bounded by a constant, then one can adapt the ideas of Alon et al. [3, 4] and obtain an EPTAS for this restricted setting.

The scheme [33] for the problem on identical machines follows the next ideas. First, round all the job sizes and job weights (up or down) to integer powers of 1+ε. Next, apply randomized geometric partitioning of the jobs based on their (rounded) density, solve each sub-instance consisting of all jobs of one partition using the scheme (which is similar to the one of [3, 4]), and schedule the partial solutions for every machine sorted by non-decreasing job densities. This last step of combining the solutions for the different parts in the partition was more delicate in [33], but as noted in [1], the last step could be made much simpler. Afrati et al. [1] also noted that this approach can be extended to obtain an approximation scheme (a PTAS) for the problem without release dates in other machine environments (see the last remark in [1]). Chekuri and Khanna [8] extended the techniques of [1] to the setting of uniformly related machines (with release dates). The methods of [1, 8] fail completely when one tries to obtain an EPTAS for these problems with or without release dates (even for identical machines).

Our results.

An EPTAS for the total weighted completion time on uniformly related machines was not known prior to our work, and it was only known for the much simpler case of identical machines. We design such an EPTAS here, using a number of methods explained in the next section. In an accompanying article (see the second part of [12]), we design an EPTAS for total weighted completion time on uniformly related machines with release dates. The current EPTAS is designed for the important special case without release dates, and while this EPTAS requires new ideas, this scheme is significantly simpler than the one of the general case. Before presenting the detailed description of the scheme and its analysis, we present an overview of the scheme. A forthcoming full version will contain additional details and omitted proofs.

2 An Overview of Our Scheme

In our scheme, all parameters are rounded to integer powers of a parameter 1+δ (where δ=εΥ for a constant Υ>0 that is independent of ε) [33, 1]. Thus, the first step is rounding the job sizes, the job weights, and the speeds of the machines to integer powers of 1+δ. Observe that this step has a minor effect on the performance guarantee of the approximation algorithm, and it is used to structure the input in the later steps. This is a standard approach which is suitable here. We say that a set of values is discrete if for every interval [L,U] (for every 0<L<U) we have at most O(logUL) distinct values of the set in the interval. Then, we would like to consider discrete values for sizes, weights, and speeds, which gives us also discrete densities and discrete job processing times on the machines.

Next, we use the shifting technique [19] by modifying the weights of some jobs to separate the input into bounded instances in terms of job sets. For each bounded instance, it will hold that its jobs have sufficiently similar densities. We eliminate intervals of densities from the input (and not intervals of weights). This is done by increasing weights of jobs whose densities belong to intervals that will be forbidden, so not just one interval of densities is removed but a regular pattern of densities is removed, creating gaps between density intervals. Finding the best sequence of gaps requires running our algorithm for all options.

Once there are regular gaps between density intervals, we show that each instance can be solved separately and independently from other instances, and the solutions can be combined without further modifications. That is, we apply an approach that is similar to [33], where we replace the randomized geometric partitioning by the stronger shifting technique (resulting in a slightly simpler analysis as one of the cases in [33] of jobs of similar densities in different instances is avoided, because there will be no such cases). In our shifting procedure, we ensure the following for a pair of jobs j and j such that the density of j is not smaller than the density of j. If j and j belong to a common bounded instance, then the ratio between the density of j and the density of j is upper bounded by a constant (that depends on ε), while if j and j belong to different bounded instances the ratio between the two densities is very large (at least a larger constant that also depends on ε). When we obtain an approximated solution for each such bounded instance, we simply concatenate the solutions by processing each job on the machine it is assigned to (in the solution to its bounded instance) and processing the jobs assigned to each machine according to their Smith’s ratios. By the separation of densities, we justify this, and conclude that the cost of the resulting solution is only slightly larger than the sum of costs of the solutions to the bounded instances. Therefore, if we are able to ensure a (1+δ)-approximation for each bounded instance, we will get an EPTAS for the instance resulting from this shifting technique.

The new part, where we cannot follow previous work, is where we solve each bounded instance (due to shifting, there is a polynomial number of such instances). Unlike the approach in [33], since we deal with machines of possibly different speeds, we use configuration mixed-integer linear programs (MILPs), where we require a variable of a machine configuration to be integral if the machine is heavily loaded, that is, if it receives a large total size of jobs. A configuration is a list of jobs assigned to one machine of a certain speed, where the list of relatively small jobs is not defined precisely, but the total size is within a given short interval. In order to define the notion of a heavily loaded machine, we “guess” the (approximate) load of the fastest machine (breaking ties in favor of low indices) counting the job set of the current bounded instance. Then, for the correct guess we can upper bound the load of each one of other machines (where the load of a machine is its completion time, that is the work divided by the speed, where the work of a machine is the total size of jobs of the current bounded instance assigned to it). In this way, we obtain a valid upper bound on the work of every other machine. This upper bound on the load of every machine or on its work could be seen as an alternative to the dual approximation method of Hochbaum and Shmoys [20, 21] for cases where this dual approximation method cannot be applied. In that case there is one upper bound for the completion time of every machine, which is not possible here since our objective is not the makespan.

We say that a machine is heavily loaded if its assigned work is at least a constant fraction of this upper bound. By splitting the machine set into slow and fast machines, we show that a slow machine is never heavily loaded in optimal solutions. Moreover, as mentioned earlier there is a limit to how much a fast machine can be loaded (in terms of the load of the first fastest machine, which we guess approximately), and as a result, the number of heavy configurations is a constant.

A machine configuration consists of the speed of the machine having this configuration, the load of this machine, and a complete list of relatively large jobs together with their densities (the jobs are large compared to the total size of jobs assigned to this machine) where a job is specified by its size and weight but not by its identity, and additionally, a configuration contains an approximate total size of smaller jobs of each density. The partition of jobs to small and large (for a given configuration) is based on both the load and on the speed of that machine (it is not based only on the speed, in the way that it is defined in many approximation schemes). This property is essential since this problem does not allow the use of the dual approximation method. The assignment of jobs that are scheduled as small jobs is carried out by another set of assignment variables. These last assignment variables can be fractional in a feasible solution to the MILP. The job sets that are small for their configurations are different for distinct configurations due to different load and speed. Such jobs are assigned (fractionally) to configurations. Thus, for every job type (which consists of a rounded size and a rounded density), there are variables counting how many copies of the job are assigned as large jobs to each configuration (so every machine scheduled based on such a configuration receives this number of jobs of this type) and how many jobs are assigned as small jobs to such a configuration (in which case the job is not assigned to a specific copy of the configuration). We find that no jobs are treated as “sand” and every job is assigned to a configuration since (due to speeds) jobs cannot be just partitioned into small and large, but assigning jobs to specific copies of configurations will not allow us to obtain the required running time. It is required to know the speed of the machine that will receive a certain job, because the objective of the MILP has only a part of the cost incurred by small jobs, and the residue has to be added (where it depends on the speed).

As in [33], we approximate the increase of cost due to the contribution of slightly larger set of small jobs using the total size of jobs of the configuration (in the sense that the increase is minor compared to this total size), and thus, the usage of a fractional assignment becomes possible, and it can be converted in turn into an integral one by slightly increasing the cost of each configuration. As for fractional configurations, we round down the number of machines with each such configuration. The total size of jobs that remain unassigned is sufficiently small to be combined into the schedule of a heavily loaded machine (at least one such fastest machine must exist, and this is sufficient for our purposes). Using the fact that we solve a bounded instance, we conclude that slightly increasing the load of a machine with respect to its value in the optimal solution to the MILP has a minor impact on the cost of the resulting solution.

Our scheme for the bounded instance is of independent interest as it extends the methods of [23] for the 2-norm minimization problem (the case of one density for all jobs) to obtain an EPTAS for a constant number of densities (where the constant depends on δ), whereas extending the original EPTAS for this 2-norm minimization problem [11] results in (only) a PTAS for the case of constant number of densities. Thus, presenting a new EPTAS for the 2-norm minimization problem is indeed necessary for our generalization. We refer to Section 3 for a complete description of the approximation scheme explained in this overview, and we conclude that there is an EPTAS for our problem.

3 The Detailed Description and Formal Analysis of Our EPTAS

In this section we present our scheme and prove its correctness.

Properties.

Obviously, since any job can start running at any time, an optimal solution (or schedule) never introduces idle time, and moreover, every machine runs its assigned jobs in an optimal order, that is, the jobs are sorted by Smith’s ratios. As any tie-breaking policy leads to the same objective function value, we use a specific tie-breaking policy in this section. More specifically, we will always break ties in favor of running larger jobs first, and in the case of equal size jobs (of the same weight), jobs of smaller indices are scheduled earlier. We call this ordering the natural ordering. Thus, we can define a solution or a schedule as a partition of the jobs to m subsets, one subset for each machine, and the order for each machine will always be the natural one. In some cases, we will compute the total weighted completion time of another permutation (not of the natural ordering). This will be done for the sake of analysis when this calculation is easier for cases where we are interested in an upper bound on the objective value and this upper bound is sufficient.

For an input X and a solution B, we let B(X) denote the output and the objective value of B for the input X. Recall that for a job j, the density of j is the ratio between its weight and its size. Thus, running the jobs sorted according to the natural ordering is equivalent to first sorting them by non-increasing density, and for each density, the jobs are sorted by non-increasing size, breaking ties (among equal size jobs) in favor of scheduling jobs of smaller indices earlier.

For a fixed schedule, the work of machine i is the total size of jobs assigned to it, and its load is its completion time, that is, the work divided by the speed (as we only consider schedules without any idle time). Let the original input instance be denoted by A, where job j has size aj>0 and weight ωj>0, and machine i has speed vi>0.

Let Cj denote the completion time of j under a given schedule, that is, the total size of jobs that run before j on the same machine (including j), divided by the speed of this machine. We let Γj be defined as ωj(Cjaj2vi), where Cjaj2vi is the time when half of job j is completed (known as the mean busy time of job j [17]). We call these values Γ-values, and obviously, the cost of a solution is at least the sum of Γ-values. Moreover, for identical machines, the difference between the cost of a solution and the sum of the Γ-values is independent of the solution (see also [33]) whereas for uniformly related machines, this difference depends on the speeds as we establish in the following lemma.

Lemma 1.

Consider a set of jobs I~A assigned to machine i. Let Φ=jI~aj and let ϕ=minjI~ωjaj be the minimum density of any job of I~. The sum of Γ-values (and the cost) of any solution that selects to run all jobs of I~ on machine i (possibly with other jobs) is at least ϕΦ22vi.

Rounding.

Given the original instance A, where job j has size aj and weight ωj, and machine i has speed vi, we create a (rounded) instance A as follows. Let 0<δ1/8 be an accuracy factor, that is a function of ε (where δ<ε), and such that 1δ is an integer. The sets of jobs and machines in A are the same as in A. Let si=(1+δ)log1+δvi be the speed of machine i in instance A. Let wj=(1+δ)log1+δωj, and pj=(1+δ)log1+δaj be the weight and size (respectively) of job j in instance A. That is, we round up the weights and the sizes of jobs while we round down the speeds of the machines. We will sometimes use Θj=log1+δwjpj, which is always an integer due to the rounding.

Next, we bound the increase of the approximation ratio due to this step. Our goal is to conclude that we can safely consider A instead of A. Let 𝒪 and 𝒪 denote optimal solutions for A and A respectively. Let SOL denote a given solution for both instances, where a solution consists of a partition of the set of jobs to the machines. The ordering for each machine may be different for the two inputs since the natural ordering is used. Using the new notation, for a given schedule, the completion time of j is still denoted by Cj, and its weighted completion time is wjCj.

Proposition 2.

We have SOL(A)SOL(A)(1+δ)3SOL(A) and 𝒪(A)𝒪(A)(1+δ)3𝒪(A). Furthermore, if a solution SOL satisfies SOL(A)(1+kδ)𝒪(A) for some k>0, then it holds that SOL(A)(1+(2k+4)δ)𝒪(A).

By the last proposition, we only consider the instance A, and use the obtained schedule as an output for A. Thus, it suffices to present an EPTAS for the rounded instance A.

On the use of 𝚪-values for 𝑨.

Given the input A, if j is executed on machine i, then Γj=wj(Cjpj2si) where Cjpj2si is the time when half of the job is completed. We say that for a given schedule, a block of jobs is a set of jobs of equal density that are assigned to one machine to run consecutively on that machine (there may be additional jobs of the same density, each assigned to run later or earlier than these jobs). The next well-known lemma shows that the sum of Γ-values for a block of jobs is a function of their total size, their common density, and the starting time of the block, and it is independent of the other properties of these jobs (but it depends on the speed of the machine). We prove the lemma for completeness similarly to the proof of Lemma 1.

Lemma 3.

Let I¯A be a set of jobs, all having densities equal to some Δ>0, and assume that these jobs are scheduled to run consecutively on machine i starting at time τ. Then jI¯Γj=Δ(τ+12sijI¯pj)(jI¯pj).

Shifting.

Based on the value of δ we define a new parameter ξ for the shifting step. Its value is a constant (as a function of δ). Let ξ=log1+δ1δ=log1+δ(1δ) for a fixed integer 35. Thus, 1δ31δ(1+δ)ξ<1+δδ<2δ<1δ+11δ6. Next, we use the integrality of 1δ2 and of . We have (1+δ)δ2>(1+δ1δ2)>(1δ), and we have ξδ2<1δ3. However, (98)ξ(1+δ)ξ1δ83, so ξ53.

We form a partition of integers based on the value of ξ, such that every subset has ξ consecutive integers. For an integer c, let Ωc={cξ+1,,(c+1)ξ}. The next step will be to partition the collections of integers in a cyclic way. There will be 1δ+1 possible collections. Let 0ζ1δ+11 be an integer. The collection for the value of ζ will consist of indices of sets for which the index modulo 1δ+1 is equal to ζ. We will be interested in densities that are powers of 1+δ that belong to Ωc where it holds that (cmod1δ+1)=ζ, and the densities for jobs with densities that belong to such a set will be increased by a constant multiplicative factor.

We define the instance Aζ by modifying the weights of some jobs in A. For a job j, if for some (not necessarily positive) integer v, ΘjΩv/δ+1+ζ (recall that the density of j is an integer power of 1+δ and Θj=log1+δwjpj), then wjζ=wj(1+δ)ξ, and otherwise wjζ=wj. We let Θjζ=log1+δwjζpj which means that in the first case Θjζ=ξΘj and in the second case Θjζ=Θj. In the first case, ΘjζΩv/δ+1+ζ+1 (where v/δ+1+ζ+1<(v+1)/δ+1+ζ, so it was moved to a different density collection out of the 1δ+1 collections). Furthermore, weights (and densities) are increased by a multiplicative factor of at most (1+δ)ξ1+δδ. For any ζ, let 𝒪ζ be an optimal solution for Aζ. As the set of jobs and machines is the same in A, A, and Aζ, the sets of feasible solutions for the three instances are the same.

Bounding the increase of the cost due to modifying 𝑨 into 𝑨𝜻.

Next, we bound the increase of the cost due to the transformation from A to Aζ. As a result of increasing weights for jobs with densities in a fixed density collection, no job jAζ has a density such that Θjζ=log1+δwjζpjΩv/δ+1+ζ for any integer v. Any value (1+δ)β where βΩv/δ+1+ζ for an integer v is called a forbidden density for ζ, and other values (1+δ)β for integer values of β are called allowed density for ζ. The next lemma compares costs for 𝒪(A) (which was defined to be an optimal solution for A) and 𝒪ζ for specific values of ζ (for the corresponding input Aζ). This will allow us to guess a suitable value of ζ (by enumeration) and approximate 𝒪ζ rather than 𝒪.

Lemma 4.

Given a solution SOL, any 0ζ1δ+11 satisfies SOL(A)SOL(Aζ), and there exists a value 0ζ¯1δ+11 such that we have SOL(Aζ¯)(1+2δ)SOL(A). Additionally, there exists a value 0ζ1δ+11 such that 𝒪(A)𝒪ζ(Aζ)(1+2δ)𝒪(A), and if a solution SOL1 satisfies SOL1(Aζ)(1+kδ)𝒪ζ(Aζ) for some k>0, then SOL1(A)(1+(2k+2)δ)𝒪(A).

The exhaustive enumeration algorithm implementing the shifting step.

Down below we present an algorithm that receives an input where the ratio between the maximum density of any job and the minimum density of any job is constant (as a function of δ), and outputs a solution of cost at most 1+δ times the cost of an optimal solution for this input. The ratio between densities will be at most (1+δ)y, where y=ξδ+1ξ1, The motivation for the value of y is that every density collection has ξ different densities, and since there are 1δ+1 options for ζ, there are 1δ+11 options of ζ for which the density collections were not removed. Sequences of possible densities without gaps introduced by considering Aζ for a fixed value of ζ will contain ξ(1δ+11)=y+1 densities which are integer powers of 1+δ, such that the largest one will be larger than the smallest one by a multiplicative factor of at most (1+δ)y.

We next use this algorithm as a black box. For every value of ζ, we apply the following process and create a schedule for the input Aζ. Afterwards, we choose a solution of minimum cost among the 1δ+1 resulting solutions.

Let 0ζ1δ+11. Decompose the allowed densities for ζ into collections of consecutive densities that are separated by intervals of forbidden densities for ζ. Since densities were rounded, two densities are consecutive if the large one is larger by a factor of exactly 1+δ from the smaller one. By our construction, this results in subsets of allowed densities with very different densities, such that each subset has allowed densities in an interval of the form [(1+δ)yρ,ρ], where ρ=(1+δ)(v/δ+1+ζ)ξ for some integer v (this is the last density in the density collection just before Ωv/δ+1+ζ), y is as defined above (y=ξδ+1ξ1), and additionally there is a gap between the allowed densities of one set and another set (because there are no jobs with densities in Ωv/δ+1+ζ in Aζ). More precisely, if I and I are two such subsets and the allowed densities for I are smaller than those of I, then the largest allowed density for I is smaller by a multiplicative factor of (1+δ)ξ+11+δδ than the smallest allowed density of I.

A sub-instance is defined to be a non-empty subset of the jobs corresponding to an interval of allowed densities together with the complete set of machines. Let q denote the number of such (sub-)instances (where qn). Let the instances be denoted by I1,,Iq, such that the maximum allowed density in Ip is ρp, and for p>1, ρp>ρp1, and in fact ρp(1+δ)ξδ+1ρp1(1δ)1δ+1ρp1.

Given a set of solutions SOLp, for 1pq (where SOLp is a solution for Ip), define a combined solution SOL, where the jobs assigned to machine i in SOL are all jobs assigned to this machine in all the solutions. Obviously, in SOL the jobs are scheduled sorted by non-increasing indices of their sets Ip.

Bounding the cost of the combined solution.

We will prove that the effect of concatenating solutions is not harmful even though earlier solutions delay the processing of later solutions. This will hold due to the gap between densities of jobs of different instances.

Lemma 5.

We have OPT(Aζ)p=1qOPT(Ip). Furthermore, if for every 1pq it holds that SOLp(Ip)(1+ν)OPT(Ip) for a fixed value ν>0, then the combined solution SOL satisfies SOL(Aζ)(1+ν)(1+8δ)OPT(Aζ).

In what follows we design a (1+δ)-approximation algorithm for the bounded ratio problem, that gives a (1+δ)(1+8δ)-approximation algorithm for Aζ. Since (1+δ)(1+8δ)<1+10δ, for an appropriate choice of ζ, we get an (1+22δ)-approximation algorithm for A, and a (1+48δ)-approximation algorithm for A. Thus, letting δ=ε48 will give a (1+ε)-approximation algorithm for the general problem. The algorithm for the bounded ratio problem is applied at most n times for each choice of ζ, i.e., at most nδ6 times in total. The reason is that we split every input Aζ into parts with similar densities, and the number of parts cannot exceed the number of jobs.

An EPTAS for the bounded ratio problem.

Let I be a bounded instance such that all densities are in [1,ρ=ρ(δ)=(1+δ)y], where y is a function of δ (y and ξ are as defined in the previous section). We will use not only the property that the number of distinct densities is a constant but also the constant maximum ratio between densities. By scaling (and possibly increasing the interval of densities), any valid input of the bounded ratio problem can be transformed into this form. We have the following properties. First, we have y=ξδ+1ξ1ξδ>400 (since 1δ+11δ1=1δδ+1δ+1>1>1ξ as 2δ+1+δ<1 by 3 and δ18, and since ξ53). Second, y+2<y+ξ+1ξδ61δ9 since 5 and ξ1δ3 and due to integrality, y+31δ9. Third, we also have (using the properties of ξ) that (1+δ)y(1+δ)ξδ(1δ)1δ(1δ3)1δ3>1δ1500, and (1+δ)y(1+δ)ξ/δ+1(1δ+1)1δ+1(1δ)6/δ6(1δ)1/δ712, and last, we conclude that y+1δ2(1+δ)y since y+11δ9 while (1+δ)y1δ1500.

Since the density of every job is in [1,(1+δ)y], every job jI has pj=(1+δ)k, wj=(1+δ)k, for some integers k,k such that 0kky. Let γ=δ12(1+δ)y. We have γδ1/δ7 and γδ1512. Let denote the set of values i such that there is at least one job of size (1+δ)i. Clearly, ||n. Let nr,i denote the number of jobs of size (1+δ)i and density (1+δ)r. By scaling the machine speeds (that are integer powers of 1+δ) and possibly reordering the machines, let the speeds of machines be s1s2sm, where without loss of generality, s1=1, and for i2, si=(1+δ)ki, for some non-positive integer ki0. It is possible that there is more than one machine of speed 1, and we consider all such machines in the next paragraph.

First guessing step and its analysis.

Recall that the work of a machine is the total size of jobs assigned to it. We would like to assume that job sizes are scaled such that the work of the most loaded machine (the most loaded machine in terms of work without taking speeds into account) of speed 1 is in [1/(1+δ),1) (while since we have jobs of varying densities, the work of slower machines may be arbitrary). We will now show that this is possible. For a job j, and 1bnδ, let Dj,b be the interval [(1+δ)b1pj,(1+δ)bpj) (we have that (1+δ)nδn+1 holds by a direct calculation since nδ is integral).

Without loss of generality we consider solutions where at least one machine of speed 1 has at least one job. This holds since otherwise it is possible to move an arbitrary job to such a machine and get a solution of a smaller cost.

Lemma 6.

For any solution, there exist a job 1jn and an integer 1bnδ, such that the work of the most loaded machine out of works of machines of speed 1 is in Dj,b.

For every choice of a pair j,b (n2δ choices in total), we scale (i.e., divide) job sizes by (1+δ)bpj (which is an integer power of 1+δ), and apply the algorithm described later in this section (this algorithm enforces the existence of a fastest machine with a suitable load, where it is possible that there is no such solution for some choices). By the claim, given an optimal solution OPT, for at least one choice of j,b the assumption regarding the work of the most loaded machine of speed 1 (that the work is at least (1+δ)1 and below 1) holds as a result of the scaling. We will pick the best solution, whose objective function value cannot be larger than that of the solution obtained for the correct choice of j,b. In what follows we analyze the properties of the correct choice of j,b for a given optimal solution OPT after the scaling. Moreover, the job sizes are according to the scaling.

Machine configurations – preliminaries.

Let U be a threshold on job sizes separating jobs seen as relatively large and jobs seen as relatively small. Let I^ be the subset of jobs assigned to machine i in some solution. The U-cost of machine i in this solution is defined as the total weighted completion time of jobs whose sizes are at least U, plus the Γ-values of jobs whose sizes are below U. The cost for machine i is therefore its U-cost plus 12sijI^,pj<Uwjpj. Obviously, the U-cost never exceeds the cost, for any value of U. Additionally, similarly to the cost, the U-cost for a fixed value of U is monotone in the sense that removing a job from the machine decreases its U-cost, and thus, if we consider a block of jobs of a common density each of which shorter than U, decreasing the total size of jobs in a block also decreases the U-cost.

We use the following two functions of δ: f(δ) and g(δ), both integral, non-decreasing, and negative. A machine i is called slow if its speed is at most (1+δ)f(δ) (and otherwise we say that it is not slow or that it is fast). We say that a machine is lightly loaded if its work does not exceed (1+δ)g(δ) (and otherwise it is heavily loaded). Let

f(δ)=g(δ)2δ2 and g(δ)=(1δ)1δ300.

Consider a fixed optimal solution OPT (for the input considered in this section which consists of a subset of jobs, and given the scaling of machine speeds and job sizes).

Lemma 7.

No machine has a load strictly above 2δ in OPT, and this is an upper bound on the work of any machine as well. Furthermore, in OPT every slow machine is lightly loaded.

Formal definition of machine configuration.

Next, we define machine configurations. A configuration is a vector that defines most properties for the schedule of one machine. This consists of a set of jobs assigned to it in terms of the types of jobs, that is, a job is specified by its size and weight but not by its identity, and jobs that are relatively small are defined by their approximate total sizes and densities. The set of all configurations will be denoted by 𝒞. For a configuration C𝒞, the first component is an integer j1(C), such that the total work of the machine is in ((1+δ)j1(C)2,(1+δ)j1(C)]. The second component is a non-positive integer j2(C) such that the speed of the machine is (1+δ)j2(C). The third component is an integer j3(C), such that j3(C)=j1(C)j2(C), and therefore the completion time (or load) of the machine is in ((1+δ)j3(C)2,(1+δ)j3(C)].

Recall that γ=δ12(1+δ)y. For integers r,i such that 0ry, ij1(C), and γ(1+δ)j1(C)1(1+δ)i(1+δ)j1(C) (i.e., j1(C)1+log1+δγij1(C)), there is an integer component nr,i(C)0 stating how many jobs of size (1+δ)i and density (1+δ)r are assigned to this machine. These jobs are called large (large for configuration C, or large for a machine that has configuration C). We let nr,i(C)=0 for other (smaller or larger) values of i (these are constants that are not a part of the configuration). There are additional components for other jobs assigned to a machine whose configuration is C. These jobs (which are not taken into account in the components of the form nr,i(C)) must have smaller values of i, as jobs with larger values of i (jobs of sizes above (1+δ)j1(C)) cannot be assigned to a machine that has configuration C. This is so since they are too large, given the upper bound on the work of the machine. These remaining jobs are called small jobs for C, or small jobs for a machine whose schedule is according to configuration C. For every r, there is an integer component tr(C)0 such that the total size of small jobs (of sizes in (0,γ(1+δ)j1(C)1)) of density (1+δ)r assigned to a machine with configuration C is in ((tr(C)1)γ(1+δ)j1(C)1,tr(C)γ(1+δ)j1(C)1]. We have tr(C)=0 if and only if there are no such jobs for a machine with this configuration.

Valid configurations.

Recall that is the set of indices i such that there is at least one job of size (1+δ)i. A configuration C is valid if the total size of jobs is sufficiently close to its required work, and its load does not exceed 2δ. Formally, letting (C)={i:j1(C)1+log1+δγij1(C)} it is required that 0ry,i(C)(1+δ)inr,i(C)+r=0ytr(C)γ(1+δ)j1(C)1((1+δ)j1(C)2,(1+δ)j1(C)+1]. The reason for the increased exponent of the right endpoint is that for every density the total size of small jobs may be smaller by an additive term of γ(1+δ)j1(C)1 compared to its upper bound, and since (y+1)γ<δ, it suffices to increase the right endpoint by a multiplicative factor of 1+δ. We also let (C)={i:ij1(C)2+log1+δγ}, and require (1+δ)j3(C)2<2δ. In the remainder of this section, given C, we will use the threshold U(C)=γ(1+δ)j1(C)1 for computing the U-cost of a machine with configuration C (this is the value U of the configuration C).

Bounding the number of configurations.

The number of components is therefore at most 3+(y+1)(log1+δ1γ+2)+(y+1)3+(y+1)(log1+δ1γ+3). Since log1+δ1γ=y+log1+δ1δ12y+1δ13, and y+11δ9, we have 3+(y+1)(log1+δ1γ+3)3+1δ9(1δ9+1δ13+3)<1δ23. We will use the following bounds.

Lemma 8.

For every pair of value j1 and j2, there are at most (1δ)1/δ33 configurations C with j1(C)=j1 and j2(C)=j2.

Lemma 9.

The number of options for j1,j2,j3 (for which there are valid configurations) is O(n2mδ)

Let J={σ1,σ2,,σκ} be the set of all κ different speeds of machines (where 1κm) such that 1=σ1>σ2>>σκ, and let Ni1 be the number of machines of speed σi.

An auxiliary mathematical program.

Consider the following mathematical program 𝒫. The goal of 𝒫 is to determine a partial schedule via machine configurations. For every configuration C, XC is a decision variable stating how many machines have this configuration. Obviously, the number of used configurations whose second component is σi cannot exceed Ni (it is not required to use all machines, but it is required to assign all jobs). For every triple of density, size, and a configuration (density (1+δ)r, size (1+δ)i, and configuration C), there is a decision variable Yr,i,C corresponding to the number of jobs of this size and density that are assigned to machines whose configurations are C as small jobs for this configuration. The variable Yr,i,C may be positive only if a job of size (1+δ)i is small for configuration C, i.e., ij1(C)2+log1+δγ, and in all other cases we set Yr,i,C=0 to be a constant rather than a variable. Each such variable is used for all copies of C together since jobs assigned as small jobs within the copies of the configuration may be of different sizes. Based on these decision variables we define a set of constraints and solve the resulting mixed-integer linear program 𝒫.

A configuration C has a cost denoted by cost(C) associated with it, which is the U-cost for the threshold U(C)=γ(1+δ)j1(C)1. Recall that for the purpose of calculating the U-cost of a machine, a list of its large jobs is needed. However, for its jobs of size below U (i.e., small jobs), the exact list of such small jobs is not needed. The only needed property of the subset of small jobs (for configuration C) of each one of the densities is their total size. This allows us to assign very different jobs that are small to machines with the same configuration, as long as these jobs are indeed small for the configuration and have the same density.

For large jobs, the exact identities of jobs are not needed as well, but a list of densities and sizes is needed. Thus, the U-cost for configuration C is calculated assuming that the machine has exactly nr,i(C) jobs of size (1+δ)i and weight (1+δ)i+r (i.e., density (1+δ)r) assigned to it, and the total size of small jobs (of sizes in (0,γ(1+δ)j1(C)1)) with densities equal to (1+δ)r is exactly tr(C)γ(1+δ)j1(C)1. The objective of 𝒫 is to minimize the sum of costs of configurations plus the missing parts of the costs of jobs that are assigned as small. For that, for a job of size (1+δ)i and density (1+δ)r that is assigned to a machine of speed s as a small job for its configuration, an additive term of (1+δ)r+2i2s is incurred (this is the difference between the actual cost for this job, and its Γ-value, which is the part of the cost already included in the U-cost). This last term only depends on the speed of the machine that runs the job rather than the specific machine. Thus, for each r and i we add

(1+δ)r+2iC𝒞Yr,i,C2(1+δ)j2(C)

to the cost of configurations to get the total cost of the schedule, where C𝒞Yr,i,C is the number of jobs of size (1+δ)i and density (1+δ)r that are assigned as small jobs for their configurations.

minC𝒞cost(C)XC+r=0yi(1+δ)r+2iC𝒞Yr,i,C2(1+δ)j2(C) s.t.
𝑪𝓒:(𝟏+𝜹)𝒋𝟐(𝑪)=𝝈𝒊𝑿𝑪𝑵𝒊σiJ (1)
𝑪𝓒𝒏𝒓,𝒊(𝑪)𝑿𝑪+𝑪𝓒𝒀𝒓,𝒊,𝑪=𝒏𝒓,𝒊0ry,i (2)
𝒊𝓘(𝑪)(𝟏+𝜹)𝒊𝒀𝒓,𝒊,𝑪((𝒕𝒓(𝑪)+𝟏)𝜸(𝟏+𝜹)𝒋𝟏(𝑪)𝟏)𝑿𝑪0 (3)
0ry,C𝒞
𝑪𝓒:𝒋𝟏(𝑪){𝟏,𝟎,𝟏},𝒋𝟐(𝑪)=𝟎𝑿𝑪𝟏 (4)
𝒀𝒓,𝒊,𝑪𝟎C𝒞,0ry,i(C) (5)
𝑿𝑪𝟎C𝒞 (6)

Motivation for the constraints.

Later, we will use the last mathematical program to direct our algorithm. Before we continue we explain the intuition behind its constraints. We stress that the formal proof does not use this intuition, so it is given to assist the reader. Condition (1) ensures that the number of used machines for each speed does not exceed the existing number of such machines. Condition (2) states that every job is assigned (either it is a large job or a small job). Condition (3) considers jobs of density (1+δ)r that are assigned as small to machines scheduled according to configuration C, and verifies that sufficient space is allocated for them if the space for them is slightly extended (this extended space may be used as we show later). Condition (4) ensures that indeed there is a machine of the maximum speed that has a work that is close to 1, that is, above 1(1+δ)2 and at most 1+δ (the condition that the work is in [1/(1+δ),1) is slightly relaxed).

Observe that we distinguish between large and small jobs for a machine based on the configuration and not solely by its speed. This feature is needed since our problem does not allow the use of the dual-approximation method (because the objective is not based on machine loads).

Formal definition of the subset of integer variables.

We define the set of heavy configurations 𝒞H as 𝒞H={C𝒞:j1(C)>g(δ)}, (note that the definition of a heavy configuration or a heavily loaded machine depends only on the work, and it is independent of the speed) and the complement set of light configurations is 𝒞L=𝒞𝒞H. By Lemma 7 it is sufficient to consider light configurations for slow machines. Therefore, we delete heavy configurations for slow machines and do not consider them (which is justified later in the comparison between the cost of an optimal schedule and the optimal solution of 𝒫).

We see 𝒫 as a mixed-integer linear program and thus we define next the set of variables that are forced to be integral and the set of variables that are allowed to be fractional. All variables Yr,i,C may be fractional. The variables of (light) configurations corresponding to slow machines and variables of light configurations of fast machines may be fractional, whereas the variables of heavy configurations corresponding to fast machines must be integral. The number of variables is polynomial, and the number of integral variables will be constant (a function of δ).

Recall that for every pair of speed and work (i.e, a pair j1,j2), the number of different valid configurations is constant (as a function of δ). The number of different speeds of fast machines is at most |f(δ)|=f(δ). The number of different values of j1(C) such that C is a heavy configuration is at most g(δ)+log1+δ2δ2δ2g(δ)=f(δ), and j2(C) of a heavy configuration C must be a fast speed (so there are at most f(δ) values for it). As the number of integral variables is constant (as a function of δ), an optimal solution can be found in polynomial time [30, 26] that is a function of δ multiplied by a polynomial in the input encoding length.

Presenting a feasible solution to 𝓟 whose cost is at most the cost of an optimal schedule.

We will first compare the cost of the optimal schedule OPT to the cost of an optimal solution of 𝒫. Later, we will show how to obtain an actual schedule given a solution of 𝒫, such that the cost of the schedule is larger only by a factor of at most 1+δ than the objective function value of the solution to 𝒫. Let (X,Y) denote an optimal solution to the mixed-integer linear program, and let Z be its objective value. Let ZOPT denote the cost of OPT.

Theorem 10.

If (X,Y) is an optimal solution of 𝒫, then ZZOPT.

Using the solution (𝑿,𝒀) for finding an approximate schedule.

Our last rounding steps are designed in order to use (X,Y) for getting an approximate schedule for the rounded ratio problem. These steps are the constructive methods for establishing our scheme.

We start with constructing an alternative set of values of the variables and bounding the resulting cost from above. We will later convert this alternative solution (that may violate some of the constraints of 𝒫) into a schedule. For every C𝒞, let XC=XC. If XC=0, then we set XC=0. In this case Yr,i,C=0 must hold for all r,i by (3), and we set Yr,i,C=0. If XC>0, we consider two cases. If CCH, then the variables of configurations are integral, and therefore XC=XC. In this case we set Yr,i,C=Yr,i,C for any 0ry, i(C). Otherwise, that is, in the case CCL, let Yr,i,C=XCXCYr,i,C. Using these definitions, in the case Yr,i,C>0, if CCH, then we have Yr,i,CYr,i,C+1, and otherwise we have Yr,i,CXCXCYr,i,C. Moreover, if CCL and XC>0, then we have Yr,i,C>XCXCYr,i,C1>Yr,i,cYr,i,CXC1, by XC<XC+1 (that holds even if XC=0).

Bounding the cost of (𝑿,𝒀).

We will define a schedule and find an upper bound on its objective function value. Before that, we analyze the first increase in the cost which is due to the transformation from (X,Y) to (X,Y), and show that it is by an additive factor of at most δ100Z. The solution (X,Y) that we consider is not a feasible solution for 𝒫, as (for example) constraint (2) does not necessarily hold as it is possible that C𝒞nr,i(C)XC+C𝒞Yr,i,Cnr,i, but we can still bound its objective function value using the objective function value of the optimal (and feasible) solution.

Since XCXC for C𝒞, and Yr,i,CYr,i,C for all CCL, 0ry, and i(C), the objective function value for these variables is at most Z plus

r=0yi(1+δ)r+2iCCH:i(C)Yr,i,CYr,i,C2(1+δ)j2(C)
r=0yCCHi(C)(1+δ)r+2iXC2(1+δ)j2(C),

since for CCH it holds that Yr,i,C=0 if XC=0, and otherwise Yr,i,CYr,i,C1XC. Next, we show that

r=0yi(C)(1+δ)r+2i12(1+δ)j2(C)δ100cost(C)

for any CCH, and thus we will conclude that the objective function value of the set of variables XC, Yr,i,C is at most (1+δ100)Z. We have cost(C)(1+δ)2j1(C)42(1+δ)j2(C) (by Lemma 1, since cost(C) is computed for jobs of total size above (1+δ)j1(C)2, and the jobs densities are no smaller than 1). Let i=max(C). We have

(1+δ)2i(γ(1+δ)j1(C)1)2

and

ii(1+δ)2i<(1+δ)2i+1/δ.

Additionally, r=0y(1+δ)r<(1+δ)y+1/δ. For a given CCH, we get

r=0yi(C)(1+δ)r+2i12(1+δ)j2(C)(1+δ)y+2i+22δ2(1+δ)j2(C)(1+δ)y+2(γ(1+δ)j1(C)1)22δ2(1+δ)j2(C)
=(1+δ)y+2δ24(1+δ)2y(1+δ)2j1(C)22δ2(1+δ)j2(C)cost(C)δ22(1+δ)4yδ100cost(C)

(since (1+δ)y1δ1500). Thus the claim regarding the cost of (X,Y) holds.

Defining a schedule of most jobs while leaving a small subset of unassigned jobs.

We will define a schedule that is based on the rounded variables. The schedule will require additional space for jobs that are small for their configurations (in some configurations) both due to rounding of variables, and the actual assignment of jobs to machines which does not exist in 𝒫 (for such jobs). Due to the rounding of numbers of copies of configurations, some jobs that are large for their configurations will also remain unassigned. Such jobs will obviously have to be assigned, thus increasing the cost. For the resulting schedule, the objective function value of 𝒫 may be no longer correct, though we show that the true objective is not much larger.

Next, we find additional upper bounds that are based on the rounded variables (only those that were rounded up). We prove upper bounds on the space used for jobs that are small for their configurations where a possible increase is due to the rounding of variables only. Additional space to be used for such jobs will be defined and analyzed later, and here we only analyze the space that was already defined.

Using condition (3), for every CCL such that XC>0 and 0ry, we have

i(C)(1+δ)iYr,i,Ci(C)(1+δ)iXCXCYr,i,C
XCXC(tr(C)+1)γ(1+δ)j1(C)1XC=(tr(C)+1)γ(1+δ)j1(C)1XC.

Similarly, for every CCH such that XC>0 and 0ry, we have

i(C)(1+δ)iYr,i,Ci(C)(1+δ)i(Yr,i,C+1)
(tr(C)+1)γ(1+δ)j1(C)1XC+i(C)(1+δ)i
(tr(C)+2+2δ)γ(1+δ)j1(C)1XC

as XC=XC1, and

i(C)(1+δ)iγ(1+δ)j1(C)1i=01(1+δ)i=γ(1+δ)j1(C)11+δδ.

If XC=0, then the corresponding upper bound (the first one if CCL, and the second one if CCH) hold trivially, as all variables are equal to zero.

We let nr,i=nr,iC𝒞nr,i(C)XCC𝒞Yr,i,C. If nr,i>0, then we say that nr,i is the number of unassigned jobs of size (1+δ)i and density (1+δ)r. These jobs will be called unassigned jobs, as they will remain unassigned if we assign jobs exactly using configurations that are based on the modified set of variables (including the assignment to configurations of jobs that are small for their configurations). We explain this assignment, which is called the first assignment step, before we proceed.

The first assignment step will be to assign jobs according to the configurations for which XC>0, such that there will be XC machines whose sets of large jobs will be as required (based on the definition of C). We will then assign Yr,i,C jobs of size (1+δ)i and density (1+δ)r to the machines whose configuration is C (these are small jobs for C). In order to ensure that all jobs can be scheduled, for each machine that is assigned the configuration C, additionally to total size of at most tr(C)γ(1+δ)j1(C)1 jobs of density (1+δ)r that are small jobs for configuration C that the machine can contain, such jobs of total size at most 3γδ(1+δ)j1(C)1 are allocated to this machine. These jobs are called additional jobs, and we will calculate the increase in the total cost as a result. The unassigned jobs will all be assigned to a machine of speed 1 whose configuration C has j1(C)=0 or j1(C)=1. We will compute an upper bound on the total size of these unassigned jobs that will allow us to find an upper bound on the increase in the cost.

Consider the machines whose configuration is C. For every 0ry, create XC bins of size tr(C)γ(1+δ)j1(C)1 each (these bins are called bins of the first kind), and XC bins of size 3γδ(1+δ)j1(C)1 (these bins are called bins of the second kind); if tr(C)=0, then we introduce only the second kind of bins. The additional jobs are those that are packed into bins of the second kind. We define the allocation of jobs to machines by packing them as items into these bins. If the number of items of a certain type assigned using this packing process exceeds the existing number of jobs, then some of the spaces allocated in this process for items will not receive jobs. For every i(C), pack Yr,i,C items of size (1+δ)i into these bins using First Fit.

We show that all items are packed. Assume by contradiction that this is not the case. Since the size of each item is below γ(1+δ)j1(C)1 and there is an item that cannot be packed, each bin of the first kind (if such a bin exists) is occupied by at least (tr(C)1)γ(1+δ)j1(C)1, and each bin of the second kind is occupied by at least (3δ1)γ(1+δ)j1(C)1. We find that the total size of items of density (1+δ)r that are to be packed into these bins as small, which is equal to i(C)(1+δ)iYr,i,C is above (tr(C)+3δ2)γ(1+δ)j1(C)1XC, contradicting the upper bound that we proved on i(C)(1+δ)iYr,i,C, since 3δ2>2+2δ (since δ18).

Bounding the increase in the cost of a machine due to the first assignment step.

The increase in the cost of each machine (since the total size of small jobs of density (1+δ)r for each 0ry may become larger) can be upper bounded as follows. A possible schedule, i.e., a permutation of the job set of the machine, is obtained by assigning the items of the bins of the first kind as a block for each value of r of size at most tr(C)γ(1+δ)j1(C)1, and the jobs of the bins of the second kind as a block of the last jobs assigned to the machine (this is possibly not an optimal ordering). For a job j that is small for its configuration the difference between its cost and Γj is already included in the objective function value (since it is assigned as a small job), and thus we compute the total Γ-values of the added blocks. Instead of considering these blocks (for different values of r) separately, we see it as one block assuming that all densities are equal to (1+δ)y (this assumption cannot reduce the cost). We assume without loss of generality that j2(C)=0. Let

W=0ry,i(C)(1+δ)inr,i(C)+r=0ytr(C)(γ(1+δ)j1(C)1)

be the work that was used for calculating cost(C) in 𝒫. The total size of this united block of small jobs added to this machine is at most 3(y+1)γδ(1+δ)j1(C)1, and thus, the sum of Γ-values for all these blocks is at most

(1+δ)y(W+3(y+1)γ(1+δ)j1(C)12δ)3(y+1)γ(1+δ)j1(C)1δ

by Lemma 3. We use W(1+δ)j1(C)2, δ1/8, y+11δ9, γδ1512, and (1+δ)yγ=δ12, and get

3(y+1)γ(1+δ)j1(C)12δδ1501W,

and (1+δ)y3(y+1)γ(1+δ)j1(C)1δ3(1+δ)δ2W and get that the total Γ-values of the added blocks is at most

(1+δ1501)W3(1+δ)δ2W7δ2W2/2.

By Lemma 1, we have cost(C)W2/2 as all densities are at least 1. Thus, the second increase in the cost is by an additive factor of at most 7δ2Z.

Bounding the cost of scheduling the unassigned jobs.

Recall that nr,i is the number of unassigned jobs of size (1+δ)i and density (1+δ)r, and for CCH, XC=XC and Yr,i,CYr,i,C. Using (2), we have

nr,i=nr,iCCHnr,i(C)XCCCH:i(C)Yr,i,C
CCLnr,i(C)XCCCL:i(C)Yr,i,C
nr,iCCHnr,i(C)XCCCH:i(C)Yr,i,C
CCL(nr,i(C)(XC1))CCL:i(C),XC>0(Yr,i,CYr,i,CXC1)
=CCLnr,i(C)+CCL:i(C),XC>0(Yr,i,CXC+1).

The total size of unassigned jobs is at most

r=0yCCLi(C)(1+δ)inr,i(C)
+r=0yCCL:XC>0i(C)(1+δ)i(Yr,i,CXC+1).

Using

i(C)(1+δ)iYr,i,CXC(tr(C)+1)γ(1+δ)j1(C)1

for CCL such that XC>0 (which holds by constraint (3) divided by XC), the total size of unassigned jobs is at most

r=0yCCL((tr(C)+1)γ(1+δ)j1(C)1+i(C)(1+δ)inr,i(C))
+r=0yCCLi(C)(1+δ)i.

Using

0ry,i(C)(1+δ)inr,i(C)+r=0ytr(C)(γ(1+δ)j1(C))(1+δ)j1(C)+1

(by the definition of a configuration) and i(C)(1+δ)iγ(1+δ)j1(C)/δ for any C𝒞, the total size of unassigned jobs is at most

CCL(1+δ)j1(C)+1+r=0yCCLγ(1+δ)j1(C)1
+r=0yCCLγ(1+δ)j1(C)/δ<CCL2(1+δ)j1(C).

since (1+δ+(y+1)γ(1/δ+1))1+δ+1δ9δ15002δ<2.

The proof of the following lemma is derived using the definition of g(δ).

Lemma 11.

The total size of unassigned jobs is at most 2γ.

Now we show that this leads to the required result. The jobs are assigned to a machine with a configuration C such that j1(C){1,0,1}. Let x be the completion time of this machine before modifications, where x(1(1+δ)3,(1+δ)2]. The modifications described so far during the first assignment step may increase the completion time of the machine by at most

3δγ(y+1)<δ1400

(since 3γ(y+1)/δ3δδ15001δ9<δ1400). Even if the density of each such job is (1+δ)y, and the unassigned jobs are now assigned as a block on the most loaded machine of speed 1 starting at time x+δ1400, the sum of their weighted completion times will be at most

2γ(1+δ)y(x+δ1400+2γ)<2δ12(x+δ1399),

while the value of cost(C) is at least x2/2. It holds that

2δ12(x+δ1399)x2/2=4δ12x+4δ1411x24δ12(1+δ)3+4δ1411(1+δ)6<6δ12,

so the increase in the cost is at most by an additive factor of 15δ12Z. The total cost will be at most Z(1+δ100+7δ2+6δ12)<(1+δ)Z, and thus the following theorem holds.

Theorem 12.

There exists a schedule whose cost is at most (1+δ)Z, and such a schedule can be constructed in polynomial time from (X,Y).

Summary of the algorithm for the bounded ratio problem.

In summary, in this section we provided an (1+δ)-approximation algorithm for the bounded ratio scheduling problem. The algorithm tests all possible intervals Dj,b, and for each one it constructs the MILP 𝒫, and finds an optimal solution. The best solution is converted into a schedule. We conclude the paper with the statement of our main result.

Theorem 13.

There is an EPTAS for the problem of minimizing the total weighted completion time on uniformly related machines.

References

  • [1] Foto N. Afrati, Evripidis Bampis, Chandra Chekuri, David R. Karger, Claire Kenyon, Sanjeev Khanna, Ioannis Milis, Maurice Queyranne, Martin Skutella, Clifford Stein, and Maxim Sviridenko. Approximation schemes for minimizing average weighted completion time with release dates. In Proc. 40th Symp. Foundations of Computer Science (FOCS), pages 32–44, 1999. doi:10.1109/SFFCS.1999.814574.
  • [2] Foto N. Afrati and Ioannis Milis. Designing PTASs for MIN-SUM scheduling problems. Discrete Applied Mathematics, 154(4):622–639, 2006. doi:10.1016/J.DAM.2005.05.014.
  • [3] N. Alon, Y. Azar, G. J. Woeginger, and T. Yadid. Approximation schemes for scheduling. In Proc. 8th Symp. on Discrete Algorithms (SODA), pages 493–500, 1997.
  • [4] N. Alon, Y. Azar, G. J. Woeginger, and T. Yadid. Approximation schemes for scheduling on parallel machines. Journal of Scheduling, 1(1):55–66, 1998.
  • [5] Alok Baveja, Xiaoran Qu, and Aravind Srinivasan. Approximating weighted completion time via stronger negative correlation. Journal of Scheduling, 27(4):319–328, 2024. doi:10.1007/S10951-023-00780-Y.
  • [6] Sebastian Berndt, Hauke Brinkop, Klaus Jansen, Matthias Mnich, and Tobias Stamm. New support size bounds for integer programming, applied to makespan minimization on uniformly related machines. In Proc. 34th International Symp. on Algorithms and Computation (ISAAC), 2023. doi:10.4230/LIPIcs.ISAAC.2023.13.
  • [7] M. Cesati and L. Trevisan. On the efficiency of polynomial time approximation schemes. Information Processing Letters, 64(4):165–171, 1997. doi:10.1016/S0020-0190(97)00164-6.
  • [8] Chandra Chekuri and Sanjeev Khanna. A PTAS for minimizing weighted completion time on uniformly related machines. In Proc. 28th International Colloquium on Automata, Languages, and Programming (ICALP), pages 848–861, 2001. doi:10.1007/3-540-48224-5_69.
  • [9] R. G. Downey and M. R. Fellows. Parameterized Complexity. Springer-Verlag, Berlin, 1999.
  • [10] L. Epstein and J. Sgall. Approximation schemes for scheduling on uniformly related and identical parallel machines. Algorithmica, 39(1):43–57, 2004. doi:10.1007/S00453-003-1077-7.
  • [11] Leah Epstein and Asaf Levin. An efficient polynomial time approximation scheme for load balancing on uniformly related machines. Mathematical Programming, 147(1-2):1–23, 2014. doi:10.1007/S10107-013-0706-4.
  • [12] Leah Epstein and Asaf Levin. Minimum total weighted completion time: Faster approximation schemes. CoRR, abs/1404.1059, 2014. arXiv:1404.1059.
  • [13] Leah Epstein, Asaf Levin, Alan J. Soper, and Vitaly A. Strusevich. Power of preemption for minimizing total completion time on uniform parallel machines. SIAM Journal on Discrete Mathematics, 31(1):101–123, 2017. doi:10.1137/16M1066610.
  • [14] J. Flum and M. Grohe. Parameterized Complexity Theory. Springer-Verlag, Berlin, 2006.
  • [15] M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
  • [16] Stanisław Gawiejnowicz. A review of four decades of time-dependent scheduling: Main results, new topics, and open problems. Journal of Scheduling, 23(1):3–47, 2020. doi:10.1007/S10951-019-00630-W.
  • [17] M. X. Goemans. Improved approximation algorithms for scheduling with release dates. In Proc. 8th Symp. on Discrete Algorithms (SODA), pages 591–598, 1997.
  • [18] D. S. Hochbaum. Various notions of approximations: Good, better, best, and more. In D. S. Hochbaum, editor, Approximation algorithms. PWS Publishing Company, 1997.
  • [19] D. S. Hochbaum and W. Maass. Approximation schemes for covering and packing problems in image processing and VLSI. Journal of the ACM, 32(1):130–136, 1985. doi:10.1145/2455.214106.
  • [20] D. S. Hochbaum and D. B. Shmoys. Using dual approximation algorithms for scheduling problems: theoretical and practical results. Journal of the ACM, 34(1):144–162, 1987. doi:10.1145/7531.7535.
  • [21] D. S. Hochbaum and D. B. Shmoys. A polynomial approximation scheme for scheduling on uniform processors: Using the dual approximation approach. SIAM Journal on Computing, 17(3):539–551, 1988. doi:10.1137/0217033.
  • [22] W. A. Horn. Minimizing average flow time with parallel machines. Operations Research, 21(3):846–847, 1973.
  • [23] K. Jansen. An EPTAS for scheduling jobs on uniform processors: Using an MILP relaxation with a constant number of integral variables. SIAM Journal on Discrete Mathematics, 24(2):457–485, 2010. doi:10.1137/090749451.
  • [24] K. Jansen and C. Robenek. Scheduling jobs on identical and uniform processors revisited. In Proc. of the 9th International Workshop on Approximation and Online Algorithms (WAOA), pages 109–122, 2011.
  • [25] Klaus Jansen, Kim-Manuel Klein, and José Verschae. Closing the gap for makespan scheduling via sparsification techniques. Mathematics of Operations Research, 45(4):1371–1392, 2020. doi:10.1287/MOOR.2019.1036.
  • [26] Ravi Kannan. Improved algorithms for integer programming and related lattice problems. In Proc. 15th Symp. Theory of Computing (STOC), pages 193–206, 1983. doi:10.1145/800061.808749.
  • [27] Dušan Knop and Martin Kouteckỳ. Scheduling meets n-fold integer programming. Journal of Scheduling, 21:493–503, 2018. doi:10.1007/S10951-017-0550-0.
  • [28] Ishai Kones and Asaf Levin. A unified framework for designing eptas for load balancing on parallel machines. Algorithmica, 81(7):3025–3046, 2019. doi:10.1007/S00453-019-00566-9.
  • [29] Hanane Krim, Nicolas Zufferey, Jean-Yves Potvin, Rachid Benmansour, and David Duvivier. Tabu search for a parallel-machine scheduling problem with periodic maintenance, job rejection and weighted sum of completion times. Journal of Scheduling, pages 1–17, 2022.
  • [30] H. W. Lenstra Jr. Integer programming with a fixed number of variables. Mathematics of Operations Research, 8(4):538–548, 1983. doi:10.1287/MOOR.8.4.538.
  • [31] D. Marx. Parameterized complexity and approximation algorithms. The Computer Journal, 51(1):60–78, 2008. doi:10.1093/COMJNL/BXM048.
  • [32] Sartaj K. Sahni. Algorithms for scheduling independent tasks. Journal of the ACM, 23:116–127, 1976. doi:10.1145/321921.321934.
  • [33] Martin Skutella and Gerhard J. Woeginger. A PTAS for minimizing the total weighted completion time on identical parallel machines. Mathematics of Operations Research, 25(1):63–75, 2000. doi:10.1287/MOOR.25.1.63.15212.
  • [34] W. E. Smith. Various optimizers for single-stage production. Naval Research Logistics Quarterly, 3:59–66, 1956.
  • [35] Vipin Ravindran Vijayalakshmi, Marc Schröder, and Tami Tamir. Minimizing total completion time with machine-dependent priority lists. European Journal of Operational Research, 315(3):844–854, 2024. doi:10.1016/J.EJOR.2023.12.030.