Abstract 1 Introduction 2 System Model 3 Problem Definition 4 Markov Chains, Convergence, and Ergodicity 5 Construction of Markov Chains for Utility Accrual 6 Case Studies 7 Exploration of Policy Parameters 8 Remarks and Extensions 9 Conclusion References

Theoretical Foundations of Utility Accrual for Real-Time Systems

Jian-Jia Chen ORCID TU Dortmund, Germany Junjie Shi ORCID TU Dortmund, Germany Mario Günzel ORCID TU Dortmund, Germany Georg von der Brüggen ORCID TU Dortmund, Germany Kuan-Hsun Chen ORCID University of Twente, The Netherlands Peter Bella ORCID TU Dortmund, Germany
Abstract

Providing guaranteed quantification of properties of soft real-time systems is important in practice to ensure that a system performs correctly most of the time. We study utility accrual for real-time systems, in which the utility of a real-time job is defined as a time utility function with respect to its response time. Essentially, we answer the fundamental questions: Does the utility accrual of a periodic real-time task in the long run converge to a single value? If yes, to which value?

We first show that concrete problem instances exist where evaluating the utility accrual by simulating the scheduling algorithm or conducting scheduling experiments in a long run is erroneous. Afterwards, we show how to construct a Markov chain to model the interactions between the scheduling policy, the probabilistic workload of a periodic real-time task, the service provided by the system to serve the task, and the effect on the utility accrual. For such a Markov chain, we also provide the theoretical fundamentals to determine whether the utility accrual converges in the long run and the derivation of the utility accrual if it converges.

Keywords and phrases:
Soft Real-Time Systems, Utility Accrual, Markov Chains, Dismiss Points
Copyright and License:
[Uncaptioned image] © Jian-Jia Chen, Junjie Shi, Mario Günzel, Georg von der Brüggen, Kuan-Hsun Chen, and Peter Bella; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Computer systems organization Embedded software
; Computer systems organization Real-time system specification
Funding:
margin: [Uncaptioned image] This result is part of a project (PropRT) that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 865170).
Editors:
Renato Mancuso

1 Introduction

Soft real-time systems, which are common among industrial real-time systems, can tolerate occasional deadline misses. Specifically, a recent empirical study on industrial real-time systems by Akesson et al. [4] reports that 67% of the investigated systems have soft timing constraints. Furthermore, safety standards such as IEC-61508 [30] and ISO-26262 [31] allow occasional deadline misses if the probability of deadline failure is upper bounded.

Hence, providing guaranteed quantification of properties of soft real-time systems is important in practice to ensure that a system performs correctly most of the time. Such quantitative assessments can be deterministic measures (e.g., the maximum number of deadline misses for every k jobs [28, 20, 6, 59, 58], bounded tardiness [22, 36, 11], etc.) or probabilistic measures. According to the survey by Davis and Cucu-Grosjean [21], two probabilistic guarantees for real-time systems are primarily studied. Suppose that DM(j) is the probability that the j-th job misses its deadline.

  • The worst-case deadline failure probability (WCDFP) is the maximal probability (among all jobs) that a job misses its deadline, i.e., supjDM(j), and is studied in [23, 63, 40, 48, 49, 14, 57, 48, 46, 50, 13, 66, 15, 66, 67, 8, 47].

  • The deadline miss probability (also referred to as deadline miss rate) is the number of jobs missing their deadlines divided by the number of released jobs, i.e., informally, limN1Nj=1NDM(j), and is studied in [21, 1, 2, 41, 3, 55, 54, 16, 24, 27, 25, 26, 42, 43, 45, 44, 10].

For adaptive real-time systems, the scheduling decisions are made at run-time to incorporate a changing environment. This allows a more efficient resource usage and graceful degradation of system performance, as pointed out by Prasad et al. [56]. Instead of applying the metric of deadline misses, Jensen et al. [32] proposed to utilize a time utility function to specify the semantics of utility accrual, based on the response time of a real-time job. Burns et al. [9] further discussed utility accrual (denoted as value assignment) and presented a framework for value-based scheduling. Prasad et al. [56] investigated both the assignment and the usage of the values.

Let V(t) be the utility accrual of a real-time job when its response time is t. Suppose that R(j) is a discrete random variable, indicating the response time of the j-th job of a recurrent task. The utility accrual in the long run is informally defined as limN1Nj=1NV(R(j)).

Hence, to derive the utility accrual in the long run, it is necessary to derive the probability distribution of the response time of the jobs. Furthermore, when designing a scheduler to maximize the utility accrual, the key component is incorporating decisions on whether a job should be admitted for execution, whether a job should be aborted if it misses its deadline, etc. Towards this, several studies, e.g., [35, 37, 39, 5, 38, 34, 51, 12, 19, 29, 71, 34], have developed algorithms to maximize the utility accrual for different system models.

Specifically, Locke’s Best Effort Scheduling Algorithm [39] (LBESA) is the first publicized algorithm for maximizing utility accrual for preemptive jobs with arbitrary time-utility functions. The LBESA examines the jobs in the earliest-deadline-first (EDF) order and rejects jobs with lower potential utility densities until the schedule is feasible. LBESA has been extended in the literature, e.g., [5, 51, 38, 37]. Further studies considering non-preemptive schedulers [12, 70] and resource sharing [17, 18] have been provided.

Koren and Shasha [35] designed an online algorithm for maximizing the utility accrual of overloaded systems and proved that their algorithm achieves the optimal competitive factor against an offline (clairvoyant) algorithm. However, their algorithm was later shown to have poor average performance [37]. Utility accrual has been also adopted as constraints when considering measures related to power consumption [34]. Tidwell et al.[65, 64] developed analysis and strategies based on Markov decision processes to maximize the expected utility accrual.

Despite the rich literature on scheduling policies aimed at maximizing the utility accrual of real-time tasks, to the best of our knowledge, no formal derivation exists for the utility accrual of a scheduling algorithm handling real-time tasks with stochastic behavior. Instead, existing studies primarily assess utility accrual through long-term simulations of scheduling algorithms or empirical scheduling experiments.

In this paper, we lay the theoretical foundations for deriving the utility accrual of scheduling algorithms when real-time tasks have a stochastic behaviour. Not only is such fundamental work, to the best of our knowledge, currently missing in the literature; we also show in Section 3 that intuitive approaches to determine the utility accrual, like conducting long-running simulations or considering the expected value, may lead to erroneous results.

Our Contributions:

We analyze the utility accrual (in the long run) of a periodic soft real-time task τ, served by a reservation server, a time division multiple access (TDMA), or alike. To the best of our knowledge, this is the first paper which provides an analytical framework for such a scenario.

  • In Section 3, we provide a rigorous definition of the utility accrual in the long run. We show that concrete problem instances exist where evaluating the utility accrual by simulating the scheduling algorithm or conducting scheduling experiments in a long run is erroneous. The key issue is that the utility accrual of a scheduling algorithm can converge towards different values depending on initial conditions. Therefore, validating the convergence to a unique value is necessary. Furthermore, we explain why the expected value of the utility accrual can be different from the utility accrual in the long run.

  • Our analysis is based on Markov chain models, convergence, and ergodicity – concepts we recap in Section 4. In Section 5, we show how to construct a Markov chain to model the interactions between the scheduling policy, the probabilistic workload of a periodic real-time task, and the service provided by the system to serve the task, as well as their effect on the utility accrual. For such a Markov chain, we also provide the theoretical fundamentals to determine whether the utility accrual converges in the long run and the derivation of the utility accrual if it converges.

  • Section 6 adopts three example schedulers to demonstrate how to utilize the constructions of Markov chains in Section 5. Specifically, we demonstrate in Section 6.1 how to systematically prove the non-existence of convergence for the example in Section 3.2. We evaluate the impact of parameter settings on the utility accrual and examine the scalability of our approach in Section 7.

2 System Model

This section provides the studied periodic task model in Section 2.1, the definition of time utility functions in Section 2.2, a description of supply functions for the service provided to the periodic task in Section 2.3, and how scheduling policies serve the periodic task when considering specific supply functions in Section 2.4.

2.1 Task Model

We consider a soft real-time system which can tolerate occasional deadline misses. We focus on one periodic real-time task τ, which is serviced by a greedy processing component (GPC) [60] in a uniprocessor system.

The periodic soft real-time task τ is modeled by a tuple (T,D,C), where T>0 is its period, D>0 is its relative deadline, and C is a random variable that describes the execution time. The task τ releases an infinite number task instances, called jobs, where its j-th job is denoted by Jj. The first job is assumed to be released at time 0. Hence, job Jj is released at time (j1)T and its absolute deadline is (j1)T+D.

We assume C follows a discrete distribution; that is, C has h1 distinct values e1<e2<<eh each with a related probability. The probability that a random variable X is equal to x is denoted as (X=x). The actual execution time of job Jj is assumed to be one of the h distinct values. Therefore, k=1h(C=ek)=100%. The execution times of the soft real-time jobs are assumed independent and identically distributed (iid). Thus, their joint probability is equal to the product of their probabilities. The execution times of jobs Jj,j={1,2,3,} are described by the random variables Cj,j which are independent copies of C.

The response time of a job, denoted as Rj for job Jj, is defined as its finishing time minus its release time. We note that a job may be dismissed (aborted) by the system before its completion. For such cases, Rj is denoted as for brevity. In this work, we consider discrete time, i.e., all the above values relevant to time are integers.

2.2 Time Utility Function

As we consider soft real-time systems, even after job Jj misses its deadline at (j1)T+D, it can be reasonable that the job is further executed, since, contrary to hard real-time tasks, a late result may still provide a certain utility to the system. To quantify the utility accrual of a job according to its response time, the task τ is associated with a time/utility function (TUF), denoted as V.

The TUF is a function that maps the response time of a job to a utility value. That is, V:{}. We assume that the time utility function of the task τ is piecewise defined and that the domain is partitioned at two time points: at the deadline D and at a termination time point HD. We assume V(t) is 1 for any 0tD (i.e., the job meets its deadline and provides full utility), V(t) is less than 1 for any D<tH (i.e., the job misses its deadline but still provides some utility), and V(t) is σ0 if it finishes after H or is dismissed (i.e., the job result is too late to be useful and the job gets a utility penalty). We assume V(t) to be non-increasing with respect to t for D<tH.

In summary, V:{},

V(t)={1,0tDnon-increasing w.r.t. t,D<tHσ0,t>H or t= (4)

We note that the domain of V can be set to {}, as all timing parameters are assumed to be integers. We choose to use the domain {}, as this in many situations allows to more easily specify the utility function.

We do not assume any relationship of D and T (i.e., τ is an arbitrary-deadline task) and only make the non-trivial assumptions V(H)σ and σ0. This model generalizes the existing hard, soft, and firm real-time models:

  • For a hard real-time task that does not allow deadline misses, H is set to D and the penalty σ is set to . We note that this implies the necessity to meet all deadlines and is not studied in this paper.

  • For a firm real-time task a job is considered not useful after its deadline miss; thus, H is set to D and the penalty σ is set to 0.

  • For a soft real-time task, multiple scenarios (and their combinations) are applicable:

    • If a job is only considered useful up to a certain point after its deadline miss, denoted in this paper by a relative dismiss point δ, then H can be set to δ.

    • If a job receives a penalty when it is not completed, σ is set to the penalty (i.e., it has a negative utility).

Figure 1 illustrates some possible scenarios. A firm real-time task is shown in Figure 1(a). For soft real-time, between D and H, V(t) could be linear, concave, or step-wise decreasing, as shown in Figure 1(b)1(c), and 1(d), respectively. A negative utility at V(H) can be still more beneficial than dismissing a job, resulting in a penalty of σ, as shown in Figure 1(e).

(a) Firm real-time tasks.
(b) V(t): linear, D<tH.
(c) V(t): concave, D<tH.
(d) V(t) is a step function.
(e) V(t) is negative before H.
Figure 1: Examples of Time Utility Functions based on Eq. (4).

The utility accrual for the case when H is set to D and the penalty σ is set to 0 is identical to 1 minus the deadline miss rate. Literature results for the deadline miss rate can therefore be applied, e.g., [21, 1, 2, 41, 3, 55, 54, 16, 24, 27, 25, 26, 42, 43, 45, 44], which is not further discussed here. Instead, we focus on the analysis of the utility accrual and utilize it for the explorations of the configurations of the scheduler.

We note that our results do not depend on the concrete structure of the utility function – it is sufficient when the Markov chain representation can be constructed for the given utility function. We restrict to functions according to the specification in Eq. (4) for the simplicity of presentation. Further discussions can be found in Section 8.

2.3 Supply Functions

We assume a soft real-time task τ that is served by a reservation server, a time division multiple access (TDMA), or alike. The jobs of τ are processed in a first-come-first-serve (FCFS) manner. As in Real-Time Calculus [62, 69, 68], we abstractly model this as a greedy processing component (GPC) [60].

We describe the supply provided by the GPC by specifying the supply in intervals of length G. This approach allows to easily specify relatively complex but recurrent supply patterns. Specifically, for 0tG and j={1,2,3,}, let the supply function βj(t) be the amount of accumulative time, i.e., supply, the GPC provides in [(j1)G,(j1)G+t). By definition, βj(0)=0. Furthermore, βj(t) is a non-decreasing function.

We implicitly assume in this paper that βj(t) is deterministic and how to extend our analysis to systems with upper/lower bounds on βj(t) is discussed in Section 8. By reformulating βj, let ρ(t) be the supply the GPC provides in [0,t). The accumulative service provided by the GPC starting from (j1)T to (j1)T+t is denoted as

service(j,t)=ρ((j1)T+t)ρ((j1)T). (5)
 Example 1 (Supply Function based on TDMA).

Suppose that the TDMA cycle is 5 and the service is from time 1 to time 5 (i.e., 4 units of time) within the TDMA cycle. Then, 0t5 and j={1,2,3,},

βj(t) ={0 if 0t<1t1 otherwise (8)

Based on the above discussions, the interaction between the period of the soft real-time task and the accumulative service repeats periodically with a period of LCM(T,G), i.e., the least common multiplier (LCM) of T and G. Let Q be LCM(T,G)T. By definition Q is a positive integer. That is, the interaction between the period of the soft real-time task and the accumulative service repeats after considering Q jobs of τ.

In the above TDMA example, the supply function is the same for each interval of length G=5. This approach can easily be extended to supply functions which are piecewise constructed. An example with two alternating supply functions (i.e., the supply in interval 1,3,5, is described by β1(t) and the supply in interval 2,4,6, is described by β2(t)) can be found in Section 3.2 and the related Figure 2.

2.4 Scheduling Policies

We assume that the scheduler 𝒜 makes a decision on how to execute a job Jj with the following properties:

  • When a job Jj arrives at time (j1)T, the scheduler decides whether this job Jj is admitted, i.e., placed in the ready queue. This decision is made based on only information that is known at time (j1)T.

  • The admission of Jj is accompanied with a maximum waiting point ωj, where Jj is dismissed if it does not start its execution before (j1)T+ωj. We assume that this configuration of ωj is decided when Jj is admitted and that 0ωjH.

  • The admission of Jj is accompanied with a relative dismiss point δj, where Jj is dismissed if it is not completed at an absolute dismiss point (j1)T+δj. We assume that this configuration of δj is decided either when Jj is admitted or when Jj starts its execution. For the rest of this paper, for simplicity, we assume that a job Jj is always dismissed immediately if it has not yet been completed at time (j1)T+H. Therefore, the response time of a job is either no more than H or defined as .

If a job is not admitted to the ready queue or is dismissed, its utility accrual is the penalty value σ. Without loss of generality, we assume that the first job J1 is always admitted with a given relative dismiss point δ1.

Although the above assumptions may not hold for all schedulers, we believe that they provide a good coverage. For instance, the following scheduling approaches are covered.

 Example 2 (Constant Relative Dismiss Point and Constant Maximum Waiting Point).

At the time job Jj arrives, the scheduler unconditionally admits Jj and configures a constant maximum waiting point ω as well as constant relative dismiss point δ specified in the scheduler. By definition 0ωδH. If Jj did not start by time (j1)T+ω, then Jj is dismissed directly at the time when Jj is the first job of τ in the ready queue (i.e., Jj is not executed at all). If Jj is not completed by time (j1)T+δ, then Jj is dismissed at time (j1)T+δ. We denote such schedulers as 𝒜const. ∎

 Example 3 (Number of Pending Jobs).

When job Jj arrives, it is admitted if the total number of pending jobs for task τ (including the currently executing job and those unstarted jobs in the ready queue) does not exceed a specified threshold. Once Jj is admitted, a maximum waiting point and a dismiss point can be configured. We denote this as 𝒜num. ∎

 Example 4 (Variable Relative Dismiss Points).

If Jj is admitted, depending on the number of pending jobs, an offset oj is configured and its absolute dismiss point is set to t+oj, where t is the time point when Jj starts its execution. We denote this as 𝒜var. ∎

We provide a general framework to construct Markov chains based the above properties of the schedulers. In case the target scheduler does not comply with these properties, a revision of the construction of the Markov chain in Section 5 is needed. We provide further discussions on this matter in Section 8.

3 Problem Definition

This paper studies the following question:

For a given scheduling policy (scheduler) 𝒜, which utility accrual can be expected in the long run?

To answer this question, in Section 3.1, we first define the utility accrual for bounded intervals and then extend to unbounded intervals. Afterwards, in Section 3.2 (Section 3.3, respectively), we explain why simulating the scheduling policy or conducting scheduling experiments (considering the expected utility accrual, respectively) in a long run may lead to erroneous results.

3.1 Utility Accrual of the First 𝑵 Jobs and in the Long Run

Under a given scheduling policy 𝒜, let R(j) be a discrete random variable, indicating the response time of the j-th job Jj of the real-time task τ. By definition, R(j) is in {}. Hence, the utility accrual of the first N jobs of task τ is a random variable that is determined by accumulating their corresponding time utility and dividing by N:

UAN=1Nj=1NV(R(j)) (9)

Note that not admitted or dismissed jobs contribute the penalty value σ in Equation (9).

The utility accrual of task τ in the long run is UAN when N. However, as UAN is a random variable, it must be shown that UAN converges towards a single value almost surely. Essentially, we need to answer the following fundamental questions:

Does UAN actually converge to a single value? If yes, to which value?

In case UAN converges to a constant value UA almost surely, then UA is denoted as the long-term utility accrual and satisfies

(limNUAN=UA)=1 (10)

3.2 Convergence of Utility Accrual

We now demonstrate why simulating the scheduling policy or conducting scheduling experiments in a long run may converge towards different values depending on initial conditions, based on the concrete example in Figure 2; thus, the utility accrual is not well-defined. Assume the following implementation of a scheduler based on Example 4:

  • When a job Jj arrives at (j1)T, it is admitted unconditionally. Let bj be a Boolean variable indicating whether there is any unfinished job before Jj is released for execution.

  • The absolute dismiss point is then set dependent on the start time of the job Jj. Suppose that Jj starts its execution at time t. The absolute dismiss point is set to t+15 if bj is false or t+5 if bj is true.

Assume we have two alternating supply functions β1,β2:[0,5]:

β1(t) ={t if 0t<22 otherwise} (11c)
β2(t) ={t if 0t<33 otherwise} (11f)

The intervals where supply is provided are depicted in gray in Figure 2.

The periodic task τ has the parameters T=5, D=6, and its execution time C is described by (C=3)=0.5 and (C=6)=0.5. The utility function is

V(t)={1,0tD1tDT,D<tD+T0,t>D+T or t= (15)

We now show that the long-term utility accrual is not well-defined for this concrete example. Specifically, we show that for the above example the system behavior depends on the execution time of the first job.

We depict the two possible schedules in Figure 2, where upwards arrows are job releases and the small gray boxes indicate the service for task τ. The boxes labeled Ji is the interval where Ji can be executed. A box with blue dashed lines indicates a job where V(t) is always 1, a red box a job where V(t) is always 0, and a white box an unclear utility (either 0 or 1).

  • For C1=6, the schedule is depicted in Figure 2(a). J1 finishes at time 11 and J2 starts executing at time 11, setting its absolute dismiss point to 16. Regardless whether C2 is 3 or 6, J2 is dismissed at time 16. Thus, J3 starts at time 16 with an absolute dismiss point at 21. At time 21, J3 either finishes its execution for C3=3 or is dismissed for C3=6. Each of the subsequent jobs J4,J5, either has a response time of 11 or is dismissed. That is, R(j)=11 or R(j)= for j2. Thus, the utility accrual for every job and, therefore, also in the long run is 0.

  • For C1=3, the schedule is depicted in Figure 2(b). J1 finishes at time 6 and J2 starts at time 6 with an absolute dismiss point of 11. If C2 is 3, then J2 finishes at time 11 with utility 1; otherwise J2 is dismissed at time 11 with utility 0. Hence, the scheduler unconditionally starts the execution of J3 at time 11, with an absolute dismiss point of 16. J3 is always dismissed at time 16. The scheduler repeats the above pattern for J4,J5,. In summary, we have the following cases:

    • For j=2,4,6,, with probability 50% each, (i) when Cj is 3 then R(j) is 6 and V(R(j))=1, and (ii) when Cj is 6 then R(j) is and V(R(j)=0.

    • For j=3,5,7,, R(j) is and V(R(j))=0.

    Interestingly, in this example R(2),R(4),R(6), are independent and identically distributed (i.i.d.) random variables for C1=3.

    Hence, if C1=3, we can calculate the utility accrual as:

    UAN =1Nj=1NV(R(j))=1N(1+j=1N2V(R(2j)))=1N+N2Nj=1N2V(R(2j))N2

    For N, we get 1N0 and N2N0.5. Furthermore, since V(R(2j)) are i.i.d. random variables, we can apply the classical central limit theorem. That is, j=1N2V(R(2j))N2 converges to the expected value 𝔼[V(R(2j))]=0.5 for N almost surely. Hence, the long-term utility in that case is 0+0.50.5=0.25.

Since the utility accrual converges towards two different values with a probability of 0.5 each (dependent only on the execution behavior of the first job), the long-term utility accrual is undefined. By simulating the scheduling policy or conducting scheduling experiments, the utility accrual in the long run is either 0 or 0.25, each with 50% probability. Hence, although each simulation or experiment returns a concrete value, these values do not represent the long run utility accrual.

(a) Schedule when C1=6.
(b) Schedule when C1=3.
Figure 2: Schedules for the example in Section 3.2.

3.3 Convergence versus Expected Value

We now show that the expected value of utility accrual is also different from the utility accrual in the long run. The expected utility accrual of the schedule can be expressed as:

𝔼[𝑈𝐴N]=1Nj=1N((R(j)=)σ+t:R(j)=t(R(j)=t)V(t)) (16)

We use the example in Section 3.2 to demonstrate that the expected utility accrual is not necessarily the same as the utility accrual in the long run. Recall that (R(1)=11)=0.5 and (R(1)=6)=0.5. Furthermore, when R(1) is 11 the conditional probabilities for j=2,3, are

(R(j)=11|R(1)=11)=(R(j)=|R(1)=11)=50%.

In addition, when R(1) is 6, the conditional probabilities for j=2,4, are

(R(j)=6|R(1)=6)=(R(j)=|R(1)=6)=50%

and for j=3,5, are

(R(j)=|R(1)=6)=100%

Hence, we get an expected value of

𝔼[𝑈𝐴N] =1N(0.51+0.50)+1Nt:j=2N(R(j)=t)V(t)
=0.5N+j=1N2(R(1)=6)(R(2j)=6|R(1)=6)V(6)
+j=1N2(R(1)=6)(R(2j+1)=|R(1)=6)V()
=0.5N+N20.50.51N

For N, we have 1N0 and N2N0.5. Therefore, limN𝔼[𝑈𝐴N]=0.125.

We have shown that the expected utility accrual is 0.125, whilst the utility accrual in a long run is either 0.25 or 0, each with 50% probability. We note that the definition of Eq. (16) should be used to derive the expected utility accrual of all possible realizations of C1,C2,,CN. However, our focus is the convergence of the utility accrual of any possible realizations of C1,C2,,CN in the long run, which can differ from the expected utility accrual, as demonstrated here.

4 Markov Chains, Convergence, and Ergodicity

As discussed in Section 3.1, the key ingredient for determining the long-term utility accrual UA is the calculation of a limit limNUAN. To achieve this, we built on the well-established foundation of Markov chains. We summarize the parts we need based on the textbook by Norris [52, Chapter 1]. More specifically, in this section we discuss the properties for convergence of the utility accrual, assuming that the behavior of the scheduler can be described by a Markov chain. In Section 5, we demonstrate that indeed the schedulers specified in Section 2.4 can be described by Markov chains.

We start by recapitulating the definition of a Markov chain.

Definition 5 (Markov chain).

A discrete time Markov chain X=(Xn)n is a sequence of random variables Xn:ΩS from the sample space Ω on a finite set S, the so-called state space. The fundamental Markov property of a Markov chain is being memoryless, i.e., the probabilistic behavior of Xn+1 only depends on the result of Xn. Formally, the Markov property is fulfilled if

(Xn+1=sn+1|X1=s1,,Xn=sn)=(Xn+1=sn+1|Xn=sn) (17)

where (A|B)=(AB)(B) is the conditional probability of A given B. The Markov chain has an initial distribution λ, which is the probability distribution of X1. The probabilistic behavior is described by the transition matrix P=(Pj,i)i,jS, defined as

Pj,i=(Xn+1=j|Xn=i) (18)

That is, if Xn=i for any n, then Xn+1=j with probability Pj,i.  

In this section, we assume that the behavior of the scheduler is described by a given Markov chain. That is, a finite set of states S is defined describing the possible job behavior. Each Xn is the random variable describing the behavior of the n-th job Jn. If Xn=s, then the job Jn behaves as described by state sS. Section 5 further discusses how to construct suitable Markov chains for different schedulers.

Let the function UA:S describe the utility accrual UA(s) of a job indicated by state sS. In this situation, if the first N jobs are indicated by the states s1,,sNS, then the utility accrual of the first N jobs can be calculated as 1Ni=1NUA(si). More generally, we formulate the random variable UAN in terms of the Markov chain:

UAN=1Ni=1NUA(Xi) (19)

Thus, determining the long-term utility accrual reduces to whether Eq. (19) converges almost surely to UA.

The ergodic theory studies the limiting behavior of Markov chains. Intuitively, the ergodic theory states that in the long run the amount of times that a certain state is visited can be described by a stationary distribution, defined below. That is, if a stationary distribution π=(πs)sS can be calculated, then in the long run the ratio of jobs in state sS is πs.

Definition 6 (Stationary distribution).

A probability distribution π=(πs)sS with sSπs=1 is stationary if Pπ=π.

The ergodic theory is applicable when the Markov chain is irreducible. Intuitively, irreducible means that all nodes in the graph describing the Markov chain are connected by paths of non-zero probability.

Definition 7 (Irreducibility).

A Markov chain X is called irreducible if for all states s,sS, there exist a sequence of states s1,,sψS such that

Ps1,s,Ps2,s1,,Psψ,sψ1,Ps,sψ>0 (20)

This leads us to the ergodic theorem, which ensures the existence (and uniqueness) of a stationary distribution, as well as the limiting behavior towards that distribution. We reformulate from the textbook by Norris [52, Theorem 1.10.2].111[52, Theorem 1.10.2] further requires a positive recurrent Markov chain, which holds by definition if the Markov chain is finite and irreducible.

Theorem 8 (Ergodic Theorem [52]).

Consider a Markov chain X=(Xn)n with finite state space S, transition matrix P and initial distribution λ. Let f:S be a bounded function. If X is irreducible, then a unique stationary distribution π exists, and

(limN1Nj=1Nf(Xj)=sSπsf(s))=1. (21)

By choosing the function f=UA:S to count the utility accrual, we can use the ergodic theorem to calculate the long-run utility accrual.

Theorem 9 (Utility Accrual).

If X is irreducible and has a finite state space S, then the unique stationary distribution π exists, and

(limNUAN=sSπsUA(s))=1. (22)

In particular, the long-term utility accrual of the system described by the Markov chain is

UA=sSπsUA(s) (23)

Proof.

We apply Theorem 8 with f=UA:S to obtain

(limN1Nj=1NUA(Xj)=sSπsUA(s))=1. (24)

Since 1Ni=1NUA(Xi)=UAN by Equation (19), this is equivalent to Equation (22). In particular, UAN converges almost surely to sSπsUA(s). Hence, the long-term utility accrual is UA=sSπsUA(s), if X is irreducible and has a finite state space S.

Since the state space of Markov chains considered in this work is finite, checking whether X is irreducible can be done efficiently using Tarjan’s strong-connected-components algorithm [61, 53]. Moreover, the stationary distribution can be calculated by (i) solving the linear system (PE)π=0, where E is the identity matrix with 1 on the diagonal and 0 otherwise, and (ii) normalizing π by setting π to 1sSπsπ.

5 Construction of Markov Chains for Utility Accrual

In this section, we show the construction of a Markov chain that captures the execution of task τ served by a GPC with a supply function βj(t) j and analyze its correctness. We exemplify this construction process in Section 6.

To generate a Markov chain for a given system, we need to construct a set of states S and transition probabilities Ps,s for all s,sS such that:

  • the behavior of a realization of every job can be described by exactly one state in S, and

  • if a realization of a job Jn is described by a state sS, then the probability that a realization of the next job Jn+1 can be described by a state sS is exactly Ps,s.

The amount of information stored in a state largely depends on the chosen scheduling policy. The more information is stored, the more complex scheduling policies can be described. However, more information increases the number of states and results in higher time/space complexity of the generation process. If the information of the state is too specific, e.g., storing the absolute finishing time in every state, then this might even result in an infinite state space. To avoid this, we exploit the periodicity of job releases and supply function.

We first detail necessary and policy-dependent information stored in the individual states. Afterwards, we explain the general procedure to generate states and transition probabilities; starting with the construction of initial states representing the first job J1 and then constructing states of the subsequent jobs Jn+1,n, based on the states of its preceding job Jn.

States of the chain.

Assume that a realization of a job Jn is in a state s. To apply our analysis, the state needs to reflect the correct utility accrual v. Furthermore, to correctly calculate the finishing time (and therefore the utility) of a subsequent job Jn+1 we need information about the remaining execution time after the subsequent job Jn+1 is released. That is, we add the remaining execution time at time nT, denoted as 𝑟𝑒𝑚, to the state. Furthermore, we need to know the service that will be provided to the subsequent state. To achieve this, we add an index ξ to the state indicating that the current job Jn was served by service function service(ξ,). Hence, the subsequent job Jn+1 will be served by service(ξ+1,). Besides that, any further information that is needed for the decision-making of the scheduling policy (e.g., the number of pending jobs, to be demonstrated in Section 6.2) is denoted as I. In conclusion, any state sS has four entries, i.e., s=(v,I,𝑟𝑒𝑚,ξ). We exploit the periodic behavior of the service by the repetitive Q service functions in the states.

Initialization.

We start with an empty set of states S= and an empty transition matrix P=(Pj,i)i,jS. In the following, we assume that all values of P are set to 0 if not actively set to a value >0. For the first job J1 of τ, for each possible realization (i.e., execution-time value) ek,k=1,2,,h of the random variable C1 we generate a state s, as follows:

  • If ekservice(1,D), then job J1 meets its deadline for a realization ek of C1. This realization ek of C1 has

    rem=max{ekservice(1,T),0} (25)

    remaining execution time that will be executed in the interval [T,max{D,T}).
    The utility accrual is v=1.

  • If ek>service(1,D), then job J1 misses its deadline for a realization ek of C1. Let r be the minimum value with service(1,r)=ek. We must consider two sub-cases:

    • J1 finishes its execution no later than δ1, i.e., rδ1. The utility accrual v is V(r). The remaining execution time that will be executed in the interval [T,max{δ1,T}) is again calculated with Eq. (25).

    • J1 does not finish its execution at time δ1, i.e., r>δ1. The utility accrual v is the penalty σ, as J1 is dismissed before its completion. Some of the remaining execution time max{ekβ1(T),0} is dismissed after its relative dismiss point δ1. Thus,

      rem=max{min{ek,service(1,δ1)service(1,T)},0} (26)

      remaining execution time will be executed in the interval [T,max{δ1,T}).

By using the values of 𝑟𝑒𝑚 and v constructed above, for each ek we generate a state s=(v,I,𝑟𝑒𝑚,ξ=1) where I is the initial information revealed to the policy by job J1 for the realization ek of C1. We add all h states to S and remove duplicates. That is, if two realizations result in the same state, the state is not considered twice. The initial distribution λ(s) for sS is the sum of the probabilities of the realizations that result in the state s.

State Constructions for 𝑱𝒏+𝟏.

We assume that the state set S and the transition matrix P is already constructed for all jobs J1,,Jn. Specifically, we assume that for every realization of C1,,Cn, a state sS describing the behavior of Jn exists. Let SnS be the set of all states without any successor states, i.e., Sn={sS|Ps,s=0sS}. By assumption, all these nodes describe a possible behavior of job Jn, since otherwise states that describe the behavior of the subsequent job would have been created. For each s=(v,I,𝑟𝑒𝑚,ξ)Sn we generate new states s=(v,I,𝑟𝑒𝑚,ξ) describing the job behavior of job Jn.

First, ξ=(ξmodQ)+1 since we move to the service function for job Jn+1. For each ek for k=1,2,,h of the h realizations of the random variable Cn+1 for the job Jn+1 of τ, we generate an s by determining the following values:

  • If the scheduler does not admit the job Jn+1 based on the revealed information in I, then Jn+1 is dismissed before its completion. The utility accrual v is the penalty σ. The remaining execution time at (n+1)T is

    rem=max{remservice(ξ,T),0} (27)
  • If the scheduler admits the job Jn+1 based on the revealed information in I, then Jn+1 is placed into the ready queue and potentially executed.

    • If the job Jn+1 cannot start before its configured maximum starting time, i.e., rem>service(ξ,ωn+1), then Jn+1 is dismissed before it starts. The utility accrual v is the penalty σ. The remaining execution time 𝑟𝑒𝑚 at time (n+1)T is the same as in Eq. (27).

    • If ek+𝑟𝑒𝑚service(ξ,D), then job Jn+1 meets its deadline for realization ek of Cn+1. The remaining execution time at (n+1)T is

      rem=max{ek+𝑟𝑒𝑚service(ξ,T),0} (28)

      This is similar to Eq. (25) by considering rem. The utility accrual is v=1.

    • If ek+rem>service(ξ,D), then Jn+1 misses its deadline for realization ek of Cn+1. Let r be the minimum value with service(ξ,r)=𝑟𝑒𝑚+ek. Two sub-cases must be considered:

      • *

        Jn+1 has a response time no later than δn+1, i.e., rδn+1. The utility accrual v is V(r). This realization ek of C1 has the same remaining execution time as in Eq. (28) at (n+1)T.

      • *

        Jn+1 cannot be finished by time jT+δn+1, i.e., r>δn+1. The utility accrual v is σ, as Jn+1 is dismissed before its completion. Thus, similar to Eq. (26), the remaining execution time is

        rem=max{min{rem+ek,service(ξ,δn+1)}service(n+1,T), 0} (29)

For state s generated from s with realization ek, we define the transition probability as Ps,s=(Cn+1=ek). If s is not already in S, then we add s to S. Otherwise, if there is another state s~S that has identical information as s, then we merge s into S by updating the transition probability as Ps~,s:=Ps~,s+Ps,s. By definition, after merging all states into S, sSPs,s=1.

The creation of the Markov chain is completed, if all states in S have successor states. In that case, we prove the following properties of correctness for the Markov chain.

Theorem 10.

For every n, and every sequence of realizations C1,,Cn, the behavior of job Jn,n can be described by one of the states sS. Furthermore, the probability that Jn+1 is in a state sS is Ps,s.

Proof.

We show the first part via contradiction. To that end, we assume that n is the minimal natural number where a sequence ek1,,ekn of realizations of C1,,Cn exists, such that the job Jn cannot be described by any state in S.

If n=1, then the state for realization ek1 must have been added to S during the initialization phase. Hence, that state describes the behavior of Jn=J1.

Furthermore, if n>1, then let s~ be the state of Jn1 in the above sequence. Since the generation process only stops when s~ has successor nodes, those successor nodes must have been created. In particular, the successor node s that has been generated for realization ekn describes the behavior of Jn.

This contradicts the assumption that Jn cannot be described by a state in S, which proves the first part.

For the second part, we know that the successor nodes of s have been generated to describe the behavior of Jn+1 for all possible realizations e1,,eh of Ck+1. Let E be the set of all realization that lead to Jn+1 being in state s. Then by construction, Ps,s=ekE(Cn+1=ek), which means that Ps,s is the probability that Jn+1 is in state s.

6 Case Studies

In the following subsections, we demonstrate how to utilize the framework discussed in Section 4 to construct a Markov chain for the three scheduling policies defined in Section 2.4. Further notes on the extension to different scheduling policies are provided in Section 8.

6.1 Variable Relative Dismiss Points: 𝓐𝒗𝒂𝒓

We first explain how to construct a Markov chain for the example used in Section 3.2. We also utilize this example to show that the non-existence of convergence can be observed directly by the constructed Markov chain. Recall the periodic soft real-time task τ has T=5 and D=6, and two realizations of e1=3 and e2=6, each with a 50% probability. The supply functions are described in Eq. (11) and the time utility function is in Eq. (15). We also know that Q is 2.

As Figure 2 already illustrates parts of possible schedules, we utilize it to simplify the explanation, instead of applying the equivalent calculations of the remaining execution time using the equations stated in Section 5. Here, we do not need to utilize I, since the remaining values (i.e., v, rem, and ξ) already reveal sufficient information to determine the subsequent states. Thus, we always denote I as .

Initialization.

For both realizations e1=3 and e2=6, we construct the initialization states. When e1 is 3, J1 finishes at time 6 (i.e., with remaining execution time of 1 at time 5) and the utility is 1. When e1 is 6, J1 finishes at time 11 (i.e., with remaining execution time of 4 at time 5) and the utility is 0.

We create two states (in terms of (v,I,rem,ξ) defined in Section 5) for J1, one state is s1=(1,,1,1) and the other is s2=(0,,4,1). We have λs1=λs2=0.5.

State Constructions for 𝑱𝟐,𝑱𝟑,𝑱𝟒,.

We consider J2 based on s1 (i.e., C1=3) and s2 (i.e., C1=6) as illustrated by the schedules of J1 in Figure 2(b) and 2(a), respectively.

  • For s1, when job J2 arrives at time 5 the ready queue has unfinished workload which is going to be finished at time 5+1. J2 will start its execution at time 6 and will set its absolute dismiss point to be at time 11.

    • If C2 is 3 (i.e., the schedule in Figure 2(b)), then J2 finishes at time 11 and its utility accrual is 1. That is, the remaining execution time of the ready jobs at time 10 is 1. We create a new state s3=(1,,1,2). The transition probability from s1 to s3 is 50%.

    • If C2 is 6, then J2 is dismissed at time 11 and its utility accrual is 0. That is, the remaining execution time of the ready jobs at time 10 is 1. We create a new state s4=(0,,1,2). The transition probability from s1 to s4 is 50%.

  • For s2, when job J2 arrives at time 5 the ready queue has unfinished workload which is going to be finished at time 5+6. J2 will start executing at time 11 and sets its absolute dismiss point to time 16. No matter whether C2 is 3 or C2 is 6, J2 misses its deadline and the remaining execution time of the ready jobs at time 10 is 3, as illustrated in Figure 2(a). We create a new state s5=(0,,3,2). The transition probability from s2 to s5 is 100%.

We then build the transitions from s3, s4, and s5 for J3:

  • For both s3 and s4, job J3 starts its execution at time 11 and is dismissed unconditionally at time 16 (as depicted in Figure 2(b)). The resulting state (0,,1,(2mod2)+1) is not identical to any of the existing states. Therefore, we create a new state s6=(0,,1,1). The transition probability from both s3 and s4 to s6 is 100%.

  • For s5, job J3 starts its execution at time 16 and is dismissed unconditionally at time 21 (as depicted in Figure 2(a)). This results in a state (0,,3,(2mod2)+1), which is not identical to any of the existing states. Therefore, we create a new state s7=(0,,3,1). The transition probability from s5 to s7 is 100%.

Now, we have two states s6 and s7 without successor nodes and continue by considering J4. As the remaining execution time of s6 is the same as that of s1, we can apply the same analysis as for s1. That is, if C4 is 3 the resulting state is (1,,1,2), which is identical to s3, and if C4 is 6 the resulting state (0,,3,2), which is identical to s4. As both s3 and s4 were constructed before, we do not have to create new states. The transition probability from s6 both to s3 and s4 is 50%. As for the state s7, job J4 will start its execution at time 21 and is dismissed unconditionally at time 26. This results in a state (0,,3,2), which is identical to s5. Therefore, we use s5 instead of creating a new state. The transition probability from s7 to s5 is 100%.

Discussions.

The constructed Markov chain, shown in Figure 3, captures all job realizations of τ according to Theorem 10. It consists of two strongly connected components, making it non-irreducible, thus Theorem 9 cannot be applied to compute long-term utility accrual. Rather than invalidating our analysis, this example highlights the necessity of rigorous analytical methods, as presented in this work.

Figure 3: Markov chain constructed in Section 6.1 for the example from Section 3.2, indicating that there is no convergence of the utility accrual as the chain has two strongly connected components. Recall that each state is represented by (v,I,𝑟𝑒𝑚,ξ), where v is the utility accrual, I is the further information needed for decision making, 𝑟𝑒𝑚 is the remaining execution time, and ξ is the indication of the service function.

6.2 Number of Pending Jobs: 𝓐𝒏𝒖𝒎

We demonstrate how to utilize the internal state I when the scheduling policy refers also to the number of pending jobs with a case study of a scheduler explained in Example 3, denoted as 𝒜num. Consider a periodic soft real-time task τ with T=D=5, and two possible execution times:

  • e1=2 and e2=6, and

  • (Cj=e1)=0.5 and (Cj=e2)=0.5, j.

Suppose τ is served by the supply function in Example 1 and that the time utility function of τ is defined as follows:

V(t)={1,0tD10.1(tD),D<tD+10σ,t>D+10 or t= (33)

The maximum number of the pending jobs is set to 2. Consequently, when a job Jn is released at time t, once the number of jobs in the ready queue at time t (including one that has started executing but is not yet finished) has already reached this limit, Jn is not admitted, and the utility accrual of Jn is σ. The relative dismiss point is set to δ=15=3T.

We use I to monitor the number of pending jobs. That is, we store the number of jobs completing or being dismissed in each upcoming period T, and the aggregate of pending jobs is the sum of these list entries. More specifically, let ai(t) be the number of jobs completing or being dismissed in the interval (t+iT,t+(i+1)T]. Then, for a job being released at time t+T the number of pending jobs can be calculated by a1(t)+a2(t), because no preceding job is executed after t+3T due to the dismiss point δ=3T. Hence, it is sufficient to store I=[a1(t),a2(t)] in a state that describes a job released at time t, since the subsequent job is admitted if and only if a1(t)+a2(t)<2.

Due to space limitation, we only sketch the construction process with a focus on how to utilize the internal state I.

Initialization.

For each of the two realizations e1=2 and e2=6, we construct two initialization states (in terms of (v,I,rem,ξ) defined in Section 5) as s1=(1,[0,0],0,1) and s2=(0.7,[1,0],2,1). We have λs1=λs2=0.5.

State Constructions for 𝑱𝟐,𝑱𝟑,𝑱𝟒,.

For both s1 and s2, the next job is always admitted. For state s1, it has 50% probability to return to itself (when C is 2) and 50% to transit to s2 (when C2 is 6). For state s2, it has 50% probability to transit to s1 (when C is 2) and 50% to transit to a new state s3=(0.5,[1,0],4,1).

We continue with s3 as it has no successor node. The next job is always admitted as there is only one pending job. It has 50% probability to transit to s2 (when C is 2) and 50% to transit to a new state s4=(0.2,[0,1],6,1).

We continue with s4 as it has no successor node. The next job is always admitted as there is only one pending job. When C is 2, this job has a response time of 10 and we have a new state s5=(0.5,[2,0],4,1). When C is 6, this job has a response time of 15 and we have a new state s6=(0,[1,1],8,1). The transition probability from s4 to s5 is 50%, so is the transition probability from s4 to s6.

For state s5, since there are two pending jobs, the next job is not admitted, resulting in a new state s7=(σ,[0,0],0,1) with 100% transition probability from s5 to s7. For state s6, two pending jobs; thus, the next job is not admitted, and a new state s8=(σ,[1,0],4,1) is constructed with 100% transition probability from s6 to s8.

The state s7 has no remaining execution time. Therefore, it has 50% transition probability to s1 and 50% transition probability to s2. The state s8 has the same remaining execution time and internal state as state s3. Therefore, it has 50% transition probability to s2 and 50% transition probability to s4.

Discussions.

The constructed Markov chain is completed with 8 states, depicted in Figure 4. Whenever a state is reached where the number of jobs in the ready queue hits the upper limit of 2 (highlighted in red in Figure 4), the release of the next job is prohibited, and the process transitions to a penalty state (marked in blue). The resulting Markov chain is irreducible and a unique stationary distribution π exists due to Theorem 8. More specifically, the resulting stationary distribution π is [722,622,322,222,122,122,122,122], where each state of the 8 states has the same probability. (We omit the calculation due to space limitation.) As a result, the UA in the long run is 7+4.2+1.5+0.4+0.5+σ+σ22=13.6+2σ22.

Figure 4: Markov chain of the example in Section 6.2 under 𝒜num with 8 states.

6.3 Constant Relative Dismiss Point and Constant Maximum Waiting Point: 𝓐𝒄𝒐𝒏𝒔𝒕

We consider the example from Section 6.2 under a different scheduling policy by setting the relative dismissed point δ to 8 without any constraint on the maximum waiting point. The initial states s1 and s2 are the same as in Section 6.2. Only one additional state s3=(σ,[1,0],2,1) is created. The detailed construction process is omitted due to space limitation. The resulting Markov chain is illustrated in Figure 5, marking the state with a dismissed job in blue. The resulting Markov chain is irreducible and a unique stationary distribution π exists due to Theorem 8. More specifically, the resulting stationary distribution π is [24,14,14]. As a result, the UA in the long run is 0.675+0.25σ.

Figure 5: Markov chain of the example in Section 6.3 under 𝒜const with 3 states.

7 Exploration of Policy Parameters

In this section, we analyze the impact of parameter configurations on utility accrual for a given task. Additionally, we demonstrate the scalability of our approach by showing the average execution times for different steps relative to the number of states.

(a) UA when varying the relative
dismiss point δ.
(b) UA when varying the cons-
tant maximum waiting point ω.
(c) UA when varying the num-
ber of pending jobs for 𝒜num.
Figure 6: Utility Accrual (UA) in the long run under different parameter configurations.

Setup.

We revisit the task in Section 6.2 by individually considering three parameters for the scheduling policy:

  1. a)

    the relative dismiss point, i.e., δ=5,6,,15;

  2. b)

    the constant maximum waiting point, i.e., ω=1,2,,10; and

  3. c)

    the maximum number of pending jobs, i.e., 𝒜num=1,2,,10. Here, we set a relative dismiss point at 15, as indicated by the utility function in Eq. (33), with a penalty σ of 0.

Impact on utility accrual.

Irreducible Markov chains for each scenario based on these scheduling strategies are constructed and analyzed. The results are depicted in Figure 6. Depending on the parameter, different patterns can be observed. Even minor changes can have a drastic effect, e.g., changing the maximum waiting point from 2 to 3 increases the utility accrual from 0.33 to 0.57.

Figure 7: Relationship between scheduling parameters and the size of generated Markov chain.

Complexity of Markov chain.

To study the impact of scheduling parameters on the complexity of the generated Markov chain, we configured three input dimensions: Maximum number of pending jobs, relative dismiss point, and maximum waiting point, each ranging from 5 to 60 in increments of 5. This resulted in a complete grid of 1,728 configurations. The output represents the number of reachable states in the constructed Markov chain, which ranges from 4 to 4,702 states. The results are shown in Figure 7. The analysis reveals that the relative dismiss point and the maximum waiting point have greater influence on the number of states. As these two parameters increase, the number of states grows rapidly. In contrast, the maximum number of pending jobs has a smaller effect on the number of states.

Figure 8: Average runtime resulting from experiments by setting the relative dismiss points δ from 5 to 3005 in steps of 100. As the number of states increases when δ increases, we use the corresponding number of states in the x-axis.

Scalability.

The scalability of our approach highly depends on the number of states in the constructed Markov chain, which is influenced by system configurations such as the relative dismiss points. A larger relative dismiss point increases the number of states. Using the described setting, we incrementally increase the relative dismiss point from 5 to 3005 in steps of 100, with the termination time H set to 3005, where the time utility function decreases linearly from D=5 to H.

Following the method outlined in Section 5, we (i) construct the Markov chain, (ii) verify its irreducible property, and (iii) compute the stationary distribution π (utilizing the scipy.sparse.linalg.eigs library to numerically find eigenvectors of a sparse matrix in Python). Figure 8 illustrates the average runtime (of 10 iterations) for these steps on a laptop equipped with an Intel i7-1360p processor and 32GB RAM. The results indicate that the time required to construct the Markov chain exhibits non-linear growth. The time to compute the stationary distribution depends on the number of states. Nonetheless, the time spent verifying the irreducible property remains minimal, i.e., it is less than 0.1 second.

Stochastic Releases.

In addition to the deterministic scheduling policy discussed earlier, we investigate the use of a stochastic release mechanism. In this approach, the decision to release a new job is made probabilistically, depending on the current number of pending jobs in the ready queue. The release probability is fine-tuned using a model-based optimization approach, specifically Bayesian Optimization [33], with the goal of maximizing utility accrual.

We adopt the same setting as in Section 6.2, where the focus is on the number of pending jobs, i.e., 𝒜num. In the example provided in Section 6.2, once the number of pending jobs reaches 𝒜num=2, no further jobs are released, while prior to this, jobs are always released. For the stochastic release mechanism, we define an array Pr=[p0,p1,,p𝒜num], which specifies the release probability based on the current number of pending jobs. Specifically, if there are a pending jobs at a given time, a job is released with probability pa.

A subset of the states for the generated Markov chain is shown in Figure 9, with Pr=[p0,p1,p2]. Here, p0 is set to 1, indicating that when there are no pending jobs, a new job is released in the next period with probability 1 (states s1 and s6). Conversely, p2 is set to 0, consistent with the design outlined in Section 6.2 (state s5). The release probability for the case with one pending job, denoted by p1, can be adjusted to maximize utility accrual.

For instance, when there is one pending job, as represented by state s2, the probability of releasing a job is p1. Specifically, there is a probability of p12 for the job to have an execution time of 2 (transitioning to state s1) and another p12 for the job to have an execution time of 6 (transitioning to state s3). Alternatively, with probability 1p1, no job is released in the next period, resulting in a transition to state s6.

An optimization approach was applied to Pr, with each element assigned a value in the range [0,1], aimed at improving utility accrual. In particular, the release probability p1 was optimized to 0.737, leading to an improvement in utility accrual from 0.618 (as shown in Figure 9-c, column 2) to 0.632. This result demonstrates that optimizations can be easily integrated, and further dedicated optimization efforts may yield even better outcomes.

Figure 9: A subset of states for Markov chain with stochastic release mechanism and 𝒜num=2.

8 Remarks and Extensions

Supply Bound Functions Based on Reservation Servers.

We have limited our attention to concrete supply functions. If the supply can only be described as upper and lower bounds, our construction framework in Section 5 requires a significant revision. The main reason is that the constructions of the states and transitions require very precise information. However, if we assume that the scheduler only makes a decision to dismiss a job at time points when the upper and lower bounds on the supply function coincide, then the construction in Section 5 can be applied. For example, for a hard constant bandwidth server (hard-CBS) [7], our method can be applied when the dismiss points always coincide with the budget replenishment period of the hard-CBS, as the upper and lower service curves are the same whenever the budget is replenished.

Arbitrary Time Utility Functions.

We note that utility functions which do not comply to Eq. (4) may better describe the behavior of a concrete system and are also considered in the literature. The Markov chain construction for a scheduler can accommodate any time utility function with minor adjustments. We opt to adopt Eq. (4) only to simplify the presentation and to illustrate the impact of parameter optimizations in Section 7.

Stochastic Scheduling Decisions.

We only consider schedulers with deterministic decision strategies. However, a scheduler can also adopt stochastic scheduling decisions. For example, the scheduler studied in Section 6.2 can stochastically admit a job with a certain probability when the number of pending jobs is 1. This can be directly integrated into the construction procedure in Section 5.

9 Conclusion

In this paper, we lay the theoretical foundations for deriving the utility accrual of scheduling algorithms when real-time tasks have a stochastic behaviour. We point out fundamental problems that have to be properly answered. The focus of this paper is the theoretical fundamentals to validate the existence of the convergence of the utility accrual in the long run and the derivation of the utility accrual if it converges.

Although our construction framework in Section 5 is generic, the automation depends on an ad-hoc configuration of the information I needed for the decision making of the scheduling policy. How to design I automatically and efficiently is an open problem. Furthermore, we hope to generalize our results for more general task models in the future.

References

  • [1] Luca Abeni and Giorgio C. Buttazzo. Integrating multimedia applications in hard real-time systems. In IEEE Real-Time Systems Symposium, pages 4–13, 1998. doi:10.1109/REAL.1998.739726.
  • [2] Luca Abeni and Giorgio C. Buttazzo. QoS guarantee using probabilistic deadlines. In Euromicro Conference on Real-Time Systems ECRTS, pages 242–249, 1999. doi:10.1109/EMRTS.1999.777471.
  • [3] Luca Abeni, Nicola Manica, and Luigi Palopoli. Efficient and robust probabilistic guarantees for real-time tasks. J. Syst. Softw., 85(5):1147–1156, 2012. doi:10.1016/j.jss.2011.12.042.
  • [4] Benny Akesson, Mitra Nasri, Geoffrey Nelissen, Sebastian Altmeyer, and Robert I. Davis. A comprehensive survey of industry practice in real-time systems. Real Time Syst., 58(3):358–398, 2022. doi:10.1007/s11241-021-09376-1.
  • [5] Saud Ahmed Aldarmi and Alan Burns. Dynamic value-density for scheduling real-time systems. In 11th Euromicro Conference on Real-Time Systems (ECRTS), pages 270–277, 1999. doi:10.1109/EMRTS.1999.777474.
  • [6] G. Bernat, A. Burns, and A. Liamosi. Weakly hard real-time systems. IEEE Transactions on Computers, 50(4):308–321, 2001. doi:10.1109/12.919277.
  • [7] Alessandro Biondi, Alessandra Melani, and Marko Bertogna. Hard constant bandwidth server: Comprehensive formulation and critical scenarios. In IEEE International Symposium on Industrial Embedded Systems, SIES, pages 29–37. IEEE, 2014. doi:10.1109/SIES.2014.6871182.
  • [8] Sergey Bozhko, Georg von der Brüggen, and Björn B. Brandenburg. Monte carlo response-time analysis. In 2021 IEEE Real-Time Systems Symposium (RTSS), pages 342–355, 2021. doi:10.1109/RTSS52674.2021.00039.
  • [9] Alan Burns, Divya Prasad, Andrea Bondavalli, Felicita Di Giandomenico, Krithi Ramamritham, John A. Stankovic, and Lorenzo Strigini. The meaning and role of value in scheduling flexible real-time systems. J. Syst. Archit., 46(4):305–325, 2000. doi:10.1016/S1383-7621(99)00008-9.
  • [10] Jian-Jia Chen, Mario Günzel, Peter Bella, Georg von der Brüggen, and Kuan-Hsun Chen. Dawn of the dead(line misses): Impact of job dismiss on the deadline miss rate. CoRR, abs/2401.15503, 2024. doi:10.48550/arXiv.2401.15503.
  • [11] Jian-Jia Chen, Wen-Hung Huang, Zheng Dong, and Cong Liu. Fixed-priority scheduling of mixed soft and hare real-time tasks on multiprocessors. In 23rd IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA, pages 1–10, 2017. doi:10.1109/RTCSA.2017.8046312.
  • [12] Ken Chen and Paul Mühlethaler. A scheduling algorithm for tasks described by time value function. Real Time Syst., 10(3):293–312, 1996. doi:10.1007/BF00383389.
  • [13] Kuan-Hsun Chen and Jian-Jia Chen. Probabilistic schedulability tests for uniprocessor fixed-priority scheduling under soft errors. In 2017 12th IEEE International Symposium on Industrial Embedded Systems (SIES), pages 1–8, 2017. doi:10.1109/SIES.2017.7993392.
  • [14] Kuan-Hsun Chen, Mario Günzel, Georg von der Brüggen, and Jian-Jia Chen. Critical instant for probabilistic timing guarantees: Refuted and revisited. In IEEE Real-Time Systems Symposium, RTSS, pages 145–157, 2022. doi:10.1109/RTSS55097.2022.00022.
  • [15] Kuan-Hsun Chen, Niklas Ueter, Georg von der Brüggen, and Jian-Jia Chen. Efficient computation of deadline-miss probability and potential pitfalls. In 2019 Design, Automation Test in Europe Conference Exhibition (DATE), pages 896–901, 2019. doi:10.23919/DATE.2019.8714908.
  • [16] Kuan-Hsun Chen, Georg von der Brüggen, and Jian-Jia Chen. Analysis of deadline miss rates for uniprocessor fixed-priority scheduling. In 24th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA, pages 168–178, 2018. doi:10.1109/RTCSA.2018.00028.
  • [17] Hyeonjoong Cho. Utility Accrual Real-Time Scheduling and Synchronization on Single and Multiprocessors: Models, Algorithms, and Tradeoffs. PhD thesis, Virginia Tech, Blacksburg, VA, USA, 2006. URL: https://hdl.handle.net/10919/28855.
  • [18] Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen. On utility accrual processor scheduling with wait-free synchronization for embedded real-time software. In Proceedings of the 2006 ACM Symposium on Applied Computing, pages 918–922, 2006. doi:10.1145/1141277.1141490.
  • [19] Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen. Utility accrual real-time scheduling for multiprocessor embedded systems. J. Parallel Distributed Comput., 70(2):101–110, 2010. doi:10.1016/J.JPDC.2009.10.003.
  • [20] Hyunjong Choi, Hyoseung Kim, and Qi Zhu. Job-class-level fixed priority scheduling of weakly-hard real-time systems. In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 241–253, 2019. doi:10.1109/RTAS.2019.00028.
  • [21] Robert I. Davis and Liliana Cucu-Grosjean. A survey of probabilistic schedulability analysis techniques for real-time systems. Leibniz Trans. Embed. Syst., 6(1):04:1–04:53, 2019. doi:10.4230/LITES-v006-i001-a004.
  • [22] U. Devi and J. Anderson. Tardiness bounds under global EDF scheduling on a multiprocessor. In Proceedings of the 26th IEEE Real-Time Systems Symposium, pages 330–341, 2005.
  • [23] J. L. Diaz, D. F. Garcia, Kanghee Kim, Chang-Gun Lee, L. Lo Bello, J. M. Lopez, Sang Lyul Min, and O. Mirabella. Stochastic analysis of periodic real-time systems. In 23rd IEEE Real-Time Systems Symposium (RTSS), 2002. doi:10.1109/REAL.2002.1181583.
  • [24] Bernardo Villalba Frias, Luigi Palopoli, Luca Abeni, and Daniele Fontanelli. Probabilistic real-time guarantees: There Is Life Beyond the i.i.d. assumption. In IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS, pages 175–186, 2017. doi:10.1109/RTAS.2017.18.
  • [25] Anna Friebe, Filip Markovic, Alessandro V. Papadopoulos, and Thomas Nolte. Adaptive runtime estimate of task execution times using bayesian modeling. In IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA, pages 1–10, 2021. doi:10.1109/RTCSA52859.2021.00008.
  • [26] Anna Friebe, Filip Markovic, Alessandro V. Papadopoulos, and Thomas Nolte. Continuous-emission markov models for real-time applications: Bounding deadline miss probabilities. In IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS, pages 14–26, 2023. doi:10.1109/RTAS58335.2023.00009.
  • [27] Anna Friebe, Alessandro Vittorio Papadopoulos, and Thomas Nolte. Identification and validation of markov models with continuous emission distributions for execution times. In IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA, pages 1–10, 2020. doi:10.1109/RTCSA50079.2020.9203594.
  • [28] M. Hamdaoui and P. Ramanathan. A dynamic priority assignment technique for streams with (m, k)-firm deadlines. IEEE Transactions on Computers, 44(12):1443–1451, 1995. doi:10.1109/12.477249.
  • [29] Xinfa Hu and Joseph Y.-T. Leung. Integrating communication cost into the utility accrual model for the resource allocation in distributed real-time systems. In The Fourteenth IEEE Internationl Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA, pages 217–226, 2008. doi:10.1109/RTCSA.2008.34.
  • [30] International Electrotechnical Commission (IEC). Functional safety of electrical / electronic / programmable electronic safety-related systems ed2.0, 2010.
  • [31] International Organization for Standardization (ISO). Iso/fdis26262: Road vehicles - functional safety, 2000.
  • [32] E. Douglas Jensen, C. Douglass Locke, and Hideyuki Tokuda. A time-driven scheduling model for real-time operating systems. In Proceedings of the 6th IEEE Real-Time Systems Symposium (RTSS), pages 112–122, 1985.
  • [33] Donald R. Jones, Matthias Schonlau, and William J. Welch. Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization, 13(4):455–492, 1998. doi:10.1023/A:1008306431147.
  • [34] Mehdi Kargahi and Ali Movaghar. Performance optimization based on analytical modeling in a real-time system with constrained time/utility functions. IEEE Trans. Computers, 60(8):1169–1181, 2011. doi:10.1109/TC.2010.151.
  • [35] Gilad Koren and Dennis E. Shasha. Dover; an optimal on-line scheduling algorithm for overloaded real-time systems. In Proceedings of the Real-Time Systems Symposium, pages 290–299, 1992. doi:10.1109/REAL.1992.242650.
  • [36] H. Leontyev and J. Anderson. Tardiness bounds for FIFO scheduling on multiprocessors. In Proceedings of the 19th Euromicro Conference on Real-Time Systems, pages 71–80, pages 71-80, 2007.
  • [37] Peng Li. Utility Accrual Real-Time Scheduling: Models and Algorithms. PhD thesis, Virginia Tech, Blacksburg, VA, USA, 2004. URL: https://hdl.handle.net/10919/28582.
  • [38] Peng Li, Hyeonjoong Cho, Binoy Ravindran, and E. Douglas Jensen. Stochastic, utility accrual real-time scheduling with task-level and system-level timeliness assurances. In Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC), pages 216–223, 2005. doi:10.1109/ISORC.2005.52.
  • [39] Carey Douglass Locke. Best-Effort Decision Making for Real-Time Scheduling. PhD thesis, Carnegie Mellon University, 1986.
  • [40] José María López, José Luis Díaz, Joaquín Entrialgo, and Daniel F. García. Stochastic analysis of real-time systems under preemptive priority-driven scheduling. Real Time Syst., 40(2):180–207, 2008. doi:10.1007/s11241-008-9053-6.
  • [41] Nicola Manica, Luigi Palopoli, and Luca Abeni. Numerically efficient probabilistic guarantees for resource reservations. In IEEE 17th International Conference on Emerging Technologies & Factory Automation, ETFA, pages 1–8, 2012. doi:10.1109/ETFA.2012.6489566.
  • [42] Sorin Manolache, Petru Eles, and Zebo Peng. Memory and time-efficient schedulability analysis of task sets with stochastic execution time. In Euromicro Conference on Real-Time Systems (ECRTS), page 19, 2001. doi:10.1109/EMRTS.2001.933991.
  • [43] Sorin Manolache, Petru Eles, and Zebo Peng. Schedulability analysis of multiprocessor real-time applications with stochastic task execution times. In IEEE/ACM International Conference on Computer-aided Design, ICCAD, pages 699–706, 2002. doi:10.1145/774572.774675.
  • [44] Sorin Manolache, Petru Eles, and Zebo Peng. Optimization of soft real-time systems with deadline miss ratio constraints. In Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 562–570, 2004. doi:10.1109/RTTAS.2004.1317304.
  • [45] Sorin Manolache, Petru Eles, and Zebo Peng. Schedulability analysis of applications with stochastic task execution times. ACM Trans. Embed. Comput. Syst., 3(4):706–735, 2004. doi:10.1145/1027794.1027797.
  • [46] Filip Marković, Alessandro Vittorio Papadopoulos, and Thomas Nolte. On the Convolution Efficiency for Probabilistic Analysis of Real-Time Systems. In 33rd Euromicro Conference on Real-Time Systems (ECRTS 2021), pages 16:1–16:22, 2021. doi:10.4230/LIPIcs.ECRTS.2021.16.
  • [47] Filip Markovic, Pierre Roux, Sergey Bozhko, Alessandro V. Papadopoulos, and Björn B. Brandenburg. CTA: A correlation-tolerant analysis of the deadline-failure probability of dependent tasks. In IEEE Real-Time Systems Symposium, RTSS, pages 317–330, 2023. doi:10.1109/RTSS59052.2023.00035.
  • [48] Dorin Maxim and Liliana Cucu-Grosjean. Response time analysis for fixed-priority tasks with multiple probabilistic parameters. In IEEE 34th Real-Time Systems Symposium, pages 224–235, 2013. doi:10.1109/RTSS.2013.30.
  • [49] Dorin Maxim, Robert I. Davis, Liliana Cucu-Grosjean, and Arvind Easwaran. Probabilistic analysis for mixed criticality systems using fixed priority preemptive scheduling. In Proceedings of the 25th International Conference on Real-Time Networks and Systems (RTNS), pages 237–246, 2017. doi:10.1145/3139258.3139276.
  • [50] Dorin Maxim, Mike Houston, Luca Santinelli, Guillem Bernat, Robert I. Davis, and Liliana Cucu-Grosjean. Re-sampling for statistical timing analysis of real-time systems. In Proceedings of the 20th International Conference on Real-Time and Network Systems, pages 111–120, New York, NY, USA, 2012. Association for Computing Machinery. doi:10.1145/2392987.2393001.
  • [51] Daniel Mossé, Martha E. Pollack, and Yagíl Ronén. Value-density algorithms to handle transient overloads in scheduling. In 11th Euromicro Conference on Real-Time Systems (ECRTS), pages 278–286, 1999. doi:10.1109/EMRTS.1999.777475.
  • [52] J. R. Norris. Markov Chains. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1997. doi:10.1017/CBO9780511810633.
  • [53] Esko Nuutila and Eljas Soisalon-Soininen. On finding the strongly connected components in a directed graph. Inf. Process. Lett., 49(1):9–14, 1994. doi:10.1016/0020-0190(94)90047-7.
  • [54] Luigi Palopoli, Daniele Fontanelli, Luca Abeni, and Bernardo Villalba Frias. An analytical solution for probabilistic guarantees of reservation based soft real-time systems. IEEE Trans. Parallel Distributed Syst., 27(3):640–653, 2016. doi:10.1109/TPDS.2015.2416732.
  • [55] Luigi Palopoli, Daniele Fontanelli, Nicola Manica, and Luca Abeni. An analytical bound for probabilistic deadlines. In Euromicro Conference on Real-Time Systems, ECRTS, pages 179–188, 2012. doi:10.1109/ECRTS.2012.19.
  • [56] Divya Prasad, Alan Burns, and Martin C. Atkins. The valid use of utility in adaptive real-time systems. Real Time Syst., 25(2-3):277–296, 2003. doi:10.1023/A:1025184411567.
  • [57] Khaled S. Refaat and Pierre-Emmanuel Hladik. Efficient stochastic analysis of real-time systems via random sampling. In 2010 22nd Euromicro Conference on Real-Time Systems, pages 175–183, 2010. doi:10.1109/ECRTS.2010.29.
  • [58] Junjie Shi, Niklas Ueter, Jian-Jia Chen, and Kuan-Hsun Chen. Average task execution time minimization under (m, k) soft error constraint. In IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS, pages 1–13. IEEE, 2023. doi:10.1109/RTAS58335.2023.00008.
  • [59] Youcheng Sun and Marco Di Natale. Weakly hard schedulability analysis for fixed priority scheduling of periodic real-time tasks. ACM Trans. Embed. Comput. Syst., 16(5s):171:1–171:19, 2017. doi:10.1145/3126497.
  • [60] Yue Tang, Nan Guan, Weichen Liu, Linh Thi Xuan Phan, and Wang Yi. Revisiting GPC and AND connector in real-time calculus. In 2017 IEEE Real-Time Systems Symposium, RTSS, pages 255–265, 2017. doi:10.1109/RTSS.2017.00031.
  • [61] Robert Endre Tarjan. Depth-first search and linear graph algorithms. SIAM J. Comput., 1(2):146–160, 1972. doi:10.1137/0201010.
  • [62] L. Thiele, S. Chakraborty, and M. Naedele. Real-time calculus for scheduling hard real-time systems. Circuits and Systems, 2000. Proceedings. ISCAS 2000 Geneva. The 2000 IEEE International Symposium on, 4:101–104, 2000. doi:10.1109/ISCAS.2000.858698.
  • [63] Too-Seng Tia, Zhong Deng, Mallikarjun Shankar, Matthew F. Storch, Jun Sun, L.-C. Wu, and Jane W.-S. Liu. Probabilistic performance guarantee for real-time tasks with varying computation times. In 1st IEEE Real-Time Technology and Applications Symposium, pages 164–173, 1995. doi:10.1109/RTTAS.1995.516213.
  • [64] Terry Tidwell, Carter Bass, Eli Lasker, Micah Wylde, Christopher D. Gill, and William D. Smart. Scalable utility aware scheduling heuristics for real-time tasks with stochastic non-preemptive execution intervals. In Euromicro Conference on Real-Time Systems, ECRTS, pages 238–247, 2011. doi:10.1109/ECRTS.2011.30.
  • [65] Terry Tidwell, Robert Glaubius, Christopher D. Gill, and William D. Smart. Optimizing expected time utility in cyber-physical systems schedulers. In IEEE Real-Time Systems Symposium, RTSS, pages 193–201, 2010. doi:10.1109/RTSS.2010.28.
  • [66] Georg von der Brüggen, Nico Piatkowski, Kuan-Hsun Chen, Jian-Jia Chen, and Katharina Morik. Efficiently Approximating the Probability of Deadline Misses in Real-Time Systems. In 30th Euromicro Conference on Real-Time Systems (ECRTS 2018), volume 106, pages 6:1–6:22, 2018. doi:10.4230/LIPIcs.ECRTS.2018.6.
  • [67] Georg von der Brüggen, Nico Piatkowski, Kuan-Hsun Chen, Jian-Jia Chen, Katharina Morik, and Björn B Brandenburg. Efficiently approximating the worst-case deadline failure probability under EDF. In IEEE Real-Time Systems Symposium (RTSS), pages 214–226, 2021. doi:10.1109/RTSS52674.2021.00029.
  • [68] Ernesto Wandeler and Lothar Thiele. Workload correlations in multi-processor hard real-time systems. J. Comput. Syst. Sci., 73(2):207–224, 2007. doi:10.1016/J.JCSS.2006.04.005.
  • [69] Ernesto Wandeler, Lothar Thiele, Marcel Verhoef, and Paul Lieverse. System architecture evaluation using modular performance analysis - a case study. Software Tools for Technology Transfer (STTT), 8(6):649–667, October 2006. doi:10.1007/S10009-006-0019-5.
  • [70] Jinggang Wang and Binoy Ravindran. Time-utility function-driven switched ethernet: Packet scheduling algorithm, implementation, and feasibility analysis. IEEE Trans. Parallel Distributed Syst., 15(2):119–133, 2004. doi:10.1109/TPDS.2004.1264796.
  • [71] Haisang Wu, Binoy Ravindran, and E. Douglas Jensen. On the joint utility accrual model. In 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004. doi:10.1109/IPDPS.2004.1303086.