Scheduling real-time DAGs in heterogeneous clusters by combining imprecise computations and bin packing techniques for the exploitation of schedule holes

https://doi.org/10.1016/j.future.2012.03.002Get rights and content

Abstract

In this paper, we investigate the improvement that can be gained in the performance of a heterogeneous cluster dedicated to real-time jobs, by exploiting schedule holes with a novel approach that combines imprecise computations and bin packing strategies. According to the imprecise computations technique, a real-time job can trade off precision for timeliness. Each job that arrives at the system has an end-to-end deadline and is a directed acyclic graph of component tasks, where the output data of a task may be used as input by another task. In case the input data of a task are imprecise, the task’s execution time is extended, in order to correct the error and produce a result of acceptable quality. Due to the data dependencies between the tasks of a job, schedule holes may appear in the schedule of a particular processor. Our approach is compared with other scheduling policies, under various workloads. The simulation results show that in the framework under study, the proposed strategy can lead to a better system performance.

Highlights

► Investigation of dynamic real-time scheduling of DAGs in a heterogeneous cluster. ► A novel scheduling approach is used for the exploitation of schedule holes. ► The proposed heuristic combines imprecise computations and bin packing techniques. ► The proposed heuristic outperforms other scheduling policies. ► Significant performance improvement in the case of computationally intensive jobs.

Introduction

Many scientific and commercial real-time applications are executed on heterogeneous clusters which comprise different types of processors and networks, as a more cost-effective solution. The heterogeneity of the distributed computing environment, along with the inherent need of real-time applications for high quality results within strict timing constraints, constitute the major challenge in the scheduling of real-time jobs. Therefore, it is crucial that an effective scheduling technique is employed, in order to guarantee that every real-time job will produce high quality results within the imposed timing constraints [1].

In distributed platforms, jobs can usually be modeled as directed acyclic graphs (DAGs or task graphs). In each DAG, the nodes represent the component tasks of the job and the edges represent the data dependencies between the tasks. The output data of a task are used as input by other tasks of the job. In order for a task to start execution, all of its predecessor tasks must have been completed. An entry task is a task with no predecessors, whereas an exit task is a task with no successors. The immediate predecessors of a task are called parents of the particular task, while the immediate successors of a task are called children of the particular task. Each job has an end-to-end deadline that must be met. That is, the tasks of a job do not have any specific individual deadlines. In order for a job to be guaranteed, all of its exit tasks must finish execution before the job’s deadline.

It is often more desirable for a real-time job to produce an approximate result by its deadline, than to produce a precise result late. Based on this observation, Lin et al. proposed the imprecise computations technique [2]. According to this method, a real-time job is allowed to return intermediate (imprecise) results of poorer, but still acceptable quality, when the deadline of the job cannot be met. To achieve this, it is assumed that every component task in a job is monotone. That is, the quality of a task’s results is improved as more time is spent to produce them. Examples of monotone tasks include numerical computation, statistical estimation and prediction, sorting, video and voice transmission, heuristic search and database query processing. The intermediate results produced by a monotone task are recorded at appropriate instants of its execution. When a task is terminated prematurely, its latest recorded result can be used. The results of a monotone task are precise only when the task is fully completed. This method for returning imprecise results is called the milestone method [3].

A commonly used approach in imprecise computations is to consider that each monotone task of a job consists of a mandatory part, followed by an optional part [4]. In order for a task to return an acceptable result, its mandatory part must be completed. The optional part refines the result produced by the mandatory part. Due to the data dependencies between the tasks of a job, input error may affect the processing time of a task. Specifically, when the input data of a task are imprecise, the processing time of the task is extended, in order to correct the error and produce a result of acceptable quality [5]. A job is completed only when all of its tasks have completed at least their mandatory part, before the job’s deadline.

Imprecise computations can provide scheduling flexibility in real-time systems, by trading off precision for timeliness [6]. For example, a video-on-demand server which streams movies and other video content to users over the internet can benefit from this technique. The server may unexpectedly encounter network congestion, causing delays during the transmission of video files to users. Imprecise computations can allow the system to reduce the quality of some video frames during a transmission, so that the video delivered to a user maintains an acceptable frame rate.

The imprecise computations technique has been studied and extended by many authors [7]. In [8], Haweet et al. examine the offline scheduling of a single task graph with an end-to-end deadline on a single processor. However, the effects of input error on the execution time of the component tasks of the DAG are not taken into account. On the other hand, in [9], Feng and Liu investigate the impact of input error on the static scheduling of a single, linear chain of tasks with an end-to-end deadline on a single processor. Even though the effects of input error are not ignored in this case, the workload has a very simple structure.

Among the real-time scheduling policies that have been proposed in the literature [10], [11], [12], the Earliest Deadline First (EDF) algorithm is the most commonly used [13]. According to this strategy, the job with the earliest deadline has the highest priority for scheduling. On the other hand, task graph scheduling is usually based on the list scheduling heuristic, which consists of two phases: the task selection phase and the processor selection phase. During the first phase, tasks are arranged in a list according to their priorities. The task with the highest priority is selected for scheduling. In the second phase, the selected task is scheduled to the processor that minimizes a specific cost function, such as the estimated start time of the task [14], [15], [16], [17], [18], [19], [20].

Based on the observation that schedule holes (idle time slots) may form in the schedule, due to the communication delays between succeeding tasks in a task graph, Kruatrachue and Lewis [21] proposed the Insertion Scheduling Heuristic (ISH). ISH is a list scheduling heuristic, where in the processor selection phase, a task may be inserted into a schedule hole on a particular processor, if it is able to execute within the particular idle time slot. This can occur when the task is not able to start earlier on any other processor, either with exploitation of schedule holes or not. The task prioritization phase in ISH is based on the position of each task in the graph.

Many variations and applications of the bin packing problem have been proposed [22], [23], [24], [25]. The traditional bin packing problem concerns the packing of a set of objects into a set of bins, using as few bins as possible. First Fit (FF), Best Fit (BF) and Worst Fit (WF) are probably the most popular bin packing policies. According to FF, an object is placed into the first bin where it fits. On the other hand, BF assigns an object to the bin where it leaves the minimum unused space possible, whereas WF places an object into the bin where it leaves the maximum unused space possible.

Previous work has shown that the imprecise computations technique, as well as the exploitation of idle time slots, can both improve the system performance in the case of real-time job scheduling [2], [21], [26], [27]. However, to the best of our knowledge, no scheduling policy that combines both techniques has been proposed in the literature. The benefits of the two techniques could be combined and integrated effectively in the scheduling of real-time DAGs, taking into account both the characteristics of the workload, as well as the heterogeneity of the system.

In this paper, we investigate the improvement that can be gained in the performance of a heterogeneous cluster dedicated to real-time jobs, by exploiting schedule holes with a novel approach that combines imprecise computations and bin packing strategies. Each job that arrives at the system is a directed acyclic graph of component tasks and has an end-to-end deadline. The output data of a task may be used as input by more than one task, according to the data dependencies between the tasks in a DAG. In case the input data of a task are imprecise, the task’s execution time is extended, in order to correct the error and produce a result of acceptable quality. There is a result precision threshold under which the results of a component task are not acceptable. Moreover, each task has an input error limit, beyond of which the error in its input cannot be corrected and an acceptable result cannot be produced, no matter how long the task executes.

The online scheduling of the component tasks of the DAGs is based on a list scheduling heuristic, where the task prioritization and selection phase is performed according to EDF. In the processor selection phase, the selected ready task is allocated to the processor that can provide it with the earliest estimated start time. The estimated start time of the ready task on each processor is evaluated by exploiting schedule holes, using a strategy that combines imprecise computations and various bin packing techniques (either FF, BF or WF). This constitutes the main difference between our approach and the other methods proposed in the literature, such as the ISH policy mentioned above, which essentially exploits possible idle time slots according to FF and without the incorporation of imprecise computations.

The work presented in this paper differs from our previous work in [26], [27], [28]. Specifically, in [26] we consider a homogeneous system and there is no limit on the portion of the optional part that each parent task can discard. Moreover, schedule holes are not exploited during the scheduling process. On the other hand, in [27] we exploit idle time slots with bin packing techniques, but there is no utilization of imprecise computations at all. Finally, in [28] imprecise computations are incorporated differently in the scheduling process, since there is no exploitation of schedule holes. The main focus of this paper is the combination of imprecise computations and bin packing techniques for the exploitation of idle time slots during the scheduling process, in order to improve the system performance. Our approach is compared with other scheduling policies, under various workloads. Our foremost goal is to guarantee that all jobs that arrive at the system will meet their deadlines, providing results as precise as possible.

The remainder of this paper is organized as follows: Section 2 gives a description of the system and the workload model under study, as well as the definition of the imprecise computations model. Section 3 describes the proposed online strategy used to schedule the DAGs that arrive at the system and explains how imprecise computations and bin packing techniques are combined and incorporated into the processor selection process in order to exploit schedule holes. Section 4 presents and analyzes the simulation results and Section 5 concludes this paper, providing suggestions for further research.

Section snippets

Problem formulation

In our case study, we consider a heterogeneous cluster, consisting of a set P of q processors, each serving its own local queue (memory). The cluster is dedicated to real-time jobs and it may be part of a computational grid. Each processor pi is characterized by an execution rate μi. Since the system is heterogeneous, the processor execution rates may vary. It is assumed that the processors in the cluster are fully connected and that the communication between them is contention-free. It is also

Scheduling algorithm

For the scheduling of the ready tasks in the global waiting queue we employ a list scheduling heuristic which consists of a task selection phase and a processor selection phase. The key feature of our proposed strategy is the integration of imprecise computations along with bin packing techniques into the processor selection phase, in order to exploit possible idle time slots that may form in the schedule of a particular processor.

Performance metric

We evaluated the performance of the system in the framework described in the previous sections, by conducting a series of simulation runs using the independent replications method. According to this method, the performance measure of interest is estimated over a number n of simulation runs, each run using the same set of input parameters and terminating condition, but different seeds of random numbers [30]. Due to the complexity of the system and the workload model under study, we implemented

Conclusions and future work

In this paper, we investigated the improvement that can be gained in the performance of a heterogeneous cluster dedicated to real-time jobs, by exploiting schedule holes with a novel approach that combines imprecise computations and various bin packing techniques. The performance of the system was evaluated by simulation under various workloads, taking into account the effects of input error on the processing time of the tasks, as well as the input error limit of each component task. The

Georgios L. Stavrinides is a Ph.D. student in the Department of Informatics, Aristotle University of Thessaloniki, Greece. He received the M.Sc. degree in Advanced Computing from Imperial College London, UK in 2007. His research interests include performance evaluation of distributed real-time systems, simulation and scheduling algorithms.

References (30)

  • D. Hull, A. Shankar, K. Nahrstedt, J.W.S. Liu, An end-to-end qos model and management architecture, in: Proceedings of...
  • W.C. Feng, Applications and extensions of the imprecise computation model, Ph.D. Thesis, University of Illinois at...
  • J.W.S. Liu et al.

    Imprecise computations

    Proceedings of the IEEE

    (1994)
  • W.A.E. Haweet, H.H.E. Meligy, I.A.E. Salam, Imprecise computation technique to schedule and/or tasks with global...
  • W.C. Feng et al.

    Algorithms for scheduling real-time tasks with input error and end-to-end deadlines

    IEEE Transactions on Software Engineering

    (1997)
  • Cited by (51)

    • Modelling and simulation of security-aware task scheduling in cloud computing based on Blockchain technology

      2020, Simulation Modelling Practice and Theory
      Citation Excerpt :

      In batch scheduling, the submitted tasks are grouped into batches and the scheduler assigns each batch to the resources. The tasks may be independently sent and executed in the cloud system or they can be submitted as parallel applications with priority criteria and interrelations among the application components (usually modelled by a Directed Acyclic Graph (DAG) [21]). Security, among the other scheduling aspects and criteria such as high availability of computing and data resources, optimization of the scheduling costs paid by the end-users, energy optimization in the cloud systems and personalization of the cloud services based on the end-users’ requirements, is one of the most challenging problems in today’s cloud cloud systems [11], [23]. ”

    • A DAG task scheduling scheme on heterogeneous cluster systems using discrete IWO algorithm

      2018, Journal of Computational Science
      Citation Excerpt :

      However, this method has one disadvantage, that is, the same task may have to run several times on different computing nodes, which makes this method resource consuming. Ref. [38] employed schedule holes with a novel approach that combines imprecise computations and bin packing strategies. According to the imprecise computation technique, a real-time job can trade off precision for timeliness.

    View all citing articles on Scopus

    Georgios L. Stavrinides is a Ph.D. student in the Department of Informatics, Aristotle University of Thessaloniki, Greece. He received the M.Sc. degree in Advanced Computing from Imperial College London, UK in 2007. His research interests include performance evaluation of distributed real-time systems, simulation and scheduling algorithms.

    Helen D. Karatza is a Professor in the Department of Informatics, Aristotle University of Thessaloniki, Greece. Her research interests include performance evaluation of parallel and distributed systems, cluster, grid and cloud computing, scheduling and simulation.

    View full text