Efficient dynamic task scheduling in virtualized data centers with fuzzy prediction
Introduction
The trend towards server-side computing and the exploding popularity of Internet services has made data centers become an integral part of the Internet fabric rapidly. Data centers become increasingly popular in large enterprises, banks, telecom, portal sites, etc. (Joseph et al., 2008, Arregoces and Portolani, 2003, Snevely, 2002). As data centers are inevitably growing more complex and larger, it brings many challenges to the deployment, resource management and service dependability, etc. (Snyder, 2007). Virtualization is viewed as an efficient way against these challenges. Server virtualization opens up the possibility of achieving higher server consolidation and more agile dynamic resource provisioning than is possible in traditional platforms (Govindan et al., 2009). A data center built using server virtualization technology with virtual machines (VMs) as the basic processing elements is called a virtualized (or virtual) data center (VDC) (Graupner et al., 2002, Xu et al., 2007, Zhu et al., 2009). Due to the advantages in deployment, management, dependability and cost, VDCs become the next infrastructure trend with the popularity of Cloud computing and Infrastructure as a Service (IaaS) (Luis et al., 2009, Xiangzhen Kong et al., 2009), such as Amazon EC2 (Amazon) and VMware vCloud (VMware).
However, the characteristics of virtualization also bring new challenges to VDCs, particularly to the task scheduling and resource management. Server consolidation (Marty and Hill, 2007, VMware, 2009) makes many VMs run in a physical server. VMs are loosely coupled with the underlying hardware and share the hardware resources of the physical server such as CPU, memory and network. The loosely coupled and highly shared features of virtualization make it difficult to accurately measure the running parameters and resources usage information of each VM, which cause some complexity in resource management and task scheduling in VDCs. In addition, the mechanism of live migration (Clark et al., 2005) makes it possible for VM appliance to move between different physical servers in a VDC. It further exacerbates the dynamicity and nondeterminacy of VDCs. Such characteristics bring challenges for the traditional task scheduling schemes to work well in VDCs.
We focus on the task scheduling problem of VDCs in this paper. Some performance metrics, such as high throughput, low response delay and short makespan, are the conventional optimization goals for task scheduling. The traditional scheduling algorithms usually assume that all server nodes are always available for processing. In practice, this assumption is often not plausible in some scenarios where certain breakdowns, requirements for maintenance, or other constraints that make the server nodes unavailable for processing exist (Qin and Xie, 2008). For example, in VDCs, a node is sometimes unavailable during the processes of backup, update maintenance or live migration. However many service applications require data center platform with high availability, particularly for some critical services such as military and healthcare applications (Qin and Xie, 2008). So availability is also a critical metric that a scheduling policy in VDCs should take into considerations. But in practical applications, it is unpractical for users to accurately specify their availability requirements in the Service Layer Agreements (SLAs) for their submitted tasks. A more friendly way is to let users designate a fuzzy level of availability requirements, such as high, medium or low level. Then, how to deal with the vagueness of availability requirements in the task scheduling is also a challenge. Moreover, to improve the availability needs more extra processing overhead that may impact on the scheduling performance. So achieving high performance and availability simultaneously is also a concern, as they are usually conflicting with each other (Qin and Xie, 2008).
Furthermore, the existing researches on the scheduling in VDCs mostly focus on the infrastructure layer, such as resource provisioning and VMs placement (Xu et al., 2007, Zhu et al., 2009, Wood et al., 2007, Song et al., 2009, Meng et al., 2010). They are dedicated to improve the performance of the data center infrastructure, but the service layer requirements specified by SLAs are ignored. Different task classes for different applications (FTP, streaming media, etc.) often have distinct requirements in the SLAs such as response time and availability, which are closely related to the Quality of Service (QoS) from the users’ view. To improve the user experience, it motivates us to study the task scheduling problem from the service layer by considering different requirements of performance and availability in the SLAs.
In this paper, we propose an effective task scheduling scheme for VDCs, which makes a good trade-off between availability and performance. The multiclass tasks model is introduced in our scheme, and different classes of tasks are characterized by their distinct arrival rates, service time and availability requirements. The contributions are concluded as the following aspects: (1) We build a general model for the task scheduling in VDCs and formulate it as a two-objective optimization problem; (2) We give a graceful fuzzy prediction method to model the uncertain workload and the vague availability of virtualized server nodes, using the type-I and type-II fuzzy logic systems; (3) We design and evaluate a dynamic task scheduling algorithm, which could efficiently improve the total availability of the VDC while maintaining good responsiveness performance.
The rest of this paper is organized as follows. Section 2 introduces the background. The general formal model of the task scheduling system in VDCs is introduced in Section 3. In Section 4, availability and load-balance predictors are constructed using type-I and interval type-II fuzzy logic systems, respectively. Section 5 presents the new dynamic scheduling algorithm. Section 6 gives the performance evaluation of the algorithm, followed with the conclusions in Section 7.
Section snippets
Task scheduling in virtualized data centers
Task scheduling is to assign tasks to different executive units while satisfying some constraints. In VDCs, a VM with the corresponding VMM and HW works as the basic executive unit called virtual executive unit (VEU), which is the provider of services specified in the SLAs. Task scheduling techniques can be either static or dynamic. Static scheduling schemes assume a fixed tasks set and a priori knowledge of the characteristics of the workload with respect to the systems. It is usually
The framework of task scheduling in virtualized data centers
We use the Multi-classes Single Queue to Multiple Servers with Local Queues (MSQMS-LQ) model for the task scheduling in a VDC (see Fig. 1). There is a shared waiting queue before entering the scheduler. Each virtual machine acts as the server with a local queue for arrived tasks. Since different class of tasks may have different characteristics and availability requirements, multiclass task model (Qin and Xie, 2008, Sethuraman and Squillante, 1999) is applied in our framework. Assume that there
Availability fuzzy prediction
This subsection presents the construction of the Availability Fuzzy Predictor (AFP) shown in Fig. 1. Availability is defined as the ratio of the time period when a server node is functional during a given interval. For simplicity, the concept of Intrinsic Availability (Birolini, 2007) is used commonly in engineering, which is defined aswhere MTBF is the mean time between failures and MTTR the mean time to repair.
In the scheduling model of virtualized data centers, the
The dynamic task scheduling algorithm
After building the availability and the load-balance fuzzy predictors, we use them in the general scheduling framework of virtualized data center, as shown in Fig. 1. We propose an on-line dynamic task scheduling algorithm called Scheduling Algorithm based on Load-balance and Availability Fuzzy prediction (SALAF). The details of the algorithm are depicted in Fig. 7.
Every arrived task is placed in the scheduling queue after it is submitted to the VDC. The scheduler gets tasks from the First-In
Simulations
In this section, we evaluate the SALAF using simulation experiments. It is assumed that the task arrival conforms to Poisson process, and the task execution times are uniformly distributed. The parameters related to the availability including the failure rate and repair rate of HW, VMM and VM are chosen to represent the characteristics of software and hardware for real-world systems (Birolini, 2007). The parameter settings are shown in Table 4.
We select two scheduling algorithms in common use
Conclusions
In this paper, we have studied the task scheduling problem in virtual data centers, considering the performance and availability requirements of SLAs. The general model of the task scheduling in VDC is built by MSQMS-LQ, and the problem is formulated as an optimization problem with two objectives: average response time and availability satisfaction percentage. Then we give a graceful fuzzy prediction method to model the uncertain workload and the vague availability of virtualized server nodes,
Acknowledgments
We are much obliged to Prof. Zuhtu Hakan Akpolat (Firat University, Turkey) for the interval type-II FLS Matlab toolbox.
This work is financially supported by the National Grand Fundamental Research 973 Program of China (nos. 2010CB328105 and 2009CB320504), and the National Natural Science Foundation of China (nos. 60932003 and 90718040).
References (30)
- et al.
Fuzzy scheduling: modelling flexible constraints vs. coping with incomplete knowledge
European Journal of Operational Research
(2003) The concept of a linguistic variable and its application to approximate reasoning—I
Information Science
(1975)- Amazon, Amazon Elastic Compute Cloud (Amazon EC2)....
- The ChinaGrid website [online]. Available:...
- VMware. Vmware vcloud....
- et al.
Data center fundamentals
(2003) Reliability engineering theory and practice
(2007)- Clark C, Fraser K, Hand S, Hansen J, Jul E, Limpach C, Pratt I, Warfield A. Live migration of virtual machines. In:...
- et al.
Host load prediction using linear models
Cluster Computing
(2000) Xen and Co.: communication-aware CPU management in consolidated xen-based hosting platforms
IEEE Transactions on Computers
(2009)
A policy-aware switching layer for data centers
ACM SIGCOMM Computer Communication Review
Type-2 fuzzy logic systems
IEEE Transactions on Fuzzy System
Queueing systems: volume I: theory
Interval type-2 fuzzy logic systems: theory and design
IEEE Transactions on Fuzzy System
Cited by (96)
A new hybrid particle swarm optimizationalgorithm for optimal tasks scheduling in distributed computing system
2023, Intelligent Systems with ApplicationsALVEC: Auto-scaling by Lotka Volterra elastic cloud: A QoS aware non linear dynamical allocation model
2019, Simulation Modelling Practice and TheoryCitation Excerpt :It is well-known that cloud computing is an Internet-based computing service. The computing paradigm revolves around flow of shared resources and data into computers and other devices [41]. However, instead of an a’ priori or ad-hoc distribution, the policy is implemented on demand.
Optimized load balancing mechanism in parallel computing for workflow in cloud computing environment
2023, International Journal of Reconfigurable and Embedded SystemsAn efficient approach for load balancing of VMs in cloud environment
2023, Applied Nanoscience (Switzerland)Next Generation Task Offloading Techniques in Evolving Computing Paradigms: Comparative Analysis, Current Challenges, and Future Research Perspectives
2023, Archives of Computational Methods in EngineeringResearch on Deadline Division and Scheduling of Batch Scientific Workflow in Cloud Environment
2022, Advances in Transdisciplinary Engineering