A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems

https://doi.org/10.1016/j.jpdc.2011.04.007Get rights and content

Abstract

In this paper, we investigate the problem of scheduling precedence-constrained parallel applications on heterogeneous computing systems (HCSs) like cloud computing infrastructures. This kind of application was studied and used in many research works. Most of these works propose algorithms to minimize the completion time (makespan) without paying much attention to energy consumption.

We propose a new parallel bi-objective hybrid genetic algorithm that takes into account, not only makespan, but also energy consumption. We particularly focus on the island parallel model and the multi-start parallel model. Our new method is based on dynamic voltage scaling (DVS) to minimize energy consumption.

In terms of energy consumption, the obtained results show that our approach outperforms previous scheduling methods by a significant margin. In terms of completion time, the obtained schedules are also shorter than those of other algorithms. Furthermore, our study demonstrates the potential of DVS.

Highlights

► Addressing the precedence-constrained parallel applications for cloud computing. ► Proposing a new parallel bi-objective hybrid genetic algorithm. ► Taking into account the resolution time and the energy consumption. ► Giving the end-user a set of Pareto solutions to choose the right compromise. ► Improving on average the best known results obtained in the literature for FFT.

Introduction

Precedence-constrained parallel applications are one of the most typical application models used in scientific and engineering fields. Such applications can be deployed on homogeneous or heterogeneous systems (HCSs) like cloud computing infrastructures.

Cloud computing is a simple concept that has emerged from heterogeneous distributed computing, grid computing, utility computing, and autonomic computing. In cloud computing, end-users do not own any part of the infrastructure. The end-users simply use the services available through the cloud computing paradigm and pay for the used services. The cloud computing paradigm can offer any conceivable form of services, such as computational resources for high performance computing applications, web services, social networking, and telecommunications services. Several criteria determine the quality of the provided service, the production cost of this service, and therefore the price paid by the end-user. The duration of this service (makespan) and the consumed energy are among of these criteria. The idea is to provide end-users with a more flexible service that takes into account the duration of the service and the consumed energy. End-users could then find the right compromise between these two conflicting objectives to solve their precedence-constrained parallel applications.

The problem of finding the right compromise between the resolution time and the energy consumed of a precedence-constrained parallel application is a bi-objective optimization problem. The solution to this problem is a set of Pareto points. Pareto solutions are those for which improvement in one objective can only occur with the worsening of at least one other objective. Thus, instead of a unique solution to the problem, the solution to a bi-objective problem is a (possibly infinite) set of Pareto points. To the best of our knowledge, there is no research published in the literature to solve the above problem with a Pareto approach.

However, many works have focused on precedence-constrained parallel applications (e.g., [7], [31], [25], [19]). Most of these works propose algorithms to minimize the makespan. Only recently have some works been interested in minimizing the energy consumption (e.g., [13], [2], [29], [30], [10], [23]).

Using green IT techniques can significantly reduce an organization and ultimately a country’s carbon footprint. The UN bought 461 tons of carbon offsets to ensure that the September 2009 Summit on Climate Change in New York was carbon neutral. According to [3], if green IT techniques were implemented that reduced the energy use of 10,000 computers in West Virginia by 50%, the deployment would produce 33 times less carbon in one year than that produced by the summit. A recent study on power consumption by servers [16] shows that, in 2005, the power used by servers represented about 0.6% of total US electricity consumption. That number grows to 1.2% when cooling and auxiliary infrastructures are included. In the same year, the aggregate electricity bill for operating those servers and associated infrastructure was about $2.7 billion and $7.2 billion for the US and the world, respectively. The total electricity use for servers doubled over the period 2000–2005 worldwide. The number of transistors integrated into today’s Intel Itanium 2 processor reaches nearly 1 billion. If this rate continues, the heat (per square centimeter) produced by future Intel processors would exceed that of the surface of the sun [15]. This implies the possibility of worsening system reliability, eventually resulting in poor system performance.

Due to the importance of energy consumption, various techniques including dynamic voltage scaling (DVS), resource hibernation, and memory optimizations have been investigated and developed [26]. DVS among these has been proven to be a very promising technique with its demonstrated capability for energy savings (e.g., [2], [29], [10]). For this reason, we adopt this technique and it is of particular interest to this study. DVS enables processors to dynamically adjust voltage supply levels (VSLs) aiming to reduce power consumption. However, this reduction is achieved at the expense of sacrificing clock frequencies.

In this paper, we investigate the energy issue in task scheduling particularly on HCSs like cloud computing systems. We propose a new parallel bi-objective hybrid genetic algorithm that takes into account, not only makespan, but also energy consumption. Our new approach is a hybrid between a multi-objective parallel genetic algorithm and energy-conscious scheduling heuristic (ECS) [20]. The results clearly demonstrate the superior performance of ECS over the other algorithms like DBUS [1] and HEFT [25]. Genetic algorithms make it possible to explore a great range of potential solutions to a problem. The exploration capability of the genetic algorithm and the intensification power of ECS are complementary. A skillful combination of a metaheuristic with concepts originating from other types of algorithms lead to more efficient behavior.

Our algorithm is effective as it profits from the exploration power of the genetic algorithm, the intensification capability of ECS, the cooperative approach of the island model, and the parallelism of the multi-start model. The island model and the hybridization improve the quality of the obtained results. The multi-start model reduces the running time of a resolution. Furthermore, one of the major interests of our approach is to give the end-user a set of Pareto solutions to choose according to the desired quality of service, in particular the completion time, and the cost of the service in terms of energy and consequently in terms of price willing to pay. The proposed method can easily be applied to loosely coupled HCSs (e.g., cloud computing systems) using advance reservation and various sets of frequency–voltage pairs.

Our new approach is evaluated with the Fast Fourier Transformation task graph which is a real-world application. Experiments show that (1) the hybridization improves on average the best known results obtained in the literature (by 47.5% for the energy consumption and 12% for the completion time), (2) the island model significantly improves the results obtained using only the hybridization, and (3) the multi-start model accelerates our approach with an average speedup of 13 using 21 cores.

The remainder of the paper is organized as follows. Section 2 presents the application, system, energy and scheduling models used in this paper. Section 3 describes the related work. Our algorithm is presented in Section 4. The results of our comparative experimental study are discussed in Section 5. The conclusion is drawn in Section 6. The paper ends with an Appendix which describes our approaches using pseudo-code.

Section snippets

Problem modeling

In this section, we describe the system, application, energy and scheduling models used in our study.

Related work

In this section, we present some noteworthy works in task scheduling, particularly for HCSs, and then scheduling algorithms with power/energy consciousness.

A parallel hybrid approach

This section starts with a brief overview on multi-objective combinatorial optimization and genetic algorithms. Then, our new parallel bi-objective hybrid approach is presented.

Experiments and results

This section presents the results obtained from our comparative experimental study. The experiments aim to demonstrate and evaluate the contribution of the hybridization, the insular approach and the multi-start approach respectively compared to ECS, the hybrid approach and the insular approach.

Conclusions

In this paper, we have investigated the precedence-constrained parallel applications particularly on high-performance computing systems like cloud computing infrastructures. Precedence-constrained parallel applications are designed mostly with the sole goal of minimizing completion time without paying much attention to energy consumption.

We presented a new parallel bi-objective hybrid genetic algorithm to solve this problem. The algorithm minimizes energy consumption and makespan. The energy

Acknowledgments

Professor Albert Zomaya’s work is support by an Australian Research GrantDP1097110. Experiments presented in this paper were carried out using the Grid’5000 experimental testbed, being developed under the INRIA ALADDIN development action with support from CNRS, RENATER and several universities as well as other funding bodies (see https://www.grid5000.fr). Clusters of the University of Mons were also used. We would like to thank the technical staffs of the Grid’5000 and the clusters of the

M. Mezmaz received the Master’s degree (2002), and a Ph.D. in Computer Science (2007) both from the Laboratoire d’Informatique Fondamentale de Lille (LIFL, Université de Lille1). He is currently a researcher in the Mathematics and Operational Research Department (MathRO) of the University of Mons. His research interests are in the fields of grid/cloud computing and combinatorial optimization. He is the author of one book and the co-author of 3 book chapters, 5 journal papers, and about 20 other

References (31)

  • A. Liefooghe et al.

    A software framework based on a conceptual unified approach for evolutionary multiobjective optimization: ParadisEO-MOEO

    European Journal of Operational Research (EJOR)

    (2011)
  • D. Bozdag, U. Catalyurek, F. Ozguner, A task duplication based bottom-up scheduling algorithm for heterogeneous...
  • D.P. Bunde

    Power-aware scheduling for makespan and flow

    Journal of Scheduling

    (2009)
  • K.W. Cameron

    Trading in green it

    Computer

    (2010)
  • J.J. Chen, T.W. Kuo, Multiprocessor energy-efficient scheduling for real-time tasks with different power...
  • J.P. Cohoon, S.U. Hegde, W.N. Martin, D. Richards, Punctuated equilibria: a parallel genetic algorithm,...
  • T.H. Cormen et al.

    Introduction to Algorithms

    (2001)
  • S. Darbha et al.

    Optimal scheduling algorithm for distributed-memory machines

    IEEE Transactions on Parallel and Distributed Systems

    (2002)
  • K. Deb

    Multi-objective Optimization Using Evolutionary Algorithms

    (2001)
  • M.R. Garey et al.

    Computers and Intractability: A Guide to the Theory of NP-Completeness

    (1979)
  • R. Ge, X. Feng, K.W. Cameron, Performance-constrained distributed dvs scheduling for scientific applications on...
  • Intel. Intel pentium m processor datasheet,...
  • S.U. Khan et al.

    A cooperative game theoretical technique for joint optimization of energy consumption and response time in computational grids

    IEEE Transactions on Parallel and Distributed Systems

    (2009)
  • K.H. Kim, R. Buyya, J. Kim, Power aware scheduling of bag-of-tasks applications with deadline constraints on...
  • S.C. Kim et al.

    Push–pull: deterministic search-based dag scheduling for heterogeneous cluster systems

    IEEE Transactions on Parallel and Distributed Systems

    (2007)
  • Cited by (298)

    View all citing articles on Scopus

    M. Mezmaz received the Master’s degree (2002), and a Ph.D. in Computer Science (2007) both from the Laboratoire d’Informatique Fondamentale de Lille (LIFL, Université de Lille1). He is currently a researcher in the Mathematics and Operational Research Department (MathRO) of the University of Mons. His research interests are in the fields of grid/cloud computing and combinatorial optimization. He is the author of one book and the co-author of 3 book chapters, 5 journal papers, and about 20 other papers in different conferences and workshops.

    N. Melab received the Master’s, Ph.D. and HDR degrees in Computer Science from the Laboratoire d’Informatique Fondamentale de Lille (LIFL, Université Lille 1). He is a Full Professor at Université Lille 1 and a member of the DOLPHIN research group at LIFL and INRIA Lille - Nord Europe. He is the head of the Grid5000 French Nation-Wide grid project and the EGI grid at Lille. His major research interests include parallel, GPU and grid/cloud computing, combinatorial optimization algorithms and software frameworks. Professor Melab has more than 80 international publications including journal papers, book chapters and conference proceedings.

    Y. Kessaci received the Master’s degree in Computer Science from the Laboratoire d’Informatique Fondamentale de Lille (LIFL, Université de Lille1). Currently he is a Ph.D. student within the DOLPHIN project team of INRIA Lille Nord Europe. His major research interests include scheduling, combinatorial optimization algorithms and applications, green computing and cloud computing.

    Y.C. Lee received the B.Sc. (hons) in Computer Science in 2003 and the Ph.D. degree from the School of Information Technologies at the University of Sydney in 2008. He is currently with the Centre for Distributed and High Performance Computing, School of Information Technologies. His current research interests include scheduling and resource allocation for distributed computing systems including clouds, nature-inspired techniques, and parallel and distributed algorithms. He is a member of the IEEE and the IEEE Computer Society.

    E.-G. Talbi received the Master and Ph.D. degrees in Computer Science, both from the Institut National Polytechnique de Grenoble in France. Then he became an Associate Professor in Computer Sciences at the University of Lille (France). Since 2001, he has been a full Professor at the University of Lille and the head of the optimization group of the Computer Science laboratory (LIFL). His current research interests are in the field of multi-objective optimization, parallel algorithms, metaheuristics, combinatorial optimization, cluster and grid computing, hybrid and cooperative optimization, and applications to logistics/transportation, bioinformatics and networking. Professor Talbi has to his credit more than 100 publications in journals, chapters in books, and conferences. He is the co-editor of three books. He was a guest editor of more than 10 special issues in different journals (Journal of Heuristics, Journal of Parallel and Distributed Computing, European Journal of Operational Research, Theoretical Computer Science, Journal of Global Optimization). He is the head of the INRIA Dolphin project and the bioinformatics platform of the Genopole of Lille. He has many collaborative national, European and international projects. He is the co-founder and the coordinator of the research group dedicated to Metaheuristics: Theory and Applications (META). He is the founding co-chair of the NIDISC workshop on nature inspired computing (IEEE/ACM IPDPS). He served in different capacities on the programs of more than 100 national and international conferences. He has also organized many conferences (e.g. EA’2005, ROADEF’2006, META’2008, IEEE AICCSA’2010).

    A.Y. Zomaya is currently the Chair Professor of High Performance Computing & Networking and Australian Research Council Professorial Fellow in the School of Information Technologies, the University of Sydney. He is also the Director of the Centre for Distributed and High Performance Computing which was established in late 2009. Professor Zomaya is the author/co-author of seven books, more than 380 papers, and the editor of 9 books and 11 conference proceedings. He is the Editor in Chief of the IEEE Transactions on Computers and serves as an associate editor for 19 leading journals. Professor Zomaya is the recipient of the Meritorious Service Award (in 2000) and the Golden Core Recognition (in 2006), both from the IEEE Computer Society. He is a Chartered Engineer (CEng), a Fellow of the AAAS, the IEEE, the IET (UK), and a Distinguished Engineer of the ACM.

    D. Tuyttens received the Ph.D. degree in Mathematics from the University of Namur (1991). He is currently a Full Professor in the Polytechnic Faculty of the University of Mons. Professor Tuyttens is author/co-author of about 30 papers. His current research interests include combinatorial optimization, exact methods, metaheuristics and multi-objective optimization.

    View full text