Skip to main content
Top

Open Access 11-06-2025

Energy-aware cooperative multi-fitness evolutionary algorithm for workflow scheduling in cloud computing

Authors: Pablo Barredo, Jorge Puente

Published in: Natural Computing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This article explores the critical challenge of optimizing energy consumption in cloud computing, particularly in the context of scientific workflow scheduling. It introduces an innovative multi-fitness evolutionary algorithm that leverages cooperative strategies to enhance the efficiency of workflow processing. The article delves into the complexities of cloud infrastructure, highlighting the importance of balancing makespan and energy consumption to achieve sustainable computing practices. By integrating multiple heuristic functions, the proposed algorithm offers a robust solution for minimizing total energy consumption while maintaining high performance. The study also presents a detailed experimental analysis, demonstrating the effectiveness of the multi-fitness approach across various cloud scenarios. This research provides valuable insights into the trade-offs between energy efficiency and computational performance, paving the way for more sustainable and efficient cloud computing practices. The article concludes with a comprehensive discussion on the implications of the findings and suggests avenues for future research in this rapidly evolving field.
Notes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Cloud computing has emerged as a cornerstone technology for processing large volumes of data, offering unparalleled flexibility and scalability. Among its diverse service models, Infrastructure as a Service (IaaS) stands out by allowing users to rent virtual machines (VMs) with customizable capabilities. This model enables horizontal scalability, by increasing the number of VMs, and vertical scalability, by enhancing the capacity of individual servers. Users benefit from a pay-as-you-go pricing scheme, which charges for computational resources like CPU and memory, as well as additional costs for network bandwidth and storage based on usage. Furthermore, IaaS provides significant control, letting users tailor their environment by selecting specific operating systems, processor configurations, and storage options, making platforms like Microsoft Azure, Google Compute Engine, and Amazon EC2 ideal for researchers needing dynamic and cost-efficient infrastructure for their experiments.
This research addresses the scheduling problem for scientific workflows, typically represented as Directed Acyclic Graphs (DAGs). These graphs model a collection of interdependent tasks, where the execution of one task may depend on the completion of others. Data exchange between tasks may range from a few bytes to several gigabytes, and the complexity of DAGs can vary significantly, from just a few dozen tasks to thousands. This variability results in execution times ranging from mere seconds to hours or even days [18, Sahni and Vidyarthi (2018)]. The execution time of each task depends not only on the specific virtual machine selected but also on factors such as storage systems and network connections between VMs, which can significantly affect the overall execution time. To run experiments effectively, researchers often rely on scalable and flexible platforms, such as the cloud IaaS mentioned above.
Scheduling these workflows requires assigning tasks to VMs in a way that satisfies various quality of service (QoS) constraints, such as minimising total makespan, operational costs or total energy consumption. This optimisation problem, known as the Workflow Scheduling Problem, is NP-complete Madni et al. (2017), highlighting the inherent complexity of finding efficient solutions.
To optimise this problem, it is necessary to use efficient schedules. Various quality metrics are applied to assess the efficiency of schedules. Among them, the total execution time is one of the main objectives of scientists. Researchers are motivated to accelerate their experimental results for several reasons. In many scenarios, research is an incremental task where new experimental designs heavily depend on the results of previous experiments. The time required for these tasks is directly tied to the computational infrastructure involved. While such infrastructure is well-defined in on-premises settings, in cloud environments researchers have access to vast computational resources. However, in this case, the main constraint is not the availability of resources but rather the budget allocated to utilize them.
In recent years, cloud computing has seen a significant increase in global energy consumption, particularly in the context of artificial intelligence research, which has grown exponentially. It is predicted that by 2025 data centres will consume around 4.5% of global energy and continue to account for 3.5% of global carbon emissions Materwala and Ismail (2021); Chhabra et al. (2022), equivalent to around 100 million tonnes of carbon emissions, a level that is not environmentally sustainable Chhabra et al. (2022).
It is therefore essential to develop efficient schedules that minimise overall energy consumption while ensuring the achievement of other research objectives, such as minimising completion times. However, exploring the interrelationships between these objectives could provide insights to help design heuristics capable of optimising each of them separately, while also considering potential synergies in their simultaneous optimisation.
As mentioned earlier, the inherent complexity of the workflow scheduling problem justifies the use of approximate optimisation methods. Exact methods, while theoretically appealing, are often impractical for large-scale problems due to the computational effort required. Consequently, approximate methods, such as classical heuristics and metaheuristics, have been widely adopted in the literature. A notable example is the Heterogeneous Earliest Finish Time (HEFT) algorithm Topcuoglu et al. (2002), a classical list heuristic used to optimise total completion time in distributed environments. Similarly, green-HEFT Durillo et al. (2014) extends HEFT to prioritise energy efficiency by assigning tasks to resources with the lowest energy consumption.
In addition to heuristics, various metaheuristic techniques have been explored to address the multi-objective nature of workflow scheduling. These include Evolutionary Algorithms Cao et al. (2023), such as the approach by Cao et al., which minimises both cost and energy consumption, Simulated Annealing Yuan et al. (2021), which optimise cost and user QoS, or Ant Colony Optimisation such as Liu et al. (2023) optimising energy. Other methods inspired by physical processes, such as the Gravitational Search Algorithm (GSA) Biswas et al. (2019) and the Intelligent Water Drop Algorithm Adhikari and Amgoth (2019), have also demonstrated their effectiveness in optimising objectives like makespan and resource utilisation. Hybrid strategies, such as the combination of the Moth Search Algorithm (MSA) and Differential Evolution (DE) Elaziz et al. (2019), or the integration of heuristic techniques with genetic algorithms Ye et al. (2019), further extend the versatility of metaheuristics. Another example is the HEFT-ACO algorithm Belgacem and Beghdad-Bey (2022), which combines the Heterogeneous Earliest Finish Time heuristic with Ant Colony Optimisation to address both makespan and cost. The ability of these methods to balance conflicting objectives, such as makespan and energy consumption, makes them indispensable for workflow optimisation in complex cloud computing environments.
Several evolutionary algorithms in the literature employ a multi-fitness strategy to enhance solution quality beyond that achieved by conventional single-objective methods. Such strategies either apply distinct fitness functions at different stages of the algorithm Wu et al. (2021) or evaluate different aspects of the same solution, as observed in multi-fitness learning approaches Yates et al. (2020). However, to the best of our knowledge, there exists no method that integrates multiple decoding functions to simultaneously optimise the same fitness objective. This gap has been addressed by the cooperative multi-fitness approach Barredo and Puente (2024), which improves the makespan quality of solutions obtained from an evolutionary algorithm by introducing a new decoding method that incorporates different supporting heuristic fitness functions to guide the standard decoding fitness function towards search space subregions, where higher quality makespan solutions are likely to be found.
In this work, we address the problem of optimizing the total energy consumption of scientific workflow scheduling in cloud environments. Building upon the previously proposed Multi-Fitness Evolutionary Algorithm from Barredo and Puente (2024) and inspired by the proposed model in García Gómez et al. (2023), we aim to evaluate how this approach improves the quality of schedules generated by a standard mono-objective Genetic Algorithm (GA). To achieve this, we introduce an energy model compatible with the Disk-Network-Computation (DNC) evaluation model Barredo and Puente (2023), enabling an accurate estimation of energy consumption across various components of the cloud infrastructure. Additionally, we propose new heuristic functions designed specifically to optimize total energy consumption, which serve as support functions to the GA’s direct decoding mechanism. Finally, we analyze the makespan of the obtained solutions to investigate potential correlations between energy consumption and execution time. This analysis highlights cloud infrastructure scenarios where such correlations emerge, providing deeper insights into the trade-offs between these critical objectives.
This proposal serves to expand upon the conference paper presented in Barredo and Puente (2024), with the objective of further elaborating and advancing the concepts introduced therein as follows:
  • Extension of the Multi-Fitness Evolutionary Algorithm to address total energy optimization of workflow scheduling in the cloud.
  • Development of an energy model compatible with the DNC framework for evaluating energy consumption.
  • Introduction of new heuristic support functions to improve schedule quality for energy optimization.
  • Analysis of the correlation between makespan and total energy, identifying cloud scenarios where this relationship is significant.
The remainder of this paper is organized as follows: Sect. 2 introduces the Scientific Workflow Scheduling Model, detailing the workflow structure, and extending the DNC model with a comprehensive energy model. Section 3 provides an overview of the Genetic Algorithm approach and heuristic functions for makespan and total energy optimization. Section 4 introduces the Cooperative Multi-Fitness approach and its specialization for total energy optimisation. Section 5 presents the Experimental Study, detailing the evaluation and analysis of the proposed approach. Finally, the conclusions are summarized in Sect. 6.

2 The scientific workflow scheduling model

Scientific workflows are usually represented as a directed acyclic graph (DAG). The DAG, denoted by \(G=(T,A)\), provides a structured representation of the workflow’s tasks, dependencies, and communication requirements. The set of nodes \(T={t_1, t_2,..., t_n}\) represents the individual tasks within the scientific workflow. Each node is identified by a label that denotes the computational workload of the corresponding task in terms of Million Floating Point Operations (MFLOP). The arcs or dependency links \(A={(t_i,t_j) | 1\le i \le n, 1 \le j \le n, i \ne j}\) define the relationships between tasks in the workflow. Each arc is designated as data(ij), indicating the size of the data in Megabytes (MB) that needs to be transferred from task \(t_i\) to task \(t_j\). This representation captures the dependencies between tasks, allowing for the identification of critical paths and bottlenecks in the workflow.
The graph includes two dummy tasks: \(t_{entry}\) and \(t_{end}\). These nodes serve as the entry point (\(t_{entry}\)) and exit point (\(t_{end}\)) of the workflow, respectively. As such, they have no computational or communication requirements; they only serve for the purpose of having a unique entry and exit point.
The resource model consists of a cloud service provider that offers an IaaS platform as a set of mixed types of hosts, or VMs, to its clients. Let M={\(vm_1,\) \(vm_2\),...,\(vm_m\)} be the set of VMs, each one modelled as a tuple consisting of \(<pc, nb, ds, pp, ap>\), where pc, nb, and ds represent the processing capacity measured in GFLOP/s (GFLOPs per second), and the network bandwidth and the disk read/write speed in MB/s, respectively. Additionally, pp and ap denote the passive and active power measured in Watts.
Given a workflow \(G=(T,A)\) and an IaaS infrastructure as a set of VMs M, a feasible solution \(S=(Hosts,Order)\) of our problem is composed of two parts, Hosts is a mapping from tasks to VMs and Order is a topological order of G. Our target is to find a feasible solution S that minimises the following two objectives:
$$minimize\;EFT(t_{{exit}} )$$
(1)
and
$$minimize\;TE(S)$$
(2)
where \(EFT(t_{exit})\) is the estimated finishing time of the task \(t_{exit}\), i.e. the makespan, and TE(S) is the total energy consumption incurred in solution S.

2.1 Workflow schedule evaluation model

This work uses the Disk Network Computing (DNC) model for optimizing workflow scheduling in cloud computing environments defined by Barredo et al. Barredo and Puente (2023).
Let \(ct_i^k\) represent the computation time for executing task \(t_i\) on virtual machine \(vm_k\). This computation time is calculated as follows:
$$\begin{aligned} ct_i^k = \frac{size(i)}{pc_k} \end{aligned}$$
(3)
where size(i) denotes the computational load of task \(t_i\), quantified in GFLOP, and \(pc_k\) refers to the processing capability of the virtual machine \(vm_k\), measured in GFLOP/s.
The time required for transferring data between tasks \(t_i\) and \(t_j\), assigned to virtual machines \(vm_k\) and \(vm_l\) respectively, is given by:
$$dt_{{i,j}}^{{k,l}} = \left\{ {\begin{array}{ll} {\frac{{data(i,j)}}{{ds_{k} }}{\text{ }}} \hfill & {if{\text{ }}k = l{\text{ }}} \hfill \\ {\frac{{data(i,j)}}{{\min (ds_{k} ,nb_{k} ,nb_{l} )}}{\text{ }}} \hfill & {if{\text{ }}k \ne l{\text{ }}} \hfill \\ \end{array} } \right.$$
(4)
where data(ij) represents the size of the data to be transferred from \(t_i\) to \(t_j\), \(ds_k\) is the disk speed of \(vm_k\), and \(nb_k\) and \(nb_l\) are the network bandwidths of \(vm_k\) and \(vm_l\), respectively. According to the DNC model, if tasks \(t_i\) and \(t_j\) are hosted on the same VM, data transfer involves local disk reads. If not, then \(nb_k\) and \(nb_l\) must also be considered.
The Estimated Finish Time (EFT) of task \(t_i\) on virtual machine \(vm_k\) is determined by adding its Estimated Start Time (EST), the time for input data acquisition, computation time, and the time for output data handling, represented as:
$$\begin{aligned} EFT(i,k) = EST(i, k) + input(i,k) + ct_i^k + output(i,k) \end{aligned}$$
(5)
The input(ik) denotes the time to receive all input data for task \(t_i\) on \(vm_k\) from its predecessors, calculated as:
$$\begin{aligned} input(i,k) = \sum _{t_j \in pred(t_i)}dt_{j,i}^{l,k} \end{aligned}$$
(6)
with each predecessor task \(t_j\) executed on its designated machine \(vm_l\). The output(ik) refers to the time to write all output data of i to k’s local disk, defined by:
$$\begin{aligned} output(i,k) = \frac{\sum _{t_j \in succ(t_i)}data(i,j)}{ds_k} \end{aligned}$$
(7)
The estimated start time of task \(t_i\) on \(vm_k\) is expressed as:
$$\begin{aligned} EST(i,k) = avail(i,k,\max _{t_j \in pred(t_i)}(EFT(j,l))) \end{aligned}$$
(8)
where avail(ikm) indicates the earliest available time slot on \(vm_k\) to process \(t_i\) where m is the maximum of the earliest finishing times of all predecessor tasks of \(t_i\) on their respective machines.

2.2 Total energy model

Upon determining the makespan, the total energy consumption associated with the execution of a scientific workflow of a given solution S should be computed by considering two distinct sources: passive and active energy consumption. Firstly, there is a permanent passive power consumption from the instant the VM is switched on until it is switched off. The amount of energy required by a \(vm_k \in M\) depends on its passive power consumption, \(P_p(k)\), and is proportional to the time the machine remains on. Assuming that all machines are switched on and off at the same time (at instants 0 and \(EFT(t_{end})\), respectively), the passive energy consumption of a solution S can be defined as follows:
$$\begin{aligned} E_p(S) = \sum _{k \in M} P_p(k)EFT(t_{end}) \end{aligned}$$
(9)
Furthermore, processing task \(t_i\) in \(vm_k\) incurs an additional energy cost \(E_a(i,k)\), determined by active power \(P_a(i,k)\) needed for \(t_i\) execution in \(vm_k\) and is proportional to the processing time \(ct_i^k\).
$$\begin{aligned} E_a(i,k) = P_a(i,k) \times (input(i,k) + ct_i^k + output(i,k)) \end{aligned}$$
(10)
Therefore, the active energy of a solution S will be:
$$\begin{aligned} E_a(S) = \sum _{i \in T,<i,k>\in Hosts}E_a(i,k) \end{aligned}$$
(11)
Following an additive model, the total energy consumption of a solution S is the sum of its passive and active energy consumption:
$$\begin{aligned} TE(S) = E_p(S)+E_a(S) \end{aligned}$$
(12)

3 Overview of the genetic algorithm approach

This section introduces the GA approach developed to optimize the Workflow Scheduling Problem. It describes the core elements of the GA from Barredo and Puente (2023), and the new decoding scheme that evaluates the effectiveness of each solution through multiple fitness metrics.
The algorithm illustrated in Algorithm 1 functions on a generational framework, integrating mechanisms such as random selection and tournament-based replacement among parents and offsprings. This approach inherently incorporates a strategy of elitism within the algorithm. The genetic algorithm is configured to operate with a variable number of heuristic fitness functions (\(l_{fitness}\)), requiring the specification of several essential parameters: the population size (\(pop_{size}\)), the maximum number of generations (\(max_{gens}\)), and the probabilities for crossover and mutation (\(p_c\) and \(p_m\) respectively).
Algorithm 1
Genetic Algorithm
Full size image
The encoding operator is constructed upon task permutations with designated VM allocations Ye et al. (2019); Zhu et al. (2016). A gene is defined as a tuple (i,k), where 1 \(\le i \le |T|\) denotes task index and 1 \(\le k \le |M|\) signifies the VM index, crafting a chromosome that encapsulates such a gene for each task. Task sequences must follow a topological order, ensuring that each task is positioned within the chromosome after all its preceding tasks and before any of its subsequent tasks. Therefore, all initial individuals, as well as those generated through genetic operations, must strictly adhere to the dependency constraints of the tasks.
The crossover process manages both task sequencing and VM allocations for the offspring. The \(\texttt{CrossoverOrder}\) approach, as described by Zhu et al. (2016), ensures that task dependencies are preserved by maintaining the relative order of each task pair from at least one parent. This guarantees that no violations of task precedence constraints occur during the crossover operation.
The mutation mechanism modifies the task sequence while preserving dependency constraints. First, a task \(t_i\) is randomly selected, and it is relocated within the sequence segment of feasible positions. Next, \(t_i\) is randomly reassigned to one of the available VMs. These steps guarantee that the mutated chromosome remains valid and compliant with the task dependency framework.
The initial population, consisting of \(pop_{size}\) members, is generated by building task sequences that adhere to a topological order, ensuring compliance with task dependency constraints. For each task, a valid VM assignment is randomly selected from the available options, guaranteeing the feasibility of the resulting chromosome.
New Decoding Schema The construction of a schedule derived from a chromosome utilizing the standard fitness function (StdFit) follows the DNC evaluation model for decoding. Genes within a chromosome are sequentially processed from left to right. For each gene (i,k), task \(t_i\) is scheduled at the earliest feasible starting time. This scheduling is done using an insertion policy of \(vm_k\) that fulfills its predecessors’ completion time and task processing requirements. The makespan of the resulting schedule is determined by the latest completion time of all tasks. This standard decoding function (StdFit) is assisted by the other supporting fitness functions (\(l_{fitness}\)) in the cooperative multi-fitness evaluation which is described in the next section.

4 Cooperative multi-fitness energy function evaluation

This work introduces the novel concept of using a set of different schedule-building functions to solve the energy minimization problem. Various heuristics can be applied, each of which may perform better for different workflows, as they explore distinct subregions of the solution space. While one heuristic may converge to a local minimum, another may uncover a different, potentially better one. The multi-fitness approach combines chromosome information with different heuristics to generate alternative schedules and improve the overall solution quality.
To avoid over-reliance on heuristics, a standard fitness function is employed to directly decode the chromosome. This ensures the validity of the solution while allowing genetic operators (crossover and mutation) to function as intended. By preserving the StdFit function, the algorithm maintains compatibility with these operators. Without this standard decoding function, however, any minor adjustments made by crossover or mutation could be undone by the heuristic’s recoding process.

4.1 Cooperative multi-fitness algorithm

The evolutive algorithm with each new individual generated uses the cooperative multi-Fitness algorithm (as shown in Algorithm 2) to generate and evaluate the corresponding schedule. This algorithm is designed to optimise a given solution using both a standard fitness function and a set of supporting heuristic fitness functions. The algorithm runs through the following steps:
Algorithm 2
Cooperative-Multi-Fitness function
Full size image
1.
Initialization: The algorithm begins by initializing a variable, minValue, which stores the best (i.e., minimal total energy) schedule evaluation found so far. This is initialized by decoding the solution using the standard fitness function, StdFit. The decoding process involves interpreting the chromosome according to the standard schema, which gives a baseline schedule evaluation.
 
2.
Fitness Function Evaluation: Next, the algorithm iterates over each supporting fitness function in the list \(l_{fitness}\). For each heuristic fitness function, the solution is decoded and evaluated. The evaluation result is compared with the current minValue. If the evaluation of the current fitness function is better (i.e., smaller), the algorithm updates minValue to reflect this new best result.
 
3.
Lamarckism: After all fitness functions have been evaluated, the best schedule found (which corresponds to the smallest minValue) is selected. This schedule is then reencoded using the standard fitness function’s decoding schema, following a Lamarckian approach. This step ensures that the schedule is mapped back into a chromosome in a way that is compatible with the genetic algorithm’s operators.
 
4.
Return the Best Schedule: Finally, the algorithm returns the schedule corresponding to the best evaluation found during the process along with its evaluation value (minValue). This schedule is the one most likely to yield the best makespan or total energy optimization, as guided by the cooperative use of both the standard and supporting fitness functions.
 

4.2 Standard and supporting functions

The following functions are defined to address total energy minimization, with the standard function serving to maintain genetic diversity in the population while the heuristic functions specifically target energy reduction.
  • StdFit: A straightforward decoding of the solution, it reads the solution as it is, following strictly both the order and the VM mapping. This function’s main strength is the ability of using the solution from previous generations and being able to be modified by the different operators. It is worth noting that this is the same GA fitness function that is considered as the basis for the makespan optimisation in Barredo and Puente (2023).
  • \(H_1^{TE}\): This heuristic support function takes the task order defined in the chromosome of a solution. It calculates the total energy consumption and the EFT of each task being processed on every possible VM, and then sorts the tasks with respect to these values. The function follows a soft constraint, which ensures that the resulting makespan after choosing a VM should not be larger than the current makespan. To achieve this, it searches for the first task to VM mapping, whose makespan does not surpass the current makespan. By doing so, it manages not to increase the runtime of the workflow, which in turn avoids to increase the passive energy consumption as the machines do not need to be kept turned on longer than necessary. If the heuristic does not find a VM mapping that follows this criteria, it selects the first item of the sorted list, meaning the one that consumes less, and if there is more than one (a tie in total energy consumption), it selects the one with smaller makespan.
  • \(H_2^{TE}\): This support function stems from the fact that there are some tasks that can be critical or at least potentially increase the makespan if they were to be executed on a slower machine. If the makespan increases, so will the total passive energy consumption. Therefore, this heuristic creates an ordered list of VMs based on their energy consumption. The VM with the lowest energy consumption is selected, unless the time required to execute the task on that VM exceeds the average execution time of the tasks in the workflow. The average execution time is calculated a priori by considering two components: (1) the average computation time of each task, which is based on the different execution durations of the available CPU models, and (2) the average communication time, which is estimated using the average disk speed.
As can be seen from the definitions of the new heuristic functions, makespan is employed as an optimisation criterion. Although this choice may initially seem counter-intuitive, since makespan and energy are typically regarded as orthogonal objectives, it will be shown in the next section that minimising makespan within our energy model is effective for reducing passive energy consumption.
In order to facilitate a meaningful comparison in the experimental section 5.4, we recall the heuristic functions proposed in Barredo and Puente (2024) for makespan minimisation. These functions are necessary for evaluating the performance of our energy optimisation approach in contrast to makespan optimisation:
RankFit (\(H_1^{MKP}\)) This fitness function is inspired by the first part of the HEFT algorithm ( Topcuoglu et al. (2002)). It generates a ranking following its tasks upranking and uses it as the order of the tasks instead of the one from the chromosome. The only information to follow from the chromosome is the VM mapping that will be combined with the new order.
HeftFit (\(H_2^{MKP}\)) This fitness uses the second part of the HEFT algorithm. It will follow the chromosome order but ignore the mapping, instead it will calculate the EFT of every VM and choose the one with lowest EFT.
The same standard function StdFit is used in the optimisation of both total energy and makespan objectives.

5 Experimental study

In this section, we evaluate the contribution of different support functions to the minimisation of total energy, in addition to the standard decoding function. To achieve this, we start with the GA proposed in Barredo and Puente (2023) and incorporate the supporting functions \(H_1^{TE}\) and \(H_2^{TE}\) into the algorithm, following the multi-fitness approach described in the current study. The behaviour of these approaches is analysed under different cloud infrastructure scenarios. Finally, the correlation between total energy minimisation and makespan optimisation is evaluated by comparing the solutions obtained by the multi-fitness algorithm optimising each of these two objectives.

5.1 Benchmark instances

The experimental instances used in this study were derived from seven different scientific workflows available in the WFCommons repository ( Coleman et al. (2022)). These workflows were generated using the Pegasus workflow management system ( Deelman et al. (2019)). The selected workflows exhibit varying characteristics and can be broadly categorized into two main types: compute-intensive and data-intensive. Within the data-intensive category, a specific subtype emerges that is characterised by a high degree of task dependencies, often referred to as ’collision-prone’. These workflows can lead to resource bottlenecks, such as saturation of disk and network interfaces. An illustrative example of this subtype is shown in Fig. 1.
Fig. 1
Diagram of Montage Scientific Workflow (source: wfcommons.org)
Full size image
All workflow problems used in this study are introduced below with a brief description:
  • 1000Genome: This workflow uses data from the 1000 Genome projects and finds mutational overlaps in order to provide a data for the evaluation of health conditions caused by mutations.
  • Cycles: It is a Agroecosystem model used to simulate the perturbations of biogeochemical process caused by different agricultural practices.
  • Epigenomics: This workflow is related to genome sequencing operations.
  • Montage: It consists in the reprojection and background correction to compose a mosaic using Flexible Image Transport System (FITS) telescope images.
  • Seismology: The workflows represent the process of taking multiple seismic stations and cross-correlating the measurements of acceleration.
  • SoyKB: Is the genomics pipe of re-sequencing the soybean germplasm to change its traits.
  • SRASearch: This workflow is the process of searching the Sequence Read Archive(SRA) and transforming the data to have aligned sequencing reads.
To evaluate the impact of workflow dimensions of each proposed problem we have selected four instances for each problem, having four sizes: extra-small (50-100 task), small (100-200), medium (200-500), and large (500-1000) if they are available in the repository.
Table 1 presents all the instances employed in the experimentation, along with the number of tasks and the communication to computation ratio (CCR). This metric, adapted to the DNC model in Barredo and Puente (2023), indicates the proportion of total execution time spent on communication relative to computation.
$$\begin{aligned} CCR = \sum _{i \in T}{\frac{\left( \overline{input_{i}} + \overline{output_{i}}\right) }{\overline{ct_i}}} \end{aligned}$$
(13)
where \(\overline{input_{i}}\) and \(\overline{output_{i}}\) represent the average communication times between task \(t_i\) and its predecessors and successors, respectively, while \(\overline{ct_i}\) denotes the average execution time of task \(t_i\).
Table 1
Analysis of CCR for every instance grouped by problem
Problem
Type
Instance
Tasks
ccr
Seismology
data
   
  
seismology-chameleon-100p-001
101
0.02%
  
seismology-chameleon-500p-001
501
0.02%
  
seismology-chameleon-700p-001
701
0.02%
  
seismology-chameleon-1000p-001
1001
0.02%
Cycles
compute
   
  
cycles-chameleon-1 l-1c-9p-001
67
0.06%
  
cycles-chameleon-2 l-1c-12p-001
437
0.03%
  
cycles-chameleon-2 l-1c-9p-001
133
0.04%
  
cycles-chameleon-5 l-1c-12p-001
1091
0.06%
Epigenomics
data
   
  
epigenomics-chameleon-hep-1seq-100k-001
41
1.55%
  
epigenomics-chameleon-ilmn-1seq-100k-001
125
1.13%
  
epigenomics-chameleon-hep-6seq-100k-001
507
0.84%
  
epigenomics-chameleon-ilmn-6seq-100k-001
863
1.74%
SRASearch
data
   
  
srasearch-chameleon-10a-005
22
2.44%
  
srasearch-chameleon-20a-003
42
0.86%
  
srasearch-chameleon-40a-003
84
1.41%
  
srasearch-chameleon-50a-003
104
1.19%
Montage
compute
   
  
montage-chameleon-2mass-005d-001
58
2.27%
  
montage-chameleon-2mass-01d-001
103
3.11%
  
montage-chameleon-dss-10d-001
472
2.15%
  
montage-chameleon-dss-125d-001
1066
5.09%
SoyKB
data
   
  
soykb-chameleon-10fastq-10ch-001
96
17.24%
  
soykb-chameleon-10fastq-20ch-001
156
14.46%
  
soykb-chameleon-30fastq-10ch-001
256
15.67%
  
soykb-chameleon-40fastq-20ch-001
546
13.31%
1000genome
data
   
  
1000genome-chameleon-2ch-250k-001
82
25.53%
  
1000genome-chameleon-4ch-250k-001
164
19.09%
  
1000genome-chameleon-12ch-250k-001
492
24.44%
  
1000genome-chameleon-18ch-250k-001
738
24.14%

5.2 Benchmark platform

The experiments are conducted by running the defined workflows under different conditions to analyse the impact of the number of hosts and the effect of varying processing capacities and their associated power consumption. The infrastructure consists of two types of hosts whose detailed characteristics, are summarised in Table 2. This table describes each host by its CPU model, processing capacity (in GFLOP/s) and power consumption (in Watts) under identical conditions - both in standby mode (passive power) and the additional power consumed when running tasks (active power). Both configurations share the same local disk model with an I/O speed of 540MB/s and a network bandwidth of 125MB/s. The Intel configuration delivers 1.48 times the computing performance of the AMD configuration, but at the expense of 4.82 times more total power (i.e. passive power + active power). It is clear that, in this configuration, the higher performance of the faster machine results in much higher passive and active power.
The scenarios to be tested use 2, 4, 8, and 16 hosts, with half of the first type of host and half of the second type of host. The second type of host (i.e. the Intel model) is objectively better than the other, with a higher GFLOP/s speed, but also a significantly higher power consumption.This is done to better test how the different heuristics work and to find out if the balance is right or if there is a tendency to use only one type of host.
Table 2
Characteristics of the hosts used in the benchmark platform. Units: Throughput in GFLOP/s, Active Power and Passive Power in Watts. (Power specifications derived from benchmark SPECpower SPEC (2024))
Processor
Throughput
Active Power
Passive Power
AMD EPYC 4584PX
67.2
68.5
43.5
Intel Xeon Platinum 8471N
93.6
330.8
88.2
Each experiment is repeated 10 times and the crossover and mutation probability are set to 1 and 0.1, respectively. The GA uses an initial population of 100 individuals and goes through 1000 generations.
The GA is developed in Java using the JMetal framework Nebro (2021) which contains several implementations of both single-objective and multi-objective algorithms and full set of tools to make both the experimentation and the analysis of the results. The platform where all the experiments are launched consists of a Linux computer with a 12th Gen Intel(R) Core(TM) i9-12900 2.40 GHz and 32 GB of RAM. The full set of results is available at the following URL: https://​github.​com/​iScOp-uniovi/​Paper_​NACO_​Barredo_​2025.

5.3 Efficiency of the cooperative multi-fitness approach in energy scenario

In this section, we evaluate the efficiency of the proposed supporting functions using the new multi-fitness decoding model to optimize the total energy consumption of the workflow. Rather than directly comparing energy values, which can vary significantly depending on the size and complexity of each workflow, we introduce the energy percentage error (EPE) as a performance metric:
$$\begin{aligned} EPE = \frac{energy - best^{TE}}{best^{TE}} \end{aligned}$$
(14)
The term energy refers to the total energy consumption of the solution being evaluated, and \(best^{TE}\) denotes the lowest energy value found across all experimental runs for the same instance and number of hosts pair. It is used as a heuristic reference for evaluating the performance of the proposed supporting fitness functions.
The EPE value serves as a comparative measure, with values closer to zero reflecting solutions that are closer to the best solution found in this experimentation for a given instance and scenario.

5.3.1 Average EPE results across configurations and instances

To evaluate the efficiency of the proposed cooperative multi-fitness approach, we conducted an extensive experimental study across various host configurations and workflow problem instances. The experiments include a range of scenarios defined by different combinations of infrastructure hosts and workflow characteristics. For each scenario, the GA was tested with four evaluation configurations: the standard fitness function, both individual energy heuristics, and the proposed multi-fitness approach that integrates the former. The results are compared using the EPE metric, which provides a normalized measure of the solution quality relative to the best solution found. This analysis aims to understand the contributions of each evaluation function to the optimization of total energy and to assess the robustness of the multi-fitness approach across diverse conditions.
Tables 3, 4, 5 And 6 summarize the EPE values achieved by the GA under different scenarios, from 2 to 16 hosts, and different evaluation functions: standard fitness function (StdFit), heuristic \(H_1^{TE}\), heuristic \(H_2^{TE}\), and the proposed multi-fitness approach \(MF_{H_1,H_2}^{TE}\). Each row represents a specific workflow instance, while each column corresponds to an evaluation method.
Table 3
EPE over the different instances for 2 hosts (in bold the best value for each problem instance)
Problem
Tasks
StdFit %
\(H_1^{TE}\) %
\(H_2^{TE}\) %
\(MF_{H_1,H_2}^{TE}\) %
1000genome
82
16
5
14
5
164
10
4
9
4
492
5
1
5
1
738
10
4
9
4
Cycles
67
26
14
6
7
133
13
3
4
3
437
6
2
2
2
1091
5
2
2
1
Epigenomics
41
55
23
16
20
507
11
5
10
5
125
22
10
21
10
863
5
2
4
2
Montage
103
35
12
11
11
58
46
13
13
11
472
24
10
10
9
1066
14
5
3
3
Seismology
101
22
6
8
6
501
10
3
4
3
701
6
1
3
1
1001
6
2
3
1
Soykb
96
21
12
4
4
156
17
12
2
2
256
13
9
3
3
546
14
12
1
1
SRASearch
22
53
15
17
14
42
39
10
11
9
84
22
5
5
4
104
24
7
8
6
Table 4
EPE over the different instances for 4 hosts (in bold the best value for each problem instance)
Problem
Tasks
StdFit %
\(H_1^{TE}\) %
\(H_2^{TE}\) %
\(MF_{H_1,H_2}^{TE}\) %
1000genome
82
10
4
22
3
164
14
6
13
6
492
5
1
5
1
738
4
1
4
1
Cycles
67
23
17
5
5
133
15
8
6
6
437
9
5
3
3
1091
10
6
4
4
Epigenomics
41
47
17
20
19
507
10
3
7
3
125
14
5
12
5
863
5
1
3
1
Montage
103
35
12
11
11
58
57
22
16
16
472
19
6
6
5
1066
19
7
5
5
Seismology
101
20
8
7
6
501
9
2
3
2
701
6
2
3
2
1001
6
2
3
1
Soykb
96
15
12
2
2
156
18
17
2
2
256
14
12
2
2
546
17
17
1
1
SRASearch
22
42
10
9
8
42
44
14
10
9
84
24
6
6
5
104
19
6
5
4
Table 5
EPE over the different instances for 8 hosts (in bold the best value for each problem instance)
Problem
Tasks
StdFit%
\(H_1^{TE}\)%
\(H_2^{TE}\)%
\(MF_{H_1,H_2}^{TE}\)%
1000genome
82
9%
3%
63%
3%
164
7%
3%
15%
2%
492
6%
2%
7%
2%
738
5%
1%
7%
1%
Cycles
67
22%
23%
6%
6%
133
19%
14%
9%
9%
437
8%
4%
1%
1%
1091
8%
4%
1%
1%
Epigenomics
41
27%
3%
7%
7%
507
16%
6%
11%
7%
125
24%
11%
21%
12%
863
8%
3%
6%
3%
Montage
103
22%
7%
5%
5%
58
56%
22%
16%
16%
472
28%
8%
11%
8%
1066
18%
5%
4%
3%
Seismology
101
20%
8%
7%
6%
501
7%
2%
2%
2%
701
7%
3%
3%
2%
1001
6%
2%
2%
2%
Soykb
96
18%
18%
2%
3%
156
21%
22%
1%
1%
256
16%
16%
1%
1%
546
22%
23%
1%
1%
SRASearch
22
49%
3%
21%
14%
42
51%
7%
19%
12%
84
30%
13%
11%
11%
104
21%
7%
6%
6%
Table 6
EPE over the different instances for 16 hosts (in bold the best value for each problem instance)
Problem
Tasks
StdFit%
\(H_1^{TE}\)%
\(H_2^{TE}\)%
\(MF_{H_1,H_2}^{TE}\)%
1000genome
82
14%
4%
140%
3%
164
10%
3%
37%
3%
492
5%
3%
16%
2%
738
4%
1%
9%
1%
Cycles
67
17%
14%
5%
4%
133
10%
6%
1%
1%
437
13%
6%
2%
3%
1091
10%
6%
1%
1%
Epigenomics
41
20%
4%
4%
5%
507
24%
4%
13%
6%
125
32%
5%
35%
11%
863
17%
2%
14%
12%
Montage
103
29%
12%
9%
9%
58
30%
7%
12%
5%
472
26%
7%
11%
7%
1066
21%
3%
3%
2%
Seismology
101
21%
9%
13%
4%
501
7%
2%
2%
1%
701
7%
2%
4%
2%
1001
5%
2%
2%
2%
Soykb
96
14%
19%
3%
2%
156
18%
28%
3%
3%
256
13%
16%
1%
1%
546
16%
27%
1%
1%
SRASearch
22
29%
1%
15%
12%
42
27%
7%
13%
8%
84
27%
6%
12%
8%
104
29%
4%
14%
9%
The results indicate that both heuristic functions \(H_1^{TE}\) and \(H_2^{TE}\) outperform the standard fitness function StdFit in terms of energy optimisation. Overall, there are no significant differences between the performance of the two heuristics. However, \(H_1^{TE}\) shows better results on the 1000genome, Epigenomics and Seismology problems, while \(H_2^{TE}\) shows superior performance on the Cycles and Soykb problems. Furthermore, the two heuristics show no significant differences in the Montage and SRAsearch problems. This behaviour variance is probably due to the specific organisation of the DAG tasks in each problem.
Although the multi-fitness approach \(MF_{H_1,H_2}^{TE}\) does not always achieve the absolute lowest EPE value, it consistently combines the best results of the individual heuristics and achieves EPE values that are equal to or lower than those of the individual heuristics in most cases.
A statistical analysis using non-parametric Friedman test for paired samples (using standard 0.05 significance level) signals significant differences among the studied methods (p-value \(< 2.2\textrm{e}{-16}\)). A pairwise Wilcoxon rank sum test with a Bonferroni correction as post-hoc analysis reveals multi-fitness version is statistical significantly better than both heuristic support functions (p-values \(< 1.5\textrm{e}{-08}\) and \(< 5.4\textrm{e}{-13}\) respectively), and that all three approaches (multi-fitness and both heuristics) outperform the StdFit results. The statistical results are summarised in Table 7. This demonstrates the advantage of using a multi-fitness approach to optimise total energy by exploiting the strengths of multiple supporting functions.
Table 7
Summary of the statistical study comparing the total energy values of the standard, heuristic and multi-fitness proposed fitness functions. The symbol (\(\checkmark\)) means that the row method is significantly better than the corresponding column method
Method
StdFit
\(H_1^{TE}\)
\(H_2^{TE}\)
\(MF_{H_1,H_2}^{TE}\)
StdFit
-
\(H_1^{TE}\)
\(\checkmark\)
\(H_2^{TE}\)
\(\checkmark\)
\(MF_{H_1,H_2}^{TE}\)
\(\checkmark\)
\(\checkmark\)
\(\checkmark\)

5.4 Bidirectional correlation of makespan and energy objectives in multi-fitness algorithms

To better assess the performance of \(MF_{H_1,H_2}^{TE}\) in terms of makespan, we use the makespan percentage error (MPE) as a metric:
$$\begin{aligned} MPE = \frac{makespan - best^{MKP}}{best^{MKP}} \end{aligned}$$
(15)
Here, makespan refers to the completion time of the solution being evaluated, and \(best^{MKP}\) denotes the lowest makespan value found across all experimental runs and algorithms for the same instance and number of hosts pair. This value is used as a heuristic reference to evaluate how close each solution is to the best-performing one under the tested conditions.
To better grasp the performance we need to compare it with \(MF_{H_1,H_2}^{MKP}\), the multi-fitness GA version including StdFit function and both makespan heuristics (\(H_1^{MKP}\) and \(H_2^{MKP}\)). In Table 8 we can see the values of MPE and EPE together. The table shows an unexpected and asymmetric behaviour between the two multi-fitness approaches.
Specifically, while \(MF_{H_1,H_2}^{TE}\) (optimising energy) consistently achieves low makespan values in addition to its primary goal of energy reduction, \(MF_{H_1,H_2}^{MKP}\) (optimising makespan) performs poorly in terms of energy, with significant increases in energy consumption. This asymmetry is evident across most cases, with \(MF_{H_1,H_2}^{TE}\) often approaching the best-known makespan solutions, except for the Epigenomics problem. In the context of the \(MF_{H_1,H_2}^{MKP}\) algorithm, the expected behaviour can be observed. By focusing solely on the makespan, the total energy consumption is neglected and can significantly increase. On average, this can lead to an energy usage increment of up to 41%.
This behaviour can be attributed to three key factors.
First, the relationship between the active and passive energy use of a VM and the durations of tasks plays a significant role. Faster machines consume more energy -both active and passive- compared to slower machines. Second, the energy model employed assumes that all machines are turned off after completing the last task, which directly links passive energy consumption to the total completion time (makespan). Finally, total energy optimisation is inherently complex, and our approach incorporates two heuristic functions specifically designed for this purpose. Notably, both heuristics leverage makespan as a guiding factor in the generation of energy-efficient solutions, further explaining the strong performance of \(MF_{H_1,H_2}^{TE}\) in minimising makespan alongside energy consumption.
Table 8
Average energy and makespan error metrics (EPE and MPE, respectively) for each problem family, reported for both multi-fitness GA variants
Problem
\(MF_{H_1,H_2}^{TE}\)%
\(MF_{H_1,H_2}^{MKP}\)%
 
EPE
MPE
MPE
EPE
1000genome
3%
4%
1%
10%
Cycles
4%
3%
2%
15%
Epigenomics
8%
14%
0%
23%
Montage
8%
3%
0%
36%
Seismology
3%
1%
0%
12%
Soykb
2%
1%
0%
24%
Srasearch
9%
3%
1%
39%

6 Conclusion

In this paper we have proposed a novel multi-fitness approach to minimise the total energy consumption of the processing of scientific workflows in cloud computing. We have applied the knowledge from our previous work where the main focus was to minimise makespan. The proposed idea is to use multiple heuristic fitness functions to combine with a standard decodification function in order to obtain better solutions through an evolutive algorithm.
This work also proposes two energy heuristics that work as the support functions of the multi-fitness. The first function focuses on reducing the total energy consumption while minimising the impact on the makespan. The other proposed support fitness has a different approach, it prioritizes the use of fast VMs for tasks with huge amounts of computation and communications while using energy efficient VMs for the rest of the tasks.
Another finding has been the close relationship between makespan and energy. As they are opposite objectives and focusing solely in energy, one would assume if one objective decreases, the other increases. This is not the case, mainly due to three key factors: the relationship between throughput and the energy consumption ratios of the different servers considered, the energy model in which virtual machines continue consuming passive energy until the makespan is reached, and the dependency of the energy heuristics on makespan, which is considered as a soft constraint.
Experimental results corroborate this analysis, showing that the algorithm optimising energy consistently achieves solutions comparable to those of a makespan-focused algorithm. However, the reverse does not hold; prioritising makespan reduction can lead to scenarios where selecting specific hosts yields marginal improvements in completion time but significantly increases energy consumption, depending on the power characteristics of the available physical machine models.
The experimental results indicate the effectiveness of the proposed algorithm. In the grand majority of analysed instances we can see the results are similar, if not better, than both of the individual supporting heuristics. This supports that using a multi-fitness approach leads to better and more consistent results. The primary benefit of the cooperative multi-fitness approach is its high degree of reusability, which stems from its independence from any specific number or set of support functions. Furthermore, multi-fitness is not constrained to a particular problem to be solved or an optimisation algorithm to be applied. Its only requirement is the availability of fitness support functions to be exploited in conjunction with a predefined standard function, all of which operate on the same solution representation scheme for the problem under consideration.

Acknowledgements

This research has been supported by the Spanish Government under research Grants TED2021-131938B-I00, PID2022-141746OB-I00, by the Principality of Asturias under research Grant GRU-GIC-24-018.

Declarations

Conflict of interest

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literature
Metadata
Title
Energy-aware cooperative multi-fitness evolutionary algorithm for workflow scheduling in cloud computing
Authors
Pablo Barredo
Jorge Puente
Publication date
11-06-2025
Publisher
Springer Netherlands
Published in
Natural Computing
Print ISSN: 1567-7818
Electronic ISSN: 1572-9796
DOI
https://doi.org/10.1007/s11047-025-10023-y

Premium Partner