Introduction
Related work
Deadline-aware workflow scheduling
Budget-aware workflow scheduling
Multi-objective workflow scheduling
Problem statement
Workflow model
Resource model
Definitions
Task execution time
Communication time
Earliest start time (EST) and earliest finish time (EFT)
Actual start time (AST) and actual finish time (AFT)
Critical path and critical tasks
Linear graph and nonlinear graph
Problem formulation
Proposed method
Workflow division algorithm
Linear graph scheduling algorithm
Initialization phase
Internal combination phase
-
Definition: G1 and G2 are linear graphs and D1 and D2 are their deadlines, respectively. These linear graphs are uncombinable if the total execution time of their tasks is greater than both D1 and D2.
-
Definition: Linear graphs, G1 and G2, are not time-overlapped graphs if:
External combination phase
Evaluation
Experimental setting and evaluation criteria
Type | ECU | Memory (GB) | Cost($)(per hour) |
---|---|---|---|
m3.medium | 3 | 3.75 | 0.067 |
m4.large | 6.5 | 8 | 0.126 |
m3.xlarge | 13 | 15 | 0.266 |
m4.2xlarge | 26 | 32 | 0.504 |
m4.4xlarge | 53.5 | 64 | 1.008 |
Scientific Workflow | Small | Medium | Large | Extra Large |
---|---|---|---|---|
CyberShake | 30 | 50 | 100 | 500 |
Epigenomics | 24 | 46 | 100 | 500 |
LIGO | 30 | 50 | 100 | 500 |
Motif | 30 | 50 | 100 | – |
-
Normalized cost: The overall workflow execution cost obtained by each of the algorithms is a suitable criterion for evaluating the compared algorithms. Because of the difference in the structure and characteristics of the benchmark workflows, the current paper employs a normalized cost metric for comparison. The normalized cost of executing each workflow with any of the benchmark algorithms is obtained by dividing its overall execution cost, cost(wf), by its scheduling cost based on the cheapest scheduling strategy(costlowest(wf)) [18], as follows:
-
Success rate: To evaluate the capability of each algorithm to meet the deadline constraints, the present study defines the metric of success rate as the ratio between the number of successful schedules and the total number of schedules.
-
Instance reduction: To compare the performance of the proposed algorithm against that of the current authors’ previous research (DQWS), an instance reduction parameter is introduced. This parameter shows the reduction factor in the percentage of the number of instances needed for scheduling each workflow type by EDQWS in comparison to DQWS. Since the present study’s focus in the linear graph scheduling phase is the combination of linear graphs, reducing the number of instances, in addition to minimizing the cost, can serve as a metric for comparing the two methods. A reduction in the number of instances required to schedule workflows can lower the probability of instance failure. This is also helpful for cloud providers which have a limitation in the number of requested instances.
Experimental result
Normalized cost analysis
Success rate analysis
The comparison of DQWS and EDQWS
Workflow | Average reduction in the normalized cost of EDQWS vs DQWS | Average reduction in the number of instances of EDQWS vs DQWS | Experiments that decrease the number of instances of EDQWS vs DQWS |
---|---|---|---|
CyberShake | 8.6% | 10.98% | 68.92% |
Epigenomics | 3.4% | 24.58% | 81.6% |
LIGO | 10.5% | −0.36% | 51.6% |
Motif | 0% | 0.05% | 8.4% |