1 Introduction
2 Related surveys
2.1 Cloud resource management
2.2 Adaptability using control theory
2.3 Cloud elasticity
3 Taxonomy
3.1 Control solution view
3.2 Elasticity view
4 Review of existing control theoretical approaches of elasticity
[72] | [60] | [44] | [43] | [48] | [53] | [50] | [61] | |
---|---|---|---|---|---|---|---|---|
Type | PID | PID | PID | PI | PD | Fixed gain | Integral | Integral |
Model | State-space | Queuing | – | Grey-box | – | Black-box | Black-box | Black-box |
Architecture | Centralized | Distributed | Centralized | Centralized | Centralized | Centralized | Centralized | Centralized |
Control objectives | 99th % read operation latency | Application SLO (response time) at a pre-defined level | Maintain a desired response time | Ensure service time constraints | Optimal number of VMs avoiding under/over utilized scenarios | Desired memory utilization | Desired CPU utilization | Maintain a desired response time |
Reference inputs | Service time | CPU utilization | CPU utilization | Service time | Server load and memory utilization | Memory utilization | CPU utilization | CPU utilization |
Control input | Number of Voldmart nodes | Number of VMs | Number of VMs | Number of map-reduce nodes, number of clients | Number of VMs | Memory allocation | Number of VMs | Number of VMs |
Monitoring metrics | Read latency without round-trip time | Mean CPU utilization of tier’s VMs | Mean CPU utilization of cluster | Service time and number of clients | Server load and memory utilization | CPU and memory usage, page fault rates | CPU load, arrival rate | CPU load, arrival rate |
Ingredients | SP St R Ho | SP W R Ho | SP W R Ho | – D Hy Ho | CP W R Ho | – W R V | SP W R Ho | SP St R Ho |
Workloads | Synthetic: generated using YCSB | Real: FIFA | Synthetic | Synthetic | Synthetic | Real | Synthetic | Synthetic |
Applications used | YCSB | – | Hogna Framework | MapReduce Benchmark Suite | – | httperf, MemAccess | – | Cloudstone |
Environment | Real: Voldmart | Real: Amazon | Real: Amazon and SAVI [117] | Real: Grid5000 [118] | Real: Amazon | Real: Custom (4 HP Proliant servers) + Xen 2.6.16 | Real: ORCA [119] + Xen as Hypervisor | Real: ORCA [119] + Xen as Hypervisor |
Compared with | Not provided | Threshold based, proportional controller | Not provided | Not provided | Compared with [81] | Not provided | Static threshold, integral control | Static provisioning |
[28] | [29] | [30] | [31] | [32] | [33] | [86] | [68] | |
---|---|---|---|---|---|---|---|---|
Type | SSF + FC | SSF | SSF (adaptive) | Adaptive | Adaptive + hill climbing | Adaptive | Adaptive | Adaptive (integral) |
Models | State-space + LQR | State-space | Black-box + LQR | Black-box | Queuing | Queuing | Queuing | Black-box |
Architecture | Centralized and cascade | Centralized | Distributed | – | Centralized | Centralized | Centralized | Hierarchical |
Control objectives | To meet SLO (CPU load, response time and bandwidth) | To meet performance (CPU load, response time and throughput) | To achieve applications’ SLO (throughput and response time) | To guarantee target performance (response time) | To achieve application performance (latency) | To guarantee SLA (max number of requests handled per time unit) | Macro level analysis to evaluate controller performance | To achieve application level QoS (response time, throughput) |
Reference input | CPU load, response time, cost | CPU load, response time, throughput | Response time, throughput | Response time | Latency | Un-handled service request in past interval | Over- and under-provisioned systems | CPU utilization |
Control input | Number of storage nodes | Number of VMs | CPU, memory and I/O allocation | Memory size | Virtual CPUs | Number of VMs | Number of VMs | CPU entitlement |
Monitoring metrics | CPU load, response time, bandwidth, total cost | CPU usage, response time, throughput | CPU and memory usage, I/O consumption | Response time | Arrival rate, performance, CPU usage, queue length | Arrival requests, service rate | Service capacity, arrived requests, buffer size, processed requests | CPU usage, workload arrivals |
Ingredients | SP St R Ho | – G R Ho | SP G R V | SP W R V | SP G R V | CP G Hy Ho | SP/CP Sc Hy Ho | CP W R V |
Workloads | Synthetic | Synthetic | Synthetic | Real: FIFA, Wikipedia | Real: FIFA | Real: FIFA | Real: FIFA, Google traces | Synthetic |
Applications used | – | httperf | httperf, RUBiS, TPC-W | RUBBoS | Zimbra load generator | – | – | RUBiS , TPC-W |
Environment | Simulation: EstoreSim | Real: Eucalyptus private cloud | Real: Linux machines + KVM | Real: Linux machine + KVM | Real: Linux machines + Zimbra [120] | Simulation: discrete event simulator | Simulation: discrete event simulator [121] | Real: 5 HP Proliant Servers + Xen |
Compared with | Scenario without any auto-scaling | Not provided | Scenario without any controller | Non-adaptive (details not provided) | Threshold-based utilization controller | Regression based controller [122] | Reactive controller [123] | None provided |
[62] | [63] | [73] | [75] | [49] | [55] | [54] | |
---|---|---|---|---|---|---|---|
Types | Adaptive (optimal) | Adaptive PI + pole placement | Adaptive PI | Nested adaptive (integral) | MIMO adaptive PI + RL | SISO, MIMO and adaptive MIMO | Adaptive |
Model | Black-box (ARMA 2nd order) | Black-box (1st order AR model) | Least square regression | – | ARMAX (2nd order) + SVM [124] | Kalman filter | Linear regression (1st order) |
Architecture | Distributed | – | Distributed | Cascade and distributed | Centralized | Centralized | Centralized (node level) |
Control objectives | To achieve application level QoS (response time) | To achieve application level QoS (response time) | To obtain target job progress to meet deadline | To achieve target QoS goal (response time) | To maximize application benefit (QoS) within time and budget constraints | To maintain CPU allocation right above the CPU utilization | Adjustment of memory size to achieve desire application response time |
Reference input | Per application response time | Response time | Target job progress | Response time, CPU utilization | Benefit function, execution time, resource cost | CPU utilization | Response time |
Control input | CPU entitlement and IO allocation | CPU entitlement | CPU share | CPU allocation | Adaptive parameters | CPU allocation | Memory size allocation |
Monitoring metrics | CPU usage, response time, disk usage | Response time | Job progress, milestone | CPU utilization, measured response time | Adaptive parameters, CPU and memory usage | CPU utilization | Measured response time, memory utilization |
Ingredients | CP W P V | – W R V | SP Sc R V | – G R V | SP/CP G P V | – W P V | SP W R V |
Workloads | Synthetic | Synthetic | Synthetic | R: SPECweb99 [125] | – | Synthetic | R: FIFA, Wikipedia |
Application used | RUBiS, TPC-W, custom: secure media server | httperf | ADCIRC, OpenLB, WRF, BLAST and Montage | httperf | Great Lake nowcasting and forecasting, volume rendering | RUBiS | RUBBoS, httpmon |
Environment | Real: two test-beds (HP C-class blades and Emulab [126]) | Real: two machines, i.e., HP9000-R server and Pentium III | Real:8-core AMD server + Hyper-V and Xen | Real: two machines, i.e., HP9000-L server and HP LPr Netserver | Real: two private clusters (each with 64 nodes) | R: three machines with Xen 3.0.2 Hypervisor | R: 32 cores and 56 GB memory based machine + Xen hypervisor |
Compared with | Two cases: work-conserving and static allocation | Fixed PI controller | Feedback approach of [127] | Single loop QoS controller, utilization controller | Work conserving, static scheduling | Not provided |
[45] | [37] | [38] | [39] | [34] | [59] | ||
---|---|---|---|---|---|---|---|
Types | Distributed MIMO MPC | Stochastic MPC | MPC + event triggering | LLC | LLC | LLC + search methods | |
Models | Fuzzy logic + ANN | Queuing + Kalman filter | G | ARIMA [129] | ARIMA + Kalman filter | Kalman filter | ARIMA + Kalman filter |
Architecture | Distributed | – | Centralized | Centralized | Centralized | Hierarchical | Centralized |
Control objectives | To achieve end-to-end desired response time of co-located web applications | To minimize application response time and cost | To reduce cluster reconfiguration decisions | To obtain optimal trade-off between energy saving and reconfiguration cost | To maximize profit and minimize power consumption | To maximize profit and minimize power consumption | To obtain a desired response time while reducing power consumption |
Reference input | Per application response time | – | – | SLA (average task scheduling delay) | Response time of each client class | Response time of each client class | Response time |
Control input | CPU and memory allocation | Number of VMs | Map-reduce cluster size | Number of machines | Cluster size, operating frequencies | Cluster size, CPU share, active host machines, workload share of VM | Operating frequency |
Monitoring metrics | CPU and memory usage of local and neighbour VMs, energy usage, response time | Resource usage, response time, cost, service rate of cluster, queue length, arrival rate | Number of clients, service time, availability | Queue length, SLA, CPU and memory usage, arrival rate | Arrival rate, response time | Queue length, arrival rate, processing rate, response time | Arrival rate, queue length, power consumption |
Ingredients | CP W P V | SP G P Ho | – St Hy Ho | CP G P Ho | CP G P Ho/V | CP G P Ho/V | CP G P V |
Workloads | Real: FIFA | Real: FIFA | Real: Taobao [130] | Real: Google [131] | Real: FIFA | Real: FIFA | Real: http traces [132] |
Application used | RUBiS and custom web service | Convex optimization solver | Map-reduce framework | – | – | IBM’s trade6 benchmark, httperf | – |
Environment | Real: Dell PowerEdge R610 servers + VMWare enviornment | Simulation: custom simulator | Simulation: Matlab Simulink | Simulation: Matlab simulation | R: Dell PowerEdge servers + VMWare ESX | Simulation: Matlab | |
Compared with | None provided | Time based control mechanism, error based event mechanism | A system without LLC | A system without LLC | Comparison among various search methods |
[42] | [74] | [64] | [56] | [57] | [46] | [69] | |
---|---|---|---|---|---|---|---|
Types | Adaptive integral + fixed gain + heuristic | Feedback (PI) + feed-forward (MPC) | Feedback + feed-forward + integral | Feedback (PI) + feed-forward (P) | Optimal + feedback (PI) | Adaptive + fuzzy controller | Multiple fuzzy controllers |
Model | Black-box | Black-box | Queuing + transaction mix [134] | Queuing | Queuing + ARMA | Fuzzy models + black-box | Fuzzy models |
Architecture | Distributed | Centralized | Hierarchical and distributed | Centralized | Centralized | Distributed | Centralized |
Control objectives | Desired application performance and reduction of SLO violations | To regulate target performance of key-value store | Desired response time of multiple multi-tier applications | To minimize CPU capacity allocation per application, and maintain desired response time | To obtain optimal partitioning scheme that reduce round trip time | To maintain a target response time | Adaptive allocation of multi-objective resources and service differentiation |
Reference input | Response time | 99th percentile of read latency | Response times of applications | Response time | Round trip time | Response time | Response time, throughput, play rate |
Control input | CPU capacity, memory allocation, MaxClient value | Average throughput per storage server | CPU capacity | Shared CPU capacity | CPU budget | CPU and memory allocation | CPU, memory, disk bandwidth |
Monitoring metrics | CPU and memory usage, response time, throughput, number of Apache processes | R99, average throughput per server | Transaction mix, measured response time, CPU utilization | Arrival rate, response time, CPU usage | Arrival rate, average round trip time, CPU usage, average resident time | Average CPU and memory usage, mean response time | Response time, throughput, play rate |
Ingredients | SP W P V | SP St Hy Ho | CP W Hy V | – W Hy V | CP W R V | SP W R V | – G P V |
Workloads | Synthetic | Synthetic | Real: VDR traces [135] | Real: news portal traces | Real: FIFA | Synthetic | Synthetic |
Applications used | httperf, autobensh tool | Voldmart | Customized RUBiS | CRIS tool | RUBiS | RUBiS, RUBBoS, Olio and httpmon | TPC-W, key-value store, video streaming |
Environment | Real: Intel Quad Core i7 + Xen 3.3 | Real: OpenStack (cluster of 11 nodes) | Real: HP ProLiant servers + Xen 2.6 | Java simulation and real set-up on Linux server | Real: Linux machines + xen 3.0.3 | Real: one machine with 32 cores + 56 GB memory + Xen | Real: two Dell servers + Xen |
Compared with | Static scenario without reconfiguring resources | Not provided | Static, utilization only, nested control, feed-forward + utilization | Schemes like equal shares and equal utilization | Not provided | Kalman filter, adaptive PI, ARMA |
[58] | [40] | [52] | [65] | [51] | [41] | [66] | |
---|---|---|---|---|---|---|---|
Types | GS (PID) | GS (SSF + LQR) | GS (\(LPV-H_{\infty }\) + LQR) | Multi-model PI | Integral + fuzzy controller | SSF + pole placement | LPV controller |
Models | – | Grey-box + least square regression | ARX + LPV-ARX | Black-box | Black-box + fuzzy model | State-space + grey-box | State-space LPV models |
Architecture | Centralized | Distributed | – | Centralized | Centralized | Distributed | Centralized |
Control objectives | To maintain a desired CPU load | To maintain web server performance (response time and throughput) | To dynamically control the aggregate CPU frequency to maintain target response time | To optimize share of CPU capacity among all VMs to meet desired response time | To maintain a desired CPU utilization of overall cluster | To maintain a desired response time | To maintain target response time for each application |
Reference input | CPU load | Response time | Response time | Per application/VM average response time | CPU utilization | Response time | Response time |
Control input | Number of VMs | Number of VMs, admission control | Aggregate CPU frequency | CPU capacity | Number of VMs | Number of VMs, admission control | Per VM CPU capacity share |
Monitoring metric | CPU utilization, number of VMs | Utilization of VMs, throughput, measured response time, service rate | Number of jobs, queue length, throughput, mean response time | Per application response time | Response time, arrival rate, CPU utilization | Utilization of VMs, throughput, response time | Per application (arrival rate, effective service time, response time) |
Ingredients | SP G R V | – W R Ho | CP G R Ho | CP G R V | SP W R Ho | – W R Ho | CP W R V |
Workloads | Synthetic + real | Synthetic | Real: web traces [138] | – | Real: Nasa + FIFA | Synthetic | Real: web traces |
Applications used | Httpmon | httperf | – | RUBiS | – | httperf | Customized Apach Jmeter, Micro-benchmarking web service |
Environment | Real: Amazon | Real: Eucalyptus + KVM hypervisor | Simulation: customize CSIM simulation | Real: custom (three machines) + Xen 2.6 | Simulation: CloudSim | Real: Eucalyptus + KVM hypervisor | Real: custom (eight VMs) + Xen 3.0 |
Compared with | Fixed gain PID | None provided | Linear models, queuing theory | Single model controllers | Fixed integral controller, Rightscale | None provided | None provided |
[47] | [76] | [77] | [97] | [70] | [71] | [67] | |
---|---|---|---|---|---|---|---|
Types | Fuzzy type-2 controller | Fuzzy controller | Fuzzy controller | Fuzzy + optimization | Fuzzy controller | Fuzzy controllers | Fuzzy self-learning controller |
Models | Fuzzy model | Zero-order Sugeno-type fuzzy model + clustering | Fuzzy model + classification and clustering | Fuzzy model + queuing | Fuzzy model + stochastic approximation algorithm [139] | Fuzzy model + t-digest [140] | Fuzzy + Q-learning |
Architecture | Centralized | Distributed | Distributed | Centralized | Distributed | Distributed and centralized | Centralized |
Control objectives | To maintain a target response time | To adjust per VM CPU capacity to meet performance goals | To adjust per VM CPU and disk IO to meet performance goals | To guarantee 90th percentile end-to-end delay in multi-tier systems | To maintain performance SLOs of an application | To maintain performance SLOs of an application | To maintain a desired response time |
Reference input | 95th percentile of response time | – | – | 90th Percentile end-to-end delay | Response time, throughput | Response time and throughput | Response time |
Control input | Number of VMs | CPU capacity | CPU capacity, disk IO bandwidth | Number of servers | CPU capacity allocation for each tier | CPU and memory allocation for each tier | Number of VMs |
Monitoring metrics | Measured response time, arrival rate | CPU usage, throughput (reply rate), workload | CPU and disk IO usage, average query throughput, measured response time | End-to-end delay | Mean CPU usage per tier/VM, measured SLO | Mean CPU and memory usage per tier/VM, measured SLO | Arrival rate, measured response time, throughput |
Ingredients | SP G R Ho | SP G P V | – DB P V | – W R Ho | CP W/D P V | CP W/DB/St P V | SP G R Ho |
Workloads | Synthetic: adapted from [138] | Synthetic + real: FIFA | Synthetic + real: FIFA | Synthetic | Synthetic | Synthetic | Synthetic |
Applications used | Jmeter | Java Pet store, httperf | TPC-H, RUBiS | Three tier application | RUBiS, Olio, Cassendra, RAIN, YCSB | RUBiS, RUBBoS, Olio, Cassendra, Redis | ElasticBench |
Environment | Real: Microsoft Azure | Real: custom (16-CPU IBM x 336 based cluster) + VMWare ESX 3.0 | Real: custom (Quad-core Intel Q6600) + Xen 3.3.1 | Simulation: simulation (no information provided) | Real: custom (two Fujitsu Servers) + Xen 4.2 | Real: custom (two Fujitsu Servers) + Xen 4.4 | Real: Azure and OpenStack |
Compared with | Over-/under-provisioning scenarios | Not provided | Peak load based allocation scheme | With/without optimization scenarios | Fuzzy controller without learning, Azure rule-based |
4.1 Control objective
-
Regulatory purpose a feedback controller developed for regulatory purposes maintains system output close to the desired reference value. For example, the average CPU utilisation of the Cluster must be 60%.
-
Optimisation the controller is responsible to obtain the best settings for the system output in the presence of certain constraints. For example, minimisation of system’s response time with the lowest possible cost.
-
Disturbance rejection such a controller is used to manage and adjust the level of disturbances, e.g., Admission control system. It only allows enough workload that does not affect the performance of the system.
4.2 Reference input
4.3 Control input
4.4 Controller
4.4.1 Classic
4.4.1.1 Fixed gain
4.4.1.2 State space feedback (SSF)
4.4.1.3 Adaptive
4.4.2 Optimal
4.4.2.1 Model predictive controller (MPC)
4.4.2.2 Limited lookahead controller (LLC)
4.4.3 Advance
4.4.3.1 Hybrid
4.4.3.2 Gain scheduled/switched controllers
4.4.4 Intelligent
4.5 Architecture
-
Centralised the control system following this architecture is implemented as one unit, which is responsible for managing the control objective from a central place, e.g., at a global system level. It is evident from Tables 1, 2, 3, 4, 5, 6 and 7 that the majority of the control solutions are Centralized. A solution can be centralized at one of the following three levels, i.e., Application, Node or Cloud. The solutions that focused on horizontal elasticity from the SP perspective are centralized at Application level (e.g., [43, 44, 48, 50, 61, 72]), whereas the control solutions that cater CPs perspective runs centrally at Cloud level, where they could be responsible for different applications (e.g., [33, 48, 86]). The application level control solutions can be executed outside of the cloud environment and therefore they can control interactions with multiple control. The centralizes solution at node level are those, where the control solutions are responsible for the resource management of VMs running at that computational node (e.g., [49, 53, 54]).
-
Distributed the control systems adopting distributed pattern implement at sub system level. Such control methods are usually responsible for achieving the control objective at sub system level. It can be seen from Tables 1, 2, 3, 4, 5, 6 and 7 that such an approach is mostly common in the cases of vertical elasticity, where the implementation of the control solutions are proposes at each VM of the cluster, (e.g., [30, 62, 73]). In the case of horizontal elasticity, there are few approaches including [40, 41, 60], where sub controllers are responsible to handle resource management task at per tier (or objective) level.
-
Hierarchical the control system in this case is implemented at two different levels, i.e., lower and upper level. At the lower level, the distributed controllers manage a sub-system, whereas, at the upper level, another controller mediate distributed controllers to achieve the control objective at the global scale. This category only include the following control solutions [34, 64, 68].
5 Discussion, issues and challenges
6 Conclusion
7 Summarized results
-
Provider the possible values include CP and SP.
-
Application type the possible values include Generic (G), Web (W), Scientific (Sc), Storage (St) and Database (Db).
-
Trigger the possible values include Reactive (R), Predictive (P) and Hybrid (H).
-
Elasticity type the possible values include Horizontal (Ho) and Vertical (V).