1 Introduction
2 Motivation
2.1 Document processing SaaS application
2.2 Scenarios
# | Scenarios/conditions | Adaptation actions |
---|---|---|
1 | CSP2 outperforms CSP1 (i.e., performance optimization) | 1. Change storage policy to use CSP2 for the future data storage requests (instead of CSP1) |
1.1. Keep existing data in CSP1, OR | ||
1.2. Migrate existing data from CSP1 to CSP2 | ||
2 | CSP2 is suffering from ongoing performance issues in peak-load condition | 1. Add more storage nodes in CSP2 (i.e., scale-out), OR |
2. Temporary spill-over to CSP1 | ||
3 | CSP2 offers a discount and storage price drops below that of CSP1 (i.e., cost optimization) | 1. Change storage policy to use CSP2 for the future data storage requests (instead of CSP1) |
1.1. Keep existing data in CSP1, OR | ||
1.2. Migrate existing data from CSP1 to CSP2 | ||
4 | The SLA of CSP1 offers higher availability than that of CSP2 | 1. Use CSP1 for the data, which requires higher availability |
1.1. Keep existing data in CSP2, OR | ||
1.2. Migrate existing data from CSP2 to CSP1 |
2.3 Requirements
3 SCOPE: a self-adaptive middleware
Data Placement
component in Fig. 2). The second principle is that the reconfiguration of the underlying federated cloud storage setup is accomplished using external and reusable reconfiguration policies, and thus the complexity of performing reconfiguration is abstracted and externalized from the application (the Adaptation Controller
component in Fig. 2). To provide a brief overview, the Monitoring
component periodically collects metrics/statistics of the underlying federated cloud storage setup and stores them in the cache/database. These metrics are then used by (i) the Data Placement
component to make dynamic data placement decisions, and (ii) the Adaptation Controller
component to support adaptation actions (e.g., scale up, scale out, temporary spillover, etc).
3.1 SaaS application
Application-wide-SLA
, which is a declarative model-based description of different SLA requirements (including different QoS metrics) that the application has to satisfy. In addition, SaaS providers can also specify the Reconfiguration-policy
, which is defined as a set of rules of which each contains a set of conditions based on specific target variables and an action part. The Reconfiguration-policy
enables service providers to specify the reconfiguration rules for the underlying federated cloud storage setup without writing the reconfiguration logic in the application code (addresses R3).3.2 Adaptive data management
Data Management
component, (ii) the SLA Management
component, (iii) the Multi-Objective Decision
component, (iv) the Monitoring
component, and (v) the Reconfiguration Support
component as shown in Fig. 3.Application-wide-SLA
. In a simplistic case (e.g., storing invoices), multiple data stores can be elected as suitable candidates for data placement decisions. However, the question remains which properties must be considered to efficiently select the data stores for data placement decisions. In order to make this decision efficiently, the application requirements (expressed in the Application-wide-SLA
), current state, and dynamic properties of cloud storage providers and their respective storage systems must be taken into consideration. There are a number of dynamic properties (e.g., cloud storage provider availability, performance, evolving price conditions, etc), which are supported and can be considered by SCOPE for data management decisions. However, in this paper, we mainly focus on the performance aspect of the cloud storage provider simply because this aspect is not yet sufficiently reflected in existing state-of-the-art federated cloud systems [7, 12, 18‐21]. To perform data placement decisions that are compatible with the operational environment (i.e., underlying federated cloud storage setup), the Data Management
component gets the persistence configuration details of different storage systems, distributed across multiple cloud storage providers from the Persistence Management
component (Step 1 in Fig. 3) and the application-specific SLA requirements from the SLA Management
component (Step 2 in Fig. 3) and passes the information to the Multi-Objective Decision
component (Step 3 in Fig. 3). The latter component is responsible for making appropriate optimization decisions (i.e., selecting suitable candidates for data placement decisions), taking into account the different requirements of the application. The Data Management
component, then performs data placement operations that are consistent with the operational environment (Step 5 in Fig. 3), based on the returned information from the Multi-Objective Decision
component about the most suited cloud storage providers (addresses R1).SLA Management
component stores SLA requirements specified by the application. The component exposes an interface that allows the application to specify SLAs, which usually are expressed in terms of different optimization objectives (e.g., performance, cost, availability, etc). Listing 1 shows an example of the SLA agreement for the document processing SaaS application. The SLO parameter values (i.e., SloID) show the quality of service required by the document processing SaaS application (e.g., response time for write operations, response time for read operations, uptime, etc). Furthermore, we also define the threshold values that guide the enforcement of these SLAs. The SLA agreement of the document processing SaaS application combines performance and availability as follows: request response time should not exceed 10 ms and atleast 97% of requests should be served.Multi-Objective Decision
component sends a request to the Storage Monitor
sub-component of the Monitoring
component, which continuously monitors different QoS metrics (see Listing 2 as an example of monitored QoS metrics such as write latency, read latency, and uptime) and sends the response back (i.e., monitored metrics) to the Multi-Objective Decision
component. The Multi-Objective Decision
component compares the monitored metrics (i.e., the QoS) with the expected performance SLAs specified by the SaaS application (see Listing 1 for an example of expected SLA policy for the document processing SaaS application). For example, as shown in Listing 2, three data stores on potentially varying cloud providers (i.e., Cassandra-Private, Cassandra-Public, and MongoDB-Private) satisfy the imposed SLA requirements for storing invoices. The Multi-Objective Decision
component based on the SLA requirements and monitored QoS metrics, making optimization decisions as such, selects the most efficient cloud provider suited for data storage.Monitoring
component is responsible for monitoring different QoS metrics of back-end storage systems operating at different cloud providers. Table 2 shows the list of supported monitoring metrics for different cloud storage systems (including both relational and NoSQL databases). An example of monitored QoS metrics (i.e., write latency, read latency, and uptime) for cloud storage technologies operating at different cloud providers is shown in Listing 2.
NoSQL Databases | RDBMS | |||
---|---|---|---|---|
Metric type | Cassandra | MongoDB | Redis | PostgreSQL |
Read latency |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Write latency |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Uptime |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
Average object size |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
Total object size |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
Connected clients |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Total connections |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Keycache capacity |
\(\boxtimes \)
| |||
Keycache hitrate |
\(\boxtimes \)
| |||
Keycache size |
\(\boxtimes \)
| |||
Memory allocated |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Memory used |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Rowcache capacity |
\(\boxtimes \)
| |||
Rowcache hitrate |
\(\boxtimes \)
| |||
Rowcache size |
\(\boxtimes \)
| |||
Read request count |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Write request count |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
| |
Total object count |
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
\(\boxtimes \)
|
Storage Monitor
component continuously monitors QoS metrics (Step A in Fig. 3) and stores up-to-date QoS metrics in the database. The up-to-date monitored QoS metrics are accessed by different components of the Adaptive Data Management layer for different purposes. For example, as stated above, to perform data management decisions that are consistent with the operational environment (i.e., the underlying federated cloud storage setup), the up-to-date QoS metrics are accessed by the Multi-Objective Decision
component. Similarly, to autonomously (re)configure the federated cloud storage setup or to be able to react to unusual demand situations (e.g., add more nodes, change the replication factor, etc), the monitored QoS metrics information is also accessed by the Adaptation Controller
sub component of the Reconfiguration Support
component.Reconfiguration Support
component provides an interface to set configuration details, and performs an initial deployment and configuration of heterogeneous storage systems distributed across multiple clouds. The component is comprised of three sub components (i) the Persistence Management
component, (ii) the Adaptation Controller
component, and (iii) the Deployment Agent Service
component.Persistence Management
component contains the persistence configuration details of different storage systems and for that, the component provides an interface to lookup and update these details.Adaptation Controller
component. The Adaptation Controller
component is responsible for the management of resources and triggers an appropriate action (e.g., install new instances to a database cluster, change the replication factor, etc) if the resources are suffering from various ongoing issues or if the unusual demand situations have occurred. The component reads the up-to-date monitored QoS metrics (Step B in Fig. 3) from the Storage Monitor
sub component of the Monitoring Component
, which provides continuous monitoring capabilities. The Adaptation Controller
component contains a number of (re)configuration rules and based on these rules and the monitored QoS metrics, and finally make effective decisions. For example, Listing 3 shows the (re)configuration rule to keep the average latency of the Cassandra storage system below 30 ms. The appropriate (re)configuration action (i.e., adding a new storage node), specified in the (re)configuration action executes in case of service violation. To perform such an action, as part of Step D in Fig. 3, the component dispatches a notification change signal to the Deployment Agent
component, which is responsible for providing the needed deployment support. Another example of service violation is to give an indication that the system (e.g., Cassandra, MongoDB, etc) is running out of storage or memory. In such a case, the Deployment Agent
component acts to increase the system memory or storage.Deployment Agent
component is responsible for performing the desired (re)configuration and deployment autonomously (Step E in Fig. 3) including: adding more nodes in a cluster (i.e., scale out), remove nodes from the cluster, changing consistency level, increase system memory and storage, etc (addresses R2).3.3 Federated cloud storage setup
Data Management
component, the Storage Monitor
component and the Deployment Agent
component) and uses the right database-specific storage driver to perform an operation.4 Prototype implementation on SCOPE
Monitoring
component and the Adaptation
Controller component) of the prototype start up and run continuously run in the background.5 Evaluation
5.1 Application/experimental setup
5.2 Functional validation of dynamic data placement
5.2.1 Metrics
5.2.2 Results
Application prototype | Write | Read |
---|---|---|
SCOPE-MR | 1012 | 656 |
SCOPE | 322 | 501 |
% Performance improvement | 212% | 40% |
5.3 Performance overhead
5.3.1 Setup
5.3.2 Results
Application prototype | Write | Write replication | Read |
---|---|---|---|
SCOPE-MR | 314 | 1051 | 435 |
SCOPE | 322 | 1089 | 501 |
Monitoring overhead | 3% | 4% | 15% |