A lightweight plug-and-play elasticity service for self-organizing resource provisioning on parallel applications

https://doi.org/10.1016/j.future.2017.02.023Get rights and content

Highlights

  • We are proposing a lightweight plug-and-play elasticity service for self-organizing resource provisioning.

  • Based on the TCP (Transmission Control Protocol) congestion control, we propose an algorithm named Live Thresholding (LT).

  • The results highlight performance competitiveness in terms of application time (performance) and cost (performance × energy) metrics.

  • This article presented the Helpar model, which can be seen as an elasticity service for HPC applications.

Abstract

Today cloud elasticity can bring benefits to parallel applications, besides the traditional targets including Web and critical-business demands. This consists in adapting the number of resources and processes at runtime, so users do not need to worry about the best choice for them beforehand. To accomplish this, the most common approaches use threshold-based reactive elasticity or time-consuming proactive elasticity. However, both present at least one problem related to the need of a previous user experience, lack on handling load peaks, completion of parameters or design for a specific infrastructure and workload setting. In this context, we developed a hybrid elasticity service for master–slave parallel applications named Helpar. The proposal presents a closed control loop elasticity architecture that adapts at runtime the values of lower and upper thresholds. The main scientific contribution is the proposition of the Live Thresholding (LT) technique for controlling elasticity. LT is based on the TCP congestion algorithm and automatically manages the value of the elasticity bounds to enhance better reactiveness on resource provisioning. The idea is to provide a lightweight plug-and-play service at the PaaS (Platform-as-a-Service) level of a cloud, in which users are completely unaware of the elasticity feature, only needing to compile their applications with Helpar prototype. For evaluation, we used a numerical integration application and OpenNebula to compare the Helpar execution against two scenarios: a set of static thresholds and a non-elastic application. The results present the lightweight feature of Helpar, besides highlighting its performance competitiveness in terms of application time (performance) and cost (performance × energy) metrics.

Introduction

Cloud computing has been gaining increasing adherence particularly because of it presents an easier way to auto-scale systems  [1]. The elasticity service is better represented on Web-based business-critical systems, which must satisfy certain service level agreement (SLA), e.g., upper bounds on user perceived response time  [2], [3], [4]. Besides acting on performance and energy saving issues, horizontal and/or vertical elasticity actions are also especially pertinent on dynamic environments, where human intervention is becoming more and more difficult or even impossible  [5]. In addition to the aforementioned client–server target, cloud elasticity is being more and more perceived as a key service to support the execution of HPC (High Performance Computing) applications. Traditionally, these kind of applications run on clusters or even in grid architectures: both have a fixed number of resources that must be maintained in terms of infrastructure configuration, scheduling tool and energy consumption. Besides hiding these procedures from programmers, an elasticity service when integrated with parallel applications can reveal one of its main contributions for the HPC context: self-organizing the number of resources (and processes or threads, consequently) in accordance with application’s demands. Defining the exact number of processes is decisive on getting better performance, being sometimes estimated by hand through several tuning executions  [6]. First, either short or large value will not efficiently explore the distributed system. Second, a fixed value cannot fit irregular applications, in which the workload varies during execution and/or occasionally is not predictable in advance.

Deciding the right amount of cloud computing resources to execute a parallel application is a double-edged sword, which may lead to either under-provisioning or over-provisioning situations  [7], [8]. These application-resource mappings are results of saturation or waste of resources, and are among the most significant challenges cloud elasticity clients are faced with. Today, most of the elasticity control strategies can be classified as either being reactive or proactive (also named by some authors as predictive)  [5], [7], [9], [10]. For the first case, typically users define an upper bound tu and a lower bound tl in an ad-hoc manner over a target metric (e.g., CPU utilization level, throughput and average response time) to trigger, respectively, the activation and deactivation of a certain number of resources  [6]. This threshold-based technique is the most used in Web-based commercial auto-scaling systems: its simplicity and intuitive nature drive this trend  [3], [11]. On the other hand, a proactive approach employs prediction techniques to anticipate the behavior of the system (its load) and thereby decide the reconfiguration actions. This capability in turn will enable the application to be ready to handle the load increase when it actually occurs. To accomplish this approach, it is common to use machine learning algorithms including Neural Network, Linear Regression, Support Vector Machine, Reinforcement Learning and Pattern Matching techniques  [5]. To improve their accuracy on forecasting load values, they are commonly combined with popular Time-Series-based prediction techniques, such as Exponential Smoothing, Moving Averages and Autoregressive models  [7].

Although the word automatic is used on both autoscaling mechanisms, current implementations commonly require some kind of user input, preliminary configuration and/or use of APIs (Application Programming Interface) to adjust resources as workloads change  [1], [10]. For the reactive approach, in particular, the tasks of choosing tl and tu and writing if-then rules are not trivial and sometimes require a deep knowledge about the behavior of the system over time  [3], [8], [11]. This makes the accuracy of the policy subjective and prone to uncertainty: the same set of thresholds that fits fine a specific infrastructure/application possibly causes undesired emergent behaviors, such as instability and resource thrashing, on other settings  [6], [12]. In addition, other problem of using thresholds is related to the lack of reactivity. There are situations in which the cloud controller could anticipate the (de)allocation of resources, but the resource configuration remains the same due to bad choices on setting tl and tu. Although not needing thresholds, the proactive elasticity mechanism, on the other hand, is based on robust mathematical modeling and commonly classified adversely as time-consuming for sensitive performance-driven applications  [7], [13]. Besides runtime model tuning, there is also the need of training the predictive technique and previous execution of the application to optimize and select parameters  [5]. Finally, Netto et al.  [6] affirm that proactive elasticity strategies focus only on method accuracy and ignore cloud technical limitations, besides being very much dependent on workload characteristics and precise prediction models.

Considering the background, we are working with the following problem statement in mind: how to provide a totally automatic elasticity service for parallel applications to bypass the aforementioned drawbacks related to proactive and reactive approaches? To answer it, we designed an elasticity model called Helpar (Hybrid Elasticity Model for Parallel Applications). As a blending elastic approach, Helpar acts as a resource provisioning service at the PaaS (Platform as a Service) level of cloud joining the threshold-based lightweight feature from the reactive approach and the prediction and feedback control from the proactive way. In our understanding, a cloud scenario of “Complete Computing” must combine the keywords automatic, effortless, proactiveness, performance and intelligence. More precisely, our idea is to provide a plug-and-play parameterless model that on-the-fly adapts tl and tu in accordance with the application’s demand to enhance the reactiveness of the system. The current version of Helpar addresses iterative master–slave parallel applications in such a way that users must only compile their applications with the middleware developed as a product of the Helpar model. Consequently, this middleware is in charge of transforming a non-elastic application in an elastic one without imposing any API or previous execution. Helpar provides a controller and a framework to manage resource (de)allocation and process (dis)connection without blocking the application while elasticity actions take place. Regarding the Helpar’s scientific contributions, it brings the following additions in the state-of-the-art:

  • (i)

    A modeling of closed control-theoretic  [14] infrastructure to support the hybrid elasticity behavior on parallel cloud-based applications;

  • (ii)

    Based on the TCP (Transmission Control Protocol) congestion control, we propose an algorithm named Live Thresholding (LT) to handle both application load projection and tl and tu adaptivity.

We designed LT to provide both elasticity reactiveness and application performance, but not neglecting energy consumption. In other words, it is not pertinent to reduce the application time by the half but spending four times more resources to accomplish this. In this way, Helpar evaluation analyzes performance (time) and energy consumption (resource), but also the cost metric (time×resource) in comparisons with non-elastic and pure reactive elasticity approaches. We modeled and developed a numerical integration parallel application that was executed against the three aforesaid scenarios when varying the input workload pattern as follows: Ascending, Descending, Wave and Constant. In our understanding, despite using a single application, the strategy of adopting multiple evaluation metrics, scenarios and mainly different workloads was essential to discuss the resource provisioning proposal.

The remainder of this article will first introduce related studies in Section  2. Section  3 presents the Helpar model, revealing how we developed the aforesaid contributions. The evaluation methodology and the discussion of the results are described in Sections  4 Evaluation methodology, 5 Evaluation. Finally, Section  6 expresses the final remarks, highlighting the contributions with quantitative data.

Section snippets

Motivation and related work

Resource provisioning and cloud elasticity are topics of a vast number of research and scientific articles. Here, we present a set of initiatives both for Web and HPC applications that guide our motivation and research gaps.

Helpar elasticity model

This section describes Helpar, including its architecture, the supported parallel programming model and how threshold adaptation takes place. Helpar provides horizontal elasticity through a hybrid model: although based on thresholds to launch elasticity actions reactively, we on-the-fly adapt their values to provide a better elasticity reactivity. Here, reactivity means the speed to react through resource reorganization when considering load prediction based on the execution history. In

Evaluation methodology

This section presents all aspects related to the Helpar evaluation, starting from its prototype in Section  4.1. Both parallel application and details about the evaluated metrics are addressed in Sections  4.2 Writing a parallel application, 4.3 Load patterns, thresholds and metrics.

Evaluation

This section firstly presents which LT approach we are using as the final Helpar strategy in Section  5.1. Section  5.2 addresses an analysis of performance, energy and cost over our evaluated scenarios. Finally, Section  5.3 presents a relation of allocated and used resources along the considered workloads.

Conclusion

This article presented the Helpar model, which can be seen as an elasticity service for HPC applications. Our work advances the current state of research by offering a totally plug-and-play service at the user viewpoint, both in terms of application and parameter writing. Helpar offers hybrid elasticity through the Live Thresholding technique, self-organizing threshold values and resource allocation to offer a competitive solution at performance and cost levels. To accomplish this, we designed

Acknowledgments

This work was partially supported by the following Brazilian Agencies: FAPERGS (352-2551/16-0), CAPES (BEX 8792/14-3) and CNPq (305531/2015-8 and 457501/2014-6).

Rodrigo da Rosa Righi is professor and researcher at University of Vale do Rio dos Sinos (Unisinos), Brazil. Rodrigo concluded his post-doctoral studies at KAIST—Korean Advanced Institute of Science and Technology, South Korea, under the following topics: RFID and cloud computing. He obtained his M.S. and Ph.D. degrees in Computer Science from the Federal University of Rio Grande do Sul, Brazil, in 2005 and 2009, respectively. His research interests include performance analysis, process

References (40)

  • L.R. Moore et al.

    Transforming reactive auto-scaling into proactive auto-scaling

  • N. Roy et al.

    Efficient autoscaling in the cloud using predictive models for workload forecasting

  • L. Yazdanov et al.

    Vertical scaling for prioritized vms provisioning

  • D. Breitgand, E. Henis, O. Shehory, Automated and adaptive threshold setting: Enabling technology for autonomy and...
  • E. Caron, F. Desprez, A. Muresan, Forecasting for grid and cloud computing on-demand resources based on pattern...
  • H. Ghanbari, B. Simmons, M. Litoiu, G. Iszlai, Exploring alternative approaches to implement an elasticity policy, in:...
  • M.Z. Hasan, E. Magana, A. Clemm, L. Tucker, S.L.D. Gudreddi, Integrated and autonomic cloud resource scaling, in: 2012...
  • H.C. Lim et al.

    Automated control in cloud computing: Challenges and opportunities

  • K. Gorlach, F. Leymann, Dynamic service provisioning for the cloud, in: 2012 IEEE Ninth International Conference on...
  • P. Leitner, C. Inzinger, W. Hummer, B. Satzger, S. Dustdar, Application-level performance monitoring of cloud services...
  • Cited by (17)

    • Helastic: On combining threshold-based and Serverless elasticity approaches for optimizing the execution of bioinformatics applications

      2021, Journal of Computational Science
      Citation Excerpt :

      Our research group has worked on cloud elasticity for distributed systems, focusing mainly on high-performance computing and IoT systems. Helastic, the model proposed in this article, builds upon previous research such as AutoElastic [26], Helpar [31,32] and Elastipipe [28]. AutoElastic focuses on providing asynchronous resource provisioning for MPI applications, enabling threshold-based elasticity through a custom elasticity manager.

    • Analytics in/for cloud-an interdependence: A review

      2018, Journal of Network and Computer Applications
      Citation Excerpt :

      Keeping above-mentioned issues in mind, we reviewed a number of papers under this section. The reviewed literature are grouped under two category viz. resource allocation based study (Banditwattanawonga et al., 2016; Bennani and Menasce, 2005; Calheiros et al., 2015; Caron et al., 2010; Díaz et al., 2017; Grande et al., 2011; Guiyi et al., 2010; Imam et al., 2011; Kundu et al., 2012; Pandey et al., 2010; Rahman et al., 2011; Rankothge et al., 2017; Tsai et al., 2013; Zhang et al., 2017; Zhao et al., 2015, 2017) and scaling based study (Ashraf et al., 2016; Chou et al., 2016; da Rosa Righi et al., 2017; Gong et al., 2010; Jiang et al., 2013a, 2013b; Mao and Humphrey, 2011; Roy et al., 2011; Shahin, 2016). In a cloud environment, if resource allocation policy is effective and efficient then it saves considerable cost for the service provider.

    • Cloud forward: From distributed to complete computing

      2018, Future Generation Computer Systems
    View all citing articles on Scopus

    Rodrigo da Rosa Righi is professor and researcher at University of Vale do Rio dos Sinos (Unisinos), Brazil. Rodrigo concluded his post-doctoral studies at KAIST—Korean Advanced Institute of Science and Technology, South Korea, under the following topics: RFID and cloud computing. He obtained his M.S. and Ph.D. degrees in Computer Science from the Federal University of Rio Grande do Sul, Brazil, in 2005 and 2009, respectively. His research interests include performance analysis, process scheduling and migration, load balancing and resource provisioning on Cluster, Grid and Cloud environments. In addition, he has expertise on mobile and embedded computing systems in which employ Linux or WinCE operating systems.

    Vinicius Facco Rodrigues completed his Bachelor’s degree in Computer Science at Unisinos University in 2012. In addition, he started his master in Applied Computing at the same University in 2014. His research areas include distributed systems and computer networks. Nowadays, Vinicius is focusing his research on the cloud topic and, particularly, in the elasticity feature of this new paradigm.

    Gustavo Rostirolla completed his Bachelors degree in Computer Engineering at Univates in 2014. In addition, he started his masters in Applied Computing at the Unisinos University in the same year. His research areas include distributed systems and computer networks. Currently, Gustavo is focusing his research on the topics of cloud computing, smart cities and energy consumption.

    Cristiano André da Costa is a full professor at Universidade do Vale do Rio dos Sinos (Unisinos), in São Leopoldo/Brazil, and a researcher of productivity at CNPq (National Council for Scientific and Technological Development). He works as a professor in higher education since 1997. Nowadays, he is the coordinator of the Applied Computing Graduate Program (PIPCA) at Unisinos. His research interests include ubiquitous, mobile, parallel and distributed computing. He obtained his Ph.D. and M.S. degree in computer science from the Federal University of Rio Grande do Sul, Brazil, in 2008 and 1997, respectively. He is member of the ACM, IEEE, IADIS and the Brazilian Computer Society (SBC).

    Eduardo Roloff completed his Bachelors degree in Computer at Unisinos University in 2005. In addition, he obtained his masters degree in Computer at the Federal University of Rio Grande do Sul in 2013. Today, Eduardo is a Ph.D. student at the same university. His research areas include distributed systems and computer networks. Currently, Eduardo is focusing his research on the topic of cloud computing.

    Philippe Olivier Alexandre Navaux obtained his doctorate in Institut National Polytechnique Computer Science from Grenoble (1979). He is currently Coordinator of the Area Computation of Higher Education Personnel Improvement Coordination, professor at the Federal University of Rio Grande do Sul. He has experience in computer science with an emphasis in Computer Systems. Acting on the following topics: Computer Architecture, Parallel Processing, Database.

    View full text