skip to main content
research-article

A Cooperative Predictive Control Approach to Improve the Reconfiguration Stability of Adaptive Distributed Parallel Applications

Published:01 March 2014Publication History
Skip Abstract Section

Abstract

Adaptiveness in distributed parallel applications is a key feature to provide satisfactory performance results in the face of unexpected events such as workload variations and time-varying user requirements. The adaptation process is based on the ability to change specific characteristics of parallel components (e.g., their parallelism degree) and to guarantee that such modifications of the application configuration are effective and durable. Reconfigurations often incur a cost on the execution (a performance overhead and/or an economic cost). For this reason advanced adaptation strategies have become of paramount importance. Effective strategies must achieve properties like control optimality (making decisions that optimize the global application QoS), reconfiguration stability expressed in terms of the average time between consecutive reconfigurations of the same component, and optimizing the reconfiguration amplitude (number of allocated/deallocated resources). To control such parameters, in this article we propose a method based on a Cooperative Model-based Predictive Control approach in which application controllers cooperate to make optimal reconfigurations and taking account of the durability and amplitude of their control decisions. The effectiveness and the feasibility of the methodology is demonstrated through experiments performed in a simulation environment and by comparing it with other existing techniques.

References

  1. Abdelwahed, S., Kandasamy, N., and Neema, S. 2004. Online control for self-management in computing systems. In Proceedings of the 10th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’04). 368--375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Abdelwahed, S., Bai, J., Su, R., and Kandasamy, N. 2009. On the application of predictive control techniques for adaptive performance management of computing systems. IEEE Trans. Netw. Serv. Manage. 6, 4, 212--225. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aldinucci, M., Danelutto, M., and Vanneschi, M. 2006. Autonomic qos in assist grid-aware components. In Proceedings of the 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP’06). IEEE, 221--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Aldinucci, M., Campa, S., Danelutto, M., and Vanneschi, M. 2008. Behavioural skeletons in gcm: Autonomic management of grid components. In Proceedings of the Euromicro International Conference on Parallel, Distributed and Network-Based Processing. 54--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Aldinucci, M., Danelutto, M., and Kilpatrick, P. 2009. Towards hierarchical management of autonomic components: A case study. In Proceedings of the 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing. 3--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Arshad, N., Heimbigner, D., and Wolf, A. L. 2007. Deployment and dynamic reconfiguration planning for distributed software systems. Softw. Quality Control 15, 3, 265--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Balsamo, S. 2011. Queueing networks with blocking: Analysis, solution algorithms and properties. In Network Performance Engineering. Springer, 233--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bertolli, C., Mencagli, G., and Vanneschi, M. 2011. Consistent reconfiguration protocols for adaptive high-performance applications. In Proceedings of the 7th International Wireless Communications and Mobile Computing Conference. 2121--2126.Google ScholarGoogle Scholar
  9. Cole, M. 2004. Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming. Parallel Comput. 30, 3, 389--406. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Coppola, M., Danelutto, M., Tonellotto, N., Vanneschi, M., and Zoccolo, C. 2007. Execution support of high performance heterogeneous component-based applications on the grid. In Proceedings of the CoreGRID, UNICORE Summit, Petascale Computational Biology and Bioinformatics Conference on Parallel Processing (Euro-Par’06). Springer, 171--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Costa, R., Brasileiro, F., Lemos, G., and Sousa, D. 2013. Analyzing the impact of elasticity on the profit of cloud computing providers. Future Generation Computer Syst. 29, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Garcia, C. E., Prett, D. M., and Morari, M. 1989. Model predictive control: Theory and practice a survey. Automatica 25, 335--348. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gelper, S., Fried, R., and Croux, C. 2010. Robust forecasting with exponential and Holt-Winters smoothing. J. Forecasting 29, 3, 285--300.Google ScholarGoogle Scholar
  14. Ghanbari, H., Simmons, B., Litoiu, M., and Iszlai, G. 2011. Exploring alternative approaches to implement an elasticity policy. In Proceedings of the IEEE International Conference on Cloud Computing. 716--723. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gomes, A. T. A., Batista, T. V., Joolia, A., and Coulson, G. 2007. Architecting dynamic reconfiguration in dependable systems. In Architecting Dependable Systems IV, Springer, 237--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hellerstein, J. L., Diao, Y., Parekh, S., and Tilbury, D. M. 2004. Feedback Control of Computing Systems. John Wiley & Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Horvath, T., Abdelzaher, T., Skadron, K., and Liu, X. 2007. Dynamic voltage scaling in multitier web servers with end-to-end delay control. IEEE Trans. Comput. 56, 4, 444--458. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kandasamy, N., Abdelwahed, S., and Khandekar, M. 2006. A hierarchical optimization framework for autonomic performance management of distributed computing systems. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS’06). 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kephart, J. and Walsh, W. 2004. An artificial intelligence perspective on autonomic computing policies. In Proceedings of the 5th IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY’04). 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Khan, A., Yan, X., Tao, S., and Anerousis, N. 2012. Workload characterization and prediction in the cloud: A multiple time series approach. In Proceedings of the IEEE Network Operations and Management Symposium. 1287--1294.Google ScholarGoogle Scholar
  21. Khargharia, B., Hariri, S., and Yousif, M. S. 2008. Autonomic power and performance management for computing systems. Cluster Comput. 11, 2, 167--181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kusic, D. and Kandasamy, N. 2007. Risk-aware limited lookahead control for dynamic resource provisioning in enterprise computing systems. Cluster Comput. 10, 4, 395--408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kusic, D., Kandasamy, N., and Jiang, G. 2011. Combined power and performance management of virtualized computing environments serving session-based workloads. IEEE Trans. Netw. Serv. Manage. 8, 3, 245--258.Google ScholarGoogle ScholarCross RefCross Ref
  24. Lee, H., Bouhchouch, A., Dallery, Y., and Frein, Y. 1998. Performance evaluation of open queueing networks with arbitrary configuration and finite buffers. Ann. Operations Research 79, 181--206.Google ScholarGoogle ScholarCross RefCross Ref
  25. Lee, H.-S. and Pollock, S. M. 1991. Approximation analysis of open acyclic exponential queueing networks with blocking. Oper. Res. 38, 6, 1123--1134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lisani, J.-L., Rudin, L., Monasse, P., Morel, J. M., and Yu, P. 2005. Meaningful automatic video demultiplexing with unknown number of cameras, contrast changes, and motion. In Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS’05). 604--608.Google ScholarGoogle Scholar
  27. Liu, H. and Parashar, M. 2006. Accord: A programming framework for autonomic applications. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 36, 3, 341--352. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lobel, I. and Ozdaglar, A. 2011. Distributed subgradient methods for convex optimization over random networks. IEEE Trans. Autom. Control 56, 6, 1291--1306.Google ScholarGoogle ScholarCross RefCross Ref
  29. Loesing, S., Hentschel, M., Kraska, T., and Kossmann, D. 2012. Stormy: An elastic and highly available streaming service in the cloud. In Proceedings of the Joint EDBT/ICDT Workshops (EDBT-ICDT’12). ACM, New York, 55--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Loureiro, E., Nixon, P., and Dobson, S. 2012. Decentralized and optimal control of shared resource pools. ACM Trans. Auton. Adapt. Syst. 7, 1, 14:1--14:31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Maggio, M., Hoffmann, H., Papadopoulos, A. V., Panerati, J., Santambrogio, M. D., Agarwal, A., and Leva, A. 2012. Comparison of decision-making strategies for self-optimization in autonomic computing systems. ACM Trans. Auton. Adapt. Syst. 7, 4, 36:1--36:32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Meiländer, D., Ploss, A., Glinka, F., and Gorlatch, S. 2012. A dynamic resource management system for real-time online applications on clouds. In Proceedings of the International Conference on Parallel Processing (Euro-Par’11). Springer, 149--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mencagli, G. 2012. A control-theoretic methodology for controlling adaptive structured parallel computations. Ph.D. thesis, Department of Computer Science, University of Pisa, Italy.Google ScholarGoogle Scholar
  34. Mencagli, G. and Vanneschi, M. 2011. Qos-control of structured parallel computations: A predictive control approach. In Proceedings of the IEEE 3rd International Conference on Cloud Computing Technology and Science (CLOUDCOM’11). IEEE, 296--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mencagli, G. and Vanneschi, M. 2013. Analysis of control-theoretic predictive strategies for the adaptation of distributed parallel computations. In Proceedings of the 1st ACM Workshop on Optimization Techniques for Resources Management in Clouds (ORMaCloud’13). ACM, New York, 33--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Mencagli, G., Vanneschi, M., and Vespa, E. 2013a. Control-theoretic adaptation strategies for autonomic reconfigurable parallel applications on cloud environments. In Proceedings of the International Conference on High Performance Computing and Simulation. 11--18.Google ScholarGoogle Scholar
  37. Mencagli, G., Vanneschi, M., and Vespa, E. 2013b. Reconfiguration stability of adaptive distributed parallel applications through a cooperative predictive control approach. In Proceedings of the 19th International Conference on Parallel Processing (Euro-Par’13). Springer, 329--340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Nedic, A. and Ozdaglar, A. 2009. Distributed subgradient methods for multi-agent optimization. IEEE Trans. Autom. Control 54, 1, 48--61.Google ScholarGoogle ScholarCross RefCross Ref
  39. Park, S.-M. and Humphrey, M. 2011. Predictable high-performance computing using feedback control and admission control. IEEE Trans. Parallel Distrib. Syst. 22, 3, 396--411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Quiroz, A., Kim, H., Parashar, M., Gnanasambandam, N., and Sharma, N. 2009. Towards autonomic workload provisioning for enterprise grids and clouds. In Proceedings of the 10th IEEE/ACM International Conference on Grid Computing. 50--57.Google ScholarGoogle Scholar
  41. Raghavendra, R., Ranganathan, P., Talwar, V., Wang, Z., and Zhu, X. 2008. No ‘power’ struggles: Coordinated multi-level power management for the data center. SIGOPS Oper. Syst. Rev. 42, 2, 48--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Ram, S. S., Nedic, A., and Veeravalli, V. V. 2009. Distributed subgradient projection algorithm for convex optimization. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’09). IEEE, 3653--3656. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Reiff-Marganiec, S. and Turner, K. J. 2004. Feature interaction in policies. Comput. Netw. 45, 569--584. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Tsai, W. T., Song, W., Chen, Y., and Paul, R. 2007. Dynamic system reconfiguration via service composition for dependable computing. In Proceedings of the 12th Monterey Conference on Reliable Systems on Unreliable Networked Platforms. Springer, 203--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Valente, F. 2010. Multi-stream speech recognition based on Dempster-Shafer combination rule. Speech Commun. 52, 3, 213--222.Google ScholarGoogle ScholarCross RefCross Ref
  46. Vanneschi, M. and Veraldi, L. 2007. Dynamicity in distributed applications: Issues, problems and the assist approach. Parallel Comput. 33, 12, 822--845. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Wang, X., Du, Z., Chen, Y., Li, S., Lan, D., Wang, G., and Chen, Y. 2008. An autonomic provisioning framework for outsourcing data center based on virtual appliances. Cluster Comput. 11, 3, 229--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Warneke, D. and Kao, O. 2011. Exploiting dynamic resource allocation for efficient parallel data processing in the cloud. IEEE Trans. Parallel Distrib. Syst. 22, 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Weigold, T., Aldinucci, M., Danelutto, M., and Getov, V. 2012. Process-driven biometric identification by means of autonomic grid components. Int. J. Auton. Adapt. Commun. Syst. 5, 3, 274--291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Yuan, Q., Liu, Z., Peng, J., Wu, X., Li, J., Han, F., Li, Q., Zhang, W., Fan, X., and Kong, S. 2011. A leasing instances based billing model for cloud computing. In Proceedings of the 6th International Conference on Advances in Grid and Pervasive Computing (GPC’11). Springer, 33--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Zhang, R., Lu, C., Abdelzaher, T. F., and Stankovic, J. A. 2002. Controlware: A middleware architecture for feedback control of software performance. In Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS’02). IEEE, 301. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Cooperative Predictive Control Approach to Improve the Reconfiguration Stability of Adaptive Distributed Parallel Applications

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Autonomous and Adaptive Systems
        ACM Transactions on Autonomous and Adaptive Systems  Volume 9, Issue 1
        March 2014
        121 pages
        ISSN:1556-4665
        EISSN:1556-4703
        DOI:10.1145/2597760
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 March 2014
        • Accepted: 1 December 2013
        • Revised: 1 July 2013
        • Received: 1 June 2013
        Published in taas Volume 9, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader