ABSTRACT
Cloud computing benefits extensively from economies of scale to provide cost effective computing. Recently, reliability has been introduced as a potential tradeoff point for delivering compute resources while decreasing further the price of cloud resources. The usage of fair market conditions create an environment where sellers and buyers of compute resources can benefit from trading their resources. The resource use efficiency can potentially be achieved as a result. While there are many advantages to the usage of auction-based infrastructure there are currently no practical computing platforms that can harness such volatile environments effectively. This research work reports a methodology and a toolkit designed to address the challenges of using volatile cloud-based auctioned resources for MPI applications.
Specifically we emphasize the use of dynamically adjusted optimal checkpoint-restart (CPR) intervals. We discuss an initial analytical model for dealing with price histories and selecting optimal checkpoint intervals. Also we describe the SpotMPI toolkit that can be used to achieve practical execution of MPI application on volatile auction-based cloud platforms. The result of this exploration is the synthesis of intrinsic dependencies that exist in MPI-based parallel applications with the publicly available price histories of HPC cloud resources on the Amazon cloud. We study algorithms with different computing v.s. communication complexities. Our results show counter-intuitive insights into the optimal bidding and application scaling strategies.
Supplemental Material
Available for Download
- J. Daly. A higher order estimate of the optimum checkpoint interval for restart dumps. Future Generation Computer Systems, 22(3):303--312, 2006. Google ScholarDigital Library
- E. Lusk. Fault tolerance in mpi programs. In Special issue of the Journal High Performance Computing Applications, IJHPCA, 2002.Google Scholar
- J. Shi. Program scalability analysis. In International Conference on Distributed and Parallel Processing, Geogetown University, Washington D.C., October 1997.Google Scholar
- J. Shi, M. Taifi, and A. Khreishah. Resource planning for parallel processing in the cloud. In High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on, pages 828--833. IEEE, 2011. Google ScholarDigital Library
- J. Shi, M. Taifi, A. Khreishah, and J. Wu. Sustainable gpu computing at scale. In 14th IEEE International Conference in Computational Science and Engneering 2011, 2011. Google ScholarDigital Library
- J. Young. A first order approximation to the optimum checkpoint interval. Communications of the ACM, 17(9):530--531, 1974. Google ScholarDigital Library
- Q. Zhang, E. Grses, R. Boutaba, and J. Xiao. Dynamic resource allocation for spot markets in clouds. In Proceedings of the 11th USENIX conference on Hot topics in management of internet, cloud, and enterprise networks and services, 2011. Google ScholarDigital Library
Index Terms
- ACM SRC poster: SpotMPI: auction-based high performance cloud computing
Recommendations
ACM SRC poster: combinatorial auction-based dynamic VM provisioning and allocation in clouds
SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis CompanionEfficient Virtual Machine (VM) provisioning and allocation allows the cloud providers to effectively utilize their available resources and obtain higher profits. As supported by the economic theory, combinatorial auction-based mechanisms are more ...
Application-Centric resource provisioning for amazon EC2 spot instances
Euro-Par'13: Proceedings of the 19th international conference on Parallel ProcessingIn late 2009, Amazon introduced spot instances to offer their unused resources at lower cost with reduced reliability. Amazon's spot instances allow customers to bid on unused Amazon EC2 capacity and run those instances for as long as their bid exceeds ...
How Small and Medium Enterprises SMEs Should Bid for Spot Instances of Amazon's EC2 Cloud
In cloud service provisioning, spot instances are spare slots for which it has no pre-booking, unlike reserved or on-demand instances for which a cloud service provider CSP has a priori booking. CSPs like Amazon prefer spot instance approach to sell ...
Comments