skip to main content
10.1145/1383422.1383434acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Combining batch execution and leasing using virtual machines

Published:23 June 2008Publication History

ABSTRACT

As cluster computers are used for a wider range of applications, we encounter the need to deliver resources at particular times, to meet particular deadlines, and/or at the same time as other resources are provided elsewhere. To address such requirements, we describe a scheduling approach in which users request resource leases, where leases can request either as-soon-as-possible ("best-effort") or reservation start times. We present the design of a lease management architecture, Haizea, that implements leases as virtual machines (VMs), leveraging their ability to suspend, migrate, and resume computations and to provide leased resources with customized application environments. We discuss methods to minimize the overhead introduced by having to deploy VM images before the start of a lease. We also present the results of simulation studies that compare alternative approaches. Using workloads with various mixes of best-effort and advance reservation requests, we compare the performance of our VM-based approach with that of non-VM-based schedulers. We find that a VM-based approach can provide better performance (measured in terms of both total execution time and average delay incurred by best-effort requests) than a scheduler that does not support task pre-emption, and only slightly worse performance than a scheduler that does support task pre-emption. We also compare the impact of different VM image popularity distributions and VM image caching strategies on performance. These results emphasize the importance of VM image caching for the workloads studied and quantify the sensitivity of scheduling performance to VM image popularity distribution.

References

  1. S. Adabala, V. Chadha, P. Chawla, R. Figueiredo, J. Fortes, I. Krsul, A. Matsunaga, M. Tsugawa, J. Zhang, M. Zhao, L. Zhu, and X. Zhu. From virtualized resources to virtual computing grids: the In-VIGO system. Future Gener. Comput. Syst., 21(6):896--909, June 2005.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Andrieux, K. Czajkowski, A. Dan, K. Keahey, H. Ludwig, T. Nakata, J. Pruyne, J. Rofrano, S. Tuecke, and M. Xu. Web services agreement specification (WS-Agreement).]]Google ScholarGoogle Scholar
  3. R. Bolze, F. Cappello, E. Caron, M. Daydé , F. Desprez, E. Jeannot, Y. Jégou, S. Lanteri, J. Leduc, N. Melab, G. Mornet, R. Namyst, P. Primet, B. Quetier, O. Richard, E.-G. Talbi, and T. Irena. Grid'5000: a large scale and highly reconfigurable experimental grid testbed. International Journal of High Performance Computing Applications, 20(4):481--494, Nov. 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. S. Cleveland. Lowess: A program for smoothing scatterplots by robust locally weighted regression. The American Statistician, 35(54), 1981.]]Google ScholarGoogle Scholar
  5. K. Czajkowski, I. Foster, and C. Kesselman. Resource co-allocation in computational grids. In HPDC '99: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing, page 37, Washington, DC, USA, 1999. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Emeneker and D. Stanzione. Increasing Reliability through Dynamic Virtual Clustering. In High Availabilityand Performance Computing Workshop, 2006.]]Google ScholarGoogle Scholar
  7. W. Emeneker and D. Stanzione. Efficient Virtual Machine Caching in Dynamic Virtual Clusters. In SRMPDS Workshop, ICAPDS 2007 Conference, December 2007.]]Google ScholarGoogle Scholar
  8. N. Fallenbeck, H.-J. Picht, M. Smith, and B. Freisleben. Xen and the art of cluster scheduling. In VTDC '06: Proceedings of the 1st International Workshop on Virtualization Technology in Distributed Computing, Washington, DC, USA, 2006. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling - a status report. 10th Workshop on Job Scheduling Strategies for Parallel Processing, New-York, NY., 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. G. Feitelson and L. Rudolph. Metrics and benchmarking for parallel job scheduling. Lecture Notes in Computer Science, 1459:1+, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Foster, C. Kesselman, C. Lee, R. Lindell, K. Nahrstedt, and A. Roy. A distributed resource management architecture that supports advance reservations and co-allocation. In Proceedings of the International Workshop on Quality of Service, 1999.]]Google ScholarGoogle ScholarCross RefCross Ref
  12. I. T. Foster, T. Freeman, K. Keahey, D. Scheftner, B. Sotomayor, and X. Zhang. Virtual clusters for grid communities. In CCGRID, pages 513--520. IEEE Computer Society, 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Freeman, K. Keahey, I. T. Foster, A. Rana, B. Sotomayor, and F. Wuerthwein. Division of labor: Tools for growing and scaling grids. In ICSOC, 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Frey, T. Tannenbaum, M. Livny, I. Foster, and S. Tuecke. Condor-G: A computation management agent for multi-institutional grids. Cluster Computing, 5(3):237--246, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. H. Hargrove and J. C. Duell. Berkeley lab checkpoint/restart (blcr) for linux clusters. Journal of Physics: Conference Series, 46:494--499, 2006.]]Google ScholarGoogle ScholarCross RefCross Ref
  16. I. Raicu, Y.Zhao, C.Dumitrescu, I.Foster, and M.Wilde. Falkon: a fast and light-weight task execution framework. In IEEE/ACM International Conference for High Performance Computing, Networking, Storage, and Analysis (SC07), 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Irwin, J. Chase, L. Grit, A. Yumerefendi, D. Becker, and K. G. Yocum. Sharing networked resources with brokered leases. In USENIX Technical Conference, June 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. Keahey, I. Foster, T. Freeman, and X. Zhang. Virtual workspaces: Achieving quality of service and quality of life on the grid. Scientific Programming, 13(4):265--276, 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Kiyanclar, G. A. Koenig, and W. Yurcik. Maestro-VC: A paravirtualized execution environment for secure on-demand cluster computing. In CCGRID '06: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06), page 28, Washington, DC, USA, 2006. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. I. Krsul, A. Ganguly, J. Zhang, J. A. B. Fortes, and R. J. Figueiredo. Vmplants: Providing and managing virtual machine execution environments for grid computing. In SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, page 7, Washington, DC, USA, 2004. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. A. Lifka. The ANL/IBM SP scheduling system. In IPPS '95: Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, pages 295--303, London, UK, 1995. Springer-Verlag.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. W. Margo, K. Yoshimoto, P. Kovatch, and P. Andrews. Impact of reservations on production job scheduling. In 13th Workshop on Job Scheduling Strategies for Parallel Processing, 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. W. Mu'alem and D. G. Feitelson. Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst., 12(6):529--543, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Nishimura, N. Maruyama, and S. Matsuoka. Virtual clusters on the fly - fast, scalable, and flexible installation. In CCGRID '07: Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid, pages 549--556, Washington, DC, USA, 2007. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Beckman, S.Nadella, N.Trebon, and I.Beschastnikh. SPRUCE: A system for supporting urgent high-performance computing. IFIP International Federation for Information Processing, Grid-Based Problem Solving Environments, 239:295--311, 2007.]]Google ScholarGoogle ScholarCross RefCross Ref
  26. K. Pruhs, J. Sgall, and E. Torng. Handbook of Scheduling: Algorithms, Models, and Performance Analysis, chapter Online Scheduling. CRC Press, Inc., Boca Raton, FL, USA, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Ruth, P. McGachey, and D. Xu. VioCluster: Virtualization for dynamic computational domains. Proceedings of the IEEE International Conference on Cluster Computing (Cluster'05), 2005.]]Google ScholarGoogle ScholarCross RefCross Ref
  28. P. Ruth, J. Rhee, D. Xu, R. Kennell, and S. Goasguen. Autonomic live adaptation of virtual computational environments in a multi-domain infrastructure. IEEE International Conference on Autonomic Computing, 2006., 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Singh, C. Kesselman, and E. Deelman. Performance impact of resource provisioning on workflows. Technical Report 05-850, Department of Computer Science, University of South California, 2005.]]Google ScholarGoogle Scholar
  30. W. Smith, I. Foster, and V. Taylor. Scheduling with advanced reservations. In IPDPS '00: Proceedings of the 14th International Symposium on Parallel and Distributed Processing, page 127, Washington, DC, USA, 2000. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Q. Snell, M. J. Clement, D. B. Jackson, and C. Gregory. The performance impact of advance reservation meta-scheduling. In IPDPS '00/JSSPP '00: Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, pages 137--153, London, UK, 2000. Springer-Verlag.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. B. Sotomayor. A resource management model for VM-based virtual workspaces. Master's thesis, University of Chicago, February 2007.]]Google ScholarGoogle Scholar
  33. B. Sotomayor, K. Keahey, and I. Foster. Overhead matters: A model for virtual resource management. In VTDC '06: Proceedings of the 1st International Workshop on Virtualization Technology in Distributed Computing, page 5, Washington, DC, USA, 2006. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. E. Walker, J. Gardner, V. Litvin, and E. Turner. Creating personal adaptive clusters for managing scientific tasks in a distributed computing environment. In Challenges of Large Applications in Distributed Environments, 2006.]]Google ScholarGoogle Scholar
  35. S. Yamasaki, N. Maruyama, and S. Matsuoka. Model-based resource selection for efficient virtual cluster deployment. In VTDC '07: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, 2007.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. H. Zhao and R. Sakellariou. Advance reservation policies for workflows. In 12th Workshop on Job Scheduling Strategies for Parallel Processing, 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Amazon EC2. http://aws.amazon.com/ec2/.]]Google ScholarGoogle Scholar
  38. Final report. teragrid co-scheduling/metascheduling requirements analysis team. http://www.teragridforum.org/mediawiki/images/b/b4/MetaschedRatReport.pdf.]]Google ScholarGoogle Scholar
  39. Parallel workloads archive. http://www.cs.huji.ac.il/labs/parallel/workload/.]]Google ScholarGoogle Scholar

Index Terms

  1. Combining batch execution and leasing using virtual machines

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader