skip to main content
research-article

Modeling virtualized applications using machine learning techniques

Published:03 March 2012Publication History
Skip Abstract Section

Abstract

With the growing adoption of virtualized datacenters and cloud hosting services, the allocation and sizing of resources such as CPU, memory, and I/O bandwidth for virtual machines (VMs) is becoming increasingly important. Accurate performance modeling of an application would help users in better VM sizing, thus reducing costs. It can also benefit cloud service providers who can offer a new charging model based on the VMs' performance instead of their configured sizes. In this paper, we present techniques to model the performance of a VM-hosted application as a function of the resources allocated to the VM and the resource contention it experiences. To address this multi-dimensional modeling problem, we propose and refine the use of two machine learning techniques: artificial neural network (ANN) and support vector machine (SVM). We evaluate these modeling techniques using five virtualized applications from the RUBiS and Filebench suite of benchmarks and demonstrate that their median and 90th percentile prediction errors are within 4.36% and 29.17% respectively. These results are substantially better than regression based approaches as well as direct applications of machine learning techniques without our refinements. We also present a simple and effective approach to VM sizing and empirically demonstrate that it can deliver optimal results for 65% of the sizing problems that we studied and produces close-to-optimal sizes for the remaining 35%.

References

  1. Amazon elastic compute cloud (amazon EC2). http://aws.amazon.com/ec2/.Google ScholarGoogle Scholar
  2. Fast Artficial Neural Network (FANN). http://leenissen.dk/fann/.Google ScholarGoogle Scholar
  3. Filebench: a framework for simulating applications on file systems. http://www.solarisinternals.com/wiki/index.php/FileBench.Google ScholarGoogle Scholar
  4. fio: Flexible I/O tester. http://freshmeat.net/projects/fio/.Google ScholarGoogle Scholar
  5. fpc: Flexible procedures for clustering. http://cran.r-project.org/web/packages/fpc/index.html.Google ScholarGoogle Scholar
  6. LIBSVM: A Library for Support Vector Machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm/.Google ScholarGoogle Scholar
  7. The R Project for Statistical Computing. http://www.r-project.org/.Google ScholarGoogle Scholar
  8. RUBiS: Rice University Bidding System. http://rubis.ow2.org/.Google ScholarGoogle Scholar
  9. E. Alpaydin. Introduction to machine learning. MIT Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. N. Bennani and D. A. Menascé. Resource allocation for autonomic data centers using analytic performance models. In ICAC, pages 229--240. IEEE Computer Society, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Bodik, M. Goldszmidt, A. Fox, D. B. Woodard, and H. Andersen. Fingerprinting the datacenter: Automated classification of performance crises. In EuroSys '10 Proceedings of the 5th European conference on Computer systems, pages 111--124, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. T. Clements, I. Ahmad, M. Vilayannur, and J. Li. Decentralized Deduplication in SAN Cluster File Systems. In Proc. of USENIX ATC, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. I. Cohen, M. Goldszmidt, T. Kelly, J. Symons, and J. S. Chase. Correlating instrumentation data to system states: A building block for automated diagnoses and control. In Proc. of the 6th USENIX OSDI), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. I. Cohen, S. Zhang, M. Goldszmidt, J. Symons, T. Kelly, and A. Fox. Capturing, indexing, clustering, and retrieving system history. In Proc. of ACM SOSP, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. P. Doyle, J. S. Chase, O. M. Asad, W. Jin, and A. Vahdat. Model-based resource provisioning in a web service utility. In USENIX Symposium on Internet Technologies and Systems, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. U. Drepper. The Cost of Virtualization. ACM Queue, Feb. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Ganapathi, Y. Chen, A. Fox, R. Katz, and D. Patterson. Statistics-driven workload modeling for the cloud. In SMDB, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  18. A. Gulati, I. Ahmad, and C. Waldspurger. PARDA: Proportionate Allocation of Resources for Distributed Storage Access. In Proc. of USENIX FAST, Feb. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Gulati, A. Merchant, and P. Varman. mClock: Handling Throughput Variability for Hypervisor IO Scheduling. In 9th USENIX OSDI, October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Haykin. Neural Networks: A Comprehensive Foundation. Prentice Hall, 2nd edition, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Katcher. Postmark: A new file system benchmark. Technical report, Network Appliance, 1997.Google ScholarGoogle Scholar
  22. E. Kotsovinos. Virtualization: Blessing or Curse? ACM Queue, Jan. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Kundu, R. Rangaswami, K. Dutta, and M. Zhao. Application Performance Modeling in a Virtualized Environment. In Proc. of IEEE HPCA, January 2010.Google ScholarGoogle ScholarCross RefCross Ref
  24. X. Liu, X. Zhu, S. Singhal, and M. F. Arlitt. Adaptive entitlement control of resource containers on shared servers. In IM, pages 163--176. IEEE, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  25. J. C. McCullough, Y. Agarwal, J. Chandrashekar, S. Kuppuswamy, A. C. Snoeren, and R. K. Gupta. Evaluating the effectiveness of model-based power characterization. In Proc. of USENIX Annual Technical Conference, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Nathuji, A. Kansal, and A. Ghaffarkhah. Q-clouds: managing performance interference effects for qos-aware clouds. In EuroSys '10, pages 237--250, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Padala, K.-Y. Hou, X. Zhu, M. Uysal, Z. Wang, S. Singhal, A. Merchant, and K. G. Shin. Automated control of multiple virtualized resources. In Proceedings of the 4th ACM European conference on Computer systems/EuroSys, pages 13--16, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Rao, X. Bu, C.-Z. Xu, L. Y. Wang, and G. G. Yin. VCONF: a reinforcement learning approach to virtual machines auto-configuration. In ICAC, pages 137--146. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Stewart, T. Kelly, A. Zhang, and K. Shen. A dollar from 15 cents: Cross-platform management for internet services. In Proceedings of the USENIX Annual Techinal Conference, pages 199--212, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Uttamchandani, L. Yin, G. A. Alvarez, J. Palmer, and G. Agha. Chameleon: a self-evolving, fully-adaptive resource arbitrator for storage systems. In Proc. of USENIX Annual Technical Conference, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. VMware, Inc. Introduction to VMware Infrastructure. 2010. http://www.vmware.com/support/pubs/.Google ScholarGoogle Scholar
  32. VMware, Inc. vSphere Resource Management Guide: ESX 4.1, ESXi 4.1, vCenter Server 4.1. 2010.Google ScholarGoogle Scholar
  33. W. Vogels. Beyond Server Consolidation. ACM Queue, Feb. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. C. A. Waldspurger. Memory resource management in vmware esx server. In Proc. of USENIX OSDI, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Z. Wang, X. Zhu, and S. Singhal. Utilization and slo-based control for dynamic sizing of resource partitions. In Proc. of 16th IFIP/IEEE Distributed Systems: Operations and Management (DSOM), October 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Wildstrom, P. Stone, and E. Witchel. CARVE: A cognitive agent for resource value estimation. In ICAC, pages 182--191. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. T. Wood, L. Cherkasova, K. Ozonat, and P. Shenoy. Profiling and modeling resource usage of virtualized applications. In Proc. of ACM/IFIP/USENIX Middleware, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Xu, M. Zhao, J. A. B. Fortes, R. Carpenter, and M. S. Yousif. Autonomic resource management in virtualized data centers using fuzzy logic-based approaches. Cluster Computing, 11(3):213--227, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. W. Zheng, R. Bianchini, G. J. Janakiraman, J. R. Santos, and Y. Turner. JustRunIt: Experiment-Based Management of Virtualized Data Centers. In Proceeding of the USENIX Annual Technical Conference, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling virtualized applications using machine learning techniques

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 47, Issue 7
        VEE '12
        July 2012
        229 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2365864
        Issue’s Table of Contents
        • cover image ACM Conferences
          VEE '12: Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
          March 2012
          248 pages
          ISBN:9781450311762
          DOI:10.1145/2151024

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 March 2012

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader