skip to main content
10.1145/3127479.3128605acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

BestConfig: tapping the performance potential of systems via automatic configuration tuning

Published:24 September 2017Publication History

ABSTRACT

An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems. A good configuration setting can greatly improve the performance of a deployed system under certain workloads. But with tens or hundreds of parameters, it becomes a highly costly task to decide which configuration setting leads to the best performance. While such task requires the strong expertise in both the system and the application, users commonly lack such expertise.

To help users tap the performance potential of systems, we present Best Config, a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload. BestConfig is designed with an extensible architecture to automate the configuration tuning for general systems. To tune system configurations within a resource limit, we propose the divide-and-diverge sampling method and the recursive bound-and-search algorithm. BestConfig can improve the throughput of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce the running time of Hive join job by about 50% and that of Spark join job by about 80%, solely by configuration adjustment.

References

  1. Sameer Agarwal, Srikanth Kandula, Nicolas Bruno, Ming-Chuan Wu, Ion Stoica, and Jingren Zhou. 2012. Re-optimizing data-parallel computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 21--21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al. 2006. The landscape of parallel computing research: A view from Berkeley. Technical Report. UCB/EECS-2006-183, EECS Department, University of California, Berkeley.Google ScholarGoogle Scholar
  3. Phil Bernstein, Michael Brodie, Stefano Ceri, David DeWitt, Mike Franklin, Hector Garcia-Molina, Jim Gray, Jerry Held, Joe Hellerstein, HV Jagadish, et al. 1998. The Asilomar report on database research. ACM Sigmod record 27, 4 (1998), 74--80.Google ScholarGoogle Scholar
  4. Josep Lluís Berral, Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer, and Daron Green. 2015. Aloja-ml: A framework for automating characterization and knowledge discovery in hadoop deployments. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1701--1710.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jon Brodkin. 2012. Why Gmail went down: Google misconfigured load balancing servers. (2012). http://arstechnica.com/information-technology/2012/12/why-gmail-went-down-google-misconfigured-chromes-sync-server/.Google ScholarGoogle Scholar
  6. Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. 2009. A reinforcement learning approach to online web systems auto-configuration. In Distributed Computing Systems, 2009. ICDCS'09. 29th IEEE International Conference on. IEEE, 2--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cassandra.Apache.Org. 2017. Apache Cassandra Website. (2017). http://cassandra.apache.org/.Google ScholarGoogle Scholar
  8. Haifeng Chen, Wenxuan Zhang, and Guofei Jiang. 2011. Experience transfer for the configuration tuning in large-scale computing systems. IEEE Transactions on Knowledge and Data Engineering 23, 3 (2011), 388--401.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cloudera.Com. 2017. Tuning YARN. http://www.cloudera.com/documentation/enterprise/5-6-x/topics/cdh_ig_yarn_tuning.html. (2017).Google ScholarGoogle Scholar
  10. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st SoCC. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. 2009. Tuning database configuration parameters with iTuned. Proceedings of the VLDB Endowment 2, 1 (2009), 1246--1257.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Anon et al, Dina Bitton, Mark Brown, Rick Catell, Stefano Ceri, Tim Chou, Dave DeWitt, Dieter Gawlick, Hector Garcia-Molina, Bob Good, Jim Gray, et al. 1985. A measure of transaction processing power. Datamation 31, 7 (1985), 112--118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Andrew D Ferguson, Peter Bodik, Srikanth Kandula, Eric Boutin, and Rodrigo Fonseca. 2012. Jockey: guaranteed job latency in data parallel clusters. In Proceedings of the 7th ACM european conference on Computer Systems. ACM, 99--112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Adem Efe Gencer, David Bindel, Emin Gün Sirer, and Robbert van Renesse. 2015. Configuring Distributed Computations Using Response Surfaces. In Proceedings of the 16th Annual Middleware Conference. ACM, 235--246.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. David E Goldberg and John H Holland. 1988. Genetic algorithms and machine learning. Machine learning 3, 2 (1988), 95--99.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Bilal Gonen, Gurhan Gunduz, and Murat Yuksel. 2015. Automated network management and configuration using Probabilistic Trans-Algorithmic Search. Computer Networks 76 (2015), 275--293.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Qi Guo, Tianshi Chen, Yunji Chen, Zhi-Hua Zhou, Weiwu Hu, and Zhiwei Xu. 2011. Effective and efficient microprocessor design space exploration using unlabeled design configurations. In IJCAI Proceedings-International Joint Conference on Artificial Intelligence, Vol. 22. Citeseer, 1671.Google ScholarGoogle Scholar
  18. Hadoop.Apache.Org. 2017. Apache Hadoop Website. (2017). http://hadoop.apache.org/.Google ScholarGoogle Scholar
  19. Herodotos Herodotou, Fei Dong, and Shivnath Babu. 2011. No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In Proceedings of the 2nd ACM Symposium on Cloud Computing. ACM, 18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hive.Apache.Org. 2017. Apache Hive Website. (2017). http://hive.apache.org/.Google ScholarGoogle Scholar
  21. Holger H Hoos. 2011. Automated algorithm configuration and parameter tuning. In Autonomous search. Springer, 37--71.Google ScholarGoogle Scholar
  22. S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang. 2010. The hibench benchmark suite: Characterization of the mapreduce-based data analysis. In Proc. of ICDEW 2010. IEEE, 41--51.Google ScholarGoogle Scholar
  23. JMeter.Apache.Org. 2017. Apache JMeter. http://jmeter.apache.org. (2017).Google ScholarGoogle Scholar
  24. Launchpad.Net. 2017. SysBench: System evaluation benchmark. http://github.com/nuodb/sysbench. (2017).Google ScholarGoogle Scholar
  25. Michael D McKay, Richard J Beckman, and William J Conover. 2000. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 42, 1 (2000), 55--61.Google ScholarGoogle ScholarCross RefCross Ref
  26. Aurimas Mikalauskas. 2017. 17 KEY MYSQL CONFIG FILE SETTINGS (MYSQL 5.7 PROOF). http://www.speedemy.com/17-key-mysql-config-file-settings-mysql-5-7-proof/. (2017).Google ScholarGoogle Scholar
  27. Rich Miller. 2012. Microsoft: Misconfigured Network Device Led to Azure Outage. (2012). http://www.datacenterknowledge.com/archives/2012/07/28/microsoft-misconfigured-network-device-caused-azure-outage/.Google ScholarGoogle Scholar
  28. MySQL.Com. 2017. MySQL Website. (2017). http://www.mysql.com/.Google ScholarGoogle Scholar
  29. Takayuki Osogami and Sei Kato. 2007. Optimizing system configurations quickly by guessing at the performance. In ACM SIGMETRICS Performance Evaluation Review, Vol. 35. ACM, 145--156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kim Shanley. 2010. History and Overview of the TPC. (2010).Google ScholarGoogle Scholar
  31. Spark.Apache.Org. 2017. Apache Spark Website. (2017). http://spark.apache.org/.Google ScholarGoogle Scholar
  32. Spark.Apache.Org. 2017. Tuning Spark. (2017). http://spark.apache.org/docs/latest/tuning.htmlGoogle ScholarGoogle Scholar
  33. Spec.Org. 2017. Standard Performance Evaluation Corporation (SPEC). http://www.spec.org/. (2017).Google ScholarGoogle Scholar
  34. Chunqiang Tang, Thawan Kooburat, Pradeep Venkatachalam, Akshay Chander, Zhe Wen, Aravind Narayanan, Patrick Dowell, and Robert Karl. 2015. Holistic configuration management at Facebook. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 328--343.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Keir Thomas. 2011. Amazon: The Cloud Crash Reveals Your Importance. (2011). http://www.pcworld.com/article/226033/thanks_amazon_for_making_possible_much_of_the_internet.html.Google ScholarGoogle Scholar
  36. Tobert. 2017. Al's Cassandra 2.1 tuning guide. https://tobert.github.io/pages/alscassandra-21-tuning-guide.html. (2017).Google ScholarGoogle Scholar
  37. Tomcat.Apache.Org. 2017. Apache Tomcat Website. (2017). http://tomcat.apache.org/.Google ScholarGoogle Scholar
  38. TPC.Org. 2017. Transaction Processing Performance Council (TPC). http://www.tpc.org/. (2017).Google ScholarGoogle Scholar
  39. Dana Van Aken, Andrew Pavlo, Geoffrey J Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 1009--1024.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Peter JM Van Laarhoven and Emile HL Aarts. 1987. Simulated annealing. In Simulated Annealing: Theory and Applications. Springer, 7--15.Google ScholarGoogle Scholar
  41. Bowei Xi, Zhen Liu, Mukund Raghavachari, Cathy H Xia, and Li Zhang. 2004. A smart hill-climbing algorithm for application server configuration. In Proceedings of the 13th international conference on World Wide Web. ACM, 287--296.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker. 2015. Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 307--319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Tao Ye and Shivkumar Kalyanaraman. 2003. A recursive random search algorithm for large-scale network parameter configuration. ACM SIGMETRICS Performance Evaluation Review 31, 1 (2003), 196--205.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Wei Zheng, Ricardo Bianchini, and Thu D. Nguyen. 2007. Automatic Configuration of Internet Services. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007. ACM, New York, NY,USA, 219--229.Google ScholarGoogle Scholar
  45. Yuqing Zhu and Jianxun Liu. 2017. Better Configurations for Large-Scale Systems (BestConf). (2017). http://github.com/zhuyuqing/bestconfGoogle ScholarGoogle Scholar
  46. Yuqing Zhu, Jianxun Liu, Mengying Guo, Wenlong Ma, and Yungang Bao. 2017. ACTS in Need: Automatic Configuration Tuning with Scalability Guarantees. In Proceedings of the 8th SIGOPS Asia-Pacific Workshop on Systems. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Yuqing Zhu, Jianfeng Zhan, Chuliang Weng, Raghunath Nambiar, Jinchao Zhang, Xingzhen Chen, and Lei Wang. 2014. Bigop: Generating comprehensive big data workloads as a benchmarking framework. In International Conference on Database Systems for Advanced Applications. Springer, 483--492.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. BestConfig: tapping the performance potential of systems via automatic configuration tuning

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing
              September 2017
              672 pages
              ISBN:9781450350280
              DOI:10.1145/3127479

              Copyright © 2017 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 24 September 2017

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate169of722submissions,23%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader