ABSTRACT
An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems. A good configuration setting can greatly improve the performance of a deployed system under certain workloads. But with tens or hundreds of parameters, it becomes a highly costly task to decide which configuration setting leads to the best performance. While such task requires the strong expertise in both the system and the application, users commonly lack such expertise.
To help users tap the performance potential of systems, we present Best Config, a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload. BestConfig is designed with an extensible architecture to automate the configuration tuning for general systems. To tune system configurations within a resource limit, we propose the divide-and-diverge sampling method and the recursive bound-and-search algorithm. BestConfig can improve the throughput of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce the running time of Hive join job by about 50% and that of Spark join job by about 80%, solely by configuration adjustment.
- Sameer Agarwal, Srikanth Kandula, Nicolas Bruno, Ming-Chuan Wu, Ion Stoica, and Jingren Zhou. 2012. Re-optimizing data-parallel computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 21--21.Google ScholarDigital Library
- Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, et al. 2006. The landscape of parallel computing research: A view from Berkeley. Technical Report. UCB/EECS-2006-183, EECS Department, University of California, Berkeley.Google Scholar
- Phil Bernstein, Michael Brodie, Stefano Ceri, David DeWitt, Mike Franklin, Hector Garcia-Molina, Jim Gray, Jerry Held, Joe Hellerstein, HV Jagadish, et al. 1998. The Asilomar report on database research. ACM Sigmod record 27, 4 (1998), 74--80.Google Scholar
- Josep Lluís Berral, Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer, and Daron Green. 2015. Aloja-ml: A framework for automating characterization and knowledge discovery in hadoop deployments. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1701--1710.Google ScholarDigital Library
- Jon Brodkin. 2012. Why Gmail went down: Google misconfigured load balancing servers. (2012). http://arstechnica.com/information-technology/2012/12/why-gmail-went-down-google-misconfigured-chromes-sync-server/.Google Scholar
- Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. 2009. A reinforcement learning approach to online web systems auto-configuration. In Distributed Computing Systems, 2009. ICDCS'09. 29th IEEE International Conference on. IEEE, 2--11.Google ScholarDigital Library
- Cassandra.Apache.Org. 2017. Apache Cassandra Website. (2017). http://cassandra.apache.org/.Google Scholar
- Haifeng Chen, Wenxuan Zhang, and Guofei Jiang. 2011. Experience transfer for the configuration tuning in large-scale computing systems. IEEE Transactions on Knowledge and Data Engineering 23, 3 (2011), 388--401.Google ScholarDigital Library
- Cloudera.Com. 2017. Tuning YARN. http://www.cloudera.com/documentation/enterprise/5-6-x/topics/cdh_ig_yarn_tuning.html. (2017).Google Scholar
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st SoCC. ACM.Google ScholarDigital Library
- Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. 2009. Tuning database configuration parameters with iTuned. Proceedings of the VLDB Endowment 2, 1 (2009), 1246--1257.Google ScholarDigital Library
- Anon et al, Dina Bitton, Mark Brown, Rick Catell, Stefano Ceri, Tim Chou, Dave DeWitt, Dieter Gawlick, Hector Garcia-Molina, Bob Good, Jim Gray, et al. 1985. A measure of transaction processing power. Datamation 31, 7 (1985), 112--118.Google ScholarDigital Library
- Andrew D Ferguson, Peter Bodik, Srikanth Kandula, Eric Boutin, and Rodrigo Fonseca. 2012. Jockey: guaranteed job latency in data parallel clusters. In Proceedings of the 7th ACM european conference on Computer Systems. ACM, 99--112.Google ScholarDigital Library
- Adem Efe Gencer, David Bindel, Emin Gün Sirer, and Robbert van Renesse. 2015. Configuring Distributed Computations Using Response Surfaces. In Proceedings of the 16th Annual Middleware Conference. ACM, 235--246.Google ScholarDigital Library
- David E Goldberg and John H Holland. 1988. Genetic algorithms and machine learning. Machine learning 3, 2 (1988), 95--99.Google ScholarDigital Library
- Bilal Gonen, Gurhan Gunduz, and Murat Yuksel. 2015. Automated network management and configuration using Probabilistic Trans-Algorithmic Search. Computer Networks 76 (2015), 275--293.Google ScholarDigital Library
- Qi Guo, Tianshi Chen, Yunji Chen, Zhi-Hua Zhou, Weiwu Hu, and Zhiwei Xu. 2011. Effective and efficient microprocessor design space exploration using unlabeled design configurations. In IJCAI Proceedings-International Joint Conference on Artificial Intelligence, Vol. 22. Citeseer, 1671.Google Scholar
- Hadoop.Apache.Org. 2017. Apache Hadoop Website. (2017). http://hadoop.apache.org/.Google Scholar
- Herodotos Herodotou, Fei Dong, and Shivnath Babu. 2011. No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In Proceedings of the 2nd ACM Symposium on Cloud Computing. ACM, 18.Google ScholarDigital Library
- Hive.Apache.Org. 2017. Apache Hive Website. (2017). http://hive.apache.org/.Google Scholar
- Holger H Hoos. 2011. Automated algorithm configuration and parameter tuning. In Autonomous search. Springer, 37--71.Google Scholar
- S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang. 2010. The hibench benchmark suite: Characterization of the mapreduce-based data analysis. In Proc. of ICDEW 2010. IEEE, 41--51.Google Scholar
- JMeter.Apache.Org. 2017. Apache JMeter™. http://jmeter.apache.org. (2017).Google Scholar
- Launchpad.Net. 2017. SysBench: System evaluation benchmark. http://github.com/nuodb/sysbench. (2017).Google Scholar
- Michael D McKay, Richard J Beckman, and William J Conover. 2000. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 42, 1 (2000), 55--61.Google ScholarCross Ref
- Aurimas Mikalauskas. 2017. 17 KEY MYSQL CONFIG FILE SETTINGS (MYSQL 5.7 PROOF). http://www.speedemy.com/17-key-mysql-config-file-settings-mysql-5-7-proof/. (2017).Google Scholar
- Rich Miller. 2012. Microsoft: Misconfigured Network Device Led to Azure Outage. (2012). http://www.datacenterknowledge.com/archives/2012/07/28/microsoft-misconfigured-network-device-caused-azure-outage/.Google Scholar
- MySQL.Com. 2017. MySQL Website. (2017). http://www.mysql.com/.Google Scholar
- Takayuki Osogami and Sei Kato. 2007. Optimizing system configurations quickly by guessing at the performance. In ACM SIGMETRICS Performance Evaluation Review, Vol. 35. ACM, 145--156.Google ScholarDigital Library
- Kim Shanley. 2010. History and Overview of the TPC. (2010).Google Scholar
- Spark.Apache.Org. 2017. Apache Spark Website. (2017). http://spark.apache.org/.Google Scholar
- Spark.Apache.Org. 2017. Tuning Spark. (2017). http://spark.apache.org/docs/latest/tuning.htmlGoogle Scholar
- Spec.Org. 2017. Standard Performance Evaluation Corporation (SPEC). http://www.spec.org/. (2017).Google Scholar
- Chunqiang Tang, Thawan Kooburat, Pradeep Venkatachalam, Akshay Chander, Zhe Wen, Aravind Narayanan, Patrick Dowell, and Robert Karl. 2015. Holistic configuration management at Facebook. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 328--343.Google ScholarDigital Library
- Keir Thomas. 2011. Amazon: The Cloud Crash Reveals Your Importance. (2011). http://www.pcworld.com/article/226033/thanks_amazon_for_making_possible_much_of_the_internet.html.Google Scholar
- Tobert. 2017. Al's Cassandra 2.1 tuning guide. https://tobert.github.io/pages/alscassandra-21-tuning-guide.html. (2017).Google Scholar
- Tomcat.Apache.Org. 2017. Apache Tomcat Website. (2017). http://tomcat.apache.org/.Google Scholar
- TPC.Org. 2017. Transaction Processing Performance Council (TPC). http://www.tpc.org/. (2017).Google Scholar
- Dana Van Aken, Andrew Pavlo, Geoffrey J Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 1009--1024.Google ScholarDigital Library
- Peter JM Van Laarhoven and Emile HL Aarts. 1987. Simulated annealing. In Simulated Annealing: Theory and Applications. Springer, 7--15.Google Scholar
- Bowei Xi, Zhen Liu, Mukund Raghavachari, Cathy H Xia, and Li Zhang. 2004. A smart hill-climbing algorithm for application server configuration. In Proceedings of the 13th international conference on World Wide Web. ACM, 287--296.Google ScholarDigital Library
- Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker. 2015. Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 307--319.Google ScholarDigital Library
- Tao Ye and Shivkumar Kalyanaraman. 2003. A recursive random search algorithm for large-scale network parameter configuration. ACM SIGMETRICS Performance Evaluation Review 31, 1 (2003), 196--205.Google ScholarDigital Library
- Wei Zheng, Ricardo Bianchini, and Thu D. Nguyen. 2007. Automatic Configuration of Internet Services. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007. ACM, New York, NY,USA, 219--229.Google Scholar
- Yuqing Zhu and Jianxun Liu. 2017. Better Configurations for Large-Scale Systems (BestConf). (2017). http://github.com/zhuyuqing/bestconfGoogle Scholar
- Yuqing Zhu, Jianxun Liu, Mengying Guo, Wenlong Ma, and Yungang Bao. 2017. ACTS in Need: Automatic Configuration Tuning with Scalability Guarantees. In Proceedings of the 8th SIGOPS Asia-Pacific Workshop on Systems. ACM.Google ScholarDigital Library
- Yuqing Zhu, Jianfeng Zhan, Chuliang Weng, Raghunath Nambiar, Jinchao Zhang, Xingzhen Chen, and Lei Wang. 2014. Bigop: Generating comprehensive big data workloads as a benchmarking framework. In International Conference on Database Systems for Advanced Applications. Springer, 483--492.Google ScholarCross Ref
Index Terms
- BestConfig: tapping the performance potential of systems via automatic configuration tuning
Recommendations
Optimizing system configurations quickly by guessing at the performance
SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systemsThe performance of a Web system can be greatly improved by tuning its configuration parameters. However, finding the optimal configuration has been a time-consuming task due to the long measurement time needed to evaluate the performance of a given ...
Optimizing system configurations quickly by guessing at the performance
SIGMETRICS '07 Conference ProceedingsThe performance of a Web system can be greatly improved by tuning its configuration parameters. However, finding the optimal configuration has been a time-consuming task due to the long measurement time needed to evaluate the performance of a given ...
ACTGAN: automatic configuration tuning for software systems with generative adversarial networks
ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software EngineeringComplex software systems often provide a large number of parameters so that users can configure them for their specific application scenarios. However, configuration tuning requires a deep understanding of the software system, far beyond the abilities ...
Comments