ABSTRACT
Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate software architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guidelines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of "too many knobs" to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software.
- Apache HTTP Server Version 2.4 Documentation. http:// httpd.apache.org/docs/2.4/.Google Scholar
- Apache HTTP Server Version 2.4 Documentation (LogLevel Directive). http://httpd.apache.org/docs/2.4/mod/ core.html#loglevel.Google Scholar
- Cloudera Manager. http://www.cloudera.com/ content/cloudera/en/products-and-services/ cloudera-enterprise/cloudera-manager.html.Google Scholar
- Cox: A configuration navigation tool and library for a thousand of knobs. https://github.com/tianyin/cox.Google Scholar
- Database Administrators. http://dba.stackexchange. com/.Google Scholar
- MySQL 5.6 Reference Manual (Online Version). http:// dev.mysql.com/doc/refman/5.6/en/index.html.Google Scholar
- MySQL 5.6 Reference Manual (PDF Version). http:// downloads.mysql.com/docs/refman-5.6-en.pdf.Google Scholar
- Pro Webmasters. http://webmasters.stackexchange. com/.Google Scholar
- ServerFault. http://serverfault.com/.Google Scholar
- StackOverflow. http://stackoverflow.com/.Google Scholar
- The Apache Lucene Project. https://lucene.apache. org/.Google Scholar
- E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: Running Circles Around Storage Administration. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02), Berkeley, CA, USA, January 2002. Google ScholarDigital Library
- M. Attariyan, M. Chow, and J. Flinn. X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12), Hollywood, CA, USA, October 2012. Google ScholarDigital Library
- M. Attariyan and J. Flinn. Automating Configuration Troubleshooting with Dynamic Information Flow Analysis. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI’10), Vancouver, BC, Canada, October 2010. Google ScholarDigital Library
- R. Barrett, E. Kandogan, P. P. Maglio, E. Haber, L. A. Takayama, and M. Prabaker. Field Studies of Computer System Administrators: Analysis of System Management Tools and Practices. In Proceedings of the 2004 ACM Conference on Computer Supported Cooperative Work (CSCW’04), Chicago, Illinois, USA, November 2004. Google ScholarDigital Library
- L. A. Barroso and U. Hölzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-scale Machines. Morgan and Claypool Publishers, 2009. Google ScholarDigital Library
- J. S. Chase, D. C. Anderson, P. N. Thakar, A. M. Vahdat, and R. P. Doyle. Managing Energy and Server Resources in Hosting Centers. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP’01), Chateau Lake Louise, Banff, Canada, October 2001. Google ScholarDigital Library
- S. Duan, V. Thummala, and S. Babu. Tuning Database Configuration Parameters with iTuned. In Proceedings of the 35th International Conference on Very Large Data Bases (VLDB’09), Lyon, France, August 2009. Google ScholarDigital Library
- E. Dumlu, C. Yilmaz, M. B. Cohen, and A. Porter. Feedback driven adaptive combinatorial testing. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’11), Toronto, ON, Canada, July 2011. Google ScholarDigital Library
- H. S. Gunawi, M. Hao, T. Leesatapornwongsa, T. Patana-anake, T. Do, J. Adityatama, K. J. Eliazar, A. Laksono, J. F. Lukman, V. Martin, and A. D. Satria. What bugs live in the cloud? a study of 3000+ issues in cloud systems. In Proceedings of the 5th ACM Symposium on Cloud Computing (SoCC’14), Seattle, WA, USA, November 2014. Google ScholarDigital Library
- E. M. Haber and J. Bailey. Design Guidelines for System Administration Tools Developed through Ethnographic Field Study. In Proceedings of the 2007 ACM Conference on Human Interfaces to the Management of Information Technology (CHIMIT’07), Cambridge, MA, USA, March 2007. Google ScholarDigital Library
- C. Henard, M. Papadakis, M. Harman, and Y. L. Traon. Combining Multi-Objective Search and Constraint Solving for Configuring Large Software Product Lines. In Proceedings of the 37th International Conference on Software Engineering (ICSE’15), Firenze, Italy, May 2015.Google ScholarDigital Library
- C. Henard, M. Papadakis, G. Perrouin, J. Klein, P. Heymans, and Y. L. Traon. Bypassing the Combinatorial Explosion: Using Similarity to Generate and Prioritize T-Wise Test Configurations for Software Product Lines. IEEE Transactions on Software Engineering (TSE), 40(7):650–670, July 2014. Google ScholarDigital Library
- A. Hervieu, B. Baudry, and A. Gotlieb. PACOGEN : Automatic Generation of Pairwise Test Configurations from Feature Models. In Proceedings of the 22nd IEEE International Symposium on Software Reliability Engineering (ISSRE’11), Hiroshima, Japan, November 2011. Google ScholarDigital Library
- A. Hubaux, Y. Xiong, and K. Czarnecki. A User Survey of Configuration Challenges in Linux and eCos. In Proceedings of 6th International Workshop on Variability Modelling of Software-intensive Systems (VaMoS’12), Leipzig, Germany, January 2012. Google ScholarDigital Library
- D. Jin, M. B. Cohen, X. Qu, and B. Robinson. PrefFinder: Getting the Right Preference in Configurable Software Systems (Supplementary Data). http://cse.unl.edu/ ~myra/artifacts/PrefFinder_2014/.Google Scholar
- D. Jin, M. B. Cohen, X. Qu, and B. Robinson. PrefFinder: Getting the Right Preference in Configurable Software Systems. In Proceedings of the 29th IEEE/ACM International Conference on Automated Software Engineering (ASE’14), Västerås, Sweden, September 2014. Google ScholarDigital Library
- D. Jin, X. Qu, M. B. Cohen, and B. Robinson. Configurations Everywhere: Implications for Testing and Debugging in Practice. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14), Hyderabad, India, June 2014. Google ScholarDigital Library
- E. Kandogan and E. M. Haber. Security Administration Tools and Practices. Security and Usability, O’Reilly Media, Inc., August 2005.Google Scholar
- L. Keller, P. Upadhyaya, and G. Candea. ConfErr: A Tool for Assessing Resilience to Human Configuration Errors. In Proceedings of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’08), Anchorage, Alaska, USA, June 2008.Google ScholarCross Ref
- S. Kendrick. What Takes Us Down? USENIX ;login:, 37(5):37–45, October 2012.Google Scholar
- E. Kiciman and Y.-M. Wang. Discovering Correctness Constraints for Self-Management of System Configuration. In Proceedings of the 1st International Conference on Autonomic Computing (ICAC’04), New York, NY, USA, May 2004. Google ScholarDigital Library
- C. H. P. Kim, D. Marinov, S. Khurshid, and D. Batory. SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems. In Proceedings of the 9th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’13), Saint Petersburg, Russia, August 2013. Google ScholarDigital Library
- M. Larsson and I. Crnkovic. Configuration Management for Component-based Systems. In Proceedings of the 23rd International Conference on Software Engineering (ICSE’01), Toronto, Ontario, Canada, May 2001.Google Scholar
- L. Y. Liang. Linkedin.com inaccessible on Thursday because of server misconfiguration. 2013. http://www. straitstimes.com/breaking-news/singapore/ story/linkedincom-inaccessible-thursdaybecause-server-misconfiguration-2013.Google Scholar
- S. Lohar, S. Amornborvornwong, A. Zisman, and J. Cleland-Huang. Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features. In Proceedings of the 9th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’13), Saint Petersburg, Russia, August 2013. Google ScholarDigital Library
- R. Michel, A. Hubaux, V. Ganesh, and P. Heymans. An SMT-based Approach to Automated Configuration. In Proceedings of the 10th International Workshop on Satisfiability Modulo Theories (SMT’12), Manchester, UK, June 2012.Google Scholar
- G. A. Miller. WordNet: A Lexical Database for English. Communications of the ACM, 38(11):39–41, November 1995. Google ScholarDigital Library
- S. Nadi, T. Berger, C. Kästner, and K. Czarnecki. Mining Configuration Constraints: Static Analyses and Empirical Results. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14), Hyderabad, India, June 2014. Google ScholarDigital Library
- D. Oppenheimer, A. Ganapathi, and D. A. Patterson. Why Do Internet Services Fail, and What Can Be Done About It? In Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems (USITS’03), Seattle, WA, USA, March 2003. Google ScholarDigital Library
- C. Perrow. Normal Accidents: Living with High-Risk Technologies. Basic Books, 1984.Google Scholar
- X. Qu, M. Acharya, and B. Robinson. Impact Analysis of Configuration Changes for Test Case Selection. In Proceedings of the 22nd IEEE International Symposium on Software Reliability Engineering (ISSRE’11), Hiroshima, Japan, November 2011. Google ScholarDigital Library
- X. Qu, M. B. Cohen, and G. Rothermel. Configuration-Aware Regression Testing: An Empirical Study of Sampling and Prioritization. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’08), Seattle, WA, USA, July 2008. Google ScholarDigital Library
- A. Rabkin and R. Katz. Precomputing Possible Configuration Error Diagnosis. In Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11), Lawrence, KS, USA, November 2011. Google ScholarDigital Library
- A. Rabkin and R. Katz. Static Extraction of Program Configuration Options. In Proceedings of the 33th International Conference on Software Engineering (ICSE’11), Honolulu, Hawaii, USA, May 2011. Google ScholarDigital Library
- A. Rabkin and R. Katz. How Hadoop Clusters Break. IEEE Software Magazine, 30(4):88–94, July 2013. Google ScholarDigital Library
- V. Ramachandran, M. Gupta, M. Sethi, and S. R. Chowdhury. Determining Configuration Parameter Dependencies via Analysis of Configuration Data from Multi-tiered Enterprise Applications. In Proceedings of the 6th International Conference on Autonomic Computing and Communications (ICAC’09), Barcelona, Spain, June 2009. Google ScholarDigital Library
- J. Reason. Human Error. Cambridge University Press, October 1990.Google Scholar
- E. Reisner, C. Song, K.-K. Ma, J. S. Foster, and A. Porter. Using Symbolic Evaluation to Understand Behavior in Configurable Software Systems. In Proceedings of the 32th International Conference on Software Engineering (ICSE’10), Cape Town, South Africa, May 2010. Google ScholarDigital Library
- B. Robinson and L. White. Testing of User-Configurable Software Systems Using Firewalls. In Proceedings of the 19th IEEE International Symposium on Software Reliability Engineering (ISSRE’08), Seattle/Redmond, WA, USA, November 2008. Google ScholarDigital Library
- A. S. Sayyad, J. Ingram, T. Menzies, and H. Ammar. Scalable Product Line Configuration: A Straw to Break the Camel’s Back. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE’13), November 2013.Google ScholarDigital Library
- A. S. Sayyad, T. Menzies, and H. Ammar. On the Value of User Preferences in Search-Based Software Engineering: A Case Study in Software Product Lines. In Proceedings of the 35th International Conference on Software Engineering (ICSE’13), San Francisco, CA, USA, May 2013. Google ScholarDigital Library
- Search Engine Watch. How Much is a Google Top Spot Worth? 2010. http://searchenginewatch.com/ article/2050861/How-Much-is-a-Google-Top-Spot-Worth.Google Scholar
- Search Engine Watch. 53% of Organic Search Clicks Go to First Link. 2012. http://searchenginewatch.com/ article/2050861/How-Much-is-a-Google-Top-Spot-Worth.Google Scholar
- C. Song, A. Porter, and J. S. Foster. iTree: Efficiently Discovering High-Coverage Configuration Using Interaction Trees. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12), Zurich, Switzerland, June 2012. Google ScholarDigital Library
- C. Song, A. Porter, and J. S. Foster. iTree: Efficiently Discovering High-Coverage Configuration Using Interaction Trees. IEEE Transactions on Software Engineering (TSE), 40(3):251–265, March 2014. Google ScholarDigital Library
- J. Spool. Do users change their settings? 2011. http://www. uie.com/brainsparks/2011/09/14/do-userschange-their-settings/.Google Scholar
- H. Srikanth, M. B. Cohen, and X. Qu. Reducing Field Failures in System Configurable Software: Cost-Based Prioritization. In Proceedings of the 20th IEEE International Symposium on Software Reliability Engineering (ISSRE’09), Mysuru, Karnataka, India, November 2009. Google ScholarDigital Library
- Y. Sverdlik. Microsoft: Misconfigured Network Device Led to Azure Outage. http://www.datacenterdynamics. com/focus/archive/2012/07/microsoftmisconfigured-network-device-led-azure-outage, 2012.Google Scholar
- G. Tamura, R. Casallas, A. Cleve, and L. Duchien. QoS Contract Preservation through Dynamic Reconfiguration: A Formal Semantics Approach. Science of Computer Programming, 94(3):301–332, November 2014. Google ScholarDigital Library
- The Association of Support Professionals. Technical Support Cost Ratios. http://www.asponline.com/tscr.pdf, 2000.Google Scholar
- The Standish Group. Modernization: Clearing a Pathway to Success. 2010. https://www.standishgroup.com/ sample_research_files/Modernization.pdf.Google Scholar
- K. Thomas. Thanks, Amazon: The Cloud Crash Reveals Your Importance. 2002. http://www.pcworld.com/ article/226033/thanks_amazon_for_making_ possible_much_of_the_internet.html.Google Scholar
- N. F. Velasquez, S. Weisband, and A. Durcikova. Designing Tools for System Administrators: An Empirical Test of the Integrated User Satisfaction Model. In Proceedings of the 22nd Large Installation System Administration Conference (LISA’08), San Diego, CA, USA, November 2008. Google ScholarDigital Library
- H. J. Wang, J. C. Platt, Y. Chen, R. Zhang, and Y.-M. Wang. Automatic Misconfiguration Troubleshooting with PeerPressure. In Proceedings of the 6th USENIX Conference on Operating Systems Design and Implementation (OSDI’04), San Francisco, California, USA, December 2004. Google ScholarDigital Library
- T. Wang, M. Harman, Y. Jia, and J. Krinke. Searching for Better Configurations: A Rigorous Approach to Clone Evaluation. In Proceedings of the 9th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’13), Saint Petersburg, Russia, August 2013. Google ScholarDigital Library
- Y.-M. Wang, C. Verbowski, J. Dunagan, Y. Chen, H. J. Wang, C. Yuan, and Z. Zhang. STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support. In Proceedings of the 17th Large Installation Systems Administration Conference (LISA’03), San Diego, CA, USA, October 2003. Google ScholarDigital Library
- M. Welsh. What I Wish Systems Researchers Would Work On. 2013. http://matt-welsh.blogspot.com/2013/ 05/what-i-wish-systems-researchers-would.html.Google Scholar
- A. Whitaker, R. S. Cox, and S. D. Gribble. Configuration Debugging as Search: Finding the Needle in the Haystack. In Proceedings of the 6th USENIX Conference on Operating Systems Design and Implementation (OSDI’04), San Francisco, California, USA, December 2004. Google ScholarDigital Library
- Y. Xiong, A. Hubaux, S. She, and K. Czarnecki. Generating Range Fixes for Software Configuration. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12), Zurich, Switzerland, June 2012. Google ScholarDigital Library
- Y. Xiong, H. Zhang, A. Hubaux, S. She, J. Wang, and K. Czarnecki. Range Fixes: Interactive Error Resolution for Software Configuration. IEEE Transactions on Software Engineering (TSE), December 2014.Google Scholar
- T. Xu, J. Zhang, P. Huang, J. Zheng, T. Sheng, D. Yuan, Y. Zhou, and S. Pasupathy. Do Not Blame Users for Misconfigurations. In Proceedings of the 24th Symposium on Operating System Principles (SOSP’13), Farmington, PA, USA, November 2013. Google ScholarDigital Library
- T. Xu and Y. Zhou. Systems Approaches to Tackling Configuration Errors: A Survey. ACM Computing Surveys (CSUR), 47(4), July 2015. Google ScholarDigital Library
- C. Yilmaz, M. B. Cohen, and A. A. Porter. Covering Arrays for Efficient Fault Characterization in Complex Configuration Spaces. IEEE Transactions on Software Engineering (TSE), 32(1):1–15, January 2006. Google ScholarDigital Library
- Z. Yin, X. Ma, J. Zheng, Y. Zhou, L. N. Bairavasundaram, and S. Pasupathy. An Empirical Study on Configuration Errors in Commercial and Open Source Systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11), Cascais, Portugal, October 2011. Google ScholarDigital Library
- D. Yuan, S. Park, and Y. Zhou. Characterizing Logging Practices in Open-Source Software. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12), Zurich, Switzerland, June 2012. Google ScholarDigital Library
- D. Yuan, Y. Xie, R. Panigrahy, J. Yang, C. Verbowski, and A. Kumar. Context-based Online Configuration Error Detection. In Proceedings of 2011 USENIX Annual Technical Conference, Portland, OR, USA, June 2011. Google ScholarDigital Library
- A. Zeller. Why Programs Fail: A Guide to Systematic Debugging (2nd Edition). Morgan Kaufmann Publishers, June 2009. Google ScholarDigital Library
- J. Zhang, L. Renganarayana, X. Zhang, N. Ge, V. Bala, T. Xu, and Y. Zhou. EnCore: Exploiting System Environment and Correlation Information for Misconfiguration Detection. In Proceedings of the 19th International Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS-XIX), Salt Lake City, UT, USA, March 2014. Google ScholarDigital Library
- S. Zhang and M. D. Ernst. Automated Diagnosis of Software Configuration Errors. In Proceedings of the 35th International Conference on Software Engineering (ICSE’13), San Francisco, CA, USA, May 2013. Google ScholarDigital Library
- S. Zhang and M. D. Ernst. Which Configuration Option Should I Change? In Proceedings of the 36th Internationl Conference on Software Engineering (ICSE’14), Hyderabad, India, May 2014. Google ScholarDigital Library
- W. Zheng, R. Bianchini, and T. D. Nguyen. Automatic Configuration of Internet Services. In Proceedings of the 2nd EuroSys Conference (EuroSys’07), Lisbon, Portugal, March 2007. Google ScholarDigital Library
- W. Zheng, R. Bianchini, and T. D. Nguyen. MassConf: Automatic Configuration Tuning By Leveraging User Community Information. In Proceedings of the 2nd ACM/SPEC International Conference on Performance Engineering (ICPE’11), Karlsruhe, Germany, March 2011. Google ScholarDigital Library
Index Terms
- Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software
Recommendations
Systems Approaches to Tackling Configuration Errors: A Survey
In recent years, configuration errors (i.e., misconfigurations) have become one of the dominant causes of system failures, resulting in many severe service outages and downtime. Unfortunately, it is notoriously difficult for system users (e.g., ...
Understanding and discovering software configuration dependencies in cloud and datacenter systems
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringA large percentage of real-world software configuration issues, such as misconfigurations, involve multiple interdependent configuration parameters. However, existing techniques and tools either do not consider dependencies among configuration ...
Query-based configuration of text retrieval solutions for software engineering tasks
ESEC/FSE 2015: Proceedings of the 2015 10th Joint Meeting on Foundations of Software EngineeringText Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good ...
Comments