Abstract
System failures resulting from configuration errors are one of the major reasons for the compromised reliability of today's software systems. Although many techniques have been proposed for configuration error detection, these approaches can generally only be applied after an error has occurred. Proactively verifying configuration files is a challenging problem, because 1) software configurations are typically written in poorly structured and untyped “languages”, and 2) specifying rules for configuration verification is challenging in practice. This paper presents ConfigV, a verification framework for general software configurations. Our framework works as follows: in the pre-processing stage, we first automatically derive a specification. Once we have a specification, we check if a given configuration file adheres to that specification. The process of learning a specification works through three steps. First, ConfigV parses a training set of configuration files (not necessarily all correct) into a well-structured and probabilistically-typed intermediate representation. Second, based on the association rule learning algorithm, ConfigV learns rules from these intermediate representations. These rules establish relationships between the keywords appearing in the files. Finally, ConfigV employs rule graph analysis to refine the resulting rules. ConfigV is capable of detecting various configuration errors, including ordering errors, integer correlation errors, type errors, and missing entry errors. We evaluated ConfigV by verifying public configuration files on GitHub, and we show that ConfigV can detect known configuration errors in these files.
- 2017a. Aymargeddon. https://raw.githubusercontent.com/bennibaermann/Aymargeddon/ b85d23c0690b1c6a48a045ea45f4c8b19b036fa5/var/my.cnf . (March 2017).Google Scholar
- 2017a. container. https://www.dropbox.com/s/5alc0zs0qp5i529/ybh8r3n2avj7sqd1rcmx0orzry23bopl.cnf ?dl=0 . (March 2017).Google Scholar
- 2017a. containerization. https://raw.githubusercontent.com/billycyzhang/containerization/ 78c6e8fefbafb89de8c28296e83a2f6fefe03879/enterprise- images/mariadb/my.cnf . (March 2017).Google Scholar
- 2017a. evansims. https://raw.githubusercontent.com/evansims/scripts/715e4f4519bbff8bab5ab26a15256d79796c923a/config/ mysql/my- 2gb.cnf . (March 2017).Google Scholar
- 2017b. evansims-script. https://raw.githubusercontent.com/evansims/scripts/715e4f4519bbff8bab5ab26a15256d79796c923a/ config/mysql/my- 1gb.cnf . (March 2017).Google Scholar
- 2017. Fatal Error: Cannot allocate memory for the buffer pool. http://dba.stackexchange.com/questions/25165/ intermittent- mysql- crashes- with- error- fatal- error- cannot- allocate- memory- for- t . (March 2017).Google Scholar
- 2017. Fine-grained value correlation error. (March 2017). http://serverfault.com/questions/628414/ my- cnf- configuration- in- mysql- 5- 6- x .Google Scholar
- 2017b. isucon2-summer-ruby. https://raw.githubusercontent.com/co- me/isucon2- summer- ruby/ 1f633384f485fb7282bbbf42f2bf5d18410f7307/config/database/my.cnf . (March 2017).Google Scholar
- 2017b. mini-2011. https://raw.githubusercontent.com/funtoo/experimental- mini- 2011/ 083598863a7c9659f188d31e15b39e3af0f56cab/dev- db/mysql/files/my.cnf . (March 2017).Google Scholar
- 2017c. mysetup. https://raw.githubusercontent.com/kazeburo/mysetup/99ba8656f54b1b36f4a7c93941e113adc2f05f70/mysql/ my55.cnf . (March 2017).Google Scholar
- 2017c. PHP CLI Segmentation Fault with pgsql. http://linux.m2osw.com/php_cli_segmentation_fault_with_pgsql . (March 2017).Google Scholar
- 2017b. puppet. https://raw.githubusercontent.com/a2o/puppet- modules- a2o- essential/ 9e48057cc1320de52548ff019352299bc4bd5069/modules/a2o_essential_linux_mysql/files/my.cnf . (March 2017).Google Scholar
- 2017. Stack Overflow. http://stackoverflow.com/ . (March 2017).Google Scholar
- 2017c. Stats-analysis. https://raw.githubusercontent.com/NCIP/stats- analysis/ec7a1a15b0a5a7518a061aedd2d601ea7cc2dfca/ cacoresdk203.2.1/conf/download/my.cnf . (March 2017).Google Scholar
- 2017a. Stats-analysis. https://raw.githubusercontent.com/NCIP/stats- analysis/ec7a1a15b0a5a7518a061aedd2d601ea7cc2dfca/ cacoresdk203.2.1/conf/download/my.cnf . (March 2017).Google Scholar
- 2017. The issue for slow query log. http://forum.directadmin.com/showthread.php?t=47547 . (March 2017).Google Scholar
- 2017d. Type Error Example. https://github.com/thekad/puppet- module- mysql/blob/master/templates/my.cnf.erb . (March 2017).Google Scholar
- 2017b. vit-analysis. https://www.dropbox.com/s/09joln8kacu9ceq/ekqjat6m1j5nv9ihjhua9q89sid77cso.cnf ?dl=00 . (March 2017).Google Scholar
- 2017c. vitroot. https://raw.githubusercontent.com/vitroot/configs/90441204dbae37521912eaaeedd3574db07b8ae4/my.cnf . (March 2017).Google Scholar
- 2017d. vitroot2. https://www.dropbox.com/s/qcfmsx12i4pjjtd/missing.cnf ?dl=0 . (March 2017).Google Scholar
- 2017c. vps. https://raw.githubusercontent.com/rarescosma/vps/7d0b898bb30eecac65158f704b43bb4d1ca06dbe/_config/ mysql/my.cnf . (March 2017).Google Scholar
- Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. 1993. Mining association rules between sets of items in large databases. In Acm sigmod record, Vol. 22. ACM, 207–216. Google ScholarDigital Library
- Mona Attariyan, Michael Chow, and Jason Flinn. 2012. X-ray: Automating root-cause diagnosis of performance anomalies in production software. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google Scholar
- Mona Attariyan and Jason Flinn. 2010. Automating configuration troubleshooting with dynamic information flow analysis. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google Scholar
- Jay Ayres, Jason Flannick, Johannes Gehrke, and Tomi Yiu. 2002. Sequential pattern mining using a bitmap representation. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 429–435. Google ScholarDigital Library
- François Bobot, Jean-Christophe Filliâtre, Claude Marché, and Andrei Paskevich. 2015. Let’s verify this with Why3. STTT 17, 6 (2015), 709–727. Google ScholarDigital Library
- Ali Breland. 2017. FCC: Over 12,000 callers couldnâĂŹt reach 911 during AT&T outage. http://thehill.com/policy/technology/325510-over-12000-callers-couldnt-reach-911-during-att-outage. (March 2017).Google Scholar
- Xu Chen, Yun Mao, Zhuoqing Morley Mao, and Jacobus E. van der Merwe. 2010. Declarative configuration management for complex and dynamic networks. In ACM CoNEXT (CoNEXT). Google ScholarDigital Library
- William Enck, Patrick Drew McDaniel, Subhabrata Sen, Panagiotis Sebos, Sylke Spoerel, Albert G. Greenberg, Sanjay G. Rao, and William Aiello. 2007. Configuration Management at Massive Scale: System Design and Experience. In USENIX Annual Technical Conference (USENIX ATC).Google ScholarDigital Library
- Jiawei Han, Hong Cheng, Dong Xin, and Xifeng Yan. 2007. Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery 15, 1 (2007), 55–86. Google ScholarDigital Library
- Peng Huang, William J. Bolosky, Abhishek Singh, and Yuanyuan Zhou. 2015. Conf Valley: A systematic configuration validation framework for cloud services. In 10th European Conference on Computer Systems (EuroSys).Google Scholar
- Andrei Nikolaevich Kolmogorov. 1965. Three approaches to the definition of the concept âĂIJquantity of informationâĂİ. Problemy peredachi informatsii 1, 1 (1965), 3–11.Google Scholar
- Pat Langley and Herbert A Simon. 1995. Applications of machine learning and rule induction. Commun. ACM 38, 11 (1995), 54–64. Google ScholarDigital Library
- Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155 (2016).Google Scholar
- K. Rustan M. Leino. 2010. Dafny: An Automatic Program Verifier for Functional Correctness. In Logic for Programming, Artificial Intelligence, and Reasoning - 16th International Conference, LPAR-16. 348–370.Google Scholar
- Boon Thau Loo, Joseph M. Hellerstein, Ion Stoica, and Raghu Ramakrishnan. 2005. Declarative routing: Extensible routing with declarative queries. In ACM SIGCOMM (SIGCOMM). Google ScholarDigital Library
- Nizar R Mabroukeh and Christie I Ezeife. 2010. A taxonomy of sequential pattern mining algorithms. ACM Computing Surveys (CSUR) 43, 1 (2010), 3.Google ScholarDigital Library
- Ruzica Piskac, Thomas Wies, and Damien Zufferey. 2014. GRASShopper - Complete Heap Verification with Mixed Specifications. In Tools and Algorithms for the Construction and Analysis of Systems - 20th International Conference, TACAS 2014. 124–139.Google Scholar
- Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting program properties from big code. In ACM SIGPLAN Notices, Vol. 50. ACM, 111–124. Google ScholarDigital Library
- Jenni Ryall. 2015. Facebook, Tinder, Instagram suffer widespread issues. http://mashable.com/2015/01/27/ facebook- tinder- instagram- issues/ . (2015).Google Scholar
- Mark Santolucito, Ennan Zhai, and Ruzica Piskac. 2016. Probabilistic Automated Language Learning for Configuration Files. In 28th Computer Aided Verification (CAV).Google Scholar
- Ya-Yunn Su, Mona Attariyan, and Jason Flinn. 2007. AutoBash: Improving configuration management with operating systems. In 21st ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
- Helen J. Wang, John C. Platt, Yu Chen, Ruyun Zhang, and Yi-Min Wang. 2004. Automatic misconfiguration troubleshooting with PeerPressure. In 6th USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google Scholar
- Andrew Whitaker, Richard S. Cox, and Steven D. Gribble. 2004. Configuration debugging as search: Finding the needle in the haystack. In 6th USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google Scholar
- Tianyin Xu. 2017. Misconfiguration dataset. https://github.com/tianyin/configuration_datasets . (March 2017).Google Scholar
- Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker. 2015. Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software. In 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE). Google ScholarDigital Library
- Tianyin Xu, Xinxin Jin, Peng Huang, Yuanyuan Zhou, Shan Lu, Long Jin, and Shankar Pasupathy. 2016. Early detection of configuration errors to reduce failure damage. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google ScholarDigital Library
- Tianyin Xu, Jiaqi Zhang, Peng Huang, Jing Zheng, Tianwei Sheng, Ding Yuan, Yuanyuan Zhou, and Shankar Pasupathy. 2013. Do not blame users for misconfigurations. In 24th ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
- Tianyin Xu and Yuanyuan Zhou. 2015. Systems approaches to tackling configuration errors: A survey. ACM Comput. Surv. 47, 4 (2015), 70. Google ScholarDigital Library
- Zuoning Yin, Xiao Ma, Jing Zheng, Yuanyuan Zhou, Lakshmi N. Bairavasundaram, and Shankar Pasupathy. 2011. An empirical study on configuration errors in commercial and open source systems. In 23rd ACM Symposium on Operating Systems Principles (SOSP). Google ScholarDigital Library
- Ding Yuan, Yinglian Xie, Rina Panigrahy, Junfeng Yang, Chad Verbowski, and Arunvijay Kumar. 2011. Context-based online configuration-error detection. In USENIX Annual Technical Conference (USENIX ATC).Google Scholar
- Andreas Zeller. 2005. Why Programs Fail: A Guide to Systematic Debugging. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Google ScholarDigital Library
- Jiaqi Zhang, Lakshminarayanan Renganarayana, Xiaolan Zhang, Niyu Ge, Vasanth Bala, Tianyin Xu, and Yuanyuan Zhou. 2014. EnCore: Exploiting system environment and correlation information for misconfiguration detection. In Architectural Support for Programming Languages and Operating Systems (ASPLOS).Google Scholar
Index Terms
- Synthesizing configuration file specifications with association rule learning
Recommendations
Research on Network Configuration Verification Based on Association Analysis
CSAE '22: Proceedings of the 6th International Conference on Computer Science and Application EngineeringThis paper studies the application of association analysis in the scenario of massive network configuration verification, and puts forward a kind of network configuration anomaly detection method and system based on association analysis. We creatively ...
What Constitutes the Deployment and Runtime Configuration System? An Empirical Study on OpenStack Projects
Modern software systems are designed to be deployed in different configured environments (e.g., permissions, virtual resources, network connections) and adapted at runtime to different situations (e.g., memory limits, enabling/disabling features, database ...
Version space learning for verification on temporal differentials
ISSTA 2017: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and AnalysisConfiguration files provide users with the ability to quickly alter the behavior of their software system. Ensuring that a configuration file does not induce errors in the software is a complex verification issue. The types of errors can be easy to ...
Comments