ABSTRACT
Infrastructure as Code (IaC) is the process of managing IT infrastructure via programmable configuration files (also called IaC scripts). Like other software artifacts, IaC scripts may contain security smells, which are coding patterns that can result in security weaknesses. Automated analysis tools to detect security smells in IaC scripts exist, but they focus on specific technologies such as Puppet, Ansible, or Chef. This means that when the detection of a new smell is implemented in one of the tools, it is not immediately available for the technologies supported by the other tools — the only option is to duplicate the effort.
This paper presents an approach that enables consistent security smell detection across different IaC technologies. We conduct a large-scale empirical study that analyzes security smells on three large datasets containing 196,755 IaC scripts and 12,281,251 LOC. We show that all categories of security smells are identified across all datasets and we identify some smells that might affect many IaC projects. To conduct this study, we developed GLITCH, a new technology-agnostic framework that enables automated polyglot smell detection by transforming IaC scripts into an intermediate representation, on which different security smell detectors can be defined. GLITCH currently supports the detection of nine different security smells in scripts written in Ansible, Chef, or Puppet. We compare GLITCH with state-of-the-art security smell detectors. The results obtained not only show that GLITCH can reduce the effort of writing security smell analyses for multiple IaC technologies, but also that it has higher precision and recall than the current state-of-the-art tools.
- Ahmad Alnafessah, Alim Ul Gias, Runan Wang, Lulai Zhu, Giuliano Casale, and Antonio Filieri. 2021. Quality-Aware DevOps Research: Where Do We Stand?IEEE Access 9(2021), 44476–44489.Google Scholar
- James Fryman. 2014. DNS outage post mortem. https://github.blog/2014-01-18-dns-outage-post-mortem/ Accessed: 3 May 2022.Google Scholar
- Michele Guerriero, Martin Garriga, Damian A Tamburri, and Fabio Palomba. 2019. Adoption, support, and challenges of infrastructure-as-code: Insights from industry. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 580–589.Google ScholarCross Ref
- Oliver Hanappi, Waldemar Hummer, and Schahram Dustdar. 2016. Asserting reliable convergence for configuration management scripts. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. 328–343.Google ScholarDigital Library
- Rebecca Hersher. 2017. Amazon and the $150 Million typo. https://www.npr.org/sections/thetwo-way/2017/03/03/518322734/amazon-and-the-150-million-typo?t=1651588365675 Accessed: 3 May 2022.Google Scholar
- Katsuhiko Ikeshita, Fuyuki Ishikawa, and Shinichi Honiden. 2017. Test suite reduction in idempotence testing of infrastructure as code. In International Conference on Tests and Proofs. Springer, 98–115.Google ScholarCross Ref
- Yujuan Jiang and Bram Adams. 2015. Co-evolution of infrastructure and source code-an empirical study. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 45–55.Google ScholarCross Ref
- Xianhao Jin and Francisco Servant. 2021. What helped, and what did not? An Evaluation of the Strategies to Improve Continuous Integration. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 213–225.Google ScholarDigital Library
- John C Kelly, Joseph S Sherif, and Jonathan Hops. 1992. An analysis of defect densities found during software inspections. Journal of Systems and Software 17, 2 (1992), 111–117.Google ScholarDigital Library
- Julien Lepiller, Ruzica Piskac, Martin Schäf, and Mark Santolucito. 2021. Analyzing Infrastructure as Code to Prevent Intra-update Sniping Vulnerabilities.. In TACAS (2). 105–123.Google Scholar
- MITRE. 2022. CWE-Common Weakness Enumeration. https://cwe.mitre.org/index.html.Google Scholar
- Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2017. Curating github for engineered software projects. Empirical Software Engineering 22, 6 (2017), 3219–3253.Google ScholarDigital Library
- Pars Mutaf. 1999. Defending against a Denial-of-Service Attack on TCP.. In Recent Advances in Intrusion Detection.Google Scholar
- National Institute of Standards and Technology. 2014. Security and Privacy Controls for Federal Information Systems and Organizations. https://www.nist.gov/publications/security-and-privacy-controls-federal-information-systems-and-organizations-including-0.Google Scholar
- Akond Rahman, Effat Farhana, Chris Parnin, and Laurie Williams. 2020. Gang of eight: A defect taxonomy for infrastructure as code scripts. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 752–764.Google ScholarDigital Library
- Akond Rahman, Effat Farhana, and Laurie Williams. 2020. The ‘as code’activities: development anti-patterns for infrastructure as code. Empirical Software Engineering 25, 5 (2020), 3430–3467.Google ScholarDigital Library
- Akond Rahman, Rezvan Mahdavi-Hezaveh, and Laurie Williams. 2019. A systematic mapping study of infrastructure as code research. Information and Software Technology 108 (2019), 65–77.Google ScholarCross Ref
- Akond Rahman, Chris Parnin, and Laurie Williams. 2019. The seven sins: Security smells in infrastructure as code scripts. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 164–175.Google ScholarDigital Library
- Akond Rahman, Md Rayhanur Rahman, Chris Parnin, and Laurie Williams. 2021. Security smells in ansible and chef scripts: A replication study. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 1(2021), 1–31.Google ScholarDigital Library
- Akond Rahman and Laurie Williams. 2018. Characterizing defective configuration scripts used for continuous deployment. In 2018 IEEE 11th International conference on software testing, verification and validation (ICST). IEEE, 34–45.Google ScholarCross Ref
- Akond Rahman and Laurie Williams. 2019. Source code properties of defective infrastructure as code scripts. Information and Software Technology 112 (2019), 148–163.Google ScholarDigital Library
- Eric Rescorla 2000. HTTP over TLS. RFC 2818, May.Google Scholar
- Johnny Saldaña. 2021. The coding manual for qualitative researchers. sage.Google Scholar
- Julian Schwarz, Andreas Steffens, and Horst Lichter. 2018. Code smells in infrastructure as code. In 2018 11th International Conference on the Quality of Information and Communications Technology (QUATIC). IEEE, 220–228.Google ScholarCross Ref
- Rian Shambaugh, Aaron Weiss, and Arjun Guha. 2016. Rehearsal: A configuration verification tool for puppet. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 416–430.Google ScholarDigital Library
- Tushar Sharma, Marios Fragkoulis, and Diomidis Spinellis. 2016. Does your configuration code smell?. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE, 189–200.Google ScholarDigital Library
- Thodoris Sotiropoulos, Dimitris Mitropoulos, and Diomidis Spinellis. 2020. Practical fault detection in Puppet programs. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 26–37.Google ScholarDigital Library
- Eduard Van der Bent, Jurriaan Hage, Joost Visser, and Georgios Gousios. 2018. How good is your puppet? an empirically defined and validated quality model for puppet. In 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER). IEEE, 164–174.Google Scholar
Index Terms
- GLITCH: Automated Polyglot Security Smell Detection in Infrastructure as Code
Recommendations
Smelly variables in ansible infrastructure code: detection, prevalence, and lifetime
MSR '22: Proceedings of the 19th International Conference on Mining Software RepositoriesInfrastructure as Code is the practice of automating the provisioning, configuration, and orchestration of network nodes using code in which variable values such as configuration parameters, node hostnames, etc. play a central role. Mistakes in these ...
Security Smells in Ansible and Chef Scripts: A Replication Study
Continuous Special Section: AI and SEContext: Security smells are recurring coding patterns that are indicative of security weakness and require further inspection. As infrastructure as code (IaC) scripts, such as Ansible and Chef scripts, are used to provision cloud-based servers and ...
Does your configuration code smell?
MSR '16: Proceedings of the 13th International Conference on Mining Software RepositoriesInfrastructure as Code (IaC) is the practice of specifying computing system configurations through code, and managing them through traditional software engineering methods. The wide adoption of configuration management and increasing size and complexity ...
Comments