Skip to main content
Erschienen in: Empirical Software Engineering 1/2024

01.02.2024

An empirical study of task infections in Ansible scripts

verfasst von: Akond Rahman, Dibyendu Brinto Bose, Yue Zhang, Rahul Pandita

Erschienen in: Empirical Software Engineering | Ausgabe 1/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Context

Despite being beneficial for managing computing infrastructure at scale, Ansible scripts include security weaknesses, such as hard-coded passwords. Security weaknesses can propagate into tasks, i.e., code constructs used for managing computing infrastructure with Ansible. Propagation of security weaknesses into tasks makes the provisioned infrastructure susceptible to security attacks. A systematic characterization of task infection, i.e., the propagation of security weaknesses into tasks, can aid practitioners and researchers in understanding how security weaknesses propagate into tasks and derive insights for practitioners to develop Ansible scripts securely.

Objective

The goal of the paper is to help practitioners and researchers understand how Ansible-managed computing infrastructure is impacted by security weaknesses by conducting an empirical study of task infections in Ansible scripts.

Method

We conduct an empirical study where we quantify the frequency of task infections in Ansible scripts. Upon detection of task infections, we apply qualitative analysis to determine task infection categories. We also conduct a survey with 23 practitioners to determine the prevalence and severity of identified task infection categories. With logistic regression analysis, we identify development factors that correlate with presence of task infections.

Results

In all, we identify 1,805 task infections in 27,213 scripts. We identify six task infection categories: anti-virus, continuous integration, data storage, message broker, networking, and virtualization. From our survey, we observe tasks used to manage data storage infrastructure perceived to have the most severe consequences. We also find three development factors, namely age, minor contributors, and scatteredness to correlate with the presence of task infections.

Conclusion

Our empirical study shows computing infrastructure managed by Ansible scripts to be impacted by security weaknesses. We conclude the paper by discussing the implications of our findings for practitioners and researchers.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
https://pyyaml.org/
 
2
https://www.vaultproject.io/
 
Literatur
Zurück zum Zitat Agrawal A, Rahman A, Krishna R, Sobran A, Menzies T (2018) We don’t need another hero?: The impact of “heroes” on software development. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ACM, New York, NY, USA, ICSE-SEIP ’18, pp 245–253. https://doi.org/10.1145/3183519.3183549 Agrawal A, Rahman A, Krishna R, Sobran A, Menzies T (2018) We don’t need another hero?: The impact of “heroes” on software development. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ACM, New York, NY, USA, ICSE-SEIP ’18, pp 245–253. https://​doi.​org/​10.​1145/​3183519.​3183549
Zurück zum Zitat Aho AV, Sethi R, Ullman JD (1986) Compilers, principles, techniques. Addison wesley 7(8):9 Aho AV, Sethi R, Ullman JD (1986) Compilers, principles, techniques. Addison wesley 7(8):9
Zurück zum Zitat Banavar G, Chandra TD, Strom RE, Sturman DC (1999) A case for message oriented middleware. In: Proceedings of the 13th international symposium on distributed computing, Springer-Verlag, Berlin, Heidelberg, pp 1–18 Banavar G, Chandra TD, Strom RE, Sturman DC (1999) A case for message oriented middleware. In: Proceedings of the 13th international symposium on distributed computing, Springer-Verlag, Berlin, Heidelberg, pp 1–18
Zurück zum Zitat Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code! examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European conference on foundations of software engineering, association for computing machinery, New York, NY, USA, ESEC/FSE ’11, pp 4–14. https://doi.org/10.1145/2025113.2025119 Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t touch my code! examining the effects of ownership on software quality. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European conference on foundations of software engineering, association for computing machinery, New York, NY, USA, ESEC/FSE ’11, pp 4–14. https://​doi.​org/​10.​1145/​2025113.​2025119
Zurück zum Zitat Borovits N, Kumara I, Di Nucci D, Krishnan P, Palma SD, Palomba F, Tamburri DA, Heuvel WJvd, (2022) Findici: Using machine learning to detect linguistic inconsistencies between code and natural language descriptions in infrastructure-as-code. Empir Softw Eng 27(7):178CrossRef Borovits N, Kumara I, Di Nucci D, Krishnan P, Palma SD, Palomba F, Tamburri DA, Heuvel WJvd, (2022) Findici: Using machine learning to detect linguistic inconsistencies between code and natural language descriptions in infrastructure-as-code. Empir Softw Eng 27(7):178CrossRef
Zurück zum Zitat Carver JC (2010) Towards reporting guidelines for experimental replications: a proposal. In: 1st international workshop on replication in empirical software engineering, vol 1, pp 1–4 Carver JC (2010) Towards reporting guidelines for experimental replications: a proposal. In: 1st international workshop on replication in empirical software engineering, vol 1, pp 1–4
Zurück zum Zitat Cohen P, West SG, Aiken LS (2014) Applied multiple regression/correlation analysis for the behavioral sciences. Psychology PressCrossRef Cohen P, West SG, Aiken LS (2014) Applied multiple regression/correlation analysis for the behavioral sciences. Psychology PressCrossRef
Zurück zum Zitat Cramer D, Howitt DL (2004) The Sage dictionary of statistics: a practical resource for students in the social sciences. SageCrossRef Cramer D, Howitt DL (2004) The Sage dictionary of statistics: a practical resource for students in the social sciences. SageCrossRef
Zurück zum Zitat Da Silva FQ, Suassuna M, França ACC, Grubb AM, Gouveia TB, Monteiro CV, dos Santos IE (2014) Replication of empirical studies in software engineering research: a systematic mapping study. Empir Softw Eng 19:501–557 Da Silva FQ, Suassuna M, França ACC, Grubb AM, Gouveia TB, Monteiro CV, dos Santos IE (2014) Replication of empirical studies in software engineering research: a systematic mapping study. Empir Softw Eng 19:501–557
Zurück zum Zitat Dalla Palma S, Di Nucci D, Palomba F, Tamburri DA (2020) Toward a catalog of software quality metrics for infrastructure code. J Syst Softw 170:110726CrossRef Dalla Palma S, Di Nucci D, Palomba F, Tamburri DA (2020) Toward a catalog of software quality metrics for infrastructure code. J Syst Softw 170:110726CrossRef
Zurück zum Zitat Duvall P, Matyas SM, Glover A (2007) Continuous integration: improving software quality and reducing risk (The Addison-Wesley Signature Series). Addison-Wesley Professional Duvall P, Matyas SM, Glover A (2007) Continuous integration: improving software quality and reducing risk (The Addison-Wesley Signature Series). Addison-Wesley Professional
Zurück zum Zitat Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting Empirical Methods for Software Engineering Research. Springer, London, London, pp 285–311 Easterbrook S, Singer J, Storey MA, Damian D (2008) Selecting Empirical Methods for Software Engineering Research. Springer, London, London, pp 285–311
Zurück zum Zitat Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University PressCrossRef Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University PressCrossRef
Zurück zum Zitat Greenwood PE, Nikulin MS (1996) A guide to chi-squared testing, vol 280. Wiley Greenwood PE, Nikulin MS (1996) A guide to chi-squared testing, vol 280. Wiley
Zurück zum Zitat Hortlund A (2021) Security smells in open-source infrastructure as code scripts: A replication study Hortlund A (2021) Security smells in open-source infrastructure as code scripts: A replication study
Zurück zum Zitat Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. WileyCrossRef Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. WileyCrossRef
Zurück zum Zitat Humble J, Farley D (2010) Continuous delivery: reliable software releases through build, test, and deployment automation, 1st edn. Addison-Wesley Professional Humble J, Farley D (2010) Continuous delivery: reliable software releases through build, test, and deployment automation, 1st edn. Addison-Wesley Professional
Zurück zum Zitat Krein JL, Knutson CD (2010) A case for replication : synthesizing research methodologies in software engineering Krein JL, Knutson CD (2010) A case for replication : synthesizing research methodologies in software engineering
Zurück zum Zitat Krishna R, Agrawal A, Rahman A, Sobran A, Menzies T (2018) What is the connection between issues, bugs, and enhancements?: Lessons learned from 800+ software projects. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ACM, New York, NY, USA, ICSE-SEIP ’18, pp 306–315. https://doi.org/10.1145/3183519.3183548 Krishna R, Agrawal A, Rahman A, Sobran A, Menzies T (2018) What is the connection between issues, bugs, and enhancements?: Lessons learned from 800+ software projects. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ACM, New York, NY, USA, ICSE-SEIP ’18, pp 306–315. https://​doi.​org/​10.​1145/​3183519.​3183548
Zurück zum Zitat Lombardi MM, Oblinger DG (2007) Authentic learning for the 21st century: an overview. Educause learning initiative 1(2007):1–12 Lombardi MM, Oblinger DG (2007) Authentic learning for the 21st century: an overview. Educause learning initiative 1(2007):1–12
Zurück zum Zitat Long JS, Freese J (2006) Regression models for categorical dependent variables using Stata, vol 7. Stata Press Long JS, Freese J (2006) Regression models for categorical dependent variables using Stata, vol 7. Stata Press
Zurück zum Zitat Meli M, McNiece MR, Reaves B (2019) How bad can it git? characterizing secret leakage in public github repositories. In: NDSS Meli M, McNiece MR, Reaves B (2019) How bad can it git? characterizing secret leakage in public github repositories. In: NDSS
Zurück zum Zitat Menard S (2002) Applied logistic regression analysis. 106, Sage Menard S (2002) Applied logistic regression analysis. 106, Sage
Zurück zum Zitat Opdebeeck R, Zerouali A, De Roover C (2022) Smelly variables in ansible infrastructure code: Detection, prevalence, and lifetime. In: 2022 IEEE/ACM 18th international conference on mining software repositories (MSR), IEEE Opdebeeck R, Zerouali A, De Roover C (2022) Smelly variables in ansible infrastructure code: Detection, prevalence, and lifetime. In: 2022 IEEE/ACM 18th international conference on mining software repositories (MSR), IEEE
Zurück zum Zitat Opdebeeck R, Zerouali A, Roover CD (2023) Control and data flow in security smell detection for infrastructure as code: Is it worth the effort? In: 2023 IEEE/ACM 19th international conference on mining software repositories (MSR) Opdebeeck R, Zerouali A, Roover CD (2023) Control and data flow in security smell detection for infrastructure as code: Is it worth the effort? In: 2023 IEEE/ACM 19th international conference on mining software repositories (MSR)
Zurück zum Zitat Radjenović D, Heričko M, Torkar R, Živkovič A (2013) Software fault prediction metrics: a systematic literature review. Inf Softw Technol 55(8):1397–1418CrossRef Radjenović D, Heričko M, Torkar R, Živkovič A (2013) Software fault prediction metrics: a systematic literature review. Inf Softw Technol 55(8):1397–1418CrossRef
Zurück zum Zitat Rahman A, Agrawal A, Krishna R, Sobran A (2018a) Characterizing the influence of continuous integration: Empirical results from 250+ open source and proprietary projects. In: Proceedings of the 4th ACM SIGSOFT International Workshop on Software Analytics, ACM, New York, NY, USA, SWAN 2018, pp 8–14. https://doi.org/10.1145/3278142.3278149 Rahman A, Agrawal A, Krishna R, Sobran A (2018a) Characterizing the influence of continuous integration: Empirical results from 250+ open source and proprietary projects. In: Proceedings of the 4th ACM SIGSOFT International Workshop on Software Analytics, ACM, New York, NY, USA, SWAN 2018, pp 8–14. https://​doi.​org/​10.​1145/​3278142.​3278149
Zurück zum Zitat Rahman A, Parnin C, Williams L (2019) The seven sins: security smells in infrastructure as code scripts. In: 2019 IEEE/ACM 41st international conference on software engineering (ICSE), IEEE, pp 164–175 Rahman A, Parnin C, Williams L (2019) The seven sins: security smells in infrastructure as code scripts. In: 2019 IEEE/ACM 41st international conference on software engineering (ICSE), IEEE, pp 164–175
Zurück zum Zitat Rahman A, Farhana E, Parnin C, Williams L (2020a) Gang of eight: a defect taxonomy for infrastructure as code scripts. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering, association for computing machinery, New York, NY, USA, ICSE ’20, p 752–764. https://doi.org/10.1145/3377811.3380409 Rahman A, Farhana E, Parnin C, Williams L (2020a) Gang of eight: a defect taxonomy for infrastructure as code scripts. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering, association for computing machinery, New York, NY, USA, ICSE ’20, p 752–764. https://​doi.​org/​10.​1145/​3377811.​3380409
Zurück zum Zitat Rahman A, Farhana E, Williams L (2020) The ’as code’activities: development anti-patterns for infrastructure as code. Empir Softw Eng 25(5):3430–3467CrossRef Rahman A, Farhana E, Williams L (2020) The ’as code’activities: development anti-patterns for infrastructure as code. Empir Softw Eng 25(5):3430–3467CrossRef
Zurück zum Zitat Reis S, Abreu R, d’Amorim M, Fortunato D (2023) Leveraging practitioners’ feedback to improve a security linter. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE ’22. https://doi.org/10.1145/3551349.3560419 Reis S, Abreu R, d’Amorim M, Fortunato D (2023) Leveraging practitioners’ feedback to improve a security linter. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE ’22. https://​doi.​org/​10.​1145/​3551349.​3560419
Zurück zum Zitat Saavedra N, Ferreira JaF (2023) Glitch: Automated polyglot security smell detection in infrastructure as code. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE ’22. https://doi.org/10.1145/3551349.3556945 Saavedra N, Ferreira JaF (2023) Glitch: Automated polyglot security smell detection in infrastructure as code. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering, association for computing machinery, New York, NY, USA, ASE ’22. https://​doi.​org/​10.​1145/​3551349.​3556945
Zurück zum Zitat Saldaña J (2015) The coding manual for qualitative researchers. Sage Saldaña J (2015) The coding manual for qualitative researchers. Sage
Zurück zum Zitat Shull FJ, Carver JC, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13:211–218CrossRef Shull FJ, Carver JC, Vegas S, Juristo N (2008) The role of replications in empirical software engineering. Empir Softw Eng 13:211–218CrossRef
Zurück zum Zitat Smith J, Johnson B, Murphy-Hill E, Chu B, Lipford HR (2015) Questions developers ask while diagnosing potential security vulnerabilities with static analysis. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, association for computing machinery, New York, NY, USA, ESEC/FSE 2015, p 248–259. https://doi.org/10.1145/2786805.2786812 Smith J, Johnson B, Murphy-Hill E, Chu B, Lipford HR (2015) Questions developers ask while diagnosing potential security vulnerabilities with static analysis. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, association for computing machinery, New York, NY, USA, ESEC/FSE 2015, p 248–259. https://​doi.​org/​10.​1145/​2786805.​2786812
Zurück zum Zitat Tan PN, Steinbach M, Kumar V (2005) Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Boston, MA, USA Tan PN, Steinbach M, Kumar V (2005) Introduction to Data Mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Boston, MA, USA
Metadaten
Titel
An empirical study of task infections in Ansible scripts
verfasst von
Akond Rahman
Dibyendu Brinto Bose
Yue Zhang
Rahul Pandita
Publikationsdatum
01.02.2024
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 1/2024
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-023-10432-6

Weitere Artikel der Ausgabe 1/2024

Empirical Software Engineering 1/2024 Zur Ausgabe

Premium Partner