skip to main content
10.1145/3377811.3380395acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections

Taxonomy of real faults in deep learning systems

Authors Info & Claims
Published:01 October 2020Publication History

ABSTRACT

The growing application of deep neural networks in safety-critical domains makes the analysis of faults that occur in such systems of enormous importance. In this paper we introduce a large taxonomy of faults in deep learning (DL) systems. We have manually analysed 1059 artefacts gathered from GitHub commits and issues of projects that use the most popular DL frameworks (TensorFlow, Keras and PyTorch) and from related Stack Overflow posts. Structured interviews with 20 researchers and practitioners describing the problems they have encountered in their experience have enriched our taxonomy with a variety of additional faults that did not emerge from the other two sources. Our final taxonomy was validated with a survey involving an additional set of 21 developers, confirming that almost all fault categories (13/15) were experienced by at least 50% of the survey participants.

References

  1. 2019. Descript. https://www.descript.comGoogle ScholarGoogle Scholar
  2. 2019. FrameworkData. https://towardsdatascience.com/deep-learning-framework-power-scores-2018-23607ddf297aGoogle ScholarGoogle Scholar
  3. 2019. GitHub - About Stars. https://help.github.com/articles/about-stars/Google ScholarGoogle Scholar
  4. 2019. GitHub - Forking a repo. https://help.github.com/articles/fork-a-repo/Google ScholarGoogle Scholar
  5. 2019. GitHub Search API. https://developer.github.com/v3/search/Google ScholarGoogle Scholar
  6. 2019. ISO/PAS 21448:2019 Road vehicles --- Safety of the intended functionality. https://www.iso.org/standard/70939.htmlGoogle ScholarGoogle Scholar
  7. 2019. Qualtrics. https://www.qualtrics.comGoogle ScholarGoogle Scholar
  8. 2019. Replication Package. https://github.com/dlfaults/dl_faultsGoogle ScholarGoogle Scholar
  9. 2019. StackExchange Data Explorer. https://data.stackexchange.com/stackoverflow/query/newGoogle ScholarGoogle Scholar
  10. 2019. Upwork. https://www.upwork.comGoogle ScholarGoogle Scholar
  11. J. H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is Mutation an Appropriate Tool for Testing Experiments?. In Proceedings of the 27th International Conference on Software Engineering (ICSE '05). ACM, New York, NY, USA, 402--411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, and Jan Bosch. 2018. Software engineering challenges of deep learning. In 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 50--59.Google ScholarGoogle ScholarCross RefCross Ref
  13. Boris Beizer. 1984. Software System Testing and Quality Assurance. Van Nostrand Reinhold Co., New York, NY, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Muriel Daran. 1996. Software Error Analysis: A Real Case Study Involving Real Faults and Mutations. In In Proceedings of the 1996 ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM Press, 158--171.Google ScholarGoogle Scholar
  15. Michael Fischer, Martin Pinzger, and Harald C. Gall. 2003. Populating a Release History Database from Version Control and Bug Tracking Systems. In 19th International Conference on Software Maintenance (ICSM 2003).Google ScholarGoogle Scholar
  16. Siw Elisabeth Hove and Bente Anda. 2005. Experiences from Conducting Semistructured Interviews in Empirical Software Engineering Research. In Proceedings of the 11th IEEE International Software Metrics Symposium (METRICS '05). IEEE Computer Society, Washington, DC, USA, 23--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA, 510--520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. René Just, Darioush Jalali, Laura Inozemtseva, Michael D. Ernst, Reid Holmes, and Gordon Fraser. 2014. Are Mutants a Valid Substitute for Real Faults in Software Testing?. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). ACM, New York, NY, USA, 654--665. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lucy Ellen Lwakatare, Aiswarya Raj, Jan Bosch, Helena Holmström Olsson, and Ivica Crnkovic. 2019. A taxonomy of software engineering challenges for machine learning systems: An empirical investigation. In International Conference on Agile Software Development. Springer, 227--243.Google ScholarGoogle ScholarCross RefCross Ref
  20. Lei Ma, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Felix Juefei-Xu, Chao Xie, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepMutation: Mutation Testing of Deep Learning Systems. In 29th IEEE International Symposium on Software Reliability Engineering, ISSRE 2018, Memphis, TN, USA, October 15-18, 2018. 100--111. Google ScholarGoogle ScholarCross RefCross Ref
  21. Sarah Meldrum, Sherlock A. Licorish, and Bastin Tony Roy Savarimuthu. 2017. Crowdsourced Knowledge on Stack Overflow: A Systematic Mapping Study. In Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering (EASE'17). ACM, New York, NY, USA, 180--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jennifer Rowley and Richard Hartley. 2017. Organizing knowledge: an introduction to managing access to information. Routledge.Google ScholarGoogle Scholar
  23. Carolyn B. Seaman. 1999. Qualitative Methods in Empirical Studies of Software Engineering. IEEE Trans. Softw. Eng. 25, 4 (July 1999), 557--572. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Carolyn B. Seaman, Forrest Shull, Myrna Regardie, Denis Elbert, Raimund L. Feldmann, Yuepu Guo, and Sally Godfrey. 2008. Defect Categorization: Making Use of a Decade of Widely Varying Historical Data. In Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '08). ACM, New York, NY, USA, 149--157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. W. Shen, J. Wan, and Z. Chen. 2018. MuNN: Mutation Analysis of Neural Networks. In 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). 108--115. Google ScholarGoogle ScholarCross RefCross Ref
  26. X. Sun, T. Zhou, G. Li, J. Hu, H. Yang, and B. Li. 2017. An Empirical Study on Real Bugs for Machine Learning Programs. In 2017 24th Asia-Pacific Software Engineering Conference (APSEC). 348--357. Google ScholarGoogle ScholarCross RefCross Ref
  27. Ferdian Thung, Shaowei Wang, David Lo, and Lingxiao Jiang. 2012. An Empirical Study of Bugs in Machine Learning Systems. In Proceedings of the 2012 IEEE 23rd International Symposium on Software Reliability Engineering (ISSRE '12). IEEE Computer Society, Washington, DC, USA, 271--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Muhammad Usman, Ricardo Britto, Jürgen Börstler, and Emilia Mendes. 2017. Taxonomies in software engineering: A systematic mapping study and a revised taxonomy development method. Information and Software Technology 85 (2017), 43--59.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Vijayaraghavan and C. Kramer. [n.d.]. Bug taxonomies: Use them to generate better test. Software Testing Analysis and Review (STAR EAST) ([n. d.]).Google ScholarGoogle Scholar
  30. Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An Empirical Study on TensorFlow Program Bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2018). ACM, New York, NY, USA, 129--140. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Taxonomy of real faults in deep learning systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
      June 2020
      1640 pages
      ISBN:9781450371216
      DOI:10.1145/3377811

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 October 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate276of1,856submissions,15%

      Upcoming Conference

      ICSE 2025

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader