skip to main content
research-article

Automatic Fault Detection for Deep Learning Programs Using Graph Transformations

Authors Info & Claims
Published:28 September 2021Publication History
Skip Abstract Section

Abstract

Nowadays, we are witnessing an increasing demand in both corporates and academia for exploiting Deep Learning (DL) to solve complex real-world problems. A DL program encodes the network structure of a desirable DL model and the process by which the model learns from the training dataset. Like any software, a DL program can be faulty, which implies substantial challenges of software quality assurance, especially in safety-critical domains. It is therefore crucial to equip DL development teams with efficient fault detection techniques and tools. In this article, we propose NeuraLint, a model-based fault detection approach for DL programs, using meta-modeling and graph transformations. First, we design a meta-model for DL programs that includes their base skeleton and fundamental properties. Then, we construct a graph-based verification process that covers 23 rules defined on top of the meta-model and implemented as graph transformations to detect faults and design inefficiencies in the generated models (i.e., instances of the meta-model). First, the proposed approach is evaluated by finding faults and design inefficiencies in 28 synthesized examples built from common problems reported in the literature. Then NeuraLint successfully finds 64 faults and design inefficiencies in 34 real-world DL programs extracted from Stack Overflow posts and GitHub repositories. The results show that NeuraLint effectively detects faults and design issues in both synthesized and real-world examples with a recall of 70.5% and a precision of 100%. Although the proposed meta-model is designed for feedforward neural networks, it can be extended to support other neural network architectures such as recurrent neural networks. Researchers can also expand our set of verification rules to cover more types of issues in DL programs.

References

  1. 2016. Retrieved February 2, 2021 from https://github.com/katyprogrammer/regularization-experiment/commit/b93dd636.Google ScholarGoogle Scholar
  2. 2017. Retrieved February 5, 2021 from https://github.com/dishen12/keras_frcnn/commit/38413c6.Google ScholarGoogle Scholar
  3. 2017. Retrieved February 5, 2021 from https://github.com/yumatsuoka/comp_DNNfw/commit/30e0973.Google ScholarGoogle Scholar
  4. 2018. Retrieved February 5, 2021 from https://github.com/taashi-s/UNet_Keras/commit/b1b6d93.Google ScholarGoogle Scholar
  5. 2018. Retrieved February 5, 2021 from https://github.com/keras-team/keras-applications/commit/05ff470.Google ScholarGoogle Scholar
  6. 2018. Retrieved February 5, 2021 from https://github.com/mateusz93/Car-recognition/commit/94b36ea.Google ScholarGoogle Scholar
  7. 2018. Retrieved February 5, 2021 from https://github.com/tf-encrypted/tf-encrypted/issues/248 (network B).Google ScholarGoogle Scholar
  8. 2018. Retrieved February 5, 2021 from https://github.com/tf-encrypted/tf-encrypted/issues/248 (network C).Google ScholarGoogle Scholar
  9. 2020. Keras-based VGG16. Retrieved June 29, 2020 https://github.com/keras-team/keras-applications/blob/master/keras_applications/vgg16.py.Google ScholarGoogle Scholar
  10. 2020. Tensorflow-based LeNet. Retrieved June 29, 2020 https://github.com/tensorflow/models/blob/master/research/slim/nets/lenet.py.Google ScholarGoogle Scholar
  11. 2021. Replication Package and Source Code of NeuraLint. Retrieved February 15, 2021 https://github.com/neuralint/neuralint.Google ScholarGoogle Scholar
  12. Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to represent programs with graphs. In International Conference on Learning Representations. Retrieved June 29, 2020 https://openreview.net/forum?id=BJOFETxR-.Google ScholarGoogle Scholar
  13. Houssem Ben Braiek and Foutse Khomh. 2019. DeepEvolution: A search-based testing approach for deep neural networks. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME'19). IEEE, 454–458.Google ScholarGoogle ScholarCross RefCross Ref
  14. Houssem Ben Braiek and Foutse Khomh. 2019. TFCheck: A tensorflow library for detecting training issues in neural network programs. In 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS'19). IEEE, 426–433.Google ScholarGoogle ScholarCross RefCross Ref
  15. Houssem Ben Braiek and Foutse Khomh. 2020. On testing machine learning programs. Journal of Systems and Software 164 (2020), 110542.Google ScholarGoogle ScholarCross RefCross Ref
  16. Alfredo Canziani, Adam Paszke, and Eugenio Culurciello. 2016. An analysis of deep neural network models for practical applications. arXiv:1605.07678. https://arxiv.org/abs/1605.07678.Google ScholarGoogle Scholar
  17. Selim Ciraci, Pim van den Broek, and Mehmet Aksit. 2010. Graph-based verification of static program constraints. In Proceedings of the 2010 ACM Symposium on Applied Computing. 2265–2272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.Google ScholarGoogle ScholarCross RefCross Ref
  19. Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, and Ke Wang. 2020. Hoppity: Learning graph transformations to detect and fix bugs in programs. In International Conference on Learning Representations. Retrieved June 29, 2020 https://openreview.net/forum.Google ScholarGoogle Scholar
  20. Anurag Dwarakanath, Manish Ahuja, Samarth Sikand, Raghotham M. Rao, R. P. Jagadeesh Chandra Bose, Neville Dubash, and Sanjay Podder. 2018. Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 118–128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Amir Hossein Ghamarian, Maarten de Mol, Arend Rensink, Eduardo Zambon, and Maria Zimakova. 2012. Modelling and analysis using GROOVE. International Journal on Software Tools for Technology Transfer 14, 1 (2012), 15–40.Google ScholarGoogle ScholarCross RefCross Ref
  22. Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. 249–256.Google ScholarGoogle Scholar
  23. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dongyoon Han, Jiwhan Kim, and Junmo Kim. 2017. Deep pyramidal residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5927–5935.Google ScholarGoogle ScholarCross RefCross Ref
  25. Thomas Hartmann, Assaad Moawad, Cedric Schockaert, Francois Fouquet, and Yves Le Traon. 2019. Meta-modelling meta-learning. In 2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS'19). IEEE, 300–305.Google ScholarGoogle ScholarCross RefCross Ref
  26. Seyyed Hossein Hasanpour, Mohammad Rouhani, Mohsen Fayyaz, Mohammad Sabokrou, and Ehsan Adeli. 2018. Towards principled design of deep convolutional networks: Introducing SimpNet. arXiv:1802.06205. https://arxiv.org/abs/1802.06205.Google ScholarGoogle Scholar
  27. Soufiane Hayou, Arnaud Doucet, and Judith Rousseau. 2019. On the impact of the activation function on deep neural networks training. In International Conference on Machine Learning. Proceedings of Machine Learning Research, 2672–2680.Google ScholarGoogle Scholar
  28. Kaiming He and Jian Sun. 2015. Convolutional neural networks at constrained time cost. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5353–5360.Google ScholarGoogle ScholarCross RefCross Ref
  29. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google ScholarGoogle ScholarCross RefCross Ref
  30. Reiko Heckel. 2006. Graph transformation in a nutshell. Electronic Notes in Theoretical Computer Science 148, 1 (2006), 187–198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1110–1121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Forrest Iandola, Matt Moskewicz, Sergey Karayev, Ross Girshick, Trevor Darrell, and Kurt Keutzer. 2014. DenseNet: Implementing efficient ConvNet descriptor pyramids. arXiv:1404.1869. https://arxiv.org/abs/1404.1869.Google ScholarGoogle Scholar
  33. Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50 fewer parameters and <0.5 MB model size. arXiv:1602.07360. https://arxiv.org/abs/1602.07360.Google ScholarGoogle Scholar
  34. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning. Proceedings of Machine Learning Research, 448–456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 510–520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. 2020. Repairing Deep neural networks: Fix patterns and challenges. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE'20). 11351146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Roshni G. Iyer, Yizhou Sun, Wei Wang, and Justin Gottschlich. 2020. Software language comprehension using a program-derived semantic graph. arXiv:2004.00768. https://arxiv.org/abs/2004.00768.Google ScholarGoogle Scholar
  38. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097–1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yann LeCun. 1998. The MNIST database of handwritten digits. Retrieved June 29, 2020 http://yann.lecun.com/exdb/mnist/.Google ScholarGoogle Scholar
  40. Yann LeCun et al. 2015. LeNet-5, convolutional neural networks. Retrieved June 29, 2020 http://yann.lecun.com/exdb/lenet.Google ScholarGoogle Scholar
  41. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.Google ScholarGoogle Scholar
  42. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.Google ScholarGoogle ScholarCross RefCross Ref
  43. Xiang Li, Shuo Chen, Xiaolin Hu, and Jian Yang. 2019. Understanding the disharmony between dropout and batch normalization by variance shift. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2682–2690.Google ScholarGoogle ScholarCross RefCross Ref
  44. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In European Conference on Computer Vision. Springer, 740–755.Google ScholarGoogle Scholar
  45. Dmytro Mishkin, Nikolay Sergievskiy, and Jiri Matas. 2017. Systematic evaluation of convolution neural network advances on the ImageNet. Computer Vision and Image Understanding 161 (2017), 11–19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall. 2018. Activation functions: Comparison of trends in practice and research for deep learning. arXiv:1811.03378. https://arxiv.org/abs/1811.03378.Google ScholarGoogle Scholar
  47. Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1–18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Carlos E. Perez. 2016. The Meta Model and Meta Meta-Model of Deep Learning. Retrieved June 29, 2020 https://medium.com/intuitionmachine/the-meta-model-and-meta-meta-model-of-deep-learning-10062f0bf74c.Google ScholarGoogle Scholar
  49. Arend Rensink. 2003. The GROOVE simulator: A tool for state space generation. In International Workshop on Applications of Graph Transformations with Industrial Relevance. Springer, 479–485.Google ScholarGoogle Scholar
  50. Dominik Scherer, Andreas Müller, and Sven Behnke. 2010. Evaluation of pooling operations in convolutional architectures for object recognition. In International Conference on Artificial Neural Networks. Springer, 92–101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Daniel Selsam, Percy Liang, and David L. Dill. 2017. Developing bug-free machine learning systems with formal mathematics. In Proceedings of the 34th International Conference on Machine Learning–Volume 70. JMLR.org, 3047–3056. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. https://arxiv.org/abs/1409.1556.Google ScholarGoogle Scholar
  53. Leslie N. Smith and Nicholay Topin. 2016. Deep convolutional neural network design patterns. arXiv:1611.00847. https://arxiv.org/abs/1611.00847.Google ScholarGoogle Scholar
  54. Richard Mark Soley. 2013. How to Deliver Resilient, Secure, Efficient, and Easily Changed IT Systems in Line with CISQ Recommendations. Retrieved June 29, 2020 https://www.omg.org/news/whitepapers/CISQ_compliant_IT_Systemsv.4-3.pdf.Google ScholarGoogle Scholar
  55. Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2014. Striving for simplicity: The all convolutional net. arXiv:1412.6806. https://arxiv.org/abs/1412.6806.Google ScholarGoogle Scholar
  56. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818–2826.Google ScholarGoogle ScholarCross RefCross Ref
  58. Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th International Conference on Software Engineering. ACM, 303–314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Chengwei Zhang. 2018. How to Choose Last-Layer Activation and Loss Function. Retrieved June 29, 2020 https://www.dlology.com/blog/how-to-choose-last-layer-activation-and-loss-function/.Google ScholarGoogle Scholar
  60. Tianyi Zhang, Cuiyun Gao, Lei Ma, Michael R. Lyu, and Miryung Kim. 2019. An empirical study of common challenges in developing deep learning applications. In The 30th IEEE International Symposium on Software Reliability Engineering (ISSRE'19).Google ScholarGoogle ScholarCross RefCross Ref
  61. Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 129–140. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic Fault Detection for Deep Learning Programs Using Graph Transformations

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Software Engineering and Methodology
        ACM Transactions on Software Engineering and Methodology  Volume 31, Issue 1
        January 2022
        665 pages
        ISSN:1049-331X
        EISSN:1557-7392
        DOI:10.1145/3481711
        • Editor:
        • Mauro Pezzè
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 September 2021
        • Accepted: 1 June 2021
        • Revised: 1 May 2021
        • Received: 1 July 2020
        Published in tosem Volume 31, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format