research-article

Automatic Fault Detection for Deep Learning Programs Using Graph Transformations

Authors:
Amin Nikanjam

K. N. Toosi University of Technology, Iran and SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada

K. N. Toosi University of Technology, Iran and SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada

0000-0002-0440-6839
View Profile

,
Houssem Ben Braiek

SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada

SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada
View Profile

,
Mohammad Mehdi Morovati

SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada

SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada
View Profile

,
Foutse Khomh

SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada

SWAT Lab., Polytechnique Montreal, Montréal (Québec) Canada
View Profile

ACM Transactions on Software Engineering and Methodology Volume 31 Issue 1Article No.: 14pp 1–27https://doi.org/10.1145/3470006

Published:28 September 2021Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

Nowadays, we are witnessing an increasing demand in both corporates and academia for exploiting Deep Learning (DL) to solve complex real-world problems. A DL program encodes the network structure of a desirable DL model and the process by which the model learns from the training dataset. Like any software, a DL program can be faulty, which implies substantial challenges of software quality assurance, especially in safety-critical domains. It is therefore crucial to equip DL development teams with efficient fault detection techniques and tools. In this article, we propose NeuraLint, a model-based fault detection approach for DL programs, using meta-modeling and graph transformations. First, we design a meta-model for DL programs that includes their base skeleton and fundamental properties. Then, we construct a graph-based verification process that covers 23 rules defined on top of the meta-model and implemented as graph transformations to detect faults and design inefficiencies in the generated models (i.e., instances of the meta-model). First, the proposed approach is evaluated by finding faults and design inefficiencies in 28 synthesized examples built from common problems reported in the literature. Then NeuraLint successfully finds 64 faults and design inefficiencies in 34 real-world DL programs extracted from Stack Overflow posts and GitHub repositories. The results show that NeuraLint effectively detects faults and design issues in both synthesized and real-world examples with a recall of 70.5% and a precision of 100%. Although the proposed meta-model is designed for feedforward neural networks, it can be extended to support other neural network architectures such as recurrent neural networks. Researchers can also expand our set of verification rules to cover more types of issues in DL programs.

References

2016. Retrieved February 2, 2021 from https://github.com/katyprogrammer/regularization-experiment/commit/b93dd636.Google Scholar
2017. Retrieved February 5, 2021 from https://github.com/dishen12/keras_frcnn/commit/38413c6.Google Scholar
2017. Retrieved February 5, 2021 from https://github.com/yumatsuoka/comp_DNNfw/commit/30e0973.Google Scholar
2018. Retrieved February 5, 2021 from https://github.com/taashi-s/UNet_Keras/commit/b1b6d93.Google Scholar
2018. Retrieved February 5, 2021 from https://github.com/keras-team/keras-applications/commit/05ff470.Google Scholar
2018. Retrieved February 5, 2021 from https://github.com/mateusz93/Car-recognition/commit/94b36ea.Google Scholar
2018. Retrieved February 5, 2021 from https://github.com/tf-encrypted/tf-encrypted/issues/248 (network B).Google Scholar
2018. Retrieved February 5, 2021 from https://github.com/tf-encrypted/tf-encrypted/issues/248 (network C).Google Scholar
2020. Keras-based VGG16. Retrieved June 29, 2020 https://github.com/keras-team/keras-applications/blob/master/keras_applications/vgg16.py.Google Scholar
2020. Tensorflow-based LeNet. Retrieved June 29, 2020 https://github.com/tensorflow/models/blob/master/research/slim/nets/lenet.py.Google Scholar
2021. Replication Package and Source Code of NeuraLint. Retrieved February 15, 2021 https://github.com/neuralint/neuralint.Google Scholar
Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to represent programs with graphs. In International Conference on Learning Representations. Retrieved June 29, 2020 https://openreview.net/forum?id=BJOFETxR-.Google Scholar
Houssem Ben Braiek and Foutse Khomh. 2019. DeepEvolution: A search-based testing approach for deep neural networks. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME'19). IEEE, 454–458.Google ScholarCross Ref
Houssem Ben Braiek and Foutse Khomh. 2019. TFCheck: A tensorflow library for detecting training issues in neural network programs. In 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS'19). IEEE, 426–433.Google ScholarCross Ref
Houssem Ben Braiek and Foutse Khomh. 2020. On testing machine learning programs. Journal of Systems and Software 164 (2020), 110542.Google ScholarCross Ref
Alfredo Canziani, Adam Paszke, and Eugenio Culurciello. 2016. An analysis of deep neural network models for practical applications. arXiv:1605.07678. https://arxiv.org/abs/1605.07678.Google Scholar
Selim Ciraci, Pim van den Broek, and Mehmet Aksit. 2010. Graph-based verification of static program constraints. In Proceedings of the 2010 ACM Symposium on Applied Computing. 2265–2272. Google ScholarDigital Library
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.Google ScholarCross Ref
Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, and Ke Wang. 2020. Hoppity: Learning graph transformations to detect and fix bugs in programs. In International Conference on Learning Representations. Retrieved June 29, 2020 https://openreview.net/forum.Google Scholar
Anurag Dwarakanath, Manish Ahuja, Samarth Sikand, Raghotham M. Rao, R. P. Jagadeesh Chandra Bose, Neville Dubash, and Sanjay Podder. 2018. Identifying implementation bugs in machine learning based image classifiers using metamorphic testing. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 118–128. Google ScholarDigital Library
Amir Hossein Ghamarian, Maarten de Mol, Arend Rensink, Eduardo Zambon, and Maria Zimakova. 2012. Modelling and analysis using GROOVE. International Journal on Software Tools for Technology Transfer 14, 1 (2012), 15–40.Google ScholarCross Ref
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. 249–256.Google Scholar
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org. Google ScholarDigital Library
Dongyoon Han, Jiwhan Kim, and Junmo Kim. 2017. Deep pyramidal residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5927–5935.Google ScholarCross Ref
Thomas Hartmann, Assaad Moawad, Cedric Schockaert, Francois Fouquet, and Yves Le Traon. 2019. Meta-modelling meta-learning. In 2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS'19). IEEE, 300–305.Google ScholarCross Ref
Seyyed Hossein Hasanpour, Mohammad Rouhani, Mohsen Fayyaz, Mohammad Sabokrou, and Ehsan Adeli. 2018. Towards principled design of deep convolutional networks: Introducing SimpNet. arXiv:1802.06205. https://arxiv.org/abs/1802.06205.Google Scholar
Soufiane Hayou, Arnaud Doucet, and Judith Rousseau. 2019. On the impact of the activation function on deep neural networks training. In International Conference on Machine Learning. Proceedings of Machine Learning Research, 2672–2680.Google Scholar
Kaiming He and Jian Sun. 2015. Convolutional neural networks at constrained time cost. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5353–5360.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google ScholarCross Ref
Reiko Heckel. 2006. Graph transformation in a nutshell. Electronic Notes in Theoretical Computer Science 148, 1 (2006), 187–198. Google ScholarDigital Library
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1110–1121. Google ScholarDigital Library
Forrest Iandola, Matt Moskewicz, Sergey Karayev, Ross Girshick, Trevor Darrell, and Kurt Keutzer. 2014. DenseNet: Implementing efficient ConvNet descriptor pyramids. arXiv:1404.1869. https://arxiv.org/abs/1404.1869.Google Scholar
Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50 fewer parameters and <0.5 MB model size. arXiv:1602.07360. https://arxiv.org/abs/1602.07360.Google Scholar
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning. Proceedings of Machine Learning Research, 448–456. Google ScholarDigital Library
Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 510–520. Google ScholarDigital Library
Md Johirul Islam, Rangeet Pan, Giang Nguyen, and Hridesh Rajan. 2020. Repairing Deep neural networks: Fix patterns and challenges. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE'20). 11351146. Google ScholarDigital Library
Roshni G. Iyer, Yizhou Sun, Wei Wang, and Justin Gottschlich. 2020. Software language comprehension using a program-derived semantic graph. arXiv:2004.00768. https://arxiv.org/abs/2004.00768.Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097–1105. Google ScholarDigital Library
Yann LeCun. 1998. The MNIST database of handwritten digits. Retrieved June 29, 2020 http://yann.lecun.com/exdb/mnist/.Google Scholar
Yann LeCun et al. 2015. LeNet-5, convolutional neural networks. Retrieved June 29, 2020 http://yann.lecun.com/exdb/lenet.Google Scholar
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.Google Scholar
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
Xiang Li, Shuo Chen, Xiaolin Hu, and Jian Yang. 2019. Understanding the disharmony between dropout and batch normalization by variance shift. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2682–2690.Google ScholarCross Ref
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In European Conference on Computer Vision. Springer, 740–755.Google Scholar
Dmytro Mishkin, Nikolay Sergievskiy, and Jiri Matas. 2017. Systematic evaluation of convolution neural network advances on the ImageNet. Computer Vision and Image Understanding 161 (2017), 11–19. Google ScholarDigital Library
Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall. 2018. Activation functions: Comparison of trends in practice and research for deep learning. arXiv:1811.03378. https://arxiv.org/abs/1811.03378.Google Scholar
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1–18. Google ScholarDigital Library
Carlos E. Perez. 2016. The Meta Model and Meta Meta-Model of Deep Learning. Retrieved June 29, 2020 https://medium.com/intuitionmachine/the-meta-model-and-meta-meta-model-of-deep-learning-10062f0bf74c.Google Scholar
Arend Rensink. 2003. The GROOVE simulator: A tool for state space generation. In International Workshop on Applications of Graph Transformations with Industrial Relevance. Springer, 479–485.Google Scholar
Dominik Scherer, Andreas Müller, and Sven Behnke. 2010. Evaluation of pooling operations in convolutional architectures for object recognition. In International Conference on Artificial Neural Networks. Springer, 92–101. Google ScholarDigital Library
Daniel Selsam, Percy Liang, and David L. Dill. 2017. Developing bug-free machine learning systems with formal mathematics. In Proceedings of the 34th International Conference on Machine Learning–Volume 70. JMLR.org, 3047–3056. Google ScholarDigital Library
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. https://arxiv.org/abs/1409.1556.Google Scholar
Leslie N. Smith and Nicholay Topin. 2016. Deep convolutional neural network design patterns. arXiv:1611.00847. https://arxiv.org/abs/1611.00847.Google Scholar
Richard Mark Soley. 2013. How to Deliver Resilient, Secure, Efficient, and Easily Changed IT Systems in Line with CISQ Recommendations. Retrieved June 29, 2020 https://www.omg.org/news/whitepapers/CISQ_compliant_IT_Systemsv.4-3.pdf.Google Scholar
Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2014. Striving for simplicity: The all convolutional net. arXiv:1412.6806. https://arxiv.org/abs/1412.6806.Google Scholar
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958. Google ScholarDigital Library
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818–2826.Google ScholarCross Ref
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th International Conference on Software Engineering. ACM, 303–314. Google ScholarDigital Library
Chengwei Zhang. 2018. How to Choose Last-Layer Activation and Loss Function. Retrieved June 29, 2020 https://www.dlology.com/blog/how-to-choose-last-layer-activation-and-loss-function/.Google Scholar
Tianyi Zhang, Cuiyun Gao, Lei Ma, Michael R. Lyu, and Miryung Kim. 2019. An empirical study of common challenges in developing deep learning applications. In The 30th IEEE International Symposium on Software Reliability Engineering (ISSRE'19).Google ScholarCross Ref
Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 129–140. Google ScholarDigital Library

Index Terms

Automatic Fault Detection for Deep Learning Programs Using Graph Transformations
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software organization and properties
    1. Software system structures
      1. Software system models
        Model-driven software engineering

Recommendations

Realizing UML Metamodel Transformations with AGG

In this paper, we work out equivalence transformations on the UML metamodel as concrete graph transformations implemented in the AGG tool. We consider two examples for manipulating the static structure of a UML model, namely the transformation of an ...
Read More
Euler Graph Transformations for Euler Diagram Layout
VLHCC '10: Proceedings of the 2010 IEEE Symposium on Visual Languages and Human-Centric Computing

Euler diagrams are frequently used for visualizing information about collections of objects and form an important component of various visual languages. Properties possessed by Euler diagrams correlate with their usability, such as whether the diagram ...
Read More
Faults in deep reinforcement learning programs: a taxonomy and a detection approach
Abstract
A growing demand is witnessed in both industry and academia for employing Deep Learning (DL) in various domains to solve real-world problems. Deep reinforcement learning (DRL) is the application of DL in the domain of Reinforcement Learning. Like ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 31, Issue 1
January 2022
665 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3481711
Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology
Issue’s Table of Contents
Copyright © 2021 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 September 2021
- Accepted: 1 June 2021
- Revised: 1 May 2021
- Received: 1 July 2020
Published in tosem Volume 31, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Graph transformations
model-based verification
deep learning
fault detection
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 564
  Total Downloads
- Downloads (Last 12 months)167
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Automatic Fault Detection for Deep Learning Programs Using Graph Transformations

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Realizing UML Metamodel Transformations with AGG

Euler Graph Transformations for Euler Diagram Layout

Faults in deep reinforcement learning programs: a taxonomy and a detection approach