ABSTRACT
While Deep Neural Networks (DNNs) have established the fundamentals of image-based autonomous driving systems, they may exhibit erroneous behaviors and cause fatal accidents. To address the safety issues in autonomous driving systems, a recent set of testing techniques have been designed to automatically generate artificial driving scenes to enrich test suite, e.g., generating new input images transformed from the original ones. However, these techniques are insufficient due to two limitations: first, many such synthetic images often lack diversity of driving scenes, and hence compromise the resulting efficacy and reliability. Second, for machine-learning-based systems, a mismatch between training and application domain can dramatically degrade system accuracy, such that it is necessary to validate inputs for improving system robustness.
In this paper, we propose DeepRoad, an unsupervised DNN-based framework for automatically testing the consistency of DNN-based autonomous driving systems and online validation. First, DeepRoad automatically synthesizes large amounts of diverse driving scenes without using image transformation rules (e.g. scale, shear and rotation). In particular, DeepRoad is able to produce driving scenes with various weather conditions (including those with rather extreme conditions) by applying Generative Adversarial Networks (GANs) along with the corresponding real-world weather scenes. Second, DeepRoad utilizes metamorphic testing techniques to check the consistency of such systems using synthetic images. Third, DeepRoad validates input images for DNN-based systems by measuring the distance of the input and training images using their VGGNet features. We implement DeepRoad to test three well-recognized DNN-based autonomous driving systems in Udacity self-driving car challenge. The experimental results demonstrate that DeepRoad can detect thousands of inconsistent behaviors for these systems, and effectively validate input images to potentially enhance the system robustness as well.
- Paul Ammann and Jeff Offutt. 2016. Introduction to software testing. Cambridge University Press. Google Scholar
- Shane Barratt and Rishi Sharma. 2018. A Note on the Inception Score. arXiv preprint arXiv:1801.01973 (2018).Google Scholar
- Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer. http://research.microsoft.com/en-us/um/people/cmbishop/prml/ Google ScholarDigital Library
- Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).Google Scholar
- Tsong Y Chen, Shing C Cheung, and Siu Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report. Technical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
- Leon Gatys, Alexander S Ecker, and Matthias Bethge. 2015. Texture synthesis using convolutional neural networks. In Advances in Neural Information Processing Systems. 262–270. Google ScholarDigital Library
- Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. IEEE, 2414–2423.Google ScholarCross Ref
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672–2680. Google ScholarDigital Library
- Divya Gopinath, Guy Katz, Corina S Pasareanu, and Clark Barrett. 2017. Deepsafe: A data-driven approach for checking adversarial robustness in neural networks. arXiv preprint arXiv:1710.00486 (2017).Google Scholar
- Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for realtime style transfer and super-resolution. In European Conference on Computer Vision. Springer, 694–711.Google ScholarCross Ref
- Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jungkwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. arXiv preprint arXiv:1703.05192 (2017).Google ScholarDigital Library
- Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105. Google ScholarDigital Library
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436.Google Scholar
- Nuo Li, Tao Xie, Maozhong Jin, and Chao Liu. 2010. Perturbation-based userinput-validation testing of web applications. Journal of Systems and Software 83, 11 (2010), 2263–2274. Google ScholarDigital Library
- Peilun Li, Xiaodan Liang, Daoyuan Jia, and Eric P Xing. 2018. Semanticaware Grad-GAN for Virtual-to-Real Urban Scene Adaption. arXiv preprint arXiv:1801.01726 (2018).Google Scholar
- Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised image-to-image translation networks. In Advances in Neural Information Processing Systems. 700– 708.Google Scholar
- Alexis C Madrigal. 2017. Inside waymo’s secret world for training self-driving cars. The Atlantic (2017).Google Scholar
- William M McKeeman. 1998. Differential testing for software. Digital Technical Journal 10, 1 (1998), 100–107.Google Scholar
- Christian Murphy, Gail E Kaiser, Lifeng Hu, and Leon Wu. 2008. Properties of Machine Learning Applications for Use in Metamorphic Testing. In SEKE, Vol. 8. 867–872.Google Scholar
- Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1–18. Google ScholarDigital Library
- Dean A. Pomerleau. 1989. Advances in Neural Information Processing Systems 1. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, Chapter ALVINN: An Autonomous Land Vehicle in a Neural Network, 305–313. Google ScholarDigital Library
- Haşim Sak, Andrew Senior, and Françoise Beaufays. 2014. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Fifteenth annual conference of the international speech communication association.Google Scholar
- Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In Advances in Neural Information Processing Systems. 2234–2242. Google ScholarDigital Library
- S. Segura, G. Fraser, A. B. Sanchez, and A. Ruiz-Cortes. 2016. A Survey on Metamorphic Testing. IEEE Transactions on Software Engineering 42, 9 (Sept 2016), 805–824.Google ScholarCross Ref
- Sergio Segura, Gordon Fraser, Ana B Sanchez, and Antonio Ruiz-Cortés. 2016. A survey on metamorphic testing. IEEE Transactions on software engineering 42, 9 (2016), 805–824.Google ScholarCross Ref
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: Automated Testing of Deep-Neural-Network-driven Autonomous Cars. In Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden, May 27 - June 3, 2018 (ICSE 2018). Google ScholarDigital Library
- Ian H Witten, Eibe Frank, Mark A Hall, and Christopher J Pal. 2016. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann. Google ScholarDigital Library
- Xiaoyuan Xie, Joshua Ho, Christian Murphy, Gail Kaiser, Baowen Xu, and Tsong Yueh Chen. 2009. Application of metamorphic testing to supervised classifiers. In Quality Software, 2009. QSIC’09. 9th International Conference on. IEEE, 135–144. Google ScholarDigital Library
- Xiaoyuan Xie, Joshua WK Ho, Christian Murphy, Gail Kaiser, Baowen Xu, and Tsong Yueh Chen. 2011. Testing and validating machine learning classifiers by metamorphic testing. Journal of Systems and Software 84, 4 (2011), 544–558. Google ScholarDigital Library
- Luona Yang, Xiaodan Liang, and Eric Xing. 2018. Unsupervised Real-to-Virtual Domain Unification for End-to-End Highway Driving. arXiv preprint arXiv:1801.03458 (2018).Google Scholar
- Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. Dualgan: Unsupervised dual learning for image-to-image translation. arXiv preprint (2017).Google Scholar
- Zhi Quan Zhou, DH Huang, TH Tse, Zongyuan Yang, Haitao Huang, and TY Chen. 2004. Metamorphic testing and its applications. In Proceedings of the 8th International Symposium on Future Software Technology (ISFST 2004). 346–351.Google Scholar
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593 (2017).Google Scholar
Index Terms
- DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems
Recommendations
Distribution-Aware Testing of Neural Networks Using Generative Models
ICSE '21: Proceedings of the 43rd International Conference on Software EngineeringThe reliability of software that has a Deep Neural Network (DNN) as a component is urgently important today given the increasing number of critical applications being deployed with DNNs. The need for reliability raises a need for rigorous testing of the ...
Covering code behavior on input validation in functional testing
Input validation is the enforcement built in software systems to ensure that only valid input is accepted to raise external effects. It is essential and very important to a large class of systems and usually forms a major part of a data-intensive ...
Numerical Analysis of Tractor Accidents using Driving Simulator for Autonomous Driving Tractor
ICMRE'19: Proceedings of the 5th International Conference on Mechatronics and Robotics EngineeringAutonomous driving of automobiles is a hot research topic in recent years. The autonomous driving tractor also has been studied in the agricultural field as well as an autonomous driving automobile. On the other hand, tractor accidents frequently occur ...
Comments