Top

International Journal of Computer Vision

Published in:

21-09-2016

A Practical and Highly Optimized Convolutional Neural Network for Classifying Traffic Signs in Real-Time

Authors: Hamed Habibi Aghdam, Elnaz Jahani Heravi, Domenec Puig

Published in: International Journal of Computer Vision | Issue 2/2017

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Classifying traffic signs is an indispensable part of Advanced Driver Assistant Systems. This strictly requires that the traffic sign classification model accurately classifies the images and consumes as few CPU cycles as possible to immediately release the CPU for other tasks. In this paper, we first propose a new ConvNet architecture. Then, we propose a new method for creating an optimal ensemble of ConvNets with highest possible accuracy and lowest number of ConvNets. Our experiments show that the ensemble of our proposed ConvNets (the ensemble is also constructed using our method) reduces the number of arithmetic operations 88 and \(73\,\%\) compared with two state-of-art ensemble of ConvNets. In addition, our ensemble is \(0.1\,\%\) more accurate than one of the state-of-art ensembles and it is only \(0.04\,\%\) less accurate than the other state-of-art ensemble when tested on the same dataset. Moreover, ensemble of our compact ConvNets reduces the number of the multiplications 95 and \(88\,\%\), yet, the classification accuracy drops only 0.2 and \(0.4\,\%\) compared with these two ensembles. Besides, we also evaluate the cross-dataset performance of our ConvNet and analyze its transferability power in different layers. We show that our network is easily scalable to new datasets with much more number of traffic sign classes and it only needs to fine-tune the weights starting from the last convolution layer. We also assess our ConvNet through different visualization techniques. Besides, we propose a new method for finding the minimum additive noise which causes the network to incorrectly classify the image by minimum difference compared with the highest score in the loss vector.

previous article Adaptive Spatial-Spectral Dictionary Learning for Hyperspectral Image Restoration

next article Robust Statistical Frontalization of Human and Animal Faces

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

http://benchmark.ini.rub.de/.

The ConvNet architecture and its trained models are available at https://github.com/pcnn/traffic-sign-recognition.

The percent of the samples which are always within the top 2 classification scores.

We calculated the number of the multiplications of a ConvNet taking into account the number of the multiplications for convolving the filters of each layer with the N-channel input from the previous layer, number of the multiplications required for computing the activations of each layer and the number of the multiplications imposed by normalization layers. We showed in Sect. 3 that tanh function utilized in Ciresan et al. (2012) can be efficiently computed using 10 multiplications. ReLU activation used in Jin et al. (2014) does not need any multiplications and Leaky ReLU units in our ConvNet compute the results using only 1 multiplication. Finally, considering that pow(float, float) function needs only 1 multiplication and 64 shift operations (http://tinyurl.com/yehg932), the normalization layer in Jin et al. (2014) requires \(k\times k+3\) multiplications per each element in the feature map.

Aghdam, H. H., Heravi, E. J., & Puig, D. (2015). A unified framework for coarse-to-fine recognition of traffic signs using Bayesian network and visual attributes. In: 10th international conference on computer vision theory and applications (VISAPP) (pp. 87–96). doi:10.5220/0005303500870096

Baró, X., Escalera, S., Vitrià, J., Pujol, O., & Radeva, P. (2009). Traffic sign recognition using evolutionary adaboost detection and forest-ECOC classification. IEEE Transactions on Intelligent Transportation Systems, 10(1), 113–126. doi:10.1109/TITS.2008.2011702.CrossRef

Ciresan, D., Meier, U., Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3642–3649). IEEE. doi:10.1109/CVPR.2012.6248110, arXiv:1202.2745v1

Coates, A., & Ng, A. Y. (2012). Learning feature representations with K-means. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7700 LECTU:561–580, doi:10.1007/978-3-642-35289-8-30

Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T. (2014). DeCAF: A deep convolutional activation feature for generic visual recognition. In: International conference on machine learning (pp. 647–655) arXiv:1310.1531.

Dosovitskiy, A., & Brox, T. (2015). Inverting convolutional networks with convolutional networks (pp. 1–15). arXiv preprint arXiv:1506.02753

Fleyeh, H., & Davami, E. (2011). Eigen-based traffic sign recognition. IET Intelligent Transport Systems, 5(3), 190. doi:10.1049/iet-its.2010.0159.CrossRef

Gao, X. W., Podladchikova, L., Shaposhnikov, D., Hong, K., & Shevtsova, N. (2006). Recognition of traffic signs based on their colour and shape features extracted using human vision models. Journal of Visual Communication and Image Representation, 17(4), 675–685. doi:10.1016/j.jvcir.2005.10.003.CrossRef

Girshick, R., Donahue, J., Darrell, T., Berkeley, U. C., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. doi:10.1109/CVPR.2014.81, arXiv:1311.2524.

Greenhalgh, J., & Mirmehdi, M. (2012). Real-time detection and recognition of road traffic signs. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1498–1506. doi:10.1109/tits.2012.2208909.CrossRef

He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv preprint arXiv:1502.01852

Hinton, G. (2014). Dropout : A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research (JMLR), 15, 1929–1958.MathSciNetMATH

Hsu, S. H., & Huang, C. L. (2001). Road sign detection and recognition using matching pursuit method. Image and Vision Computing, 19(3), 119–129. doi:10.1016/S0262-8856(00)00050-0.CrossRef

Huang, G., Mao, K. Z., Siew, C., Huang, D. (2013). A hierarchical method for traffic sign classification with support vector machines. In: The 2013 international joint conference on neural networks (IJCNN) pp 1–6. IEEE. doi:10.1109/IJCNN.2013.6706803

Jin, J., Fu, K., & Zhang, C. (2014). Traffic sign recognition with hinge loss trained convolutional neural networks. IEEE Transactions on Intelligent Transportation Systems, 15(5), 1991–2000. doi:10.1109/TITS.2014.2308281.CrossRef

Krizhevsky, A., Sutskever, I., Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105. Curran Associates, Inc.

Larsson, F., & Felsberg, M. (2011). Using Fourier descriptors and spatial models for traffic sign recognition. In: Image analysis lecture notes in computer science (Vol 6688, pp. 238–249). Springer. doi:10.1007/978-3-642-21227-7_23

Liu, H., Liu, Y., & Sun, F. (2014). Traffic sign recognition using group sparse coding. Information Sciences, 266, 75–89. doi:10.1016/j.ins.2014.01.010.CrossRef

Lu, K., Ding, Z., & Ge, S. (2012). Sparse-representation-based graph embedding for traffic sign recognition. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1515–1524. doi:10.1109/TITS.2012.2220965.CrossRef

Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In International conference on machine learning (ICML) workshop on deep learning (Vol 30)

Mahendran, A., & Vedaldi, A. (2015). Understanding deep image representations by inverting them. In Computer vision and pattern recognition (pp. 5188–5196). IEEE, Boston. doi:10.1109/CVPR.2015.7299155, arXiv:1412.0035

Maldonado-Bascon, S., Lafuente-Arroyo, S., Gil-Jimenez, P., Gomez-Moreno, H., & Lopez-Ferreras, F. (2007). Road-sign detection and recognition based on support vector machines. IEEE Transactions on Intelligent Transportation Systems, 8(2), 264–278. doi:10.1109/TITS.2007.895311.CrossRefMATH

Maldonado Bascón, S., Acevedo Rodríguez, J., Lafuente Arroyo, S., Fernndez Caballero, A., & López-Ferreras, F. (2010). An optimization on pictogram identification for the road-sign recognition task using SVMs. Computer Vision and Image Understanding, 114(3), 373–383. doi:10.1016/j.cviu.2009.12.002.CrossRef

Mathias, M., Timofte, R., Benenson, R., & Van Gool, L. (2013). Traffic sign recognition–How far are we from the solution? International joint conference on neural networks,. doi:10.1109/IJCNN.2013.6707049.

Møgelmose, A., Trivedi, M. M., & Moeslund, T. B. (2012). Vision-based traffic sign detection and analysis for intelligent driver assistance systems: Perspectives and survey. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1484–1497. doi:10.1109/TITS.2012.2209421.CrossRef

Moiseev, B., Konev, A., Chigorin, A., & Konushin, A. (2013). Evaluation of traffic sign recognition methods trained on synthetically generated data. In: 15th international conference on advanced concepts for intelligent vision systems (ACIVS), Springer, Pozna?, pp 576–583, doi:10.1007/978-3-319-02895-8_52

Paclík, P., Novovičová, J., Pudil, P., & Somol, P. (2000). Road sign classification using Laplace kernel classifier. Pattern Recognition Letters, 21(13–14), 1165–1173. doi:10.1016/S0167-8655(00)00078-7.CrossRefMATH

Piccioli, G., De Micheli, E., Parodi, P., & Campani, M. (1996). Robust method for road sign detection and recognition. Image and Vision Computing, 14(3), 209–223. doi:10.1016/0262-8856(95)01057-2.CrossRef

Ruta, A., Li, Y., & Liu, X. (2010). Robust class similarity measure for traffic sign recognition. IEEE Transactions on Intelligent Transportation Systems, 11(4), 846–855. doi:10.1109/TITS.2010.2051427.CrossRef

Sermanet, P., & Lecun, Y. (2011). Traffic sign recognition with multi-scale convolutional networks. In Proceedings of the international joint conference on neural networks (pp. 2809–2813). doi:10.1109/IJCNN.2011.6033589

Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y. (2013). OverFeat : Integrated recognition , localization and detection using convolutional networks. In arXiv preprint arXiv:1312.6229, pp. 1–15

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representation (ICLR) (pp. 1–13), 1409.1556v5.

Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv. preprint, 13126034, 1–8.

Stallkamp, J., Schlipsing, M., Salmen, J., & Igel, C. (2012). Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks, 32, 323–332. doi:10.1016/j.neunet.2012.02.016.CrossRef

Sun, Z. L., Wang, H., Lau, W. S., Seet, G., & Wang, D. (2014). Application of BW-ELM model on traffic sign recognition. Neurocomputing, 128, 153–159. doi:10.1016/j.neucom.2012.11.057.CrossRef

Szegedy, C., Reed, S., Sermanet, P., Vanhoucke, V., & Rabinovich, A. (2014a). Going deeper with convolutions. In: arXiv preprint arXiv:1409.4842, pp. 1–12.

Szegedy, C., Zaremba, W., Sutskever, I. (2014b). Intriguing properties of neural networks. arXiv:1312.6199v4

Tibshirani, R. (1994). Regression Selection and Shrinkage via the Lasso.,. doi:10.2307/2346178.

Timofte R, Van Gool, L. (2011). Sparse representation based projections. In: 22nd British Machine Vision Conference (pp. 61.1–61.12). BMVA Press. doi:10.5244/C.25.61

Timofte, R., Zimmermann, K., & Van Gool, L. (2011). Multi-view traffic sign detection, recognition, and 3D localisation. Machine Vision and Applications (November):1–15. doi:10.1007/s00138-011-0391-3.

Van Der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605. doi:10.1007/s10479-011-0841-3.MATH

Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., & Gong, Y. (2010). Locality-constrained linear coding for image classification. In IEEE computer vision and pattern recognition (CVPR) (pp. 3360–3367). doi:10.1109/CVPR.2010.5540018.

Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks ? Neural Information Processing System (NIPS), 27. arXiv:1411.1792v1.

Yuan, X., Hao, X., Chen, H., & Wei, X. (2014). Robust traffic sign recognition based on color global and local oriented edge magnitude patterns. IEEE Transactions on Intelligent Transportation Systems, 15(4), 1466–1474. doi:10.1109/TITS.2014.2298912.CrossRef

Zaklouta, F., & Stanciulescu, B. (2012). Real-time traffic-sign recognition using tree classifiers. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1507–1514. doi:10.1109/TITS.2012.2225618.CrossRef

Zaklouta, F., & Stanciulescu, B. (2014). Real-time traffic sign recognition in three stages. Robotics and Autonomous Systems, 62(1), 16–24. doi:10.1016/j.robot.2012.07.019.CrossRef

Zaklouta, F., Stanciulescu, B., & Hamdoun, O. (2011). Traffic sign classification using K-d trees and random forests. In Proceedings of the international joint conference on neural networks (pp. 2151–2155). doi:10.1109/IJCNN.2011.6033494.

Zeiler, M., & Fergus, R. (2014). Visualizing and understanding convolutional networks. European Conference on Computer Vision (ECCV), 8689, 818–833. doi:10.1007/978-3-319-10590-1_53.1311.2901.

Zeng, Y., Xu, X., Fang, Y., & Zhao, K. (2015). Traffic sign recognition using deep convolutional networks and extreme learning machine. In Intelligence science and big data engineering. image and video data engineering (IScIDE ) (pp. 272–280). Springer. doi:10.1007/978-3-319-23989-7_28.

Title: A Practical and Highly Optimized Convolutional Neural Network for Classifying Traffic Signs in Real-Time
Authors: Hamed Habibi Aghdam
Elnaz Jahani Heravi
Domenec Puig
Publication date: 21-09-2016
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 2/2017
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-016-0955-9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 2/2017

Automatic Sleep System Recommendation by Multi-modal RBG-Depth-Pressure Anthropometric Analysis

Guest Editorial: Machine Vision Applications

Domain Adaptation for Automatic OLED Panel Defect Detection Using Adaptive Support Vector Data Description

Growing Regression Tree Forests by Classification for Continuous Object Pose Estimation

Robust Statistical Frontalization of Human and Animal Faces

Adaptive Spatial-Spectral Dictionary Learning for Hyperspectral Image Restoration

Premium Partner