Skip to main content

2020 | OriginalPaper | Buchkapitel

A Unit Softmax with Laplacian Smoothing Stochastic Gradient Descent for Deep Convolutional Neural Networks

verfasst von : Jamshaid Ul Rahman, Akhtar Ali, Masood Ur Rehman, Rafaqat Kazmi

Erschienen in: Intelligent Technologies and Applications

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Several techniques were designed during last few years to improve the performance of deep architecture by means of appropriate loss functions or activation functions. Arguably, softmax is the traditionally convenient to train Deep Convolutional Neural Networks (DCNNs) for classification task. However, the modern deep learning architectures have exposed its limitation towards feature discriminability. In this paper, we offered a supervision signal for discriminative image features through a modification in softmax to boost up the power of loss function. Amending the original softmax loss and motivated by the A-softmax loss for face recognition, we fixed the angular margin to introduce a unit margin softmax loss. The improved alternative form of softmax is trainable, easy to optimize and stable for usage along with Stochastic Gradient Descent (SGD) and Laplacian Smoothing Stochastic Gradient Descent (LS-SGD) and applicable to classify the digits in image. Experimental results demonstrate a state-of-the-art performance on famous database of handwritten digits the Modified National Institute of Standards and Technology (MNIST) database.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Agarwal, S., Terrail, J.O.D., Jurie, F.: Recent advances in object detection in the age of deep convolutional neural networks. arXiv preprint arXiv:1809.03193 (2018) Agarwal, S., Terrail, J.O.D., Jurie, F.: Recent advances in object detection in the age of deep convolutional neural networks. arXiv preprint arXiv:​1809.​03193 (2018)
Zurück zum Zitat Ashiquzzaman, A., Tushar, A.K.: Handwritten Arabic numeral recognition using deep learning neural networks. In: 2017 IEEE International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 1–4. IEEE (2017) Ashiquzzaman, A., Tushar, A.K.: Handwritten Arabic numeral recognition using deep learning neural networks. In: 2017 IEEE International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 1–4. IEEE (2017)
Zurück zum Zitat Bhatia, E.N.: Optical character recognition techniques: a review. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 4(5) (2014) Bhatia, E.N.: Optical character recognition techniques: a review. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 4(5) (2014)
Zurück zum Zitat Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)MathSciNetCrossRef Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)MathSciNetCrossRef
Zurück zum Zitat Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, pp. 1646–1654 (2014) Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, pp. 1646–1654 (2014)
Zurück zum Zitat Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. arXiv preprint arXiv:1801.07698 (2018) Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. arXiv preprint arXiv:​1801.​07698 (2018)
Zurück zum Zitat Evans, L.C.: Partial Differential Equations. Graduate Studies in Mathematics, vol. 19, 2nd edn. American Mathematical Society, Providence (2010)MATH Evans, L.C.: Partial Differential Equations. Graduate Studies in Mathematics, vol. 19, 2nd edn. American Mathematical Society, Providence (2010)MATH
Zurück zum Zitat Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. arXiv preprint arXiv:1302.4389 (2013) Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. arXiv preprint arXiv:​1302.​4389 (2013)
Zurück zum Zitat Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2146–2153. IEEE (2009) Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2146–2153. IEEE (2009)
Zurück zum Zitat Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013) Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013)
Zurück zum Zitat Laval, J.A., Leclercq, L.: The Hamilton-Jacobi partial differential equation and the three representations of traffic flow. Transp. Res. Part B Methodol. 52, 17–30 (2013)CrossRef Laval, J.A., Leclercq, L.: The Hamilton-Jacobi partial differential equation and the three representations of traffic flow. Transp. Res. Part B Methodol. 52, 17–30 (2013)CrossRef
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef
Zurück zum Zitat Lee, C.Y., Gallagher, P.W., Tu, Z.: Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. In: Artificial Intelligence and Statistics, pp. 464–472 (2016) Lee, C.Y., Gallagher, P.W., Tu, Z.: Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. In: Artificial Intelligence and Statistics, pp. 464–472 (2016)
Zurück zum Zitat Liu, C.L., Sako, H., Fujisawa, H.: Discriminative learning quadratic discriminant function for handwriting recognition. IEEE Trans. Neural Netw. 15(2), 430–444 (2004)CrossRef Liu, C.L., Sako, H., Fujisawa, H.: Discriminative learning quadratic discriminant function for handwriting recognition. IEEE Trans. Neural Netw. 15(2), 430–444 (2004)CrossRef
Zurück zum Zitat Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017) Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017)
Zurück zum Zitat Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: ICML, vol. 2. p. 7 (2016) Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: ICML, vol. 2. p. 7 (2016)
Zurück zum Zitat Osher, S., Wang, B., Yin, P., Luo, X., Pham, M., Lin, A.: Laplacian smoothing gradient descent. arXiv preprint arXiv:1806.06317 (2018) Osher, S., Wang, B., Yin, P., Luo, X., Pham, M., Lin, A.: Laplacian smoothing gradient descent. arXiv preprint arXiv:​1806.​06317 (2018)
Zurück zum Zitat Ranjan, R., Castillo, C.D., Chellappa, R.: L2-constrained softmax loss for discriminative face verification. arXiv preprint arXiv:1703.09507 (2017) Ranjan, R., Castillo, C.D., Chellappa, R.: L2-constrained softmax loss for discriminative face verification. arXiv preprint arXiv:​1703.​09507 (2017)
Zurück zum Zitat Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Zurück zum Zitat Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014) Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. arXiv preprint arXiv:​1412.​6550 (2014)
Zurück zum Zitat Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef
Zurück zum Zitat Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015) Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Zurück zum Zitat Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems, pp. 1988–1996 (2014) Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems, pp. 1988–1996 (2014)
Zurück zum Zitat Ul Rahman, J., Chen, Q., Yang, Z.: Additive parameter for deep face recognition. Commun. Math. Stat., 1–15 (2019) Ul Rahman, J., Chen, Q., Yang, Z.: Additive parameter for deep face recognition. Commun. Math. Stat., 1–15 (2019)
Zurück zum Zitat Ul Rahman, J., Suleman, M., Lu, D., He, J.H., Ramzan, M.: He-Elzaki method for spatial diffusion of biological population. Fractals (2009) Ul Rahman, J., Suleman, M., Lu, D., He, J.H., Ramzan, M.: He-Elzaki method for spatial diffusion of biological population. Fractals (2009)
Zurück zum Zitat Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. (2018) Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. (2018)
Zurück zum Zitat Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, pp. 1058–1066 (2013) Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, pp. 1058–1066 (2013)
Zurück zum Zitat Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018) Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)
Zurück zum Zitat Zhang, Q., Yang, L.T., Chen, Z., Li, P.: A survey on deep learning for big data. Inf. Fusion 42, 146–157 (2018)CrossRef Zhang, Q., Yang, L.T., Chen, Z., Li, P.: A survey on deep learning for big data. Inf. Fusion 42, 146–157 (2018)CrossRef
Zurück zum Zitat Zhang, S., Choromanska, A.E., LeCun, Y.: Deep learning with elastic averaging SGD. In: Advances in Neural Information Processing Systems, pp. 685–693 (2015) Zhang, S., Choromanska, A.E., LeCun, Y.: Deep learning with elastic averaging SGD. In: Advances in Neural Information Processing Systems, pp. 685–693 (2015)
Metadaten
Titel
A Unit Softmax with Laplacian Smoothing Stochastic Gradient Descent for Deep Convolutional Neural Networks
verfasst von
Jamshaid Ul Rahman
Akhtar Ali
Masood Ur Rehman
Rafaqat Kazmi
Copyright-Jahr
2020
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-5232-8_14