Skip to main content

2022 | OriginalPaper | Buchkapitel

Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition

verfasst von : Rajesh George Rajan, P. Selvi Rajendran

Erschienen in: Computer Networks and Inventive Communication Technologies

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The fundamental part of the neural network is the learning rate, and the strategy of adopting the learning process in a neural network is carried out using optimization algorithms or optimizers. This optimization algorithm helps us produce better results to the model by changing the parameters like bias and weights, i.e., it helps us maximize or minimize the error function and depends on the learnable parameters. In this paper, we examine how an End-to-End CNN model named ASLNET recognizes the alphabets of the American sign language using various optimizers such as Stochastic Gradient Descent (SGD), Root-Mean-Square propagation (RM-Sprop), Adaptive Gradient Algorithm (Adagrad), Adaptive Delta (Adadelta),Adaptive Moment Estimation (Adam), Adam with Nesterov Momentum (Nadam), LookAhead and Rectified Adam (RAdam). To avoid the overfitting issues, traditional data augmentation techniques are used to compare our model with data augmentation and without augmentation with these optimizers. Among these, LookAhead and RAdam are the most recently developed. The experiment is conducted on 2 NVIDIA TESLA P100 GPUs of batch size 64, and the investigation was based on benchmark ASL Finger Spelling dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., et al.: Recent advances in deep learning for speech research at microsoft. In: ICASSP 2013 (2013) Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Zweig, G., He, X., Williams, J., et al.: Recent advances in deep learning for speech research at microsoft. In: ICASSP 2013 (2013)
2.
Zurück zum Zitat Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRef Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRef
3.
Zurück zum Zitat Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Process. Mag. IEEE 29(6), 82–97 (2012) Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Process. Mag. IEEE 29(6), 82–97 (2012)
4.
Zurück zum Zitat Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013) Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)
5.
Zurück zum Zitat Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNet Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNet
6.
Zurück zum Zitat Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., et al.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2012) Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., et al.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2012)
8.
Zurück zum Zitat Tieleman, T., Hinton, G.: Lecture 6.5—RMSProp, COURSERA: neural networks for machine learning. Technical report (2012) Tieleman, T., Hinton, G.: Lecture 6.5—RMSProp, COURSERA: neural networks for machine learning. Technical report (2012)
10.
Zurück zum Zitat Dozat, T.: Incorporating nesterov momentum into Adam (2016) Dozat, T.: Incorporating nesterov momentum into Adam (2016)
11.
Zurück zum Zitat Zhang, M.R., Lucas, J., Hinton, G., Ba, J.: Lookahead optimizer: k steps forward, 1 step back (2019). arXiv preprint arXiv:1907.08610 Zhang, M.R., Lucas, J., Hinton, G., Ba, J.: Lookahead optimizer: k steps forward, 1 step back (2019). arXiv preprint arXiv:​1907.​08610
12.
Zurück zum Zitat Liu, L., Jiang, H., He, P., Chen, W., Liu, X., Gao, J., Han, J.: On the variance of the adaptive learning rate and beyond (2019). arXiv preprint arXiv:1908.03265 Liu, L., Jiang, H., He, P., Chen, W., Liu, X., Gao, J., Han, J.: On the variance of the adaptive learning rate and beyond (2019). arXiv preprint arXiv:​1908.​03265
13.
Zurück zum Zitat Pugeault, N., Bowden, R.: Spelling it out: real-time ASL fingerspelling recognition. In: Proceedings of the 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision, jointly with ICCV'2011 (2011) Pugeault, N., Bowden, R.: Spelling it out: real-time ASL fingerspelling recognition. In: Proceedings of the 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision, jointly with ICCV'2011 (2011)
15.
Zurück zum Zitat Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016) Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016)
Metadaten
Titel
Comparative Study of Optimization Algorithm in Deep CNN-Based Model for Sign Language Recognition
verfasst von
Rajesh George Rajan
P. Selvi Rajendran
Copyright-Jahr
2022
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-16-3728-5_35

Neuer Inhalt