Skip to main content

2018 | OriginalPaper | Buchkapitel

Hand Gesture Recognition by Using 3DCNN and LSTM with Adam Optimizer

verfasst von : Siyu Jiang, Yimin Chen

Erschienen in: Advances in Multimedia Information Processing – PCM 2017

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A two-step hand gesture recognition system is proposed to classify gestures from different subjects performed under widely varying lighting conditions. First, 3D Convolutional neural network are fine-tuned to classify each hand gesture. Then, the fine-tuned 3D Convolutional neural network are used to learn spatio-temporal features for Long short-term memory automatically. We also perform spatiotemporal data augmentation for more effective training to reduce potential overfitting. In addition, Adam optimizer is employed to improve training speed in both steps. On the VIVA challenge dataset, our method achieves a correct classification rate of \(94.5\%\), and experimental result shows that Adam optimizer outperforms the most commonly used optimizer SGD. Moreover, our system has strong robustness in different lighting conditions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1371–1375 (1998)CrossRef Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1371–1375 (1998)CrossRef
2.
Zurück zum Zitat Wang, S.B., Quattoni, A., Morency, L.P., et al.: Hidden conditional random fields for gesture recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1521–1527. IEEE (2006) Wang, S.B., Quattoni, A., Morency, L.P., et al.: Hidden conditional random fields for gesture recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1521–1527. IEEE (2006)
3.
Zurück zum Zitat Dardas, N.H., Georganas, N.D.: Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Trans. Instrum. Meas. 60(11), 3592–3607 (2011)CrossRef Dardas, N.H., Georganas, N.D.: Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques. IEEE Trans. Instrum. Meas. 60(11), 3592–3607 (2011)CrossRef
4.
Zurück zum Zitat Ohn-Bar, E., Trivedi, M.M.: Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans. Intell. Transp. Syst. 15(6), 2368–2377 (2014)CrossRef Ohn-Bar, E., Trivedi, M.M.: Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans. Intell. Transp. Syst. 15(6), 2368–2377 (2014)CrossRef
5.
Zurück zum Zitat Elmezain, M., Al-Hamadi, A., Appenrodt, J., et al.: A hidden markov model-based continuous gesture recognition system for hand motion trajectory. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4. IEEE (2008) Elmezain, M., Al-Hamadi, A., Appenrodt, J., et al.: A hidden markov model-based continuous gesture recognition system for hand motion trajectory. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4. IEEE (2008)
8.
Zurück zum Zitat Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 3, 115–143 (2002)MathSciNetMATH Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 3, 115–143 (2002)MathSciNetMATH
9.
Zurück zum Zitat Tran, D., Bourdev, L., Fergus, R., et al.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015) Tran, D., Bourdev, L., Fergus, R., et al.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
10.
Zurück zum Zitat Ji, S., Xu, W., Yang, M., et al.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef Ji, S., Xu, W., Yang, M., et al.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef
11.
Zurück zum Zitat Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., Kautz, J.: Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., Kautz, J.: Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks
12.
Zurück zum Zitat Molchanov, P., Gupta, S., Kim, K., et al.: Hand gesture recognition with 3D convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7 (2015) Molchanov, P., Gupta, S., Kim, K., et al.: Hand gesture recognition with 3D convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7 (2015)
13.
Zurück zum Zitat Hinton, G.E., Srivastava, N., Krizhevsky, A., et al.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012) Hinton, G.E., Srivastava, N., Krizhevsky, A., et al.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:​1207.​0580 (2012)
15.
Zurück zum Zitat Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetMATH Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetMATH
16.
Zurück zum Zitat Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4(2), 26–31 (2012) Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4(2), 26–31 (2012)
17.
Zurück zum Zitat Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRef Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRef
18.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
20.
Zurück zum Zitat Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997)CrossRef Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997)CrossRef
21.
Zurück zum Zitat Trindade, P., Lobo, J., Barreto, J.P.: Hand gesture recognition using color and depth images enhanced with hand angular pose data. In: 2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 71–76. IEEE (2012) Trindade, P., Lobo, J., Barreto, J.P.: Hand gesture recognition using color and depth images enhanced with hand angular pose data. In: 2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 71–76. IEEE (2012)
23.
Zurück zum Zitat Althoff, F., Lindl, R., Walchshausl, L., et al.: Robust multimodal hand-and head gesture recognition for controlling automotive infotainment systems. VDI BERICHTE 1919, 187 (2005) Althoff, F., Lindl, R., Walchshausl, L., et al.: Robust multimodal hand-and head gesture recognition for controlling automotive infotainment systems. VDI BERICHTE 1919, 187 (2005)
24.
Zurück zum Zitat Parada-Loira, F., Gonzlez-Agulla, E., Alba-Castro, J.L.: Hand gestures to control infotainment equipment in cars. In: 2014 IEEE Intelligent Vehicles Symposium Proceedings, pp. 1–6. IEEE (2014) Parada-Loira, F., Gonzlez-Agulla, E., Alba-Castro, J.L.: Hand gestures to control infotainment equipment in cars. In: 2014 IEEE Intelligent Vehicles Symposium Proceedings, pp. 1–6. IEEE (2014)
25.
Zurück zum Zitat Karpathy, A., Toderici, G., Shetty, S., et al.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014) Karpathy, A., Toderici, G., Shetty, S., et al.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
26.
Zurück zum Zitat Molchanov, P., Gupta, S., Kim, K., et al.: Multi-sensor system for driver’s hand-gesture recognition. In: Automatic Face and Gesture Molchanov, P., Gupta, S., Kim, K., et al.: Multi-sensor system for driver’s hand-gesture recognition. In: Automatic Face and Gesture
27.
Zurück zum Zitat Hoskisson, R.E., Hitt, M.A., Johnson, R.A., et al.: Construct validity of an objective (entropy) categorical measure of diversification strategy. Strateg. Manag. J. 14(3), 215–235 (1993)CrossRef Hoskisson, R.E., Hitt, M.A., Johnson, R.A., et al.: Construct validity of an objective (entropy) categorical measure of diversification strategy. Strateg. Manag. J. 14(3), 215–235 (1993)CrossRef
Metadaten
Titel
Hand Gesture Recognition by Using 3DCNN and LSTM with Adam Optimizer
verfasst von
Siyu Jiang
Yimin Chen
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-77380-3_71

Neuer Inhalt