Skip to main content
Top
Published in: Multimedia Systems 5/2022

20-05-2022 | Regular Article

A convolutional neural network and classical moments-based feature fusion model for gesture recognition

Authors: Abul Abbas Barbhuiya, Ram Kumar Karsh, Rahul Jain

Published in: Multimedia Systems | Issue 5/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Hand gesture recognition is a significant and challenging building block for different computer vision applications such as controlling, conversational, manipulative, and communicative gestures. Several systems have been suggested to address the hand gesture recognition and classification challenges. Convolutional neural networks (CNNs) are widely used for different pattern recognition problems. Besides CNNs, the features extracted using moment-based approaches are considered the most effective and transparent features for the task of image recognition and classification. However, most of the moment-based approaches consider only the global image features while neglecting the discriminative properties of the local image features. This paper proposes a new efficient gesture recognition approach that combines CNN features with conventional Zernike moment-based features. Two groups of Zernike moment-based features are extracted since only global Zernike moment-based features are not sufficient to distinguish between very similar hand postures. Hence, global features are supplemented with local modified Zernike moment-based features to improve the recognition accuracy by extracting the local pattern information of the image. Furthermore, we have introduced an improved architecture that combines the features derived from the whitening transformed Zernike moments computed for each image and CNNs’ last convolutional layer. Finally, the library for support vector machine (LIBSVM) has been used for classification. The proposed model has recognition accuracies of 98.41%, 94.33%, 97.27%, and 99.84% on four different standard datasets. The performance comparisons show that the proposed model is better than the state-of-the-art methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Robertson, S.P., Zachary, W., Black, J. B.: Cognition, Computing and Cooperation. Ablex Publishing Corporation: Norwood, New Jersey USA (1990) Robertson, S.P., Zachary, W., Black, J. B.: Cognition, Computing and Cooperation. Ablex Publishing Corporation: Norwood, New Jersey USA (1990)
2.
go back to reference Wang, C.-C., Wang, K.-C.: Hand posture recognition using adaboost with sift for human robot interaction. In: Recent Progress in Robotics: Viable Robotic Service to Human, pp. 317–329. Springer, Berlin, Heidelberg (2007) Wang, C.-C., Wang, K.-C.: Hand posture recognition using adaboost with sift for human robot interaction. In: Recent Progress in Robotics: Viable Robotic Service to Human, pp. 317–329. Springer, Berlin, Heidelberg (2007)
3.
go back to reference Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015)CrossRef Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015)CrossRef
4.
go back to reference Birk, H., Moeslund, T. B., Madsen, C. B.: Real-time recognition of hand alphabet gestures using principal component analysis. In: Proceedings of the Scandinavian conference on image analysis, vol. 1, pp. 261–268. Proceedings published by various publishers, (1997) Birk, H., Moeslund, T. B., Madsen, C. B.: Real-time recognition of hand alphabet gestures using principal component analysis. In: Proceedings of the Scandinavian conference on image analysis, vol. 1, pp. 261–268. Proceedings published by various publishers, (1997)
5.
go back to reference Vutinuntakasame, S., Jaijongrak, V.-R., Thiemjarus, S.: An assistive body sensor network glove for speech-and hearing-impaired disabilities. In: 2011 international conference on body sensor networks, pp. 7–12. IEEE (2011) Vutinuntakasame, S., Jaijongrak, V.-R., Thiemjarus, S.: An assistive body sensor network glove for speech-and hearing-impaired disabilities. In: 2011 international conference on body sensor networks, pp. 7–12. IEEE (2011)
6.
go back to reference Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef
7.
go back to reference Binh, N. D., Ejima, T.: Real-time hand gesture recognition using pseudo 3-D Hidden Markov Model. In: 2006 5th IEEE international conference on cognitive informatics, vol. 2, pp. 820–824. IEEE (2006) Binh, N. D., Ejima, T.: Real-time hand gesture recognition using pseudo 3-D Hidden Markov Model. In: 2006 5th IEEE international conference on cognitive informatics, vol. 2, pp. 820–824. IEEE (2006)
8.
go back to reference Aowal, M. A., Zaman, A. S., Mahbubur Rahman, S. M., Hatzinakos, D.: Static hand gesture recognition using discriminative 2D Zernike moments. In: TENCON 2014–2014 IEEE region 10 conference, pp. 1–5. IEEE (2014) Aowal, M. A., Zaman, A. S., Mahbubur Rahman, S. M., Hatzinakos, D.: Static hand gesture recognition using discriminative 2D Zernike moments. In: TENCON 2014–2014 IEEE region 10 conference, pp. 1–5. IEEE (2014)
9.
go back to reference Pradeep Kumar, B.P., Manjunatha, M.B.: A hybrid gesture recognition method for American sign language. Indian J. Sci. Technol. 10(1), 1–12 (2017) Pradeep Kumar, B.P., Manjunatha, M.B.: A hybrid gesture recognition method for American sign language. Indian J. Sci. Technol. 10(1), 1–12 (2017)
10.
go back to reference Sabhara, R.K., Lee, C.-P., Lim, K.-M.: Comparative study of hu moments and zernike moments in object recognition. SmartCR 3(3), 166–173 (2013)CrossRef Sabhara, R.K., Lee, C.-P., Lim, K.-M.: Comparative study of hu moments and zernike moments in object recognition. SmartCR 3(3), 166–173 (2013)CrossRef
11.
go back to reference Otiniano-Rodrıguez, K.C., Cámara-Chávez, G., Menotti, D.: Hu and Zernike moments for sign language recognition. In: Proceedings of International Conference on Image Processing, Computer Vision, and Pattern Recognition, pp. 1–5 (2012) Otiniano-Rodrıguez, K.C., Cámara-Chávez, G., Menotti, D.: Hu and Zernike moments for sign language recognition. In: Proceedings of International Conference on Image Processing, Computer Vision, and Pattern Recognition, pp. 1–5 (2012)
12.
go back to reference Guo, Y., Liu, C., Gong, S.: Improved algorithm for Zernike moments. In: 2015 International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 307–312. IEEE (2015) Guo, Y., Liu, C., Gong, S.: Improved algorithm for Zernike moments. In: 2015 International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 307–312. IEEE (2015)
13.
go back to reference Avraam, M.: Static gesture recognition combining graph and appearance features. Int. J. Adv. Res. Artif. Intell. (IJARAI) 3(2) (2014). Avraam, M.: Static gesture recognition combining graph and appearance features. Int. J. Adv. Res. Artif. Intell. (IJARAI) 3(2) (2014).
14.
go back to reference Barros, P., Magg, S., Weber, C., Wermter, S.: A multichannel convolutional neural network for hand posture recognition. In: International Conference on Artificial Neural Networks, pp. 403–410. Springer, Cham (2014) Barros, P., Magg, S., Weber, C., Wermter, S.: A multichannel convolutional neural network for hand posture recognition. In: International Conference on Artificial Neural Networks, pp. 403–410. Springer, Cham (2014)
15.
go back to reference Sanchez-Riera, J., Hua, K.-L., Hsiao, Y.-S., Lim, T., Hidayati, S.C., Cheng, W.-H.: A comparative study of data fusion for RGB-D based visual recognition. Pattern Recogn. Lett. 73, 1–6 (2016)CrossRef Sanchez-Riera, J., Hua, K.-L., Hsiao, Y.-S., Lim, T., Hidayati, S.C., Cheng, W.-H.: A comparative study of data fusion for RGB-D based visual recognition. Pattern Recogn. Lett. 73, 1–6 (2016)CrossRef
16.
go back to reference Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Appl. 28(12), 3941–3951 (2017)CrossRef Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Appl. 28(12), 3941–3951 (2017)CrossRef
17.
go back to reference Ji, P., Wu, C., Xu, X., Song, A., Li, H.: Vision-based posture recognition using an ensemble classifier and a vote filter. In: Infrared Technology and Applications, and Robot Sensing and Advanced Control, vol. 10157, p. 101571. J. International Society for Optics and Photonics (2016) Ji, P., Wu, C., Xu, X., Song, A., Li, H.: Vision-based posture recognition using an ensemble classifier and a vote filter. In: Infrared Technology and Applications, and Robot Sensing and Advanced Control, vol. 10157, p. 101571. J. International Society for Optics and Photonics (2016)
18.
go back to reference Nasr-Esfahani, E., Karimi, N., Soroushmehr, S.M., Jafari, M.H., Khorsandi, M.A., Samavi, S., Najarian, K.: Hand gesture recognition for contactless device control in operating rooms. arXiv preprint arXiv:1611.04138 (2016) Nasr-Esfahani, E., Karimi, N., Soroushmehr, S.M., Jafari, M.H., Khorsandi, M.A., Samavi, S., Najarian, K.: Hand gesture recognition for contactless device control in operating rooms. arXiv preprint arXiv:​1611.​04138 (2016)
19.
go back to reference Wadhawan, A., Kumar, P.: Deep learning-based sign language recognition system for static signs. Neural Comput. Appl. 1, 1–2 (2020) Wadhawan, A., Kumar, P.: Deep learning-based sign language recognition system for static signs. Neural Comput. Appl. 1, 1–2 (2020)
20.
go back to reference Barbhuiya, A.A., Karsh, R.K., Jain, R.: CNN based feature extraction and classification for sign language. Multimedia Tools Appl. 80(2), 1–19 (2020) Barbhuiya, A.A., Karsh, R.K., Jain, R.: CNN based feature extraction and classification for sign language. Multimedia Tools Appl. 80(2), 1–19 (2020)
21.
go back to reference Rahim, M.A., Islam, M.R., Shin, J.: Non-touch sign word recognition based on dynamic hand gesture using hybrid segmentation and CNN feature fusion. Appl. Sci. 9(18), 3790 (2019)CrossRef Rahim, M.A., Islam, M.R., Shin, J.: Non-touch sign word recognition based on dynamic hand gesture using hybrid segmentation and CNN feature fusion. Appl. Sci. 9(18), 3790 (2019)CrossRef
22.
go back to reference Chevtchenko, S.F., Vale, R.F., Macario, V., Cordeiro, F.R.: A convolutional neural network with feature fusion for real-time hand posture recognition. Appl. Soft Comput. 73, 748–766 (2018)CrossRef Chevtchenko, S.F., Vale, R.F., Macario, V., Cordeiro, F.R.: A convolutional neural network with feature fusion for real-time hand posture recognition. Appl. Soft Comput. 73, 748–766 (2018)CrossRef
23.
go back to reference Mujahid, A., Awan, M.J., Yasin, A., Mohammed, M.A., Damaševičius, R., Maskeliūnas, R., Abdulkareem, K.H.: Real-time hand gesture recognition based on deep learning YOLOv3 model. Appl. Sci. 11(9), 4164 (2021)CrossRef Mujahid, A., Awan, M.J., Yasin, A., Mohammed, M.A., Damaševičius, R., Maskeliūnas, R., Abdulkareem, K.H.: Real-time hand gesture recognition based on deep learning YOLOv3 model. Appl. Sci. 11(9), 4164 (2021)CrossRef
24.
go back to reference Bhaumik, G., Verma, M., Govil, M.C., Vipparthi, S.K.: HyFiNet: hybrid feature attention network for hand gesture recognition. Multimedia Tools Appl. 1–20 (2022) Bhaumik, G., Verma, M., Govil, M.C., Vipparthi, S.K.: HyFiNet: hybrid feature attention network for hand gesture recognition. Multimedia Tools Appl. 1–20 (2022)
26.
go back to reference Bhaumik, G., Verma, M., Govil, M.C., Vipparthi, S.K.: CrossFeat: multi-scale cross feature aggregation network for hand gesture recognition. In: 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS), pp. 274–279. IEEE (2020) Bhaumik, G., Verma, M., Govil, M.C., Vipparthi, S.K.: CrossFeat: multi-scale cross feature aggregation network for hand gesture recognition. In: 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS), pp. 274–279. IEEE (2020)
27.
go back to reference Zernike, F.: Beugungstheorie des Schneidenverfahrens und seiner verbesserten Form, der Phasenkontrastmethode. Physica 1(7–12), 689–704 (1934)CrossRef Zernike, F.: Beugungstheorie des Schneidenverfahrens und seiner verbesserten Form, der Phasenkontrastmethode. Physica 1(7–12), 689–704 (1934)CrossRef
28.
go back to reference Teh, C.-H., Chin, R.T.: On image analysis by the methods of moments. IEEE Trans. Pattern Anal. Mach. Intell. 10(4), 496–513 (1988)CrossRef Teh, C.-H., Chin, R.T.: On image analysis by the methods of moments. IEEE Trans. Pattern Anal. Mach. Intell. 10(4), 496–513 (1988)CrossRef
29.
go back to reference Deng, Li., Dong, Yu.: Deep learning: methods and applications. Found. Trends Sig. Process. 7(3–4), 197–387 (2014)MathSciNetCrossRef Deng, Li., Dong, Yu.: Deep learning: methods and applications. Found. Trends Sig. Process. 7(3–4), 197–387 (2014)MathSciNetCrossRef
30.
go back to reference LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
31.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
32.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556 (2014)
33.
go back to reference Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9 (2015) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9 (2015)
34.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
35.
go back to reference Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn RES 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn RES 15(1), 1929–1958 (2014)MathSciNetMATH
36.
go back to reference Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3), 1–27 (2011)CrossRef Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3), 1–27 (2011)CrossRef
37.
go back to reference Sun, X., Chen, M., Hauptmann, A.: Action recognition via local descriptors and holistic features. In: 2009 IEEE computer society conference on computer vision and pattern recognition workshops, pp. 58–65. IEEE (2009) Sun, X., Chen, M., Hauptmann, A.: Action recognition via local descriptors and holistic features. In: 2009 IEEE computer society conference on computer vision and pattern recognition workshops, pp. 58–65. IEEE (2009)
38.
go back to reference Ahad, M.A.R., Islam, M.N., Jahan, I.: Action recognition based on binary patterns of action-history and histogram of oriented gradient. J. Multimodal User Interfaces 10(4), 335–344 (2016)CrossRef Ahad, M.A.R., Islam, M.N., Jahan, I.: Action recognition based on binary patterns of action-history and histogram of oriented gradient. J. Multimodal User Interfaces 10(4), 335–344 (2016)CrossRef
39.
go back to reference Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol. 1, pp. 886–893. IEEE (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol. 1, pp. 886–893. IEEE (2005)
40.
go back to reference Belouchrani, A., Cichocki, A.: Robust whitening procedure in blind source separation context. Electron. Lett. 36(24), 2050–2051 (2000)CrossRef Belouchrani, A., Cichocki, A.: Robust whitening procedure in blind source separation context. Electron. Lett. 36(24), 2050–2051 (2000)CrossRef
41.
go back to reference Barczak, A.L.C., Reyes, N.H., Abastillas, M., Piccio, A., Susnjak, T.: A new 2D static hand gesture colour image dataset for ASL gestures. Res Lett Inf Math 15, 12–20 (2011) Barczak, A.L.C., Reyes, N.H., Abastillas, M., Piccio, A., Susnjak, T.: A new 2D static hand gesture colour image dataset for ASL gestures. Res Lett Inf Math 15, 12–20 (2011)
42.
go back to reference Marin, G., Dominio, F., Zanuttigh, P.: Hand gesture recognition with leap motion and kinect devices. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 1565–1569. IEEE (2014) Marin, G., Dominio, F., Zanuttigh, P.: Hand gesture recognition with leap motion and kinect devices. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 1565–1569. IEEE (2014)
43.
go back to reference Nalepa, J., Grzejszczak, T., Kawulok, M.: Wrist localization in color images for hand gesture recognition. In: Man-Machine Interactions, vol 3, pp. 79–86. Springer, Cham (2014) Nalepa, J., Grzejszczak, T., Kawulok, M.: Wrist localization in color images for hand gesture recognition. In: Man-Machine Interactions, vol 3, pp. 79–86. Springer, Cham (2014)
44.
go back to reference Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016) Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
45.
go back to reference Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016) Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:​1602.​07360 (2016)
46.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Adv Neural Inf Process Syst, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Adv Neural Inf Process Syst, pp. 1097–1105 (2012)
47.
go back to reference Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH
48.
go back to reference Javed Awan, M., Mohd Rahim, M.S., Salim, N., Mohammed, M.A., Garcia-Zapirain, B., Abdulkareem, K.H.: Efficient detection of knee anterior cruciate ligament from magnetic resonance imaging using deep learning approach. Diagnostics 11(1), 105 (2021)CrossRef Javed Awan, M., Mohd Rahim, M.S., Salim, N., Mohammed, M.A., Garcia-Zapirain, B., Abdulkareem, K.H.: Efficient detection of knee anterior cruciate ligament from magnetic resonance imaging using deep learning approach. Diagnostics 11(1), 105 (2021)CrossRef
Metadata
Title
A convolutional neural network and classical moments-based feature fusion model for gesture recognition
Authors
Abul Abbas Barbhuiya
Ram Kumar Karsh
Rahul Jain
Publication date
20-05-2022
Publisher
Springer Berlin Heidelberg
Published in
Multimedia Systems / Issue 5/2022
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-022-00951-5

Other articles of this Issue 5/2022

Multimedia Systems 5/2022 Go to the issue