Top

Multimedia Systems

Published in:

20-05-2022 | Regular Article

A convolutional neural network and classical moments-based feature fusion model for gesture recognition

Authors: Abul Abbas Barbhuiya, Ram Kumar Karsh, Rahul Jain

Published in: Multimedia Systems | Issue 5/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Hand gesture recognition is a significant and challenging building block for different computer vision applications such as controlling, conversational, manipulative, and communicative gestures. Several systems have been suggested to address the hand gesture recognition and classification challenges. Convolutional neural networks (CNNs) are widely used for different pattern recognition problems. Besides CNNs, the features extracted using moment-based approaches are considered the most effective and transparent features for the task of image recognition and classification. However, most of the moment-based approaches consider only the global image features while neglecting the discriminative properties of the local image features. This paper proposes a new efficient gesture recognition approach that combines CNN features with conventional Zernike moment-based features. Two groups of Zernike moment-based features are extracted since only global Zernike moment-based features are not sufficient to distinguish between very similar hand postures. Hence, global features are supplemented with local modified Zernike moment-based features to improve the recognition accuracy by extracting the local pattern information of the image. Furthermore, we have introduced an improved architecture that combines the features derived from the whitening transformed Zernike moments computed for each image and CNNs’ last convolutional layer. Finally, the library for support vector machine (LIBSVM) has been used for classification. The proposed model has recognition accuracies of 98.41%, 94.33%, 97.27%, and 99.84% on four different standard datasets. The performance comparisons show that the proposed model is better than the state-of-the-art methods.

previous article Double-scale similarity with rich features for cross-modal retrieval

next article Novel design of cryptosystems for video/audio streaming via dynamic synchronized chaos-based random keys

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Robertson, S.P., Zachary, W., Black, J. B.: Cognition, Computing and Cooperation. Ablex Publishing Corporation: Norwood, New Jersey USA (1990)

Wang, C.-C., Wang, K.-C.: Hand posture recognition using adaboost with sift for human robot interaction. In: Recent Progress in Robotics: Viable Robotic Service to Human, pp. 317–329. Springer, Berlin, Heidelberg (2007)

Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artif. Intell. Rev. 43(1), 1–54 (2015)CrossRef

Birk, H., Moeslund, T. B., Madsen, C. B.: Real-time recognition of hand alphabet gestures using principal component analysis. In: Proceedings of the Scandinavian conference on image analysis, vol. 1, pp. 261–268. Proceedings published by various publishers, (1997)

Vutinuntakasame, S., Jaijongrak, V.-R., Thiemjarus, S.: An assistive body sensor network glove for speech-and hearing-impaired disabilities. In: 2011 international conference on body sensor networks, pp. 7–12. IEEE (2011)

Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef

Binh, N. D., Ejima, T.: Real-time hand gesture recognition using pseudo 3-D Hidden Markov Model. In: 2006 5th IEEE international conference on cognitive informatics, vol. 2, pp. 820–824. IEEE (2006)

Aowal, M. A., Zaman, A. S., Mahbubur Rahman, S. M., Hatzinakos, D.: Static hand gesture recognition using discriminative 2D Zernike moments. In: TENCON 2014–2014 IEEE region 10 conference, pp. 1–5. IEEE (2014)

Pradeep Kumar, B.P., Manjunatha, M.B.: A hybrid gesture recognition method for American sign language. Indian J. Sci. Technol. 10(1), 1–12 (2017)

10.

Sabhara, R.K., Lee, C.-P., Lim, K.-M.: Comparative study of hu moments and zernike moments in object recognition. SmartCR 3(3), 166–173 (2013)CrossRef

11.

Otiniano-Rodrıguez, K.C., Cámara-Chávez, G., Menotti, D.: Hu and Zernike moments for sign language recognition. In: Proceedings of International Conference on Image Processing, Computer Vision, and Pattern Recognition, pp. 1–5 (2012)

12.

Guo, Y., Liu, C., Gong, S.: Improved algorithm for Zernike moments. In: 2015 International Conference on Control, Automation and Information Sciences (ICCAIS), pp. 307–312. IEEE (2015)

13.

Avraam, M.: Static gesture recognition combining graph and appearance features. Int. J. Adv. Res. Artif. Intell. (IJARAI) 3(2) (2014).

14.

Barros, P., Magg, S., Weber, C., Wermter, S.: A multichannel convolutional neural network for hand posture recognition. In: International Conference on Artificial Neural Networks, pp. 403–410. Springer, Cham (2014)

15.

Sanchez-Riera, J., Hua, K.-L., Hsiao, Y.-S., Lim, T., Hidayati, S.C., Cheng, W.-H.: A comparative study of data fusion for RGB-D based visual recognition. Pattern Recogn. Lett. 73, 1–6 (2016)CrossRef

16.

Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Appl. 28(12), 3941–3951 (2017)CrossRef

17.

Ji, P., Wu, C., Xu, X., Song, A., Li, H.: Vision-based posture recognition using an ensemble classifier and a vote filter. In: Infrared Technology and Applications, and Robot Sensing and Advanced Control, vol. 10157, p. 101571. J. International Society for Optics and Photonics (2016)

18.

Nasr-Esfahani, E., Karimi, N., Soroushmehr, S.M., Jafari, M.H., Khorsandi, M.A., Samavi, S., Najarian, K.: Hand gesture recognition for contactless device control in operating rooms. arXiv preprint arXiv:1611.04138 (2016)

19.

Wadhawan, A., Kumar, P.: Deep learning-based sign language recognition system for static signs. Neural Comput. Appl. 1, 1–2 (2020)

20.

Barbhuiya, A.A., Karsh, R.K., Jain, R.: CNN based feature extraction and classification for sign language. Multimedia Tools Appl. 80(2), 1–19 (2020)

21.

Rahim, M.A., Islam, M.R., Shin, J.: Non-touch sign word recognition based on dynamic hand gesture using hybrid segmentation and CNN feature fusion. Appl. Sci. 9(18), 3790 (2019)CrossRef

22.

Chevtchenko, S.F., Vale, R.F., Macario, V., Cordeiro, F.R.: A convolutional neural network with feature fusion for real-time hand posture recognition. Appl. Soft Comput. 73, 748–766 (2018)CrossRef

23.

Mujahid, A., Awan, M.J., Yasin, A., Mohammed, M.A., Damaševičius, R., Maskeliūnas, R., Abdulkareem, K.H.: Real-time hand gesture recognition based on deep learning YOLOv3 model. Appl. Sci. 11(9), 4164 (2021)CrossRef

24.

Bhaumik, G., Verma, M., Govil, M.C., Vipparthi, S.K.: HyFiNet: hybrid feature attention network for hand gesture recognition. Multimedia Tools Appl. 1–20 (2022)

25.

Bhaumik, G., Verma, M., Govil, M.C., Vipparthi, S.K.: ExtriDeNet: an intensive feature extrication deep network for hand gesture recognition. Visual Comp. (2021). https://doi.org/10.1007/s00371-021-02225-zCrossRef

26.

Bhaumik, G., Verma, M., Govil, M.C., Vipparthi, S.K.: CrossFeat: multi-scale cross feature aggregation network for hand gesture recognition. In: 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS), pp. 274–279. IEEE (2020)

27.

Zernike, F.: Beugungstheorie des Schneidenverfahrens und seiner verbesserten Form, der Phasenkontrastmethode. Physica 1(7–12), 689–704 (1934)CrossRef

28.

Teh, C.-H., Chin, R.T.: On image analysis by the methods of moments. IEEE Trans. Pattern Anal. Mach. Intell. 10(4), 496–513 (1988)CrossRef

29.

Deng, Li., Dong, Yu.: Deep learning: methods and applications. Found. Trends Sig. Process. 7(3–4), 197–387 (2014)MathSciNetCrossRef

30.

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef

31.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)

32.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

33.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9 (2015)

34.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)

35.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn RES 15(1), 1929–1958 (2014)MathSciNetMATH

36.

Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3), 1–27 (2011)CrossRef

37.

Sun, X., Chen, M., Hauptmann, A.: Action recognition via local descriptors and holistic features. In: 2009 IEEE computer society conference on computer vision and pattern recognition workshops, pp. 58–65. IEEE (2009)

38.

Ahad, M.A.R., Islam, M.N., Jahan, I.: Action recognition based on binary patterns of action-history and histogram of oriented gradient. J. Multimodal User Interfaces 10(4), 335–344 (2016)CrossRef

39.

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol. 1, pp. 886–893. IEEE (2005)

40.

Belouchrani, A., Cichocki, A.: Robust whitening procedure in blind source separation context. Electron. Lett. 36(24), 2050–2051 (2000)CrossRef

41.

Barczak, A.L.C., Reyes, N.H., Abastillas, M., Piccio, A., Susnjak, T.: A new 2D static hand gesture colour image dataset for ASL gestures. Res Lett Inf Math 15, 12–20 (2011)

42.

Marin, G., Dominio, F., Zanuttigh, P.: Hand gesture recognition with leap motion and kinect devices. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 1565–1569. IEEE (2014)

43.

Nalepa, J., Grzejszczak, T., Kawulok, M.: Wrist localization in color images for hand gesture recognition. In: Man-Machine Interactions, vol 3, pp. 79–86. Springer, Cham (2014)

44.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

45.

Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016)

46.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Adv Neural Inf Process Syst, pp. 1097–1105 (2012)

47.

Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH

48.

Javed Awan, M., Mohd Rahim, M.S., Salim, N., Mohammed, M.A., Garcia-Zapirain, B., Abdulkareem, K.H.: Efficient detection of knee anterior cruciate ligament from magnetic resonance imaging using deep learning approach. Diagnostics 11(1), 105 (2021)CrossRef

Title: A convolutional neural network and classical moments-based feature fusion model for gesture recognition
Authors: Abul Abbas Barbhuiya
Ram Kumar Karsh
Rahul Jain
Publication date: 20-05-2022
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 5/2022
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-022-00951-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 5/2022

Visual saliency detection via combining center prior and U-Net

Exemplar-guided low-light image enhancement

Closed-loop reasoning with graph-aware dense interaction for visual dialog

An olfactory display for virtual reality glasses

Fixed-resolution representation network for human pose estimation

Correction: STASiamRPN: visual tracking based on spatiotemporal and attention