Skip to main content
Top

2018 | OriginalPaper | Chapter

A Kinematic Gesture Representation Based on Shape Difference VLAD for Sign Language Recognition

Authors : Jefferson Rodríguez, Fabio Martínez

Published in: Computer Vision and Graphics

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Automatic Sign language recognition (SLR) is a fundamental task to help with inclusion of deaf community in society, facilitating, noways, many conventional multimedia interactions. In this work is proposed a novel approach to represent gestures in SLR as a shape difference-VLAD mid level coding of kinematic primitives, captured along videos. This representation capture local salient motions together with regional dominant patterns developed by articulators along utterances. Also, the special VLAD representation allows to quantify local motion pattern but also capture shape of motion descriptors, that achieved a proper regional gesture characterization. The proposed approach achieved an average accuracy of 85,45% in a corpus data of 64 sign words captured in 3200 videos. Additionally, for Boston sign dataset the proposed approach achieve competitive results with \(82\%\) of accuracy in average.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 288–303 (2010)CrossRef Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 288–303 (2010)CrossRef
2.
go back to reference Brox, T., Bregler, C., Malik, J.: Large displacement optical flow. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 41–48. IEEE (2009) Brox, T., Bregler, C., Malik, J.: Large displacement optical flow. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 41–48. IEEE (2009)
4.
go back to reference Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011) Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
5.
go back to reference Chaudhry, R., Ravichandran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1932–1939. IEEE (2009) Chaudhry, R., Ravichandran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1932–1939. IEEE (2009)
7.
go back to reference Duta, I.C., Uijlings, J.R., Ionescu, B., Aizawa, K., Hauptmann, A.G., Sebe, N.: Efficient human action recognition using histograms of motion gradients and vlad with descriptor shape information. Multimedia Tools Appl. 76, 22445–22472 (2017)CrossRef Duta, I.C., Uijlings, J.R., Ionescu, B., Aizawa, K., Hauptmann, A.G., Sebe, N.: Efficient human action recognition using histograms of motion gradients and vlad with descriptor shape information. Multimedia Tools Appl. 76, 22445–22472 (2017)CrossRef
8.
go back to reference Jain, M., Jegou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2555–2562 (2013) Jain, M., Jegou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2555–2562 (2013)
9.
go back to reference Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE (2010) Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE (2010)
10.
go back to reference Konecnỳ, J., Hagara, M.: One-shot-learning gesture recognition using HOG-HOF. J. Mach. Learn. Res. 15, 2513–2532 (2014)MathSciNet Konecnỳ, J., Hagara, M.: One-shot-learning gesture recognition using HOG-HOF. J. Mach. Learn. Res. 15, 2513–2532 (2014)MathSciNet
12.
go back to reference Paulraj, M., Yaacob, S., Desa, H., Hema, C., Ridzuan, W.M., Ab Majid, W.: Extraction of head and hand gesture features for recognition of sign language. In: International Conference on Electronic Design, ICED 2008, pp. 1–6. IEEE (2008) Paulraj, M., Yaacob, S., Desa, H., Hema, C., Ridzuan, W.M., Ab Majid, W.: Extraction of head and hand gesture features for recognition of sign language. In: International Conference on Electronic Design, ICED 2008, pp. 1–6. IEEE (2008)
13.
go back to reference Peng, X., Wang, L., Wang, X., Qiao, Y.: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput. Vis. Image Underst. 150, 109–125 (2016)CrossRef Peng, X., Wang, L., Wang, X., Qiao, Y.: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput. Vis. Image Underst. 150, 109–125 (2016)CrossRef
15.
go back to reference Ronchetti, F., Quiroga, F., Estrebou, C.A., Lanzarini, L.C., Rosete, A.: LSA64: an Argentinian sign language dataset. In: XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016) (2016) Ronchetti, F., Quiroga, F., Estrebou, C.A., Lanzarini, L.C., Rosete, A.: LSA64: an Argentinian sign language dataset. In: XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016) (2016)
16.
go back to reference Wan, J., Ruan, Q., Li, W., Deng, S.: One-shot learning gesture recognition from RGB-D data using bag of features. J. Mach. Learn. Res. 14(1), 2549–2582 (2013) Wan, J., Ruan, Q., Li, W., Deng, S.: One-shot learning gesture recognition from RGB-D data using bag of features. J. Mach. Learn. Res. 14(1), 2549–2582 (2013)
17.
go back to reference Zafrulla, Z., Brashear, H., Starner, T., Hamilton, H., Presti, P.: American sign language recognition with the kinect. In: Proceedings of the 13th International Conference on Multimodal Interfaces, pp. 279–286. ACM (2011) Zafrulla, Z., Brashear, H., Starner, T., Hamilton, H., Presti, P.: American sign language recognition with the kinect. In: Proceedings of the 13th International Conference on Multimodal Interfaces, pp. 279–286. ACM (2011)
18.
go back to reference Zahedi, M., Keysers, D., Deselaers, T., Ney, H.: Combination of tangent distance and an image distortion model for appearance-based sign language recognition. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 401–408. Springer, Heidelberg (2005). https://doi.org/10.1007/11550518_50CrossRef Zahedi, M., Keysers, D., Deselaers, T., Ney, H.: Combination of tangent distance and an image distortion model for appearance-based sign language recognition. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 401–408. Springer, Heidelberg (2005). https://​doi.​org/​10.​1007/​11550518_​50CrossRef
20.
go back to reference Zaki, M.M., Shaheen, S.I.: Sign language recognition using a combination of new vision based features. Pattern Recogn. Lett. 32(4), 572–577 (2011)CrossRef Zaki, M.M., Shaheen, S.I.: Sign language recognition using a combination of new vision based features. Pattern Recogn. Lett. 32(4), 572–577 (2011)CrossRef
Metadata
Title
A Kinematic Gesture Representation Based on Shape Difference VLAD for Sign Language Recognition
Authors
Jefferson Rodríguez
Fabio Martínez
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-00692-1_38

Premium Partner