Skip to main content

2017 | OriginalPaper | Buchkapitel

Towards Automatic Recognition of Sign Language Gestures Using Kinect 2.0

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a prototype of a new computer system aimed at recognition of manual gestures using Kinect 2.0 for Windows. This sensor allows getting a stream of optical images having FullHD resolution with 30 frames per second (fps) and a depth map of the scene. At present, our system is able to recognize continuous fingerspelling gestures and sequences of digits in Russian and Kazakh sign languages (SL). Our gesture vocabulary contains 52 fingerspelling gestures. We have collected a visual database of SL gestures, which consists of Kinect-based recordings of 2 persons (a man and a woman) demonstrating manual gestures. 5 samples of each gesture were applied for training models and the rest data were used for tuning and testing the developed recognition system. Model of each gesture is presented as a vector of informative visual features, calculated for the hand palm and all fingers. Feature vectors are extracted from both training and test samples of gestures, then comparison of reference patterns (models) and sequences of test vectors is made using the Euclidian distance. Sequences of vectors are compared using the dynamic time warping method (dynamic programming) and a reference pattern with a minimal distance is selected as a recognition result. According to our experiments in the signer-dependent mode with 2 demonstrators from the visual database, the average accuracy of gesture recognition is 87% for 52 manual signs.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Koller, O., Forster, J., Ney, H.: Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Comput. Vis. Image Underst. 141, 108–125 (2015)CrossRef Koller, O., Forster, J., Ney, H.: Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Comput. Vis. Image Underst. 141, 108–125 (2015)CrossRef
2.
Zurück zum Zitat Cooper, H., Ong, E.J., Pugeault, N., Bowden, R.: Sign language recognition using sub-units. J. Mach. Learn. Res. 13, 2205–2231 (2012)MATH Cooper, H., Ong, E.J., Pugeault, N., Bowden, R.: Sign language recognition using sub-units. J. Mach. Learn. Res. 13, 2205–2231 (2012)MATH
4.
Zurück zum Zitat Karpov, A., Kipyatkova, I., Zelezny, M.: Automatic technologies for processing spoken sign languages. Procedia Comput. Sci. 81, 201–207 (2016)CrossRef Karpov, A., Kipyatkova, I., Zelezny, M.: Automatic technologies for processing spoken sign languages. Procedia Comput. Sci. 81, 201–207 (2016)CrossRef
5.
Zurück zum Zitat Karpov, A., Krnoul, Z., Zelezny, M., Ronzhin, A.: Multimodal synthesizer for Russian and Czech Sign Languages and Audio-Visual Speech. In: Stephanidis, C., Antona, M. (eds.) UAHCI/HCII 2013. LNCS, vol. 8009, pp. 520–529. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39188-0_56 CrossRef Karpov, A., Krnoul, Z., Zelezny, M., Ronzhin, A.: Multimodal synthesizer for Russian and Czech Sign Languages and Audio-Visual Speech. In: Stephanidis, C., Antona, M. (eds.) UAHCI/HCII 2013. LNCS, vol. 8009, pp. 520–529. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-39188-0_​56 CrossRef
6.
Zurück zum Zitat Kindiroglu, A., Yalcin, H., Aran, O., Hruz, M., Campr, P., Akarun, L., Karpov, A.: Automatic recognition of fingerspelling gestures in multiple languages for a communication interface for the disabled. Pattern Recogn. Image Anal. 22(4), 527–536 (2012)CrossRef Kindiroglu, A., Yalcin, H., Aran, O., Hruz, M., Campr, P., Akarun, L., Karpov, A.: Automatic recognition of fingerspelling gestures in multiple languages for a communication interface for the disabled. Pattern Recogn. Image Anal. 22(4), 527–536 (2012)CrossRef
7.
Zurück zum Zitat Hruz, M., Campr, P., Dikici, E., Kindiroglu, A., Krnoul, Z., Ronzhin, A.L., Sak, H., Schorno, D., Akarun, L., Aran, O., Karpov, A., Saraclar, M., Zelezny, M.: Automatic fingersign to speech translation system. J. Multimodal User Interfaces 4(2), 61–79 (2011)CrossRef Hruz, M., Campr, P., Dikici, E., Kindiroglu, A., Krnoul, Z., Ronzhin, A.L., Sak, H., Schorno, D., Akarun, L., Aran, O., Karpov, A., Saraclar, M., Zelezny, M.: Automatic fingersign to speech translation system. J. Multimodal User Interfaces 4(2), 61–79 (2011)CrossRef
8.
Zurück zum Zitat Sousa, L., Rodrigues, J.M.F., Monteiro, J., Cardoso, P.J.S., Lam, R.: GyGSLA: a portable glove system for learning sign language alphabet. In: Antona, M., Stephanidis, C. (eds.) UAHCI 2016. LNCS, vol. 9739, pp. 159–170. Springer, Cham (2016). doi:10.1007/978-3-319-40238-3_16 CrossRef Sousa, L., Rodrigues, J.M.F., Monteiro, J., Cardoso, P.J.S., Lam, R.: GyGSLA: a portable glove system for learning sign language alphabet. In: Antona, M., Stephanidis, C. (eds.) UAHCI 2016. LNCS, vol. 9739, pp. 159–170. Springer, Cham (2016). doi:10.​1007/​978-3-319-40238-3_​16 CrossRef
9.
Zurück zum Zitat Shibata, H., Nishimura, H., Tanaka, H.: Basic investigation for improvement of sign language recognition using classification scheme. In: Yamamoto, S. (ed.) HIMI 2016. LNCS, vol. 9734, pp. 563–574. Springer, Cham (2016). doi:10.1007/978-3-319-40349-6_55 CrossRef Shibata, H., Nishimura, H., Tanaka, H.: Basic investigation for improvement of sign language recognition using classification scheme. In: Yamamoto, S. (ed.) HIMI 2016. LNCS, vol. 9734, pp. 563–574. Springer, Cham (2016). doi:10.​1007/​978-3-319-40349-6_​55 CrossRef
10.
Zurück zum Zitat Nagashima, Y., et al.: A support tool for analyzing the 3D motions of sign language and the construction of a morpheme dictionary. In: Stephanidis, C. (ed.) HCI 2016. CCIS, vol. 618, pp. 124–129. Springer, Cham (2016). doi:10.1007/978-3-319-40542-1_20 CrossRef Nagashima, Y., et al.: A support tool for analyzing the 3D motions of sign language and the construction of a morpheme dictionary. In: Stephanidis, C. (ed.) HCI 2016. CCIS, vol. 618, pp. 124–129. Springer, Cham (2016). doi:10.​1007/​978-3-319-40542-1_​20 CrossRef
11.
Zurück zum Zitat Sako, S., Hatano, M., Kitamura, T.: Real-time Japanese sign language recognition based on three phonological elements of sign. In: Stephanidis, C. (ed.) HCI 2016. CCIS, vol. 618, pp. 130–136. Springer, Cham (2016). doi:10.1007/978-3-319-40542-1_21 CrossRef Sako, S., Hatano, M., Kitamura, T.: Real-time Japanese sign language recognition based on three phonological elements of sign. In: Stephanidis, C. (ed.) HCI 2016. CCIS, vol. 618, pp. 130–136. Springer, Cham (2016). doi:10.​1007/​978-3-319-40542-1_​21 CrossRef
12.
Zurück zum Zitat Halim, Z., Abbas, G.: A Kinect-based sign language hand gesture recognition system for hearing- and speech-impaired: a pilot study of Pakistani sign language. Assistive Technol. 27(1), 34–43 (2015)CrossRef Halim, Z., Abbas, G.: A Kinect-based sign language hand gesture recognition system for hearing- and speech-impaired: a pilot study of Pakistani sign language. Assistive Technol. 27(1), 34–43 (2015)CrossRef
13.
Zurück zum Zitat Chong, W., Zhong, L., Shing-Chow, C.: Superpixel-based hand gesture recognition with Kinect depth camera. IEEE Trans. Multimed. 1(17), 29–39 (2015) Chong, W., Zhong, L., Shing-Chow, C.: Superpixel-based hand gesture recognition with Kinect depth camera. IEEE Trans. Multimed. 1(17), 29–39 (2015)
15.
Zurück zum Zitat Sharma, D., Vatta, S.: Optimizing the search in hierarchical database using Quad Tree. Int. J. Sci. Res. Sci. Eng. Technol. 1(4), 221–226 (2015). Springer Sharma, D., Vatta, S.: Optimizing the search in hierarchical database using Quad Tree. Int. J. Sci. Res. Sci. Eng. Technol. 1(4), 221–226 (2015). Springer
16.
Zurück zum Zitat Sreedhar, K., Panlal, B.: Enhancement of images using morphological transformations. Int. J. Comput. Sci. Inf. Technol. 4(1), 33–50 (2012) Sreedhar, K., Panlal, B.: Enhancement of images using morphological transformations. Int. J. Comput. Sci. Inf. Technol. 4(1), 33–50 (2012)
17.
Zurück zum Zitat Sossa-Azuela, J.H., Santiago-Montero, R., Pérez-Cisneros, M., Rubio-Espino, E.: Computing the Euler number of a binary image based on a vertex codification. J. Appl. Res. Technology. 11, 360–370 (2013)CrossRef Sossa-Azuela, J.H., Santiago-Montero, R., Pérez-Cisneros, M., Rubio-Espino, E.: Computing the Euler number of a binary image based on a vertex codification. J. Appl. Res. Technology. 11, 360–370 (2013)CrossRef
18.
Zurück zum Zitat Chaple G., Daruwala R., Gofane, M.: Comparisons of Robert, Prewitt, Sobel operator based edge detection methods for real time uses on FPGA. In: Proceeding International Conference on Technologies for Sustainable Development ICTSD-2015. IEEEXplore (2015) Chaple G., Daruwala R., Gofane, M.: Comparisons of Robert, Prewitt, Sobel operator based edge detection methods for real time uses on FPGA. In: Proceeding International Conference on Technologies for Sustainable Development ICTSD-2015. IEEEXplore (2015)
19.
Zurück zum Zitat Kaehler, A., Bradsky, G.: Learning OpenCV 3. O’Reilly Media, California (2017) Kaehler, A., Bradsky, G.: Learning OpenCV 3. O’Reilly Media, California (2017)
22.
23.
Zurück zum Zitat Ivanko, D.V., Karpov, A.A.: An analysis of perspectives for using high-speed cameras in processing dynamic video information. SPIIRAS Proc. 44(1), 98–113 (2016). doi:10.15622/sp.44.7 CrossRef Ivanko, D.V., Karpov, A.A.: An analysis of perspectives for using high-speed cameras in processing dynamic video information. SPIIRAS Proc. 44(1), 98–113 (2016). doi:10.​15622/​sp.​44.​7 CrossRef
24.
Zurück zum Zitat Sargin, M., Aran, O., Karpov, A., Ofli, F., Yasinnik, Y., Wilson, S., Erzin, E., Yemez, Y., Tekalp, M.: Combined gesture-speech analysis and speech driven gesture synthesis. In: Proceeding IEEE International Conference on Multimedia and Expo ICME-2006, Toronto, Canada. IEEEXplore (2006) Sargin, M., Aran, O., Karpov, A., Ofli, F., Yasinnik, Y., Wilson, S., Erzin, E., Yemez, Y., Tekalp, M.: Combined gesture-speech analysis and speech driven gesture synthesis. In: Proceeding IEEE International Conference on Multimedia and Expo ICME-2006, Toronto, Canada. IEEEXplore (2006)
25.
Zurück zum Zitat Karpov, A., Ronzhin, A.: A universal assistive technology with multimodal input and multimedia output interfaces. In: Stephanidis, C., Antona, M. (eds.) UAHCI/HCII 2014. LNCS, vol. 8513, pp. 369–378. Springer, Cham (2014). doi:10.1007/978-3-319-07437-5_35 Karpov, A., Ronzhin, A.: A universal assistive technology with multimodal input and multimedia output interfaces. In: Stephanidis, C., Antona, M. (eds.) UAHCI/HCII 2014. LNCS, vol. 8513, pp. 369–378. Springer, Cham (2014). doi:10.​1007/​978-3-319-07437-5_​35
26.
Zurück zum Zitat Karpov, A., Akarun, L., Yalçın, H., Ronzhin, A.L., Demiröz B., Çoban A., Zelezny M.: Audio-visual signal processing in a multimodal assisted living environment. In: Proceeding of 15th International Conference INTERSPEECH-2014, Singapore, pp. 1023–1027 (2014) Karpov, A., Akarun, L., Yalçın, H., Ronzhin, A.L., Demiröz B., Çoban A., Zelezny M.: Audio-visual signal processing in a multimodal assisted living environment. In: Proceeding of 15th International Conference INTERSPEECH-2014, Singapore, pp. 1023–1027 (2014)
27.
Zurück zum Zitat Karpov, A., Ronzhin, A., Kipyatkova, I.: Automatic analysis of speech and acoustic events for ambient assisted living. In: Antona, M., Stephanidis, C. (eds.) UAHCI/HCII 2015. LNCS, vol. 9176, pp. 455–463. Springer, Cham (2015). doi:10.1007/978-3-319-20681-3_43 CrossRef Karpov, A., Ronzhin, A., Kipyatkova, I.: Automatic analysis of speech and acoustic events for ambient assisted living. In: Antona, M., Stephanidis, C. (eds.) UAHCI/HCII 2015. LNCS, vol. 9176, pp. 455–463. Springer, Cham (2015). doi:10.​1007/​978-3-319-20681-3_​43 CrossRef
Metadaten
Titel
Towards Automatic Recognition of Sign Language Gestures Using Kinect 2.0
verfasst von
Dmitry Ryumin
Alexey A. Karpov
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-58703-5_7

Neuer Inhalt