Skip to main content
Erschienen in: Information Systems Frontiers 6/2017

30.05.2017

Grammatical facial expression recognition in sign language discourse: a study at the syntax level

verfasst von: Fernando A. Freitas, Sarajane M. Peres, Clodoaldo A. M. Lima, Felipe V. Barbosa

Erschienen in: Information Systems Frontiers | Ausgabe 6/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Facial Expression Recognition is an already well-developed research area, mainly due to its applicability in the construction of different system types. Facial expressions are especially important in the area which relates to the construction of discourses through sign language. Sign languages are visual-spatial languages that are not assisted by voice intonation. Therefore, they use facial expressions to support the manifestation of prosody aspects and some grammatical constructions. Such expressions are called Grammatical Facial Expressions (GFEs) and they are present at sign language morphological and syntactic levels. GFEs stand out in automated recognition processes for sign languages, as they help removing ambiguity among signals, and they also contribute to compose the semantic meaning of discourse. This paper aims to present a study which applies inductive reasoning to recognize patterns, as a way to study the problem involving the automated recognition of GFEs at the discourse syntactic level in the Libras Sign Language (Brazilian Sign Language). In this study, sensor Microsoft Kinect was used to capture three-dimensional points in the faces of subjects who were fluent in sign language, generating a corpus of Libras phrases, which comprised different syntactic constructions. This corpus was analyzed through classifiers that were implemented through neural network Multilayer Perceptron, and then a series of experiments was conducted. The experiments allowed investigating: the recognition complexity that is inherent to each of the GFEs that are present in the corpus; the use suitability of different vector representations, considering descriptive characteristics that are based on coordinates of points in three dimensions, distances and angles therefrom; the need for using time data regarding the execution of expressions during speech; and particularities that are connected to data labeling and the evaluation of classifying models in the context of a sign language.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
NMMs are characterized by head positions and movements, body position and movements, looks, and FEs.
 
3
Notation indicating that the interrogative facial expression (WH-question) was used in the whole phrase. Symbols <> mark the period in which that expression is executed.
 
4
It was not applied in this study.
 
6
A device that is capable of capturing RGB images which hold depth information, and also of capturing acoustic information (http://​msdn.​microsoft.​com/​en-us/​library/​hh855347.​aspx).
 
8
This parameter assumes different values in each experiment, always considering the shortest time for the execution of an expression in the phrases. Therefore, it prevents a “window” from being large enough to contain frames which represent: non-expression – expression – non-expression.
 
9
There are no elements in this study to allow for evaluating whether this GFE is more difficult to be labeled by human labelers, or if it is difficult for a classifier to interpret the transition phase between non-expression – expression – non-expression.
 
Literatur
Zurück zum Zitat Aarons, D. (1994). Aspects of the syntax of american sign language. PhD thesis: Boston University Graduate School. Aarons, D. (1994). Aspects of the syntax of american sign language. PhD thesis: Boston University Graduate School.
Zurück zum Zitat Aran, O., Ari, I., Guvensan, A., Haberdar, H., Kurr, Z., Turkmen, I., Uyar, A., & Akarun, L. (2007). A database of non-manual signs in turkish sign language. In IEEE 15th signal processing and communications applications. SIU (pp 1–4). IEEE. Aran, O., Ari, I., Guvensan, A., Haberdar, H., Kurr, Z., Turkmen, I., Uyar, A., & Akarun, L. (2007). A database of non-manual signs in turkish sign language. In IEEE 15th signal processing and communications applications. SIU (pp 1–4). IEEE.
Zurück zum Zitat Aran, O., Burger, T., Caplier, A., & Akarun, L. (2009). A belief-based sequential fusion approach for fusing manual signs and non-manual signals. Pattern Recognition, 42(5), 812– 822.CrossRef Aran, O., Burger, T., Caplier, A., & Akarun, L. (2009). A belief-based sequential fusion approach for fusing manual signs and non-manual signals. Pattern Recognition, 42(5), 812– 822.CrossRef
Zurück zum Zitat Ari, I., Uyar, A., & Akarun, L. (2008). Facial feature tracking and expression recognition for sign language. In 23rd International symposium on computer and information sciences. ISCIS’08 (pp 1–6). IEEE. Ari, I., Uyar, A., & Akarun, L. (2008). Facial feature tracking and expression recognition for sign language. In 23rd International symposium on computer and information sciences. ISCIS’08 (pp 1–6). IEEE.
Zurück zum Zitat Arrotéia, J. (2005). O papel da marcaċão não-manual nas sentenċas negativas em língua de sinais brasileira (lsb). PhD thesis, Dissertaċão de Mestrado. Unversidade Estadual de Campinas. Arrotéia, J. (2005). O papel da marcaċão não-manual nas sentenċas negativas em língua de sinais brasileira (lsb). PhD thesis, Dissertaċão de Mestrado. Unversidade Estadual de Campinas.
Zurück zum Zitat Artstein, R., & Poesio, M. (2008). Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4), 555–596.CrossRef Artstein, R., & Poesio, M. (2008). Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4), 555–596.CrossRef
Zurück zum Zitat Battison, R. (1974). Phonological deletion in american sign language. Sign language studies, 5, 1–14.CrossRef Battison, R. (1974). Phonological deletion in american sign language. Sign language studies, 5, 1–14.CrossRef
Zurück zum Zitat Campr, P., Hrúz, M., & Trojanová, J. (2008). Collection and preprocessing of czech sign language corpus for sign language recognition. In Proceedings of the sixth international conference on language resources and evaluation. Campr, P., Hrúz, M., & Trojanová, J. (2008). Collection and preprocessing of czech sign language corpus for sign language recognition. In Proceedings of the sixth international conference on language resources and evaluation.
Zurück zum Zitat Caridakis, G., Asteriadis, S., & Karpouzis, K. (2011). Non-manual cues in automatic sign language recognition. In Proceedings of the 4th international conference on pervasive technologies related to assistive environments (pp 37–46). ACM. Caridakis, G., Asteriadis, S., & Karpouzis, K. (2011). Non-manual cues in automatic sign language recognition. In Proceedings of the 4th international conference on pervasive technologies related to assistive environments (pp 37–46). ACM.
Zurück zum Zitat Chang, C.Y., & Huang, Y.C. (2010). Personalized facial expression recognition in indoor environments. In International joint conference on neural networks (pp. 1–8). IEEE. Chang, C.Y., & Huang, Y.C. (2010). Personalized facial expression recognition in indoor environments. In International joint conference on neural networks (pp. 1–8). IEEE.
Zurück zum Zitat Dahmane, M., & Meunier, J. (2012). Sift-flow registration for facial expression analysis using gabor wavelets. In 11th International conference on information science, signal processing and their applications (pp 175–180). IEEE. Dahmane, M., & Meunier, J. (2012). Sift-flow registration for facial expression analysis using gabor wavelets. In 11th International conference on information science, signal processing and their applications (pp 175–180). IEEE.
Zurück zum Zitat Ding, L., & Martinez, A.M. (2010). Features versus context: an approach for precise and detailed detection and delineation of faces and facial features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(11), 2022–2038.CrossRef Ding, L., & Martinez, A.M. (2010). Features versus context: an approach for precise and detailed detection and delineation of faces and facial features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(11), 2022–2038.CrossRef
Zurück zum Zitat Ekman, P. (1978). Facial signs: facts, fantasies, and possibilities. Sight, Sound, and Sense, 124–156. Ekman, P. (1978). Facial signs: facts, fantasies, and possibilities. Sight, Sound, and Sense, 124–156.
Zurück zum Zitat Ekman, P., & Friesen, W.V. (1977). Facial action coding system. Ekman, P., & Friesen, W.V. (1977). Facial action coding system.
Zurück zum Zitat Ferreira-Brito, L. (1990). Uma abordagem fonológica dos sinais da lscb. Espaċ,o: Informativo Técnico-Científico do INES, 20–43. Ferreira-Brito, L. (1990). Uma abordagem fonológica dos sinais da lscb. Espaċ,o: Informativo Técnico-Científico do INES, 20–43.
Zurück zum Zitat Gibet, S., Courty, N., Duarte, K., & Naour, T.L. (2011). The signcom system for data-driven animation of interactive virtual signers: Methodology and evaluation. ACM Transactions on Interactive Intelligent Systems, 1(1), 6.CrossRef Gibet, S., Courty, N., Duarte, K., & Naour, T.L. (2011). The signcom system for data-driven animation of interactive virtual signers: Methodology and evaluation. ACM Transactions on Interactive Intelligent Systems, 1(1), 6.CrossRef
Zurück zum Zitat Haykin, S. (2009). Neural networks and learning machines (Vol. 3). Pearson Education Upper Saddle River. Haykin, S. (2009). Neural networks and learning machines (Vol. 3). Pearson Education Upper Saddle River.
Zurück zum Zitat Hrúz, M., Trojanová, J., & Żeleznỳ, M. (2011). Local binary pattern based features for sign language recognition. Pattern Recognition and Image Analysis, 21(3), 398–401.CrossRef Hrúz, M., Trojanová, J., & Żeleznỳ, M. (2011). Local binary pattern based features for sign language recognition. Pattern Recognition and Image Analysis, 21(3), 398–401.CrossRef
Zurück zum Zitat Jack, R. E., Garrod, O. G., & Schyns, P. G. (2014). Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time. Current Biology, 24(2), 187–192.CrossRef Jack, R. E., Garrod, O. G., & Schyns, P. G. (2014). Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time. Current Biology, 24(2), 187–192.CrossRef
Zurück zum Zitat Joho, H., Jose, J. M., Valenti, R., & Sebe, N. (2009). Exploiting facial expressions for affective video summarisation. In Proceedings of the ACM international conference on image and video retrieval (p. 31). ACM. Joho, H., Jose, J. M., Valenti, R., & Sebe, N. (2009). Exploiting facial expressions for affective video summarisation. In Proceedings of the ACM international conference on image and video retrieval (p. 31). ACM.
Zurück zum Zitat Kacorri, H. (2013). Models of linguistic facial expressions for american sign language animation. ACM SIGACCESS Accessibility and Computing (105), 19–23. Kacorri, H. (2013). Models of linguistic facial expressions for american sign language animation. ACM SIGACCESS Accessibility and Computing (105), 19–23.
Zurück zum Zitat Kelly, D., Delannoy, J. R., McDonald, J., & Markham, C. (2009a). Incorporating facial features into a multi-channel gesture recognition system for the interpretation of irish sign language sequences. In IEEE 12th international conference on computer vision workshops. (pp. 1977–1984). IEEE. Kelly, D., Delannoy, J. R., McDonald, J., & Markham, C. (2009a). Incorporating facial features into a multi-channel gesture recognition system for the interpretation of irish sign language sequences. In IEEE 12th international conference on computer vision workshops. (pp. 1977–1984). IEEE.
Zurück zum Zitat Kelly, D., Reilly Delannoy, J., Mc Donald, J., & Markham, C. (2009b). A framework for continuous multimodal sign language recognition. In International conference on multimodal interfaces (pp. 351–358). ACM. Kelly, D., Reilly Delannoy, J., Mc Donald, J., & Markham, C. (2009b). A framework for continuous multimodal sign language recognition. In International conference on multimodal interfaces (pp. 351–358). ACM.
Zurück zum Zitat Kostakis, O., Papapetrou, P., & Hollmén, J. (2011). Distance measure for querying sequences of temporal intervals. In Proceedings of the 4th international conference on pervasive technologies related to assistive environments (pp. 40:1–40:8). ACM. Kostakis, O., Papapetrou, P., & Hollmén, J. (2011). Distance measure for querying sequences of temporal intervals. In Proceedings of the 4th international conference on pervasive technologies related to assistive environments (pp. 40:1–40:8). ACM.
Zurück zum Zitat Krnoul, Z., Hruz, M., & Campr, P. (2010). Correlation analysis of facial features and sign gestures. In IEEE 10th international conference on signal processing (pp. 732–735). IEEE. Krnoul, Z., Hruz, M., & Campr, P. (2010). Correlation analysis of facial features and sign gestures. In IEEE 10th international conference on signal processing (pp. 732–735). IEEE.
Zurück zum Zitat Li, D., Sun, C., Hu, F., Zang, D., Wang, L., & Zhang, M. (2013). Real-time performance-driven facial animation with 3ds max and kinect. In 2013 3rd International conference on consumer electronics, communications and networks (pp. 473–476). IEEE. Li, D., Sun, C., Hu, F., Zang, D., Wang, L., & Zhang, M. (2013). Real-time performance-driven facial animation with 3ds max and kinect. In 2013 3rd International conference on consumer electronics, communications and networks (pp. 473–476). IEEE.
Zurück zum Zitat Lillo-Martin, D., & Quadros, R. M. d. (2005). The acquisition of focus constructions in american sign language and língua brasileira de sinais. In Language development conference on Boston university (Vol. 29, pp. 365–375). Lillo-Martin, D., & Quadros, R. M. d. (2005). The acquisition of focus constructions in american sign language and língua brasileira de sinais. In Language development conference on Boston university (Vol. 29, pp. 365–375).
Zurück zum Zitat Michael, N., Metaxas, D., & Neidle, C. (2009). Spatial and temporal pyramids for grammatical expression recognition of american sign language. In Proceedings of the 11th international ACM SIGACCESS conference on computers and accessibility (pp. 75–82). ACM. Michael, N., Metaxas, D., & Neidle, C. (2009). Spatial and temporal pyramids for grammatical expression recognition of american sign language. In Proceedings of the 11th international ACM SIGACCESS conference on computers and accessibility (pp. 75–82). ACM.
Zurück zum Zitat Mitchell, T. (1997). Machine learning. McGraw-Hill. Mitchell, T. (1997). Machine learning. McGraw-Hill.
Zurück zum Zitat Nguyen, T. D., & Ranganath, S. (2008). Towards recognition of facial expressions in sign language: tracking facial features under occlusion. In 15th IEEE international conference on image processing. ICIP (pp. 3228–3231). IEEE. Nguyen, T. D., & Ranganath, S. (2008). Towards recognition of facial expressions in sign language: tracking facial features under occlusion. In 15th IEEE international conference on image processing. ICIP (pp. 3228–3231). IEEE.
Zurück zum Zitat Nguyen, T. D., & Ranganath, S. (2012). Facial expressions in american sign language: tracking and recognition. Pattern Recognition, 45(5), 1877–1891.CrossRef Nguyen, T. D., & Ranganath, S. (2012). Facial expressions in american sign language: tracking and recognition. Pattern Recognition, 45(5), 1877–1891.CrossRef
Zurück zum Zitat Petridis, S., & Pantic, M. (2011). Audiovisual discrimination between speech and laughter: why and when visual information might help. IEEE Transactions on Multimedia, 13(2), 216–234.CrossRef Petridis, S., & Pantic, M. (2011). Audiovisual discrimination between speech and laughter: why and when visual information might help. IEEE Transactions on Multimedia, 13(2), 216–234.CrossRef
Zurück zum Zitat Popa, M., Rothkrantz, L., & Wiggers, P. (2010). Products appreciation by facial expressions analysis. In: Proceedings of the 11th international conference on computer systems and technologies and workshop for PhD students (pp. 293–298). ACM. Popa, M., Rothkrantz, L., & Wiggers, P. (2010). Products appreciation by facial expressions analysis. In: Proceedings of the 11th international conference on computer systems and technologies and workshop for PhD students (pp. 293–298). ACM.
Zurück zum Zitat Quadros, R. M. d., & Karnopp, L. B. (2004). Língua de sinais brasileira: estudos lingüísticos (Vol. 1). Quadros, R. M. d., & Karnopp, L. B. (2004). Língua de sinais brasileira: estudos lingüísticos (Vol. 1).
Zurück zum Zitat Saeed, U. (2010). Comparative analysis of lip features for person identification. In Proceedings of the 8th international conference on frontiers of information technology (pp. 20:1–20:6). ACM. Saeed, U. (2010). Comparative analysis of lip features for person identification. In Proceedings of the 8th international conference on frontiers of information technology (pp. 20:1–20:6). ACM.
Zurück zum Zitat Stokoe, W. C. (1978). Sign language structure. Annual Review of Anthropology, 9, 365–390.CrossRef Stokoe, W. C. (1978). Sign language structure. Annual Review of Anthropology, 9, 365–390.CrossRef
Zurück zum Zitat Takahashi, M., Clippingdale, S., Okuda, M., Yamanouchi, Y., Naemura, M., & Shibata, M. (2013). An estimator for rating video contents on the basis of a viewer’s behavior in typical home environments. In 2013 International conference on signal-image technology & internet-based systems (pp. 6–13). IEEE. Takahashi, M., Clippingdale, S., Okuda, M., Yamanouchi, Y., Naemura, M., & Shibata, M. (2013). An estimator for rating video contents on the basis of a viewer’s behavior in typical home environments. In 2013 International conference on signal-image technology & internet-based systems (pp. 6–13). IEEE.
Zurück zum Zitat Valenti, R., Jaimes, A., & Sebe, N. (2008). Facial expression recognition as a creative interface. In Proceedings of the 13th international conference on intelligent user interfaces (pp. pp 433–434). ACM. Valenti, R., Jaimes, A., & Sebe, N. (2008). Facial expression recognition as a creative interface. In Proceedings of the 13th international conference on intelligent user interfaces (pp. pp 433–434). ACM.
Zurück zum Zitat von Agris, U., Knorr, M., & Kraiss, K. F. (2008). The significance of facial features for automatic sign language recognition. In 8th IEEE International conference on automatic face & gesture recognition. FG’08 (pp 1–6). IEEE. von Agris, U., Knorr, M., & Kraiss, K. F. (2008). The significance of facial features for automatic sign language recognition. In 8th IEEE International conference on automatic face & gesture recognition. FG’08 (pp 1–6). IEEE.
Zurück zum Zitat Wang, H., Huang, H., Hu, Y., Anderson, M., Rollins, P., & Makedon, F. (2010). Emotion detection via discriminative kernel method. In Proceedings of the 3rd international conference on pervasive technologies related to assistive environments (pp. 7:1–7:7). ACM. Wang, H., Huang, H., Hu, Y., Anderson, M., Rollins, P., & Makedon, F. (2010). Emotion detection via discriminative kernel method. In Proceedings of the 3rd international conference on pervasive technologies related to assistive environments (pp. 7:1–7:7). ACM.
Zurück zum Zitat Werbos, P. (1974). Beyond regression: new tools for prediction and analysis in the behavioral sciences. Werbos, P. (1974). Beyond regression: new tools for prediction and analysis in the behavioral sciences.
Zurück zum Zitat Whissell, C. (1989). The dictionary of affect in language. Emotion: Theory, Research, and Experience, 4 (113–131), 94. Whissell, C. (1989). The dictionary of affect in language. Emotion: Theory, Research, and Experience, 4 (113–131), 94.
Zurück zum Zitat Yang, C. K., & Chiang, W. T. (2008). An interactive facial expression generation system. Multimedia Tools and Applications, 40(1), 41– 60.CrossRef Yang, C. K., & Chiang, W. T. (2008). An interactive facial expression generation system. Multimedia Tools and Applications, 40(1), 41– 60.CrossRef
Zurück zum Zitat Yang, H. D., & Lee, S. W. (2011). Combination of manual and non-manual features for sign language recognition based on conditional random field and active appearance model. In International conference on machine learning and cybernetics (Vol. 4, pp. 1726–1731). IEEE. Yang, H. D., & Lee, S. W. (2011). Combination of manual and non-manual features for sign language recognition based on conditional random field and active appearance model. In International conference on machine learning and cybernetics (Vol. 4, pp. 1726–1731). IEEE.
Zurück zum Zitat Yargic, A., & Dogan, M. (2013). A lip reading application on ms kinect camera. In 2013 IEEE International symposium on innovations in intelligent systems and applications (pp. 1–5). IEEE. Yargic, A., & Dogan, M. (2013). A lip reading application on ms kinect camera. In 2013 IEEE International symposium on innovations in intelligent systems and applications (pp. 1–5). IEEE.
Zurück zum Zitat Yong, C. Y., Sudirman, R., & Chew, K. M. (2011). Facial expression monitoring system using pca-bayes classifier. In International conference on future computer sciences and application (pp. 187 –191). Yong, C. Y., Sudirman, R., & Chew, K. M. (2011). Facial expression monitoring system using pca-bayes classifier. In International conference on future computer sciences and application (pp. 187 –191).
Zurück zum Zitat Yu, Y. C., You, S. D., & Tsai, D. R. (2012). Magic mirror table for social-emotion alleviation in the smart home. IEEE Transactions on Consumer Electronics, 58(1), 126–131.CrossRef Yu, Y. C., You, S. D., & Tsai, D. R. (2012). Magic mirror table for social-emotion alleviation in the smart home. IEEE Transactions on Consumer Electronics, 58(1), 126–131.CrossRef
Zurück zum Zitat Zhang, D., Liu, X., Yan, N., Wang, L., Zhu, Y., & Chen, H. (2014). A multi-channel/multi-speaker articulatory database in mandarin for speech visualization. In 2014 9th International symposium on Chinese spoken language processing (pp. 299–303). IEEE. Zhang, D., Liu, X., Yan, N., Wang, L., Zhu, Y., & Chen, H. (2014). A multi-channel/multi-speaker articulatory database in mandarin for speech visualization. In 2014 9th International symposium on Chinese spoken language processing (pp. 299–303). IEEE.
Zurück zum Zitat Zhao, S., Yao, H., Sun, X., Xu, P., Liu, X., & Ji, R. (2011). Video indexing and recommendation based on affective analysis of viewers. In Proceedings of the 19th ACM international conference on multimedia (pp. 1473–1476). ACM. Zhao, S., Yao, H., Sun, X., Xu, P., Liu, X., & Ji, R. (2011). Video indexing and recommendation based on affective analysis of viewers. In Proceedings of the 19th ACM international conference on multimedia (pp. 1473–1476). ACM.
Metadaten
Titel
Grammatical facial expression recognition in sign language discourse: a study at the syntax level
verfasst von
Fernando A. Freitas
Sarajane M. Peres
Clodoaldo A. M. Lima
Felipe V. Barbosa
Publikationsdatum
30.05.2017
Verlag
Springer US
Erschienen in
Information Systems Frontiers / Ausgabe 6/2017
Print ISSN: 1387-3326
Elektronische ISSN: 1572-9419
DOI
https://doi.org/10.1007/s10796-017-9765-z

Weitere Artikel der Ausgabe 6/2017

Information Systems Frontiers 6/2017 Zur Ausgabe