nach oben

Information Systems Frontiers

Erschienen in:

30.05.2017

Grammatical facial expression recognition in sign language discourse: a study at the syntax level

verfasst von: Fernando A. Freitas, Sarajane M. Peres, Clodoaldo A. M. Lima, Felipe V. Barbosa

Erschienen in: Information Systems Frontiers | Ausgabe 6/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Facial Expression Recognition is an already well-developed research area, mainly due to its applicability in the construction of different system types. Facial expressions are especially important in the area which relates to the construction of discourses through sign language. Sign languages are visual-spatial languages that are not assisted by voice intonation. Therefore, they use facial expressions to support the manifestation of prosody aspects and some grammatical constructions. Such expressions are called Grammatical Facial Expressions (GFEs) and they are present at sign language morphological and syntactic levels. GFEs stand out in automated recognition processes for sign languages, as they help removing ambiguity among signals, and they also contribute to compose the semantic meaning of discourse. This paper aims to present a study which applies inductive reasoning to recognize patterns, as a way to study the problem involving the automated recognition of GFEs at the discourse syntactic level in the Libras Sign Language (Brazilian Sign Language). In this study, sensor Microsoft Kinect was used to capture three-dimensional points in the faces of subjects who were fluent in sign language, generating a corpus of Libras phrases, which comprised different syntactic constructions. This corpus was analyzed through classifiers that were implemented through neural network Multilayer Perceptron, and then a series of experiments was conducted. The experiments allowed investigating: the recognition complexity that is inherent to each of the GFEs that are present in the corpus; the use suitability of different vector representations, considering descriptive characteristics that are based on coordinates of points in three dimensions, distances and angles therefrom; the need for using time data regarding the execution of expressions during speech; and particularities that are connected to data labeling and the evaluation of classifying models in the context of a sign language.

Vorheriger Artikel Revealing determinant factors for early breast cancer recurrence by decision tree

Nächster Artikel How user relationships affect user perceived value propositions of enterprises on social commerce platforms

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Available at https://archive.ics.uci.edu/ml/datasets/Grammatical+Facial+Expressions https://archive.ics.uci.edu/ml/datasets/Grammatical+Facial+Expressions – UCI Machine Learning Repository (Lichman 2013).

NMMs are characterized by head positions and movements, body position and movements, looks, and FEs.

Notation indicating that the interrogative facial expression (WH-question) was used in the whole phrase. Symbols <> mark the period in which that expression is executed.

It was not applied in this study.

http://www.bu.edu/asllrp/cslgr/.

A device that is capable of capturing RGB images which hold depth information, and also of capturing acoustic information (http://msdn.microsoft.com/en-us/library/hh855347.aspx).

http://msdn.microsoft.com/en-us/library/jj130970.aspx.

This parameter assumes different values in each experiment, always considering the shortest time for the execution of an expression in the phrases. Therefore, it prevents a “window” from being large enough to contain frames which represent: non-expression – expression – non-expression.

There are no elements in this study to allow for evaluating whether this GFE is more difficult to be labeled by human labelers, or if it is difficult for a classifier to interpret the transition phase between non-expression – expression – non-expression.

Aarons, D. (1994). Aspects of the syntax of american sign language. PhD thesis: Boston University Graduate School.

Aran, O., Ari, I., Guvensan, A., Haberdar, H., Kurr, Z., Turkmen, I., Uyar, A., & Akarun, L. (2007). A database of non-manual signs in turkish sign language. In IEEE 15th signal processing and communications applications. SIU (pp 1–4). IEEE.

Aran, O., Burger, T., Caplier, A., & Akarun, L. (2009). A belief-based sequential fusion approach for fusing manual signs and non-manual signals. Pattern Recognition, 42(5), 812– 822.CrossRef

Ari, I., Uyar, A., & Akarun, L. (2008). Facial feature tracking and expression recognition for sign language. In 23rd International symposium on computer and information sciences. ISCIS’08 (pp 1–6). IEEE.

Arrotéia, J. (2005). O papel da marcaċão não-manual nas sentenċas negativas em língua de sinais brasileira (lsb). PhD thesis, Dissertaċão de Mestrado. Unversidade Estadual de Campinas.

Artstein, R., & Poesio, M. (2008). Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4), 555–596.CrossRef

Battison, R. (1974). Phonological deletion in american sign language. Sign language studies, 5, 1–14.CrossRef

Campr, P., Hrúz, M., & Trojanová, J. (2008). Collection and preprocessing of czech sign language corpus for sign language recognition. In Proceedings of the sixth international conference on language resources and evaluation.

Caridakis, G., Asteriadis, S., & Karpouzis, K. (2011). Non-manual cues in automatic sign language recognition. In Proceedings of the 4th international conference on pervasive technologies related to assistive environments (pp 37–46). ACM.

Chang, C.Y., & Huang, Y.C. (2010). Personalized facial expression recognition in indoor environments. In International joint conference on neural networks (pp. 1–8). IEEE.

Dahmane, M., & Meunier, J. (2012). Sift-flow registration for facial expression analysis using gabor wavelets. In 11th International conference on information science, signal processing and their applications (pp 175–180). IEEE.

Ding, L., & Martinez, A.M. (2010). Features versus context: an approach for precise and detailed detection and delineation of faces and facial features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(11), 2022–2038.CrossRef

Ekman, P. (1978). Facial signs: facts, fantasies, and possibilities. Sight, Sound, and Sense, 124–156.

Ekman, P., & Friesen, W.V. (1977). Facial action coding system.

Ferreira-Brito, L. (1990). Uma abordagem fonológica dos sinais da lscb. Espaċ,o: Informativo Técnico-Científico do INES, 20–43.

Gibet, S., Courty, N., Duarte, K., & Naour, T.L. (2011). The signcom system for data-driven animation of interactive virtual signers: Methodology and evaluation. ACM Transactions on Interactive Intelligent Systems, 1(1), 6.CrossRef

Haykin, S. (2009). Neural networks and learning machines (Vol. 3). Pearson Education Upper Saddle River.

Hrúz, M., Trojanová, J., & Żeleznỳ, M. (2011). Local binary pattern based features for sign language recognition. Pattern Recognition and Image Analysis, 21(3), 398–401.CrossRef

Jack, R. E., Garrod, O. G., & Schyns, P. G. (2014). Dynamic facial expressions of emotion transmit an evolving hierarchy of signals over time. Current Biology, 24(2), 187–192.CrossRef

Joho, H., Jose, J. M., Valenti, R., & Sebe, N. (2009). Exploiting facial expressions for affective video summarisation. In Proceedings of the ACM international conference on image and video retrieval (p. 31). ACM.

Kacorri, H. (2013). Models of linguistic facial expressions for american sign language animation. ACM SIGACCESS Accessibility and Computing (105), 19–23.

Kelly, D., Delannoy, J. R., McDonald, J., & Markham, C. (2009a). Incorporating facial features into a multi-channel gesture recognition system for the interpretation of irish sign language sequences. In IEEE 12th international conference on computer vision workshops. (pp. 1977–1984). IEEE.

Kelly, D., Reilly Delannoy, J., Mc Donald, J., & Markham, C. (2009b). A framework for continuous multimodal sign language recognition. In International conference on multimodal interfaces (pp. 351–358). ACM.

Kostakis, O., Papapetrou, P., & Hollmén, J. (2011). Distance measure for querying sequences of temporal intervals. In Proceedings of the 4th international conference on pervasive technologies related to assistive environments (pp. 40:1–40:8). ACM.

Krnoul, Z., Hruz, M., & Campr, P. (2010). Correlation analysis of facial features and sign gestures. In IEEE 10th international conference on signal processing (pp. 732–735). IEEE.

Li, D., Sun, C., Hu, F., Zang, D., Wang, L., & Zhang, M. (2013). Real-time performance-driven facial animation with 3ds max and kinect. In 2013 3rd International conference on consumer electronics, communications and networks (pp. 473–476). IEEE.

Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.

Lillo-Martin, D., & Quadros, R. M. d. (2005). The acquisition of focus constructions in american sign language and língua brasileira de sinais. In Language development conference on Boston university (Vol. 29, pp. 365–375).

Michael, N., Metaxas, D., & Neidle, C. (2009). Spatial and temporal pyramids for grammatical expression recognition of american sign language. In Proceedings of the 11th international ACM SIGACCESS conference on computers and accessibility (pp. 75–82). ACM.

Mitchell, T. (1997). Machine learning. McGraw-Hill.

Nguyen, T. D., & Ranganath, S. (2008). Towards recognition of facial expressions in sign language: tracking facial features under occlusion. In 15th IEEE international conference on image processing. ICIP (pp. 3228–3231). IEEE.

Nguyen, T. D., & Ranganath, S. (2012). Facial expressions in american sign language: tracking and recognition. Pattern Recognition, 45(5), 1877–1891.CrossRef

Petridis, S., & Pantic, M. (2011). Audiovisual discrimination between speech and laughter: why and when visual information might help. IEEE Transactions on Multimedia, 13(2), 216–234.CrossRef

Popa, M., Rothkrantz, L., & Wiggers, P. (2010). Products appreciation by facial expressions analysis. In: Proceedings of the 11th international conference on computer systems and technologies and workshop for PhD students (pp. 293–298). ACM.

Quadros, R. M. d., & Karnopp, L. B. (2004). Língua de sinais brasileira: estudos lingüísticos (Vol. 1).

Saeed, U. (2010). Comparative analysis of lip features for person identification. In Proceedings of the 8th international conference on frontiers of information technology (pp. 20:1–20:6). ACM.

Stokoe, W. C. (1978). Sign language structure. Annual Review of Anthropology, 9, 365–390.CrossRef

Takahashi, M., Clippingdale, S., Okuda, M., Yamanouchi, Y., Naemura, M., & Shibata, M. (2013). An estimator for rating video contents on the basis of a viewer’s behavior in typical home environments. In 2013 International conference on signal-image technology & internet-based systems (pp. 6–13). IEEE.

Valenti, R., Jaimes, A., & Sebe, N. (2008). Facial expression recognition as a creative interface. In Proceedings of the 13th international conference on intelligent user interfaces (pp. pp 433–434). ACM.

von Agris, U., Knorr, M., & Kraiss, K. F. (2008). The significance of facial features for automatic sign language recognition. In 8th IEEE International conference on automatic face & gesture recognition. FG’08 (pp 1–6). IEEE.

Wang, H., Huang, H., Hu, Y., Anderson, M., Rollins, P., & Makedon, F. (2010). Emotion detection via discriminative kernel method. In Proceedings of the 3rd international conference on pervasive technologies related to assistive environments (pp. 7:1–7:7). ACM.

Werbos, P. (1974). Beyond regression: new tools for prediction and analysis in the behavioral sciences.

Whissell, C. (1989). The dictionary of affect in language. Emotion: Theory, Research, and Experience, 4 (113–131), 94.

Yang, C. K., & Chiang, W. T. (2008). An interactive facial expression generation system. Multimedia Tools and Applications, 40(1), 41– 60.CrossRef

Yang, H. D., & Lee, S. W. (2011). Combination of manual and non-manual features for sign language recognition based on conditional random field and active appearance model. In International conference on machine learning and cybernetics (Vol. 4, pp. 1726–1731). IEEE.

Yargic, A., & Dogan, M. (2013). A lip reading application on ms kinect camera. In 2013 IEEE International symposium on innovations in intelligent systems and applications (pp. 1–5). IEEE.

Yong, C. Y., Sudirman, R., & Chew, K. M. (2011). Facial expression monitoring system using pca-bayes classifier. In International conference on future computer sciences and application (pp. 187 –191).

Yu, Y. C., You, S. D., & Tsai, D. R. (2012). Magic mirror table for social-emotion alleviation in the smart home. IEEE Transactions on Consumer Electronics, 58(1), 126–131.CrossRef

Zhang, D., Liu, X., Yan, N., Wang, L., Zhu, Y., & Chen, H. (2014). A multi-channel/multi-speaker articulatory database in mandarin for speech visualization. In 2014 9th International symposium on Chinese spoken language processing (pp. 299–303). IEEE.

Zhao, S., Yao, H., Sun, X., Xu, P., Liu, X., & Ji, R. (2011). Video indexing and recommendation based on affective analysis of viewers. In Proceedings of the 19th ACM international conference on multimedia (pp. 1473–1476). ACM.

Titel: Grammatical facial expression recognition in sign language discourse: a study at the syntax level
verfasst von: Fernando A. Freitas
Sarajane M. Peres
Clodoaldo A. M. Lima
Felipe V. Barbosa
Publikationsdatum: 30.05.2017
Verlag: Springer US
Erschienen in: Information Systems Frontiers / Ausgabe 6/2017
Print ISSN: 1387-3326
Elektronische ISSN: 1572-9419
DOI: https://doi.org/10.1007/s10796-017-9765-z

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 6/2017

Adaptive approach to restraining content pollution in peer-to-peer networks

Reliable and efficient big service selection

Analyzing social choice and group ranking of online games for product mix innovation

How user relationships affect user perceived value propositions of enterprises on social commerce platforms

Mining variable fragments from process event logs

The design of a cloud-based tracker platform based on system-of-systems service architecture