Top

Published in:

2019 | OriginalPaper | Chapter

Give Ear to My Face: Modelling Multimodal Attention to Social Interactions

Authors : Giuseppe Boccignone, Vittorio Cuculo, Alessandro D’Amelio, Giuliano Grossi, Raffaella Lanzarotti

Published in: Computer Vision – ECCV 2018 Workshops

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

We address the deployment of perceptual attention to social interactions as displayed in conversational clips, when relying on multimodal information (audio and video). A probabilistic modelling framework is proposed that goes beyond the classic saliency paradigm while integrating multiple information cues. Attentional allocation is determined not just by stimulus-driven selection but, importantly, by social value as modulating the selection history of relevant multimodal items. Thus, the construction of attentional priority is the result of a sampling procedure conditioned on the potential value dynamics of socially relevant objects emerging moment to moment within the scene. Preliminary experiments on a publicly available dataset are presented.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Adversarial Examples Detection in Features Distance Spaces

next chapter Investigating Depth Domain Adaptation for Efficient Human Pose Estimation

Anderson, B.A.: A value-driven mechanism of attentional selection. J. Vis. 13(3), 7 (2013)CrossRef

Awh, E., Belopolsky, A.V., Theeuwes, J.: Top-down versus bottom-up attentional control: a failed theoretical dichotomy. Trends Cogn. Sci. 16(8), 437–443 (2012)CrossRef

Berridge, K.C., Robinson, T.E.: Parsing reward. Trends Neurosci. 26(9), 507–513 (2003)CrossRef

Boccignone, G., Ferraro, M.: Ecological sampling of gaze shifts. IEEE Trans. Cybern. 44(2), 266–279 (2014)CrossRef

Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013)CrossRef

Bruce, N.D., Wloka, C., Frosst, N., Rahman, S., Tsotsos, J.K.: On computational modeling of visual saliency: examining what’s right, and what’s left. Vis. Res. 116, 95–112 (2015)CrossRef

Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 1 (2018). https://doi.org/10.1109/TPAMI.2018.2815601

Cerf, M., Harel, J., Einhäuser, W., Koch, C.: Predicting human gaze using low-level saliency combined with face detection. In: Advances in Neural Information Processing Systems, vol. 20 (2008)

Chikkerur, S., Serre, T., Tan, C., Poggio, T.: What and where: a Bayesian inference theory of attention. Vis. Res. 50(22), 2233–2247 (2010)CrossRef

10.

Chung, J.S., Zisserman, A.: Out of time: automated lip sync in the wild. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10117, pp. 251–263. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54427-4_19CrossRef

11.

Chung, J.S., Zisserman, A.: Lip reading in profile. In: BMVC (2017)

12.

Coutrot, A., Guyader, N.: An efficient audiovisual saliency model to predict eye positions when looking at conversations. In: 23rd European Signal Processing Conference, pp. 1531–1535, August 2015

13.

Coutrot, A., Guyader, N.: How saliency, faces, and sound influence gaze in dynamic social scenes. J. Vis. 14(8), 5 (2014)CrossRef

14.

Einhäuser, W., Spain, M., Perona, P.: Objects predict fixations better than early saliency. J. Vis. 8(14) (2008). https://doi.org/10.1167/8.14.18, http://www.journalofvision.org/content/8/14/18.abstractCrossRef

15.

Evangelopoulos, G., Rapantzikos, K., Maragos, P., Avrithis, Y., Potamianos, A.: Audiovisual attention modeling and salient event detection. In: Maragos, P., Potamianos, A., Gros, P. (eds.) Multimodal Processing and Interaction. MMSA, pp. 1–21. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-76316-3_8CrossRef

16.

Foulsham, T., Cheng, J.T., Tracy, J.L., Henrich, J., Kingstone, A.: Gaze allocation in a dynamic situation: effects of social status and speaking. Cognition 117(3), 319–331 (2010)CrossRef

17.

Hu, P., Ramanan, D.: Finding tiny faces. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1522–1530. IEEE (2017)

18.

Kaya, E.M., Elhilali, M.: Modelling auditory attention. Phil. Trans. R. Soc. B 372(1714), 20160101 (2017)CrossRef

19.

Kayser, C., Petkov, C.I., Lippert, M., Logothetis, N.K.: Mechanisms for allocating auditory attention: an auditory saliency map. Curr. Biol. 15(21), 1943–1947 (2005)CrossRef

20.

Le Meur, O., Coutrot, A.: Introducing context-dependent and spatially-variant viewing biases in saccadic models. Vis. Res. 121, 72–84 (2016)CrossRef

21.

Nakajima, J., Sugimoto, A., Kawamoto, K.: Incorporating audio signals into constructing a visual saliency map. In: Klette, R., Rivera, M., Satoh, S. (eds.) PSIVT 2013. LNCS, vol. 8333, pp. 468–480. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-53842-1_40CrossRef

22.

Napoletano, P., Boccignone, G., Tisato, F.: Attentive monitoring of multiple video streams driven by a bayesian foraging strategy. IEEE Trans. Image Process. 24(11), 3266–3281 (2015)MathSciNetCrossRef

23.

Onat, S., Libertus, K., König, P.: Integrating audiovisual information for the control of overt attention. J. Vis. 7(10), 11 (2007)CrossRef

24.

Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc. 103(482), 681–686 (2008)MathSciNetCrossRef

25.

Rahman, I.M., Hollitt, C., Zhang, M.: Feature map quality score estimation through regression. IEEE Trans. Image Process. 27(4), 1793–1808 (2018)MathSciNetCrossRef

26.

Rodríguez-Hidalgo, A., Peláez-Moreno, C., Gallardo-Antolín, A.: Towards multimodal saliency detection: an enhancement of audio-visual correlation estimation. In: Proceedings of 16th International Conference on Cognitive Informatics and Cognitive Computing, pp. 438–443. IEEE (2017)

27.

Schütz, A., Braun, D., Gegenfurtner, K.: Eye movements and perception: a selective review. J. Vis. 11(5), 9 (2011)CrossRef

28.

Seo, H., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9(12), 1–27 (2009)CrossRef

29.

Shinn-Cunningham, B.G.: Object-based auditory and visual attention. Trends Cogn. Sci. 12(5), 182–186 (2008)CrossRef

30.

Suda, Y., Kitazawa, S.: A model of face selection in viewing video stories. Sci. Rep. 5, 7666 (2015)CrossRef

31.

Tatler, B., Hayhoe, M., Land, M., Ballard, D.: Eye guidance in natural vision: Reinterpreting salience. J. Vis. 11(5), 5 (2011)CrossRef

32.

Tatler, B., Vincent, B.: The prominence of behavioural biases in eye guidance. Vis. Cogn. 17(6–7), 1029–1054 (2009)CrossRef

33.

Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53, 153–167 (2003)CrossRef

34.

Wolfe, J.M.: When is it time to move to the next raspberry bush? Foraging rules in human visual search. J. Vis. 13(3), 10 (2013)CrossRef

35.

Yang, S.C.H., Wolpert, D.M., Lengyel, M.: Theoretical perspectives on active sensing. Curr. Opin. Behav. Sci. 11, 100–108 (2016)CrossRef

Title: Give Ear to My Face: Modelling Multimodal Attention to Social Interactions
Authors: Giuseppe Boccignone
Vittorio Cuculo
Alessandro D’Amelio
Giuliano Grossi
Raffaella Lanzarotti
Publisher: Springer International Publishing
Book: Computer Vision – ECCV 2018 Workshops
Print ISBN: 978-3-030-11011-6

Electronic ISBN: 978-3-030-11012-3

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-11012-3_27

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner