Skip to main content

2018 | OriginalPaper | Buchkapitel

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

verfasst von : Eunji Chong, Nataniel Ruiz, Yongxin Wang, Yun Zhang, Agata Rozga, James M. Rehg

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper addresses the challenging problem of estimating the general visual attention of people in images. Our proposed method is designed to work across multiple naturalistic social scenarios and provides a full picture of the subject’s attention and gaze. In contrast, earlier works on gaze and attention estimation have focused on constrained problems in more specific contexts. In particular, our model explicitly represents the gaze direction and handles out-of-frame gaze targets. We leverage three different datasets using a multi-task learning approach. We evaluate our method on widely used benchmarks for single-tasks such as gaze angle estimation and attention-within-an-image, as well as on the new challenging task of generalized visual attention prediction. In addition, we have created extended annotations for the MMDB and GazeFollow datasets which are used in our experiments, which we will publicly release.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Baltrušaitis, T., Robinson, P., Morency, L.P.: OpenFace: an open source facial behavior analysis toolkit. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10. IEEE (2016) Baltrušaitis, T., Robinson, P., Morency, L.P.: OpenFace: an open source facial behavior analysis toolkit. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10. IEEE (2016)
3.
Zurück zum Zitat Benfold, B., Reid, I.: Guiding visual surveillance by tracking human attention. In: British Machine Vision Conference, September 2009 Benfold, B., Reid, I.: Guiding visual surveillance by tracking human attention. In: British Machine Vision Conference, September 2009
4.
Zurück zum Zitat Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)MathSciNetCrossRef Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)MathSciNetCrossRef
5.
Zurück zum Zitat Borji, A., Sihite, D.N., Itti, L.: What stands out in a scene? A study of human explicit saliency judgment. Vis. Res. 91, 62–77 (2013)CrossRef Borji, A., Sihite, D.N., Itti, L.: What stands out in a scene? A study of human explicit saliency judgment. Vis. Res. 91, 62–77 (2013)CrossRef
6.
Zurück zum Zitat Chen, C.Y., Grauman, K.: Subjects and their objects: localizing interactees for a person-centric view of importance. Int. J. Comput. Vis. 126, 1–22 (2016)MathSciNet Chen, C.Y., Grauman, K.: Subjects and their objects: localizing interactees for a person-centric view of importance. Int. J. Comput. Vis. 126, 1–22 (2016)MathSciNet
7.
Zurück zum Zitat Chong, E., et al.: Detecting gaze towards eyes in natural social interactions and its use in child assessment. Proc. ACM Interact. Mob. Wearable Ubiquit. Technol. 1(3), 43 (2017)CrossRef Chong, E., et al.: Detecting gaze towards eyes in natural social interactions and its use in child assessment. Proc. ACM Interact. Mob. Wearable Ubiquit. Technol. 1(3), 43 (2017)CrossRef
8.
Zurück zum Zitat Cristani, M., et al.: Social interaction discovery by statistical analysis of f-formations. In: Proceedings of BMVC (2011) Cristani, M., et al.: Social interaction discovery by statistical analysis of f-formations. In: Proceedings of BMVC (2011)
9.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)
10.
Zurück zum Zitat Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRef Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRef
11.
Zurück zum Zitat Funes Mora, K.A., Monay, F., Odobez, J.M.: EYEDIAP: a database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras. In: Proceedings of the ACM Symposium on Eye Tracking Research and Applications. ACM, March 2014. https://doi.org/10.1145/2578153.2578190 Funes Mora, K.A., Monay, F., Odobez, J.M.: EYEDIAP: a database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras. In: Proceedings of the ACM Symposium on Eye Tracking Research and Applications. ACM, March 2014. https://​doi.​org/​10.​1145/​2578153.​2578190
12.
Zurück zum Zitat Gorji, S., Clark, J.J.: Attentional push: a deep convolutional network for augmenting image salience with shared attention modeling in social scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2510–2519 (2017) Gorji, S., Clark, J.J.: Attentional push: a deep convolutional network for augmenting image salience with shared attention modeling in social scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2510–2519 (2017)
13.
Zurück zum Zitat Gu, J., Yang, X., De Mello, S., Kautz, J.: Dynamic facial analysis: from Bayesian filtering to recurrent neural network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 Gu, J., Yang, X., De Mello, S., Kautz, J.: Dynamic facial analysis: from Bayesian filtering to recurrent neural network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
14.
15.
Zurück zum Zitat Hutman, T., Chela, M.K., Gillespie-Lynch, K., Sigman, M.: Selective visual attention at twelve months: signs of autism in early social interactions. J. Autism Dev. Disord. 42(4), 487–498 (2012)CrossRef Hutman, T., Chela, M.K., Gillespie-Lynch, K., Sigman, M.: Selective visual attention at twelve months: signs of autism in early social interactions. J. Autism Dev. Disord. 42(4), 487–498 (2012)CrossRef
16.
Zurück zum Zitat Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)CrossRef Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)CrossRef
17.
Zurück zum Zitat Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th international conference on Computer Vision, pp. 2106–2113. IEEE (2009) Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th international conference on Computer Vision, pp. 2106–2113. IEEE (2009)
18.
Zurück zum Zitat Krafka, K., et al.: Eye tracking for everyone. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Krafka, K., et al.: Eye tracking for everyone. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
19.
Zurück zum Zitat Land, M., Tatler, B.: Looking and Acting: Vision and Eye Movements in Natural Behaviour. Oxford University Press, Oxford (2009)CrossRef Land, M., Tatler, B.: Looking and Acting: Vision and Eye Movements in Natural Behaviour. Oxford University Press, Oxford (2009)CrossRef
20.
Zurück zum Zitat Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: Conference on Computer Vision and Pattern Recognition (2015) Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: Conference on Computer Vision and Pattern Recognition (2015)
21.
Zurück zum Zitat Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280–287 (2014) Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280–287 (2014)
23.
Zurück zum Zitat Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Advances in Neural Information Processing Systems (NIPS) (2015) Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Advances in Neural Information Processing Systems (NIPS) (2015)
24.
Zurück zum Zitat Recasens, A., Vondrick, C., Khosla, A., Torralba, A.: Following gaze in video. In: The IEEE International Conference on Computer Vision (ICCV), October 2017 Recasens, A., Vondrick, C., Khosla, A., Torralba, A.: Following gaze in video. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
25.
Zurück zum Zitat Rehg, J., et al.: Decoding children’s social behavior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3414–3421 (2013) Rehg, J., et al.: Decoding children’s social behavior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3414–3421 (2013)
26.
Zurück zum Zitat Soo Park, H., Shi, J.: Social saliency prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4777–4785 (2015) Soo Park, H., Shi, J.: Social saliency prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4777–4785 (2015)
27.
Zurück zum Zitat Sugano, Y., Matsushita, Y., Sato, Y.: Learning-by-synthesis for appearance-based 3D gaze estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1821–1828 (2014) Sugano, Y., Matsushita, Y., Sato, Y.: Learning-by-synthesis for appearance-based 3D gaze estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1821–1828 (2014)
28.
Zurück zum Zitat Wang, L., Lu, H., Ruan, X., Yang, M.H.: Deep networks for saliency detection via local estimation and global search. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3183–3192. IEEE (2015) Wang, L., Lu, H., Ruan, X., Yang, M.H.: Deep networks for saliency detection via local estimation and global search. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3183–3192. IEEE (2015)
29.
Zurück zum Zitat Wood, E., Baltrusaitis, T., Zhang, X., Sugano, Y., Robinson, P., Bulling, A.: Rendering of eyes for eye-shape registration and gaze estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3756–3764 (2015) Wood, E., Baltrusaitis, T., Zhang, X., Sugano, Y., Robinson, P., Bulling, A.: Rendering of eyes for eye-shape registration and gaze estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3756–3764 (2015)
30.
Zurück zum Zitat Zhang, X., Sugano, Y., Bulling, A.: Everyday eye contact detection using unsupervised gaze target discovery. In: 30th Annual Symposium on User Interface Software and Technology. ACM (2017) Zhang, X., Sugano, Y., Bulling, A.: Everyday eye contact detection using unsupervised gaze target discovery. In: 30th Annual Symposium on User Interface Software and Technology. ACM (2017)
31.
Zurück zum Zitat Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4511–5420, June 2015 Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4511–5420, June 2015
32.
Zurück zum Zitat Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: It’s written all over your face: full-face appearance-based gaze estimation. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2017) Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: It’s written all over your face: full-face appearance-based gaze estimation. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2017)
33.
Zurück zum Zitat Zhao, R., Ouyang, W., Li, H., Wang, X.: Saliency detection by multi-context deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1265–1274 (2015) Zhao, R., Ouyang, W., Li, H., Wang, X.: Saliency detection by multi-context deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1265–1274 (2015)
Metadaten
Titel
Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency
verfasst von
Eunji Chong
Nataniel Ruiz
Yongxin Wang
Yun Zhang
Agata Rozga
James M. Rehg
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01228-1_24

Premium Partner