Top

Published in:

2019 | OriginalPaper | Chapter

Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model

Authors : Yu Yu, Gang Liu, Jean-Marc Odobez

Published in: Computer Vision – ECCV 2018 Workshops

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

As an indicator of attention, gaze is an important cue for human behavior and social interaction analysis. Recent deep learning methods for gaze estimation rely on plain regression of the gaze from images without accounting for potential mismatches in eye image cropping and normalization. This may impact the estimation of the implicit relation between visual cues and the gaze direction when dealing with low resolution images or when training with a limited amount of data. In this paper, we propose a deep multitask framework for gaze estimation, with the following contributions. (i) we proposed a multitask framework which relies on both synthetic data and real data for end-to-end training. During training, each dataset provides the label of only one task but the two tasks are combined in a constrained way. (ii) we introduce a Constrained Landmark-Gaze Model (CLGM) modeling the joint variation of eye landmark locations (including the iris center) and gaze directions. By relating explicitly visual information (landmarks) to the more abstract gaze values, we demonstrate that the estimator is more accurate and easier to learn. (iii) by decomposing our deep network into a network inferring jointly the parameters of the CLGM model and the scale and translation parameters of eye regions on one hand, and a CLGM based decoder deterministically inferring landmark positions and gaze from these parameters and head pose on the other hand, our framework decouples gaze estimation from irrelevant geometric variations in the eye image (scale, translation), resulting in a more robust model. Thorough experiments on public datasets demonstrate that our method achieves competitive results, improving over state-of-the-art results in challenging free head pose gaze estimation tasks and on eye landmark localization (iris location) ones.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Rendering Realistic Subject-Dependent Expression Images by Learning 3DMM Deformation Coefficients

next chapter Photorealistic Facial Synthesis in the Dimensional Affect Space

Note that the corrected model relies on real data. In all experiments, the subject(s) used in the test set are never used for computing a corrected CLGM model.

Bixler, R., Blanchard, N., Garrison, L., D’Mello, S.: Automatic detection of mind wandering during reading using gaze and physiology. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, ICMI 2015, pp. 299–306. ACM, New York (2015)

Hiraoka, R., Tanaka, H., Sakti, S., Neubig, G., Nakamura, S.: Personalized unknown word detection in non-native language reading using eye gaze. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, ICMI 2016, pp. 66–70. ACM, New York (2016)

Velichkovsky, B.M., Dornhoefer, S.M., Pannasch, S., Unema, P.J.: Visual fixations and level of attentional processing. In: Proceedings of the 2000 Symposium on Eye Tracking Research & Applications, ETRA 2000, pp. 79–85. ACM, New York (2000)

Kendon, A.: Some functions of gaze-direction in social interaction. Acta Psychol. 26(Suppl. C), 22–63 (1967)CrossRef

Vidal, M., Turner, J., Bulling, A., Gellersen, H.: Wearable eye tracking for mental health monitoring. Comput. Commun. 35(11), 1306–1311 (2012)CrossRef

Ishii, R., Otsuka, K., Kumano, S., Yamato, J.: Prediction of who will be the next speaker and when using gaze behavior in multiparty meetings. ACM Trans. Interact. Intell. Syst. 6(1), 4:1–4:31 (2016)CrossRef

Andrist, S., Tan, X.Z., Gleicher, M., Mutlu, B.: Conversational gaze aversion for humanlike robots. In: Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction, HRI 2014, pp. 25–32. ACM, New York (2014)

Moon, A., et al.: Meet me where i’m gazing: How shared attention gaze affects human-robot handover timing. In: Proceedings of the 2014 ACM/IEEE International Conference on Human-robot Interaction, HRI 2014, pp. 334–341. ACM, New York (2014)

Krafka, K., Khosla, A., Kellnhofer, P., Kannan, H.: Eye tracking for everyone. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2176–2184 (2016)

10.

Tonsen, M., Steil, J., Sugano, Y., Bulling, A.: InvisibleEye: mobile eye tracking using multiple low-resolution cameras and learning-based gaze estimation. In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), vol. 1, no. 3 (2017)CrossRef

11.

Huang, Q., Veeraraghavan, A., Sabharwal, A.: Tabletgaze: unconstrained appearance-based gaze estimation in mobile tablets. arXiv preprint arXiv:1508.01244 (2015)

12.

Zhu, W., Deng, H.: Monocular free-head 3D gaze tracking with deep learning and geometry constraints. In: The IEEE International Conference on Computer Vision (ICCV), October 2017

13.

Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild, pp. 4511–4520 (2015)

14.

Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: It’s written all over your face: full-face appearance-based gaze estimation (2016)

15.

Wood, E., Baltrušaitis, T., Morency, L.P., Robinson, P., Bulling, A.: Learning an appearance-based gaze estimator from one million synthesised images. In: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research and Applications, pp. 131–138 (2016)

16.

Hansen, D.W., Ji, Q.: In the eye of the beholder: a survey of models for eyes and gaze. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 478–500 (2010)CrossRef

17.

Venkateswarlu, R., et al.: Eye gaze estimation from a single image of one eye, pp. 136–143 (2003)

18.

Funes Mora, K.A., Odobez, J.M.: Geometric generative gaze estimation (G3E) for remote RGB-D cameras, pp. 1773–1780, June 2014

19.

Wood, E., Baltrušaitis, T., Morency, L.-P., Robinson, P., Bulling, A.: A 3D morphable eye region model for gaze estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 297–313. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_18CrossRef

20.

Ishikawa, T.: Passive driver gaze tracking with active appearance models (2004)

21.

Wood, E., Bulling, A.: Eyetab: model-based gaze estimation on unmodified tablet computers, pp. 207–210 (2014)

22.

Gou, C., Wu, Y., Wang, K., Wang, F.Y., Ji, Q.: Learning-by-synthesis for accurate eye detection. In: ICPR (2016)

23.

Gou, C., Wu, Y., Wang, K., Wang, K., Wang, F., Ji, Q.: A joint cascaded framework for simultaneous eye detection and eye state estimation. Pattern Recogn. 67, 23–31 (2017)CrossRef

24.

Timm, F., Barth, E.: Accurate eye centre localisation by means of gradients. In: VISAPP (2011)

25.

Villanueva, A., Ponz, V., Sesma-Sanchez, L., Ariz, M., Porta, S., Cabeza, R.: Hybrid method based on topography for robust detection of iris center and eye corners. ACM Trans. Multimedia Comput. Commun. Appl. 9(4) (2013)CrossRef

26.

Tan, K.H., Kriegman, D.J., Ahuja, N.: Appearance-based eye gaze estimation, pp. 191–195 (2002)

27.

Noris, B., Keller, J.B., Billard, A.: A wearable gaze tracking system for children in unconstrained environments. Comput. Vis. Image Underst. 115(4), 476–486 (2011)CrossRef

28.

Martinez, F., Carbone, A., Pissaloux, E.: Gaze estimation using local features and non-linear regression, pp. 1961–1964 (2012)

29.

Sugano, Y., Matsushita, Y., Sato, Y.: Learning-by-synthesis for appearance-based 3D gaze estimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1821–1828 (2014)

30.

Lu, F., Sugano, Y., Okabe, T., Sato, Y.: Inferring human gaze from appearance via adaptive linear regression, pp. 153–160 (2011)

31.

Funes-Mora, K.A., Odobez, J.M.: Gaze estimation in the 3D space using RGB-D sensors. Int. J. Comput. Vis. 118(2), 194–216 (2016)MathSciNetCrossRef

32.

Palmero, C., Selva, J., Bagheri, M.A., Escalera, S.: Recurrent CNN for 3d gaze estimation using appearance and shape cues, p. 251 (2018)

33.

Park, S., Zhang, X., Bulling, A., Hilliges, O.: Learning to find eye region landmarks for remote gaze estimation in unconstrained settings, pp. 21:1–21:10 (2018)

34.

Wang, K., Zhao, R., Ji, Q.: A hierarchical generative model for eye image synthesis and eye gaze estimation, June 2018

35.

Park, S., Spurr, A., Hilliges, O.: Deep pictorial gaze estimation, September 2018

36.

Ruder, S.: An overview of multi-task learning in deep neural networks, June 2017

37.

Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. CoRR abs/1603.01249 (2016)

38.

Ranjan, R., Sankaranarayanan, S., Castillo, C.D., Chellappa, R.: An all-in-one convolutional neural network for face analysis. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 17–24 (2017)

39.

Wang, F., Han, H., Shan, S., Chen, X.: Deep multi-task learning for joint prediction of heterogeneous face attributes. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, 30 May–3 June 2017, pp. 173–179 (2017)

40.

Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_7CrossRef

41.

Yim, J., Jung, H., Yoo, B., Choi, C., Park, D., Kim, J.: Rotating your face using multi-task deep neural network. In: CVPR, pp. 676–684. IEEE Computer Society (2015)

42.

Misra, I., Shrivastava, A., Gupta, A., Hebert, M.: Cross-stitch networks for multi-task learning. CoRR abs/1604.03539 (2016)

43.

Lu, Y., Kumar, A., Zhai, S., Cheng, Y., Javidi, T., Feris, R.S.: Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. CoRR abs/1611.05377 (2016)

44.

IEEE: A 3D Face Model for Pose and Illumination Invariant Face Recognition. IEEE, Genova, Italy (2009)

45.

Cristinacce, D., Cootes, T.F.: Feature detection and tracking with constrained local models, January 2006

46.

Funes Mora, K.A., Monay, F., Odobez, J.M.: EYEDIAP: a database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras. In: Proceedings of the ACM Symposium on Eye Tracking Research and Applications. ACM, March 2014

47.

Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. CoRR abs/1612.07828 (2016)

48.

49.

Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: CVPR, pp. 1867–1874. IEEE Computer Society (2014)

50.

Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph. 20(3), 413–425 (2014)CrossRef

Title: Deep Multitask Gaze Estimation with a Constrained Landmark-Gaze Model
Authors: Yu Yu
Gang Liu
Jean-Marc Odobez
Publisher: Springer International Publishing
Book: Computer Vision – ECCV 2018 Workshops
Print ISBN: 978-3-030-11011-6

Electronic ISBN: 978-3-030-11012-3

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-11012-3_35

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner