research-article

Learning an appearance-based gaze estimator from one million synthesised images

Authors:
Erroll Wood

University of Cambridge, United Kingdom

University of Cambridge, United Kingdom
View Profile

,
Tadas Baltrušaitis

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Louis-Philippe Morency

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Peter Robinson

University of Cambridge, United Kingdom

University of Cambridge, United Kingdom
View Profile

,
Andreas Bulling

Max Planck Institute for Informatics, Germany

Max Planck Institute for Informatics, Germany
View Profile

ETRA '16: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & ApplicationsMarch 2016Pages 131–138https://doi.org/10.1145/2857491.2857492

Published:14 March 2016Publication History

ETRA '16: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications

Pages 131–138

ABSTRACT

Learning-based methods for appearance-based gaze estimation achieve state-of-the-art performance in challenging real-world settings but require large amounts of labelled training data. Learning-by-synthesis was proposed as a promising solution to this problem but current methods are limited with respect to speed, appearance variability, and the head pose and gaze angle distribution they can synthesize. We present UnityEyes, a novel method to rapidly synthesize large amounts of variable eye region images as training data. Our method combines a novel generative 3D model of the human eye region with a real-time rendering framework. The model is based on high-resolution 3D face scans and uses real-time approximations for complex eyeball materials and structures as well as anatomically inspired procedural geometry methods for eyelid animation. We show that these synthesized images can be used to estimate gaze in difficult in-the-wild scenarios, even for extreme gaze angles or in cases in which the pupil is fully occluded. We also demonstrate competitive gaze estimation results on a benchmark in-the-wild dataset, despite only using a light-weight nearest-neighbor algorithm. We are making our UnityEyes synthesis framework available online for the benefit of the research community.

References

Bélhumeur, P. N., Jacobs, D. W., Kriegman, D. J., and Kumar, N. 2011. Localizing parts of faces using a consensus of exemplars. In CVPR.Google Scholar
Bérard, P., Bradley, D., Nitti, M., Beeler, T., and Gross, M. 2014. Highquality capture of eyes. ACM Transactions on Graphics. Google ScholarDigital Library
Bermano, A., Beeler, T., Kozlov, Y., Bradley, D., Bickel, B., and Gross, M. 2015. Detailed spatio-temporal reconstruction of eyelids. ACM Transactions on Graphics. Google ScholarDigital Library
Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques, ACM Press/Addison-Wesley Publishing Co., 187--194. Google ScholarDigital Library
Cao, C., Weng, Y., Zhou, S., Tong, Y., and Zhou, K. 2014. Facewarehouse: a 3d facial expression database for visual computing. Visualization and Computer Graphics, IEEE Transactions on 20, 3, 413--425. Google ScholarDigital Library
Debevec, P. 2002. Image-based lighting. IEEE Computer Graphics and Applications 22, 2, 26--34. Google ScholarDigital Library
Evinger, C., Manning, K. A., and Sibony, P. A. 1991. Eyelid movements. Invest. Ophthalmol. Vis. Sci 32, 2.Google Scholar
Fanelli, G., Dantone, M., Gall, J., Fossati, A., and Van Gool, L. 2013. Random forests for real time 3d face analysis. International Journal of Computer Vision. Google ScholarDigital Library
Huang, Q., Veeraraghavan, A., and Sabharwal, A. 2015. Tabletgaze: A dataset and baseline algorithms for unconstrained appearance-based gaze estimation in mobile tablets. arXiv preprint arXiv:1508.01244.Google Scholar
Jimenez, J., Danvoye, E., and von der Pahlen, J. 2012. Photorealistic eyes rendering. In SIGGRAPH Talks, Advances in Real-Time Rendering, ACM.Google Scholar
Le, V., Brandt, J., Lin, Z., Bourdev, L., and Huang, T. S. 2012. Interactive facial feature localization. In ECCV. Google ScholarDigital Library
Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics. Google ScholarDigital Library
Loop, C. 1987. Smooth subdivision surfaces based on triangles.Google Scholar
Lu, F., Sugano, Y., Okabe, T., and Sato, Y. 2011. Inferring human gaze from appearance via adaptive linear regression. In ICCV, IEEE. Google ScholarDigital Library
Lu, F., Sugano, Y., Okabe, T., and Sato, Y. 2012. Head pose-free appearance-based gaze sensing via eye image synthesis. In Pattern Recognition (ICPR), IEEE.Google Scholar
Malbouisson, J. M., Messias, A., Leite, L., Rios, G., et al. 2005. Upper and lower eyelid saccades describe a harmonic oscillator function. Invest. Ophthalmol. Vis. Sci 46, 3.Google ScholarCross Ref
Miller, E., and Pinskiy, D. 2009. Realistic eye motion using procedural geometric methods. In SIGGRAPH Talks, ACM. Google ScholarDigital Library
Mora, K. A. F., and Odobez, J.-M. 2012. Gaze estimation from multimodal kinect data. In CVPRW, IEEE.Google Scholar
Orvalho, V., Bastos, P., Parke, F., Oliveira, B., and Alvarez, X. 2012. A facial rigging survey. In Eurographics.Google Scholar
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., and Vetter, T. 2009. A 3d face model for pose and illumination invariant face recognition. In Advanced Video and Signal Based Surveillance, IEEE. Google ScholarDigital Library
Penner, E., and Borshukov, G. 2011. Pre-integrated skin shading. Gpu Pro 2, 41--54.Google ScholarCross Ref
Ruhland, K., Andrist, S., Badler, J., Peters, C., Badler, N., Gleicher, M., Mutlu, B., and Mcdonnell, R. 2014. Look me in the eyes: A survey of eye and gaze animation for virtual agents and artificial systems. In Eurographics, 69--91.Google Scholar
Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., and Pantic, M. 2013. 300 faces in-the-wild challenge: The first facial landmark localization challenge. In ICCV. Google ScholarDigital Library
Shirley, P., Ashikhmin, M., and Marschner, S. 2009. Fundamentals of computer graphics. CRC Press. Google ScholarDigital Library
Smith, B., Yin, Q., Feiner, S., and Nayar, S. 2013. Gaze Locking: Passive Eye Contact Detection for HumanObject Interaction. In UIST, ACM. Google ScholarDigital Library
Sugano, Y., Matsushita, Y., and Sato, Y. 2014. Learning-by-Synthesis for Appearance-based 3D Gaze Estimation. In Proc. CVPR. Google ScholarDigital Library
Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models. In ACM Transactions on Graphics, vol. 24, ACM, 426--433. Google ScholarDigital Library
Wood, E., Baltrusaitis, T., Zhang, X., Sugano, Y., Robinson, P., and Bulling, A. 2015. Rendering of eyes for eye-shape registration and gaze estimation. In ICCV.Google Scholar
Zhang, X., Sugano, Y., Fritz, M., and Bulling, A. 2015. Appearance-Based Gaze Estimation in the Wild. In CVPR.Google Scholar
Zhu, X., and Ramanan, D. 2012. Face detection, pose estimation, and landmark localization in the wild. In CVPR. Google ScholarDigital Library

Index Terms

Learning an appearance-based gaze estimator from one million synthesised images
1. Computing methodologies

Recommendations

Revisiting data normalization for appearance-based gaze estimation
ETRA '18: Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications

Appearance-based gaze estimation is promising for unconstrained real-world settings, but the significant variability in head pose and user-camera distance poses significant challenges for training generic gaze estimators. Data normalization was proposed ...
Read More
Learning a gaze estimator with neighbor selection from large-scale synthetic eye images

Appearance-based gaze estimation works well in inferring human gaze under real-world condition. But one of the significant limitations in appearance-based methods is the need for huge amounts of training data. Eye image synthesis addresses this problem ...
Read More
Evaluation of Appearance-Based Methods and Implications for Gaze-Based Applications
CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

Appearance-based gaze estimation methods that only require an off-the-shelf camera have significantly improved but they are still not yet widely used in the human-computer interaction (HCI) community. This is partly because it remains unclear how they ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ETRA '16: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications
March 2016
378 pages
ISBN:9781450341257
DOI:10.1145/2857491
Conference Chairs:
Pernilla Qvarfordt
FX Palo Alto Laboratory
,
Dan Witzner Hansen
IT University of Copenhagen, Denmark
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 March 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Emerging Investigator
Author Tags
3D morphable model
appearance-based gaze estimation
learning-by-synthesis
real-time rendering
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate69of137submissions,50%
Upcoming Conference
ETRA '24

Sponsor:

sigchi

sigchi

The 2024 Symposium on Eye Tracking Research and Applications

June 4 - 7, 2024

Glasgow , United Kingdom
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 164
  Total Citations
  View Citations
- 1,483
  Total Downloads
- Downloads (Last 12 months)174
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning an appearance-based gaze estimator from one million synthesised images

ETRA '16: Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revisiting data normalization for appearance-based gaze estimation

Learning a gaze estimator with neighbor selection from large-scale synthetic eye images

Evaluation of Appearance-Based Methods and Implications for Gaze-Based Applications