Abstract
The field of 3D face modeling has a large gap between high-end and low-end methods. At the high end, the best facial animation is indistinguishable from real humans, but this comes at the cost of extensive manual labor. At the low end, face capture from consumer depth sensors relies on 3D face models that are not expressive enough to capture the variability in natural facial shape and expression. We seek a middle ground by learning a facial model from thousands of accurately aligned 3D scans. Our FLAME model (Faces Learned with an Articulated Model and Expressions) is designed to work with existing graphics software and be easy to fit to data. FLAME uses a linear shape space trained from 3800 scans of human heads. FLAME combines this linear shape space with an articulated jaw, neck, and eyeballs, pose-dependent corrective blendshapes, and additional global expression blendshapes. The pose and expression dependent articulations are learned from 4D face sequences in the D3DFACS dataset along with additional 4D sequences. We accurately register a template mesh to the scan sequences and make the D3DFACS registrations available for research purposes. In total the model is trained from over 33, 000 scans. FLAME is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model. We compare FLAME to these models by fitting them to static 3D scans and 4D sequences using the same optimization method. FLAME is significantly more accurate and is available for research purposes (http://flame.is.tue.mpg.de).
Supplemental Material
Available for Download
Supplemental material.
- O. Alexander, M. Rogers, W. Lambeth, M. Chiang, and P. Debevec. 2009. The Digital Emily Project: Photoreal Facial Modeling and Animation. In SIGGRAPH 2009 Courses. 12:1--12:15. Google ScholarDigital Library
- B. Allen, B. Curless, and Z. Popović. 2003. The space of human body shapes: Reconstruction and parameterization from range scans. In Transactions on Graphics (Proceedings of SIGGRAPH), Vol. 22. 587--594. Google ScholarDigital Library
- B. Allen, B.n Curless, Z. Popović, and A. Hertzmann. 2006. Learning a Correlated Model of Identity and Pose-dependent Body Shape Variation for Real-time Synthesis. In ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA '06). 147--156. Google ScholarDigital Library
- B. Amberg, R. Knothe, and T. Vetter. 2008. Expression Invariant 3D Face Recognition with a Morphable Model. In International Conference on Automatic Face Gesture Recognition. 1--6.Google Scholar
- B. Amberg, S. Romdhani, and T. Vetter. 2007. Optimal Step Nonrigid ICP Algorithms for Surface Registration. In Conference on Computer Vision and Pattern Recognition. 1--8.Google Scholar
- D. Anguelov, P. Srinivasan, D. Koller, S. Thrun, J. Rodgers, and J. Davis. 2005. SCAPE: Shape Completion and Animation of People. Transactions on Graphics (Proceedings of SIGGRAPH) 24, 3 (2005), 408--416. Google ScholarDigital Library
- T. Beeler and D. Bradley. 2014. Rigid Stabilization of Facial Expressions. Transactions on Graphics (Proceedings of SIGGRAPH) 33, 4 (2014), 44:1--44:9. Google ScholarDigital Library
- T. Beeler, F. Hahn, D. Bradley, B. Bickel, P. Beardsley, C. Gotsman, R. W. Sumner, and M. Gross. 2011. High-quality passive facial performance capture using anchor frames. Transactions on Graphics (Proceedings of SIGGRAPH) 30, 4 (2011), 75:1--75:10. Google ScholarDigital Library
- V. Blanz and T. Vetter. 1999. A morphable model for the synthesis of 3D faces. In SIGGRAPH. 187--194. Google ScholarDigital Library
- F. Bogo, M. J. Black, M. Loper, and J. Romero. 2015. Detailed Full-Body Reconstructions of Moving People from Monocular RGB-D Sequences. In International Conference on Computer Vision. 2300--2308. Google ScholarDigital Library
- F. Bogo, J. Romero, M. Loper, and M.J. Black. 2014. FAUST: Dataset and Evaluation for 3D Mesh Registration. In Conference on Computer Vision and Pattern Recognition. 3794--3801. Google ScholarDigital Library
- T. Bolkart and S. Wuhrer. 2015. A Groupwise Multilinear Correspondence Optimization for 3D Faces. In International Conference on Computer Vision. 3604--3612. Google ScholarDigital Library
- J. Booth, A. Roussos, A. Ponniah, D. Dunaway, and S. Zafeiriou. 2017. Large Scale 3D Morphable Models. International Journal of Computer Vision (2017), 1--22.Google Scholar
- J. Booth, A. Roussos, S. Zafeiriou, A. Ponniahy, and D. Dunaway. 2016. A 3D Morphable Model Learnt from 10,000 Faces. In Conference on Computer Vision and Pattern Recognition. 5543--5552.Google Scholar
- S. Bouaziz, Y. Wang, and M. Pauly. 2013. Online modeling for realtime facial animation. Transactions on Graphics (Proceedings of SIGGRAPH) 32, 4 (2013), 40:1--40:10. Google ScholarDigital Library
- A. Bronstein, M. Bronstein, and R. Kimmel. 2008. Numerical Geometry of Non-Rigid Shapes (1 ed.). Springer Publishing Company, Incorporated. Google ScholarDigital Library
- A. Brunton, T. Bolkart, and S. Wuhrer. 2014. Multilinear wavelets: A statistical shape space for human faces. In European Conference on Computer Vision. 297--312.Google Scholar
- C. Cao, D. Bradley, K. Zhou, and T. Beeler. 2015. Real-time High-fidelity Facial Performance Capture. Transactions on Graphics (Proceedings of SIGGRAPH) 34, 4 (2015), 46:1--46:9. Google ScholarDigital Library
- C. Cao, Y. Weng, S. Zhou, Y. Tong, and K. Zhou. 2014. FaceWarehouse: A 3D facial expression database for visual computing. Transactions on Visualization and Computer Graphics 20, 3 (2014), 413--425. Google ScholarDigital Library
- Y. Chen, D. Robertson, and R. Cipolla. 2011. A Practical System for Modelling Body Shapes from Single View Measurements. In British Machine Vision Conference. 82.1--82.11.Google Scholar
- D. Cosker, E. Krumhuber, and A. Hilton. 2011. A FACS valid 3D dynamic action unit data-base with applications to 3D dynamic morphable facial modeling. In International Conference on Computer Vision. 2296--2303. Google ScholarDigital Library
- R. Davies, C. Twining, and C. Taylor. 2008. Statistical Models of Shape: Optimisation and Evaluation. Springer. Google ScholarDigital Library
- L. Dutreve, A. Meyer, and S. Bouakaz. 2011. Easy acquisition and real-time animation of facial wrinkles. Computer Animation and Virtual Worlds 22, 2--3 (2011), 169--176. Google ScholarDigital Library
- P. Ekman and W. Friesen. 1978. Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press.Google Scholar
- C. Ferrari, G. Lisanti, S. Berretti, and A. Del Bimbo. 2015. Dictionary Learning based 3D Morphable Model Construction for Face Recognition with Varying Expression and Pose. In International Conference on 3D Vision. 509--517. Google ScholarDigital Library
- FLAME. 2017. http://flame.is.tue.mpg.de. (2017).Google Scholar
- P. Garrido, L. Valgaerts, C. Wu, and C. Theobalt. 2013. Reconstructing detailed dynamic face geometry from monocular video. Transactions on Graphics (Proceedings of SIGGRAPH Asia) 32, 6 (2013), 158:1--158:10. Google ScholarDigital Library
- P. Garrido, M. Zollhöfer, D. Casas, L. Valgaerts, K. Varanasi, P. Perez, and C. Theobalt. 2016. Reconstruction of Personalized 3D Face Rigs from Monocular Video. Transactions on Graphics (Presented at SIGGRAPH 2016) 35, 3 (2016), 28:1--28:15. Google ScholarDigital Library
- S. Geman and D. E. McClure. 1987. Statistical methods for tomographic image reconstruction. Proceedings of the 46th Session of the International Statistical Institute, Bulletin of the ISI 52 (1987).Google Scholar
- N. Hasler, C. Stoll, M. Sunkel, B. Rosenhahn, and H.-P. Seidel. 2009. A Statistical Model of Human Pose and Body Shape. Computer Graphics Forum (2009).Google Scholar
- D.A. Hirshberg, M. Loper, E. Rachlin, and M.J. Black. 2012. Coregistration: Simultaneous alignment and modeling of articulated 3D shape. In European Conference on Computer Vision. 242--255. Google ScholarDigital Library
- A. E. Ichim, S. Bouaziz, and M. Pauly. 2015. Dynamic 3D Avatar Creation from Handheld Video Input. Transactions on Graphics (Proceedings of SIGGRAPH) 34, 4 (2015), 45:1--45:14. Google ScholarDigital Library
- I. Kemelmacher-Shlizerman and S. M. Seitz. 2011. Face Reconstruction in the Wild. In International Conference on Computer Vision. 1746--1753. Google ScholarDigital Library
- L. Kobbelt, S. Campagna, J. Vorsatz, and H.-P. Seidel. 1998. Interactive Multi-resolution Modeling on Arbitrary Meshes. In SIGGRAPH. 105--114. Google ScholarDigital Library
- Y. Kozlov, D. Bradley, M. Bächer, B. Thomaszewski, T. Beeler, and M. Gross. 2017. Enriching Facial Blendshape Rigs with Physical Simulation. Computer Graphics Forum (2017). Google ScholarDigital Library
- H. Li, T. Weise, and M. Pauly. 2010. Example-based facial rigging. Transactions on Graphics (Proceedings of SIGGRAPH) 29, 4 (2010), 32:1--32:6. Google ScholarDigital Library
- H. Li, J. Yu, Y. Ye, and C. Bregler. 2013. Realtime facial animation with on-the-fly correctives. Transactions on Graphics (Proceedings of SIGGRAPH) 32, 4 (2013), 42:1--42:10. Google ScholarDigital Library
- J. Li, W. Xu, Z. Cheng, K. Xu, and R. Klein. 2015. Lightweight wrinkle synthesis for 3D facial modeling and animation. Computer-Aided Design 58 (2015), 117--122. Google ScholarDigital Library
- M. Loper, N. Mahmood, J. Romero, G. Pons-Moll, and M.J. Black. 2015. SMPL: A Skinned Multi-person Linear Model. Transactions on Graphics (Proceedings of SIGGRAPH Asia) 34, 6 (2015), 248:1--248:16. Google ScholarDigital Library
- M. M. Loper and M. J. Black. 2014. OpenDR: An Approximate Differentiable Renderer. In European Conference on Computer Vision. 154--169.Google Scholar
- T. Neumann, K. Varanasi, S. Wenger, M. Wacker, M. Magnor, and C. Theobalt. 2013. Sparse Localized Deformation Components. Transactions on Graphics (Proceedings of SIGGRAPH Asia) 32, 6 (2013), 179:1--179:10. Google ScholarDigital Library
- J. Nocedal and S. J. Wright. 2006. Numerical Optimization. Springer.Google Scholar
- P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vetter. 2009. A 3D Face Model for Pose and Illumination Invariant Face Recognition. In International Conference on Advanced Video and Signal Based Surveillance. 296--301. Google ScholarDigital Library
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830. Google ScholarDigital Library
- L. Pishchulin, S. Wuhrer, T. Helten, C. Theobalt, and B. Schiele. 2017. Building Statistical Shape Spaces for 3D Human Modeling. Pattern Recognition 67, C (July 2017), 276--286. Google ScholarDigital Library
- K. Robinette, S. Blackwell, H. Daanen, M. Boehmer, S. Fleming, T. Brill, D. Hoeferlin, and D. Burnsides. 2002. Civilian American and European Surface Anthropometry Resource (CAESAR) Final Report. Technical Report AFRL-HE-WP-TR-2002-0169. US Air Force Research Laboratory.Google Scholar
- A. Salazar, S. Wuhrer, C. Shu, and F. Prieto. 2014. Fully automatic expression-invariant face correspondence. Machine Vision and Applications 25, 4 (2014), 859--879. Google ScholarDigital Library
- F. Shi, H.-T. Wu, X. Tong, and J. Chai. 2014. Automatic Acquisition of High-fidelity Facial Performances Using Monocular Videos. Transactions on Graphics (Proceedings of SIGGRAPH Asia) 33, 6 (2014), 222:1--222:13. Google ScholarDigital Library
- R.W. Sumner and J. Popović. 2004. Deformation transfer for triangle meshes. Transactions on Graphics (Proceedings of SIGGRAPH) 23, 3 (2004), 399--405. Google ScholarDigital Library
- S. Suwajanakorn, I. Kemelmacher-Shlizerman, and S. M. Seitz. 2014. Total Moving Face Reconstruction. In European Conference on Computer Vision. 796--812.Google Scholar
- S. Suwajanakorn, S. M. Seitz, and I. Kemelmacher-Shlizerman. 2015. What Makes Tom Hanks Look Like Tom Hanks. In International Conference on Computer Vision. 3952--3960. Google ScholarDigital Library
- J. Thies, M. Zollhöfer, M. Nießner, L. Valgaerts, M. Stamminger, and C. Theobalt. 2015. Real-time Expression Transfer for Facial Reenactment. Transactions on Graphics (Proceedings of SIGGRAPH Asia) 34, 6 (2015), 183:1--183:14. Google ScholarDigital Library
- D. Vlasic, M. Brand, H. Pfister, and J. Popović. 2005. Face transfer with multilinear models. Transactions on Graphics (Proceedings of SIGGRAPH) 24, 3 (2005), 426--433. Google ScholarDigital Library
- T. Weise, S. Bouaziz, H. Li, and M. Pauly. 2011. Realtime performance-based facial animation. Transactions on Graphics (Proceedings of SIGGRAPH) 30, 4 (2011), 77:1--77:10. Google ScholarDigital Library
- E. Wood, T. Baltrušaitis, L.-P. Morency, P. Robinson, and A. Bulling. 2016. A 3D morphable eye region model for gaze estimation. In European Conference on Computer Vision. 297--313. Google ScholarDigital Library
- C. Wu, D. Bradley, M. Gross, and T. Beeler. 2016. An Anatomically-constrained Local Deformation Model for Monocular Face Capture. Transactions on Graphics (Proceedings of SIGGRAPH) 35, 4 (2016), 115:1--115:12. Google ScholarDigital Library
- X. Xiong and F. De la Torre. 2013. Supervised descent method and its applications to face alignment. In Conference on Computer Vision and Pattern Recognition. 532--539. Google ScholarDigital Library
- F. Xu, J. Chai, Y. Liu, and X. Tong. 2014. Controllable High-fidelity Facial Performance Transfer. Transactions on Graphics (Proceedings of SIGGRAPH) 33, 4 (2014), 42:1--42:11. Google ScholarDigital Library
- F. Yang, J. Wang, E. Shechtman, L. Bourdev, and D. Metaxas. 2011. Expression flow for 3D-aware face component transfer. Transactions on Graphics (Proceedings of SIGGRAPH) 30, 4 (2011), 60:1--10. Google ScholarDigital Library
- L. Yin, X. Wei, Y. Sun, J. Wang, and M. J. Rosato. 2006. A 3D Facial Expression Database for Facial Behavior Research. In International Conference on Automatic Face and Gesture Recognition. 211--216. Google ScholarDigital Library
Index Terms
- Learning a model of facial shape and expression from 4D scans
Recommendations
3D/4D facial expression analysis: An advanced annotated face model approach
Facial expression analysis has interested many researchers in the past decade due to its potential applications in various fields such as human-computer interaction, psychological studies, and facial animation. Three-dimensional facial data has been ...
Natural facial expression recognition using differential-AAM and manifold learning
This paper proposes a novel natural facial expression recognition method that recognizes a sequence of dynamic facial expression images using the differential active appearance model (AAM) and manifold learning as follows. First, the differential-AAM ...
A bi-modal face recognition framework integrating facial expression with facial appearance
Among many biometric characteristics, the facial biometric is considered to be the least intrusive technology that can be deployed in the real-world visual surveillance environment. However, in facial biometric, little research attention has been paid ...
Comments