skip to main content
article

Face transfer with multilinear models

Published:01 July 2005Publication History
Skip Abstract Section

Abstract

Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target---the attributes are separably controllable. This supports a wide variety of video rewrite and puppetry applications.Face Transfer is based on a multilinear model of 3D face meshes that separably parameterizes the space of geometric variations due to different attributes (e.g., identity, expression, and viseme). Separability means that each of these attributes can be independently varied. A multilinear model can be estimated from a Cartesian product of examples (identities × expressions × visemes) with techniques from statistical analysis, but only after careful preprocessing of the geometric data set to secure one-to-one correspondence, to minimize cross-coupling artifacts, and to fill in any missing examples. Face Transfer offers new solutions to these problems and links the estimated model with a face-tracking algorithm to extract pose, expression, and viseme parameters.

Skip Supplemental Material Section

Supplemental Material

pps004.mp4

mp4

31 MB

References

  1. Allen, B., Curless, B., and Popović, Z. 2003. The space of human body shapes: Reconstruction and parameterization from range scans. ACM Transactions on Graphics 22, 3 (July), 587--594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bascle, B., and Blake, A. 1998. Separability of pose and expression in facial tracking and animation. In International Conference on Computer Vision (ICCV), 323--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Birchfield, S., 1996. KLT: An implementation of the kanade-lucas-tomasi feature tracker. http://www.ces.clemson.edu/~stb/.Google ScholarGoogle Scholar
  4. Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of SIGGRAPH 99, Computer Graphics Proceedings, Annual Conference Series, 187--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Blanz, V., Basso, C., Poggio, T., and Vetter, T. 2003. Reanimating faces in images and video. Computer Graphics Forum 22, 3 (Sept.), 641--650.Google ScholarGoogle ScholarCross RefCross Ref
  6. Brand, M. E., and Bhotika, R. 2001. Flexible flow for 3D nonrigid tracking and shape recovery. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, 315--322.Google ScholarGoogle Scholar
  7. Brand, M. E. 2002. Incremental singular value decomposition of uncertain data with missing values. In European Conference on Computer Vision (ECCV), vol. 2350, 707--720. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bregler, C., Covell, M., and Slaney, M. 1997. Video rewrite: Driving visual speech with audio. In Proceedings of SIGGRAPH 97, Computer Graphics Proceedings, Annual Conference Series, 353--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bregler, C., Hertzmann, A., and Biermann, H. 2000. Recovering non-rigid 3D shape from image streams. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 690--696.Google ScholarGoogle Scholar
  10. Cao, Y., Faloutsos, P., and Pighin, F. 2003. Unsupervised learning for speech motion editing. In Eurographics/SIGGRAPH Symposium on Computer Animation (SCA), 225--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chai, J.-X., Xiao, J., and Hodgins, J. 2003. Vision-based control of 3D facial animation. In Eurographics/SIGGRAPH Symposium on Computer Animation (SCA), 193--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chuang, E. S., Deshpande, H., and Bregler, C. 2002. Facial expression space learning. In Pacific Conference on Computer Graphics and Applications (PG), 68--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. De Lathauwer, L. 1997. Signal Processing based on Multilinear Algebra. PhD thesis, Katholieke Universiteit Leuven, Belgium.Google ScholarGoogle Scholar
  14. DeCarlo, D., and Metaxas, D. 1996. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. DeCarlo, D., and Metaxas, D. 2000. Optical flow constraints on deformable models with applications to face tracking. International Journal of Computer Vision 38, 2, 99--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Essa, I., Basu, S., Darrell, T., and Pentland, A. 1996. Modeling, tracking and interactive animation of faces and heads: Using input from video. In Computer Animation '96, 68--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ezzat, T., and Poggio, T. 2000. Visual speech synthesis by morphing visemes. International Journal of Computer Vision 38, 1, 45--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Freeman, W. T., and Tenenbaum, J. B. 1997. Learning bilinear models for two factor problems in vision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 554--560. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Georghiades, A., Belhumeur, P., and Kriegman, D. 2001. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 23, 6, 643--660. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Gotsman, C., Gu, X., and Sheffer, A. 2003. Fundamentals of spherical parameterization for 3D meshes. ACM Transactions on Graphics 22, 3 (July), 358--363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jones, T. R., Durand, F., and Desbrun, M. 2003. Non-iterative, feature-preserving mesh smoothing. ACM Transactions on Graphics 22, 3 (July), 943--949. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Koch, R. M., Gross, M. H., Carls, F. R., Von Büren, D. F., Fankhauser, G., and Parish, Y. 1996. Simulating facial surgery using finite element methods. In Proceedings of SIGGRAPH 96, Computer Graphics Proceedings, Annual Conference Series, 421--428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kraevoy, V., and Sheffer, A. 2004. Cross-parameterization and compatible remeshing of 3D models. ACM Transactions on Graphics 23, 3 (Aug.), 861--869. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kroonenberg, P. M., and De Leeuw, J. 1980. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika 45, 69--97.Google ScholarGoogle ScholarCross RefCross Ref
  25. Lee, Y., Terzopoulos, D., and Waters, K. 1995. Realistic modeling for facial animation. In Proceedings of SIGGRAPH 95, Computer Graphics Proceedings, Annual Conference Series, 55--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Li, H., Roivainen, P., and Forchheimer, R. 1993. 3-D motion estimation in model-based facial image coding. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 15, 6, 545-555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Noh, J.-Y., and Neumann, U. 2001. Expression cloning. In Proceedings of SIGGRAPH 2001, Computer Graphics Proceedings, Annual Conference Series, 277--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Parke, F. I. 1974. A parametric model for human faces. PhD thesis, University of Utah, Salt Lake City, Utah. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Parke, F. I. 1982. Parameterized models for facial animation. IEEE Computer Graphics & Applications 2 (Nov.), 61--68.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Pentland, A., and Sclaroff, S. 1991. Closed-form solutions for physically based shape modeling and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 13, 7, 715--729. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Transactions on Graphics 22, 3 (July), 313--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., and Salesin, D. H. 1998. Synthesizing realistic facial expressions from photographs. In Proceedings of SIGGRAPH 98, Computer Graphics Proceedings, Annual Conference Series, 75--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Pighin, F. H., Szeliski, R., and Salesin, D. 1999. Resynthesizing facial animation through 3d model-based tracking. In International Conference on Computer Vision (ICCV), 143--150.Google ScholarGoogle Scholar
  34. Praun, E., and Hoppe, H. 2003. Spherical parameterization and remeshing. ACM Transactions on Graphics 22, 3 (July), 340--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Robertson, B. 2004. Locomotion. Computer Graphics World (Dec.).Google ScholarGoogle Scholar
  36. Roweis, S. 1997. EM algorithms for PCA and SPCA. In Advances in neural information processing systems 10 (NIPS), 626--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Sirovich, L., and Kirby, M. 1987. Low dimensional procedure for the characterization of human faces. Journal of the Optical Society of America A 4, 519--524.Google ScholarGoogle ScholarCross RefCross Ref
  38. Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. ACM Transactions on Graphics 23, 3 (Aug.), 399--405. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Tipping, M. E., and Bishop, C. M. 1999. Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B 61, 3, 611--622.Google ScholarGoogle ScholarCross RefCross Ref
  40. Torresani, L., Yang, D., Alexander, E., and Bregler, C. 2001. Tracking and modeling non-rigid objects with rank constraints. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, 493--450.Google ScholarGoogle Scholar
  41. Tucker, L. R. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31, 3 (Sept.), 279--311.Google ScholarGoogle ScholarCross RefCross Ref
  42. Vasilescu, M. A. O., and Terzopoulos, D. 2002. Multilinear analysis of image ensembles: Tensorfaces. In European Conference on Computer Vision (ECCV), 447--460. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Vasilescu, M. A. O., and Terzopoulos, D. 2004. Tensortextures: multilinear image-based rendering. ACM Transactions on Graphics 23, 3 (Aug.), 336-342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Viola, P., and Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, 511--518.Google ScholarGoogle Scholar
  45. Wang, Y., Huang, X., Lee, C.-S., Zhang, S., Li, Z., Samaras, D., Metaxas, D., Elgammal, A., and Huang, P. 2004. High resolution acquisition, learning and transfer of dynamic 3-d facial expressions. Computer Graphics Forum 23, 3 (Sept.), 677--686.Google ScholarGoogle ScholarCross RefCross Ref
  46. Waters, K. 1987. A muscle model for animating three-dimensional facial expression. In Computer Graphics (Proceedings of SIGGRAPH 87), vol. 21, 17--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Williams, L. 1990. Performance-driven facial animation. In Computer Graphics (Proceedings of SIGGRAPH 90), vol. 24, 235--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Space-time faces: high resolution capture for modeling and animation. ACM Transactions on Graphics 23, 3 (Aug.), 548-558. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Face transfer with multilinear models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 24, Issue 3
        July 2005
        826 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/1073204
        Issue’s Table of Contents

        Copyright © 2005 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 July 2005
        Published in tog Volume 24, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader