skip to main content
research-article

Video face replacement

Published:12 December 2011Publication History
Skip Abstract Section

Abstract

We present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual operation or complex acquisition hardware, only single-camera video. We use a 3D multilinear model to track the facial performance in both videos. Using the corresponding 3D geometry, we warp the source to the target face and retime the source to match the target performance. We then compute an optimal seam through the video volume that maintains temporal consistency in the final composite. We showcase the use of our method on a variety of examples and present the result of a user study that suggests our results are difficult to distinguish from real video footage.

Skip Supplemental Material Section

Supplemental Material

a130-dale.mp4

mp4

66.5 MB

References

  1. Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., and Cohen, M. 2004. Interactive digital photomontage. ACM Trans. Graphics (Proc. SIGGRAPH) 23, 3, 294--302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. The digital emily project: Photoreal facial modeling and animation. In ACM SIGGRAPH 2009 Courses, 12:1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, B., and Gross, M. 2011 (to appear). High-quality passive facial performance capture using anchor frames. ACM Trans. Graphics (Proc. SIGGRAPH) 3, 27, 75:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., and Gross, M. 2007. Multi-scale capture of facial geometry and motion. ACM Trans. Graphics (Proc. SIGGRAPH) 26, 3, 33:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bitouk, D., Kumar, N., Dhillon, S., Belhumeur, P., and Nayar, S. K. 2008. Face swapping: Automatically replacing faces in photographs. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 39:1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Blanz, V., Basso, C., Poggio, T., and Vetter, T. 2003. Reanimating faces in images and video. Computer Graphics Forum 22, 3, 641--650.Google ScholarGoogle ScholarCross RefCross Ref
  7. Blanz, V., Scherbaum, K., Vetter, T., and Seidel, H.-P. 2004. Exchanging faces in images. Computer Graphics Forum (Proc. Eurographics) 23, 3, 669--676.Google ScholarGoogle ScholarCross RefCross Ref
  8. Borshukov, G., Piponi, D., Larsen, O., Lewis, J., and Tempelaar-Lietz, C. 2003. Universal capture -- Image-based facial animation for "The Matrix Reloaded". In ACM SIG-GRAPH 2003 Sketches & Applications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis and Machine Intelligence 23, 11, 1222--1239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Bradley, D., Heidrich, W., Popa, T., and Sheffer, A. 2010. High resolution passive facial performance capture. ACM Trans. Graphics (Proc. SIGGRAPH), 4, 41:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bregler, C., Covell, M., and Slaney, M. 1997. Video Rewrite: Driving visual speech with audio. In Proc. SIGGRAPH, 353--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. DeCarlo, D., and Metaxas, D. 1996. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Essa, I., Basu, S., Darrell, T., and Pentland, A. 1996. Modeling, tracking and interactive animation of faces and heads: Using input from video. In Proc. Computer Animation, 68--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Everingham, M., Sivic, J., and Zisserman, A. 2006. "Hello! My name is... Buffy" -- automatic naming of characters in TV video. In Proc. British Machine Vision Conference (BMVC), 899--908.Google ScholarGoogle Scholar
  15. Ezzat, T., Geiger, G., and Poggio, T. 2002. Trainable vide-orealistic speech animation. ACM Trans. Graphics (Proc. SIGGRAPH) 21, 3, 388--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Farbman, Z., Hoffer, G., Lipman, Y., Cohen-Or, D., and Lischinski, D. 2009. Coordinates for instant image cloning. ACM Trans. Graphics (Proc. SIGGRAPH) 28, 3, 67:1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Flagg, M., Nakazawa, A., Zhang, Q., Kang, S. B., Ryu, Y. K., Essa, I., and Rehg, J. M. 2009. Human video textures. In Proc. Symp. Interactive 3D Graphics (I3D), 199--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Guenter, B., Grimm, C., Wood, D., Malvar, H., and Pighin, F. 1998. Making faces. In Proc. SIGGRAPH, 55--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jain, A., Thormählen, T., Seidel, H.-P., and Theobalt, C. 2010. Moviereshape: Tracking and reshaping of humans in videos. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 29, 5, 148:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jia, J., Sun, J., Tang, C.-K., and Shum, H.-Y. 2006. Drag-and-drop pasting. ACM Trans. Graphics (Proc. SIGGRAPH) 25, 3, 631--637. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jones, A., Gardner, A., Bolas, M., McDowall, I., and Debevec, P. 2006. Simulating spatially varying lighting on a live performance. In Proc. European Conf. Visual Media Production (CVMP), 127--133.Google ScholarGoogle Scholar
  22. Joshi, N., Matusik, W., Adelson, E. H., and Kriegman, D. J. 2010. Personal photo enhancement using example images. ACM Trans. Graphics 29, 2, 12:1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. M. 2010. Being John Malkovich. In Proc. European Conf. Computer Vision (ECCV), 341--353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kwatra, V., Schödl, A., Essa, I., Turk, G., and Bobick, A. 2003. Graphcut textures: Image and video synthesis using graph cuts. ACM Trans. Graphics (Proc. SIGGRAPH) 22, 3, 277--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Leyvand, T., Cohen-Or, D., Dror, G., and Lischinski, D. 2008. Data-driven enhancement of facial attractiveness. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 38:1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graphics (Proc. SIGGRAPH) 28, 5, 175:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ma, W.-C., Jones, A., Chiang, J.-Y., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 27, 5, 121:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graphics (Proc. SIGGRAPH) 22, 3, 313--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Pighin, F. H., Szeliski, R., and Salesin, D. 1999. Resynthesizing facial animation through 3d model-based tracking. In Proc. IEEE Int. Conf. Computer Vision (ICCV), 143--150.Google ScholarGoogle Scholar
  30. Rabiner, L., and Juang, B.-H. 1993. Fundamentals of speech recognition. Prentice-Hall, Inc., Upper Saddle River, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Robertson, B. 2009. What's old is new again. Computer Graphics World 32, 1.Google ScholarGoogle Scholar
  32. Singular Inversions Inc., 2011. FaceGen Modeller manual. www.facegen.com.Google ScholarGoogle Scholar
  33. Sunkavalli, K., Johnson, M. K., Matusik, W., and Pfister, H. 2010. Multi-scale image harmonization. ACM Trans. Graphics (Proc. SIGGRAPH) 29, 4, 125:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Viola, P. A., and Jones, M. J. 2001. Robust real-time face detection. In Proc. IEEE Int. Conf. Computer Vision (ICCV), 747--755.Google ScholarGoogle Scholar
  35. Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models. ACM Trans. Graphics (Proc. SIGGRAPH) 24, 3, 426--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/Off: Live facial puppetry. In Proc. SIGGRAPH/Eurographics Symp. Computer Animation, 7--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Williams, L. 1990. Performance-driven facial animation. Computer Graphics (Proc. SIGGRAPH) 24, 4, 235--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yang, F., Wang, J., Shechtman, E., Bourdev, L., and Metaxas, D. 2011. Expression flow for 3D-aware face component transfer. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 60:1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: High resolution capture for modeling and animation. ACM Trans. Graphics 23, 3, 548--558. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Video face replacement

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 30, Issue 6
          December 2011
          678 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/2070781
          Issue’s Table of Contents

          Copyright © 2011 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 December 2011
          Published in tog Volume 30, Issue 6

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader