Abstract
We present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual operation or complex acquisition hardware, only single-camera video. We use a 3D multilinear model to track the facial performance in both videos. Using the corresponding 3D geometry, we warp the source to the target face and retime the source to match the target performance. We then compute an optimal seam through the video volume that maintains temporal consistency in the final composite. We showcase the use of our method on a variety of examples and present the result of a user study that suggests our results are difficult to distinguish from real video footage.
Supplemental Material
- Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., and Cohen, M. 2004. Interactive digital photomontage. ACM Trans. Graphics (Proc. SIGGRAPH) 23, 3, 294--302. Google ScholarDigital Library
- Alexander, O., Rogers, M., Lambeth, W., Chiang, M., and Debevec, P. 2009. The digital emily project: Photoreal facial modeling and animation. In ACM SIGGRAPH 2009 Courses, 12:1--15. Google ScholarDigital Library
- Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, B., and Gross, M. 2011 (to appear). High-quality passive facial performance capture using anchor frames. ACM Trans. Graphics (Proc. SIGGRAPH) 3, 27, 75:1--10. Google ScholarDigital Library
- Bickel, B., Botsch, M., Angst, R., Matusik, W., Otaduy, M., Pfister, H., and Gross, M. 2007. Multi-scale capture of facial geometry and motion. ACM Trans. Graphics (Proc. SIGGRAPH) 26, 3, 33:1--10. Google ScholarDigital Library
- Bitouk, D., Kumar, N., Dhillon, S., Belhumeur, P., and Nayar, S. K. 2008. Face swapping: Automatically replacing faces in photographs. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 39:1--8. Google ScholarDigital Library
- Blanz, V., Basso, C., Poggio, T., and Vetter, T. 2003. Reanimating faces in images and video. Computer Graphics Forum 22, 3, 641--650.Google ScholarCross Ref
- Blanz, V., Scherbaum, K., Vetter, T., and Seidel, H.-P. 2004. Exchanging faces in images. Computer Graphics Forum (Proc. Eurographics) 23, 3, 669--676.Google ScholarCross Ref
- Borshukov, G., Piponi, D., Larsen, O., Lewis, J., and Tempelaar-Lietz, C. 2003. Universal capture -- Image-based facial animation for "The Matrix Reloaded". In ACM SIG-GRAPH 2003 Sketches & Applications. Google ScholarDigital Library
- Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis and Machine Intelligence 23, 11, 1222--1239. Google ScholarDigital Library
- Bradley, D., Heidrich, W., Popa, T., and Sheffer, A. 2010. High resolution passive facial performance capture. ACM Trans. Graphics (Proc. SIGGRAPH), 4, 41:1--10. Google ScholarDigital Library
- Bregler, C., Covell, M., and Slaney, M. 1997. Video Rewrite: Driving visual speech with audio. In Proc. SIGGRAPH, 353--360. Google ScholarDigital Library
- DeCarlo, D., and Metaxas, D. 1996. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 231--238. Google ScholarDigital Library
- Essa, I., Basu, S., Darrell, T., and Pentland, A. 1996. Modeling, tracking and interactive animation of faces and heads: Using input from video. In Proc. Computer Animation, 68--79. Google ScholarDigital Library
- Everingham, M., Sivic, J., and Zisserman, A. 2006. "Hello! My name is... Buffy" -- automatic naming of characters in TV video. In Proc. British Machine Vision Conference (BMVC), 899--908.Google Scholar
- Ezzat, T., Geiger, G., and Poggio, T. 2002. Trainable vide-orealistic speech animation. ACM Trans. Graphics (Proc. SIGGRAPH) 21, 3, 388--398. Google ScholarDigital Library
- Farbman, Z., Hoffer, G., Lipman, Y., Cohen-Or, D., and Lischinski, D. 2009. Coordinates for instant image cloning. ACM Trans. Graphics (Proc. SIGGRAPH) 28, 3, 67:1--9. Google ScholarDigital Library
- Flagg, M., Nakazawa, A., Zhang, Q., Kang, S. B., Ryu, Y. K., Essa, I., and Rehg, J. M. 2009. Human video textures. In Proc. Symp. Interactive 3D Graphics (I3D), 199--206. Google ScholarDigital Library
- Guenter, B., Grimm, C., Wood, D., Malvar, H., and Pighin, F. 1998. Making faces. In Proc. SIGGRAPH, 55--66. Google ScholarDigital Library
- Jain, A., Thormählen, T., Seidel, H.-P., and Theobalt, C. 2010. Moviereshape: Tracking and reshaping of humans in videos. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 29, 5, 148:1--10. Google ScholarDigital Library
- Jia, J., Sun, J., Tang, C.-K., and Shum, H.-Y. 2006. Drag-and-drop pasting. ACM Trans. Graphics (Proc. SIGGRAPH) 25, 3, 631--637. Google ScholarDigital Library
- Jones, A., Gardner, A., Bolas, M., McDowall, I., and Debevec, P. 2006. Simulating spatially varying lighting on a live performance. In Proc. European Conf. Visual Media Production (CVMP), 127--133.Google Scholar
- Joshi, N., Matusik, W., Adelson, E. H., and Kriegman, D. J. 2010. Personal photo enhancement using example images. ACM Trans. Graphics 29, 2, 12:1--15. Google ScholarDigital Library
- Kemelmacher-Shlizerman, I., Sankar, A., Shechtman, E., and Seitz, S. M. 2010. Being John Malkovich. In Proc. European Conf. Computer Vision (ECCV), 341--353. Google ScholarDigital Library
- Kwatra, V., Schödl, A., Essa, I., Turk, G., and Bobick, A. 2003. Graphcut textures: Image and video synthesis using graph cuts. ACM Trans. Graphics (Proc. SIGGRAPH) 22, 3, 277--286. Google ScholarDigital Library
- Leyvand, T., Cohen-Or, D., Dror, G., and Lischinski, D. 2008. Data-driven enhancement of facial attractiveness. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 38:1--9. Google ScholarDigital Library
- Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graphics (Proc. SIGGRAPH) 28, 5, 175:1--10. Google ScholarDigital Library
- Ma, W.-C., Jones, A., Chiang, J.-Y., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 27, 5, 121:1--10. Google ScholarDigital Library
- Pérez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. ACM Trans. Graphics (Proc. SIGGRAPH) 22, 3, 313--318. Google ScholarDigital Library
- Pighin, F. H., Szeliski, R., and Salesin, D. 1999. Resynthesizing facial animation through 3d model-based tracking. In Proc. IEEE Int. Conf. Computer Vision (ICCV), 143--150.Google Scholar
- Rabiner, L., and Juang, B.-H. 1993. Fundamentals of speech recognition. Prentice-Hall, Inc., Upper Saddle River, NJ, USA. Google ScholarDigital Library
- Robertson, B. 2009. What's old is new again. Computer Graphics World 32, 1.Google Scholar
- Singular Inversions Inc., 2011. FaceGen Modeller manual. www.facegen.com.Google Scholar
- Sunkavalli, K., Johnson, M. K., Matusik, W., and Pfister, H. 2010. Multi-scale image harmonization. ACM Trans. Graphics (Proc. SIGGRAPH) 29, 4, 125:1--10. Google ScholarDigital Library
- Viola, P. A., and Jones, M. J. 2001. Robust real-time face detection. In Proc. IEEE Int. Conf. Computer Vision (ICCV), 747--755.Google Scholar
- Vlasic, D., Brand, M., Pfister, H., and Popović, J. 2005. Face transfer with multilinear models. ACM Trans. Graphics (Proc. SIGGRAPH) 24, 3, 426--433. Google ScholarDigital Library
- Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/Off: Live facial puppetry. In Proc. SIGGRAPH/Eurographics Symp. Computer Animation, 7--16. Google ScholarDigital Library
- Williams, L. 1990. Performance-driven facial animation. Computer Graphics (Proc. SIGGRAPH) 24, 4, 235--242. Google ScholarDigital Library
- Yang, F., Wang, J., Shechtman, E., Bourdev, L., and Metaxas, D. 2011. Expression flow for 3D-aware face component transfer. ACM Trans. Graphics (Proc. SIGGRAPH) 27, 3, 60:1--10. Google ScholarDigital Library
- Zhang, L., Snavely, N., Curless, B., and Seitz, S. M. 2004. Spacetime faces: High resolution capture for modeling and animation. ACM Trans. Graphics 23, 3, 548--558. Google ScholarDigital Library
Index Terms
- Video face replacement
Recommendations
Video face replacement
SA '11: Proceedings of the 2011 SIGGRAPH Asia ConferenceWe present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual ...
Face swapping: automatically replacing faces in photographs
In this paper, we present a complete system for automatic face replacement in images. Our system uses a large library of face images created automatically by downloading images from the internet, extracting faces using face detection software, and ...
A Live Face Swapper
MM '16: Proceedings of the 24th ACM international conference on MultimediaIn this technical demonstration, we propose a face swapping framework, which is able to interactively change the appearance of a face in the wild to a different person/creature's face in real time on a mobile device. To realize this objective, we ...
Comments