research-article

Real-time non-rigid reconstruction using an RGB-D camera

Authors:
Michael Zollhöfer

University of Erlangen-Nuremberg

University of Erlangen-Nuremberg
View Profile

,
Matthias Nießner

Stanford University

Stanford University
View Profile

,
Shahram Izadi

Microsoft Research

Microsoft Research
View Profile

,
Christoph Rehmann

Microsoft Research

Microsoft Research
View Profile

,
Christopher Zach

Microsoft Research

Microsoft Research
View Profile

,
Matthew Fisher

Stanford University

Stanford University
View Profile

,
Chenglei Wu

Max Planck Institute for Informatics

Max Planck Institute for Informatics
View Profile

,
Andrew Fitzgibbon

Microsoft Research

Microsoft Research
View Profile

,
Charles Loop

Microsoft Research

Microsoft Research
View Profile

,
Christian Theobalt

Max Planck Institute for Informatics

Max Planck Institute for Informatics
View Profile

,
Marc Stamminger

University of Erlangen-Nuremberg

University of Erlangen-Nuremberg
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 33 Issue 4Article No.: 156pp 1–12https://doi.org/10.1145/2601097.2601165

Published:27 July 2014Publication History

ACM Transactions on Graphics

Abstract

We present a combined hardware and software solution for markerless reconstruction of non-rigidly deforming physical objects with arbitrary shape in real-time. Our system uses a single self-contained stereo camera unit built from off-the-shelf components and consumer graphics hardware to generate spatio-temporally coherent 3D models at 30 Hz. A new stereo matching algorithm estimates real-time RGB-D data. We start by scanning a smooth template model of the subject as they move rigidly. This geometric surface prior avoids strong scene assumptions, such as a kinematic human skeleton or a parametric shape model. Next, a novel GPU pipeline performs non-rigid registration of live RGB-D data to the smooth template using an extended non-linear as-rigid-as-possible (ARAP) framework. High-frequency details are fused onto the final mesh using a linear deformation model. The system is an order of magnitude faster than state-of-the-art methods, while matching the quality and robustness of many offline algorithms. We show precise real-time reconstructions of diverse scenes, including: large deformations of users' heads, hands, and upper bodies; fine-scale wrinkles and folds of skin and clothing; and non-rigid interactions performed by users on flexible objects such as toys. We demonstrate how acquired models can be used for many interactive scenarios, including re-texturing, online performance capture and preview, and real-time shape and motion re-targeting.

Supplemental Material

a156-sidebyside.mp4

mp4

22.9 MB

Download

Available for Download

zip

a156-zollhofer.zip (116.5 MB)

Supplemental material.

References

Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM TOG (Proc. SIGGRAPH) 30, 4, 75. Google ScholarDigital Library
Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3D faces. In Proc. SIGGRAPH, 187--194. Google ScholarDigital Library
Bleyer, M., Rhemann, C., and Rother, C. 2011. Patchmatch stereo: Stereo matching with slanted support windows. In Proc. BMVC, vol. 11, 1--11.Google Scholar
Bojsen-Hansen, M., Li, H., and Wojtan, C. 2012. Tracking surfaces with evolving topology. ACM Trans. Graph. 31, 4, 53. Google ScholarDigital Library
Botsch, M., and Sorkine, O. 2008. On linear variational surface deformation methods. IEEE Trans. Vis. Comp. Graph 14, 1, 213--230. Google ScholarDigital Library
Bradley, D., Popa, T., Sheffer, A., Heidrich, W., and Boubekeur, T. 2008. Markerless garment capture. ACM TOG (Proc. SIGGRAPH) 27, 3, 99. Google ScholarDigital Library
Brown, B. J., and Rusinkiewicz, S. 2007. Global non-rigid alignment of 3D scans. ACM TOG 26, 3, 21--30. Google ScholarDigital Library
Cagniart, C., Boyer, E., and Ilic, S. 2010. Free-form mesh tracking: a patch-based approach. In Proc. CVPR.Google Scholar
Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3D shape regression for real-time facial animation. ACM TOG 32, 4, 41. Google ScholarDigital Library
Chen, J., Izadi, S., and Fitzgibbon, A. 2012. Kinêtre: animating the world with the human body. In Proc. UIST, 435--444. Google ScholarDigital Library
de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM TOG (Proc. SIGGRAPH) 27, 1--10. Google ScholarDigital Library
Dou, M., Fuchs, H., and Frahm, J.-M. 2013. Scanning and tracking dynamic objects with commodity depth cameras. In Proc. ISMAR, 99--106.Google Scholar
Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., and Seidel, H.-P. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proc. CVPR, 1746--1753.Google Scholar
Garrido, P., Valgaert, L., Wu, C., and Theobalt, C. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM TOG (Proc. SIGGRAPH Asia) 32, 6, 158. Google ScholarDigital Library
Helten, T., Baak, A., Bharaj, G., Muller, M., Seidel, H.-P., and Theobalt, C. 2013. Personalization and evaluation of a real-time depth-based full body tracker. In Proc. 3DV, 279--286. Google ScholarDigital Library
Hernández, C., Vogiatzis, G., Brostow, G. J., Stenger, B., and Cipolla, R. 2007. Non-rigid photometric stereo with colored lights. In Proc. ICCV, 1--8.Google Scholar
Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., and Fitzgibbon, A. 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In Proc. UIST, 559--568. Google ScholarDigital Library
Kolb, A., Barth, E., Koch, R., and Larsen, R. 2009. Time-of-flight sensors in computer graphics. In Proc. Eurographics State-of-the-art Reports, 119--134.Google Scholar
Li, H., Sumner, R. W., and Pauly, M. 2008. Global correspondence optimization for non-rigid registration of depth scans. In Proc. SGP, Eurographics Association, 1421--1430. Google ScholarDigital Library
Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM TOG 28, 5, 175. Google ScholarDigital Library
Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J. T., and Gusev, G. 2013. 3D self-portraits. ACM TOG 32, 6, 187. Google ScholarDigital Library
Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics 32, 4 (July). Google ScholarDigital Library
Liao, M., Zhang, Q., Wang, H., Yang, R., and Gong, M. 2009. Modeling deformable objects from a single depth camera. In Proc. ICCV, 167--174.Google Scholar
Mitra, N. J., Flöry, S., Ovsjanikov, M., Gelfand, N., Guibas, L. J., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173--182. Google ScholarDigital Library
Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proc. ISMAR, 127--136. Google ScholarDigital Library
Niessner, M., Zollhöfer, M., Izadi, S., and Stamminger, M. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM TOG 32, 6, 169. Google ScholarDigital Library
Oikonomidis, I., Kyriazis, N., and Argyros, A. A. 2011. Efficient model-based 3D tracking of hand articulations using Kinect. In Proc. BMVC, 1--11.Google Scholar
Pradeep, V., Rhemann, C., Izadi, S., Zach, C., Bleyer, M., and Bathiche, S. 2013. MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera. In Proc. ISMAR, 83--88.Google Scholar
Sorkine, O., and Alexa, M. 2007. As-rigid-as-possible surface modeling. In Proc. SGP, 109--116. Google ScholarDigital Library
Starck, J., and Hilton, A. 2007. Surface capture for performance-based animation. Computer Graphics and Applications 27, 3, 21--31. Google ScholarDigital Library
Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. In ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, SIGGRAPH '04, 399--405. Google ScholarDigital Library
Sumner, R. W., Schmid, J., and Pauly, M. 2007. Embedded deformation for shape manipulation. ACM TOG 26, 3, 80. Google ScholarDigital Library
Taylor, J., Shotton, J., Sharp, T., and Fitzgibbon, A. 2012. The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In Proc. CVPR, 103--110. Google ScholarDigital Library
Tevs, A., Berner, A., Wand, M., Ihrke, I., Bokeloh, M., Kerber, J., and Seidel, H.-P. 2012. Animation cartography-intrinsic reconstruction of shape and motion. ACM TOG 31, 2, 12. Google ScholarDigital Library
Theobalt, C., de Aguiar, E., Stoll, C., Seidel, H.-P., and Thrun, S. 2010. Performance capture from multi-view video. In Image and Geometry Processing for 3D-Cinematography, R. Ronfard and G. Taubin, Eds. Springer, 127ff.Google Scholar
Tong, J., Zhou, J., Liu, L., Pan, Z., and Yan, H. 2012. Scanning 3D full human bodies using Kinects. TVCG 18, 4, 643--650. Google ScholarDigital Library
Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. ACM TOG (Proc. SIGGRAPH Asia) 31, 6 (November), 187. Google ScholarDigital Library
Vlasic, D., Baran, I., Matusik, W., and Popović, J. 2008. Articulated mesh animation from multi-view silhouettes. ACM TOG (Proc. SIGGRAPH). Google ScholarDigital Library
Vlasic, D., Peers, P., Baran, I., Debevec, P., Popovic, J., Rusinkiewicz, S., and Matusik, W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM TOG (Proc. SIGGRAPH Asia) 28, 5, 174. Google ScholarDigital Library
Wand, M., Adams, B., Ovsjanikov, M., Berner, A., Bokeloh, M., Jenke, P., Guibas, L., Seidel, H.-P., and Schilling, A. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM TOG 28, 15. Google ScholarDigital Library
Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629--638.Google Scholar
Weber, D., Bender, J., Schnoes, M., Stork, A., and Fellner, D. 2013. Efficient gpu data structures and methods to solve sparse linear systems in dynamics applications. Computer Graphics Forum 32, 1, 16--26.Google ScholarCross Ref
Wei, X., Zhang, P., and Chai, J. 2012. Accurate realtime full-body motion capture using a single depth camera. ACM TOG 31, 6 (Nov.), 188. Google ScholarDigital Library
Weise, T., Wismer, T., Leibe, B., and Gool, L. V. 2009. In-hand scanning with online loop closure. In IEEE International Workshop on 3-D Digital Imaging and Modeling.Google Scholar
Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/off: Live facial puppetry. In Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer animation (Proc. SCA'09), Eurographics Association, ETH Zurich. Google ScholarDigital Library
Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. ACM TOG 30, 4, 77. Google ScholarDigital Library
Weiss, A., Hirshberg, D., and Black, M. J. 2011. Home 3D body scans from noisy image and range data. In Proc. ICCV, 1951--1958. Google ScholarDigital Library
White, B. S., McKee, S. A., de Supinski, B. R., Miller, B., Quinlan, D., and Schulz, M. 2005. Improving the computational intensity of unstructured mesh applications. In Proc. ACM Intl. Conf. on Supercomputing, 341--350. Google ScholarDigital Library
Wilamowski, B. M., and Yu, H. 2010. Improved computation for levenberg-marquardt training. IEEE Trans. Neural Networks 21, 6, 930--937. Google ScholarDigital Library
Wu, C., Stoll, C., Valgaerts, L., and Theobalt, C. 2013. On-set performance capture of multiple actors with a stereo camera. ACM TOG 32, 6, 161. Google ScholarDigital Library
Ye, G., Liu, Y., Hasler, N., Ji, X., Dai, Q., and Theobalt, C. 2012. Performance capture of interacting characters with handheld kinects. In Proc. ECCV. Springer, 828--841. Google ScholarDigital Library
Zeng, M., Zheng, J., Cheng, X., and Liu, X. 2013. Templateless quasi-rigid shape modeling with implicit loop-closure. In Proc. CVPR, 145--152. Google ScholarDigital Library

Index Terms

Real-time non-rigid reconstruction using an RGB-D camera
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Document scanning
      2. Graphics recognition and interpretation
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
  2. Computer graphics
    1. Image manipulation

Recommendations

Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

This article proposes a real-time method that uses a single-view RGB-D input (a depth sensor integrated with a color camera) to simultaneously reconstruct a casual scene with a detailed geometry model, surface albedo, per-frame non-rigid motion, and per-...
Read More
Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

This article proposes a real-time method that uses a single-view RGB-D input (a depth sensor integrated with a color camera) to simultaneously reconstruct a casual scene with a detailed geometry model, surface albedo, per-frame non-rigid motion, and per-...
Read More
On template-based reconstruction from a single view: Analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces
CVPR '12: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recovering a deformable surface's 3D shape from a single view registered to a 3D template requires one to provide additional constraints. A recent approach has been to constrain the surface to deform quasi-isometrically. This is applicable to surfaces ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 33, Issue 4
July 2014
1366 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2601097
Issue’s Table of Contents

Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 July 2014
Published in tog Volume 33, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
3D scanning
deformation
depth camera
non-rigid
shape
stereo matching
surface reconstruction
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 317
  Total Citations
  View Citations
- 3,483
  Total Downloads
- Downloads (Last 12 months)151
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Real-time non-rigid reconstruction using an RGB-D camera

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

On template-based reconstruction from a single view: Analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Real-time non-rigid reconstruction using an RGB-D camera

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

On template-based reconstruction from a single view: Analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media