Abstract
We present a combined hardware and software solution for markerless reconstruction of non-rigidly deforming physical objects with arbitrary shape in real-time. Our system uses a single self-contained stereo camera unit built from off-the-shelf components and consumer graphics hardware to generate spatio-temporally coherent 3D models at 30 Hz. A new stereo matching algorithm estimates real-time RGB-D data. We start by scanning a smooth template model of the subject as they move rigidly. This geometric surface prior avoids strong scene assumptions, such as a kinematic human skeleton or a parametric shape model. Next, a novel GPU pipeline performs non-rigid registration of live RGB-D data to the smooth template using an extended non-linear as-rigid-as-possible (ARAP) framework. High-frequency details are fused onto the final mesh using a linear deformation model. The system is an order of magnitude faster than state-of-the-art methods, while matching the quality and robustness of many offline algorithms. We show precise real-time reconstructions of diverse scenes, including: large deformations of users' heads, hands, and upper bodies; fine-scale wrinkles and folds of skin and clothing; and non-rigid interactions performed by users on flexible objects such as toys. We demonstrate how acquired models can be used for many interactive scenarios, including re-texturing, online performance capture and preview, and real-time shape and motion re-targeting.
Supplemental Material
Available for Download
Supplemental material.
- Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM TOG (Proc. SIGGRAPH) 30, 4, 75. Google ScholarDigital Library
- Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3D faces. In Proc. SIGGRAPH, 187--194. Google ScholarDigital Library
- Bleyer, M., Rhemann, C., and Rother, C. 2011. Patchmatch stereo: Stereo matching with slanted support windows. In Proc. BMVC, vol. 11, 1--11.Google Scholar
- Bojsen-Hansen, M., Li, H., and Wojtan, C. 2012. Tracking surfaces with evolving topology. ACM Trans. Graph. 31, 4, 53. Google ScholarDigital Library
- Botsch, M., and Sorkine, O. 2008. On linear variational surface deformation methods. IEEE Trans. Vis. Comp. Graph 14, 1, 213--230. Google ScholarDigital Library
- Bradley, D., Popa, T., Sheffer, A., Heidrich, W., and Boubekeur, T. 2008. Markerless garment capture. ACM TOG (Proc. SIGGRAPH) 27, 3, 99. Google ScholarDigital Library
- Brown, B. J., and Rusinkiewicz, S. 2007. Global non-rigid alignment of 3D scans. ACM TOG 26, 3, 21--30. Google ScholarDigital Library
- Cagniart, C., Boyer, E., and Ilic, S. 2010. Free-form mesh tracking: a patch-based approach. In Proc. CVPR.Google Scholar
- Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3D shape regression for real-time facial animation. ACM TOG 32, 4, 41. Google ScholarDigital Library
- Chen, J., Izadi, S., and Fitzgibbon, A. 2012. Kinêtre: animating the world with the human body. In Proc. UIST, 435--444. Google ScholarDigital Library
- de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM TOG (Proc. SIGGRAPH) 27, 1--10. Google ScholarDigital Library
- Dou, M., Fuchs, H., and Frahm, J.-M. 2013. Scanning and tracking dynamic objects with commodity depth cameras. In Proc. ISMAR, 99--106.Google Scholar
- Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., and Seidel, H.-P. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proc. CVPR, 1746--1753.Google Scholar
- Garrido, P., Valgaert, L., Wu, C., and Theobalt, C. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM TOG (Proc. SIGGRAPH Asia) 32, 6, 158. Google ScholarDigital Library
- Helten, T., Baak, A., Bharaj, G., Muller, M., Seidel, H.-P., and Theobalt, C. 2013. Personalization and evaluation of a real-time depth-based full body tracker. In Proc. 3DV, 279--286. Google ScholarDigital Library
- Hernández, C., Vogiatzis, G., Brostow, G. J., Stenger, B., and Cipolla, R. 2007. Non-rigid photometric stereo with colored lights. In Proc. ICCV, 1--8.Google Scholar
- Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., and Fitzgibbon, A. 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In Proc. UIST, 559--568. Google ScholarDigital Library
- Kolb, A., Barth, E., Koch, R., and Larsen, R. 2009. Time-of-flight sensors in computer graphics. In Proc. Eurographics State-of-the-art Reports, 119--134.Google Scholar
- Li, H., Sumner, R. W., and Pauly, M. 2008. Global correspondence optimization for non-rigid registration of depth scans. In Proc. SGP, Eurographics Association, 1421--1430. Google ScholarDigital Library
- Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM TOG 28, 5, 175. Google ScholarDigital Library
- Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J. T., and Gusev, G. 2013. 3D self-portraits. ACM TOG 32, 6, 187. Google ScholarDigital Library
- Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics 32, 4 (July). Google ScholarDigital Library
- Liao, M., Zhang, Q., Wang, H., Yang, R., and Gong, M. 2009. Modeling deformable objects from a single depth camera. In Proc. ICCV, 167--174.Google Scholar
- Mitra, N. J., Flöry, S., Ovsjanikov, M., Gelfand, N., Guibas, L. J., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173--182. Google ScholarDigital Library
- Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proc. ISMAR, 127--136. Google ScholarDigital Library
- Niessner, M., Zollhöfer, M., Izadi, S., and Stamminger, M. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM TOG 32, 6, 169. Google ScholarDigital Library
- Oikonomidis, I., Kyriazis, N., and Argyros, A. A. 2011. Efficient model-based 3D tracking of hand articulations using Kinect. In Proc. BMVC, 1--11.Google Scholar
- Pradeep, V., Rhemann, C., Izadi, S., Zach, C., Bleyer, M., and Bathiche, S. 2013. MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera. In Proc. ISMAR, 83--88.Google Scholar
- Sorkine, O., and Alexa, M. 2007. As-rigid-as-possible surface modeling. In Proc. SGP, 109--116. Google ScholarDigital Library
- Starck, J., and Hilton, A. 2007. Surface capture for performance-based animation. Computer Graphics and Applications 27, 3, 21--31. Google ScholarDigital Library
- Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. In ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, SIGGRAPH '04, 399--405. Google ScholarDigital Library
- Sumner, R. W., Schmid, J., and Pauly, M. 2007. Embedded deformation for shape manipulation. ACM TOG 26, 3, 80. Google ScholarDigital Library
- Taylor, J., Shotton, J., Sharp, T., and Fitzgibbon, A. 2012. The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In Proc. CVPR, 103--110. Google ScholarDigital Library
- Tevs, A., Berner, A., Wand, M., Ihrke, I., Bokeloh, M., Kerber, J., and Seidel, H.-P. 2012. Animation cartography-intrinsic reconstruction of shape and motion. ACM TOG 31, 2, 12. Google ScholarDigital Library
- Theobalt, C., de Aguiar, E., Stoll, C., Seidel, H.-P., and Thrun, S. 2010. Performance capture from multi-view video. In Image and Geometry Processing for 3D-Cinematography, R. Ronfard and G. Taubin, Eds. Springer, 127ff.Google Scholar
- Tong, J., Zhou, J., Liu, L., Pan, Z., and Yan, H. 2012. Scanning 3D full human bodies using Kinects. TVCG 18, 4, 643--650. Google ScholarDigital Library
- Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. ACM TOG (Proc. SIGGRAPH Asia) 31, 6 (November), 187. Google ScholarDigital Library
- Vlasic, D., Baran, I., Matusik, W., and Popović, J. 2008. Articulated mesh animation from multi-view silhouettes. ACM TOG (Proc. SIGGRAPH). Google ScholarDigital Library
- Vlasic, D., Peers, P., Baran, I., Debevec, P., Popovic, J., Rusinkiewicz, S., and Matusik, W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM TOG (Proc. SIGGRAPH Asia) 28, 5, 174. Google ScholarDigital Library
- Wand, M., Adams, B., Ovsjanikov, M., Berner, A., Bokeloh, M., Jenke, P., Guibas, L., Seidel, H.-P., and Schilling, A. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM TOG 28, 15. Google ScholarDigital Library
- Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629--638.Google Scholar
- Weber, D., Bender, J., Schnoes, M., Stork, A., and Fellner, D. 2013. Efficient gpu data structures and methods to solve sparse linear systems in dynamics applications. Computer Graphics Forum 32, 1, 16--26.Google ScholarCross Ref
- Wei, X., Zhang, P., and Chai, J. 2012. Accurate realtime full-body motion capture using a single depth camera. ACM TOG 31, 6 (Nov.), 188. Google ScholarDigital Library
- Weise, T., Wismer, T., Leibe, B., and Gool, L. V. 2009. In-hand scanning with online loop closure. In IEEE International Workshop on 3-D Digital Imaging and Modeling.Google Scholar
- Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/off: Live facial puppetry. In Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer animation (Proc. SCA'09), Eurographics Association, ETH Zurich. Google ScholarDigital Library
- Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. ACM TOG 30, 4, 77. Google ScholarDigital Library
- Weiss, A., Hirshberg, D., and Black, M. J. 2011. Home 3D body scans from noisy image and range data. In Proc. ICCV, 1951--1958. Google ScholarDigital Library
- White, B. S., McKee, S. A., de Supinski, B. R., Miller, B., Quinlan, D., and Schulz, M. 2005. Improving the computational intensity of unstructured mesh applications. In Proc. ACM Intl. Conf. on Supercomputing, 341--350. Google ScholarDigital Library
- Wilamowski, B. M., and Yu, H. 2010. Improved computation for levenberg-marquardt training. IEEE Trans. Neural Networks 21, 6, 930--937. Google ScholarDigital Library
- Wu, C., Stoll, C., Valgaerts, L., and Theobalt, C. 2013. On-set performance capture of multiple actors with a stereo camera. ACM TOG 32, 6, 161. Google ScholarDigital Library
- Ye, G., Liu, Y., Hasler, N., Ji, X., Dai, Q., and Theobalt, C. 2012. Performance capture of interacting characters with handheld kinects. In Proc. ECCV. Springer, 828--841. Google ScholarDigital Library
- Zeng, M., Zheng, J., Cheng, X., and Liu, X. 2013. Templateless quasi-rigid shape modeling with implicit loop-closure. In Proc. CVPR, 145--152. Google ScholarDigital Library
Index Terms
- Real-time non-rigid reconstruction using an RGB-D camera
Recommendations
Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera
This article proposes a real-time method that uses a single-view RGB-D input (a depth sensor integrated with a color camera) to simultaneously reconstruct a casual scene with a detailed geometry model, surface albedo, per-frame non-rigid motion, and per-...
Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera
This article proposes a real-time method that uses a single-view RGB-D input (a depth sensor integrated with a color camera) to simultaneously reconstruct a casual scene with a detailed geometry model, surface albedo, per-frame non-rigid motion, and per-...
On template-based reconstruction from a single view: Analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces
CVPR '12: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)Recovering a deformable surface's 3D shape from a single view registered to a 3D template requires one to provide additional constraints. A recent approach has been to constrain the surface to deform quasi-isometrically. This is applicable to surfaces ...
Comments