Skip to main content
Top

2020 | OriginalPaper | Chapter

TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video

Authors : Tiancheng Zhi, Christoph Lassner, Tony Tung, Carsten Stoll, Srinivasa G. Narasimhan, Minh Vo

Published in: Computer Vision – ECCV 2020

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We present TexMesh, a novel approach to reconstruct detailed human meshes with high-resolution full-body texture from RGB-D video. TexMesh enables high quality free-viewpoint rendering of humans. Given the RGB frames, the captured environment map, and the coarse per-frame human mesh from RGB-D tracking, our method reconstructs spatiotemporally consistent and detailed per-frame meshes along with a high-resolution albedo texture. By using the incident illumination we are able to accurately estimate local surface geometry and albedo, which allows us to further use photometric constraints to adapt a synthetically trained model to real-world sequences in a self-supervised manner for detailed surface geometry and high-resolution texture estimation. In practice, we train our models on a short example sequence for self-adaptation and the model runs at interactive framerate afterwards. We validate TexMesh on synthetic and real-world data, and show it outperforms the state of art quantitatively and qualitatively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Alldieck, T., Magnor, M., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single RGB camera. In: CVPR (2019) Alldieck, T., Magnor, M., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single RGB camera. In: CVPR (2019)
2.
go back to reference Alldieck, T., Magnor, M., Xu, W., Theobalt, C., Pons-Moll, G.: Detailed human avatars from monocular video. In: 3DV (2018) Alldieck, T., Magnor, M., Xu, W., Theobalt, C., Pons-Moll, G.: Detailed human avatars from monocular video. In: 3DV (2018)
3.
go back to reference Alldieck, T., Magnor, M., Xu, W., Theobalt, C., Pons-Moll, G.: Video based reconstruction of 3D people models. In: CVPR (2018) Alldieck, T., Magnor, M., Xu, W., Theobalt, C., Pons-Moll, G.: Video based reconstruction of 3D people models. In: CVPR (2018)
4.
go back to reference Alldieck, T., Pons-Moll, G., Theobalt, C., Magnor, M.: Tex2shape: detailed full human body geometry from a single image. In: CVPR (2019) Alldieck, T., Pons-Moll, G., Theobalt, C., Magnor, M.: Tex2shape: detailed full human body geometry from a single image. In: CVPR (2019)
5.
go back to reference Barron, J.T.: A general and adaptive robust loss function. In: CVPR (2019) Barron, J.T.: A general and adaptive robust loss function. In: CVPR (2019)
6.
go back to reference Bhatnagar, B.L., Tiwari, G., Theobalt, C., Pons-Moll, G.: Multi-Garment Net: learning to dress 3D people from images. In: ICCV (2019) Bhatnagar, B.L., Tiwari, G., Theobalt, C., Pons-Moll, G.: Multi-Garment Net: learning to dress 3D people from images. In: ICCV (2019)
7.
go back to reference Blinn, J.F., Newell, M.E.: Texture and reflection in computer generated images. Commun. ACM 19(10), 542–547 (1976)CrossRef Blinn, J.F., Newell, M.E.: Texture and reflection in computer generated images. Commun. ACM 19(10), 542–547 (1976)CrossRef
8.
go back to reference Bogo, F., Black, M.J., Loper, M., Romero, J.: Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In: ICCV (2015) Bogo, F., Black, M.J., Loper, M., Romero, J.: Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In: ICCV (2015)
9.
10.
go back to reference Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: realtime multi-person 2D pose estimation using part affinity fields. TPAMI (2019) Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: realtime multi-person 2D pose estimation using part affinity fields. TPAMI (2019)
11.
go back to reference Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017) Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:​1706.​05587 (2017)
12.
go back to reference Collet, A., et al.: High-quality streamable free-viewpoint video. TOG 34, 1–13 (2015)CrossRef Collet, A., et al.: High-quality streamable free-viewpoint video. TOG 34, 1–13 (2015)CrossRef
13.
go back to reference Gardner, M.A., et al.: Learning to predict indoor illumination from a single image. TOG (SIGGRAPH Asia) 9(4) (2017) Gardner, M.A., et al.: Learning to predict indoor illumination from a single image. TOG (SIGGRAPH Asia) 9(4) (2017)
14.
go back to reference Grigorev, A., Sevastopolsky, A., Vakhitov, A., Lempitsky, V.: Coordinate-based texture inpainting for pose-guided human image generation. In: CVPR (2019) Grigorev, A., Sevastopolsky, A., Vakhitov, A., Lempitsky, V.: Coordinate-based texture inpainting for pose-guided human image generation. In: CVPR (2019)
15.
go back to reference Habermann, M., Xu, W., Zollhoefer, M., Pons-Moll, G., Theobalt, C.: Livecap: real-time human performance capture from monocular video. TOG 38(2), 1–17 (2019)CrossRef Habermann, M., Xu, W., Zollhoefer, M., Pons-Moll, G., Theobalt, C.: Livecap: real-time human performance capture from monocular video. TOG 38(2), 1–17 (2019)CrossRef
16.
go back to reference Huang, Y., et al.: Towards accurate marker-less human shape and pose estimation over time. In: 3DV (2017) Huang, Y., et al.: Towards accurate marker-less human shape and pose estimation over time. In: 3DV (2017)
17.
go back to reference Huang, Z., Xu, Y., Lassner, C., Li, H., Tung, T.: ARCH: animatable reconstruction of clothed humans. In: CVPR (2020) Huang, Z., Xu, Y., Lassner, C., Li, H., Tung, T.: ARCH: animatable reconstruction of clothed humans. In: CVPR (2020)
18.
go back to reference Jain, A., Thormählen, T., Seidel, H.P., Theobalt, C.: MovieReshape: tracking and reshaping of humans in videos. TOG 29(6), 1–10 (2010)CrossRef Jain, A., Thormählen, T., Seidel, H.P., Theobalt, C.: MovieReshape: tracking and reshaping of humans in videos. TOG 29(6), 1–10 (2010)CrossRef
20.
go back to reference Kanade, T., Rander, P., Narayanan, P.: Virtualized reality: constructing virtual worlds from real scenes. IEEE Multimed. 4(1), 34–47 (1997)CrossRef Kanade, T., Rander, P., Narayanan, P.: Virtualized reality: constructing virtual worlds from real scenes. IEEE Multimed. 4(1), 34–47 (1997)CrossRef
21.
go back to reference Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR (2018) Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR (2018)
22.
go back to reference Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015) Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
24.
go back to reference Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M.J., Gehler, P.V.: Unite the people: closing the loop between 3D and 2D human representations. In: CVPR (2017) Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M.J., Gehler, P.V.: Unite the people: closing the loop between 3D and 2D human representations. In: CVPR (2017)
25.
go back to reference Lengyel, E.: Mathematics for 3D Game Programming and Computer Graphics. Cengage Learning, Boston (2012)MATH Lengyel, E.: Mathematics for 3D Game Programming and Computer Graphics. Cengage Learning, Boston (2012)MATH
26.
go back to reference Li, H., Sumner, R.W., Pauly, M.: Global correspondence optimization for non-rigid registration of depth scans. In: CGF (2008) Li, H., Sumner, R.W., Pauly, M.: Global correspondence optimization for non-rigid registration of depth scans. In: CGF (2008)
27.
go back to reference Liu, S., Li, T., Chen, W., Li, H.: Soft Rasterizer: a differentiable renderer for image-based 3d reasoning. In: ICCV (2019) Liu, S., Li, T., Chen, W., Li, H.: Soft Rasterizer: a differentiable renderer for image-based 3d reasoning. In: ICCV (2019)
28.
go back to reference Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. TOG 34(6), 1–16 (2015)CrossRef Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. TOG 34(6), 1–16 (2015)CrossRef
29.
go back to reference Matsuyama, T., Takai, T.: Generation, visualization, and editing of 3D video. In: 3DPVT (2002) Matsuyama, T., Takai, T.: Generation, visualization, and editing of 3D video. In: 3DPVT (2002)
30.
go back to reference Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: CVPR (2015) Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: CVPR (2015)
31.
go back to reference Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136. IEEE (2011) Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136. IEEE (2011)
32.
go back to reference Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.: Texture fields: learning texture representations in function space. In: ICCV (2019) Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.: Texture fields: learning texture representations in function space. In: ICCV (2019)
33.
go back to reference Omran, M., Lassner, C., Pons-Moll, G., Gehler, P., Schiele, B.: Neural body fitting: unifying deep learning and model based human pose and shape estimation. In: 3DV (2018) Omran, M., Lassner, C., Pons-Moll, G., Gehler, P., Schiele, B.: Neural body fitting: unifying deep learning and model based human pose and shape estimation. In: 3DV (2018)
34.
go back to reference Piccardi, M.: Background subtraction techniques: a review. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol. 4, pp. 3099–3104. IEEE (2004) Piccardi, M.: Background subtraction techniques: a review. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol. 4, pp. 3099–3104. IEEE (2004)
35.
go back to reference Ramamoorthi, R., Hanrahan, P.: An efficient representation for irradiance environment maps. In: SIGGRAPH (2001) Ramamoorthi, R., Hanrahan, P.: An efficient representation for irradiance environment maps. In: SIGGRAPH (2001)
36.
38.
go back to reference Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: PIFu: pixel-aligned implicit function for high-resolution clothed human digitization. In: ICCV (2019) Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: PIFu: pixel-aligned implicit function for high-resolution clothed human digitization. In: ICCV (2019)
39.
go back to reference Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. In: ICCV (2017) Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. In: ICCV (2017)
40.
go back to reference Sengupta, S., Kanazawa, A., Castillo, C.D., Jacobs, D.W.: SfSNet: learning shape, reflectance and illuminance of faces ‘in the wild’. In: CVPR (2018) Sengupta, S., Kanazawa, A., Castillo, C.D., Jacobs, D.W.: SfSNet: learning shape, reflectance and illuminance of faces ‘in the wild’. In: CVPR (2018)
41.
go back to reference Shysheya, A., et al.: Textured neural avatars. In: CVPR (2019) Shysheya, A., et al.: Textured neural avatars. In: CVPR (2019)
42.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
43.
go back to reference Sorkine, O.: Differential representations for mesh processing. In: CGF (2006) Sorkine, O.: Differential representations for mesh processing. In: CGF (2006)
44.
go back to reference Tewari, A., et al.: FML: face model learning from videos. In: CVPR (2019) Tewari, A., et al.: FML: face model learning from videos. In: CVPR (2019)
45.
go back to reference Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: CVPR (2018) Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: CVPR (2018)
46.
go back to reference Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: CVPR (2018) Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: CVPR (2018)
47.
go back to reference Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., Matusik, W.: Dynamic shape capture using multi-view photometric stereo. TOG (SIGGRAPH Asia) 28(5) (2009) Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., Matusik, W.: Dynamic shape capture using multi-view photometric stereo. TOG (SIGGRAPH Asia) 28(5) (2009)
48.
go back to reference Vo, M., Narasimhan, S.G., Sheikh, Y.: Spatiotemporal bundle adjustment for dynamic 3D reconstruction. In: CVPR (2016) Vo, M., Narasimhan, S.G., Sheikh, Y.: Spatiotemporal bundle adjustment for dynamic 3D reconstruction. In: CVPR (2016)
49.
go back to reference Walsman, A., Wan, W., Schmidt, T., Fox, D.: Dynamic high resolution deformable articulated tracking. In: 3DV (2017) Walsman, A., Wan, W., Schmidt, T., Fox, D.: Dynamic high resolution deformable articulated tracking. In: 3DV (2017)
50.
go back to reference Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP 13(4), 600–612 (2004) Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP 13(4), 600–612 (2004)
51.
go back to reference Xu, W., et al.: MonoPerfCap: human performance capture from monocular video. TOG 37, 27:1–27:15 (2018) Xu, W., et al.: MonoPerfCap: human performance capture from monocular video. TOG 37, 27:1–27:15 (2018)
52.
go back to reference Xu, Y., Zhu, S.C., Tung, T.: DenseRaC: joint 3D pose and shape estimation by dense render and compare. In: ICCV (2019) Xu, Y., Zhu, S.C., Tung, T.: DenseRaC: joint 3D pose and shape estimation by dense render and compare. In: ICCV (2019)
54.
go back to reference Yu, T., et al.: BodyFusion: real-time capture of human motion and surface geometry using a single depth camera. In: ICCV (2017) Yu, T., et al.: BodyFusion: real-time capture of human motion and surface geometry using a single depth camera. In: ICCV (2017)
55.
go back to reference Yu, T., et al.: DoubleFusion: real-time capture of human performances with inner body shapes from a single depth sensor. In: CVPR (2018) Yu, T., et al.: DoubleFusion: real-time capture of human performances with inner body shapes from a single depth sensor. In: CVPR (2018)
56.
go back to reference Yu, T., et al.: SimulCap: single-view human performance capture with cloth simulation. In: CVPR (2019) Yu, T., et al.: SimulCap: single-view human performance capture with cloth simulation. In: CVPR (2019)
57.
go back to reference Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: DeepHuman: 3D human reconstruction from a single image. In: ICCV (2019) Zheng, Z., Yu, T., Wei, Y., Dai, Q., Liu, Y.: DeepHuman: 3D human reconstruction from a single image. In: ICCV (2019)
58.
go back to reference Zhou, S., Fu, H., Liu, L., Cohen-Or, D., Han, X.: Parametric reshaping of human bodies in images. TOG 29(4), 1–10 (2010)CrossRef Zhou, S., Fu, H., Liu, L., Cohen-Or, D., Han, X.: Parametric reshaping of human bodies in images. TOG 29(4), 1–10 (2010)CrossRef
59.
go back to reference Zhu, H., Zuo, X., Wang, S., Cao, X., Yang, R.: Detailed human shape estimation from a single image by hierarchical mesh deformation. In: CVPR (2019) Zhu, H., Zuo, X., Wang, S., Cao, X., Yang, R.: Detailed human shape estimation from a single image by hierarchical mesh deformation. In: CVPR (2019)
Metadata
Title
TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video
Authors
Tiancheng Zhi
Christoph Lassner
Tony Tung
Carsten Stoll
Srinivasa G. Narasimhan
Minh Vo
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-58607-2_29

Premium Partner