Skip to main content

2016 | OriginalPaper | Buchkapitel

Combining Human Body Shape and Pose Estimation for Robust Upper Body Tracking Using a Depth Sensor

verfasst von : Thomas Probst, Andrea Fossati, Luc Van Gool

Erschienen in: Computer Vision – ECCV 2016 Workshops

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Rapid and accurate estimation of a person’s upper body shape and real-time tracking of the pose in the presence of occlusions is crucial for many future assistive technologies, health care applications and telemedicine systems. We propose to tackle this challenging problem by combining data-driven and generative methods for both body shape and pose estimation. Our strategy comprises a subspace-based method to predict body shape directly from a single depth map input, and a random forest regression approach to obtain a sound initialization for pose estimation of the upper body. We propose a model-fitting strategy in order to refine the estimated body shape and to exploit body shape information for improving pose accuracy. During tracking, we feed refinement results back into the forest-based joint position regressor to stabilize and accelerate pose estimation over time. Our tracking framework is designed to cope with viewpoint limitations and occlusions due to dynamic objects.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
Note that solving for \({\mathbf {c}}\) in equation \(I({\mathbf {c}})=\varPi (P{\mathbf {c}})\) is overdetermined as long as there are more than \(N_d\) visible points/pixels. If all pixels are taken into account, the result of the optimization equals projecting the canonical depth map on the eigenimages.
 
3
Note that this vector concatenates the leaves of all the trees in the forest.
 
4
Evaluation hardware: Intel Core i7-4790K CPU 4.00 GHz, 16 Gb RAM.
 
5
Robot Operating System (ROS). http://​www.​ros.​org/​.
 
6
Hardware: Intel Core i7-4510U 2.0 Ghz, 8 Gb RAM.
 
Literatur
1.
Zurück zum Zitat Bauer, S., Seitel, A., Hofmann, H., Blum, T., Wasza, J., Balda, M., Meinzer, H.-P., Navab, N., Hornegger, J., Maier-Hein, L.: Real-time range imaging in health care: a survey. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds.) Time-of-Flight and Depth Imaging. LNCS, vol. 8200, pp. 228–254. Springer, Heidelberg (2013). doi:10.1007/978-3-642-44964-2_11 Bauer, S., Seitel, A., Hofmann, H., Blum, T., Wasza, J., Balda, M., Meinzer, H.-P., Navab, N., Hornegger, J., Maier-Hein, L.: Real-time range imaging in health care: a survey. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds.) Time-of-Flight and Depth Imaging. LNCS, vol. 8200, pp. 228–254. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-44964-2_​11
2.
Zurück zum Zitat Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Cipolla, R., Battiato, S., Farinella, G.M. (eds.) Machine Learning for Computer Vision. SCI, pp. 119–135. Springer, Heidelberg (2013). doi:10.1007/978-3-642-28661-2_5 CrossRef Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Cipolla, R., Battiato, S., Farinella, G.M. (eds.) Machine Learning for Computer Vision. SCI, pp. 119–135. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-28661-2_​5 CrossRef
3.
Zurück zum Zitat Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 415–422 (2011) Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 415–422 (2011)
4.
Zurück zum Zitat Jung, H.Y., Lee, S., Comp, D., Eng, E.S.: Random tree walk toward instantaneous 3D human pose estimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015) Jung, H.Y., Lee, S., Comp, D., Eng, E.S.: Random tree walk toward instantaneous 3D human pose estimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)
5.
Zurück zum Zitat Anguelov, D., Srinivasan, P., Thrun, S., Daphne, K., Davis, J., Rodgers, J.: SCAPE: shape completion and animation of people, LNCS (PART 2) vol. 7729, pp. 133–147 (2013) Anguelov, D., Srinivasan, P., Thrun, S., Daphne, K., Davis, J., Rodgers, J.: SCAPE: shape completion and animation of people, LNCS (PART 2) vol. 7729, pp. 133–147 (2013)
6.
Zurück zum Zitat Hasler, N., Stoll, C.: A statistical model of human pose and body shape. Eurographics 28(2), 1–10 (2009) Hasler, N., Stoll, C.: A statistical model of human pose and body shape. Eurographics 28(2), 1–10 (2009)
7.
Zurück zum Zitat Pishchulin, L., Wuhrer, S., Helten, T., Theobalt, C., Schiele, B.: Building statistical shape spaces for 3D human modeling. arXiv (2015) Pishchulin, L., Wuhrer, S., Helten, T., Theobalt, C., Schiele, B.: Building statistical shape spaces for 3D human modeling. arXiv (2015)
8.
Zurück zum Zitat Zuffi, S., Black, M.J.: The stitched puppet: a graphical model of 3D human shape and pose. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Zuffi, S., Black, M.J.: The stitched puppet: a graphical model of 3D human shape and pose. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
9.
Zurück zum Zitat Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 34(6), 248:1–248:16 (2015) Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 34(6), 248:1–248:16 (2015)
10.
Zurück zum Zitat Helten, T., Baak, A., Bharaj, G., Muller, M., Seidel, H.P., Theobalt, C.: Personalization and evaluation of a real-time depth-based full body tracker. In: International Conference on 3D Vision (3DV) (2013) Helten, T., Baak, A., Bharaj, G., Muller, M., Seidel, H.P., Theobalt, C.: Personalization and evaluation of a real-time depth-based full body tracker. In: International Conference on 3D Vision (3DV) (2013)
11.
Zurück zum Zitat Zhang, Q., Fu, B., Ye, M.: Quality dynamic human body modeling using a single low-cost depth camera. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014) Zhang, Q., Fu, B., Ye, M.: Quality dynamic human body modeling using a single low-cost depth camera. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
12.
Zurück zum Zitat Xu, H., Yu, Y., Zhou, Y., Li, Y., Du, S.: Measuring accurate body parameters of dressed humans with large-scale motion using a Kinect sensor. Sensors 13(9), 11362–11384 (2013)CrossRef Xu, H., Yu, Y., Zhou, Y., Li, Y., Du, S.: Measuring accurate body parameters of dressed humans with large-scale motion using a Kinect sensor. Sensors 13(9), 11362–11384 (2013)CrossRef
13.
Zurück zum Zitat Perbet, F., Johnson, S., Pham, M.T., Stenger, B.: Human body shape estimation using a multi-resolution manifold forest. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014) Perbet, F., Johnson, S., Pham, M.T., Stenger, B.: Human body shape estimation using a multi-resolution manifold forest. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
14.
Zurück zum Zitat Weiss, A., Hirshberg, D., Black, M.J.: Home 3D body scans from noisy image and range data. In: International Conference on Computer Vision (ICCV) (2011) Weiss, A., Hirshberg, D., Black, M.J.: Home 3D body scans from noisy image and range data. In: International Conference on Computer Vision (ICCV) (2011)
15.
Zurück zum Zitat Bogo, F., Black, M.J., Loper, M., Romero, J.: Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In: ICCV (2015) Bogo, F., Black, M.J., Loper, M., Romero, J.: Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In: ICCV (2015)
16.
Zurück zum Zitat Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
17.
Zurück zum Zitat Newcombe, R.A., Molyneaux, D., Kim, D., Davison, A.J., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (2011) Newcombe, R.A., Molyneaux, D., Kim, D., Davison, A.J., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (2011)
18.
Zurück zum Zitat Cui, Y., Chang, W., Tobias, N.: KinectAvatar: fully automatic body capture using a single kinect. In: ACCV Workshop on Color Depth Fusion in Computer Vision (2012) Cui, Y., Chang, W., Tobias, N.: KinectAvatar: fully automatic body capture using a single kinect. In: ACCV Workshop on Color Depth Fusion in Computer Vision (2012)
19.
Zurück zum Zitat Zeng, M., Zheng, J., Cheng, X., Liu, X.: Templateless quasi-rigid shape modeling with implicit loop-closure. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 145–152 (2013) Zeng, M., Zheng, J., Cheng, X., Liu, X.: Templateless quasi-rigid shape modeling with implicit loop-closure. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 145–152 (2013)
20.
Zurück zum Zitat Tong, J., Zhou, J., Liu, L., Pan, Z., Yan, H.: Scanning 3D full human bodies using kinects. IEEE Trans. Vis. Comput. Graph. 18, 643–650 (2012)CrossRef Tong, J., Zhou, J., Liu, L., Pan, Z., Yan, H.: Scanning 3D full human bodies using kinects. IEEE Trans. Vis. Comput. Graph. 18, 643–650 (2012)CrossRef
21.
Zurück zum Zitat Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3D Self-Portraits Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3D Self-Portraits
22.
Zurück zum Zitat Ganapathi, V., Plagemann, C., Koller, D., Thrun, S.: Real-time human pose tracking from range data. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 738–751. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33783-3_53 CrossRef Ganapathi, V., Plagemann, C., Koller, D., Thrun, S.: Real-time human pose tracking from range data. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 738–751. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33783-3_​53 CrossRef
23.
Zurück zum Zitat Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., Seidel, H.P.: Motion capture using joint skeleton tracking and surface estimation. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, pp. 1746–1753 (2009) Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., Seidel, H.P.: Motion capture using joint skeleton tracking and surface estimation. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, pp. 1746–1753 (2009)
24.
Zurück zum Zitat Grest, D., Krüger, V., Koch, R.: Single view motion tracking by depth and silhouette information. In: Ersbøll, B.K., Pedersen, K.S. (eds.) SCIA 2007. LNCS, vol. 4522, pp. 719–729. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73040-8_73 CrossRef Grest, D., Krüger, V., Koch, R.: Single view motion tracking by depth and silhouette information. In: Ersbøll, B.K., Pedersen, K.S. (eds.) SCIA 2007. LNCS, vol. 4522, pp. 719–729. Springer, Heidelberg (2007). doi:10.​1007/​978-3-540-73040-8_​73 CrossRef
25.
Zurück zum Zitat Sun, M., Kohli, P., Shotton, J.: Conditional regression forests for human pose estimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3394–3401 (2012) Sun, M., Kohli, P., Shotton, J.: Conditional regression forests for human pose estimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3394–3401 (2012)
26.
Zurück zum Zitat Baak, A., Muller, M., Bharaj, G., Seidel, H.P., Theobalt, C.: A data-driven approach for real-time full body pose reconstruction from a depth camera. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1092–1099 (2011) Baak, A., Muller, M., Bharaj, G., Seidel, H.P., Theobalt, C.: A data-driven approach for real-time full body pose reconstruction from a depth camera. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1092–1099 (2011)
27.
Zurück zum Zitat Ye, M., Yang, R., Pollefeys, M.: Accurate 3D pose estimation from a single depth image. In: 2011 International Conference on Computer Vision, pp. 731–738 (2011) Ye, M., Yang, R., Pollefeys, M.: Accurate 3D pose estimation from a single depth image. In: 2011 International Conference on Computer Vision, pp. 731–738 (2011)
28.
Zurück zum Zitat Ganapathi, V., Plagemann, C., Koller, D., Thrun, S.: Real time motion capture using a single time-of-flight camera. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp. 755–762 (2010) Ganapathi, V., Plagemann, C., Koller, D., Thrun, S.: Real time motion capture using a single time-of-flight camera. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp. 755–762 (2010)
29.
Zurück zum Zitat Pons-Moll, G., Javier, R., Mahmood, N., Black, M.J.: Dyna: a model of dynamic human shape in motion. ACM Trans. Graph. 34, 1–14 (2015)CrossRef Pons-Moll, G., Javier, R., Mahmood, N., Black, M.J.: Dyna: a model of dynamic human shape in motion. ACM Trans. Graph. 34, 1–14 (2015)CrossRef
30.
Zurück zum Zitat Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The Vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 103–110 (2012) Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The Vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 103–110 (2012)
31.
Zurück zum Zitat Ren, S., Cao, X., Wei, Y., Sun, J.: Global refinement of random forest. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015) Ren, S., Cao, X., Wei, Y., Sun, J.: Global refinement of random forest. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
32.
Zurück zum Zitat Robinette, K.M., Daanen, H., Paquet, E.: The CAESAR project: a 3-D surface anthropometry survey. In: International Conference on 3-D Digital Imaging and Modeling (1999) Robinette, K.M., Daanen, H., Paquet, E.: The CAESAR project: a 3-D surface anthropometry survey. In: International Conference on 3-D Digital Imaging and Modeling (1999)
33.
Zurück zum Zitat Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT) (2012) Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT) (2012)
Metadaten
Titel
Combining Human Body Shape and Pose Estimation for Robust Upper Body Tracking Using a Depth Sensor
verfasst von
Thomas Probst
Andrea Fossati
Luc Van Gool
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-48881-3_20