skip to main content
research-article
Free Access

Real-time human pose recognition in parts from single depth images

Published:01 January 2013Publication History
Skip Abstract Section

Abstract

We propose a new method to quickly and accurately predict human pose---the 3D positions of body joints---from a single depth image, without depending on information from preceding frames. Our approach is strongly rooted in current object recognition strategies. By designing an intermediate representation in terms of body parts, the difficult pose estimation problem is transformed into a simpler per-pixel classification problem, for which efficient machine learning techniques exist. By using computer graphics to synthesize a very large dataset of training image pairs, one can train a classifier that estimates body part labels from test images invariant to pose, body shape, clothing, and other irrelevances. Finally, we generate confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.

The system runs in under 5ms on the Xbox 360. Our evaluation shows high accuracy on both synthetic and real test sets, and investigates the effect of several training parameters. We achieve state-of-the-art accuracy in our comparison with related work and demonstrate improved generalization over exact whole-skeleton nearest neighbor matching.

References

  1. Agarwal, A., Triggs, B. 3D human pose from silhouettes by relevance vector regression. In Proceedings of CVPR (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Amit, Y., Geman, D. Shape quantization and recognition with randomized trees. Neural Computation, 9, 7 (1997), 1545--1588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Belongie, S., Malik, J., Puzicha, J. Shape matching and object recognition using shape contexts. IEEE Trans. PAMI 24, 4 (2002), 509--522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Breiman, L. Random forests. Mach. Learn. 45, 1 (2001), 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. CMU Mocap Database. http://mocap.cs.cmu.edu.Google ScholarGoogle Scholar
  6. Comaniciu, D., Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Trans. PAMI 24, 5 (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fergus, R., Perona, P., Zisserman, A. Object class recognition by unsupervised scale-invariant learning. In Proceedings of CVPR (2003).Google ScholarGoogle ScholarCross RefCross Ref
  8. Ganapathi, V., Plagemann, C., Koller, D., Thrun, S. Real time motion capture using a single time-of-flight camera. In Proceedings of CVPR (2010).Google ScholarGoogle ScholarCross RefCross Ref
  9. Gavrila, D. Pedestrian detection from a moving vehicle. In Proceedings of ECCV (June 2000). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gonzalez, T. Clustering to minimize the maximum intercluster distance. Theor. Comp. Sci. 38 (1985).Google ScholarGoogle Scholar
  11. Lepetit, V., Lagger, P., Fua, P. Randomized trees for real-time keypoint recognition. In Proceedings of CVPR (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Moeslund, T., Hilton, A., Krüger, V. A survey of advances in vision-based human motion capture and analysis. CVIU 104(2--3) (2006), 90--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Navaratnam, R., Fitzgibbon, A.W., Cipolla, R. The joint manifold model for semi-supervised multi-valued regression. In Proceedings of ICCV (2007).Google ScholarGoogle ScholarCross RefCross Ref
  14. Ning, H., Xu, W., Gong, Y., Huang, T.S. Discriminative learning of visual words for 3D human pose estimation. In Proceedings of CVPR (2008).Google ScholarGoogle Scholar
  15. Okada, R., Soatto, S. Relevant feature selection for human pose estimation and localization in cluttered images. In Proceedings of ECCV (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Plagemann, C., Ganapathi, V., Koller, D., Thrun, S. Real-time identification and localization of body parts from depth images. In Proceedings of ICRA (2010).Google ScholarGoogle ScholarCross RefCross Ref
  17. Poppe, R. Vision-based human motion analysis: An overview. CVIU 108(1--2) (2007), 4--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ramanan, D., Forsyth, D. Finding and tracking people from the bottom up. In Proceedings of CVPR (2003).Google ScholarGoogle ScholarCross RefCross Ref
  19. Shakhnarovich, G., Viola, P., Darrell, T. Fast pose estimation with parameter sensitive hashing. In Proceedings of ICCV (2003). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sharp, T. Implementing decision trees and forests on a GPU. In Proceedings of ECCV (2008).Google ScholarGoogle ScholarCross RefCross Ref
  21. Shotton, J., Winn, J., Rother, C., Criminisi, A. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings of ECCV (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Siddiqui, M., Medioni, G. Human pose estimation from a single view point, real-time range sensor. In IEEE International Workshop on Computer Vision for Computer Games (2010).Google ScholarGoogle ScholarCross RefCross Ref
  23. Sidenbladh, H., Black, M., Sigal, L. Implicit probabilistic models of human motion for synthesis and tracking. In Proceedings of ECCV (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M. Tracking loose-limbed people. In Proceedings of CVPR (2004).Google ScholarGoogle ScholarCross RefCross Ref
  25. Urtasun, R., Darrell, T. Local probabilistic regression for activity-independent human pose inference. In Proceedings of CVPR (2008).Google ScholarGoogle ScholarCross RefCross Ref
  26. Wang, R., Popović, J. Real-time hand-tracking with a color glove. In Proceedings of ACM SIGGRAPH (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Winn, J., Shotton, J. The layout consistent random field for recognizing and segmenting partially occluded objects. In Proceedings of CVPR (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhu, Y., Fujimura, K. Constrained optimization for human pose estimation from depth sequences. In Proceedings of ACCV (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Real-time human pose recognition in parts from single depth images

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image Communications of the ACM
              Communications of the ACM  Volume 56, Issue 1
              January 2013
              117 pages
              ISSN:0001-0782
              EISSN:1557-7317
              DOI:10.1145/2398356
              Issue’s Table of Contents

              Copyright © 2013 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 January 2013

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Popular
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format