research-article

Free Access

Real-time human pose recognition in parts from single depth images

Authors:
Jamie Shotton

Microsoft Research, Cambridge, UK

Microsoft Research, Cambridge, UK
View Profile

,
Toby Sharp

Microsoft Research, Cambridge, UK

Microsoft Research, Cambridge, UK
View Profile

,
Alex Kipman

Xbox Incubation

Xbox Incubation
View Profile

,
Andrew Fitzgibbon

Microsoft Research, Cambridge, UK

Microsoft Research, Cambridge, UK
View Profile

,
Mark Finocchio

Xbox Incubation

Xbox Incubation
View Profile

,
Andrew Blake

Microsoft Research, Cambridge, UK

Microsoft Research, Cambridge, UK
View Profile

,
Mat Cook

Microsoft Research, Cambridge, UK

Microsoft Research, Cambridge, UK
View Profile

,
Richard Moore

ST-Ericsson

ST-Ericsson
View Profile

Authors Info & Claims

Communications of the ACM Volume 56 Issue 1January 2013pp 116–124https://doi.org/10.1145/2398356.2398381

Published:01 January 2013Publication History

Communications of the ACM

Abstract

We propose a new method to quickly and accurately predict human pose---the 3D positions of body joints---from a single depth image, without depending on information from preceding frames. Our approach is strongly rooted in current object recognition strategies. By designing an intermediate representation in terms of body parts, the difficult pose estimation problem is transformed into a simpler per-pixel classification problem, for which efficient machine learning techniques exist. By using computer graphics to synthesize a very large dataset of training image pairs, one can train a classifier that estimates body part labels from test images invariant to pose, body shape, clothing, and other irrelevances. Finally, we generate confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.

The system runs in under 5ms on the Xbox 360. Our evaluation shows high accuracy on both synthetic and real test sets, and investigates the effect of several training parameters. We achieve state-of-the-art accuracy in our comparison with related work and demonstrate improved generalization over exact whole-skeleton nearest neighbor matching.

References

Agarwal, A., Triggs, B. 3D human pose from silhouettes by relevance vector regression. In Proceedings of CVPR (2004). Google ScholarDigital Library
Amit, Y., Geman, D. Shape quantization and recognition with randomized trees. Neural Computation, 9, 7 (1997), 1545--1588. Google ScholarDigital Library
Belongie, S., Malik, J., Puzicha, J. Shape matching and object recognition using shape contexts. IEEE Trans. PAMI 24, 4 (2002), 509--522. Google ScholarDigital Library
Breiman, L. Random forests. Mach. Learn. 45, 1 (2001), 5--32. Google ScholarDigital Library
CMU Mocap Database. http://mocap.cs.cmu.edu.Google Scholar
Comaniciu, D., Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Trans. PAMI 24, 5 (2002). Google ScholarDigital Library
Fergus, R., Perona, P., Zisserman, A. Object class recognition by unsupervised scale-invariant learning. In Proceedings of CVPR (2003).Google ScholarCross Ref
Ganapathi, V., Plagemann, C., Koller, D., Thrun, S. Real time motion capture using a single time-of-flight camera. In Proceedings of CVPR (2010).Google ScholarCross Ref
Gavrila, D. Pedestrian detection from a moving vehicle. In Proceedings of ECCV (June 2000). Google ScholarDigital Library
Gonzalez, T. Clustering to minimize the maximum intercluster distance. Theor. Comp. Sci. 38 (1985).Google Scholar
Lepetit, V., Lagger, P., Fua, P. Randomized trees for real-time keypoint recognition. In Proceedings of CVPR (2005). Google ScholarDigital Library
Moeslund, T., Hilton, A., Krüger, V. A survey of advances in vision-based human motion capture and analysis. CVIU 104(2--3) (2006), 90--126. Google ScholarDigital Library
Navaratnam, R., Fitzgibbon, A.W., Cipolla, R. The joint manifold model for semi-supervised multi-valued regression. In Proceedings of ICCV (2007).Google ScholarCross Ref
Ning, H., Xu, W., Gong, Y., Huang, T.S. Discriminative learning of visual words for 3D human pose estimation. In Proceedings of CVPR (2008).Google Scholar
Okada, R., Soatto, S. Relevant feature selection for human pose estimation and localization in cluttered images. In Proceedings of ECCV (2008). Google ScholarDigital Library
Plagemann, C., Ganapathi, V., Koller, D., Thrun, S. Real-time identification and localization of body parts from depth images. In Proceedings of ICRA (2010).Google ScholarCross Ref
Poppe, R. Vision-based human motion analysis: An overview. CVIU 108(1--2) (2007), 4--18. Google ScholarDigital Library
Ramanan, D., Forsyth, D. Finding and tracking people from the bottom up. In Proceedings of CVPR (2003).Google ScholarCross Ref
Shakhnarovich, G., Viola, P., Darrell, T. Fast pose estimation with parameter sensitive hashing. In Proceedings of ICCV (2003). Google ScholarDigital Library
Sharp, T. Implementing decision trees and forests on a GPU. In Proceedings of ECCV (2008).Google ScholarCross Ref
Shotton, J., Winn, J., Rother, C., Criminisi, A. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings of ECCV (2006). Google ScholarDigital Library
Siddiqui, M., Medioni, G. Human pose estimation from a single view point, real-time range sensor. In IEEE International Workshop on Computer Vision for Computer Games (2010).Google ScholarCross Ref
Sidenbladh, H., Black, M., Sigal, L. Implicit probabilistic models of human motion for synthesis and tracking. In Proceedings of ECCV (2002). Google ScholarDigital Library
Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M. Tracking loose-limbed people. In Proceedings of CVPR (2004).Google ScholarCross Ref
Urtasun, R., Darrell, T. Local probabilistic regression for activity-independent human pose inference. In Proceedings of CVPR (2008).Google ScholarCross Ref
Wang, R., Popović, J. Real-time hand-tracking with a color glove. In Proceedings of ACM SIGGRAPH (2009). Google ScholarDigital Library
Winn, J., Shotton, J. The layout consistent random field for recognizing and segmenting partially occluded objects. In Proceedings of CVPR (2006). Google ScholarDigital Library
Zhu, Y., Fujimura, K. Constrained optimization for human pose estimation from depth sequences. In Proceedings of ACCV (2007). Google ScholarDigital Library

Index Terms

Real-time human pose recognition in parts from single depth images
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Real-time human pose recognition in parts from single depth images
CVPR '11: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

We propose a new method to quickly and accurately predict 3D positions of body joints from a single depth image, using no temporal information. We take an object recognition approach, designing an intermediate body parts representation that maps the ...
Read More
Principal direction analysis-based real-time 3D human pose reconstruction from a single depth image
SoICT '13: Proceedings of the 4th Symposium on Information and Communication Technology

Human pose estimation in real-time is a challenging problem in computer vision. In this paper, we present a novel approach to recover a 3D human pose in real-time from a single depth human silhouette using Principal Direction Analysis (PDA) on each ...
Read More
Real-time 3D human pose recovery from a single depth image using principal direction analysis

In this paper, we present a novel approach to recover a 3D human pose in real-time from a single depth image using principal direction analysis (PDA). Human body parts are first recognized from a human depth silhouette via trained random forests (RFs). ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Communications of the ACM Volume 56, Issue 1
January 2013
117 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/2398356
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 January 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Popular
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1,219
  Total Citations
  View Citations
- 19,102
  Total Downloads
- Downloads (Last 12 months)589
- Downloads (Last 6 weeks)142
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Real-time human pose recognition in parts from single depth images

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Real-time human pose recognition in parts from single depth images

Principal direction analysis-based real-time 3D human pose reconstruction from a single depth image

Real-time 3D human pose recovery from a single depth image using principal direction analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Real-time human pose recognition in parts from single depth images

Communications of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

Real-time human pose recognition in parts from single depth images

Principal direction analysis-based real-time 3D human pose reconstruction from a single depth image

Real-time 3D human pose recovery from a single depth image using principal direction analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media