skip to main content
research-article

Accurate realtime full-body motion capture using a single depth camera

Published:01 November 2012Publication History
Skip Abstract Section

Abstract

We present a fast, automatic method for accurately capturing full-body motion data using a single depth camera. At the core of our system lies a realtime registration process that accurately reconstructs 3D human poses from single monocular depth images, even in the case of significant occlusions. The idea is to formulate the registration problem in a Maximum A Posteriori (MAP) framework and iteratively register a 3D articulated human body model with monocular depth cues via linear system solvers. We integrate depth data, silhouette information, full-body geometry, temporal pose priors, and occlusion reasoning into a unified MAP estimation framework. Our 3D tracking process, however, requires manual initialization and recovery from failures. We address this challenge by combining 3D tracking with 3D pose detection. This combination not only automates the whole process but also significantly improves the robustness and accuracy of the system. Our whole algorithm is highly parallel and is therefore easily implemented on a GPU. We demonstrate the power of our approach by capturing a wide range of human movements in real time and achieve state-of-the-art accuracy in our comparison against alternative systems such as Kinect [2012].

References

  1. Amit, Y., and Geman, D. 1997. Shape quantization and recognition with randomized trees. Neural Computation. 9(7):1545--1588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Baak, A., Müller, M., Bharaj, G., Seidel, H.-P., and Theobalt, C. 2011. A data-driven approach for real-time full body pose reconstruction from a depth camera. In IEEE 13th International Conference on Computer Vision (ICCV), 1092--1099. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Baker, S., and Matthews, I. 2004. Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision. 56(3):221--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bregler, C., Malik, J., and Pullen, K. 2004. Twist based acquisition and tracking of animal and human kinematics. International Journal of Computer Vision. 56(3):179--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chai, J., and Hodgins, J. 2005. Performance animation from low-dimensional control signals. In ACM Transactions on Graphics. 24(3):686--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ganapathi, V., Plagemann, C., Koller, D., and Thrun, S. 2010. Real time motion capture using a single time-of-flight camera. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 755--762.Google ScholarGoogle Scholar
  7. Girshick, R., Shotton, J., Kohli, P., Criminisi, A., and Fitzgibbon, A. 2011. Efficient regression of general-activity human poses from depth images. In Proceedings of IEEE 13th International Conference on Computer Vision, 415--422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Grest, D., Kruger, V., and Koch, R. 2007. Single view motion tracking by depth and silhouette information. In Proceedings of the 15th Scandinavian Conference on Image Analysis (SCIA), 719--729. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kinect, 2012. Microsoft Kinect for Xbox 360.Google ScholarGoogle Scholar
  10. Knoop, S., Vacek, S., and Dillmann, R. 2006. Sensor fusion for 3D human body tracking with an articulated 3D body model. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 1686--1691.Google ScholarGoogle Scholar
  11. Lepetit, V., and Fua, P. 2006. Keypoint recognition using randomized trees. IEEE Transactions on Pattern Analysis and Machine Intelligence. 28(9): 1465--1479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Liu, H., Wei, X., Chai, J., Ha, I., and Rhee, T. 2011. Realtime human motion control with a small number of inertial sensors. In Symposium on Interactive 3D Graphics and Games, ACM, I3D '11, 133--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Microsft Kinect API for Windows, 2012. http://www.microsoft.com/en-us/kinectforwindows/.Google ScholarGoogle Scholar
  14. Moeslund, T. B., Hilton, A., and Kruger, V. 2006. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding. 104:90--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Plagemann, C., Ganapathi, V., Koller, D., and Thrun, S. 2010. Realtime identification and localization of body parts from depth images. In Proceedings of International Conferences on Robotics and Automation (ICRA 2010), 3108--3113.Google ScholarGoogle Scholar
  16. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. 2011. Real-time human pose recognition in parts from a single depth image. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1297--1304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Siddiqui, M., and Medioni, G. 2010. Human pose estimation from a single view point, real-time range sensor. In CVCG at CVPR.Google ScholarGoogle Scholar
  18. Slyper, R., and Hodgins, J. 2008. Action capture with ac-celerometers. In ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 193--199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Tautges, J., Zinke, A., Krüger, B., Baumann, J., Weber, A., Helten, T., Müller, M., Seidel, H.-P., and Eberhardt, B. 2011. Motion reconstruction using sparse accelerometer data. ACM Transactions on Graphics. 30(3): 18:1--18:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vicon Systems, 2011. http://www.vicon.com.Google ScholarGoogle Scholar
  21. Ye, M., Wang, X., Yang, R., Ren, L., and Pollefeys, M. 2011. Accurate 3D pose estimation from a single depth image. In Proceedings of IEEE 13th International Conference on Computer Vision, 731--738. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Accurate realtime full-body motion capture using a single depth camera

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 31, Issue 6
          November 2012
          794 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/2366145
          Issue’s Table of Contents

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 November 2012
          Published in tog Volume 31, Issue 6

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader