Skip to main content
Erschienen in: International Journal of Computer Vision 12/2021

15.10.2021

DeMoCap: Low-Cost Marker-Based Motion Capture

verfasst von: Anargyros Chatzitofis, Dimitrios Zarpalas, Petros Daras, Stefanos Kollias

Erschienen in: International Journal of Computer Vision | Ausgabe 12/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Optical marker-based motion capture (MoCap) remains the predominant way to acquire high-fidelity articulated body motions. We introduce DeMoCap, the first data-driven approach for end-to-end marker-based MoCap, using only a sparse setup of spatio-temporally aligned, consumer-grade infrared-depth cameras. Trading off some of their typical features, our approach is the sole robust option for far lower-cost marker-based MoCap than high-end solutions. We introduce an end-to-end differentiable markers-to-pose model to solve a set of challenges such as under-constrained position estimates, noisy input data and spatial configuration invariance. We simultaneously handle depth and marker detection noise, label and localize the markers, and estimate the 3D pose by introducing a novel spatial 3D coordinate regression technique under a multi-view rendering and supervision concept. DeMoCap is driven by a special dataset captured with 4 spatio-temporally aligned low-cost Intel RealSense D415 sensors and a 24 MXT40S camera professional MoCap system, used as input and ground truth, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
Literatur
Zurück zum Zitat Alexanderson, S., O’Sullivan, C., & Beskow, J. (2017). Real-time labeling of non-rigid motion capture marker sets. Computers & Graphics, 69, 59–67. Alexanderson, S., O’Sullivan, C., & Beskow, J. (2017). Real-time labeling of non-rigid motion capture marker sets. Computers & Graphics, 69, 59–67.
Zurück zum Zitat Bascones, J. L. J. (2019). Cloud point labelling in optical motion capture systems. Ph.D. thesis, Universidad del País Vasco-Euskal Herriko Unibertsitatea. Bascones, J. L. J. (2019). Cloud point labelling in optical motion capture systems. Ph.D. thesis, Universidad del País Vasco-Euskal Herriko Unibertsitatea.
Zurück zum Zitat Bekhtaoui, W., Sa, R., Teixeira, B., Singh, V., Kirchberg, K., Yj, Chang, & Kapoor, A. (2020). View invariant human body detection and pose estimation from multiple depth sensors. arXiv preprint arXiv:2005.04258. Bekhtaoui, W., Sa, R., Teixeira, B., Singh, V., Kirchberg, K., Yj, Chang, & Kapoor, A. (2020). View invariant human body detection and pose estimation from multiple depth sensors. arXiv preprint arXiv:​2005.​04258.
Zurück zum Zitat Buhrmester, V., Münch, D., Bulatov, D., & Arens, M. (2019). Evaluating the impact of color information in deep neural networks. In Iberian conference on pattern recognition and image analysis (pp. 302–316). Springer. Buhrmester, V., Münch, D., Bulatov, D., & Arens, M. (2019). Evaluating the impact of color information in deep neural networks. In Iberian conference on pattern recognition and image analysis (pp. 302–316). Springer.
Zurück zum Zitat Burenius, M., Sullivan, J., & Carlsson, S. (2013). 3D pictorial structures for multiple view articulated pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3618–3625). Burenius, M., Sullivan, J., & Carlsson, S. (2013). 3D pictorial structures for multiple view articulated pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3618–3625).
Zurück zum Zitat Cao, Z., Simon, T., Wei, S. E., & Sheikh, Y. (2017). Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7291–7299). Cao, Z., Simon, T., Wei, S. E., & Sheikh, Y. (2017). Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7291–7299).
Zurück zum Zitat Chatzitofis, A., Zarpalas, D., Kollias, S., & Daras, P. (2019). Deepmocap: Deep optical motion capture using multiple depth sensors and retro-reflectors. Sensors, 19(2), 282.CrossRef Chatzitofis, A., Zarpalas, D., Kollias, S., & Daras, P. (2019). Deepmocap: Deep optical motion capture using multiple depth sensors and retro-reflectors. Sensors, 19(2), 282.CrossRef
Zurück zum Zitat Chatzitofis, A., Saroglou, L., Boutis, P., Drakoulis, P., Zioulis, N., Subramanyam, S., Kevelham, B., Charbonnier, C., Cesar, P., Zarpalas, D., et al. (2020). Human4d: A human-centric multimodal dataset for motions and immersive media. IEEE Access, 8, 176241–176262.CrossRef Chatzitofis, A., Saroglou, L., Boutis, P., Drakoulis, P., Zioulis, N., Subramanyam, S., Kevelham, B., Charbonnier, C., Cesar, P., Zarpalas, D., et al. (2020). Human4d: A human-centric multimodal dataset for motions and immersive media. IEEE Access, 8, 176241–176262.CrossRef
Zurück zum Zitat Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L. (2019). Bottom-up higher-resolution networks for multi-person pose estimation. arXiv preprint arXiv:1908.10357. Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L. (2019). Bottom-up higher-resolution networks for multi-person pose estimation. arXiv preprint arXiv:​1908.​10357.
Zurück zum Zitat Doosti, B., Naha, S., Mirbagheri, M., & Crandall, D. J. (2020). Hope-net: A graph-based model for hand-object pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6608–6617). Doosti, B., Naha, S., Mirbagheri, M., & Crandall, D. J. (2020). Hope-net: A graph-based model for hand-object pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6608–6617).
Zurück zum Zitat Elhayek, A., de Aguiar, E., Jain, A., Tompson, J., Pishchulin, L., Andriluka, M., Bregler, C., Schiele, B., & Theobalt, C. (2015). Efficient convnet-based marker-less motion capture in general scenes with a low number of cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3810–3818). Elhayek, A., de Aguiar, E., Jain, A., Tompson, J., Pishchulin, L., Andriluka, M., Bregler, C., Schiele, B., & Theobalt, C. (2015). Efficient convnet-based marker-less motion capture in general scenes with a low number of cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3810–3818).
Zurück zum Zitat Feng, Z. H., Kittler, J., Awais, M., Huber, P., & Wu, X. J. (2018). Wing loss for robust facial landmark localisation with convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2235–2245). Feng, Z. H., Kittler, J., Awais, M., Huber, P., & Wu, X. J. (2018). Wing loss for robust facial landmark localisation with convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2235–2245).
Zurück zum Zitat Fuglede, B., Topsoe, F. (2004). Jensen-shannon divergence and hilbert space embedding. In International symposium on information theory, 2004. ISIT 2004. Proceedings (p. 31). IEEE. Fuglede, B., Topsoe, F. (2004). Jensen-shannon divergence and hilbert space embedding. In International symposium on information theory, 2004. ISIT 2004. Proceedings (p. 31). IEEE.
Zurück zum Zitat Gao, H., & Ji, S. (2019). Graph u-nets. In International conference on machine learning, PMLR (pp. 2083–2092). Gao, H., & Ji, S. (2019). Graph u-nets. In International conference on machine learning, PMLR (pp. 2083–2092).
Zurück zum Zitat Gaschler, A. (2011). Real-time marker-based motion tracking: Application to kinematic model estimation of a humanoid robot. Thesis Gaschler, A. (2011). Real-time marker-based motion tracking: Application to kinematic model estimation of a humanoid robot. Thesis
Zurück zum Zitat Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th international conference on artificial intelligence and statistics (pp. 249–256). Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th international conference on artificial intelligence and statistics (pp. 249–256).
Zurück zum Zitat Guler, R. A., & Kokkinos, I. (2019). Holopose: Holistic 3D human reconstruction in-the-wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10884–10894). Guler, R. A., & Kokkinos, I. (2019). Holopose: Holistic 3D human reconstruction in-the-wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10884–10894).
Zurück zum Zitat Han, S., Liu, B., Wang, R., Ye, Y., Twigg, C. D., & Kin, K. (2018). Online optical marker-based hand tracking with deep labels. ACM Transactions on Graphics (TOG), 37(4), 166.CrossRef Han, S., Liu, B., Wang, R., Ye, Y., Twigg, C. D., & Kin, K. (2018). Online optical marker-based hand tracking with deep labels. ACM Transactions on Graphics (TOG), 37(4), 166.CrossRef
Zurück zum Zitat Haque, A., Peng, B., Luo, Z., Alahi, A., Yeung, S., & Fei-Fei, L. (2016). Towards viewpoint invariant 3D human pose estimation. In European conference on computer vision (pp. 160–177). Springer Haque, A., Peng, B., Luo, Z., Alahi, A., Yeung, S., & Fei-Fei, L. (2016). Towards viewpoint invariant 3D human pose estimation. In European conference on computer vision (pp. 160–177). Springer
Zurück zum Zitat Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge University Press. Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge University Press.
Zurück zum Zitat He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778). He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Zurück zum Zitat He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961–2969). He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961–2969).
Zurück zum Zitat Holden, D. (2018). Robust solving of optical motion capture data by denoising. ACM Transactions on Graphics (TOG), 37(4), 1–12.CrossRef Holden, D. (2018). Robust solving of optical motion capture data by denoising. ACM Transactions on Graphics (TOG), 37(4), 1–12.CrossRef
Zurück zum Zitat Ionescu, C., Papava, D., Olaru, V., & Sminchisescu, C. (2013). Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1325–1339. Ionescu, C., Papava, D., Olaru, V., & Sminchisescu, C. (2013). Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(7), 1325–1339.
Zurück zum Zitat Iskakov, K., Burkov, E., Lempitsky, V., & Malkov, Y. (2019). Learnable triangulation of human pose. In Proceedings of the IEEE international conference on computer vision (pp. 7718–7727). Iskakov, K., Burkov, E., Lempitsky, V., & Malkov, Y. (2019). Learnable triangulation of human pose. In Proceedings of the IEEE international conference on computer vision (pp. 7718–7727).
Zurück zum Zitat Joo, H., Simon, T., & Sheikh, Y. (2018). Total capture: A 3D deformation model for tracking faces, hands, and bodies. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8320–8329). Joo, H., Simon, T., & Sheikh, Y. (2018). Total capture: A 3D deformation model for tracking faces, hands, and bodies. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8320–8329).
Zurück zum Zitat Keselman, L., Iselin Woodfill, J., Grunnet-Jepsen, A., & Bhowmik, A. (2017). Intel realsense stereoscopic depth cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 1–10). Keselman, L., Iselin Woodfill, J., Grunnet-Jepsen, A., & Bhowmik, A. (2017). Intel realsense stereoscopic depth cameras. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 1–10).
Zurück zum Zitat Li, S., Zhang, W., & Chan, A. B. (2015). Maximum-margin structured learning with deep networks for 3D human pose estimation. In Proceedings of the IEEE international conference on computer vision (ICCV). Li, S., Zhang, W., & Chan, A. B. (2015). Maximum-margin structured learning with deep networks for 3D human pose estimation. In Proceedings of the IEEE international conference on computer vision (ICCV).
Zurück zum Zitat Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740–755). Springer. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740–755). Springer.
Zurück zum Zitat Loper, M., Mahmood, N., & Black, M. J. (2014). Mosh: Motion and shape capture from sparse markers. ACM Transactions on Graphics (TOG), 33(6), 220.CrossRef Loper, M., Mahmood, N., & Black, M. J. (2014). Mosh: Motion and shape capture from sparse markers. ACM Transactions on Graphics (TOG), 33(6), 220.CrossRef
Zurück zum Zitat Luvizon, D. C., Tabia, H., & Picard, D. (2019). Human pose regression by combining indirect part detection and contextual information. Computers & Graphics, 85, 15–22.CrossRef Luvizon, D. C., Tabia, H., & Picard, D. (2019). Human pose regression by combining indirect part detection and contextual information. Computers & Graphics, 85, 15–22.CrossRef
Zurück zum Zitat Mahmood, N., Ghorbani, N., Troje, N. F., Pons-Moll, G., Black, M. J. (2019). Amass: Archive of motion capture as surface shapes. arXiv preprint arXiv:1904.03278. Mahmood, N., Ghorbani, N., Troje, N. F., Pons-Moll, G., Black, M. J. (2019). Amass: Archive of motion capture as surface shapes. arXiv preprint arXiv:​1904.​03278.
Zurück zum Zitat Martínez-González, A., Villamizar, M., Canévet, O., Odobez, J. M. (2018a). Investigating depth domain adaptation for efficient human pose estimation. In 2018 European conference on computer vision—workshops, ECCV 2018. Martínez-González, A., Villamizar, M., Canévet, O., Odobez, J. M. (2018a). Investigating depth domain adaptation for efficient human pose estimation. In 2018 European conference on computer vision—workshops, ECCV 2018.
Zurück zum Zitat Martínez-González, A., Villamizar, M., Canévet, O., & Odobez, J. M. (2018b). Real-time convolutional networks for depth-based human pose estimation. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 41–47). https://doi.org/10.1109/IROS.2018.8593383. Martínez-González, A., Villamizar, M., Canévet, O., & Odobez, J. M. (2018b). Real-time convolutional networks for depth-based human pose estimation. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 41–47). https://​doi.​org/​10.​1109/​IROS.​2018.​8593383.
Zurück zum Zitat Mehta, D., Sridhar, S., Sotnychenko, O., Rhodin, H., Shafiei, M., Seidel, H. P., Xu, W., Casas, D., & Theobalt, C. (2017). Vnect: Real-time 3D human pose estimation with a single rgb camera. ACM Transactions on Graphics (TOG), 36(4), 1–14.CrossRef Mehta, D., Sridhar, S., Sotnychenko, O., Rhodin, H., Shafiei, M., Seidel, H. P., Xu, W., Casas, D., & Theobalt, C. (2017). Vnect: Real-time 3D human pose estimation with a single rgb camera. ACM Transactions on Graphics (TOG), 36(4), 1–14.CrossRef
Zurück zum Zitat Mehta, D., Sotnychenko, O., Mueller, F., Xu, W., Elgharib, M., Fua, P., Seidel, H. P., Rhodin, H., Pons-Moll, G., Theobalt, C. (2019). Xnect: Real-time multi-person 3D human pose estimation with a single RGB camera. arXiv preprint arXiv:1907.00837. Mehta, D., Sotnychenko, O., Mueller, F., Xu, W., Elgharib, M., Fua, P., Seidel, H. P., Rhodin, H., Pons-Moll, G., Theobalt, C. (2019). Xnect: Real-time multi-person 3D human pose estimation with a single RGB camera. arXiv preprint arXiv:​1907.​00837.
Zurück zum Zitat Newell, A., Yang, K., & Deng, J. (2016). Stacked hourglass networks for human pose estimation. In European conference on computer vision (pp. 483–499). Springer. Newell, A., Yang, K., & Deng, J. (2016). Stacked hourglass networks for human pose estimation. In European conference on computer vision (pp. 483–499). Springer.
Zurück zum Zitat Nibali, A., He, Z., Morgan, S., Prendergast, L. (2018). Numerical coordinate regression with convolutional neural networks. arXiv preprint arXiv:1801.07372. Nibali, A., He, Z., Morgan, S., Prendergast, L. (2018). Numerical coordinate regression with convolutional neural networks. arXiv preprint arXiv:​1801.​07372.
Zurück zum Zitat Park, S., Yong Chang, J., Jeong, H., Lee, J. H., & Park, J. Y. (2017). Accurate and efficient 3D human pose estimation algorithm using single depth images for pose analysis in golf. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 49–57). Park, S., Yong Chang, J., Jeong, H., Lee, J. H., & Park, J. Y. (2017). Accurate and efficient 3D human pose estimation algorithm using single depth images for pose analysis in golf. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 49–57).
Zurück zum Zitat Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. de-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems 32 (pp. 8024–8035). Curran Associates Inc. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. de-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems 32 (pp. 8024–8035). Curran Associates Inc.
Zurück zum Zitat Pavllo, D., Porssut, T., Herbelin, B., & Boulic, R. (2018). Real-time finger tracking using active motion capture: A neural network approach robust to occlusions. In Proceedings of the 11th annual international conference on motion, interaction, and games (pp. 1–10). Pavllo, D., Porssut, T., Herbelin, B., & Boulic, R. (2018). Real-time finger tracking using active motion capture: A neural network approach robust to occlusions. In Proceedings of the 11th annual international conference on motion, interaction, and games (pp. 1–10).
Zurück zum Zitat Perepichka, M., Holden, D., Mudur, S. P., & Popa, T. (2019). Robust marker trajectory repair for mocap using kinematic reference. In Motion, interaction and games (pp. 1–10). Ernst & Sohn. Perepichka, M., Holden, D., Mudur, S. P., & Popa, T. (2019). Robust marker trajectory repair for mocap using kinematic reference. In Motion, interaction and games (pp. 1–10). Ernst & Sohn.
Zurück zum Zitat Qi, C. R., Yi, L., Su, H., Guibas, L. J. (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems (pp. 5099–5108). Qi, C. R., Yi, L., Su, H., Guibas, L. J. (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems (pp. 5099–5108).
Zurück zum Zitat Qiu, H., Wang, C., Wang, J., Wang, N., & Zeng, W. (2019). Cross view fusion for 3d human pose estimation. In Proceedings of the IEEE international conference on computer vision (pp. 4342–4351). Qiu, H., Wang, C., Wang, J., Wang, N., & Zeng, W. (2019). Cross view fusion for 3d human pose estimation. In Proceedings of the IEEE international conference on computer vision (pp. 4342–4351).
Zurück zum Zitat Rhodin, H., Salzmann, M., & Fua, P. (2018). Unsupervised geometry-aware representation for 3D human pose estimation. In Proceedings of the European conference on computer vision (ECCV) (pp. 750–767). Rhodin, H., Salzmann, M., & Fua, P. (2018). Unsupervised geometry-aware representation for 3D human pose estimation. In Proceedings of the European conference on computer vision (ECCV) (pp. 750–767).
Zurück zum Zitat Riegler, G., Osman Ulusoy, A., & Geiger, A. (2017). Octnet: Learning deep 3D representations at high resolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3577–3586). Riegler, G., Osman Ulusoy, A., & Geiger, A. (2017). Octnet: Learning deep 3D representations at high resolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3577–3586).
Zurück zum Zitat Rüegg, N., Lassner, C., Black, M. J., Schindler, K. (2020). Chained representation cycling: Learning to estimate 3D human pose and shape by cycling between representations. arXiv preprint arXiv:2001.01613. Rüegg, N., Lassner, C., Black, M. J., Schindler, K. (2020). Chained representation cycling: Learning to estimate 3D human pose and shape by cycling between representations. arXiv preprint arXiv:​2001.​01613.
Zurück zum Zitat Sigal, L., Isard, M., Haussecker, H., & Black, M. J. (2012). Loose-limbed people: Estimating 3D human pose and motion using non-parametric belief propagation. International Journal of Computer Vision, 98(1), 15–48.MathSciNetCrossRef Sigal, L., Isard, M., Haussecker, H., & Black, M. J. (2012). Loose-limbed people: Estimating 3D human pose and motion using non-parametric belief propagation. International Journal of Computer Vision, 98(1), 15–48.MathSciNetCrossRef
Zurück zum Zitat Sterzentsenko, V., Karakottas, A., Papachristou, A., Zioulis, N., Doumanoglou, A., Zarpalas, D., & Daras, P. (2018). A low-cost, flexible and portable volumetric capturing system. In 2018 14th international conference on signal-image technology & internet-based systems (SITIS) (pp. 200–207). IEEE. Sterzentsenko, V., Karakottas, A., Papachristou, A., Zioulis, N., Doumanoglou, A., Zarpalas, D., & Daras, P. (2018). A low-cost, flexible and portable volumetric capturing system. In 2018 14th international conference on signal-image technology & internet-based systems (SITIS) (pp. 200–207). IEEE.
Zurück zum Zitat Sun, X., Xiao, B., Wei, F., Liang, S., & Wei, Y. (2018). Integral human pose regression. In Proceedings of the European conference on computer vision (ECCV) (pp. 529–545). Sun, X., Xiao, B., Wei, F., Liang, S., & Wei, Y. (2018). Integral human pose regression. In Proceedings of the European conference on computer vision (ECCV) (pp. 529–545).
Zurück zum Zitat Tensmeyer, C., Martinez, T. (2019). Robust keypoint detection. In 2019 international conference on document analysis and recognition workshops (ICDARW) (Vol. 5, pp. 1–7). IEEE. Tensmeyer, C., Martinez, T. (2019). Robust keypoint detection. In 2019 international conference on document analysis and recognition workshops (ICDARW) (Vol. 5, pp. 1–7). IEEE.
Zurück zum Zitat Tompson, J.J., Jain, A., LeCun, Y., Bregler, C. (2014). Joint training of a convolutional network and a graphical model for human pose estimation. In Advances in neural information processing systems (pp. 1799–1807). Tompson, J.J., Jain, A., LeCun, Y., Bregler, C. (2014). Joint training of a convolutional network and a graphical model for human pose estimation. In Advances in neural information processing systems (pp. 1799–1807).
Zurück zum Zitat Toshev, A., & Szegedy, C. (2014). Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1653–1660). Toshev, A., & Szegedy, C. (2014). Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1653–1660).
Zurück zum Zitat Tu, H., Wang, C., Zeng, W. (2020). Voxelpose: Towards multi-camera 3d human pose estimation in wild environment. In Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16 (pp. 197–212). Springer. Tu, H., Wang, C., Zeng, W. (2020). Voxelpose: Towards multi-camera 3d human pose estimation in wild environment. In Computer Vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16 (pp. 197–212). Springer.
Zurück zum Zitat Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Zurück zum Zitat Wei, S. E., Ramakrishna, V., Kanade, T., & Sheikh, Y. (2016). Convolutional pose machines. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4724–4732). Wei, S. E., Ramakrishna, V., Kanade, T., & Sheikh, Y. (2016). Convolutional pose machines. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4724–4732).
Zurück zum Zitat Yang, Y., Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In CVPR 2011 (pp. 1385–1392). IEEE. Yang, Y., Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In CVPR 2011 (pp. 1385–1392). IEEE.
Zurück zum Zitat Zanfir, A., Marinoiu, E., Zanfir, M., Popa, A. I., & Sminchisescu, C. (2018). Deep network for the integrated 3D sensing of multiple people in natural images. Advances in Neural Information Processing Systems, 31, 8410–8419. Zanfir, A., Marinoiu, E., Zanfir, M., Popa, A. I., & Sminchisescu, C. (2018). Deep network for the integrated 3D sensing of multiple people in natural images. Advances in Neural Information Processing Systems, 31, 8410–8419.
Zurück zum Zitat Zhang, F., Zhu, X., Dai, H., Ye, M., & Zhu, C. (2020a). Distribution-aware coordinate representation for human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7093–7102). Zhang, F., Zhu, X., Dai, H., Ye, M., & Zhu, C. (2020a). Distribution-aware coordinate representation for human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7093–7102).
Zurück zum Zitat Zhang, Y., An, L., Yu, T., Li, X., Li, K., & Liu, Y. (2020b). 4D association graph for realtime multi-person motion capture using multiple video cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1324–1333). Zhang, Y., An, L., Yu, T., Li, X., Li, K., & Liu, Y. (2020b). 4D association graph for realtime multi-person motion capture using multiple video cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1324–1333).
Zurück zum Zitat Zhang, Z. (2012). Microsoft kinect sensor and its effect. IEEE Multimedia, 19(2), 4–10.CrossRef Zhang, Z. (2012). Microsoft kinect sensor and its effect. IEEE Multimedia, 19(2), 4–10.CrossRef
Metadaten
Titel
DeMoCap: Low-Cost Marker-Based Motion Capture
verfasst von
Anargyros Chatzitofis
Dimitrios Zarpalas
Petros Daras
Stefanos Kollias
Publikationsdatum
15.10.2021
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 12/2021
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-021-01526-z

Weitere Artikel der Ausgabe 12/2021

International Journal of Computer Vision 12/2021 Zur Ausgabe

Premium Partner