Skip to main content

2016 | OriginalPaper | Buchkapitel

Semi-supervised Learning for Human Pose Recognition with RGB-D Light-Model

verfasst von : Xinbo Wang, Guoshan Zhang, Dahai Yu, Dan Liu

Erschienen in: Advances in Multimedia Information Processing - PCM 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This work targets human pose recognition based on RGB-D videos. In recently, RGB-D based methods can be typically represented as either maps-based approaches or skeleton-based approaches. This paper proposes a semi-supervised learning method for evaluating human posture via RGB-D and light-model. The light-model is generated to represent depth sequence, by using the dynamic-fusion strategy. In this regard, light-model has richer information than depth image, and a CNN classifier is further constructed to recognize human pose with trained labeled light model data. Soft correlation and hard correlation are used to adjust the CNN output of non-labeled data. This paper constructs a set of posture data which consist of RGB images and light model. The experiments results show that our method is more accuracy than the state of the art, and the efficient is also competitive. This study implies that feature extracted from 3D models is reliable for human pose recognition, especially for sitting posture.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 17th International Conference on Proceedings of the Pattern Recognition (ICPR 2004), vol. 3, pp. 32–36. IEEE Computer Society (2004) Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 17th International Conference on Proceedings of the Pattern Recognition (ICPR 2004), vol. 3, pp. 32–36. IEEE Computer Society (2004)
2.
Zurück zum Zitat Heng, W., Schmid, C.: Action recognition with improved trajectories. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE (2013) Heng, W., Schmid, C.: Action recognition with improved trajectories. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE (2013)
3.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)
4.
Zurück zum Zitat Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia, vol. 2 (2014) Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia, vol. 2 (2014)
5.
Zurück zum Zitat Chéron, G., Laptev, I., Schmid, C.: P-CNN: pose-based CNN features for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision (2015) Chéron, G., Laptev, I., Schmid, C.: P-CNN: pose-based CNN features for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
6.
Zurück zum Zitat Tran, D., et al.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (2015) Tran, D., et al.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (2015)
7.
Zurück zum Zitat Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2014) Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2014)
8.
Zurück zum Zitat Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)CrossRef Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)CrossRef
9.
Zurück zum Zitat Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE (2013) Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE (2013)
10.
Zurück zum Zitat Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015) Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
11.
Zurück zum Zitat Lu, X., Aggarwal, J.K.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013) Lu, X., Aggarwal, J.K.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013)
12.
Zurück zum Zitat Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4d normals for activity recognition from depth sequences. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013) Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4d normals for activity recognition from depth sequences. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2013)
13.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems (2014) Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems (2014)
14.
15.
Zurück zum Zitat Wang, K., Wang, X., Lin, L., et al.: 3D human activity recognition with reconfigurable convolutional neural networks. In: Proceedings of the ACM International Conference on Multimedia. ACM (2014) Wang, K., Wang, X., Lin, L., et al.: 3D human activity recognition with reconfigurable convolutional neural networks. In: Proceedings of the ACM International Conference on Multimedia. ACM (2014)
16.
Zurück zum Zitat Whelan, T., et al.: Kintinuous: spatially extended kinectfusion. MIT-CSAIL-TR-2012-020 (2012) Whelan, T., et al.: Kintinuous: spatially extended kinectfusion. MIT-CSAIL-TR-2012-020 (2012)
17.
Zurück zum Zitat Nießner, M., et al.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Graph. (TOG) 32(6) (2013). Article No. 169 Nießner, M., et al.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Graph. (TOG) 32(6) (2013). Article No. 169
18.
Zurück zum Zitat Whelan, T., et al.: ElasticFusion: dense SLAM without a pose graph. In: RSS (2015) Whelan, T., et al.: ElasticFusion: dense SLAM without a pose graph. In: RSS (2015)
19.
Zurück zum Zitat Blan, A.O., et al.: Shining a light on human pose: on shadows, shading and the estimation of pose and shape. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007) Blan, A.O., et al.: Shining a light on human pose: on shadows, shading and the estimation of pose and shape. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007)
20.
Zurück zum Zitat Lee, M.W., Nevatia, R.: Body part detection for human pose estimation and tracking. In: IEEE Workshop on Motion and Video Computing, WMVC 2007. IEEE (2007) Lee, M.W., Nevatia, R.: Body part detection for human pose estimation and tracking. In: IEEE Workshop on Motion and Video Computing, WMVC 2007. IEEE (2007)
21.
Zurück zum Zitat Lee, M.W., Nevatia, R.: Dynamic human pose estimation using Markov chain Monte Carlo approach. In: Seventh IEEE Workshops on Application of Computer Vision, WACV/MOTIONS 2005, vol. 1–2. IEEE (2005) Lee, M.W., Nevatia, R.: Dynamic human pose estimation using Markov chain Monte Carlo approach. In: Seventh IEEE Workshops on Application of Computer Vision, WACV/MOTIONS 2005, vol. 1–2. IEEE (2005)
22.
Zurück zum Zitat Fathi, A., Mori, G.: Human pose estimation using motion exemplars. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007) Fathi, A., Mori, G.: Human pose estimation using motion exemplars. In: IEEE 11th International Conference on Computer Vision, ICCV 2007. IEEE (2007)
23.
Zurück zum Zitat Baumberg, A.M., Hogg, D.C.: An efficient method for contour tracking using active shape models. In: Proceedings of the 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects. IEEE (1994) Baumberg, A.M., Hogg, D.C.: An efficient method for contour tracking using active shape models. In: Proceedings of the 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects. IEEE (1994)
24.
Zurück zum Zitat Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE (2010) Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE (2010)
25.
Zurück zum Zitat Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: Space-Time Occupancy Patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33275-3_31 CrossRef Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: Space-Time Occupancy Patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33275-3_​31 CrossRef
26.
Zurück zum Zitat Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3d action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7441, pp. 872–885. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33709-3_62 Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3d action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7441, pp. 872–885. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33709-3_​62
27.
Zurück zum Zitat Mao, Y., et al.: Accurate 3d pose estimation from a single depth image. In: 2011 IEEE International Conference on Computer Vision (ICCV). IEEE (2011) Mao, Y., et al.: Accurate 3d pose estimation from a single depth image. In: 2011 IEEE International Conference on Computer Vision (ICCV). IEEE (2011)
28.
Zurück zum Zitat Criminisi, A., Shotton, J., Robertson, D., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies. In: Menze, B., Langs, G., Tu, Z., Criminisi, A. (eds.) MCV 2010. LNCS, vol. 6533, pp. 106–117. Springer, Heidelberg (2011). doi:10.1007/978-3-642-18421-5_11 CrossRef Criminisi, A., Shotton, J., Robertson, D., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies. In: Menze, B., Langs, G., Tu, Z., Criminisi, A. (eds.) MCV 2010. LNCS, vol. 6533, pp. 106–117. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-18421-5_​11 CrossRef
29.
Zurück zum Zitat Jalal, A., et al.: Recognition of human home activities via depth silhouettes and transformation for smart homes. Indoor Built Environ. 21(1), 184–190 (2011)CrossRef Jalal, A., et al.: Recognition of human home activities via depth silhouettes and transformation for smart homes. Indoor Built Environ. 21(1), 184–190 (2011)CrossRef
30.
Zurück zum Zitat Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia. ACM (2012) Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia. ACM (2012)
31.
Zurück zum Zitat Wu, S.-L., Cui, R.-Y.: Human behavior recognition based on sitting postures. In: 2010 International Symposium on Computer Communication Control and Automation (3CA), vol. 1. IEEE (2010) Wu, S.-L., Cui, R.-Y.: Human behavior recognition based on sitting postures. In: 2010 International Symposium on Computer Communication Control and Automation (3CA), vol. 1. IEEE (2010)
33.
34.
Zurück zum Zitat Stikic, M., Van Laerhoven, K., Schiele, B.: Exploring semi-supervised and active learning for activity recognition. In: 12th IEEE International Symposium on Wearable Computers, ISWC 2008. IEEE (2008) Stikic, M., Van Laerhoven, K., Schiele, B.: Exploring semi-supervised and active learning for activity recognition. In: 12th IEEE International Symposium on Wearable Computers, ISWC 2008. IEEE (2008)
35.
Zurück zum Zitat Zhao, X., et al.: Human action recognition based on semi-supervised discriminant analysis with global constraint. Neurocomputing 105, 45–50 (2013)CrossRef Zhao, X., et al.: Human action recognition based on semi-supervised discriminant analysis with global constraint. Neurocomputing 105, 45–50 (2013)CrossRef
36.
Zurück zum Zitat Zhang, T., et al.: Boosted multi-class semi-supervised learning for human action recognition. Pattern Recogn. 44(10), 2334–2342 (2011)CrossRefMATH Zhang, T., et al.: Boosted multi-class semi-supervised learning for human action recognition. Pattern Recogn. 44(10), 2334–2342 (2011)CrossRefMATH
37.
Zurück zum Zitat Guan, D., et al.: Activity recognition based on semi-supervised learning. In: 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2007. IEEE (2007) Guan, D., et al.: Activity recognition based on semi-supervised learning. In: 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2007. IEEE (2007)
38.
Zurück zum Zitat Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39, 1–38 (1977) Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39, 1–38 (1977)
39.
Zurück zum Zitat Miller, D.J., Uyar, H.S.: A mixture of experts classifier with learning based on both labelled and unlabelled data. In: Advances in Neural Information Processing Systems (1997) Miller, D.J., Uyar, H.S.: A mixture of experts classifier with learning based on both labelled and unlabelled data. In: Advances in Neural Information Processing Systems (1997)
40.
Zurück zum Zitat Zhao, Y., et al.: Combing RGB and depth map features for human activity recognition. In: 2012 Asia-Pacific on Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE (2012) Zhao, Y., et al.: Combing RGB and depth map features for human activity recognition. In: 2012 Asia-Pacific on Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE (2012)
41.
Zurück zum Zitat Faria, D.R., Premebida, C., Nunes, U.: A probabilistic approach for human everyday activities recognition using body motion from RGB-D images. In: 2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication. IEEE (2014) Faria, D.R., Premebida, C., Nunes, U.: A probabilistic approach for human everyday activities recognition using body motion from RGB-D images. In: 2014 RO-MAN: The 23rd IEEE International Symposium on Robot and Human Interactive Communication. IEEE (2014)
42.
Zurück zum Zitat Ming, Y., Ruan, Q., Hauptmann, A.G.: Activity recognition from RGB-D camera with 3d local spatio-temporal features. In: 2012 IEEE International Conference on Multimedia and Expo (ICME). IEEE (2012) Ming, Y., Ruan, Q., Hauptmann, A.G.: Activity recognition from RGB-D camera with 3d local spatio-temporal features. In: 2012 IEEE International Conference on Multimedia and Expo (ICME). IEEE (2012)
Metadaten
Titel
Semi-supervised Learning for Human Pose Recognition with RGB-D Light-Model
verfasst von
Xinbo Wang
Guoshan Zhang
Dahai Yu
Dan Liu
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-48896-7_72

Neuer Inhalt