Abstract
In this paper, we present an approach for identification of actions within depth action videos. First, we process the video to get motion history images (MHIs) and static history images (SHIs) corresponding to an action video based on the use of 3D Motion Trail Model (3DMTM). We then characterize the action video by extracting the Gradient Local Auto-Correlations (GLAC) features from the SHIs and the MHIs. The two sets of features i.e., GLAC features from MHIs and GLAC features from SHIs are concatenated to obtain a representation vector for action. Finally, we perform the classification on all the action samples by using the l2-regularized Collaborative Representation Classifier (l2-CRC) to recognize different human actions in an effective way. We perform evaluation of the proposed method on three action datasets, MSR-Action3D, DHA and UTD-MHAD. Through experimental results, we observe that the proposed method performs superior to other approaches.
Similar content being viewed by others
References
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In Proc. 10th IEEE Int. Conf. Comput. Vis., Beijing, pp. 1395–1402
Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267
Bulbul MF, Jiang Y, Ma J (2015) DMMs-based multiple features fusion for human action recognition. International Journal of Multimedia Data Engineering and Management (IJMDEM) 6(4):23–39
Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F (2012) A Review on Vision Techniques Applied to Human Behaviour Analysis for Ambient-Assisted Living. International Journal of Expert Systems with Applications 39(12):10873–10888
Chaaraoui AA, Padilla-López JR, Climent-Pérez P, Flórez-Revuelta F (2014) Evolutionary joint selection to improve human action recognition with rgb-d devices. Expert Syst Appl 41(3):786–794
Chen C, Fowler JE (2012) Single-image super-resolution using multi hypothesis prediction. In: Proceedings of the 46th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, 608–612
Chen, C., Hou, Z., Zhang, B., Jiang, J., & Yang, Y. (2015). Gradient local auto-correlations and extreme learning machine for depth-based activity recognition. In International Symposium on Visual Computing (pp. 613-623). Springer International Publishing
Chen C, Jafari R, Kehtarnavaz N (2015) Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Transactions on Human-Machine Systems 45(1):51–61
Chen C, Jafari R, Kehtarnavaz N (2015) UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: Proc. IEEE Int. Conf. Image Process., pp. 168–172
Chen C, Jafari R, Kehtarnavaz N (2015) Action recognition from depth sequences using depth motion maps-based local binary patterns. In: WACV, pp. 1092–1099
Chen C, Kehtarnavaz N, Jafari R (2014) A medication adherence monitoring system for pill bottles based on a wearable inertial sensor. In: EMBC, pp. 4983–4986
Chen C, Li W, Tramel EW, Fowler JE (2014) Reconstruction of hyperspectral imagery from random projections using multi hypothesis prediction. IEEE Trans Geosci Remote Sens 52(1):365–374
Chen C, Liu K, Jafari R, Kehtarnavaz N (2014) Home-based senior fitness test measurement system using collaborative inertial and depth sensors. In: EMBC, pp. 4135–4138
Chen C, Liu K, Kehtarnavaz N (2013) Real-time human action recognition based on depth motion maps. J Real-Time Image Process:1–9. https://doi.org/10.1007/s11554-013-0370-1
Chen C, Liu M, Zhang B, Han J, Jiang J, Liu H (2016) 3D action recognition using multi-temporal depth motion maps and Fisher vector. In: Proc. Int. Joint Conf. Artif. Intell., pp. 3331–3337
Chen C, Tramel W, Fowler JE (2011) Compressed sensing recovery of images and video using multi hypothesis predictions. In: Proceedings of the 45th Asilomar Conference on signals, Systems, and Computers, Pacific Grove, 1193–1198
Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recognition Letters, 1995–2006
Chen C, Zhang B, Hou Z, Jiang J, Liu M, Yang Y (2017) Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features. Multimed Tools Appl 76(3):4651–4669
Chen E, Zhang S, Liang C (2017) Action Recognition Using Motion History Image and Static History Image-based Local Binary Patterns. International Journal of Multimedia and Ubiquitous Engineering 12(1):203–214
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893
Elmadany NED, He Y, Guan L (2018) Information Fusion for Human Action Recognition via Biset/Multiset Globality Locality Preserving Canonical Correlation Analysis. IEEE Trans Image Process 27(11):5275–5287
Evangelidis G, Singh G, Horaud R (2014) Skeletal quads: human action recognition using joint quadruples. In ICPR, pp. 4513–4518
Farhad M, Jiang Y, Ma J (2015) Human Action Recognition Based On DMMs, HOGs and Contourlet Transform. In: Proceedings of IEEE international conference on multimedia big data, Beijing, China, 389–394
Farhad M, Jiang Y, Ma J (2015) Real-time human action recognition using DMMs-Based LBP and EOH feautres. In Proceedings of the International Conference on Intelligent Computing. Fuzhou
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Gao Z, Zhang H, Xu GP, Xue YB (Mar. 2015) Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition. Neuro-computing 151:554–564
Golub G, Hansen PC, O’Leary D (1999) Tikhonov regularization and total least squares. SIAM Journal on Matrix Analysis and Applications 21(1):185–194
Gorelick L, Blank M, Irani ESM, Basri R (2007) Actions as space-time shapes. TPMAI 29(12):2247–2253
Hossein Rahmani Q, Du H, Mahmood A, Mian A (2015) Discriminative human action classification using locality-constrained linear coding. PRL
Kobayashi T, Otsu N (2008) Image feature extraction using gradient local auto-correlations. In: Forsyth D, Torr P, Zisserman A (eds) ECCV 2008, Part I. LNCS, vol 5302. Springer, Heidelberg, pp 346–358
Lei Q, Zhang H, Xin M, Cai Y (2018) A hierarchical representation for human action recognition in realistic scenes. Multimed Tools Appl 77(9):11403–11423
Li B, He M, Dai Y, Cheng X, Chen Y (2018) 3D skeleton based action recognition by video-domain translation-scale invariant mapping and multi-scale dilated CNN. Multimed Tools Appl 77(17):22901–22921
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: CVPRW, pp. 9–14
Liang B, Zheng L (2013) Three dimensional motion trail model for gesture recognition. In: Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on, pp. 684–691
Lin YC, Hu MC, Cheng WH, Hsieh YH, Chen HM (2012) Human action recognition and retrieval using sole depth information. In: Proc. ACM MM, pp. 1053–1056
Liu H, Tian L, Liu M, Tang H (2015) SDM-BSM: A fusing depth scheme for human action recognition. In: Proc. ICIP, pp. 4674–4678
Luo J, Wang W, Qi H (2014) Spatio-Temporal Feature Extraction and Representation for RGB-D Human Action Recognition. Pattern Recognition Letters, 139–148
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Oreifej O, Liu Z (2013) HON4D: Histogram of oriented 4D normals for activity recognition from depth sequences. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 716–723
Poppe R (2010) A Survey on Vision-Based Human Action Recognition. J Image Vision Comput 28(6):976–990
Rahmani H, Mahmood A, Huynh DQ, Mian A (2014) Real-time action recognition using histograms of depth gradients and random decision forests. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, (pp. 626–633). RI
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R et al (2013) Real-Time Human Pose Recognition in Parts from Single Depth Images. Commun ACM 56(1):116–124
Theodoridis T, Agapitos A, Hu H, Lucas SM (2008) Ubiquitous robotics in physical human action recognition: a comparison between dynamic ANNs and GP. In: ICRA, pp. 3064–3069
Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: CVPR, pp. 588–595
Vieira AW, Nascimento ER, Oliveira GL, Liu Z, Campos MF (2012) STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 252–259
Vieira AW, Nascimento ER, Oliveira GL, Liu Z, Campos MF (2014) On the improvement of human action recognition from depth map sequences using space-time occupancy patterns. Pattern Recogn Lett 36:221–227
Wang J, Liu Z, Chorowski J, Chen Z, Wu Y (2012) Robust 3D action recognition with random occupancy patterns. In: Proc. Eur. Conf. Comput. Vis., pp. 872–885
Wang J, Liu Z, Wu Y, Yuan J (2012b) Mining actionlet ensemble for action recognition with depth cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 1290–1297). Providence
Wang J, Liu Z, Wu Y, Yuan J (2014) Learning actionlet ensemble for 3D human action recognition. TPAMI 36(5):914–927
Wang H, Schmid C (2013) Action recognition with improved trajectories. In Proceedings of the IEEE International Conference on Computer Vision, (pp. 3551–3558). Sydney
Wang L, Zhang B, Yang W (2015) Boosting-like deep convolutional network for pedestrian detection. In: Proc. Chin. Conf. Biometric Recognit., pp. 581–588
Wiliem A, Madasu V, Boles W, Yarlagadda P (2010) An update-describe approach for human action recognition in surveillance video. In: Proceedings of the International Conference on Digital Image Computing: Techniques and Applications , (pp. 270–275). Sydney
Wright J, Ma Y, Mairal J, Sapiro G, Huang T, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031–1044
Xia L, Aggarwal JK (2013) Spatio-temporal depth cuboid similarity feature for action recognition using depth camera. In: CVPR, pp. 2834–2841
Xia L, Chen C-C, Aggarwal J (2012) View invariant human action recognition using histograms of 3d joints. In: CVPR Workshops, pp. 20–27
Yang X, Tian Y (2012) eigenjoints-based action recognition using naïve-bayes-nearest-neighbor. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, (pp. 14–19). Province
Yang X, Tian Y (2014) Super normal vector for action recognition using depth sequences. In: CVPR, pp. 804–811
Yang R, Yang R (2014) DMM-pyramid based deep architectures for action recognition with depth cameras. In: Proc. Asian Conf. Comput.Vis., pp. 37–49
Yang X, Zhang C, Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients. In: ACM Multimedia, pp. 1057–1060
Yu Kong B, Satarboroujeni B, Fu Y (2015) Hierarchical 3D kernel descriptors for action recognition using depth sequences. In FG, pages 1–6
Zanfir M, Leordeanu M, Sminchisescu C (2013) The moving pose: An efficient 3d kinematics descriptor for low-latency action recognition and detection. In ICCV, pp. 2752–2759
Zeng S, Lu G, Yan P (2018) Enhancing human action recognition via structural average curves analysis. SIViP 12(8):1551–1558
Zhang Y-Z, Pan C, Sun J, Tang C (2018) Multiple sclerosis identification by convolutional neural network with dropout and parametric ReLU. J Comput Sci 28:1–10
Zhang B, Yang Y, Chen C, Yang L, Han J, Shao L (2017) Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier. IEEE Trans Image Process 26(10)
Zhang Y-D, Zhang Y, Hou X-X, Chen H, Wang S-H (2018) Seven-layer deep neural network based on sparse autoencoder for voxelwise detection of cerebral microbleed. Multimed Tools Appl 77(9):10521–10538
Zhu H-M, Pun C-M (2013) Human Action Recognition with Skeletal Information from Depth Camera. In: Proceedings of the IEEE International Conference Information and Automation, (pp. 1082–1085). Yinchuan
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bulbul, M.F., Islam, S. & Ali, H. 3D human action analysis and recognition through GLAC descriptor on 2D motion and static posture images. Multimed Tools Appl 78, 21085–21111 (2019). https://doi.org/10.1007/s11042-019-7365-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7365-2