nach oben

Neural Processing Letters

Erschienen in:

30.07.2020

A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement

verfasst von: Zhimin Gao, Pichao Wang, Huogen Wang, Mingliang Xu, Wanqing Li

Erschienen in: Neural Processing Letters | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

RGB-D based action recognition is attracting more and more attention in both the research and industrial communities. However, due to the lack of training data, pre-training based methods are popular in this field. This paper presents a review of the concept of dynamic maps for RGB-D based human motion recognition using pretrained models in image domain. The dynamic maps recursively encode the spatial, temporal and structural information contained in the video sequence into dynamic motion images simultaneously. They enable the usage of Convolutional Neural Network and its pretained models on ImageNet for 3D human motion recognition. This simple, compact and effective representation achieves state-of-the-art results on various gesture/action/activities recognition datasets. Based on the review of previous methods using this concept upon different modalities (depth, skeleton or RGB-D data), a novel encoding scheme is developed and presented in this paper. The improved method generates effective flow-guided dynamic maps, and they could select the high motion window and distinguish the order among the frames with small motion. The improved flow-guided dynamic maps achieve state-of-the-art results on the large Chalearn LAP IsoGD and NTU RGB+D datasets.

Vorheriger Artikel Time Series Prediction Method Based on Variant LSTM Recurrent Neural Network

Nächster Artikel A Novel Solution of Using Deep Learning for White Blood Cells Classification: Enhanced Loss Function with Regularization and Weighted Loss (ELFRWL)

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Bilen H, Fernando B, Gavves E, Vedaldi A, Gould S (2016) Dynamic image networks for action recognition. In: CVPR

Chen C, Jafari R, Kehtarnavaz N (2015) UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: ICIP, pp 168–172

Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPR, pp 2625–2634

Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: CVPR, pp 1110–1118

Duan J, Wan J, Zhou S, Guo X, Li S (2017) A unified framework for multi-modal isolated gesture recognition. In: ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM),(under review, round 2)

Fothergill S, Mentis HM, Nowozin S, Kohli P (2012) Instructing people for training gestural interactive systems. In: ACM HCI

Hou Y, Li Z, Wang P, Li W (2016) Skeleton optical spectra based action recognition using convolutional neural networks. In: TCSVT, pp 1–5

Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2017) Flownet 2.0: Evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2462–2470

Jayaraman D, Grauman K (2016) Slow and steady feature analysis: higher order temporal coherence in video. In: CVPR

10.

Ji S, Xu W, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition. TPAMI 35(1):221–231CrossRef

11.

Ji X, Cheng J, Tao D, Wu X, Feng W (2017) The spatial laplacian and temporal energy pyramid representation for human action recognition using depth sequences. Knowl-Based Syst 122:64–74CrossRef

12.

Li C, Hou Y, Wang P, Li W (2017) Joint distance maps based action recognition with convolutional neural networks. IEEE Signal Process Lett 24(5):624–628CrossRef

13.

Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: CVPRW, pp 9–14

14.

Liu AA, Xu N, Nie WZ, Su YT, Wong Y, Kankanhalli M (2016a) Benchmarking a multimodal and multiview and interactive dataset for human action recognition. TCYB

15.

Liu J, Shahroudy A, Xu D, Wang G (2016b) Spatio-temporal LSTM with trust gates for 3D human action recognition. In: ECCV, pp 816–833

16.

Liu M, Liu H, Chen C (2017) 3d action recognition using multiscale energy-based global ternary image. IEEE Trans Circuits Syst Video Technol 28(8):1824–1838MathSciNetCrossRef

17.

Lu C, Jia J, Tang CK (2014) Range-sample depth feature for action recognition. In: CVPR, pp 772–779

18.

Oreifej O, Liu Z (2013) HON4D: Histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR, pp 716–723

19.

Shahroudy A, Liu J, Ng TT, Wang G (2016) NTU RGB+ D: A large scale dataset for 3D human activity analysis. In: CVPR

20.

Sharma S, Kiros R, Salakhutdinov R (2015) Action recognition using visual attention. arXiv preprint arXiv:1511.04119

21.

Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR, pp 1297–1304

22.

Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In: NIPS, pp 568–576

23.

Srivastava N, Mansimov E, Salakhudinov R (2015) Unsupervised learning of video representations using lstms. In: ICML, pp 843–852

24.

Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3D convolutional networks. In: ICCV, pp 4489–4497

25.

Veeriah V, Zhuang N, Qi GJ (2015) Differential recurrent neural networks for action recognition. In: ICCV, pp 4041–4049

26.

Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: CVPR, pp 588–595

27.

Wan J, Guo G, Li SZ (2016a) Explore efficient local features from RGB-D data for one-shot learning gesture recognition. TPAMI 38(8):1626–1639CrossRef

28.

Wan J, Li SZ, Zhao Y, Zhou S, Guyon I, Escalera S (2016b) Chalearn looking at people RGB-D isolated and continuous datasets for gesture recognition. In: CVPRW, pp 1–9

29.

Wang H, Wang P, Song Z, Li W (2017a) Large-scale multimodal gesture recognition using heterogeneous networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3129–3137

30.

Wang H, Wang P, Song Z, Li W (2017b) Large-scale multimodal gesture segmentation and recognition based on convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3138–3146

31.

Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: CVPR, pp 1290–1297

32.

Wang P, Li W, Ogunbona P, Gao Z, Zhang H (2014) Mining mid-level features for action recognition based on effective skeleton representation. In: DICTA, pp 1–8

33.

Wang P, Li W, Gao Z, Tang C, Zhang J, Ogunbona PO (2015) Convnets-based action recognition from depth maps through virtual cameras and pseudocoloring. In: ACM MM, pp 1119–1122

34.

Wang P, Li W, Gao Z, Zhang J, Tang C, Ogunbona P (2016a) Action recognition from depth maps using deep convolutional neural networks. THMS 46(4):498–509

35.

Wang P, Li W, Liu S, Gao Z, Tang C, Ogunbona P (2016b) Large-scale isolated gesture recognition using convolutional neural networks. In: Pattern recognition (ICPR), 2016 23rd international conference on, IEEE, pp 7–12

36.

Wang P, Li Z, Hou Y, Li W (2016c) Action recognition based on joint trajectory maps using convolutional neural networks. In: ACM MM, pp 102–106

37.

Wang P, Li W, Gao Z, Zhang Y, Tang C, Ogunbona P (2017c) Scene flow to action map: A new representation for rgb-d based action recognition with convolutional neural networks. In: The IEEE conference on computer vision and pattern recognition (CVPR)

38.

Wang P, Li W, Gao Z, Tang C, Ogunbona PO (2018) Depth pooling based large-scale 3-d action recognition with convolutional neural networks. IEEE Trans Multimed 20(5):1051–1061CrossRef

39.

Xia L, Chen CC, Aggarwal J (2012) View invariant human action recognition using histograms of 3D joints. In: CVPRW, pp 20–27

40.

Xiao Y, Chen J, Wang Y, Cao Z, Zhou JT, Bai X (2019) Action recognition for depth video using multi-view dynamic images. Inform Sci 480:287–304CrossRef

41.

Yang X, Tian Y (2012) Eigenjoints-based action recognition using Naive-Bayes-Nearest-Neighbor. In: CVPRW, pp 14–19

42.

Yang X, Tian Y (2014) Super normal vector for activity recognition using depth sequences. In: CVPR, pp 804–811

43.

Yang X, Zhang C, Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients. In: ACM MM, pp 1057–1060

44.

Yue-Hei Ng J, Hausknecht M, Vijayanarasimhan S, Vinyals O, Monga R, Toderici G (2015) Beyond short snippets: Deep networks for video classification. In: CVPR, pp 4694–4702

45.

Zhu G, Zhang L, Shen P, Song J (2017) Multimodal gesture recognition using 3d convolution and convolutional lstm. IEEE Access

Titel: A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement
verfasst von: Zhimin Gao
Pichao Wang
Huogen Wang
Mingliang Xu
Wanqing Li
Publikationsdatum: 30.07.2020
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 2/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-020-10320-w

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Benedikt Bonnmann von Adesso/© Adesso, Teilzeit/© Fokussiert / stock.adobe.com, Hans-Joachim Lefeld/© Lucht Probst Associates GmbH, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2020

Flight Delay Prediction Using Deep Convolutional Neural Network Based on Fusion of Meteorological Data

Hierarchical Deep Neural Network for Image Captioning

Robust Subspace Clustering via Latent Smooth Representation Clustering

SNRNet: A Deep Learning-Based Network for Banknote Serial Number Recognition

Partial Pinning Control for the Synchronization of Fractional-Order Directed Complex Networks

Finite/Fixed-Time Bipartite Synchronization of Coupled Delayed Neural Networks Under a Unified Discontinuous Controller

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.