nach oben

Wireless Personal Communications

Erschienen in:

12.03.2023

Multi-view Multi-modal Approach Based on 5S-CNN and BiLSTM Using Skeleton, Depth and RGB Data for Human Activity Recognition

verfasst von: Rahul Kumar, Shailender Kumar

Erschienen in: Wireless Personal Communications | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Recognition of human activity is a challenging issue, especially in the presence of multiple actions and multiple scenarios. Therefore, in this paper, multi-view multi-modal based human action recognition (HAR) is proposed. Here, initially, motion representation of each image such as Depth motion maps, motion history images, and skeleton images are created from depth, RGB, and skeleton data of RGB-D sensor. After the motion representation, each motion is separately trained by using a 5-stack convolution neural network (5S-CNN). To enhance the recognition rate and accuracy, the skeleton representation is trained using a hybrid 5S-CNN and Bi-LSTM classifier. Then, decision-level fusion is applied to fuse the score value of three motions. Finally, based on the fusion value, the activity of humans is identified. To estimate the efficiency of the suggested 5S-CNN with the Bi-LSTM method, we conduct our experiments using UTD-MHAD. Results show that the suggested HAR method attained better than other existing approaches.

Vorheriger Artikel Enhancing Data Security of Cloud Based LMS

Nächster Artikel AFA: Anti-Flooding Attack Scheme Against Flooding Attack in MANET

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Aggarwal, J. K., & Ryoo, M. S. (2011). Human activity analysis: A review. ACM Computing Surveys (CSUR), 43(3), 16.CrossRef

Wang, P., Li, W., Gao, Z., Tang, C., Zhang, J., & Ogunbona, P. (2015). Convnets-based action recognition from depth maps through virtual cameras and pseudocoloring. In Proceedings of the 23rd ACM international conference on Multimedia, ACM, pp. 1119–1122.

Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., & Ogunbona, P. O. (2016). Action recognition from depth maps using deep convolutional neural networks. IEEE Transactions on Human-Machine Systems, 46(4), 498–509.CrossRef

Wang, P., Li, W., Ogunbona, P., Gao, Z., & Zhang, H. (2014). Mining mid-level features for action recognition based on effective skeleton representation. In Digital lmage computing: techniques and applications (DlCTA), 2014 international conference on, IEEE, pp. 1–8.

Vemulapalli, R., Arrate, F., & Chellappa, R. (2014). Human action recognition by representing 3d skeletons as points in a lie group. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 588–595.

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for largescale image recognition, arXiv preprint arXiv:1409.1556, ICLR, pp. 1–10.

Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1725–1732.

Simonyan, K., & Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, pp. 568–576.

Luo, J., Wang, W., & Qi, H. (2014). Spatio-temporal feature extraction and representation for RGB-D human action recognition. Pattern Recognition Letters, 50, 139–148.CrossRef

10.

Chen, C., Jafari, R., & Kehtarnavaz, N. (2015). Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Transactions on Human-Machine Systems, 45(1), 51–61.CrossRef

11.

El Madany, N. E. D., He, Y., & Guan, L. (2016). Human action recognition via Multiview discriminative analysis of canonical correlations. In: Image processing (ICIP), 2016 IEEE international conference on, IEEE, pp. 4170–4174.

12.

Verma, P., Sah, A., & Srivastava, R. (2020). Deep learning-based multi-modal approach using RGB and skeleton sequences for human activity recognition. Multimedia Systems, 26(6), 671–685.CrossRef

13.

Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., & Ogunbona, P. O. (2015). Action recognition from depth maps using deep convolutional neural networks. IEEE Transactions on Human-Machine Systems, 46(4), 498–509.CrossRef

14.

Chen, C., Jafari, R., & Kehtarnavaz, N. (2016). Fusion of depth, skeleton, and inertial data for human action recognition. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 2712–2716. IEEE.

15.

Escobedo, E., & Camara, G. (2016). A new approach for dynamic gesture recognition using skeleton trajectory representation and histograms of cumulative magnitudes. In 2016 29th SIBGRAPI conference on graphics, patterns and images (SIBGRAPI), pp. 209–216. IEEE.

16.

Gaglio, S., Re, G. L., & Morana, M. (2014). Human activity recognition process using 3-D posture data. IEEE Transactions on Human-Machine Systems, 45(5), 586–597.CrossRef

17.

Khaire, P., Kumar, P., & Imran, J. (2018). Combining CNN streams of RGB-D and skeletal data for human activity recognition. Pattern Recognition Letters, 115, 107–116.CrossRef

18.

Guo, J., Bai, H., Tang, Z., Xu, P., Gan, D., & Liu, B. (2020). Multi modal human action recognition for video content matching. Multimedia Tools and Applications, 79, 34665–34683.

19.

Tran, T.-H., Tran, H.-N., & Doan, H.-G. (2019). Dynamic hand gesture recognition from multi-modal streams using deep neural network. In International conference on multi-disciplinary trends in artificial intelligence. Springer, Cham, pp. 156–167.

20.

Nie, W., Yan, Y., Song, D., & Wang, K. (2020). Multi-modal feature fusion based on multi-layers LSTM for video emotion recognition. Multimedia Tools and Applications, 80, 1–10.

21.

Khowaja, S. A., & Lee, S.-L. (2020). Hybrid and hierarchical fusion networks: a deep cross-modal learning architecture for action recognition. Neural Computing and Applications, 32(14), 10423–10434.CrossRef

22.

Chen, C., Jafari, R., & Kehtarnavaz, N. (2015). UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In Image processing (ICIP), 2015 IEEE international conference on, IEEE, pp. 168–172.

23.

Bulbul, M. F., Jiang, Y., & Ma, J. (2015). Dmms-based multiple features fusion for human action recognition. International Journal of Multimedia Data Engineering and Management (IJMDEM), 6(4), 23–39.CrossRef

24.

Annadani, Y., Rakshith, D., & Biswas, S. (2016). Sliding dictionary based sparse representation for action recognition, arXiv preprint arXiv:1611.00218, 1–7.

25.

Escobedo, E., & Camara, G. (2016). A new approach for dynamic gesture recognition using skeleton trajectory representation and histograms of cumulative magnitudes. In Graphics, patterns and images (SIBGRAPI), 2016 29th SIBGRAPI conference on, IEEE, pp. 209–216.

Titel: Multi-view Multi-modal Approach Based on 5S-CNN and BiLSTM Using Skeleton, Depth and RGB Data for Human Activity Recognition
verfasst von: Rahul Kumar
Shailender Kumar
Publikationsdatum: 12.03.2023
Verlag: Springer US
Erschienen in: Wireless Personal Communications / Ausgabe 2/2023
Print ISSN: 0929-6212
Elektronische ISSN: 1572-834X
DOI: https://doi.org/10.1007/s11277-023-10324-4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2023

Forecasting Future Monthly Patient Volume using Deep Learning and Statistical Models

Dynamic Virtual Machine Consolidation in a Cloud Data Center Using Modified Water Wave Optimization

Design of Linear and Circular Antenna Arrays for Side Lobe Reduction Using a Novel Modified Sparrow Search Algorithm

: Fuzzy K-Means Clustering Routing Algorithm for Load Balancing in Wireless Sensor Networks

A Novel Multi-Hop Clustering Routing Algorithm Based on Particle Swarm Optimization for Wireless Sensors Networks

DeepCOVNet Model for COVID-19 Detection Using Chest X-Ray Images

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.