research-article

Human Action Recognition using Pre-trained Convolutional Neural Networks

Authors:
Cherry Aly

Multimedia University, Malaysia

Multimedia University, Malaysia
View Profile

,
Fazly Salleh Abas

Multimedia University, Malaysia

Multimedia University, Malaysia
View Profile

,
Hock Ann Goh

Multimedia University, Malaysia

Multimedia University, Malaysia
View Profile

VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image ProcessingDecember 2020Pages 30–34https://doi.org/10.1145/3442705.3442710

Published:21 March 2021Publication History

VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image Processing

Pages 30–34

ABSTRACT

Recognition of human action is one of the challenges in the field of artificial intelligence. Deep learning model has become a research issue in action recognition applications due to its ability to outperform traditional machine learning approaches. The Convolutional Neural Network is one of the architectures commonly used in most action recognition works. There are different models in the Convolutional Neural Network, but no study has been done to evaluate which model has the best performance in understanding human actions. Thus, in this paper, we compare the performance of two separate pre-trained models of deep Convolutional Neural Network in classifying the human actions to identify the different behaviours. GoogleNet and AlexNet are the used two models with fine-tuned parameters used for comparison, in addition, to use Long-Short Term Memory for the video's labels prediction. The paper's main contribution is that it offers a performance analysis of two separate fine-tuned deep CNN pre-trained models compared to the results of other recently proposed human action recognition methods applied on KTH, Weizmann, UCF11(YouTube actions) and UCF-Sports datasets.

References

N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Jul. 2005, pp. 886–893..Google Scholar
R. Chaudhry, A. Ravichandran, G. Hager and R. Vida, "Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit, , Jun. 2009, pp. 1932–1939.Google ScholarCross Ref
H. Schmid and C. Wang, "Action recognition with improved trajectories," in n Proc. IEEE Int. Conf. Comput. Vis, Dec. 2013, pp. 3551–3558.Google Scholar
B. Leng, X. Zhang, M. Yao and Z. Xiong, "A 3D model recognition mechanism based on deep Boltzmann machines," in Neurocomputing, vol. 151,, pp. 593–602, Mar. 2015..Google Scholar
A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural network," in n Proc. Adv. Neural Inf. Process. Syst, 2012, pp. 1097–1105.Google Scholar
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, "Going deeper with convolutions," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015,pp. 1–9.Google Scholar
K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.Google Scholar
S. Schmidhuber and J. Hochreiter, "Long short-term memory," in Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.Google ScholarDigital Library
M. Koohzadi and N. Charkari, "Survey on deep learning methods in human action recognition," in IET Computer Vision, 2017; 11: 623-632.Google Scholar
M. Saufi, M. Zamanhuri, N. Mohammad and Z. Ibrahim, "Deep Learning for Roman Handwritten Character Recognition," Indonesian Journal of Electrical Engineering and Computer Science, vol. 2, no. 12, pp. 455-460, 2018.Google ScholarCross Ref
N. Kasim, N. Rahman, Z. Ibrahim and N. N. Abu Mangshor, "Celebrity Face Recognition using Deep Learning," Indonesian Journal of Electrical Engineering and Computer Science, vol. 2, no. 12, pp. 476-481, 2018.Google ScholarCross Ref
C. Feichtenhofer, A. Pinz and R. P. Wildes, "Spatiotemporal multiplier networks for video action recognition," in Proc. IEEE Conf. Comput. Vis., Jul. 2017,pp. 4768–4777.Google Scholar
A. Kar, N. Rai, K. Sikka and G. Sharma, "AdaScan: Adaptive scan pooling in deep convolutional neural networks for human action recognition in videos," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 3376–3385.Google ScholarCross Ref
F. Tapamo and J.-R. Osayamwen, "Deep learning class discrimination based on prior probability for human activity recognition," IEEE Access, vol. 7, p. 14747–14756, 2019.Google ScholarCross Ref
Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu and M. Lew, "Deep learning for visual understanding: A review," in Neurocomputing, 2016, pp. 27-48.Google ScholarDigital Library
V. A. Chenarlogh, H. Jond and J. Platoš, "A Robust Deep Model for Human Action Recognition," in International Conference on Telecommunications and Signal Processing (TSP), 2020.Google Scholar
B. Sukrit, S. Vaibhav, K. Pawan, S. Ram and B. Debotosh, "SV-NET: A Deep Learning Approach to Video Based Human Activity Recognition," in Proceedings of the 11th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2019), 2020.Google Scholar
A. Nadeem, A. Jalal and K. Kim, "Human Actions Tracking and Recognition Based on Body Parts Detection via Artificial Neural Network," in 2020 3rd International Conference on Advancements in Computational Sciences (ICACS), Lahore, Pakistan, 2020, pp. 1-6.Google Scholar
P. Gao, D. Zhao and X. Chen, "Multi-dimensional data modelling of video image action recognition and motion capture in deep learning framework," IET Image Processing, vol. 14, no. 7, p. p. 1257 – 1264, 29 May 2020.Google ScholarCross Ref

Recommendations

Bangla Handwritten Digit Recognition Using Deep Convolutional Neural Network
ICCA 2020: Proceedings of the International Conference on Computing Advancements

Handwritten Bangla digit recognition is one of the most challenging computer vision problems due to its diverse shapes and writing style. Recently deep learning based convolutional neural network known as deep CNN finds wide-spread applications in ...
Read More
A High-Performance Reconfigurable Accelerator for Convolutional Neural Networks
ICMSSP '18: Proceedings of the 3rd International Conference on Multimedia Systems and Signal Processing

In this paper, we propose a new high-performance accelerator that supports a variety of convolutional neural networks (CNNs) such as GoogLeNet, ResNet and AlexNet. The proposed accelerator mainly includes 24 parallel PEs (processing engines) for ...
Read More
Benchmarking deep learning techniques for face recognition
Highlights
- Training networks for face recognition is very complex and time-consuming.
- ...
Abstract
Recent progresses in Convolutional Neural Networks (CNNs) and GPUs have greatly advanced the state-of-the-art performance for face recognition. However, training CNNs for face recognition is complex and time-consuming. Multiple factors ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image Processing
December 2020
108 pages
ISBN:9781450388931
DOI:10.1145/3442705

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 March 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Action recognition
AlexNet
Convolutional neural network (CNN)
Deep learning
GoogleNet
Long-Short Term Memory (LSTM)
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 123
  Total Downloads
- Downloads (Last 12 months)32
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Human Action Recognition using Pre-trained Convolutional Neural Networks

VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image Processing

ABSTRACT

References

Cited By

Recommendations

Bangla Handwritten Digit Recognition Using Deep Convolutional Neural Network

A High-Performance Reconfigurable Accelerator for Convolutional Neural Networks

Benchmarking deep learning techniques for face recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Human Action Recognition using Pre-trained Convolutional Neural Networks

VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image Processing

ABSTRACT

References

Cited By

Recommendations

Bangla Handwritten Digit Recognition Using Deep Convolutional Neural Network

A High-Performance Reconfigurable Accelerator for Convolutional Neural Networks

Benchmarking deep learning techniques for face recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media