ABSTRACT
Recognition of human action is one of the challenges in the field of artificial intelligence. Deep learning model has become a research issue in action recognition applications due to its ability to outperform traditional machine learning approaches. The Convolutional Neural Network is one of the architectures commonly used in most action recognition works. There are different models in the Convolutional Neural Network, but no study has been done to evaluate which model has the best performance in understanding human actions. Thus, in this paper, we compare the performance of two separate pre-trained models of deep Convolutional Neural Network in classifying the human actions to identify the different behaviours. GoogleNet and AlexNet are the used two models with fine-tuned parameters used for comparison, in addition, to use Long-Short Term Memory for the video's labels prediction. The paper's main contribution is that it offers a performance analysis of two separate fine-tuned deep CNN pre-trained models compared to the results of other recently proposed human action recognition methods applied on KTH, Weizmann, UCF11(YouTube actions) and UCF-Sports datasets.
- N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Jul. 2005, pp. 886–893..Google Scholar
- R. Chaudhry, A. Ravichandran, G. Hager and R. Vida, "Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit, , Jun. 2009, pp. 1932–1939.Google ScholarCross Ref
- H. Schmid and C. Wang, "Action recognition with improved trajectories," in n Proc. IEEE Int. Conf. Comput. Vis, Dec. 2013, pp. 3551–3558.Google Scholar
- B. Leng, X. Zhang, M. Yao and Z. Xiong, "A 3D model recognition mechanism based on deep Boltzmann machines," in Neurocomputing, vol. 151,, pp. 593–602, Mar. 2015..Google Scholar
- A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural network," in n Proc. Adv. Neural Inf. Process. Syst, 2012, pp. 1097–1105.Google Scholar
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, "Going deeper with convolutions," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015,pp. 1–9.Google Scholar
- K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.Google Scholar
- S. Schmidhuber and J. Hochreiter, "Long short-term memory," in Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.Google ScholarDigital Library
- M. Koohzadi and N. Charkari, "Survey on deep learning methods in human action recognition," in IET Computer Vision, 2017; 11: 623-632.Google Scholar
- M. Saufi, M. Zamanhuri, N. Mohammad and Z. Ibrahim, "Deep Learning for Roman Handwritten Character Recognition," Indonesian Journal of Electrical Engineering and Computer Science, vol. 2, no. 12, pp. 455-460, 2018.Google ScholarCross Ref
- N. Kasim, N. Rahman, Z. Ibrahim and N. N. Abu Mangshor, "Celebrity Face Recognition using Deep Learning," Indonesian Journal of Electrical Engineering and Computer Science, vol. 2, no. 12, pp. 476-481, 2018.Google ScholarCross Ref
- C. Feichtenhofer, A. Pinz and R. P. Wildes, "Spatiotemporal multiplier networks for video action recognition," in Proc. IEEE Conf. Comput. Vis., Jul. 2017,pp. 4768–4777.Google Scholar
- A. Kar, N. Rai, K. Sikka and G. Sharma, "AdaScan: Adaptive scan pooling in deep convolutional neural networks for human action recognition in videos," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 3376–3385.Google ScholarCross Ref
- F. Tapamo and J.-R. Osayamwen, "Deep learning class discrimination based on prior probability for human activity recognition," IEEE Access, vol. 7, p. 14747–14756, 2019.Google ScholarCross Ref
- Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu and M. Lew, "Deep learning for visual understanding: A review," in Neurocomputing, 2016, pp. 27-48.Google ScholarDigital Library
- V. A. Chenarlogh, H. Jond and J. Platoš, "A Robust Deep Model for Human Action Recognition," in International Conference on Telecommunications and Signal Processing (TSP), 2020.Google Scholar
- B. Sukrit, S. Vaibhav, K. Pawan, S. Ram and B. Debotosh, "SV-NET: A Deep Learning Approach to Video Based Human Activity Recognition," in Proceedings of the 11th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2019), 2020.Google Scholar
- A. Nadeem, A. Jalal and K. Kim, "Human Actions Tracking and Recognition Based on Body Parts Detection via Artificial Neural Network," in 2020 3rd International Conference on Advancements in Computational Sciences (ICACS), Lahore, Pakistan, 2020, pp. 1-6.Google Scholar
- P. Gao, D. Zhao and X. Chen, "Multi-dimensional data modelling of video image action recognition and motion capture in deep learning framework," IET Image Processing, vol. 14, no. 7, p. p. 1257 – 1264, 29 May 2020.Google ScholarCross Ref
Recommendations
Bangla Handwritten Digit Recognition Using Deep Convolutional Neural Network
ICCA 2020: Proceedings of the International Conference on Computing AdvancementsHandwritten Bangla digit recognition is one of the most challenging computer vision problems due to its diverse shapes and writing style. Recently deep learning based convolutional neural network known as deep CNN finds wide-spread applications in ...
A High-Performance Reconfigurable Accelerator for Convolutional Neural Networks
ICMSSP '18: Proceedings of the 3rd International Conference on Multimedia Systems and Signal ProcessingIn this paper, we propose a new high-performance accelerator that supports a variety of convolutional neural networks (CNNs) such as GoogLeNet, ResNet and AlexNet. The proposed accelerator mainly includes 24 parallel PEs (processing engines) for ...
Benchmarking deep learning techniques for face recognition
Highlights- Training networks for face recognition is very complex and time-consuming.
- ...
AbstractRecent progresses in Convolutional Neural Networks (CNNs) and GPUs have greatly advanced the state-of-the-art performance for face recognition. However, training CNNs for face recognition is complex and time-consuming. Multiple factors ...
Comments