skip to main content
10.1145/3442705.3442710acmotherconferencesArticle/Chapter ViewAbstractPublication PagesvsipConference Proceedingsconference-collections
research-article

Human Action Recognition using Pre-trained Convolutional Neural Networks

Published:21 March 2021Publication History

ABSTRACT

Recognition of human action is one of the challenges in the field of artificial intelligence. Deep learning model has become a research issue in action recognition applications due to its ability to outperform traditional machine learning approaches. The Convolutional Neural Network is one of the architectures commonly used in most action recognition works. There are different models in the Convolutional Neural Network, but no study has been done to evaluate which model has the best performance in understanding human actions. Thus, in this paper, we compare the performance of two separate pre-trained models of deep Convolutional Neural Network in classifying the human actions to identify the different behaviours. GoogleNet and AlexNet are the used two models with fine-tuned parameters used for comparison, in addition, to use Long-Short Term Memory for the video's labels prediction. The paper's main contribution is that it offers a performance analysis of two separate fine-tuned deep CNN pre-trained models compared to the results of other recently proposed human action recognition methods applied on KTH, Weizmann, UCF11(YouTube actions) and UCF-Sports datasets.

References

  1. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Jul. 2005, pp. 886–893..Google ScholarGoogle Scholar
  2. R. Chaudhry, A. Ravichandran, G. Hager and R. Vida, "Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit, , Jun. 2009, pp. 1932–1939.Google ScholarGoogle ScholarCross RefCross Ref
  3. H. Schmid and C. Wang, "Action recognition with improved trajectories," in n Proc. IEEE Int. Conf. Comput. Vis, Dec. 2013, pp. 3551–3558.Google ScholarGoogle Scholar
  4. B. Leng, X. Zhang, M. Yao and Z. Xiong, "A 3D model recognition mechanism based on deep Boltzmann machines," in Neurocomputing, vol. 151,, pp. 593–602, Mar. 2015..Google ScholarGoogle Scholar
  5. A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural network," in n Proc. Adv. Neural Inf. Process. Syst, 2012, pp. 1097–1105.Google ScholarGoogle Scholar
  6. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, "Going deeper with convolutions," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015,pp. 1–9.Google ScholarGoogle Scholar
  7. K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778.Google ScholarGoogle Scholar
  8. S. Schmidhuber and J. Hochreiter, "Long short-term memory," in Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Koohzadi and N. Charkari, "Survey on deep learning methods in human action recognition," in IET Computer Vision, 2017; 11: 623-632.Google ScholarGoogle Scholar
  10. M. Saufi, M. Zamanhuri, N. Mohammad and Z. Ibrahim, "Deep Learning for Roman Handwritten Character Recognition," Indonesian Journal of Electrical Engineering and Computer Science, vol. 2, no. 12, pp. 455-460, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  11. N. Kasim, N. Rahman, Z. Ibrahim and N. N. Abu Mangshor, "Celebrity Face Recognition using Deep Learning," Indonesian Journal of Electrical Engineering and Computer Science, vol. 2, no. 12, pp. 476-481, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  12. C. Feichtenhofer, A. Pinz and R. P. Wildes, "Spatiotemporal multiplier networks for video action recognition," in Proc. IEEE Conf. Comput. Vis., Jul. 2017,pp. 4768–4777.Google ScholarGoogle Scholar
  13. A. Kar, N. Rai, K. Sikka and G. Sharma, "AdaScan: Adaptive scan pooling in deep convolutional neural networks for human action recognition in videos," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 3376–3385.Google ScholarGoogle ScholarCross RefCross Ref
  14. F. Tapamo and J.-R. Osayamwen, "Deep learning class discrimination based on prior probability for human activity recognition," IEEE Access, vol. 7, p. 14747–14756, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  15. Y. Guo, Y. Liu, A. Oerlemans, S. Lao, S. Wu and M. Lew, "Deep learning for visual understanding: A review," in Neurocomputing, 2016, pp. 27-48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. A. Chenarlogh, H. Jond and J. Platoš, "A Robust Deep Model for Human Action Recognition," in International Conference on Telecommunications and Signal Processing (TSP), 2020.Google ScholarGoogle Scholar
  17. B. Sukrit, S. Vaibhav, K. Pawan, S. Ram and B. Debotosh, "SV-NET: A Deep Learning Approach to Video Based Human Activity Recognition," in Proceedings of the 11th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2019), 2020.Google ScholarGoogle Scholar
  18. A. Nadeem, A. Jalal and K. Kim, "Human Actions Tracking and Recognition Based on Body Parts Detection via Artificial Neural Network," in 2020 3rd International Conference on Advancements in Computational Sciences (ICACS), Lahore, Pakistan, 2020, pp. 1-6.Google ScholarGoogle Scholar
  19. P. Gao, D. Zhao and X. Chen, "Multi-dimensional data modelling of video image action recognition and motion capture in deep learning framework," IET Image Processing, vol. 14, no. 7, p. p. 1257 – 1264, 29 May 2020.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    VSIP '20: Proceedings of the 2020 2nd International Conference on Video, Signal and Image Processing
    December 2020
    108 pages
    ISBN:9781450388931
    DOI:10.1145/3442705

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 21 March 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)32
    • Downloads (Last 6 weeks)6

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format