Top

Cluster Computing

Published in:

07-11-2017

More efficient and effective tricks for deep action recognition

Authors: Zheyuan Liu, Xiaoteng Zhang, Lei Song, Zhengyan Ding, Huixian Duan

Published in: Cluster Computing | Special Issue 1/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Deep convolutional network has achieved great success in visual recognition of static images, while it is not so advantageous as traditional methods in action recognition in videos. As two-stream-style convolutional network gaining best performance in human action recognition, there exist obstacles such as selecting different pre-train models and hyper-parameters, and high computation consumption. In this paper, we propose two efficient and effective methods for action recognition, based on two-stream convolutional network. (1) Reducing computational cost of temporal stream while achieving the same accuracy, and (2) providing techniques such as selection of optical flow algorithm, the pre-train dataset/architectures and the hyper-parameters for assembly in action recognition task. Experimental results show that we are able to obtain performance on a par with the state-of-the-art ones on the datasets of HMDB51 (70.9%) and UCF101 (95.4%).

previous article Expected value model of bus gas station site layout problem with fuzzy demand in supplementary fuel using genetic algorithm

next article Distributed SVM face recognition based on Hadoop

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 1–14 (2015)

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: CVPR, pp. 1–9 (2015)

Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: CVPR (2014)

Tran, D., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: ICCV, pp. 4489–4497 (2015)

Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In NIPS (2014)

Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: CVPR (2015)

Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR (2016)

Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Gool, L.V.: Temporal segment networks: towards good practices for deep action recognition. In: ECCV (2016)

10.

Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. In: CoRR (2012)

11.

Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: a large video database for human motion recognition. In: ICCV (2011)

12.

Ioe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML, pp. 448–456 (2015)

13.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

14.

Yue-Hei, J., Jonghyun, C., Jan, N., Larry, D.: ActionFlowNet: learning motion representation for action recognition. In: CoRR (2016). arXiv:1612.03052

15.

Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: ECCV (2004)

16.

Jean-Yves, B.: Pyramidal Implementation of the Affine Lucas Kanade Feature Tracker Description of the Algorithm, vol. 5, no. 1–10, p. 4. Intel Corporation (2001)

17.

Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime tv-L1 optical flow. In: Proceedings of the 29th DAGM symposium on pattern recognition, pp. 214–223 (2007)

18.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)

19.

Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database 27. In: NIPS (2014)

20.

Jiang, Y., Wu, Z., Wang, J., Xue, X., Chang, S.: Exploiting feature and class relationships in video categorization with regularized deep neural networks. In: CoRR (2015). arXiv:1502.07209

21.

Szegedy, C., Vanhoucke, V., Ioffe, S., Jonathon, S.: Rethinking the inception architecture for computer vision. In: CoRR (2015). arXiv:1512.00567

22.

Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: CoRR (2016). arXiv:1602.07261

23.

Xie, S., Girshick, R., Doll’ar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CoRR (2016). arXiv:1611.05431

24.

Lan, Z., Lin, M., Li, X., Hauptmann, A.G., Raj, B.: Beyond Gaussian pyramid: multi-skip feature stacking for action recognition. In: CVPR (2015)

25.

Zhu, W., Hu, J., Sun, G., Cao, X., Qiao, Y.: A key volume mining deep framework for action recognition. In: CVPR, pp. 1991–1999 (2016)

26.

Diba, A., Sharma, V., Van Gool, L.: Deep temporal linear encoding networks (2016). arXiv Preprint arXiv:1611.06678

27.

Zhenzhong, L., Yi, Z., Alexander, G.: Deep local video feature for action recognition. In: CoRR (2017). arXiv:1701.07368

28.

Jiang, Y.G., Liu, J., Roshan Zamir, A., Laptev, I., Piccardi, M., Shah, M., Suk-thankar, R.: THUMOS challenge: action recognition with a large number of classes (2013)

29.

Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th international joint conference on artificial intelligence (IJCAI), pp. 674–679 (1981)

Title: More efficient and effective tricks for deep action recognition
Authors: Zheyuan Liu
Xiaoteng Zhang
Lei Song
Zhengyan Ding
Huixian Duan
Publication date: 07-11-2017
Publisher: Springer US
Published in: Cluster Computing / Issue Special Issue 1/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI: https://doi.org/10.1007/s10586-017-1309-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Special Issue 1/2019

Development of healthcare service model using physical information data based on cluster sensing technology

Design and implementation of OpenGL SC 2.0 rendering pipeline

Anonymous and formally verified dual signature based online e-voting protocol

Research on the single image super-resolution method based on sparse Bayesian estimation

A novel scheme for safeguarding confidentiality in public clouds for service users of cloud computing

The resource running time manager for integrated environment

Premium Partner