Skip to main content

2018 | OriginalPaper | Buchkapitel

Classification of Human Actions Using 3-D Convolutional Neural Networks: A Hierarchical Approach

verfasst von : Shaival Thakkar, M. V. Joshi

Erschienen in: Computer Vision, Pattern Recognition, Image Processing, and Graphics

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present a hierarchical approach for human action classification using 3-D Convolutional neural networks (3-D CNN). In general, human actions refer to positioning and movement of hands and legs and hence can be classified based on those performed by hands or by legs or, in some cases, both. This acts as the intuition for our work on hierarchical classification. In this work, we consider the actions as tasks performed by hand or leg movements. Therefore, instead of using a single 3-D CNN for classification of given actions, we use multiple networks to perform the classification hierarchically, that is, we first perform binary classification to separate the hand and leg actions and then use two separate networks for hand and leg actions to perform classification among target action categories. For example, in case of KTH dataset, we train three networks to classify six different actions, comprising of three actions each for hands and legs. The novelty of our approach lies in performing the separation of hand and leg actions first, thus making the subsequent classifiers to accept the features corresponding to either hands or legs only. This leads to better classification accuracy. Also, the use of 3-D CNN enables automatic extraction of features in spatial as well as temporal domain, avoiding the need for hand crafted features. This makes it one of the better approaches when it comes to video classification. We use the KTH, Weizmann and UCF-sports datasets to evaluate our method and comparison with the state of the art methods shows that our approach outperforms most of them.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)CrossRef
2.
Zurück zum Zitat Laptev, I., Lindeberg, T.: Space-time interest points. In: 2003 Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 1, pp. 432–439, October 2003 Laptev, I., Lindeberg, T.: Space-time interest points. In: 2003 Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 1, pp. 432–439, October 2003
8.
9.
Zurück zum Zitat Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc., New York (1997)MATH Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc., New York (1997)MATH
10.
Zurück zum Zitat Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 2004 Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 32–36, August 2004 Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 2004 Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 32–36, August 2004
11.
Zurück zum Zitat Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Proceedings of the 14th International Conference on Computer Communications and Networks, Series ICCCN 2005, Washington, DC, USA, pp. 65–72. IEEE Computer Society (2005). http://dl.acm.org/citation.cfm?id=1259587.1259830 Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Proceedings of the 14th International Conference on Computer Communications and Networks, Series ICCCN 2005, Washington, DC, USA, pp. 65–72. IEEE Computer Society (2005). http://​dl.​acm.​org/​citation.​cfm?​id=​1259587.​1259830
13.
Zurück zum Zitat Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8, October 2007 Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8, October 2007
14.
Zurück zum Zitat Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008 Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008
15.
Zurück zum Zitat Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR 2011, pp. 3169–3176, June 2011 Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR 2011, pp. 3169–3176, June 2011
16.
Zurück zum Zitat Hao, Z., Lu, L., Zhang, Q., Wu, J., Izquierdo, E., Yang, J., Zhao, J.: Action recognition based on subdivision-fusion model. In: Proceedings of the British Machine Vision Conference (BMVC), pp. 50.1–50.12. BMVA Press, September 2015. https://doi.org/10.5244/C.29.50 Hao, Z., Lu, L., Zhang, Q., Wu, J., Izquierdo, E., Yang, J., Zhao, J.: Action recognition based on subdivision-fusion model. In: Proceedings of the British Machine Vision Conference (BMVC), pp. 50.1–50.12. BMVA Press, September 2015. https://​doi.​org/​10.​5244/​C.​29.​50
17.
Zurück zum Zitat Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)CrossRef Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)CrossRef
18.
Zurück zum Zitat Brahnam, S., Nanni, L.: High performance set of features for human action classification (2009) Brahnam, S., Nanni, L.: High performance set of features for human action classification (2009)
19.
Zurück zum Zitat Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008 Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008
21.
Zurück zum Zitat Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: Cavallaro, A., Prince, S., Alexander, D. (eds.) British Machine Vision Conference, BMVC 2009, London, United Kingdom, pp. 124.1–124.11. BMVA Press, September 2009. https://hal.inria.fr/inria-00439769 Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: Cavallaro, A., Prince, S., Alexander, D. (eds.) British Machine Vision Conference, BMVC 2009, London, United Kingdom, pp. 124.1–124.11. BMVA Press, September 2009. https://​hal.​inria.​fr/​inria-00439769
22.
Zurück zum Zitat Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2046–2053, June 2010 Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2046–2053, June 2010
Metadaten
Titel
Classification of Human Actions Using 3-D Convolutional Neural Networks: A Hierarchical Approach
verfasst von
Shaival Thakkar
M. V. Joshi
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-0020-2_2