Skip to main content

2018 | OriginalPaper | Buchkapitel

Convolutional Neural Network-Based Action Recognition on Depth Maps

verfasst von : Jacek Trelinski, Bogdan Kwolek

Erschienen in: Computer Vision and Graphics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present an algorithm for action recognition that uses only depth maps. We propose a set of handcrafted features to describe person’s shape in noisy depth maps. We extract features by a convolutional neural network (CNN), which has been trained on multi-channel input sequences consisting of two consecutive depth maps and depth map projected onto an orthogonal Cartesian plane. We show experimentally that combining features extracted by the CNN and proposed features leads to better classification performance. We demonstrate that an LSTM trained on such aggregated features achieves state-of-the-art classification performance on UTKinect dataset. We propose a global statistical descriptor of temporal features. We show experimentally that such a descriptor has high discriminative power on time-series of concatenated CNN features with handcrafted features.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal, J., Ryoo, M.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16:1–16:43 (2011)CrossRef Aggarwal, J., Ryoo, M.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 16:1–16:43 (2011)CrossRef
2.
Zurück zum Zitat Malawski, F., Kwolek, B.: Real-time action detection and analysis in fencing footwork. In: 40th International Conference on Telecommunications and Signal Processing (TSP), pp. 520–523 (2017) Malawski, F., Kwolek, B.: Real-time action detection and analysis in fencing footwork. In: 40th International Conference on Telecommunications and Signal Processing (TSP), pp. 520–523 (2017)
3.
Zurück zum Zitat Liang, B., Zheng, L.: A survey on human action recognition using depth sensors. In: International Conference on Digital Image Computing: Techniques and Applications, pp. 1–8 (2015) Liang, B., Zheng, L.: A survey on human action recognition using depth sensors. In: International Conference on Digital Image Computing: Techniques and Applications, pp. 1–8 (2015)
4.
Zurück zum Zitat Aggarwal, J., Xia, L.: Human activity recognition from 3D data: a review. Pattern Recogn. Lett. 48, 70–80 (2014)CrossRef Aggarwal, J., Xia, L.: Human activity recognition from 3D data: a review. Pattern Recogn. Lett. 48, 70–80 (2014)CrossRef
5.
Zurück zum Zitat Chen, L., Wei, H., Ferryman, J.: A survey of human motion analysis using depth imagery. Pattern Recogn. Lett. 34(15), 1995–2006 (2013)CrossRef Chen, L., Wei, H., Ferryman, J.: A survey of human motion analysis using depth imagery. Pattern Recogn. Lett. 34(15), 1995–2006 (2013)CrossRef
6.
Zurück zum Zitat Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., Gall, J.: A survey on human motion analysis from depth data. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds.) Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. LNCS, vol. 8200, pp. 149–187. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-44964-2_8CrossRef Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., Gall, J.: A survey on human motion analysis from depth data. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds.) Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. LNCS, vol. 8200, pp. 149–187. Springer, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-642-44964-2_​8CrossRef
7.
Zurück zum Zitat Lo Presti, L., La Cascia, M.: 3D skeleton-based human action classification. Pattern Recogn. 53(C), 130–147 (2016) Lo Presti, L., La Cascia, M.: 3D skeleton-based human action classification. Pattern Recogn. 53(C), 130–147 (2016)
8.
Zurück zum Zitat Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3D joints. In: CVPR Workshops, pp. 20–27 (2012) Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3D joints. In: CVPR Workshops, pp. 20–27 (2012)
9.
Zurück zum Zitat Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: IEEE International Conference on Computer Vision and Pattern Recognition - Workshops, pp. 9–14 (2010) Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: IEEE International Conference on Computer Vision and Pattern Recognition - Workshops, pp. 9–14 (2010)
10.
Zurück zum Zitat Xia, L., Chen, C., Aggarwal, J.: Human detection using depth information by Kinect. In: CVPR 2011 Workshops, pp. 15–22 (2011) Xia, L., Chen, C., Aggarwal, J.: Human detection using depth information by Kinect. In: CVPR 2011 Workshops, pp. 15–22 (2011)
11.
Zurück zum Zitat Chen, C., Jafari, R., Kehtarnavaz, N.: Action recognition from depth sequences using depth motion maps-based local binary patterns. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp.1092–1099 (2015) Chen, C., Jafari, R., Kehtarnavaz, N.: Action recognition from depth sequences using depth motion maps-based local binary patterns. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp.1092–1099 (2015)
12.
Zurück zum Zitat Yang, X., Zhang, C., Tian, Y.L.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 1057–1060. ACM (2012) Yang, X., Zhang, C., Tian, Y.L.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 1057–1060. ACM (2012)
14.
Zurück zum Zitat Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33275-3_31CrossRef Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012). https://​doi.​org/​10.​1007/​978-3-642-33275-3_​31CrossRef
15.
Zurück zum Zitat Xia, L., Aggarwal, J.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2834–2841 (2013) Xia, L., Aggarwal, J.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2834–2841 (2013)
16.
Zurück zum Zitat Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: IEEE Internatiponal Conference on Computer Vision and Pattern Recognition, pp. 716–723 (2013) Oreifej, O., Liu, Z.: HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: IEEE Internatiponal Conference on Computer Vision and Pattern Recognition, pp. 716–723 (2013)
17.
Zurück zum Zitat Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., Ogunbona, P.: Action recognition from depth maps using deep convolutional neural networks. IEEE Trans. Hum. Mach. Syst. 46(4), 498–509 (2016)CrossRef Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., Ogunbona, P.: Action recognition from depth maps using deep convolutional neural networks. IEEE Trans. Hum. Mach. Syst. 46(4), 498–509 (2016)CrossRef
18.
Zurück zum Zitat Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRef
20.
Zurück zum Zitat Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATH
21.
Zurück zum Zitat Paliwal, K., Agarwal, A., Sinha, S.: A modification over Sakoe and Chiba’s dynamic time warping algorithm for isolated word recognition. Signal Process. 4(4), 329–333 (1982)CrossRef Paliwal, K., Agarwal, A., Sinha, S.: A modification over Sakoe and Chiba’s dynamic time warping algorithm for isolated word recognition. Signal Process. 4(4), 329–333 (1982)CrossRef
22.
Zurück zum Zitat Sainath, T., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4580–4584 (2015) Sainath, T., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4580–4584 (2015)
23.
Zurück zum Zitat Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
24.
Zurück zum Zitat Zhu, Y., Chen, W., Guo, G.: Fusing multiple features for depth-based action recognition. ACM Trans. Intell. Syst. Technol. 6(2), 18:1–18:20 (2015)CrossRef Zhu, Y., Chen, W., Guo, G.: Fusing multiple features for depth-based action recognition. ACM Trans. Intell. Syst. Technol. 6(2), 18:1–18:20 (2015)CrossRef
25.
Zurück zum Zitat Yang, X., Tian, Y.L.: Super normal vector for activity recognition using depth sequences. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 804–811 (2014) Yang, X., Tian, Y.L.: Super normal vector for activity recognition using depth sequences. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 804–811 (2014)
26.
Zurück zum Zitat Wu, Y.: Mining actionlet ensemble for action recognition with depth cameras. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1290–1297 (2012) Wu, Y.: Mining actionlet ensemble for action recognition with depth cameras. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1290–1297 (2012)
27.
Zurück zum Zitat Lu, C., Jia, J., Tang, C.: Range-sample depth feature for action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 772–779 (2014) Lu, C., Jia, J., Tang, C.: Range-sample depth feature for action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 772–779 (2014)
28.
Zurück zum Zitat Ji, X., Liu, H.: Advances in view-invariant human motion analysis: a review. IEEE Trans. Syst. Man Cybern. Part C 40(1), 13–24 (2010) Ji, X., Liu, H.: Advances in view-invariant human motion analysis: a review. IEEE Trans. Syst. Man Cybern. Part C 40(1), 13–24 (2010)
29.
Zurück zum Zitat Werbos, P.: Backpropagation through time: what it does and how to do it. Proceedings of the IEEE 78(10), 1550–1560 (1990)CrossRef Werbos, P.: Backpropagation through time: what it does and how to do it. Proceedings of the IEEE 78(10), 1550–1560 (1990)CrossRef
Metadaten
Titel
Convolutional Neural Network-Based Action Recognition on Depth Maps
verfasst von
Jacek Trelinski
Bogdan Kwolek
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-00692-1_19