Skip to main content
Top

2018 | OriginalPaper | Chapter

Exploiting Sub-region Deep Features for Specific Action Recognition in Combat Sports Video

Authors : Yongqiang Kong, Zhaoqiang Wei, Zhengang Wei, Shengke Wang, Feng Gao

Published in: Advances in Multimedia Information Processing – PCM 2017

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Current research works for human action recognition in videos mainly focused on the case in different types of videos, that is coarse recognition. However, for recognizing specific actions of one object of interest, these methods may fail to recognize, especially if the video contains multiple moving objects with different actions. In this paper, we proposed a novel method for specific player action recognition in combat sports video. Object tracking with body segmentation are used to generate sub-frame sequences. Action recognition is achieved by training a new three-stream Convolutional Neural Networks (CNNs) model, where the network inputs are horizontal components of optical flow, single sub-frame and vertical components of optical flow, respectively. And the network fusion is applied at both convolutional and softmax layers. Extensive experiments on real broadcast combat sports videos are provided to show the advantages and effectiveness of the proposed method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014) Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
2.
go back to reference Sun, L., Jia, K., Yeung, D.Y., Shi, B.E.: Human action recognition using factorized spatio-temporal convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 4597–4605. IEEE (2015) Sun, L., Jia, K., Yeung, D.Y., Shi, B.E.: Human action recognition using factorized spatio-temporal convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 4597–4605. IEEE (2015)
3.
go back to reference Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497. IEEE (2015) Du, T., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497. IEEE (2015)
4.
go back to reference Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1933–1941. IEEE (2016) Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1933–1941. IEEE (2016)
5.
go back to reference Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012) Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:​1212.​0402 (2012)
6.
go back to reference Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563. IEEE (2011) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563. IEEE (2011)
7.
go back to reference Zhen, X., Shao, L., Tao, D., Li, X.: Embedding motion and structure features for action recognition. IEEE Trans. Circuits Syst. Video Technol. 23(7), 1182–1190 (2013)CrossRef Zhen, X., Shao, L., Tao, D., Li, X.: Embedding motion and structure features for action recognition. IEEE Trans. Circuits Syst. Video Technol. 23(7), 1182–1190 (2013)CrossRef
8.
go back to reference Everts, I., Van Gemert, J.C., Gevers, T.: Evaluation of color stips for human action recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2850–2857. IEEE (2013) Everts, I., Van Gemert, J.C., Gevers, T.: Evaluation of color stips for human action recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2850–2857. IEEE (2013)
9.
go back to reference Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Li, F.F.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732. IEEE (2014) Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Li, F.F.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732. IEEE (2014)
10.
go back to reference Rodriguez, M.D., Ahmed, J., Shah, M.: Action MACH a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008) Rodriguez, M.D., Ahmed, J., Shah, M.: Action MACH a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)
11.
go back to reference Mendi, E., Clemente, H.B., Bayrak, C.: Sports video summarization based on motion analysis. Comput. Electr. Eng. 39(3), 790–796 (2013)CrossRef Mendi, E., Clemente, H.B., Bayrak, C.: Sports video summarization based on motion analysis. Comput. Electr. Eng. 39(3), 790–796 (2013)CrossRef
12.
go back to reference Dao, M.S., Babaguchi, N.: A new spatio-temporal method for event detection and personalized retrieval of sports video. Multimed. Tools Appl. 50(1), 227–248 (2010)CrossRef Dao, M.S., Babaguchi, N.: A new spatio-temporal method for event detection and personalized retrieval of sports video. Multimed. Tools Appl. 50(1), 227–248 (2010)CrossRef
14.
go back to reference Liu, J., Carr, P., Collins, R.T., Liu, Y.: Tracking sports players with context-conditioned motion models. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1830–1837. IEEE (2013) Liu, J., Carr, P., Collins, R.T., Liu, Y.: Tracking sports players with context-conditioned motion models. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1830–1837. IEEE (2013)
15.
go back to reference Dehghan, A., Tian, Y., Torr, P.H., Shah, M.: Target identity-aware network flow for online multiple target tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1146–1154. IEEE (2015) Dehghan, A., Tian, Y., Torr, P.H., Shah, M.: Target identity-aware network flow for online multiple target tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1146–1154. IEEE (2015)
16.
go back to reference Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6(2), 1453–1484 (2005)MathSciNetMATH Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6(2), 1453–1484 (2005)MathSciNetMATH
17.
go back to reference Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 886–893. IEEE (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 886–893. IEEE (2005)
18.
go back to reference Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7(3), 551–585 (2006)MathSciNetMATH Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7(3), 551–585 (2006)MathSciNetMATH
19.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556 (2014)
20.
go back to reference Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)
21.
go back to reference Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of ACM International Conference on Multimedia, pp. 675–678. ACM (2014) Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
22.
go back to reference Wang, L., Xiong, Y., Wang, Z., Qiao, Y.: Towards good practices for very deep two-stream convnets. arXiv preprint arXiv:1507.02159 (2015) Wang, L., Xiong, Y., Wang, Z., Qiao, Y.: Towards good practices for very deep two-stream convnets. arXiv preprint arXiv:​1507.​02159 (2015)
Metadata
Title
Exploiting Sub-region Deep Features for Specific Action Recognition in Combat Sports Video
Authors
Yongqiang Kong
Zhaoqiang Wei
Zhengang Wei
Shengke Wang
Feng Gao
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-77383-4_19