Skip to main content
Top
Published in: International Journal of Multimedia Information Retrieval 4/2019

28-10-2019 | Short Paper

Probabilistic selection of frames for early action recognition in videos

Authors: Mehrin Saremi, Farzin Yaghmaee

Published in: International Journal of Multimedia Information Retrieval | Issue 4/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Early action recognition seeks to recognize human actions in a video, while the video has been only partially observed. In this paper, we introduce an approach to this kind of recognition task. In some offline (non-early) recognition works, it has been proposed to sample frames of the video uniformly and use them in training of the model. However, there is no reason that uniform sampling should be optimal, so we propose a non-uniform sampling to make it more tailored to early recognition. The proposed method samples the frames in such a way that earlier frames are more likely to be chosen. These frames are then used in training a deep network architecture. We compare our sampling approach with a uniform sampling process, using HMDB51 dataset as a benchmark. We further compare our method with other state-of-the-art early recognition works. The experimental results suggest that our sampling process leads to better recognition accuracy than uniform sampling, at the early stages of the video, and that our proposed algorithm outperforms the state-of-the-art.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cao Y, Barrett D, Barbu A, Narayanaswamy S, Yu H, Michaux A, Lin Y, Dickinson S, Siskind JM, Wang S (2013) Recognize human activities from partially observed videos. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2658–2665. https://doi.org/10.1109/CVPR.2013.343 Cao Y, Barrett D, Barbu A, Narayanaswamy S, Yu H, Michaux A, Lin Y, Dickinson S, Siskind JM, Wang S (2013) Recognize human activities from partially observed videos. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2658–2665. https://​doi.​org/​10.​1109/​CVPR.​2013.​343
2.
go back to reference Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255 Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
3.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778.
5.
go back to reference Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: 32nd international conference on machine learning, ICML 2015, vol 1. International Machine Learning Society (IMLS), pp 448–456 Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: 32nd international conference on machine learning, ICML 2015, vol 1. International Machine Learning Society (IMLS), pp 448–456
13.
go back to reference Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Adv Neural Inf Process Syst 1:568–576 Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Adv Neural Inf Process Syst 1:568–576
14.
go back to reference Vondrick C, Pirsiavash H, Torralba A (2016) Anticipating visual representations from unlabeled video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 98–106 Vondrick C, Pirsiavash H, Torralba A (2016) Anticipating visual representations from unlabeled video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 98–106
19.
go back to reference Zanfir M, Leordeanu M, Sminchisescu C (2013) The moving pose: an efficient 3D kinematics descriptor for low-latency action recognition and detection. In: Proceedings of the IEEE international conference on computer vision, pp 2752–2759. https://doi.org/10.1109/ICCV.2013.342 Zanfir M, Leordeanu M, Sminchisescu C (2013) The moving pose: an efficient 3D kinematics descriptor for low-latency action recognition and detection. In: Proceedings of the IEEE international conference on computer vision, pp 2752–2759. https://​doi.​org/​10.​1109/​ICCV.​2013.​342
Metadata
Title
Probabilistic selection of frames for early action recognition in videos
Authors
Mehrin Saremi
Farzin Yaghmaee
Publication date
28-10-2019
Publisher
Springer London
Published in
International Journal of Multimedia Information Retrieval / Issue 4/2019
Print ISSN: 2192-6611
Electronic ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-019-00182-x

Other articles of this Issue 4/2019

International Journal of Multimedia Information Retrieval 4/2019 Go to the issue

Premium Partner