Skip to main content

2018 | OriginalPaper | Buchkapitel

Action Recognition from Optical Flow Visualizations

verfasst von : Arpan Gupta, M. Sakthi Balan

Erschienen in: Proceedings of 2nd International Conference on Computer Vision & Image Processing

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Optical flow is an important computer vision technique used for motion estimation, object tracking and activity recognition. In this paper, we study the effectiveness of the optical flow feature in recognizing simple actions by using only their RGB visualizations as input to a deep neural network. Feeding only the optical flow visualizations, instead of the raw multimedia content, ensures that only a single motion feature is used as a classification criterion. Here, we deal with human action recognition as a multi-class classification problem. In order to categorize an action, we train an AlexNet-like Convolutional Neural Network (CNN) on Farneback optical flow visualization features of the action videos. We have chosen the KTH data set, which contains six types of action videos, namely walking, running, boxing, jogging, hand-clapping and hand-waving. The accuracy obtained on the test set is 84.72%, and it is naturally less than the state of the art since only a single motion feature is used for classification, but it is high enough to show the effectiveness of optical flow visualization as a good distinguishing criterion for action recognition. The AlexNet-like CNN was trained in Caffe on two NVIDIA Quadro K4200 GPU cards, while the Farneback optical flow features were calculated using OpenCV library.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Aaron F. Bobick. Action Recognition Using Temporal Templates. Journal of Chemical Information and Modeling, 53(9):1689–1699, 2013. Aaron F. Bobick. Action Recognition Using Temporal Templates. Journal of Chemical Information and Modeling, 53(9):1689–1699, 2013.
2.
Zurück zum Zitat J K Aggarwal and M S Ryoo. Human activity analysis. ACM Comput. Surv., 43(3):1–43, 2011.CrossRef J K Aggarwal and M S Ryoo. Human activity analysis. ACM Comput. Surv., 43(3):1–43, 2011.CrossRef
3.
Zurück zum Zitat J.K. Aggarwal and Q. Cai. Human Motion Analysis: A Review. Computer Vision and Image Understanding, 73(3):428–440, 1999.CrossRef J.K. Aggarwal and Q. Cai. Human Motion Analysis: A Review. Computer Vision and Image Understanding, 73(3):428–440, 1999.CrossRef
4.
Zurück zum Zitat Simon Baker, Daniel Scharstein, J. P. Lewis, Stefan Roth, Michael J. Black, and Richard Szeliski. A database and evaluation methodology for optical flow. International Journal of Computer Vision, 92(1):1–31, 2011. Simon Baker, Daniel Scharstein, J. P. Lewis, Stefan Roth, Michael J. Black, and Richard Szeliski. A database and evaluation methodology for optical flow. International Journal of Computer Vision, 92(1):1–31, 2011.
5.
Zurück zum Zitat G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000. G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
6.
Zurück zum Zitat Thomas Brox, Nils Papenberg, and Joachim Weickert. High Accuracy Optical Flow Estimation Based on a Theory for Warping. Computer Vision - ECCV 2004, 4(May):25–36, 2004. Thomas Brox, Nils Papenberg, and Joachim Weickert. High Accuracy Optical Flow Estimation Based on a Theory for Warping. Computer Vision - ECCV 2004, 4(May):25–36, 2004.
7.
Zurück zum Zitat Gunnar Farnebäck. Two-frame Motion Estimation Based on Polynomial Expansion. In Proceedings of the 13th Scandinavian Conference on Image Analysis, SCIA’03, pages 363–370, Berlin, Heidelberg, 2003. Springer-Verlag. Gunnar Farnebäck. Two-frame Motion Estimation Based on Polynomial Expansion. In Proceedings of the 13th Scandinavian Conference on Image Analysis, SCIA’03, pages 363–370, Berlin, Heidelberg, 2003. Springer-Verlag.
8.
Zurück zum Zitat Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazirbas, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, and Thomas Brox. Flownet: Learning optical flow with convolutional networks. CoRR, arXiv:1504.06852, 2015. Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazirbas, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, and Thomas Brox. Flownet: Learning optical flow with convolutional networks. CoRR, arXiv:​1504.​06852, 2015.
9.
Zurück zum Zitat David Fleet and Yair Weiss. Optical Flow Estimation. Mathematical models for Computer Vision: The Handbook, pages 239–257, 2005. David Fleet and Yair Weiss. Optical Flow Estimation. Mathematical models for Computer Vision: The Handbook, pages 239–257, 2005.
10.
Zurück zum Zitat Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 3354–3361, 2012. Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 3354–3361, 2012.
11.
Zurück zum Zitat Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 9:249–256, 2010. Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 9:249–256, 2010.
12.
Zurück zum Zitat Berthold Horn and B Schunck. Determining optical flow. Artificial Intelligence, 17(1–2):185–203, 1981.CrossRef Berthold Horn and B Schunck. Determining optical flow. Artificial Intelligence, 17(1–2):185–203, 1981.CrossRef
13.
Zurück zum Zitat Shuiwang Ji, Ming Yang, and Kai Yu. 3D Convolutional Neural Networks for Human Action Recognition. Pami, 35(1):221–31, 2013.CrossRef Shuiwang Ji, Ming Yang, and Kai Yu. 3D Convolutional Neural Networks for Human Action Recognition. Pami, 35(1):221–31, 2013.CrossRef
14.
Zurück zum Zitat Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093, 2014. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:​1408.​5093, 2014.
15.
Zurück zum Zitat Alex Krizhevsky, IIya Sulskever, and Geoffrey E Hinton. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information and Processing Systems (NIPS), pages 1–9, 2012. Alex Krizhevsky, IIya Sulskever, and Geoffrey E Hinton. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information and Processing Systems (NIPS), pages 1–9, 2012.
16.
Zurück zum Zitat Ivan Laptev, Marcin Marszałek, Cordelia Schmid, and Benjamin Rozenfeld. Learning realistic human actions from movies. 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008. Ivan Laptev, Marcin Marszałek, Cordelia Schmid, and Benjamin Rozenfeld. Learning realistic human actions from movies. 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008.
17.
Zurück zum Zitat J Liu and M Shah. Learning human action via information maximization. Conference on Computer Vision and Pattern Recognition, pages 2971–2978, 2008. J Liu and M Shah. Learning human action via information maximization. Conference on Computer Vision and Pattern Recognition, pages 2971–2978, 2008.
18.
Zurück zum Zitat BD Lucas and T Kanade. An Iterative Image Registration Technique with an Application to Stereo Vision. Ijcai, 130:121–129, 1981. BD Lucas and T Kanade. An Iterative Image Registration Technique with an Application to Stereo Vision. Ijcai, 130:121–129, 1981.
19.
Zurück zum Zitat Upal Mahbub, Hafiz Imtiaz, and Md Atiqur Rahman Ahad. An optical flow based approach for action recognition. 14th International Conference on Computer and Information Technology, ICCIT 2011, (Iccit):646–651, 2011. Upal Mahbub, Hafiz Imtiaz, and Md Atiqur Rahman Ahad. An optical flow based approach for action recognition. 14th International Conference on Computer and Information Technology, ICCIT 2011, (Iccit):646–651, 2011.
20.
Zurück zum Zitat Pol Rosello. Predicting Future Optical Flow from Static Video Frames. 2016. Pol Rosello. Predicting Future Optical Flow from Static Video Frames. 2016.
21.
Zurück zum Zitat Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–252, 2015.MathSciNetCrossRef Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–252, 2015.MathSciNetCrossRef
22.
Zurück zum Zitat Christian Schuldt, Ivan Laptev, and Barbara Caputo. Recognizing human actions: A local svm approach. In Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) Volume 3 - Volume 03, ICPR ’04, pages 32–36, Washington, DC, USA, 2004. IEEE Computer Society. Christian Schuldt, Ivan Laptev, and Barbara Caputo. Recognizing human actions: A local svm approach. In Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) Volume 3 - Volume 03, ICPR ’04, pages 32–36, Washington, DC, USA, 2004. IEEE Computer Society.
23.
Zurück zum Zitat Karen Simonyan and Andrew Zisserman. Two-Stream Convolutional Networks for Action Recognition in Videos. arXiv preprint arXiv:1406.2199, pages 1–11, 2014. Karen Simonyan and Andrew Zisserman. Two-Stream Convolutional Networks for Action Recognition in Videos. arXiv preprint arXiv:​1406.​2199, pages 1–11, 2014.
24.
Zurück zum Zitat Michalis Vrigkas, Christophoros Nikou, and Ioannis a. Kakadiaris. A Review of Human Activity Recognition Methods. Frontiers in Robotics and AI, 2(November):1–28, nov 2015. Michalis Vrigkas, Christophoros Nikou, and Ioannis a. Kakadiaris. A Review of Human Activity Recognition Methods. Frontiers in Robotics and AI, 2(November):1–28, nov 2015.
25.
Zurück zum Zitat Heng Wang, Muhammad Muneeb Ullah, Alexander Klaser, Ivan Laptev, and Cordelia Schmid. Evaluation of local spatio-temporal features for action recognition. BMVC 2009 - British Machine Vision Conference, pages 124.1–124.11, 2009. Heng Wang, Muhammad Muneeb Ullah, Alexander Klaser, Ivan Laptev, and Cordelia Schmid. Evaluation of local spatio-temporal features for action recognition. BMVC 2009 - British Machine Vision Conference, pages 124.1–124.11, 2009.
26.
Zurück zum Zitat Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui, and Cordelia Schmid. DeepFlow: Large displacement optical flow with deep matching. Proceedings of the IEEE International Conference on Computer Vision, (Section 2):1385–1392, 2013. Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui, and Cordelia Schmid. DeepFlow: Large displacement optical flow with deep matching. Proceedings of the IEEE International Conference on Computer Vision, (Section 2):1385–1392, 2013.
27.
Zurück zum Zitat Zoran Zivkovic. Improved adaptive Gaussian mixture model for background subtraction. Proceedings of the 17th International Conference on Pattern Recognition, 2(2):28–31 Vol. 2, 2004. Zoran Zivkovic. Improved adaptive Gaussian mixture model for background subtraction. Proceedings of the 17th International Conference on Pattern Recognition, 2(2):28–31 Vol. 2, 2004.
Metadaten
Titel
Action Recognition from Optical Flow Visualizations
verfasst von
Arpan Gupta
M. Sakthi Balan
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7895-8_31