Skip to main content
Top

2018 | OriginalPaper | Chapter

Action Recognition from Optical Flow Visualizations

Authors : Arpan Gupta, M. Sakthi Balan

Published in: Proceedings of 2nd International Conference on Computer Vision & Image Processing

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Optical flow is an important computer vision technique used for motion estimation, object tracking and activity recognition. In this paper, we study the effectiveness of the optical flow feature in recognizing simple actions by using only their RGB visualizations as input to a deep neural network. Feeding only the optical flow visualizations, instead of the raw multimedia content, ensures that only a single motion feature is used as a classification criterion. Here, we deal with human action recognition as a multi-class classification problem. In order to categorize an action, we train an AlexNet-like Convolutional Neural Network (CNN) on Farneback optical flow visualization features of the action videos. We have chosen the KTH data set, which contains six types of action videos, namely walking, running, boxing, jogging, hand-clapping and hand-waving. The accuracy obtained on the test set is 84.72%, and it is naturally less than the state of the art since only a single motion feature is used for classification, but it is high enough to show the effectiveness of optical flow visualization as a good distinguishing criterion for action recognition. The AlexNet-like CNN was trained in Caffe on two NVIDIA Quadro K4200 GPU cards, while the Farneback optical flow features were calculated using OpenCV library.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Aaron F. Bobick. Action Recognition Using Temporal Templates. Journal of Chemical Information and Modeling, 53(9):1689–1699, 2013. Aaron F. Bobick. Action Recognition Using Temporal Templates. Journal of Chemical Information and Modeling, 53(9):1689–1699, 2013.
2.
go back to reference J K Aggarwal and M S Ryoo. Human activity analysis. ACM Comput. Surv., 43(3):1–43, 2011.CrossRef J K Aggarwal and M S Ryoo. Human activity analysis. ACM Comput. Surv., 43(3):1–43, 2011.CrossRef
3.
go back to reference J.K. Aggarwal and Q. Cai. Human Motion Analysis: A Review. Computer Vision and Image Understanding, 73(3):428–440, 1999.CrossRef J.K. Aggarwal and Q. Cai. Human Motion Analysis: A Review. Computer Vision and Image Understanding, 73(3):428–440, 1999.CrossRef
4.
go back to reference Simon Baker, Daniel Scharstein, J. P. Lewis, Stefan Roth, Michael J. Black, and Richard Szeliski. A database and evaluation methodology for optical flow. International Journal of Computer Vision, 92(1):1–31, 2011. Simon Baker, Daniel Scharstein, J. P. Lewis, Stefan Roth, Michael J. Black, and Richard Szeliski. A database and evaluation methodology for optical flow. International Journal of Computer Vision, 92(1):1–31, 2011.
5.
go back to reference G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000. G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
6.
go back to reference Thomas Brox, Nils Papenberg, and Joachim Weickert. High Accuracy Optical Flow Estimation Based on a Theory for Warping. Computer Vision - ECCV 2004, 4(May):25–36, 2004. Thomas Brox, Nils Papenberg, and Joachim Weickert. High Accuracy Optical Flow Estimation Based on a Theory for Warping. Computer Vision - ECCV 2004, 4(May):25–36, 2004.
7.
go back to reference Gunnar Farnebäck. Two-frame Motion Estimation Based on Polynomial Expansion. In Proceedings of the 13th Scandinavian Conference on Image Analysis, SCIA’03, pages 363–370, Berlin, Heidelberg, 2003. Springer-Verlag. Gunnar Farnebäck. Two-frame Motion Estimation Based on Polynomial Expansion. In Proceedings of the 13th Scandinavian Conference on Image Analysis, SCIA’03, pages 363–370, Berlin, Heidelberg, 2003. Springer-Verlag.
8.
go back to reference Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazirbas, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, and Thomas Brox. Flownet: Learning optical flow with convolutional networks. CoRR, arXiv:1504.06852, 2015. Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazirbas, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, and Thomas Brox. Flownet: Learning optical flow with convolutional networks. CoRR, arXiv:​1504.​06852, 2015.
9.
go back to reference David Fleet and Yair Weiss. Optical Flow Estimation. Mathematical models for Computer Vision: The Handbook, pages 239–257, 2005. David Fleet and Yair Weiss. Optical Flow Estimation. Mathematical models for Computer Vision: The Handbook, pages 239–257, 2005.
10.
go back to reference Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 3354–3361, 2012. Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 3354–3361, 2012.
11.
go back to reference Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 9:249–256, 2010. Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 9:249–256, 2010.
12.
go back to reference Berthold Horn and B Schunck. Determining optical flow. Artificial Intelligence, 17(1–2):185–203, 1981.CrossRef Berthold Horn and B Schunck. Determining optical flow. Artificial Intelligence, 17(1–2):185–203, 1981.CrossRef
13.
go back to reference Shuiwang Ji, Ming Yang, and Kai Yu. 3D Convolutional Neural Networks for Human Action Recognition. Pami, 35(1):221–31, 2013.CrossRef Shuiwang Ji, Ming Yang, and Kai Yu. 3D Convolutional Neural Networks for Human Action Recognition. Pami, 35(1):221–31, 2013.CrossRef
14.
go back to reference Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093, 2014. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:​1408.​5093, 2014.
15.
go back to reference Alex Krizhevsky, IIya Sulskever, and Geoffrey E Hinton. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information and Processing Systems (NIPS), pages 1–9, 2012. Alex Krizhevsky, IIya Sulskever, and Geoffrey E Hinton. ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information and Processing Systems (NIPS), pages 1–9, 2012.
16.
go back to reference Ivan Laptev, Marcin Marszałek, Cordelia Schmid, and Benjamin Rozenfeld. Learning realistic human actions from movies. 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008. Ivan Laptev, Marcin Marszałek, Cordelia Schmid, and Benjamin Rozenfeld. Learning realistic human actions from movies. 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008.
17.
go back to reference J Liu and M Shah. Learning human action via information maximization. Conference on Computer Vision and Pattern Recognition, pages 2971–2978, 2008. J Liu and M Shah. Learning human action via information maximization. Conference on Computer Vision and Pattern Recognition, pages 2971–2978, 2008.
18.
go back to reference BD Lucas and T Kanade. An Iterative Image Registration Technique with an Application to Stereo Vision. Ijcai, 130:121–129, 1981. BD Lucas and T Kanade. An Iterative Image Registration Technique with an Application to Stereo Vision. Ijcai, 130:121–129, 1981.
19.
go back to reference Upal Mahbub, Hafiz Imtiaz, and Md Atiqur Rahman Ahad. An optical flow based approach for action recognition. 14th International Conference on Computer and Information Technology, ICCIT 2011, (Iccit):646–651, 2011. Upal Mahbub, Hafiz Imtiaz, and Md Atiqur Rahman Ahad. An optical flow based approach for action recognition. 14th International Conference on Computer and Information Technology, ICCIT 2011, (Iccit):646–651, 2011.
20.
go back to reference Pol Rosello. Predicting Future Optical Flow from Static Video Frames. 2016. Pol Rosello. Predicting Future Optical Flow from Static Video Frames. 2016.
21.
go back to reference Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–252, 2015.MathSciNetCrossRef Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–252, 2015.MathSciNetCrossRef
22.
go back to reference Christian Schuldt, Ivan Laptev, and Barbara Caputo. Recognizing human actions: A local svm approach. In Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) Volume 3 - Volume 03, ICPR ’04, pages 32–36, Washington, DC, USA, 2004. IEEE Computer Society. Christian Schuldt, Ivan Laptev, and Barbara Caputo. Recognizing human actions: A local svm approach. In Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) Volume 3 - Volume 03, ICPR ’04, pages 32–36, Washington, DC, USA, 2004. IEEE Computer Society.
23.
go back to reference Karen Simonyan and Andrew Zisserman. Two-Stream Convolutional Networks for Action Recognition in Videos. arXiv preprint arXiv:1406.2199, pages 1–11, 2014. Karen Simonyan and Andrew Zisserman. Two-Stream Convolutional Networks for Action Recognition in Videos. arXiv preprint arXiv:​1406.​2199, pages 1–11, 2014.
24.
go back to reference Michalis Vrigkas, Christophoros Nikou, and Ioannis a. Kakadiaris. A Review of Human Activity Recognition Methods. Frontiers in Robotics and AI, 2(November):1–28, nov 2015. Michalis Vrigkas, Christophoros Nikou, and Ioannis a. Kakadiaris. A Review of Human Activity Recognition Methods. Frontiers in Robotics and AI, 2(November):1–28, nov 2015.
25.
go back to reference Heng Wang, Muhammad Muneeb Ullah, Alexander Klaser, Ivan Laptev, and Cordelia Schmid. Evaluation of local spatio-temporal features for action recognition. BMVC 2009 - British Machine Vision Conference, pages 124.1–124.11, 2009. Heng Wang, Muhammad Muneeb Ullah, Alexander Klaser, Ivan Laptev, and Cordelia Schmid. Evaluation of local spatio-temporal features for action recognition. BMVC 2009 - British Machine Vision Conference, pages 124.1–124.11, 2009.
26.
go back to reference Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui, and Cordelia Schmid. DeepFlow: Large displacement optical flow with deep matching. Proceedings of the IEEE International Conference on Computer Vision, (Section 2):1385–1392, 2013. Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui, and Cordelia Schmid. DeepFlow: Large displacement optical flow with deep matching. Proceedings of the IEEE International Conference on Computer Vision, (Section 2):1385–1392, 2013.
27.
go back to reference Zoran Zivkovic. Improved adaptive Gaussian mixture model for background subtraction. Proceedings of the 17th International Conference on Pattern Recognition, 2(2):28–31 Vol. 2, 2004. Zoran Zivkovic. Improved adaptive Gaussian mixture model for background subtraction. Proceedings of the 17th International Conference on Pattern Recognition, 2(2):28–31 Vol. 2, 2004.
Metadata
Title
Action Recognition from Optical Flow Visualizations
Authors
Arpan Gupta
M. Sakthi Balan
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7895-8_31