Skip to main content
Erschienen in: Arabian Journal for Science and Engineering 8/2020

31.03.2020 | Research Article-Computer Engineering and Computer Science

Multiple Batches of Motion History Images (MB-MHIs) for Multi-view Human Action Recognition

verfasst von: Hajra Binte Naeem, Fiza Murtaza, Muhammad Haroon Yousaf, Sergio A. Velastin

Erschienen in: Arabian Journal for Science and Engineering | Ausgabe 8/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The recognition of human actions recorded in a multi-camera environment faces the challenging issue of viewpoint variation. Multi-view methods employ videos from different views to generate a compact view-invariant representation of human actions. This paper proposes a novel multi-view human action recognition approach that uses multiple low-dimensional temporal templates and a reconstruction-based encoding scheme. The proposed approach is based upon the extraction of multiple 2D motion history images (MHIs) of human action videos over non-overlapping temporal windows, constructing multiple batches of motion history images (MB-MHIs). Then, two kinds of descriptions are computed for these MHIs batches based on (1) a deep residual network (ResNet) and (2) histogram of oriented gradients (HOG) to effectively quantify a change in gradient. ResNet descriptions are average pooled at each batch. HOG descriptions are processed independently at each batch to learn a class-based dictionary using a K-spectral value decomposition algorithm. Later, the sparse codes of feature descriptions are obtained using an orthogonal matching pursuit approach. These sparse codes are average pooled to extract encoded feature vectors. Then, encoded feature vectors at each batch are fused to form a final view-invariant feature representation. Finally, a linear support vector machine classifier is trained for action recognition. Experimental results are given on three versions of a multi-view dataset: MuHAVi-8, MuHAVi-14, and MuHAVi-uncut. The proposed approach shows promising results when tested for a novel camera. Results on deep features indicate that action representation by MB-MHIs is more view-invariant than single MHIs.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
2.
Zurück zum Zitat Aggarwal, J.K.; Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43, 16 (2011)CrossRef Aggarwal, J.K.; Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43, 16 (2011)CrossRef
3.
6.
Zurück zum Zitat Bobick, A.F.; Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)CrossRef Bobick, A.F.; Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)CrossRef
7.
Zurück zum Zitat Yilmaz, A.; Shah, M.: Actions sketch: a novel action representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 984–989 (2005) Yilmaz, A.; Shah, M.: Actions sketch: a novel action representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 984–989 (2005)
8.
Zurück zum Zitat Willems, G.; Tuytelaars, T.; Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: European Conference on Computer Vision, pp. 650–663 (2008) Willems, G.; Tuytelaars, T.; Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: European Conference on Computer Vision, pp. 650–663 (2008)
9.
Zurück zum Zitat Wang, H.; Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013) Wang, H.; Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
10.
Zurück zum Zitat Wang, H.; Kläser, A.; Schmid, C.; Liu, C.-L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 60–79 (2013)MathSciNetCrossRef Wang, H.; Kläser, A.; Schmid, C.; Liu, C.-L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 60–79 (2013)MathSciNetCrossRef
11.
Zurück zum Zitat Dalal, N.;, Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision and Pattern Recognition (CVPR’05), pp. 886–893 (2005) Dalal, N.;, Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision and Pattern Recognition (CVPR’05), pp. 886–893 (2005)
12.
Zurück zum Zitat Laptev, I.; Marszalek, M.; Schmid, C.; Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008) Laptev, I.; Marszalek, M.; Schmid, C.; Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
13.
Zurück zum Zitat Dalal, N.; Triggs, B.; Schmid, C.: Human detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision, pp. 428–441 (2006) Dalal, N.; Triggs, B.; Schmid, C.: Human detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision, pp. 428–441 (2006)
14.
Zurück zum Zitat Wang, H.; Oneata, D.; Verbeek, J.; Schmid, C.: A robust and efficient video representation for action recognition. Int. J. Comput. Vis. 119, 219–238 (2016)MathSciNetCrossRef Wang, H.; Oneata, D.; Verbeek, J.; Schmid, C.: A robust and efficient video representation for action recognition. Int. J. Comput. Vis. 119, 219–238 (2016)MathSciNetCrossRef
15.
Zurück zum Zitat Heikkilä, M.; Pietikäinen, M.; Schmid, C.: Description of interest regions with local binary patterns. Pattern Recognit. 42, 425–436 (2009)CrossRef Heikkilä, M.; Pietikäinen, M.; Schmid, C.: Description of interest regions with local binary patterns. Pattern Recognit. 42, 425–436 (2009)CrossRef
16.
Zurück zum Zitat Calonder, M.; Lepetit, V.; Ozuysal, M.; Trzcinski, T.; Strecha, C.; Fua, P.: BRIEF: computing a local binary descriptor very fast. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1281–1298 (2012)CrossRef Calonder, M.; Lepetit, V.; Ozuysal, M.; Trzcinski, T.; Strecha, C.; Fua, P.: BRIEF: computing a local binary descriptor very fast. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1281–1298 (2012)CrossRef
18.
Zurück zum Zitat Wang, L.; Xiong, Y.; Wang, Z.; Qiao, Y.; Lin, D.; Tang, X.; Van Gool, L.: Temporal segment networks for action recognition in videos. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2740–2755 (2018)CrossRef Wang, L.; Xiong, Y.; Wang, Z.; Qiao, Y.; Lin, D.; Tang, X.; Van Gool, L.: Temporal segment networks for action recognition in videos. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2740–2755 (2018)CrossRef
20.
Zurück zum Zitat Shao, L.; Liu, L.; Yu, M.: Kernelized multiview projection for robust action recognition. Int. J. Comput. Vis. 118, 115–129 (2016)MathSciNetCrossRef Shao, L.; Liu, L.; Yu, M.: Kernelized multiview projection for robust action recognition. Int. J. Comput. Vis. 118, 115–129 (2016)MathSciNetCrossRef
21.
Zurück zum Zitat Zhu, F.; Shao, L.; Lin, M.: Multi-view action recognition using local similarity random forests and sensor fusion. Pattern Recogn. Lett. 34, 20–24 (2013)CrossRef Zhu, F.; Shao, L.; Lin, M.: Multi-view action recognition using local similarity random forests and sensor fusion. Pattern Recogn. Lett. 34, 20–24 (2013)CrossRef
25.
Zurück zum Zitat Zhang, C.; Zheng, H.; Lai, J.: Cross-view action recognition based on hierarchical view-shared dictionary learning. IEEE Access 6, 16855–16868 (2018)CrossRef Zhang, C.; Zheng, H.; Lai, J.: Cross-view action recognition based on hierarchical view-shared dictionary learning. IEEE Access 6, 16855–16868 (2018)CrossRef
28.
Zurück zum Zitat Zhang, B.; Yang, Y.; Chen, C.; Yang, L.; Han, J.; Shao, L.: Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans. Image Process. 26, 4648–4660 (2017)MathSciNetCrossRef Zhang, B.; Yang, Y.; Chen, C.; Yang, L.; Han, J.; Shao, L.: Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans. Image Process. 26, 4648–4660 (2017)MathSciNetCrossRef
29.
Zurück zum Zitat Ershadi-Nasab, S.; Noury, E.; Kasaei, S.; Sanaei, E.: Multiple human 3D pose estimation from multiview images. Multimed. Tools Appl. 77, 15573–15601 (2018)CrossRef Ershadi-Nasab, S.; Noury, E.; Kasaei, S.; Sanaei, E.: Multiple human 3D pose estimation from multiview images. Multimed. Tools Appl. 77, 15573–15601 (2018)CrossRef
30.
Zurück zum Zitat Gu, J.; Ding, X.; Wang, S.: Action recognition from arbitrary views using 3D-key-pose set. Front. Electr. Electron. Eng. 7, 224–241 (2012) Gu, J.; Ding, X.; Wang, S.: Action recognition from arbitrary views using 3D-key-pose set. Front. Electr. Electron. Eng. 7, 224–241 (2012)
31.
Zurück zum Zitat Zhang, D.; Shah, M.: Human pose estimation in videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2012–2020 (2015) Zhang, D.; Shah, M.: Human pose estimation in videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2012–2020 (2015)
35.
Zurück zum Zitat Gonz, L.; Velastin, S.A.; Acu, G.: Silhouette-based human action recognition with a multi-class support vector machine. In: 9th International Conference on Pattern Recognition Systems (ICPRS 2018), p. 5 (2018) Gonz, L.; Velastin, S.A.; Acu, G.: Silhouette-based human action recognition with a multi-class support vector machine. In: 9th International Conference on Pattern Recognition Systems (ICPRS 2018), p. 5 (2018)
36.
Zurück zum Zitat Bui, M.; Duong, V.; Tai, T.; Wang, J.: Depth human action recognition depth based on convolution neural networks and principal component analysis. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 1543–1547 (2018). https://doi.org/10.1109/icip.2018.8451232 Bui, M.; Duong, V.; Tai, T.; Wang, J.: Depth human action recognition depth based on convolution neural networks and principal component analysis. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 1543–1547 (2018). https://​doi.​org/​10.​1109/​icip.​2018.​8451232
37.
Zurück zum Zitat Liu, M.; Liu, H.; Chen, C.: Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit. 68, 346–362 (2017)CrossRef Liu, M.; Liu, H.; Chen, C.: Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit. 68, 346–362 (2017)CrossRef
38.
Zurück zum Zitat Wang, K.; Zhang, G.; Xia, S.: Templateless non-rigid reconstruction and motion tracking with a single RGB-D camera. IEEE Trans. Image Process. 26, 5966–5979 (2017)MathSciNetCrossRef Wang, K.; Zhang, G.; Xia, S.: Templateless non-rigid reconstruction and motion tracking with a single RGB-D camera. IEEE Trans. Image Process. 26, 5966–5979 (2017)MathSciNetCrossRef
39.
Zurück zum Zitat Rahmani, H.; Mian, A.: 3D action recognition from novel viewpoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1506–1515 (2016) Rahmani, H.; Mian, A.: 3D action recognition from novel viewpoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1506–1515 (2016)
40.
Zurück zum Zitat Qureshi, F.Z.; Terzopoulos, D.: Surveillance camera scheduling: a virtual vision approach. Multimed. Syst. 12, 269–283 (2006)CrossRef Qureshi, F.Z.; Terzopoulos, D.: Surveillance camera scheduling: a virtual vision approach. Multimed. Syst. 12, 269–283 (2006)CrossRef
41.
Zurück zum Zitat Orrite, C.; Rodriguez, M.; Herrero, E.; Rogez, G.; Velastin, S.A.: Automatic segmentation and recognition of human actions in monocular sequences. In: 2014 22nd International Conference on Pattern Recognition, pp. 4218–4223 (2014) Orrite, C.; Rodriguez, M.; Herrero, E.; Rogez, G.; Velastin, S.A.: Automatic segmentation and recognition of human actions in monocular sequences. In: 2014 22nd International Conference on Pattern Recognition, pp. 4218–4223 (2014)
43.
Zurück zum Zitat Bhorge, S.; Bedase, D.: Multi view human action recognition using HODD. In: International Conference on Advances in Computing and Data Sciences, pp. 499–508 (2018) Bhorge, S.; Bedase, D.: Multi view human action recognition using HODD. In: International Conference on Advances in Computing and Data Sciences, pp. 499–508 (2018)
45.
Zurück zum Zitat Jurie, F.; Triggs, B.: Creating efficient codebooks for visual recognition. In: Tenth IEEE International Conference on Computer Vision (ICCV’05) Vol. 1, pp. 604–610 (2005) Jurie, F.; Triggs, B.: Creating efficient codebooks for visual recognition. In: Tenth IEEE International Conference on Computer Vision (ICCV’05) Vol. 1, pp. 604–610 (2005)
46.
Zurück zum Zitat Parikh, D., Grauman, K.: Relative attributes. In: 2011 International Conference on Computer Vision, pp. 503–510 (2011) Parikh, D., Grauman, K.: Relative attributes. In: 2011 International Conference on Computer Vision, pp. 503–510 (2011)
48.
Zurück zum Zitat Zhu, Y.; Zhao, X.; Fu, Y.; Liu, Y.: Sparse coding on local spatial-temporal volumes for human action recognition. In: Asian Conference on Computer Vision, pp. 660–671 (2010) Zhu, Y.; Zhao, X.; Fu, Y.; Liu, Y.: Sparse coding on local spatial-temporal volumes for human action recognition. In: Asian Conference on Computer Vision, pp. 660–671 (2010)
49.
Zurück zum Zitat Alfaro, A.; Mery, D.; Soto, A.: Action recognition in video using sparse coding and relative features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2688–2697 (2016) Alfaro, A.; Mery, D.; Soto, A.: Action recognition in video using sparse coding and relative features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2688–2697 (2016)
50.
Zurück zum Zitat Wang, W.; Yan, Y.; Zhang, L.; Hong, R.; Sebe, N.: Collaborative sparse coding for multiview action recognition. IEEE Multimed. 23, 80–87 (2016)CrossRef Wang, W.; Yan, Y.; Zhang, L.; Hong, R.; Sebe, N.: Collaborative sparse coding for multiview action recognition. IEEE Multimed. 23, 80–87 (2016)CrossRef
51.
Zurück zum Zitat Zheng, J.; Jiang, Z.; Chellappa, R.: Cross-view action recognition via transferable dictionary learning. IEEE Trans. Image Process. 25, 2542–2556 (2016)MathSciNetCrossRef Zheng, J.; Jiang, Z.; Chellappa, R.: Cross-view action recognition via transferable dictionary learning. IEEE Trans. Image Process. 25, 2542–2556 (2016)MathSciNetCrossRef
54.
Zurück zum Zitat Samadi, F.; Akbarizadeh, G.; Kaabi, H.: Change detection in SAR images using deep belief network: a new training approach based on morphological images. IET Image Process. 13, 2255–2264 (2019)CrossRef Samadi, F.; Akbarizadeh, G.; Kaabi, H.: Change detection in SAR images using deep belief network: a new training approach based on morphological images. IET Image Process. 13, 2255–2264 (2019)CrossRef
55.
Zurück zum Zitat Aharon, M.; Elad, M.; Bruckstein, A.: Others: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54, 4311 (2006)CrossRef Aharon, M.; Elad, M.; Bruckstein, A.: Others: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54, 4311 (2006)CrossRef
56.
Zurück zum Zitat He, K.; Zhang, X.; Ren, S.; Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K.; Zhang, X.; Ren, S.; Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
57.
Zurück zum Zitat Traver, V.J.; Serra-Toro, C.: Analysis of single-and dual-dictionary strategies in pedestrian classification. Pattern Anal. Appl. 21, 655–670 (2018)MathSciNetCrossRef Traver, V.J.; Serra-Toro, C.: Analysis of single-and dual-dictionary strategies in pedestrian classification. Pattern Anal. Appl. 21, 655–670 (2018)MathSciNetCrossRef
58.
Zurück zum Zitat Singh, S.; Velastin, S.A.; Ragheb, H.; Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 48–55 (2010). https://doi.org/10.1109/avss.2010.63 Singh, S.; Velastin, S.A.; Ragheb, H.; Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 48–55 (2010). https://​doi.​org/​10.​1109/​avss.​2010.​63
60.
Zurück zum Zitat Cheema, S.; Eweiwi, A.; Thurau, C.; Bauckhage, C.; Iais, F.; Augustin, S.: Action recognition by learning discriminative key poses. In: Computer Vision Workshops, pp. 1302–1309 (2011) Cheema, S.; Eweiwi, A.; Thurau, C.; Bauckhage, C.; Iais, F.; Augustin, S.: Action recognition by learning discriminative key poses. In: Computer Vision Workshops, pp. 1302–1309 (2011)
61.
Zurück zum Zitat Chou, K.-P.; Prasad, M.; Wu, D.; Sharma, N.; Li, D.-L.; Lin, Y.-F.; Blumenstein, M.; Lin, W.-C.; Lin, C.-T.: Robust feature-based automated multi-view human action recognition system. IEEE Access 6, 15283–15296 (2018)CrossRef Chou, K.-P.; Prasad, M.; Wu, D.; Sharma, N.; Li, D.-L.; Lin, Y.-F.; Blumenstein, M.; Lin, W.-C.; Lin, C.-T.: Robust feature-based automated multi-view human action recognition system. IEEE Access 6, 15283–15296 (2018)CrossRef
62.
Zurück zum Zitat Al-Faris, M.; Chiverton, J.; Yang, L.; Ndzi, D.: Appearance and motion information based human activity recognition. In: IET 3rd International Conference on Intelligent Signal Processing (ISP 2017), pp. 1–6 (2017) Al-Faris, M.; Chiverton, J.; Yang, L.; Ndzi, D.: Appearance and motion information based human activity recognition. In: IET 3rd International Conference on Intelligent Signal Processing (ISP 2017), pp. 1–6 (2017)
63.
Zurück zum Zitat Singh, D.; Kumar, V.: Comprehensive survey on haze removal techniques. Multimed. Tools Appl. 77, 9595–9620 (2018)CrossRef Singh, D.; Kumar, V.: Comprehensive survey on haze removal techniques. Multimed. Tools Appl. 77, 9595–9620 (2018)CrossRef
64.
Zurück zum Zitat Singh, D.; Kumar, V.: Defogging of road images using gain coefficient-based trilateral filter. J. Electron. Imaging 27, 13004 (2018)CrossRef Singh, D.; Kumar, V.: Defogging of road images using gain coefficient-based trilateral filter. J. Electron. Imaging 27, 13004 (2018)CrossRef
65.
Zurück zum Zitat Singh, D.; Kumar, V.: Dehazing of outdoor images using notch based integral guided filter. Multimed. Tools Appl. 77, 27363–27386 (2018)CrossRef Singh, D.; Kumar, V.: Dehazing of outdoor images using notch based integral guided filter. Multimed. Tools Appl. 77, 27363–27386 (2018)CrossRef
66.
Zurück zum Zitat Singh, D.; Kumar, V.; Kaur, M.: Single image dehazing using gradient channel prior. Appl. Intell. 49, 4276–4293 (2019)CrossRef Singh, D.; Kumar, V.; Kaur, M.: Single image dehazing using gradient channel prior. Appl. Intell. 49, 4276–4293 (2019)CrossRef
Metadaten
Titel
Multiple Batches of Motion History Images (MB-MHIs) for Multi-view Human Action Recognition
verfasst von
Hajra Binte Naeem
Fiza Murtaza
Muhammad Haroon Yousaf
Sergio A. Velastin
Publikationsdatum
31.03.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
Arabian Journal for Science and Engineering / Ausgabe 8/2020
Print ISSN: 2193-567X
Elektronische ISSN: 2191-4281
DOI
https://doi.org/10.1007/s13369-020-04481-y

Weitere Artikel der Ausgabe 8/2020

Arabian Journal for Science and Engineering 8/2020 Zur Ausgabe

Research Article-Electrical Engineering

Adaptive Cooperation for Millimeter Wave Communications

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.