nach oben

Arabian Journal for Science and Engineering

Erschienen in:

31.03.2020 | Research Article-Computer Engineering and Computer Science

Multiple Batches of Motion History Images (MB-MHIs) for Multi-view Human Action Recognition

verfasst von: Hajra Binte Naeem, Fiza Murtaza, Muhammad Haroon Yousaf, Sergio A. Velastin

Erschienen in: Arabian Journal for Science and Engineering | Ausgabe 8/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The recognition of human actions recorded in a multi-camera environment faces the challenging issue of viewpoint variation. Multi-view methods employ videos from different views to generate a compact view-invariant representation of human actions. This paper proposes a novel multi-view human action recognition approach that uses multiple low-dimensional temporal templates and a reconstruction-based encoding scheme. The proposed approach is based upon the extraction of multiple 2D motion history images (MHIs) of human action videos over non-overlapping temporal windows, constructing multiple batches of motion history images (MB-MHIs). Then, two kinds of descriptions are computed for these MHIs batches based on (1) a deep residual network (ResNet) and (2) histogram of oriented gradients (HOG) to effectively quantify a change in gradient. ResNet descriptions are average pooled at each batch. HOG descriptions are processed independently at each batch to learn a class-based dictionary using a K-spectral value decomposition algorithm. Later, the sparse codes of feature descriptions are obtained using an orthogonal matching pursuit approach. These sparse codes are average pooled to extract encoded feature vectors. Then, encoded feature vectors at each batch are fused to form a final view-invariant feature representation. Finally, a linear support vector machine classifier is trained for action recognition. Experimental results are given on three versions of a multi-view dataset: MuHAVi-8, MuHAVi-14, and MuHAVi-uncut. The proposed approach shows promising results when tested for a novel camera. Results on deep features indicate that action representation by MB-MHIs is more view-invariant than single MHIs.

Vorheriger Artikel A Modified Multi-objective Particle Swarm Optimizer-Based Lévy Flight: An Approach Toward Intrusion Detection in Internet of Things

Nächster Artikel Risk-Based Test Case Prioritization by Correlating System Methods and Their Associated Risks

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Efthymiou, N.; Koutras, P.; Filntisis, P.P.; Potamianos, G.; Maragos, P.: Multi-view fusion for action recognition in child-robot interaction. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 455–459 (2018). https://doi.org/10.1109/icip.2018.8451146

Aggarwal, J.K.; Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43, 16 (2011)CrossRef

Sigurdsson, G.A.; Russakovsky, O.; Gupta, A.: What actions are needed for understanding human actions in videos? In: Proceedings of the IEEE International Conference on Computer Vision. 2017-October, pp. 2156–2165 (2017). https://doi.org/10.1109/iccv.2017.235

Sharifzadeh, F.; Akbarizadeh, G.; Seifi Kavian, Y.: Ship classification in SAR images using a new hybrid CNN–MLP classifier. J. Indian Soc. Remote Sens. 47, 551–562 (2018). https://doi.org/10.1007/s12524-018-0891-y CrossRef

Peng, X.; Wang, L.; Wang, X.; Qiao, Y.: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput. Vis. Image Underst. 150, 109–125 (2016). https://doi.org/10.1016/j.cviu.2016.03.013 CrossRef

Bobick, A.F.; Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)CrossRef

Yilmaz, A.; Shah, M.: Actions sketch: a novel action representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 984–989 (2005)

Willems, G.; Tuytelaars, T.; Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: European Conference on Computer Vision, pp. 650–663 (2008)

Wang, H.; Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)

10.

Wang, H.; Kläser, A.; Schmid, C.; Liu, C.-L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 60–79 (2013)MathSciNetCrossRef

11.

Dalal, N.;, Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision and Pattern Recognition (CVPR’05), pp. 886–893 (2005)

12.

Laptev, I.; Marszalek, M.; Schmid, C.; Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

13.

Dalal, N.; Triggs, B.; Schmid, C.: Human detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision, pp. 428–441 (2006)

14.

Wang, H.; Oneata, D.; Verbeek, J.; Schmid, C.: A robust and efficient video representation for action recognition. Int. J. Comput. Vis. 119, 219–238 (2016)MathSciNetCrossRef

15.

Heikkilä, M.; Pietikäinen, M.; Schmid, C.: Description of interest regions with local binary patterns. Pattern Recognit. 42, 425–436 (2009)CrossRef

16.

Calonder, M.; Lepetit, V.; Ozuysal, M.; Trzcinski, T.; Strecha, C.; Fua, P.: BRIEF: computing a local binary descriptor very fast. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1281–1298 (2012)CrossRef

17.

Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. (2014). https://doi.org/10.1109/cvpr.2014.223

18.

Wang, L.; Xiong, Y.; Wang, Z.; Qiao, Y.; Lin, D.; Tang, X.; Van Gool, L.: Temporal segment networks for action recognition in videos. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2740–2755 (2018)CrossRef

19.

Liu, Z.; Hu, H.; Zhang, J.: Spatiotemporal fusion networks for video action recognition. Neural Process. Lett. 50, 1877–1890 (2019). https://doi.org/10.1007/s11063-018-09972-6 CrossRef

20.

Shao, L.; Liu, L.; Yu, M.: Kernelized multiview projection for robust action recognition. Int. J. Comput. Vis. 118, 115–129 (2016)MathSciNetCrossRef

21.

Zhu, F.; Shao, L.; Lin, M.: Multi-view action recognition using local similarity random forests and sensor fusion. Pattern Recogn. Lett. 34, 20–24 (2013)CrossRef

22.

Cai, Z.; Wang, L.; Peng, X.; Qiao, Y.: Multi-view super vector for action recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 596–603 (2014). https://doi.org/10.1109/cvpr.2014.83

23.

Hao, T.; Wu, D.; Wang, Q.; Sun, J.: Multi-view representation learning for multi-view action recognition. J. Vis. Commun. Image Represent. 48, 453–460 (2017). https://doi.org/10.1016/j.jvcir.2017.01.019 CrossRef

24.

Wang, J.; Nie, X.: Cross-view action modeling, learning and recognition (2014). https://doi.org/10.1109/cvpr.2014.339

25.

Zhang, C.; Zheng, H.; Lai, J.: Cross-view action recognition based on hierarchical view-shared dictionary learning. IEEE Access 6, 16855–16868 (2018)CrossRef

26.

Ulhaq, A.; Yin, X.; He, J.; Zhang, Y.: On space-time filtering framework for matching human actions across different viewpoints. IEEE Trans. Image Process. 27, 1230–1242 (2018). https://doi.org/10.1109/TIP.2017.2765821 MathSciNetCrossRefMATH

27.

Rahmani, H.; Mian, A.; Shah, M.: Learning a deep model for human action recognition from novel viewpoints. IEEE Trans. Pattern Anal. Mach. Intell. 40, 667–681 (2016). https://doi.org/10.1103/PhysRevD.94.065007 CrossRef

28.

Zhang, B.; Yang, Y.; Chen, C.; Yang, L.; Han, J.; Shao, L.: Action recognition using 3D histograms of texture and a multi-class boosting classifier. IEEE Trans. Image Process. 26, 4648–4660 (2017)MathSciNetCrossRef

29.

Ershadi-Nasab, S.; Noury, E.; Kasaei, S.; Sanaei, E.: Multiple human 3D pose estimation from multiview images. Multimed. Tools Appl. 77, 15573–15601 (2018)CrossRef

30.

Gu, J.; Ding, X.; Wang, S.: Action recognition from arbitrary views using 3D-key-pose set. Front. Electr. Electron. Eng. 7, 224–241 (2012)

31.

Zhang, D.; Shah, M.: Human pose estimation in videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2012–2020 (2015)

32.

Sargano, A.; Angelov, P.; Habib, Z.: Human action recognition from multiple views based on view-invariant feature descriptor using support vector machines. Appl. Sci. 6, 309 (2016). https://doi.org/10.3390/app6100309 CrossRef

33.

Chun, S.; Lee, C.: Human action recognition using histogram of motion intensity and direction from multiple views. IET Comput. Vis. 10, 250–257 (2016). https://doi.org/10.1049/iet-cvi.2015.0233 CrossRef

34.

Murtaza, F.; Yousaf, M.H.; Velastin, S.A.: Multi-view human action recognition using 2D motion templates based on MHIs and their HOG description. IET Comput. Vis. 10, 758–767 (2016). https://doi.org/10.1049/iet-cvi.2015.0416 CrossRef

35.

Gonz, L.; Velastin, S.A.; Acu, G.: Silhouette-based human action recognition with a multi-class support vector machine. In: 9th International Conference on Pattern Recognition Systems (ICPRS 2018), p. 5 (2018)

36.

Bui, M.; Duong, V.; Tai, T.; Wang, J.: Depth human action recognition depth based on convolution neural networks and principal component analysis. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 1543–1547 (2018). https://doi.org/10.1109/icip.2018.8451232

37.

Liu, M.; Liu, H.; Chen, C.: Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit. 68, 346–362 (2017)CrossRef

38.

Wang, K.; Zhang, G.; Xia, S.: Templateless non-rigid reconstruction and motion tracking with a single RGB-D camera. IEEE Trans. Image Process. 26, 5966–5979 (2017)MathSciNetCrossRef

39.

Rahmani, H.; Mian, A.: 3D action recognition from novel viewpoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1506–1515 (2016)

40.

Qureshi, F.Z.; Terzopoulos, D.: Surveillance camera scheduling: a virtual vision approach. Multimed. Syst. 12, 269–283 (2006)CrossRef

41.

Orrite, C.; Rodriguez, M.; Herrero, E.; Rogez, G.; Velastin, S.A.: Automatic segmentation and recognition of human actions in monocular sequences. In: 2014 22nd International Conference on Pattern Recognition, pp. 4218–4223 (2014)

42.

Su, T.; Chiang, C.; Lai, S.: A multiattribute sparse coding approach for action recognition from a single. IEEE Trans. Circuits Syst. Video Technol. 26, 1476–1489 (2016). https://doi.org/10.1109/TCSVT.2015.2409012 CrossRef

43.

Bhorge, S.; Bedase, D.: Multi view human action recognition using HODD. In: International Conference on Advances in Computing and Data Sciences, pp. 499–508 (2018)

44.

Murtaza, F.; Velastin, S.A.: Multi-view human action recognition using histograms of oriented gradients (HOG) description of motion history images (MHIs) (2015). https://doi.org/10.1109/fit.2015.59

45.

Jurie, F.; Triggs, B.: Creating efficient codebooks for visual recognition. In: Tenth IEEE International Conference on Computer Vision (ICCV’05) Vol. 1, pp. 604–610 (2005)

46.

Parikh, D., Grauman, K.: Relative attributes. In: 2011 International Conference on Computer Vision, pp. 503–510 (2011)

47.

Taibi, F.; Akbarizadeh, G.; Farshidi, E.: Robust reservoir rock fracture recognition based on a new sparse feature learning and data training method. Multidimens. Syst. Signal Process. 30, 2113–2146 (2019). https://doi.org/10.1007/s11045-019-00645-8 CrossRefMATH

48.

Zhu, Y.; Zhao, X.; Fu, Y.; Liu, Y.: Sparse coding on local spatial-temporal volumes for human action recognition. In: Asian Conference on Computer Vision, pp. 660–671 (2010)

49.

Alfaro, A.; Mery, D.; Soto, A.: Action recognition in video using sparse coding and relative features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2688–2697 (2016)

50.

Wang, W.; Yan, Y.; Zhang, L.; Hong, R.; Sebe, N.: Collaborative sparse coding for multiview action recognition. IEEE Multimed. 23, 80–87 (2016)CrossRef

51.

Zheng, J.; Jiang, Z.; Chellappa, R.: Cross-view action recognition via transferable dictionary learning. IEEE Trans. Image Process. 25, 2542–2556 (2016)MathSciNetCrossRef

52.

Akbarizadeh, G.: A new statistical-based kurtosis wavelet energy feature for texture recognition of SAR images. IEEE Trans. Geosci. Remote Sens. 50, 4358–4368 (2012). https://doi.org/10.1109/TGRS.2012.2194787 CrossRef

53.

Tirandaz, Z.; Akbarizadeh, G.: Unsupervised texture-based SAR image segmentation using spectral regression and Gabor Filter Bank. J. Indian Soc. Remote Sens. 44, 177–186 (2016). https://doi.org/10.1007/s12524-015-0490-0 CrossRef

54.

Samadi, F.; Akbarizadeh, G.; Kaabi, H.: Change detection in SAR images using deep belief network: a new training approach based on morphological images. IET Image Process. 13, 2255–2264 (2019)CrossRef

55.

Aharon, M.; Elad, M.; Bruckstein, A.: Others: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54, 4311 (2006)CrossRef

56.

He, K.; Zhang, X.; Ren, S.; Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

57.

Traver, V.J.; Serra-Toro, C.: Analysis of single-and dual-dictionary strategies in pedestrian classification. Pattern Anal. Appl. 21, 655–670 (2018)MathSciNetCrossRef

58.

Singh, S.; Velastin, S.A.; Ragheb, H.; Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 48–55 (2010). https://doi.org/10.1109/avss.2010.63

59.

Nida, N.; Yousaf, M.H.; Irtaza, A.; Velastin, S.: Deep temporal motion descriptor (DTMD) for human action recognition. Turkish J. Electr. Eng. Comput. Sci. (2019). https://doi.org/10.3906/elk-1907-214 CrossRef

60.

Cheema, S.; Eweiwi, A.; Thurau, C.; Bauckhage, C.; Iais, F.; Augustin, S.: Action recognition by learning discriminative key poses. In: Computer Vision Workshops, pp. 1302–1309 (2011)

61.

Chou, K.-P.; Prasad, M.; Wu, D.; Sharma, N.; Li, D.-L.; Lin, Y.-F.; Blumenstein, M.; Lin, W.-C.; Lin, C.-T.: Robust feature-based automated multi-view human action recognition system. IEEE Access 6, 15283–15296 (2018)CrossRef

62.

Al-Faris, M.; Chiverton, J.; Yang, L.; Ndzi, D.: Appearance and motion information based human activity recognition. In: IET 3rd International Conference on Intelligent Signal Processing (ISP 2017), pp. 1–6 (2017)

63.

Singh, D.; Kumar, V.: Comprehensive survey on haze removal techniques. Multimed. Tools Appl. 77, 9595–9620 (2018)CrossRef

64.

Singh, D.; Kumar, V.: Defogging of road images using gain coefficient-based trilateral filter. J. Electron. Imaging 27, 13004 (2018)CrossRef

65.

Singh, D.; Kumar, V.: Dehazing of outdoor images using notch based integral guided filter. Multimed. Tools Appl. 77, 27363–27386 (2018)CrossRef

66.

Singh, D.; Kumar, V.; Kaur, M.: Single image dehazing using gradient channel prior. Appl. Intell. 49, 4276–4293 (2019)CrossRef

Titel: Multiple Batches of Motion History Images (MB-MHIs) for Multi-view Human Action Recognition
verfasst von: Hajra Binte Naeem
Fiza Murtaza
Muhammad Haroon Yousaf
Sergio A. Velastin
Publikationsdatum: 31.03.2020
Verlag: Springer Berlin Heidelberg
Erschienen in: Arabian Journal for Science and Engineering / Ausgabe 8/2020
Print ISSN: 2193-567X
Elektronische ISSN: 2191-4281
DOI: https://doi.org/10.1007/s13369-020-04481-y

Premium Partner

Marktübersichten

Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.

Zur Marktübersicht

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 8/2020

Multi-switching Dual Combination Synchronization of Time Delay Dynamical Systems for Fully Unknown Parameters via Adaptive Control

Adaptive Cooperation for Millimeter Wave Communications

Robust Optimization-Based Energy Pricing and Dispatching Model Using DSM for Smart Grid Aggregators to Tackle Price Uncertainty

A Modified Multi-objective Particle Swarm Optimizer-Based Lévy Flight: An Approach Toward Intrusion Detection in Internet of Things

Risk-Based Test Case Prioritization by Correlating System Methods and Their Associated Risks

Desirability Optimization Models to Create the Global Healthcare Competitiveness Index

Premium Partner

Marktübersichten