Skip to main content

2016 | OriginalPaper | Buchkapitel

Reverse Testing Image Set Model Based Multi-view Human Action Recognition

verfasst von : Z. Gao, Y. Zhang, H. Zhang, G. P. Xu, Y. B. Xue

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recognizing human activities from videos becomes a hot research topic in computer vision, but many studies show that action recognition based on single view cannot obtain satisfying performance, thus, many researchers put their attentions on multi-view action recognition, but how to mine the relationships among different views still is a challenge problem. Since video face recognition algorithm based on image set has proved that image set algorithm can effectively mine the complementary properties of different views image, and achieves satisfying performance. Thus, Inspired by these, image set is utilized to mine the relationships among multi-view action recognition. However, the studies show that the sample number of gallery and query set in video face recognition based on image set will affect the algorithm performance, and several ten to several hundred samples is supplied, but, in multi-view action recognition, we only have 3–5 views (samples) in each query set, which will limit the effect of image set.
In order to solve the issues, reverse testing image set model (called RTISM) based multi-view human action recognition is proposed. We firstly extract dense trajectory feature for each camera, and then construct the shared codebook by k-means for all cameras, after that, Bag-of-Word (BoW) weight scheme is employed to code these features for each camera; Secondly, for each query set, we will compute the compound distance with each image subset in gallery set, after that, the scheme of the nearest image subset (called RTIS) is chosen to add into the query set; Finally, RTISM is optimized where the query set and RTIS are whole reconstructed by the gallery set, thus, the relationship of different actions among gallery set and the complementary property of different samples among query set are meanwhile excavated. Large scale experimental results on two public multi-view action3D datasets - Northwestern UCLA and CVS-MV-RGBD-Single, show that the reconstruction of query set over gallery set is very effectively, and RTIS added into query set is very helpful for classification, what is more, the performance of RTISM is comparable to the state-of-the-art methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: a survey. IEEE Trans, Circ. Syst. Video Technol. 18(11), 1473–1488 (2008)CrossRef Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: a survey. IEEE Trans, Circ. Syst. Video Technol. 18(11), 1473–1488 (2008)CrossRef
2.
Zurück zum Zitat Ke, S.-R., Thuc, H.L.U., Lee, Y.-J., Hwang, J.-N., Yoo, J.-H., Choi, K.-H.: A review on video-based human activity recognition. Computers 2, 88–131 (2013)CrossRef Ke, S.-R., Thuc, H.L.U., Lee, Y.-J., Hwang, J.-N., Yoo, J.-H., Choi, K.-H.: A review on video-based human activity recognition. Computers 2, 88–131 (2013)CrossRef
3.
Zurück zum Zitat Song, Y., Davis, R.: Multi-view latent variable discriminative models for action recognition. In: CVPR 2012, pp. 1–8 (2012) Song, Y., Davis, R.: Multi-view latent variable discriminative models for action recognition. In: CVPR 2012, pp. 1–8 (2012)
4.
Zurück zum Zitat Cai, Z., Wang, L., Peng, X.: Multi-view super vector for action recognition. In: CVPR 2014, pp. 1–8 (2014) Cai, Z., Wang, L., Peng, X.: Multi-view super vector for action recognition. In: CVPR 2014, pp. 1–8 (2014)
5.
Zurück zum Zitat Kan, M., Shan, S., Zhang, H., Lao, S., Chen, X.: Multi-view discriminant analysis. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 808–821. Springer, Heidelberg (2012)CrossRef Kan, M., Shan, S., Zhang, H., Lao, S., Chen, X.: Multi-view discriminant analysis. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 808–821. Springer, Heidelberg (2012)CrossRef
6.
Zurück zum Zitat Liu, A., Su, Y., Jia, P., Gao, Z., Hao, T., Yang, Z.: Multipe/single-view human action recognition via part-induced multi-task structural learning. IEEE Trans. Cybern. 45(6), 1194–1208 (2015)CrossRef Liu, A., Su, Y., Jia, P., Gao, Z., Hao, T., Yang, Z.: Multipe/single-view human action recognition via part-induced multi-task structural learning. IEEE Trans. Cybern. 45(6), 1194–1208 (2015)CrossRef
7.
Zurück zum Zitat Gao, Z., Zhang, H., Liu, A., Xue, Y., Xu, G.: Human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning. KSII Trans. Internet Inf. Syst. 8(2), 483–503 (2014)CrossRef Gao, Z., Zhang, H., Liu, A., Xue, Y., Xu, G.: Human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning. KSII Trans. Internet Inf. Syst. 8(2), 483–503 (2014)CrossRef
8.
Zurück zum Zitat Liu, A., Xu, N., Su, Y., Lin, H., Hao, T., Yang, Z.: Single/multi-view human action recognition via regularized multi-task learning. Neurocomputing 151(2), 544–553 (2015)CrossRef Liu, A., Xu, N., Su, Y., Lin, H., Hao, T., Yang, Z.: Single/multi-view human action recognition via regularized multi-task learning. Neurocomputing 151(2), 544–553 (2015)CrossRef
9.
Zurück zum Zitat Gao, Z., Zhang, H., Xu, G.P., Xue, Y.B.: Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition. Neurocomputing 151(2), 554–564 (2015). doi:10.1016/j.neucom.2014.06.085 Gao, Z., Zhang, H., Xu, G.P., Xue, Y.B.: Multi-perspective and multi-modality joint representation and recognition model for 3D action recognition. Neurocomputing 151(2), 554–564 (2015). doi:10.​1016/​j.​neucom.​2014.​06.​085
11.
Zurück zum Zitat Gao, Z., Zhang, H., Xu, G-P., Xue, Y.-B., Hauptmann, A.G.: Multi-view discriminative and structure dictionary learning with group sparsity for human action recognition. Sig. Process. (2014). doi:10.1016/j.sigpro.2014.08.034 Gao, Z., Zhang, H., Xu, G-P., Xue, Y.-B., Hauptmann, A.G.: Multi-view discriminative and structure dictionary learning with group sparsity for human action recognition. Sig. Process. (2014). doi:10.​1016/​j.​sigpro.​2014.​08.​034
12.
Zurück zum Zitat Nie, W., Liu, A., Su, Y., et al.: Single/cross-camera multiple-person tracking by graph matching. Neurocomputing 139, 220–232 (2014)CrossRef Nie, W., Liu, A., Su, Y., et al.: Single/cross-camera multiple-person tracking by graph matching. Neurocomputing 139, 220–232 (2014)CrossRef
13.
Zurück zum Zitat Gao, Z., Zhang, L., Chen, M., Hauptmann, A., Zhang, H., Cai, A.: Enhanced and hierarchical structure algorithm for data imbalance problem in semantic extraction under massive video dataset. Multimedia Tools Appl. 68(3), 641–657 (2014)CrossRef Gao, Z., Zhang, L., Chen, M., Hauptmann, A., Zhang, H., Cai, A.: Enhanced and hierarchical structure algorithm for data imbalance problem in semantic extraction under massive video dataset. Multimedia Tools Appl. 68(3), 641–657 (2014)CrossRef
15.
Zurück zum Zitat Gao, Z., Song, J., Zhang, H., Liu, A., Xu, G., Xue, Y.: Human action recognition via multi-modality information. J. Electr. Eng. Technol. 9(2), 739–748 (2014)CrossRef Gao, Z., Song, J., Zhang, H., Liu, A., Xu, G., Xue, Y.: Human action recognition via multi-modality information. J. Electr. Eng. Technol. 9(2), 739–748 (2014)CrossRef
16.
Zurück zum Zitat Hu, Y., Mian, A.S., Owens, R.: Sparse approximated nearest points for image set classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 121–128. IEEE (2011) Hu, Y., Mian, A.S., Owens, R.: Sparse approximated nearest points for image set classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 121–128. IEEE (2011)
17.
Zurück zum Zitat Cui, Z., Shan, S., Zhang, H., Lao, S., Chen, X.: Image sets alignment for video-based face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2633. IEEE (2012) Cui, Z., Shan, S., Zhang, H., Lao, S., Chen, X.: Image sets alignment for video-based face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2633. IEEE (2012)
18.
Zurück zum Zitat Chen, Y.-C., Patel, V.M., Phillips, P.J., Chellappa, R.: Dictionary-based face recognition from video. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 766–779. Springer, Heidelberg (2012)CrossRef Chen, Y.-C., Patel, V.M., Phillips, P.J., Chellappa, R.: Dictionary-based face recognition from video. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 766–779. Springer, Heidelberg (2012)CrossRef
19.
Zurück zum Zitat Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009) Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
20.
Zurück zum Zitat Gunawardana, A., Byrne, W.: Convergence theorems for generalized alternating minimization procedures. J. Mach. Learn. Res. 6, 2049–2073 (2005)MATHMathSciNet Gunawardana, A., Byrne, W.: Convergence theorems for generalized alternating minimization procedures. J. Mach. Learn. Res. 6, 2049–2073 (2005)MATHMathSciNet
21.
Zurück zum Zitat Wang, J., Nie, X., Xia, Y., Wu, Y., Zhu, S.-C.: Cross-view action modeling, learning, and recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014) Wang, J., Nie, X., Xia, Y., Wu, Y., Zhu, S.-C.: Cross-view action modeling, learning, and recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
22.
Zurück zum Zitat Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: CVPR, IEEE, June 2011 (2, 6, 7, 8) Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: CVPR, IEEE, June 2011 (2, 6, 7, 8)
Metadaten
Titel
Reverse Testing Image Set Model Based Multi-view Human Action Recognition
verfasst von
Z. Gao
Y. Zhang
H. Zhang
G. P. Xu
Y. B. Xue
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-27671-7_33

Neuer Inhalt