Skip to main content

2014 | OriginalPaper | Buchkapitel

A Stochastic Late Fusion Approach to Human Action Recognition in Unconstrained Images and Videos

verfasst von : Muhammad Shahzad Cheema, Abdalrahman Eweiwi, Christian Bauckhage

Erschienen in: Pattern Recognition

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recognizing human actions in unconstrained videos and still images has attracted considerable interest in recent research. An increasingly popular trend is to use ensembles of multiple features and classifiers in order to cope with different aspects such as motion, scene, pose and context. It has been observed that late fusion of predictions from individual classifiers offers more robustness than the early fusion of feature descriptors. In this paper, we present a novel framework for the late fusion of probabilistic predictions of different classifiers which is based on formulating and solving constrained quadratic optimization problems. In contrast to late fusion methods such as the sum-rule and the linear weighting, our approach binds constraints on mixture coefficients such that they represent the posterior of every participating classifier for each class. Further, unlike fusion by Bayesian inference, the proposed approach minimizes an error function that also considers correlations among different models. Experiments on three video and image action datasets show that our approach outperforms other late fusion techniques. In particular we report 6 %–8 % improvement compared to previously published results on two benchmark datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Atrey, P., Kankanhalli, S., Jain, R.: Information assimilation framework for event detection in multimedia surveillance systems. ACM Multimedia Syst. J. 12(3), 239–253 (2006)CrossRef Atrey, P., Kankanhalli, S., Jain, R.: Information assimilation framework for event detection in multimedia surveillance systems. ACM Multimedia Syst. J. 12(3), 239–253 (2006)CrossRef
2.
Zurück zum Zitat Atrey, P.K., Hossain, M.A., Saddik, A.E., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia Syst. 16, 345–379 (2010)CrossRef Atrey, P.K., Hossain, M.A., Saddik, A.E., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia Syst. 16, 345–379 (2010)CrossRef
3.
Zurück zum Zitat Bach, F., Lanckriet, G.: Multiple kernel learning, conic duality, and the smo algorithm. In: ICML (2004) Bach, F., Lanckriet, G.: Multiple kernel learning, conic duality, and the smo algorithm. In: ICML (2004)
4.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
5.
Zurück zum Zitat Deltaire, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: BMVC (2010) Deltaire, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: BMVC (2010)
6.
Zurück zum Zitat Eweiwi, A., Cheema, M.S., Bauckhage, C.: Discriminative joint non-negative matrix factorization for human action classification. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 61–70. Springer, Heidelberg (2013)CrossRef Eweiwi, A., Cheema, M.S., Bauckhage, C.: Discriminative joint non-negative matrix factorization for human action classification. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 61–70. Springer, Heidelberg (2013)CrossRef
7.
Zurück zum Zitat Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: CVPR. pp. 221–228 (2009) Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: CVPR. pp. 221–228 (2009)
8.
Zurück zum Zitat He, J., Chang, S., Xie, L.: Fast kernel learning for spatial pyramid matching. In: CVPR (2008) He, J., Chang, S., Xie, L.: Fast kernel learning for spatial pyramid matching. In: CVPR (2008)
9.
Zurück zum Zitat Ikizler-Cinbis, N., Cinbis, R., Sclaroff, S.: Learning actions from the web. In: ICCV (2009) Ikizler-Cinbis, N., Cinbis, R., Sclaroff, S.: Learning actions from the web. In: ICCV (2009)
10.
Zurück zum Zitat Jain, A., Duin, R., Mao, J.: Statistical pattern recognition: a review. TPAMI 22, 4–37 (2000)CrossRef Jain, A., Duin, R., Mao, J.: Statistical pattern recognition: a review. TPAMI 22, 4–37 (2000)CrossRef
11.
Zurück zum Zitat Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. TPAMI 20, 226–239 (1998)CrossRef Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. TPAMI 20, 226–239 (1998)CrossRef
12.
Zurück zum Zitat Kuehne, H., Jhaung, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: ICCV (2011) Kuehne, H., Jhaung, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: A large video database for human motion recognition. In: ICCV (2011)
13.
Zurück zum Zitat Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008) Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
14.
Zurück zum Zitat Liu, D., Lai, K., Ye, G., Chen, M., Chang, S.: Sample-specific late fusion for visual category recognition. In: CVPR (2013) Liu, D., Lai, K., Ye, G., Chen, M., Chang, S.: Sample-specific late fusion for visual category recognition. In: CVPR (2013)
15.
Zurück zum Zitat Liu, J., Yang, Y., Saleemi, I., Shah, M.: Learning semantic features for action recognition via diffusion maps. CVIU 116, 361–377 (2012) Liu, J., Yang, Y., Saleemi, I., Shah, M.: Learning semantic features for action recognition via diffusion maps. CVIU 116, 361–377 (2012)
16.
Zurück zum Zitat Nandakumar, K., Chen, Y., Dass, S., Jain, A.: Likelihood ratio-based biometric score fusion. TPAMI 30, 342–347 (2008)CrossRef Nandakumar, K., Chen, Y., Dass, S., Jain, A.: Likelihood ratio-based biometric score fusion. TPAMI 30, 342–347 (2008)CrossRef
17.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
18.
Zurück zum Zitat Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR (2012) Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR (2012)
19.
Zurück zum Zitat Sadanand, S., Corso, J.J.: Action bank: a high-level representation of activity in video. In: CVPR (2012) Sadanand, S., Corso, J.J.: Action bank: a high-level representation of activity in video. In: CVPR (2012)
20.
Zurück zum Zitat Tavakoli, A., Zhang, J., Son, S.H.: Group-based event detection in undersea sensor networks. In: International Workshop on Networked Sensing Systems (2005) Tavakoli, A., Zhang, J., Son, S.H.: Group-based event detection in undersea sensor networks. In: International Workshop on Networked Sensing Systems (2005)
21.
Zurück zum Zitat Terrades, O., Valveny, E., Tabbone, S.: Optimal classifier fusion in a non-bayesian probabilistic framework. TPAMI 31, 1630–1644 (2009)CrossRef Terrades, O., Valveny, E., Tabbone, S.: Optimal classifier fusion in a non-bayesian probabilistic framework. TPAMI 31, 1630–1644 (2009)CrossRef
22.
Zurück zum Zitat Thurau, C., Hlavac, V.: Pose primitive based human action recognition in videos or still images. In: CVPR (2008) Thurau, C., Hlavac, V.: Pose primitive based human action recognition in videos or still images. In: CVPR (2008)
23.
Zurück zum Zitat Wang, H., Klaeser, A., Schmid, C., Cheng-Lin, L.: Action recognition by dense trajectories. In: CVPR (2011) Wang, H., Klaeser, A., Schmid, C., Cheng-Lin, L.: Action recognition by dense trajectories. In: CVPR (2011)
24.
Zurück zum Zitat Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: IEEE Conference on Computer Vision and Pattern Recognition, June 2011 Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: IEEE Conference on Computer Vision and Pattern Recognition, June 2011
26.
Zurück zum Zitat Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR (2012) Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR (2012)
27.
Zurück zum Zitat Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010) Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)
28.
Zurück zum Zitat Xu, L., Krzyzak, A., Suen, C.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybern. 22, 418–435 (1992)CrossRef Xu, L., Krzyzak, A., Suen, C.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybern. 22, 418–435 (1992)CrossRef
29.
Zurück zum Zitat Yang, W., Wang, Y., Mori, G.: Recognizing human actions from still images with latent poses. In: CVPR (2010) Yang, W., Wang, Y., Mori, G.: Recognizing human actions from still images with latent poses. In: CVPR (2010)
30.
Zurück zum Zitat Yao, B., Fei-Fei, L.: Grouplet: a structured image representation for recognizing human and object interactions. In: CVPR (2010) Yao, B., Fei-Fei, L.: Grouplet: a structured image representation for recognizing human and object interactions. In: CVPR (2010)
31.
Zurück zum Zitat Ye, G., D.Liu, Chang, I.J.S.: Robust late fusion with rank minimization. In: CVPR (2012) Ye, G., D.Liu, Chang, I.J.S.: Robust late fusion with rank minimization. In: CVPR (2012)
Metadaten
Titel
A Stochastic Late Fusion Approach to Human Action Recognition in Unconstrained Images and Videos
verfasst von
Muhammad Shahzad Cheema
Abdalrahman Eweiwi
Christian Bauckhage
Copyright-Jahr
2014
DOI
https://doi.org/10.1007/978-3-319-11752-2_51