Skip to main content
Erschienen in: Pattern Analysis and Applications 1/2015

01.02.2015 | Industrial and Commercial Application

Robust relative attributes for human action recognition

verfasst von: Zhong Zhang, Chunheng Wang, Baihua Xiao, Wen Zhou, Shuang Liu

Erschienen in: Pattern Analysis and Applications | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

High-level semantic feature is important to recognize human action. Recently, relative attributes, which are used to describe relative relationship, have been proposed as one of high-level semantic features and have shown promising performance. However, the training process is very sensitive to noises and moreover it is not robust to zero-shot learning. In this paper, to overcome these drawbacks, we propose a robust learning framework using relative attributes for human action recognition. We simultaneously add Sigmoid and Gaussian envelops into the loss objective. In this way, the influence of outliers will be greatly reduced in the process of optimization, thus improving the accuracy. In addition, we adopt Gaussian Mixture models for better fitting the distribution of actions in rank score space. Correspondingly, a novel transfer strategy is proposed to evaluate the parameters of Gaussian Mixture models for unseen classes. Our method is verified on three challenging datasets (KTH, UIUC and HOLLYWOOD2), and the experimental results demonstrate that our method achieves better results than previous methods in both zero-shot classification and traditional recognition task for human action recognition.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: IEEE conference on computer vision (ICCV), pp 2556–2563 Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: IEEE conference on computer vision (ICCV), pp 2556–2563
2.
Zurück zum Zitat Aggarwal JK, Cai Q (1997) Human motion analysis: a review. In: IEEE nonrigid and articulated motion workshop, pp 90–102 Aggarwal JK, Cai Q (1997) Human motion analysis: a review. In: IEEE nonrigid and articulated motion workshop, pp 90–102
3.
Zurück zum Zitat Yilmaz A, Shah M (2005) Actions sketch: a novel action representation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 984–989 Yilmaz A, Shah M (2005) Actions sketch: a novel action representation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 984–989
4.
Zurück zum Zitat Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: IEEE conference on computer vision (ICCV), pp 1395–1402 Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: IEEE conference on computer vision (ICCV), pp 1395–1402
5.
Zurück zum Zitat Lin Z, Jiang Z, Davis LS (2009) Recognizing actions by shape-motion prototype trees. In: IEEE conference on computer vision (ICCV), pp 444–451 Lin Z, Jiang Z, Davis LS (2009) Recognizing actions by shape-motion prototype trees. In: IEEE conference on computer vision (ICCV), pp 444–451
6.
Zurück zum Zitat Lv F, Nevatia R (2007) Single view human action recognition using key pose matching and viterbi path searching. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8 Lv F, Nevatia R (2007) Single view human action recognition using key pose matching and viterbi path searching. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
7.
Zurück zum Zitat Efros A, Berg A, Mori G, Malik J (2003) Recognizing action at a distance. In: IEEE conference on computer vision (ICCV), pp 726–733 Efros A, Berg A, Mori G, Malik J (2003) Recognizing action at a distance. In: IEEE conference on computer vision (ICCV), pp 726–733
8.
Zurück zum Zitat Raptis M, Soatto S (2010) Tracklet descriptors for action modeling and video analysis. In: European Conference on Computer Vision (ECCV) pp 577–590 Raptis M, Soatto S (2010) Tracklet descriptors for action modeling and video analysis. In: European Conference on Computer Vision (ECCV) pp 577–590
9.
Zurück zum Zitat Dollár P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: IEEE conference on computer vision workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS), pp 65–72 Dollár P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: IEEE conference on computer vision workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS), pp 65–72
10.
Zurück zum Zitat Liu J, Shah M (2008) Learning human actions via information maximization. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8 Liu J, Shah M (2008) Learning human actions via information maximization. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
11.
Zurück zum Zitat Liu J, Yang Y, Shah M (2009) Learning semantic visual vocabularies using diffusion distance. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 461–468 Liu J, Yang Y, Shah M (2009) Learning semantic visual vocabularies using diffusion distance. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 461–468
12.
Zurück zum Zitat Zhang Z, Wang C, Xiao B, Zhou W, Liu S (2012) Action recognition using context-constrained linear coding. IEEE Signal Process Lett 19(7):439–442CrossRef Zhang Z, Wang C, Xiao B, Zhou W, Liu S (2012) Action recognition using context-constrained linear coding. IEEE Signal Process Lett 19(7):439–442CrossRef
13.
Zurück zum Zitat Savarese S, DelPozo A, Niebles JC, Fei-Fei L (2008) Spatial-Temporal correlatons for unsupervised action classification. In: IEEE workshop on Motion and Video Computing (WMVC), pp 1–8 Savarese S, DelPozo A, Niebles JC, Fei-Fei L (2008) Spatial-Temporal correlatons for unsupervised action classification. In: IEEE workshop on Motion and Video Computing (WMVC), pp 1–8
14.
Zurück zum Zitat Ryoo MS, Aggarwal JK (2009) Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: IEEE conference on computer vision (ICCV), pp 1593–1600 Ryoo MS, Aggarwal JK (2009) Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: IEEE conference on computer vision (ICCV), pp 1593–1600
15.
Zurück zum Zitat Kovashka A, Grauman K (2010) Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2046–2053 Kovashka A, Grauman K (2010) Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2046–2053
16.
Zurück zum Zitat Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1778–1785 Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1778–1785
17.
Zurück zum Zitat Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 951–958 Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 951–958
18.
Zurück zum Zitat Parikh D, Grauman K (2011) Relative attributes. In: IEEE conference on computer vision (ICCV), pp 503–510 Parikh D, Grauman K (2011) Relative attributes. In: IEEE conference on computer vision (ICCV), pp 503–510
19.
Zurück zum Zitat Liu J, Kuipers B, Savarese S (2011) Recognizing human actions by attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3337–3344 Liu J, Kuipers B, Savarese S (2011) Recognizing human actions by attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3337–3344
20.
Zurück zum Zitat Kumar N, Berg AC, Belhumeur PN, Nayar SK (2009) Attribute and simile classifiers for face verification, In: IEEE conference on computer vision (ICCV), pp 365–372 Kumar N, Berg AC, Belhumeur PN, Nayar SK (2009) Attribute and simile classifiers for face verification, In: IEEE conference on computer vision (ICCV), pp 365–372
21.
Zurück zum Zitat Wang Y, Mori G (2010) A discriminative latent model of object classes and attributes. In: European Conference on Computer Vision (ECCV), pp 155–168 Wang Y, Mori G (2010) A discriminative latent model of object classes and attributes. In: European Conference on Computer Vision (ECCV), pp 155–168
22.
Zurück zum Zitat Hwang SJ, Sha F, Grauman K (2011) Sharing features between objects and their attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1761–1768 Hwang SJ, Sha F, Grauman K (2011) Sharing features between objects and their attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1761–1768
23.
Zurück zum Zitat Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272CrossRef Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272CrossRef
24.
Zurück zum Zitat Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, pp 339–348 Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, pp 339–348
25.
Zurück zum Zitat Berg T, Berg A, Shih J (2010) Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV), pp 663–676 Berg T, Berg A, Shih J (2010) Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV), pp 663–676
26.
Zurück zum Zitat Elsas JL, Carvalho VR, Carbonell JG (2008) Fast learning of document ranking functions with the committee perceptron. In: ACM conference on web search and data mining (WSDM), pp 55–64 Elsas JL, Carvalho VR, Carbonell JG (2008) Fast learning of document ranking functions with the committee perceptron. In: ACM conference on web search and data mining (WSDM), pp 55–64
27.
Zurück zum Zitat Perez-Cruz F, Navia-Vazquez A, Figueiras-Vidal AR, Artes-Rodriguez A (2008) Empirical risk minimization for support vector classifiers. IEEE Trans Neural Netw 14(2):296–303CrossRef Perez-Cruz F, Navia-Vazquez A, Figueiras-Vidal AR, Artes-Rodriguez A (2008) Empirical risk minimization for support vector classifiers. IEEE Trans Neural Netw 14(2):296–303CrossRef
28.
Zurück zum Zitat Larochelle H, Erhan D, Bengio Y (2008) Zero-data learning of new tasks. In: AAAI Conference on Artificial Intelligence (AAAI), pp 646–651 Larochelle H, Erhan D, Bengio Y (2008) Zero-data learning of new tasks. In: AAAI Conference on Artificial Intelligence (AAAI), pp 646–651
30.
Zurück zum Zitat Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 886–893 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 886–893
31.
Zurück zum Zitat Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision (ECCV), pp 428–441 Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision (ECCV), pp 428–441
32.
Zurück zum Zitat Wang H, Ullah MM, Klaser A, Laptev I, Schmid C (2009) Evaluation of local spatio-temporal features for action recognition. In: British Machine Vision Conference (BMVC) Wang H, Ullah MM, Klaser A, Laptev I, Schmid C (2009) Evaluation of local spatio-temporal features for action recognition. In: British Machine Vision Conference (BMVC)
33.
Zurück zum Zitat Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27CrossRef Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27CrossRef
34.
Zurück zum Zitat Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: A local SVM approach. In: International Conference on Pattern Recognition (ICPR), pp 32–36 Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: A local SVM approach. In: International Conference on Pattern Recognition (ICPR), pp 32–36
35.
Zurück zum Zitat Tran D, Sorokin A (2008) Human activity recognition with metric learning. In: European Conference on Computer Vision (ECCV), pp 548–561 Tran D, Sorokin A (2008) Human activity recognition with metric learning. In: European Conference on Computer Vision (ECCV), pp 548–561
36.
Zurück zum Zitat Marszalek M, Laptev I, Schmid C (2009) Actions in context. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2929–2936 Marszalek M, Laptev I, Schmid C (2009) Actions in context. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2929–2936
37.
Zurück zum Zitat Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8 Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
38.
Zurück zum Zitat Wang J, Chen Z, Wu Y (2011) Action recognition with multiscale spatio-temporal contexts. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3185–3192 Wang J, Chen Z, Wu Y (2011) Action recognition with multiscale spatio-temporal contexts. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3185–3192
39.
Zurück zum Zitat Ji S, Xu W, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition In: IEEE Transactions on Pattern Analysis and Machine Intelligence Ji S, Xu W, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition In: IEEE Transactions on Pattern Analysis and Machine Intelligence
40.
Zurück zum Zitat Han D, Bo L, Sminchisescu C (2009) Selection and context for action recognition. In: IEEE conference on computer vision (ICCV), pp 1933–1940 Han D, Bo L, Sminchisescu C (2009) Selection and context for action recognition. In: IEEE conference on computer vision (ICCV), pp 1933–1940
41.
Zurück zum Zitat Gilbert A, Illingworth J, Bowden R (2009) Fast realistic multi-action recognition using mined dense spatio-temporal features. In: IEEE conference on computer vision (ICCV), pp 925–931 Gilbert A, Illingworth J, Bowden R (2009) Fast realistic multi-action recognition using mined dense spatio-temporal features. In: IEEE conference on computer vision (ICCV), pp 925–931
42.
Zurück zum Zitat Ullah M, Parizi S, Laptev I (2010) Improving bag-of-features action recognition with non-local cues. In: British Machine Vision Conference (BMVC) Ullah M, Parizi S, Laptev I (2010) Improving bag-of-features action recognition with non-local cues. In: British Machine Vision Conference (BMVC)
43.
Zurück zum Zitat Chakraborty B, Holte M, Moeslund T, Gonzà àlez J (2012) Selective spatio-temporal interest points. Comput Vis Image Underst 116(3):396–410CrossRef Chakraborty B, Holte M, Moeslund T, Gonzà àlez J (2012) Selective spatio-temporal interest points. Comput Vis Image Underst 116(3):396–410CrossRef
Metadaten
Titel
Robust relative attributes for human action recognition
verfasst von
Zhong Zhang
Chunheng Wang
Baihua Xiao
Wen Zhou
Shuang Liu
Publikationsdatum
01.02.2015
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 1/2015
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-013-0349-3

Weitere Artikel der Ausgabe 1/2015

Pattern Analysis and Applications 1/2015 Zur Ausgabe

Premium Partner