nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Alternative Semantic Representations for Zero-Shot Human Action Recognition

verfasst von : Qian Wang, Ke Chen

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from still images relevant to human actions. Such side information are accessible on Web with little cost, which paves a new way in gaining side information for large-scale zero-shot human action recognition. We investigate different encoding methods to generate semantic representations for human actions from such side information. Based on our zero-shot visual recognition method, we conducted experiments on UCF101 and HMDB51 to evaluate two proposed semantic representations. The results suggest that our proposed text- and image-based semantic representations outperform traditional attributes and word vectors considerably for zero-shot human action recognition. In particular, the image-based semantic representations yield the favourable performance even though the representation is extracted from a small number of images per class.

Code related to this chapter is available at: http://staff.cs.manchester.ac.uk/~kechen/BiDiLEL/

Data related to this chapter are available at: http://staff.cs.manchester.ac.uk/~kechen/ASRHAR/

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel zooRank: Ranking Suspicious Entities in Time-Evolving Tensors

Nächstes Kapitel Early Active Learning with Pairwise Constraint for Person Re-identification

The image scraper tool is available: http://staff.cs.manchester.ac.uk/kechen/ASRHAR/.

Like attributes and word vectors, our proposed semantic representations may be directly deployed in all the existing zero-shot human action recognition methods.

The scripts and data used in our experiments can be available on our project page: http://staff.cs.manchester.ac.uk/kechen/ASRHAR/.

https://code.google.com/p/word2vec/.

http://www.vlfeat.org/matconvnet/.

Akata, Z., Malinowski, M., Fritz, M., Schiele, B.: Multi-cue zero-shot learning with strong supervision. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 59–68 (2016)

Alexiou, I., Xiang, T., Gong, S.: Exploring synonyms as context in zero-shot action recognition. In: IEEE International Conference on Image Processing (ICIP), pp. 4190–4194. IEEE (2016)

Chao, W.-L., Changpinyo, S., Gong, B., Sha, F.: An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 52–68. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_4 CrossRef

Cheng, J., Liu, Q., Lu, H., Chen, Y.W.: Supervised Kernel locality preserving projections for face recognition. Neurocomputing 67, 443–449 (2005)CrossRef

Chuang Gan, M.L., Yang, Y., Zhuang, Y., Hauptmann, A.G.: Exploring semantic interclass relationships (SIR) for zero-shot action recognition. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 3769–3775 (2015)

Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: zero-shot learning using purely textual descriptions. In: IEEE International Conference on Computer Vision (ICCV), pp. 2584–2591 (2013)

Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1778–1785. IEEE (2009)

Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Learning multimodal latent attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 303–316 (2014)CrossRef

Inoue, N., Shinoda, K.: Adaptation of word vectors using tree structure for visual semantics. In: ACM on Multimedia Conference, pp. 277–281. ACM (2016)

10.

Jiang, Y., Liu, J., Zamir, A.R., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: Thumos challenge: action recognition with a large number of classes (2014)

11.

Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563. IEEE (2011)

12.

Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 951–958. IEEE (2009)

13.

Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning (ICML), vol. 14, pp. 1188–1196 (2014)

14.

Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3337–3344. IEEE (2011)

15.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

16.

Mukherjee, T., Hospedales, T.: Gaussian visual-linguistic embedding for zero-shot recognition. In: Conference on Empirical Methods on Natural Language Processing (EMNLP) (2016)

17.

Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_11 CrossRef

18.

Qin, J., Wang, Y., Liu, L., Chen, J., Shao, L.: Beyond semantic attributes: discrete latent attributes learning for zero-shot recognition. IEEE Sig. Process. Lett. 23(11), 1667–1671 (2016)CrossRef

19.

Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., Schiele, B.: What helps where-and why? Semantic relatedness for knowledge transfer. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 910–917. IEEE (2010)

20.

Sandouk, U., Chen, K.: Multi-label zero-shot learning via concept embedding. arXiv preprint arXiv:1606.00282 (2016)

21.

Sharmanska, V., Quadrianto, N., Lampert, C.H.: Augmented attribute representations. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 242–255. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_18 CrossRef

22.

Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)

23.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)

24.

Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497 (2015)

25.

Vedaldi, A., Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms (2008). http://www.vlfeat.org/

26.

Wang, Q., Chen, K.: Zero-shot visual recognition via bidirectional latent embedding. Int. J. Comput. Vis. (2017)

27.

Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

28.

Zhang, Z., Saligrama, V.: Zero-shot recognition via structured prediction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 533–548. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_33

Titel: Alternative Semantic Representations for Zero-Shot Human Action Recognition
verfasst von: Qian Wang
Ke Chen
Verlag: Springer International Publishing
Buch: Machine Learning and Knowledge Discovery in Databases
Print ISBN: 978-3-319-71248-2

Electronic ISBN: 978-3-319-71249-9

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-71249-9_6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"