Top

Published in:

2020 | OriginalPaper | Chapter

Adversarial Query-by-Image Video Retrieval Based on Attention Mechanism

Authors : Ruicong Xu, Li Niu, Liqing Zhang

Published in: MultiMedia Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The query-by-image video retrieval (QBIVR) is a difficult feature matching task across different modalities. More and more retrieval tasks require indexing the videos containing the activities in the image, which makes extracting meaningful spatio-temporal video features crucial. In this paper, we propose an approach based on adversarial learning, termed Adversarial Image-to-Video (AIV) approach. To capture the temporal pattern of videos, we utilize temporal regions likely to contain activities via fully-convolutional 3D ConvNet features, and then obtain the video bag features by 3D RoI Pooling. To solve mismatch issue with image vector features and identify the importances of information for videos, we add a Multiple Instance Learning (MIL) module to assign different weights to each temporal information in video bags. Moreover, we utilize the triplet loss to distinguish different semantic categorites and support intraclass variability of images and videos. Specially, our AIV proposes modality loss as an adversary to the triplet loss in the adversarial learning. The interplay between two losses jointly bridges the domain gap across different modalities. Extensive experiments on two widely used datasets verify the effectiveness of our proposed methods as compared with other methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Meta Transfer Learning for Adaptive Vehicle Tracking in UAV Videos

next chapter Joint Sketch-Attribute Learning for Fine-Grained Face Synthesis

de Araújo, A.F., Chaves, J., Angst, R., Girod, B.: Temporal aggregation for large-scale query-by-image video retrieval. In: ICIP (2015)

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)

Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: CVPR (2014)

Feng, F., Wang, X., Li, R.: Cross-modal retrieval with correspondence autoencoder. In: MM (2014)

Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning binary codes. In: CVPR, pp. 817–824 (2011)

Goodfellow, I.J., et al.: Generative adversarial networks. CoRR abs/1406.2661 (2014)

Gorisse, D., et al.: IRIM at TRECVID 2010: semantic indexing and instance search. In: TRECVID (2010)

Heo, J., Lee, Y., He, J., Chang, S., Yoon, S.: Spherical hashing. In: CVPR (2012)

Jiang, Z., Rozgic, V., Adali, S.: Learning spatiotemporal features for infrared action recognition with 3D convolutional neural networks. In: CVPR, pp. 309–317 (2017)

10.

Lin, G., Shen, C., van den Hengel, A.: Supervised hashing using graph cuts and boosted decision trees. CoRR abs/1408.5574 (2014)

11.

Raginsky, M., Lazebnik, S.: Locality-sensitive binary codes from shift-invariant kernels. In: NIPS (2009)

12.

Wang, K., He, R., Wang, L., Wang, W., Tan, T.: Joint feature selection and subspace learning for cross-modal retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2010–2023 (2016)CrossRef

13.

Wang, K., He, R., Wang, W., Wang, L., Tan, T.: Learning coupled feature spaces for cross-modal matching. In: ICCV (2013)

14.

Xu, H., Das, A., Saenko, K.: R-C3D: region convolutional 3d network for temporal activity detection. In: ICCV (2017)

15.

Ye, D., Li, Y., Tao, C., Xie, X., Wang, X.: Multiple feature hashing learning for large-scale remote sensing image retrieval. ISPRS Int. J. Geo-Inf. 6(11), 364 (2017)CrossRef

16.

You, Q., Cao, L., Jin, H., Luo, J.: Robust visual-textual sentiment analysis: when attention meets tree-structured recursive neural networks. In: Proceedings of the 2016 ACM Conference on Multimedia Conference, MM, pp. 1008–1017 (2016)

17.

Yu, F.X., Kumar, S., Gong, Y., Chang, S.: Circulant binary embedding. In: ICML (2014)

18.

Zhai, X., Peng, Y., Xiao, J.: Learning cross-media joint representation with sparse and semisupervised regularization. IEEE Trans. Circ. Syst. Video Techn. 24(6), 965–978 (2014)CrossRef

19.

Zhang, D., Li, W.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI (2014)

20.

Zhu, C., Huang, Y., Satoh, S.: Multi-image aggregation for better visual object retrieval. In: ICASSP (2014)

21.

Zhu, C., Satoh, S.: Large vocabulary quantization for searching instances from videos. In: ICMR (2012)

Title: Adversarial Query-by-Image Video Retrieval Based on Attention Mechanism
Authors: Ruicong Xu
Li Niu
Liqing Zhang
Publisher: Springer International Publishing
Book: MultiMedia Modeling
Print ISBN: 978-3-030-37730-4

Electronic ISBN: 978-3-030-37731-1

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-3-030-37731-1_63

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"