Skip to main content
Erschienen in: International Journal of Multimedia Information Retrieval 1/2023

01.06.2023 | Regular Paper

CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval

verfasst von: Li Mingyong, Li Yewen, Ge Mingyuan, Ma Longfei

Erschienen in: International Journal of Multimedia Information Retrieval | Ausgabe 1/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As multi-modal data proliferates, people are no longer content with a single mode of data retrieval for access to information. Deep hashing retrieval algorithms have attracted much attention for their advantages of efficient storage and fast query speed. Currently, the existing unsupervised hashing methods generally have two limitations: (1) Existing methods fail to adequately capture the latent semantic relevance and coexistent information from the different modality data, resulting in the lack of effective feature and hash encoding representation to bridge the heterogeneous and semantic gaps in multi-modal data. (2) Existing unsupervised methods typically construct a similarity matrix to guide the hash code learning, which suffers from inaccurate similarity problems, resulting in sub-optimal retrieval performance. To address these issues, we propose a novel CLIP-based fusion-modal reconstructing hashing for Large-scale Unsupervised Cross-modal Retrieval. First, we use CLIP to encode cross-modal features of visual modalities, and learn the common representation space of the hash code using modality-specific autoencoders. Second, we propose an efficient fusion approach to construct a semantically complementary affinity matrix that can maximize the potential semantic relevance of different modal instances. Furthermore, to retain the intrinsic semantic similarity of all similar pairs in the learned hash codes, an objective function for similarity reconstruction based on semantic complementation is designed to learn high-quality hash code representations. Sufficient experiments were carried out on four multi-modal benchmark datasets (WIKI, MIRFLICKR, NUS-WIDE, and MS COCO), and the proposed method achieves state-of-the-art image-text retrieval performance compared to several representative unsupervised cross-modal hashing methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yan C, Gong B, Wei Y, Gao Y (2020) Deep multi-view enhancement hashing for image retrieval. IEEE Trans Pattern Anal Mach Intell 43(4):1445–1451CrossRef Yan C, Gong B, Wei Y, Gao Y (2020) Deep multi-view enhancement hashing for image retrieval. IEEE Trans Pattern Anal Mach Intell 43(4):1445–1451CrossRef
2.
Zurück zum Zitat Gong Q, Wang L, Lai H, Pan Y, Yin J (2022) Vit2hash: unsupervised information-preserving hashing. arXiv preprint arXiv:2201.05541 Gong Q, Wang L, Lai H, Pan Y, Yin J (2022) Vit2hash: unsupervised information-preserving hashing. arXiv preprint arXiv:​2201.​05541
4.
Zurück zum Zitat Wang B, Yang Y, Xu X, Hanjalic A, Shen HT (2017) Adversarial cross-modal retrieval. In: Proceedings of the 25th ACM international conference on multimedia, pp 154–162 Wang B, Yang Y, Xu X, Hanjalic A, Shen HT (2017) Adversarial cross-modal retrieval. In: Proceedings of the 25th ACM international conference on multimedia, pp 154–162
5.
Zurück zum Zitat Gu W, Gu X, Gu J, Li B, Xiong Z, Wang W (2019) Adversary guided asymmetric hashing for cross-modal retrieval. In: Proceedings of the 2019 on international conference on multimedia retrieval, pp 159–167 Gu W, Gu X, Gu J, Li B, Xiong Z, Wang W (2019) Adversary guided asymmetric hashing for cross-modal retrieval. In: Proceedings of the 2019 on international conference on multimedia retrieval, pp 159–167
6.
Zurück zum Zitat Cheng Q, Zhou Y, Fu P, Xu Y, Zhang L (2021) A deep semantic alignment network for the cross-modal image-text retrieval in remote sensing. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:4284–4297CrossRef Cheng Q, Zhou Y, Fu P, Xu Y, Zhang L (2021) A deep semantic alignment network for the cross-modal image-text retrieval in remote sensing. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:4284–4297CrossRef
7.
Zurück zum Zitat Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: Twenty-second international joint conference on artificial intelligence Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: Twenty-second international joint conference on artificial intelligence
8.
Zurück zum Zitat Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, pp 785–796 Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, pp 785–796
9.
Zurück zum Zitat Bai C, Zeng C, Ma Q, Zhang J, Chen S (2020) Deep adversarial discrete hashing for cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 525–531 Bai C, Zeng C, Ma Q, Zhang J, Chen S (2020) Deep adversarial discrete hashing for cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 525–531
10.
Zurück zum Zitat Lu X, Zhu L, Cheng Z, Li J, Nie X, Zhang H (2019) Flexible online multi-modal hashing for large-scale multimedia retrieval. In: Proceedings of the 27th ACM international conference on multimedia, pp 1129–1137 Lu X, Zhu L, Cheng Z, Li J, Nie X, Zhang H (2019) Flexible online multi-modal hashing for large-scale multimedia retrieval. In: Proceedings of the 27th ACM international conference on multimedia, pp 1129–1137
11.
Zurück zum Zitat Fan L, Ng KW, Ju C, Zhang T, Chan CS (2020) Deep polarized network for supervised learning of accurate binary hashing codes. In: IJCAI, pp 825–831 Fan L, Ng KW, Ju C, Zhang T, Chan CS (2020) Deep polarized network for supervised learning of accurate binary hashing codes. In: IJCAI, pp 825–831
12.
Zurück zum Zitat Su S, Zhong Z, Zhang C (2019) Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3027–3035 Su S, Zhong Z, Zhang C (2019) Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3027–3035
13.
Zurück zum Zitat Li C, Deng C, Wang L, Xie D, Liu X (2019) Coupled cyclegan: Unsupervised hashing network for cross-modal retrieval. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp 176–183 Li C, Deng C, Wang L, Xie D, Liu X (2019) Coupled cyclegan: Unsupervised hashing network for cross-modal retrieval. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp 176–183
14.
Zurück zum Zitat Zhang P-F, Li Y, Huang Z, Xu X-S (2021) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479CrossRef Zhang P-F, Li Y, Huang Z, Xu X-S (2021) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479CrossRef
15.
Zurück zum Zitat Shen X, Zhang H, Li L, Liu L (2021) Attention-guided semantic hashing for unsupervised cross-modal retrieval. In: 2021 IEEE international conference on multimedia and expo (ICME), pp 1–6. IEEE Shen X, Zhang H, Li L, Liu L (2021) Attention-guided semantic hashing for unsupervised cross-modal retrieval. In: 2021 IEEE international conference on multimedia and expo (ICME), pp 1–6. IEEE
16.
Zurück zum Zitat Shen, F., Xu, Y., Liu, L., Yang, Y., Huang, Z., Shen, H.T.: Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE transactions on pattern analysis and machine intelligence 40(12), 3034–3044 (2018)CrossRef Shen, F., Xu, Y., Liu, L., Yang, Y., Huang, Z., Shen, H.T.: Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE transactions on pattern analysis and machine intelligence 40(12), 3034–3044 (2018)CrossRef
17.
Zurück zum Zitat Zhang X, Lai H, Feng J (2018) Attention-aware deep adversarial hashing for cross-modal retrieval. In: Proceedings of the European conference on computer vision (ECCV), pp 591–606 Zhang X, Lai H, Feng J (2018) Attention-aware deep adversarial hashing for cross-modal retrieval. In: Proceedings of the European conference on computer vision (ECCV), pp 591–606
18.
Zurück zum Zitat Zhang D, Wu X-J, Xu T, Yin H (2021) Dah: discrete asymmetric hashing for efficient cross-media retrieval. IEEE Trans Knowl Data Eng 35(2):1365–1378 Zhang D, Wu X-J, Xu T, Yin H (2021) Dah: discrete asymmetric hashing for efficient cross-media retrieval. IEEE Trans Knowl Data Eng 35(2):1365–1378
19.
Zurück zum Zitat Zhang D, Wu X-J, Yu J (2021) Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications TOMM 17(3):1–18CrossRef Zhang D, Wu X-J, Yu J (2021) Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications TOMM 17(3):1–18CrossRef
20.
Zurück zum Zitat Zhang D, Wu X-J, Yu J (2021) Discrete bidirectional matrix factorization hashing for zero-shot cross-media retrieval. In: Chinese conference on pattern recognition and computer vision (PRCV), pp 524–536. Springer Zhang D, Wu X-J, Yu J (2021) Discrete bidirectional matrix factorization hashing for zero-shot cross-media retrieval. In: Chinese conference on pattern recognition and computer vision (PRCV), pp 524–536. Springer
21.
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
22.
Zurück zum Zitat Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2021) Transformers in vision: a survey. ACM Comput Surv (CSUR) 54(10s):1–41CrossRef Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2021) Transformers in vision: a survey. ACM Comput Surv (CSUR) 54(10s):1–41CrossRef
23.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
24.
Zurück zum Zitat Zhang J, Peng Y, Yuan M (2018) Unsupervised generative adversarial cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, p 32 Zhang J, Peng Y, Yuan M (2018) Unsupervised generative adversarial cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, p 32
25.
Zurück zum Zitat Yu J, Zhou H, Zhan Y, Tao D (2021) Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35. pp 4626–4634 Yu J, Zhou H, Zhan Y, Tao D (2021) Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35. pp 4626–4634
26.
Zurück zum Zitat Wu G, Lin Z, Han J, Liu L, Ding G, Zhang B, Shen J (2018) Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: IJCAI, pp 1:5 Wu G, Lin Z, Han J, Liu L, Ding G, Zhang B, Shen J (2018) Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: IJCAI, pp 1:5
27.
Zurück zum Zitat Yang D, Wu D, Zhang W, Zhang H, Li B, Wang W (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44–52 Yang D, Wu D, Zhang W, Zhang H, Li B, Wang W (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44–52
28.
Zurück zum Zitat Liu S, Qian S, Guan Y, Zhan J, Ying L (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379–1388 Liu S, Qian S, Guan Y, Zhan J, Ying L (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379–1388
29.
Zurück zum Zitat Zhang P-F, Li Y, Huang Z, Xu X-S (2021) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479CrossRef Zhang P-F, Li Y, Huang Z, Xu X-S (2021) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479CrossRef
30.
Zurück zum Zitat Jiang Q-Y, Li W-J (2017) Deep cross-modal hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3232–3240 Jiang Q-Y, Li W-J (2017) Deep cross-modal hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3232–3240
31.
Zurück zum Zitat Li C, Deng C, Li N, Liu W, Gao X, Tao D (2018) Self-supervised adversarial hashing networks for cross-modal retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4242–4251 Li C, Deng C, Li N, Liu W, Gao X, Tao D (2018) Self-supervised adversarial hashing networks for cross-modal retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4242–4251
32.
Zurück zum Zitat Zhang D, Wu X-J (2022) Robust and discrete matrix factorization hashing for cross-modal retrieval. Pattern Recogn 122:108343 Zhang D, Wu X-J (2022) Robust and discrete matrix factorization hashing for cross-modal retrieval. Pattern Recogn 122:108343
34.
Zurück zum Zitat Wang, X., Zou, X., Bakker, E.M., Wu, S.: Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval. Neurocomputing 400, 255–271 (2020)CrossRef Wang, X., Zou, X., Bakker, E.M., Wu, S.: Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval. Neurocomputing 400, 255–271 (2020)CrossRef
35.
Zurück zum Zitat Zhang P-F, Luo Y, Huang Z, Xu X-S, Song J (2021) High-order nonlocal hashing for unsupervised cross-modal retrieval. World Wide Web 24(2):563–583CrossRef Zhang P-F, Luo Y, Huang Z, Xu X-S, Song J (2021) High-order nonlocal hashing for unsupervised cross-modal retrieval. World Wide Web 24(2):563–583CrossRef
36.
Zurück zum Zitat Mikriukov G, Ravanbakhsh M, Demir B (2022) Deep unsupervised contrastive hashing for large-scale cross-modal text-image retrieval in remote sensing. arXiv preprint arXiv:2201.08125 Mikriukov G, Ravanbakhsh M, Demir B (2022) Deep unsupervised contrastive hashing for large-scale cross-modal text-image retrieval in remote sensing. arXiv preprint arXiv:​2201.​08125
37.
Zurück zum Zitat Cao J, Gan Z, Cheng Y, Yu L, Chen Y-C, Liu J (2020) Behind the scene: Revealing the secrets of pre-trained vision-and-language models. In: European conference on computer vision, pp 565–580. Springer Cao J, Gan Z, Cheng Y, Yu L, Chen Y-C, Liu J (2020) Behind the scene: Revealing the secrets of pre-trained vision-and-language models. In: European conference on computer vision, pp 565–580. Springer
38.
Zurück zum Zitat Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J et al (2021) Learning transferable visual models from natural language supervision. In: International conference on machine learning, pp 8748–8763. PMLR Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J et al (2021) Learning transferable visual models from natural language supervision. In: International conference on machine learning, pp 8748–8763. PMLR
39.
Zurück zum Zitat Su W, Zhu X, Cao Y, Li B, Lu L, Wei F, Dai J (2019) Vl-bert: pre-training of generic visual-linguistic representations. arXiv preprint arXiv:1908.08530 Su W, Zhu X, Cao Y, Li B, Lu L, Wei F, Dai J (2019) Vl-bert: pre-training of generic visual-linguistic representations. arXiv preprint arXiv:​1908.​08530
40.
Zurück zum Zitat Chen Y-C, Li L, Yu L, El Kholy A, Ahmed F, Gan Z, Cheng Y, Liu J (2020) Uniter: universal image-text representation learning. In: European conference on computer vision, pp 104–120. Springer Chen Y-C, Li L, Yu L, El Kholy A, Ahmed F, Gan Z, Cheng Y, Liu J (2020) Uniter: universal image-text representation learning. In: European conference on computer vision, pp 104–120. Springer
41.
Zurück zum Zitat Lu J, Batra D, Parikh D, Lee S (2019) Vilbert: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Adv Neural Inf Process Syst 32 Lu J, Batra D, Parikh D, Lee S (2019) Vilbert: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Adv Neural Inf Process Syst 32
42.
Zurück zum Zitat Zeng Z, Mao W (2022) A comprehensive empirical study of vision-language pre-trained model for supervised cross-modal retrieval. arXiv preprint arXiv:2201.02772 Zeng Z, Mao W (2022) A comprehensive empirical study of vision-language pre-trained model for supervised cross-modal retrieval. arXiv preprint arXiv:​2201.​02772
43.
Zurück zum Zitat Zhuo Y, Li Y, Hsiao J, Ho C, Li B (2022) Clip4hashing: unsupervised deep hashing for cross-modal video-text retrieval. In: Proceedings of the 2022 international conference on multimedia retrieval, pp 158–166 Zhuo Y, Li Y, Hsiao J, Ho C, Li B (2022) Clip4hashing: unsupervised deep hashing for cross-modal video-text retrieval. In: Proceedings of the 2022 international conference on multimedia retrieval, pp 158–166
44.
Zurück zum Zitat Song, J., Zhang, H., Li, X., Gao, L., Wang, M., Hong, R.: Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Transactions on Image Processing 27(7), 3210–3221 (2018)MathSciNetCrossRefMATH Song, J., Zhang, H., Li, X., Gao, L., Wang, M., Hong, R.: Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Transactions on Image Processing 27(7), 3210–3221 (2018)MathSciNetCrossRefMATH
45.
Zurück zum Zitat Zhang D, Wu X-J (2020) Scalable discrete matrix factorization and semantic autoencoder for cross-media retrieval. IEEE Transactions on Cybernetics 52(7):5947–60CrossRef Zhang D, Wu X-J (2020) Scalable discrete matrix factorization and semantic autoencoder for cross-media retrieval. IEEE Transactions on Cybernetics 52(7):5947–60CrossRef
46.
Zurück zum Zitat Liu H, Lin M, Zhang S, Wu Y, Huang F, Ji R (2018) Dense auto-encoder hashing for robust cross-modality retrieval. In: Proceedings of the 26th ACM international conference on multimedia, pp 1589–1597 Liu H, Lin M, Zhang S, Wu Y, Huang F, Ji R (2018) Dense auto-encoder hashing for robust cross-modality retrieval. In: Proceedings of the 26th ACM international conference on multimedia, pp 1589–1597
47.
Zurück zum Zitat Bai S, Bai X, Tian Q, Latecki LJ (2017) Regularized diffusion process for visual retrieval. In: Proceedings of the AAAI conference on artificial intelligence, p 31 Bai S, Bai X, Tian Q, Latecki LJ (2017) Regularized diffusion process for visual retrieval. In: Proceedings of the AAAI conference on artificial intelligence, p 31
48.
Zurück zum Zitat Bai, S., Bai, X., Tian, Q., Latecki, L.J.: Regularized diffusion process on bidirectional context for object retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 41(5), 1213–1226 (2018)CrossRef Bai, S., Bai, X., Tian, Q., Latecki, L.J.: Regularized diffusion process on bidirectional context for object retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 41(5), 1213–1226 (2018)CrossRef
49.
Zurück zum Zitat Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM international conference on multimedia, pp 251–260 Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM international conference on multimedia, pp 251–260
50.
Zurück zum Zitat Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval, pp 39–43 Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval, pp 39–43
51.
Zurück zum Zitat Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp 1–9 Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp 1–9
52.
Zurück zum Zitat Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755. Springer Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755. Springer
53.
Zurück zum Zitat Shi Y, Zhao Y, Liu X, Zheng F, Ou W, You X, Peng Q (2022) Deep adaptively-enhanced hashing with discriminative similarity guidance for unsupervised cross-modal retrieval. IEEE Transactions on Circuits and Systems for Video Technology 32(10):7255–68 Shi Y, Zhao Y, Liu X, Zheng F, Ou W, You X, Peng Q (2022) Deep adaptively-enhanced hashing with discriminative similarity guidance for unsupervised cross-modal retrieval. IEEE Transactions on Circuits and Systems for Video Technology 32(10):7255–68
Metadaten
Titel
CLIP-based fusion-modal reconstructing hashing for large-scale unsupervised cross-modal retrieval
verfasst von
Li Mingyong
Li Yewen
Ge Mingyuan
Ma Longfei
Publikationsdatum
01.06.2023
Verlag
Springer London
Erschienen in
International Journal of Multimedia Information Retrieval / Ausgabe 1/2023
Print ISSN: 2192-6611
Elektronische ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-023-00268-7

Weitere Artikel der Ausgabe 1/2023

International Journal of Multimedia Information Retrieval 1/2023 Zur Ausgabe

Premium Partner