Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 10/2023

11.05.2023 | Original Article

Pseudo-label driven deep hashing for unsupervised cross-modal retrieval

verfasst von: XianHua Zeng, Ke Xu, YiCai Xie

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the rapid development of big data and the Internet, cross-modal retrieval has become a popular research topic. Cross-modal hashing is an important research direction in cross-modal retrieval, due to its highly efficiency and small memory consumption. Recently, many unsupervised cross-modal hashing methods achieved great results on cross-modal retrieval tasks. However, how to narrow the heterogeneous gap between different modalities and generate more discriminative hash codes are still the main problems of unsupervised hashing. In this paper, we propose a novel unsupervised cross-modal hashing method Pseudo-label Driven Deep Hashing to solve aforementioned problems. We introduce clustering into our modal to obtain initialized semantical information called pseudo-label, and we propose a novel adjusting method that uses pseudo-labels to adjust joint-semantic similarity matrix. We construct a similarity consistency loss function that focuses on the heterogeneity gap between different modalities, and a real values and binary codes fine-tuning strategy for closing the gap between real value space and Hamming space. We conduct experiments on five datasets including three natural datasets which have larger inter-class distances and two medical datasets which have smaller inter-class distances, the results demonstrate the superiority of our method compared with several unsupervised cross-modal hashing methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Arthur D, Vassilvitskii S (2006) k-means++: the advantages of careful seeding. Tech. rep, Stanford Arthur D, Vassilvitskii S (2006) k-means++: the advantages of careful seeding. Tech. rep, Stanford
2.
Zurück zum Zitat Bronstein MM, Bronstein AM, Michel F et al (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 3594–3601 Bronstein MM, Bronstein AM, Michel F et al (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 3594–3601
3.
Zurück zum Zitat Cao Y, Long M, Wang J et al (2016) Correlation autoencoder hashing for supervised cross-modal search. In: Proceedings of the 2016 ACM on international conference on multimedia retrieval, pp 197–204 Cao Y, Long M, Wang J et al (2016) Correlation autoencoder hashing for supervised cross-modal search. In: Proceedings of the 2016 ACM on international conference on multimedia retrieval, pp 197–204
4.
Zurück zum Zitat Chua TS, Tang J, Hong R et al (2009) Nus-wide: a real-world web image database from national university of Singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp 1–9 Chua TS, Tang J, Hong R et al (2009) Nus-wide: a real-world web image database from national university of Singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp 1–9
5.
Zurück zum Zitat Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, Ieee, pp 248–255 Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, Ieee, pp 248–255
6.
Zurück zum Zitat Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2075–2082 Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2075–2082
7.
Zurück zum Zitat Fang X, Liu Z, Han N et al (2021) Discrete matrix factorization hashing for cross-modal retrieval. Int J Mach Learn Cybern 12(10):3023–3036CrossRef Fang X, Liu Z, Han N et al (2021) Discrete matrix factorization hashing for cross-modal retrieval. Int J Mach Learn Cybern 12(10):3023–3036CrossRef
8.
Zurück zum Zitat Feng F, Wang X, Li R (2014) Cross-modal retrieval with correspondence autoencoder. In: Proceedings of the 22nd ACM international conference on multimedia, pp 7–16 Feng F, Wang X, Li R (2014) Cross-modal retrieval with correspondence autoencoder. In: Proceedings of the 22nd ACM international conference on multimedia, pp 7–16
9.
Zurück zum Zitat Hu D, Nie F, Li X (2018) Deep binary reconstruction for cross-modal hashing. IEEE Trans Multimedia 21(4):973–985CrossRef Hu D, Nie F, Li X (2018) Deep binary reconstruction for cross-modal hashing. IEEE Trans Multimedia 21(4):973–985CrossRef
10.
Zurück zum Zitat Hu P, Peng D, Wang X et al (2019) Multimodal adversarial network for cross-modal retrieval. Knowl-Based Syst 180:38–50CrossRef Hu P, Peng D, Wang X et al (2019) Multimodal adversarial network for cross-modal retrieval. Knowl-Based Syst 180:38–50CrossRef
11.
Zurück zum Zitat Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on Multimedia information retrieval, pp 39–43 Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on Multimedia information retrieval, pp 39–43
12.
Zurück zum Zitat Jiang QY, Li WJ (2017) Deep cross-modal hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3232–3240 Jiang QY, Li WJ (2017) Deep cross-modal hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3232–3240
13.
Zurück zum Zitat Kang C, Xiang S, Liao S et al (2015) Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Trans Multimedia 17(3):370–381CrossRef Kang C, Xiang S, Liao S et al (2015) Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Trans Multimedia 17(3):370–381CrossRef
14.
Zurück zum Zitat Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: Twenty-second international joint conference on artificial intelligence Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: Twenty-second international joint conference on artificial intelligence
15.
Zurück zum Zitat Li C, Deng C, Li N et al (2018) Self-supervised adversarial hashing networks for cross-modal retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4242–4251 Li C, Deng C, Li N et al (2018) Self-supervised adversarial hashing networks for cross-modal retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4242–4251
16.
Zurück zum Zitat Li Y, Hu P, Liu Z et al (2021) Contrastive clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 8547–8555 Li Y, Hu P, Liu Z et al (2021) Contrastive clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 8547–8555
17.
Zurück zum Zitat Lin Z, Ding G, Hu M et al (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3864–3872 Lin Z, Ding G, Hu M et al (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3864–3872
18.
Zurück zum Zitat Lin Z, Ding G, Han J et al (2016) Cross-view retrieval via probability-based semantics-preserving hashing. IEEE Trans Cybern 47(12):4342–4355CrossRef Lin Z, Ding G, Han J et al (2016) Cross-view retrieval via probability-based semantics-preserving hashing. IEEE Trans Cybern 47(12):4342–4355CrossRef
19.
Zurück zum Zitat Lin Q, Cao W, He Z et al (2020) Mask cross-modal hashing networks. IEEE Trans Multimedia 23:550–558CrossRef Lin Q, Cao W, He Z et al (2020) Mask cross-modal hashing networks. IEEE Trans Multimedia 23:550–558CrossRef
20.
Zurück zum Zitat Liu H, Lin M, Zhang S et al (2018) Dense auto-encoder hashing for robust cross-modality retrieval. In: Proceedings of the 26th ACM international conference on multimedia, pp 1589–1597 Liu H, Lin M, Zhang S et al (2018) Dense auto-encoder hashing for robust cross-modality retrieval. In: Proceedings of the 26th ACM international conference on multimedia, pp 1589–1597
21.
Zurück zum Zitat Liu Y, Guo Y, Liu L et al (2019) Cyclematch: a cycle-consistent embedding network for image-text matching. Pattern Recogn 93:365–379CrossRef Liu Y, Guo Y, Liu L et al (2019) Cyclematch: a cycle-consistent embedding network for image-text matching. Pattern Recogn 93:365–379CrossRef
22.
Zurück zum Zitat Liu S, Qian S, Guan Y et al (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379–1388 Liu S, Qian S, Guan Y et al (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379–1388
23.
Zurück zum Zitat Lu X, Zhu L, Cheng Z et al (2019a) Flexible online multi-modal hashing for large-scale multimedia retrieval. In: Proceedings of the 27th ACM international conference on multimedia, pp 1129–1137 Lu X, Zhu L, Cheng Z et al (2019a) Flexible online multi-modal hashing for large-scale multimedia retrieval. In: Proceedings of the 27th ACM international conference on multimedia, pp 1129–1137
24.
Zurück zum Zitat Lu X, Zhu L, Cheng Z et al (2019b) Efficient discrete latent semantic hashing for scalable cross-modal retrieval. Signal Process 154:217–231CrossRef Lu X, Zhu L, Cheng Z et al (2019b) Efficient discrete latent semantic hashing for scalable cross-modal retrieval. Signal Process 154:217–231CrossRef
25.
Zurück zum Zitat Pereira JC, Coviello E, Doyle G et al (2013) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36(3):521–535CrossRef Pereira JC, Coviello E, Doyle G et al (2013) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36(3):521–535CrossRef
26.
Zurück zum Zitat Rasiwasia N, Costa Pereira J, Coviello E et al (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM international conference on multimedia, pp 251–260 Rasiwasia N, Costa Pereira J, Coviello E et al (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM international conference on multimedia, pp 251–260
27.
Zurück zum Zitat Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747
28.
Zurück zum Zitat Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
29.
Zurück zum Zitat Song J, Yang Y, Yang Y et al (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, pp 785–796 Song J, Yang Y, Yang Y et al (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, pp 785–796
30.
Zurück zum Zitat Su S, Zhong Z, Zhang C (2019) Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3027–3035 Su S, Zhong Z, Zhang C (2019) Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3027–3035
31.
Zurück zum Zitat Wang B, Yang Y, Xu X et al (2017) Adversarial cross-modal retrieval. In: Proceedings of the 25th ACM international conference on Multimedia, pp 154–162 Wang B, Yang Y, Xu X et al (2017) Adversarial cross-modal retrieval. In: Proceedings of the 25th ACM international conference on Multimedia, pp 154–162
32.
Zurück zum Zitat Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. Adv Neural Inf Process Syst 21 Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. Adv Neural Inf Process Syst 21
33.
Zurück zum Zitat Wu B, Yang Q, Zheng WS et al (2015a) Quantized correlation hashing for fast cross-modal search. In: Twenty-fourth international joint conference on artificial intelligence Wu B, Yang Q, Zheng WS et al (2015a) Quantized correlation hashing for fast cross-modal search. In: Twenty-fourth international joint conference on artificial intelligence
34.
Zurück zum Zitat Wu F, Jiang X, Li X et al (2015b) Cross-modal learning to rank via latent joint representation. IEEE Trans Image Process 24(5):1497–1509MathSciNetCrossRefMATH Wu F, Jiang X, Li X et al (2015b) Cross-modal learning to rank via latent joint representation. IEEE Trans Image Process 24(5):1497–1509MathSciNetCrossRefMATH
35.
Zurück zum Zitat Wu G, Lin Z, Han J et al (2018) Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: IJCAI, p 5 Wu G, Lin Z, Han J et al (2018) Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: IJCAI, p 5
36.
Zurück zum Zitat Xie D, Deng C, Li C et al (2020) Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE Trans Image Process 29:3626–3637CrossRefMATH Xie D, Deng C, Li C et al (2020) Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE Trans Image Process 29:3626–3637CrossRefMATH
37.
Zurück zum Zitat Xu X, Shen F, Yang Y et al (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507MathSciNetCrossRefMATH Xu X, Shen F, Yang Y et al (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507MathSciNetCrossRefMATH
40.
Zurück zum Zitat Yang D, Wu D, Zhang W et al (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44–52 Yang D, Wu D, Zhang W et al (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44–52
38.
Zurück zum Zitat Yang F, Ding X, Liu Y et al (2022a) Scalable semantic-enhanced supervised hashing for cross-modal retrieval. Knowl-Based Syst:109176 Yang F, Ding X, Liu Y et al (2022a) Scalable semantic-enhanced supervised hashing for cross-modal retrieval. Knowl-Based Syst:109176
39.
Zurück zum Zitat Yang F, Liu Y, Ding X et al (2022b) Asymmetric cross-modal hashing with high-level semantic similarity. Pattern Recogn 130:108823CrossRef Yang F, Liu Y, Ding X et al (2022b) Asymmetric cross-modal hashing with high-level semantic similarity. Pattern Recogn 130:108823CrossRef
41.
Zurück zum Zitat Ye Z, Peng Y (2018) Multi-scale correlation for sequential cross-modal hashing learning. In: Proceedings of the 26th ACM international conference on Multimedia, pp 852–860 Ye Z, Peng Y (2018) Multi-scale correlation for sequential cross-modal hashing learning. In: Proceedings of the 26th ACM international conference on Multimedia, pp 852–860
42.
Zurück zum Zitat Yu J, Zhou H, Zhan Y et al (2021) Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, pp 4626–4634 Yu J, Zhou H, Zhan Y et al (2021) Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, pp 4626–4634
43.
Zurück zum Zitat Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the AAAI conference on artificial intelligence Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the AAAI conference on artificial intelligence
44.
Zurück zum Zitat Zhang J, Peng Y, Yuan M (2018) Unsupervised generative adversarial cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence Zhang J, Peng Y, Yuan M (2018) Unsupervised generative adversarial cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence
45.
Zurück zum Zitat Zheng W, Liu H, Wang B et al (2020) Cross-modal learning for material perception using deep extreme learning machine. Int J Mach Learn Cybern 11(4):813–823CrossRef Zheng W, Liu H, Wang B et al (2020) Cross-modal learning for material perception using deep extreme learning machine. Int J Mach Learn Cybern 11(4):813–823CrossRef
46.
Zurück zum Zitat Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, pp 415–424 Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, pp 415–424
47.
Zurück zum Zitat Zhu X, Huang Z, Shen HT et al (2013) Linear cross-modal hashing for efficient multimedia search. In: Proceedings of the 21st ACM international conference on multimedia, pp 143–152 Zhu X, Huang Z, Shen HT et al (2013) Linear cross-modal hashing for efficient multimedia search. In: Proceedings of the 21st ACM international conference on multimedia, pp 143–152
Metadaten
Titel
Pseudo-label driven deep hashing for unsupervised cross-modal retrieval
verfasst von
XianHua Zeng
Ke Xu
YiCai Xie
Publikationsdatum
11.05.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 10/2023
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-023-01842-5

Weitere Artikel der Ausgabe 10/2023

International Journal of Machine Learning and Cybernetics 10/2023 Zur Ausgabe

Neuer Inhalt