nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

11.05.2023 | Original Article

Pseudo-label driven deep hashing for unsupervised cross-modal retrieval

verfasst von: XianHua Zeng, Ke Xu, YiCai Xie

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

With the rapid development of big data and the Internet, cross-modal retrieval has become a popular research topic. Cross-modal hashing is an important research direction in cross-modal retrieval, due to its highly efficiency and small memory consumption. Recently, many unsupervised cross-modal hashing methods achieved great results on cross-modal retrieval tasks. However, how to narrow the heterogeneous gap between different modalities and generate more discriminative hash codes are still the main problems of unsupervised hashing. In this paper, we propose a novel unsupervised cross-modal hashing method Pseudo-label Driven Deep Hashing to solve aforementioned problems. We introduce clustering into our modal to obtain initialized semantical information called pseudo-label, and we propose a novel adjusting method that uses pseudo-labels to adjust joint-semantic similarity matrix. We construct a similarity consistency loss function that focuses on the heterogeneity gap between different modalities, and a real values and binary codes fine-tuning strategy for closing the gap between real value space and Hamming space. We conduct experiments on five datasets including three natural datasets which have larger inter-class distances and two medical datasets which have smaller inter-class distances, the results demonstrate the superiority of our method compared with several unsupervised cross-modal hashing methods.

Vorheriger Artikel Multi-layered semantic representation network for multi-label image classification

Nächster Artikel Domain generalization by distribution estimation

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Arthur D, Vassilvitskii S (2006) k-means++: the advantages of careful seeding. Tech. rep, Stanford

Bronstein MM, Bronstein AM, Michel F et al (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 3594–3601

Cao Y, Long M, Wang J et al (2016) Correlation autoencoder hashing for supervised cross-modal search. In: Proceedings of the 2016 ACM on international conference on multimedia retrieval, pp 197–204

Chua TS, Tang J, Hong R et al (2009) Nus-wide: a real-world web image database from national university of Singapore. In: Proceedings of the ACM international conference on image and video retrieval, pp 1–9

Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, Ieee, pp 248–255

Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2075–2082

Fang X, Liu Z, Han N et al (2021) Discrete matrix factorization hashing for cross-modal retrieval. Int J Mach Learn Cybern 12(10):3023–3036CrossRef

Feng F, Wang X, Li R (2014) Cross-modal retrieval with correspondence autoencoder. In: Proceedings of the 22nd ACM international conference on multimedia, pp 7–16

Hu D, Nie F, Li X (2018) Deep binary reconstruction for cross-modal hashing. IEEE Trans Multimedia 21(4):973–985CrossRef

10.

Hu P, Peng D, Wang X et al (2019) Multimodal adversarial network for cross-modal retrieval. Knowl-Based Syst 180:38–50CrossRef

11.

Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on Multimedia information retrieval, pp 39–43

12.

Jiang QY, Li WJ (2017) Deep cross-modal hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3232–3240

13.

Kang C, Xiang S, Liao S et al (2015) Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Trans Multimedia 17(3):370–381CrossRef

14.

Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: Twenty-second international joint conference on artificial intelligence

15.

Li C, Deng C, Li N et al (2018) Self-supervised adversarial hashing networks for cross-modal retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4242–4251

16.

Li Y, Hu P, Liu Z et al (2021) Contrastive clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 8547–8555

17.

Lin Z, Ding G, Hu M et al (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3864–3872

18.

Lin Z, Ding G, Han J et al (2016) Cross-view retrieval via probability-based semantics-preserving hashing. IEEE Trans Cybern 47(12):4342–4355CrossRef

19.

Lin Q, Cao W, He Z et al (2020) Mask cross-modal hashing networks. IEEE Trans Multimedia 23:550–558CrossRef

20.

Liu H, Lin M, Zhang S et al (2018) Dense auto-encoder hashing for robust cross-modality retrieval. In: Proceedings of the 26th ACM international conference on multimedia, pp 1589–1597

21.

Liu Y, Guo Y, Liu L et al (2019) Cyclematch: a cycle-consistent embedding network for image-text matching. Pattern Recogn 93:365–379CrossRef

22.

Liu S, Qian S, Guan Y et al (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379–1388

23.

Lu X, Zhu L, Cheng Z et al (2019a) Flexible online multi-modal hashing for large-scale multimedia retrieval. In: Proceedings of the 27th ACM international conference on multimedia, pp 1129–1137

24.

Lu X, Zhu L, Cheng Z et al (2019b) Efficient discrete latent semantic hashing for scalable cross-modal retrieval. Signal Process 154:217–231CrossRef

25.

Pereira JC, Coviello E, Doyle G et al (2013) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36(3):521–535CrossRef

26.

Rasiwasia N, Costa Pereira J, Coviello E et al (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM international conference on multimedia, pp 251–260

27.

Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747

28.

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

29.

Song J, Yang Y, Yang Y et al (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, pp 785–796

30.

Su S, Zhong Z, Zhang C (2019) Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3027–3035

31.

Wang B, Yang Y, Xu X et al (2017) Adversarial cross-modal retrieval. In: Proceedings of the 25th ACM international conference on Multimedia, pp 154–162

32.

Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. Adv Neural Inf Process Syst 21

33.

Wu B, Yang Q, Zheng WS et al (2015a) Quantized correlation hashing for fast cross-modal search. In: Twenty-fourth international joint conference on artificial intelligence

34.

Wu F, Jiang X, Li X et al (2015b) Cross-modal learning to rank via latent joint representation. IEEE Trans Image Process 24(5):1497–1509MathSciNetCrossRefMATH

35.

Wu G, Lin Z, Han J et al (2018) Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: IJCAI, p 5

36.

Xie D, Deng C, Li C et al (2020) Multi-task consistency-preserving adversarial hashing for cross-modal retrieval. IEEE Trans Image Process 29:3626–3637CrossRefMATH

37.

Xu X, Shen F, Yang Y et al (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507MathSciNetCrossRefMATH

40.

Yang D, Wu D, Zhang W et al (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44–52

38.

Yang F, Ding X, Liu Y et al (2022a) Scalable semantic-enhanced supervised hashing for cross-modal retrieval. Knowl-Based Syst:109176

39.

Yang F, Liu Y, Ding X et al (2022b) Asymmetric cross-modal hashing with high-level semantic similarity. Pattern Recogn 130:108823CrossRef

41.

Ye Z, Peng Y (2018) Multi-scale correlation for sequential cross-modal hashing learning. In: Proceedings of the 26th ACM international conference on Multimedia, pp 852–860

42.

Yu J, Zhou H, Zhan Y et al (2021) Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, pp 4626–4634

43.

Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the AAAI conference on artificial intelligence

44.

Zhang J, Peng Y, Yuan M (2018) Unsupervised generative adversarial cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence

45.

Zheng W, Liu H, Wang B et al (2020) Cross-modal learning for material perception using deep extreme learning machine. Int J Mach Learn Cybern 11(4):813–823CrossRef

46.

Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, pp 415–424

47.

Zhu X, Huang Z, Shen HT et al (2013) Linear cross-modal hashing for efficient multimedia search. In: Proceedings of the 21st ACM international conference on multimedia, pp 143–152

Titel: Pseudo-label driven deep hashing for unsupervised cross-modal retrieval
verfasst von: XianHua Zeng
Ke Xu
YiCai Xie
Publikationsdatum: 11.05.2023
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 10/2023
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-023-01842-5

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Additiv gefertigte Teile/© Marina_Skoropadskaya | Getty Images | iStock, Warnschild "Land unter"/© Bluedesign / Fotolia, Gardiner von Trapp/© Alpega Group, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 10/2023

learning anomalous human actions using frames of interest and decoderless deep embedded clustering

OUBoost: boosting based over and under sampling technique for handling imbalanced data

Transferring variational autoencoders with coarse-and-fine alignment for open set broad classification

Stochastic configuration networks for adaptive inverse dynamics modeling

An interpretable neural network TV program recommendation based on SHAP

OSAGGAN: one-shot unsupervised image-to-image translation using attention-guided generative adversarial networks

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.