Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 10/2021

02.08.2021 | Original Article

Discrete matrix factorization hashing for cross-modal retrieval

verfasst von: Xiaozhao Fang, Zhihu Liu, Na Han, Lin Jiang, Shaohua Teng

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cross-modal hashing has recently attracted considerable attention in the large-scale retrieval task due to its low storage cost and high retrieval efficiency. However, the existing hashing methods still have some issues that need to be further solved. For example, most existing cross-modal hashing methods convert the original data into a common Hamming space to learn unified hash codes, which ignores the specific properties of multi-modal data. In addition, most of them relax the discrete constraint to learn hash codes, which may lead to quantization loss and suboptimal performance. In order to address the above problems, this paper proposes a novel cross-modal retrieval method, named discrete matrix factorization hashing (DMFH). DMFH is a two-stage approach. In the first stage, given training data, DMFH exploits the matrix factorization technique to learn modality-specific semantic representation for each modality, then generates the corresponding hash codes by linear projection. Meanwhile, in order to ensure that the hash codes can preserve the semantic similarity between different modalities, DMFH optimizes the hash codes by an affinity matrix constructed from the label information. During the first stage, DMFH proposes a discrete optimal algorithm to solve the discrete constraint problem in learning hash codes. In the second stage, given the hash codes learned in the first stage, DMFH utilizes kernel logistic regression to learn the nonlinear features from the unseen instance, then generates corresponding hash codes for each modality. Extensive experimental results on three public benchmark datasets show that the proposed DMFH outperforms several state-of-art cross-modal hashing methods in terms of accuracy and efficiency.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Bronstein MM, Bronstein AM, Michel F, Paragios N (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: The twenty-third ieee conference on computer vision and pattern recognition, CVPR 2010. IEEE Computer Society, San Francisco, CA, USA, 13–18 June 2010, pp 3594–3601 Bronstein MM, Bronstein AM, Michel F, Paragios N (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: The twenty-third ieee conference on computer vision and pattern recognition, CVPR 2010. IEEE Computer Society, San Francisco, CA, USA, 13–18 June 2010, pp 3594–3601
2.
Zurück zum Zitat Charikar M (2002) Similarity estimation techniques from rounding algorithms. In: Reif JH (ed) Proceedings on 34th annual ACM symposium on theory of computing, May 19–21, 2002. ACM, Montréal, Québec, Canada, pp 380–388 Charikar M (2002) Similarity estimation techniques from rounding algorithms. In: Reif JH (ed) Proceedings on 34th annual ACM symposium on theory of computing, May 19–21, 2002. ACM, Montréal, Québec, Canada, pp 380–388
3.
Zurück zum Zitat Chen Z, Zhong F, Min G, Leng Y, Ying Y (2018) Supervised intra- and inter-modality similarity preserving hashing for cross-modal retrieval. IEEE Access 6:27796–27808CrossRef Chen Z, Zhong F, Min G, Leng Y, Ying Y (2018) Supervised intra- and inter-modality similarity preserving hashing for cross-modal retrieval. IEEE Access 6:27796–27808CrossRef
4.
Zurück zum Zitat Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of singapore. In: Proceedings of the 8th ACM international conference on image and video retrieval, CIVR 2009. ACM, Santorini Island, Greece, July 8–10, 2009 Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of singapore. In: Proceedings of the 8th ACM international conference on image and video retrieval, CIVR 2009. ACM, Santorini Island, Greece, July 8–10, 2009
5.
Zurück zum Zitat Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: 2014 IEEE conference on computer vision and pattern recognition, CVPR 2014. IEEE Computer Society, Columbus, OH, USA, June 23–28, 2014, pp 2083–2090 Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: 2014 IEEE conference on computer vision and pattern recognition, CVPR 2014. IEEE Computer Society, Columbus, OH, USA, June 23–28, 2014, pp 2083–2090
6.
Zurück zum Zitat Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: VLDB’99, proceedings of 25th international conference on very large data bases, September 7–10, 1999. Morgan Kaufmann, Edinburgh, Scotland, UK, pp 518–529 Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: VLDB’99, proceedings of 25th international conference on very large data bases, September 7–10, 1999. Morgan Kaufmann, Edinburgh, Scotland, UK, pp 518–529
7.
Zurück zum Zitat Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vis 106(2):210–233CrossRef Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vis 106(2):210–233CrossRef
8.
Zurück zum Zitat Hardoon DR, Szedmák S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664CrossRef Hardoon DR, Szedmák S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664CrossRef
9.
Zurück zum Zitat Huiskes MJ, Thomee B, Lew MS (2010) New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative. In: Proceedings of the 11th ACM SIGMM international conference on multimedia information retrieval, MIR 2010. ACM, Philadelphia, Pennsylvania, USA, March 29–31, 2010, pp 527–536 Huiskes MJ, Thomee B, Lew MS (2010) New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative. In: Proceedings of the 11th ACM SIGMM international conference on multimedia information retrieval, MIR 2010. ACM, Philadelphia, Pennsylvania, USA, March 29–31, 2010, pp 527–536
10.
Zurück zum Zitat Kulis B, Grauman K (2012) Kernelized locality-sensitive hashing. IEEE Trans Pattern Anal Mach Intell 34(6):1092–1104CrossRef Kulis B, Grauman K (2012) Kernelized locality-sensitive hashing. IEEE Trans Pattern Anal Mach Intell 34(6):1092–1104CrossRef
11.
Zurück zum Zitat Liang J, He R, Sun Z, Tan T (2016) Group-invariant cross-modal subspace learning. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016, pp 1739–1745 Liang J, He R, Sun Z, Tan T (2016) Group-invariant cross-modal subspace learning. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016, pp 1739–1745
12.
Zurück zum Zitat Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. IEEE Computer Society, Boston, MA, USA, June 7–12, 2015, pp 3864–3872 Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. IEEE Computer Society, Boston, MA, USA, June 7–12, 2015, pp 3864–3872
13.
Zurück zum Zitat Liu H, Ji R, Wu Y, Hua G (2016) Supervised matrix factorization for cross-modality hashing. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI 2016. IJCAI/AAAI Press, New York, NY, USA, 9–15 July 2016, pp 1767–1773 Liu H, Ji R, Wu Y, Hua G (2016) Supervised matrix factorization for cross-modality hashing. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI 2016. IJCAI/AAAI Press, New York, NY, USA, 9–15 July 2016, pp 1767–1773
14.
Zurück zum Zitat Liu H, Ji R, Wu Y, Huang F, Zhang B (2017) Cross-modality binary code learning via fusion similarity hashing. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017. IEEE Computer Society, Honolulu, HI, USA, July 21–26, 2017, pp 6345–6353 Liu H, Ji R, Wu Y, Huang F, Zhang B (2017) Cross-modality binary code learning via fusion similarity hashing. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017. IEEE Computer Society, Honolulu, HI, USA, July 21–26, 2017, pp 6345–6353
15.
Zurück zum Zitat Liu W, Wang J, Ji R, Jiang Y, Chang S (2012) Supervised hashing with kernels. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, Providence, RI, USA, June 16–21, 2012, pp 2074–2081 Liu W, Wang J, Ji R, Jiang Y, Chang S (2012) Supervised hashing with kernels. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, Providence, RI, USA, June 16–21, 2012, pp 2074–2081
16.
Zurück zum Zitat Liu X, Li A, Du J, Peng S, Fan W (2018) Efficient cross-modal retrieval via flexible supervised collective matrix factorization hashing. Multimed Tools Appl 77(21):28665–28683CrossRef Liu X, Li A, Du J, Peng S, Fan W (2018) Efficient cross-modal retrieval via flexible supervised collective matrix factorization hashing. Multimed Tools Appl 77(21):28665–28683CrossRef
17.
Zurück zum Zitat Mandal D, Chaudhury KN, Biswas S (2017) Generalized semantic preserving hashing for n-label cross-modal retrieval. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017. IEEE Computer Society, Honolulu, HI, USA, July 21–26, 2017, pp 2633–2641 Mandal D, Chaudhury KN, Biswas S (2017) Generalized semantic preserving hashing for n-label cross-modal retrieval. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017. IEEE Computer Society, Honolulu, HI, USA, July 21–26, 2017, pp 2633–2641
18.
Zurück zum Zitat Pereira JC, Coviello E, Doyle G, Rasiwasia N, Lanckriet GRG, Levy R, Vasconcelos N (2014) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36(3):521–535CrossRef Pereira JC, Coviello E, Doyle G, Rasiwasia N, Lanckriet GRG, Levy R, Vasconcelos N (2014) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36(3):521–535CrossRef
19.
Zurück zum Zitat Schmidt M (2005) minfunc: unconstrained differentiable multivariate optimization in matlab Schmidt M (2005) minfunc: unconstrained differentiable multivariate optimization in matlab
20.
Zurück zum Zitat Sharma A, Kumar A, III HD, Jacobs DW (2012) Generalized multiview analysis: a discriminative latent space. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, Providence, RI, USA, June 16–21, 2012, pp 2160–2167 Sharma A, Kumar A, III HD, Jacobs DW (2012) Generalized multiview analysis: a discriminative latent space. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, Providence, RI, USA, June 16–21, 2012, pp 2160–2167
21.
Zurück zum Zitat Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: IEEE Conference on computer vision and pattern recognition, CVPR 2015. IEEE Computer Society, Boston, MA, USA, June 7–12, 2015, pp 37–45 Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: IEEE Conference on computer vision and pattern recognition, CVPR 2015. IEEE Computer Society, Boston, MA, USA, June 7–12, 2015, pp 37–45
22.
Zurück zum Zitat Slaney M, Casey MA (2008) Locality-sensitive hashing for finding nearest neighbors [lecture notes]. IEEE Signal Process Mag 25(2):128–131CrossRef Slaney M, Casey MA (2008) Locality-sensitive hashing for finding nearest neighbors [lecture notes]. IEEE Signal Process Mag 25(2):128–131CrossRef
23.
Zurück zum Zitat Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD 2013. ACM, New York, NY, USA, June 22–27, 2013, pp 785–796 Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD 2013. ACM, New York, NY, USA, June 22–27, 2013, pp 785–796
24.
Zurück zum Zitat Tang J, Wang K, Shao L (2016) Supervised matrix factorization hashing for cross-modal retrieval. IEEE Trans Image Process 25(7):3157–3166MathSciNetCrossRef Tang J, Wang K, Shao L (2016) Supervised matrix factorization hashing for cross-modal retrieval. IEEE Trans Image Process 25(7):3157–3166MathSciNetCrossRef
25.
Zurück zum Zitat Wang J, Zhang T, Song J, Sebe N, Shen HT (2018) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell 40(4):769–790CrossRef Wang J, Zhang T, Song J, Sebe N, Shen HT (2018) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell 40(4):769–790CrossRef
26.
Zurück zum Zitat Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems 21, proceedings of the twenty-second annual conference on neural information processing systems. Curran Associates, Inc., Vancouver, British Columbia, Canada, December 8–11, 2008, pp 1753–1760 Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems 21, proceedings of the twenty-second annual conference on neural information processing systems. Curran Associates, Inc., Vancouver, British Columbia, Canada, December 8–11, 2008, pp 1753–1760
27.
Zurück zum Zitat Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning. In: Proceedings of the twenty-Eighth AAAI conference on artificial intelligence, July 27 -31, 2014, Québec City, Québec, Canada, pp 2156–2162 Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning. In: Proceedings of the twenty-Eighth AAAI conference on artificial intelligence, July 27 -31, 2014, Québec City, Québec, Canada, pp 2156–2162
28.
Zurück zum Zitat Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507MathSciNetCrossRef Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507MathSciNetCrossRef
29.
Zurück zum Zitat Zhang D, Li W (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the twenty-eighth AAAI conference on artificial intelligence, July 27–31, 2014. AAAI Press, Québec City, Québec, Canada, pp 2177–2183 Zhang D, Li W (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the twenty-eighth AAAI conference on artificial intelligence, July 27–31, 2014. AAAI Press, Québec City, Québec, Canada, pp 2177–2183
30.
Zurück zum Zitat Zhang D, Li W (2014) Large-scale supervised multimodal hashing with semantic correlation maximization, pp 2177–2183 Zhang D, Li W (2014) Large-scale supervised multimodal hashing with semantic correlation maximization, pp 2177–2183
31.
Zurück zum Zitat Zhen Y, Yeung D (2012) Co-regularized hashing for multimodal data. In: Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3–6, 2012, Lake Tahoe, Nevada, United States, pp 1385–1393 Zhen Y, Yeung D (2012) Co-regularized hashing for multimodal data. In: Advances in neural information processing systems 25: 26th annual conference on neural information processing systems 2012. Proceedings of a meeting held December 3–6, 2012, Lake Tahoe, Nevada, United States, pp 1385–1393
32.
Zurück zum Zitat Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: The 37th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’14, Gold Coast. ACM, QLD, Australia, July 06–11, 2014, pp 415–424 Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: The 37th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’14, Gold Coast. ACM, QLD, Australia, July 06–11, 2014, pp 415–424
Metadaten
Titel
Discrete matrix factorization hashing for cross-modal retrieval
verfasst von
Xiaozhao Fang
Zhihu Liu
Na Han
Lin Jiang
Shaohua Teng
Publikationsdatum
02.08.2021
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 10/2021
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-021-01395-5

Weitere Artikel der Ausgabe 10/2021

International Journal of Machine Learning and Cybernetics 10/2021 Zur Ausgabe