Skip to main content
Top

2019 | OriginalPaper | Chapter

Semisupervised Cross-Media Retrieval by Distance-Preserving Correlation Learning and Multi-modal Manifold Regularization

Authors : Ting Wang, Hong Zhang, Bo Li, Xin Xu

Published in: PRICAI 2019: Trends in Artificial Intelligence

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Due to the heterogeneous representation and incongruous distribution of cross-media data, like text, image, audio, video, and 3D model, how to capture the correlations of heterogeneous data for cross-media retrieval is a challenging problem. In order to handle with multiple media types, this paper proposes a novel distance-preserving correlation learning and multi-modal manifold regularization (DCLMM) approach to exploit the common representation of heterogeneous data. The method mines the distance-preserving correlation by minimizing (maximizing) the distances between media samples with positive (negative) semantic correlations, while most existing methods only focus on positive correlations of pairwise media types. DCLMM also utilizes an intrinsic multi-modal manifold to well describe the geometry distribution of both labeled and unlabeled heterogeneous cross-media data. Moreover, DCLMM incorporates the distance-preserving correlation and multi-modal manifold into a kernel based regularization framework to explore more rich complementary information from high dimensional space. Extensive experimental results on two widely-used cross-media datasets with up to five media types demonstrate the effectiveness of DCLMM for cross-media retrieval, compared with the state-of-the-art methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ranjan, V., Rasiwasia, N., Jawahar, C.V.: Multi-label cross-modal retrieval. In: IEEE International Conference on Computer Vision, pp. 4094–4102. IEEE Computer Society (2015) Ranjan, V., Rasiwasia, N., Jawahar, C.V.: Multi-label cross-modal retrieval. In: IEEE International Conference on Computer Vision, pp. 4094–4102. IEEE Computer Society (2015)
2.
go back to reference Zhang, H., Zhang, W., Liu, W., et al.: Multiple kernel visual-auditory representation learning for retrieval. Multimedia Tools Appl. 75, 9169–9184 (2016)CrossRef Zhang, H., Zhang, W., Liu, W., et al.: Multiple kernel visual-auditory representation learning for retrieval. Multimedia Tools Appl. 75, 9169–9184 (2016)CrossRef
3.
go back to reference Peng, Y., Qi, J., Huang, X., Yuan, Y.: CCL: cross-modal correlation learning with multigrained fusion by hierarchical network. IEEE Trans. Multimedia 20(2), 405–420 (2018)CrossRef Peng, Y., Qi, J., Huang, X., Yuan, Y.: CCL: cross-modal correlation learning with multigrained fusion by hierarchical network. IEEE Trans. Multimedia 20(2), 405–420 (2018)CrossRef
4.
go back to reference Zhai, X., Peng, Y., Xiao, J.: Cross-modality correlation propagation for cross-media retrieval. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2337–2340 (2012) Zhai, X., Peng, Y., Xiao, J.: Cross-modality correlation propagation for cross-media retrieval. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2337–2340 (2012)
5.
go back to reference Zhai, X., Peng, Y., Xiao, J.: Learning cross-media joint representation with sparse and semisupervised regularization. IEEE Trans. Circuits Syst. Video Technol. 24(6), 965–978 (2014)CrossRef Zhai, X., Peng, Y., Xiao, J.: Learning cross-media joint representation with sparse and semisupervised regularization. IEEE Trans. Circuits Syst. Video Technol. 24(6), 965–978 (2014)CrossRef
6.
go back to reference Peng, Y., Zhai, X., Zhao, Y., Huang, X.: Semi-supervised cross-media feature learning with unified patch graph regularization. IEEE Trans. Circuits Syst. Video Technol. 26(3), 583–596 (2016)CrossRef Peng, Y., Zhai, X., Zhao, Y., Huang, X.: Semi-supervised cross-media feature learning with unified patch graph regularization. IEEE Trans. Circuits Syst. Video Technol. 26(3), 583–596 (2016)CrossRef
7.
go back to reference Zhai, X., Peng, Y., Xiao, J.: Heterogeneous metric learning with joint graph regularization for cross-media retrieval. In: Twenty-Seventh AAAI Conference on Artificial Intelligence Heterogeneous, pp. 1198–1204 (2013) Zhai, X., Peng, Y., Xiao, J.: Heterogeneous metric learning with joint graph regularization for cross-media retrieval. In: Twenty-Seventh AAAI Conference on Artificial Intelligence Heterogeneous, pp. 1198–1204 (2013)
8.
go back to reference Zhang, H., Dai, G., Tang, D., Xu, X.: Cross-media retrieval based on semi-supervised regularization and correlation learning. Multimedia Tools Appl. 77(17), 22455–22473 (2018)CrossRef Zhang, H., Dai, G., Tang, D., Xu, X.: Cross-media retrieval based on semi-supervised regularization and correlation learning. Multimedia Tools Appl. 77(17), 22455–22473 (2018)CrossRef
9.
go back to reference Mcfee, B., Lanckriet, G.: Metric learning to rank. In: Proceedings of the 27th International Conference on Machine Learning, pp. 775–782 (2010) Mcfee, B., Lanckriet, G.: Metric learning to rank. In: Proceedings of the 27th International Conference on Machine Learning, pp. 775–782 (2010)
10.
go back to reference Wang, K., He, R., Wang, L., Wang, W., Tan, T.: Joint feature selection and subspace learning for cross-modal retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2010–2023 (2016)CrossRef Wang, K., He, R., Wang, L., Wang, W., Tan, T.: Joint feature selection and subspace learning for cross-modal retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2010–2023 (2016)CrossRef
11.
go back to reference Zhai, X., Peng, Y., Xiao, J.: Effective heterogeneous similarity measure with nearest neighbors for cross-media retrieval. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 312–322. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27355-1_30CrossRef Zhai, X., Peng, Y., Xiao, J.: Effective heterogeneous similarity measure with nearest neighbors for cross-media retrieval. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 312–322. Springer, Heidelberg (2012). https://​doi.​org/​10.​1007/​978-3-642-27355-1_​30CrossRef
12.
go back to reference Rasiwasia, N., Pereira, J.C., Coviello, E., et al.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 251–260. ACM, New York (2010) Rasiwasia, N., Pereira, J.C., Coviello, E., et al.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 251–260. ACM, New York (2010)
14.
go back to reference Chen, D.Y., Tian, X.P., Shen, Y.T., Ouhyoung, M.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)CrossRef Chen, D.Y., Tian, X.P., Shen, Y.T., Ouhyoung, M.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)CrossRef
Metadata
Title
Semisupervised Cross-Media Retrieval by Distance-Preserving Correlation Learning and Multi-modal Manifold Regularization
Authors
Ting Wang
Hong Zhang
Bo Li
Xin Xu
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-29908-8_3

Premium Partner