nach oben

Multimedia Systems

Erschienen in:

01.11.2014 | Regular Paper

Relative image similarity learning with contextual information for Internet cross-media retrieval

verfasst von: Shuqiang Jiang, Xinhang Song, Qingming Huang

Erschienen in: Multimedia Systems | Ausgabe 6/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

With the fast explosive rate of the amount of image data on the Internet, how to efficiently utilize them in the cross-media scenario becomes an urgent problem. Images are usually accompanied with contextual textual information. These two heterogeneous modalities are mutually reinforcing to make the Internet content more informative. In most cases, visual information can be regarded as an enhanced content of the textual document. To make image-to-image similarity being more consistent with document-to-document similarity, this paper proposes a method to learn image similarities according to the relations of the accompanied textual documents. More specifically, instead of using the static quantitative relations, rank-based learning procedure by employing structural SVM is adopted in this paper, and the ranking structure is established by comparing the relative relations of textual information. The learning results are in more accordance with the human’s recognition. The proposed method in this paper can be used not only for the image-to-image retrieval, but also for cross-modality multimedia, where a query expansion framework is proposed to get more satisfactory results. Extensive experimental evaluations on large scale Internet dataset validate the performance of the proposed methods.

Vorheriger Artikel Sparse semantic metric learning for image retrieval

Nächster Artikel Grassmann multimodal implicit feature selection

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp. 524–531 (2005)

Li, L., Jiang, S., Huang, Q.: Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Trans. Multimedia 14(5), 1401–1413 (2012)CrossRef

Tang, J., Zha, Z.-J., Tao, D., Chua, T.-S.: Semantic-gap oriented active learning for multi-label image annotation. IEEE Trans. Image Process. 21(4), 2354–2360 (2012)MathSciNetCrossRef

Wang, S., Huang, Q., Jiang, S., Tian, Q.: S3MKL: scalable semi-supervised multiple kernel learning for real world image applications. IEEE Trans. Multimedia 14(4), 1259–1274 (2012)CrossRef

Wang, M., Hua, X., Hong, R., Tang, J., Qi, G., Song, Y.: Unified video annotation via multi-graph learning. IEEE Trans. Circ. Syst. Video Technol. 19(5), 733–746 (2009)CrossRef

Jiang, S., Huang, Q., Ye, Q., Gao, W.: An effective method to detect and categorize digitized traditional Chinese paintings. Pattern Recogn. Lett. 27(7), 734–746 (2006)CrossRef

Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)

Tang, J., Yan, S., Hong, R., Qi, G.-J., Chua, T.-S.: Inferring semantic concepts from community-contributed images and noisy tags. In: Proceedings of ACM Multimedia, pp. 223–232 (2009)

Li, X., Snoek, C.G.M., Worring, M.: Learning social tag relevance by neighbor voting. IEEE Trans. Multimedia 11(7), 1310–1322 (2009)CrossRef

10.

Tang, J., Hong, R., Yan, S., Chua, T.-S., Qi, G.-J., Jain, R.: Image annotation by knn-sparse graph-based label propagation over noisily-tagged web images. ACM Trans. Intell. Syst. Technol. 2, 2 (2011)CrossRef

11.

Liu, D., Hua, X., Yang, L., Wang, M., Zhang, H.: Tag ranking. In: Proceeding of the 17th International Conference on World Wide Web, ACM, New York, NY, USA, pp. 317–326 (2009)

12.

Zhu, G., Yan, S., Ma, Y.: Image tag refinement towards low-rank, content-tag prior and error sparsity. In: Proceedings of ACM Multimedia, pp. 461–470 (2010)

13.

Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of WWW image search results using visual, textual and link information. In: Proceedings of ACM Multimedia, pp. 952–959 (2004)

14.

Gao, B., Liu, T.-Y., Qin, T., Zheng, X., Cheng, Q.-S., Ma, W.-Y.: Web image clustering by consistent utilization of visual features and surrounding texts. In: Proceedings of ACM Multimedia, pp. 112–121 (2005)

15.

Rege, M., Dong, M., Hua, J.: Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. In: Proceeding of the 17th International Conference on World Wide Web, ACM, New York, NY, USA, pp. 317–326 (2008)

16.

Jin, Y., Khan, L., Wang, L., Awad M.: Image annotations by combining multiple evidence and Wordnet. In: Proceedings of ACM Multimedia, pp. 706–715 (2008)

17.

Wu, L., Hoi, S.C., Zhu, J., Jin, R., Yu, N.: Distance metric learning from uncertain side information with application to automated photo tagging. In: Proceedings of ACM Multimedia, pp. 135–144 (2009)

18.

Wang, S., Jiang, S., Huang, Q., Tian, Q.: Multi-feature metric learning with knowledge transfer among semantics and social tagging. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2633 (2012)

19.

Wu, L., Hua, X.-S., Yu, N., Ma, W.-Y., Li, S.: Flickr distance. In: Proceedings of ACM Multimedia, pp. 31–40 (2008)

20.

Hoi, S.C.H., Liu, W., Lyu, M.R., Ma, W.-Y.: Learning distance metrics with contextual constraints for image retrieval. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2072–2078 (2006)

21.

Hwang, S.J., Grauman, K., Sha, F.: Learning a tree of metrics with disjoint visual features. In: Proceedings of the Conference on Advances in Neural Information Processing Systems, NIPS (2011)

22.

Wu, P., Hoi, S.C.H., Zhao, P., He, Y.: Mining social images with distance metric learning for automated image tagging. In: WSDM, pp. 197–206 (2011)

23.

Verma, N., Mahajan, D., Sellamanickam, S., Nair, V.: Learning hierarchical similarity metrics. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2280–2287 (2012)

24.

Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)MathSciNetMATH

25.

McFee, B., Lanckriet, G.: Metric learning to rank. In: International Conference on Machine Learning, Haifa, Israel (2010)

26.

Wang, X.-J., Zhang, L., Li, X., Ma, W.-Y.: Annotating images by mining image search results. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1919–1932 (2008)CrossRef

27.

Harchaoui, Z., Douze, M., Paulin, M., Dudik, M., Malick, J.: Large-scale image classification with trace-norm regularization. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp. 3386–3393 (2012)

28.

Perronnin, F., Akata, Z., Harchaoui, Z., Schmid, C.: Towards good practice in large-scale learning for image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3482–3489 (2012)

29.

Zhang, H.J., Su, Z.: Improving CBIR by semantic propagation and cross modality query expansion. In: Proceedings of the international workshop on MultiMedia Content-Based Indexing and Retrieval (MM-CBIR’01), September, pp. 83–86 (2001)

30.

Jia, Y., Salmann, M., Darrell, T.: Learning cross-modality similarity for multinomial data. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2407–2414 (2011)

31.

Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)MATH

32.

Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Proceedings of the Conference on Advances in Neural Information Processing Systems (2005)

33.

Schultz, M., Joachims, T.: Learning a distance metric from relative comparisons. In: Proceedings of the Conference on Advances in Neural Information Processing Systems (2009)

34.

Agarwal, S., Wills, J., Cayton, L., Lanckriet, G., Kriegman, D., Belongi, S.: Generalized non-metric multi-dimensional scaling. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (2007)

35.

McFee, B., Lanckriet, G.R.G.: Learning multi-modal similarity. J. Mach. Learn. Res. (JMLR), February, pp. 491–523 (2011)

36.

Lee, J.-E., Jin, R., Jain, A.K.: Rank-based distance metric learning: an application to image retrieval. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2009)

37.

Thorsten, J., Finley, T., John Yu C.-N.: Cutting-plane training of structural SVMs. Mach. Learn. 77(1):27–59 (2009). ISSN 0885-6125

38.

Crammer, K., Singer, Y.: On the algorithmic implementation of multi-class kernel-based vector machines. Mach. Learn. Res. 2, 265–292 (2001)

39.

Joachims, T.: A support vector method for multivariate performance measures. In: International Conference on Machine Learning, pp. 377–384 (2005)

40.

Yue, Y., Finley, T., Radlinski, F., Joachims, T.: A support vector method for optimizing average precision. In: Proceedings of acm special interest group on information retrieval conference, pp. 271–278 (2007)

41.

Chakrabarti, S., Khanna, R., Sawant, U., Bhattacharyya, C.: Structured learning for non smooth ranking losses. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, USA, pp. 88–96 (2008)

42.

http://www.imageclef.org

43.

Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRef

Titel: Relative image similarity learning with contextual information for Internet cross-media retrieval
verfasst von: Shuqiang Jiang
Xinhang Song
Qingming Huang
Publikationsdatum: 01.11.2014
Verlag: Springer Berlin Heidelberg
Erschienen in: Multimedia Systems / Ausgabe 6/2014
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-012-0299-4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence_ieS/© Springer Fachmedien Wiesbaden GmbH, Search Icon, Banner Hanser, Bunte Männchen, die Kunden darstelle, werden von einem riesigen Magneten angezogen. /© Oleksiy Mark, Dr. Daniel Schneider/© Fraunhofer IESE, Interview Level Ten PPA Bild/© LevelTen, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 6/2014

Exploring sharing patterns for video recommendation on YouTube-like social media

Real-time panoramic video streaming system with overlaid interface concept for social media

Personalising live zooming using the ePlayer

Sparse semantic metric learning for image retrieval

Harvesting microblogs for contextual music similarity estimation: a co-occurrence-based framework

Social media mining and knowledge discovery

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.