Skip to main content
Log in

Multimedia ontology matching by using visual and textual modalities

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Ontologies have been intensively applied for improving multimedia search and retrieval by providing explicit meaning to visual content. Several multimedia ontologies have been recently proposed as knowledge models suitable for narrowing the well known semantic gap and for enabling the semantic interpretation of images. Since these ontologies have been created in different application contexts, establishing links between them, a task known as ontology matching, promises to fully unlock their potential in support of multimedia search and retrieval. This paper proposes and compares empirically two extensional ontology matching techniques applied to an important semantic image retrieval issue: automatically associating common-sense knowledge to multimedia concepts. First, we extend a previously introduced textual concept matching approach to use both textual and visual representation of images. In addition, a novel matching technique based on a multi-modal graph is proposed. We argue that the textual and visual modalities have to be seen as complementary rather than as exclusive sources of extensional information in order to improve the efficiency of the application of an ontology matching approach in the multimedia domain. An experimental evaluation is included in the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The temporal aspect of the ontology application for multimedia falls out of the scope of the current study. We refer the interested reader to the project www.vidivideo.info and the publications found there.

  2. This constraint will be lifted for the graph-based approach.

  3. http://www-nlpir.nist.gov/projects/tv2005/

  4. http://www.ee.columbia.edu/ln/dvmm/lscom/

  5. Pearson’s measure, also discussed in [27] showed to compete closely with Spearman’s.

  6. A good heuristics is to set that threshold at the number of concepts to be kept, k′.

  7. The VSBM approaches have also been tested on this larger concept collection and vocabulary. The achieved results are not reported here as they were not significantly different from the ones achieved on the lower scale.

References

  1. Athanasiadis T, Tzouvaras V, Petridis K, Precioso F, Avrithis Y, Kompatsiaris Y (2005) Using a multimedia ontology infrastructure for semantic annotation of multimedia content. In: SemAnnot’05

  2. Dasiopoulou S, Kompatsiaris I, Strintzis M (2008) Using fuzzy dls to enhance semantic image analysis. In: Semantic multimedia. Springer, pp 31–46

  3. Dasiopoulou S, Tzouvaras V, Kompatsiaris I, Strintzis M (2010) Enquiring MPEG-7 based multimedia ontologies. Multimed Tools Appl 46(2):1–40

    Google Scholar 

  4. Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR, pp 710–719

  5. Doan A, Madhavan J, Domingos P, Halevy A (2002) Learning to map between ontologies on the semantic web. In: WWW’02. ACM Press, pp 662–673

  6. Euzenat J, Shvaiko P (2007) Ontology matching, 1st edn. Springer

  7. Fan J, Luo H, Shen Y, Yang C (2009) Integrating visual and semantic contexts for topic network generation and word sense disambiguation. In: ACM CIVR’09, pp 1–8

  8. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn 3(1):1157–1182

    MATH  Google Scholar 

  9. Haveliwala T (2003) Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans Knowl Data Eng 15(4):784–796

    Google Scholar 

  10. Hudelot C, Atif J, Bloch I (2008) Fuzzy spatial relation ontology for image interpretation. Fuzzy Sets Syst 159:1929–1951

    Article  MathSciNet  Google Scholar 

  11. Hudelot C, Maillot N, Thonnat M (2005) Symbol grounding for semantic image interpretation: from image data to semantics. In: SKCV-workshop, ICCV

  12. Inoue, M (2004) On the need for annotation-based image retrieval. In: Proceedings of the workshop on information retrieval in context (IRiX), Sheffield, UK, pp 44–46

  13. James N, Todorov K, Hudelot C (2010) Ontology matching for the semantic annotation of images. In: FUZZ-IEEE. IEEE Computer Society Press, Los Alamitos

    Google Scholar 

  14. Koskela M, Smeaton A (2007) An empirical study of inter-concept similarities in multimedia ontologies. In: CIVR’07. ACM, pp 464–471

  15. Lacher MS, Groh G (2001) Facilitating the exchange of explicit knowledge through ontology mappings. In: In Proceedings of the 14th int FLAIRS conference. AAAI Press, pp 305–309

  16. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110. doi:10.1023/B:VISI.0000029664.99615.94

    Google Scholar 

  17. Mihalcea R, Tarau P, Figa E (2004) Pagerank on semantic networks, with application to word sense disambiguation. In: ICCL. Association for Computational Linguistics, p 1126

  18. Miller G (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41

    Article  Google Scholar 

  19. Pan J, Yang H, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: ACM SIGKDD. ACM, p 658

  20. Peraldi ISE, Kaya A, Möller R (2009) Formalizing multimedia interpretation based on abduction over description logic aboxes. In: Description logics

  21. Russell B, Torralba A, Murphy K, Freeman W (2008) LabelMe: a database and web-based tool for image annotation. IJCV 77(1):157–173

    Article  Google Scholar 

  22. Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Google Scholar 

  23. Smith J, Chang S (2006) Large-scale concept ontology for multimedia. IEEE Multimed 13(3):86–91

    Article  Google Scholar 

  24. Snoek C, Huurnink B, Hollink L, De Rijke M, Schreiber G, Worring M (2007) Adding semantics to detectors for video retrieval. IEEE Trans Multimedia 9(5):975–986

    Article  Google Scholar 

  25. Stumme G, Maedche A (2001) Fca-merge: bottom-up merging of ontologies. In: International joint conference on artificial intelligence, pp 225–230

  26. Tansley R (1998) The multimedia thesaurus: an aid for multimedia information retrieval and navigation. Master’s thesis

  27. Todorov K, Geibel P, Kühnberger K-U (2010) Extensional ontology matching with variable selection for support vector machines. In: CISIS. IEEE Computer Society Press, Los Alamitos, pp 962–968

    Google Scholar 

  28. Tong H, Faloutsos C, Pan J-Y (2006) Fast random walk with restart and its applications. In: Industrial conference on data mining 06. IEEE Computer Society, Washington, pp 613–622

  29. Wang C, Jing F, Zhang L, Zhang H (2006) Image annotation refinement using random walk with restarts. In: ACM multimedia, p 650

  30. Wu L, Hua X-S, Yu N, Ma W-Y, Li S (2008) Flickr distance. In: Multimedia 08. ACM, pp 31–40

  31. Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Fourteenth ICML. Morgan Kaufmann, San Mateo, pp 412–420

    Google Scholar 

  32. Yao BZ, Yang X, Lin L, Lee MW, Zhu S-C (2010) I2t: image parsing to text description. In: IEEE proceedings, vol 98, no 8, pp. 1485–1508

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantin Todorov.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Todorov, K., James, N. & Hudelot, C. Multimedia ontology matching by using visual and textual modalities. Multimed Tools Appl 62, 401–425 (2013). https://doi.org/10.1007/s11042-011-0912-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0912-0

Keywords

Navigation