Skip to main content

2017 | OriginalPaper | Buchkapitel

Fusion Strategies for Large-Scale Multi-modal Image Retrieval

verfasst von : Petra Budikova, Michal Batko, Pavel Zezula

Erschienen in: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXIII

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Large-scale data management and retrieval in complex domains such as images, videos, or biometrical data remains one of the most important and challenging information processing tasks. Even after two decades of intensive research, many questions still remain to be answered before working tools become available for everyday use. In this work, we focus on the practical applicability of different multi-modal retrieval techniques. Multi-modal searching, which combines several complementary views on complex data objects, follows the human thinking process and represents a very promising retrieval paradigm. However, a rapid development of modality fusion techniques in several diverse directions and a lack of comparisons between individual approaches have resulted in a confusing situation when the applicability of individual solutions is unclear. Aiming at improving the research community’s comprehension of this topic, we analyze and systematically categorize existing multi-modal search techniques, identify their strengths, and describe selected representatives. In the second part of the paper, we focus on the specific problem of large-scale multi-modal image retrieval on the web. We analyze the requirements of such task, implement several applicable fusion methods, and experimentally evaluate their performance in terms of both efficiency and effectiveness. The extensive experiments provide a unique comparison of diverse approaches to modality fusion in equal settings on two large real-world datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abu-Shareha, A.A., Mandava, R., Khan, L., Ramachandram, D.: Multimodal concept fusion using semantic closeness for image concept disambiguation. Multimedia Tools Appl. 61(1), 69–86 (2011). doi:10.1007/s11042-010-0707-8 CrossRef Abu-Shareha, A.A., Mandava, R., Khan, L., Ramachandram, D.: Multimodal concept fusion using semantic closeness for image concept disambiguation. Multimedia Tools Appl. 61(1), 69–86 (2011). doi:10.​1007/​s11042-010-0707-8 CrossRef
2.
Zurück zum Zitat Ah-Pine, J., Csurka, G., Clinchant, S.: Unsupervised visual and textual information fusion in CBMIR using graph-based methods. ACM Trans. Inform. Syst. 33(2), 9:1–9:31 (2015). doi:10.1145/2699668 Ah-Pine, J., Csurka, G., Clinchant, S.: Unsupervised visual and textual information fusion in CBMIR using graph-based methods. ACM Trans. Inform. Syst. 33(2), 9:1–9:31 (2015). doi:10.​1145/​2699668
3.
Zurück zum Zitat Andrade, F.S.P., Almeida, J., Pedrini, H., S.Torres, R.: Fusion of local and global descriptors for content-based image and video retrieval. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 845–853. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33275-3_104 CrossRef Andrade, F.S.P., Almeida, J., Pedrini, H., S.Torres, R.: Fusion of local and global descriptors for content-based image and video retrieval. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 845–853. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33275-3_​104 CrossRef
4.
Zurück zum Zitat Arampatzis, A., Zagoris, K., Chatzichristofis, S.A.: Dynamic two-stage image retrieval from large multimodal databases. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 326–337. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_33 CrossRef Arampatzis, A., Zagoris, K., Chatzichristofis, S.A.: Dynamic two-stage image retrieval from large multimodal databases. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 326–337. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-20161-5_​33 CrossRef
6.
Zurück zum Zitat Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval - The Concepts and Technology Behind Search, 2nd edn. Pearson Education Ltd., Harlow (2011) Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval - The Concepts and Technology Behind Search, 2nd edn. Pearson Education Ltd., Harlow (2011)
7.
Zurück zum Zitat Barrios, J.M., Bustos, B.: Automatic weight selection for multi-metric distances. In: Proceedings of the 4th International Conference on Similarity Search and Applications (SISAP 2011), pp. 61–68 (2011). doi:10.1145/1995412.1995425 Barrios, J.M., Bustos, B.: Automatic weight selection for multi-metric distances. In: Proceedings of the 4th International Conference on Similarity Search and Applications (SISAP 2011), pp. 61–68 (2011). doi:10.​1145/​1995412.​1995425
8.
Zurück zum Zitat Batko, M., Falchi, F., Lucchese, C., Novak, D., Perego, R., Rabitti, F., Sedmidubsky, J., Zezula, P.: Building a web-scale image similarity search system. Multimedia Tools Appl. 47(3), 599–629 (2010). doi:10.1007/s11042-009-0339-z CrossRef Batko, M., Falchi, F., Lucchese, C., Novak, D., Perego, R., Rabitti, F., Sedmidubsky, J., Zezula, P.: Building a web-scale image similarity search system. Multimedia Tools Appl. 47(3), 599–629 (2010). doi:10.​1007/​s11042-009-0339-z CrossRef
9.
Zurück zum Zitat Batko, M., Kohoutkova, P., Zezula, P.: Combining metric features in large collections. In: 24th International Conference on Data Engineering Workshops (ICDE 2008), pp. 370–377 (2008). doi:10.1109/ICDEW.2008.4498347 Batko, M., Kohoutkova, P., Zezula, P.: Combining metric features in large collections. In: 24th International Conference on Data Engineering Workshops (ICDE 2008), pp. 370–377 (2008). doi:10.​1109/​ICDEW.​2008.​4498347
10.
Zurück zum Zitat Batko, M., Novak, D., Zezula, P.: MESSIF: metric similarity search implementation framework. In: Thanos, C., Borri, F., Candela, L. (eds.) DELOS 2007. LNCS, vol. 4877, pp. 1–10. Springer, Heidelberg (2007). doi:10.1007/978-3-540-77088-6_1 CrossRef Batko, M., Novak, D., Zezula, P.: MESSIF: metric similarity search implementation framework. In: Thanos, C., Borri, F., Candela, L. (eds.) DELOS 2007. LNCS, vol. 4877, pp. 1–10. Springer, Heidelberg (2007). doi:10.​1007/​978-3-540-77088-6_​1 CrossRef
11.
Zurück zum Zitat Benavent, X., Garcia-Serrano, A., Granados, R., Benavent, J., de Ves, E.: Multimedia information retrieval based on late semantic fusion approaches: experiments on a wikipedia image collection. IEEE Trans. Multimedia 15(8), 2009–2021 (2013). doi:10.1109/TMM.2013.2267726 CrossRef Benavent, X., Garcia-Serrano, A., Granados, R., Benavent, J., de Ves, E.: Multimedia information retrieval based on late semantic fusion approaches: experiments on a wikipedia image collection. IEEE Trans. Multimedia 15(8), 2009–2021 (2013). doi:10.​1109/​TMM.​2013.​2267726 CrossRef
12.
Zurück zum Zitat Blanken, H., de Vries, A., Blok, H., Feng, L.: Multimedia Retrieval. Data-Centric Systems and Applications. Springer, Secaucus (2007) Blanken, H., de Vries, A., Blok, H., Feng, L.: Multimedia Retrieval. Data-Centric Systems and Applications. Springer, Secaucus (2007)
13.
Zurück zum Zitat Bossé, É., Roy, J., Wark, S.: Concepts, Models, and Tools for Information Fusion. Artech House, Inc., Norwood (2007) Bossé, É., Roy, J., Wark, S.: Concepts, Models, and Tools for Information Fusion. Artech House, Inc., Norwood (2007)
14.
Zurück zum Zitat Bozzon, A., Fraternali, P.: Chapter 8: multimedia and multimodal information retrieval. In: Ceri, S., Brambilla, M. (eds.) Search Computing. LNCS, vol. 5950, pp. 135–155. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12310-8_8 CrossRef Bozzon, A., Fraternali, P.: Chapter 8: multimedia and multimodal information retrieval. In: Ceri, S., Brambilla, M. (eds.) Search Computing. LNCS, vol. 5950, pp. 135–155. Springer, Heidelberg (2010). doi:10.​1007/​978-3-642-12310-8_​8 CrossRef
16.
Zurück zum Zitat Budikova, P., Batko, M., Zezula, P.: Evaluation platform for content-based image retrieval systems. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 130–142. Springer, Heidelberg (2011). doi:10.1007/978-3-642-24469-8_15 CrossRef Budikova, P., Batko, M., Zezula, P.: Evaluation platform for content-based image retrieval systems. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 130–142. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-24469-8_​15 CrossRef
17.
Zurück zum Zitat Budikova, P., Batko, M., Zezula, P.: Similarity query postprocessing by ranking. In: Detyniecki, M., Knees, P., Nürnberger, A., Schedl, M., Stober, S. (eds.) AMR 2010. LNCS, vol. 6817, pp. 159–173. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27169-4_12 CrossRef Budikova, P., Batko, M., Zezula, P.: Similarity query postprocessing by ranking. In: Detyniecki, M., Knees, P., Nürnberger, A., Schedl, M., Stober, S. (eds.) AMR 2010. LNCS, vol. 6817, pp. 159–173. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-27169-4_​12 CrossRef
19.
20.
Zurück zum Zitat Chatzichristofis, S.A., Zagoris, K., Boutalis, Y., Arampatzis, A.: A fuzzy rank-based late fusion method for image retrieval. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 463–472. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27355-1_43 CrossRef Chatzichristofis, S.A., Zagoris, K., Boutalis, Y., Arampatzis, A.: A fuzzy rank-based late fusion method for image retrieval. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 463–472. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-27355-1_​43 CrossRef
21.
Zurück zum Zitat Chen, L., Cong, G., Jensen, C.S., Wu, D.: Spatial keyword query processing: an experimental evaluation. In: The Proceedings of the VLDB Endowment (PVLDB), pp. 217–228 (2013). doi:10.14778/2535569.2448955 Chen, L., Cong, G., Jensen, C.S., Wu, D.: Spatial keyword query processing: an experimental evaluation. In: The Proceedings of the VLDB Endowment (PVLDB), pp. 217–228 (2013). doi:10.​14778/​2535569.​2448955
22.
Zurück zum Zitat Chen, Y., Yu, N., Luo, B., wen Chen, X.: iLike: integrating visual and textual features for vertical search. In: 18th International Conference on Multimedia (ACM Multimedia 2010), pp. 221–230 (2010). doi:10.1145/1873951.1873984 Chen, Y., Yu, N., Luo, B., wen Chen, X.: iLike: integrating visual and textual features for vertical search. In: 18th International Conference on Multimedia (ACM Multimedia 2010), pp. 221–230 (2010). doi:10.​1145/​1873951.​1873984
24.
Zurück zum Zitat Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic combination of textual and visual information in multimedia retrieval. In: Proceedings of the 1st International Conference on Multimedia Retrieval (ICMR 2011), p. 44 (2011). doi:10.1145/1991996.1992040 Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic combination of textual and visual information in multimedia retrieval. In: Proceedings of the 1st International Conference on Multimedia Retrieval (ICMR 2011), p. 44 (2011). doi:10.​1145/​1991996.​1992040
26.
Zurück zum Zitat Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5:1–5:60 (2008). doi:10.1145/1348246.1348248 Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 5:1–5:60 (2008). doi:10.​1145/​1348246.​1348248
27.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 248–255 (2009). doi:10.1109/CVPRW.2009.5206848 Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 248–255 (2009). doi:10.​1109/​CVPRW.​2009.​5206848
28.
Zurück zum Zitat Depeursinge, A., Müller, H.: Fusion techniques for combining textual and visual information retrieval. In: ImageCLEF. The Kluwer International Series on Information Retrieval, vol. 32, pp. 95–114. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15181-1_6 Depeursinge, A., Müller, H.: Fusion techniques for combining textual and visual information retrieval. In: ImageCLEF. The Kluwer International Series on Information Retrieval, vol. 32, pp. 95–114. Springer, Heidelberg (2010). doi:10.​1007/​978-3-642-15181-1_​6
31.
Zurück zum Zitat Eickhoff, C., Li, W., Vries, A.P.: Exploiting user comments for audio-visual content indexing and retrieval. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 38–49. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36973-5_4 CrossRef Eickhoff, C., Li, W., Vries, A.P.: Exploiting user comments for audio-visual content indexing and retrieval. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 38–49. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-36973-5_​4 CrossRef
32.
Zurück zum Zitat Escalante, H.J., Montes, M., Sucar, L.E.: Multimodal indexing based on semantic cohesion for image retrieval. Inform. Retrieval 15(1), 1–32 (2012). doi:10.1007/s10791-011-9170-z Escalante, H.J., Montes, M., Sucar, L.E.: Multimodal indexing based on semantic cohesion for image retrieval. Inform. Retrieval 15(1), 1–32 (2012). doi:10.​1007/​s10791-011-9170-z
34.
Zurück zum Zitat Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. The MIT Press, Cambridge (1998)MATH Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. The MIT Press, Cambridge (1998)MATH
36.
Zurück zum Zitat Ha, H., Yang, Y., Fleites, F., Chen, S.: Correlation-based feature analysis and multi-modality fusion framework for multimedia semantic retrieval. In: Proceedings of the 2013 IEEE International Conference on Multimedia and Expo (ICME 2013), pp. 1–6 (2013). doi:10.1109/ICME.2013.6607639 Ha, H., Yang, Y., Fleites, F., Chen, S.: Correlation-based feature analysis and multi-modality fusion framework for multimedia semantic retrieval. In: Proceedings of the 2013 IEEE International Conference on Multimedia and Expo (ICME 2013), pp. 1–6 (2013). doi:10.​1109/​ICME.​2013.​6607639
37.
Zurück zum Zitat Hemayati, R., Meng, W., Yu, C.: Semantic-based grouping of search engine results using wordnet. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM -2007. LNCS, vol. 4505, pp. 678–686. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72524-4_70 CrossRef Hemayati, R., Meng, W., Yu, C.: Semantic-based grouping of search engine results using wordnet. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM -2007. LNCS, vol. 4505, pp. 678–686. Springer, Heidelberg (2007). doi:10.​1007/​978-3-540-72524-4_​70 CrossRef
38.
Zurück zum Zitat Hoque, E., Strong, G., Hoeber, O., Gong, M.: Conceptual query expansion and visual search results exploration for web image retrieval. In: 7th Atlantic Web Intelligence Conference (AWIC 2011), pp. 73–82 (2011). doi:10.1007/978-3-642-18029-3_8 Hoque, E., Strong, G., Hoeber, O., Gong, M.: Conceptual query expansion and visual search results exploration for web image retrieval. In: 7th Atlantic Web Intelligence Conference (AWIC 2011), pp. 73–82 (2011). doi:10.​1007/​978-3-642-18029-3_​8
39.
Zurück zum Zitat Hörster, E., Slaney, M., Ranzato, M., Weinberger, K.: Unsupervised image ranking. In: 1st ACM Workshop on Large-Scale Multimedia Retrieval and Mining (LS-MMRM 2009), pp. 81–88 (2009). doi:10.1145/1631058.1631074 Hörster, E., Slaney, M., Ranzato, M., Weinberger, K.: Unsupervised image ranking. In: 1st ACM Workshop on Large-Scale Multimedia Retrieval and Mining (LS-MMRM 2009), pp. 81–88 (2009). doi:10.​1145/​1631058.​1631074
41.
Zurück zum Zitat Jain, R., Sinha, P.: Content without context is meaningless. In: International Conference on Multimedia (ACM Multimedia 2010), pp. 1259–1268. ACM (2010). doi:10.1145/1873951.1874199 Jain, R., Sinha, P.: Content without context is meaningless. In: International Conference on Multimedia (ACM Multimedia 2010), pp. 1259–1268. ACM (2010). doi:10.​1145/​1873951.​1874199
43.
Zurück zum Zitat Jegou, H., Schmid, C., Harzallah, H., Verbeek, J.J.: Accurate image search using the contextual dissimilarity measure. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 2–11 (2010). doi:10.1109/TPAMI.2008.285 CrossRef Jegou, H., Schmid, C., Harzallah, H., Verbeek, J.J.: Accurate image search using the contextual dissimilarity measure. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 2–11 (2010). doi:10.​1109/​TPAMI.​2008.​285 CrossRef
45.
Zurück zum Zitat Khasanova, R., Dong, X., Frossard, P.: Multi-modal image retrieval with random walk on multi-layer graphs. In: IEEE International Symposium on Multimedia (ISM 2016), pp. 1–6 (2016). doi:10.1109/ISM.2016.0011 Khasanova, R., Dong, X., Frossard, P.: Multi-modal image retrieval with random walk on multi-layer graphs. In: IEEE International Symposium on Multimedia (ISM 2016), pp. 1–6 (2016). doi:10.​1109/​ISM.​2016.​0011
47.
Zurück zum Zitat Kludas, J., Bruno, E., Marchand-Maillet, S.: Information fusion in multimedia information retrieval. In: Boujemaa, N., Detyniecki, M., Nürnberger, A. (eds.) AMR 2007. LNCS, vol. 4918, pp. 147–159. Springer, Heidelberg (2008). doi:10.1007/978-3-540-79860-6_12 CrossRef Kludas, J., Bruno, E., Marchand-Maillet, S.: Information fusion in multimedia information retrieval. In: Boujemaa, N., Detyniecki, M., Nürnberger, A. (eds.) AMR 2007. LNCS, vol. 4918, pp. 147–159. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-79860-6_​12 CrossRef
50.
Zurück zum Zitat Lan, Z., Bao, L., Yu, S.-I., Liu, W., Hauptmann, A.G.: Double fusion for multimedia event detection. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 173–185. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27355-1_18 CrossRef Lan, Z., Bao, L., Yu, S.-I., Liu, W., Hauptmann, A.G.: Double fusion for multimedia event detection. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 173–185. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-27355-1_​18 CrossRef
52.
Zurück zum Zitat Li, J.: Reachability based ranking in interactive image retrieval. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2015), pp. 867–870 (2015). doi:10.1145/2766462.2767777 Li, J.: Reachability based ranking in interactive image retrieval. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2015), pp. 867–870 (2015). doi:10.​1145/​2766462.​2767777
53.
Zurück zum Zitat Li, J., Ma, Q., Asano, Y., Yoshikawa, M.: Re-ranking by multi-modal relevance feedback for content-based social image retrieval. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 399–410. Springer, Heidelberg (2012). doi:10.1007/978-3-642-29253-8_34 CrossRef Li, J., Ma, Q., Asano, Y., Yoshikawa, M.: Re-ranking by multi-modal relevance feedback for content-based social image retrieval. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 399–410. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-29253-8_​34 CrossRef
54.
Zurück zum Zitat Liu, Y., Mei, T., Hua, X.S.: CrowdReranking: exploring multiple search engines for visual search reranking. In: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009), pp. 500–507 (2009). doi:10.1145/1571941.1572027 Liu, Y., Mei, T., Hua, X.S.: CrowdReranking: exploring multiple search engines for visual search reranking. In: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2009), pp. 500–507 (2009). doi:10.​1145/​1571941.​1572027
55.
Zurück zum Zitat Lokoč, J., Novák, D., Batko, M., Skopal, T.: Visual image search: feature signatures or/and global descriptors. In: Navarro, G., Pestov, V. (eds.) SISAP 2012. LNCS, vol. 7404, pp. 177–191. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32153-5_13 CrossRef Lokoč, J., Novák, D., Batko, M., Skopal, T.: Visual image search: feature signatures or/and global descriptors. In: Navarro, G., Pestov, V. (eds.) SISAP 2012. LNCS, vol. 7404, pp. 177–191. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-32153-5_​13 CrossRef
57.
Zurück zum Zitat Magalhães, J., Rüger, S.: An information-theoretic framework for semantic-multimedia retrieval. ACM Trans. Inform. Syst. 28(4), 1–32 (2010). doi:10.1145/1852102.1852105 Magalhães, J., Rüger, S.: An information-theoretic framework for semantic-multimedia retrieval. ACM Trans. Inform. Syst. 28(4), 1–32 (2010). doi:10.​1145/​1852102.​1852105
59.
Zurück zum Zitat McCandless, M., Hatcher, E., Gospodnetić, O.: Lucene in Action: Covers Apache Lucene V. 3. 0. Manning Pubs Co Series, Manning (2010) McCandless, M., Hatcher, E., Gospodnetić, O.: Lucene in Action: Covers Apache Lucene V. 3. 0. Manning Pubs Co Series, Manning (2010)
61.
Zurück zum Zitat Mironica, I., Ionescu, B., Vertan, C.: Hierarchical clustering relevance feedback for content-based image retrieval. In: 10th International Workshop on Content-Based Multimedia Indexing (CBMI 2012), pp. 1–6 (2012). doi:10.1109/CBMI.2012.6269811 Mironica, I., Ionescu, B., Vertan, C.: Hierarchical clustering relevance feedback for content-based image retrieval. In: 10th International Workshop on Content-Based Multimedia Indexing (CBMI 2012), pp. 1–6 (2012). doi:10.​1109/​CBMI.​2012.​6269811
62.
Zurück zum Zitat MPEG-7: Multimedia content description interfaces. Part 3: Visual. ISO/IEC 15938–3:2002 (2002) MPEG-7: Multimedia content description interfaces. Part 3: Visual. ISO/IEC 15938–3:2002 (2002)
63.
Zurück zum Zitat Müller, H., Clough, P., Deselaers, T., Caputo, B.: ImageCLEF: Experimental Evaluation in Visual Information Retrieval, 1st edn. Springer, Heidelberg (2010)CrossRefMATH Müller, H., Clough, P., Deselaers, T., Caputo, B.: ImageCLEF: Experimental Evaluation in Visual Information Retrieval, 1st edn. Springer, Heidelberg (2010)CrossRefMATH
67.
Zurück zum Zitat Oh, S., McCloskey, S., Kim, I., Vahdat, A., Cannons, K.J., Hajimirsadeghi, H., Mori, G., Perera, A.G.A., Pandey, M., Corso, J.J.: Multimedia event detection with multimodal feature fusion and temporal concept localization. Mach. Vis. Appl. 25(1), 49–69 (2013). doi:10.1007/s00138-013-0525-x CrossRef Oh, S., McCloskey, S., Kim, I., Vahdat, A., Cannons, K.J., Hajimirsadeghi, H., Mori, G., Perera, A.G.A., Pandey, M., Corso, J.J.: Multimedia event detection with multimodal feature fusion and temporal concept localization. Mach. Vis. Appl. 25(1), 49–69 (2013). doi:10.​1007/​s00138-013-0525-x CrossRef
70.
Zurück zum Zitat Pedronette, D.C.G., da Silva Torres, R.: Combining re-ranking and rank aggregation methods for image retrieval. Multimedia Tools Appl. 75(15), 9121–9144 (2016). doi:10.1007/s11042-015-3044-0 Pedronette, D.C.G., da Silva Torres, R.: Combining re-ranking and rank aggregation methods for image retrieval. Multimedia Tools Appl. 75(15), 9121–9144 (2016). doi:10.​1007/​s11042-015-3044-0
71.
Zurück zum Zitat Pham, T.T., Maillot, N., Lim, J.H., Chevallet, J.P.: Latent semantic fusion model for image retrieval and annotation. In: Sixteenth ACM Conference on Information and Knowledge Management (CIKM 2007), pp. 439–444 (2007). doi:10.1145/1321440.1321503 Pham, T.T., Maillot, N., Lim, J.H., Chevallet, J.P.: Latent semantic fusion model for image retrieval and annotation. In: Sixteenth ACM Conference on Information and Knowledge Management (CIKM 2007), pp. 439–444 (2007). doi:10.​1145/​1321440.​1321503
72.
Zurück zum Zitat Pulla, C., Jawahar, C.V.: Multi modal semantic indexing for image retrieval. In: 9th ACM International Conference on Image and Video Retrieval (CIVR 2010), pp. 342–349 (2010). doi:10.1145/1816041.1816091 Pulla, C., Jawahar, C.V.: Multi modal semantic indexing for image retrieval. In: 9th ACM International Conference on Image and Video Retrieval (CIVR 2010), pp. 342–349 (2010). doi:10.​1145/​1816041.​1816091
73.
74.
Zurück zum Zitat Richter, F., Romberg, S., Hörster, E., Lienhart, R.: Multimodal ranking for image search on community databases. In: Proceedings of the International Conference on Multimedia Information Retrieval (MIR 2010), pp. 63–72 (2010). doi:10.1145/1743384.1743402 Richter, F., Romberg, S., Hörster, E., Lienhart, R.: Multimodal ranking for image search on community databases. In: Proceedings of the International Conference on Multimedia Information Retrieval (MIR 2010), pp. 63–72 (2010). doi:10.​1145/​1743384.​1743402
78.
Zurück zum Zitat Safadi, B., Sahuguet, M., Huet, B.: When textual and visual information join forces for multimedia retrieval. In: International Conference on Multimedia Retrieval (ICMR 2014), p. 265 (2014). doi:10.1145/2578726.2578760 Safadi, B., Sahuguet, M., Huet, B.: When textual and visual information join forces for multimedia retrieval. In: International Conference on Multimedia Retrieval (ICMR 2014), p. 265 (2014). doi:10.​1145/​2578726.​2578760
79.
Zurück zum Zitat Samet, H.: Foundations of Multidimensional and Metric Data Structures. Computer Graphics and Geometric Modeling. Morgan Kaufmann Publishers Inc. (2005) Samet, H.: Foundations of Multidimensional and Metric Data Structures. Computer Graphics and Geometric Modeling. Morgan Kaufmann Publishers Inc. (2005)
80.
Zurück zum Zitat Santos, J.M., Cavalcanti, J.M.B., Saraiva, P.C., Moura, E.S.: Multimodal re-ranking of product image search results. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 62–73. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36973-5_6 CrossRef Santos, J.M., Cavalcanti, J.M.B., Saraiva, P.C., Moura, E.S.: Multimodal re-ranking of product image search results. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 62–73. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-36973-5_​6 CrossRef
81.
82.
Zurück zum Zitat Siddiquie, B., White, B., Sharma, A., Davis, L.S.: Multi-modal image retrieval for complex queries using small codes. In: International Conference on Multimedia Retrieval (ICMR 2014), p. 321 (2014). doi:10.1145/2578726.2578767 Siddiquie, B., White, B., Sharma, A., Davis, L.S.: Multi-modal image retrieval for complex queries using small codes. In: International Conference on Multimedia Retrieval (ICMR 2014), p. 321 (2014). doi:10.​1145/​2578726.​2578767
83.
Zurück zum Zitat Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000). doi:10.1109/34.895972 CrossRef Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000). doi:10.​1109/​34.​895972 CrossRef
84.
Zurück zum Zitat Snoek, C., Worring, M., Smeulders, A.W.M.: Early versus late fusion in semantic video analysis. In: 13th ACM International Conference on Multimedia (ACM Multimedia), pp. 399–402 (2005). doi:10.1145/1101149.1101236 Snoek, C., Worring, M., Smeulders, A.W.M.: Early versus late fusion in semantic video analysis. In: 13th ACM International Conference on Multimedia (ACM Multimedia), pp. 399–402 (2005). doi:10.​1145/​1101149.​1101236
85.
Zurück zum Zitat Sugiyama, Y., Kato, M.P., Ohshima, H., Tanaka, K.: Relative relevance feedback in image retrieval. In: International Conference on Multimedia and Expo (ICME 2012), pp. 272–277 (2012). doi:10.1109/ICME.2012.161 Sugiyama, Y., Kato, M.P., Ohshima, H., Tanaka, K.: Relative relevance feedback in image retrieval. In: International Conference on Multimedia and Expo (ICME 2012), pp. 272–277 (2012). doi:10.​1109/​ICME.​2012.​161
86.
Zurück zum Zitat Tollari, S., Detyniecki, M., Marsala, C., Fakeri-Tabrizi, A., Amini, M.-R., Gallinari, P.: Exploiting visual concepts to improve text-based image retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 701–705. Springer, Heidelberg (2009). doi:10.1007/978-3-642-00958-7_70 CrossRef Tollari, S., Detyniecki, M., Marsala, C., Fakeri-Tabrizi, A., Amini, M.-R., Gallinari, P.: Exploiting visual concepts to improve text-based image retrieval. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 701–705. Springer, Heidelberg (2009). doi:10.​1007/​978-3-642-00958-7_​70 CrossRef
87.
Zurück zum Zitat Tran, T., Phung, D., Venkatesh, S.: Learning sparse latent representation and distance metric for image retrieval. In: IEEE International Conference on Multimedia and Expo (ICME 2013), pp. 1–6. IEEE (2013). doi:10.1109/ICME.2013.6607435 Tran, T., Phung, D., Venkatesh, S.: Learning sparse latent representation and distance metric for image retrieval. In: IEEE International Conference on Multimedia and Expo (ICME 2013), pp. 1–6. IEEE (2013). doi:10.​1109/​ICME.​2013.​6607435
88.
Zurück zum Zitat Uluwitige, D., Chappell, T., Geva, S., Chandran, V.: Improving retrieval quality using pseudo relevance feedback in content-based image retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pp. 873–876 (2016). doi:10.1145/2911451.2914747 Uluwitige, D., Chappell, T., Geva, S., Chandran, V.: Improving retrieval quality using pseudo relevance feedback in content-based image retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pp. 873–876 (2016). doi:10.​1145/​2911451.​2914747
89.
Zurück zum Zitat Wang, L., Yang, L., Tian, X.: Query aware visual similarity propagation for image search reranking. In: ACM Multimedia 2009, pp. 725–728 (2009). doi:10.1145/1631272.1631398 Wang, L., Yang, L., Tian, X.: Query aware visual similarity propagation for image search reranking. In: ACM Multimedia 2009, pp. 725–728 (2009). doi:10.​1145/​1631272.​1631398
92.
Zurück zum Zitat Wei, Y., Song, Y., Zhen, Y., Liu, B., Yang, Q.: Heterogeneous translated hashing: A scalable solution towards multi-modal similarity search. ACM Trans. Knowl. Discov. Data 10(4), 36:1–36:28 (2016). doi:10.1145/2744204 Wei, Y., Song, Y., Zhen, Y., Liu, B., Yang, Q.: Heterogeneous translated hashing: A scalable solution towards multi-modal similarity search. ACM Trans. Knowl. Discov. Data 10(4), 36:1–36:28 (2016). doi:10.​1145/​2744204
93.
Zurück zum Zitat Wilkins, P., Smeaton, A.F., Ferguson, P.: Properties of optimally weighted data fusion in CBMIR. In: 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2010), pp. 643–650 (2010). doi:10.1145/1835449.1835556 Wilkins, P., Smeaton, A.F., Ferguson, P.: Properties of optimally weighted data fusion in CBMIR. In: 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2010), pp. 643–650 (2010). doi:10.​1145/​1835449.​1835556
94.
96.
Zurück zum Zitat Xu, S., Li, H., Chang, X., Yu, S., Du, X., Li, X., Jiang, L., Mao, Z., Lan, Z., Burger, S., Hauptmann, A.G.: Incremental multimodal query construction for video search. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR 2015), pp. 675–678 (2015). doi:10.1145/2671188.2749413 Xu, S., Li, H., Chang, X., Yu, S., Du, X., Li, X., Jiang, L., Mao, Z., Lan, Z., Burger, S., Hauptmann, A.G.: Incremental multimodal query construction for video search. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR 2015), pp. 675–678 (2015). doi:10.​1145/​2671188.​2749413
99.
Zurück zum Zitat Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search - The Metric Space Approach, Advances in Database Systems, vol. 32. Springer (2006) Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search - The Metric Space Approach, Advances in Database Systems, vol. 32. Springer (2006)
101.
Zurück zum Zitat Zhang, S., Yang, M., Cour, T., Yu, K., Metaxas, D.N.: Query specific fusion for image retrieval. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, pp. 660–673. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33709-3_47 CrossRef Zhang, S., Yang, M., Cour, T., Yu, K., Metaxas, D.N.: Query specific fusion for image retrieval. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, pp. 660–673. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33709-3_​47 CrossRef
102.
Zurück zum Zitat Zheng, L., Wang, S., Tian, L., He, F., Liu, Z., Tian, Q.: Query-adaptive late fusion for image search and person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 1741–1750 (2015). doi:10.1109/CVPR.2015.7298783 Zheng, L., Wang, S., Tian, L., He, F., Liu, Z., Tian, Q.: Query-adaptive late fusion for image search and person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 1741–1750 (2015). doi:10.​1109/​CVPR.​2015.​7298783
103.
Zurück zum Zitat Zitouni, H., Sevil, S.G., Ozkan, D., Duygulu, P.: Re-ranking of web image search results using a graph algorithm. In: 19th International Conference on Pattern Recognition (ICPR 2008), pp. 1–4 (2008). doi:10.1109/ICPR.2008.4761472 Zitouni, H., Sevil, S.G., Ozkan, D., Duygulu, P.: Re-ranking of web image search results using a graph algorithm. In: 19th International Conference on Pattern Recognition (ICPR 2008), pp. 1–4 (2008). doi:10.​1109/​ICPR.​2008.​4761472
Metadaten
Titel
Fusion Strategies for Large-Scale Multi-modal Image Retrieval
verfasst von
Petra Budikova
Michal Batko
Pavel Zezula
Copyright-Jahr
2017
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-55696-2_5