Skip to main content
Erschienen in: Multimedia Systems 4/2013

01.07.2013 | Regular Paper

Multiple visual concept discovery using concept-based visual word clustering

verfasst von: Jun-Bin Yeh, Chung-Hsien Wu, Shi-Xin Mai

Erschienen in: Multimedia Systems | Ausgabe 4/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent research, visual concept discovery was used to fill the semantic gap for representing the visual content. However, multiple concepts in an image generally degrade the discovery accuracy. In this paper, a Concept-based Visual Word Clustering (CVWC) method is proposed to discover multiple concepts from an image without pre-segmented training images. The CVWC is based on prior knowledge of concepts, which are trained from meta-text of web images. First, concepts are obtained by clustering the visual words in the regions extracted from image segmentation. A concept-based genetic algorithm (CBGA) is applied for searching the near-optimal clusters according to the visual words (VWs) in a concept and the co-occurrence probability of two concepts. The clustering procedure is also performed on the neighboring VWs to discover all the regions for concept representation. A concept extension method (CE) is further applied for iteratively updating the discovered concepts from the clustered results. In the experiments on the application to video retrieval, the mAP of the proposed CVWC method based on CBGA and CE obtained satisfactory improvements of 0.04 and 0.06, compared to pixel-based image segmentation approach and conventional concept model approach for the category “nation defense,” and 0.06 and 0.05 for the category “ecology,” respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
4.
Zurück zum Zitat Huiskes, M.J., Lew, M.S.: The MIR Flickr retrieval evaluation. In: Proceedings of the ACM Multimedia Information Retrieval (MIR), pp. 39–43 (2008). doi:10.1145/1460096.1460104 Huiskes, M.J., Lew, M.S.: The MIR Flickr retrieval evaluation. In: Proceedings of the ACM Multimedia Information Retrieval (MIR), pp. 39–43 (2008). doi:10.​1145/​1460096.​1460104
5.
Zurück zum Zitat Smith, J.R., Naphade, M.R., Natsev, A.P.: Multimedia semantic indexing using model vectors. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 445–448 (2003). doi:10.1109/ICME.2003.1221649 Smith, J.R., Naphade, M.R., Natsev, A.P.: Multimedia semantic indexing using model vectors. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 445–448 (2003). doi:10.​1109/​ICME.​2003.​1221649
6.
7.
9.
Zurück zum Zitat Harris, C., Stephens, M.: A combined corner and edge detection. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988) Harris, C., Stephens, M.: A combined corner and edge detection. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)
10.
Zurück zum Zitat Wang, D., Liu, X., Luo, L., Li, J., Zhang, B.: Video diver: generic video indexing with diverse features. In: Proceedings of the ACM Multimedia Information Retrieval (MIR), pp. 61–70 (2007). doi:10.1145/1290082.1290094 Wang, D., Liu, X., Luo, L., Li, J., Zhang, B.: Video diver: generic video indexing with diverse features. In: Proceedings of the ACM Multimedia Information Retrieval (MIR), pp. 61–70 (2007). doi:10.​1145/​1290082.​1290094
11.
Zurück zum Zitat Cao, J., Lan, Y., Li, J., Li, Q., Li, X., Lin, F., Liu, X., Luo, L., Peng, W., Wang, D., Wang, H., Wang, Z., Xiang, Z., Yuan, J., Zheng, W., Zhang, B., Zhang, J., Zhang, L., Zhang, X.: Intelligent multimedia group of Tsinghua University at TRECVID 2006. In: Proceedings of the TRECVID Workshop (2006) Cao, J., Lan, Y., Li, J., Li, Q., Li, X., Lin, F., Liu, X., Luo, L., Peng, W., Wang, D., Wang, H., Wang, Z., Xiang, Z., Yuan, J., Zheng, W., Zhang, B., Zhang, J., Zhang, L., Zhang, X.: Intelligent multimedia group of Tsinghua University at TRECVID 2006. In: Proceedings of the TRECVID Workshop (2006)
16.
Zurück zum Zitat van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluation of color descriptors for object and scene recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23–28 (2008). doi:10.1109/CVPR.2008.4587658 van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluation of color descriptors for object and scene recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23–28 (2008). doi:10.​1109/​CVPR.​2008.​4587658
18.
Zurück zum Zitat Yeh, J.B., Wu C.H.: Extraction of query term-related visual phrases for news video retrieval using mutual information. In: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 730–733 (2009). doi:10.1109/ISCAS.2009.5117852 Yeh, J.B., Wu C.H.: Extraction of query term-related visual phrases for news video retrieval using mutual information. In: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 730–733 (2009). doi:10.​1109/​ISCAS.​2009.​5117852
19.
Zurück zum Zitat Yeh, J.B., Wu C.H.: Extraction of robust visual phrases using graph mining for image retrieval. In: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 3681–3684 (2010). doi:10.1109/ISCAS.2010.5537760 Yeh, J.B., Wu C.H.: Extraction of robust visual phrases using graph mining for image retrieval. In: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 3681–3684 (2010). doi:10.​1109/​ISCAS.​2010.​5537760
20.
Zurück zum Zitat Rafael, C.G., Richard E.W.: Digital image processing (3rd edn). Prentice Hall, New Jersey (2007). ISBN: 978-0131687288 Rafael, C.G., Richard E.W.: Digital image processing (3rd edn). Prentice Hall, New Jersey (2007). ISBN: 978-0131687288
21.
Zurück zum Zitat Vincent, L., Soille, P.: Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 13, 583–598 (1991). doi:10.1109/34.87344 CrossRef Vincent, L., Soille, P.: Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 13, 583–598 (1991). doi:10.​1109/​34.​87344 CrossRef
23.
Zurück zum Zitat Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1124–1131 (2005). doi:10.1109/CVPR.2005.332 Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1124–1131 (2005). doi:10.​1109/​CVPR.​2005.​332
26.
Zurück zum Zitat Russell B.C., Freeman W.T., Efros A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: Proceedings of the 19th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1605–1614 (2006). doi:10.1109/CVPR.2006.326 Russell B.C., Freeman W.T., Efros A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: Proceedings of the 19th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1605–1614 (2006). doi:10.​1109/​CVPR.​2006.​326
27.
Zurück zum Zitat Nils, P., Marc, T., Shinichi, N.: Multi-class image segmentation using conditional random fields and global classification. In: Proceedings of the 26th international conference on machine learning (ICML), pp. 817–824 (2009). doi:10.1145/1553374.1553479 Nils, P., Marc, T., Shinichi, N.: Multi-class image segmentation using conditional random fields and global classification. In: Proceedings of the 26th international conference on machine learning (ICML), pp. 817–824 (2009). doi:10.​1145/​1553374.​1553479
28.
Zurück zum Zitat Gonfaus, J.M., Boix, X., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials for joint classification and segmentation. In: Proceedings of the 23rd IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3280–3287 (2010). doi:10.1109/CVPR.2010.5540048 Gonfaus, J.M., Boix, X., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials for joint classification and segmentation. In: Proceedings of the 23rd IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3280–3287 (2010). doi:10.​1109/​CVPR.​2010.​5540048
29.
Zurück zum Zitat Bunday, B.D.: Basic optimisation methods. Edward Arnold, 1984. ISBN: 978-0713135060 Bunday, B.D.: Basic optimisation methods. Edward Arnold, 1984. ISBN: 978-0713135060
33.
Zurück zum Zitat Wang, H.M., Chen, B., Kuo, J.W., Cheng, S.S.: MATBN: a mandarin Chinese broadcast news corpus. Int. J. Comput. Linguist. Chinese Lang. Process. 10(2), 219–236 (2005)MATH Wang, H.M., Chen, B., Kuo, J.W., Cheng, S.S.: MATBN: a mandarin Chinese broadcast news corpus. Int. J. Comput. Linguist. Chinese Lang. Process. 10(2), 219–236 (2005)MATH
34.
Zurück zum Zitat Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the ACM SIGIR conference on research and development in information retrieval (SIGIR), pp. 697–716 (1998). doi:10.1145/290941.291017 Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the ACM SIGIR conference on research and development in information retrieval (SIGIR), pp. 697–716 (1998). doi:10.​1145/​290941.​291017
35.
Zurück zum Zitat Buckley, C., Voorhees, E.: Evaluating evaluation measure stability. In: Proceedings of the ACM SIGIR conference on research and development in information retrieval (SIGIR), pp. 33–40 (2000). doi:10.1145/345508.345543 Buckley, C., Voorhees, E.: Evaluating evaluation measure stability. In: Proceedings of the ACM SIGIR conference on research and development in information retrieval (SIGIR), pp. 33–40 (2000). doi:10.​1145/​345508.​345543
36.
Zurück zum Zitat Yeh, J.B., Wu, C.H.: Video news retrieval incorporating relevant terms based on distribution of document frequency. In: Proceedings of the Pacific-Rim Conference on Multimedia (PCM), pp. 583–592 (2008). doi:10.1007/978-3-540-89796-5_60 Yeh, J.B., Wu, C.H.: Video news retrieval incorporating relevant terms based on distribution of document frequency. In: Proceedings of the Pacific-Rim Conference on Multimedia (PCM), pp. 583–592 (2008). doi:10.​1007/​978-3-540-89796-5_​60
Metadaten
Titel
Multiple visual concept discovery using concept-based visual word clustering
verfasst von
Jun-Bin Yeh
Chung-Hsien Wu
Shi-Xin Mai
Publikationsdatum
01.07.2013
Verlag
Springer-Verlag
Erschienen in
Multimedia Systems / Ausgabe 4/2013
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-012-0294-9

Neuer Inhalt