nach oben

Multimedia Systems

Erschienen in:

01.07.2013 | Regular Paper

Multiple visual concept discovery using concept-based visual word clustering

verfasst von: Jun-Bin Yeh, Chung-Hsien Wu, Shi-Xin Mai

Erschienen in: Multimedia Systems | Ausgabe 4/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In recent research, visual concept discovery was used to fill the semantic gap for representing the visual content. However, multiple concepts in an image generally degrade the discovery accuracy. In this paper, a Concept-based Visual Word Clustering (CVWC) method is proposed to discover multiple concepts from an image without pre-segmented training images. The CVWC is based on prior knowledge of concepts, which are trained from meta-text of web images. First, concepts are obtained by clustering the visual words in the regions extracted from image segmentation. A concept-based genetic algorithm (CBGA) is applied for searching the near-optimal clusters according to the visual words (VWs) in a concept and the co-occurrence probability of two concepts. The clustering procedure is also performed on the neighboring VWs to discover all the regions for concept representation. A concept extension method (CE) is further applied for iteratively updating the discovered concepts from the clustered results. In the experiments on the application to video retrieval, the mAP of the proposed CVWC method based on CBGA and CE obtained satisfactory improvements of 0.04 and 0.06, compared to pixel-based image segmentation approach and conventional concept model approach for the category “nation defense,” and 0.06 and 0.05 for the category “ecology,” respectively.

Vorheriger Artikel Adaptive music resizing with stretching, cropping and insertion

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://www.youtube.com/

http://www.flickr.com/

http://www.bbc.co.uk/news/video_and_audio/

Huiskes, M.J., Lew, M.S.: The MIR Flickr retrieval evaluation. In: Proceedings of the ACM Multimedia Information Retrieval (MIR), pp. 39–43 (2008). doi:10.1145/1460096.1460104

Smith, J.R., Naphade, M.R., Natsev, A.P.: Multimedia semantic indexing using model vectors. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), pp. 445–448 (2003). doi:10.1109/ICME.2003.1221649

Geusebroek, J.M., Boomgaard, R., Smeulders, A.W.M., Geerts, H.: Color invariance. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1338–1350 (2001). doi:10.1109/34.977559 CrossRef

Randen, T., Husoy, J.H.: Filtering for texture classification: A comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 21, 291–310 (1999). doi:10.1109/34.761261 CrossRef

Hoang, M.A., Geusebroek, J.M., Smeulders, A.W.M.: Color texture measurement and segmentation. Signal Process. 85, 265–275 (2005). doi:10.1016/j.sigpro.2004.10.009 MATHCrossRef

Harris, C., Stephens, M.: A combined corner and edge detection. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)

10.

Wang, D., Liu, X., Luo, L., Li, J., Zhang, B.: Video diver: generic video indexing with diverse features. In: Proceedings of the ACM Multimedia Information Retrieval (MIR), pp. 61–70 (2007). doi:10.1145/1290082.1290094

11.

Cao, J., Lan, Y., Li, J., Li, Q., Li, X., Lin, F., Liu, X., Luo, L., Peng, W., Wang, D., Wang, H., Wang, Z., Xiang, Z., Yuan, J., Zheng, W., Zhang, B., Zhang, J., Zhang, L., Zhang, X.: Intelligent multimedia group of Tsinghua University at TRECVID 2006. In: Proceedings of the TRECVID Workshop (2006)

12.

Yeh, J.B., Wu, C.H., Chang, S.X.: Unsupervised alignment of news video and text using visual patterns and textual concepts. IEEE Trans. Multimed. 13(2), 206–215 (2011). doi:10.1109/TMM.2010.2095412 CrossRef

13.

Bay, H., Ess, A., Tuytelaars, T., Gool, L.-V.: Speeded-Up robust features (SURF). Comput. Vis Image Underst. 110, 346–359 (2008). doi:10.1016/j.cviu.2007.09.014 CrossRef

14.

Burghouts, G.J., Geusebroek, J.M.: Performance evaluation of local colour invariants. Comput. Vis. Image Underst. 113, 48–62 (2009). doi:10.1016/j.cviu.2008.07.003 CrossRef

15.

Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1632–1646 (2008). doi:10.1109/TPAMI.2007.70822 CrossRef

16.

van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluation of color descriptors for object and scene recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23–28 (2008). doi:10.1109/CVPR.2008.4587658

17.

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004). doi:10.1023/B:VISI.0000029664.99615.94 CrossRef

18.

Yeh, J.B., Wu C.H.: Extraction of query term-related visual phrases for news video retrieval using mutual information. In: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 730–733 (2009). doi:10.1109/ISCAS.2009.5117852

19.

Yeh, J.B., Wu C.H.: Extraction of robust visual phrases using graph mining for image retrieval. In: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 3681–3684 (2010). doi:10.1109/ISCAS.2010.5537760

20.

Rafael, C.G., Richard E.W.: Digital image processing (3rd edn). Prentice Hall, New Jersey (2007). ISBN: 978-0131687288

21.

Vincent, L., Soille, P.: Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 13, 583–598 (1991). doi:10.1109/34.87344 CrossRef

22.

Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. Int. J. Comput. Vision 1(4), 321–331 (1988). doi:10.1007/BF00133570 CrossRef

23.

Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1124–1131 (2005). doi:10.1109/CVPR.2005.332

24.

Grady, L.: Random walks for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28(11) (2006). doi:10.1109/TPAMI.2006.233

25.

Grady, L., Schwartz, E.L.: Isoperimetric graph partitioning for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28(3), 469–475 (2006). doi:10.1109/TPAMI.2006.57 CrossRef

26.

Russell B.C., Freeman W.T., Efros A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: Proceedings of the 19th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1605–1614 (2006). doi:10.1109/CVPR.2006.326

27.

Nils, P., Marc, T., Shinichi, N.: Multi-class image segmentation using conditional random fields and global classification. In: Proceedings of the 26th international conference on machine learning (ICML), pp. 817–824 (2009). doi:10.1145/1553374.1553479

28.

Gonfaus, J.M., Boix, X., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials for joint classification and segmentation. In: Proceedings of the 23rd IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3280–3287 (2010). doi:10.1109/CVPR.2010.5540048

29.

Bunday, B.D.: Basic optimisation methods. Edward Arnold, 1984. ISBN: 978-0713135060

30.

http://www.google.com/imghp

31.

Vedaldi, A., Fulkerson, B.: An open and portable library of computer vision algorithms (2008). Software available at http://www.vlfeat.org/

32.

Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

33.

Wang, H.M., Chen, B., Kuo, J.W., Cheng, S.S.: MATBN: a mandarin Chinese broadcast news corpus. Int. J. Comput. Linguist. Chinese Lang. Process. 10(2), 219–236 (2005)MATH

34.

Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the ACM SIGIR conference on research and development in information retrieval (SIGIR), pp. 697–716 (1998). doi:10.1145/290941.291017

35.

Buckley, C., Voorhees, E.: Evaluating evaluation measure stability. In: Proceedings of the ACM SIGIR conference on research and development in information retrieval (SIGIR), pp. 33–40 (2000). doi:10.1145/345508.345543

36.

Yeh, J.B., Wu, C.H.: Video news retrieval incorporating relevant terms based on distribution of document frequency. In: Proceedings of the Pacific-Rim Conference on Multimedia (PCM), pp. 583–592 (2008). doi:10.1007/978-3-540-89796-5_60

Titel: Multiple visual concept discovery using concept-based visual word clustering
verfasst von: Jun-Bin Yeh
Chung-Hsien Wu
Shi-Xin Mai
Publikationsdatum: 01.07.2013
Verlag: Springer-Verlag
Erschienen in: Multimedia Systems / Ausgabe 4/2013
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-012-0294-9

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Kryptowährungen/© gopixa / Getty Images / iStock, MG4 aus China auf dem Prüfstand im ADAC-Technik-Zentrum in Landsberg am Lech/© ADAC e.V., Chassis eines Elektrofahrzeugs/© chesky / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.