Skip to main content
Top
Published in: Soft Computing 5/2011

01-05-2011 | Focus

Image annotation by incorporating word correlations into multi-class SVM

Authors: Lei Zhang, Jun Ma

Published in: Soft Computing | Issue 5/2011

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Image annotation systems aim at automatically annotating images with semantic keywords. Machine learning approaches are often used to develop these systems. In this paper, we propose an image annotation approach by incorporating word correlations into multi-class support vector machine (SVM). At first, each image is segmented into five fixed-size blocks instead of time-consuming object segmentation. Every keyword from training images is manually assigned to the corresponding block and word correlations are computed by a co-occurrence matrix. Then, MPEG-7 visual descriptors are applied to these blocks to represent visual features and the minimal-redundancy-maximum-relevance (mRMR) method is used to reduce the feature dimension. A block-feature-based multi-class SVM classifier is trained for 80 semantic concepts. At last, the probabilistic outputs from SVM and the word correlations are integrated to obtain the final annotation keywords. The experiments on Corel 5000 dataset demonstrate our approach is effective and efficient.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
More instances of blocks are shown in Fig. 9. For disambiguation, in the following, we use “block” to denote tile or sub-image and “image” to denote full-size picture.
 
2
Sequence number 1–4 in Fig. 3.
 
3
Sequence number 5 in Fig. 3.
 
Literature
go back to reference Carneiro G, Chan AB, Moreno PJ, Vasconcelos N (2007) Supervised Learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410CrossRef Carneiro G, Chan AB, Moreno PJ, Vasconcelos N (2007) Supervised Learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410CrossRef
go back to reference Chang E, Goh K, Sychay G, Wu G (2003) CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans Circuits Syst Video Technol 13(1):26–38CrossRef Chang E, Goh K, Sychay G, Wu G (2003) CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans Circuits Syst Video Technol 13(1):26–38CrossRef
go back to reference Chen YW, Lin CJ (2006) Combining SVMs with various feature selection strategies. Stud Fuzziness Soft Comput 207:315CrossRef Chen YW, Lin CJ (2006) Combining SVMs with various feature selection strategies. Stud Fuzziness Soft Comput 207:315CrossRef
go back to reference Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939 Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939
go back to reference Cusano C, Ciocca G, Schettini R (2003) Image annotation using SVM. In: Proceedings of SPIE, p 330 Cusano C, Ciocca G, Schettini R (2003) Image annotation using SVM. In: Proceedings of SPIE, p 330
go back to reference Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2) Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2)
go back to reference Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Lect Notes Comput Sci 97–112 Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Lect Notes Comput Sci 97–112
go back to reference Eidenberger H (2003) How good are the visual MPEG-7 features? In: SPIE and IEEE visual communications and image processing conference, Lugano, Switzerland Eidenberger H (2003) How good are the visual MPEG-7 features? In: SPIE and IEEE visual communications and image processing conference, Lugano, Switzerland
go back to reference Fan J, Gao Y, Luo H, Xu G (2004) Automatic image annotation by using concept-sensitive salient objects for image content representation. In: Proceedings of the 27th ACM SIGIR conference, pp 361–368 Fan J, Gao Y, Luo H, Xu G (2004) Automatic image annotation by using concept-sensitive salient objects for image content representation. In: Proceedings of the 27th ACM SIGIR conference, pp 361–368
go back to reference Fellbaum C et al (1998) WordNet: an electronic lexical database. MIT press, CambridgeMATH Fellbaum C et al (1998) WordNet: an electronic lexical database. MIT press, CambridgeMATH
go back to reference Goh KS, Chang EY, Li B (2005) Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans Know Data Eng 17(10):1333–1346CrossRef Goh KS, Chang EY, Li B (2005) Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans Know Data Eng 17(10):1333–1346CrossRef
go back to reference Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning
go back to reference Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425CrossRef Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425CrossRef
go back to reference Jeannin S (2001) Mpeg-7 visual part of experimentation model version 9.0. ISO/IEC JTC1/SC29/WG11, 3914 Jeannin S (2001) Mpeg-7 visual part of experimentation model version 9.0. ISO/IEC JTC1/SC29/WG11, 3914
go back to reference Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference, pp 119–126 Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference, pp 119–126
go back to reference Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. Lec Notes Comput Sci, pp 171–171 Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. Lec Notes Comput Sci, pp 171–171
go back to reference Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2 Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2
go back to reference Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715CrossRef Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715CrossRef
go back to reference Monay F, Gatica-Perez D (2003) On image auto-annotation with latent space models. In: Proceedings of the eleventh ACM international conference on multimedia, pp 275–278 Monay F, Gatica-Perez D (2003) On image auto-annotation with latent space models. In: Proceedings of the eleventh ACM international conference on multimedia, pp 275–278
go back to reference Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 1226–1238 Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 1226–1238
go back to reference Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741CrossRefMATH Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741CrossRefMATH
go back to reference Rasiwasia N, Moreno PJ, Vasconcelos N (2007) Bridging the gap: query by semantic example. IEEE Trans Multimed 9(5):923–938CrossRef Rasiwasia N, Moreno PJ, Vasconcelos N (2007) Bridging the gap: query by semantic example. IEEE Trans Multimed 9(5):923–938CrossRef
go back to reference Rahman MM, Desai BC, Bhattacharya P (2006) A feature level fusion in similarity matching to content-based image retrieval. In: The 9th international conference on information fusion, pp 1–6 Rahman MM, Desai BC, Bhattacharya P (2006) A feature level fusion in similarity matching to content-based image retrieval. In: The 9th international conference on information fusion, pp 1–6
go back to reference Stricker M, Dimai A (1997) Spectral covariance and fuzzy regions for image indexing. Mach Vis Appl 10(2):66–73CrossRef Stricker M, Dimai A (1997) Spectral covariance and fuzzy regions for image indexing. Mach Vis Appl 10(2):66–73CrossRef
go back to reference Tang J, Lewis P (2007) A study of quality issues for image auto-annotation with the corel data-set. IEEE Trans Circuits Syst Video Technol (No. 3):384–389 Tang J, Lewis P (2007) A study of quality issues for image auto-annotation with the corel data-set. IEEE Trans Circuits Syst Video Technol (No. 3):384–389
go back to reference Tsai CF, McGarry K, Tait J (2006) CLAIRE: a modular support vector image indexing and classification system. ACM Trans Inf Syst 24(3):353–379CrossRef Tsai CF, McGarry K, Tait J (2006) CLAIRE: a modular support vector image indexing and classification system. ACM Trans Inf Syst 24(3):353–379CrossRef
go back to reference Vapnik VN (2000) The nature of statistical learning theory. Springer Vapnik VN (2000) The nature of statistical learning theory. Springer
go back to reference Wang JZ, Li J, Wiederhold G (2001) SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell, 947–963 Wang JZ, Li J, Wiederhold G (2001) SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell, 947–963
go back to reference Wong RCF, Leung CHC (2008) Automatic Semantic Annotation of Real-World Web Images. IEEE Trans Pattern Anal Mach Intell 30(11):1933–1944CrossRef Wong RCF, Leung CHC (2008) Automatic Semantic Annotation of Real-World Web Images. IEEE Trans Pattern Anal Mach Intell 30(11):1933–1944CrossRef
go back to reference Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques
go back to reference Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005MathSciNet Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005MathSciNet
go back to reference Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of 20th international conference on machine leaning Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of 20th international conference on machine leaning
go back to reference Zhou X, Wang M, Zhang Q, Zhang J, Shi B (2007) Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching. In: Proceedings of the 6th ACM international conference on image and video retrieval, pp 25–32 Zhou X, Wang M, Zhang Q, Zhang J, Shi B (2007) Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching. In: Proceedings of the 6th ACM international conference on image and video retrieval, pp 25–32
Metadata
Title
Image annotation by incorporating word correlations into multi-class SVM
Authors
Lei Zhang
Jun Ma
Publication date
01-05-2011
Publisher
Springer-Verlag
Published in
Soft Computing / Issue 5/2011
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-010-0558-2

Other articles of this Issue 5/2011

Soft Computing 5/2011 Go to the issue

Premium Partner