Skip to main content
Erschienen in: Neural Computing and Applications 6/2015

01.08.2015 | Original Article

A novel image annotation model based on content representation with multi-layer segmentation

verfasst von: Jing Zhang, Yaxin Zhao, Da Li, Zhihua Chen, Yubo Yuan

Erschienen in: Neural Computing and Applications | Ausgabe 6/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Image automatic annotation is an important issue of semantic-based image retrieval, and it is still a challenging problem for the reason of semantic gap. In this paper, a novel model with three parts is proposed. The first one is multi-layer image segmentation, in which saliency analysis and normalized cut are combined to segment images into semantic regions in the first layer. While in the second layer, the semantic regions are segmented into grids further . The second one is image content representation by region-based bag-of-words (RBoW) model, which is the variant of BoW model. Considering the correlations of labels, we adopt second-order CRFs as the third part of our model to ensure the accuracy of automatic image annotation. Experimental results show that our multi-layer segmentation-based image annotation model can achieve promising performance for multi-labeling and outperform the model based on single-layer segmentation and previous algorithm on Corel 5K and Pascal VOC 2007 datasets .

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zhang H, Berg A, Maire M, Malik J (2006) SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: 2006 IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 2126–2136 Zhang H, Berg A, Maire M, Malik J (2006) SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: 2006 IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 2126–2136
2.
Zurück zum Zitat Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th international conference on computer vision, pp 309–316 Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th international conference on computer vision, pp 309–316
3.
Zurück zum Zitat Mei T, Wang Y, Hua XS, Gong SG, Li SP (2008) Coherent image annotation by learning semantic distance. In: 2008 CVPR 2008 IEEE conference on computer vision and pattern recognition, pp 1–8 Mei T, Wang Y, Hua XS, Gong SG, Li SP (2008) Coherent image annotation by learning semantic distance. In: 2008 CVPR 2008 IEEE conference on computer vision and pattern recognition, pp 1–8
4.
Zurück zum Zitat Shi JB, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef Shi JB, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef
5.
Zurück zum Zitat Cour T, Bénézit F, Shi JB (2005) Spectral segmentation with multiscale graph decomposition. Spectr Segm Multiscale Graph Decompos 2:1124–1131 Cour T, Bénézit F, Shi JB (2005) Spectral segmentation with multiscale graph decomposition. Spectr Segm Multiscale Graph Decompos 2:1124–1131
6.
Zurück zum Zitat Deng YN, Manjunath BS (2001) Unsupervised segmentation of color-texture regions in images and video. IEEE Trans Pattern Anal Mach Intell 23(8):800–810 Deng YN, Manjunath BS (2001) Unsupervised segmentation of color-texture regions in images and video. IEEE Trans Pattern Anal Mach Intell 23(8):800–810
7.
Zurück zum Zitat Serrano N, Savakis A, Luo JB (2002) A computationally efficient approach to indoor/outdoor scene classification. In: 2002 Proceedings 16th international conference on pattern recognition, vol 4, pp 146–149 Serrano N, Savakis A, Luo JB (2002) A computationally efficient approach to indoor/outdoor scene classification. In: 2002 Proceedings 16th international conference on pattern recognition, vol 4, pp 146–149
8.
Zurück zum Zitat Zhang J, Zhao Y, Li D, Chen Z, Yuan Y (2013) Representation of image content of image content with multi-scale segmentation. In: 2013 ICMLC machine learning and cybernetics Tianjin, China, July 14–17 Zhang J, Zhao Y, Li D, Chen Z, Yuan Y (2013) Representation of image content of image content with multi-scale segmentation. In: 2013 ICMLC machine learning and cybernetics Tianjin, China, July 14–17
9.
Zurück zum Zitat Zhang J, Li D, Zhao Y, Chen Z, Yuan Y (2015) Representation of image content based on RoI-BoW. J Vis Commun Image Represent 26(1):37C49 Zhang J, Li D, Zhao Y, Chen Z, Yuan Y (2015) Representation of image content based on RoI-BoW. J Vis Commun Image Represent 26(1):37C49
10.
Zurück zum Zitat Li FF, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: IEEE computer society conference on computer vision and pattern recognition, 2005 (CVPR 2005), vol 2, pp 524–531 Li FF, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: IEEE computer society conference on computer vision and pattern recognition, 2005 (CVPR 2005), vol 2, pp 524–531
11.
Zurück zum Zitat Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: 2003 proceedings ninth IEEE international conference on computer vision, vol 2, pp 1470–1477 Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: 2003 proceedings ninth IEEE international conference on computer vision, vol 2, pp 1470–1477
12.
Zurück zum Zitat Wang XS, Liu X, Shi ZP, Shi ZZ, Sui HJ (2010) Voting conditional random fields for multi-label image classification. In: 2010 3rd international congress on image and signal processing (CISP), vol 4, pp 1984–1988 Wang XS, Liu X, Shi ZP, Shi ZZ, Sui HJ (2010) Voting conditional random fields for multi-label image classification. In: 2010 3rd international congress on image and signal processing (CISP), vol 4, pp 1984–1988
13.
Zurück zum Zitat Varma M, Zisserman A (2005) A statistical approach to texture classification from single images. Int J Comput Vis 62:61–81CrossRef Varma M, Zisserman A (2005) A statistical approach to texture classification from single images. Int J Comput Vis 62:61–81CrossRef
14.
Zurück zum Zitat Li T, Kweon I-S (2008) A semantic region descriptor for local feature based image categorization. In: 2008 IEEE international conference on acoustics, speech and signal processing (2008 ICASSP), pp 1333–1336 Li T, Kweon I-S (2008) A semantic region descriptor for local feature based image categorization. In: 2008 IEEE international conference on acoustics, speech and signal processing (2008 ICASSP), pp 1333–1336
15.
Zurück zum Zitat Zhang JG, Marszalek M, Lazebnik S, Schmid C (2006) Local features and kernels for classification of texture and object categories: a comprehensive study. In: Conference on computer vision and pattern recognition workshop, 2006 (CVPRW’06), p 13 Zhang JG, Marszalek M, Lazebnik S, Schmid C (2006) Local features and kernels for classification of texture and object categories: a comprehensive study. In: Conference on computer vision and pattern recognition workshop, 2006 (CVPRW’06), p 13
16.
Zurück zum Zitat Wu X, Zhao WL, Ngo CW (2007) Near-duplicate keyframe retrieval with visual keywords and semantic context. In: CIVR ’07 proceedings of the 6th ACM international conference on image and video retrieval, pp 162–169 Wu X, Zhao WL, Ngo CW (2007) Near-duplicate keyframe retrieval with visual keywords and semantic context. In: CIVR ’07 proceedings of the 6th ACM international conference on image and video retrieval, pp 162–169
17.
Zurück zum Zitat Alvarez S, Vanrell M (2012) Texton theory revisited: a bag-of-words approach to combine textons. Pattern Recognit 45:4312–4325CrossRef Alvarez S, Vanrell M (2012) Texton theory revisited: a bag-of-words approach to combine textons. Pattern Recognit 45:4312–4325CrossRef
18.
Zurück zum Zitat Chen T, Yap KH, Chau LP (2011) From universal bag-of-words to adaptive bag-of-phrases for mobile scene recognition. 2011 18th IEEE international conference on image processing (ICIP), pp 825–828 Chen T, Yap KH, Chau LP (2011) From universal bag-of-words to adaptive bag-of-phrases for mobile scene recognition. 2011 18th IEEE international conference on image processing (ICIP), pp 825–828
19.
Zurück zum Zitat Albatal R, Mulhem P, Chiaramella Y (2010) Visual phrases for automatic images annotation. In: 2010 International workshop on content-based multimedia indexing (CBMI), pp 1–6 Albatal R, Mulhem P, Chiaramella Y (2010) Visual phrases for automatic images annotation. In: 2010 International workshop on content-based multimedia indexing (CBMI), pp 1–6
20.
Zurück zum Zitat Albatal R, Mulhem P, Chiaramella Y (2011) A new ROI grouping schema for automatic image annotation. In: 2011 IEEE international conference on multimedia and expo (ICME), pp 1–6 Albatal R, Mulhem P, Chiaramella Y (2011) A new ROI grouping schema for automatic image annotation. In: 2011 IEEE international conference on multimedia and expo (ICME), pp 1–6
21.
Zurück zum Zitat Zhang YM, Jia ZY, Chen T (2011) Image retrieval with geometry-preserving visual phrases. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR), pp 809–816 Zhang YM, Jia ZY, Chen T (2011) Image retrieval with geometry-preserving visual phrases. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR), pp 809–816
22.
Zurück zum Zitat Wang FY, Zhang SW, Li HP, Zhang NG (2012) Image retrieval using multiple orders of geometry-preserving visual phrases. In: 2012 International conference on image analysis and signal processing (IASP), pp 1–5 Wang FY, Zhang SW, Li HP, Zhang NG (2012) Image retrieval using multiple orders of geometry-preserving visual phrases. In: 2012 International conference on image analysis and signal processing (IASP), pp 1–5
23.
Zurück zum Zitat Zhang SL, Tian Q, Hua G (2009) Descriptive visual words and visual phrases for image applications. In: MM ’09 proceedings of the 17th ACM international conference on Multimedia, pp 75–84 Zhang SL, Tian Q, Hua G (2009) Descriptive visual words and visual phrases for image applications. In: MM ’09 proceedings of the 17th ACM international conference on Multimedia, pp 75–84
24.
Zurück zum Zitat Yang C, Dong M, Fotouhi F (2005) Region based image annotation through multiple instance learning. In: Multimedia’05 proceedings of the 13th annual ACM international conference on multimedia, pp 435–438 Yang C, Dong M, Fotouhi F (2005) Region based image annotation through multiple instance learning. In: Multimedia’05 proceedings of the 13th annual ACM international conference on multimedia, pp 435–438
25.
Zurück zum Zitat Veksler O, Boykov Y, Mehrani P (2010) Superpixels and supervoxels in an energy optimization framework. In: European conference on computer vision (ECCV), pp 211–224 Veksler O, Boykov Y, Mehrani P (2010) Superpixels and supervoxels in an energy optimization framework. In: European conference on computer vision (ECCV), pp 211–224
26.
Zurück zum Zitat Huang QX, Han M, Wu B, Ioffe S (2011) A hierarchical conditional random field model for labeling and segmenting images of street scenes. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR), pp 1953–1960 Huang QX, Han M, Wu B, Ioffe S (2011) A hierarchical conditional random field model for labeling and segmenting images of street scenes. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR), pp 1953–1960
27.
Zurück zum Zitat Zhang DS, Islam MM, Lu GJ (2012) A review on automatic image annotation techniques. Pattern Recognit 45(1):346–362CrossRef Zhang DS, Islam MM, Lu GJ (2012) A review on automatic image annotation techniques. Pattern Recognit 45(1):346–362CrossRef
28.
Zurück zum Zitat Goh KS, Chang EY, Li BT (2005) Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans Knowl Data Eng 17(10):1333–1346CrossRef Goh KS, Chang EY, Li BT (2005) Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans Knowl Data Eng 17(10):1333–1346CrossRef
29.
Zurück zum Zitat Qi XJ, Han YT (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741CrossRef Qi XJ, Han YT (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741CrossRef
30.
Zurück zum Zitat Shi R, Feng H, Chua TS, Lee CH (2004) An adaptive image content representation and segmentation approach to automatic image annotation. In: International conference on image and video retrieval, pp 545–554 Shi R, Feng H, Chua TS, Lee CH (2004) An adaptive image content representation and segmentation approach to automatic image annotation. In: International conference on image and video retrieval, pp 545–554
31.
Zurück zum Zitat Tao DP, Jin LW, Liu WF, Li XL (2013) Hessian regularized support vector machines for mobile image annotation on the cloud. IEEE Trans Multimed 15(4):833–844CrossRef Tao DP, Jin LW, Liu WF, Li XL (2013) Hessian regularized support vector machines for mobile image annotation on the cloud. IEEE Trans Multimed 15(4):833–844CrossRef
32.
Zurück zum Zitat Park SB, Lee JW, Kim SK (2004) Content-based image classification using a neural network. Pattern Recognit Lett 25(3):287–300CrossRef Park SB, Lee JW, Kim SK (2004) Content-based image classification using a neural network. Pattern Recognit Lett 25(3):287–300CrossRef
33.
Zurück zum Zitat Kim S, Park S, Kim M (2004) Image classification into object/non-object classes. In: International conference on image and video retrieval, Dublin, Ireland, pp 393–400 Kim S, Park S, Kim M (2004) Image classification into object/non-object classes. In: International conference on image and video retrieval, Dublin, Ireland, pp 393–400
34.
Zurück zum Zitat Frate FD, Pacifici F, Schiavon G, Solimini C (2007) Use of neural networks for automatic classification from high-resolution images. IEEE Trans Geosci Remote Sens 45(4):800–809CrossRef Frate FD, Pacifici F, Schiavon G, Solimini C (2007) Use of neural networks for automatic classification from high-resolution images. IEEE Trans Geosci Remote Sens 45(4):800–809CrossRef
35.
Zurück zum Zitat Su JH, Chou CL, Lin CY, Tseng VS (2011) Effective semantic annotation by image-to-concept distribution model. IEEE Trans Multimed 13(3):530–538CrossRef Su JH, Chou CL, Lin CY, Tseng VS (2011) Effective semantic annotation by image-to-concept distribution model. IEEE Trans Multimed 13(3):530–538CrossRef
36.
Zurück zum Zitat Bao BK, Li T, Yan SC (2012) Hidden-concept driven multilabel image annotation and label ranking, multimedia. IEEE Trans Multimed 14(1):199–210CrossRef Bao BK, Li T, Yan SC (2012) Hidden-concept driven multilabel image annotation and label ranking, multimedia. IEEE Trans Multimed 14(1):199–210CrossRef
37.
Zurück zum Zitat Wang Y, Mei T, Gong SG, Hua XS (2009) Combining global, regional and contextual features for automatic image annotation. Pattern Recognit 42(2):259–266CrossRef Wang Y, Mei T, Gong SG, Hua XS (2009) Combining global, regional and contextual features for automatic image annotation. Pattern Recognit 42(2):259–266CrossRef
38.
Zurück zum Zitat Hu JW, Lam KM (2013) An efficient two-stage framework for image annotation. Pattern Recognit 46(3):936–947CrossRef Hu JW, Lam KM (2013) An efficient two-stage framework for image annotation. Pattern Recognit 46(3):936–947CrossRef
39.
Zurück zum Zitat Blanchart P, Datcu M (2010) A semi-supervised algorithm for auto-annotation and unknown structures discovery in satellite image databases. IEEE J Sel Top Appl Earth Obs Remote Sens 3(4):698–717CrossRef Blanchart P, Datcu M (2010) A semi-supervised algorithm for auto-annotation and unknown structures discovery in satellite image databases. IEEE J Sel Top Appl Earth Obs Remote Sens 3(4):698–717CrossRef
40.
Zurück zum Zitat Liu J, Li MJ, Liu QS, Lu HQ, Ma SD (2009) Image annotation via graph learning. Pattern Recognit 42(2):218–228CrossRef Liu J, Li MJ, Liu QS, Lu HQ, Ma SD (2009) Image annotation via graph learning. Pattern Recognit 42(2):218–228CrossRef
41.
Zurück zum Zitat Chen ZH, Fu H, Chi ZR, Feng DD (2012) An adaptive recognition model for image annotation. IEEE Trans Syst Man Cybern Part C Appl Rev 42(6):1120–1127CrossRef Chen ZH, Fu H, Chi ZR, Feng DD (2012) An adaptive recognition model for image annotation. IEEE Trans Syst Man Cybern Part C Appl Rev 42(6):1120–1127CrossRef
42.
Zurück zum Zitat Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labelling sequence data. In: The 18th international conference on machine learning 2001 (ICML 2001), pp 282–289 Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labelling sequence data. In: The 18th international conference on machine learning 2001 (ICML 2001), pp 282–289
43.
Zurück zum Zitat He X, Zemel RS, Carreira-Perpiñán MÁ (2004) Multiscale conditional random fields for image labeling. In: Proceedings of the 2004 IEEE computer society conference on computer Vision and pattern recognition, 2004 (CVPR 2004), vol 2, pp 695–702 He X, Zemel RS, Carreira-Perpiñán MÁ (2004) Multiscale conditional random fields for image labeling. In: Proceedings of the 2004 IEEE computer society conference on computer Vision and pattern recognition, 2004 (CVPR 2004), vol 2, pp 695–702
44.
Zurück zum Zitat Liu T, Sun J, Zheng NN, Tang XO, Shum HY (2007) Learning to detect a salient object. In: IEEE conference on computer vision and pattern recognition (2007 CVPR’07), pp 1–8 Liu T, Sun J, Zheng NN, Tang XO, Shum HY (2007) Learning to detect a salient object. In: IEEE conference on computer vision and pattern recognition (2007 CVPR’07), pp 1–8
45.
Zurück zum Zitat Mensink T, Verbeek J, Csurka G (2012) Tree-structured CRF models for interactive image labeling. IEEE Trans Pattern Anal Mach Intell 35(2):476–489CrossRef Mensink T, Verbeek J, Csurka G (2012) Tree-structured CRF models for interactive image labeling. IEEE Trans Pattern Anal Mach Intell 35(2):476–489CrossRef
46.
Zurück zum Zitat Zhong P, Wang RS (2010) Learning conditional random fields for classification of hyperspectral images. IEEE Trans Image Process 19(7):1890–1907MathSciNetCrossRef Zhong P, Wang RS (2010) Learning conditional random fields for classification of hyperspectral images. IEEE Trans Image Process 19(7):1890–1907MathSciNetCrossRef
47.
Zurück zum Zitat Zhang J, Hu WW (2013) Multi-label image annotation based on multi-model. In: ACM international conference on ubiguitous information management and communication (ACM ICUIMC 2013), Kota Kinabalu, Malaysia, pp 17–19 Zhang J, Hu WW (2013) Multi-label image annotation based on multi-model. In: ACM international conference on ubiguitous information management and communication (ACM ICUIMC 2013), Kota Kinabalu, Malaysia, pp 17–19
Metadaten
Titel
A novel image annotation model based on content representation with multi-layer segmentation
verfasst von
Jing Zhang
Yaxin Zhao
Da Li
Zhihua Chen
Yubo Yuan
Publikationsdatum
01.08.2015
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 6/2015
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-014-1815-6

Weitere Artikel der Ausgabe 6/2015

Neural Computing and Applications 6/2015 Zur Ausgabe

Premium Partner