Skip to main content
Top

2018 | OriginalPaper | Chapter

MKL Based Local Label Diffusion for Automatic Image Annotation

Authors : Abhijeet Kumar, Anjali Anil Shenoy, Avinash Sharma

Published in: Computer Vision, Pattern Recognition, Image Processing, and Graphics

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The task of automatic image annotation attempts to predict a set of semantic labels for an image. Majority of the existing methods discover a common latent space that combines content and semantic image similarity using the metric learning kind of global learning framework. This limits their applicability to large datasets. On the other hand, there are few methods which entirely focus on learning a local latent space for every test image. However, they completely ignore the global structure of the data. In this work, we propose a novel image annotation method which attempts to combine best of both local and global learning methods. We introduce the notion of neighborhood-types based on the hypothesis that similar images in content/feature space should also have overlapping neighborhoods. We also use graph diffusion as a mechanism for label transfer. Experiments on publicly available datasets show promising performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)CrossRef Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)CrossRef
2.
go back to reference Datta, S., Tourani, S., Sharma, A., Krishna, K.M.: SLAM pose-graph robustification via multi-scale Heat-Kernel analysis. In: CDC (2016) Datta, S., Tourani, S., Sharma, A., Krishna, K.M.: SLAM pose-graph robustification via multi-scale Heat-Kernel analysis. In: CDC (2016)
3.
go back to reference Sharma, A., Horaud, R., Cech, J., Boyer, E.: Topologically-robust 3D shape matching based on diffusion geometry and seed growing. In: CVPR (2011) Sharma, A., Horaud, R., Cech, J., Boyer, E.: Topologically-robust 3D shape matching based on diffusion geometry and seed growing. In: CVPR (2011)
4.
go back to reference Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE TPAMI 29(3), 394–410 (2007)CrossRef Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE TPAMI 29(3), 394–410 (2007)CrossRef
5.
go back to reference Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J.M., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88, 303–338 (2010)CrossRef Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J.M., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. IJCV 88, 303–338 (2010)CrossRef
6.
go back to reference Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. In: CVPR (2004) Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. In: CVPR (2004)
8.
go back to reference Grangier, D., Bengio, S.: A discriminative kernel-based approach to rank images from text queries. IEEE TPAMI 30, 1371–1384 (2008)CrossRef Grangier, D., Bengio, S.: A discriminative kernel-based approach to rank images from text queries. IEEE TPAMI 30, 1371–1384 (2008)CrossRef
9.
go back to reference Guillaumin, M., Mensink, T., Verbeek, J.J., Schmid, C.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (2009) Guillaumin, M., Mensink, T., Verbeek, J.J., Schmid, C.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (2009)
10.
go back to reference Gupta, A., Verma, Y., Jawahar, C.V.: Choosing linguistics over vision to describe images. In: AAAI (2012) Gupta, A., Verma, Y., Jawahar, C.V.: Choosing linguistics over vision to describe images. In: AAAI (2012)
11.
go back to reference Hu, H., Zhou, G.-T., Deng, Z., Liao, Z., Mori, G.: Learning structured inference neural networks with label relations. CoRR abs/1511.05616 (2015) Hu, H., Zhou, G.-T., Deng, Z., Liao, Z., Mori, G.: Learning structured inference neural networks with label relations. CoRR abs/1511.05616 (2015)
12.
go back to reference Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: Multimedia Information Retrieval (2008) Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: Multimedia Information Retrieval (2008)
13.
go back to reference Johnson, J., Ballan, L., Li, F.-F.: Love thy neighbors: image annotation by exploiting image metadata. CoRR abs/1508.07647 (2015) Johnson, J., Ballan, L., Li, F.-F.: Love thy neighbors: image annotation by exploiting image metadata. CoRR abs/1508.07647 (2015)
14.
go back to reference Kalayeh, M., Idrees, H., Shah, M.: NMF-KNN: image annotation using weighted multi-view non-negative matrix factorization. In: CVPR (2014) Kalayeh, M., Idrees, H., Shah, M.: NMF-KNN: image annotation using weighted multi-view non-negative matrix factorization. In: CVPR (2014)
15.
go back to reference Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS (2000) Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS (2000)
16.
go back to reference Li, X., Snoek, C.G.M., Worring, M.: Learning social tag relevance by neighbor voting. IEEE Trans. Multimed. 11, 1310–1322 (2009)CrossRef Li, X., Snoek, C.G.M., Worring, M.: Learning social tag relevance by neighbor voting. IEEE Trans. Multimed. 11, 1310–1322 (2009)CrossRef
17.
go back to reference Liu, J., Li, M., Liu, Q., Hanqing, L., Ma, S.: Image annotation via graph learning. Pattern Recogn. 42, 218–228 (2009)CrossRef Liu, J., Li, M., Liu, Q., Hanqing, L., Ma, S.: Image annotation via graph learning. Pattern Recogn. 42, 218–228 (2009)CrossRef
18.
go back to reference Makadia, A., Pavlovic, V., Kumar, S.: Baselines for image annotation. In: IJCV (2010)CrossRef Makadia, A., Pavlovic, V., Kumar, S.: Baselines for image annotation. In: IJCV (2010)CrossRef
19.
go back to reference Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)
20.
go back to reference Murthy, V.N., Can, E.F., Manmatha, R.: A hybrid model for automatic image annotation. In: ICMR (2014) Murthy, V.N., Can, E.F., Manmatha, R.: A hybrid model for automatic image annotation. In: ICMR (2014)
21.
go back to reference Murthy, V.N., Maji, S., Manmatha, R.: Automatic image annotation using deep learning representations. In: ICMR (2015) Murthy, V.N., Maji, S., Manmatha, R.: Automatic image annotation using deep learning representations. In: ICMR (2015)
22.
go back to reference Murthy, V.N., Sharma, A., Chari, V., Manmatha, R.: Image annotation using multi-scale hypergraph heat diffusion framework. In: ICMR (2016) Murthy, V.N., Sharma, A., Chari, V., Manmatha, R.: Image annotation using multi-scale hypergraph heat diffusion framework. In: ICMR (2016)
23.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
24.
go back to reference Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. CoRR abs/1409.4842 (2015) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. CoRR abs/1409.4842 (2015)
25.
go back to reference Uricchio, T., Ballan, L., Seidenari, L., Del Bimbo, A.: Automatic image annotation via label transfer in the semantic space. CoRR abs/1605.04770 (2016) Uricchio, T., Ballan, L., Seidenari, L., Del Bimbo, A.: Automatic image annotation via label transfer in the semantic space. CoRR abs/1605.04770 (2016)
26.
go back to reference Verbeek, J., Guillaumin, M., Mensink, T., Schmid, C.: Image annotation with TagProp on the MIRFLICKR set. In: ACM MIR (2010) Verbeek, J., Guillaumin, M., Mensink, T., Schmid, C.: Image annotation with TagProp on the MIRFLICKR set. In: ACM MIR (2010)
27.
go back to reference Verma, Y., Jawahar, C.V.: Exploring SVM for image annotation in presence of confusing labels. In: BMVC (2013) Verma, Y., Jawahar, C.V.: Exploring SVM for image annotation in presence of confusing labels. In: BMVC (2013)
28.
go back to reference Verma, Y., Jawahar, C.V.: Image annotation by propagating labels from semantic neighbourhoods. IJCV 121, 1–23 (2016) Verma, Y., Jawahar, C.V.: Image annotation by propagating labels from semantic neighbourhoods. IJCV 121, 1–23 (2016)
29.
go back to reference Wang, H., Huang, H., Ding, C.H.Q.: Image annotation using multi-label correlated Green’s function. In: ICCV (2009) Wang, H., Huang, H., Ding, C.H.Q.: Image annotation using multi-label correlated Green’s function. In: ICCV (2009)
30.
go back to reference Wang, H., Huang, H., Ding, C.H.Q.: Image annotation using bi-relational graph of images and semantic labels. In: CVPR (2011) Wang, H., Huang, H., Ding, C.H.Q.: Image annotation using bi-relational graph of images and semantic labels. In: CVPR (2011)
31.
go back to reference Duvenaud, D.K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik, A., Adams, R.P.: Convolutional networks on graphs for learning molecular fingerprints. In: NIPS (2015) Duvenaud, D.K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik, A., Adams, R.P.: Convolutional networks on graphs for learning molecular fingerprints. In: NIPS (2015)
32.
go back to reference Wang, J., Yang, Y., Mao, J., Huang, Z., Huang, C., Xu, W.: CNN-RNN: a unified framework for multi-label image classification. CoRR abs/1604.04573 (2016) Wang, J., Yang, Y., Mao, J., Huang, Z., Huang, C., Xu, W.: CNN-RNN: a unified framework for multi-label image classification. CoRR abs/1604.04573 (2016)
33.
go back to reference Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. JMLR 10, 207–244 (2009)MATH Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. JMLR 10, 207–244 (2009)MATH
34.
go back to reference Xiang, Y., Zhou, X., Chua, T.S., Ngo, C.W.: A revisit of generative model for automatic image annotation using markov random fields. In: CVPR (2009) Xiang, Y., Zhou, X., Chua, T.S., Ngo, C.W.: A revisit of generative model for automatic image annotation using markov random fields. In: CVPR (2009)
35.
go back to reference Zhang, H., Berg, A.C., Maire, M., Malik, J.: SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: CVPR (2006) Zhang, H., Berg, A.C., Maire, M., Malik, J.: SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: CVPR (2006)
36.
go back to reference Szlam, A.D., Maggioni, M., Coifman, R.R.: Regularization on graphs with function-adapted diffusion processes. JMLR 9, 1711–1739 (2008)MathSciNetMATH Szlam, A.D., Maggioni, M., Coifman, R.R.: Regularization on graphs with function-adapted diffusion processes. JMLR 9, 1711–1739 (2008)MathSciNetMATH
37.
go back to reference Liu, F., Xiang, T., Hospedales, T.M., Yang, W., Sun, C.: Semantic regularisation for recurrent image annotation. In: CVPR (2017) Liu, F., Xiang, T., Hospedales, T.M., Yang, W., Sun, C.: Semantic regularisation for recurrent image annotation. In: CVPR (2017)
38.
go back to reference Scarselli, F., Gori, M., Tsoi, A.C., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20, 61–80 (2009)CrossRef Scarselli, F., Gori, M., Tsoi, A.C., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20, 61–80 (2009)CrossRef
39.
go back to reference Li, X., Uricchio, T., Ballan, L., Bertini, M., Snoek, C.G.M., Del Bimbo, A.: Socializing the semantic gap: a comparative survey on image tag assignment, refinement and retrieval. CSUR 49(1) (2016)CrossRef Li, X., Uricchio, T., Ballan, L., Bertini, M., Snoek, C.G.M., Del Bimbo, A.: Socializing the semantic gap: a comparative survey on image tag assignment, refinement and retrieval. CSUR 49(1) (2016)CrossRef
40.
go back to reference Li, Y., Zemel, R.: Gated graph sequence neural networks. In: ICLR (2016) Li, Y., Zemel, R.: Gated graph sequence neural networks. In: ICLR (2016)
41.
go back to reference Marino, K., Salakhutdinov, R., Gupta, A.: The more you know: using knowledge graphs for image classification. In: CVPR (2017) Marino, K., Salakhutdinov, R., Gupta, A.: The more you know: using knowledge graphs for image classification. In: CVPR (2017)
42.
Metadata
Title
MKL Based Local Label Diffusion for Automatic Image Annotation
Authors
Abhijeet Kumar
Anjali Anil Shenoy
Avinash Sharma
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-0020-2_34

Premium Partner