Top

International Journal of Machine Learning and Cybernetics

Published in:

06-02-2016 | Original Article

A context-aware semantic modeling framework for efficient image retrieval

Authors: K. S. Arun, V. K. Govindan

Published in: International Journal of Machine Learning and Cybernetics | Issue 4/2017

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In recent years, high-level image representation is gaining popularity in image classification and retrieval tasks. This paper proposes an efficient scheme known as semantic context model to derive high-level image descriptors well suited for the retrieval operation. Semantic context model uses an undirected graphical model based formulation which jointly exploits low-level visual features and contextual information for classifying local image blocks into some predefined concept classes. Contextual information involves concept co-occurrences and their spatial correlation statistics. More expressive potential functions are introduced to capture the structural dependencies among various semantic concepts. The proposed framework proceeds in three steps. Initially, optimal values of model parameters that impose spatial consistency of concept labels among local image blocks are learned from the training data. Then, the semantics associated with the constituent blocks of an unseen image are inferred using an improved message-passing algorithm. Finally, a compact but discriminative image signature is derived by integrating the frequency of occurrence of various regional semantics. Experimental results on various benchmark datasets show that semantic context model can effectively resolve local ambiguities and consequently improve concept recognition performance in complex images. Moreover, the retrieval efficiency of the new semantics based image feature is found to be much better than state-of-the-art approaches.

previous article PBSeq: Modeling base-level bias to estimate gene and isoform expression for RNA-seq data

next article Robust adaptive nonsingular terminal sliding mode control of MEMS gyroscope using fuzzy-neural-network compensator

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

inform now

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

inform now

Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proceedings of Ninth IEEE international conference on computer vision, vol 2, pp 1470–1477

Duan M, Wu X (2010) Visual polysemy and synonymy: toward near-duplicate image retrieval. Front Electr Electron Eng China 5(4):419–429CrossRef

Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196CrossRefMATH

Zhang R, Zhang Z (2007) Effective image retrieval based on hidden concept discovery in image database. IEEE Trans Image Process 16(2):562–572MathSciNetCrossRef

Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH

Biederman I, Mezzanotte R, Rabinowitz J (1982) Scene perception: detecting and judging objects undergoing relational violations. Cogn Psychol 14(2):143–177CrossRef

Kumar S, Hebert M (2006) Discriminative random fields. Int J Comput Vis 68(2):179–201CrossRef

Yu L, Xie J, Chen S (2012) Conditional random field-based image labelling combining features of pixels, segments and regions. IET Comput Vis 6(5):459–467MathSciNetCrossRef

Vogel J, Schiele B (2007) Semantic modeling of natural scenes for content-based image retrieval. Int J Comput Vis 72(2):133–157CrossRef

10.

Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef

11.

Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, vol 1, pp 886–893

12.

Bay, H., Tuytelaars, T., Van Gool, L (2006) Surf: speeded up robust features. In: Proceedings of the 9th European conference on computer vision, pp 404-417

13.

Tola E, Lepetit V, Fua P (2010) Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830CrossRef

14.

Li LJ, Su H, Lim Y, Fei-Fei L (2014) Object bank: an object-level image representation for high-level visual recognition. Int J Comput Vis 107(1):20–39CrossRef

15.

Torresani L, Szummer M, Fitzgibbon A (2010) Efficient object category recognition using classemes. In: Proceedings of 11th European conference on computer vision. Springer, Berlin, Heidelberg, pp 776–789

16.

Chan A, A., Vasconcelos., N, (2005) Probabilistic kernels for the classification of auto-regressive visual processes. In: Proceedings of IEEE conference on computer vision and pattern recognition, vol 1, pp 846–851

17.

Zhang H, Berg A, Maire M, Malik J (2006) Svm-knn: discriminative nearest neighbor classification for visual category recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2126–2136

18.

Cai D, He X, Han J (2007) Efficient kernel discriminant analysis via spectral regression. In: Proceedings of Seventh IEEE international conference on data mining, pp 427–432

19.

Grauman K, Darrell T (2007) The pyramid match kernel: efficient learning with sets of features. J Mach Learn Res 8:725–760MATH

20.

Bosch A, Zisserman A, Munoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans Pattern Anal Mach Intell 30(4):712–727CrossRef

21.

Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, pp 119–126

22.

Fei-Fei L, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 524–531

23.

Sivic J, Russell B, Efros A, Zisserman A, Freeman W (2005) Discovering object and their localization in images. In: Proceedings of the tenth IEEE international conference on computer vision, vol 1, pp 370–377

24.

Sudderth E, Torralba A, Freeman W, Willsky A (2005) Learning hierarchical models of scenes, objects and parts. In: Proceedings of the tenth IEEE international conference on computer vision, vol 2, pp 1331–1338

25.

Carneiro G, Chan A, Moreno P, Vasconcelos N (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410CrossRef

26.

Rasiwasia N, Vasconcelos N (2012) Holistic context models for visual recognition. IEEE Trans Pattern Anal Mach Intell 34(5):902–917CrossRef

27.

Bar M (2004) Visual objects in context. Nat Rev Neurosci 5(8):617–629CrossRef

28.

Bar M, Ullman S (1993) Spatial context in recognition. Perception 25:343–352CrossRef

29.

Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge, p 1280

30.

Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, pp 282–289

31.

Kohli P, Torr PH (2009) Robust higher order potentials for enforcing label consistency. Int J Comput Vis 82(3):302–324CrossRef

32.

He X, Zemel RS, Carreira-Perpindn MA (2004) Multiscale conditional random fields for image labeling. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 695–702

33.

Krhenbhl P, Koltun V (2012) Efficient inference in fully connected crfs with Gaussian edge potentials. arXiv:1210.5644

34.

Efron B (1975) The efficiency of logistic regression compared to normal discriminant analysis. J Am Stat Assoc 70(352):892–898MathSciNetCrossRefMATH

35.

Kindermann R, Snell JL (1980) Markov random fields and their applications, vol 1. American Mathematical Society, ProvidenceCrossRefMATH

36.

Dagli C, Huang TS (2004) A framework for grid-based image retrieval. In: Proceedings of the 17th IEEE international conference on pattern recognition, vol 2, pp 1021–1024

37.

Huiskes MJ, Lew MS (2008) The MIR Flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval, pp 39–43

38.

Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886CrossRef

39.

Mallat S (2012) Group invariant scattering. Commun Pure Appl Math 65(10):1331–1398MathSciNetCrossRefMATH

40.

Andn J, Mallat S (2011) Multiscale scattering for audio classification. In: ISMIR, pp 657–662

41.

Oyallon E, Mallat S, Sifre L (2013) Generic deep networks with wavelet scattering. arXiv:1312.5940v3

42.

Lee TS (1996) Image representation using 2D Gabor wavelets. IEEE Trans Pattern Anal Mach Intell 18(10):959–971CrossRef

43.

Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10(3):61–74

44.

Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005MathSciNetMATH

45.

Sutton C, McCallum A (2007) Piecewise pseudo likelihood for efficient training of conditional random fields. In: Proceedings of the 24th ACM international conference on machine learning, pp 863–870

46.

Beck A, Ben-Tal A (2006) On the solution of the Tikhonov regularization of the total least squares problem. SIAM J Optim 17(1):98–118MathSciNetCrossRefMATH

47.

Kelley CT (1999) Iterative methods for optimization. Frontiers in applied mathematics. Siam, Philadelphia, PA

48.

Gill PE, Murray W, Wright MH (1981) Practical optimization, vol 5. Academic press, LondonMATH

49.

Lempitsky V, Rother C, Roth S, Blake A (2010) Fusion moves for markov random field optimization. IEEE Trans Pattern Anal Mach Intell 32(8):1392–1405CrossRef

50.

Murphy KP, Weiss Y, Jordan MI (1999) Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of the Fifteenth International conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc, pp 467–475

51.

Murray I, Ghahramani Z (2004) Bayesian learning in undirected graphical models: approximate MCMC algorithms. In: Proceedings of the 20th International conference on uncertainty in artificial intelligence. AUAI Press, pp 392–399

52.

Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Francisco, CA

53.

Johnson D, Sinanovic S (2001) Symmetrizing the kullback-leibler distance. http://www-dsp.rice.edu/~dhj/resistor.pdf

54.

Barla A, Odone F, Verri A (2003) Histogram intersection kernel for image classification. In: Proceedings of international conference on image processing, vol 3, pp 513–516

55.

Zobel J, Moffat A, Ramamohanarao K (1998) Inverted files versus signature files for text indexing. ACM Trans Database Syst 23(4):453–490CrossRef

56.

van Hateren JH, van der Schaaf A (1998) Independent component filters of natural images compared with simple cells in primary visual cortex. Proc R Soc Lond B Biol Sci 265(1394):359–366CrossRef

57.

Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems (NIPS), pp 487–495

58.

Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) Imagenet large scale visual recognition challenge. Int J Comput Vis, pp 1–42

59.

Kohavi R, Provost F (1998) Glossary of terms. Mach Learn 30(2–3):271–274

60.

Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-Hash and tf-idf weighting. In: Proceedings of British machine vision conference, vol 810, pp 812–815

Title: A context-aware semantic modeling framework for efficient image retrieval
Authors: K. S. Arun
V. K. Govindan
Publication date: 06-02-2016
Publisher: Springer Berlin Heidelberg
Published in: International Journal of Machine Learning and Cybernetics / Issue 4/2017
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-016-0498-y

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Other articles of this Issue 4/2017

Value of foreknowledge in the online k-taxi problem

Position stabilisation and lag reduction with Gaussian processes in sensor fusion system for user performance improvement

Knowledge reduction of dynamic covering decision information systems caused by variations of attribute values

Enhancing relative ratio method for MCDM via attitudinal distance measures of interval-valued hesitant fuzzy sets

An efficient gesture based humanoid learning using wavelet descriptor and MFCC techniques

A data reduction method in formal fuzzy contexts