Skip to main content
Top
Published in: International Journal of Multimedia Information Retrieval 3/2015

01-09-2015 | Regular Paper

Optimizing visual dictionaries for effective image retrieval

Authors: K. S. Arun, V. K. Govindan

Published in: International Journal of Multimedia Information Retrieval | Issue 3/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Characterizing images by high-level concepts from a learned visual dictionary is extensively used in image classification and retrieval. This paper deals with inferring discriminative visual dictionaries for effective image retrieval and examines a non-negative visual dictionary learning scheme towards this direction. More specifically, a non-negative matrix factorization framework with \(\ell _0\)-sparseness constraint on the coefficient matrix for optimizing the dictionary is proposed. It is a two-step iterative process composed of sparse encoding and dictionary enhancement stages. An initial estimate of the visual dictionary is updated in each iteration with the proposed \(\ell _0\)-constraint gradient projection algorithm. A desirable attribute of this formulation is an adaptive sequential dictionary initialization procedure. This leads to a sharp drop down of the approximation error and a faster convergence. Finally, the proposed dictionary optimization scheme is used to derive a compact image representation for the retrieval task. A new image signature is obtained by projecting local descriptors on to the basis elements of the optimized visual dictionary and then aggregating the resulting sparse encodings in to a single feature vector. Experimental results on various benchmark datasets show that the proposed system can infer enhanced visual dictionaries and the derived image feature vector can achieve better retrieval results as compared to state-of-the-art techniques.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Rebollo-Neira L (2004) Dictionary redundancy elimination. IEE Proc Vis Image Signal Process 151(1):31–34CrossRef Rebollo-Neira L (2004) Dictionary redundancy elimination. IEE Proc Vis Image Signal Process 151(1):31–34CrossRef
2.
go back to reference Lewicki M, Sejnowski T (2000) Learning overcomplete representations. Neural Comput 12(2):337–365CrossRef Lewicki M, Sejnowski T (2000) Learning overcomplete representations. Neural Comput 12(2):337–365CrossRef
3.
go back to reference Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791CrossRef Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791CrossRef
4.
go back to reference Berry M, Browne M, Langville A, Pauca P, Plemmons R (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52:55–173MathSciNetCrossRef Berry M, Browne M, Langville A, Pauca P, Plemmons R (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52:55–173MathSciNetCrossRef
5.
6.
go back to reference Xinhui H, Ryosuke I, Hisashi K Satoshi N (2010) Clustered-based language model for spoken document retrieval using NMF-based document clustering. In: Interspeech proceeding, pp 705–708 Xinhui H, Ryosuke I, Hisashi K Satoshi N (2010) Clustered-based language model for spoken document retrieval using NMF-based document clustering. In: Interspeech proceeding, pp 705–708
7.
go back to reference Dhillon IS, Modha DM (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175CrossRefMATH Dhillon IS, Modha DM (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175CrossRefMATH
8.
go back to reference Cadzow JA (2002) Minimum \(\ell _1\), \(\ell _2\) and \(\ell _{\infty }\) norm approximate solutions to an overdetermined system of linear equations. Digit Signal Process 12(4):524–560CrossRef Cadzow JA (2002) Minimum \(\ell _1\), \(\ell _2\) and \(\ell _{\infty }\) norm approximate solutions to an overdetermined system of linear equations. Digit Signal Process 12(4):524–560CrossRef
9.
go back to reference Aharon M, Elad M, Bruckstein A (2005) K-SVD and its non-negative variant for dictionary design. In: Proceedings of the SPIE conference on curvelet, directional, and sparse representations, vol 5914, pp 11.1–11.13 Aharon M, Elad M, Bruckstein A (2005) K-SVD and its non-negative variant for dictionary design. In: Proceedings of the SPIE conference on curvelet, directional, and sparse representations, vol 5914, pp 11.1–11.13
10.
go back to reference Peharz R, Pernkopf F (2012) Sparse nonnegative matrix factorization with \(\ell ^0\)-constraints. Neurocomput Spec Issue Mach Learn Signal Process 80(1):38–46 Peharz R, Pernkopf F (2012) Sparse nonnegative matrix factorization with \(\ell ^0\)-constraints. Neurocomput Spec Issue Mach Learn Signal Process 80(1):38–46
11.
go back to reference Bevilacqua M, Roumy A, Guillemot C, Morel MLA (2013) K-WEB: nonnegative dictionary learning for sparse image representations. In: Proceedings of the IEEE international conference on image processing Bevilacqua M, Roumy A, Guillemot C, Morel MLA (2013) K-WEB: nonnegative dictionary learning for sparse image representations. In: Proceedings of the IEEE international conference on image processing
12.
go back to reference Shneier M, Abdel-Mottaleb M (1996) Exploiting the JPEG compression scheme for image retrieval. IEEE Trans Pattern Anal Mach Intell 18(8):849–853CrossRef Shneier M, Abdel-Mottaleb M (1996) Exploiting the JPEG compression scheme for image retrieval. IEEE Trans Pattern Anal Mach Intell 18(8):849–853CrossRef
13.
go back to reference Jacobs CE, Finkelstein A, Salesin DH (1995) Fast multi resolution image querying. In: Proceedings of the 22nd ACM annual conference on computer graphics and interactive techniques, pp 277–286 Jacobs CE, Finkelstein A, Salesin DH (1995) Fast multi resolution image querying. In: Proceedings of the 22nd ACM annual conference on computer graphics and interactive techniques, pp 277–286
14.
go back to reference Zhou W, Sei-ichiro K (2013) Face recognition with learned local curvelet patterns and 2-directional l1-norm based 2DPCA. In: Proceedings of the 10th Asian conference on computer vision Zhou W, Sei-ichiro K (2013) Face recognition with learned local curvelet patterns and 2-directional l1-norm based 2DPCA. In: Proceedings of the 10th Asian conference on computer vision
15.
go back to reference Mallat S, Pennec EL (2005) Bandelet image approximation and compression. SIAM Multiscale Model Simul 4(3):992–1039CrossRefMATH Mallat S, Pennec EL (2005) Bandelet image approximation and compression. SIAM Multiscale Model Simul 4(3):992–1039CrossRefMATH
16.
go back to reference Mairal J, Bach F, Ponce J, Sapiro G (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60MathSciNetMATH Mairal J, Bach F, Ponce J, Sapiro G (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60MathSciNetMATH
17.
go back to reference Lu G, Teng S (1999) A novel image retrieval technique based on vector quantization. In: Proceedings of the international conference on computational intelligence for modelling, control and automation, pp 36–41 Lu G, Teng S (1999) A novel image retrieval technique based on vector quantization. In: Proceedings of the international conference on computational intelligence for modelling, control and automation, pp 36–41
18.
go back to reference Belhumeur PN, Hespanha JP, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720CrossRef Belhumeur PN, Hespanha JP, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720CrossRef
19.
go back to reference Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Netw 13(6):1450–1464CrossRef Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Netw 13(6):1450–1464CrossRef
20.
go back to reference Wang N, Jingdong W, Yeung DY (2013) Online robust non-negative dictionary learning for visual tracking. In: Proceedings of IEEE international conference on computer vision, pp 657–664 Wang N, Jingdong W, Yeung DY (2013) Online robust non-negative dictionary learning for visual tracking. In: Proceedings of IEEE international conference on computer vision, pp 657–664
21.
go back to reference Ross DA, Zemel RS (2006) Learning parts-based representations of data. J Mach Learn Res 7:2369–2397MathSciNetMATH Ross DA, Zemel RS (2006) Learning parts-based representations of data. J Mach Learn Res 7:2369–2397MathSciNetMATH
22.
go back to reference Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791CrossRef Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791CrossRef
23.
go back to reference Lee H, Battle A, Raina R, Ng AY (2006) Efficient sparse coding algorithms. In: Advances in neural information processing systems, pp 801–808 Lee H, Battle A, Raina R, Ng AY (2006) Efficient sparse coding algorithms. In: Advances in neural information processing systems, pp 801–808
24.
go back to reference Olshausen BA, Field DJ (1997) Sparse coding with an over complete basis set: a strategy employed by V1? Vis Res 37(23):3311–3325CrossRef Olshausen BA, Field DJ (1997) Sparse coding with an over complete basis set: a strategy employed by V1? Vis Res 37(23):3311–3325CrossRef
25.
go back to reference Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469MathSciNetMATH Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469MathSciNetMATH
26.
go back to reference Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Proceedings of advances in neural information processing systems, pp 556–562 Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Proceedings of advances in neural information processing systems, pp 556–562
27.
go back to reference Kim H, Park H (2008) Non negative matrix factorization based on alternating non negativity constrained least squares and active set method. SIAM J Matrix Anal Appl 30(2):713–730MathSciNetCrossRefMATH Kim H, Park H (2008) Non negative matrix factorization based on alternating non negativity constrained least squares and active set method. SIAM J Matrix Anal Appl 30(2):713–730MathSciNetCrossRefMATH
29.
go back to reference Mallat S, Zhang Z (1993) Matching pursuits with time–frequency dictionaries. IEEE Trans Signal Process 41:3397–3415CrossRefMATH Mallat S, Zhang Z (1993) Matching pursuits with time–frequency dictionaries. IEEE Trans Signal Process 41:3397–3415CrossRefMATH
30.
go back to reference Pati YC, Rezaiifar R, Krishnaprasad PS (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the twenty-seventh IEEE conference on signals, systems and computers, pp 40–44 Pati YC, Rezaiifar R, Krishnaprasad PS (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the twenty-seventh IEEE conference on signals, systems and computers, pp 40–44
31.
32.
go back to reference Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288MathSciNetMATH Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288MathSciNetMATH
33.
go back to reference Gorodnitsky IF, Rao BD (1997) Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm. IEEE Trans Signal Process 45(3):600–616CrossRef Gorodnitsky IF, Rao BD (1997) Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm. IEEE Trans Signal Process 45(3):600–616CrossRef
34.
go back to reference Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing over complete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322CrossRef Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing over complete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322CrossRef
35.
go back to reference Patrik OH (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469MATH Patrik OH (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469MATH
36.
go back to reference Nakayama H, Harada T, Kuniyoshi Y (2010) Dense sampling low-level statistics of local features. IEICE Trans Inf Syst 93(7):1727–1736CrossRef Nakayama H, Harada T, Kuniyoshi Y (2010) Dense sampling low-level statistics of local features. IEICE Trans Inf Syst 93(7):1727–1736CrossRef
37.
go back to reference Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: Proceedings of the European conference on computer vision, pp 490–503 Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: Proceedings of the European conference on computer vision, pp 490–503
38.
go back to reference Langville AN, Meyer CD, Albright R, Cox J, Duling D (2006) Initializations for the non negative matrix factorization. In: Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining, pp 23–26 Langville AN, Meyer CD, Albright R, Cox J, Duling D (2006) Initializations for the non negative matrix factorization. In: Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining, pp 23–26
39.
go back to reference Rezaei M, Boostani R, Rezaei M (2011) An efficient initialization method for non negative matrix factorization. J Appl Sci 11(2):354–359CrossRef Rezaei M, Boostani R, Rezaei M (2011) An efficient initialization method for non negative matrix factorization. J Appl Sci 11(2):354–359CrossRef
40.
go back to reference Jafari MG, Plumbley MD (2011) Fast dictionary learning for sparse representations of speech signals. J Sel Top Signal Process 5(5):1025–1031CrossRef Jafari MG, Plumbley MD (2011) Fast dictionary learning for sparse representations of speech signals. J Sel Top Signal Process 5(5):1025–1031CrossRef
42.
go back to reference Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, London Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, London
43.
go back to reference Vartak MN (1955) On an application of Kronecker product of matrices to statistical designs. Ann Math Stat 26(3):420–438 Vartak MN (1955) On an application of Kronecker product of matrices to statistical designs. Ann Math Stat 26(3):420–438
44.
45.
go back to reference Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1–8 Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1–8
46.
go back to reference Zhao Y, Hong R, Jiang J, Wen J, Zhang H (2013) Image matching by fast random sample consensus. In: Proceedings of the fifth international conference on internet multimedia computing and service, pp 159–162 Zhao Y, Hong R, Jiang J, Wen J, Zhang H (2013) Image matching by fast random sample consensus. In: Proceedings of the fifth international conference on internet multimedia computing and service, pp 159–162
47.
go back to reference Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the international conference on computer vision and pattern recognition, vol 2, pp 2169–2178 Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the international conference on computer vision and pattern recognition, vol 2, pp 2169–2178
48.
go back to reference Zhang Y, Jia Z, Chen T (2011) Image retrieval with geometry-preserving visual phrases. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 809–816 Zhang Y, Jia Z, Chen T (2011) Image retrieval with geometry-preserving visual phrases. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 809–816
49.
go back to reference Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: Proceedings on computer vision and pattern recognition, pp 1–8 Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: Proceedings on computer vision and pattern recognition, pp 1–8
50.
go back to reference Jgou H, Douze M, Schmid C, Prez P (2010) Aggregating local descriptors into a compact image representation. In: Proceeding of IEEE conference on computer vision and pattern recognition (CVPR), pp 3304–3311 Jgou H, Douze M, Schmid C, Prez P (2010) Aggregating local descriptors into a compact image representation. In: Proceeding of IEEE conference on computer vision and pattern recognition (CVPR), pp 3304–3311
51.
go back to reference Perronnin F, Liu Y, Snchez J, Poirier H (2010) Large-scale image retrieval with compressed fisher vectors. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 3384–3391 Perronnin F, Liu Y, Snchez J, Poirier H (2010) Large-scale image retrieval with compressed fisher vectors. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 3384–3391
52.
go back to reference Chatfield K, Lempitsky V, Vedaldi A, Zisserman A (2011) The devil is in the details: an evaluation of recent feature encoding methods. In: Proceedings of the 22nd british machine vision conference (BMVC), pp 76.1–76.12 Chatfield K, Lempitsky V, Vedaldi A, Zisserman A (2011) The devil is in the details: an evaluation of recent feature encoding methods. In: Proceedings of the 22nd british machine vision conference (BMVC), pp 76.1–76.12
53.
go back to reference Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern 8:460–472CrossRef Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern 8:460–472CrossRef
54.
go back to reference Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proceedings of ninth IEEE international conference on computer vision, pp 1470–1477 Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proceedings of ninth IEEE international conference on computer vision, pp 1470–1477
55.
go back to reference Herve J, Matthijs D, Cordelia S (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision 2008 (ECCV 2008). Springer, Berlin, pp 304–317 Herve J, Matthijs D, Cordelia S (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision 2008 (ECCV 2008). Springer, Berlin, pp 304–317
57.
go back to reference Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vis 30(2):79–116CrossRef Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vis 30(2):79–116CrossRef
58.
go back to reference Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86CrossRef Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86CrossRef
59.
go back to reference Lowe DG (2004) Distinctive image features from scale-invariant key points. Int J Comput Vis 60(2):91–110CrossRef Lowe DG (2004) Distinctive image features from scale-invariant key points. Int J Comput Vis 60(2):91–110CrossRef
60.
go back to reference Tola E, Lepetit V, Fua P (2010) Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830CrossRef Tola E, Lepetit V, Fua P (2010) Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830CrossRef
61.
go back to reference Bouachir W, Kardouchi M, Belacel N (2009) Improving bag of visual words image retrieval: a fuzzy weighting scheme for efficient indexation. In: Proceedings of fifth IEEE international conference on signal-image technology & internet-based systems (SITIS), pp 215–220 Bouachir W, Kardouchi M, Belacel N (2009) Improving bag of visual words image retrieval: a fuzzy weighting scheme for efficient indexation. In: Proceedings of fifth IEEE international conference on signal-image technology & internet-based systems (SITIS), pp 215–220
62.
go back to reference Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-Hash and tf-idf weighting. In BMVC, vol 810, pp 812–815 Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-Hash and tf-idf weighting. In BMVC, vol 810, pp 812–815
63.
go back to reference Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp II-506 Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp II-506
Metadata
Title
Optimizing visual dictionaries for effective image retrieval
Authors
K. S. Arun
V. K. Govindan
Publication date
01-09-2015
Publisher
Springer London
Published in
International Journal of Multimedia Information Retrieval / Issue 3/2015
Print ISSN: 2192-6611
Electronic ISSN: 2192-662X
DOI
https://doi.org/10.1007/s13735-015-0076-1

Premium Partner