Skip to main content
Top
Published in: Soft Computing 18/2018

27-11-2017 | Foundations

Learning visual codebooks for image classification using spectral clustering

Authors: Yi Hong, Weiping Zhu

Published in: Soft Computing | Issue 18/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This study explores the idea of learning visual codebook using spectral clustering, which we call spectral visual codebook learning (SVCL). Though spectral clustering has been widely applied into unsupervised segmentation, clustering, and manifold learning, using it to learn codebooks on standard image benchmark datasets has not been thoroughly studied. We show how learned codebooks by SVCL can be used for scene classification, texture recognition and image categorization. We describe several implementations for constructing the similarity graph and addressing the large-scale local image patches problem. We show that our approach captures nonlinear manifolds of semantic image patches. Another advantage is that both label and spatial information can be incorporated without increasing its model complexity. We validate SVCL on datasets such as KTH-TIPS, Scene-15, Graz-02, and Caltech-101.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Chang Y, Lin C (2008) Ranking feature using linear svm. In: JMLR workshop, pp 53–64 Chang Y, Lin C (2008) Ranking feature using linear svm. In: JMLR workshop, pp 53–64
go back to reference Chen W, Song Y, Bai H, Lin C, Chang E (2011) Parallel spectral clustering in distributed systems. IEEE Trans Pattern Anal Mach Intell 33:568–586CrossRef Chen W, Song Y, Bai H, Lin C, Chang E (2011) Parallel spectral clustering in distributed systems. IEEE Trans Pattern Anal Mach Intell 33:568–586CrossRef
go back to reference Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop of European conference on computer vision, pp 1–16 Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop of European conference on computer vision, pp 1–16
go back to reference Fischer B, Buhmann J (2003) Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans Pattern Anal Mach Intell 25:513–518 Fischer B, Buhmann J (2003) Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans Pattern Anal Mach Intell 25:513–518
go back to reference Forsyth D, Toor P, Zisserman A (2008) Kernel codebooks for scence categorization. In: European conference on computer vision, pp 696–709 Forsyth D, Toor P, Zisserman A (2008) Kernel codebooks for scence categorization. In: European conference on computer vision, pp 696–709
go back to reference Fowlkes C, Belongie S, Chung F, Malik J (2004) Spectral grouping using the nystrom method. IEEE Trans Pattern Anal Mach Intell 26:214–224CrossRef Fowlkes C, Belongie S, Chung F, Malik J (2004) Spectral grouping using the nystrom method. IEEE Trans Pattern Anal Mach Intell 26:214–224CrossRef
go back to reference Fred A, Jain A (2004) Robust data clustering. In: IEEE conference on computer vision and pattern recognition, pp 1–8 Fred A, Jain A (2004) Robust data clustering. In: IEEE conference on computer vision and pattern recognition, pp 1–8
go back to reference Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: International conference on computer vision, pp 604–610 Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: International conference on computer vision, pp 604–610
go back to reference Keys R (1981) Cubic convolution interpolation for digital image processing. IEEE Trans Acoust Speech Signal Process ASSP 29(6):1153–1160MathSciNetCrossRefMATH Keys R (1981) Cubic convolution interpolation for digital image processing. IEEE Trans Acoust Speech Signal Process ASSP 29(6):1153–1160MathSciNetCrossRefMATH
go back to reference Lanckriet G, Cristianini N, Ghaoui L, Bartlett P, Jordan J (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72MathSciNetMATH Lanckriet G, Cristianini N, Ghaoui L, Bartlett P, Jordan J (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72MathSciNetMATH
go back to reference Lazebnik S, Raginshy M (2007) Learning nearest-neighbor quantizers from labeled data by information loss minimization. In: AI statistics, pp 251–258 Lazebnik S, Raginshy M (2007) Learning nearest-neighbor quantizers from labeled data by information loss minimization. In: AI statistics, pp 251–258
go back to reference Lazebnik S, Schmid C, Ponce J (2003) Affine-invariant local descriptors and neighborhood statistics for texture recognition. In: International conference on computer vision, pp 649–655 Lazebnik S, Schmid C, Ponce J (2003) Affine-invariant local descriptors and neighborhood statistics for texture recognition. In: International conference on computer vision, pp 649–655
go back to reference Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognition natural scene categories. In: IEEE conference on computer vision and pattern recognition, pp 2169–2178 Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognition natural scene categories. In: IEEE conference on computer vision and pattern recognition, pp 2169–2178
go back to reference Leibe B, Mikolajczyk K, Schiele B (2006) Efficient clustering and matching for object class recognition. In: British conference on computer vision, pp 1–10 Leibe B, Mikolajczyk K, Schiele B (2006) Efficient clustering and matching for object class recognition. In: British conference on computer vision, pp 1–10
go back to reference Leung T, Malik J (1999) Recognizing surfaces using three-dimensional textons. In: International conference on computer vision, p 1010 Leung T, Malik J (1999) Recognizing surfaces using three-dimensional textons. In: International conference on computer vision, p 1010
go back to reference Li F, Fergus R, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: IEEE conference on computer vision and pattern recognition, pp 524–531 Li F, Fergus R, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: IEEE conference on computer vision and pattern recognition, pp 524–531
go back to reference Lim J, Ho J, Yang M, Lee K, Kriegman D (2004) Image clustering with metric, local linear structure and affine symmetry. In: European conference on computer vision, pp 456–468 Lim J, Ho J, Yang M, Lee K, Kriegman D (2004) Image clustering with metric, local linear structure and affine symmetry. In: European conference on computer vision, pp 456–468
go back to reference Liu D, Hua G, Viola P, Chen T (2008) Integrated feature selection and higher-order spatial feature extraction for object categorization. In: IEEE conference on computer vision and pattern recognition, pp 1–8 Liu D, Hua G, Viola P, Chen T (2008) Integrated feature selection and higher-order spatial feature extraction for object categorization. In: IEEE conference on computer vision and pattern recognition, pp 1–8
go back to reference Liu J, Yang Y, Shah M (2009) Learning semantic visual vocabularies using diffusion distance. In: IEEE conference on computer vision and pattern recognition, pp 461–468 Liu J, Yang Y, Shah M (2009) Learning semantic visual vocabularies using diffusion distance. In: IEEE conference on computer vision and pattern recognition, pp 461–468
go back to reference Liu L, Wang L, Shen C (2011) A generalized probabilistic framework for compact codebook creation. In: IEEE conference on computer vision and pattern recognition, pp 1537–1544 Liu L, Wang L, Shen C (2011) A generalized probabilistic framework for compact codebook creation. In: IEEE conference on computer vision and pattern recognition, pp 1537–1544
go back to reference Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–100CrossRef Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–100CrossRef
go back to reference Mallapragada P, Jin R, Jain A (2010) Online visual vocabulary pruning using pairwise constraints. In: IEEE conference on computer vision and pattern recognition, pp 3073–3080 Mallapragada P, Jin R, Jain A (2010) Online visual vocabulary pruning using pairwise constraints. In: IEEE conference on computer vision and pattern recognition, pp 3073–3080
go back to reference Mikulik A, Perdoch M, Chum O, Matas J (2010) Learning a fine vocabulary. In: European conference on computer vision, pp 1–14 Mikulik A, Perdoch M, Chum O, Matas J (2010) Learning a fine vocabulary. In: European conference on computer vision, pp 1–14
go back to reference Miladenic D, Brank J, Grobelnik M, Milic-Frayling N (2004) Feature selection using linear classifier weights: interaction with classification model. In: ACM SIGIR conference on research and development in information retrieval, pp 234–241 Miladenic D, Brank J, Grobelnik M, Milic-Frayling N (2004) Feature selection using linear classifier weights: interaction with classification model. In: ACM SIGIR conference on research and development in information retrieval, pp 234–241
go back to reference Moosmann F, Triggs B, Jurie F (2007) Fast discriminative visual codebooks using randomized clustering forests. In: Neural information processing systems, pp 985–992 Moosmann F, Triggs B, Jurie F (2007) Fast discriminative visual codebooks using randomized clustering forests. In: Neural information processing systems, pp 985–992
go back to reference Ng A, Jordan M, Weiss Y (2002) On spectral clusterings: analysis and an algorithm. In: Neural information processing systems, pp 849–856 Ng A, Jordan M, Weiss Y (2002) On spectral clusterings: analysis and an algorithm. In: Neural information processing systems, pp 849–856
go back to reference Nguyen H, Fablet R, Boucher J (2011) Visual textures as realizations of multivariate log-gaussian cox processes. In: IEEE conference on computer vision and pattern recognition, pp 2945–2952 Nguyen H, Fablet R, Boucher J (2011) Visual textures as realizations of multivariate log-gaussian cox processes. In: IEEE conference on computer vision and pattern recognition, pp 2945–2952
go back to reference Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: IEEE conference on computer vision and pattern recognition, pp 2161–2168 Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: IEEE conference on computer vision and pattern recognition, pp 2161–2168
go back to reference Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: European conference on computer vision, pp 490–503 Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: European conference on computer vision, pp 490–503
go back to reference Opelt A, Fussenegger M, Pinz A, Auer P (2004) Weak hypotheses and boosting for generic object detection and recognition. In: European conference on computer vision, pp 71–84 Opelt A, Fussenegger M, Pinz A, Auer P (2004) Weak hypotheses and boosting for generic object detection and recognition. In: European conference on computer vision, pp 71–84
go back to reference Shi J, Malik J (2000) Normilzed cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22:888–905CrossRef Shi J, Malik J (2000) Normilzed cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22:888–905CrossRef
go back to reference Sivic J, Aisserman Z (2003) Video google: a text retrieval approach to object matching in videos. In: International conference on computer vision, pp 1470–1477 Sivic J, Aisserman Z (2003) Video google: a text retrieval approach to object matching in videos. In: International conference on computer vision, pp 1470–1477
go back to reference Sonnenburg S, Ratsch G, Schafer C, Scholkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565MathSciNetMATH Sonnenburg S, Ratsch G, Schafer C, Scholkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565MathSciNetMATH
go back to reference Strehl A, Ghosh J (2002) Clustering ensembles-a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617MATH Strehl A, Ghosh J (2002) Clustering ensembles-a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617MATH
go back to reference Wang J, Yang J, Yu K, Lu F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: IEEE conference on computer vision and pattern recognition, pp 3360—3367 Wang J, Yang J, Yu K, Lu F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: IEEE conference on computer vision and pattern recognition, pp 3360—3367
go back to reference Wu J, Rehg JM (2009) Beyond the euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: International conference on computer vision, pp 630–637 Wu J, Rehg JM (2009) Beyond the euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: International conference on computer vision, pp 630–637
go back to reference Yan D, Huang L, Jordan M (2009) Fast approximate spectral clustering. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 907–916 Yan D, Huang L, Jordan M (2009) Fast approximate spectral clustering. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 907–916
go back to reference Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: IEEE conference on computer vision and pattern recognition, pp 1794–1801 Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: IEEE conference on computer vision and pattern recognition, pp 1794–1801
go back to reference Zhang J, Marszalek M, Lazebnik S, Schimd C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73:213–238CrossRef Zhang J, Marszalek M, Lazebnik S, Schimd C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73:213–238CrossRef
go back to reference Zhu Q, Song G, Shi J (2007) Untangling cycles for contour grouping. In: International conference on computer vision, pp 1–8 Zhu Q, Song G, Shi J (2007) Untangling cycles for contour grouping. In: International conference on computer vision, pp 1–8
go back to reference Zhu S, Guo C, Wu Y, Wang Y (2002) What are textons. In: European conference on computer vision, pp 793–807 Zhu S, Guo C, Wu Y, Wang Y (2002) What are textons. In: European conference on computer vision, pp 793–807
Metadata
Title
Learning visual codebooks for image classification using spectral clustering
Authors
Yi Hong
Weiping Zhu
Publication date
27-11-2017
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 18/2018
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-017-2937-4

Other articles of this Issue 18/2018

Soft Computing 18/2018 Go to the issue

Methodologies and Application

Biclustering with a quantum annealer

Premium Partner