Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 1-4/2010

01.12.2010 | Original Article

Understanding bag-of-words model: a statistical framework

verfasst von: Yin Zhang, Rong Jin, Zhi-Hua Zhou

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 1-4/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The bag-of-words model is one of the most popular representation methods for object categorization. The key idea is to quantize each extracted key point into one of visual words, and then represent each image by a histogram of the visual words. For this purpose, a clustering algorithm (e.g., K-means), is generally used for generating the visual words. Although a number of studies have shown encouraging results of the bag-of-words representation for object categorization, theoretical studies on properties of the bag-of-words model is almost untouched, possibly due to the difficulty introduced by using a heuristic clustering process. In this paper, we present a statistical framework which generalizes the bag-of-words representation. In this framework, the visual words are generated by a statistical process rather than using a clustering algorithm, while the empirical performance is competitive to clustering-based method. A theoretical analysis based on statistical consistency is presented for the proposed framework. Moreover, based on the framework we developed two algorithms which do not rely on clustering, while achieving competitive performance in object categorization when compared to clustering-based bag-of-words representations.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Abramowitz M, Stegun IA (eds) (1972) Handbook of mathematical functions with formulas, graphs, and mathematical tables. Dover, New YorkMATH Abramowitz M, Stegun IA (eds) (1972) Handbook of mathematical functions with formulas, graphs, and mathematical tables. Dover, New YorkMATH
2.
Zurück zum Zitat Bartlett PL, Wang M (2002) Rademacher and Gaussian complexities: risk bounds and structural results. J Mach Learn Res 3:463–482CrossRefMathSciNet Bartlett PL, Wang M (2002) Rademacher and Gaussian complexities: risk bounds and structural results. J Mach Learn Res 3:463–482CrossRefMathSciNet
3.
Zurück zum Zitat Csurka G, Dance C, Fan L, Williamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV workshop on statistical learning in computer vision, Prague, Czech Republic, 2004 Csurka G, Dance C, Fan L, Williamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV workshop on statistical learning in computer vision, Prague, Czech Republic, 2004
5.
Zurück zum Zitat Farquhar J, Szedmak S, Meng H, Shawe-Taylor J (2005) Improving “bag-of-keypoints” image categorisation. Technical report, University of Southampton Farquhar J, Szedmak S, Meng H, Shawe-Taylor J (2005) Improving “bag-of-keypoints” image categorisation. Technical report, University of Southampton
6.
Zurück zum Zitat Joachims T (1998) Text categorization with suport vector machines: learning with many relevant features. In: Proceedings of the 10th European conference on machine learning. Chemnitz, Germany, pp 137–142 Joachims T (1998) Text categorization with suport vector machines: learning with many relevant features. In: Proceedings of the 10th European conference on machine learning. Chemnitz, Germany, pp 137–142
7.
Zurück zum Zitat Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: Proceedings of the 10th IEEE international conference on computer vision, Beijing, China, 2005, pp 604–610 Jurie F, Triggs B (2005) Creating efficient codebooks for visual recognition. In: Proceedings of the 10th IEEE international conference on computer vision, Beijing, China, 2005, pp 604–610
8.
Zurück zum Zitat Lazebnik S, Raginsky M (2009) Supervised learning of quantizer codebooks by information loss minimization. IEEE Trans Pattern Anal Mach Intell 31(7):1294–1309CrossRef Lazebnik S, Raginsky M (2009) Supervised learning of quantizer codebooks by information loss minimization. IEEE Trans Pattern Anal Mach Intell 31(7):1294–1309CrossRef
9.
Zurück zum Zitat Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef
10.
Zurück zum Zitat McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In: AAAI workshop on learning for text categorization, Madison, WI McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In: AAAI workshop on learning for text categorization, Madison, WI
11.
Zurück zum Zitat McDiarmid C (1989) On the method of bounded differences. In: Surveys in combinatorics 1989, pp 148–188 McDiarmid C (1989) On the method of bounded differences. In: Surveys in combinatorics 1989, pp 148–188
12.
Zurück zum Zitat Moosmann F, Triggs B, Jurie F (2007) Fast discriminative visual codebooks using randomized clustering forests. In: Schölkopf B, Platt J, Hoffman T (eds) Advances in neural information processing systems, vol 19. MIT Press, Cambridge, pp 985–992 Moosmann F, Triggs B, Jurie F (2007) Fast discriminative visual codebooks using randomized clustering forests. In: Schölkopf B, Platt J, Hoffman T (eds) Advances in neural information processing systems, vol 19. MIT Press, Cambridge, pp 985–992
13.
Zurück zum Zitat Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, New York, NY, pp 2161–2168 Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, New York, NY, pp 2161–2168
14.
Zurück zum Zitat Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: Proceedings of the 9th European conference on computer vision, Graz, Austria, pp 490–503 Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: Proceedings of the 9th European conference on computer vision, Graz, Austria, pp 490–503
15.
Zurück zum Zitat Opelt A, Pinz A, Fussenegger M, Auer P (2006) Generic object recognition with boosting. IEEE Trans Pattern Anal Mach Intell 28(3):416–431CrossRef Opelt A, Pinz A, Fussenegger M, Auer P (2006) Generic object recognition with boosting. IEEE Trans Pattern Anal Mach Intell 28(3):416–431CrossRef
16.
Zurück zum Zitat Perronnin F, Dance C, Csurka G, Bressian M (2006) Adapted vocabularies for generic visual categorization. In: Proceedings of the 9th European conference on computer vision, Graz, Austria, pp 464–475 Perronnin F, Dance C, Csurka G, Bressian M (2006) Adapted vocabularies for generic visual categorization. In: Proceedings of the 9th European conference on computer vision, Graz, Austria, pp 464–475
17.
Zurück zum Zitat Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2008) Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Anchorage, AK Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2008) Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, Anchorage, AK
18.
Zurück zum Zitat Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge
19.
Zurück zum Zitat Shawe-Taylor J, Dolia A (2007) A framework for probability density estimation. In: Proceedings of the 11th international conference on artificial intelligence and statistics, San Juan, Puerto Rico, pp 468–475 Shawe-Taylor J, Dolia A (2007) A framework for probability density estimation. In: Proceedings of the 11th international conference on artificial intelligence and statistics, San Juan, Puerto Rico, pp 468–475
20.
Zurück zum Zitat Sivic J, Zisserman A (2003) Video Google: A text retrieval approach to object matching in videos. In: Proceedings of the 9th IEEE international conference on computer vision, Nice, France, pp 1470–1477 Sivic J, Zisserman A (2003) Video Google: A text retrieval approach to object matching in videos. In: Proceedings of the 9th IEEE international conference on computer vision, Nice, France, pp 1470–1477
21.
Zurück zum Zitat Tuytelaars T, Schmid C (2007) Vector quantizing feature space with a regular lattice. In: Proceedings of the 11th IEEE international conference on computer vision, Rio de Janeiro, Brazil, pp 1–8 Tuytelaars T, Schmid C (2007) Vector quantizing feature space with a regular lattice. In: Proceedings of the 11th IEEE international conference on computer vision, Rio de Janeiro, Brazil, pp 1–8
22.
Zurück zum Zitat van Gemert JC, Geusebroek J-M, Veenman CJ, Smeulders AWM (2008) Kernel codebooks for scene categorization. In: Proceedings of the 10th European conference on computer vision, Marseille, France, pp 696–709 van Gemert JC, Geusebroek J-M, Veenman CJ, Smeulders AWM (2008) Kernel codebooks for scene categorization. In: Proceedings of the 10th European conference on computer vision, Marseille, France, pp 696–709
24.
Zurück zum Zitat Viitaniemi V, Laaksonen J (2008) Experiments on selection of codebooks for local image feature histograms. In: Proceedings of the 10th international conference series on visual information systems, Salerno, Italy, pp 126–137 Viitaniemi V, Laaksonen J (2008) Experiments on selection of codebooks for local image feature histograms. In: Proceedings of the 10th international conference series on visual information systems, Salerno, Italy, pp 126–137
25.
Zurück zum Zitat Winn J, Criminisi A, Minka T (2005) Object categorization by learned universal visual dictionary. In: Proceedings of the 10th IEEE international conference on computer vision, Beijing, China, pp 1800–1807 Winn J, Criminisi A, Minka T (2005) Object categorization by learned universal visual dictionary. In: Proceedings of the 10th IEEE international conference on computer vision, Beijing, China, pp 1800–1807
Metadaten
Titel
Understanding bag-of-words model: a statistical framework
verfasst von
Yin Zhang
Rong Jin
Zhi-Hua Zhou
Publikationsdatum
01.12.2010
Verlag
Springer-Verlag
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 1-4/2010
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-010-0001-0

Weitere Artikel der Ausgabe 1-4/2010

International Journal of Machine Learning and Cybernetics 1-4/2010 Zur Ausgabe

Neuer Inhalt