Skip to main content
Erschienen in: International Journal of Computer Vision 3/2014

01.07.2014

Mining Mid-level Features for Image Classification

verfasst von: Basura Fernando, Elisa Fromont, Tinne Tuytelaars

Erschienen in: International Journal of Computer Vision | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Mid-level or semi-local features learnt using class-level information are potentially more distinctive than the traditional low-level local features constructed in a purely bottom-up fashion. At the same time they preserve some of the robustness properties with respect to occlusions and image clutter. In this paper we propose a new and effective scheme for extracting mid-level features for image classification, based on relevant pattern mining. In particular, we mine relevant patterns of local compositions of densely sampled low-level features. We refer to the new set of obtained patterns as Frequent Local Histograms or FLHs. During this process, we pay special attention to keeping all the local histogram information and to selecting the most relevant reduced set of FLH patterns for classification. The careful choice of the visual primitives and an extension to exploit both local and global spatial information allow us to build powerful bag-of-FLH-based image representations. We show that these bag-of-FLHs are more discriminative than traditional bag-of-words and yield state-of-the-art results on various image classification benchmarks, including Pascal VOC.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large database. SIGMOD Record, 22, 207–216.CrossRef Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large database. SIGMOD Record, 22, 207–216.CrossRef
Zurück zum Zitat Boiman, O., Shechtman, E., Irani, M. (2008). In defense of nearest-neighbor based image classification. In CVPR. Boiman, O., Shechtman, E., Irani, M. (2008). In defense of nearest-neighbor based image classification. In CVPR.
Zurück zum Zitat Boureau, Y. L., Bach, F., LeCun, Y., & Ponce, J. (2010). Learning mid-level features for recognition. In CVPR. Boureau, Y. L., Bach, F., LeCun, Y., & Ponce, J. (2010). Learning mid-level features for recognition. In CVPR.
Zurück zum Zitat Chang, C. C., & Lin, C. J. (2011). Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.CrossRef Chang, C. C., & Lin, C. J. (2011). Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.CrossRef
Zurück zum Zitat Chatfield, K., Lempitsky, V., Vedaldi, A., & Zisserman, A. (2011). The devil is in the details: An evaluation of recent feature encoding methods. In BMVC. Chatfield, K., Lempitsky, V., Vedaldi, A., & Zisserman, A. (2011). The devil is in the details: An evaluation of recent feature encoding methods. In BMVC.
Zurück zum Zitat Cinbis, R. G., Verbeek, J., & Schmid, C. (2012). Image categorization using fisher kernels of non-iid image models. In CVPR. Cinbis, R. G., Verbeek, J., & Schmid, C. (2012). Image categorization using fisher kernels of non-iid image models. In CVPR.
Zurück zum Zitat Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In Work. on statistical learning in CV (pp. 1–22). Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In Work. on statistical learning in CV (pp. 1–22).
Zurück zum Zitat Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE conference on computer vision and patternn recognition (CVPR). Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE conference on computer vision and patternn recognition (CVPR).
Zurück zum Zitat Endres, I., Shih, K. J., Jiaa, J., & Hoiem, D. (2013). Learning collections of part models for object recognition. In The IEEE conference on computer vision and pattern recognition (CVPR). Endres, I., Shih, K. J., Jiaa, J., & Hoiem, D. (2013). Learning collections of part models for object recognition. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Fernando, B., Fromont, E., Muselet, D., & Sebban, M. (2012). Discriminative feature fusion for image classification. In CVPR. Fernando, B., Fromont, E., Muselet, D., & Sebban, M. (2012). Discriminative feature fusion for image classification. In CVPR.
Zurück zum Zitat Fernando, B., Fromont, É., & Tuytelaars, T. (2012). Effective use of frequent itemset mining for image classification. In ECCV, Lecture Notes in Computer Science (Vol. 7572, pp. 214–227). New York: Springer. Fernando, B., Fromont, É., & Tuytelaars, T. (2012). Effective use of frequent itemset mining for image classification. In ECCV, Lecture Notes in Computer Science (Vol. 7572, pp. 214–227). New York: Springer.
Zurück zum Zitat Fernando, B., & Tuytelaars, T. (2013). Mining multiple queries for image retrieval: On-the-fly learning of an object-specific mid-level representation. In ICCV. Fernando, B., & Tuytelaars, T. (2013). Mining multiple queries for image retrieval: On-the-fly learning of an object-specific mid-level representation. In ICCV.
Zurück zum Zitat Gilbert, A., Illingworth, J., Bowden, R. (2009). Fast realistic multi-action recognition using mined dense spatio-temporal features. In ICCV (pp. 925–931). doi:10.1109/ICCV.2009.5459335. Gilbert, A., Illingworth, J., Bowden, R. (2009). Fast realistic multi-action recognition using mined dense spatio-temporal features. In ICCV (pp. 925–931). doi:10.​1109/​ICCV.​2009.​5459335.
Zurück zum Zitat Jaakkola, T., & Haussler, D. (1998) Exploiting generative models in discriminative classifiers. In NIPS (pp. 487–493). Jaakkola, T., & Haussler, D. (1998) Exploiting generative models in discriminative classifiers. In NIPS (pp. 487–493).
Zurück zum Zitat Juneja, M., Vedaldi, A., Jawahar, C. V., & Zisserman, A. (2013) Blocks that shout: Distinctive parts for scene classification. In CVPR. Juneja, M., Vedaldi, A., Jawahar, C. V., & Zisserman, A. (2013) Blocks that shout: Distinctive parts for scene classification. In CVPR.
Zurück zum Zitat Kim, S., Jin, X., & Han, J. (2010). Disiclass: Discriminative frequent pattern-based image classification. In Tenth int. workshop on multimedia data mining. doi: 10.1145/1814245.1814252. Kim, S., Jin, X., & Han, J. (2010). Disiclass: Discriminative frequent pattern-based image classification. In Tenth int. workshop on multimedia data mining. doi: 10.​1145/​1814245.​1814252.
Zurück zum Zitat Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR (pp. 2169–2178). Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR (pp. 2169–2178).
Zurück zum Zitat Lee, A. J., Liu, Y. H., Tsai, H. M., Lin, H. H., & Wu, H. W. (2009). Mining frequent patterns in image databases with 9d-spa representation. Journal of Systems and Software, 82(4), 603–618. doi:10.1016/j.jss.2008.08.028. Lee, A. J., Liu, Y. H., Tsai, H. M., Lin, H. H., & Wu, H. W. (2009). Mining frequent patterns in image databases with 9d-spa representation. Journal of Systems and Software, 82(4), 603–618. doi:10.​1016/​j.​jss.​2008.​08.​028.
Zurück zum Zitat Lee, Y. J., Efros, A. A., & Hebert, M. (2013). Style-aware mid-level representation for discovering visual connections in space and time. In International conference on computer vision. Lee, Y. J., Efros, A. A., & Hebert, M. (2013). Style-aware mid-level representation for discovering visual connections in space and time. In International conference on computer vision.
Zurück zum Zitat Ling, H., & Soatto, S. (2007). Proximity distribution kernels for geometric context in category recognition. In ICCV. Ling, H., & Soatto, S. (2007). Proximity distribution kernels for geometric context in category recognition. In ICCV.
Zurück zum Zitat Liu, D., Hua, G., Viola, P., & Chen, T. (2008). Integrated feature selection and higher-order spatial feature extraction for object categorization. In it CVPR. Liu, D., Hua, G., Viola, P., & Chen, T. (2008). Integrated feature selection and higher-order spatial feature extraction for object categorization. In it CVPR.
Zurück zum Zitat Lowe, D. G. (1999). Object recognition from local scale-invariant features. In ICCV, (pp. 1150–1157). Lowe, D. G. (1999). Object recognition from local scale-invariant features. In ICCV, (pp. 1150–1157).
Zurück zum Zitat Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2004). Weak hypotheses and boosting for generic object detection and recognition. In ECCV (pp. 71–84). Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2004). Weak hypotheses and boosting for generic object detection and recognition. In ECCV (pp. 71–84).
Zurück zum Zitat Quack, T., Ferrari, V., & Gool, L. V. (2006). Video mining with frequent itemset configurations. In CIVR (pp. 360–369). Quack, T., Ferrari, V., & Gool, L. V. (2006). Video mining with frequent itemset configurations. In CIVR (pp. 360–369).
Zurück zum Zitat Quack, T., Ferrari, V., Leibe, B., & Van Gool, L. (2007). Efficient mining of frequent and distinctive feature configurations. In ICCV. Quack, T., Ferrari, V., Leibe, B., & Van Gool, L. (2007). Efficient mining of frequent and distinctive feature configurations. In ICCV.
Zurück zum Zitat Rematas, K., Fritz, M., & Tuytelaars, T. (2012). The pooled nbnn kernel: Beyond image-to-class and image-to-image. ACCV, 7724, 176–189. Rematas, K., Fritz, M., & Tuytelaars, T. (2012). The pooled nbnn kernel: Beyond image-to-class and image-to-image. ACCV, 7724, 176–189.
Zurück zum Zitat Savarese, S., Winn, J., & Criminisi, A. (2006). Discriminative object class models of appearance and shape by correlatons. In CVPR. Savarese, S., Winn, J., & Criminisi, A. (2006). Discriminative object class models of appearance and shape by correlatons. In CVPR.
Zurück zum Zitat Shahbaz Khan, F., van de Weijer, J., & Vanrell, M. (2009). Top-down color attention for object recognition. In ICCV (pp. 979–986). Shahbaz Khan, F., van de Weijer, J., & Vanrell, M. (2009). Top-down color attention for object recognition. In ICCV (pp. 979–986).
Zurück zum Zitat Sharma, G., Jurie, F., & Schmid, C. (2013). Expanded parts model for human attribute and action recognition in still images. In CVPR. Sharma, G., Jurie, F., & Schmid, C. (2013). Expanded parts model for human attribute and action recognition in still images. In CVPR.
Zurück zum Zitat Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep fisher networks for large-scale image classification. In Advances in neural information processing systems. Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep fisher networks for large-scale image classification. In Advances in neural information processing systems.
Zurück zum Zitat Singh, S., Gupta, A., & Efros, A. (2012). Unsupervised discovery of mid-level discriminative patches. In ECCV. Singh, S., Gupta, A., & Efros, A. (2012). Unsupervised discovery of mid-level discriminative patches. In ECCV.
Zurück zum Zitat Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. ICCV, 2, 1470–1477. Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. ICCV, 2, 1470–1477.
Zurück zum Zitat Tuytelaars, T., Fritz, M., Saenko, K., & Darrell, T. (2011). The nbnn kernel. In ICCV (pp. 1824–1831). Tuytelaars, T., Fritz, M., Saenko, K., & Darrell, T. (2011). The nbnn kernel. In ICCV (pp. 1824–1831).
Zurück zum Zitat van de Weijer, J., & Schmid, C. (2007). Applying color names to image description. In ICIP (pp. 493–496). van de Weijer, J., & Schmid, C. (2007). Applying color names to image description. In ICIP (pp. 493–496).
Zurück zum Zitat Yan, X., Cheng, H., Han, J., & Xin, D. (2005). Summarizing itemset patterns: A profile-based approach. In ACM SIGKDD. Yan, X., Cheng, H., Han, J., & Xin, D. (2005). Summarizing itemset patterns: A profile-based approach. In ACM SIGKDD.
Zurück zum Zitat Yang, Y., & Newsam, S. (2011). Spatial pyramid co-occurrence for image classification. In ICCV. Yang, Y., & Newsam, S. (2011). Spatial pyramid co-occurrence for image classification. In ICCV.
Zurück zum Zitat Yao, B., & Fei-Fei, L. (2010). Grouplet: A structured image representation for recognizing human and object interactions. In CVPR. Yao, B., & Fei-Fei, L. (2010). Grouplet: A structured image representation for recognizing human and object interactions. In CVPR.
Zurück zum Zitat Yimeng Zhang, T. C. (2009). Efficient kernels for identifying unbounded-order spatial features. In CVPR. Yimeng Zhang, T. C. (2009). Efficient kernels for identifying unbounded-order spatial features. In CVPR.
Zurück zum Zitat Yun, U., & Leggett, J. J. (2005). Wfim: Weighted frequent itemset mining with a weight range and a minimum weight. In SDM’05. Yun, U., & Leggett, J. J. (2005). Wfim: Weighted frequent itemset mining with a weight range and a minimum weight. In SDM’05.
Metadaten
Titel
Mining Mid-level Features for Image Classification
verfasst von
Basura Fernando
Elisa Fromont
Tinne Tuytelaars
Publikationsdatum
01.07.2014
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 3/2014
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-014-0700-1

Weitere Artikel der Ausgabe 3/2014

International Journal of Computer Vision 3/2014 Zur Ausgabe

Premium Partner