Skip to main content
Erschienen in: International Journal of Computer Vision 2/2013

01.09.2013

Selective Search for Object Recognition

verfasst von: J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, A. W. M. Smeulders

Erschienen in: International Journal of Computer Vision | Ausgabe 2/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper addresses the problem of generating possible object locations for use in object recognition. We introduce selective search which combines the strength of both an exhaustive search and segmentation. Like segmentation, we use the image structure to guide our sampling process. Like exhaustive search, we aim to capture all possible object locations. Instead of a single technique to generate possible object locations, we diversify our search and use a variety of complementary image partitionings to deal with as many image conditions as possible. Our selective search results in a small set of data-driven, class-independent, high quality locations, yielding 99 % recall and a Mean Average Best Overlap of 0.879 at 10,097 locations. The reduced number of locations compared to an exhaustive search enables the use of stronger machine learning techniques and stronger appearance models for object recognition. In this paper we show that our selective search enables the use of the powerful Bag-of-Words model for recognition. The selective search software is made publicly available (Software: http://​disi.​unitn.​it/​~uijlings/​SelectiveSearch.​html).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
We found no difference in recognition accuracy when using the Random Forest assignment of Uijlings et al. (2010) or kmeans nearest neighbour assignment in van de Sande et al. (2010) on the Pascal dataset.
 
Literatur
Zurück zum Zitat Alexe, B., Deselaers, T., Ferrari, V. (2010). What is an object? In CVPR. Alexe, B., Deselaers, T., Ferrari, V. (2010). What is an object? In CVPR.
Zurück zum Zitat Alexe, B., Deselaers, T., & Ferrari, V. (2012). Measuring the objectness of image windows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2189–2202.CrossRef Alexe, B., Deselaers, T., & Ferrari, V. (2012). Measuring the objectness of image windows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2189–2202.CrossRef
Zurück zum Zitat Arbeláez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.CrossRef Arbeláez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.CrossRef
Zurück zum Zitat Carreira, J., Sminchisescu, C. (2010). Constrained parametric min-cuts for automatic object segmentation. In CVPR. Carreira, J., Sminchisescu, C. (2010). Constrained parametric min-cuts for automatic object segmentation. In CVPR.
Zurück zum Zitat Chum, O., Zisserman, A. (2007). An exemplar model for learning object classes. In CVPR. Chum, O., Zisserman, A. (2007). An exemplar model for learning object classes. In CVPR.
Zurück zum Zitat Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.CrossRef Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.CrossRef
Zurück zum Zitat Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). In ECCV statistical learning in computer vision: Visual categorization with bags of keypoints. Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). In ECCV statistical learning in computer vision: Visual categorization with bags of keypoints.
Zurück zum Zitat Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR. Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR.
Zurück zum Zitat Endres, I., Hoiem, D. (2010). Category independent object proposals. In ECCV. Endres, I., Hoiem, D. (2010). Category independent object proposals. In ECCV.
Zurück zum Zitat Everingham, M., Gool, L. V., Williams, C., Winn, J., & Zisserman, A. (2011). The Pascal visual object classes challenge workshop: Overview and results of the detection challenge. Everingham, M., Gool, L. V., Williams, C., Winn, J., & Zisserman, A. (2011). The Pascal visual object classes challenge workshop: Overview and results of the detection challenge.
Zurück zum Zitat Everingham, M., van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88, 303–338. Everingham, M., van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88, 303–338.
Zurück zum Zitat Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1627–1645.CrossRef Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1627–1645.CrossRef
Zurück zum Zitat Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59, 167–181.CrossRef Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59, 167–181.CrossRef
Zurück zum Zitat Geusebroek, J. M., van den Boomgaard, R., Smeulders, A. W. M., & Geerts, H. (2001). Color invariance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 1338–1350.CrossRef Geusebroek, J. M., van den Boomgaard, R., Smeulders, A. W. M., & Geerts, H. (2001). Color invariance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 1338–1350.CrossRef
Zurück zum Zitat Gu, C., Lim, J. J., Arbeláez, P., & Malik, J. (2009). In CVPR: Recognition using regions. Gu, C., Lim, J. J., Arbeláez, P., & Malik, J. (2009). In CVPR: Recognition using regions.
Zurück zum Zitat Harzallah, H., Jurie, F., & Schmid, C. (2009). In ICCV: Combining efficient object localization and image classification. Harzallah, H., Jurie, F., & Schmid, C. (2009). In ICCV: Combining efficient object localization and image classification.
Zurück zum Zitat Lampert, C. H., Blaschko, M. B., & Hofmann, T. (2009). Efficient subwindow search: A branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 2129–2142. Lampert, C. H., Blaschko, M. B., & Hofmann, T. (2009). Efficient subwindow search: A branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 2129–2142.
Zurück zum Zitat Lazebnik, S., Schmid, C., Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR. Lazebnik, S., Schmid, C., Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR.
Zurück zum Zitat Li, F., & Carreira, J., Sminchisescu, C. (2010). In CVPR: Object recognition as ranking holistic figure-ground hypotheses. Li, F., & Carreira, J., Sminchisescu, C. (2010). In CVPR: Object recognition as ranking holistic figure-ground hypotheses.
Zurück zum Zitat Liu, C., Sharan, L., Adelson, E.H., Rosenholtz, R. (2010). Exploring features in a bayesian framework for material recognition. In Computer vision and pattern recognition 2010. IEEE. Liu, C., Sharan, L., Adelson, E.H., Rosenholtz, R. (2010). Exploring features in a bayesian framework for material recognition. In Computer vision and pattern recognition 2010. IEEE.
Zurück zum Zitat Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.CrossRef Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.CrossRef
Zurück zum Zitat Maji, S., Berg, A. C., & Malik, J. (2008). In CVPR: Classification using intersection kernel support vector machines is efficient. Maji, S., Berg, A. C., & Malik, J. (2008). In CVPR: Classification using intersection kernel support vector machines is efficient.
Zurück zum Zitat Maji, S., & Malik, J. (2009). Object detection using a max-margin hough transform. In CVPR. Maji, S., & Malik, J. (2009). Object detection using a max-margin hough transform. In CVPR.
Zurück zum Zitat Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.CrossRef Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.CrossRef
Zurück zum Zitat Perronnin, F., Sánchez, J., & Thomas M. (2010). In ECCV: Improving the Fisher Kernel for large-scale image classification. Perronnin, F., Sánchez, J., & Thomas M. (2010). In ECCV: Improving the Fisher Kernel for large-scale image classification.
Zurück zum Zitat Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888–905.CrossRef Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888–905.CrossRef
Zurück zum Zitat Sivic, J., Zisserman, A.(2003). Video google: A text retrieval approach to object matching in videos. In ICCV. Sivic, J., Zisserman, A.(2003). Video google: A text retrieval approach to object matching in videos. In ICCV.
Zurück zum Zitat Sonnenburg, S., Raetsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., et al. (2010). The shogun machine learning toolbox. Journal of Machine Learning Research, 11, 1799–1802.MATH Sonnenburg, S., Raetsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., et al. (2010). The shogun machine learning toolbox. Journal of Machine Learning Research, 11, 1799–1802.MATH
Zurück zum Zitat Tu, Z., Chen, X., Yuille, A. L., & Zhu, S. (2005). Image parsing: Unifying segmentation, detection and recognition. Marr Prize Issue. International Journal of Computer Vision. Tu, Z., Chen, X., Yuille, A. L., & Zhu, S. (2005). Image parsing: Unifying segmentation, detection and recognition. Marr Prize Issue. International Journal of Computer Vision.
Zurück zum Zitat Uijlings, J. R. R., Smeulders, A. W. M., & Scha, R. J. H. (2010). Real-time visual concept classification. IEEE Transactions on Multimedia, 12(7), 665–681.CrossRef Uijlings, J. R. R., Smeulders, A. W. M., & Scha, R. J. H. (2010). Real-time visual concept classification. IEEE Transactions on Multimedia, 12(7), 665–681.CrossRef
Zurück zum Zitat van de Sande, K. E. A., & Gevers, T. (2012). Illumination-invariant descriptors for discriminative visual object categorization. Technical report : University of Amsterdam. van de Sande, K. E. A., & Gevers, T. (2012). Illumination-invariant descriptors for discriminative visual object categorization. Technical report : University of Amsterdam.
Zurück zum Zitat van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2010). Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1582–1596.CrossRef van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2010). Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1582–1596.CrossRef
Zurück zum Zitat van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2011). Empowering visual categorization with the GPU. IEEE Transactions on Multimedia, 13(1), 60–70.CrossRef van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2011). Empowering visual categorization with the GPU. IEEE Transactions on Multimedia, 13(1), 60–70.CrossRef
Zurück zum Zitat Vedaldi, A., Gulshan, V., Varma, M., & Zisserman, A. (2009). In ICCV: Multiple kernels for object detection. Vedaldi, A., Gulshan, V., Varma, M., & Zisserman, A. (2009). In ICCV: Multiple kernels for object detection.
Zurück zum Zitat Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In CVPR, Volume 1, 511–518. Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In CVPR, Volume 1, 511–518.
Zurück zum Zitat Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57, 137–154.CrossRef Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57, 137–154.CrossRef
Zurück zum Zitat Zhou, X., Kai, Y., Zhang, T., & Huang, T. S. (2010). In ECCV: Image classification using super-vector coding of local image descriptors. Zhou, X., Kai, Y., Zhang, T., & Huang, T. S. (2010). In ECCV: Image classification using super-vector coding of local image descriptors.
Zurück zum Zitat Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). In CVPR: Latent hierarchical structural learning for object detection. Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). In CVPR: Latent hierarchical structural learning for object detection.
Metadaten
Titel
Selective Search for Object Recognition
verfasst von
J. R. R. Uijlings
K. E. A. van de Sande
T. Gevers
A. W. M. Smeulders
Publikationsdatum
01.09.2013
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 2/2013
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-013-0620-5

Weitere Artikel der Ausgabe 2/2013

International Journal of Computer Vision 2/2013 Zur Ausgabe