Skip to main content

2015 | OriginalPaper | Buchkapitel

Non-maximum Suppression for Object Detection by Passing Messages Between Windows

verfasst von : Rasmus Rothe, Matthieu Guillaumin, Luc Van Gool

Erschienen in: Computer Vision – ACCV 2014

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Non-maximum suppression (NMS) is a key post-processing step in many computer vision applications. In the context of object detection, it is used to transform a smooth response map that triggers many imprecise object window hypotheses in, ideally, a single bounding-box for each detected object. The most common approach for NMS for object detection is a greedy, locally optimal strategy with several hand-designed components (e.g., thresholds). Such a strategy inherently suffers from several shortcomings, such as the inability to detect nearby objects. In this paper, we try to alleviate these problems and explore a novel formulation of NMS as a well-defined clustering problem. Our method builds on the recent Affinity Propagation Clustering algorithm, which passes messages between data points to identify cluster exemplars. Contrary to the greedy approach, our method is solved globally and its parameters can be automatically learned from training data. In experiments, we show in two contexts – object class and generic object detection – that it provides a promising solution to the shortcomings of the greedy NMS.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Canny, J.: A computational approach to edge detection. TPAMI 8(6), 679–698 (1986)CrossRef Canny, J.: A computational approach to edge detection. TPAMI 8(6), 679–698 (1986)CrossRef
2.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
3.
Zurück zum Zitat Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32(9), 1627–1645 (2010)CrossRef Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32(9), 1627–1645 (2010)CrossRef
4.
Zurück zum Zitat Viola, P., Jones, M.: Robust real-time object detection. IJCV 57(2), 137–154 (2004)CrossRef Viola, P., Jones, M.: Robust real-time object detection. IJCV 57(2), 137–154 (2004)CrossRef
5.
Zurück zum Zitat Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014) Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
6.
Zurück zum Zitat Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.H.S.: BING: binarized normed gradients for objectness estimation at 300fps. In: CVPR (2014) Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.H.S.: BING: binarized normed gradients for objectness estimation at 300fps. In: CVPR (2014)
7.
Zurück zum Zitat Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. TPAMI 34(11), 2189–2202 (2012)CrossRef Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. TPAMI 34(11), 2189–2202 (2012)CrossRef
8.
Zurück zum Zitat Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRef Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRef
9.
Zurück zum Zitat Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)MathSciNetCrossRef Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)MathSciNetCrossRef
10.
Zurück zum Zitat Mikolajczyk, K., Schmid, C.: Scale & Affine invariant interest point detectors. IJCV 1(60), 63–86 (2004)CrossRef Mikolajczyk, K., Schmid, C.: Scale & Affine invariant interest point detectors. IJCV 1(60), 63–86 (2004)CrossRef
11.
Zurück zum Zitat Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. IJCV 56(3), 151–177 (2004)CrossRef Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. IJCV 56(3), 151–177 (2004)CrossRef
12.
Zurück zum Zitat Cinbis, R.G., Verbeek, J., Schmid, C.: Segmentation driven object detection with fisher vectors. In: ICCV (2013) Cinbis, R.G., Verbeek, J., Schmid, C.: Segmentation driven object detection with fisher vectors. In: ICCV (2013)
13.
Zurück zum Zitat Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS (2013) Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS (2013)
14.
Zurück zum Zitat Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)CrossRef Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)CrossRef
15.
Zurück zum Zitat Dalal, N.: Finding people in images and videos. Ph.D. thesis, Institut National Polytechnique de Grenoble (2006) Dalal, N.: Finding people in images and videos. Ph.D. thesis, Institut National Polytechnique de Grenoble (2006)
16.
Zurück zum Zitat Wojcikiewicz, W.: Probabilistic modelling of multiple observations in face detection. Technical report, Humboldt-Universität zu Berlin (2008) Wojcikiewicz, W.: Probabilistic modelling of multiple observations in face detection. Technical report, Humboldt-Universität zu Berlin (2008)
17.
Zurück zum Zitat Blaschko, M.B., Kannala, J., Rahtu, E.: Non maximal suppression in cascaded ranking models. In: Kämäräinen, J.-K., Koskela, M. (eds.) SCIA 2013. LNCS, vol. 7944, pp. 408–419. Springer, Heidelberg (2013) CrossRef Blaschko, M.B., Kannala, J., Rahtu, E.: Non maximal suppression in cascaded ranking models. In: Kämäräinen, J.-K., Koskela, M. (eds.) SCIA 2013. LNCS, vol. 7944, pp. 408–419. Springer, Heidelberg (2013) CrossRef
18.
Zurück zum Zitat Chen, G., Ding, Y., Xiao, J., Han, T.X.: Detection evolution with multi-order contextual co-occurrence. In: CVPR (2013) Chen, G., Ding, Y., Xiao, J., Han, T.X.: Detection evolution with multi-order contextual co-occurrence. In: CVPR (2013)
19.
Zurück zum Zitat Ding, Y., Xiao, J.: Contextual boost for pedestrian detection. In: CVPR (2012) Ding, Y., Xiao, J.: Contextual boost for pedestrian detection. In: CVPR (2012)
20.
Zurück zum Zitat Razavi, N., Gall, J., Van Gool, L.: Backprojection revisited: scalable multi-view object detection and similarity metrics for detections. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 620–633. Springer, Heidelberg (2010) CrossRef Razavi, N., Gall, J., Van Gool, L.: Backprojection revisited: scalable multi-view object detection and similarity metrics for detections. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 620–633. Springer, Heidelberg (2010) CrossRef
21.
Zurück zum Zitat Barinova, O., Lempitsky, V., Kholi, P.: On detection of multiple object instances using hough transforms. TPAMI 34(9), 1773–1784 (2012)CrossRef Barinova, O., Lempitsky, V., Kholi, P.: On detection of multiple object instances using hough transforms. TPAMI 34(9), 1773–1784 (2012)CrossRef
22.
Zurück zum Zitat Wohlhart, P., Donoser, M., Roth, P.M., Bischof, H.: Detecting partially occluded objects with an implicit shape model random field. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 302–315. Springer, Heidelberg (2013) CrossRef Wohlhart, P., Donoser, M., Roth, P.M., Bischof, H.: Detecting partially occluded objects with an implicit shape model random field. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 302–315. Springer, Heidelberg (2013) CrossRef
23.
Zurück zum Zitat Wu, B., Nevatia, R.: Detection and segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. IJCV 82(2), 185–204 (2009)CrossRef Wu, B., Nevatia, R.: Detection and segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. IJCV 82(2), 185–204 (2009)CrossRef
24.
Zurück zum Zitat Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008) CrossRef Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008) CrossRef
25.
Zurück zum Zitat Blaschko, M.B.: Branch and bound strategies for non-maximal suppression in object detection. In: Boykov, Y., Kahl, F., Lempitsky, V., Schmidt, F.R. (eds.) EMMCVPR 2011. LNCS, vol. 6819, pp. 385–398. Springer, Heidelberg (2011) Blaschko, M.B.: Branch and bound strategies for non-maximal suppression in object detection. In: Boykov, Y., Kahl, F., Lempitsky, V., Schmidt, F.R. (eds.) EMMCVPR 2011. LNCS, vol. 6819, pp. 385–398. Springer, Heidelberg (2011)
26.
Zurück zum Zitat Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. In: BMVC (2012) Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. In: BMVC (2012)
27.
Zurück zum Zitat Desai, C., Ramanan, D., Fowlkes, C.C.: Discriminative models for multi-class object layout. IJCV 95(1), 1–12 (2011)MathSciNetCrossRef Desai, C., Ramanan, D., Fowlkes, C.C.: Discriminative models for multi-class object layout. IJCV 95(1), 1–12 (2011)MathSciNetCrossRef
28.
Zurück zum Zitat Ladický, L., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010) CrossRef Ladický, L., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010) CrossRef
29.
Zurück zum Zitat Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: CVPR (2012) Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: CVPR (2012)
30.
Zurück zum Zitat MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1(14), pp. 281–297 (1967) MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1(14), pp. 281–297 (1967)
31.
Zurück zum Zitat Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods. North-Holland, Amsterdam (1987) Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods. North-Holland, Amsterdam (1987)
33.
Zurück zum Zitat Dueck, D., Frey, B.J.: Non-metric affinity propagation for unsupervised image categorization. In: ICCV (2007) Dueck, D., Frey, B.J.: Non-metric affinity propagation for unsupervised image categorization. In: ICCV (2007)
34.
Zurück zum Zitat Dueck, D., Frey, B.J., Jojic, N., Jojic, V., Giaever, G., Emili, A., Musso, G., Hegele, R.: Using affinity propagation. In: RECOMB (2008) Dueck, D., Frey, B.J., Jojic, N., Jojic, V., Giaever, G., Emili, A., Musso, G., Hegele, R.: Using affinity propagation. In: RECOMB (2008)
35.
Zurück zum Zitat Lazic, N., Frey, B.J., Aarabi, P.: Solving the uncapacitated facility location problem using message passing algorithms. In: AISTATS (2010) Lazic, N., Frey, B.J., Aarabi, P.: Solving the uncapacitated facility location problem using message passing algorithms. In: AISTATS (2010)
36.
Zurück zum Zitat Givoni, I.E., Chung, C., Frey, B.J.: Hierarchical affinity propagation. In: The 27th Conference on Uncertainty in Artificial Intelligence (UAI) (2011) Givoni, I.E., Chung, C., Frey, B.J.: Hierarchical affinity propagation. In: The 27th Conference on Uncertainty in Artificial Intelligence (UAI) (2011)
37.
Zurück zum Zitat Givoni, I.E., Frey, B.J.: Semi-supervised affinity propagation with instance-level constraints. In: AISTATS (2009) Givoni, I.E., Frey, B.J.: Semi-supervised affinity propagation with instance-level constraints. In: AISTATS (2009)
38.
Zurück zum Zitat Givoni, I.E., Frey, B.J.: A binary variable model for affinity propagation. Neural Comput. 21(6), 1589–1600 (2009)MathSciNetCrossRef Givoni, I.E., Frey, B.J.: A binary variable model for affinity propagation. Neural Comput. 21(6), 1589–1600 (2009)MathSciNetCrossRef
39.
Zurück zum Zitat Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009) Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009)
40.
Zurück zum Zitat Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)CrossRef Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)CrossRef
41.
Zurück zum Zitat Vedaldi, A.: A MATLAB wrapper of SVMstruct (2011) Vedaldi, A.: A MATLAB wrapper of SVMstruct (2011)
42.
Zurück zum Zitat Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 340–353. Springer, Heidelberg (2012) CrossRef Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 340–353. Springer, Heidelberg (2012) CrossRef
43.
Zurück zum Zitat Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: CVPR (2010) Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: CVPR (2010)
44.
Zurück zum Zitat Manén, S., Guillaumin, M., Van Gool, L.: Prime object proposals with randomized Prim’s algorithm. In: ICCV (2013) Manén, S., Guillaumin, M., Van Gool, L.: Prime object proposals with randomized Prim’s algorithm. In: ICCV (2013)
45.
Zurück zum Zitat Ristin, M., Gall, J., Van Gool, L.: Local context priors for object proposal generation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 57–70. Springer, Heidelberg (2013) CrossRef Ristin, M., Gall, J., Van Gool, L.: Local context priors for object proposal generation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 57–70. Springer, Heidelberg (2013) CrossRef
46.
Zurück zum Zitat Dollar, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV (2013) Dollar, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV (2013)
Metadaten
Titel
Non-maximum Suppression for Object Detection by Passing Messages Between Windows
verfasst von
Rasmus Rothe
Matthieu Guillaumin
Luc Van Gool
Copyright-Jahr
2015
Verlag
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-319-16865-4_19

Premium Partner