Skip to main content
Erschienen in: International Journal of Computer Vision 1-3/2017

15.03.2017

Holistically-Nested Edge Detection

verfasst von: Saining Xie, Zhuowen Tu

Erschienen in: International Journal of Computer Vision | Ausgabe 1-3/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We develop a new edge detection algorithm that addresses two important issues in this long-standing vision problem: (1) holistic image training and prediction; and (2) multi-scale and multi-level feature learning. Our proposed method, holistically-nested edge detection (HED), performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSDS500 dataset (ODS F-score of 0.790) and the NYU Depth dataset (ODS F-score of 0.746), and do so with an improved speed (0.4 s per image) that is orders of magnitude faster than some CNN-based edge detection algorithms developed before HED. We also observe encouraging results on other boundary detection benchmark datasets such as Multicue and PASCAL-Context.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. PAMI, 33(5), 898–916.CrossRef Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. PAMI, 33(5), 898–916.CrossRef
Zurück zum Zitat Bertasius, G., Shi, J., & Torresani, L. (2015). Deepedge: A multi-scale bifurcated deep network for top-down contour detection. In CVPR. Bertasius, G., Shi, J., & Torresani, L. (2015). Deepedge: A multi-scale bifurcated deep network for top-down contour detection. In CVPR.
Zurück zum Zitat Buyssens, P., Elmoataz, A., & Lézoray, O. (2013). Multiscale convolutional neural networks for vision-based classification of cells. In ACCV. Buyssens, P., Elmoataz, A., & Lézoray, O. (2013). Multiscale convolutional neural networks for vision-based classification of cells. In ACCV.
Zurück zum Zitat Canny, J. (1986). A computational approach to edge detection. PAMI, 6, 679–698.CrossRef Canny, J. (1986). A computational approach to edge detection. PAMI, 6, 679–698.CrossRef
Zurück zum Zitat Chen, L. C., Barron, J. T., Papandreou, G., Murphy, K., & Yuille, A. L. (2016). Semantic image segmentation with task-specific edge detection using cnns and a discriminatively trained domain transform. In: CVPR. Chen, L. C., Barron, J. T., Papandreou, G., Murphy, K., & Yuille, A. L. (2016). Semantic image segmentation with task-specific edge detection using cnns and a discriminatively trained domain transform. In: CVPR.
Zurück zum Zitat Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In: CVPR. Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In: CVPR.
Zurück zum Zitat Dollár, P., & Zitnick, C. L. (2015). Fast edge detection using structured forests. In PAMI. Dollár, P., & Zitnick, C. L. (2015). Fast edge detection using structured forests. In PAMI.
Zurück zum Zitat Elder, J. H., & Goldberg, R. M. (2002). Ecological statistics of gestalt laws for the perceptual organization of contours. Journal of Vision, 2(4), 5.CrossRef Elder, J. H., & Goldberg, R. M. (2002). Ecological statistics of gestalt laws for the perceptual organization of contours. Journal of Vision, 2(4), 5.CrossRef
Zurück zum Zitat Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2014). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.CrossRef Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2014). The pascal visual object classes challenge: A retrospective. IJCV, 111(1), 98–136.CrossRef
Zurück zum Zitat Farabet, C., Couprie, C., Najman, L., & LeCun, Y. (2013). Learning hierarchical features for scene labeling. In PAMI. Farabet, C., Couprie, C., Najman, L., & LeCun, Y. (2013). Learning hierarchical features for scene labeling. In PAMI.
Zurück zum Zitat Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. IJCV, 59(2), 167–181.CrossRef Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. IJCV, 59(2), 167–181.CrossRef
Zurück zum Zitat Ganin, Y., & Lempitsky, V. (2014). N4-fields: Neural network nearest neighbor fields for image transforms. In: ACCV. Ganin, Y., & Lempitsky, V. (2014). N4-fields: Neural network nearest neighbor fields for image transforms. In: ACCV.
Zurück zum Zitat Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE conference on computer vision and pattern recognition (CVPR), (pp. 580–587). Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE conference on computer vision and pattern recognition (CVPR), (pp. 580–587).
Zurück zum Zitat Gupta, S., Arbelaez, P., & Malik, J. (2013). Perceptual organization and recognition of indoor scenes from rgb-d images. In CVPR. Gupta, S., Arbelaez, P., & Malik, J. (2013). Perceptual organization and recognition of indoor scenes from rgb-d images. In CVPR.
Zurück zum Zitat Gupta, S., Girshick, R., Arbeláez, P., & Malik. J. (2014). Learning rich features from rgb-d images for object detection and segmentation. In ECCV. Gupta, S., Girshick, R., Arbeláez, P., & Malik. J. (2014). Learning rich features from rgb-d images for object detection and segmentation. In ECCV.
Zurück zum Zitat Hallman, S., & Fowlkes, C. C. (2015). Oriented edge forests for boundary detection. In: CVPR. Hallman, S., & Fowlkes, C. C. (2015). Oriented edge forests for boundary detection. In: CVPR.
Zurück zum Zitat Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2015). Hypercolumns for object segmentation and fine-grained localization. In CVPR. Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2015). Hypercolumns for object segmentation and fine-grained localization. In CVPR.
Zurück zum Zitat Hoiem, D., Efros, A. A., & Hebert, M. (2008). Putting objects in perspective. IJCV, 80(1), 3–15.CrossRef Hoiem, D., Efros, A. A., & Hebert, M. (2008). Putting objects in perspective. IJCV, 80(1), 3–15.CrossRef
Zurück zum Zitat Hoiem, D., Stein, A. N., Efros, A. A, & Hebert, M. (2007). Recovering occlusion boundaries from a single image. In ICCV. Hoiem, D., Stein, A. N., Efros, A. A, & Hebert, M. (2007). Recovering occlusion boundaries from a single image. In ICCV.
Zurück zum Zitat Hou. X., Yuille, A., & Koch, C. (2013). Boundary detection benchmarking: Beyond f-measures. In CVPR. Hou. X., Yuille, A., & Koch, C. (2013). Boundary detection benchmarking: Beyond f-measures. In CVPR.
Zurück zum Zitat Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology, 160(1), 106–154.CrossRef Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology, 160(1), 106–154.CrossRef
Zurück zum Zitat Hwang, J. J., & Liu, T. L. (2015). Pixel-wise deep learning for contour detection. In ICLR. Hwang, J. J., & Liu, T. L. (2015). Pixel-wise deep learning for contour detection. In ICLR.
Zurück zum Zitat Khoreva, A., Benenson, R., Omran, M., Hein, M., & Schiele, B. (2016). Weakly supervised object boundaries. In CVPR. Khoreva, A., Benenson, R., Omran, M., Hein, M., & Schiele, B. (2016). Weakly supervised object boundaries. In CVPR.
Zurück zum Zitat Kittler, J. (1983). On the accuracy of the sobel edge detector. Image and Vision Computing, 1(1), 37–42.CrossRef Kittler, J. (1983). On the accuracy of the sobel edge detector. Image and Vision Computing, 1(1), 37–42.CrossRef
Zurück zum Zitat Kivinen, J. J., Williams, C. K., Heess, N., & Technologies, D. (2014). Visual boundary prediction: A deep neural prediction network and quality dissection. In AISTATS. Kivinen, J. J., Williams, C. K., Heess, N., & Technologies, D. (2014). Visual boundary prediction: A deep neural prediction network and quality dissection. In AISTATS.
Zurück zum Zitat Kokkinos, I. (2016). Pushing the boundaries of boundary detection using deep learning. In ICLR. Kokkinos, I. (2016). Pushing the boundaries of boundary detection using deep learning. In ICLR.
Zurück zum Zitat Konishi, S., Yuille, A. L., Coughlan, J. M., & Zhu, S. C. (2003). Statistical edge detection: Learning and evaluating edge cues. PAMI, 25(1), 57–74.CrossRef Konishi, S., Yuille, A. L., Coughlan, J. M., & Zhu, S. C. (2003). Statistical edge detection: Learning and evaluating edge cues. PAMI, 25(1), 57–74.CrossRef
Zurück zum Zitat LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R., Hubbard, W., & Jackel, L. (1989). Backpropagation applied to handwritten zip code recognition. In Neural Computation. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R., Hubbard, W., & Jackel, L. (1989). Backpropagation applied to handwritten zip code recognition. In Neural Computation.
Zurück zum Zitat Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2015). Deeply-supervised nets. In AISTATS. Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2015). Deeply-supervised nets. In AISTATS.
Zurück zum Zitat Li, Y., Paluri, M., Rehg, J. M., & Dollár, P. (2016). Unsupervised learning of edges. In CVPR. Li, Y., Paluri, M., Rehg, J. M., & Dollár, P. (2016). Unsupervised learning of edges. In CVPR.
Zurück zum Zitat Lim, J. J., Zitnick, C. L., & Dollár, P. (2013). Sketch tokens: A learned mid-level representation for contour and object detection. In CVPR. Lim, J. J., Zitnick, C. L., & Dollár, P. (2013). Sketch tokens: A learned mid-level representation for contour and object detection. In CVPR.
Zurück zum Zitat Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. PAMI, 33(12), 2368–2382.CrossRef Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. PAMI, 33(12), 2368–2382.CrossRef
Zurück zum Zitat Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In CVPR.
Zurück zum Zitat Maninis, K. K., Pont-Tuset, J., Arbeláez, P., & Van Gool, L.(2016). Convolutional oriented boundaries. In ECCV. Maninis, K. K., Pont-Tuset, J., Arbeláez, P., & Van Gool, L.(2016). Convolutional oriented boundaries. In ECCV.
Zurück zum Zitat Marr, D., & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London Series B Biological Sciences, 207(1167), 187–217.CrossRef Marr, D., & Hildreth, E. (1980). Theory of edge detection. Proceedings of the Royal Society of London Series B Biological Sciences, 207(1167), 187–217.CrossRef
Zurück zum Zitat Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5), 530–549.CrossRef Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI, 26(5), 530–549.CrossRef
Zurück zum Zitat Mély, D., Kim, J., McGill, M., Guo, Y., & Serre, T. (2015). A systematic comparison between visual cues for boundary detection. Vision Research, 120, 93–107.CrossRef Mély, D., Kim, J., McGill, M., Guo, Y., & Serre, T. (2015). A systematic comparison between visual cues for boundary detection. Vision Research, 120, 93–107.CrossRef
Zurück zum Zitat Merkow, J., Kriegman, D., Marsden, A., & Tu, Z. (2016). Dense volume-to-volume vascular boundary detection. In MICCAI. Merkow, J., Kriegman, D., Marsden, A., & Tu, Z. (2016). Dense volume-to-volume vascular boundary detection. In MICCAI.
Zurück zum Zitat Mottaghi, R., Chen, X., Liu, X., Cho, N. G., Lee, S. W., Fidler, S., Urtasun, R., & Yuille. A (2014). The role of context for object detection and semantic segmentation in the wild. In CVPR. Mottaghi, R., Chen, X., Liu, X., Cho, N. G., Lee, S. W., Fidler, S., Urtasun, R., & Yuille. A (2014). The role of context for object detection and semantic segmentation in the wild. In CVPR.
Zurück zum Zitat Neverova, N., Wolf, C., Taylor, G. W., & Nebout, F. (2014). Multi-scale deep learning for gesture detection and localization. In ECCV Workshops. Neverova, N., Wolf, C., Taylor, G. W., & Nebout, F. (2014). Multi-scale deep learning for gesture detection and localization. In ECCV Workshops.
Zurück zum Zitat Premachandran, V., Bonev, B., & Yuille, A. L. (2015). Pascal boundaries: A class-agnostic semantic boundary dataset. arXiv preprint arXiv:1511.07951. Premachandran, V., Bonev, B., & Yuille, A. L. (2015). Pascal boundaries: A class-agnostic semantic boundary dataset. arXiv preprint arXiv:​1511.​07951.
Zurück zum Zitat Ren, X. (2008). Multi-scale improves boundary detection in natural images. In ECCV. Ren, X. (2008). Multi-scale improves boundary detection in natural images. In ECCV.
Zurück zum Zitat Ren, X., & Bo, L. (2012). Discriminatively trained sparse code gradients for contour detection. In NIPS. Ren, X., & Bo, L. (2012). Discriminatively trained sparse code gradients for contour detection. In NIPS.
Zurück zum Zitat Ruderman, D. L., & Bialek, W. (1994). Statistics of natural images: Scaling in the woods. Physical Review Letters, 73(6), 814.CrossRef Ruderman, D. L., & Bialek, W. (1994). Statistics of natural images: Scaling in the woods. Physical Review Letters, 73(6), 814.CrossRef
Zurück zum Zitat Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. IJCV, 115(3), 211–252. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. IJCV, 115(3), 211–252.
Zurück zum Zitat Sermanet, P., Chintala, S., & LeCun, Y. (2012). Convolutional neural networks applied to house numbers digit classification. In ICPR. Sermanet, P., Chintala, S., & LeCun, Y. (2012). Convolutional neural networks applied to house numbers digit classification. In ICPR.
Zurück zum Zitat Shen, W., Wang, X., Wang, Y., Bai, X., & Zhang, Z. (2015). Deepcontour: A deep convolutional feature learned by positive-sharing loss for contour detection draft version. In CVPR. Shen, W., Wang, X., Wang, Y., Bai, X., & Zhang, Z. (2015). Deepcontour: A deep convolutional feature learned by positive-sharing loss for contour detection draft version. In CVPR.
Zurück zum Zitat Shen, W., Zhao, K., Jiang, Y., Wang, Y., Zhang, Z., & Bai, X. (2016). Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In CVPR. Shen, W., Zhao, K., Jiang, Y., Wang, Y., Zhang, Z., & Bai, X. (2016). Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In CVPR.
Zurück zum Zitat Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In ECCV. Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In ECCV.
Zurück zum Zitat Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.
Zurück zum Zitat Torre, V., & Poggio, T. A. (1986). On edge detection. PAMI, 2, 147–163.CrossRef Torre, V., & Poggio, T. A. (1986). On edge detection. PAMI, 2, 147–163.CrossRef
Zurück zum Zitat Tu, Z. (2008). Auto-context and its application to high-level vision tasks. In CVPR. Tu, Z. (2008). Auto-context and its application to high-level vision tasks. In CVPR.
Zurück zum Zitat Van Essen, D. C., & Gallant, J. L. (1994). Neural mechanisms of form and motion processing in the primate visual system. Neuron, 13(1), 1–10.CrossRef Van Essen, D. C., & Gallant, J. L. (1994). Neural mechanisms of form and motion processing in the primate visual system. Neuron, 13(1), 1–10.CrossRef
Zurück zum Zitat Witkin, A. P. (1984). Scale-space filtering: A new approach to multi-scale description. In ICASSP. Witkin, A. P. (1984). Scale-space filtering: A new approach to multi-scale description. In ICASSP.
Zurück zum Zitat Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision, (pp. 1395–1403). Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision, (pp. 1395–1403).
Zurück zum Zitat Yang, J., Price, B., Cohen, S., Lee, H., & Yang. M. H. (2016). Object contour detection with a fully convolutional encoder-decoder network. In CVPR. Yang, J., Price, B., Cohen, S., Lee, H., & Yang. M. H. (2016). Object contour detection with a fully convolutional encoder-decoder network. In CVPR.
Zurück zum Zitat Yuille, A. L., & Poggio, T. A. (1986). Scaling theorems for zero crossings. PAMI, 1, 15–25.CrossRefMATH Yuille, A. L., & Poggio, T. A. (1986). Scaling theorems for zero crossings. PAMI, 1, 15–25.CrossRefMATH
Metadaten
Titel
Holistically-Nested Edge Detection
verfasst von
Saining Xie
Zhuowen Tu
Publikationsdatum
15.03.2017
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 1-3/2017
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-017-1004-z

Weitere Artikel der Ausgabe 1-3/2017

International Journal of Computer Vision 1-3/2017 Zur Ausgabe

Premium Partner