Skip to main content
Erschienen in: International Journal of Computer Vision 4/2021

30.01.2021

DeepFlux for Skeleton Detection in the Wild

verfasst von: Yongchao Xu, Yukang Wang, Stavros Tsogkas, Jianqiang Wan, Xiang Bai, Sven Dickinson, Kaleem Siddiqi

Erschienen in: International Journal of Computer Vision | Ausgabe 4/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The medial axis, or skeleton, is a fundamental object representation that has been extensively used in shape recognition. Yet, its extension to natural images has been challenging due to the large appearance and scale variations of objects and complex background clutter that appear in this setting. In contrast to recent methods that address skeleton extraction as a binary pixel classification problem, in this article we present an alternative formulation for skeleton detection. We follow the spirit of flux-based algorithms for medial axis recovery by training a convolutional neural network to predict a two-dimensional vector field encoding the flux representation. The skeleton is then recovered from the flux representation, which captures the position of skeletal pixels relative to semantically meaningful entities (e.g., image points in spatial context, and hence the implied object boundaries), resulting in precise skeleton detection. Moreover, since the flux representation is a region-based vector field, it is better able to cope with object parts of large width. We evaluate the proposed method, termed DeepFlux, on six benchmark datasets, consistently achieving superior performance over state-of-the-art methods. Finally, we demonstrate an application of DeepFlux, augmented with a skeleton scale estimation module, to detect objects in aerial images. This combination yields results that are competitive with models trained specifically for object detection, showcasing the versatility and effectiveness of mid-level representations in high-level tasks. An implementation of our method is available at https://​github.​com/​YukangWang/​DeepFlux.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
In fact, in the context of skeletonization of binary objects Siddiqi and Pizer (2008), this flux vector would be in the direction opposite to that of the spoke vector from a skeletal pixel to its associated boundary pixel.
 
Literatur
Zurück zum Zitat Ahn, J., Cho, S., & Kwak, S. (2019). Weakly supervised learning of instance segmentation with inter-pixel relations. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2209–2218). Ahn, J., Cho, S., & Kwak, S. (2019). Weakly supervised learning of instance segmentation with inter-pixel relations. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2209–2218).
Zurück zum Zitat Bai, M., & Urtasun, R. (2017). Deep watershed transform for instance segmentation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2858–2866). Bai, M., & Urtasun, R. (2017). Deep watershed transform for instance segmentation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2858–2866).
Zurück zum Zitat Bai, X., Wang, X., Latecki, L. J., Liu, W., & Tu, Z. (2009). Active skeleton for non-rigid object detection. In Proceedings of IEEE international conference on computer vision (pp. 575–582). Bai, X., Wang, X., Latecki, L. J., Liu, W., & Tu, Z. (2009). Active skeleton for non-rigid object detection. In Proceedings of IEEE international conference on computer vision (pp. 575–582).
Zurück zum Zitat Blum, H. (1973). Biological shape and visual science (part i). Journal of Theoretical Biology, 38(2), 205–287.CrossRef Blum, H. (1973). Biological shape and visual science (part i). Journal of Theoretical Biology, 38(2), 205–287.CrossRef
Zurück zum Zitat Borenstein, E., & Ullman, S. (2002). Class-specific, top-down segmentation. In Proceedings of European conference on computer vision (pp. 109–122). Borenstein, E., & Ullman, S. (2002). Class-specific, top-down segmentation. In Proceedings of European conference on computer vision (pp. 109–122).
Zurück zum Zitat Chen, L. C., Hermans, A., Papandreou, G., Schroff, F., Wang, P., & Adam, H. (2018). Masklab: Instance segmentation by refining object detection with semantic and direction features. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 4013–4022). Chen, L. C., Hermans, A., Papandreou, G., Schroff, F., Wang, P., & Adam, H. (2018). Masklab: Instance segmentation by refining object detection with semantic and direction features. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 4013–4022).
Zurück zum Zitat Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.CrossRef Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.CrossRef
Zurück zum Zitat Chen, X., Fang, H., Lin, T. Y., Vedantam, R., Gupta, S., Dollár, P., & Zitnick, C. L. (2015). Microsoft coco captions: Data collection and evaluation server. CoRR abs/1504.00325. Chen, X., Fang, H., Lin, T. Y., Vedantam, R., Gupta, S., Dollár, P., & Zitnick, C. L. (2015). Microsoft coco captions: Data collection and evaluation server. CoRR abs/1504.00325.
Zurück zum Zitat Ci, H., Wang, C., & Wang, Y. (2018). Video object segmentation by learning location-sensitive embeddings. In Proceedings of European conference on computer vision (pp. 501–516). Ci, H., Wang, C., & Wang, Y. (2018). Video object segmentation by learning location-sensitive embeddings. In Proceedings of European conference on computer vision (pp. 501–516).
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Li, F. F. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 248–255). Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Li, F. F. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 248–255).
Zurück zum Zitat Dickinson, S. J. (2009). Object categorization: Computer and human vision perspectives. Cambridge: Cambridge University Press.CrossRef Dickinson, S. J. (2009). Object categorization: Computer and human vision perspectives. Cambridge: Cambridge University Press.CrossRef
Zurück zum Zitat Dimitrov, P., Damon, J. N., & Siddiqi, K. (2013). Flux invariants for shape. In Proceedings of IEEE international conference on computer vision and pattern recognition. Dimitrov, P., Damon, J. N., & Siddiqi, K. (2013). Flux invariants for shape. In Proceedings of IEEE international conference on computer vision and pattern recognition.
Zurück zum Zitat Ding, J., Xue, N., Long, Y., Xia, G. S., & Lu, Q. (2019). Learning RoI transformer for oriented object detection in aerial images. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2849–2858). Ding, J., Xue, N., Long, Y., Xia, G. S., & Lu, Q. (2019). Learning RoI transformer for oriented object detection in aerial images. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2849–2858).
Zurück zum Zitat Dollár, P., & Zitnick, C. L. (2015). Fast edge detection using structured forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8), 1558–1570.CrossRef Dollár, P., & Zitnick, C. L. (2015). Fast edge detection using structured forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8), 1558–1570.CrossRef
Zurück zum Zitat Dufresne-Camaro, C. O., Rezanejad, M., Tsogkas, S., Siddiqi, K., & Dickinson, S. (2020). Appearance shock grammar for fast medial axis extraction from real images. In Proceedings of IEEE international conference on computer vision and pattern recognition. Dufresne-Camaro, C. O., Rezanejad, M., Tsogkas, S., Siddiqi, K., & Dickinson, S. (2020). Appearance shock grammar for fast medial axis extraction from real images. In Proceedings of IEEE international conference on computer vision and pattern recognition.
Zurück zum Zitat Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.CrossRef Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.CrossRef
Zurück zum Zitat Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.CrossRef Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61(1), 55–79.CrossRef
Zurück zum Zitat Felzenszwalb, P. F., & Huttenlocher, D. P. (2012). Distance transforms of sampled functions. Theory of Computing, 8(1), 415–428.MathSciNetCrossRef Felzenszwalb, P. F., & Huttenlocher, D. P. (2012). Distance transforms of sampled functions. Theory of Computing, 8(1), 415–428.MathSciNetCrossRef
Zurück zum Zitat Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In Proceedings of IEEE international conference on computer vision (pp. 415–422). Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In Proceedings of IEEE international conference on computer vision (pp. 415–422).
Zurück zum Zitat He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 770–778). He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 770–778).
Zurück zum Zitat Jang, J. H., & Hong, K. S. (2001). A pseudo-distance map for the segmentation-free skeletonization of gray-scale images. In Proceedings of IEEE international conference on computer vision (vol. 2, pp. 18–23). Jang, J. H., & Hong, K. S. (2001). A pseudo-distance map for the segmentation-free skeletonization of gray-scale images. In Proceedings of IEEE international conference on computer vision (vol. 2, pp. 18–23).
Zurück zum Zitat Jerripothula, K. R., Cai, J., Lu, J., & Yuan, J. (2017). Object co-skeletonization with co-segmentation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 3881–3889). Jerripothula, K. R., Cai, J., Lu, J., & Yuan, J. (2017). Object co-skeletonization with co-segmentation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 3881–3889).
Zurück zum Zitat Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of ACM multimedia (pp. 675–678). Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of ACM multimedia (pp. 675–678).
Zurück zum Zitat Jiang, Y., Zhu, X., Wang, X., Yang, S., Li, W., Wang, H., Fu, P., & Luo, Z. (2017). R2CNN: Rotational region CNN for orientation robust scene text detection. Preprint arXiv:1706.09579. Jiang, Y., Zhu, X., Wang, X., Yang, S., Li, W., Wang, H., Fu, P., & Luo, Z. (2017). R2CNN: Rotational region CNN for orientation robust scene text detection. Preprint arXiv:​1706.​09579.
Zurück zum Zitat Ke, W., Chen, J., Jiao, J., Zhao, G., & Ye, Q. (2017) SRN: Side-output residual network for object symmetry detection in the wild. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 302–310). Ke, W., Chen, J., Jiao, J., Zhao, G., & Ye, Q. (2017) SRN: Side-output residual network for object symmetry detection in the wild. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 302–310).
Zurück zum Zitat Kinga, D., & Adam, J. B.: A method for stochastic optimization. In Proceedings of international conference on learning representations (vol. 5). Kinga, D., & Adam, J. B.: A method for stochastic optimization. In Proceedings of international conference on learning representations (vol. 5).
Zurück zum Zitat Kreiss, S., Bertoni, L., & Alahi, A. (2019) PifPaf: Composite fields for human pose estimation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 11977–11986). Kreiss, S., Bertoni, L., & Alahi, A. (2019) PifPaf: Composite fields for human pose estimation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 11977–11986).
Zurück zum Zitat Levinshtein, A., Sminchisescu, C., & Dickinson, S. (2013). Multiscale symmetric part detection and grouping. International Journal of Computer Vision, 104(2), 117–134.CrossRef Levinshtein, A., Sminchisescu, C., & Dickinson, S. (2013). Multiscale symmetric part detection and grouping. International Journal of Computer Vision, 104(2), 117–134.CrossRef
Zurück zum Zitat Lindeberg, T. (1998). Edge detection and ridge detection with automatic scale selection. International Journal of Computer Vision, 30(2), 117–156.CrossRef Lindeberg, T. (1998). Edge detection and ridge detection with automatic scale selection. International Journal of Computer Vision, 30(2), 117–156.CrossRef
Zurück zum Zitat Lindeberg, T. (2013). Scale selection properties of generalized scale-space interest point detectors. Journal of Mathematical Imaging and Vision, 46(2), 177–210.MathSciNetCrossRef Lindeberg, T. (2013). Scale selection properties of generalized scale-space interest point detectors. Journal of Mathematical Imaging and Vision, 46(2), 177–210.MathSciNetCrossRef
Zurück zum Zitat Liu, C., Ke, W., Qin, F., & Ye, Q. (2018). Linear span network for object skeleton detection. In Proceedings of European conference on computer vision (pp. 136–151). Liu, C., Ke, W., Qin, F., & Ye, Q. (2018). Linear span network for object skeleton detection. In Proceedings of European conference on computer vision (pp. 136–151).
Zurück zum Zitat Liu, T. L., Geiger, D., & Yuille, A. L. (1998). Segmenting by seeking the symmetry axis. In Proceedings of international conference on pattern recognition (vol. 2, pp. 994–998). Liu, T. L., Geiger, D., & Yuille, A. L. (1998). Segmenting by seeking the symmetry axis. In Proceedings of international conference on pattern recognition (vol. 2, pp. 994–998).
Zurück zum Zitat Liu, X., Lyu, P., Bai, X., & Cheng, M. M. (2017). Fusing image and segmentation cues for skeleton extraction in the wild. In Proceedings of ICCV workshop on detecting symmetry in the wild (vol. 6, p. 8). Liu, X., Lyu, P., Bai, X., & Cheng, M. M. (2017). Fusing image and segmentation cues for skeleton extraction in the wild. In Proceedings of ICCV workshop on detecting symmetry in the wild (vol. 6, p. 8).
Zurück zum Zitat Liu, Y., Cheng, M. M., Hu, X., Wang, K., & Bai, X. (2017). Richer convolutional features for edge detection. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 5872–5881). Liu, Y., Cheng, M. M., Hu, X., Wang, K., & Bai, X. (2017). Richer convolutional features for edge detection. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 5872–5881).
Zurück zum Zitat Long, J., Shelhamer, E., & Darrell, T. (2015) Fully convolutional networks for semantic segmentation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 3431–3440). Long, J., Shelhamer, E., & Darrell, T. (2015) Fully convolutional networks for semantic segmentation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 3431–3440).
Zurück zum Zitat Luo, W., Li, Y., Urtasun, R., & Zemel, R. (2016). Understanding the effective receptive field in deep convolutional neural networks. In Proceedings of advances in neural information processing systems (pp. 4898–4906). Luo, W., Li, Y., Urtasun, R., & Zemel, R. (2016). Understanding the effective receptive field in deep convolutional neural networks. In Proceedings of advances in neural information processing systems (pp. 4898–4906).
Zurück zum Zitat Ma, J., Shao, W., Ye, H., Wang, L., Wang, H., Zheng, Y., et al. (2018). Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 20(11), 3111–3122.CrossRef Ma, J., Shao, W., Ye, H., Wang, L., Wang, H., Zheng, Y., et al. (2018). Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 20(11), 3111–3122.CrossRef
Zurück zum Zitat Maninis, K. K., Pont-Tuset, J., Arbeláez, P., & Van Gool, L. (2018). Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 819–833.CrossRef Maninis, K. K., Pont-Tuset, J., Arbeláez, P., & Van Gool, L. (2018). Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 819–833.CrossRef
Zurück zum Zitat Marr, D., & Nishihara, H. K. (1978). Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London B: Biological Sciences, 200(1140), 269–294. Marr, D., & Nishihara, H. K. (1978). Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London B: Biological Sciences, 200(1140), 269–294.
Zurück zum Zitat Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of IEEE international conference on computer vision (vol. 2, pp. 416–423). Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of IEEE international conference on computer vision (vol. 2, pp. 416–423).
Zurück zum Zitat Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.CrossRef Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.CrossRef
Zurück zum Zitat Máttyus, G., Luo, W., & Urtasun, R. (2017). Deeproadmapper: Extracting road topology from aerial images. In Proceedings of the IEEE international conference on computer vision. Máttyus, G., Luo, W., & Urtasun, R. (2017). Deeproadmapper: Extracting road topology from aerial images. In Proceedings of the IEEE international conference on computer vision.
Zurück zum Zitat Mattyus, G., Wang, S., Fidler, S., & Urtasun, R. (2015). Enhancing road maps by parsing aerial images around the world. In Proceedings of the IEEE international conference on computer vision (pp. 1689–1697). Mattyus, G., Wang, S., Fidler, S., & Urtasun, R. (2015). Enhancing road maps by parsing aerial images around the world. In Proceedings of the IEEE international conference on computer vision (pp. 1689–1697).
Zurück zum Zitat Nedzved, A., Ablameyko, S., & Uchida, S. (2006). Gray-scale thinning by using a pseudo-distance map. In Proceedings of IEEE international conference on pattern recognition. Nedzved, A., Ablameyko, S., & Uchida, S. (2006). Gray-scale thinning by using a pseudo-distance map. In Proceedings of IEEE international conference on pattern recognition.
Zurück zum Zitat Peng, S., Liu, Y., Huang, Q., Zhou, X., & Bao, H. (2019). PVNet: Pixel-wise voting network for 6dof pose estimation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 4561–4570). Peng, S., Liu, Y., Huang, Q., Zhou, X., & Bao, H. (2019). PVNet: Pixel-wise voting network for 6dof pose estimation. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 4561–4570).
Zurück zum Zitat Ren, Z., Yuan, J., Meng, J., & Zhang, Z. (2013). Robust part-based hand gesture recognition using kinect sensor. IEEE Transactions on Multimedia, 15(5), 1110–1120.CrossRef Ren, Z., Yuan, J., Meng, J., & Zhang, Z. (2013). Robust part-based hand gesture recognition using kinect sensor. IEEE Transactions on Multimedia, 15(5), 1110–1120.CrossRef
Zurück zum Zitat Shen, W., Bai, X., Hu, R., Wang, H., & Latecki, L. J. (2011). Skeleton growing and pruning with bending potential ratio. Pattern Recognition, 44(2), 196–209.CrossRef Shen, W., Bai, X., Hu, R., Wang, H., & Latecki, L. J. (2011). Skeleton growing and pruning with bending potential ratio. Pattern Recognition, 44(2), 196–209.CrossRef
Zurück zum Zitat Shen, W., Bai, X., Hu, Z., & Zhang, Z. (2016). Multiple instance subspace learning via partial random projection tree for local reflection symmetry in natural images. Pattern Recognition, 52, 306–316.CrossRef Shen, W., Bai, X., Hu, Z., & Zhang, Z. (2016). Multiple instance subspace learning via partial random projection tree for local reflection symmetry in natural images. Pattern Recognition, 52, 306–316.CrossRef
Zurück zum Zitat Shen, W., Zhao, K., Jiang, Y., Wang, Y., Bai, X., & Yuille, A. (2017). Deepskeleton: Learning multi-task scale-associated deep side outputs for object skeleton extraction in natural images. IEEE Transactions on Image Processing, 26(11), 5298–5311.MathSciNetCrossRef Shen, W., Zhao, K., Jiang, Y., Wang, Y., Bai, X., & Yuille, A. (2017). Deepskeleton: Learning multi-task scale-associated deep side outputs for object skeleton extraction in natural images. IEEE Transactions on Image Processing, 26(11), 5298–5311.MathSciNetCrossRef
Zurück zum Zitat Shen, W., Zhao, K., Jiang, Y., Wang, Y., Zhang, Z., & Bai, X. (2016). Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 222–230). Shen, W., Zhao, K., Jiang, Y., Wang, Y., Zhang, Z., & Bai, X. (2016). Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 222–230).
Zurück zum Zitat Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011) Real-time human pose recognition in parts from single depth images. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 1297–1304). Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011) Real-time human pose recognition in parts from single depth images. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 1297–1304).
Zurück zum Zitat Siddiqi, K., Bouix, S., Tannenbaum, A., & Zucker, S. W. (2002). Hamilton-jacobi skeletons. International Journal of Computer Vision, 48(3), 215–231.CrossRef Siddiqi, K., Bouix, S., Tannenbaum, A., & Zucker, S. W. (2002). Hamilton-jacobi skeletons. International Journal of Computer Vision, 48(3), 215–231.CrossRef
Zurück zum Zitat Siddiqi, K., & Pizer, S. M. (2008). Medial Representations: Mathematics., Algorithms and Applications Berlin: Springer.CrossRef Siddiqi, K., & Pizer, S. M. (2008). Medial Representations: Mathematics., Algorithms and Applications Berlin: Springer.CrossRef
Zurück zum Zitat Siddiqi, K., Shokoufandeh, A., Dickinson, S. J., & Zucker, S. W. (1999). Shock graphs and shape matching. International Journal of Computer Vision, 35(1), 13–32.CrossRef Siddiqi, K., Shokoufandeh, A., Dickinson, S. J., & Zucker, S. W. (1999). Shock graphs and shape matching. International Journal of Computer Vision, 35(1), 13–32.CrossRef
Zurück zum Zitat Sie Ho Lee, T., Fidler, S., & Dickinson, S. (2013). Detecting curved symmetric parts using a deformable disc model. In Proceedings of IEEE international conference on computer vision (pp. 1753–1760). Sie Ho Lee, T., Fidler, S., & Dickinson, S. (2013). Detecting curved symmetric parts using a deformable disc model. In Proceedings of IEEE international conference on computer vision (pp. 1753–1760).
Zurück zum Zitat Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In Proceedings of international conference on learning representations. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In Proceedings of international conference on learning representations.
Zurück zum Zitat Sironi, A., Lepetit, V., & Fua, P. (2014). Multiscale centerline detection by learning a scale-space distance transform. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2697–2704). Sironi, A., Lepetit, V., & Fua, P. (2014). Multiscale centerline detection by learning a scale-space distance transform. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2697–2704).
Zurück zum Zitat Trinh, N. H., & Kimia, B. B. (2011). Skeleton search: Category-specific object recognition and segmentation using a skeletal shape model. International Journal of Computer Vision, 2, 215–240.CrossRef Trinh, N. H., & Kimia, B. B. (2011). Skeleton search: Category-specific object recognition and segmentation using a skeletal shape model. International Journal of Computer Vision, 2, 215–240.CrossRef
Zurück zum Zitat Tsogkas, S., & Dickinson, S. (2017) AMAT: Medial axis transform for natural images. In Proceedings of IEEE international conference on computer vision (pp. 2727–2736). Tsogkas, S., & Dickinson, S. (2017) AMAT: Medial axis transform for natural images. In Proceedings of IEEE international conference on computer vision (pp. 2727–2736).
Zurück zum Zitat Tsogkas, S., & Kokkinos, I. (2012). Learning-based symmetry detection in natural images. In Proceedings of European conference on computer vision (pp. 41–54). Tsogkas, S., & Kokkinos, I. (2012). Learning-based symmetry detection in natural images. In Proceedings of European conference on computer vision (pp. 41–54).
Zurück zum Zitat Wang, Y., Xu, Y., Tsogkas, S., Bai, X., Dickinson, S., & Siddiqi, K. (2019). Deepflux for skeletons in the wild. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 5287–5296). Wang, Y., Xu, Y., Tsogkas, S., Bai, X., Dickinson, S., & Siddiqi, K. (2019). Deepflux for skeletons in the wild. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 5287–5296).
Zurück zum Zitat Wei, S. E., Ramakrishna, V., Kanade, T., & Sheikh, Y. (2016). Convolutional pose machines. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 4724–4732). Wei, S. E., Ramakrishna, V., Kanade, T., & Sheikh, Y. (2016). Convolutional pose machines. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 4724–4732).
Zurück zum Zitat Xia, G., Hu, J., Hu, F., Shi, B., Bai, X., Zhong, Y., et al. (2017). AID: A benchmark data set for performance evaluation of aerial scene classification. IEEE Transactions Geoscience and Remote Sensing, 55(7), 3965–3981.CrossRef Xia, G., Hu, J., Hu, F., Shi, B., Bai, X., Zhong, Y., et al. (2017). AID: A benchmark data set for performance evaluation of aerial scene classification. IEEE Transactions Geoscience and Remote Sensing, 55(7), 3965–3981.CrossRef
Zurück zum Zitat Xia, G. S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., & Zhang, L. (2018) DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 3974–3983). Xia, G. S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., Datcu, M., Pelillo, M., & Zhang, L. (2018) DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 3974–3983).
Zurück zum Zitat Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of IEEE international conference on computer vision (pp. 1395–1403). Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of IEEE international conference on computer vision (pp. 1395–1403).
Zurück zum Zitat Xu, W., Parmar, G., & Tu, Z. (2019). Geometry-aware end-to-end skeleton detection. In British Machine Vision Conference. Xu, W., Parmar, G., & Tu, Z. (2019). Geometry-aware end-to-end skeleton detection. In British Machine Vision Conference.
Zurück zum Zitat Xu, Y., Wang, Y., Zhou, W., Wang, Y., Yang, Z., & Bai, X. (2019). Textfield: Learning a deep direction field for irregular scene text detection. IEEE Transactions on Image Processing, 28(11), 5566–5579.MathSciNetCrossRef Xu, Y., Wang, Y., Zhou, W., Wang, Y., Yang, Z., & Bai, X. (2019). Textfield: Learning a deep direction field for irregular scene text detection. IEEE Transactions on Image Processing, 28(11), 5566–5579.MathSciNetCrossRef
Zurück zum Zitat Yang, X., Sun, H., Fu, K., Yang, J., Sun, X., Yan, M., et al. (2018). Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sensing, 10(1), 132. Yang, X., Sun, H., Fu, K., Yang, J., Sun, X., Yan, M., et al. (2018). Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sensing, 10(1), 132.
Zurück zum Zitat Yu, Z., & Bajaj, C. (2004). A segmentation-free approach for skeletonization of gray-scale images via anisotropic vector diffusion. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 415–420). Yu, Z., & Bajaj, C. (2004). A segmentation-free approach for skeletonization of gray-scale images via anisotropic vector diffusion. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 415–420).
Zurück zum Zitat Zhang, Q., & Couloigner, I. (2007). Accurate centerline detection and line width estimation of thick lines using the radon transform. IEEE Transactions on Image Processing, 16(2), 310–316.MathSciNetCrossRef Zhang, Q., & Couloigner, I. (2007). Accurate centerline detection and line width estimation of thick lines using the radon transform. IEEE Transactions on Image Processing, 16(2), 310–316.MathSciNetCrossRef
Zurück zum Zitat Zhang, Z., Shen, W., Yao, C., & Bai, X. (2015). Symmetry-based text line detection in natural scenes. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2558–2567). Zhang, Z., Shen, W., Yao, C., & Bai, X. (2015). Symmetry-based text line detection in natural scenes. In Proceedings of IEEE international conference on computer vision and pattern recognition (pp. 2558–2567).
Zurück zum Zitat Zhao, K., Shen, W., Gao, S., Li, D., & Cheng, M. M. (2018). Hi-fi: Hierarchical feature integration for skeleton detection. In Proceedings of international joint conference on artificial intelligence (pp. 1191–1197). Zhao, K., Shen, W., Gao, S., Li, D., & Cheng, M. M. (2018). Hi-fi: Hierarchical feature integration for skeleton detection. In Proceedings of international joint conference on artificial intelligence (pp. 1191–1197).
Zurück zum Zitat Zhu, S. C., & Yuille, A. L. (1996). Forms: A flexible object recognition and modelling system. International Journal of Computer Vision, 20(3), 187–212.CrossRef Zhu, S. C., & Yuille, A. L. (1996). Forms: A flexible object recognition and modelling system. International Journal of Computer Vision, 20(3), 187–212.CrossRef
Zurück zum Zitat Zucker, S. W. (2012). Local field potentials and border ownership: A conjecture about computation in visual cortex. Journal of Physiology-Paris, 106, 297–315.CrossRef Zucker, S. W. (2012). Local field potentials and border ownership: A conjecture about computation in visual cortex. Journal of Physiology-Paris, 106, 297–315.CrossRef
Metadaten
Titel
DeepFlux for Skeleton Detection in the Wild
verfasst von
Yongchao Xu
Yukang Wang
Stavros Tsogkas
Jianqiang Wan
Xiang Bai
Sven Dickinson
Kaleem Siddiqi
Publikationsdatum
30.01.2021
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 4/2021
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-021-01430-6

Weitere Artikel der Ausgabe 4/2021

International Journal of Computer Vision 4/2021 Zur Ausgabe