Skip to main content
Top
Published in: Neural Processing Letters 1/2023

16-06-2022

Adaptive Generation of Weakly Supervised Semantic Segmentation for Object Detection

Authors: Shibao Li, Yixuan Liu, Yunwu Zhang, Yi Luo, Jianhang Liu

Published in: Neural Processing Letters | Issue 1/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Object detection and semantic segmentation are the basic tasks of computer vision. Recently, the combination of object detection and semantic segmentation has made great progress. With the box-level weakly supervised semantic segmentation(WSSS) method, we predict segmentation based on feature maps extracted from object detector. Existing methods require both box-level and pixel-level annotations to train the shared backbone network simultaneously to get the bounding boxes and segmentation. However, in the absence of pixel-level annotations and without changing the parameters of network framework, object detectors can’t predict semantic segmentation. We design a compact and plug-and-play object detection to semantic segmentation(O2S) module to enable object detectors to predict semantic masks, making full utilization of the training set and intermediate feature maps of object detection. We also propose a box-level weakly supervised probabilistic gap adaptive(PGA) method, which enables O2S to learn semantic masks from the training set of object detection. We evaluate the proposed approach on Pascal VOC 2007 and Pascal VOC 2012 and show its feasibility. With only 3.5 million parameters, the results of O2S trained with PGA are very close to the results of the whole networks trained with the WSSS methods. Our work has important implications for exploring the commonality of multiple visual tasks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Kong T, Sun F, Liu H, Jiang Y, Li L, Shi J (2020) Foveabox: Beyound anchor-based object detection. IEEE Trans Image Process 29:7389–7398CrossRefMATH Kong T, Sun F, Liu H, Jiang Y, Li L, Shi J (2020) Foveabox: Beyound anchor-based object detection. IEEE Trans Image Process 29:7389–7398CrossRefMATH
2.
go back to reference Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection, In: IEEE/CVF international conference on computer vision (ICCV) vol 2019, pp 9626–9635 Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection, In: IEEE/CVF international conference on computer vision (ICCV) vol 2019, pp 9626–9635
3.
go back to reference Kai C, Pang J, Wang J, Yu X, Lin D (2019) Hybrid task cascade for instance segmentation, In: IEEE/CVF conference on computer vision and pattern recognition Kai C, Pang J, Wang J, Yu X, Lin D (2019) Hybrid task cascade for instance segmentation, In: IEEE/CVF conference on computer vision and pattern recognition
4.
go back to reference Yla B, Gqa B, Msa B, Jq C, Jie Y, Zza B (2021) Semantic and detail collaborative learning network for salient object detection, Neurocomputing, 462(2) Yla B, Gqa B, Msa B, Jq C, Jie Y, Zza B (2021) Semantic and detail collaborative learning network for salient object detection, Neurocomputing, 462(2)
5.
go back to reference Song C, Huang Y, Ouyang W, Wang L (2019) Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 3131–3140 Song C, Huang Y, Ouyang W, Wang L (2019) Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 3131–3140
6.
go back to reference Dai J, He K, Sun J (dec 2015) “Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation,” In: 2015 IEEE international conference on computer vision (ICCV). Los Alamitos, CA, USA: IEEE Computer Society, pp Dai J, He K, Sun J (dec 2015) “Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation,” In: 2015 IEEE international conference on computer vision (ICCV). Los Alamitos, CA, USA: IEEE Computer Society, pp
7.
go back to reference Papandreou G, Chen L, Murphy KP, Yuille AL (2015) Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation, In: IEEE international conference on computer vision (ICCV) vol 2015, pp 1742–1750 Papandreou G, Chen L, Murphy KP, Yuille AL (2015) Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation, In: IEEE international conference on computer vision (ICCV) vol 2015, pp 1742–1750
8.
go back to reference Khoreva A, Benenson R, Hosang J, Hein M, Schiele B (2017) Simple does it: Weakly supervised instance and semantic segmentation, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2017, pp 1665–1674 Khoreva A, Benenson R, Hosang J, Hein M, Schiele B (2017) Simple does it: Weakly supervised instance and semantic segmentation, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2017, pp 1665–1674
9.
go back to reference Zhou X, Wang D, Krhenbühl P (2019) Objects as points Zhou X, Wang D, Krhenbühl P (2019) Objects as points
10.
go back to reference Law H, Deng J (2020) Cornernet: Detecting objects as paired keypoints. Int J Comput Vis 128(3):642–656CrossRef Law H, Deng J (2020) Cornernet: Detecting objects as paired keypoints. Int J Comput Vis 128(3):642–656CrossRef
11.
go back to reference Yang Z, Liu S, Hu H, Wang L, Lin S (2019) Reppoints: Point set representation for object detection, In: IEEE/CVF international conference on computer vision (ICCV) vol 2019, pp 9656–9665 Yang Z, Liu S, Hu H, Wang L, Lin S (2019) Reppoints: Point set representation for object detection, In: IEEE/CVF international conference on computer vision (ICCV) vol 2019, pp 9656–9665
12.
go back to reference Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 840–849 Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 840–849
13.
go back to reference Shen Y, Ji R, Wang Y, Wu Y, Cao L (2019) Cyclic guidance for weakly supervised joint detection and segmentation, In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR) Shen Y, Ji R, Wang Y, Wu Y, Cao L (2019) Cyclic guidance for weakly supervised joint detection and segmentation, In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
14.
go back to reference Yu J, Yao J, Zhang J, Yu Z, Tao D (2021) Sprnet: Single-pixel reconstruction for one-stage instance segmentation. IEEE Trans Cybern 51(4):1731–1742CrossRef Yu J, Yao J, Zhang J, Yu Z, Tao D (2021) Sprnet: Single-pixel reconstruction for one-stage instance segmentation. IEEE Trans Cybern 51(4):1731–1742CrossRef
15.
go back to reference Lin TY, Maire M, Belongie S, Hays J, Zitnick CL (2014) Microsoft coco: Common objects in context, In: European conference on computer vision Lin TY, Maire M, Belongie S, Hays J, Zitnick CL (2014) Microsoft coco: Common objects in context, In: European conference on computer vision
16.
go back to reference Yu J, Tan M, Zhang H, Rui Y, Tao D (2022) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563–578CrossRef Yu J, Tan M, Zhang H, Rui Y, Tao D (2022) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563–578CrossRef
17.
go back to reference Krähenbühl P, Koltun V (Oct. 2012) Efficient inference in fully connected CRFs with Gaussian edge potentials, arXiv e-prints, p. arXiv:1210.5644 Krähenbühl P, Koltun V (Oct. 2012) Efficient inference in fully connected CRFs with Gaussian edge potentials, arXiv e-prints, p. arXiv:​1210.​5644
18.
go back to reference Kim G (2006) Pascal visual object classes challenge Kim G (2006) Pascal visual object classes challenge
19.
go back to reference Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef Everingham M, Gool LV, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef
21.
go back to reference Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation, In: CVPR Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation, In: CVPR
22.
go back to reference Girshick R (2015) Fast r-cnn, In: IEEE international conference on computer vision (ICCV) vol 2015, pp 1440–1448 Girshick R (2015) Fast r-cnn, In: IEEE international conference on computer vision (ICCV) vol 2015, pp 1440–1448
23.
go back to reference Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef
24.
go back to reference He K, Gkioxari G, Dollár P, Girshick R (2020) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386–397CrossRef He K, Gkioxari G, Dollár P, Girshick R (2020) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386–397CrossRef
25.
26.
go back to reference Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision - ECCV 2016. Springer International Publishing, Cham, pp 21–37CrossRef Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision - ECCV 2016. Springer International Publishing, Cham, pp 21–37CrossRef
27.
go back to reference Shen Z, Zhuang L, Li J, Jiang YG, Xue X (2017) Dsod: Learning deeply supervised object detectors from scratch, In: 2017 IEEE international conference on computer vision (ICCV), Shen Z, Zhuang L, Li J, Jiang YG, Xue X (2017) Dsod: Learning deeply supervised object detectors from scratch, In: 2017 IEEE international conference on computer vision (ICCV),
28.
go back to reference Wang J, Chen K, Yang S, Loy CC, Lin D (2019) Region proposal by guided anchoring, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 2960–2969 Wang J, Chen K, Yang S, Loy CC, Lin D (2019) Region proposal by guided anchoring, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 2960–2969
29.
go back to reference Zhou X, Zhuo J, Krähenbühl P (2019) Bottom-up object detection by grouping extreme and center points, arXiv e-prints, p. arXiv:1901.08043, Zhou X, Zhuo J, Krähenbühl P (2019) Bottom-up object detection by grouping extreme and center points, arXiv e-prints, p. arXiv:​1901.​08043,
30.
go back to reference Jonathan L, Evan S, Trevor D (2017) Fully convolutional networks for semantic segmentation, IEEE Trans Pattern Anal Mach Intell Jonathan L, Evan S, Trevor D (2017) Fully convolutional networks for semantic segmentation, IEEE Trans Pattern Anal Mach Intell
31.
go back to reference Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef
32.
go back to reference Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation, In IEEE/CVF conference on computer vision and pattern recognition vol 2018, pp 7151–7160 Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation, In IEEE/CVF conference on computer vision and pattern recognition vol 2018, pp 7151–7160
33.
go back to reference Kirillov A, Wu Y, He K, Girshick R (2019) Pointrend: Image segmentation as rendering Kirillov A, Wu Y, He K, Girshick R (2019) Pointrend: Image segmentation as rendering
34.
go back to reference Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation, In: European conference on computer vision Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation, In: European conference on computer vision
35.
go back to reference Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions, In: ICLR, Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions, In: ICLR,
36.
go back to reference Yang M, Yu K, Zhang C, Li Z, Yang K (2018) Denseaspp for semantic segmentation in street scenes, In: IEEE/CVF conference on computer vision and pattern recognition vol 2018, pp 3684–3692 Yang M, Yu K, Zhang C, Li Z, Yang K (2018) Denseaspp for semantic segmentation in street scenes, In: IEEE/CVF conference on computer vision and pattern recognition vol 2018, pp 3684–3692
37.
go back to reference Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: Three principles for weakly-supervised image segmentation, In: European conference on computer vision Kolesnikov A, Lampert CH (2016) Seed, expand and constrain: Three principles for weakly-supervised image segmentation, In: European conference on computer vision
38.
go back to reference Fan J, Zhang Z, Song C, Tan T (2020) Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR) Fan J, Zhang Z, Song C, Tan T (2020) Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
39.
go back to reference Araslanov N, Roth S (2020) Single-stage semantic segmentation from image labels, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), Araslanov N, Roth S (2020) Single-stage semantic segmentation from image labels, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR),
40.
go back to reference Wang Y, Zhang J, Kan M, Shan S, Chen X (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR) Wang Y, Zhang J, Kan M, Shan S, Chen X (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
41.
go back to reference Redondo-Cabrera C, Baptista-Ríos M, López-Sastre RJ (2019) Learning to exploit the prior network knowledge for weakly supervised semantic segmentation. IEEE Trans Image Process 28(7):3649–3661MathSciNetCrossRefMATH Redondo-Cabrera C, Baptista-Ríos M, López-Sastre RJ (2019) Learning to exploit the prior network knowledge for weakly supervised semantic segmentation. IEEE Trans Image Process 28(7):3649–3661MathSciNetCrossRefMATH
42.
go back to reference Lee J, Kim E, Lee S, Lee J, Yoon S (2019) Ficklenet: Weakly and semi-supervised semantic image segmentation using stochastic inference, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 5262–5271 Lee J, Kim E, Lee S, Lee J, Yoon S (2019) Ficklenet: Weakly and semi-supervised semantic image segmentation using stochastic inference, In: IEEE/CVF conference on computer vision and pattern recognition (CVPR) vol 2019, pp 5262–5271
43.
go back to reference Wei Y, Feng J, Liang X, Cheng M, Zhao Y, Yan S (2017) Object region mining with adversarial erasing: A simple classification to semantic segmentation approach, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2017, pp 6488–6496 Wei Y, Feng J, Liang X, Cheng M, Zhao Y, Yan S (2017) Object region mining with adversarial erasing: A simple classification to semantic segmentation approach, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2017, pp 6488–6496
44.
go back to reference Xu L, Xue H, Bennamoun M, Boussaid F, Sohel F (2021) Atrous convolutional feature network for weakly supervised semantic segmentation. Neurocomputing 421(1):115–126CrossRef Xu L, Xue H, Bennamoun M, Boussaid F, Sohel F (2021) Atrous convolutional feature network for weakly supervised semantic segmentation. Neurocomputing 421(1):115–126CrossRef
45.
go back to reference Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2016, pp 2921–2929 Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2016, pp 2921–2929
46.
go back to reference Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization, In: IEEE international conference on computer vision (ICCV) vol 2017, pp 618–626 Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization, In: IEEE international conference on computer vision (ICCV) vol 2017, pp 618–626
47.
go back to reference Bearman A, Russakovsky O, Ferrari V, Fei-Fei L (2016) What’s the point: Semantic segmentation with point supervision, ECCV Bearman A, Russakovsky O, Ferrari V, Fei-Fei L (2016) What’s the point: Semantic segmentation with point supervision, ECCV
48.
go back to reference Lin D, Dai J, Jia J, He K, Sun J (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2016, pp 3159–3167 Lin D, Dai J, Jia J, He K, Sun J (2016) Scribblesup: Scribble-supervised convolutional networks for semantic segmentation, In: IEEE conference on computer vision and pattern recognition (CVPR) vol 2016, pp 3159–3167
49.
go back to reference Tang M, Djelouah A, Perazzi F, Boykov Y, Schroers C (2018) Normalized cut loss for weakly-supervised cnn segmentation, In: IEEE/CVF conference on computer vision and pattern recognition vol 2018, pp 1818–1827 Tang M, Djelouah A, Perazzi F, Boykov Y, Schroers C (2018) Normalized cut loss for weakly-supervised cnn segmentation, In: IEEE/CVF conference on computer vision and pattern recognition vol 2018, pp 1818–1827
50.
go back to reference Vernaza P, Chandraker M (2017) Learning random-walk label propagation for weakly-supervised semantic segmentation, In: CVPR Vernaza P, Chandraker M (2017) Learning random-walk label propagation for weakly-supervised semantic segmentation, In: CVPR
51.
go back to reference Arbeláez P, Pont-Tuset J, Barron J, Marques F, Malik J (2014) Multiscale combinatorial grouping, In: IEEE conference on computer vision and pattern recognition vol 2014, pp 328–335 Arbeláez P, Pont-Tuset J, Barron J, Marques F, Malik J (2014) Multiscale combinatorial grouping, In: IEEE conference on computer vision and pattern recognition vol 2014, pp 328–335
52.
go back to reference Rother C, Kolmogorov V, Blake A (2004) “grabcut”: Interactive foreground extraction using iterated graph cuts, In: ACM SIGGRAPH, (2004) Papers, ser. SIGGRAPH ’04. New York, NY, USA: association for computing machinery, pp 309–314 Rother C, Kolmogorov V, Blake A (2004) “grabcut”: Interactive foreground extraction using iterated graph cuts, In: ACM SIGGRAPH, (2004) Papers, ser. SIGGRAPH ’04. New York, NY, USA: association for computing machinery, pp 309–314
53.
go back to reference Ibrahim MS, Vahdat A, Ranjbar M, Macready WG (2020) Semi-supervised semantic image segmentation with self-correcting networks, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR) Ibrahim MS, Vahdat A, Ranjbar M, Macready WG (2020) Semi-supervised semantic image segmentation with self-correcting networks, In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
54.
go back to reference Paszke A, Gross S, Massa F, Lerer A, Chintala S (2019) Pytorch: An imperative style, high-performance deep learning library Paszke A, Gross S, Massa F, Lerer A, Chintala S (2019) Pytorch: An imperative style, high-performance deep learning library
Metadata
Title
Adaptive Generation of Weakly Supervised Semantic Segmentation for Object Detection
Authors
Shibao Li
Yixuan Liu
Yunwu Zhang
Yi Luo
Jianhang Liu
Publication date
16-06-2022
Publisher
Springer US
Published in
Neural Processing Letters / Issue 1/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-10902-w

Other articles of this Issue 1/2023

Neural Processing Letters 1/2023 Go to the issue