Skip to main content
Top
Published in: Neural Processing Letters 4/2023

09-11-2022

A Strip Dilated Convolutional Network for Semantic Segmentation

Authors: Yan Zhou, Xihong Zheng, Wanli Ouyang, Baopu Li

Published in: Neural Processing Letters | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

There are frequently a large number of strip objects in segmentation scenarios, and the use of conventional square convolution may yield redundant information. Based on our previously proposed SA-FFNet (Zhou et al. in Neurocomputing 453:50–59, 2021), we study the effect of strip sub-region information extraction on semantic segmentation and propose a network. Our method is conducive to extracting multi-scale strip objects that often appear in segmentation scenes, and using strip dilated convolution to further extract contextual dependencies in other directions. First, we propose a multi-scale strip pooling module that enables the backbone network to effectively obtain multi-scale contexts; Then, we introduce a strip dilated convolution module, which supplements the vertical contexts of the strip pooling by using strip dilated convolution; Finally, we construct a novel network integrating the proposed two modules. The method explicitly takes horizontal and vertical contexts of multi-scale strip objects into consideration, so that scene understanding could benefit from long-range dependencies. The experimental results on the widely used PASCAL VOC 2012 and Cityscapes scene analysis benchmark datasets, which are better than the existing OCRNet, DeeplabV3+, SPNet, etc, both qualitatively and quantitatively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969 He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
2.
go back to reference Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2018) MultiNet: real-time joint semantic reasoning for autonomous driving. In: 2018 IEEE intelligent vehicles symposium, pp 1013–1020 Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2018) MultiNet: real-time joint semantic reasoning for autonomous driving. In: 2018 IEEE intelligent vehicles symposium, pp 1013–1020
3.
go back to reference Chen C, Wei J, Peng C, Qin H (2021) Depth-quality-aware salient object detection. IEEE Trans Image Process 30:2350–2363CrossRef Chen C, Wei J, Peng C, Qin H (2021) Depth-quality-aware salient object detection. IEEE Trans Image Process 30:2350–2363CrossRef
4.
go back to reference Wu Z, Li S, Chen C, Hao A, Qin H (2020) A deeper look at image salient object detection: bi-stream network with a small training dataset. IEEE Trans Multimedia 24:73–86CrossRef Wu Z, Li S, Chen C, Hao A, Qin H (2020) A deeper look at image salient object detection: bi-stream network with a small training dataset. IEEE Trans Multimedia 24:73–86CrossRef
5.
go back to reference Ma G, Li S, Chen C, Hao A, Qin H (2021) Rethinking image salient object detection: object-level semantic saliency reranking first, pixelwise saliency refinement later. IEEE Trans Image Process 30:4238–4252CrossRef Ma G, Li S, Chen C, Hao A, Qin H (2021) Rethinking image salient object detection: object-level semantic saliency reranking first, pixelwise saliency refinement later. IEEE Trans Image Process 30:4238–4252CrossRef
6.
go back to reference Ma G, Chen C, Li S, Peng C, Hao A, Qin H (2019) Salient object detection via multiple instance joint re-learning. IEEE Trans Multimedia 22(2):324–336CrossRef Ma G, Chen C, Li S, Peng C, Hao A, Qin H (2019) Salient object detection via multiple instance joint re-learning. IEEE Trans Multimedia 22(2):324–336CrossRef
7.
go back to reference Chen C, Wei J, Peng C, Zhang W, Qin H (2020) Improved saliency detection in RGB-D images using two-phase depth estimation and selective deep fusion. IEEE Trans Image Process 29:4296–4307CrossRefMATH Chen C, Wei J, Peng C, Zhang W, Qin H (2020) Improved saliency detection in RGB-D images using two-phase depth estimation and selective deep fusion. IEEE Trans Image Process 29:4296–4307CrossRefMATH
8.
go back to reference Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241 Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241
9.
go back to reference He W, Song H, Guo Y, Bian G, Sun Y, Zhou X, Wang X (2020) Multiscale matters for part segmentation of instruments in robotic surgery. IET Image Proc 14(13):3215–3222CrossRef He W, Song H, Guo Y, Bian G, Sun Y, Zhou X, Wang X (2020) Multiscale matters for part segmentation of instruments in robotic surgery. IET Image Proc 14(13):3215–3222CrossRef
10.
go back to reference Liu C, Zhao R, Xie W, Pang M (2020) Pathological lung segmentation based on random forest combined with deep model and multi-scale superpixels. Neural Process Lett 52(2):1631–1649CrossRef Liu C, Zhao R, Xie W, Pang M (2020) Pathological lung segmentation based on random forest combined with deep model and multi-scale superpixels. Neural Process Lett 52(2):1631–1649CrossRef
11.
go back to reference Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646CrossRef Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646CrossRef
12.
go back to reference Al-Huda Z, Peng B, Yang Y, Algburi RNA, Ahmad M, Khurshid F, Moghalles K (2021) Weakly supervised semantic segmentation by iteratively refining optimal segmentation with deep cues guidance. Neural Comput Appl 33(15):9035–9060CrossRef Al-Huda Z, Peng B, Yang Y, Algburi RNA, Ahmad M, Khurshid F, Moghalles K (2021) Weakly supervised semantic segmentation by iteratively refining optimal segmentation with deep cues guidance. Neural Comput Appl 33(15):9035–9060CrossRef
14.
go back to reference Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
15.
go back to reference Zhao B, Zhang X, Li Z, Hu X (2019) A multi-scale strategy for deep semantic segmentation with convolutional neural networks. Neurocomputing 365:273–284CrossRef Zhao B, Zhang X, Li Z, Hu X (2019) A multi-scale strategy for deep semantic segmentation with convolutional neural networks. Neurocomputing 365:273–284CrossRef
16.
go back to reference Ding H, Jiang X, Shuai B, Liu AQ, Wang G (2018) Context contrasted feature and gated multi-scale aggregation for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2393–2402 Ding H, Jiang X, Shuai B, Liu AQ, Wang G (2018) Context contrasted feature and gated multi-scale aggregation for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2393–2402
17.
go back to reference Gao S-H, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr P (2019) Res2Net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662CrossRef Gao S-H, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr P (2019) Res2Net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662CrossRef
18.
go back to reference Xia H, Sun W, Song S, Mou X (2020) Md-Net: multi-scale dilated convolution network for CT images segmentation. Neural Process Lett 51(3):2915–2927CrossRef Xia H, Sun W, Song S, Mou X (2020) Md-Net: multi-scale dilated convolution network for CT images segmentation. Neural Process Lett 51(3):2915–2927CrossRef
19.
go back to reference Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890 Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
20.
go back to reference Lin G, Milan A, Shen C, Reid I (2017) RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934 Lin G, Milan A, Shen C, Reid I (2017) RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934
21.
go back to reference Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866 Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866
22.
go back to reference Li H, Xiong P, Fan H, Sun J (2019) DFANet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9522–9531 Li H, Xiong P, Fan H, Sun J (2019) DFANet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9522–9531
23.
go back to reference Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef
24.
go back to reference Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818 Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818
25.
go back to reference Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528 Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528
26.
go back to reference Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7151–7160 Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7151–7160
27.
go back to reference Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef
28.
29.
go back to reference Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154 Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154
30.
31.
go back to reference Liu Y, Xu C, Chen Z, Chen C, Zhao H, Jin X (2020) Deep dual-stream network with scale context selection attention module for semantic segmentation. Neural Process Lett 51(3):2281–2299CrossRef Liu Y, Xu C, Chen Z, Chen C, Zhao H, Jin X (2020) Deep dual-stream network with scale context selection attention module for semantic segmentation. Neural Process Lett 51(3):2281–2299CrossRef
32.
go back to reference Peng G, Yang S, Wang H (2021) Refine for semantic segmentation based on parallel convolutional network with attention model. Neural Process Lett 53(6):4177–4188CrossRef Peng G, Yang S, Wang H (2021) Refine for semantic segmentation based on parallel convolutional network with attention model. Neural Process Lett 53(6):4177–4188CrossRef
33.
go back to reference Fan Z, Hu G, Sun X, Wang G, Dong J, Su C (2022) Self-attention neural architecture search for semantic image segmentation. Knowl-Based Syst 239:107968CrossRef Fan Z, Hu G, Sun X, Wang G, Dong J, Su C (2022) Self-attention neural architecture search for semantic image segmentation. Knowl-Based Syst 239:107968CrossRef
34.
go back to reference Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141 Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
35.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
36.
go back to reference Zhou Z, Zhou Y, Wang D, Mu J, Zhou H (2021) Self-attention feature fusion network for semantic segmentation. Neurocomputing 453:50–59CrossRef Zhou Z, Zhou Y, Wang D, Mu J, Zhou H (2021) Self-attention feature fusion network for semantic segmentation. Neurocomputing 453:50–59CrossRef
37.
go back to reference Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773 Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773
38.
go back to reference He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916CrossRef He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916CrossRef
39.
go back to reference Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19 Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19
40.
go back to reference Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722 Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722
41.
go back to reference Zhou F, Hu Y, Shen X (2020) Scale-aware spatial pyramid pooling with both encoder-mask and scale-attention for semantic segmentation. Neurocomputing 383:174–182CrossRef Zhou F, Hu Y, Shen X (2020) Scale-aware spatial pyramid pooling with both encoder-mask and scale-attention for semantic segmentation. Neurocomputing 383:174–182CrossRef
42.
go back to reference Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) CCNet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 603–612 Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) CCNet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 603–612
43.
go back to reference He J, Deng Z, Zhou L, Wang Y, Qiao Y (2019) Adaptive pyramid context network for semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7519–7528 He J, Deng Z, Zhou L, Wang Y, Qiao Y (2019) Adaptive pyramid context network for semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7519–7528
44.
go back to reference Hou Q, Zhang L, Cheng M-M, Feng J (2020) Strip pooling: rethinking spatial pooling for scene parsing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4003–4012 Hou Q, Zhang L, Cheng M-M, Feng J (2020) Strip pooling: rethinking spatial pooling for scene parsing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4003–4012
45.
go back to reference Liu J, He J, Zhang J, Ren JS, Li H (2020) EfficientFCN: holistically-guided decoding for semantic segmentation. In: European conference on computer vision, pp 1–17 . Springer Liu J, He J, Zhang J, Ren JS, Li H (2020) EfficientFCN: holistically-guided decoding for semantic segmentation. In: European conference on computer vision, pp 1–17 . Springer
46.
go back to reference Zhang H, Xue J, Dana K (2017) Deep TEN: texture encoding network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 708–717 Zhang H, Xue J, Dana K (2017) Deep TEN: texture encoding network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 708–717
47.
go back to reference Bai S, Wang C (2021) Information aggregation and fusion in deep neural networks for object interaction exploration for semantic segmentation. Knowl-Based Syst 218:106843CrossRef Bai S, Wang C (2021) Information aggregation and fusion in deep neural networks for object interaction exploration for semantic segmentation. Knowl-Based Syst 218:106843CrossRef
49.
go back to reference Hu Y, Long Z, AlRegib G (2019) Multi-level texture encoding and representation (MuLTER) based on deep neural networks. In: 2019 IEEE international conference on image processing, pp 4410–4414 . IEEE Hu Y, Long Z, AlRegib G (2019) Multi-level texture encoding and representation (MuLTER) based on deep neural networks. In: 2019 IEEE international conference on image processing, pp 4410–4414 . IEEE
51.
go back to reference Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338CrossRef Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338CrossRef
52.
go back to reference Cordts M, Omran M, Ramos S, Scharwächter T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2015) The cityscapes dataset. In: CVPR workshop on the future of datasets in vision, vol 2 Cordts M, Omran M, Ramos S, Scharwächter T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2015) The cityscapes dataset. In: CVPR workshop on the future of datasets in vision, vol 2
53.
go back to reference Yuan Y, Chen X, Wang J (2020) Object-contextual representations for semantic segmentation. In: Proceedings of the European conference on computer vision, pp 173–190 Yuan Y, Chen X, Wang J (2020) Object-contextual representations for semantic segmentation. In: Proceedings of the European conference on computer vision, pp 173–190
Metadata
Title
A Strip Dilated Convolutional Network for Semantic Segmentation
Authors
Yan Zhou
Xihong Zheng
Wanli Ouyang
Baopu Li
Publication date
09-11-2022
Publisher
Springer US
Published in
Neural Processing Letters / Issue 4/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-11048-5

Other articles of this Issue 4/2023

Neural Processing Letters 4/2023 Go to the issue