Top

Neural Processing Letters

Published in:

09-11-2022

A Strip Dilated Convolutional Network for Semantic Segmentation

Authors: Yan Zhou, Xihong Zheng, Wanli Ouyang, Baopu Li

Published in: Neural Processing Letters | Issue 4/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

There are frequently a large number of strip objects in segmentation scenarios, and the use of conventional square convolution may yield redundant information. Based on our previously proposed SA-FFNet (Zhou et al. in Neurocomputing 453:50–59, 2021), we study the effect of strip sub-region information extraction on semantic segmentation and propose a network. Our method is conducive to extracting multi-scale strip objects that often appear in segmentation scenes, and using strip dilated convolution to further extract contextual dependencies in other directions. First, we propose a multi-scale strip pooling module that enables the backbone network to effectively obtain multi-scale contexts; Then, we introduce a strip dilated convolution module, which supplements the vertical contexts of the strip pooling by using strip dilated convolution; Finally, we construct a novel network integrating the proposed two modules. The method explicitly takes horizontal and vertical contexts of multi-scale strip objects into consideration, so that scene understanding could benefit from long-range dependencies. The experimental results on the widely used PASCAL VOC 2012 and Cityscapes scene analysis benchmark datasets, which are better than the existing OCRNet, DeeplabV3+, SPNet, etc, both qualitatively and quantitatively.

previous article MatchACNN: A Multi-Granularity Deep Matching Model

next article Transfer Learning with Ensembles of Deep Neural Networks for Skin Cancer Detection in Imbalanced Data Sets

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

Teichmann M, Weber M, Zoellner M, Cipolla R, Urtasun R (2018) MultiNet: real-time joint semantic reasoning for autonomous driving. In: 2018 IEEE intelligent vehicles symposium, pp 1013–1020

Chen C, Wei J, Peng C, Qin H (2021) Depth-quality-aware salient object detection. IEEE Trans Image Process 30:2350–2363CrossRef

Wu Z, Li S, Chen C, Hao A, Qin H (2020) A deeper look at image salient object detection: bi-stream network with a small training dataset. IEEE Trans Multimedia 24:73–86CrossRef

Ma G, Li S, Chen C, Hao A, Qin H (2021) Rethinking image salient object detection: object-level semantic saliency reranking first, pixelwise saliency refinement later. IEEE Trans Image Process 30:4238–4252CrossRef

Ma G, Chen C, Li S, Peng C, Hao A, Qin H (2019) Salient object detection via multiple instance joint re-learning. IEEE Trans Multimedia 22(2):324–336CrossRef

Chen C, Wei J, Peng C, Zhang W, Qin H (2020) Improved saliency detection in RGB-D images using two-phase depth estimation and selective deep fusion. IEEE Trans Image Process 29:4296–4307CrossRefMATH

Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241

He W, Song H, Guo Y, Bian G, Sun Y, Zhou X, Wang X (2020) Multiscale matters for part segmentation of instruments in robotic surgery. IET Image Proc 14(13):3215–3222CrossRef

10.

Liu C, Zhao R, Xie W, Pang M (2020) Pathological lung segmentation based on random forest combined with deep model and multi-scale superpixels. Neural Process Lett 52(2):1631–1649CrossRef

11.

Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646CrossRef

12.

Al-Huda Z, Peng B, Yang Y, Algburi RNA, Ahmad M, Khurshid F, Moghalles K (2021) Weakly supervised semantic segmentation by iteratively refining optimal segmentation with deep cues guidance. Neural Comput Appl 33(15):9035–9060CrossRef

13.

Rainarli E (2021) A decade: review of scene text detection methods. Comput Sci Rev 42:100434MathSciNetCrossRef

14.

Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

15.

Zhao B, Zhang X, Li Z, Hu X (2019) A multi-scale strategy for deep semantic segmentation with convolutional neural networks. Neurocomputing 365:273–284CrossRef

16.

Ding H, Jiang X, Shuai B, Liu AQ, Wang G (2018) Context contrasted feature and gated multi-scale aggregation for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2393–2402

17.

Gao S-H, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr P (2019) Res2Net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662CrossRef

18.

Xia H, Sun W, Song S, Mou X (2020) Md-Net: multi-scale dilated convolution network for CT images segmentation. Neural Process Lett 51(3):2915–2927CrossRef

19.

Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890

20.

Lin G, Milan A, Shen C, Reid I (2017) RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1925–1934

21.

Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866

22.

Li H, Xiong P, Fan H, Sun J (2019) DFANet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9522–9531

23.

Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef

24.

Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818

25.

Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528

26.

Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7151–7160

27.

Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef

28.

Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation.arXiv:1706.05587

29.

Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3146–3154

30.

Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. BMVC. arXiv preprint arXiv:1805.10180

31.

Liu Y, Xu C, Chen Z, Chen C, Zhao H, Jin X (2020) Deep dual-stream network with scale context selection attention module for semantic segmentation. Neural Process Lett 51(3):2281–2299CrossRef

32.

Peng G, Yang S, Wang H (2021) Refine for semantic segmentation based on parallel convolutional network with attention model. Neural Process Lett 53(6):4177–4188CrossRef

33.

Fan Z, Hu G, Sun X, Wang G, Dong J, Su C (2022) Self-attention neural architecture search for semantic image segmentation. Knowl-Based Syst 239:107968CrossRef

34.

Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

35.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

36.

Zhou Z, Zhou Y, Wang D, Mu J, Zhou H (2021) Self-attention feature fusion network for semantic segmentation. Neurocomputing 453:50–59CrossRef

37.

Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773

38.

He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916CrossRef

39.

Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision, pp 3–19

40.

Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722

41.

Zhou F, Hu Y, Shen X (2020) Scale-aware spatial pyramid pooling with both encoder-mask and scale-attention for semantic segmentation. Neurocomputing 383:174–182CrossRef

42.

Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) CCNet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 603–612

43.

He J, Deng Z, Zhou L, Wang Y, Qiao Y (2019) Adaptive pyramid context network for semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7519–7528

44.

Hou Q, Zhang L, Cheng M-M, Feng J (2020) Strip pooling: rethinking spatial pooling for scene parsing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4003–4012

45.

Liu J, He J, Zhang J, Ren JS, Li H (2020) EfficientFCN: holistically-guided decoding for semantic segmentation. In: European conference on computer vision, pp 1–17 . Springer

46.

Zhang H, Xue J, Dana K (2017) Deep TEN: texture encoding network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 708–717

47.

Bai S, Wang C (2021) Information aggregation and fusion in deep neural networks for object interaction exploration for semantic segmentation. Knowl-Based Syst 218:106843CrossRef

48.

Srivastava V, Biswas B (2022) CNN-EFF: CNN based edge feature fusion in semantic image labelling and parsing. Neural Process Lett. https://doi.org/10.1007/s11063-021-10704-6CrossRef

49.

Hu Y, Long Z, AlRegib G (2019) Multi-level texture encoding and representation (MuLTER) based on deep neural networks. In: 2019 IEEE international conference on image processing, pp 4410–4414 . IEEE

50.

Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122

51.

Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338CrossRef

52.

Cordts M, Omran M, Ramos S, Scharwächter T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2015) The cityscapes dataset. In: CVPR workshop on the future of datasets in vision, vol 2

53.

Yuan Y, Chen X, Wang J (2020) Object-contextual representations for semantic segmentation. In: Proceedings of the European conference on computer vision, pp 173–190

Title: A Strip Dilated Convolutional Network for Semantic Segmentation
Authors: Yan Zhou
Xihong Zheng
Wanli Ouyang
Baopu Li
Publication date: 09-11-2022
Publisher: Springer US
Published in: Neural Processing Letters / Issue 4/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-022-11048-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2023

Active Learning by Extreme Learning Machine with Considering Exploration and Exploitation Simultaneously

Deep Convolutional Neural Networks with Transfer Learning for Visual Sentiment Analysis

Self-Transfer Learning Network for Multicolor Fabric Defect Detection

CDMC-Net: Context-Aware Image Deblurring Using a Multi-scale Cascaded Network

DRFS: Detecting Risk Factor of Stroke Disease from Social Media Using Machine Learning Techniques

An Attention-Based Spatio-temporal Model for Methane Concentration Forecasting in Coal Mine