Skip to main content
Top
Published in: Pattern Recognition and Image Analysis 2/2021

01-04-2021 | MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION AND UNDERSTANDING

Efficient Residual Neural Network for Semantic Segmentation

Authors: Bin Li, Junyue Zang, Jie Cao

Published in: Pattern Recognition and Image Analysis | Issue 2/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we present an improved Efficient Neural Network (ENet) for semantic segmentation, and named the proposed network as Efficient Residual Neural Network (ERNet). The ERNet network contains two processing streams: one is pooling stream, which is used to obtain high-dimensional semantic information; the other is residual stream which is used to record low-dimensional boundary information. The ERNet has five stages, each stage contains several bottleneck modules. The output of each bottleneck in the ERNet network is fed into the residual stream. Starting from the second stage of ERNet, pooling stream and residual stream through concatenating are used as inputs for each down-sampling or up-sampling bottleneck. The identity mapping of residual stream shortens the distance between the near input and output terminals of each stage network in ERNet, alleviates the problem of vanishing gradient, strengthens the propagation of low-dimensional boundary features, and encourages feature reuse of low-dimensional boundary features. We tested ERNet on CamVid, Cityscape, and SUN RGB-D datasets. The segmentation speed of ERNet is close to that of ENet, but the segmentation accuracy is higher than that of ENet.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39 (4), 640–651 (2014). J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39 (4), 640–651 (2014).
2.
go back to reference V. Badrinarayanan, A. Handa, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv (2015). arXiv:1511.00561 V. Badrinarayanan, A. Handa, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv (2015). arXiv:1511.00561
3.
go back to reference M. Treml, J. Arjona-Medina, T. Unterthiner, R. Durgesh, F. Friedmann, P. Schuberth, A. Mayr, M. Heusel, M. Hofmarcher, M. Widrich, B. Nessler, and S. Hochreiter, “Speeding up semantic segmentation for autonomous driving,” in NIPS Workshop (2016). M. Treml, J. Arjona-Medina, T. Unterthiner, R. Durgesh, F. Friedmann, P. Schuberth, A. Mayr, M. Heusel, M. Hofmarcher, M. Widrich, B. Nessler, and S. Hochreiter, “Speeding up semantic segmentation for autonomous driving,” in NIPS Workshop (2016).
4.
go back to reference Liang Chieh Chen et al., “DeepLab: Semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs,” IEEE Trans. Pattern Anal. Mach. Intell. 40 (4), 834–848 (2016).CrossRef Liang Chieh Chen et al., “DeepLab: Semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs,” IEEE Trans. Pattern Anal. Mach. Intell. 40 (4), 834–848 (2016).CrossRef
5.
go back to reference Golnaz Ghiasi and C. C. Fowlkes, “Laplacian pyramid reconstruction and refinement for semantic segmentation,” arXiv (2016). arXiv:1605.02264 [cs.CV] Golnaz Ghiasi and C. C. Fowlkes, “Laplacian pyramid reconstruction and refinement for semantic segmentation,” arXiv (2016). arXiv:1605.02264 [cs.CV]
6.
go back to reference Liang Chieh Chen et al., “Attention to scale: Scale-aware semantic image segmentation,” arXiv (2015). arXiv:1511.03339 [cs.CV] Liang Chieh Chen et al., “Attention to scale: Scale-aware semantic image segmentation,” arXiv (2015). arXiv:1511.03339 [cs.CV]
7.
go back to reference B. Hariharan et al., “Hypercolumns for object segmentation and fine-grained localization,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015). B. Hariharan et al., “Hypercolumns for object segmentation and fine-grained localization,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).
8.
go back to reference Fangting Xia et al., “Zoom better to see clearer: Human and object parsing with hierarchical auto-zoom net,” in ECCV 2016: Computer Vision–ECCV 2016 (2016), pp. 648–663. Fangting Xia et al., “Zoom better to see clearer: Human and object parsing with hierarchical auto-zoom net,” in ECCV 2016: Computer Vision–ECCV 2016 (2016), pp. 648–663.
9.
go back to reference T. Pohlen et al., “Full-resolution residual networks for semantic segmentation in street scenes,” arXiv (2016). arXiv:1611.08323 [cs.CV] T. Pohlen et al., “Full-resolution residual networks for semantic segmentation in street scenes,” arXiv (2016). arXiv:1611.08323 [cs.CV]
10.
go back to reference V. Badrinarayanan, A. Handa, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling,” arXiv (2015). arXiv:1505.07293 V. Badrinarayanan, A. Handa, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling,” arXiv (2015). arXiv:1505.07293
11.
go back to reference E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo, “Efficient ConvNet for real-time semantic segmentation,” in 2017 IEEE Intelligent Vehicles Symposium (IV) (IEEE, 2017). E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo, “Efficient ConvNet for real-time semantic segmentation,” in 2017 IEEE Intelligent Vehicles Symposium (IV) (IEEE, 2017).
12.
go back to reference A. Paszke et al., “ENet: A deep neural network architecture for real-time semantic segmentation,” arXiv (2016). arXiv:1606.02147 [cs.CV] A. Paszke et al., “ENet: A deep neural network architecture for real-time semantic segmentation,” arXiv (2016). arXiv:1606.02147 [cs.CV]
13.
go back to reference G. J. Brostow, J. Fauqueur, and R. Cipolla, “Semantic object classes in video: A high-definition ground truth database,” Pattern Recognit. Lett. 30 (2), 88–97 (2009).CrossRef G. J. Brostow, J. Fauqueur, and R. Cipolla, “Semantic object classes in video: A high-definition ground truth database,” Pattern Recognit. Lett. 30 (2), 88–97 (2009).CrossRef
14.
go back to reference M. Cordts et al., “The Cityscapes dataset for semantic urban scene understanding,” arXiv (2016). arXiv:1604.01685 [cs.CV] M. Cordts et al., “The Cityscapes dataset for semantic urban scene understanding,” arXiv (2016). arXiv:1604.01685 [cs.CV]
15.
go back to reference S. Song, S. P. Lichtenberg, and J. Xiao, “Sun RGB-D: A RGB-D scene understanding benchmark suite,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 567–576. S. Song, S. P. Lichtenberg, and J. Xiao, “Sun RGB-D: A RGB-D scene understanding benchmark suite,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 567–576.
16.
go back to reference J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler, “Efficient object localization using convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 648–656. J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler, “Efficient object localization using convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 648–656.
17.
go back to reference S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on International Conference on Machine Learning (2015). S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on International Conference on Machine Learning (2015).
18.
go back to reference K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,” in ICCV’15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV) (2015), pp. 1026–1034. K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,” in ICCV’15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV) (2015), pp. 1026–1034.
19.
go back to reference Yu, Fisher and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” arXiv (2015). arXiv:1511.07122 Yu, Fisher and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” arXiv (2015). arXiv:1511.07122
20.
go back to reference J. Jin, A. Dunder, and E. Culurciello, “Flattened convolutional neural networks for feedforward acceleration,” arXiv (2014). arXiv:1412.5474 J. Jin, A. Dunder, and E. Culurciello, “Flattened convolutional neural networks for feedforward acceleration,” arXiv (2014). arXiv:1412.5474
Metadata
Title
Efficient Residual Neural Network for Semantic Segmentation
Authors
Bin Li
Junyue Zang
Jie Cao
Publication date
01-04-2021
Publisher
Pleiades Publishing
Published in
Pattern Recognition and Image Analysis / Issue 2/2021
Print ISSN: 1054-6618
Electronic ISSN: 1555-6212
DOI
https://doi.org/10.1134/S1054661821020103

Other articles of this Issue 2/2021

Pattern Recognition and Image Analysis 2/2021 Go to the issue

MATHEMATICAL THEORY OF PATTERN RECOGNITION

Rank Aggregation Based on New Types of the Kemeny’s Median

Premium Partner