Skip to main content
Top

2020 | OriginalPaper | Chapter

Compact Position-Aware Attention Network for Image Semantic Segmentation

Authors : Yajun Xu, Zhendong Mao, Peng Zhang, Bin Wang

Published in: MultiMedia Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In intelligent multimedia security, automatic image semantic segmentation is a fundamental research, which facilitates to accurately recognizing important targets from multimedia data and performing subsequent security analysis. Most existing semantic segmentation methods have made remarkable progress via modeling interactions between image pixels based on fully convolutional networks (FCN). However, they neglect the fact that semantic features extracted by FCN have poor ability to represent original image details, which always makes it hard to attend true positive relevant information within adjacent regions in spatial position for interactions modeling based methods. To tackle above problem, we take position information into account and adaptively model position relevance between pixels for enhancing local consistent in segmentation results. We propose a novel compact position-aware attention network (CPANet), containing spatial augmented attention module and channel augmented attention module, to simultaneously learn semantic relevance and position relevance between image pixels in a mutually reinforced way. In spatial augmented module, we introduce relative height and width distance to model position relevance based on self-attention mechanism. In channel augmented module, we exploit bilinear pooling to model compact correlation between pixels at any position across any channels. Our proposed CPANet can mutually learn accurate position and semantic of image pixels in a compact manner for improving semantic segmentation performance. Experimental results demonstrate that our approach has achieved the state-of-the-art performance in Cityscapes dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters - improve semantic segmentation by global convolutional network (2017) Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters - improve semantic segmentation by global convolutional network (2017)
2.
go back to reference Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018) CrossRef Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018) CrossRef
3.
go back to reference Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction (2018) Chen, L.C., et al.: Searching for efficient multi-scale architectures for dense image prediction (2018)
4.
go back to reference Chen, L.C., Yi, Y., Jiang, W., Wei, X., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2016) Chen, L.C., Yi, Y., Jiang, W., Wei, X., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
5.
go back to reference Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding (2016) Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding (2016)
6.
go back to reference Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene segmentation (2018) Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene segmentation (2018)
7.
go back to reference Huang, C.Z.A., et al.: Music transformer (2018) Huang, C.Z.A., et al.: Music transformer (2018)
8.
go back to reference Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: CCNet: criss-cross attention for semantic segmentation (2018) Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: CCNet: criss-cross attention for semantic segmentation (2018)
9.
go back to reference Jie, H., Li, S., Albanie, S., Gang, S., Wu, E.: Squeeze-and-excitation networks PP(99), 1–1 (2017) Jie, H., Li, S., Albanie, S., Gang, S., Wu, E.: Squeeze-and-excitation networks PP(99), 1–1 (2017)
10.
go back to reference Liu, W., Rabinovich, A., Berg, A.C.: ParseNet: looking wider to see better. Comput. Sci. (2015) Liu, W., Rabinovich, A., Berg, A.C.: ParseNet: looking wider to see better. Comput. Sci. (2015)
11.
go back to reference Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2014) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2014)
12.
go back to reference Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: IEEE International Conference on Computer Vision (2015) Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: IEEE International Conference on Computer Vision (2015)
13.
go back to reference Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention (2015) Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention (2015)
14.
go back to reference Vaswani, A., et al.: Attention is all you need (2017) Vaswani, A., et al.: Attention is all you need (2017)
15.
go back to reference Wang, P., et al.: Understanding convolution for semantic segmentation. In: IEEE Winter Conference on Applications of Computer Vision (2018) Wang, P., et al.: Understanding convolution for semantic segmentation. In: IEEE Winter Conference on Applications of Computer Vision (2018)
16.
go back to reference Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks (2017) Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks (2017)
17.
go back to reference Yuan, Y., Wang, J.: OCNet: object context network for scene parsing (2018) Yuan, Y., Wang, J.: OCNet: object context network for scene parsing (2018)
18.
go back to reference Yue, K., Sun, M., Yuan, Y., Zhou, F., Ding, E., Xu, F.: Compact generalized non-local network. In: Advances in Neural Information Processing System, pp. 6510–6519 (2018) Yue, K., Sun, M., Yuan, Y., Zhou, F., Ding, E., Xu, F.: Compact generalized non-local network. In: Advances in Neural Information Processing System, pp. 6510–6519 (2018)
19.
go back to reference Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network (2016) Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network (2016)
20.
go back to reference Zhao, H., et al.: PSANet: point-wise spatial attention network for scene parsing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 267–283 (2018)CrossRef Zhao, H., et al.: PSANet: point-wise spatial attention network for scene parsing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 267–283 (2018)CrossRef
Metadata
Title
Compact Position-Aware Attention Network for Image Semantic Segmentation
Authors
Yajun Xu
Zhendong Mao
Peng Zhang
Bin Wang
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-37734-2_52