Top

Neural Computing and Applications

26-04-2024 | Original Article

Enhancing the ability of convolutional neural networks for remote sensing image segmentation using transformers

Author: Mohammad Barr

Published in: Neural Computing and Applications

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The segmentation of remote sensing images has emerged as a compelling undertaking in computer vision owing to its use in the development of several applications. The U-Net style has been extensively utilized in many picture segmentation applications, yielding remarkable achievements. Nevertheless, the U-Net has several constraints in the context of remote sensing picture segmentation, mostly stemming from the limited scope of the convolution kernels. The transformer is a deep learning model specifically developed for sequence-to-sequence translation. It incorporates a self-attention mechanism to efficiently process many inputs, selectively retaining the relevant information and discarding the irrelevant inputs by adjusting the weights. However, it highlights a constraint in the localization capability caused by the absence of fundamental characteristics. This work presents a novel approach called U-Net–transformer, which combines the U-Net and transformer models for the purpose of remote sensing picture segmentation. The suggested solution surpasses individual models, such as U-Net and transformers, by combining and leveraging their characteristics. Initially, the transformer obtains the overall context by encoding tokenized picture patches derived from the feature maps of the convolutional neural network (CNN). Next, the encoded feature maps undergo upsampling through a decoder and are then merged with the high-resolution feature maps of the CNN model. This enables the localization to be more accurate. The transformer serves as an unconventional encoder for segmenting remote sensing images. It enhances the U-Net model by capturing localized spatial data, hence improving the capacity to capture intricate details. The U-Net–transformer, as suggested, has demonstrated exceptional performance in remote sensing picture segmentation across many benchmark datasets. The given findings demonstrated the efficacy of integrating the U-Net and transformer model for the purpose of segmenting remote sensing images.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Brunn SD (2019) The international encyclopedia of geography: people, the earth, environment and technology. AAG rev Books 7(2):77–85CrossRef

A. I. Godunov, Penza State University, S. T. Balanyan, P. S. Egorov, Air Force Academy named after Professor N. E. Zhukovsky and Yu. A. Gagarin, and Air Force Academy named after Professor N. E. Zhukovsky and Yu. A. Gagarin, 2021 "Image segmentation and object recognition based on convolutional neural network technology," Reliab. qual. complex syst., no. 3

P. Wang et al., 2018 “Understanding Convolution for Semantic Segmentation,” In 2018 IEEE winter conference on applications of computer vision (WACV)

S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, 2018 “Path aggregation network for instance segmentation,” In 2018 IEEE/CVF conference on computer vision and pattern recognition

M. N. Mullani and P. A. Dandavate. 2019 Semantic texton forests for image categorization and segmentation. Nternational j. adv. res. comput. commun. eng. 8(4): 259–262

Smith A (2010) Image segmentation scale parameter optimization and land cover classification using the Random Forest algorithm. J Spat Sci 55(1):69–79CrossRef

Barthakur M, Sarma KK (2020) “Deep learning based semantic segmentation applied to satellite image”, in Data Visualization and Knowledge Engineering. Springer International Publishing, Cham, pp 79–107

Ayachi R, Said Y, Atri M (2021) A convolutional neural network to perform object detection and identification in large-scale visual data. Big Data 9(1):41–52CrossRef

Afif M, Ayachi R, Said Y, Atri M (2020) Deep learning based application for indoor scene recognition. Neural Process Lett 51(3):2827–2837CrossRef

10.

Ayachi R, Afif M, Said Y, Atri M (2020) Traffic signs detection for real-world application of an advanced driving assisting system using deep learning. Neural Process Lett 51(1):837–851CrossRef

11.

R. Ayachi, M. Afif, Y. Said, and A. B. Abdelaali, 2020 “Pedestrian detection for advanced driving assisting system: a transfer learning approach,” In 2020 5th international conference on advanced technologies for signal and image processing (ATSIP)

12.

Ronneberger O, Fischer P, Brox T (2015) “U-Net: Convolutional Networks for Biomedical Image Segmentation.” Lecture Notes in Computer Science. Springer International Publishing, Cham, pp 234–241

13.

X. Zhang, H. Yang, and E. F. Y. Young, “Attentional transfer is all you need: Technology-aware layout pattern generation,” In 2021 58th ACM/IEEE design automation conference (DAC), 2021.

14.

A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv [cs.CV], 2020.

15.

Qi X, Li K, Liu P, Zhou X, Sun M (2020) Deep attention and multiscale networks for accurate remote sensing image segmentation. IEEE Access 8:146627–146639CrossRef

16.

L. Zhou, C. Zhang, and M. Wu, "D-LinkNet: LinkNet with pretrained encoder and dilated convolution for high-resolution satellite imagery road extraction," In 2018 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), 2018.

17.

F. Yu and V. Koltun, "Multiscale context aggregation by dilated convolutions," arXiv [cs.CV], 2015.

18.

Cui B, Chen X, Lu Y (2020) Semantic segmentation of remote sensing images using transfer learning and deep convolutional neural network with dense connection. IEEE Access 8:116744–116755CrossRef

19.

G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, 2017 “Densely connected convolutional networks,” In 2017 IEEE conference on computer vision and pattern recognition (CVPR)

20.

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. (2009) “ImageNet A large-scale hierarchical image database.” In 2009 IEEE conference on computer vision and pattern recognition. Doi https://doi.org/10.1109/CVPR.2009.5206848

21.

Gao L, Liu H, Yang M, Chen L, Wan Y, Xiao Z, Qian Y (2021) STransFuse: Fusing swin transformer and convolutional neural network for remote sensing image semantic segmentation. IEEE J Sel Top Appl Earth Obs Remote Sens 14:10990–11003CrossRef

22.

Zou Z, Shi T, Li W, Zhang Z, Shi Z (2020) Do game data generalize well for remote sensing image segmentation? Remote Sens 12(2):275CrossRef

23.

J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, 2017 “Unpaired image-to-image translation using cycle-consistent adversarial networks,” In 2017 IEEE international conference on computer vision (ICCV)

24.

K. He, X. Zhang, S. Ren, and J. Sun, 2016 “Deep residual learning for image recognition,” In 2016 IEEE conference on computer vision and pattern recognition (CVPR)

25.

Liu Y, Zhu Q, Cao F, Chen J, Gang Lu (2021) High-resolution remote sensing image segmentation framework based on attention mechanism and adaptive weighting. ISPRS Int J Geo Inf 10(4):241CrossRef

26.

Xu Z, Zhang W, Zhang T, Yang Z, Li J (2021) Efficient transformer for remote sensing image segmentation. Remote Sens 13(18):3585CrossRef

27.

Li A, Jiao L, Zhu H, Li L, Liu F (2021) Multitask semantic boundary awareness network for remote sensing image segmentation. IEEE Trans Geosci Remote Sens 60:1–14

28.

F. Rottensteiner et al., 2012 “The isprs benchmark on urban object classification and 3d building reconstruction.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences; I-3. 1(1): 293-298

29.

M. Tan and Q. V. Le, 2021 “EfficientNetV2: Smaller models and faster training,” arXiv [cs.CV]

30.

D. Zhou et al., 2021 “DeepViT: Towards Deeper Vision Transformer,” arXiv [cs.CV]

31.

Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021 "Swin transformer: Hierarchical vision transformer using shifted windows." In Proceedings of the IEEE/CVF international conference on computer vision. 10012–10022

32.

Demir, Ilke, Krzysztof Koperski, David Lindenbaum, Guan Pang, Jing Huang, Saikat Basu, Forest Hughes, Devis Tuia, and Ramesh Raskar. 2018 "Deepglobe 2018: A challenge to parse the earth through satellite images." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 172–181

33.

Zhang, Kaidong, and Dong Liu. (2023) "Customized segment anything model for medical image segmentation." arXiv preprint arXiv:2304.13785

34.

Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int j comput vis 88:303–338CrossRef

35.

Afaq Y, Manocha A (2021) Analysis on change detection techniques for remote sensing applications: a review. Eco Inform 63:101310CrossRef

36.

Bai T, Wang Le, Yin D, Sun K, Chen Y, Li W, Li D (2023) Deep learning for change detection in remote sensing: a review. Geo-spat Inform Sci 26(3):262–288CrossRef

37.

Zhang C, Wang L, Cheng S, Li Y (2022) SwinSUNet: Pure transformer network for remote sensing image change detection. IEEE Trans Geosci Remote Sens 60:1–13

38.

Manocha A, Afaq Y (2023) Optical and SAR images-based image translation for change detection using generative adversarial network (GAN). Multimed Tools Appl 82(17):26289–26315CrossRef

Title: Enhancing the ability of convolutional neural networks for remote sensing image segmentation using transformers
Author: Mohammad Barr
Publication date: 26-04-2024
Publisher: Springer London
Published in: Neural Computing and Applications
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-024-09743-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Premium Partner