Skip to main content

2021 | OriginalPaper | Buchkapitel

R2SN: Refined Semantic Segmentation Network of City Remote Sensing Image

verfasst von : Chenglong Wang, Dong Wu, Jie Nie, Lei Huang

Erschienen in: Pattern Recognition. ICPR International Workshops and Challenges

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Semantic segmentation is always a key problem in remote sensing image analysis. Especially, it is very useful for city-scale vehicle detection. However, multi-object and imbalanced data classes of remote sensing images bring a huge challenge, which leads that many traditional segmentation approaches were often unsatisfactory. In this paper, we propose a novel Refined Semantic Segmentation Network (R2SN), which apply the classic encoder-to-decoder framework to handle segmentation problem. However, we add the convolution layers in encoder and decoder to make the network can achieve more local information in the training step. The design is more suitable for high-resolution remote sensing image. More specially, the classic Focal loss is introduced in this network, which can guide the model focus on the difficult objects in remote sensing images and effectively handle multi-object segmentation problem. Meanwhile, the classic Hinge loss is also utilized to increase the distinction between classes, which can guarantee the more refined segmentation results. We validate our approach on the International Society for Photogrammetry and Remote Sensing (ISPRS) semantic segmentation benchmark dataset. The evaluation and comparison results show that our method exceeds the state-of-the-art remote sensing image segmentation methods in terms of mean intersection over union (MIoU), pixel accuracy, and F1-score.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
image ID: 1, 3, 11, 13, 15, 17, 21, 26, 28, 32, 34, 37.
 
2
image ID: 5,7,23,30.
 
Literatur
1.
Zurück zum Zitat Ahmed, O.S., et al.: Hierarchical land cover and vegetation classification using multispectral data acquired from an unmanned aerial vehicle. Int. J. Remote Sens. 38(8–10), 2037–2052 (2017)CrossRef Ahmed, O.S., et al.: Hierarchical land cover and vegetation classification using multispectral data acquired from an unmanned aerial vehicle. Int. J. Remote Sens. 38(8–10), 2037–2052 (2017)CrossRef
2.
Zurück zum Zitat Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)CrossRef Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)CrossRef
3.
4.
Zurück zum Zitat Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017) Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous convolution for semantic image segmentation. arXiv preprint arXiv:​1706.​05587 (2017)
5.
6.
Zurück zum Zitat Diakogiannis, F.I., Waldner, F., Caccetta, P., Wu, C.: Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J. Photogrammetry Remote Sens. 162, 94–114 (2020)CrossRef Diakogiannis, F.I., Waldner, F., Caccetta, P., Wu, C.: Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data. ISPRS J. Photogrammetry Remote Sens. 162, 94–114 (2020)CrossRef
7.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
8.
Zurück zum Zitat Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001) Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
9.
Zurück zum Zitat Li, H., Qiu, K., Chen, L., Mei, X., Hong, L., Tao, C.: Scattnet: semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 14, 1–5 (2020) Li, H., Qiu, K., Chen, L., Mei, X., Hong, L., Tao, C.: Scattnet: semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 14, 1–5 (2020)
10.
Zurück zum Zitat Li, H., Xiong, P., Fan, H., Sun, J.: DfaNet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9522–9531 (2019) Li, H., Xiong, P., Fan, H., Sun, J.: DfaNet: deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9522–9531 (2019)
11.
Zurück zum Zitat Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1925–1934 (2017) Lin, G., Milan, A., Shen, C., Reid, I.: RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1925–1934 (2017)
12.
Zurück zum Zitat Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017) Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
13.
Zurück zum Zitat Liu, W., Liu, X., Ma, H., Cheng, P.: Beyond human-level license plate super-resolution with progressive vehicle search and domain priori GAN. In: Proceedings of the 25th ACM International Conference on Multimedia. MM 2017, Association for Computing Machinery, New York, NY, USA, pp. 1618–1626 (2017). https://doi.org/10.1145/3123266.3123422 Liu, W., Liu, X., Ma, H., Cheng, P.: Beyond human-level license plate super-resolution with progressive vehicle search and domain priori GAN. In: Proceedings of the 25th ACM International Conference on Multimedia. MM 2017, Association for Computing Machinery, New York, NY, USA, pp. 1618–1626 (2017). https://​doi.​org/​10.​1145/​3123266.​3123422
16.
Zurück zum Zitat Liu, Y., Ren, Q., Geng, J., Ding, M., Li, J.: Efficient patch-wise semantic segmentation for large-scale remote sensing images. Sensors 18(10), 3232 (2018)CrossRef Liu, Y., Ren, Q., Geng, J., Ding, M., Li, J.: Efficient patch-wise semantic segmentation for large-scale remote sensing images. Sensors 18(10), 3232 (2018)CrossRef
17.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
18.
Zurück zum Zitat Marmanis, D., Wegner, J.D., Galliani, S., Schindler, K., Datcu, M., Stilla, U.: Semantic segmentation of aerial images with an ensemble of CNSs. ISPRS Ann. Photogram. Remote Sens. Spatial Inf. Sci. 3, 473–480 (2016)CrossRef Marmanis, D., Wegner, J.D., Galliani, S., Schindler, K., Datcu, M., Stilla, U.: Semantic segmentation of aerial images with an ensemble of CNSs. ISPRS Ann. Photogram. Remote Sens. Spatial Inf. Sci. 3, 473–480 (2016)CrossRef
19.
Zurück zum Zitat Nie, W.Z., Liu, A.A., Zhao, S., Gao, Y.: Deep correlated joint network for 2-d image-based 3-d model retrieval. IEEE Trans. Cybernet. (2020) Nie, W.Z., Liu, A.A., Zhao, S., Gao, Y.: Deep correlated joint network for 2-d image-based 3-d model retrieval. IEEE Trans. Cybernet. (2020)
20.
Zurück zum Zitat Nie, W., Jia, W., Li, W., Liu, A., Zhao, S.: 3d pose estimation based on reinforce learning for 2d image-based 3d model retrieval. IEEE Trans. Multimedia (2020) Nie, W., Jia, W., Li, W., Liu, A., Zhao, S.: 3d pose estimation based on reinforce learning for 2d image-based 3d model retrieval. IEEE Trans. Multimedia (2020)
21.
Zurück zum Zitat Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters-improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4353–4361 (2017) Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters-improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4353–4361 (2017)
22.
Zurück zum Zitat Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
24.
Zurück zum Zitat Sun, Y., et al.: Synthetic training for monocular human mesh recovery, October 2020 Sun, Y., et al.: Synthetic training for monocular human mesh recovery, October 2020
25.
Zurück zum Zitat Vakalopoulou, M., Karantzalos, K., Komodakis, N., Paragios, N.: Building detection in very high resolution multispectral data with deep learning features. In: 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 1873–1876. IEEE (2015) Vakalopoulou, M., Karantzalos, K., Komodakis, N., Paragios, N.: Building detection in very high resolution multispectral data with deep learning features. In: 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 1873–1876. IEEE (2015)
27.
Zurück zum Zitat Wang, W., Yang, N., Zhang, Y., Wang, F., Cao, T., Eklund, P.: A review of road extraction from remote sensing images. J. Traff. Transp. Eng. (Eng. Ed.) 3(3), 271–282 (2016) Wang, W., Yang, N., Zhang, Y., Wang, F., Cao, T., Eklund, P.: A review of road extraction from remote sensing images. J. Traff. Transp. Eng. (Eng. Ed.) 3(3), 271–282 (2016)
28.
Zurück zum Zitat Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1857–1866 (2018) Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1857–1866 (2018)
30.
Zurück zum Zitat Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2528–2535. IEEE (2010) Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2528–2535. IEEE (2010)
31.
Zurück zum Zitat Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017) Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Metadaten
Titel
R2SN: Refined Semantic Segmentation Network of City Remote Sensing Image
verfasst von
Chenglong Wang
Dong Wu
Jie Nie
Lei Huang
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-68821-9_34