Skip to main content
Top

2021 | OriginalPaper | Chapter

Semantic Segmentation of Aerial Images Using Binary Space Partitioning

Authors : Daniel Gritzner, Jörn Ostermann

Published in: KI 2021: Advances in Artificial Intelligence

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The semantic segmentation of aerial images enables many useful applications such as tracking city growth, tracking deforestation, or automatically creating and updating maps. However, gathering enough training data to train a proper model for the automated analysis of aerial images is usually too labor-intensive and thus too expensive in most cases. Therefore, domain adaptation techniques are often necessary to be able to adapt existing models or to transfer knowledge from existing datasets to new unlabeled aerial images. Modern adaptation approaches make use of complex architectures involving many model components, losses and loss weights. These approaches are hard to apply in practice since their hyperparameters are hard to optimize for a given adaptation problem. This complexity is the result of trying to separate domain-invariant elements, e.g., structures and shapes, from domain-specific elements, e.g., textures. In this paper, we present a novel model for semantic segmentation, which not only achieves state-of-the-art performance on aerial images, but also inherently learns separate feature representations for shapes and textures. Our goal is to provide a model which can serve as the basis for future domain adaptation approaches which are simpler but still effective. Through end-to-end training our deep learning model learns to map aerial images to feature representations which can be decoded into binary space partitioning trees, a resolution-independent representation of the semantic segmentation, which can then be rendered into a pixelwise semantic segmentation in a differentiable way.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Chang, W.L., Wang, H.P., Peng, W.H., Chiu, W.C.: All about structure: adapting structural information across domains for boosting semantic segmentation, In: CVPR. pp. 1900–1909 (2019) Chang, W.L., Wang, H.P., Peng, W.H., Chiu, W.C.: All about structure: adapting structural information across domains for boosting semantic segmentation, In: CVPR. pp. 1900–1909 (2019)
2.
go back to reference Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)CrossRef Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)CrossRef
3.
go back to reference Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017) Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:​1706.​05587 (2017)
5.
go back to reference Chen, Z., Tagliasacchi, A., Zhang, H.: BSP-NET: generating compact meshes via binary space partitioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 45–54 (2020) Chen, Z., Tagliasacchi, A., Zhang, H.: BSP-NET: generating compact meshes via binary space partitioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 45–54 (2020)
6.
go back to reference Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017) Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
7.
go back to reference Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49CrossRef Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-46723-8_​49CrossRef
8.
go back to reference Fuchs, H., Kedem, Z.M., Naylor, B.F.: On visible surface generation by a priori tree structures. In: Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, pp. 124–133 (1980) Fuchs, H., Kedem, Z.M., Naylor, B.F.: On visible surface generation by a priori tree structures. In: Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, pp. 124–133 (1980)
10.
go back to reference Gkioxari, G., Malik, J., Johnson, J.: Mesh R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9785–9795 (2019) Gkioxari, G., Malik, J., Johnson, J.: Mesh R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9785–9795 (2019)
11.
go back to reference Gómez, J.A., Patiño, J.E., Duque, J.C., Passos, S.: Spatiotemporal modeling of urban growth using machine learning. Remote Sens. 12(1), 109 (2020)CrossRef Gómez, J.A., Patiño, J.E., Duque, J.C., Passos, S.: Spatiotemporal modeling of urban growth using machine learning. Remote Sens. 12(1), 109 (2020)CrossRef
12.
go back to reference He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
13.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
15.
go back to reference Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 11–19 (2017) Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 11–19 (2017)
17.
go back to reference Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9404–9413 (2019) Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9404–9413 (2019)
18.
go back to reference Kirillov, A., Wu, Y., He, K., Girshick, R.: PointRend: image segmentation as rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9799–9808 (2020) Kirillov, A., Wu, Y., He, K., Girshick, R.: PointRend: image segmentation as rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9799–9808 (2020)
19.
go back to reference Lee, S.H., Han, K.J., Lee, K., Lee, K.J., Oh, K.Y., Lee, M.J.: Classification of landscape affected by deforestation using high-resolution remote sensing data and deep-learning techniques. Remote Sens. 12(20), 3372 (2020)CrossRef Lee, S.H., Han, K.J., Lee, K., Lee, K.J., Oh, K.Y., Lee, M.J.: Classification of landscape affected by deforestation using high-resolution remote sensing data and deep-learning techniques. Remote Sens. 12(20), 3372 (2020)CrossRef
20.
go back to reference Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
23.
go back to reference Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4460–4470 (2019) Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4460–4470 (2019)
24.
go back to reference Papandreou, G., Chen, L.C., Murphy, K., Yuille, A.: Weakly-and semi-supervised learning of a DCNN for semantic image segmentation. arXiv:1502.02734 (2015) Papandreou, G., Chen, L.C., Murphy, K., Yuille, A.: Weakly-and semi-supervised learning of a DCNN for semantic image segmentation. arXiv:​1502.​02734 (2015)
26.
go back to reference Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018) Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018)
27.
go back to reference Sanglard, F.: Game Engine Black Book: DOOM v1.1. Sanglard, Fabien (2019) Sanglard, F.: Game Engine Black Book: DOOM v1.1. Sanglard, Fabien (2019)
28.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556 (2014)
29.
go back to reference Sofiiuk, K., Barinova, O., Konushin, A.: Adaptis: adaptive instance selection network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7355–7363 (2019) Sofiiuk, K., Barinova, O., Konushin, A.: Adaptis: adaptive instance selection network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7355–7363 (2019)
30.
go back to reference Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-SCNN: gated shape CNNs for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5229–5238 (2019) Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-SCNN: gated shape CNNs for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5229–5238 (2019)
31.
go back to reference Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 (2017) Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 (2017)
32.
go back to reference Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018) Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018)
33.
go back to reference Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: FastFCN: rethinking dilated convolution in the backbone for semantic segmentation. arXiv preprint arXiv:1903.11816 (2019) Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: FastFCN: rethinking dilated convolution in the backbone for semantic segmentation. arXiv preprint arXiv:​1903.​11816 (2019)
34.
go back to reference Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015) Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:​1505.​00853 (2015)
36.
go back to reference Zhu, Y., et al.: Improving semantic segmentation via video propagation and label relaxation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8856–8865 (2019) Zhu, Y., et al.: Improving semantic segmentation via video propagation and label relaxation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8856–8865 (2019)
Metadata
Title
Semantic Segmentation of Aerial Images Using Binary Space Partitioning
Authors
Daniel Gritzner
Jörn Ostermann
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-87626-5_10

Premium Partner