Skip to main content

2021 | OriginalPaper | Buchkapitel

Semantic Segmentation of Aerial Images Using Binary Space Partitioning

verfasst von : Daniel Gritzner, Jörn Ostermann

Erschienen in: KI 2021: Advances in Artificial Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The semantic segmentation of aerial images enables many useful applications such as tracking city growth, tracking deforestation, or automatically creating and updating maps. However, gathering enough training data to train a proper model for the automated analysis of aerial images is usually too labor-intensive and thus too expensive in most cases. Therefore, domain adaptation techniques are often necessary to be able to adapt existing models or to transfer knowledge from existing datasets to new unlabeled aerial images. Modern adaptation approaches make use of complex architectures involving many model components, losses and loss weights. These approaches are hard to apply in practice since their hyperparameters are hard to optimize for a given adaptation problem. This complexity is the result of trying to separate domain-invariant elements, e.g., structures and shapes, from domain-specific elements, e.g., textures. In this paper, we present a novel model for semantic segmentation, which not only achieves state-of-the-art performance on aerial images, but also inherently learns separate feature representations for shapes and textures. Our goal is to provide a model which can serve as the basis for future domain adaptation approaches which are simpler but still effective. Through end-to-end training our deep learning model learns to map aerial images to feature representations which can be decoded into binary space partitioning trees, a resolution-independent representation of the semantic segmentation, which can then be rendered into a pixelwise semantic segmentation in a differentiable way.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Chang, W.L., Wang, H.P., Peng, W.H., Chiu, W.C.: All about structure: adapting structural information across domains for boosting semantic segmentation, In: CVPR. pp. 1900–1909 (2019) Chang, W.L., Wang, H.P., Peng, W.H., Chiu, W.C.: All about structure: adapting structural information across domains for boosting semantic segmentation, In: CVPR. pp. 1900–1909 (2019)
2.
Zurück zum Zitat Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)CrossRef Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)CrossRef
3.
Zurück zum Zitat Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017) Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:​1706.​05587 (2017)
4.
5.
Zurück zum Zitat Chen, Z., Tagliasacchi, A., Zhang, H.: BSP-NET: generating compact meshes via binary space partitioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 45–54 (2020) Chen, Z., Tagliasacchi, A., Zhang, H.: BSP-NET: generating compact meshes via binary space partitioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 45–54 (2020)
6.
Zurück zum Zitat Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017) Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
7.
Zurück zum Zitat Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49CrossRef Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-46723-8_​49CrossRef
8.
Zurück zum Zitat Fuchs, H., Kedem, Z.M., Naylor, B.F.: On visible surface generation by a priori tree structures. In: Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, pp. 124–133 (1980) Fuchs, H., Kedem, Z.M., Naylor, B.F.: On visible surface generation by a priori tree structures. In: Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, pp. 124–133 (1980)
10.
Zurück zum Zitat Gkioxari, G., Malik, J., Johnson, J.: Mesh R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9785–9795 (2019) Gkioxari, G., Malik, J., Johnson, J.: Mesh R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9785–9795 (2019)
11.
Zurück zum Zitat Gómez, J.A., Patiño, J.E., Duque, J.C., Passos, S.: Spatiotemporal modeling of urban growth using machine learning. Remote Sens. 12(1), 109 (2020)CrossRef Gómez, J.A., Patiño, J.E., Duque, J.C., Passos, S.: Spatiotemporal modeling of urban growth using machine learning. Remote Sens. 12(1), 109 (2020)CrossRef
12.
Zurück zum Zitat He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
13.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
15.
Zurück zum Zitat Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 11–19 (2017) Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 11–19 (2017)
17.
Zurück zum Zitat Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9404–9413 (2019) Kirillov, A., He, K., Girshick, R., Rother, C., Dollár, P.: Panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9404–9413 (2019)
18.
Zurück zum Zitat Kirillov, A., Wu, Y., He, K., Girshick, R.: PointRend: image segmentation as rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9799–9808 (2020) Kirillov, A., Wu, Y., He, K., Girshick, R.: PointRend: image segmentation as rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9799–9808 (2020)
19.
Zurück zum Zitat Lee, S.H., Han, K.J., Lee, K., Lee, K.J., Oh, K.Y., Lee, M.J.: Classification of landscape affected by deforestation using high-resolution remote sensing data and deep-learning techniques. Remote Sens. 12(20), 3372 (2020)CrossRef Lee, S.H., Han, K.J., Lee, K., Lee, K.J., Oh, K.Y., Lee, M.J.: Classification of landscape affected by deforestation using high-resolution remote sensing data and deep-learning techniques. Remote Sens. 12(20), 3372 (2020)CrossRef
20.
Zurück zum Zitat Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015) Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
21.
23.
Zurück zum Zitat Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4460–4470 (2019) Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4460–4470 (2019)
24.
Zurück zum Zitat Papandreou, G., Chen, L.C., Murphy, K., Yuille, A.: Weakly-and semi-supervised learning of a DCNN for semantic image segmentation. arXiv:1502.02734 (2015) Papandreou, G., Chen, L.C., Murphy, K., Yuille, A.: Weakly-and semi-supervised learning of a DCNN for semantic image segmentation. arXiv:​1502.​02734 (2015)
26.
Zurück zum Zitat Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018) Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: Inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018)
27.
Zurück zum Zitat Sanglard, F.: Game Engine Black Book: DOOM v1.1. Sanglard, Fabien (2019) Sanglard, F.: Game Engine Black Book: DOOM v1.1. Sanglard, Fabien (2019)
28.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556 (2014)
29.
Zurück zum Zitat Sofiiuk, K., Barinova, O., Konushin, A.: Adaptis: adaptive instance selection network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7355–7363 (2019) Sofiiuk, K., Barinova, O., Konushin, A.: Adaptis: adaptive instance selection network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7355–7363 (2019)
30.
Zurück zum Zitat Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-SCNN: gated shape CNNs for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5229–5238 (2019) Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-SCNN: gated shape CNNs for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5229–5238 (2019)
31.
Zurück zum Zitat Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 (2017) Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 (2017)
32.
Zurück zum Zitat Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018) Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2Mesh: generating 3D mesh models from single RGB images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 52–67 (2018)
33.
Zurück zum Zitat Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: FastFCN: rethinking dilated convolution in the backbone for semantic segmentation. arXiv preprint arXiv:1903.11816 (2019) Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: FastFCN: rethinking dilated convolution in the backbone for semantic segmentation. arXiv preprint arXiv:​1903.​11816 (2019)
34.
Zurück zum Zitat Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015) Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:​1505.​00853 (2015)
36.
Zurück zum Zitat Zhu, Y., et al.: Improving semantic segmentation via video propagation and label relaxation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8856–8865 (2019) Zhu, Y., et al.: Improving semantic segmentation via video propagation and label relaxation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8856–8865 (2019)
Metadaten
Titel
Semantic Segmentation of Aerial Images Using Binary Space Partitioning
verfasst von
Daniel Gritzner
Jörn Ostermann
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-87626-5_10