Skip to main content

2024 | OriginalPaper | Buchkapitel

Review on Vision Transformer for Satellite Image Classification

verfasst von : Himanshu Srivastava, Akansha Singh, Anuj Kumar Bharti

Erschienen in: Proceedings of Third International Conference on Computing and Communication Networks

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Satellite image classification has been a topic of interest in the research community for the last two decades. Neural network researchers have given advanced models year by year for this problem. A lot of attention is given to the transformers from 2017 onwards and the vision transformer is a transformer-based model for computer vision problems. This paper reviews the application of vision transformers (and its variants) to the satellite image classification. The article provides a detailed working of the vision transformer and the history of its chronological development. The review prospects that the vision transformer is suitable for a sufficiently large dataset. For small datasets, convolutional network-based models perform well as compared to vision transformers, but pre-trained vision transformers beating convolutional models and transfer learning have produced better results. The review suggests, considering limited research in the field of vision transformer application for satellite data, due to the fairly new model, there are high possibilities for making this model producing good results. The article also explores the ongoing challenges and research opportunities in the vision transformer development for satellite image classification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16 × 16 words: transformers for image recognition at scale (2020). https://doi.org/10.48550/arXiv.2010.11929 Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16 × 16 words: transformers for image recognition at scale (2020). https://​doi.​org/​10.​48550/​arXiv.​2010.​11929
2.
Zurück zum Zitat Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need (2017). arXiv:1706.03762 Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need (2017). arXiv:​1706.​03762
6.
Zurück zum Zitat Chen, M., Radford, A., Child, R., Wu, J., Jun, H., Luan, D., Sutskever, I.: Generative pretraining from pixels. In: International Conference on Machine Learning, pp. 1691–1703. PMLR (2020, November) Chen, M., Radford, A., Child, R., Wu, J., Jun, H., Luan, D., Sutskever, I.: Generative pretraining from pixels. In: International Conference on Machine Learning, pp. 1691–1703. PMLR (2020, November)
11.
Zurück zum Zitat Kaselimi, M., Voulodimos, A., Daskalopoulos, I., Doulamis, N., Doulamis, A.: A vision transformer model for convolution-free multilabel classification of satellite imagery in deforestation monitoring. In: IEEE Transactions on Neural Networks and Learning Systems (2022). https://doi.org/10.1109/TNNLS.2022.3144791 Kaselimi, M., Voulodimos, A., Daskalopoulos, I., Doulamis, N., Doulamis, A.: A vision transformer model for convolution-free multilabel classification of satellite imagery in deforestation monitoring. In: IEEE Transactions on Neural Networks and Learning Systems (2022). https://​doi.​org/​10.​1109/​TNNLS.​2022.​3144791
13.
14.
18.
Zurück zum Zitat Nabi, M., Maggiolo, L., Moser, G., & Serpico, S.B.: A CNN-transformer knowledge distillation for remote sensing scene classification. In: IGARSS 2022–2022 IEEE International Geoscience and Remote Sensing Symposium, pp. 663–666. IEEE (2022). https://doi.org/10.3390/rs13204143 Nabi, M., Maggiolo, L., Moser, G., & Serpico, S.B.: A CNN-transformer knowledge distillation for remote sensing scene classification. In: IGARSS 2022–2022 IEEE International Geoscience and Remote Sensing Symposium, pp. 663–666. IEEE (2022). https://​doi.​org/​10.​3390/​rs13204143
Metadaten
Titel
Review on Vision Transformer for Satellite Image Classification
verfasst von
Himanshu Srivastava
Akansha Singh
Anuj Kumar Bharti
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-0892-5_16