Top

Multimedia Systems

Published in:

09-05-2023 | Regular Paper

A cross-view geo-localization method guided by relation-aware global attention

Authors: Jing Sun, Rui Yan, Bing Zhang, Bing Zhu, Fuming Sun

Published in: Multimedia Systems | Issue 4/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Cross-view geo-localization mainly exploits query images to match images from the same geographical location from different platforms. Most existing methods fail to adequately consider the effect of image structural information on cross-view geo-localization, resulting in the extracted features can not fully characterize the image, which affects the localization accuracy. Based on this, this paper proposes a cross-view geo-localization method guided by relation-aware global attention, which can capture the rich global structural information by perfectly integrating attention mechanism and feature extraction network, thus improving the representation ability of features. Meanwhile, considering the important role of semantic and context information in geo-localization, a joint training structure with parallel global branch and local branch is designed to fully mine multi-scale context features for image matching, which can further improve the accuracy of cross-view geo-localization. The quantitative and qualitative experimental results on University-1652, CVUSA, and CVACT datasets show that the algorithm in this paper outperforms other advanced methods in recall accuracy (Recall) and image retrieval average precision (AP).

previous article Deep learning model with multi-feature fusion and label association for suicide detection

next article An acupoint health care system with real-time acupoint localization and visualization in augmented reality

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Wang, Z., Qin, J., Xiang, X., Tan, Y.: A privacy-preserving and traitor tracking content-based image retrieval scheme in cloud computing. Multimedia Syst. 27(3), 403–415 (2021)CrossRef

Saritha, R.R., Paul, V., Kumar, P.G.: Content based image retrieval using deep learning process. Cluster Comput. 22(2), 4187–4200 (2019)CrossRef

Outay, F., Mengash, H.A., Adnan, M.: Applications of unmanned aerial vehicle (uav) in road safety, traffic and highway infrastructure management: recent advances and challenges. Trans. Res. Part A 141, 116–129 (2020)

Zhao, X., Huang, P., Shu, X.: Wavelet-attention CNN for image classification. Multimedia Syst. 28(3), 915–924 (2022)CrossRef

Wang, P., Fan, E., Wang, P.: Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recogn. Lett. 141, 61–67 (2021)CrossRef

Wang, H., Song, Y., Huo, L., Chen, L., He, Q.: Multiscale object detection based on channel and data enhancement at construction sites. Multimedia Syst. 29(1), 49–58 (2023)CrossRef

Tan, M., Pang, R., Le, Q.V.: Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)

Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 173–190 (2020)

Hao, S., Zhou, Y., Guo, Y.: A brief survey on semantic segmentation with deep learning. Neurocomputing 406, 302–321 (2020)CrossRef

10.

Jaouedi, N., Boujnah, N., Bouhlel, M.S.: A new hybrid deep learning model for human action recognition. J. King Saud Univ. Comput. Inf. Sci. 32(4), 447–453 (2020)

11.

Yang, C., Xu, Y., Shi, J., Dai, B., Zhou, B.: Temporal pyramid network for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 588–597 (2020)

12.

Shi, Y., Yu, X., Liu, L., Zhang, T., Li, H.: Optimal feature transport for cross-view image geo-localization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11990–11997 (2020)

13.

Zheng, Z., Wei, Y., Yang, Y.: University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1395–1403 (2020)

14.

Wang, T., Zheng, Z., Yan, C., Zhang, J., Sun, Y., Zheng, B., Yang, Y.: Each part matters: local patterns facilitate cross-view geo-localization. IEEE Trans. Circuits Syst. Video Technol. 32(2), 867–879 (2021)CrossRef

15.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

16.

Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3186–3195 (2020)

17.

Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480 (2017)

18.

Zheng, Z., Zheng, L., Yang, Y.: A discriminatively learned cnn embedding for person reidentification. ACM Tran. Multimedia Comput. Commun. Appl. 14(1), 13–11320 (2018)MathSciNet

19.

Li, X., Yu, L., Chang, D., Ma, Z., Cao, J.: Dual cross-entropy loss for small-sample fine-grained vehicle classification. IEEE Trans. Vehicular Technol. 68(5), 4204–4212 (2019)CrossRef

20.

Workman, S., Jacobs, N.: On the location dependence of convolutional neural network features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 70–78 (2015)

21.

Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3961–3969 (2015)

22.

Lin, T.-Y., Cui, Y., Belongie, S., Hays, J.: Learning deep representations for ground-to-aerial geolocalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5007–5015 (2015)

23.

Vo, N.N., Hays, J.: Localizing and orienting street views using overhead imagery. In: Proceedings of the European Conference on Computer Vision, Springer. pp 494–509 (2016)

24.

Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 3608–3616 (2017)

25.

Altwaijry, H., Trulls, E., Hays, J., Fua, P., Belongie, S.: Learning to match aerial images with deep attentive architectures. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 3539–3547 (2016)

26.

Zhai, M., Bessinger, Z., Workman, S., Jacobs, N.: Predicting ground-level scene layout from aerial imagery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 867–875 (2017)

27.

Hu, S., Feng, M., Nguyen, R.M., Lee, G.H.: Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7258–7267 (2018)

28.

Arandjelovic, R., Gronát, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5297–5307 (2016)

29.

Shi, Y., Liu, L., Yu, X., Li, H.: Spatial-aware feature aggregation for cross-view image based geo-localization. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. pp 10090–10100 (2019)

30.

Shi, Y., Yu, X., Campbell, D., Li, H.: Where am i looking at? joint location and orientation estimation by cross-view matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4064–4072 (2020)

31.

Liu, L., Li, H.: Lending orientation to neural networks for cross-view geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5624–5633 (2019)

32.

Rodrigues, R., Tani, M.: Are these from the same place? seeing the unseen in cross-view image geo-localization. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision. pp 3753–3761 (2021)

33.

Regmi, K., Shah, M.: Bridging the domain gap for ground-to-aerial image matching. In: Proceedings of the IEEE International Conference on Computer Visio. pp 470–479 (2019)

34.

Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)MathSciNetCrossRef

35.

Toker, A., Zhou, Q., Maximov, M., Leal-Taixé, L.: Coming down to earth: Satellite-to-street view synthesis for geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 6488–6497 (2021)

36.

Zheng, Z., Zheng, L., Garrett, M., Yang, Y., Xu, M., Shen, Y.: Dual-path convolutional image-text embeddings with instance loss. ACM Trans. Multimedia Compu. Commun. Appl. 16(2), 1–23 (2020)CrossRef

37.

Ding, L., Zhou, J., Meng, L., Long, Z.: A practical cross-view image matching method between uav and satellite for uav-based geo-localization. Remote Sens. 13(1), 47 (2020)CrossRef

38.

Zhuang, J., Dai, M., Chen, X., Zheng, E.: A faster and more effective cross-view matching method of uav and satellite images for uav geolocalization. Remote Sens. 13(19), 3979 (2021)CrossRef

39.

Lin, J., Zheng, Z., Zhong, Z., Luo, Z., Li, S., Yang, Y., Sebe, N.: Joint representation learning and keypoint detection for cross-view geo-localization. IEEE Trans. Image Process. 31, 3780–3792 (2022)CrossRef

40.

Dai, M., Hu, J., Zhuang, J., Zheng, E.: A transformer-based feature segmentation and region alignment method for uav-view geo-localization. IEEE Trans. Circuits. Syst. Video Technol. 32(7), 4376–4389 (2022)CrossRef

41.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, vol. 30, pp. 1–11 (2017)

42.

Chechik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. J. Mach. Learning Res. 11(3), 1109–1135 (2010)MathSciNetMATH

43.

Cai, S., Guo, Y., Khan, S., Hu, J., Wen, G.: Ground-to-aerial image geo-localization with a hard exemplar reweighting triplet loss. In: Proceedings of the IEEE International Conference on Computer Vision. pp 8391–8400 (2019)

44.

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7132–7141 (2018)

Title: A cross-view geo-localization method guided by relation-aware global attention
Authors: Jing Sun
Rui Yan
Bing Zhang
Bing Zhu
Fuming Sun
Publication date: 09-05-2023
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 4/2023
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-023-01101-1

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2023

E-Cap Net: an efficient-capsule network for shallow and deepfakes forgery detection

Prototype-based semantic consistency learning for unsupervised 2D image-based 3D shape retrieval

Centralized sub-critic based hierarchical-structured reinforcement learning for temporal sentence grounding

Attentional weighting strategy-based dynamic GCN for skeleton-based action recognition

Research on multi-context aware recommendation methods based on tensor factorization

Multi-level network based on transformer encoder for fine-grained image–text matching