Skip to main content
Erschienen in: International Journal of Computer Vision 4/2024

18.11.2023

Image and Object Geo-Localization

verfasst von: Daniel Wilson, Xiaohan Zhang, Waqas Sultani, Safwan Wshah

Erschienen in: International Journal of Computer Vision | Ausgabe 4/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The concept of geo-localization broadly refers to the process of determining an entity’s geographical location, typically in the form of Global Positioning System (GPS) coordinates. The entity of interest may be an image, a sequence of images, a video, a satellite image, or even objects visible within the image. Recently, massive datasets of GPS-tagged media have become available due to smartphones and the internet, and deep learning has risen to prominence and enhanced the performance capabilities of machine learning models. These developments have enabled the rise of image and object geo-localization, which has impacted a wide range of applications such as augmented reality, robotics, self-driving vehicles, road maintenance, and 3D reconstruction. This paper provides a comprehensive survey of visual geo-localization, which may involve either determining the location at which an image has been captured (image geo-localization) or geolocating objects within an image (object geo-localization). We will provide an in-depth study of visual geo-localization including a summary of popular algorithms, a description of proposed datasets, and an analysis of performance results to illustrate the current state of the field.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
Literatur
Zurück zum Zitat Anguelov, D., Dulong, C., Filip, D., Frueh, C., Lafon, S., Lyon, R., & Weaver, J. (2010). Google street view: Capturing the world at street level. Institute of Electrical and Electronics Engineers (IEEE) Computer, 43(6), 32–38. Anguelov, D., Dulong, C., Filip, D., Frueh, C., Lafon, S., Lyon, R., & Weaver, J. (2010). Google street view: Capturing the world at street level. Institute of Electrical and Electronics Engineers (IEEE) Computer, 43(6), 32–38.
Zurück zum Zitat Ankerst, M., Breunig, M. M., Kriegel, H.-P., & Sander, J. (1999). Optics: Ordering points to identify the clustering structure. Proceedings of the 1999 ACM Sigmod International Conference on Management of Data (p. 49–60). Association for Computing Machinery. https://doi.org/10.1145/304182.304187 Ankerst, M., Breunig, M. M., Kriegel, H.-P., & Sander, J. (1999). Optics: Ordering points to identify the clustering structure. Proceedings of the 1999 ACM Sigmod International Conference on Management of Data (p. 49–60). Association for Computing Machinery. https://​doi.​org/​10.​1145/​304182.​304187
Zurück zum Zitat Arandjelović, R., Gronat, P., Torii, A., Pajdla, T., & Sivic, J. (2018). Netvlad: CNN architecture for weakly supervised place recognition. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 40(6), 1437–1451. https://doi.org/10.1109/TPAMI.2017.2711011CrossRef Arandjelović, R., Gronat, P., Torii, A., Pajdla, T., & Sivic, J. (2018). Netvlad: CNN architecture for weakly supervised place recognition. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 40(6), 1437–1451. https://​doi.​org/​10.​1109/​TPAMI.​2017.​2711011CrossRef
Zurück zum Zitat Baatz, G., Saurer, O., Köser, K., & Pollefeys, M. (2012). Large scale visual geo-localization of images in mountainous terrain. In Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y. & Schmid, C. (Eds.) Computer Vision—ECCV 2012 (pp. 517–530). Springer. Baatz, G., Saurer, O., Köser, K., & Pollefeys, M. (2012). Large scale visual geo-localization of images in mountainous terrain. In Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y. & Schmid, C. (Eds.) Computer Vision—ECCV 2012 (pp. 517–530). Springer.
Zurück zum Zitat Brejcha, J., & Čadík, M. (2017). State-of-the-art in visual geo-localization. Pattern Analysis and Applications, 20(3), 613–637.MathSciNetCrossRef Brejcha, J., & Čadík, M. (2017). State-of-the-art in visual geo-localization. Pattern Analysis and Applications, 20(3), 613–637.MathSciNetCrossRef
Zurück zum Zitat Brejcha, J., Lukác, M., Chen, Z., DiVerdi, S., & Cadík, M. (2018). Immersive trip reports. In Proceedings of the 31st Annual ACM symposium on user interface software and technology (pp. 389–401). Association for Computing Machinery. https://doi.org/10.1145/3242587.3242653 Brejcha, J., Lukác, M., Chen, Z., DiVerdi, S., & Cadík, M. (2018). Immersive trip reports. In Proceedings of the 31st Annual ACM symposium on user interface software and technology (pp. 389–401). Association for Computing Machinery. https://​doi.​org/​10.​1145/​3242587.​3242653
Zurück zum Zitat Brock, A., Donahue, J., & Simonyan, K. (2019). Large scale GAN training for high fidelity natural image synthesis. International conference on learning representations (ICLR). Brock, A., Donahue, J., & Simonyan, K. (2019). Large scale GAN training for high fidelity natural image synthesis. International conference on learning representations (ICLR).
Zurück zum Zitat Bromley, J., Bentz, J. W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., & Shah, R. (1993). Signature verification using a “siamese’’ time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI), 7(04), 669–688.CrossRef Bromley, J., Bentz, J. W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., & Shah, R. (1993). Signature verification using a “siamese’’ time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI), 7(04), 669–688.CrossRef
Zurück zum Zitat Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., & Beijbom, O. (2020). nuscenes: A multimodal dataset for autonomous driving. Institute of Electrical and Electronics Engineers (IEEE)/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 11618–11628). Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., & Beijbom, O. (2020). nuscenes: A multimodal dataset for autonomous driving. Institute of Electrical and Electronics Engineers (IEEE)/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 11618–11628).
Zurück zum Zitat Cai, S., Guo, Y., Khan, S., Hu, J., & Wen, G. (2019). Ground-to-aerial image geolocalization with a hard exemplar reweighting triplet loss. In Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf international conference on computer vision (ICCV). Cai, S., Guo, Y., Khan, S., Hu, J., & Wen, G. (2019). Ground-to-aerial image geolocalization with a hard exemplar reweighting triplet loss. In Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf international conference on computer vision (ICCV).
Zurück zum Zitat Castaldo, F., Zamir, A., Angst, R., Palmieri, F., & Savarese, S. (2015). Semantic crossview matching. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV) workshops. Castaldo, F., Zamir, A., Angst, R., Palmieri, F., & Savarese, S. (2015). Semantic crossview matching. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV) workshops.
Zurück zum Zitat Chaabane, M., Gueguen, L., Trabelsi, A., Beveridge, R., & O’Hara, S. (2021). End-to-end learning improves static object geo-localization from video. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV) (pp. 2063–2072). Chaabane, M., Gueguen, L., Trabelsi, A., Beveridge, R., & O’Hara, S. (2021). End-to-end learning improves static object geo-localization from video. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV) (pp. 2063–2072).
Zurück zum Zitat Chen, W., Liu, Y., Wang, W., Bakker, E., Georgiou, T., Fieguth, P., & Lew, M. (2021). Deep image retrieval: A survey. Chen, W., Liu, Y., Wang, W., Bakker, E., Georgiou, T., Fieguth, P., & Lew, M. (2021). Deep image retrieval: A survey.
Zurück zum Zitat Chen, Y., Qian, G., Gunda, K., Gupta, H., & Shafique, K. (2015). Camera geolocation from mountain images. In 18th International Conference on Information Fusion (Fusion) (pp. 1587–1596). Chen, Y., Qian, G., Gunda, K., Gupta, H., & Shafique, K. (2015). Camera geolocation from mountain images. In 18th International Conference on Information Fusion (Fusion) (pp. 1587–1596).
Zurück zum Zitat Chopra, S., Hadsell, R., & LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 539–546). Chopra, S., Hadsell, R., & LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 539–546).
Zurück zum Zitat Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR). Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Costea, D., & Leordeanu, M. (2016). Aerial image geolocalization from recognition and matching of roads and intersections. Richard, E. R. H., Wilson, C., & Smith, W. A. P. (Eds.) Proceedings of the british machine vision conference (bmvc) (pp. 118.1–118.12). BMVA Press. https://doi.org/10.5244/C.30.118 Costea, D., & Leordeanu, M. (2016). Aerial image geolocalization from recognition and matching of roads and intersections. Richard, E. R. H., Wilson, C., & Smith, W. A. P. (Eds.) Proceedings of the british machine vision conference (bmvc) (pp. 118.1–118.12). BMVA Press. https://​doi.​org/​10.​5244/​C.​30.​118
Zurück zum Zitat Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. Advances in Neural Information Processing Systems (NeurIPS), 26, 2292–2300. Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. Advances in Neural Information Processing Systems (NeurIPS), 26, 2292–2300.
Zurück zum Zitat Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 886–893). Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 886–893).
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. Institute of electrical and electronics engineers (ieee) conference on computer vision and pattern recognition (cvpr) (pp. 248–255). Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. Institute of electrical and electronics engineers (ieee) conference on computer vision and pattern recognition (cvpr) (pp. 248–255).
Zurück zum Zitat Dünser, A., Billinghurst, M., Wen, J., Lehtinen, V., & Nurminen, A. (2012). Exploring the use of handheld AR for outdoor navigation. Computers & Graphics, 36(8), 1084–1095.CrossRef Dünser, A., Billinghurst, M., Wen, J., Lehtinen, V., & Nurminen, A. (2012). Exploring the use of handheld AR for outdoor navigation. Computers & Graphics, 36(8), 1084–1095.CrossRef
Zurück zum Zitat Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The kitti dataset. International Journal of Robotics Research (IJRR). Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The kitti dataset. International Journal of Robotics Research (IJRR).
Zurück zum Zitat Girshick, R. (2015). Fast r-cnn. Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (iccv) (pp. 1440–1448). Girshick, R. (2015). Fast r-cnn. Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (iccv) (pp. 1440–1448).
Zurück zum Zitat Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems (NeurIPS), 27, 1. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., & Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems (NeurIPS), 27, 1.
Zurück zum Zitat Haas, L., Alberti, S., & Skreta, M. (2023). Pigeon: Predicting image geolocations. Haas, L., Alberti, S., & Skreta, M. (2023). Pigeon: Predicting image geolocations.
Zurück zum Zitat Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 1735–1742). Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 1735–1742).
Zurück zum Zitat Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision (2nd ed.). New York: Cambridge University Press. Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision (2nd ed.). New York: Cambridge University Press.
Zurück zum Zitat Hays, J., & Efros, A. A. (2008). im2gps: Estimating geographic information from a single image. In Proceedings of the institute of electrical and electronics engineers (ieee) conference on computer vision and pattern recognition (cvpr). Hays, J., & Efros, A. A. (2008). im2gps: Estimating geographic information from a single image. In Proceedings of the institute of electrical and electronics engineers (ieee) conference on computer vision and pattern recognition (cvpr).
Zurück zum Zitat Hu, S., Feng, M., Nguyen, R. M., & Lee, G. H. (2018). CVM-net: Cross-view matching network for image-based ground-to-aerial geo-localization. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (cvpr) (pp. 7258–7267). Hu, S., Feng, M., Nguyen, R. M., & Lee, G. H. (2018). CVM-net: Cross-view matching network for image-based ground-to-aerial geo-localization. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (cvpr) (pp. 7258–7267).
Zurück zum Zitat Isola, P., Zhu, J.-Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (cvpr). Isola, P., Zhu, J.-Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (cvpr).
Zurück zum Zitat Jégou, H., Douze, M., Schmid, C., & Pérez, P. (2010). Aggregating local descriptors into a compact image representation. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (cvpr) (pp. 3304–3311). https://doi.org/10.1109/CVPR.2010.5540039 Jégou, H., Douze, M., Schmid, C., & Pérez, P. (2010). Aggregating local descriptors into a compact image representation. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (cvpr) (pp. 3304–3311). https://​doi.​org/​10.​1109/​CVPR.​2010.​5540039
Zurück zum Zitat Kalogerakis, E., Vesselova, O., Hays, J., Efros, A. A., & Hertzmann, A. (2009). Image sequence geolocation with human travel priors. In Institute of Electrical and Electronics Engineers (IEEE) 12th International Conference on Computer Vision (ICCV) (pp. 253–260). Kalogerakis, E., Vesselova, O., Hays, J., Efros, A. A., & Hertzmann, A. (2009). Image sequence geolocation with human travel priors. In Institute of Electrical and Electronics Engineers (IEEE) 12th International Conference on Computer Vision (ICCV) (pp. 253–260).
Zurück zum Zitat Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. Proceedings of the institute of electrical and electronics engineers (ieee)/cvf conference on computer vision and pattern recognition (cvpr). Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. Proceedings of the institute of electrical and electronics engineers (ieee)/cvf conference on computer vision and pattern recognition (cvpr).
Zurück zum Zitat Kim, H. J., Dunn, E., & Frahm, J.-M. (2015). Predicting good features for image geo-localization using per-bundle vlad. Institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV) (pp. 1170–1178). https://doi.org/10.1109/ICCV.2015.139 Kim, H. J., Dunn, E., & Frahm, J.-M. (2015). Predicting good features for image geo-localization using per-bundle vlad. Institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV) (pp. 1170–1178). https://​doi.​org/​10.​1109/​ICCV.​2015.​139
Zurück zum Zitat Kim, H. J., Dunn, E., & Frahm, J.-M. (2017). Learned contextual feature reweighting for image geolocalization. In Institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 3251–3260). https://doi.org/10.1109/CVPR.2017.346 Kim, H. J., Dunn, E., & Frahm, J.-M. (2017). Learned contextual feature reweighting for image geolocalization. In Institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 3251–3260). https://​doi.​org/​10.​1109/​CVPR.​2017.​346
Zurück zum Zitat Knight, P. A. (2008). The Sinkhorn–Knopp algorithm: Convergence and applications. SIAM Journal on Matrix Analysis and Applications, 30(1), 261–275.MathSciNetCrossRef Knight, P. A. (2008). The Sinkhorn–Knopp algorithm: Convergence and applications. SIAM Journal on Matrix Analysis and Applications, 30(1), 261–275.MathSciNetCrossRef
Zurück zum Zitat Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (NeurIPS), 25, 1097–1105. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems (NeurIPS), 25, 1097–1105.
Zurück zum Zitat Lam, D., Kuzma, R., McGee, K., Dooley, S., Laielli, M., Klaric, M. K., & McCord, B. (2018). xview: Objects in context in overhead imagery. ArXiv arXiv:1802.07856. Lam, D., Kuzma, R., McGee, K., Dooley, S., Laielli, M., Klaric, M. K., & McCord, B. (2018). xview: Objects in context in overhead imagery. ArXiv arXiv:​1802.​07856.
Zurück zum Zitat Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 2169–2178). https://doi.org/10.1109/CVPR.2006.68 Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 2169–2178). https://​doi.​org/​10.​1109/​CVPR.​2006.​68
Zurück zum Zitat Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., & Shi, W. (2017). Photo-realistic single image superresolution using a generative adversarial network. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (cvpr). Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., & Shi, W. (2017). Photo-realistic single image superresolution using a generative adversarial network. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (cvpr).
Zurück zum Zitat Lin, T.-Y., Belongie, S., & Hays, J. (2013). Crossview image geolocalization. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR). Lin, T.-Y., Belongie, S., & Hays, J. (2013). Crossview image geolocalization. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Lin, T.-Y., Cui, Y., Belongie, S., & Hays, J. (2015). Learning deep representations for ground-toaerial geolocalization. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR). Lin, T.-Y., Cui, Y., Belongie, S., & Hays, J. (2015). Learning deep representations for ground-toaerial geolocalization. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Liu, L., & Li, H. (2019). Lending orientation to neural networks for cross-view geolocalization. In Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf conference on computer vision and pattern recognition (CVPR). Liu, L., & Li, H. (2019). Lending orientation to neural networks for cross-view geolocalization. In Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Lu, X., Li, Z., Cui, Z., Oswald, M.R., Pollefeys, M., & Qin, R. (2020). Geometry-aware satellite-to-ground image synthesis for urban areas. Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf conference on computer vision and pattern recognition (CVPR). Lu, X., Li, Z., Cui, Z., Oswald, M.R., Pollefeys, M., & Qin, R. (2020). Geometry-aware satellite-to-ground image synthesis for urban areas. Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Martinson, E., Furlong, B., & Gillies, A. (2021). Training rare object detection in satellite imagery with synthetic gan images. In 2021 institute of electrical and electronics engineers (IEEE)/cvf conference on computer vision and pattern recognition workshops (cvprw) (pp. 2763–2770). https://doi.org/10.1109/CVPRW53098.2021.00311 Martinson, E., Furlong, B., & Gillies, A. (2021). Training rare object detection in satellite imagery with synthetic gan images. In 2021 institute of electrical and electronics engineers (IEEE)/cvf conference on computer vision and pattern recognition workshops (cvprw) (pp. 2763–2770). https://​doi.​org/​10.​1109/​CVPRW53098.​2021.​00311
Zurück zum Zitat McManus, C., Churchill, W., Maddern, W., Stewart, A. D., & Newman, P. (2014). Shady dealings: Robust, long-term visual localisation using illumination invariance. Institute of electrical and electronics engineers (IEEE) international conference on robotics and automation (ICRA) (pp. 901–906). https://doi.org/10.1109/ICRA.2014.6906961 McManus, C., Churchill, W., Maddern, W., Stewart, A. D., & Newman, P. (2014). Shady dealings: Robust, long-term visual localisation using illumination invariance. Institute of electrical and electronics engineers (IEEE) international conference on robotics and automation (ICRA) (pp. 901–906). https://​doi.​org/​10.​1109/​ICRA.​2014.​6906961
Zurück zum Zitat Middelberg, S., Sattler, T., Untzelmann, O., & Kobbelt, L. (2014). Scalable 6-dof localization on mobile devices. In Fleet, D., Pajdla, T., Schiele, B., & T. Tuytelaars (Eds.) European conference on computer vision (eccv) (pp. 268–283). Springer. Middelberg, S., Sattler, T., Untzelmann, O., & Kobbelt, L. (2014). Scalable 6-dof localization on mobile devices. In Fleet, D., Pajdla, T., Schiele, B., & T. Tuytelaars (Eds.) European conference on computer vision (eccv) (pp. 268–283). Springer.
Zurück zum Zitat Muller-Budack, E., Pustu-Iren, K., & Ewerth, R. (2018). Geolocation estimation of photos using a hierarchical model and scene classification. Proceedings of the European conference on computer vision (ECCV). Muller-Budack, E., Pustu-Iren, K., & Ewerth, R. (2018). Geolocation estimation of photos using a hierarchical model and scene classification. Proceedings of the European conference on computer vision (ECCV).
Zurück zum Zitat Narzt, W., Pomberger, G., Ferscha, A., Kolb, D., Müller, R., Wieghardt, J., & Lindinger, C. (2006). Augmented reality navigation systems. Universal Access in the Information Society (UAIS), 4(3), 177–187.CrossRef Narzt, W., Pomberger, G., Ferscha, A., Kolb, D., Müller, R., Wieghardt, J., & Lindinger, C. (2006). Augmented reality navigation systems. Universal Access in the Information Society (UAIS), 4(3), 177–187.CrossRef
Zurück zum Zitat Nassar, A. S., D’Aronco, S., Lefèvre, S., Wegner, J. D. (2020). Geograph: graph-based multi-view object detection with geometric cues end-toend. Vedaldi, A., Bischof, H., Brox, T., & Frahm, J.-M. (Eds.) European conference on computer vision (eccv) (pp. 488–504). Springer. Nassar, A. S., D’Aronco, S., Lefèvre, S., Wegner, J. D. (2020). Geograph: graph-based multi-view object detection with geometric cues end-toend. Vedaldi, A., Bischof, H., Brox, T., & Frahm, J.-M. (Eds.) European conference on computer vision (eccv) (pp. 488–504). Springer.
Zurück zum Zitat Nassar, A. S., Lefevre, S., Wegner, & J. D. (2019). Simultaneous multi-view instance detection with learned geometric soft-constraints. Proceedings of the IEEE/CVF international conference on computer vision (ICCV). Nassar, A. S., Lefevre, S., Wegner, & J. D. (2019). Simultaneous multi-view instance detection with learned geometric soft-constraints. Proceedings of the IEEE/CVF international conference on computer vision (ICCV).
Zurück zum Zitat Neuhold, G., Ollmann, T., Bulò, S. R., & Kontschieder, P. (2017). The mapillary vistas dataset for semantic understanding of street scenes. In Institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV) (pp. 5000–5009). https://doi.org/10.1109/ICCV.2017.534 Neuhold, G., Ollmann, T., Bulò, S. R., & Kontschieder, P. (2017). The mapillary vistas dataset for semantic understanding of street scenes. In Institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV) (pp. 5000–5009). https://​doi.​org/​10.​1109/​ICCV.​2017.​534
Zurück zum Zitat Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42(3), 145–175.CrossRef Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42(3), 145–175.CrossRef
Zurück zum Zitat Pavan, M., & Pelillo, M. (2003). A new graphtheoretic approach to clustering and segmentation. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. I-I). https://doi.org/10.1109/CVPR.2003.1211348 Pavan, M., & Pelillo, M. (2003). A new graphtheoretic approach to clustering and segmentation. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. I-I). https://​doi.​org/​10.​1109/​CVPR.​2003.​1211348
Zurück zum Zitat Pramanick, S., Nowara, E.M., Gleason, J., Castillo, C.D., & Chellappa, R. (2022). Where in the world is this image? Transformer-based geo-localization in the wild. Avidan, S., Brostow, G., Cissé, M., Farinella, G. M. & Hassner, T. (Eds.) Computer vision—ECCV 2022 (pp. 196–215). Springer. Pramanick, S., Nowara, E.M., Gleason, J., Castillo, C.D., & Chellappa, R. (2022). Where in the world is this image? Transformer-based geo-localization in the wild. Avidan, S., Brostow, G., Cissé, M., Farinella, G. M. & Hassner, T. (Eds.) Computer vision—ECCV 2022 (pp. 196–215). Springer.
Zurück zum Zitat Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2018). Ganimation: Anatomically-aware facial animation from a single image. Proceedings of the european conference on computer vision (eccv) (pp. 818–833). Pumarola, A., Agudo, A., Martinez, A. M., Sanfeliu, A., & Moreno-Noguer, F. (2018). Ganimation: Anatomically-aware facial animation from a single image. Proceedings of the european conference on computer vision (eccv) (pp. 818–833).
Zurück zum Zitat Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. In Meila, M., & Zhang, T. (Eds.) Proceedings of the 38th international conference on machine learning, ICML 2021, 18–24 July 2021, virtual event (Vol. 139, pp. 8748–8763). PMLR. http://proceedings.mlr.press/v139/radford21a.html Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. In Meila, M., & Zhang, T. (Eds.) Proceedings of the 38th international conference on machine learning, ICML 2021, 18–24 July 2021, virtual event (Vol. 139, pp. 8748–8763). PMLR. http://​proceedings.​mlr.​press/​v139/​radford21a.​html
Zurück zum Zitat Regmi, K., & Borji, A. (2018). Cross-view image synthesis using conditional gans. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR). Regmi, K., & Borji, A. (2018). Cross-view image synthesis using conditional gans. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Regmi, K., & Shah, M. (2019). Bridging the domain gap for ground-to-aerial image matching. In Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF international conference on computer vision (ICCV). Regmi, K., & Shah, M. (2019). Bridging the domain gap for ground-to-aerial image matching. In Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF international conference on computer vision (ICCV).
Zurück zum Zitat Ren, X., Bo, L., & Fox, D. (2012). Rgb-(d) scene labeling: Features and algorithms. Institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 2759–2766). Ren, X., Bo, L., & Fox, D. (2012). Rgb-(d) scene labeling: Features and algorithms. Institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 2759–2766).
Zurück zum Zitat Rodrigues, R., & Tani, M. (2021). Are these from the same place? seeing the unseen in crossview image geo-localization. In Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF winter conference on applications of computer vision (WACV) (pp. 3753–3761). Rodrigues, R., & Tani, M. (2021). Are these from the same place? seeing the unseen in crossview image geo-localization. In Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF winter conference on applications of computer vision (WACV) (pp. 3753–3761).
Zurück zum Zitat Roshan Zamir, A., Ardeshir, S., & Shah, M. (2014). Gps-tag refinement using random walks with an adaptive damping factor. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR). Roshan Zamir, A., Ardeshir, S., & Shah, M. (2014). Gps-tag refinement using random walks with an adaptive damping factor. In Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Santana, L.V., Brandao, A.S., & Sarcinelli-Filho, M. (2015). Outdoor waypoint navigation with the ar. drone quadrotor. International conference on unmanned aircraft systems (ICUAS) (pp. 303–311). Santana, L.V., Brandao, A.S., & Sarcinelli-Filho, M. (2015). Outdoor waypoint navigation with the ar. drone quadrotor. International conference on unmanned aircraft systems (ICUAS) (pp. 303–311).
Zurück zum Zitat Saputra, M. R. U., Markham, A., & Trigoni, N. (2018). Visual slam and structure from motion in dynamic environments. ACM Computing Surveys (CSUR), 51, 1–36.CrossRef Saputra, M. R. U., Markham, A., & Trigoni, N. (2018). Visual slam and structure from motion in dynamic environments. ACM Computing Surveys (CSUR), 51, 1–36.CrossRef
Zurück zum Zitat Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR). Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Gradcam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV). Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Gradcam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV).
Zurück zum Zitat Seo, P. H., Weyand, T., Sim, J., & Han, B. (2018). Cplanet: Enhancing image geolocalization by combinatorial partitioning of maps. In Ferrari, V., Hebert, M., Sminchisescu, C., & Weiss, Y. (Eds.) European conference on computer vision (ECCV) (pp. 544–560). Springer. Seo, P. H., Weyand, T., Sim, J., & Han, B. (2018). Cplanet: Enhancing image geolocalization by combinatorial partitioning of maps. In Ferrari, V., Hebert, M., Sminchisescu, C., & Weiss, Y. (Eds.) European conference on computer vision (ECCV) (pp. 544–560). Springer.
Zurück zum Zitat Shermeyer, J., & Etten, A. V. (2019). The effects of super-resolution on object detection performance in satellite imagery. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019, 1432–1441. Shermeyer, J., & Etten, A. V. (2019). The effects of super-resolution on object detection performance in satellite imagery. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019, 1432–1441.
Zurück zum Zitat Shi, Y., Campbell, D., Yu, X., & Li, H. (2021). Geometry-guided street-view panorama synthesis from satellite imagery. arXiv preprint arXiv:2103.01623. Shi, Y., Campbell, D., Yu, X., & Li, H. (2021). Geometry-guided street-view panorama synthesis from satellite imagery. arXiv preprint arXiv:​2103.​01623.
Zurück zum Zitat Shi, Y., Liu, L., Yu, X., & Li, H. (2019). Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems (NeurIPS), 32, 10090–10100. Shi, Y., Liu, L., Yu, X., & Li, H. (2019). Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems (NeurIPS), 32, 10090–10100.
Zurück zum Zitat Shi, Y., Yu, X., Campbell, D., & Li, H. (2020, June). Where am i looking at? Joint location and orientation estimation by cross-view matching. In Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF conference on computer vision and pattern recognition (CVPR). Shi, Y., Yu, X., Campbell, D., & Li, H. (2020, June). Where am i looking at? Joint location and orientation estimation by cross-view matching. In Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Shi, Y., Yu, X., Wang, S., & Li, H. (2022). Cvlnet: Cross-view semantic correspondence learning for video-based camera localization. arXiv preprint arXiv:2208.03660. Shi, Y., Yu, X., Wang, S., & Li, H. (2022). Cvlnet: Cross-view semantic correspondence learning for video-based camera localization. arXiv preprint arXiv:​2208.​03660.
Zurück zum Zitat Sinkhorn, R., & Knopp, P. (1967). Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 21(2), 343–348.MathSciNetCrossRef Sinkhorn, R., & Knopp, P. (1967). Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 21(2), 343–348.MathSciNetCrossRef
Zurück zum Zitat Suenderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., & Milford, M. (2015). Place recognition with convnet landmarks: Viewpoint-robust, condition-robust, training-free. Hsu, D. (Ed.) Robotics: Science and systems xi (pp. 1–10). Robotics: Science and Systems Conference. Suenderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., & Milford, M. (2015). Place recognition with convnet landmarks: Viewpoint-robust, condition-robust, training-free. Hsu, D. (Ed.) Robotics: Science and systems xi (pp. 1–10). Robotics: Science and Systems Conference.
Zurück zum Zitat Tang, H., Liu, H., Xu, D., Torr, P. H., & Sebe, N. (2021). Attentiongan: Unpaired image-to-image translation using attention-guided generative adversarial networks. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Neural Networks and Learning Systems (TNNLS). Tang, H., Liu, H., Xu, D., Torr, P. H., & Sebe, N. (2021). Attentiongan: Unpaired image-to-image translation using attention-guided generative adversarial networks. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Neural Networks and Learning Systems (TNNLS).
Zurück zum Zitat Tang, H., Xu, D., Sebe, N.,Wang, Y., Corso, J. J., & Yan, Y. (2019). Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation. Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF conference on computer vision and pattern recognition (CVPR). Tang, H., Xu, D., Sebe, N.,Wang, Y., Corso, J. J., & Yan, Y. (2019). Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation. Proceedings of the institute of electrical and electronics engineers (IEEE)/CVF conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Tian, Y., Chen, C., & Shah, M. (2017). Cross-view image matching for geo-localization in urban environments. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR). Tian, Y., Chen, C., & Shah, M. (2017). Cross-view image matching for geo-localization in urban environments. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Toker, A., Zhou, Q., Maximov, M., & Leal-Taixe, L. (2021). Coming down to earth: Satelliteto- street view synthesis for geo-localization. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (cvpr) (pp. 6488–6497). Toker, A., Zhou, Q., Maximov, M., & Leal-Taixe, L. (2021). Coming down to earth: Satelliteto- street view synthesis for geo-localization. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (cvpr) (pp. 6488–6497).
Zurück zum Zitat Tomešek, J., Čadík, M., & Brejcha, J. (2022). Crosslocate: Cross-modal large-scale visual geolocalization in natural environments using rendered modalities. In 2022 IEEE/CVF winter conference on applications of computer vision (WACV) (pp. 2193–2202). https://doi.org/10.1109/WACV51458.2022.00225 Tomešek, J., Čadík, M., & Brejcha, J. (2022). Crosslocate: Cross-modal large-scale visual geolocalization in natural environments using rendered modalities. In 2022 IEEE/CVF winter conference on applications of computer vision (WACV) (pp. 2193–2202). https://​doi.​org/​10.​1109/​WACV51458.​2022.​00225
Zurück zum Zitat Torii, A., Arandjelović, R., Sivic, J., Okutomi, M., & Pajdla, T. (2015). 24/7 place recognition by view synthesis. In Institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 1808–1817). https://doi.org/10.1109/CVPR.2015.7298790 Torii, A., Arandjelović, R., Sivic, J., Okutomi, M., & Pajdla, T. (2015). 24/7 place recognition by view synthesis. In Institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 1808–1817). https://​doi.​org/​10.​1109/​CVPR.​2015.​7298790
Zurück zum Zitat Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jegou, H. (2021). Training data efficient image transformers distillation through attention. International Conference on Machine Learning, 139, 10347–10357. Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jegou, H. (2021). Training data efficient image transformers distillation through attention. International Conference on Machine Learning, 139, 10347–10357.
Zurück zum Zitat Vishal, K., Jawahar, C. V., & Chari, V. (2015). Accurate localization by fusing images and GPS signals. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) workshops. Vishal, K., Jawahar, C. V., & Chari, V. (2015). Accurate localization by fusing images and GPS signals. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) workshops.
Zurück zum Zitat Vo, N., & Hays, J. (2016). Localizing and orienting street views using overhead imagery. Leibe, B., Matas, J., Sebe, N., & Welling, M. (Eds.) European conference on computer vision (ECCV) (pp. 494–509). Springer. Vo, N., & Hays, J. (2016). Localizing and orienting street views using overhead imagery. Leibe, B., Matas, J., Sebe, N., & Welling, M. (Eds.) European conference on computer vision (ECCV) (pp. 494–509). Springer.
Zurück zum Zitat Vo, N., Jacobs, N., & Hays, J. (2017). Revisiting im2gps in the deep learning era. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV). Vo, N., Jacobs, N., & Hays, J. (2017). Revisiting im2gps in the deep learning era. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV).
Zurück zum Zitat Vyas, S., Chen, C., & Shah, M. (2022). Gama: Cross-view video geo-localization. Avidan, S., Brostow, G., Cissé, M., Farinella, G. M., & Hassner, T (Eds.) Computer vision—ECCV 2022 (pp. 440–456). Springer. Vyas, S., Chen, C., & Shah, M. (2022). Gama: Cross-view video geo-localization. Avidan, S., Brostow, G., Cissé, M., Farinella, G. M., & Hassner, T (Eds.) Computer vision—ECCV 2022 (pp. 440–456). Springer.
Zurück zum Zitat Wang, T., Zheng, Z., Yan, C., Zhang, J., Sun, Y., Zheng, B., & Yang, Y. (2021). Each part matters: Local patterns facilitate cross-view geo-localization. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Circuits and Systems for Video Technology (TCSVT), 1-1. https://doi.org/10.1109/TCSVT.2021.3061265 Wang, T., Zheng, Z., Yan, C., Zhang, J., Sun, Y., Zheng, B., & Yang, Y. (2021). Each part matters: Local patterns facilitate cross-view geo-localization. Institute of Electrical and Electronics Engineers (IEEE) Transactions on Circuits and Systems for Video Technology (TCSVT), 1-1. https://​doi.​org/​10.​1109/​TCSVT.​2021.​3061265
Zurück zum Zitat Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., & Change Loy, C. (2018). Esrgan: Enhanced super-resolution generative adversarial networks. Proceedings of the European conference on computer vision (ECCV) workshops Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., & Change Loy, C. (2018). Esrgan: Enhanced super-resolution generative adversarial networks. Proceedings of the European conference on computer vision (ECCV) workshops
Zurück zum Zitat Weyand, T., Kostrikov, I., & Philbin, J. (2016). Planet—photo geolocation with convolutional neural networks. In Leibe, B., Matas, J., Sebe, N., & Welling, W. (Eds.) European conference on computer vision (eccv) (pp. 37–55). Springer. Weyand, T., Kostrikov, I., & Philbin, J. (2016). Planet—photo geolocation with convolutional neural networks. In Leibe, B., Matas, J., Sebe, N., & Welling, W. (Eds.) European conference on computer vision (eccv) (pp. 37–55). Springer.
Zurück zum Zitat Wilson, D., Alshaabi, T., Oort, C. M. V., Zhang, X., Nelson, J., & Wshah, S. (2021). Object tracking and geo-localization from street images. CoRR arXiv:2107.06257. Wilson, D., Alshaabi, T., Oort, C. M. V., Zhang, X., Nelson, J., & Wshah, S. (2021). Object tracking and geo-localization from street images. CoRR arXiv:​2107.​06257.
Zurück zum Zitat Woo, S., Park, J., Lee, J.-Y., & Kweon, I.S. (2018). Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV). Woo, S., Park, J., Lee, J.-Y., & Kweon, I.S. (2018). Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV).
Zurück zum Zitat Workman, S., Souvenir, R., & Jacobs, N. (2015). Wide-area image geolocalization with aerial reference imagery. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV). Workman, S., Souvenir, R., & Jacobs, N. (2015). Wide-area image geolocalization with aerial reference imagery. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV).
Zurück zum Zitat Xia, G.-S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., & Zhang, L. (2018). Dota: A large-scale dataset for object detection in aerial images. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 3974–3983). Xia, G.-S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., & Zhang, L. (2018). Dota: A large-scale dataset for object detection in aerial images. Proceedings of the institute of electrical and electronics engineers (IEEE) conference on computer vision and pattern recognition (CVPR) (pp. 3974–3983).
Zurück zum Zitat Xia, H., Zhao, H., & Ding, Z. (2021). Adaptive adversarial network for source-free domain adaptation. Proceedings of the IEEE/CVF international conference on computer vision (ICCV) (pp. 9010–9019). Xia, H., Zhao, H., & Ding, Z. (2021). Adaptive adversarial network for source-free domain adaptation. Proceedings of the IEEE/CVF international conference on computer vision (ICCV) (pp. 9010–9019).
Zurück zum Zitat Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (p. 3485–3492). https://doi.org/10.1109/CVPR.2010.5539970 Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (p. 3485–3492). https://​doi.​org/​10.​1109/​CVPR.​2010.​5539970
Zurück zum Zitat Yi, Z., Zhang, H., Tan, P., & Gong, M. (2017). Dualgan: Unsupervised dual learning for image-toimage translation. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV). Yi, Z., Zhang, H., Tan, P., & Gong, M. (2017). Dualgan: Unsupervised dual learning for image-toimage translation. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV).
Zurück zum Zitat You, K., Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2019). Universal domain adaptation. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (cvpr). You, K., Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2019). Universal domain adaptation. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (cvpr).
Zurück zum Zitat Zamir, A. R., & Shah, M. (2010). Accurate image localization based on google maps street view. In Daniilidis, K., Maragos, P., & Paragios, N. (Eds.) European conference on computer vision (eccv) (pp. 255–268). Springer. Zamir, A. R., & Shah, M. (2010). Accurate image localization based on google maps street view. In Daniilidis, K., Maragos, P., & Paragios, N. (Eds.) European conference on computer vision (eccv) (pp. 255–268). Springer.
Zurück zum Zitat Zhai, M., Bessinger, Z., Workman, S., & Jacobs, N. (2017). Predicting ground-level scene layout from aerial imagery. In Proceedings of the ieee conference on computer vision and pattern recognition (cvpr). Zhai, M., Bessinger, Z., Workman, S., & Jacobs, N. (2017). Predicting ground-level scene layout from aerial imagery. In Proceedings of the ieee conference on computer vision and pattern recognition (cvpr).
Zurück zum Zitat Zhang, H., Berg, A., Maire, M., & Malik, J. (2006). Svm-knn: Discriminative nearest neighbor classification for visual category recognition. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 2126–2136). https://doi.org/10.1109/CVPR.2006.301 Zhang, H., Berg, A., Maire, M., & Malik, J. (2006). Svm-knn: Discriminative nearest neighbor classification for visual category recognition. Institute of electrical and electronics engineers (IEEE) computer society conference on computer vision and pattern recognition (CVPR) (Vol. 2, pp. 2126–2136). https://​doi.​org/​10.​1109/​CVPR.​2006.​301
Zurück zum Zitat Zhang, X., Sultani, W., & Wshah, S. (2023). Cross-view image sequence geo-localization. Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV) (pp. 2914–2923). Zhang, X., Sultani, W., & Wshah, S. (2023). Cross-view image sequence geo-localization. Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV) (pp. 2914–2923).
Zurück zum Zitat Zheng, Z., Wei, Y., & Yang, Y. (2020). University- 1652: A multi-view multi-source benchmark for drone-based geo-localization. In Proceedings of the 28th acm international conference on multimedia (p. 1395–1403). Association for Computing Machinery. https://doi.org/10.1145/3394171.3413896 Zheng, Z., Wei, Y., & Yang, Y. (2020). University- 1652: A multi-view multi-source benchmark for drone-based geo-localization. In Proceedings of the 28th acm international conference on multimedia (p. 1395–1403). Association for Computing Machinery. https://​doi.​org/​10.​1145/​3394171.​3413896
Zurück zum Zitat Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep features for scene recognition using places database. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., & Weinberger, K. Q. (Eds.) Advances in neural information processing systems (neurips) (Vol. 27). Curran Associates, Inc. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., & Oliva, A. (2014). Learning deep features for scene recognition using places database. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., & Weinberger, K. Q. (Eds.) Advances in neural information processing systems (neurips) (Vol. 27). Curran Associates, Inc.
Zurück zum Zitat Zhou, B., Liu, L., Oliva, A., & Torralba, A. (2014). Recognizing city identity via attribute analysis of geo-tagged images. In Fleet, D., Pajdla, T., Schiele, B., & Tuytelaars, T. (Eds.) European conference on computer vision (eccv) (pp. 519–534). Springer. Zhou, B., Liu, L., Oliva, A., & Torralba, A. (2014). Recognizing city identity via attribute analysis of geo-tagged images. In Fleet, D., Pajdla, T., Schiele, B., & Tuytelaars, T. (Eds.) European conference on computer vision (eccv) (pp. 519–534). Springer.
Zurück zum Zitat Zhu, J.-Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV). Zhu, J.-Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the institute of electrical and electronics engineers (IEEE) international conference on computer vision (ICCV).
Zurück zum Zitat Zhu, J.-Y., Zhang, R., Pathak, D., Darrell, T., Efros, A. A., Wang, O., & Shechtman, E. (2017). Toward multimodal image-to-image translation. In Guyon, I. et al. (Eds.) Advances in neural information processing systems (Vol. 30, pp. 465–476). Curran Associates, Inc. Zhu, J.-Y., Zhang, R., Pathak, D., Darrell, T., Efros, A. A., Wang, O., & Shechtman, E. (2017). Toward multimodal image-to-image translation. In Guyon, I. et al. (Eds.) Advances in neural information processing systems (Vol. 30, pp. 465–476). Curran Associates, Inc.
Zurück zum Zitat Zhu, S., Shah, M., & Chen, C. (2022). Transgeo: Transformer is all you need for cross view image geo-localization. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 1162–1171). Zhu, S., Shah, M., & Chen, C. (2022). Transgeo: Transformer is all you need for cross view image geo-localization. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 1162–1171).
Zurück zum Zitat Zhu, S., Yang, T., & Chen, C. (2021a). Revisiting street-to-aerial view image geo-localization and orientation estimation. In Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf winter conference on applications of computer vision (wacv) (pp. 756–765). Zhu, S., Yang, T., & Chen, C. (2021a). Revisiting street-to-aerial view image geo-localization and orientation estimation. In Proceedings of the institute of electrical and electronics engineers (IEEE)/cvf winter conference on applications of computer vision (wacv) (pp. 756–765).
Zurück zum Zitat Zhu, S., Yang, T., & Chen, C. (2021b). Vigor: Cross-view image geo-localization beyond oneto- one retrieval. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 3640–3649). Zhu, S., Yang, T., & Chen, C. (2021b). Vigor: Cross-view image geo-localization beyond oneto- one retrieval. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 3640–3649).
Metadaten
Titel
Image and Object Geo-Localization
verfasst von
Daniel Wilson
Xiaohan Zhang
Waqas Sultani
Safwan Wshah
Publikationsdatum
18.11.2023
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 4/2024
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-023-01942-3

Weitere Artikel der Ausgabe 4/2024

International Journal of Computer Vision 4/2024 Zur Ausgabe

Premium Partner