Skip to main content

2015 | OriginalPaper | Buchkapitel

3. Large-Scale Image Geolocalization

verfasst von : James Hays, Alexei A. Efros

Erschienen in: Multimodal Location Estimation of Videos and Images

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter, we explore the task of global image geolocalization—estimating where on the Earth a photograph was captured. We examine variants of the “im2gps” algorithm using millions of “geotagged” Internet photographs as training data. We first discuss a simple to understand nearest-neighbor baseline. Next, we introduce a lazy-learning approach with more sophisticated features that doubles the performance of the original “im2gps” algorithm. Beyond quantifying geolocalization accuracy, we also analyze (a) how the nonuniform distribution of training data impacts the algorithm (b) how performance compares to baselines such as random guessing and land-cover recognition and (c) whether geolocalization is simply landmark or “instance level” recognition at a large scale. We also show that geolocation estimates can provide the basis for image understanding tasks such as population density estimation or land cover estimation. This work was originally described, in part, in “im2gps” [9] which was the first attempt at global geolocalization using Internet-derived training data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
This value was calculated by counting the number of database photos close enough to each query in the test set. Alternatively, each geolocation guess has an area of 126,663 km\(^2\) and the land area of the Earth is 148,940,000 km\(^2\), suggesting that a truly uniform test set would have a chance guessing accuracy of 0.084 %. Chance is higher for our test set because our database (and thus test set) contain no photographs in some regions of Siberia, Sahara, and Antarctica.
 
Literatur
1.
Zurück zum Zitat G. Baatz, O. Saurer, K.Köser, M. Pollefeys, Large scale visual geo-localization of images in mountainous terrain, In Proceedings of the 12th European Conference on Computer Vision - Volume Part II, (2012), pp. 517–530 G. Baatz, O. Saurer, K.Köser, M. Pollefeys, Large scale visual geo-localization of images in mountainous terrain, In Proceedings of the 12th European Conference on Computer Vision - Volume Part II, (2012), pp. 517–530
2.
Zurück zum Zitat M. Bar, The proactive brain: using analogies and associations to generate predictions. Trends Cogn. Sci. 11(7), 280–289 (2007)CrossRef M. Bar, The proactive brain: using analogies and associations to generate predictions. Trends Cogn. Sci. 11(7), 280–289 (2007)CrossRef
3.
Zurück zum Zitat S.S. Chris Atkeson, Andrew Moore, Locally weighted learning. AI. Review 11, 11–73 (1997) S.S. Chris Atkeson, Andrew Moore, Locally weighted learning. AI. Review 11, 11–73 (1997)
4.
Zurück zum Zitat O. Chum, J. Philbin, J. Sivic, M. Isard, A. Zisserman, Total recall: Automatic query expansion with a generative feature model for object retrieval, in Proceedings of ICCV, 2007 O. Chum, J. Philbin, J. Sivic, M. Isard, A. Zisserman, Total recall: Automatic query expansion with a generative feature model for object retrieval, in Proceedings of ICCV, 2007
5.
Zurück zum Zitat D. Comaniciu, P. Meer, Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRef D. Comaniciu, P. Meer, Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)CrossRef
6.
Zurück zum Zitat D.J. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg. Mapping the world’s photos, in WWW ’09: Proceedings of the 18th international conference on World wide web 2009, pp. 761–770, 2009 D.J. Crandall, L. Backstrom, D. Huttenlocher, J. Kleinberg. Mapping the world’s photos, in WWW ’09: Proceedings of the 18th international conference on World wide web 2009, pp. 761–770, 2009
7.
Zurück zum Zitat J. Hays, A. Efros. Where in the world? human and computer geolocation of images, in Vision sciences society meeting, 2009 J. Hays, A. Efros. Where in the world? human and computer geolocation of images, in Vision sciences society meeting, 2009
8.
Zurück zum Zitat J. Hays, A.A. Efros. Scene completion using millions of photographs, in ACM Transactions on Graphics (SIGGRAPH 2007), 26(3), 2007 J. Hays, A.A. Efros. Scene completion using millions of photographs, in ACM Transactions on Graphics (SIGGRAPH 2007), 26(3), 2007
9.
Zurück zum Zitat J. Hays, A.A. Efros. im2gps: estimating geographic information from a single image, in CVPR, 2008 J. Hays, A.A. Efros. im2gps: estimating geographic information from a single image, in CVPR, 2008
10.
Zurück zum Zitat D. Hoiem, A. Efros, M. Hebert, Recovering surface layout from an image. Int. J. Comput. Vision. 75(1), 151–172 (2007)CrossRef D. Hoiem, A. Efros, M. Hebert, Recovering surface layout from an image. Int. J. Comput. Vision. 75(1), 151–172 (2007)CrossRef
11.
Zurück zum Zitat N. Jacobs, S. Satkin, N. Roman, R. Speyer, R. Pless, Geolocating static cameras, in Proceedings, ICCV, 2007 N. Jacobs, S. Satkin, N. Roman, R. Speyer, R. Pless, Geolocating static cameras, in Proceedings, ICCV, 2007
12.
Zurück zum Zitat E. Kalogerakis, O. Vesselova, J. Hays, A.A. Efros, A. Hertzmann. Image sequence geolocation with human travel priors, in Proceedings of the IEEE International Conference on Computer Vision (ICCV ’09) (2009) E. Kalogerakis, O. Vesselova, J. Hays, A.A. Efros, A. Hertzmann. Image sequence geolocation with human travel priors, in Proceedings of the IEEE International Conference on Computer Vision (ICCV ’09) (2009)
13.
Zurück zum Zitat J. Kosecka, W. Zhang. Video compass, in ECCV ’02: Proceedings of the 7th European Conference on Computer Vision-Part IV, 2002, pp. 476–490 J. Kosecka, W. Zhang. Video compass, in ECCV ’02: Proceedings of the 7th European Conference on Computer Vision-Part IV, 2002, pp. 476–490
14.
Zurück zum Zitat J.-F. Lalonde, D. Hoiem, A.A. Efros, C. Rother, J. Winn, A. Criminisi. Photo clip art. ACM Transactions on Graphics (SIGGRAPH 2007), vol. 26(3) (August 2007) J.-F. Lalonde, D. Hoiem, A.A. Efros, C. Rother, J. Winn, A. Criminisi. Photo clip art. ACM Transactions on Graphics (SIGGRAPH 2007), vol. 26(3) (August 2007)
15.
Zurück zum Zitat S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, in CVPR (2006) S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, in CVPR (2006)
16.
Zurück zum Zitat L.-J. Li, L.F. Fei, What, where and who? classifying events by scene and object recognition, in Proceedings, ICCV, (2007) L.-J. Li, L.F. Fei, What, where and who? classifying events by scene and object recognition, in Proceedings, ICCV, (2007)
17.
Zurück zum Zitat T.-Y. Lin, S. Belongie, J. Hays. Cross-view image geolocalization, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Portland, OR, June 2013) T.-Y. Lin, S. Belongie, J. Hays. Cross-view image geolocalization, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Portland, OR, June 2013)
18.
Zurück zum Zitat D. Lowe, Object recognition from local scale-invariant features. ICCV 2, 1150–1157 (1999) D. Lowe, Object recognition from local scale-invariant features. ICCV 2, 1150–1157 (1999)
19.
Zurück zum Zitat J. Luo, D. Joshi, J. Yu, A. Gallagher, Geotagging in multimedia and computer visiona survey. Multime’d Tools Appl. 51, 187–211 (2011)CrossRef J. Luo, D. Joshi, J. Yu, A. Gallagher, Geotagging in multimedia and computer visiona survey. Multime’d Tools Appl. 51, 187–211 (2011)CrossRef
20.
Zurück zum Zitat D. Martin, C. Fowlkes, D. Tal, J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in Proceedings ICCV (July 2001) D. Martin, C. Fowlkes, D. Tal, J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in Proceedings ICCV (July 2001)
21.
Zurück zum Zitat J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)CrossRef J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)CrossRef
22.
Zurück zum Zitat A. Oliva, A. Torralba, Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)CrossRefMATH A. Oliva, A. Torralba, Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)CrossRefMATH
23.
Zurück zum Zitat A. Oliva, A. Torralba. Building the gist of a scene: The role of global image features in recognition, in Visual Perception, Progress in Brain Research, 2006, vol. 155 A. Oliva, A. Torralba. Building the gist of a scene: The role of global image features in recognition, in Visual Perception, Progress in Brain Research, 2006, vol. 155
24.
Zurück zum Zitat J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Object retrieval with large vocabularies and fast spatial matching, in CVPR (2007) J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Object retrieval with large vocabularies and fast spatial matching, in CVPR (2007)
25.
Zurück zum Zitat J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008) J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2008)
26.
Zurück zum Zitat T. Quack, B. Leibe, L. Van Gool. World-scale mining of objects and events from community photo collections, in CIVR ’08: Proceedings of the 2008 international conference on Content-based image and video retrieval (2008) T. Quack, B. Leibe, L. Van Gool. World-scale mining of objects and events from community photo collections, in CIVR ’08: Proceedings of the 2008 international conference on Content-based image and video retrieval (2008)
27.
Zurück zum Zitat L.W. Renninger, J. Malik, When is scene recognition just texture recognition? Vis. Res. 44, 2301–2311 (2004)CrossRef L.W. Renninger, J. Malik, When is scene recognition just texture recognition? Vis. Res. 44, 2301–2311 (2004)CrossRef
28.
Zurück zum Zitat I. Simon, N. Snavely, S.M. Seitz. Scene summarization for online image collections, in Proceedings, ICCV (2007) I. Simon, N. Snavely, S.M. Seitz. Scene summarization for online image collections, in Proceedings, ICCV (2007)
29.
Zurück zum Zitat J. Sivic, A. Zisserman, Video Google: A text retrieval approach to object matching in videos. ICCV 2, 1470–1477 (2003) J. Sivic, A. Zisserman, Video Google: A text retrieval approach to object matching in videos. ICCV 2, 1470–1477 (2003)
30.
Zurück zum Zitat N. Snavely, S.M. Seitz, R. Szeliski, Photo tourism: exploring photo collections in 3d. ACM Trans. Graph. 25(3), 835–846 (2006)CrossRef N. Snavely, S.M. Seitz, R. Szeliski, Photo tourism: exploring photo collections in 3d. ACM Trans. Graph. 25(3), 835–846 (2006)CrossRef
32.
Zurück zum Zitat W. Thompson, C. Valiquette, B. Bennett, K. Sutherland, Geometric reasoning for map-based localization. Spatial Cogn. Comput 1(3), 291–321 (1999) W. Thompson, C. Valiquette, B. Bennett, K. Sutherland, Geometric reasoning for map-based localization. Spatial Cogn. Comput 1(3), 291–321 (1999)
33.
Zurück zum Zitat A. Torralba, R. Fergus, W.T. Freeman, 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE PAMI 30(11), 1958–1970 (2008)CrossRef A. Torralba, R. Fergus, W.T. Freeman, 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE PAMI 30(11), 1958–1970 (2008)CrossRef
34.
Zurück zum Zitat J. Vogel, B. Schiele, Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vis. 72(2), 133–157 (2007)CrossRef J. Vogel, B. Schiele, Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vis. 72(2), 133–157 (2007)CrossRef
35.
Zurück zum Zitat J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo, in CVPR (2010) J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo, in CVPR (2010)
36.
Zurück zum Zitat H. Zhang, A.C. Berg, M. Maire, J. Malik. Svm-knn: Discriminative nearest neighbor classification for visual category recognition, in CVPR ’06 (2006) H. Zhang, A.C. Berg, M. Maire, J. Malik. Svm-knn: Discriminative nearest neighbor classification for visual category recognition, in CVPR ’06 (2006)
37.
Zurück zum Zitat W. Zhang, J. Kosecka. Image based localization in urban environments, in 3DPVT ’06 (2006) W. Zhang, J. Kosecka. Image based localization in urban environments, in 3DPVT ’06 (2006)
38.
Zurück zum Zitat Y. Zheng, M. Zhao, Y. Song, H. Adam, U. Buddemeier, A. Bissacco, F. Brucher, T.-S. Chua, H. Neven. Tour the world: building a web-scale landmark recognition engine, in CVPR (2009) Y. Zheng, M. Zhao, Y. Song, H. Adam, U. Buddemeier, A. Bissacco, F. Brucher, T.-S. Chua, H. Neven. Tour the world: building a web-scale landmark recognition engine, in CVPR (2009)
Metadaten
Titel
Large-Scale Image Geolocalization
verfasst von
James Hays
Alexei A. Efros
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-09861-6_3