Top

International Journal of Computer Vision

Published in:

01-07-2016

Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

Authors: Petr Gronát, Josef Sivic, Guillaume Obozinski, Tomas Pajdla

Published in: International Journal of Computer Vision | Issue 3/2016

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as only one or a few positive training examples are available for each location, we propose two methods to calibrate all the per-location SVM classifiers without the need for additional positive training data. The first method relies on p-values from statistical hypothesis testing and uses only the available negative training data. The second method performs an affine calibration by appropriately normalizing the learnt classifier hyperplane and does not need any additional labelled training data. We test the proposed place recognition method with the bag-of-visual-words and Fisher vector image representations suitable for large scale indexing. Experiments are performed on three datasets: 25,000 and 55,000 geotagged street view images of Pittsburgh, and the 24/7 Tokyo benchmark containing 76,000 images with varying illumination conditions. The results show improved place recognition accuracy of the learnt image representation over direct matching of raw image descriptors.

previous article Correctness Prediction, Accuracy Improvement and Generalization of Stereo Matching Using Supervised Learning

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

The notion most commonly used in statistics is in fact the p value. The p value associated to a score is the quantity \(\alpha (s)\) defined by \(\alpha (s)=1-F_0(s)\); so the more significant the score is, the closer to 1 the cdf value is, and the closer to 0 the p-value is. To keep the presentation simple, we avoid the formulation in terms of p-values and we only talk of the probabilistic calibrated values obtained from the cdf \(F_0\).

When the calibration by re-normalization method is used the \(\mathbf {w}_j\) contains the re-normalized weights and the bias \(b_j\) is zero. However, to cover both calibration methods we include the bias term in the derivations in this section.

Agarwal, S., Snavely, N., Simon, I., Seitz, S. & Szeliski, R. (2009). Building Rome in a day. In ICCV (pp. 72–79).

Arandjelović, R. & Zisserman, A. (2012). Three things everyone should know to improve object retrieval. In IEEE PAMI.

Aubry, M., Maturana, D., Efros, A., Russell, B. & Sivic, J. (2014). Seeing 3D chairs: exemplar part-based 2D–3D alignment using a large dataset of CAD models. In CVPR.

Aubry, M., Russell, B. & Sivic, J. (2014) Painting-to-3D model alignment via discriminative visual elements. ACM Transactions on Graphics.

Bay, H., Tuytelaars, T. & Van Gool, L. (2006). SURF: Speeded up robust features. In ECCV.

Cao, S. & Snavely, N. (2013). Graph-based discriminative learning for location recognition. In IEEE Conference on CVPR (pp. 700–707).

Casella, G. & Berger, R. (2001). Statistical inference.

Chen, D., Baatz, G., Köser, Tsai, S., Vedantham, R., Pylvanainen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B. & Grzeszczuk, R. (2011). City-scale landmark identification on mobile devices. In CVPR.

Chum, O., Philbin, J., Sivic, J., Isard, M. & Zisserman, A. (2007). Total recall: Automatic query expansion with a generative feature model for object retrieval. In ICCV.

Csurka, G., Bray, C., Dance, C., & Fan, L. (2004). Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV (pp. 1–22).

Cummins, M. & Newman, P. (2009). Highly scalable appearance-only SLAM - FAB-MAP 2.0. In Proceedings of Robotics: Science and Systems, Seattle, USA.

Dalal, N. & Triggs, B. (2005). Histogram of oriented gradients for human detection. In CVPR.

Doersch, C., Gupta, A. & Efros, A.A. (2013). Mid-level visual element discovery as discriminative mode seeking. In NIPS.

Doersch, C., Singh, S., Gupta, A., Sivic, J., & Efros, A. A. (2012). What makes Paris look like Paris? SIGGRAPH, 31(4), 101.

Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.MATH

Gebel, M., & Weihs, C. (2007). Calibrating classifier scores into probabilities. Advances in Data Analysis (pp. 141–148). Berlin: Springer.CrossRef

Gharbi, M., Malisiewicz, T., Paris, S., & Durand, F. (2012). A Gaussian approximation of feature space for fast image similarity. Technical Report, MIT.

Google: ICMLA 2011 streetview recognition challenge. http://www.icmla-conference.org/icmla11/challenge.htm.

Gronát, P.: Project webpage: Learning and calibrating per-location classifiers for visual place recognition. http://www.di.ens.fr/willow/research/perlocation/.

Gronát, P. (2015). streetget. http://www.di.ens.fr/willow/research/streetget/.

Gronát, P., Obozinski, G., Sivic, J. & Pajdla, T. (2013). Learning and calibrating per-location classifiers for visual place recognition. In CVPR.

Hariharan, B., Malik, J. & Ramanan, D. (2012). Discriminative decorrelation for clustering and classification. In ECCV.

Hays, J. & Efros, A.A. (2008). im2gps: estimating geographic information from a single image. In CVPR.

Irschara, A., Zach, C., Frahm, J.M. & Bischof, H. (2009). From structure-from-motion point clouds to fast location recognition. In CVPR.

Jégou, H. & Chum, O. (2012). Negative evidences and co-occurences in image retrieval: The benefit of PCA and whitening. In ECCV (pp. 774–787).

Jegou, H., Douze, M., & Schmid, C. (2011). Product quantization for nearest neighbor search. IEEE Transactions on PAMI, 33(1), 117–128.CrossRef

Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., & Schmid, C. (2012). Aggregating local image descriptors into compact codes. IEEE Transactions on PAMI, 34, 1704–1716.CrossRef

Kalogerakis, E., Vesselova, O., Hays, J., Efros, A. & Hertzmann, A. (2009). Image sequence geolocation with human travel priors. In ICCV (pp. 253–260).

Klingner, B., Martin, D. & Roseborough, J. (2013). Street view motion-from-structure-from-motion. In ICCV.

Knopp, J., Sivic, J. & Pajdla, T. (2010). Avoidng confusing features in place recognition. In ECCV.

Li, Y., Crandall, D. & Huttenlocher, D. (2009). Landmark classification in large-scale image collections. In ICCV.

Li, Y., Snavely, N. & Huttenlocher, D. (2010). Location recognition using prioritized feature matching. In ECCV.

Li, Y., Snavely, N., Huttenlocher, D. & Fua, P. (2012). Worldwide pose estimation using 3d point clouds. In ECCV.

Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 91–110.CrossRef

Malisiewicz, T., Gupta, A. & Efros, A.A. (2011). Ensemble of exemplar-svms for object detection and beyond. In ICCV.

Muja, M. & Lowe, D.G. (2014). Scalable nearest neighbor algorithms for high dimensional data. In IEEE Transactions on PAMI 36.

Nister, D. & Stewenius, H. (2006). Scalable recognition with a vocabulary tree. In CVPR.

Philbin, J., Chum, O., Isard, M., Sivic, J. & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In CVPR.

Philbin, J., Sivic, J. & Zisserman, A. (2010). Geometric latent dirichlet allocation on a matching graph for large-scale image datasets. In IJCV.

Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 10(3), 61–74.

Sattler, T., Leibe, B. & Kobbelt, L. (2012). Improving image-based localization by active correspondence search. In ECCV.

Sattler, T., Weyand, T., Leibe, B., & Kobbelt, L. (2012). Image retrieval for image-based localization revisited. In Proceedings of BMVC.

Scheirer, W., Kumar, N., Belhumeur, P.N. & Boult, T.E. (2012). Multi-attribute spaces: Calibration for attribute fusion and similarity search. In CVPR.

Schindler, G., Brown, M. & Szeliski, R. (2007). City-scale location recognition. In CVPR.

Scholkopf, B., & Smola, A. (2002). Learning with kernels. Cambridge: MIT press.MATH

Shrivastava, A., Malisiewicz, T., Gupta, A. & Efros, A.A. (2011). Data-driven visual similarity for cross-domain image matching. In SIGGRAPH ASIA.

Singh, S., Gupta, A. & Efros, A.A. (2012). Unsupervised discovery of mid-level discriminative patches. In ECCV.

Sivic, J., Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In ICCV. http://www.robots.ox.ac.uk/vgg.

Tighe, J. & Lazebnik, S. (2013). Finding things: Image parsing with regions and per-exemplar detectors. In CVPR.

Torii, A. (2015). Project webpage: 24/7 place recognition by view synthesis. http://www.ok.ctrl.titech.ac.jp/torii/project/247/.

Torii, A., Arandjelović, R., Sivic, J., Okutomi, M. & Pajdla, T. (2015). 24/7 place recognition by view synthesis. In CVPR.

Torii, A., Sivic, J. & Pajdla, T. (2011). Visual localization by linear combination of image descriptors. In IEEE Workshop on Mobile Vision.

Torii, A., Sivic, J., Pajdla, T. & Okutomi, M. (2013) Visual place recognition with repetitive structures. In CVPR.

Turcot, P., & Lowe, D. (2009). Better matching with fewer features: The selection of useful features in large database recognition problem. In WS-LAVD, ICCV.

Zadrozny, B. & Elkan, C. (2002) Transforming classifier scores into accurate multiclass probability estimates. In ACM SIGKDD.

Zamir, A. & Shah, M. (2010) Accurate image localization based on google maps street view. In ECCV.

Title: Learning and Calibrating Per-Location Classifiers for Visual Place Recognition
Authors: Petr Gronát
Josef Sivic
Guillaume Obozinski
Tomas Pajdla
Publication date: 01-07-2016
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 3/2016
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-015-0878-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 3/2016

Correctness Prediction, Accuracy Improvement and Generalization of Stereo Matching Using Supervised Learning

Bounding Multiple Gaussians Uncertainty with Application to Object Tracking

Similarity Fusion for Visual Tracking

Cyclic Schemes for PDE-Based Image Analysis

Spectral Generalized Multi-dimensional Scaling

Premium Partner