nach oben

International Journal of Computer Vision

Erschienen in:

03.12.2018

Understanding and Improving Kernel Local Descriptors

verfasst von: Arun Mukundan, Giorgos Tolias, Andrei Bursuc, Hervé Jégou, Ondřej Chum

Erschienen in: International Journal of Computer Vision | Ausgabe 11-12/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We propose a multiple-kernel local-patch descriptor based on efficient match kernels from pixel gradients. It combines two parametrizations of gradient position and direction, each parametrization provides robustness to a different type of patch mis-registration: polar parametrization for noise in the patch dominant orientation detection, Cartesian for imprecise location of the feature point. Combined with whitening of the descriptor space, that is learned with or without supervision, the performance is significantly improved. We analyze the effect of the whitening on patch similarity and demonstrate its semantic meaning. Our unsupervised variant is the best performing descriptor constructed without the need of labeled data. Despite the simplicity of the proposed descriptor, it competes well with deep learning approaches on a number of different tasks.

Vorheriger Artikel Reflectance and Shape Estimation with a Light Field Camera Under Natural Illumination

Nächster Artikel Cross-Domain Image Matching with Deep Feature Maps

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

Also known as the periodic normal distribution.

L2Net and HardNet descriptors were provided by the authors of HardNet (Mishchuk et al. 2017).

Ahonen, T., Matas, J., He, C., & Pietikäinen, M. (2009). Rotation invariant image description with local binary pattern histogram fourier features. In Scandinavian conference on image analysis (pp. 61–70). Berlin.

Alahi, A., Ortiz, R., & Vandergheynst, P. (2012). Reak: fast retina keypoint. In CVPR.

Ambai, M., & Yoshida, Y. (2011). Card: Compact and real-time descriptors. In ICCV.

Arandjelovic, & R., Zisserman, A., (2012). Three things everyone should know to improve object retrieval. In CVPR.

Babenko, A., & Lempitsky, V. (2015). Aggregating deep convolutional features for image retrieval. In ICCV.

Balntas, V., Johns, E., Tang, L., & Mikolajczyk, K. (2016). PN-Net: Conjoined triple deep network for learning local image descriptors. arXiv preprint arXiv:1601.05030

Balntas, V., Riba, E., Ponsa, D., & Mikolajczyk, K. (2016). Learning local feature descriptors with triplets and shallow convolutional neural networks. In BMVC.

Balntas, V., Lenc, K., Vedaldi, A., & Mikolajczyk, K. (2017). Hpatches: A benchmark and evaluation of handcrafted and learned local descriptors. In CVPR.

Bau, D., Zhou, B., Khosla, A., Oliva, A., & Torralba, A. (2017). Networkdissection: Quantifying interpretabilityof deep visual representations. In CVPR (pp. 3319–3327). IEEE.

Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). Speeded-up robust features (SURF). CVIU, 110(3), 346–359.

Bo, L., Ren, X., & Fox, D. (2010). Kernel descriptors for visual recognition. In NIPS.

Bo, L., Ren, X., & Fox, D. (2011). Depth kernel descriptors for object recognition. In IROS.

Bo, L., & Sminchisescu, C. (2009). Efficient match kernels between sets of features for visual recognition. In NIPS.

Brown, M., Hua, G., & Winder, S. (2011). Discriminative learning of local image descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1), 43–57.CrossRef

Brown, M., Szeliski, R., & Winder, S. (2005). Multi-image matching using multi-scale oriented patches. CVPR, 1, 510–517.

Bursuc, A., Tolias, G., & Jégou, H. Kernel. (2015). local descriptors with implicit rotation matching. In ICMR.

Calonder, M., Lepetit, V., Strecha, C., & Fua, P. (2010). Brief: Binary robust independent elementary features. In ECCV.

Chum, O. (2015). Low dimensional explicit feature maps. In ICCV.

Delhumeau, J., Gosselin, P. H., Jégou, H., & Pérez, P. (2013). Revisiting the VLAD image representation. In ACM multimedia.

Dong, J., & Soatto, S. (2015). Domain-size pooling in local descriptors: Dsp-sift. In CVPR.

Forssén, P.E., & Lowe, D.G. (2007). Shape descriptors for maximally stable extremal regions. In IEEE 11th international conference on computer vision, 2007. ICCV 2007 (pp. 1–8). IEEE

Frahm, J. M., Fite-Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., et al. (2010). Building rome on a cloudless day. In ECCV.

Han, X., Leung, T., Jia, Y., Sukthankar, R., & Berg, A. C. (2015). Matchnet: Unifying feature and metric learning for patch-based matching. In CVPR.

Heikkila, M., Pietikainen, M., & Schmid, C. (2009). Description of interest regions with local binary patterns. Pattern Recognition, 42(3), 425–436.CrossRef

Heinly, J., Schonberger, J. L., Dunn, E., & Frahm, J. M. (2015). Reconstructing the world* in six days*(as captured by the yahoo 100 million image dataset). In CVPR.

Jaderberg, M., Simonyan, K., & Zisserman, A., et al. (2015). Spatial transformer networks. InNIPS (pp. 2017–2025)

Jégou, H., & Chum, O. (2012). Negative evidences and co-occurrences in image retrieval: The benefit of PCA and whitening. In ECCV.

Ke, Y., & Sukthankar, R. (2004). PCA-SIFT: a more distinctive representation for local image descriptors. In CVPR (pp. 506–513).

Kokkinos, I., & Yuille, A. (2008). Scale invariance without scale selection. In CVPR.

Lazebnik, S., Schmid, C., & Ponce, J. (2005). A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1265–1278.CrossRef

Ledoit, O., & Wolf, M. (2004). Honey, i shrunk the sample covariance matrix. The Journal of Portfolio Management, 30(4), 110–119.CrossRef

Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88(2), 365–411.MathSciNetCrossRef

Leutenegger, S., Chli, M., & Siegwart, R. Y. Brisk. (2011). Binary robust invariant scalable keypoints. In ICCV.

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60(2), 91–110.CrossRef

Mahendran, A., & Vedaldi, A. (2016). Visualizing deep convolutional neural networks using natural pre-images. IJCV, 120(3), 233–255.MathSciNetCrossRef

Mairal, J., Koniusz, P., Harchaoui, Z., & Schmid, C. (2014). Convolutional kernel networks. In NIPS (pp. 2627–2635).

Mikolajczyk, K., & Matas, J. (2007). Improving descriptors for fast tree matching by optimal linear projection. In ICCV.

Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.CrossRef

Mishchuk, A., Mishkin, D., Radenovic, F., & Matas, J. (2017). Working hard to know your neighbor’s margins: Local descriptor learning loss. In NIPS.

Mishkin, D., Matas, J., Perdoch, M., & Lenc, K. (2015). WxBS: Wide baseline stereo generalizations. arXiv preprint arXiv:1504.06603

Mukundan, A., Tolias, G., & Chum, O. (2017). Multiple-kernel local-patch descriptor. In BMVC.

Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.CrossRef

Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(3), 145–175.CrossRef

Paulin, M., Douze, M., Harchaoui, Z., Mairal, J., Perronin, F., & Schmid, C. (2015). Local convolutional features with unsupervised training for image retrieval. In ICCV.

Paulin, M., Mairal, J., Douze, M., Harchaoui, Z., Perronnin, F., & Schmid, C. (2017). Convolutional patch representations for image retrieval: An unsupervised approach. ICCV, 121(1), 149–168.

Philbin, J., Isard, M., Sivic, J., & Zisserman, A. (2010). Descriptor learning for efficient retrieval. In ECCV.

Radenović, F., Tolias, G., & Chum, O. (2016). CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples. In ECCV.

Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011). Orb: An efficient alternative to sift or surf. In ICCV.

Schmid, C., & Mohr, R. (1997). Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5), 530–535.CrossRef

Schonberger, J. L., & Frahm, J. M. (2016). Structure-from-motion revisited. In CVPR.

Schönberger, J. L., Hardmeier, H., Sattler, T., & Pollefeys, M. (2017). Comparative evaluation of hand-crafted and learned local features. In CVPR.

Schönberger, J. L., Radenović, F., Chum, O., & Frahm, J. M. (2015). From single image query to detailed 3D reconstruction. In CVPR.

Scovanner, P., Ali, S., & Shah, M. (2007). A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th ACM international conference on multimedia (pp. 357–360).

Shechtman, E., & Irani, M. (2007). Matching local self-similarities across images and videos. In CVPR (p. (pp. 1–8). IEEE.

Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., & Moreno-Noguer, F. (2015). Discriminative learning of deep convolutional feature point descriptors. In ICCV.

Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Learning local feature descriptors using convex optimisation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(8), 1573–1585.CrossRef

Taira, H., Torii, A., & Okutomi, M. (2016). Robust feature matching by learning descriptor covariance with viewpoint synthesis. In ICPR.

Tian, B. F. Y., & Wu, F. (2017). L2-net: Deep learning of discriminative patch descriptor in euclidean space. In CVPR.

Tola, E., Lepetit, V., & Fua, P. (2010). Daisy: An efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 815–830.CrossRef

Tolias, G., Bursuc, A., Furon, T., & Jégou, H. (2015). Rotation and translation covariant match kernels for image retrieval. CVIU, 140, 9–20.

Trzcinski, T., Christoudias, M., Lepetit, V., & Fua, P. (2012). Learning image descriptors with the boosting-trick. In NIPS

van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2010). Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1582–1596.CrossRef

Vedaldi, A., & Zisserman, A. (2010). Efficient additive kernels via explicit feature maps. In CVPR.

Vedaldi, A., & Zisserman, A. (2012). Efficient additive kernels via explicit feature maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 480–492.CrossRef

Wang, P., Wang, J., Zeng, G., Xu, W., Zha, H., & Li, S. (2013). Supervised kernel descriptors for visual recognition. In CVPR.

Winder, S., & Brown, M. (2007). Learning local image descriptors. In CVPR.

Yi, K. M., Trulls, E., Lepetit, V., & Fua, P. (2016). Lift: Learned invariant feature transform. In ECCV (pp. 467–483). Springer.

Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., & Lipson, H. (2015). Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579

Yu, G., & Morel, J. M. (2009). A fully affine invariant image comparison method. In ICASSP. (pp. 1597–1600). IEEE.

Zagoruyko, S., & Komodakis, N. (2015). Learning to compare image patches via convolutional neural networks. In CVPR.

Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In ECCV.

Zhou, L., Zhu, S., Shen, T., Wang, J., Fang, T., & Quan, L. (2017). Progressive large scale-invariant image matching in scale space. In ICCV.

Titel: Understanding and Improving Kernel Local Descriptors
verfasst von: Arun Mukundan
Giorgos Tolias
Andrei Bursuc
Hervé Jégou
Ondřej Chum
Publikationsdatum: 03.12.2018
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 11-12/2019
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-018-1137-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 11-12/2019

End-to-End Learning of Latent Deformable Part-Based Representations for Object Detection

Unsupervised Binary Representation Learning with Deep Variational Networks

Special Issue on Machine Vision

Stochastic Quantization for Learning Accurate Low-Bit Deep Neural Networks

Reflectance and Shape Estimation with a Light Field Camera Under Natural Illumination

Learning to Predict 3D Surfaces of Sculptures from Single and Multiple Views

Premium Partner