nach oben

Autonomous Robots

Erschienen in:

16.11.2017

BoCNF: efficient image matching with Bag of ConvNet features for scalable and robust visual place recognition

verfasst von: Yi Hou, Hong Zhang, Shilin Zhou

Erschienen in: Autonomous Robots | Ausgabe 6/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Recent advances in visual place recognition (VPR) have exploited ConvNet features to improve the recognition accuracy under significant environmental and viewpoint changes. However, it remains unsolved how to implement efficient image matching with high dimensional ConvNet features. In this paper, we tackle the problem of matching efficiency using ConvNet features for VPR, where the task is to accurately and quickly recognize a given place in large-scale challenging environments. The paper makes two contributions. First, we propose an efficient solution to VPR, based on the well-known bag-of-words (BoW) framework, to speed up image matching with ConvNet features. Second, in order to alleviate the problem of perceptual aliasing in BoW, we adopt a coarse-to-fine approach where we first, in the coarse stage, search for the top-K candidate images via BoW and then, in the fine stage, identify the best match among the candidates using a hash-based voting scheme. We conduct extensive experiments on six popular VPR datasets to validate the effectiveness of our method. Experimental results show that, in terms of recognition accuracy, our method is comparable to linear search, and outperforms other methods such as FABMAP and SeqSLAM by a significant margin. In terms of efficiecy, our method achieves a significant speed-up over linear search, with an average matching time as low as 23.5 ms per query on a dataset with 21K images.

Vorheriger Artikel Gaussian process for 6-DoF rigid motions

Nächster Artikel Analyzing and improving multi-robot missions by using process mining

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Codes were downloaded from https://github.com/subokita/OpenSeqSLAM.

Codes were downloaded from https://github.com/arrenglover/openfabmap.

Arandjelovic, R., & Zisserman, A. (2013). All about VLAD. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1578–1585).

Babenko, A., & Lempitsky, V. (2015). Aggregating deep convolutional features for image retrieval. In IEEE international conference on computer vision (ICCV).

Badino, H., Huber, D., & Kanade T. (2011). The CMU visual localization data set. http://3dvis.ri.cmu.edu/data-sets/localization.

Bay, H., Tuytelaars, T., & Van Gool, L. (2006). SURF: Speeded up robust features. In European conference on computer vision (ECCV) (Vol. 3951, pp. 404–417).

Chen, Z., Lam, O., Jacobson, A., & M. Milford (2014). Convolutional neural network-based place recognition. In Australasian conference on robotics and automation (ACRA) (pp. 2–4).

Cheng, M.-M., Zhang, Z., Lin, W.-Y., & Torr, P. (2014). BING: Binarized normed gradients for objectness estimation at 300fps. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3286–3293).

Cummins, M., & Newman, P. (2011). Appearance-only SLAM at large scale with FAB-MAP 2.0. The International Journal of Robotics Research, 30(9), 1100–1123.CrossRef

Dalal, N., & Triggs B. (2005). Histograms of oriented gradients for human detection. In International conference on computer vision and pattern recognition (CVPR) (pp. 886–893).

Gionis, A., Indyk, P., & Motwani, R. (1999). Similarity search in high dimensions via hashing. In International conference on very large data bases, San Francisco, CA (pp. 518–529).

Glover, A., Maddern, W., Milford, M., & Wyeth, G. (2010). FAB-MAP + RatSLAM: appearance-based SLAM for multiple times of day. In IEEE international conference on robotics and automation (ICRA) (pp. 3507–3512).

Glover, A., Maddern, W., Warren, M., Reid, S., Milford, M., & Wyeth, G. (2012). OpenFABMAP: An open source toolbox for appearance-based loop closure detection. In IEEE international conference on robotics and automation (ICRA) (pp. 4730–4735).

Hosang, J., Benenson, R., Dollár, P., & Schiele, B. (2016). What makes for effective detection proposals? IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(4), 814–830.CrossRef

Hou, Y., Zhang, H., & Zhou, S. (2015). Convolutional neural network-based image representation for visual loop closure detection. In IEEE international conference on information and automation (ICIA) (pp. 2238–2245).

Hou, Y., Zhang, H., Zhou, S., & Zou H. (2017). Efficient ConvNet feature extraction with multiple RoI pooling for landmark-based visual localization of autonomous vehicles. In: Mobile information systems (Vol. 2017) (in press).

Jégou, H., Douze, M., & Schmid, C. (2008). Hamming embedding and weak geometric consistency for large scale image search. In European conference on computer vision (ECCV) (pp. 304–317).

Jégou, H., Douze, M., Schmid, C., & Pérez, P. (2010). Aggregating local descriptors into a compact image representation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3304–3311).

Kalantidis, Y., Mellina, C., & Osindero, S. (2015). Cross-dimensional weighting for aggregated deep convolutional features. In: European conference on computer vision (ECCV) (pp. 685–701).

Kosecka, J., & Li, F. (2004). Vision based topological Markov localization. In IEEE international conference on robotics and automation (ICRA) (Vol. 2, pp. 1481–1486).

Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (NIPS) (pp. 1097–1105).

Li, F., & Kosecka, J. (2006). Probabilistic location recognition using reduced feature set. In IEEE international conference on robotics and automation (ICRA) (pp. 3405–3410).

Liu, Y., & Zhang, H. (2012). Visual loop closure detection with a compact image descriptor. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1051–1056).

Liu, Y. & Zhang, H. (2013). Towards improving the efficiency of sequence-based SLAM. In IEEE international conference on mechatronics and automation (ICMA) (pp. 1261–1266).

Liu, Y., Feng, R., & Zhang, H. (2015). Keypoint matching by outlier pruning with consensus constraint. In IEEE international conference on robotics and automation (ICRA) (pp. 5481–5486).

Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.CrossRef

Lowry, S., Süenderhauf, N., Newman, P., Leonard, J., Cox, D., Corke, P., et al. (2016). Visual place recognition: A survey. IEEE Transactions on Robotics, 32(1), 1–19.CrossRef

Mapillary. https://www.mapillary.com.

Milford, M. (2013). Vision-based place recognition: how low can you go? The International Journal of Robotics Research, 32(7), 766–789.CrossRef

Milford, M., & Wyeth, G. (2012). SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights. In IEEE international conference on robotics and automation (ICRA) (pp. 1643–1649).

Naseer, T., Spinello, L., Burgard, W., & Stachniss, C. (2014). Robust visual robot localization across seasons using network flows. In The AAAI conference on artificial intelligence.

Neubert, P., & Protzel, P. (2015). Local region detector + CNN based landmarks for practical place recognition in changing environments. In European conference on mobile robots (ECMR) (pp. 1–6).

Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.CrossRefMATH

Pepperell, E., Corke, P., & Milford, M. (2014). All-environment visual place recognition with SMART. In IEEE international conference on robotics and automation (ICRA) (pp. 1612–1618).

Perronnin, F., & Dance, C. (2007). Fisher kernels on visual vocabularies for image categorization. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

Perronnin, F., Sánchez, J., & Mensink, T. (2010). Improving the fisher kernel for large-scale image classification. In European conference on computer vision (ECCV) (pp. 143–156).

Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

Singh, G., & Kosecka, J. (2010). Visual loop closing using gist descriptors in manhattan world. In IEEE international conference on robotics and automation (ICRA) omnidirectional robot vision workshop.

Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In IEEE international conference on computer vision (ICCV) (pp. 1470–1477).

Süenderhauf, N., & Protzel, P. (2011). BRIEF-Gist—Closing the loop by simple means. In IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1234–1241).

Süenderhauf, N., Dayoub, F., Shirazi, S., Upcroft, B., & M. Milford (2015a). On the performance of ConvNet features for place recognition. In IEEE international conference on intelligent robots and systems (IROS).

Süenderhauf, N., Neubert, P., & Protzel, P. (2013). Are we there yet? Challenging seqslam on a 3000 km journey across all four seasons. In IEEE international conference on robotics and automation (ICRA) workshop on long-term autonomy.

Süenderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., & Milford, M. (2015b). Place recognition with ConvNet landmarks: viewpoint-robust, condition-robust, training-free. In Robotics: science and systems (RSS), Rome.

Zhang, H. (2011). BoRF: Loop-closure detection with scale invariant visual features. In IEEE international conference on robotics and automation (ICRA) (pp. 3125–3130).

Zhang, H., Han, F., & Wang, H. (2016). Robust multimodal sequence-based loop closure detection via structured sparsity. In Robotics: Science and systems (RSS).

Zheng, L., Yang, Y., & Tian, Q. (2016). SIFT meets CNN: a decade survey of instance retrieval. In IEEE transactions on pattern analysis and machine intelligence (vol. PP, no. 99, pp. 1–1).

Zitnick, C. L., & Dollár, P. (2014). Edge boxes: Locating object proposals from edges. In European conference on computer vision (ECCV) (pp. 391–405).

Titel: BoCNF: efficient image matching with Bag of ConvNet features for scalable and robust visual place recognition
verfasst von: Yi Hou
Hong Zhang
Shilin Zhou
Publikationsdatum: 16.11.2017
Verlag: Springer US
Erschienen in: Autonomous Robots / Ausgabe 6/2018
Print ISSN: 0929-5593
Elektronische ISSN: 1573-7527
DOI: https://doi.org/10.1007/s10514-017-9684-3

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 6/2018

Gaussian process for 6-DoF rigid motions

Optimized vision-based robot motion planning from multiple demonstrations

Design of vertebrae-inspired trunk mechanism for controlling walking behavior of semi-passive walker

Autonomous navigation of micro aerial vehicles using high-rate and low-cost sensors

Long-term online multi-session graph-based SPLAM with memory management

A sensor-based approach for fault detection and diagnosis for robotic systems

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.