Skip to main content

2023 | OriginalPaper | Buchkapitel

3D Reconstruction by Pretrained Features and Visual-Inertial Odometry

verfasst von : Park Kunbum, Takeshi Tsuchiya

Erschienen in: The Proceedings of the 2021 Asia-Pacific International Symposium on Aerospace Technology (APISAT 2021), Volume 2

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The goal of my paper is to create a new framework which provides sufficient semantic information for decision makings as a core component of applications such as SLAM (Simultaneous Localization And Mapping), robotics, AR (Augmented Reality), autonomous driving, etc.
This framework does not provide dense point clouds. Rather, a scene is described with several features which the agent has been trained to recognize. Specifically, scenes are generated by extracting features from the space’s occupancy, the location of the light source, shape of the object, colors, and textures in the images. The extraction of the features is conducted by ensemble with deep-learning based feature extractors, traditional machine learning algorithms and image processing techniques. The deep-learning based feature extractors are trained by a supervised learning with data augmentation over 3D models; and the results of inferences are evaluated by a depth camera and retrained by unsupervised learning. Using existed methods it is difficult to utilize semantic information for unknown objects. But the agent in this methodology tries to describe them as much as possible by utilizing information trained in advance.
For an odometry module, which estimates attitudes and positions, is implemented by a typical feature-based visual odometry methodology. The camera coordinate frame’s depth camera points and the pixel plane’s points are optimized by Levenberg–Marquardt algorithm after extracting a typical corner detection algorithm and tracking by an optical flow algorithm. Using several key-frames, sliding-windowed PnP (Perspective-n-Point), algorithms can be constructed. The visual odometry of the camera and the attitude estimation from the IMU (Inertial Measurement Unit) are loosely coupled. An ARS (Attitude Reference System) is built with a quaternion based linear Kalman filter, and mainly compensates the rotation error of the sliding-windowed PnP algorithm for each frame. The positions of the recognized objects are also included in the PnP algorithm to cover up the lack of features due to the lack of light or motion blur, which are a major problem in feature-based odometry. Since the re-recognized objects’ positions anchor the odometry, the drift problem which commonly occurs can also be solved.
This framework performs a rough 3D reconstruction by interpreting the scene with minimal computing resources, and obtains the location of the agent. Therefore, it offers a simple 3D map and a graph structure, as a by-product for applications using this framework, and provides the attitudes and positions of agents for each frame. For applications that do not necessarily require dense geometric results, I propose that this framework can be utilized as a flexible and versatile component with fewer computer resources.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Klein G, Murray D (2009) Parallel tracking and mapping on a camera phone. In: Proceedings of the ISMAR 2009 Klein G, Murray D (2009) Parallel tracking and mapping on a camera phone. In: Proceedings of the ISMAR 2009
2.
Zurück zum Zitat Mur-Artal R, Montiel JMM, Tardos JD (2015) ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans Robot 31:1147–1163 Mur-Artal R, Montiel JMM, Tardos JD (2015) ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans Robot 31:1147–1163
3.
Zurück zum Zitat Mur-Artal R, Tardos JD (2017) ORB-SLAM2: an open-source SLAM system for monocular, stereo and RGB-D cameras. IEEE Trans Robot 33:1255–1262CrossRef Mur-Artal R, Tardos JD (2017) ORB-SLAM2: an open-source SLAM system for monocular, stereo and RGB-D cameras. IEEE Trans Robot 33:1255–1262CrossRef
4.
Zurück zum Zitat Campos C, Elvira R, Gómez Rodríguez JJ, Montiel JMM, Tardós JD (2021) ORB-SLAM3: an accurate open-source library for visual, visual-inertial and multi-map SLAM. IEEE Trans Robot 37:1874–1890 Campos C, Elvira R, Gómez Rodríguez JJ, Montiel JMM, Tardós JD (2021) ORB-SLAM3: an accurate open-source library for visual, visual-inertial and multi-map SLAM. IEEE Trans Robot 37:1874–1890
5.
Zurück zum Zitat Bachrach A Presentation: robust visual navigation in the real world. In: ICRA21 VINS workshop Bachrach A Presentation: robust visual navigation in the real world. In: ICRA21 VINS workshop
6.
Zurück zum Zitat Rosinol A, Abate M, Chang Y, Carlone L (2020) Kimera: an open-source library for real-time metric-semantic localization and mapping. In: IEEE International Conference on Robotics and Automation (ICRA) Rosinol A, Abate M, Chang Y, Carlone L (2020) Kimera: an open-source library for real-time metric-semantic localization and mapping. In: IEEE International Conference on Robotics and Automation (ICRA)
7.
Zurück zum Zitat Qin T, Li P, Shen S (2017) VINS-Mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans Robot 34:1004–1020CrossRef Qin T, Li P, Shen S (2017) VINS-Mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans Robot 34:1004–1020CrossRef
8.
Zurück zum Zitat Ludwig SA, Burnham KD (2018) Comparison of Euler estimate using extended Kalman filter, Madgwick and Mahony on quadcopter flight data. In: International conference on unmanned aircraft systems (ICUAS) Ludwig SA, Burnham KD (2018) Comparison of Euler estimate using extended Kalman filter, Madgwick and Mahony on quadcopter flight data. In: International conference on unmanned aircraft systems (ICUAS)
9.
Zurück zum Zitat Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition
10.
Zurück zum Zitat Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition (CVPR) Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition (CVPR)
11.
Zurück zum Zitat Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR
Metadaten
Titel
3D Reconstruction by Pretrained Features and Visual-Inertial Odometry
verfasst von
Park Kunbum
Takeshi Tsuchiya
Copyright-Jahr
2023
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-19-2635-8_18

    Premium Partner