Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden.
powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden.
powered by
Abstract
The goal of my paper is to create a new framework which provides sufficient semantic information for decision makings as a core component of applications such as SLAM (Simultaneous Localization And Mapping), robotics, AR (Augmented Reality), autonomous driving, etc.
This framework does not provide dense point clouds. Rather, a scene is described with several features which the agent has been trained to recognize. Specifically, scenes are generated by extracting features from the space’s occupancy, the location of the light source, shape of the object, colors, and textures in the images. The extraction of the features is conducted by ensemble with deep-learning based feature extractors, traditional machine learning algorithms and image processing techniques. The deep-learning based feature extractors are trained by a supervised learning with data augmentation over 3D models; and the results of inferences are evaluated by a depth camera and retrained by unsupervised learning. Using existed methods it is difficult to utilize semantic information for unknown objects. But the agent in this methodology tries to describe them as much as possible by utilizing information trained in advance.
For an odometry module, which estimates attitudes and positions, is implemented by a typical feature-based visual odometry methodology. The camera coordinate frame’s depth camera points and the pixel plane’s points are optimized by Levenberg–Marquardt algorithm after extracting a typical corner detection algorithm and tracking by an optical flow algorithm. Using several key-frames, sliding-windowed PnP (Perspective-n-Point), algorithms can be constructed. The visual odometry of the camera and the attitude estimation from the IMU (Inertial Measurement Unit) are loosely coupled. An ARS (Attitude Reference System) is built with a quaternion based linear Kalman filter, and mainly compensates the rotation error of the sliding-windowed PnP algorithm for each frame. The positions of the recognized objects are also included in the PnP algorithm to cover up the lack of features due to the lack of light or motion blur, which are a major problem in feature-based odometry. Since the re-recognized objects’ positions anchor the odometry, the drift problem which commonly occurs can also be solved.
This framework performs a rough 3D reconstruction by interpreting the scene with minimal computing resources, and obtains the location of the agent. Therefore, it offers a simple 3D map and a graph structure, as a by-product for applications using this framework, and provides the attitudes and positions of agents for each frame. For applications that do not necessarily require dense geometric results, I propose that this framework can be utilized as a flexible and versatile component with fewer computer resources.
Anzeige
Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.