nach oben

Pattern Analysis and Applications

Erschienen in:

07.10.2023 | Theoretical Advances

A semantic-aware monocular projection model for accurate pose measurement

verfasst von: Libo Weng, Xiuqi Chen, Qi Qiu, Yaozhong Zhuang, Fei Gao

Erschienen in: Pattern Analysis and Applications | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Monocular vision system is widely used in many fields due to its simple structure, faster speed, and lower cost for object measurement. However, most of the current monocular methods have complicated mathematical models or require artificial markers to achieve accurate measurement results. In addition, it is not easy to precisely extract the features of objects in the captured image which are affected by many factors. In this paper, we present a semantic-aware monocular projection model for accurate pose measurement. Our mathematical model is simple and neat, and we use deep learning network to extract the semantic features in images. Finally, the relevant parameters of the projection model are further optimized with Kalman filter to make the measurement results more accurate and stable. The extensive experiments demonstrate that the proposed method is robust with high performance and accuracy. As a few constraints are required on the measured object and environment, our method is easy for installation.

Vorheriger Artikel Expanded relative density peak clustering for image segmentation

Nächster Artikel Applying unsupervised keyphrase methods on concepts extracted from discharge sheets

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Mur-Artal R, Tardós JD (2017) Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans Rob 33(5):1255–1262CrossRef

Desouza GN, Kak AC (2002) Vision for mobile robot navigation: a survey. IEEE Trans Pattern Anal Mach Intell 24(2):237–267CrossRef

Zhu Ren Zhang, Lin Yan, Zhang Lei (2006) A new algorithm for distance measurement of computer vision system for spacecraft rendezvous. J Beijing Univ Aeronaut Astronaut 32(7):764–768

Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. W.H. Freeman and Compay, San Francisco

Wolf J, Burgard W, Burkhardt H (2005) Robust vision-based localization by combining an image-retrieval system with Monte Carlo localization[J]. IEEE Trans Rob 21(2):208–216CrossRef

Deretey E, Ahmed MT, Marshall JA et al. (2015) Visual indoor positioning with a single camera using PnP. In: International Conference on Indoor Positioning & Indoor Navigation. 1–9

Xu C, Zhang L, Cheng L et al (2017) Pose estimation from line correspondences: a complete analysis and a series of solutions. IEEE Trans Pattern Anal Mach Intell 39(6):1209–1222CrossRef

Qin L, Wang T, Hu Y et al (2016) Improved position and attitude determination method for monocular vision in vehicle collision warning system. Int J Pattern Recognit Artif Intell 30(07):1655019MathSciNetCrossRef

Chen S, Li Y, Chen H (2017) A monocular vision localization algorithm based on maximum likelihood estimation[C]. In: IEEE International Conference on Real-time Computing & Robotics. IEEE, 561–566

10.

Yuhang Ji, Lizhuang Ma (2016) A Stereo Tree Based Stereo Matching Parallax Optimization Algorithm. J ComputAid Des Comput Graph. 28(12):2159–2167

11.

Duggal S, Wang S, Ma WC et al (2019) Deeppruner: Learning efficient stereo matching via differentiable patchmatch. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 4384–4393

12.

Liu GD, Jiang GL, Xiong R et al. (2019) Binocular depth estimation using convolutional neural network with Siamese branches. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO). New York:IEEE Press. 1717–1722

13.

Tan Nguyen L (2017) Omnidirectional vision-based distributed optimal tracking control for mobile multi-robot systems with kinematic and dynamic disturbance rejection. IEEE Trans Ind Electron 65(7):5693–5703

14.

Saxena A, Chung SH, Ng AY (2008) 3-D depth reconstruction from a single still image. Int J Comput Vision 76(1):53–69CrossRef

15.

David Eigen, Christian Puhrsch, Rob Fergus (2014) Depth map prediction from a single image using a multi-scale deep network. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS'14), Z Ghahramani, M Welling, C Cortes, ND Lawrence, KQ Weinberger (Eds), Vol. 2. MIT Press, Cambridge, MA, USA, 2366–2374

16.

I Laina, C Rupprecht, V Belagiannis, F Tombari, N Navab (2016) “Deeper Depth Prediction with Fully Convolutional Residual Networks,” In 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 2016 pp. 239–248

17.

Zhou TH, Matthew B, Noah S et al. (2017) Unsupervised learning of depth and ego-motion from video. In: the 30th IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 22–25

18.

Zhang YD, Ravi G, Chamara S W et al (2018) Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction. The 31th IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 19–21

19.

Zhang L, Huang J, Li X et al (2018) Vision-based parking-slot detection: a DCNN-based approach and a large-scale benchmark dataset. IEEE Trans Image Process 27(11):5350–5364MathSciNetCrossRef

20.

Li L, Zhang L, Li X et al. (2017) Vision-based parking-slot detection: A benchmark and a learning-based approach[C]. In: 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 649–654

21.

Wu Z, Sun W, Wang M, et al. (2020) Psdet: Efficient and universal parking slot detection. In: 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 290–297

22.

Huang J, Zhang L, Shen Y, et al. (2019) DMPR-PS: A novel approach for parking-slot detection using directional marking-point regression. In: 2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 212–217

23.

Min C, Xu J, Xiao L et al (2021) Attentional graph neural network for parking-slot detection. IEEE Robot Autom Lett 6(2):3445–3450CrossRef

24.

Long J, Shelhamer E, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651

25.

Ronneberger O, Fischer P, Brox T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: International Conference on Medical Image Computing & Computer-assisted Intervention. 234–241

26.

H Zhao, J Shi, X Qi, X Wang, J Jia (2017) “Pyramid Scene Parsing Network,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 6230–6239

27.

Liu Y, Chen K, Liu C, et al. (2019) Structured knowledge distillation for semantic segmentation. In : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2604–2613

28.

Zhang H, Dana K, Shi J, et al. (2018) Context encoding for semantic segmentation. In : Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 7151–7160

29.

Visin F, Ciccone M, Romero A, et al. (2016) Reseg: A recurrent neural network-based model for semantic segmentation. In : Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 41–48

30.

Xue Y, Xu T, Zhang H et al (2018) Segan: adversarial network with multi-scale l 1 loss for medical image segmentation. Neuroinformatics 16(3):383–392CrossRef

31.

Ding X, Guo Y, Ding G, et al. (2019) Acnet: strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In : Proceedings of the IEEE/CVF International Conference on Computer Vision. 1911–1920

32.

Mehta S, Rastegari M, Caspi A, et al (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In : Proceedings of the european conference on computer vision (ECCV). 552–568

33.

Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision–ECCV 2018 Lecture Notes in Computer Science. Springer, Cham

34.

Li H, Xiong P, Fan H, et al. (2019) Dfanet: Deep feature aggregation for real-time semantic segmentation[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 9522–9531

35.

Sandler M, Howard A, Zhu M, et al. (2018) Mobilenetv2: Inverted residuals and linear bottlenecks In : Proceedings of the IEEE conference on computer vision and pattern recognition. 4510–4520

36.

Fei Gao, Yisu Ge, Wang Tao Lu, Shufang Zhang Yuanming (2018) Vision-based localization model based on plane constraints. Chinese J Sci Instrum 39(07):183–190

37.

B Fu, B Zhao, Y Cheng (2019) Monocular Camera Target Detection and Location. In : IEEE 21st Internation al Workshop on Multimedia Signal Processing, Kuala Lumpur, Malaysia, 1–3

Titel: A semantic-aware monocular projection model for accurate pose measurement
verfasst von: Libo Weng
Xiuqi Chen
Qi Qiu
Yaozhong Zhuang
Fei Gao
Publikationsdatum: 07.10.2023
Verlag: Springer London
Erschienen in: Pattern Analysis and Applications / Ausgabe 4/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-023-01197-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2023

Unsupervised open-world human action recognition

Adaptive frequency-based fully hyperbolic graph neural networks

Hybrid ABC and black hole algorithm with genetic operators optimized SVM ensemble based diagnosis of breast cancer

Kurcuma: a kitchen utensil recognition collection for unsupervised domain adaptation

A multi-metric small sphere large margin method for classification

Deep spatial and tonal data optimisation for homogeneous diffusion inpainting

Premium Partner