Skip to main content
Erschienen in: Pattern Analysis and Applications 4/2023

07.10.2023 | Theoretical Advances

A semantic-aware monocular projection model for accurate pose measurement

verfasst von: Libo Weng, Xiuqi Chen, Qi Qiu, Yaozhong Zhuang, Fei Gao

Erschienen in: Pattern Analysis and Applications | Ausgabe 4/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Monocular vision system is widely used in many fields due to its simple structure, faster speed, and lower cost for object measurement. However, most of the current monocular methods have complicated mathematical models or require artificial markers to achieve accurate measurement results. In addition, it is not easy to precisely extract the features of objects in the captured image which are affected by many factors. In this paper, we present a semantic-aware monocular projection model for accurate pose measurement. Our mathematical model is simple and neat, and we use deep learning network to extract the semantic features in images. Finally, the relevant parameters of the projection model are further optimized with Kalman filter to make the measurement results more accurate and stable. The extensive experiments demonstrate that the proposed method is robust with high performance and accuracy. As a few constraints are required on the measured object and environment, our method is easy for installation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mur-Artal R, Tardós JD (2017) Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans Rob 33(5):1255–1262CrossRef Mur-Artal R, Tardós JD (2017) Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans Rob 33(5):1255–1262CrossRef
2.
Zurück zum Zitat Desouza GN, Kak AC (2002) Vision for mobile robot navigation: a survey. IEEE Trans Pattern Anal Mach Intell 24(2):237–267CrossRef Desouza GN, Kak AC (2002) Vision for mobile robot navigation: a survey. IEEE Trans Pattern Anal Mach Intell 24(2):237–267CrossRef
3.
Zurück zum Zitat Zhu Ren Zhang, Lin Yan, Zhang Lei (2006) A new algorithm for distance measurement of computer vision system for spacecraft rendezvous. J Beijing Univ Aeronaut Astronaut 32(7):764–768 Zhu Ren Zhang, Lin Yan, Zhang Lei (2006) A new algorithm for distance measurement of computer vision system for spacecraft rendezvous. J Beijing Univ Aeronaut Astronaut 32(7):764–768
4.
Zurück zum Zitat Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. W.H. Freeman and Compay, San Francisco Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. W.H. Freeman and Compay, San Francisco
5.
Zurück zum Zitat Wolf J, Burgard W, Burkhardt H (2005) Robust vision-based localization by combining an image-retrieval system with Monte Carlo localization[J]. IEEE Trans Rob 21(2):208–216CrossRef Wolf J, Burgard W, Burkhardt H (2005) Robust vision-based localization by combining an image-retrieval system with Monte Carlo localization[J]. IEEE Trans Rob 21(2):208–216CrossRef
6.
Zurück zum Zitat Deretey E, Ahmed MT, Marshall JA et al. (2015) Visual indoor positioning with a single camera using PnP. In: International Conference on Indoor Positioning & Indoor Navigation. 1–9 Deretey E, Ahmed MT, Marshall JA et al. (2015) Visual indoor positioning with a single camera using PnP. In: International Conference on Indoor Positioning & Indoor Navigation. 1–9
7.
Zurück zum Zitat Xu C, Zhang L, Cheng L et al (2017) Pose estimation from line correspondences: a complete analysis and a series of solutions. IEEE Trans Pattern Anal Mach Intell 39(6):1209–1222CrossRef Xu C, Zhang L, Cheng L et al (2017) Pose estimation from line correspondences: a complete analysis and a series of solutions. IEEE Trans Pattern Anal Mach Intell 39(6):1209–1222CrossRef
8.
Zurück zum Zitat Qin L, Wang T, Hu Y et al (2016) Improved position and attitude determination method for monocular vision in vehicle collision warning system. Int J Pattern Recognit Artif Intell 30(07):1655019MathSciNetCrossRef Qin L, Wang T, Hu Y et al (2016) Improved position and attitude determination method for monocular vision in vehicle collision warning system. Int J Pattern Recognit Artif Intell 30(07):1655019MathSciNetCrossRef
9.
Zurück zum Zitat Chen S, Li Y, Chen H (2017) A monocular vision localization algorithm based on maximum likelihood estimation[C]. In: IEEE International Conference on Real-time Computing & Robotics. IEEE, 561–566 Chen S, Li Y, Chen H (2017) A monocular vision localization algorithm based on maximum likelihood estimation[C]. In: IEEE International Conference on Real-time Computing & Robotics. IEEE, 561–566
10.
Zurück zum Zitat Yuhang Ji, Lizhuang Ma (2016) A Stereo Tree Based Stereo Matching Parallax Optimization Algorithm. J ComputAid Des Comput Graph. 28(12):2159–2167 Yuhang Ji, Lizhuang Ma (2016) A Stereo Tree Based Stereo Matching Parallax Optimization Algorithm. J ComputAid Des Comput Graph. 28(12):2159–2167
11.
Zurück zum Zitat Duggal S, Wang S, Ma WC et al (2019) Deeppruner: Learning efficient stereo matching via differentiable patchmatch. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 4384–4393 Duggal S, Wang S, Ma WC et al (2019) Deeppruner: Learning efficient stereo matching via differentiable patchmatch. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 4384–4393
12.
Zurück zum Zitat Liu GD, Jiang GL, Xiong R et al. (2019) Binocular depth estimation using convolutional neural network with Siamese branches. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO). New York:IEEE Press. 1717–1722 Liu GD, Jiang GL, Xiong R et al. (2019) Binocular depth estimation using convolutional neural network with Siamese branches. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO). New York:IEEE Press. 1717–1722
13.
Zurück zum Zitat Tan Nguyen L (2017) Omnidirectional vision-based distributed optimal tracking control for mobile multi-robot systems with kinematic and dynamic disturbance rejection. IEEE Trans Ind Electron 65(7):5693–5703 Tan Nguyen L (2017) Omnidirectional vision-based distributed optimal tracking control for mobile multi-robot systems with kinematic and dynamic disturbance rejection. IEEE Trans Ind Electron 65(7):5693–5703
14.
Zurück zum Zitat Saxena A, Chung SH, Ng AY (2008) 3-D depth reconstruction from a single still image. Int J Comput Vision 76(1):53–69CrossRef Saxena A, Chung SH, Ng AY (2008) 3-D depth reconstruction from a single still image. Int J Comput Vision 76(1):53–69CrossRef
15.
Zurück zum Zitat David Eigen, Christian Puhrsch, Rob Fergus (2014) Depth map prediction from a single image using a multi-scale deep network. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS'14), Z Ghahramani, M Welling, C Cortes, ND Lawrence, KQ Weinberger (Eds), Vol. 2. MIT Press, Cambridge, MA, USA, 2366–2374 David Eigen, Christian Puhrsch, Rob Fergus (2014) Depth map prediction from a single image using a multi-scale deep network. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS'14), Z Ghahramani, M Welling, C Cortes, ND Lawrence, KQ Weinberger (Eds), Vol. 2. MIT Press, Cambridge, MA, USA, 2366–2374
16.
Zurück zum Zitat I Laina, C Rupprecht, V Belagiannis, F Tombari, N Navab (2016) “Deeper Depth Prediction with Fully Convolutional Residual Networks,” In 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 2016 pp. 239–248 I Laina, C Rupprecht, V Belagiannis, F Tombari, N Navab (2016) “Deeper Depth Prediction with Fully Convolutional Residual Networks,” In 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 2016 pp. 239–248
17.
Zurück zum Zitat Zhou TH, Matthew B, Noah S et al. (2017) Unsupervised learning of depth and ego-motion from video. In: the 30th IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 22–25 Zhou TH, Matthew B, Noah S et al. (2017) Unsupervised learning of depth and ego-motion from video. In: the 30th IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 22–25
18.
Zurück zum Zitat Zhang YD, Ravi G, Chamara S W et al (2018) Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction. The 31th IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 19–21 Zhang YD, Ravi G, Chamara S W et al (2018) Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction. The 31th IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 19–21
19.
Zurück zum Zitat Zhang L, Huang J, Li X et al (2018) Vision-based parking-slot detection: a DCNN-based approach and a large-scale benchmark dataset. IEEE Trans Image Process 27(11):5350–5364MathSciNetCrossRef Zhang L, Huang J, Li X et al (2018) Vision-based parking-slot detection: a DCNN-based approach and a large-scale benchmark dataset. IEEE Trans Image Process 27(11):5350–5364MathSciNetCrossRef
20.
Zurück zum Zitat Li L, Zhang L, Li X et al. (2017) Vision-based parking-slot detection: A benchmark and a learning-based approach[C]. In: 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 649–654 Li L, Zhang L, Li X et al. (2017) Vision-based parking-slot detection: A benchmark and a learning-based approach[C]. In: 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 649–654
21.
Zurück zum Zitat Wu Z, Sun W, Wang M, et al. (2020) Psdet: Efficient and universal parking slot detection. In: 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 290–297 Wu Z, Sun W, Wang M, et al. (2020) Psdet: Efficient and universal parking slot detection. In: 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 290–297
22.
Zurück zum Zitat Huang J, Zhang L, Shen Y, et al. (2019) DMPR-PS: A novel approach for parking-slot detection using directional marking-point regression. In: 2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 212–217 Huang J, Zhang L, Shen Y, et al. (2019) DMPR-PS: A novel approach for parking-slot detection using directional marking-point regression. In: 2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 212–217
23.
Zurück zum Zitat Min C, Xu J, Xiao L et al (2021) Attentional graph neural network for parking-slot detection. IEEE Robot Autom Lett 6(2):3445–3450CrossRef Min C, Xu J, Xiao L et al (2021) Attentional graph neural network for parking-slot detection. IEEE Robot Autom Lett 6(2):3445–3450CrossRef
24.
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651 Long J, Shelhamer E, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
25.
Zurück zum Zitat Ronneberger O, Fischer P, Brox T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: International Conference on Medical Image Computing & Computer-assisted Intervention. 234–241 Ronneberger O, Fischer P, Brox T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: International Conference on Medical Image Computing & Computer-assisted Intervention. 234–241
26.
Zurück zum Zitat H Zhao, J Shi, X Qi, X Wang, J Jia (2017) “Pyramid Scene Parsing Network,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 6230–6239 H Zhao, J Shi, X Qi, X Wang, J Jia (2017) “Pyramid Scene Parsing Network,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 6230–6239
27.
Zurück zum Zitat Liu Y, Chen K, Liu C, et al. (2019) Structured knowledge distillation for semantic segmentation. In : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2604–2613 Liu Y, Chen K, Liu C, et al. (2019) Structured knowledge distillation for semantic segmentation. In : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2604–2613
28.
Zurück zum Zitat Zhang H, Dana K, Shi J, et al. (2018) Context encoding for semantic segmentation. In : Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 7151–7160 Zhang H, Dana K, Shi J, et al. (2018) Context encoding for semantic segmentation. In : Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 7151–7160
29.
Zurück zum Zitat Visin F, Ciccone M, Romero A, et al. (2016) Reseg: A recurrent neural network-based model for semantic segmentation. In : Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 41–48 Visin F, Ciccone M, Romero A, et al. (2016) Reseg: A recurrent neural network-based model for semantic segmentation. In : Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 41–48
30.
Zurück zum Zitat Xue Y, Xu T, Zhang H et al (2018) Segan: adversarial network with multi-scale l 1 loss for medical image segmentation. Neuroinformatics 16(3):383–392CrossRef Xue Y, Xu T, Zhang H et al (2018) Segan: adversarial network with multi-scale l 1 loss for medical image segmentation. Neuroinformatics 16(3):383–392CrossRef
31.
Zurück zum Zitat Ding X, Guo Y, Ding G, et al. (2019) Acnet: strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In : Proceedings of the IEEE/CVF International Conference on Computer Vision. 1911–1920 Ding X, Guo Y, Ding G, et al. (2019) Acnet: strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In : Proceedings of the IEEE/CVF International Conference on Computer Vision. 1911–1920
32.
Zurück zum Zitat Mehta S, Rastegari M, Caspi A, et al (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In : Proceedings of the european conference on computer vision (ECCV). 552–568 Mehta S, Rastegari M, Caspi A, et al (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In : Proceedings of the european conference on computer vision (ECCV). 552–568
33.
Zurück zum Zitat Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision–ECCV 2018 Lecture Notes in Computer Science. Springer, Cham Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision–ECCV 2018 Lecture Notes in Computer Science. Springer, Cham
34.
Zurück zum Zitat Li H, Xiong P, Fan H, et al. (2019) Dfanet: Deep feature aggregation for real-time semantic segmentation[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 9522–9531 Li H, Xiong P, Fan H, et al. (2019) Dfanet: Deep feature aggregation for real-time semantic segmentation[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 9522–9531
35.
Zurück zum Zitat Sandler M, Howard A, Zhu M, et al. (2018) Mobilenetv2: Inverted residuals and linear bottlenecks In : Proceedings of the IEEE conference on computer vision and pattern recognition. 4510–4520 Sandler M, Howard A, Zhu M, et al. (2018) Mobilenetv2: Inverted residuals and linear bottlenecks In : Proceedings of the IEEE conference on computer vision and pattern recognition. 4510–4520
36.
Zurück zum Zitat Fei Gao, Yisu Ge, Wang Tao Lu, Shufang Zhang Yuanming (2018) Vision-based localization model based on plane constraints. Chinese J Sci Instrum 39(07):183–190 Fei Gao, Yisu Ge, Wang Tao Lu, Shufang Zhang Yuanming (2018) Vision-based localization model based on plane constraints. Chinese J Sci Instrum 39(07):183–190
37.
Zurück zum Zitat B Fu, B Zhao, Y Cheng (2019) Monocular Camera Target Detection and Location. In : IEEE 21st Internation al Workshop on Multimedia Signal Processing, Kuala Lumpur, Malaysia, 1–3 B Fu, B Zhao, Y Cheng (2019) Monocular Camera Target Detection and Location. In : IEEE 21st Internation al Workshop on Multimedia Signal Processing, Kuala Lumpur, Malaysia, 1–3
Metadaten
Titel
A semantic-aware monocular projection model for accurate pose measurement
verfasst von
Libo Weng
Xiuqi Chen
Qi Qiu
Yaozhong Zhuang
Fei Gao
Publikationsdatum
07.10.2023
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 4/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-023-01197-1

Weitere Artikel der Ausgabe 4/2023

Pattern Analysis and Applications 4/2023 Zur Ausgabe

Premium Partner