nach oben

Pattern Analysis and Applications

Erschienen in:

18.04.2023 | Industrial and Commercial Application

Shape completion using orthogonal views through a multi-input–output network

verfasst von: Leonardo Delgado, Eduardo F. Morales

Erschienen in: Pattern Analysis and Applications | Ausgabe 3/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Knowing the shape of objects is essential to many robotics tasks. However, this is not always feasible. Recent approaches based on point clouds and voxel cubes have been proposed for shape completion from a single-depth view. However, they tend to be computationally expensive and require the tuning of many weights. This paper presents a novel architecture for shape completion based on six orthogonal views obtained from a point cloud (they can be seen as the six faces of a dice). Our network uses one branch for each orthogonal view as input–output and mixes them in the middle of the architecture. By using orthogonal views, the number of required parameters is significantly reduced. We also introduce a novel method to filter the output of networks based on orthogonal views and describe algorithms to convert an orthogonal view to voxel cube and point cloud. We compared our approach against state-of-the-art approaches on the YCB and ShapeNet datasets using the Chamfer distance and mean square error measures and showed very competitive performance with less than 5% of their parameters.

Vorheriger Artikel Regularized denoising latent subspace based linear regression for image classification

Nächster Artikel MVDet: multi-view multi-class object detection without ground plane assumption

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

PointNet/PointNet++ [8, 9] were proposed to work directly over the point clouds. Several works have been proposed using their frameworks to solve problems like semantic segmentation, classification, point completion and others [3‐7, 17]; in this work, we compared our results against [3] which is based on PointNet.

Note that parts of the shape can be touched on the face of the voxel cube. Our representation takes the distance to the first occupied voxel (as shown in Algorithm 2), and if it coincides with position 0 (this means that it is touching a face of the voxel cube), all these voxels have the same value of the background; for this reason, we add the offset. The offset was set empirically at four.

For ShapeNet and YCB datasets, we use the versions proposed by [3] and [1], respectively.

They use as base network PCN which has 6.8 M parameters, so compared with them we only use 4.4% of their weights.

Note that on a lower capacity GPU the comparison could not be made since both networks require at least 8GB of VRAM to be trained.

This means that the inner points are not taken into account. Only the border points.

Varley J, DeChant C, Richardson A, Ruales J, Allen P (2017) Shape completion enabled robotic grasping. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 2442–2447. https://doi.org/10.1109/iros.2017.8206060

Yang B, Rosa S, Markham A, Trigoni N, Wen H (2019) Dense 3D object reconstruction from a single depth view. IEEE Trans Pattern Anal Mach Intell 41(12):2820–2834. https://doi.org/10.1109/TPAMI.2018.2868195CrossRef

Yuan W, Khot T, Held D, Mertz C, Hebert M (2018) Pcn: point completion network. In: 2018 international conference on 3D vision (3DV), pp 728–737

Liu M, Sheng L, Yang S, Shao J, Hu S-M (2019) Morphing and sampling network for dense point cloud completion. In: The thirty-fourth AAAI conference on artificial intelligence

Peng Y, Chang M, Wang Q, Qian Y, Zhang Y, Wei M, Liao X (2020) Sparse-to-dense multi-encoder shape completion of unstructured point cloud. IEEE Access 8:30969–30978CrossRef

Yu X, Rao Y, Wang Z, Liu Z, Lu J, Zhou J (2021) Pointr: diverse point cloud completion with geometry-aware transformers. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 12478–12487. https://doi.org/10.1109/ICCV48922.2021.01227

Xiang P, Wen X, Liu Y-S, Cao Y-P, Wan P, Zheng W, Han Z (2021) SnowflakeNet: point cloud completion by snowflake point deconvolution with skip-transformer. In: Proceedings of the IEEE international conference on computer vision (ICCV)

Charles RQ, Su H, Kaichun M, Guibas LJ (2017) Pointnet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 77–85. https://doi.org/10.1109/CVPR.2017.16

Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, vol 30. Curran Associates Inc, USA

10.

Hu T, Han Z, Shrivastava A, Zwicker M (2019) Render4completion: synthesizing multi-view depth maps for 3D shape completion. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 4114–4122. https://doi.org/10.1109/ICCVW.2019.00506

11.

Hu T, Han Z, Zwicker M (2020) 3D shape completion with multi-view consistent inference. In: The Thirty-Fourth AAAI conference on artificial intelligence, AAAI 2020, The Thirty-Second innovative applications of artificial intelligence conference, IAAI 2020, The tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 Feb 2020, pp 10997–11004

12.

Chang AX, Funkhouser T, Guibas L, Hanrahan P, Huang Q, Li Z, Savarese S, Savva M, Song S, Su H, Xiao J, Yi L, Yu F (2015) ShapeNet: an information-rich 3D model repository. Technical Report arXiv:1512.03012 [cs.GR], Toyota Technological Institute, Chicago

13.

Calli B, Singh A, Walsman A, Srinivasa S, Abbeel P, Dollar AM (2015) The ycb object and model set: towards common benchmarks for manipulation research. In: 2015 international conference on advanced robotics (ICAR), pp 510–517. https://doi.org/10.1109/ICAR.2015.7251504

14.

Kappler D, Bohg J, Schaal S (2015) Leveraging big data for grasp planning. In: 2015 IEEE international conference on robotics and automation (ICRA), pp 4304–4311. https://doi.org/10.1109/ICRA.2015.7139793

15.

Koenig N, Howard A (2004) Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ international conference on intelligent robots and systems (IROS) (IEEE Cat. No.04CH37566), vol 3, pp 2149–21543 . https://doi.org/10.1109/IROS.2004.1389727

16.

Min P (2019) Binvox. http://www.patrickmin.com/binvox or https://www.google.com/search?q=binvox. Accessed on 05 Oct 2019

17.

Saha M, Amin SB, Sharma A, Kumar TKS, Kalia RK (2022) AI-driven quantification of ground glass opacities in lungs of COVID-19 patients using 3D computed tomography imaging. PLoS ONE 17:1–14. https://doi.org/10.1371/journal.pone.0263916CrossRef

18.

Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI conference on artificial intelligence. AAAI’17, pp 4278–4284

19.

Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical image computing and computer-assisted intervention—MICCAI 2015. Springer, Cham, pp 234–241

20.

Riegler G, Ulusoy AO, Geiger A (2017) Octnet: learning deep 3D representations at high resolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6620–6629. https://doi.org/10.1109/CVPR.2017.701

21.

Choy CB, Xu D, Gwak J, Chen K, Savarese S (2016) 3D–r2n2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. Springer, Cham, pp 628–644CrossRef

22.

Rusu RB, Cousins S (2011) 3D is here: point cloud library (PCL). In: IEEE international conference on robotics and automation (ICRA). Shanghai, China, pp 1–4

23.

Han X, Li Z, Huang H, Kalogerakis E, Yu Y (2017) High-resolution shape completion using deep neural networks for global structure and local geometry inference. In: 2017 IEEE international conference on computer vision (ICCV), pp 85–93. https://doi.org/10.1109/ICCV.2017.19

24.

Oliphant T (2006) A guide to NumPy. Trelgol Publishing, USA

25.

Kingma D, Ba J (2014) Adam: a method for stochastic optimization. In: International conference on learning representations. arxiv:1412.6980

26.

Chollet F (2021) Deep Learning with Python, Second Edition. ISBN 9781617296864

27.

Abadi M, Agarwal A et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems. Software available from https://www.tensorflow.org/

28.

Chollet F (2017) Deep learning with Python, 1st edn. Manning Publications Co., Greenwich

29.

Do T-T, Nguyen A, Reid I (2018) AffordanceNet: an end-to-end deep learning approach for object affordance detection. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 5882–5889. https://doi.org/10.1109/icra.2018.8460902

Titel: Shape completion using orthogonal views through a multi-input–output network
verfasst von: Leonardo Delgado
Eduardo F. Morales
Publikationsdatum: 18.04.2023
Verlag: Springer London
Erschienen in: Pattern Analysis and Applications / Ausgabe 3/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-023-01154-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2023

Detection, tracking, and recognition of isolated multi-stroke gesticulated characters

Self-label correction for image classification with noisy labels

The object migration automata: its field, scope, applications, and future research challenges

Body condition scoring network based on improved YOLOX

Exponential filtering technique for Euclidean norm-regularized extreme learning machines

Global–local transformer for single-image rain removal

Premium Partner