Top

Pattern Analysis and Applications

Published in:

11-06-2022 | Industrial and Commercial Application

Segmentation based 6D pose estimation using integrated shape pattern and RGB information

Authors: Chaochen Gu, Qi Feng, Changsheng Lu, Shuxin Zhao, Rui Xu

Published in: Pattern Analysis and Applications | Issue 4/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Point cloud is currently the most typical representation in describing the 3D world. However, recognizing objects as well as the poses from point clouds is still a great challenge due to the property of disordered 3D data arrangement. In this paper, a unified deep learning framework for 3D scene segmentation and 6D object pose estimation is proposed. In order to accurately segment foreground objects, a novel shape pattern aggregation module called PointDoN is proposed, which could learn meaningful deep geometric representations from both Difference of Normals (DoN) and the initial spatial coordinates of point cloud. Our PointDoN is flexible to be applied to any convolutional networks and shows improvements in the popular tasks of point cloud classification and semantic segmentation. Once the objects are segmented, the range of point clouds for each object in the entire scene could be specified, which enables us to further estimate the 6D pose for each object within local region of interest. To acquire good estimate, we propose a new 6D pose estimation approach that incorporates both 2D and 3D features generated from RGB images and point clouds, respectively. Specifically, 3D features are extracted via a CNN-based architecture where the input is XYZ map converted from the initial point cloud. Experiments showed that our method could achieve satisfactory results on the publicly available point cloud datasets in both tasks of segmentation and 6D pose estimation.

previous article A hybrid loss balancing algorithm based on gradient equilibrium and sample loss for understanding of road scenes at basic-level

next article Arctangent entropy: a new fast threshold segmentation entropy for light colored character image on semiconductor chip surface

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3d semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1534–1543

Aubry M, Maturana D, Efros AA, Russell BC, Sivic J (2014) Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3762–3769

Bai S, Bai X, Zhou Z, Zhang Z, Jan Latecki L (2016) Gift: a real-time and scalable 3d shape search engine. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5023–5032

Brachmann E, Krull A, Michel F, Gumhold S, Shotton J, Rother C (2014) Learning 6d object pose estimation using 3d object coordinates. In: European conference on computer vision, pp. 536–551. Springer

Brachmann E, Michel F, Krull A, Ying Yang M, Gumhold S, et al (2016) Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3364–3372

Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236

Cao Z, Huang Q, Karthik R (2017) 3d object classification via spherical projections. In: 2017 International Conference on 3D Vision (3DV), pp. 566–574. IEEE

Collet A, Martinez M, Srinivasa SS (2011) The moped framework: object recognition and pose estimation for manipulation. Int J Robot Res 30(10):1284–1306CrossRef

Ferrari V, Tuytelaars T, Van Gool L (2006) Simultaneous object recognition and segmentation from single or multiple model views. Int J Comput Vis 67(2):159–188CrossRef

10.

Gu C, Lu C, Gu C, Guan X (2019) Viewpoint estimation using triplet loss with a novel viewpoint-based input selection strategy. In: Journal of Physics: Conference Series, vol. 1207, p. 012009. IOP Publishing

11.

Hinterstoisser S, Holzer S, Cagniart C, Ilic S, Konolige K, Navab N, Lepetit V (2011) Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 international conference on computer vision, pp. 858–865. IEEE

12.

Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision, pp. 548–562. Springer

13.

Huang H, Kalogerakis E, Chaudhuri S, Ceylan D, Kim VG, Yumer E (2017) Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Trans Gr (TOG) 37(1):1–14

14.

Huang J, You S (2016) Point cloud labeling using 3d convolutional neural network. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2670–2675. IEEE

15.

Huang Q, Wang W, Neumann U (2018) Recurrent slice networks for 3d segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2635

16.

Ioannou Y, Taati B, Harrap R, Greenspan M (2012) Difference of normals as a multi-scale operator in unorganized point clouds. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 501–508. IEEE

17.

Kalogerakis E, Averkiou M, Maji S, Chaudhuri S (2017) 3d shape segmentation with projective convolutional networks. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3779–3788

18.

Kehl W, Manhardt F, Tombari F, Ilic S, Navab N (2017) Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1521–1529

19.

Kehl W, Milletari F, Tombari F, Ilic S, Navab N (2016) Deep learning of local rgb-d patches for 3d object detection and 6d pose estimation. In: European conference on computer vision, pp. 205–220. Springer

20.

Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

21.

Landrieu L, Simonovsky M (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4558–4567

22.

Le T, Duan Y (2018) Pointgrid: A deep network for 3d shape understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9204–9214

23.

Lepetit V, Moreno-Noguer F, Fua P (2009) Epnp: an accurate o (n) solution to the pnp problem. Int J of Comput Vis 81(2):155CrossRef

24.

Li C, Bai J, Hager GD (2018) A unified framework for multi-view multi-class object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 254–269

25.

Li J, Chen BM, Hee Lee G (2018) So-net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9397–9406

26.

Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: convolution on x-transformed points. In: Advances in neural information processing systems, pp. 820–830

27.

Li Y, Wang G, Ji X, Xiang Y, Fox D (2018) Deepim: Deep iterative matching for 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 683–698

28.

Li Z, Sun Y, Tang J (2021) Ctnet: Context-based tandem network for semantic segmentation

29.

Li Z, Tang J, Mei T (2019) Deep collaborative embedding for social image understanding. IEEE Trans Pattern Anal Mach Intell 41(9):2070–2083. https://doi.org/10.1109/TPAMI.2018.2852750CrossRef

30.

Lu C, Gu C, Wu K, Xia S, Wang H, Guan X (2020) Deep transfer neural network using hybrid representations of domain discrepancy. Neurocomputing 409:60–73CrossRef

31.

Lu C, Wang H, Gu C, Wu K, Guan X (2018) Viewpoint estimation for workpieces with deep transfer learning from cold to hot. In: International Conference on Neural Information Processing, pp. 21–32. Springer

32.

Maturana D, Scherer S (2015) 3d convolutional neural networks for landing zone detection from lidar. In: 2015 IEEE international conference on robotics and automation (ICRA), pp. 3471–3478. IEEE

33.

Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE

34.

Mousavian A, Anguelov D, Flynn J, Kosecka J (2017) 3d bounding box estimation using deep learning and geometry. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7074–7082

35.

Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp. 1520–1528

36.

Qi CR, Liu W, Wu C, Su H, Guibas LJ (2018) Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 918–927

37.

Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660

38.

Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5648–5656

39.

Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: advances in neural information processing systems, pp. 5099–5108

40.

Rad M, Lepetit V (2017) Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836

41.

Riegler G, Osman Ulusoy A, Geiger A (2017) Octnet: Learning deep 3d representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3577–3586

42.

Rios-Cabrera R, Tuytelaars T (2013) Discriminatively trained templates for 3d object detection: a real time scalable approach. In: Proceedings of the IEEE international conference on computer vision, pp. 2048–2055

43.

Rothganger F, Lazebnik S, Schmid C, Ponce J (2003) 3d object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., vol. 2, pp. II–272. IEEE

44.

Schwarz M, Schulz H, Behnke S (2015) Rgb-d object recognition and pose estimation based on pre-trained convolutional neural network features. In: 2015 IEEE international conference on robotics and automation (ICRA), pp. 1329–1335. IEEE

45.

Sedaghat N, Zolfaghari M, Amiri E, Brox T (2016) Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:1604.03351

46.

Singhirunnusorn K, Fahimi F, Aygun R (2018) Single-camera pose estimation using mirage. IET Comput Vis 12(5):720–727CrossRef

47.

Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang MH, Kautz J (2018) Splatnet: Sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2530–2539

48.

Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 945–953

49.

Sundermeyer M, Marton ZC, Durner M, Brucker M, Triebel R (2018) Implicit 3d orientation learning for 6d object detection from rgb images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 699–715

50.

Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096

51.

Tchapmi L, Choy C, Armeni I, Gwak J, Savarese S (2017) Segcloud: semantic segmentation of 3d point clouds. In: 2017 international conference on 3D vision (3DV), pp. 537–547. IEEE

52.

Tejani A, Tang D, Kouskouridas R, Kim TK (2014) Latent-class hough forests for 3d object detection and pose estimation. In: European Conference on Computer Vision, pp. 462–477. Springer

53.

Tekin B, Sinha SN, Fua P (2018) Real-time seamless single shot 6d object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301

54.

Tulsiani S, Malik J (2015) Viewpoints and keypoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1510–1519

55.

Wang C, Xu D, Zhu Y, Martín-Martín R, Lu C, Fei-Fei L, Savarese S (2019) Densefusion: 6d object pose estimation by iterative dense fusion. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 3343–3352

56.

Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-cnn: octree-based convolutional neural networks for 3d shape analysis. ACM Trans Gr (TOG) 36(4):1–11

57.

Wohlhart P, Lepetit V (2015) Learning descriptors for object recognition and 3d pose estimation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 3109–3118

58.

Wu X, Lu C, Gu C, Wu K, Zhu S (2021) Domain adaptation for viewpoint estimation with image generation. In: 2021 International Conference on control, automation and information sciences (ICCAIS), pp. 341–346. IEEE

59.

Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1912–1920

60.

Xiang Y, Choi W, Lin Y, Savarese S (2015) Data-driven 3d voxel patterns for object category recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1903–1911

61.

Xiang Y, Choi W, Lin Y, Savarese S (2017) Subcategory-aware convolutional neural networks for object proposals and detection. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp. 924–933. IEEE

62.

Xiang Y, Schmidt T, Narayanan V, Fox D (2017) Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199

63.

Xie S, Liu S, Chen Z, Tu Z (2018) Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition, pp. 4606–4615

64.

Xu D, Anguelov D, Jain A (2018) Pointfusion: Deep sensor fusion for 3d bounding box estimation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 244–253

65.

Zhang X, Jiang Z, Zhang H, Wei Q (2018) Vision-based pose estimation for textureless space objects by contour points matching. IEEE Trans Aerosp Electron Syst 54(5):2342–2355CrossRef

66.

Zhao S, Gu C, Lu C, Huang Y, Wu K, Guan X (2019) Pointdon: A shape pattern aggregation module for deep learning on point cloud. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE

67.

Zhu M, Derpanis KG, Yang Y, Brahmbhatt S, Zhang M, Phillips C, Lecce M, Daniilidis K (2014) Single image 3d object detection and pose estimation for grasping. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3936–3943. IEEE

Title: Segmentation based 6D pose estimation using integrated shape pattern and RGB information
Authors: Chaochen Gu
Qi Feng
Changsheng Lu
Shuxin Zhao
Rui Xu
Publication date: 11-06-2022
Publisher: Springer London
Published in: Pattern Analysis and Applications / Issue 4/2022
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI: https://doi.org/10.1007/s10044-022-01078-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2022

Enhanced deep-joint segmentation with deep learning networks of glioma tumor for multi-grade classification using MR images

Symmetric nonnegative matrix factorization with elastic-net regularized block-wise weighted representation for clustering

A hybrid loss balancing algorithm based on gradient equilibrium and sample loss for understanding of road scenes at basic-level

A novel video saliency estimation method in the compressed domain

Bilingual text detection from natural scene images using faster R-CNN and extended histogram of oriented gradients

Visual attention-based deepfake video forgery detection

Premium Partner