Skip to main content
Top
Published in: Pattern Analysis and Applications 4/2022

11-06-2022 | Industrial and Commercial Application

Segmentation based 6D pose estimation using integrated shape pattern and RGB information

Authors: Chaochen Gu, Qi Feng, Changsheng Lu, Shuxin Zhao, Rui Xu

Published in: Pattern Analysis and Applications | Issue 4/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Point cloud is currently the most typical representation in describing the 3D world. However, recognizing objects as well as the poses from point clouds is still a great challenge due to the property of disordered 3D data arrangement. In this paper, a unified deep learning framework for 3D scene segmentation and 6D object pose estimation is proposed. In order to accurately segment foreground objects, a novel shape pattern aggregation module called PointDoN is proposed, which could learn meaningful deep geometric representations from both Difference of Normals (DoN) and the initial spatial coordinates of point cloud. Our PointDoN is flexible to be applied to any convolutional networks and shows improvements in the popular tasks of point cloud classification and semantic segmentation. Once the objects are segmented, the range of point clouds for each object in the entire scene could be specified, which enables us to further estimate the 6D pose for each object within local region of interest. To acquire good estimate, we propose a new 6D pose estimation approach that incorporates both 2D and 3D features generated from RGB images and point clouds, respectively. Specifically, 3D features are extracted via a CNN-based architecture where the input is XYZ map converted from the initial point cloud. Experiments showed that our method could achieve satisfactory results on the publicly available point cloud datasets in both tasks of segmentation and 6D pose estimation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3d semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1534–1543 Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3d semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1534–1543
2.
go back to reference Aubry M, Maturana D, Efros AA, Russell BC, Sivic J (2014) Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3762–3769 Aubry M, Maturana D, Efros AA, Russell BC, Sivic J (2014) Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3762–3769
3.
go back to reference Bai S, Bai X, Zhou Z, Zhang Z, Jan Latecki L (2016) Gift: a real-time and scalable 3d shape search engine. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5023–5032 Bai S, Bai X, Zhou Z, Zhang Z, Jan Latecki L (2016) Gift: a real-time and scalable 3d shape search engine. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5023–5032
4.
go back to reference Brachmann E, Krull A, Michel F, Gumhold S, Shotton J, Rother C (2014) Learning 6d object pose estimation using 3d object coordinates. In: European conference on computer vision, pp. 536–551. Springer Brachmann E, Krull A, Michel F, Gumhold S, Shotton J, Rother C (2014) Learning 6d object pose estimation using 3d object coordinates. In: European conference on computer vision, pp. 536–551. Springer
5.
go back to reference Brachmann E, Michel F, Krull A, Ying Yang M, Gumhold S, et al (2016) Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3364–3372 Brachmann E, Michel F, Krull A, Ying Yang M, Gumhold S, et al (2016) Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3364–3372
6.
go back to reference Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 Brock A, Lim T, Ritchie JM, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:​1608.​04236
7.
go back to reference Cao Z, Huang Q, Karthik R (2017) 3d object classification via spherical projections. In: 2017 International Conference on 3D Vision (3DV), pp. 566–574. IEEE Cao Z, Huang Q, Karthik R (2017) 3d object classification via spherical projections. In: 2017 International Conference on 3D Vision (3DV), pp. 566–574. IEEE
8.
go back to reference Collet A, Martinez M, Srinivasa SS (2011) The moped framework: object recognition and pose estimation for manipulation. Int J Robot Res 30(10):1284–1306CrossRef Collet A, Martinez M, Srinivasa SS (2011) The moped framework: object recognition and pose estimation for manipulation. Int J Robot Res 30(10):1284–1306CrossRef
9.
go back to reference Ferrari V, Tuytelaars T, Van Gool L (2006) Simultaneous object recognition and segmentation from single or multiple model views. Int J Comput Vis 67(2):159–188CrossRef Ferrari V, Tuytelaars T, Van Gool L (2006) Simultaneous object recognition and segmentation from single or multiple model views. Int J Comput Vis 67(2):159–188CrossRef
10.
go back to reference Gu C, Lu C, Gu C, Guan X (2019) Viewpoint estimation using triplet loss with a novel viewpoint-based input selection strategy. In: Journal of Physics: Conference Series, vol. 1207, p. 012009. IOP Publishing Gu C, Lu C, Gu C, Guan X (2019) Viewpoint estimation using triplet loss with a novel viewpoint-based input selection strategy. In: Journal of Physics: Conference Series, vol. 1207, p. 012009. IOP Publishing
11.
go back to reference Hinterstoisser S, Holzer S, Cagniart C, Ilic S, Konolige K, Navab N, Lepetit V (2011) Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 international conference on computer vision, pp. 858–865. IEEE Hinterstoisser S, Holzer S, Cagniart C, Ilic S, Konolige K, Navab N, Lepetit V (2011) Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 international conference on computer vision, pp. 858–865. IEEE
12.
go back to reference Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision, pp. 548–562. Springer Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Asian conference on computer vision, pp. 548–562. Springer
13.
go back to reference Huang H, Kalogerakis E, Chaudhuri S, Ceylan D, Kim VG, Yumer E (2017) Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Trans Gr (TOG) 37(1):1–14 Huang H, Kalogerakis E, Chaudhuri S, Ceylan D, Kim VG, Yumer E (2017) Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Trans Gr (TOG) 37(1):1–14
14.
go back to reference Huang J, You S (2016) Point cloud labeling using 3d convolutional neural network. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2670–2675. IEEE Huang J, You S (2016) Point cloud labeling using 3d convolutional neural network. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2670–2675. IEEE
15.
go back to reference Huang Q, Wang W, Neumann U (2018) Recurrent slice networks for 3d segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2635 Huang Q, Wang W, Neumann U (2018) Recurrent slice networks for 3d segmentation of point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2626–2635
16.
go back to reference Ioannou Y, Taati B, Harrap R, Greenspan M (2012) Difference of normals as a multi-scale operator in unorganized point clouds. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 501–508. IEEE Ioannou Y, Taati B, Harrap R, Greenspan M (2012) Difference of normals as a multi-scale operator in unorganized point clouds. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 501–508. IEEE
17.
go back to reference Kalogerakis E, Averkiou M, Maji S, Chaudhuri S (2017) 3d shape segmentation with projective convolutional networks. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3779–3788 Kalogerakis E, Averkiou M, Maji S, Chaudhuri S (2017) 3d shape segmentation with projective convolutional networks. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3779–3788
18.
go back to reference Kehl W, Manhardt F, Tombari F, Ilic S, Navab N (2017) Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1521–1529 Kehl W, Manhardt F, Tombari F, Ilic S, Navab N (2017) Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1521–1529
19.
go back to reference Kehl W, Milletari F, Tombari F, Ilic S, Navab N (2016) Deep learning of local rgb-d patches for 3d object detection and 6d pose estimation. In: European conference on computer vision, pp. 205–220. Springer Kehl W, Milletari F, Tombari F, Ilic S, Navab N (2016) Deep learning of local rgb-d patches for 3d object detection and 6d pose estimation. In: European conference on computer vision, pp. 205–220. Springer
21.
go back to reference Landrieu L, Simonovsky M (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4558–4567 Landrieu L, Simonovsky M (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4558–4567
22.
go back to reference Le T, Duan Y (2018) Pointgrid: A deep network for 3d shape understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9204–9214 Le T, Duan Y (2018) Pointgrid: A deep network for 3d shape understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9204–9214
23.
go back to reference Lepetit V, Moreno-Noguer F, Fua P (2009) Epnp: an accurate o (n) solution to the pnp problem. Int J of Comput Vis 81(2):155CrossRef Lepetit V, Moreno-Noguer F, Fua P (2009) Epnp: an accurate o (n) solution to the pnp problem. Int J of Comput Vis 81(2):155CrossRef
24.
go back to reference Li C, Bai J, Hager GD (2018) A unified framework for multi-view multi-class object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 254–269 Li C, Bai J, Hager GD (2018) A unified framework for multi-view multi-class object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 254–269
25.
go back to reference Li J, Chen BM, Hee Lee G (2018) So-net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9397–9406 Li J, Chen BM, Hee Lee G (2018) So-net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9397–9406
26.
go back to reference Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: convolution on x-transformed points. In: Advances in neural information processing systems, pp. 820–830 Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: convolution on x-transformed points. In: Advances in neural information processing systems, pp. 820–830
27.
go back to reference Li Y, Wang G, Ji X, Xiang Y, Fox D (2018) Deepim: Deep iterative matching for 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 683–698 Li Y, Wang G, Ji X, Xiang Y, Fox D (2018) Deepim: Deep iterative matching for 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 683–698
28.
go back to reference Li Z, Sun Y, Tang J (2021) Ctnet: Context-based tandem network for semantic segmentation Li Z, Sun Y, Tang J (2021) Ctnet: Context-based tandem network for semantic segmentation
30.
go back to reference Lu C, Gu C, Wu K, Xia S, Wang H, Guan X (2020) Deep transfer neural network using hybrid representations of domain discrepancy. Neurocomputing 409:60–73CrossRef Lu C, Gu C, Wu K, Xia S, Wang H, Guan X (2020) Deep transfer neural network using hybrid representations of domain discrepancy. Neurocomputing 409:60–73CrossRef
31.
go back to reference Lu C, Wang H, Gu C, Wu K, Guan X (2018) Viewpoint estimation for workpieces with deep transfer learning from cold to hot. In: International Conference on Neural Information Processing, pp. 21–32. Springer Lu C, Wang H, Gu C, Wu K, Guan X (2018) Viewpoint estimation for workpieces with deep transfer learning from cold to hot. In: International Conference on Neural Information Processing, pp. 21–32. Springer
32.
go back to reference Maturana D, Scherer S (2015) 3d convolutional neural networks for landing zone detection from lidar. In: 2015 IEEE international conference on robotics and automation (ICRA), pp. 3471–3478. IEEE Maturana D, Scherer S (2015) 3d convolutional neural networks for landing zone detection from lidar. In: 2015 IEEE international conference on robotics and automation (ICRA), pp. 3471–3478. IEEE
33.
go back to reference Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE
34.
go back to reference Mousavian A, Anguelov D, Flynn J, Kosecka J (2017) 3d bounding box estimation using deep learning and geometry. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7074–7082 Mousavian A, Anguelov D, Flynn J, Kosecka J (2017) 3d bounding box estimation using deep learning and geometry. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7074–7082
35.
go back to reference Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp. 1520–1528 Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp. 1520–1528
36.
go back to reference Qi CR, Liu W, Wu C, Su H, Guibas LJ (2018) Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 918–927 Qi CR, Liu W, Wu C, Su H, Guibas LJ (2018) Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 918–927
37.
go back to reference Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660 Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660
38.
go back to reference Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5648–5656 Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5648–5656
39.
go back to reference Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: advances in neural information processing systems, pp. 5099–5108 Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: advances in neural information processing systems, pp. 5099–5108
40.
go back to reference Rad M, Lepetit V (2017) Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836 Rad M, Lepetit V (2017) Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3828–3836
41.
go back to reference Riegler G, Osman Ulusoy A, Geiger A (2017) Octnet: Learning deep 3d representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3577–3586 Riegler G, Osman Ulusoy A, Geiger A (2017) Octnet: Learning deep 3d representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3577–3586
42.
go back to reference Rios-Cabrera R, Tuytelaars T (2013) Discriminatively trained templates for 3d object detection: a real time scalable approach. In: Proceedings of the IEEE international conference on computer vision, pp. 2048–2055 Rios-Cabrera R, Tuytelaars T (2013) Discriminatively trained templates for 3d object detection: a real time scalable approach. In: Proceedings of the IEEE international conference on computer vision, pp. 2048–2055
43.
go back to reference Rothganger F, Lazebnik S, Schmid C, Ponce J (2003) 3d object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., vol. 2, pp. II–272. IEEE Rothganger F, Lazebnik S, Schmid C, Ponce J (2003) 3d object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., vol. 2, pp. II–272. IEEE
44.
go back to reference Schwarz M, Schulz H, Behnke S (2015) Rgb-d object recognition and pose estimation based on pre-trained convolutional neural network features. In: 2015 IEEE international conference on robotics and automation (ICRA), pp. 1329–1335. IEEE Schwarz M, Schulz H, Behnke S (2015) Rgb-d object recognition and pose estimation based on pre-trained convolutional neural network features. In: 2015 IEEE international conference on robotics and automation (ICRA), pp. 1329–1335. IEEE
45.
go back to reference Sedaghat N, Zolfaghari M, Amiri E, Brox T (2016) Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:1604.03351 Sedaghat N, Zolfaghari M, Amiri E, Brox T (2016) Orientation-boosted voxel nets for 3d object recognition. arXiv preprint arXiv:​1604.​03351
46.
go back to reference Singhirunnusorn K, Fahimi F, Aygun R (2018) Single-camera pose estimation using mirage. IET Comput Vis 12(5):720–727CrossRef Singhirunnusorn K, Fahimi F, Aygun R (2018) Single-camera pose estimation using mirage. IET Comput Vis 12(5):720–727CrossRef
47.
go back to reference Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang MH, Kautz J (2018) Splatnet: Sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2530–2539 Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang MH, Kautz J (2018) Splatnet: Sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2530–2539
48.
go back to reference Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 945–953 Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 945–953
49.
go back to reference Sundermeyer M, Marton ZC, Durner M, Brucker M, Triebel R (2018) Implicit 3d orientation learning for 6d object detection from rgb images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 699–715 Sundermeyer M, Marton ZC, Durner M, Brucker M, Triebel R (2018) Implicit 3d orientation learning for 6d object detection from rgb images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 699–715
50.
go back to reference Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096 Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2088–2096
51.
go back to reference Tchapmi L, Choy C, Armeni I, Gwak J, Savarese S (2017) Segcloud: semantic segmentation of 3d point clouds. In: 2017 international conference on 3D vision (3DV), pp. 537–547. IEEE Tchapmi L, Choy C, Armeni I, Gwak J, Savarese S (2017) Segcloud: semantic segmentation of 3d point clouds. In: 2017 international conference on 3D vision (3DV), pp. 537–547. IEEE
52.
go back to reference Tejani A, Tang D, Kouskouridas R, Kim TK (2014) Latent-class hough forests for 3d object detection and pose estimation. In: European Conference on Computer Vision, pp. 462–477. Springer Tejani A, Tang D, Kouskouridas R, Kim TK (2014) Latent-class hough forests for 3d object detection and pose estimation. In: European Conference on Computer Vision, pp. 462–477. Springer
53.
go back to reference Tekin B, Sinha SN, Fua P (2018) Real-time seamless single shot 6d object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301 Tekin B, Sinha SN, Fua P (2018) Real-time seamless single shot 6d object pose prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301
54.
go back to reference Tulsiani S, Malik J (2015) Viewpoints and keypoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1510–1519 Tulsiani S, Malik J (2015) Viewpoints and keypoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1510–1519
55.
go back to reference Wang C, Xu D, Zhu Y, Martín-Martín R, Lu C, Fei-Fei L, Savarese S (2019) Densefusion: 6d object pose estimation by iterative dense fusion. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 3343–3352 Wang C, Xu D, Zhu Y, Martín-Martín R, Lu C, Fei-Fei L, Savarese S (2019) Densefusion: 6d object pose estimation by iterative dense fusion. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 3343–3352
56.
go back to reference Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-cnn: octree-based convolutional neural networks for 3d shape analysis. ACM Trans Gr (TOG) 36(4):1–11 Wang PS, Liu Y, Guo YX, Sun CY, Tong X (2017) O-cnn: octree-based convolutional neural networks for 3d shape analysis. ACM Trans Gr (TOG) 36(4):1–11
57.
go back to reference Wohlhart P, Lepetit V (2015) Learning descriptors for object recognition and 3d pose estimation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 3109–3118 Wohlhart P, Lepetit V (2015) Learning descriptors for object recognition and 3d pose estimation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 3109–3118
58.
go back to reference Wu X, Lu C, Gu C, Wu K, Zhu S (2021) Domain adaptation for viewpoint estimation with image generation. In: 2021 International Conference on control, automation and information sciences (ICCAIS), pp. 341–346. IEEE Wu X, Lu C, Gu C, Wu K, Zhu S (2021) Domain adaptation for viewpoint estimation with image generation. In: 2021 International Conference on control, automation and information sciences (ICCAIS), pp. 341–346. IEEE
59.
go back to reference Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1912–1920 Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1912–1920
60.
go back to reference Xiang Y, Choi W, Lin Y, Savarese S (2015) Data-driven 3d voxel patterns for object category recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1903–1911 Xiang Y, Choi W, Lin Y, Savarese S (2015) Data-driven 3d voxel patterns for object category recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 1903–1911
61.
go back to reference Xiang Y, Choi W, Lin Y, Savarese S (2017) Subcategory-aware convolutional neural networks for object proposals and detection. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp. 924–933. IEEE Xiang Y, Choi W, Lin Y, Savarese S (2017) Subcategory-aware convolutional neural networks for object proposals and detection. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp. 924–933. IEEE
62.
go back to reference Xiang Y, Schmidt T, Narayanan V, Fox D (2017) Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199 Xiang Y, Schmidt T, Narayanan V, Fox D (2017) Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:​1711.​00199
63.
go back to reference Xie S, Liu S, Chen Z, Tu Z (2018) Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition, pp. 4606–4615 Xie S, Liu S, Chen Z, Tu Z (2018) Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition, pp. 4606–4615
64.
go back to reference Xu D, Anguelov D, Jain A (2018) Pointfusion: Deep sensor fusion for 3d bounding box estimation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 244–253 Xu D, Anguelov D, Jain A (2018) Pointfusion: Deep sensor fusion for 3d bounding box estimation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 244–253
65.
go back to reference Zhang X, Jiang Z, Zhang H, Wei Q (2018) Vision-based pose estimation for textureless space objects by contour points matching. IEEE Trans Aerosp Electron Syst 54(5):2342–2355CrossRef Zhang X, Jiang Z, Zhang H, Wei Q (2018) Vision-based pose estimation for textureless space objects by contour points matching. IEEE Trans Aerosp Electron Syst 54(5):2342–2355CrossRef
66.
go back to reference Zhao S, Gu C, Lu C, Huang Y, Wu K, Guan X (2019) Pointdon: A shape pattern aggregation module for deep learning on point cloud. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE Zhao S, Gu C, Lu C, Huang Y, Wu K, Guan X (2019) Pointdon: A shape pattern aggregation module for deep learning on point cloud. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE
67.
go back to reference Zhu M, Derpanis KG, Yang Y, Brahmbhatt S, Zhang M, Phillips C, Lecce M, Daniilidis K (2014) Single image 3d object detection and pose estimation for grasping. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3936–3943. IEEE Zhu M, Derpanis KG, Yang Y, Brahmbhatt S, Zhang M, Phillips C, Lecce M, Daniilidis K (2014) Single image 3d object detection and pose estimation for grasping. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3936–3943. IEEE
Metadata
Title
Segmentation based 6D pose estimation using integrated shape pattern and RGB information
Authors
Chaochen Gu
Qi Feng
Changsheng Lu
Shuxin Zhao
Rui Xu
Publication date
11-06-2022
Publisher
Springer London
Published in
Pattern Analysis and Applications / Issue 4/2022
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-022-01078-z

Other articles of this Issue 4/2022

Pattern Analysis and Applications 4/2022 Go to the issue

Premium Partner