Skip to main content
Top

2023 | OriginalPaper | Chapter

4. 3D Object and Hand Pose Estimation

Author : Vincent Lepetit

Published in: Springer Handbook of Augmented Reality

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

3D object and hand pose estimation have huge potentials for Augmented Reality, to enable tangible interfaces, natural interfaces, and blurring the boundaries between the real and virtual worlds. In this chapter, we first motivate the topic and explain the challenges. After a brief review of early work and Deep Learning techniques on which recent works are based, we present the recent developments for 3D object and hand pose estimation using cameras, when considered separately and together. We examine the abilities and limitations of each technique. We conclude by discussing the possible future developments of the field.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Almahairi, A., Rajeswar, S., Sordoni, A., Bachman, P., Courville, A.: Augmented CycleGAN: Learning many-to-many mappings from unpaired data. In: International Conference on Machine Learning (2018) Almahairi, A., Rajeswar, S., Sordoni, A., Bachman, P., Courville, A.: Augmented CycleGAN: Learning many-to-many mappings from unpaired data. In: International Conference on Machine Learning (2018)
2.
go back to reference Armagan, A., Garcia-Hernando, G., Baek, S., Hampali, S., Rad, M., Zhang, Z., Xie, S., Chen, M., Zhang, B., Xiong, F., Xiao, Y., Cao, Z., Yuan, J., Ren, P., Huang, W., Sun, H., Hrúz, M., Kanis, J., Krn̂oul, Z., Wan, Q., Li, S., Yang, L., Lee, D., Yao, A., Zhou, W., Mei, S., Liu, Y., Spurr, A., Iqbal, U., Molchanov, P., Weinzaepfel, P., Brégier, R., Rogez, G., Lepetit, V., Kim, T.-K.: Measuring generalisation to unseen viewpoints, articulations, shapes and objects for 3D hand pose estimation under hand-object interaction. In: European Conference on Computer Vision. Springer, Berlin (2020) Armagan, A., Garcia-Hernando, G., Baek, S., Hampali, S., Rad, M., Zhang, Z., Xie, S., Chen, M., Zhang, B., Xiong, F., Xiao, Y., Cao, Z., Yuan, J., Ren, P., Huang, W., Sun, H., Hrúz, M., Kanis, J., Krn̂oul, Z., Wan, Q., Li, S., Yang, L., Lee, D., Yao, A., Zhou, W., Mei, S., Liu, Y., Spurr, A., Iqbal, U., Molchanov, P., Weinzaepfel, P., Brégier, R., Rogez, G., Lepetit, V., Kim, T.-K.: Measuring generalisation to unseen viewpoints, articulations, shapes and objects for 3D hand pose estimation under hand-object interaction. In: European Conference on Computer Vision. Springer, Berlin (2020)
4.
go back to reference Aubry, M., Russell, B.: Understanding deep features with computer-generated imagery. In: IEEE Conference on Computer Vision and Pattern Recognition (2015) Aubry, M., Russell, B.: Understanding deep features with computer-generated imagery. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
5.
go back to reference Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: European Conference on Computer Vision. Springer, Berlin (2012) Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: European Conference on Computer Vision. Springer, Berlin (2012)
6.
go back to reference Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. In: Computer Vision and Image Understanding, pp. 346–359 (2008) Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. In: Computer Vision and Image Understanding, pp. 346–359 (2008)
7.
go back to reference Bazarevsky, V., Zhang, F.: Google Project on Hand Pose Prediction (2019) Bazarevsky, V., Zhang, F.: Google Project on Hand Pose Prediction (2019)
8.
go back to reference Billinghurst, M., Kato, H., Poupyrev, I.: The magicbook: a transitional AR interface. Comput. Graphics 25, 745–753 (2001)CrossRef Billinghurst, M., Kato, H., Poupyrev, I.: The magicbook: a transitional AR interface. Comput. Graphics 25, 745–753 (2001)CrossRef
9.
go back to reference Boukhayma, A., De Bem, R., Torr, P.H.S.: 3D hand shape and pose from images in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)CrossRef Boukhayma, A., De Bem, R., Torr, P.H.S.: 3D hand shape and pose from images in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)CrossRef
10.
go back to reference Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: European Conference on Computer Vision. Springer, Berlin (2014) Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: European Conference on Computer Vision. Springer, Berlin (2014)
11.
go back to reference Brachmann, E., Michel, F., Krull, A., Yang, M.M., Gumhold, S., Rother, C.: Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image. In: IEEE Conference on Computer Vision and Pattern Recognition (2016) Brachmann, E., Michel, F., Krull, A., Yang, M.M., Gumhold, S., Rother, C.: Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
12.
go back to reference Brégier, R., Devernay, F., Leyrit, L., Crowley, J.L.: Defining the pose of any 3D rigid object and an associated distance. Int. J. Comput. Vis. 126(6), 571–596 (2018)CrossRefMATH Brégier, R., Devernay, F., Leyrit, L., Crowley, J.L.: Defining the pose of any 3D rigid object and an associated distance. Int. J. Comput. Vis. 126(6), 571–596 (2018)CrossRefMATH
14.
go back to reference Calli, B., Singh, A., Bruce, J., Walsman, A., Konolige, K., Srinivasa, S., Abbeel, P., Dollar, A.M.: Yale-CMU-Berkeley dataset for robotic manipulation research. Int. J. Rob. Res. 36(3), 261–268 (2017)CrossRef Calli, B., Singh, A., Bruce, J., Walsman, A., Konolige, K., Srinivasa, S., Abbeel, P., Dollar, A.M.: Yale-CMU-Berkeley dataset for robotic manipulation research. Int. J. Rob. Res. 36(3), 261–268 (2017)CrossRef
15.
go back to reference Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: ShapeNet: An Information-Rich 3D Model Repository. In: arXiv Preprint (2015) Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: ShapeNet: An Information-Rich 3D Model Repository. In: arXiv Preprint (2015)
16.
go back to reference Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems (2016) Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems (2016)
17.
go back to reference Drummond, T., Cipolla, R.: Real-time visual tracking of complex structures. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 932–946 (2002)CrossRef Drummond, T., Cipolla, R.: Real-time visual tracking of complex structures. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 932–946 (2002)CrossRef
18.
go back to reference Eldan, R., Shamir, O.: The power of depth for feedforward neural networks. In: Conference on Learning Theory (2016) Eldan, R., Shamir, O.: The power of depth for feedforward neural networks. In: Conference on Learning Theory (2016)
19.
go back to reference Gao, X., Hou, X., Tang, J., Cheng, H.: Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 930–943 (2003)CrossRef Gao, X., Hou, X., Tang, J., Cheng, H.: Complete solution classification for the perspective-three-point problem. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 930–943 (2003)CrossRef
20.
go back to reference Garcia-Hernando, G., Yuan, S., Baek, S., Kim, T.-K.: First-person hand action benchmark with RGB-D videos and 3D hand pose annotations. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 409–419 (2018) Garcia-Hernando, G., Yuan, S., Baek, S., Kim, T.-K.: First-person hand action benchmark with RGB-D videos and 3D hand pose annotations. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 409–419 (2018)
21.
go back to reference Ge, L., Ren, Z., Li, Y., Xue, Z., Wang, Y., Cai, J., Yuan, J.: 3D hand shape and pose estimation from a single RGB Image. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)CrossRef Ge, L., Ren, Z., Li, Y., Xue, Z., Wang, Y., Cai, J., Yuan, J.: 3D hand shape and pose estimation from a single RGB Image. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)CrossRef
22.
go back to reference Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, A.Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (2014) Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, A.Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (2014)
23.
go back to reference Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge, MA 02142 (2016) Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge, MA 02142 (2016)
24.
25.
go back to reference Grabner, A., Roth, P.M., Lepetit, V.: 3D pose estimation and 3D model retrieval for objects in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)CrossRef Grabner, A., Roth, P.M., Lepetit, V.: 3D pose estimation and 3D model retrieval for objects in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)CrossRef
26.
go back to reference Grabner, A., Roth, P.M., Lepetit, V.: Location field descriptors: single image 3D model retrieval in the wild. In: IEEE International Conference on 3D Vision (2019) Grabner, A., Roth, P.M., Lepetit, V.: Location field descriptors: single image 3D model retrieval in the wild. In: IEEE International Conference on 3D Vision (2019)
27.
go back to reference Groueix, T., Fisher, M., Kim, V., Russell, B., Aubry, M.: Atlasnet: A Papier-Mâché approach to learning 3D surface generation. In: IEEE Conference on Computer Vision and Pattern Recognition (2018) Groueix, T., Fisher, M., Kim, V., Russell, B., Aubry, M.: Atlasnet: A Papier-Mâché approach to learning 3D surface generation. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
28.
go back to reference Gustus, A., Stillfried, G., Visser, J., Jörntell, H., Van Der Smagt, P.: Human hand modelling: kinematics, dynamics, applications. Comput. Anim. 106(11), 741–755 (2012)MATH Gustus, A., Stillfried, G., Visser, J., Jörntell, H., Van Der Smagt, P.: Human hand modelling: kinematics, dynamics, applications. Comput. Anim. 106(11), 741–755 (2012)MATH
29.
go back to reference Hampali, S., Rad, M., Oberweger, M., Lepetit, A.V.: HOnnotate: A method for 3D annotation of hand and object poses. In: IEEE Conference on Computer Vision and Pattern Recognition (2020) Hampali, S., Rad, M., Oberweger, M., Lepetit, A.V.: HOnnotate: A method for 3D annotation of hand and object poses. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
30.
go back to reference Harris, C., Stennett, C.: RAPiD-A video rate object tracker. In: British Machine Vision Conference (1990)CrossRef Harris, C., Stennett, C.: RAPiD-A video rate object tracker. In: British Machine Vision Conference (1990)CrossRef
31.
go back to reference Harris, C.G., Stephens, M.J.: A combined corner and edge detector. In: Fourth Alvey Vision Conference (1988) Harris, C.G., Stephens, M.J.: A combined corner and edge detector. In: Fourth Alvey Vision Conference (1988)
32.
go back to reference Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M.J., Laptev, I., Schmid, C.: Learning joint reconstruction of hands and manipulated objects. In: IEEE Conference on Computer Vision and Pattern Recognition (2019) Hasson, Y., Varol, G., Tzionas, D., Kalevatykh, I., Black, M.J., Laptev, I., Schmid, C.: Learning joint reconstruction of hands and manipulated objects. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
33.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
34.
go back to reference He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision (2017)
35.
go back to reference Hinterstoisser, S., Cagniart, C., Ilic, S., Sturm, P.F., Navab, N., Fua, P., Lepetit, V.: Gradient response maps for real-time detection of textureless objects. IEEE Trans. Pattern Anal. Mach. Intell. 34(5) 876–888 (2012)CrossRef Hinterstoisser, S., Cagniart, C., Ilic, S., Sturm, P.F., Navab, N., Fua, P., Lepetit, V.: Gradient response maps for real-time detection of textureless objects. IEEE Trans. Pattern Anal. Mach. Intell. 34(5) 876–888 (2012)CrossRef
36.
go back to reference Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G.R., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Asian Conference on Computer Vision. Springer, Berlin (2012) Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G.R., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Asian Conference on Computer Vision. Springer, Berlin (2012)
37.
go back to reference Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
38.
go back to reference Hodan, T., Haluza, P., Obdrzalek, S., Matas, J., Lourakis, M., Zabulis, X.: On evaluation of 6D object pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2016) Hodan, T., Haluza, P., Obdrzalek, S., Matas, J., Lourakis, M., Zabulis, X.: On evaluation of 6D object pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2016)
39.
go back to reference Hoell, M., Oberweger, M., Arth, C., Lepetit, A.V.: Efficient physics-based implementation for realistic hand-object interaction in virtual reality. In: IEEE Conference on Virtual Reality (2018) Hoell, M., Oberweger, M., Arth, C., Lepetit, A.V.: Efficient physics-based implementation for realistic hand-object interaction in virtual reality. In: IEEE Conference on Virtual Reality (2018)
40.
go back to reference Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)CrossRefMATH Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)CrossRefMATH
41.
go back to reference Hu, Y., Hugonot, J., Fua, P., Salzmann, M.: Segmentation-driven 6D object pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2019) Hu, Y., Hugonot, J., Fua, P., Salzmann, M.: Segmentation-driven 6D object pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
42.
go back to reference Iqbal, U., Molchanov, P., Breuel, T., Gall, J., Kautz, J.: Hand pose estimation via Latent 2.5D heatmap regression. In: European Conference on Computer Vision. Springer, Berlin (2018) Iqbal, U., Molchanov, P., Breuel, T., Gall, J., Kautz, J.: Hand pose estimation via Latent 2.5D heatmap regression. In: European Conference on Computer Vision. Springer, Berlin (2018)
43.
go back to reference Izadinia, H., Shan, Q., Seitz, S.: Im2cad. In: IEEE Conference on Computer Vision and Pattern Recognition (2017) Izadinia, H., Shan, Q., Seitz, S.: Im2cad. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
44.
go back to reference Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-Based 3D detection and 6D pose estimation great again. In: IEEE International Conference on Computer Vision (2017) Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-Based 3D detection and 6D pose estimation great again. In: IEEE International Conference on Computer Vision (2017)
45.
go back to reference Kim, K., Grundmann, M., Shamir, A., Matthews, I., Hodgins, J., Essa, I.: Motion fields to predict play evolution in dynamic sport scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2010) Kim, K., Grundmann, M., Shamir, A., Matthews, I., Hodgins, J., Essa, I.: Motion fields to predict play evolution in dynamic sport scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
46.
go back to reference Kokic, M., Kragic, D., Bohg, J.: Learning to estimate pose and shape of hand-held objects from RGB images. In: IEEE International Conference on Intelligent Robots and Systems (2019) Kokic, M., Kragic, D., Bohg, J.: Learning to estimate pose and shape of hand-held objects from RGB images. In: IEEE International Conference on Intelligent Robots and Systems (2019)
47.
go back to reference Kulon, D., Guler, R.A., Kokkinos, I., Bronstein, M., Zafeiriou, S.: Weakly-supervised mesh-convolutional hand reconstruction in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2020) Kulon, D., Guler, R.A., Kokkinos, I., Bronstein, M., Zafeiriou, S.: Weakly-supervised mesh-convolutional hand reconstruction in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
48.
go back to reference Kyriazis, N., Argyros, A.A.: Physically plausible 3D scene tracking: the single actor hypothesis. In: IEEE Conference on Computer Vision and Pattern Recognition (2013) Kyriazis, N., Argyros, A.A.: Physically plausible 3D scene tracking: the single actor hypothesis. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
49.
go back to reference Labbé, Y., Carpentier, J., Aubry, M., Sivic, J.: CosyPose: consistent multi-view multi-object 6D pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2020) Labbé, Y., Carpentier, J., Aubry, M., Sivic, J.: CosyPose: consistent multi-view multi-object 6D pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2020)
50.
go back to reference Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2005) Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
51.
go back to reference Li, Y., Fu, J.L., Pollard, N.S.: Data-driven grasp synthesis using shape matching and task-based pruning. IEEE Trans. Visual Comput. Graphics 13(4), 732–747 (2007)CrossRef Li, Y., Fu, J.L., Pollard, N.S.: Data-driven grasp synthesis using shape matching and task-based pruning. IEEE Trans. Visual Comput. Graphics 13(4), 732–747 (2007)CrossRef
52.
go back to reference Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: DeepIM: deep iterative matching for 6D pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2018) Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: DeepIM: deep iterative matching for 6D pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2018)
53.
go back to reference Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer, Berlin (2014) Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer, Berlin (2014)
54.
go back to reference Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer, Berlin (2016) Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer, Berlin (2016)
55.
go back to reference Lowe, D.G.: Fitting parameterized three-dimensional models to images. IEEE Trans. Pattern Anal. Mach. Intell. 13(5), 441–450 (1991)CrossRef Lowe, D.G.: Fitting parameterized three-dimensional models to images. IEEE Trans. Pattern Anal. Mach. Intell. 13(5), 441–450 (1991)CrossRef
56.
go back to reference Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 20(2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 20(2), 91–110 (2004)CrossRef
57.
go back to reference Miller, A.T., Allen, P.K.: Graspit! a versatile simulator for robotic grasping. IEEE Rob. Autom. Mag. 11(4), 110–122 (2004)CrossRef Miller, A.T., Allen, P.K.: Graspit! a versatile simulator for robotic grasping. IEEE Rob. Autom. Mag. 11(4), 110–122 (2004)CrossRef
58.
go back to reference Miller, A.T., Knoop, S., Christensen, H.I., Allen, P.K.: Automatic grasp planning using shape primitives. In: International Conference on Robotics and Automation (2003) Miller, A.T., Knoop, S., Christensen, H.I., Allen, P.K.: Automatic grasp planning using shape primitives. In: International Conference on Robotics and Automation (2003)
59.
go back to reference Moon, G., Chang, J.Y., Lee, K.M.: V2V-PoseNet: Voxel-To-Voxel Prediction network for accurate 3D hand and human pose estimation from a single depth map. In: IEEE Conference on Computer Vision and Pattern Recognition (2018) Moon, G., Chang, J.Y., Lee, K.M.: V2V-PoseNet: Voxel-To-Voxel Prediction network for accurate 3D hand and human pose estimation from a single depth map. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
60.
go back to reference Mueller, F., Bernard, F., Sotnychenko, O., Mehta, D., Sridhar, S., Casas, D., Theobalt, C.: GANerated hands for real-time 3D hand tracking from monocular RGB. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)CrossRef Mueller, F., Bernard, F., Sotnychenko, O., Mehta, D., Sridhar, S., Casas, D., Theobalt, C.: GANerated hands for real-time 3D hand tracking from monocular RGB. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)CrossRef
61.
go back to reference Mueller, F., Bernard, F., Sotnychenko, O., Mehta, D., Sridhar, S., Casas, D., Theobalt, C.: GANerated hands for real-time 3D hand tracking from monocular RGB. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)CrossRef Mueller, F., Bernard, F., Sotnychenko, O., Mehta, D., Sridhar, S., Casas, D., Theobalt, C.: GANerated hands for real-time 3D hand tracking from monocular RGB. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)CrossRef
62.
go back to reference Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: IEEE International Conference on Computer Vision (2015) Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: IEEE International Conference on Computer Vision (2015)
63.
go back to reference Oberweger, M., Wohlhart, P., Lepeti, V.: DeepPrior++: improving fast and accurate 3D hand pose estimation. In: IEEE International Conference on Computer Vision, p. 2 (2017) Oberweger, M., Wohlhart, P., Lepeti, V.: DeepPrior++: improving fast and accurate 3D hand pose estimation. In: IEEE International Conference on Computer Vision, p. 2 (2017)
64.
go back to reference Oberweger, M., Rad, M., Lepetit, V.: Making deep heatmaps robust to partial occlusions for 3D object pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2018) Oberweger, M., Rad, M., Lepetit, V.: Making deep heatmaps robust to partial occlusions for 3D object pose estimation. In: European Conference on Computer Vision. Springer, Berlin (2018)
66.
go back to reference Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full DoF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE International Conference on Computer Vision (2011) Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full DoF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE International Conference on Computer Vision (2011)
67.
go back to reference Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, A.S.: Deepsdf: learning continuous signed distance functions for shape representation. In: IEEE Conference on Computer Vision and Pattern Recognition (2019) Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, A.S.: Deepsdf: learning continuous signed distance functions for shape representation. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
68.
go back to reference Park, K., Patten, T., Vincze, M.: Pix2pose: pixel-wise coordinate regression of objects for 6D pose estimation. In: IEEE International Conference on Computer Vision (2019) Park, K., Patten, T., Vincze, M.: Pix2pose: pixel-wise coordinate regression of objects for 6D pose estimation. In: IEEE International Conference on Computer Vision (2019)
69.
go back to reference Pavlakos, G., Zhou, X., Chan, A., Derpanis, K., Daniilidis, K.: 6-DoF object pose from semantic keypoints. In: International Conference on Robotics and Automation, pp. 2011–2018 (2017) Pavlakos, G., Zhou, X., Chan, A., Derpanis, K., Daniilidis, K.: 6-DoF object pose from semantic keypoints. In: International Conference on Robotics and Automation, pp. 2011–2018 (2017)
70.
go back to reference Peng, S., Liu, Y., Huang, Q., Bao, H., Zhou, X.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2019) Peng, S., Liu, Y., Huang, Q., Bao, H., Zhou, X.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
71.
go back to reference Pepik, B., Stark, M., Gehler, P., Ritschel, T., Schiele, B.: 3D object class detection in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)CrossRef Pepik, B., Stark, M., Gehler, P., Ritschel, T., Schiele, B.: 3D object class detection in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)CrossRef
72.
go back to reference Pinkus, A.: Approximation theory of the MLP model in neural networks. Comput. Anim 8, 143–195 (1999)MATH Pinkus, A.: Approximation theory of the MLP model in neural networks. Comput. Anim 8, 143–195 (1999)MATH
73.
go back to reference Pitteri, G., Bugeau, A., Ilic, S., Lepetit, V.: 3D object detection and pose estimation of unseen objects in color images with local surface EMbeddings. In: Asian Conference on Computer Vision Springer, Berlin (2020) Pitteri, G., Bugeau, A., Ilic, S., Lepetit, V.: 3D object detection and pose estimation of unseen objects in color images with local surface EMbeddings. In: Asian Conference on Computer Vision Springer, Berlin (2020)
74.
go back to reference Poirson, P., Ammirato, P., Fu, C.-Y., Liu, W., Kosecka, J., Berg, A.C.: Fast single shot detection and pose estimation. In: IEEE International Conference on 3D Vision (2016) Poirson, P., Ammirato, P., Fu, C.-Y., Liu, W., Kosecka, J., Berg, A.C.: Fast single shot detection and pose estimation. In: IEEE International Conference on 3D Vision (2016)
75.
go back to reference Pumarola, A., Agudo, A., Porzi, L., Sanfeliu, A., Lepetit, V., Moreno-Noguer, F.: Geometry-aware network for non-rigid shape prediction from a single view. In: IEEE Conference on Computer Vision and Pattern Recognition (2018) Pumarola, A., Agudo, A., Porzi, L., Sanfeliu, A., Lepetit, V., Moreno-Noguer, F.: Geometry-aware network for non-rigid shape prediction from a single view. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
76.
go back to reference Rad, M., Lepetit, V.: BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: IEEE International Conference on Computer Vision (2017) Rad, M., Lepetit, V.: BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. In: IEEE International Conference on Computer Vision (2017)
77.
go back to reference Rad, M., Oberweger, M., Lepetit, V.: Feature mapping for learning fast and accurate 3D pose inference from synthetic images. In: IEEE Conference on Computer Vision and Pattern Recognition (2018) Rad, M., Oberweger, M., Lepetit, V.: Feature mapping for learning fast and accurate 3D pose inference from synthetic images. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
78.
go back to reference Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition (2017) Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition (2017)
79.
go back to reference Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2016) Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
80.
go back to reference Rijpkema, H., Girard, M.: Computer animation of knowledge-based human grasping. In: ACM SIGGRAPH, pp. 339–348 (1991) Rijpkema, H., Girard, M.: Computer animation of knowledge-based human grasping. In: ACM SIGGRAPH, pp. 339–348 (1991)
81.
go back to reference Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM Trans. Graphics 36, 245:1 – 245:17 (2017) Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. ACM Trans. Graphics 36, 245:1 – 245:17 (2017)
82.
go back to reference Rublee, E., Rabaud, V., Konolidge, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision (2011) Rublee, E., Rabaud, V., Konolidge, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision (2011)
83.
go back to reference Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)CrossRef Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)CrossRef
84.
go back to reference Salzmann, M., Fua, P.: Deformable surface 3D reconstruction from monocular images. Morgan-Claypool Publishers, San Rafael (2010) Salzmann, M., Fua, P.: Deformable surface 3D reconstruction from monocular images. Morgan-Claypool Publishers, San Rafael (2010)
85.
go back to reference Sharp, T., Keskin, C., Robertson, D.P., Taylor, J., Shotton, J., Kim, D., Rhemann, C., Leichter, I., Vinnikov, A., Wei, Y., Freedman, D., Kohli, P., Krupka, E., Fitzgibbon, A.W., Izadi, A.S.: Accurate, robust, and flexible real-time hand tracking. In: Association for Computing Machinery (2015) Sharp, T., Keskin, C., Robertson, D.P., Taylor, J., Shotton, J., Kim, D., Rhemann, C., Leichter, I., Vinnikov, A., Wei, Y., Freedman, D., Kohli, P., Krupka, E., Fitzgibbon, A.W., Izadi, A.S.: Accurate, robust, and flexible real-time hand tracking. In: Association for Computing Machinery (2015)
86.
go back to reference Shin, D., Fowlkes, C.C., Hoiem, D.: Pixels, voxels, and views: a study of shape representations for single view 3D object shape prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (2018) Shin, D., Fowlkes, C.C., Hoiem, D.: Pixels, voxels, and views: a study of shape representations for single view 3D object shape prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
87.
go back to reference Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., Blake, A.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821–2840 (2013)CrossRef Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., Blake, A.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821–2840 (2013)CrossRef
88.
go back to reference Simon, T., Joo, H., Matthews, I.A., Sheikh, A.Y.: Hand keypoint detection in single images using multiview bootstrapping. In: IEEE Conference on Computer Vision and Pattern Recognition (2017) Simon, T., Joo, H., Matthews, I.A., Sheikh, A.Y.: Hand keypoint detection in single images using multiview bootstrapping. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
89.
go back to reference Sock, J., Garcia-Hernando, G., Armagan, A., Kim, T.-K.: Introducing pose consistency and warp-alignment for self-supervised 6D object pose estimation in color images. In: IEEE International Conference on 3D Vision (2020) Sock, J., Garcia-Hernando, G., Armagan, A., Kim, T.-K.: Introducing pose consistency and warp-alignment for self-supervised 6D object pose estimation in color images. In: IEEE International Conference on 3D Vision (2020)
90.
go back to reference Sun, X., Wei, Y., Liang, S., Tang, X., Sun, J.: Cascaded hand pose regression. In: IEEE Conference on Computer Vision and Pattern Recognition (2015) Sun, X., Wei, Y., Liang, S., Tang, X., Sun, J.: Cascaded hand pose regression. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
91.
go back to reference Sun, X., Shang, J., Liang, S., Wei, Y.: Compositional human pose regression. In: IEEE International Conference on Computer Vision (2017) Sun, X., Shang, J., Liang, S., Wei, Y.: Compositional human pose regression. In: IEEE International Conference on Computer Vision (2017)
92.
go back to reference Sundermeyer, M., Marton, Z.-C., Durner, M., Triebel, R.: Augmented autoencoders: implicit 3D orientation learning for 6D object detection. Int. J. Comput. Vis. 128(3) 714–729 (2019)CrossRef Sundermeyer, M., Marton, Z.-C., Durner, M., Triebel, R.: Augmented autoencoders: implicit 3D orientation learning for 6D object detection. Int. J. Comput. Vis. 128(3) 714–729 (2019)CrossRef
93.
go back to reference Sundermeyer, M., Durner, M., Puang, E.Y., Vaskevicius, N., Marton, Z.C., Arras, K.O., Triebel, R.: Multi-path learning for object pose estimation across domains. In: IEEE Conference on Computer Vision and Pattern Recognition (2020) Sundermeyer, M., Durner, M., Puang, E.Y., Vaskevicius, N., Marton, Z.C., Arras, K.O., Triebel, R.: Multi-path learning for object pose estimation across domains. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
94.
go back to reference Supancic, J.S., Rogez, G., Yang, Y., Shotton, J., Ramanan, D.: Depth-based hand pose estimation: data, methods, and challenges. In: IEEE International Conference on Computer Vision (2015) Supancic, J.S., Rogez, G., Yang, Y., Shotton, J., Ramanan, D.: Depth-based hand pose estimation: data, methods, and challenges. In: IEEE International Conference on Computer Vision (2015)
95.
go back to reference Talvas, A., Marchal, M., Duriez, C., Otaduy, M.A.: Aggregate constraints for virtual manipulation with soft fingers. IEEE Trans. Visual Comput. Graphics 21(4), 452–461 (2015)CrossRef Talvas, A., Marchal, M., Duriez, C., Otaduy, M.A.: Aggregate constraints for virtual manipulation with soft fingers. IEEE Trans. Visual Comput. Graphics 21(4), 452–461 (2015)CrossRef
96.
go back to reference Tang, D., Chang, H.J., Tejani, A., Kim, T.-K.: Latent regression forest: structured estimation of 3D articulated hand posture. In: IEEE Conference on Computer Vision and Pattern Recognition (2014) Tang, D., Chang, H.J., Tejani, A., Kim, T.-K.: Latent regression forest: structured estimation of 3D articulated hand posture. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
97.
go back to reference Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2012) Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
98.
go back to reference Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (2018) Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
99.
go back to reference Tekin, B., Bogo, F., Pollefeys, M.: H+O: unified egocentric recognition of 3D hand-object poses and interactions. In: IEEE Conference on Computer Vision and Pattern Recognition (2019) Tekin, B., Bogo, F., Pollefeys, M.: H+O: unified egocentric recognition of 3D hand-object poses and interactions. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
100.
go back to reference Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IEEE International Conference on Intelligent Robots and Systems (2017) Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IEEE International Conference on Intelligent Robots and Systems (2017)
101.
go back to reference Tolani, D., Goswami, A., Badler, N.I.: Real-time inverse kinematics techniques for anthropomorphic limbs. Comput. Anim. 62(5), 353–388 (2000)MATH Tolani, D., Goswami, A., Badler, N.I.: Real-time inverse kinematics techniques for anthropomorphic limbs. Comput. Anim. 62(5), 353–388 (2000)MATH
102.
go back to reference Tompson, J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems (2014) Tompson, J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems (2014)
103.
go back to reference Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2014) Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
104.
go back to reference Tulsiani, S., Malik, J.: Viewpoints and keypoints. In: IEEE Conference on Computer Vision and Pattern Recognition (2015) Tulsiani, S., Malik, J.: Viewpoints and keypoints. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
105.
go back to reference Tulsiani, S., Carreira, J., Malik, J.: Pose induction for novel object categories. In: IEEE International Conference on Computer Vision (2015) Tulsiani, S., Carreira, J., Malik, J.: Pose induction for novel object categories. In: IEEE International Conference on Computer Vision (2015)
106.
go back to reference Tzionas, D., Ballan, L., Srikantha, A., Aponte, P., Pollefeys, M., Gall, J.: Capturing hands in action using discriminative salient points and physics simulation. Int. J. Comput. Vis. 118(2), 172–193 (2016)CrossRef Tzionas, D., Ballan, L., Srikantha, A., Aponte, P., Pollefeys, M., Gall, J.: Capturing hands in action using discriminative salient points and physics simulation. Int. J. Comput. Vis. 118(2), 172–193 (2016)CrossRef
107.
go back to reference Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3D tracking using online and offline information IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1385–1391 (2004)CrossRef Vacchetti, L., Lepetit, V., Fua, P.: Stable real-time 3D tracking using online and offline information IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1385–1391 (2004)CrossRef
108.
go back to reference Wan, C., Probst, T., Van Gool, L., Yao, A.: Self-supervised 3D hand pose estimation through training by fitting. In: IEEE Conference on Computer Vision and Pattern Recognition (2019) Wan, C., Probst, T., Van Gool, L., Yao, A.: Self-supervised 3D hand pose estimation through training by fitting. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
109.
go back to reference Wang, R., Paris, S., Popović, J.: 6D Hands: markerless hand-tracking for computer aided design. In: Association for Computing Machinery, pp. 549–558 (2011) Wang, R., Paris, S., Popović, J.: 6D Hands: markerless hand-tracking for computer aided design. In: Association for Computing Machinery, pp. 549–558 (2011)
110.
go back to reference Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: IEEE Conference on Computer Vision and Pattern Recognition (2016) Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
111.
go back to reference Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., Savarese, S.: ObjectNet3D: a large scale database for 3D object recognition. In: European Conference on Computer Vision. Springer, Berlin (2016) Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., Savarese, S.: ObjectNet3D: a large scale database for 3D object recognition. In: European Conference on Computer Vision. Springer, Berlin (2016)
112.
go back to reference Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems Conference (2018) Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems Conference (2018)
113.
go back to reference Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems Conference (2018) Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems Conference (2018)
114.
go back to reference Xiao, Y., Marlet, R.: Few-shot object detection and viewpoint estimation for objects in the wild. In: European Conference on Computer Vision. Springer, Berlin (2020) Xiao, Y., Marlet, R.: Few-shot object detection and viewpoint estimation for objects in the wild. In: European Conference on Computer Vision. Springer, Berlin (2020)
115.
go back to reference Xiong, F., Zhanga, B., Xiao, Y., Cao, Z., Yu, T., Zhou, J.T., Yuan, J.: A2J: anchor-to-joint regression network for 3D articulated pose estimation from a single depth image. In: IEEE International Conference on Computer Vision (2019) Xiong, F., Zhanga, B., Xiao, Y., Cao, Z., Yu, T., Zhou, J.T., Yuan, J.: A2J: anchor-to-joint regression network for 3D articulated pose estimation from a single depth image. In: IEEE International Conference on Computer Vision (2019)
116.
go back to reference Yuan, S., Ye, Q., Stenger, B., Jain, S., Kim, T.-K.: BigHand2.2M benchmark: hand pose dataset and state of the art analysis. In: IEEE Conference on Computer Vision and Pattern Recognition (2017) Yuan, S., Ye, Q., Stenger, B., Jain, S., Kim, T.-K.: BigHand2.2M benchmark: hand pose dataset and state of the art analysis. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
117.
go back to reference Zakharov, S., Shugurov, I., Ilic, S.: DPOD: 6D pose object detector and refiner. In: IEEE International Conference on Computer Vision (2019) Zakharov, S., Shugurov, I., Ilic, S.: DPOD: 6D pose object detector and refiner. In: IEEE International Conference on Computer Vision (2019)
118.
go back to reference Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision (2017) Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision (2017)
119.
go back to reference Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. In: IEEE International Conference on Computer Vision (2017) Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. In: IEEE International Conference on Computer Vision (2017)
120.
go back to reference Zimmermann, C., Ceylan, D., Yang, J., Russell, B.C., Argus, M.J., Brox, T.: FreiHAND: a dataset for markerless capture of hand pose and shape from single RGB images. In: IEEE International Conference on Computer Vision (2019) Zimmermann, C., Ceylan, D., Yang, J., Russell, B.C., Argus, M.J., Brox, T.: FreiHAND: a dataset for markerless capture of hand pose and shape from single RGB images. In: IEEE International Conference on Computer Vision (2019)
Metadata
Title
3D Object and Hand Pose Estimation
Author
Vincent Lepetit
Copyright Year
2023
DOI
https://doi.org/10.1007/978-3-030-67822-7_4

Premium Partner