Skip to main content
Erschienen in: Neural Computing and Applications 11/2021

30.09.2020 | Original Article

Learning-detailed 3D face reconstruction based on convolutional neural networks from a single image

verfasst von: Asad Khan, Sakander Hayat, Muhammad Ahmad, Jinde Cao, Muhammad Faizan Tahir, Asad Ullah, Muhammad Sufyan Javed

Erschienen in: Neural Computing and Applications | Ausgabe 11/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The efficiency of convolutional neural networks (CNNs) facilitates 3D face reconstruction, which takes a single image as an input and demonstrates significant performance in generating a detailed face geometry. The dependence of the extensive scale of labelled data works as a key to making CNN-based techniques significantly successful. However, no such datasets are publicly available that provide an across-the-board quantity of face images with correspondingly explained 3D face geometry. State-of-the-art learning-based 3D face reconstruction methods synthesize the training data by using a coarse morphable model of a face having non-photo-realistic synthesized face images. In this article, by using a learning-based inverse face rendering, we propose a novel data-generation technique by rendering a large number of face images that are photo-realistic and possess distinct properties. Based on the real-time fine-scale textured 3D face reconstruction comprising decently constructed datasets, we can train two cascaded CNNs in a coarse-to-fine manner. The networks are trained for actual detailed 3D face reconstruction from a single image. Experimental results demonstrate that the reconstruction of 3D face shapes with geometry details from only one input image can efficiently be performed by our method. Furthermore, the results demonstrate the efficiency of our technique to pose, expression and lighting dynamics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Both fine-scale and coarse-scale photo-realistic face image datasets will be publicly available once the present work is published.
 
Literatur
1.
Zurück zum Zitat Blanz V, Vetter T (2003) Face recognition based on fitting a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 25(9):1063–1074CrossRef Blanz V, Vetter T (2003) Face recognition based on fitting a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 25(9):1063–1074CrossRef
3.
Zurück zum Zitat Ichim AE, Bouaziz S, Pauly M (2015) Dynamic 3d avatar creation from hand-held video input. ACM Trans Gr (ToG) 34(4):45 Ichim AE, Bouaziz S, Pauly M (2015) Dynamic 3d avatar creation from hand-held video input. ACM Trans Gr (ToG) 34(4):45
4.
Zurück zum Zitat Thies J, Zollhofer M, Stamminger M, Theobalt C, Niessner M (2016) Face2face: real-time face capture and reenactment of rgb videos. In: IEEE conference on computer vision and pattern recognition 2387–2395 Thies J, Zollhofer M, Stamminger M, Theobalt C, Niessner M (2016) Face2face: real-time face capture and reenactment of rgb videos. In: IEEE conference on computer vision and pattern recognition 2387–2395
5.
Zurück zum Zitat Kemelmacher Shlizerman I, Basri R (2011) 3d face reconstruction from a single image using a single reference face shape. IEEE Trans Pattern Anal Mach Intell 33(2):394–405CrossRef Kemelmacher Shlizerman I, Basri R (2011) 3d face reconstruction from a single image using a single reference face shape. IEEE Trans Pattern Anal Mach Intell 33(2):394–405CrossRef
7.
Zurück zum Zitat Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2014) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Gr 20(3):413–425CrossRef Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2014) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Gr 20(3):413–425CrossRef
8.
Zurück zum Zitat Aldrian O, Smith WA (2013) Inverse rendering of faces with a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 35(5):1080–1093CrossRef Aldrian O, Smith WA (2013) Inverse rendering of faces with a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 35(5):1080–1093CrossRef
9.
Zurück zum Zitat Zhang R, Tsai P-S, Cryer JE, Shah M (1999) Shapefrom-shading: a survey. IEEE Trans Pattern Anal Mach Intell 21(8):690–706CrossRef Zhang R, Tsai P-S, Cryer JE, Shah M (1999) Shapefrom-shading: a survey. IEEE Trans Pattern Anal Mach Intell 21(8):690–706CrossRef
11.
Zurück zum Zitat Prados E, Faugeras O (2006) Shape from shading. Handbook of mathematical models in computer vision. Springer, Berlin, pp 375–388MATH Prados E, Faugeras O (2006) Shape from shading. Handbook of mathematical models in computer vision. Springer, Berlin, pp 375–388MATH
12.
Zurück zum Zitat Shimshoni I, Moses Y, Lindenbaum M (2000) Shape reconstruction of 3d bilaterally symmetric surfaces. Int J Comput Vis 39(2):97–110CrossRef Shimshoni I, Moses Y, Lindenbaum M (2000) Shape reconstruction of 3d bilaterally symmetric surfaces. Int J Comput Vis 39(2):97–110CrossRef
13.
Zurück zum Zitat Zhao WY, Chellappa R (2000) Illumination-insensitive face recognition using symmetric shape-from-shading. In: IEEE Conference on computer vision and pattern recognition (CVPR) 1:286–293 Zhao WY, Chellappa R (2000) Illumination-insensitive face recognition using symmetric shape-from-shading. In: IEEE Conference on computer vision and pattern recognition (CVPR) 1:286–293
14.
Zurück zum Zitat Zhao WY, Chellappa R (2001) Symmetric shape-fromshading using self-ratio image. Int J Comput Vis 45(1):55–65CrossRef Zhao WY, Chellappa R (2001) Symmetric shape-fromshading using self-ratio image. Int J Comput Vis 45(1):55–65CrossRef
15.
16.
Zurück zum Zitat Zhu X, Lei Z, Yan J, Yi D, Li SZ (2015) High-fidelity pose and expression normalization for face recognition in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 787-796 Zhu X, Lei Z, Yan J, Yi D, Li SZ (2015) High-fidelity pose and expression normalization for face recognition in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 787-796
18.
Zurück zum Zitat Feng L, Zeng D, Zhao Q, Liu X (2016) Joint face alignment and 3d face reconstruction. In: European conference on computer vision. Amsterdam, The Netherlands Feng L, Zeng D, Zhao Q, Liu X (2016) Joint face alignment and 3d face reconstruction. In: European conference on computer vision. Amsterdam, The Netherlands
21.
Zurück zum Zitat Dou P, Wu Y, Shah S, Kakadiaris I (2014) Robust 3d face shape reconstruction from single images via two-fold coupled structure learning and off-the-shelf landmark detectors. In: British machine vision conference. https://doi.org/10.5244/C.28.131 Dou P, Wu Y, Shah S, Kakadiaris I (2014) Robust 3d face shape reconstruction from single images via two-fold coupled structure learning and off-the-shelf landmark detectors. In: British machine vision conference. https://​doi.​org/​10.​5244/​C.​28.​131
23.
Zurück zum Zitat Liu F, Zeng D, Li J, Zhao Q (2015) Cascaded regressor based 3d face reconstruction from a single arbitrary view image. In arXiv preprint arXiv:1509.06161 Liu F, Zeng D, Li J, Zhao Q (2015) Cascaded regressor based 3d face reconstruction from a single arbitrary view image. In arXiv preprint arXiv:​1509.​06161
27.
Zurück zum Zitat Cao C, Bradley D, Zhou K, Beeler T (2015) Real-time high-fidelity facial performance capture. ACM Trans Gr 34(4):46CrossRef Cao C, Bradley D, Zhou K, Beeler T (2015) Real-time high-fidelity facial performance capture. ACM Trans Gr 34(4):46CrossRef
28.
Zurück zum Zitat Cao C, Hou Q, Zhou K (2014) Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans Gr 33(4):43 Cao C, Hou Q, Zhou K (2014) Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans Gr 33(4):43
30.
Zurück zum Zitat Shi F, Wu H-T, Tong X, Chai J (2014) Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans Gr 33(6):222CrossRef Shi F, Wu H-T, Tong X, Chai J (2014) Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans Gr 33(6):222CrossRef
31.
Zurück zum Zitat Bas A, Smith WAP, Bolkart T, Wuhrer S (2016) Fitting a 3d morphable model to edges: a comparison between hard and soft correspondences. In: Asian conference on computer vision workshop on facial informatics (Taipei, Taiwan), vol. 10117, pp. 377–391 Bas A, Smith WAP, Bolkart T, Wuhrer S (2016) Fitting a 3d morphable model to edges: a comparison between hard and soft correspondences. In: Asian conference on computer vision workshop on facial informatics (Taipei, Taiwan), vol. 10117, pp. 377–391
33.
Zurück zum Zitat Paysan P, Knothe R, Amberg B, Romdhani S, Vetter T (2009) A 3d face model for pose and illumination invariant face recognition. In: IEEE International conference on advanced video and signal based surveillance, Genova, pp. 296-301, https://doi.org/10.1109/AVSS.2009.58 Paysan P, Knothe R, Amberg B, Romdhani S, Vetter T (2009) A 3d face model for pose and illumination invariant face recognition. In: IEEE International conference on advanced video and signal based surveillance, Genova, pp. 296-301, https://​doi.​org/​10.​1109/​AVSS.​2009.​58
34.
Zurück zum Zitat Ramamoorthi R, Hanrahan P (2001) An efficient representation for irradiance environment maps. In: 28th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’01). Association for Computing Machinery, New York, NY, USA, 497C500. https://doi.org/10.1145/383259.383317 Ramamoorthi R, Hanrahan P (2001) An efficient representation for irradiance environment maps. In: 28th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’01). Association for Computing Machinery, New York, NY, USA, 497C500. https://​doi.​org/​10.​1145/​383259.​383317
35.
Zurück zum Zitat Blanz V, Vetter TA (1999) Morphable model for the synthesis of 3D faces. In: 26th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’99). ACM Press/Addison-Wesley Publishing Co., USA, 187C194. https://doi.org/10.1145/311535.311556 Blanz V, Vetter TA (1999) Morphable model for the synthesis of 3D faces. In: 26th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’99). ACM Press/Addison-Wesley Publishing Co., USA, 187C194. https://​doi.​org/​10.​1145/​311535.​311556
37.
Zurück zum Zitat Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vis Comput 28(5):807–813CrossRef Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vis Comput 28(5):807–813CrossRef
38.
Zurück zum Zitat Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: IEEE international conference on computer vision workshops, Sydney, NSW, pp. 397–403, https://doi.org/10.1109/ICCVW.2013.59 Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: IEEE international conference on computer vision workshops, Sydney, NSW, pp. 397–403, https://​doi.​org/​10.​1109/​ICCVW.​2013.​59
40.
Zurück zum Zitat Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer- assisted intervention, Springer, pp. 234–241 Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer- assisted intervention, Springer, pp. 234–241
41.
Zurück zum Zitat Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: MWJ Xianghua Xie, Tam GKL (Eds.) British Machine Vision Conference (BMVC), BMVA Press, pp. 41.1–41.12 Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: MWJ Xianghua Xie, Tam GKL (Eds.) British Machine Vision Conference (BMVC), BMVA Press, pp. 41.1–41.12
42.
Zurück zum Zitat Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: 22Nd ACM international conference on multimedia (New York, NY, USA), (MM 14), ACM, pp. 675–678. 10.1145/2647868.2654889 Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: 22Nd ACM international conference on multimedia (New York, NY, USA), (MM 14), ACM, pp. 675–678. 10.1145/2647868.2654889
45.
Zurück zum Zitat Garrido P, Valgaerts L, Wu C, Theobalt C (2013) Reconstructing detailed dynamic face geometry from monocular video. ACM Trans Gr 32, 6, Article 158, 10 pages. 10.1145/2508363.2508380 Garrido P, Valgaerts L, Wu C, Theobalt C (2013) Reconstructing detailed dynamic face geometry from monocular video. ACM Trans Gr 32, 6, Article 158, 10 pages. 10.1145/2508363.2508380
46.
Zurück zum Zitat Kim H, Zollhöer M, Tewari A, Thies J, Richardt C, Theobalt C (2018) Inversefacenet: deep single-shot inverse face rendering from a single image. In: IEEE conference on computer vision and pattern recognition Kim H, Zollhöer M, Tewari A, Thies J, Richardt C, Theobalt C (2018) Inversefacenet: deep single-shot inverse face rendering from a single image. In: IEEE conference on computer vision and pattern recognition
47.
Zurück zum Zitat Phillips PJ, Flynn PJ, Scruggs T, Bowyer KW, Chang J, Hoffman K, Marques J, Min J, Worek W (2005) Overview of the face recognition grand challenge. In: IEEE conference on computer vision and pattern recognition (CVPR’05), San Diego, CA, USA, pp. 947–954 vol. 1, https://doi.org/10.1109/CVPR.2005.268 Phillips PJ, Flynn PJ, Scruggs T, Bowyer KW, Chang J, Hoffman K, Marques J, Min J, Worek W (2005) Overview of the face recognition grand challenge. In: IEEE conference on computer vision and pattern recognition (CVPR’05), San Diego, CA, USA, pp. 947–954 vol. 1, https://​doi.​org/​10.​1109/​CVPR.​2005.​268
Metadaten
Titel
Learning-detailed 3D face reconstruction based on convolutional neural networks from a single image
verfasst von
Asad Khan
Sakander Hayat
Muhammad Ahmad
Jinde Cao
Muhammad Faizan Tahir
Asad Ullah
Muhammad Sufyan Javed
Publikationsdatum
30.09.2020
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 11/2021
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-020-05373-w

Weitere Artikel der Ausgabe 11/2021

Neural Computing and Applications 11/2021 Zur Ausgabe