Skip to main content

2018 | OriginalPaper | Buchkapitel

Generating 3D Faces Using Convolutional Mesh Autoencoders

verfasst von : Anurag Ranjan, Timo Bolkart, Soubhik Sanyal, Michael J. Black

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Learned 3D representations of human faces are useful for computer vision problems such as 3D face tracking and reconstruction from images, as well as graphics applications such as character generation and animation. Traditional models learn a latent representation of a face using linear subspaces or higher-order tensor generalizations. Due to this linearity, they can not capture extreme deformations and non-linear expressions. To address this, we introduce a versatile model that learns a non-linear representation of a face using spectral convolutions on a mesh surface. We introduce mesh sampling operations that enable a hierarchical mesh representation that captures non-linear variations in shape and expression at multiple scales within the model. In a variational setting, our model samples diverse realistic 3D faces from a multivariate Gaussian distribution. Our training data consists of 20,466 meshes of extreme expressions captured over 12 different subjects. Despite limited training data, our trained model outperforms state-of-the-art face models with 50% lower reconstruction error, while using 75% fewer parameters. We show that, replacing the expression space of an existing state-of-the-art face model with our model, achieves a lower reconstruction error. Our data, model and code are available at http://​coma.​is.​tue.​mpg.​de/​.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Amberg, B., Knothe, R., Vetter, T.: Expression invariant 3D face recognition with a morphable model. In: International Conference on Automatic Face Gesture Recognition, pp. 1–6 (2008) Amberg, B., Knothe, R., Vetter, T.: Expression invariant 3D face recognition with a morphable model. In: International Conference on Automatic Face Gesture Recognition, pp. 1–6 (2008)
2.
Zurück zum Zitat Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, pp. 187–194 (1999) Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, pp. 187–194 (1999)
3.
Zurück zum Zitat Booth, J., Roussos, A., Ponniah, A., Dunaway, D., Zafeiriou, S.: Large scale 3D morphable models. Int. J. Comput. Vis. 126, 1–22 (2017)MathSciNet Booth, J., Roussos, A., Ponniah, A., Dunaway, D., Zafeiriou, S.: Large scale 3D morphable models. Int. J. Comput. Vis. 126, 1–22 (2017)MathSciNet
4.
Zurück zum Zitat Boscaini, D., Masci, J., Melzi, S., Bronstein, M.M., Castellani, U., Vandergheynst, P.: Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. In: Eurographics Symposium on Geometry Processing, pp. 13–23 (2015)CrossRef Boscaini, D., Masci, J., Melzi, S., Bronstein, M.M., Castellani, U., Vandergheynst, P.: Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. In: Eurographics Symposium on Geometry Processing, pp. 13–23 (2015)CrossRef
5.
Zurück zum Zitat Boscaini, D., Masci, J., Rodolà, E., Bronstein, M.: Learning shape correspondence with anisotropic convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 3189–3197 (2016) Boscaini, D., Masci, J., Rodolà, E., Bronstein, M.: Learning shape correspondence with anisotropic convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 3189–3197 (2016)
6.
Zurück zum Zitat Bouaziz, S., Wang, Y., Pauly, M.: Online modeling for realtime facial animation. ACM Trans. Graph. 32(4), 40 (2013)CrossRef Bouaziz, S., Wang, Y., Pauly, M.: Online modeling for realtime facial animation. ACM Trans. Graph. 32(4), 40 (2013)CrossRef
7.
Zurück zum Zitat Breidt, M., Bülthoff, H.H., Curio, C.: Robust semantic analysis by synthesis of 3D facial motion. In: International Conference on Automatic Face and Gesture Recognition and Workshops, pp. 713–719 (2011) Breidt, M., Bülthoff, H.H., Curio, C.: Robust semantic analysis by synthesis of 3D facial motion. In: International Conference on Automatic Face and Gesture Recognition and Workshops, pp. 713–719 (2011)
8.
Zurück zum Zitat Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 (2016) Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:​1608.​04236 (2016)
10.
Zurück zum Zitat Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond euclidean data. Signal Process. Mag. 34(4), 18–42 (2017)CrossRef Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond euclidean data. Signal Process. Mag. 34(4), 18–42 (2017)CrossRef
11.
Zurück zum Zitat Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. CoRR abs/1312.6203 (2013) Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. CoRR abs/1312.6203 (2013)
13.
Zurück zum Zitat Brunton, A., Salazar, A., Bolkart, T., Wuhrer, S.: Review of statistical shape spaces for 3D data with comparative analysis for human faces. Comput. Vis. Image Underst. 128, 1–17 (2014)CrossRef Brunton, A., Salazar, A., Bolkart, T., Wuhrer, S.: Review of statistical shape spaces for 3D data with comparative analysis for human faces. Comput. Vis. Image Underst. 128, 1–17 (2014)CrossRef
14.
Zurück zum Zitat Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. Trans. Vis. Comput. Graph. 20(3), 413–425 (2014)CrossRef Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. Trans. Vis. Comput. Graph. 20(3), 413–425 (2014)CrossRef
15.
Zurück zum Zitat Chung, F.R.K.: Spectral Graph Theory, vol. 92. American Mathematical Soc., Providence (1997)MATH Chung, F.R.K.: Spectral Graph Theory, vol. 92. American Mathematical Soc., Providence (1997)MATH
16.
Zurück zum Zitat Cosker, D., Krumhuber, E., Hilton, A.: A FACS valid 3D dynamic action unit database with applications to 3D dynamic morphable facial modeling. In: International Conference on Computer Vision, pp. 2296–2303 (2011) Cosker, D., Krumhuber, E., Hilton, A.: A FACS valid 3D dynamic action unit database with applications to 3D dynamic morphable facial modeling. In: International Conference on Computer Vision, pp. 2296–2303 (2011)
17.
Zurück zum Zitat Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3844–3852 (2016) Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3844–3852 (2016)
18.
Zurück zum Zitat Abrevaya, V.F., Wuhrer, S., Boyer, E.: Multilinear autoencoder for 3D face model learning. In: Winter Conference on Applications of Computer Vision, pp. 1–9 (2018) Abrevaya, V.F., Wuhrer, S., Boyer, E.: Multilinear autoencoder for 3D face model learning. In: Winter Conference on Applications of Computer Vision, pp. 1–9 (2018)
19.
Zurück zum Zitat Ferrari, C., Lisanti, G., Berretti, S., Bimbo, A.D.: Dictionary learning based 3D morphable model construction for face recognition with varying expression and pose. In: International Conference on 3D Vision, pp. 509–517 (2015) Ferrari, C., Lisanti, G., Berretti, S., Bimbo, A.D.: Dictionary learning based 3D morphable model construction for face recognition with varying expression and pose. In: International Conference on 3D Vision, pp. 509–517 (2015)
20.
Zurück zum Zitat Garland, M., Heckbert, P.S.: Surface simplification using quadric error metrics. In: Proceedings of the 24th Annual Conference on Computer Graphics And Interactive Techniques, pp. 209–216. ACM Press/Addison-Wesley Publishing Co. (1997) Garland, M., Heckbert, P.S.: Surface simplification using quadric error metrics. In: Proceedings of the 24th Annual Conference on Computer Graphics And Interactive Techniques, pp. 209–216. ACM Press/Addison-Wesley Publishing Co. (1997)
21.
Zurück zum Zitat Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Fourteenth International Conference on Artificial Intelligence and Statistics (2011) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Fourteenth International Conference on Artificial Intelligence and Statistics (2011)
22.
Zurück zum Zitat Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014) Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
23.
Zurück zum Zitat Hammond, D.K., Vandergheynst, P., Gribonval, R.: Wavelets on graphs via spectral graph theory. Appl. Comput. Harmonic Anal. 30(2), 129–150 (2011)MathSciNetCrossRef Hammond, D.K., Vandergheynst, P., Gribonval, R.: Wavelets on graphs via spectral graph theory. Appl. Comput. Harmonic Anal. 30(2), 129–150 (2011)MathSciNetCrossRef
24.
Zurück zum Zitat Henaff, M., Bruna, J., LeCun, Y.: Deep convolutional networks on graph-structured data. CoRR abs/1506.05163 (2015) Henaff, M., Bruna, J., LeCun, Y.: Deep convolutional networks on graph-structured data. CoRR abs/1506.05163 (2015)
25.
Zurück zum Zitat Jackson, A.S., Bulat, A., Argyriou, V., Tzimiropoulos, G.: Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In: International Conference on Computer Vision (2017) Jackson, A.S., Bulat, A., Argyriou, V., Tzimiropoulos, G.: Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In: International Conference on Computer Vision (2017)
26.
Zurück zum Zitat Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2016) Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2016)
27.
Zurück zum Zitat Li, H., Weise, T., Pauly, M.: Example-based facial rigging. ACM Trans. Graph. 29(4), 32 (2010) Li, H., Weise, T., Pauly, M.: Example-based facial rigging. ACM Trans. Graph. 29(4), 32 (2010)
28.
Zurück zum Zitat Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 194 (2017) Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 194 (2017)
29.
Zurück zum Zitat Litany, O., Bronstein, A., Bronstein, M., Makadia, A.: Deformable shape completion with graph convolutional autoencoders. arXiv preprint arXiv:1712.00268 (2017) Litany, O., Bronstein, A., Bronstein, M., Makadia, A.: Deformable shape completion with graph convolutional autoencoders. arXiv preprint arXiv:​1712.​00268 (2017)
30.
Zurück zum Zitat Maron, H., et al.: Convolutional neural networks on surfaces via seamless toric covers. ACM Trans. Graph. 36(4), 71:1–71:10 (2017)CrossRef Maron, H., et al.: Convolutional neural networks on surfaces via seamless toric covers. ACM Trans. Graph. 36(4), 71:1–71:10 (2017)CrossRef
31.
Zurück zum Zitat Masci, J., Boscaini, D., Bronstein, M., Vandergheynst, P.: Geodesic convolutional neural networks on Riemannian manifolds. In: International Conference on Computer Vision Workshops, pp. 37–45 (2015) Masci, J., Boscaini, D., Bronstein, M., Vandergheynst, P.: Geodesic convolutional neural networks on Riemannian manifolds. In: International Conference on Computer Vision Workshops, pp. 37–45 (2015)
32.
Zurück zum Zitat Monti, F., Boscaini, D., Masci, J., Rodolà, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model CNNs (2017) Monti, F., Boscaini, D., Masci, J., Rodolà, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model CNNs (2017)
33.
Zurück zum Zitat Neumann, T., Varanasi, K., Wenger, S., Wacker, M., Magnor, M., Theobalt, C.: Sparse localized deformation components. Trans. Graph. (Proc. SIGGRAPH Asia) 32(6), 179:1–179:10 (2013) Neumann, T., Varanasi, K., Wenger, S., Wacker, M., Magnor, M., Theobalt, C.: Sparse localized deformation components. Trans. Graph. (Proc. SIGGRAPH Asia) 32(6), 179:1–179:10 (2013)
34.
Zurück zum Zitat van den Oord, A., et al.: WaveNet: a generative model for raw audio. CoRR abs/1609.03499 (2016) van den Oord, A., et al.: WaveNet: a generative model for raw audio. CoRR abs/1609.03499 (2016)
35.
36.
Zurück zum Zitat Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: International Conference on Advanced Video and Signal Based Surveillance, pp. 296–301 (2009) Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: International Conference on Advanced Video and Signal Based Surveillance, pp. 296–301 (2009)
37.
Zurück zum Zitat Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
40.
Zurück zum Zitat Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014) Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
41.
Zurück zum Zitat Tewari, A., et al.: MoFA: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: International Conference on Computer Vision (2017) Tewari, A., et al.: MoFA: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: International Conference on Computer Vision (2017)
42.
Zurück zum Zitat Thies, J., Zollhöfer, M., Nießner, M., Valgaerts, L., Stamminger, M., Theobalt, C.: Real-time expression transfer for facial reenactment. Trans. Graph. 34(6), 183:1–183:14 (2015)CrossRef Thies, J., Zollhöfer, M., Nießner, M., Valgaerts, L., Stamminger, M., Theobalt, C.: Real-time expression transfer for facial reenactment. Trans. Graph. 34(6), 183:1–183:14 (2015)CrossRef
43.
Zurück zum Zitat Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2Face: real-time face capture and reenactment of RGB videos. In: Conference on Computer Vision and Pattern Recognition, pp. 2387–2395 (2016) Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2Face: real-time face capture and reenactment of RGB videos. In: Conference on Computer Vision and Pattern Recognition, pp. 2387–2395 (2016)
44.
Zurück zum Zitat Tran, A.T., Hassner, T., Masi, I., Paz, E., Nirkin, Y., Medioni, G.: Extreme 3D face reconstruction: looking past occlusions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Tran, A.T., Hassner, T., Masi, I., Paz, E., Nirkin, Y., Medioni, G.: Extreme 3D face reconstruction: looking past occlusions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
45.
Zurück zum Zitat Verma, N., Boyer, E., Verbeek, J.: Dynamic filters in graph convolutional networks. CoRR abs/1706.05206 (2017) Verma, N., Boyer, E., Verbeek, J.: Dynamic filters in graph convolutional networks. CoRR abs/1706.05206 (2017)
46.
Zurück zum Zitat Vlasic, D., Brand, M., Pfister, H., Popović, J.: Face transfer with multilinear models. Trans. Graph. 24(3), 426–433 (2005)CrossRef Vlasic, D., Brand, M., Pfister, H., Popović, J.: Face transfer with multilinear models. Trans. Graph. 24(3), 426–433 (2005)CrossRef
47.
Zurück zum Zitat Yang, F., Wang, J., Shechtman, E., Bourdev, L., Metaxas, D.: Expression flow for 3D-aware face component transfer. Trans. Graph. 30(4), 60:1–60:10 (2011)CrossRef Yang, F., Wang, J., Shechtman, E., Bourdev, L., Metaxas, D.: Expression flow for 3D-aware face component transfer. Trans. Graph. 30(4), 60:1–60:10 (2011)CrossRef
48.
Zurück zum Zitat Yi, L., Su, H., Guo, X., Guibas, L.J.: SyncSpecCNN: synchronized spectral CNN for 3D shape segmentation (2017) Yi, L., Su, H., Guo, X., Guibas, L.J.: SyncSpecCNN: synchronized spectral CNN for 3D shape segmentation (2017)
49.
Zurück zum Zitat Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression database. In: International Conference on Automatic Face and Gesture Recognition, pp. 1–6 (2008) Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression database. In: International Conference on Automatic Face and Gesture Recognition, pp. 1–6 (2008)
50.
Zurück zum Zitat Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: International Conference on Automatic Face and Gesture Recognition, pp. 211–216 (2006) Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: International Conference on Automatic Face and Gesture Recognition, pp. 211–216 (2006)
Metadaten
Titel
Generating 3D Faces Using Convolutional Mesh Autoencoders
verfasst von
Anurag Ranjan
Timo Bolkart
Soubhik Sanyal
Michael J. Black
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01219-9_43