Skip to main content
Erschienen in: International Journal of Computer Vision 6-7/2019

13.02.2019

Deep, Landmark-Free FAME: Face Alignment, Modeling, and Expression Estimation

verfasst von: Feng-Ju Chang, Anh Tuan Tran, Tal Hassner, Iacopo Masi, Ram Nevatia, Gérard Medioni

Erschienen in: International Journal of Computer Vision | Ausgabe 6-7/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a novel method for modeling 3D face shape, viewpoint, and expression from a single, unconstrained photo. Our method uses three deep convolutional neural networks to estimate each of these components separately. Importantly, unlike others, our method does not use facial landmark detection at test time; instead, it estimates these properties directly from image intensities. In fact, rather than using detectors, we show how accurate landmarks can be obtained as a by-product of our modeling process. We rigorously test our proposed method. To this end, we raise a number of concerns with existing practices used in evaluating face landmark detection methods. In response to these concerns, we propose novel paradigms for testing the effectiveness of rigid and non-rigid face alignment methods without relying on landmark detection benchmarks. We evaluate rigid face alignment by measuring its effects on face recognition accuracy on the challenging IJB-A and IJB-B benchmarks. Non-rigid, expression estimation is tested on the CK+ and EmotiW’17 benchmarks for emotion classification. We do, however, report the accuracy of our approach as a landmark detector for 3D landmarks on AFLW2000-3D and 2D landmarks on 300W and AFLW-PIFA. A surprising conclusion of these results is that better landmark detection accuracy does not necessarily translate to better face processing. Parts of this paper were previously published by Tran et al. (2017) and Chang et al. (2017, 2018).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
FPN, bundled with rendering and alignment code, publicly available from: https://​github.​com/​fengju514/​Face-Pose-Net.
 
2
The train/test partitions of PIFA are available at http://​cvlab.​cse.​msu.​edu/​project-pifa.​html.
 
Literatur
Zurück zum Zitat Artizzu, X. P., Perona, P., & Dollár, P. (2013). Robust face landmark estimation under occlusion. In Proceedings of the international conference on computer vision. Artizzu, X. P., Perona, P., & Dollár, P. (2013). Robust face landmark estimation under occlusion. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Asthana, A., Zafeiriou, S., Cheng, S., & Pantic, M. (2014). Incremental face alignment in the wild. In Proceedings of the conference on computer vision pattern recognition. Asthana, A., Zafeiriou, S., Cheng, S., & Pantic, M. (2014). Incremental face alignment in the wild. In Proceedings of the conference on computer vision pattern recognition.
Zurück zum Zitat Baltrusaitis, T., Robinson, P., & Morency, L. P. (2013). Constrained local neural fields for robust facial landmark detection in the wild. In Proceedings of the conference on computer vision pattern recognition workshops. Baltrusaitis, T., Robinson, P., & Morency, L. P. (2013). Constrained local neural fields for robust facial landmark detection in the wild. In Proceedings of the conference on computer vision pattern recognition workshops.
Zurück zum Zitat Baltrušaitis, T., Robinson, P., & Morency, L. P. (2016). Openface: An open source facial behavior analysis toolkit. In Winter conference on appllications of computer vision. Baltrušaitis, T., Robinson, P., & Morency, L. P. (2016). Openface: An open source facial behavior analysis toolkit. In Winter conference on appllications of computer vision.
Zurück zum Zitat Bansal, A., Russell, B., & Gupta, A. (2016). Marr revisited: 2D-3D alignment via surface normal prediction. In Proceedings of the conference on computer vision pattern recognition. Bansal, A., Russell, B., & Gupta, A. (2016). Marr revisited: 2D-3D alignment via surface normal prediction. In Proceedings of the conference on computer vision pattern recognition.
Zurück zum Zitat Bas, A., Smith, W. A. P., Bolkart, T., & Wuhrer, S. (2016). Fitting a 3D morphable model to edges: A comparison between hard and soft correspondences. In ACCV workshops. Bas, A., Smith, W. A. P., Bolkart, T., & Wuhrer, S. (2016). Fitting a 3D morphable model to edges: A comparison between hard and soft correspondences. In ACCV workshops.
Zurück zum Zitat Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., & Kumar, N. (2013). Localizing parts of faces using a consensus of exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2930–2940.CrossRef Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., & Kumar, N. (2013). Localizing parts of faces using a consensus of exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2930–2940.CrossRef
Zurück zum Zitat Bhagavatula, C., Zhu, C., Luu, K., & Savvides, M. (2017). Faster than real-time facial alignment: A 3D spatial transformer network approach in unconstrained poses. In Proceedings of the international conference on computer vision. Bhagavatula, C., Zhu, C., Luu, K., & Savvides, M. (2017). Faster than real-time facial alignment: A 3D spatial transformer network approach in unconstrained poses. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Blanz, V., & Vetter, T. (1999). Morphable model for the synthesis of 3D faces. In Proceedings of ACM SIGGRAPH conference on computer graphics. Blanz, V., & Vetter, T. (1999). Morphable model for the synthesis of 3D faces. In Proceedings of ACM SIGGRAPH conference on computer graphics.
Zurück zum Zitat Blanz, V., & Vetter, T. (2003). Face recognition based on fitting a 3d morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 1063–1074.CrossRef Blanz, V., & Vetter, T. (2003). Face recognition based on fitting a 3d morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 1063–1074.CrossRef
Zurück zum Zitat Blanz, V., Romdhani, S., & Vetter, T. (2002). Face identification across different poses and illuminations with a 3d morphable model. In International conference on automatic face and gesture recognition. Blanz, V., Romdhani, S., & Vetter, T. (2002). Face identification across different poses and illuminations with a 3d morphable model. In International conference on automatic face and gesture recognition.
Zurück zum Zitat Blanz, V., Scherbaum, K., Vetter, T., & Seidel, H. P. (2004). Exchanging faces in images. Computer Graphics Forum, 23(3), 669–676.CrossRef Blanz, V., Scherbaum, K., Vetter, T., & Seidel, H. P. (2004). Exchanging faces in images. Computer Graphics Forum, 23(3), 669–676.CrossRef
Zurück zum Zitat Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., & Zafeiriou, S. (2017). 3D face morphable models “in-the-wild”. In Proceedings of conference on computer vision pattern recognition. Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., & Zafeiriou, S. (2017). 3D face morphable models “in-the-wild”. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Bulat, A., & Tzimiropoulos, G. (2017a). Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In Proceedings of the international conference on computer vision. Bulat, A., & Tzimiropoulos, G. (2017a). Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Bulat, A., & Tzimiropoulos, G. (2017b). How far are we from solving the 2d and 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In Proceedings of the international conference on computer vision. Bulat, A., & Tzimiropoulos, G. (2017b). How far are we from solving the 2d and 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In Proceedings of the international conference on computer vision.
Zurück zum Zitat Cao, X., Wei, Y., Wen, F., & Sun, J. (2014). Face alignment by explicit shape regression. International Journal of Computer Vision, 107(2), 177–190.MathSciNetCrossRef Cao, X., Wei, Y., Wen, F., & Sun, J. (2014). Face alignment by explicit shape regression. International Journal of Computer Vision, 107(2), 177–190.MathSciNetCrossRef
Zurück zum Zitat Chang, F. J., Tran, A., Hassner, T., Masi, I., Nevatia, R., & Medioni, G. (2017) Faceposenet: Making a case for landmark-free face alignment. In Proceedings of international conference on computer vision workshops. Chang, F. J., Tran, A., Hassner, T., Masi, I., Nevatia, R., & Medioni, G. (2017) Faceposenet: Making a case for landmark-free face alignment. In Proceedings of international conference on computer vision workshops.
Zurück zum Zitat Chang, F. J., Tran, A. T., Hassner, T., Masi, I., Nevatia, R., & Medioni, G. (2018) Expnet: Landmark-free, deep, 3D facial expressions. In International conference on automatic face and gesture recognition. Chang, F. J., Tran, A. T., Hassner, T., Masi, I., Nevatia, R., & Medioni, G. (2018) Expnet: Landmark-free, deep, 3D facial expressions. In International conference on automatic face and gesture recognition.
Zurück zum Zitat Chu, B., Romdhani, S., & Chen, L. (2014). 3D-aided face recognition robust to expression and pose variations. In Proceedings of conference on computer vision pattern recognition. Chu, B., Romdhani, S., & Chen, L. (2014). 3D-aided face recognition robust to expression and pose variations. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Crosswhite, N., Byrne, J., Stauffer, C., Parkhi, O., Cao, Q., & Zisserman, A. (2017). Template adaptation for face verification and identification. In International conference on automatic face and gesture recognition. Crosswhite, N., Byrne, J., Stauffer, C., Parkhi, O., Cao, Q., & Zisserman, A. (2017). Template adaptation for face verification and identification. In International conference on automatic face and gesture recognition.
Zurück zum Zitat Dantone, M., Gall, J., Fanelli, G., & Van Gool, L. (2012). Real-time facial feature detection using conditional regression forests. In Proceedings of conference on computer vision pattern recognition. Dantone, M., Gall, J., Fanelli, G., & Van Gool, L. (2012). Real-time facial feature detection using conditional regression forests. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Dhall, A., Goecke, R., Lucey, S., & Gedeon, T. (2012). Collecting large, richly annotated facial-expression databases from movies. IEEE MultiMedia, 19(3), 34–41.CrossRef Dhall, A., Goecke, R., Lucey, S., & Gedeon, T. (2012). Collecting large, richly annotated facial-expression databases from movies. IEEE MultiMedia, 19(3), 34–41.CrossRef
Zurück zum Zitat Dhall, A., Goecke, R., Ghosh, S., Joshi, J., Hoey, J., & Gedeon, T. (2017). From individual to group-level emotion recognition: Emotiw 5.0. In ACM ICMI. Dhall, A., Goecke, R., Ghosh, S., Joshi, J., Hoey, J., & Gedeon, T. (2017). From individual to group-level emotion recognition: Emotiw 5.0. In ACM ICMI.
Zurück zum Zitat Dhall, A., Murthy, O. R., Goecke, R., Joshi, J., & Gedeon, T. (2015). Video and image based emotion recognition challenges in the wild: EmotiW 2015. In: ACM ICMI. Dhall, A., Murthy, O. R., Goecke, R., Joshi, J., & Gedeon, T. (2015). Video and image based emotion recognition challenges in the wild: EmotiW 2015. In: ACM ICMI.
Zurück zum Zitat Dong, X., Yan, Y., Ouyang, W., & Yang, Y. (2018a). Style aggregated network for facial landmark detection. In Proceedings of conference on computer vision pattern recognition. Dong, X., Yan, Y., Ouyang, W., & Yang, Y. (2018a). Style aggregated network for facial landmark detection. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Dong, X., Yu, S. I., Weng, X., Wei, S. E., Yang, Y., & Sheikh, Y. (2018b). Supervision-by-registration: An unsupervised approach to improve the precision of facial landmark detectors. In Proceedings of conference on computer vision pattern recognition. Dong, X., Yu, S. I., Weng, X., Wei, S. E., Yang, Y., & Sheikh, Y. (2018b). Supervision-by-registration: An unsupervised approach to improve the precision of facial landmark detectors. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Dou, P., Shah, S. K., & Kakadiaris, I. A. (2017). End-to-end 3D face reconstruction with deep neural networks. In Proceedings of conference on computer vision pattern recognition. Dou, P., Shah, S. K., & Kakadiaris, I. A. (2017). End-to-end 3D face reconstruction with deep neural networks. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Eidinger, E., Enbar, R., & Hassner, T. (2014). Age and gender estimation of unfiltered faces. IEEE Transactions on Information Forensics and Security, 9(12), 2170–2179.CrossRef Eidinger, E., Enbar, R., & Hassner, T. (2014). Age and gender estimation of unfiltered faces. IEEE Transactions on Information Forensics and Security, 9(12), 2170–2179.CrossRef
Zurück zum Zitat Everingham, M., Sivic, J., & Zisserman, A. (2006). “Hello! My name is... Buffy”—Automatic naming of characters in TV video. In Proceedings of British machine vision conference. Everingham, M., Sivic, J., & Zisserman, A. (2006). “Hello! My name is... Buffy”—Automatic naming of characters in TV video. In Proceedings of British machine vision conference.
Zurück zum Zitat Fabian Benitez-Quiroz, C., Srinivasan, R., & Martinez, A. M. (2016). Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In Proceedings of conference on computer vision pattern recognition. Fabian Benitez-Quiroz, C., Srinivasan, R., & Martinez, A. M. (2016). Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.MATH Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.MATH
Zurück zum Zitat Hassner, T., & Basri, R. (2006). Example based 3D reconstruction from single 2D images. In Proceedings of conference on computer vision pattern recognition workshops. Hassner, T., & Basri, R. (2006). Example based 3D reconstruction from single 2D images. In Proceedings of conference on computer vision pattern recognition workshops.
Zurück zum Zitat Hassner, T., Harel, S., Paz, E., & Enbar, R. (2015). Effective face frontalization in unconstrained images. In Proceedings of conference on computer vision pattern recognition. Hassner, T., Harel, S., Paz, E., & Enbar, R. (2015). Effective face frontalization in unconstrained images. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Hassner, T., Masi, I., Kim, J., Choi, J., Harel, S., Natarajan, P., & Medioni, G. (2016). Pooling faces: Template based face recognition with pooled face images. In Proceedings of conference on computer vision pattern recognition workshops. Hassner, T., Masi, I., Kim, J., Choi, J., Harel, S., Natarajan, P., & Medioni, G. (2016). Pooling faces: Template based face recognition with pooled face images. In Proceedings of conference on computer vision pattern recognition workshops.
Zurück zum Zitat He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of conference on computer vision pattern recognition. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Huang, G. B., Jain, V., & Learned-Miller, E. (2007). Unsupervised joint alignment of complex images. In Proceedings of the international conference on computer vision. Huang, G. B., Jain, V., & Learned-Miller, E. (2007). Unsupervised joint alignment of complex images. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Huber, P., Hu, G., Tena, R., Mortazavian, P., Koppen, W., Christmas, W., Rtsch, M., & Kittler, J. (2016). A multiresolution 3D morphable face model and fitting framework. In VISAPP. Huber, P., Hu, G., Tena, R., Mortazavian, P., Koppen, W., Christmas, W., Rtsch, M., & Kittler, J. (2016). A multiresolution 3D morphable face model and fitting framework. In VISAPP.
Zurück zum Zitat Jackson, A. S., Bulat, A., Argyriou, V., & Tzimiropoulos, G. (2017). Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In Proceedings of the international conference on computer vision Jackson, A. S., Bulat, A., Argyriou, V., & Tzimiropoulos, G. (2017). Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In Proceedings of the international conference on computer vision
Zurück zum Zitat Jeni, L. A., Cohn, J. F., & Kanade, T. (2015). Dense 3D face alignment from 2D videos in real-time. In International conference on automatic face and gesture recognition. Jeni, L. A., Cohn, J. F., & Kanade, T. (2015). Dense 3D face alignment from 2D videos in real-time. In International conference on automatic face and gesture recognition.
Zurück zum Zitat Jourabloo, A., & Liu, X. (2015). Pose-invariant 3d face alignment. In Proceedings of conference on computer vision pattern recognition. Jourabloo, A., & Liu, X. (2015). Pose-invariant 3d face alignment. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Jourabloo, A., & Liu, X. (2016). Large-pose face alignment via cnn-based dense 3D model fitting. In Proceedings of conference on computer vision pattern recognition. Jourabloo, A., & Liu, X. (2016). Large-pose face alignment via cnn-based dense 3D model fitting. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Kazemi, V., & Sullivan, J. (2014). One millisecond face alignment with an ensemble of regression trees. In Proceedings of conference on computer vision pattern recognition. Kazemi, V., & Sullivan, J. (2014). One millisecond face alignment with an ensemble of regression trees. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Kemelmacher-Shlizerman, I., & Basri, R. (2011). 3D face reconstruction from a single image using a single reference face shape. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(2), 394–405.CrossRef Kemelmacher-Shlizerman, I., & Basri, R. (2011). 3D face reconstruction from a single image using a single reference face shape. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(2), 394–405.CrossRef
Zurück zum Zitat King, D. E. (2009). Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10, 1755–1758. King, D. E. (2009). Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10, 1755–1758.
Zurück zum Zitat Klare, B. F., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother, P., Mah, A., Burge, M., & Jain, A. K. (2015). Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark-A. In Proceedings of conference on computer vision pattern recognition. Klare, B. F., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother, P., Mah, A., Burge, M., & Jain, A. K. (2015). Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark-A. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Kosti, R., Alvarez, J. M., Recasens, A., & Lapedriza, A. (2017). Emotion recognition in context. In Proceedings of conference on computer vision pattern recognition. Kosti, R., Alvarez, J. M., Recasens, A., & Lapedriza, A. (2017). Emotion recognition in context. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Köstinger, M., Wohlhart, P., Roth, P. M., & Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In Proceedings of the international conference on computer vision workshops. Köstinger, M., Wohlhart, P., Roth, P. M., & Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In Proceedings of the international conference on computer vision workshops.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Neural information processing systems. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Neural information processing systems.
Zurück zum Zitat Kumar, A., Alavi, A., & Chellappa, R. (2017). Kepler: Keypoint and pose estimation of unconstrained faces by learning efficient h-cnn regressors. In Automatic face and gesture recognition. Kumar, A., Alavi, A., & Chellappa, R. (2017). Kepler: Keypoint and pose estimation of unconstrained faces by learning efficient h-cnn regressors. In Automatic face and gesture recognition.
Zurück zum Zitat Kumar, A., & Chellappa, R. (2018). Disentangling 3D pose in a dendritic cnn for unconstrained 2d face alignment. In Proceedings of conference on computer vision pattern recognition. Kumar, A., & Chellappa, R. (2018). Disentangling 3D pose in a dendritic cnn for unconstrained 2d face alignment. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Le, V., Brandt, J., Lin, Z., Bourdev, L., & Huang, T. (2012). Interactive facial feature localization. In European conference on computer vision. Le, V., Brandt, J., Lin, Z., Bourdev, L., & Huang, T. (2012). Interactive facial feature localization. In European conference on computer vision.
Zurück zum Zitat Levi, G., & Hassner, T. (2015). Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In ACM ICMI. Levi, G., & Hassner, T. (2015). Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In ACM ICMI.
Zurück zum Zitat Li, C., Zhou, K., & Lin, S. (2014). Intrinsic face image decomposition with human face priors. In European conference on computer vision. Li, C., Zhou, K., & Lin, S. (2014). Intrinsic face image decomposition with human face priors. In European conference on computer vision.
Zurück zum Zitat Liu, Y., Jourabloo, A., Ren, W., & Liu, X. (2017). Dense face alignment. In Proceedings of conference on computer vision pattern recognition. Liu, Y., Jourabloo, A., Ren, W., & Liu, X. (2017). Dense face alignment. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the international conference on computer vision. Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In Proceedings of conference on computer vision pattern recognition workshops. Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In Proceedings of conference on computer vision pattern recognition workshops.
Zurück zum Zitat Masi, I., Ferrari, C., Del Bimbo, A., & Medioni, G. (2014). Pose independent face recognition by localizing local binary patterns via deformation components. In International conference on pattern recognition (pp. 4477–4482). IEEE. Masi, I., Ferrari, C., Del Bimbo, A., & Medioni, G. (2014). Pose independent face recognition by localizing local binary patterns via deformation components. In International conference on pattern recognition (pp. 4477–4482). IEEE.
Zurück zum Zitat Masi, I., Chang, F. J., Choi, J., Harel, S., Kim, J., Kim, K., et al. (2018a). Learning pose-aware models for pose-invariant face recognition in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 379–393.CrossRef Masi, I., Chang, F. J., Choi, J., Harel, S., Kim, J., Kim, K., et al. (2018a). Learning pose-aware models for pose-invariant face recognition in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 379–393.CrossRef
Zurück zum Zitat Masi, I., Hassner, T., Tran, A. T., & Medioni, G. (2017). Rapid synthesis of massive face sets for improved face recognition. In International conference on automatic face and gesture recognition (pp. 604–611). IEEE. Masi, I., Hassner, T., Tran, A. T., & Medioni, G. (2017). Rapid synthesis of massive face sets for improved face recognition. In International conference on automatic face and gesture recognition (pp. 604–611). IEEE.
Zurück zum Zitat Masi, I., Rawls, S., Medioni, G., & Natarajan, P. (2016a). Pose-aware face recognition in the wild. In Proceedings of conference on computer vision pattern recognition. Masi, I., Rawls, S., Medioni, G., & Natarajan, P. (2016a). Pose-aware face recognition in the wild. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Masi, I., Wu, Y., Hassner, T., & Natarajan, P. (2018b). Deep face recognition: A survey. In Conference on graphics, patterns and images. Masi, I., Wu, Y., Hassner, T., & Natarajan, P. (2018b). Deep face recognition: A survey. In Conference on graphics, patterns and images.
Zurück zum Zitat Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In Proceedings of British machine vision conference. Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In Proceedings of British machine vision conference.
Zurück zum Zitat Paysan, P., Knothe, R., Amberg, B., Romhani, S., & Vetter, T. (2009). A 3D face model for pose and illumination invariant face recognition. In International conference on advanced video and signal based surveillance. Paysan, P., Knothe, R., Amberg, B., Romhani, S., & Vetter, T. (2009). A 3D face model for pose and illumination invariant face recognition. In International conference on advanced video and signal based surveillance.
Zurück zum Zitat Poirson, P., Ammirato, P., Fu, C. Y., Liu, W., Kosecka, J., & Berg, A. C. (2016). Fast single shot detection and pose estimation. In 3DV. Poirson, P., Ammirato, P., Fu, C. Y., Liu, W., Kosecka, J., & Berg, A. C. (2016). Fast single shot detection and pose estimation. In 3DV.
Zurück zum Zitat Ranjan, R., Castillo, C. D., & Chellappa, R. (2017). L2-constrained softmax loss for discriminative face verification. arXiv preprint arXiv:1703.09507. Ranjan, R., Castillo, C. D., & Chellappa, R. (2017). L2-constrained softmax loss for discriminative face verification. arXiv preprint arXiv:​1703.​09507.
Zurück zum Zitat Ren, S., Cao, X., Wei, Y., & Sun, J. (2014). Face alignment at 3000 fps via regressing local binary features. In Proceedings of conference on computer vision pattern recognition. Ren, S., Cao, X., Wei, Y., & Sun, J. (2014). Face alignment at 3000 fps via regressing local binary features. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Richardson, E., Sela, M., & Kimmel, R. (2016). 3d face reconstruction by learning from synthetic data. In 3DV. Richardson, E., Sela, M., & Kimmel, R. (2016). 3d face reconstruction by learning from synthetic data. In 3DV.
Zurück zum Zitat Richardson, E., Sela, M., Or-El, R., & Kimmel, R. (2017). Learning detailed face reconstruction from a single image. In Proceedings of conference on computer vision pattern recognition. Richardson, E., Sela, M., Or-El, R., & Kimmel, R. (2017). Learning detailed face reconstruction from a single image. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Romdhani, S., & Vetter, T. (2003). Efficient, robust and accurate fitting of a 3D morphable model. In Proceedings of the international conference on computer vision. Romdhani, S., & Vetter, T. (2003). Efficient, robust and accurate fitting of a 3D morphable model. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Romdhani, S., & Vetter, T. (2005). Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior. In Proceedings of conference on computer vision pattern recognition. Romdhani, S., & Vetter, T. (2005). Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013). 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Proceedings of conference on computer vision pattern recognition workshops. Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013). 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Proceedings of conference on computer vision pattern recognition workshops.
Zurück zum Zitat Sagonas, C., Antonakos, E., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2016). 300 faces in-the-wild challenge: Database and results. Image and Vision Computing, 47, 3–18.CrossRef Sagonas, C., Antonakos, E., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2016). 300 faces in-the-wild challenge: Database and results. Image and Vision Computing, 47, 3–18.CrossRef
Zurück zum Zitat Sela, M., Richardson, E., & Kimmel, R. (2017). Unrestricted facial geometry reconstruction using image-to-image translation. In Proceedings of the international conference on computer vision. Sela, M., Richardson, E., & Kimmel, R. (2017). Unrestricted facial geometry reconstruction using image-to-image translation. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Sengupta, S., Kanazawa, A., Castillo, C. D., & Jacobs, D. (2018). SfSNet: Learning shape, reflectance and illuminance of faces in the wild. In Proceedings of conference on computer vision pattern recognition. Sengupta, S., Kanazawa, A., Castillo, C. D., & Jacobs, D. (2018). SfSNet: Learning shape, reflectance and illuminance of faces in the wild. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Su, H., Qi, C. R., Li, Y., & Guibas, L. J. (2015). Render for CNN: Viewpoint estimation in images using CNNs trained with rendered 3D model views. In Proceedings of the international conference on computer vision. Su, H., Qi, C. R., Li, Y., & Guibas, L. J. (2015). Render for CNN: Viewpoint estimation in images using CNNs trained with rendered 3D model views. In Proceedings of the international conference on computer vision.
Zurück zum Zitat Surace, L., Patacchiola, M., Battini Sönmez, E., Spataro, W., & Cangelosi, A. (2017). Emotion recognition in the wild using deep neural networks and Bayesian classifiers. In ACM ICMI. Surace, L., Patacchiola, M., Battini Sönmez, E., Spataro, W., & Cangelosi, A. (2017). Emotion recognition in the wild using deep neural networks and Bayesian classifiers. In ACM ICMI.
Zurück zum Zitat Tang, H., Hu, Y., Fu, Y., Hasegawa-Johnson, M., & Huang, T. S. (2008). Real-time conversion from a single 2d face image to a 3D text-driven emotive audio-visual avatar. In International conference on multimedia and expo. Tang, H., Hu, Y., Fu, Y., Hasegawa-Johnson, M., & Huang, T. S. (2008). Real-time conversion from a single 2d face image to a 3D text-driven emotive audio-visual avatar. In International conference on multimedia and expo.
Zurück zum Zitat Tewari, A., Zollhfer, M., Garrido, P., Florian Bernard, H. K., Prez, P., & Theobalt, C. (2018). Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In Proceedings of conference on computer vision pattern recognition. Tewari, A., Zollhfer, M., Garrido, P., Florian Bernard, H. K., Prez, P., & Theobalt, C. (2018). Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Tran, A., Hassner, T., Masi, I., & Medioni, G. (2017). Regressing robust and discriminative 3D morphable models with a very deep neural network. In Proceedings of conference on computer vision pattern recognition. Tran, A., Hassner, T., Masi, I., & Medioni, G. (2017). Regressing robust and discriminative 3D morphable models with a very deep neural network. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Tran, A. T., Hassner, T., Masi, I., Paz, E., Nirkin, Y., & Medioni, G. (2018) Extreme 3D face reconstruction: Looking past occlusions. In Proceedings of conference on computer vision pattern recognition. Tran, A. T., Hassner, T., Masi, I., Paz, E., Nirkin, Y., & Medioni, G. (2018) Extreme 3D face reconstruction: Looking past occlusions. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Vetter, T., & Blanz, V. (1998). Estimating coloured 3D face models from single images: An example based approach. In European conference on computer vision. Vetter, T., & Blanz, V. (1998). Estimating coloured 3D face models from single images: An example based approach. In European conference on computer vision.
Zurück zum Zitat Whitelam, C., Taborsky, E., Blanton, A., Maze, B., Adams, J., Miller, T., Kalka, N., Jain, A. K., Duncan, J. A., & Allen, K., et al. (2017). Iarpa janus benchmark-b face dataset. In Proceedings of conference on computer vision pattern recognition workshops. Whitelam, C., Taborsky, E., Blanton, A., Maze, B., Adams, J., Miller, T., Kalka, N., Jain, A. K., Duncan, J. A., & Allen, K., et al. (2017). Iarpa janus benchmark-b face dataset. In Proceedings of conference on computer vision pattern recognition workshops.
Zurück zum Zitat Wolf, L., Hassner, T., & Maoz, I. (2011). Face recognition in unconstrained videos with matched background similarity. In Proceedings of conference on computer vision pattern recognition. Wolf, L., Hassner, T., & Maoz, I. (2011). Face recognition in unconstrained videos with matched background similarity. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Wu, Y., Hassner, T., Kim, K., Medioni, G., & Natarajan, P. (2017). Facial landmark detection with tweaked convolutional neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 3067–3074.CrossRef Wu, Y., Hassner, T., Kim, K., Medioni, G., & Natarajan, P. (2017). Facial landmark detection with tweaked convolutional neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 3067–3074.CrossRef
Zurück zum Zitat Xiang, Y., Mottaghi, R., & Savarese, S. (2014). Beyond pascal: A benchmark for 3D object detection in the wild. In Winter conference on applications of computer vision. Xiang, Y., Mottaghi, R., & Savarese, S. (2014). Beyond pascal: A benchmark for 3D object detection in the wild. In Winter conference on applications of computer vision.
Zurück zum Zitat Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., & Savarese, S. (2016). Objectnet3D: A large scale database for 3D object recognition. In European conference on computer vision. Xiang, Y., Kim, W., Chen, W., Ji, J., Choy, C., Su, H., Mottaghi, R., Guibas, L., & Savarese, S. (2016). Objectnet3D: A large scale database for 3D object recognition. In European conference on computer vision.
Zurück zum Zitat Xie, L., Wang, J., Wei, Z., Wang, M., & Tian, Q. (2016). Disturblabel: Regularizing cnn on the loss layer. In Proceedings of conference on computer vision pattern recognition (pp. 4753–4762). Xie, L., Wang, J., Wei, Z., Wang, M., & Tian, Q. (2016). Disturblabel: Regularizing cnn on the loss layer. In Proceedings of conference on computer vision pattern recognition (pp. 4753–4762).
Zurück zum Zitat Xiong, X., & De la Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of conference on computer vision pattern recognition. Xiong, X., & De la Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Yang, Z., & Nevatia, R. (2016). A multi-scale cascade fully convolutional network face detector. In ICPR. Yang, Z., & Nevatia, R. (2016). A multi-scale cascade fully convolutional network face detector. In ICPR.
Zurück zum Zitat Yang, F., Wang, J., Shechtman, E., Bourdev, L., & Metaxas, D. (2011). Expression flow for 3D-aware face component transfer. ACM Transactions on Graphics, 30(4), 60.CrossRef Yang, F., Wang, J., Shechtman, E., Bourdev, L., & Metaxas, D. (2011). Expression flow for 3D-aware face component transfer. ACM Transactions on Graphics, 30(4), 60.CrossRef
Zurück zum Zitat Yu, X., Huang, J., Zhang, S., Yan, W., & Metaxas, D. N. (2013). Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In Proceedings of the international conference on computer vision (pp. 1944–1951). IEEE. Yu, X., Huang, J., Zhang, S., Yan, W., & Metaxas, D. N. (2013). Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In Proceedings of the international conference on computer vision (pp. 1944–1951). IEEE.
Zurück zum Zitat Zadeh, A., Baltrušaitis, T., & Morency, L. P. (2016). Deep constrained local models for facial landmark detection. arXiv preprint arXiv:1611.08657. Zadeh, A., Baltrušaitis, T., & Morency, L. P. (2016). Deep constrained local models for facial landmark detection. arXiv preprint arXiv:​1611.​08657.
Zurück zum Zitat Zafeiriou, S., Chrysos, G. G., Roussos, A., Ververas, E., Deng, J., & Trigeorgis, G. (2017). The 3D menpo facial landmark tracking challenge. In Proceedings of international conference on computer vision workshops. Zafeiriou, S., Chrysos, G. G., Roussos, A., Ververas, E., Deng, J., & Trigeorgis, G. (2017). The 3D menpo facial landmark tracking challenge. In Proceedings of international conference on computer vision workshops.
Zurück zum Zitat Zafeiriou, S., Papaioannou, A., Kotsia, I., Nicolaou, M., & Zhao, G. (2016) Facial affect “in-the-wild”. In Proceedings of conference on computer vision pattern recognition workshops (pp. 36–47). Zafeiriou, S., Papaioannou, A., Kotsia, I., Nicolaou, M., & Zhao, G. (2016) Facial affect “in-the-wild”. In Proceedings of conference on computer vision pattern recognition workshops (pp. 36–47).
Zurück zum Zitat Zhang, J., Shan, S., Kan, M., & Chen, X. (2014). Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In European conference on computer vision. Springer. Zhang, J., Shan, S., Kan, M., & Chen, X. (2014). Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In European conference on computer vision. Springer.
Zurück zum Zitat Zhang, K., Tan, L., Li, Z., & Qiao, Y. (2016). Gender and smile classification using deep convolutional neural networks. In Proceedings of conference on computer vision pattern recognition workshops (pp. 34–38). Zhang, K., Tan, L., Li, Z., & Qiao, Y. (2016). Gender and smile classification using deep convolutional neural networks. In Proceedings of conference on computer vision pattern recognition workshops (pp. 34–38).
Zurück zum Zitat Zhu, S., Li, C., Change Loy, C., & Tang, X. (2015a). Face alignment by coarse-to-fine shape searching. In Proceedings of conference on computer vision pattern recognition. Zhu, S., Li, C., Change Loy, C., & Tang, X. (2015a). Face alignment by coarse-to-fine shape searching. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Zhu, S., Li, C., Loy, C. C., & Tang, X. (2016a). Unconstrained face alignment via cascaded compositional learning. In Proceedings of conference on computer vision pattern recognition. Zhu, S., Li, C., Loy, C. C., & Tang, X. (2016a). Unconstrained face alignment via cascaded compositional learning. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Zhu, X., Lei, Z., Liu, X., Shi, H., & Li, S. (2016b). Face alignment across large poses: A 3D solution. In Proceedings of conference on computer vision pattern recognition. Zhu, X., Lei, Z., Liu, X., Shi, H., & Li, S. (2016b). Face alignment across large poses: A 3D solution. In Proceedings of conference on computer vision pattern recognition.
Zurück zum Zitat Zhu, X., Lei, Z., Yan, J., Yi, D., & Li, S. Z. (2015b). High-fidelity pose and expression normalization for face recognition in the wild. In Proceedings of conference on computer vision pattern recognition (pp. 787–796). Zhu, X., Lei, Z., Yan, J., Yi, D., & Li, S. Z. (2015b). High-fidelity pose and expression normalization for face recognition in the wild. In Proceedings of conference on computer vision pattern recognition (pp. 787–796).
Zurück zum Zitat Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Proceedings of conference on computer vision pattern recognition. Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Proceedings of conference on computer vision pattern recognition.
Metadaten
Titel
Deep, Landmark-Free FAME: Face Alignment, Modeling, and Expression Estimation
verfasst von
Feng-Ju Chang
Anh Tuan Tran
Tal Hassner
Iacopo Masi
Ram Nevatia
Gérard Medioni
Publikationsdatum
13.02.2019
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 6-7/2019
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-019-01151-x

Weitere Artikel der Ausgabe 6-7/2019

International Journal of Computer Vision 6-7/2019 Zur Ausgabe

Premium Partner