Skip to main content
Erschienen in: International Journal of Computer Vision 1/2015

01.05.2015

Discriminative Deep Face Shape Model for Facial Point Detection

verfasst von: Yue Wu, Qiang Ji

Erschienen in: International Journal of Computer Vision | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Facial point detection is an active area in computer vision due to its relevance to many applications. It is a nontrivial task, since facial shapes vary significantly with facial expressions, poses or occlusion. In this paper, we address this problem by proposing a discriminative deep face shape model that is constructed based on an augmented factorized three-way Restricted Boltzmann Machines model. Specifically, the discriminative deep model combines the top-down information from the embedded face shape patterns and the bottom up measurements from local point detectors in a unified framework. In addition, along with the model, effective algorithms are proposed to perform model learning and to infer the true facial point locations from their measurements. Based on the discriminative deep face shape model, 68 facial points are detected on facial images in both controlled and “in-the-wild” conditions. Experiments on benchmark data sets show the effectiveness of the proposed facial point detection algorithm against state-of-the-art methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Baker, S., Gross, R., & Matthews, I. (2002). Lucas-kanade 20 years on: A unifying framework: Part 3. International Journal of Computer Vision, 56, 221–255.CrossRef Baker, S., Gross, R., & Matthews, I. (2002). Lucas-kanade 20 years on: A unifying framework: Part 3. International Journal of Computer Vision, 56, 221–255.CrossRef
Zurück zum Zitat Belhumeur, P., Jacobs, D., Kriegman, D., & Kumar, N. (2013). Localizing parts of faces using a consensus of exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2930–2940.CrossRef Belhumeur, P., Jacobs, D., Kriegman, D., & Kumar, N. (2013). Localizing parts of faces using a consensus of exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2930–2940.CrossRef
Zurück zum Zitat Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., & Kumar, N. (2011). Localizing parts of faces using a consensus of exemplars. In IEEE International Conference on Computer Vision and Pattern Recognition. Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., & Kumar, N. (2011). Localizing parts of faces using a consensus of exemplars. In IEEE International Conference on Computer Vision and Pattern Recognition.
Zurück zum Zitat Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models their training and application. Computer Vision and Image Understanding, 61(1), 38–59.CrossRef Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models their training and application. Computer Vision and Image Understanding, 61(1), 38–59.CrossRef
Zurück zum Zitat Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.CrossRef Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.CrossRef
Zurück zum Zitat Cristinacce, D., & Cootes, T. (2008). Automatic feature localisation with constrained local models. Pattern Recognition, 41(10), 3054–3067.CrossRefMATH Cristinacce, D., & Cootes, T. (2008). Automatic feature localisation with constrained local models. Pattern Recognition, 41(10), 3054–3067.CrossRefMATH
Zurück zum Zitat Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. International Conference on Computer Vision and Pattern Recognition, 2, 886–893. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. International Conference on Computer Vision and Pattern Recognition, 2, 886–893.
Zurück zum Zitat Eslami, S., Heess, N., & Winn, J. (2012). The shape boltzmann machine: A strong model of object shape. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 406–413). Eslami, S., Heess, N., & Winn, J. (2012). The shape boltzmann machine: A strong model of object shape. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 406–413).
Zurück zum Zitat Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.MATH Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.MATH
Zurück zum Zitat Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-pie. Image and Vision Computing, 28(5), 807–813.CrossRef Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-pie. Image and Vision Computing, 28(5), 807–813.CrossRef
Zurück zum Zitat Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.CrossRefMATHMathSciNet Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.CrossRefMATHMathSciNet
Zurück zum Zitat Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.CrossRefMATHMathSciNet Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.CrossRefMATHMathSciNet
Zurück zum Zitat Kae, A., Sohn, K., Lee, H., & Learned-Miller, E. G. (2013). Augmenting crfs with boltzmann machine shape priors for image labeling. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 2019–2026). Kae, A., Sohn, K., Lee, H., & Learned-Miller, E. G. (2013). Augmenting crfs with boltzmann machine shape priors for image labeling. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 2019–2026).
Zurück zum Zitat Le, V., Brandt, J., Lin, Z., Bourdev, L., & Huang, T. S. (2012). Interactive facial feature localization. In European Conference on Computer Vision, Part III (ECCV’12, pp. 679–692). Le, V., Brandt, J., Lin, Z., Bourdev, L., & Huang, T. S. (2012). Interactive facial feature localization. In European Conference on Computer Vision, Part III (ECCV’12, pp. 679–692).
Zurück zum Zitat Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, 60(2), 91–110.CrossRef Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, 60(2), 91–110.CrossRef
Zurück zum Zitat Martinez, B., Valstar, M. F., Binefa, X., & Pantic, M. (2013). Local evidence aggregation for regression-based facial point detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(5), 1149–1163.CrossRef Martinez, B., Valstar, M. F., Binefa, X., & Pantic, M. (2013). Local evidence aggregation for regression-based facial point detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(5), 1149–1163.CrossRef
Zurück zum Zitat Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.CrossRef Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.CrossRef
Zurück zum Zitat Memisevic, R., & Hinton, G. E. (2010). Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Computation, 22(6), 1473–1492.CrossRefMATH Memisevic, R., & Hinton, G. E. (2010). Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Computation, 22(6), 1473–1492.CrossRefMATH
Zurück zum Zitat Mohamed, A., Dahl, G., & Hinton, G. (2011). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, PP(99), 1. Mohamed, A., Dahl, G., & Hinton, G. (2011). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, PP(99), 1.
Zurück zum Zitat Ranzato, M., Krizhevsky, A., & Hinton, G. E. (2010). Factored 3-way restricted boltzmann machines for modeling natural images. In International Conference on Artificial Intelligence and Statistics (pp. 621–628). Ranzato, M., Krizhevsky, A., & Hinton, G. E. (2010). Factored 3-way restricted boltzmann machines for modeling natural images. In International Conference on Artificial Intelligence and Statistics (pp. 621–628).
Zurück zum Zitat Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M. (2013). 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Proceedings of IEEE International Conference on Computer Vision (ICCV-W 2013), Sydney. Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M. (2013). 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Proceedings of IEEE International Conference on Computer Vision (ICCV-W 2013), Sydney.
Zurück zum Zitat Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013). A semi-automatic methodology for facial landmark annotation. In Computer Vision and Pattern Recognition Workshops (CVPRW, pp. 896–903). Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013). A semi-automatic methodology for facial landmark annotation. In Computer Vision and Pattern Recognition Workshops (CVPRW, pp. 896–903).
Zurück zum Zitat Salakhutdinov, R., & Hinton, G. (2009). Deep boltzmann machines. Proceedings of the International Conference on Artificial Intelligence and Statistics, 5, 448–455. Salakhutdinov, R., & Hinton, G. (2009). Deep boltzmann machines. Proceedings of the International Conference on Artificial Intelligence and Statistics, 5, 448–455.
Zurück zum Zitat Saragih, J. M., Lucey, S., & Cohn, J. F. (2011). Deformable model fitting by regularized landmark mean-shift. International Journal of Computer Vision, 91(2), 200–215.CrossRefMATHMathSciNet Saragih, J. M., Lucey, S., & Cohn, J. F. (2011). Deformable model fitting by regularized landmark mean-shift. International Journal of Computer Vision, 91(2), 200–215.CrossRefMATHMathSciNet
Zurück zum Zitat Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.CrossRefMathSciNet Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.CrossRefMathSciNet
Zurück zum Zitat Sun, Y., Wang, X., & Tang, X. (2013a). Deep convolutional network cascade for facial point detection. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 3476–3483). Sun, Y., Wang, X., & Tang, X. (2013a). Deep convolutional network cascade for facial point detection. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 3476–3483).
Zurück zum Zitat Sun, Y., Wang, X., & Tang, X. (2013b). Hybrid deep learning for face verification. In 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1489–1496. Sun, Y., Wang, X., & Tang, X. (2013b). Hybrid deep learning for face verification. In 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1489–1496.
Zurück zum Zitat Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). DeepFace: Closing the gap to human-level performance in face verification. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1701–1708. Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). DeepFace: Closing the gap to human-level performance in face verification. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1701–1708.
Zurück zum Zitat Taylor, G., Sigal, L., Fleet, D., & Hinton, G. (2010). Dynamical binary latent variable models for 3d human pose tracking. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR, pp. 631–638). Taylor, G., Sigal, L., Fleet, D., & Hinton, G. (2010). Dynamical binary latent variable models for 3d human pose tracking. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR, pp. 631–638).
Zurück zum Zitat Tieleman, T. (2008). Training restricted boltzmann machines using approximations to the likelihood gradient. In Proceedings of the 25th International Conference on Machine Learning (pp. 1064–1071). Tieleman, T. (2008). Training restricted boltzmann machines using approximations to the likelihood gradient. In Proceedings of the 25th International Conference on Machine Learning (pp. 1064–1071).
Zurück zum Zitat Tzimiropoulos, G., & Pantic, M. (2013). Optimization problems for fast aam fitting in-the-wild. In International conference on Computer Vision (pp. 593–600). Tzimiropoulos, G., & Pantic, M. (2013). Optimization problems for fast aam fitting in-the-wild. In International conference on Computer Vision (pp. 593–600).
Zurück zum Zitat Valstar, M., Martinez, B., Binefa, V., & Pantic, M. (2010). Facial point detection using boosted regression and graph models. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 13–18). Valstar, M., Martinez, B., Binefa, V., & Pantic, M. (2010). Facial point detection using boosted regression and graph models. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 13–18).
Zurück zum Zitat Welling, M., & Hinton, G. E. (2002). A new learning algorithm for mean field boltzmann machines. In Proceedings of the International Conference on Artificial Neural Networks (ICANN ’02, pp 351–357). London: Springer. Welling, M., & Hinton, G. E. (2002). A new learning algorithm for mean field boltzmann machines. In Proceedings of the International Conference on Artificial Neural Networks (ICANN ’02, pp 351–357). London: Springer.
Zurück zum Zitat Wu, Y., Wang, Z., & Ji, Q. (2013). Facial feature tracking under varying facial expressions and face poses based on restricted boltzmann machines. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 3452–3459). Wu, Y., Wang, Z., & Ji, Q. (2013). Facial feature tracking under varying facial expressions and face poses based on restricted boltzmann machines. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 3452–3459).
Zurück zum Zitat Xiong, X., & De la Torre Frade, F. (2013). Supervised descent method and its applications to face alignment. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR). Xiong, X., & De la Torre Frade, F. (2013). Supervised descent method and its applications to face alignment. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).
Zurück zum Zitat Zhou, E., Fan, H., Cao, Z., Jiang, Y., & Yin, Q. (2013). Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In IEEE International Conference on Computer Vision Workshops (pp. 386–391). Zhou, E., Fan, H., Cao, Z., Jiang, Y., & Yin, Q. (2013). Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In IEEE International Conference on Computer Vision Workshops (pp. 386–391).
Zurück zum Zitat Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 2879–2886). Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 2879–2886).
Metadaten
Titel
Discriminative Deep Face Shape Model for Facial Point Detection
verfasst von
Yue Wu
Qiang Ji
Publikationsdatum
01.05.2015
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 1/2015
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-014-0775-8

Weitere Artikel der Ausgabe 1/2015

International Journal of Computer Vision 1/2015 Zur Ausgabe

Premium Partner