nach oben

International Journal of Computer Vision

Erschienen in:

31.08.2018

Deep Appearance Models: A Deep Boltzmann Machine Approach for Face Modeling

verfasst von: Chi Nhan Duong, Khoa Luu, Kha Gia Quach, Tien D. Bui

Erschienen in: International Journal of Computer Vision | Ausgabe 5/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The “interpretation through synthesis” approach to analyze face images, particularly Active Appearance Models (AAMs) method, has become one of the most successful face modeling approaches over the last two decades. AAM models have ability to represent face images through synthesis using a controllable parameterized Principal Component Analysis (PCA) model. However, the accuracy and robustness of the synthesized faces of AAMs are highly depended on the training sets and inherently on the generalizability of PCA subspaces. This paper presents a novel Deep Appearance Models (DAMs) approach, an efficient replacement for AAMs, to accurately capture both shape and texture of face images under large variations. In this approach, three crucial components represented in hierarchical layers are modeled using the Deep Boltzmann Machines (DBM) to robustly capture the variations of facial shapes and appearances. DAMs are therefore superior to AAMs in inferencing a representation for new face images under various challenging conditions. The proposed approach is evaluated in various applications to demonstrate its robustness and capabilities, i.e. facial super-resolution reconstruction, facial off-angle reconstruction or face frontalization, facial occlusion removal and age estimation using challenging face databases, i.e. Labeled Face Parts in the Wild, Helen and FG-NET. Comparing to AAMs and other deep learning based approaches, the proposed DAMs achieve competitive results in those applications, thus this showed their advantages in handling occlusions, facial representation, and reconstruction.

Vorheriger Artikel Robust and Optimal Registration of Image Sets and Structured Scenes via Sum-of-Squares Polynomials

Nächster Artikel Understanding Image Representations by Measuring Their Equivariance and Equivalence

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Noted that the term DAM is also used for “Direct Appearance Models” in (Hou et al. 2001).

The implementation of DAMs will be available at https://github.com/dcnhan/DeepAppearanceModels and our project page http://www.contrib.andrew.cmu.edu/~kluu/faceaging.html.

The FG-NET Aging Database, http://www.fgnet.rsunit.com/.

Amberg, B., Blake, A., & Vetter, T. (2009). On compositional image alignment, with an application to active appearance models. In CVPR (pp. 1714–1721). IEEE.

Anderson, R., Stenger, B., Wan, V., & Cipolla, R. (2013). Expressive visual text-to-speech using active appearance models. In CVPR (pp. 3382–3389). IEEE.

Antonakos, E., Alabort-i Medina, J., Tzimiropoulos, G., & Zafeiriou, S. (2014). Hog active appearance models. In ICIP (pp. 224–228). IEEE.

Antonakos, E., Alabort-i Medina, J., Tzimiropoulos, G., & Zafeiriou, S. P. (2015). Feature-based lucas–kanade and active appearance models. IEEE Transactions on Image Processing, 24(9), 2617–2632.MathSciNetCrossRef

Antonakos, E., Snape, P., Trigeorgis, G., & Zafeiriou, S. (2016). Adaptive cascaded regression. In IEEE international conference on image processing (ICIP), 2016 (pp. 1649–1653). IEEE.

Belhumeur, P. N., Jacobs, D. W., Kriegman, D., & Kumar, N. (2011). Localizing parts of faces using a consensus of exemplars. In CVPR (pp. 545–552). IEEE.

Burgos-Artizzu, X. P., Perona, P., & Dollár, P. (2013). Robust face landmark estimation under occlusion. In ICCV (pp. 1513–1520). IEEE.

Chen, K., Gong, S., Xiang, T., & Loy, C. (2013). Cumulative attribute space for age and crowd density estimation. In CVPR (pp. 2467–2474).

Cootes, T. F., & Taylor, C. J. (2006). An algorithm for tuning an active appearance model to new data. In BMVC (pp. 919–928).

Cootes, T. F., Edwards, G. J., & Taylor, C. J. (1998). Interprettting face images using active appearance models. In FG (pp. 300–305).

Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.CrossRef

Ding, C., & Tao, D. (2015). Robust face recognition via multimodal deep face representation. IEEE Transactions on Multimedia, 17(11), 2049–2058.CrossRef

Dong, C., Loy, C. C., He, K., & Tang, X. (2014). Learning a deep convolutional network for image super-resolution. In ECCV, (pp. 184–199). Berlin: Springer.

Donner, R., Reiter, M., Langs, G., Peloschek, P., & Bischof, H. (2006). Fast active appearance model search using canonical correlation analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1690.CrossRef

Duong, C. N., Quach, K. G., Luu, K., Le, H. B., & Ricanek, K. (2011). Fine tuning age-estimation with global and local facial features. In International conference on acoustics, speech and signal processing (ICASSP). IEEE.

Duong, C. N., Luu, K., Gia Quach, K., & Bui, T. D. (2015). Beyond principal components: Deep boltzmann machines for face modeling. In: CVPR (pp. 4786–4794).

Edwards, G. J., Cootes, T. F., & Taylor, C. J. (1998). Face recognition using active appearance models. In: ECCV (pp. 581–595). Berlin: Springer.

Eslami, S. A., Heess, N., Williams, C. K., & Winn, J. (2014). The shape boltzmann machine: A strong model of object shape. International Journal of Computer Vision, 107(2), 155–176.MathSciNetCrossRefMATH

Ferrari, C., Lisanti, G., Berretti, S., & Del Bimbo, A. (2016). Effective 3d based frontalization for unconstrained face recognition. In 23rd International conference on pattern recognition (ICPR) (pp. 1047–1052). IEEE.

Fu, Y., & Huang, T. S. (2008). Human age estimation with regression on discriminative aging manifold. IEEE Transactions on Multimedia, 10(4), 578–584.CrossRef

Gao, S., Zhang, Y., Jia, K., Lu, J., & Zhang, Y. (2015). Single sample face recognition via learning deep supervised autoencoders. IEEE Transactions on Information Forensics and Security, 10(10), 2108–2118.CrossRef

Ge, Y., Yang, D., Lu, J., Li, B., & Zhang, X. (2013). Active appearance models using statistical characteristics of gabor based texture representation. Journal of Visual Communication and Image Representation, 24(5), 627–634.CrossRef

Gross, R., Matthews, I., & Baker, S. (2005). Generic vs. person specific active appearance models. Image and Vision Computing, 23(12), 1080–1093.CrossRef

Haase, D., Rodner, E., & Denzler, J. (2014). Instance-weighted transfer learning of active appearance models. In CVPR (pp. 1426–1433). IEEE.

Hassner, T., Harel, S., Paz, E., & Enbar, R. (2015). Effective face frontalization in unconstrained images. In CVPR (pp. 4295 – 4304).

Hou, X., Li, SZ., Zhang, H., & Cheng, Q. (2001). Direct appearance models. In: CVPR (Vol. 1, pp. I–828–I–833). IEEE.

Huang, GB., Lee, H., & Learned-Miller, E. (2012). Learning hierarchical representations for face verification with convolutional deep belief networks. In CVPR (pp. 2518–2525). IEEE.

Huiskes, M. J., Thomee, B., & Lew, M. S. (2010). New trends and ideas in visual concept detection: The mir flickr retrieval evaluation initiative. In ICMR (pp. 527–536). ACM.

Jeni, L. A., Cohn, J. F. (2016). Person-independent 3d gaze estimation using face frontalization. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 87–95).

Kan, M., Shan, S., Chang, H., & Chen, X. (2014). Stacked progressive auto-encoders (spae) for face recognition across poses. In CVPR (pp. 1883–1890).

Le, V., Brandt, J., Lin, Z., Bourdev, L., & Huang, T. S. (2012). Interactive facial feature localization. In ECCV (pp. 679–692). Berlin: Springer.

Li, C., Liu, Q., Liu, J., & Lu, H. (2012). Learning ordinal discriminative features for age estimation. In CVPR (pp. 2570–2577). IEEE.

Li, C., Zhou, K., & Lin, S. (2014). Intrinsic face image decomposition with human face priors. In ECCV (pp. 218–233). Springer.

Liu, L., Xiong, C., Zhang, H., Niu, Z., Wang, M., & Yan, S. (2016). Deep aging face verification with large gaps. IEEE Transactions on Multimedia, 18(1), 64–75.CrossRef

Luu, K., Ricanek, K., Bui, T. D., & Suen, C. Y. (2009). Age estimation using active appearance models and support vector machine regression. In BTAS (pp. 1–5). IEEE.

Luu, K., Bui, T. D., Suen, C. Y., & Ricanek, K. (2010). Spectral regression based age determination. In Computer vision and pattern recognition workshops (CVPRW). IEEE.

Luu, K., Bui, T. D., Suen, C. Y. (2011a). Kernel spectral regression of perceived age from hybrid facial features. In International conference on automatic face and gesture recognition and workshops (FG). IEEE.

Luu, K., Keshav Seshadri, M. S., Bui, T. D., & Suen, C. Y. (2011b). Contourlet appearance model for facial age estimation. In International joint conference on biometrics (IJCB). IEEE.

Martınez, A., & Benavente, R. (1998). The AR face database. Rapport technique 24.

Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.CrossRef

Alabort-i Medina, J., & Zafeiriou, S. (2014). Bayesian active appearance models. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3438–3445).

Alabort-i Medina, J., Zafeiriou, S. (2015). Unifying holistic and parts-based deformable model fitting. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3679–3688).

Alabort-i Medina, J., & Zafeiriou, S. (2017). A unified framework for compositional fitting of active appearance models. International Journal of Computer Vision, 121(1), 26–64.CrossRef

Alabort-i Medina, J., Antonakos, E., Booth, J., Snape, P., & Zafeiriou, S. (2014). Menpo: A comprehensive platform for parametric image alignment and visual deformable models. In: Proceedings of the 22nd ACM international conference on Multimedia (pp. 679–682). ACM.

Alabort-i Medina, J., & Zafeiriou, S. (2014). Bayesian active appearance models. In CVPR (pp. 3438–3445). IEEE.

Mollahosseini, A., & Mahoor, M. H. (2013). Bidirectional warping of active appearance model. In CVPRW (pp. 875–880). IEEE.

Navarathna, R., Sridharan, S., & Lucey, S. (2011). Fourier active appearance models. In ICCV (pp. 1919–1926). IEEE.

Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., & Ng, A. Y. (2011). Multimodal deep learning. In ICML (pp. 689–696).

Papandreou, G., & Maragos, P. (2008). Adaptive and constrained algorithms for inverse compositional active appearance model fitting. In CVPR (pp. 1–8). IEEE.

Pizarro, D., Peyras, J., & Bartoli, A. (2008). Light-invariant fitting of active appearance models. In CVPR (pp. 1–6). IEEE.

Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013). A semi-automatic methodology for facial landmark annotation. In CVPRW (pp. 896–903). IEEE.

Sagonas, C., Panagakis, Y., Zafeiriou, S., & Pantic, M. (2015). Robust statistical face frontalization. In Proceedings of the IEEE international conference on computer vision (pp. 3871–3879).

Salakhutdinov, R., Hinton, G. E. (2009). Deep boltzmann machines. In International conference on artificial intelligence and statistics (pp. 448–455).

Salakhutdinov, R. R. (2009). Learning in Markov random fields using tempered transitions. In NIPS (pp. 1598–1606).

Saragih, J., & Goecke, R. (2007). A nonlinear discriminative approach to aam fitting. In ICCV (pp. 1–8). IEEE.

Srivastava, N., & Salakhutdinov, R. (2012). Multimodal learning with deep boltzmann machines. In NIPS (pp. 2222–2230).

Sun, Y., Wang, X., & Tang, X. (2013). Deep convolutional network cascade for facial point detection. In CVPR (pp. 3476–3483).

Sun, Y., Wang, X., & Tang, X. (2014). Deep learning face representation from predicting 10,000 classes. In CVPR (pp 1891–1898).

Sung, J., & Kim, D. (2008). Pose-robust facial expression recognition using view-based 2D + 3D AAM. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 38(4), 852–866.CrossRef

Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In CVPR (pp. 1701–1708).

Tang, Y., Salakhutdinov, R., & Hinton, G. (2012a). Deep lambertian networks. In ICML.

Tang, Y., Salakhutdinov, R., & Hinton, G. (2012b). Robust Boltzmann machines for recognition and denoising. In CVPR (pp. 2264–2271). IEEE.

Taylor, G. W., Sigal, L., Fleet, D. J., & Hinton, G. E. (2010). Dynamical binary latent variable models for 3d human pose tracking. In CVPR (pp. 631–638). IEEE.

Tzimiropoulos, G., & Pantic, M. (2013). Optimization problems for fast aam fitting in-the-wild. In ICCV (pp. 593–600). IEEE.

Tzimiropoulos, G., & Pantic, M. (2017). Fast algorithms for fitting active appearance models to unconstrained images. International Journal of Computer Vision, 122(1), 17–33.MathSciNetCrossRef

Van Der Maaten, L., & Hendriks, E. (2010). Capturing appearance variation in active appearance models. In CVPRW (pp. 34–41). IEEE.

Wang, B., Feng, X., Gong, L., Feng, H., Hwang, W., & Han, J. J. (2015a). Robust pose normalization for face recognition under varying views. In IEEE international conference on image processing (ICIP) (pp. 1648–1652). IEEE.

Wang, X., Guo, R., & Kambhamettu, C. (2015b). Deeply-learned feature for age estimation. In WACV (pp 534–541). IEEE.

Wang, Z., & Bovik, A. C. (2009). Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Processing Magazine, 26(1), 98–117.CrossRef

Wu, Y., Wang, Z., & Ji, Q. (2013). Facial feature tracking under varying facial expressions and face poses based on restricted Boltzmann machines. In CVPR (pp 3452–3459). IEEE.

Xing, J., Niu, Z., Huang, J., Hu, W., & Yan, S. (2014). Towards multi-view and partially-occluded face alignment. In CVPR (pp. 1829–1836).

Yang, C. Y., Liu, S., & Yang, M. H. (2013). Structured face hallucination. In CVPR (pp 1099–1106). IEEE.

Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–2873.MathSciNetCrossRefMATH

Yildirim, I., Kulkarni, T. D., Freiwald, W. A., & Tenenbaum, J. B. (2015). Efficient analysis-by-synthesis in vision: A computational framework, behavioral tests, and comparison with neural representations. In CogSci.

Zhai, H., Liu, C., Dong, H., Ji, Y., Guo, Y., & Gong, S. (2015). Face verification across aging based on deep convolutional networks and local binary patterns. In IScIDE (pp. 341–350). Berlin: Springer.

Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016a). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503.CrossRef

Zhang, Z., Luo, P., Loy, C. C., & Tang, X. (2016b). Learning deep representation for face alignment with auxiliary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(5), 918–930.CrossRef

Zhu, C., Zheng, Y., Luu, K., & Savvides, M. (2017). CMS-RCNN: Contextual multi-scale region-based cnn for unconstrained face detection. In Deep learning for biometrics (pp. 57–79). Berlin: Springer.

Zhu, J., Hoi, S. C., & Lyu, M. R. (2006). Real-time non-rigid shape recovery via active appearance models for augmented reality. In ECCV (pp. 186–197). Berlin: Springer.

Zhu, Z., Luo, P., Wang, X., & Tang, X. (2013). Deep learning identity-preserving face space. In CVPR (pp. 113–120).

Zhu, Z., Luo, P., Wang, X., & Tang, X. (2014). Multi-view perceptron: A deep model for learning face identity and view representations. In NIPS (pp. 217–225).

Titel: Deep Appearance Models: A Deep Boltzmann Machine Approach for Face Modeling
verfasst von: Chi Nhan Duong
Khoa Luu
Kha Gia Quach
Tien D. Bui
Publikationsdatum: 31.08.2018
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 5/2019
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-018-1113-3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 5/2019

Understanding Image Representations by Measuring Their Equivariance and Equivalence

Kernel Cuts: Kernel and Spectral Clustering Meet Regularization

Locality Preserving Matching

Robust and Optimal Registration of Image Sets and Structured Scenes via Sum-of-Squares Polynomials