Top

International Journal of Computer Vision

Published in:

07-11-2019

Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks

Authors: Xin Yu, Fatih Porikli, Basura Fernando, Richard Hartley

Published in: International Journal of Computer Vision | Issue 2/2020

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Conventional face hallucination methods heavily rely on accurate alignment of low-resolution (LR) faces before upsampling them. Misalignment often leads to deficient results and unnatural artifacts for large upscaling factors. However, due to the diverse range of poses and different facial expressions, aligning an LR input image, in particular when it is tiny, is severely difficult. In addition, when the resolutions of LR input images vary, previous deep neural network based face hallucination methods require the interocular distances of input face images to be similar to the ones in the training datasets. Downsampling LR input faces to a required resolution will lose high-frequency information of the original input images. This may lead to suboptimal super-resolution performance for the state-of-the-art face hallucination networks. To overcome these challenges, we present an end-to-end multiscale transformative discriminative neural network devised for super-resolving unaligned and very small face images of different resolutions ranging from 16 \(\times \) 16 to 32 \(\times \) 32 pixels in a unified framework. Our proposed network embeds spatial transformation layers to allow local receptive fields to line-up with similar spatial supports, thus obtaining a better mapping between LR and HR facial patterns. Furthermore, we incorporate a class-specific loss designed to classify upright realistic faces in our objective through a successive discriminative network to improve the alignment and upsampling performance with semantic information. Extensive experiments on a large face dataset show that the proposed method significantly outperforms the state-of-the-art.

previous article Adaptive Importance Learning for Improving Lightweight Image Super-Resolution Network

next article SceneFlowFields++: Multi-frame Matching, Visibility Prediction, and Robust Interpolation for Scene Flow Estimation

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Arandjelović, O. (2014). Hallucinating optimal high-dimensional subspaces. Pattern Recognition, 47(8), 2662–2672.CrossRef

Baker, S., & Kanade, T. (2000). Hallucinating faces. In Proceedings of 4th IEEE international conference on automatic face and gesture recognition, FG 2000 (pp. 83–88).

Baker, S., & Kanade, T. (2002). Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1167–1183.CrossRef

Bruna, J., Sprechmann, P., & LeCun, Y. (2016). Super-resolution with deep convolutional sufficient statistics. In International conference on learning representations (ICLR).

Bulat, A., & Tzimiropoulos, G. (2017). How far are we from solving the 2D and 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In Proceeding of international conference on computer vision (ICCV).

Bulat, A., & Tzimiropoulos, G. (2018). Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

Bulat, A., Yang, J., & Tzimiropoulos, G. (2018). To learn image super-resolution, use a gan to learn how to do image degradation first. In Proceedings of European conference on computer vision (ECCV) (pp. 185–200).

Chen, Y., Tai, Y., Liu, X., Shen, C., & Yang, J. (2018). Fsrnet: End-to-end learning face super-resolution with facial priors. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

Dahl, R., Norouzi, M., & Shlens, J. (2017). Pixel recursive super resolution. In Proceeding of international conference on computer vision (ICCV) (pp. 5439–5448).

Denton, E., Chintala, S., Szlam, A., & Fergus, R. (2015). Deep generative image models using a Laplacian pyramid of adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 1486–1494).

Dong, C., Loy, C. C., & He, K. (2016). Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295–307.CrossRef

Freedman, G., & Fattal, R. (2010). Image and video upscaling from local self-examples. ACM Transactions on Graphics, 28(3), 1–10.

Freeman, W. T., Jones, T. R., & Pasztor, E. C. (2002). Example-based super-resolution. IEEE Computer Graphics and Applications, 22(2), 56–65.CrossRef

Glasner, D., Bagon, S., & Irani, M. (2009). Super-resolution from a single image. In Proceedings of IEEE international conference on computer vision (ICCV) (pp. 349–356).

Goodfellow, I., Pouget-Abadie, J., & Mirza, M. (2014). Generative adversarial networks. In Advances in neural information processing systems (NIPS) (pp. 2672–2680).

Gu, S., Zuo, W., Xie, Q., Meng, D., Feng, X., & Zhang, L. (2015). Convolutional sparse coding for image super-resolution. In Proceedings of the IEEE international conference on computer vision (ICCV).

Hennings-Yeomans, P. H., Baker. S., & Kumar, B. V. (2008). Simultaneous super-resolution and feature extraction for recognition of low-resolution faces. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

Hinton, G. (2012). Neural networks for machine learning lecture 6a: Overview of mini-batch gradient descent Reminder—The error surface for a linear neuron. Technical report.

Chang, H., Yeung, D.-Y., & Xiong, Y. (2004). Super-resolution through neighbor embedding. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 275–282).

Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07-49, University of Massachusetts, Amherst.

Huang, H., He, R., Sun, Z., & Tan, T. (2017). Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution. In Proceeding of international conference on computer vision (ICCV).

Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 5197–5206).

Jaderberg, M., Simonyan, K., & Zisserman, A., et al. (2015). Spatial transformer networks. In Advances in neural information processing systems (NIPS) (pp. 2017–2025).

Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European conference on computer vision (ECCV).

Kim, J., Kwon Lee, J., & Mu Lee, K. (2016a). Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1646–1654).

Kim, J., Kwon Lee, J., & Mu Lee, K. (2016b). Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1637–1645).

Kolouri, S., & Rohde, G. K. (2015). Transport-based single frame super resolution of very low resolution face images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR).

Lai, W. S., Huang, J. B., Ahuja, N., & Yang, M. H. (2017). Deep Laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 624–632).

Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z., et al. (2016). Photo-realistic single image super-resolution using a generative adversarial network. arXiv:1609.04802

Li, Y., Cai, C., Qiu, G., & Lam, K. M. (2014). Face hallucination based on sparse local-pixel structure. Pattern Recognition, 47(3), 1261–1270.CrossRef

Lin, Z., & Shum, H. Y. (2006). Response to the comments on “Fundamental limits of reconstruction-based superresolution algorithms under local translation”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 83–97.

Lin, Z., He, J., Tang, X., & Tang, C. K. (2008). Limits of learning-based superresolution algorithms. International Journal of Computer Vision, 80(3), 406–420.CrossRef

Liu, C., Shum, H., & Zhang, C. (2001). A two-step approach to hallucinating faces: Global parametric model and local nonparametric model. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 192–198).

Liu, C., Shum, H. Y., & Freeman, W. T. (2007). Face hallucination: Theory and practice. International Journal of Computer Vision, 75(1), 115–134.CrossRef

Liu, C., Yuen, J., & Torralba, A. (2011). Sift flow: Dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 978–994.CrossRef

Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017). Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, p. 1).

Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (ICCV).

Ma, X., Zhang, J., & Qi, C. (2010). Hallucinating face by position-patch. Pattern Recognition, 43(6), 2224–2236.CrossRef

Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks (pp. 1–15). arXiv:1511.06434

Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1874–1883).

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

Singh, A., Porikli, F., & Ahuja, N. (2014). Super-resolving noisy images. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2846–2853).

Tai, Y., Yang, J., & Liu, X. (2017). Image super-resolution via deep recursive residual network. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1).

Tappen, M. F., & Liu, C. (2012). A Bayesian approach to alignment-based image hallucination. In Proceedings of European conference on computer vision (ECCV) (Vol. 7578, pp. 236–249).

Tappen, M. F., Russell, B. C., & Freeman, W. T. (2003). Exploiting the sparse derivative prior for super-resolution and image demosaicing. In IEEE workshop on statistical and computational theories of vision.

Van Den Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. In Proceedings of international conference on international conference on machine learning (ICML) (pp. 1747–1756).

Wang, N., Tao, D., Gao, X., Li, X., & Li, J. (2014). A comprehensive survey to face hallucination. International Journal of Computer Vision, 106(1), 9–30.CrossRef

Wang, X., & Tang, X. (2005). Hallucinating face by eigen transformation. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 35(3), 425–434.CrossRef

Xu, X., Sun, D., Pan, J., Zhang, Y., Pfister, H., & Yang, M. H. (2017). Learning to super-resolve blurry face and text images. In Proceedings of the IEEE conference on computer vision and pattern recognition (ICCV) (pp. 251–260).

Yang, C. Y., Liu, S., & Yang, M. H. (2013). Structured face hallucination. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 1099–1106).

Yang, C. Y., Liu, S., & Yang, M. H. (2018). Hallucinating compressed face images. International Journal of Computer Vision, 126(6), 597–614. MathSciNetCrossRef

Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–73.MathSciNetCrossRef

Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). Wider face: A face detection benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5525–5533).

Yu, X., & Porikli, F. (2016). Ultra-resolving face images by discriminative generative networks. In European conference on computer vision (ECCV) (pp. 318–333).

Yu, X., & Porikli, F. (2017a). Face hallucination with tiny unaligned images by transformative discriminative neural networks. In Thirty-First AAAI conference on artificial intelligence.

Yu, X., & Porikli, F. (2017b). Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3760–3768).

Yu, X., & Porikli, F. (2018). Imagining the unimaginable faces by deconvolutional networks. IEEE Transactions on Image Processing, 27(6), 2747–2761.MathSciNetCrossRef

Yu, X., Xu, F., Zhang, S., & Zhang, L. (2014). Efficient patch-wise non-uniform deblurring for a single image. IEEE Transactions on Multimedia, 16(6), 1510–1524.CrossRef

Yu, X., Fernando, B., Ghanem, B., Porikli, F., & Hartley, R. (2018a). Face super-resolution guided by facial component heatmaps. In Proceedings of European conference on computer vision (ECCV) (pp. 217–233).

Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2018b). Super-resolving very low-resolution face images with supplementary attributes. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 908–917).

Yu, X., Fernando, B., Hartley, R., & Porikli, F. (2019a). Semantic face hallucination: Super-resolving very low-resolution face images with supplementary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence,. https://doi.org/10.1109/TPAMI.2019.2916881.

Yu, X., Shiri, F., Ghanem, B., & Porikli, F. (2019b). Can we see more? Joint frontalization and hallucination of unaligned tiny faces. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2914039

Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision (ECCV) (pp. 818–833).

Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010). Deconvolutional networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR) (pp. 2528–2535).

Zhou, E., & Fan, H. (2015). Learning face hallucination in the wild. In Twenty-ninth AAAI conference on artificial intelligence (pp. 3871–3877).

Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016a). Deep cascaded bi-network for face hallucination. In Proceedings of European conference on computer vision (ECCV) (pp. 614–630).

Zhu, S., Liu, S., Loy, C. C., & Tang, X. (2016b). Deep cascaded bi-network for face hallucination. In European conference on computer vision (ECCV) (pp. 614–630).

Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2879–2886).

Title: Hallucinating Unaligned Face Images by Multiscale Transformative Discriminative Networks
Authors: Xin Yu
Fatih Porikli
Basura Fernando
Richard Hartley
Publication date: 07-11-2019
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 2/2020
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-019-01254-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 2/2020

Corner Detection Using Multi-directional Structure Tensor with Multiple Scales

Recognizing Profile Faces by Imagining Frontal View

Deep Learning for Generic Object Detection: A Survey

Adaptive Importance Learning for Improving Lightweight Image Super-Resolution Network

Dual L1-Normalized Context Aware Tensor Power Iteration and Its Applications to Multi-object Tracking and Multi-graph Matching

Deep Insights into Convolutional Networks for Video Recognition

Premium Partner