Elsevier

Pattern Recognition

Volume 40, Issue 11, November 2007, Pages 3178-3194
Pattern Recognition

Hallucinating faces: LPH super-resolution and neighbor reconstruction for residue compensation

https://doi.org/10.1016/j.patcog.2007.03.011Get rights and content

Abstract

A two-phase face hallucination approach is proposed in this paper to infer high-resolution face image from the low-resolution observation based on a set of training image pairs. The proposed locality preserving hallucination (LPH) algorithm combines locality preserving projection (LPP) and radial basis function (RBF) regression together to hallucinate the global high-resolution face. Furthermore, in order to compensate the inferred global face with detailed inartificial facial features, the neighbor reconstruction based face residue hallucination is used. Compared with existing approaches, the proposed LPH algorithm can generate global face more similar to the ground truth face efficiently, moreover, the patch structure and search strategy carefully designed for the neighbor reconstruction algorithm greatly reduce the computational complexity without diminishing the quality of high-resolution face detail. The details of synthetic high-resolution face are further improved by a global linear smoother. Experiments indicate that our approach can synthesize distinct high-resolution faces with various facial appearances such as facial expressions, eyeglasses efficiently.

Introduction

Video surveillance cameras are already prevalent in banks, stores and parking lots, but in many cases, the size of interested faces is often small because of the distance between the camera and the object. The low image resolution of human face becomes a primary obstacle to face identification and recognition. Therefore, in order to gain detailed facial features for recognition, it is necessary to infer high-resolution face image from the low-resolution one and this technique is called face hallucination [1], [2], [3], [4], [5], [6] which incorporates image super-resolution techniques into facial image synthesis. Currently, the image super-resolution techniques can be divided into two classes: learning-based techniques [1], [2], [7], [8] and reconstruction-based techniques [9], [10], [11], and usually the former generates better results than the latter does. This paper focuses on learning-based face hallucination technique.

A number of learning-based image hallucination techniques have been proposed in recent years [7], [8], [12]. The common gist shared by them is to learn hallucinated high-resolution faces from a training database which comprises pairs of high-resolution and low-resolution image samples. For example, Freeman [7] proposed a sample-based method; they trained Markov networks to learn the relationship between the low-resolution image and fine details of corresponding high-resolution image, and then used the learned relationship to infer fine details of other low-resolution images. Hertzmann [8] proposed a generalized local feature transform approach called “image analogies”. According to this approach, the high-resolution image was inferred from the low-resolution input image by the local similarity between the training pairs based on a multi-scale auto-regression. These above approaches were suitable for dealing with generic images rather than face image because they ignored the special property of face images. The term “face hallucination” was first proposed by Baker and Kanade [1]. They developed a hallucination approach based on the property of human faces which learned a prior on the spatial distribution of the image gradient for frontal face images and employed Bayesian theory to infer the high-resolution face image from the low-resolution one. The approach intrinsically relied on a complicated statistical model. Su et al. [13] used a steerable pyramid to extract multi-orientation and multi-scale information of low-level facial features both from the input low-resolution face and other high-resolution ones, and then adopted pyramid parent structure and local best match to optimize the prior for solving a Bayesian MAP problem. Wang and Tang [3] developed an efficient face hallucination approach based on an eigen-transformation algorithm. They used principal component analysis (PCA) to fit the input face image as a linear combination of the training low-resolution face images, and then the high-resolution image was synthesized by replacing the training low-resolution face images with the high-resolution counterparts while retaining the same combination weights. But the methods only focused on global information without paying attention to local details, so the results seemed unclear and lacked detailed features. Liu et al. [2] developed a two-step statistical approach which integrated a global parametric model and a local model with Markov random field (MRF) prior. This approach depended on explicit down-sampling function which is sometimes unavailable in practice. Enlightened by Liu's work, a number of existing works treated face hallucination as a two-step problem [4], [5], [6]. First, a global face image keeping the main characteristics of the high-resolution face was obtained which looks smooth and lacks some detailed features. Second, an optimal residue face image containing the high-frequency image information was synthesized and the residue image was piled onto the global face image to gain the final results.

Motivated by the two-step architecture, we propose a statistical learning-based face hallucination approach which includes two phases. In the first phase, instead of PCA used by Liu et al., locality preserving projections (LPP) is adopted to learn the intrinsic features of the training low-resolution face images and build a set of transformation vectors beforehand. Then the transformation between low- and high-resolution image pairs is achieved by a radial basis function (RBF) regression between the intrinsic features of low-resolution images and corresponding high-resolution images. The inference can be done by projecting the input low-resolution face onto the transformation vectors and using the gained intrinsic features to obtain global high-resolution face via RBF regression. We state locality preserving hallucination (LPH) as the name of this method. In the second phase, different from Liu's approach, we infer high-resolution residue image from the low-resolution residue image rather than the global high-resolution image based on the low- and high-resolution training residue image pairs. In this phase, enlightened by locally linear embedding (LLE), a neighbor reconstruction based method is used to fit the input low-resolution residue patch as a linear combination of the K-NN low-resolution patches in the training set, and then the high-resolution residue patch can be inferred by replacing the training low-resolution residue patches with the high-resolution counterparts while preserving the same combination weights. The overlapped high-resolution residue patches are then smoothed by a linear operator to form the high-resolution residue face. The final hallucinated face image is the composition of the global high-resolution image and the high-resolution residue image. The outline of our approach can be described in Fig. 1.

It is worthwhile to highlight several aspects of our proposed approach here:

  • 1.

    Unlike other “batch” manifold algorithms such as isometric feature mapping (ISOMAP) and LLE, LPP is a linear method and it is defined everywhere by explicitly giving a set of transformation vectors. This indicates that LPP can be easily applied to any new data point out of the training set. Furthermore, by preserving the local structure of the image space, LPP provides better results than PCA in many real world applications.

  • 2.

    We originally build a global face hallucination framework named LPH by combining the manifold learning algorithm (LPP) and neural network (RBF regression). In the framework, LPP captures the most intrinsic features of training images while RBF regression is bridging connection between the intrinsic image features and original images. Compared with linear PCA, LPH generates global high-resolution face more similar to the ground truth image with high computational efficiency.

  • 3.

    Unlike previous works [2], [14], our search strategy in the second phase is position-dependent; this means when fitting an input patch with specific image position, the search is localized at the same position throughout the training image set. This will not diminish the quality of face details because patches at same position denote roughly the same region of human face naturally, and the K-NNs are more probable to appear in these patches. In addition, our K-NN search is applied on the low-resolution image patches instead.So, our search strategy greatly reduces the computational complexity of the neighbor searching process.

The rest part of this paper is organized as follows: Section 2 shows the reviewing of related works, the proposed approach to face hallucination is discussed in detail in Section 3, convincible experiments are shown in 4 Experiments and discussion, 5 Conclusions and future work concludes this paper.

Section snippets

Related works

Manifold learning can be regarded as a kind of non-linear dimensionality reduction approach which intends to uncover the most intrinsic characteristics of high-dimensional observations distributed in a non-linear subspace. Frequently used manifold learning methods consist of ISOMAP [15], LLE [16], local tangent space alignment (LTSA) [17], and LPP [18], etc. ISOMAP strives to maintain the geodesic distance between two data points when reducing the dimensionality; LLE supposes that the

Low-resolution manifold vs. high-resolution manifold

Like many learning-based face hallucination approaches, our training data set contains a group of low- and high-resolution image pairs. Actually these low-resolution images are the down-sampled version of the high-resolution counterparts. So it is reasonable to make assumptions as follows:

  • 1.

    All the training samples with the same resolution form a manifold in the low-dimensional feature space.

  • 2.

    The low- and high-resolution training pairs are embedded into respective manifold space in approximately

Data set

Our experiments are performed on the Asian Face Image Database PF01 [26]. PF01 contains 107 persons, 56 men and 51 women representing 17 various images (1 frontal face, 4 illumination variations, 8 pose variations, 4 expression variations) per person. Twenty-four male and eight female wear glasses, most of the volunteers’ age is between 20 and 30. Since our focus is frontal face hallucination in average illumination, we eliminate the images with illumination as well as pose variations and

Conclusions and future work

In this paper, we propose a two-phase face hallucination approach to infer high-resolution faces from low-resolution ones. In the global phase, based on the fact that the low- and high-resolution face images have very similar distributions in respective low-dimensional feature space, LPH algorithm is proposed to synthesize the global high-resolution face with global smoothness and continuity. Compared with other method, LPH can be applied to new data more easily and generates more visually

Acknowledgments

This work is supported by National Natural Science Foundation of China (No. 60525108, No. 60533090), 973 Program (No. 2002CB312101), Science and Technology Project of Zhejiang Province (2005C13032,2006C13097), and China–US Million Book Digital Library Project (www.cadal.zju.edu.cn).

About the Author—YUE-TING ZHUANG was born in 1965 and he received his B.S., M.S., and Ph.D. degrees from Zhejiang University in 1986, 1989, and 1998, respectively. Currently, he is a professor and Ph.D. supervisor at College of Computer Science, Zhejiang University. His research interests mainly include multimedia database, artificial intelligence, content based image/video information retrieval and video based animation.

References (26)

  • C.Y. Su et al.

    Steerable pyramid-based face hallucination

    Pattern Recognition.

    (2005)
  • Y. Chang et al.

    Manifold based analysis of facial expression

    Image and Vision Comput.

    (2006)
  • S. Baker et al.

    Hallucinating faces

  • C. Liu et al.

    A two-step approach to hallucinating faces: global parametric model and local nonparametric model

  • X.G. Wang et al.

    Hallucinating face by eigentransformation

    IEEE Trans. Syst. Man Cybern.

    (2005)
  • W. Liu, D.H. Lin, X.O. Tang, Neighbor combination and transformation for hallucinating faces, in: Proceedings of IEEE...
  • W. Liu, D.H. Lin, X.O. Tang, Hallucinating faces: TensorPatch super-resolution and coupled residue compensation, in:...
  • Y. Li et al.

    An improved two-step approach to hallucinating faces

  • W.T. Freeman et al.

    Learning low-level vision

  • A. Hertzmann et al.

    Image analogies

  • R. Kimmel

    Demosaicing: image reconstruction from color CCD samples

    IEEE Trans. Image Process.

    (1999)
  • C. Lee et al.

    High-quality image resizing using oblique projection operators

    IEEE Trans. Image Process.

    (1998)
  • X. Li et al.

    New edge-directed interpolation

    IEEE Trans. Image Process.

    (2001)
  • Cited by (0)

    About the Author—YUE-TING ZHUANG was born in 1965 and he received his B.S., M.S., and Ph.D. degrees from Zhejiang University in 1986, 1989, and 1998, respectively. Currently, he is a professor and Ph.D. supervisor at College of Computer Science, Zhejiang University. His research interests mainly include multimedia database, artificial intelligence, content based image/video information retrieval and video based animation.

    About the Author—JIAN ZHANG received his B.S. and M.S. degrees from Shandong University of Science and Technology in 2000 and 2003, respectively. He is currently a Ph.D. candidate of Zhejiang University–Microsoft visual perception laboratory. His current research interests include image/video analysis, pattern recognition, computer vision and machine learning.

    About the Author—FEI WU received his B.S., M.S. and Ph.D. degrees from Lanzhou University, Macao University and Zhejiang University, respectively. Dr. Wu's research interests include multimedia retrieval, video-based human animation and pattern recognition.

    View full text