Elsevier

Image and Vision Computing

Volume 30, Issue 9, September 2012, Pages 668-679
Image and Vision Computing

Efficient 3D face recognition handling facial expression and hair occlusion

https://doi.org/10.1016/j.imavis.2012.07.011Get rights and content

Abstract

This paper presents an efficient 3D face recognition method to handle facial expression and hair occlusion. The proposed method uses facial curves to form a rejection classifier and produce a facial deformation mapping and then adaptively selects regions for matching. When a new 3D face with an arbitrary pose and expression is queried, the pose is normalized based on the automatically detected nose tip and the principal component analysis (PCA) follows. Then, the facial curve in the nose region is extracted and used to form the rejection classifier which quickly eliminates dissimilar faces in the gallery for efficient recognition. Next, six facial regions which cover the face are segmented and curves in these regions are used to map facial deformation. Regions used for matching are automatically selected based on the deformation mapping. In the end, results of all the matching engines are fused by weighted sum rule. The approach is applied on the FRGC v2.0 dataset and a verification rate of 96.0% for ROC III is achieved as a false acceptance rate (FAR) of 0.1%. In the identification scenario, a rank-one accuracy of 97.8% is achieved.

Highlights

► An efficient 3D face preprocessing method is used. ► The partial central profile is used to produce a rejection classifier. ► An adaptive region selection scheme is proposed.

Introduction

Face recognition is usually required in a wide range of applications such as surveillance, access control and machine–human interaction because of its non-intrusive nature. 2D face recognition has been extensively researched in the last two decades, but it is sensitive to illumination and pose. Therefore, researchers are now investigating other data acquisition modalities of the face to overcome these limitations. One of the most promising modalities is the 3D shape of the face. The 3D face data is considered to be inherently robust to pose and illumination which adversely affect 2D face data [1].

The 3D shape of the human face will be sharply deformed under facial expressions in a complex way which result in deterioration in the accuracy of recognition. It is noted that the expression variation is one of the most difficult factors in 3D face recognition.

There also exist some attempts to overcome the expression variation in the 3D face recognition field. Bronstein et al. [2] demonstrate that facial expressions can be modeled as isometries of the facial surface. They assume the 3D facial surface to be isometric and remove the effects of expressions by finding an isometry-invariant representation of the face. Lu et al. [3] learn expression deformations from a control group of neutral and non-neutral scans. Then, the learned deformations are used to synthesize an expression model for the neutral models in the gallery. The comparisons between the probe and the deformed galleries are performed to finish the recognition. In the work by Kakadiaris et al. [4], the 3D face is parameterized using the annotated face model (AFM). In the work by Wang et al. [5], fields are applied to deform the non-neutral 3D scans to neutral scans and meanwhile constraints are applied to avoid overfitting. Al-Osaimi et al. [6] propose a 3D face recognition approach which differentiates between expression deformations and interpersonal disparities using deformable models.

Different parts of the face deforms differently when expression changes: some regions deform largely and others little. Using rigid regions which are relatively less sensitive to expressions is another way to overcome the expression variation in 3D face recognition. Chang et al. [7] use multiple overlapping nose regions and obtain increased performance relative to using the whole-frontal-face region. Mian et al. [8] describe a multimodal 2D + 3D face recognition system. In their system, three kinds of methods, spherical face representation (SFR), scale-invariant feature transform (SIFT) descriptor and a region-based matching method, are combined. The system improves the efficiency of recognition by using SFR and SIFT to reject a subset of gallery subjects, and receives promising results, but some information sensitive to expressions is used to produce SFR which will deteriorate recognition. Both methods above, however, don't make full use of the discriminative information existing in the entire face. Queirolo et al. [9] use a simulated annealing-based approach (SA) for range image registration, and then compute the similarity between two faces by surface interpenetration measure (SIM). Four regions are used for matching and the matching classifier is based on a hierarchical model. Significant recognition accuracy is reported for that approach.

In the work by Wang et al. [10], the rigid parts of the facial surface are dynamically extracted by selecting a part of the nearest point pairs to calculate dissimilarity measure during registration of facial surfaces. Faltemier et al. [11] select 28 different regions around the face, and then perform score-based fusion on the individual region match scores. A hierarchical component-based face registration method is used in [12]. In the work by Lin et al. [13], all of the matching points are weighed by the distance from itself to the bilateral symmetrical plane of the face. The methods above use iterative closest point (ICP) [14] to perform image matching, and achieve good performance, but the time consumption is large because all of the face regions should be matched in the recognition process. Ocegueda et al. [15] proposed a novel framework to find the most discriminative areas of the face for recognition by a Markov random field model. For each 3D face, a compact signature consisting of 360 coefficients was extracted and used for recognition. Contrary to our approach, this method requires mapping to obtain a geometry image for each face.

Considering that face is usually deformed in different ways under different expressions, some researchers combine facial expression categorization with face recognition. In the work by Cook et al. [16], images are manually allocated to three groups based on expression, and then a part-face recognition system using subspace projection method is constructed. However, the above method assumes that the expression of the face is known. In [17], an automated variant of this approach is detailed, and a “happy face” recognition system is used to aid the face recognition.

Beumier and Acheroy [18] develop an integrated 3D face acquisition and comparison system based on multiple-profile comparisons. The facial curves are extracted and used to compare two faces. There are also many other works employing the facial curves for 3D face recognition [19]. It is reported that the central profile shows the major distinctiveness [20]. Generally, the facial curves are less discriminative than the regions, but they only need low storage cost and have high speed.

Another factor that usually deforms the face is hair occlusion, and hair often occludes the forehead region. In this paper, we present a robust 3D face recognition method handling facial expression and hair occlusion. A complete framework for face recognition is presented and it has five main stages (see Fig. 1): (1) preprocessing, (2) rejection, (3) adaptive region selection, (4) region-based matching, and (5) evaluation.

In the preprocessing stage, the input face is firstly smoothed and the nose tip is automatically detected. Then, the face is cropped and normalized. The facial curve in the nose region which is also called “the partial central profile” is matched to reject dissimilar faces in the gallery. Curves in the forehead, cheeks and mouth regions are extracted and matched to produce a facial deformation mapping for expression and hair occlusion. Regions used for matching are adaptively selected based on the deformation mapping. The selected regions are matched to the remaining gallery using a modified ICP algorithm. Lastly, similarity score is computed based on the weighted sum of elementary scores given by each region and the face with the highest similarity score is considered as the recognition result. Fig. 1 shows the block diagram of our method.

3D face recognition is computationally expensive, and a brute force matching approach does not scale well to large galleries. AdaBoost is used to select a small number of critical features in 2D face detection [21], and there are also researchers using a cascade method for fast human detection [22]. In this paper, a rejection algorithm is used to reduce the gallery faces before accurate matching. A rejection classifier can reduce the number of galleries before accurate matching, but what feature is used for rejection is a key problem, which is the first contribution of our paper. We use the partial central profile to produce a rejection classifier. Partial central profile is a vertical facial curve in the nose region. It is simple and robust to facial expression and hair occlusion, so it can reject a large number of ineligible candidate faces in a short time.

The second contribution of our paper is an adaptive region selection scheme. After rejection, our scheme can automatically select as much information as possible for accurate matching, and the information is robust to expression and hair occlusion. Facial curves in the regions of forehead, cheeks and mouth are extracted and compared to produce the facial deformation mapping, then regions used for matching are selected according to the deformation mapping. Through looking at the distortions on a region by region basis, our scheme can scale to any number of facial expressions without requiring the use of expression recognition systems as in [17]. Comparing with the work by Wang et al. [10] and Faltemier et al. [11], only regions that are not influenced by expression or hair occlusion are matched and used for recognition in this paper, which can improve the efficiency. This also has the benefit over systems such as those in [7], [8], where only the nose and forehead regions are used for recognition.

This paper is an extension of our work presented in [23] and is organized as follows. Section 2 describes the preprocessing stage. In Section 3, the rejection classifier based on the partial central profile is presented. Section 4 gives details of our adaptive region selection scheme. Section 5 lists and compares the recognition results of our algorithm with others. Conclusions are discussed in Section 6.

Section snippets

Face smoothing

We perform experiments on the Face Recognition Grand Challenge (FRGC) v2.0 dataset [24]. Our algorithm operates automatically by using only the 3D shape from the frontal view of the face. There are outlier and missing points in many images in the FRGC v2.0 dataset, which will deteriorate the recognition performance, so the images are firstly smoothed. Spikes caused by outlier points are removed by Gaussian filter. Then, small holes in the face are filled. The X, Y and Z coordinates of the

Rejection classifier

A rejection classifier is defined as the one which quickly eliminates a large percentage of the candidate classes with high probability. A rejector is an algorithm, ψ, that given an input, xRd, returns a set of class labels, ψ(x), such that x  Wi  i  ψ(x), where x is a measurement vector, Rd is a classifier space of d measurements, and Wi is the ith class such that WiRd [8], [28].

The effectiveness of a rejection classifier, ψ, can be defined asEff(ψ)=ExS(|ψ(x)|)N,where Ex  S(|ψ(x)|) is the

Part-face methodology

Face recognition techniques typically employ a monolithic representation of the face during recognition, however, approaches which decompose the face into sub-regions have shown considerable promise [7], [8]. Based on the analysis of facial action under expressions [30], we divide the face into six regions which are respectively the nose (Probe A), forehead (Probe B), left cheek (Probe C), right cheek (Probe D), left mouth (Probe E) and right mouth (Probe F) (see Fig. 9). The six regions are

Experiments and analysis

We employ two different scenarios for our experiments: verification and identification. In a verification scenario, the objective is to answer if a person is who he/she claims to be, and we measure the verification rate at a 0.1% false acceptance rate (FAR). Each face in the probe set is compared to a gallery set and the result is compared against a threshold. The results are summarized using receiver operating characteristic (ROC) curves. In an identification scenario, the objective is to

Conclusions

In this paper, we have proposed a robust 3D face recognition method based on rejection and adaptive region selection scheme. Experiments have been done on the FRGC v2.0 dataset. Encouraging results and comparisons to other 3D face recognition method have demonstrated the effectiveness of our method.

Although our method works well on common faces with approximately symmetrical shapes, it can fail when the face is normalized incorrectly, which causes incorrect PCS. Some failure cases are provided

Acknowledgments

The authors would like to thank the FRGC organizers and Zhejiang University for providing the face datas. This research is sponsored by the National Natural Science Foundation of China (Grant No. 51175081), Jiangsu Natural Science Foundation (Grant No. BK2010058), and the Mass Innovation Foundation of Shanghai Municipal Electrical Power Company, China (Grant No. 52097012000A-2).

References (31)

  • K.W. Bowyer et al.

    A survey of approaches and challenges in 3D and multi-modal 3D + 2D face recognition

    Comp. Vision Image Underst.

    (2006)
  • C. Beumier et al.

    Automatic 3D face authentication

    Image Vision Comput.

    (2000)
  • A.M. Bronstein et al.

    Expression-invariant representations of faces

    IEEE Trans. Image Process.

    (2007)
  • X.G. Lu et al.

    Deformation modeling for robust 3D face matching

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2008)
  • I.A. Kakadiaris et al.

    Three-dimensional face recognition in the presence of facial expressions: an annotated deformable model approach

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • Y.M. Wang et al.

    3D face recognition in the presence of expression: a guidance-based constraint deformation approach constraint deformation approach

  • F. Al-Osaimi et al.

    An expression deformation approach to non-rigid 3D face recognition

    Int. J. Comput. Vis.

    (2009)
  • K.I. Chang et al.

    Multiple nose region matching for 3D face recognition under varying facial expression

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • A. Mian et al.

    An efficient multimodal 2D–3D hybrid approach to automatic face recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • C.C. Queirolo et al.

    3D face recognition using simulated annealing and the surface interpenetration measure

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • Y.M. Wang et al.

    Exploring facial expression effects in 3D face recognition using partial ICP

  • T.C. Faltemier et al.

    A region ensemble for 3-D face recognition

    IEEE Trans. Inf. Forensics Secur.

    (2008)
  • N. Alyuz et al.

    Component-based registration with curvature descriptors for expression insensitive 3D face recognition

  • T.H. Lin et al.

    Automatic face authentication with self compensation

    Image Vision Comput.

    (2007)
  • P.J. Besl et al.

    A method for registration of 3-D shapes

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1992)
  • Cited by (0)

    This paper has been recommended for acceptance by Ioannis A. Kakadiaris, PhD.

    View full text