Adaptive active appearance model with incremental learning
Introduction
Active appearance models (AAMs) are the flexible linear models that can represent the shapes and appearance variations of non-rigid objects (Cootes et al., 2001, Matthews and Baker, 2004). They are specifically popular for tracking the facial feature points because the human face has a variety of shape and appearance variations among different people. Matthews and Baker (2004) modeled the feature points of the human face using the AAM and proposed a fast fitting algorithm. Xiao et al. (2004) introduced the 3D shape model which limited the variation of 2D shape in order to construct a more robust AAM. Sung and Kim (2006) proposed a stereo AAM that could fit to a stereo image and developed a facial expression system by tracking the facial feature points.
However, AAMs are extremely sensitive to several factors such as poses, expressions, and illumination changes. Specifically, the AAM fitting performance is drastically degraded when the illumination characteristic of an input image is deviated from that of the training images. Many researcher have tried to overcome the effect of illumination. Cao et al. (2003) used the histogram equalization and the gamma correction. Wang et al. (2003) introduced a ratio image that transformed the illumination of input image into the illumination of reference image. However, the ratio image could not be appropriate for the real-time application due to the long computation time. Blanz and Vetter, 1999, Wang et al., 2003 built the parametric 3D face model that allowed for the variations of the 3D shape and texture and solved the illumination problems by estimating the illumination effects from 3D surface normals. Hager and Belhumeur (1998) removed the illumination effects by finding the bases of the illumination from the normal vectors and SVD method. But, the data points in the 3D face models are so dense that we cannot handle them in the real-time.
One simple way to solve this problem is to train the AAM with a large amount of training images with all possible illumination conditions. However, it is difficult to collect the training images with all possible illumination conditions and to represent all possible illumination changes no matter how the AAM is trained by such a rich training set.
In this paper, we propose an adaptive AAM that updates its linear appearance model to represent a given data well throughout the incremental PCA technique (Artac et al., 2002, Hall et al., 1998, Li, 2004). The proposed adaptive AAM is able to treat the input image with the unknown illumination condition because it puts the different illumination condition into the updated appearance basis vectors using the incremental PCA technique. However, a wrong update by the ill-fitted input image can degrade the fitting performance. So, we need to update the appearance basis vectors selectively when the AAM fitting is good, which requires a quantitative measure that evaluates the goodness of AAM fitting.
Generally, we use the AAM fitting error as a measure of the goodness of AAM fitting. However, it is necessary to know the ground truth of shape vector in order to compute the AAM fitting error,1 but it is impossible to know the ground truth of the shape vector in the test phase. To solve this problem, we propose to take the number of outlier pixels as an alternative measure of the goodness of AAM fitting because there is a strong correlation between the ill-fitting and the outlier pixels in that an ill-fitted pixel is treated as an outlier pixel. Gross et al. (2004) introduced the idea of outlier pixels by using the robust statistics to improve the AAM fitting. However, this method was not effective when the illumination of the input face image was changing. To overcome this limitation, we take the adaptive appearance model (McKenna et al., 1997) that makes the computation of outlier pixels robust to the change of illumination.
Fleet et al. (2003) proposed an online appearance model (OAM) for the robust visual tracking. Their approach involved a mixture of Gaussian models of three components such as stable image structure S, the lost component L for the outliers, and the wandering component W for two-frame variations. Zhou et al. (2004) modified the existing online appearance model (Fleet et al., 2003) by using a fixed template F instead of the lost component L for an effective occlusion handling and acquired the outlier pixels in the manner of robust statistics. We propose a novel online appearance model that modifies the existing OAM by taking the reconstructed appearance image R instead of the fixed template F, where the reconstructed appearance image is more adaptive to the input image than the fixed template F. In this work, the OAM parameters such as means, variances, and mixing densities of three components S, L, and R are updated by the EM learning only when the AAM fitting is good, which is determined by the number of outlier pixels.
We can keep a good AAM fitting with the help of the online appearance model while the illumination condition is sustaining. However, when the illumination condition of the input image is changed at a certain time, the AAM fitting becomes unstable because the current appearance basis vectors are not appropriate for representing the changed input image and this causes an increase in the reconstruction error.2 To avoid this situation, we update the current appearance basis vectors of the AAM by changing the input image using the incremental PCA technique. This selective update of the appearance basis vectors of the AAM provides the illumination adaptiveness to the AAM by using a minimum load of just the necessary updates.
The organization of this paper is as following. Section 2 reviews related studies such as the active appearance model, the incremental principal component analysis, and the online appearance model. Section 3 explains the proposed adaptive AAM algorithm. Section 4 presents some experimental results and discussion. Finally, Section 5 draws our conclusion.
Section snippets
Active appearance models
In AAMs (Cootes et al., 2001), the 2D shape is represented by triangulated meshes with l vertices, which correspond to salient points of an objects. Mathematically, the shape vector s is defined by the 2D coordinates of the l vertices that make up the mesh as and shape variation is expressed by a linear combination of a mean shape s0 and n shape base vectors aswhere is a shape parameter vector. The shape basis vectors are normally obtained by
AAM fitting quality measurement
As mentioned earlier, the AAM fitting error, which is defined by the difference between the warped patch image and the model appearance image, is not appropriate for measuring the AAM fitting quality to determine whether the AAM fitting is good or bad. The reason is that we do not know whether a high value of the AAM fitting error comes from the ill-fitted result or the illumination change.
Fig. 1 shows a case that the AAM fitting error is misused for evaluating the goodness of the AAM fitting,
Data set
To prove the effectiveness of the proposed adaptive AAM algorithm, we performed several experiments using two face image sequences that were captured under different lighting conditions, and the images were saved as 320 × 240 gray images.
The first image sequence consists of 100 face images that are captured under the normal lighting condition. Fig. 5a shows some face images in the first image sequence, where the facial expression is varying over the sequence. The second image sequence consists of
Conclusion
Because the AAM uses a subspace appearance model that has been learned from a set of training images, it can fit to the images that are similar to the training images. However, it is so sensitive to the illumination change, especially when the illumination condition of input images are different from that of training images, that it produces unstable fitting performances.
To alleviate this problem, we proposed an adaptive AAM that updates the appearance bases incrementally to adapt itself to the
Acknowledgement
This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the Biometrics Engineering Research Center (BERC) at Yonsei University (R112002105070030(2008)) and also was supported by the Intelligent Robotics Development Program, one of the 21st Century Frontier R&D Programs funded by the Ministry of Commerce, Industry and Energy (MOCIE).
References (20)
On incremental and robust subspace learning
Pattern Recognition
(2004)- Artac, M., Jogan, M., Leonardis, A., 2002. Incremental PCA for on-line visual learning and recognition. In: Proc....
- et al.
Lucas–Kanade 20 years on: A unifying framework
Internat. J. Comput. Vision
(2004) - Blanz, V., Vetter, T., 1999. A morphable model for the synthesis of 3d faces. In: Proc. Computer Graphics, Annual Conf....
- Cao, B., Shan, S., Gao, W., Zhao, D., 2003. Illumination normalization for robust face recognition against varying...
- et al.
Active appearance models
IEEE Trans. Pattern Anal. Machine Intell.
(2001) - et al.
Robust online appearance models for visual tracking
IEEE Trans. Pattern Anal. Machine Intell.
(2003) - Gross, R., Matthews, I., Baker, S., 2003. Lucas–Kanade 20years on: A unifying framework: Part 3, CMU-RI-TR-03-05,...
- Gross, R., Matthews, I., Baker, S., 2004. Constructing and fitting active appearance models with occlusion. In: Proc....
- et al.
Efficient region tracking with parametric models of geometry and illumination
IEEE Trans. Pattern Anal. Machine Intell.
(1998)
Cited by (39)
Distance-based online classifiers
2016, Expert Systems with ApplicationsSequentially adaptive active appearance model with regression-based online reference appearance template
2016, Journal of Visual Communication and Image RepresentationCitation Excerpt :To tackle the problems, incremental learning approaches were introduced to AAMs. Ref. [11] presented an adaptive AAM that updates the appearance basis vectors with the input image by the incremental principal component analysis (PCA) in person specific AAM. However, the method needed images of same person for training, which is generally unpractical.
Unsupervised segmentation of the vocal tract from real-time MRI sequences
2015, Computer Speech and LanguageRegression-based Active Appearance Model initialization for facial feature tracking with missing frames
2014, Pattern Recognition LettersCitation Excerpt :Wimmer (2008) used a learned Active Shape Model (ASM) fitting for AAM initialization because it provides stable results for the entire set of experiments, even in cases of poor initial parameter estimates as determined by a face detector. In fitting an AAM to video sequences, conventional methods directly fit the AAM to each frame using the fitting results, i.e., shape and appearance parameters, of the previous frame as the initialization of the current frame (Ionita et al., 2011; Liu, 2010; Cristinacce and Cootes, 2008; Saragih et al., 2011; Sung and Kim, 2009). However, this method is only suitable for small movements between frames.
An incremental learning algorithm based on the K-associated graph for non-stationary data classification
2013, Information SciencesCitation Excerpt :However, this idea causes at least three prohibitive drawbacks: (i) retraining process usually is computationally expensive; (ii) deciding when a given classifier is no longer useful is quite challenging, mainly due to virtual concept drift; and (iii) selecting which data should be used to train a new classifier is also a challenging task. The practical solution to tackle classification on non-stationary domains is using an incremental learning algorithm [23,37]. Such approach enables a classifier to acquire knowledge during application phase and to update the model with new data without explicitly retraining it [28].
AAM-based palm segmentation in unrestricted backgrounds and various postures for palmprint recognition
2013, Pattern Recognition Letters