Elsevier

Pattern Recognition

Volume 44, Issue 2, February 2011, Pages 401-411
Pattern Recognition

Particle swarm optimization based fusion of near infrared and visible images for improved face verification

https://doi.org/10.1016/j.patcog.2010.08.006Get rights and content

Abstract

This paper presents two novel image fusion schemes for combining visible and near infrared face images (NIR), aiming at improving the verification performance. Sub-band decomposition is first performed on the visible and NIR images separately. In both cases, we further employ particle swarm optimization (PSO) to find an optimal strategy for performing fusion of the visible and NIR sub-band coefficients. In the first scheme, PSO is used to calculate the optimum weights of a weighted linear combination of the coefficients. In the second scheme, PSO is used to select an optimal subset of features from visible and near infrared face images. To evaluate and compare the efficacy of the proposed schemes, we have performed extensive verification experiments on the IRVI database. This database was acquired in our laboratory using a new sensor that is capable of acquiring visible and near infrared face images simultaneously thereby avoiding the need for image calibration. The experiments show the strong superiority of our first scheme compared to NIR and score fusion performance, which already showed a good stability to illumination variations.

Introduction

Recent improvements in sensor technology have boosted the face as a reliable trait for successful identification of persons in security applications. However, face verification still offers poor performance in presence of high illumination, variation in pose and facial expressions [4]. Recently, some studies have demonstrated that the use of thermal/infrared sensors allows for acquiring images that are less sensitive to ambient light or illumination changes [10], facial expression [10], [2] and face pose [3]. Unfortunately, these images present less information related to the face shape and texture [4]. Some authors [15], [18], [4] have shown recently that near infrared (NIR) images which use an intermediate light spectrum range between infrared and visible light (790–900 nm) present some robustness to illumination variations while preserving meaningful aspects of face like appearance and structure. This way, face recognition in degraded illumination condition remains possible on this type of images [15], [18]. We have developed in our team an original acquisition device allowing to acquire almost simultaneously (20 ms delay) visible and NIR images [4]. This successive capture is done using the video protocol consisting of the capture of two frames (odd and even) to form an image. We acquire during the odd frame a near infrared image by adding a synchronized illuminator activated during the integration time of the CCD (coupled charged device). During the even frame we deactivate the illuminator and a normal acquisition is performed. During the odd frame, we reduced the acquisition time in order to acquire less ambient light and we activated the pulsed flash during the capture time. The resulting image is composed of the reflection of the flash's light. During the even frame we deactivated the flash. The output of this sensor is a video flux that combines both images. We therefore separate the odd and even frames to form two images of 640×480 pixels. Fig. 1 presents the acquisition sequence. This original device allows us to alleviate the calibration task if we want to use the two images at the same moment, whatever for comparison or fusion purposes. The detailed description of our sensor is given in [5].

In our previous work [4], our aim was first to compare the performance of different classical face verification approaches on both types visible and near infrared images acquired with our sensor. We then studied the respective degradation of performance due to illumination variations on visible images and we checked the stability of NIR images in bad illumination contexts. We therefore confirmed that the error rates of two algorithms, one chosen as representative of global face verification method and the other one as a variation of elastic graph matching (EGM) were very high on visible images while they remain limited on NIR images, when considering highly illuminated conditions. We performed these experiments using a database acquired with our sensor on which we defined several protocols considered so as to emphasize the difference between controlled and non-controlled conditions. We also performed some limited and non-satisfactory experiments of scores fusion. Indeed, due to the bad performance of the systems working on visible images, fusion performance were roughly identical to those of the best system, namely the system working on near infrared light.

In this study, we propose to perform fusion at the image level in order to exploit at an earlier stage the inherent complementarities of the two images. The problem in this context is to build from the two available images of the same object a single image. It appears from the image fusion literature related to visible and infrared images an interest in performing multi-resolution fusion and not direct fusion at the pixel level. This way it is possible to integrate more information at different level of decomposition. The general scheme when performing multi-resolution image fusion is the following [6], [13], [14], [16]: After performing a multi-scale transformation of each source image leading to a different set of features for each image, a composite feature set is built from these two distinct feature sets according to some specific rules. Finally an inverse transform allows reconstructing an image that can be seen as the fusion of the two original images. In the literature of infrared and visible image fusion, there are interesting propositions for defining these fusion rules which can be split into two groups. The first idea is to select the group of features from visible and thermal face images that will be retained to constitute the fused feature set. To this end, Singh et al. [16], for instance, employed genetic algorithm. The other idea is to assign weights to the two types of features (of course selection can be seen as a particular case of weighting when weights take values 0 and 1). Heo et al. [6] proposed to fuse the pixels of visible and thermal face images using an empirically designed weighted linear combination. In [14], the authors suggest another approach for weighting the features by relating the weights to the activity (energy) level of each image, labeled “good” or “bad” thanks to a granular SVM classifier.

In this work, we provide two novel solutions for fusing features from two different image sources; one corresponds to a weighting scheme while the other gives a selection scheme. Novelty here is in the use of particle swarm optimization (PSO) [8], a well-known optimization algorithm, which we have already used efficiently [12] in the context of biometric feature fusion. In the first proposed scheme, we will use the “real” version of PSO for finding optimal weights in the weighted summation of visible and infrared feature while in the second proposed scheme, we will use the “binary version” of PSO to select the features in each spectrum which will finally constitutes the fused feature vector. In both cases, PSO is used for finding the feature fusion strategy which optimizes the verification performance. The proposed image fusion techniques are evaluated on visible and NIR images acquired by our original sensor. One should note that this sensor device provides an automatic registration between the visible and NIR images as both images are acquired simultaneously. This acquisition framework renders therefore natural the idea of image fusion. More precisely, we will use IRVI database that was acquired previously in our laboratory and the associated protocols defined in [4] to this end. Our experiments allow comparing the two proposed image fusion schemes to the unimodal models and also to several systems performing fusion at different stages namely: image, feature and match score level. Results illustrate that the proposed image fusion offers substantial improvements in performance over all the other schemes.

The rest of the paper is organized as follows: Section 2 briefly describes the PSO algorithm principles including the definition of the real and binary implementations. Section 3 describes the two proposed image fusion schemes that we designed for the image fusion of two different images resulting from different sources and the related verification systems. Section 4 describes the experimental setup, the PSO parameters that were chosen for performing this fusion task and the experimental protocols defined on the IRVI database. Section 5 describes and analyses the results of the proposed schemes and finally Section 6 draws the conclusion.

Section snippets

General description

PSO is a stochastic, population based optimization technique aiming at finding a solution to an optimization problem in a search space. The PSO algorithm was first described by Kennedy and Eberhart in 1995 [8]. The main objective of PSO is to optimize a given function called fitness function. PSO is initialized with a population of particles distributed randomly over the search space. Each particle corresponds to a point in the N-dimension space. The ith particle is represented as Xi={x1, x2,…,x

Multispectral image fusion using particle swarm optimization

In this section, we describe two novel image fusion schemes that we propose in order to fuse visible and near infrared images. As previously mentioned, we will use PSO algorithms in both schemes to realize the fusion of the visible and near infrared features. The fused pixel image is further used in a verification system which relies on a matching between two images after some feature extraction step. Verification decision (accept/reject) is taken on the basis of the matching scores between the

Parameters of PSO

In this work, the particles are associated to the weights that can be assigned to the features extracted from the visible and near infrared images. In this section, we describe several PSO parameters such as fitness function, acceleration constant, inertia weight and velocity limitation which need to be adapted to our precise application.

Results and discussion

In this section, we use protocols S2P1 and S2PA as defined in Section 4.2 for evaluating the performance of proposed image fusion algorithms and for comparing the proposed algorithms with unimodal, feature, match score level fusion and state-of-the-art image fusion algorithms [6], [16].

Conclusion

In this work we have considered the difficult problem of face recognition in presence of non-controlled illumination. Following some previous work initiated in our team, we propose the use of an original sensor allowing a simultaneous acquisition of visible and near infrared face images. We (and other authors) have shown in earlier works [4] that the use of NIR light allows decreasing significantly the EER compared to the use of normal light in uncontrolled illumination conditions. Our aim in

R. Raghavendra received his B.E. and M. Tech. degree in Electronics and Communication Engineering from University of Mysore, Visvesvaraya Technological University, respectively. He also holds Ph.D. degree in Computer Science and Technology from University of Mysore, India. Presently, he is working as Scientific Researcher in Institut Télécom; TélécomSudParis, Evry, Paris, France. His current research interest includes Multimodal Biometrics, DSP, Pattern Recognition, Feature Selection,

References (21)

There are more references available in the full text version of this article.

Cited by (94)

View all citing articles on Scopus

R. Raghavendra received his B.E. and M. Tech. degree in Electronics and Communication Engineering from University of Mysore, Visvesvaraya Technological University, respectively. He also holds Ph.D. degree in Computer Science and Technology from University of Mysore, India. Presently, he is working as Scientific Researcher in Institut Télécom; TélécomSudParis, Evry, Paris, France. His current research interest includes Multimodal Biometrics, DSP, Pattern Recognition, Feature Selection, Optimization Techniques and Finite Mixture Models. He is an author of more than 25 research papers in all areas of his interest.

Bernadette Dorizzi got her Ph.D. (Thèse d’état) in Theoretical Physics at the University of Orsay (Paris XI-France) in 1983, on the study of integrability of dynamical systems. She is Professor at Télécom and Management SudParis (ex INT) since September 1989, the head of the Electronics and Physics Department since 1995 where she is in charge of the Intermedia (Interaction for Multimedia) research team. Her present research activity is related to the biometric domain. She is coordinator of the Bio-Identity Institut Telecom research project (http://www.int-evry/biometrics) and of the BioSecure Foundation (http://biosecure.info). She is an author of more than 300 research papers and has supervised more than 15 Ph.D. thesis.

Ashok Rao received his B.E., M.E. and Ph.D. in 1982, 1985, 1991 all in EE from University of Mysore, IISc Bangalore and IIT Bombay, respectively. His current area of research includes Biometrics, DSP and Image Processing, Bio-Medical Signal Processing, Applied Linear Algebra, Social Computing, Renewable Energy Engineering Education. From 1999–2005 he was the Head of Network Project, CEDT, IISc. From August 2009, he is Professor in CIT, Gubbi, Tumkur. He has authored over 75 research publications in all areas of his interest.

G. Hemantha Kumar received B.Sc., M.Sc. and Ph.D. from University of Mysore. He is working as a Professor in the Department of Studies in Computer Science, University of Mysore, Mysore. He has published more than 200 papers in Journals, Edited Books and Refereed Conferences. His current research interest includes Numerical Techniques, Digital Image Processing, Pattern Recognition and Multimodal Biometrics.

View full text