Elsevier

Pattern Recognition

Volume 33, Issue 9, September 2000, Pages 1525-1539
Pattern Recognition

Real-time face location on gray-scale static images

https://doi.org/10.1016/S0031-3203(99)00130-2Get rights and content

Abstract

This work presents a new approach to automatic face location on gray-scale static images with complex backgrounds. In a first stage our technique approximately detects the image positions where the probability of finding a face is high; during the second stage the location accuracy of the candidate faces is improved and their existence is verified. The experimentation shows that the algorithm performs very well both in terms of detection rate (just one missed detection on 70 images) and of efficiency (about 13 images/s can be processed on Hardware Intel Pentium II 266 MHz).

Introduction

Automatic face location is a very important task which constitutes the first step of a large area of applications: face recognition, face retrieval by similarity, face tracking, surveillance, etc. (e.g. Ref. [1]). In the opinion of many researchers, face location is the most critical step towards the development of practical face-based biometric systems, since its accuracy and efficiency have a direct impact on the system usability. Several factors contribute to making this task very complex, especially in the case of applications requiring to operate in real-time on gray-scale static images. Complex backgrounds, illumination changes, pose and expression changes, head rotation in the 3D space and different distances between the subject and the camera are the main sources of difficulty.

Many face-location approaches have been proposed in the literature, depending on the type of images (gray-scale images, color images or image sequences) and on the constraints considered (simple or complex background, scale and rotation changes, different illuminations, etc.). Giving a brief summary of the conspicuous number of works requires a pre-classification; unfortunately, due to the large amount of different techniques used by researchers this task is not so easy. While we are aware of the unavoidable inaccuracies, we have tried to make a tentative classification:

  • Methods based on template matching with static masks and heuristic algorithms which use images taken at different resolutions (multiresolution approaches) [2], [3].

  • Computational approaches based on deformable templates which characterize the human face [4] or internal features [5], [6], [7], [8]: eyes, nose, mouth. These methods can be conceived as an evolution of the previous class, since the templates can be adapted to the different shapes characterizing the searched objects. The templates are generally defined in terms of geometric primitives like lines, polygons, circles and arcs; a fitness criterion is employed to determine the degree of matching.

  • Face and facial parts detection using dynamic contours or snakes [6], [9], [10], [11]. These techniques involve a constrained global optimization, which usually gives very accurate results but at the same time is computationally expensive.

  • Methods based on elliptical approximation and on face searching via least-squares minimization [12], incremental ellipse fitting [13] and elliptic region growing [14].

  • Approaches based on the Hough transform [7] and the adaptive Hough transform [15].

  • Methods based on the search for a significant group of features [triplets, constellations, etc.] in the context considered: for example, two eyes and a mouth suitably located constitute a significant group in the context of a face [7], [16], [17], [18], [19].

  • Face search on the eigenspace determined via PCA [20] and face location approaches based on the information theory [21], [22].

  • Neural Network approaches [23], [24], [25], [26], [27], [28], [29], [30]. The best results have been obtained by using feed forward networks to classify image portions normalized with respect to scale and illumination. During the training, examples of face objects and non-face objects are presented to the network. The high computational cost, induced by the need to process at different resolutions all the possible positions of a face in the image, is the main drawback of these methods.

  • Face location on color images through segmentation in a color-space: YIQ, YES, HSI, HSV, Farnsworth, etc [27], [31], [32], [33], [34], [35], [36]. Generally, color information greatly simplifies the localization task: a simple spectrographic analysis shows that the face skin pixels are usually clustered in a color space, and then an ad hoc segmentation allows the face to be isolated from the background or, at least, to drastically reduce the amount of information which must be processed during the successive stages.

  • Face detection on image sequences using motion information: optical flow, spatio-temporal gradient, etc. [27], [33], [37].

Since in several applications it is mandatory (or preferable) to deal with static gray-scale images we believe it is important to develop a method which does not exploit additional information like color and motion. For example, most of the surveillance cameras nowadays installed in shops, banks and airports are still gray-scale cameras (due to the lower cost), and the electronic processing of mug-shot or identity card databases could require to detect faces from static gray-scale pictures printed on paper. Unfortunately, if we discard color and motion-based approaches, the most robust methods are generally time consuming and cannot be used in real-time applications.

The aim of this work is to provide a new method which is capable of processing gray-scale static images in real time. The algorithm must operate with structured backgrounds and must tolerate illumination changes, scale variations and small head rotations.

Our approach (Fig. 1(a)) is based on a location technique which starts by approximately detecting the image positions (or candidate positions) where the probability to find a face is high (module AL) and then for each of them improves the location accuracy and verifies the presence of a true face (module FLFV). Actually, most of the applications in the field of biometric systems require detection of just one object in the image (i.e. the foreground object): under this hypothesis, a more efficient implementation of our method is reported in Fig. 1(b), where at each step the module AL passes only the most likely position to FLFV and FLFV continues to require a new position until a valid face is detected or no more candidates are available. It should be noted that, even in this case, the system could be used to detect more faces in an image, assuming that the iterative process is not prematurely interrupted.

Although AL and FLFV have been implemented in a very different manner, both these modules work on the same kind of data: that is the directional image extracted by the starting gray-scale image.

In Section 2 the directional image is defined and some comments about its computation are reported. Section 3 describes the module AL which is based on the search of elliptical blobs in the directional image by means of the generalized Hough transform. In Section 4 we present the dynamic-mask-based technique used for fine location and face verification (module FLFV) and in Section 5 we discuss how to practically combine AL and FLVF in order to implement the functional schema of Fig. 1(b). Section 6 reports the results of our experimentation over a 70 image database; finally, in Section 7, we present our conclusions and discuss future research.

Section snippets

Directional image

Most of the face location approaches perform an initial edge extraction by means of a gradient-like operator; few methods also exploit other additional features like directional information, intensity maxima and minima, etc. Our technique strongly relies on the edge phase-angles contained in a directional image.

A directional image is a matrix defined over a discrete grid, superimposed on the gray-scale image, whose elements are in correspondence with the grid nodes. Each element is a vector

AL - approximate location

The analysis of a certain number of directional images suggested the formulation of a simple method for detecting faces. In particular, we noted that when a face is present in an image the corresponding directional image region is characterized by vectors producing an elliptical blob. For this reason, the module AL is based on the search for ellipses on the directional image. Several techniques could be used for this purpose, for example multiresolution template matching [39] and least-squares

FLFV - fine location and face verification

Different strategies can be adopted in order to improve the location accuracy and to verify whether an elliptical object is really a face or not. Some of the alternatives we explored are reported in the following:

  • Improving the ellipse center location through AHT (Adaptive Hough Transform) [41], [42] which requires the granularity of the hot accumulator cells to be gradually refined.

  • Local optimization of the center [xc,yc], of the semi-axes a and b and of the ellipse tilt angle ξ through a local

Combining AL and FLFV

Depending on the application requirements, there are several ways of adjusting and combining AL and FLFV modules. Since at this stage our aim is to develop a method capable of efficiently detecting the foreground face, we adopted the functional schema of Fig. 1.b. In particular, the algorithm searches for just one face in the image; it returns the face position[xf,yf] and sizes af,bf, in case of detection, and null otherwise. A pseudo-code version of the whole face detection method is reported:

 

Experimentation

Experimental results have been produced on a database of 70 images each of which contains at least one human face (Fig. 11). All the images (384×288 pixels—256 gray levels) were acquired in some offices and laboratories of our department, under different illuminations (sometimes rather critical: backlighting, semidarkness,…) and with the subject at different distances from the camera. In 10 images people wear spectacles. The subjects were required to gaze the camera. Each of the 70 images was

Conclusions

This work proposes a two-stage approach to face location on the gray scale static images with complex backgrounds. Both the modules operate on the elements constituting the directional image, which has been proved to be very effective in providing reliable information even in the presence of critical illumination and semidarkness.

The approximate location module searches for the most likely positions in the image by means of a particular implementation of the generalized Hough transform. Great

About the Author—DARIO MAIO is Full Professor at the Computer Science Department, University of Bologna, Italy. He has published in the fields of distributed computer systems, computer performance evaluation, database design, information systems, neural networks, biometric systems, autonomous agents. Before joining the Computer Science Department, he received a fellowship from the C.N.R. (Italian National Research Council) for participation to the Air Traffic Control Project. He received the

References (48)

  • L.S. Davis

    Hierarchical generalized Hough transform and line segment based generalized Hough transforms

    Pattern Recognition

    (1982)
  • R. Chellappa, S. Sirohey, C.L. Wilson, C.S. Barnes, Human and machine recognition of faces: a survey, Tech. Report...
  • I. Craw, D. Tock, A. Bennet, Finding face features, Proceedings of ECCV,...
  • A. Yuille, D. Cohen, P. Hallinan, Facial features extraction by deformable templates, Tech. Report 88-2, Harward...
  • K. Lam et al.

    Locating and extracting the eye in human face images

    Pattern Recognition

    (1996)
  • A. Lanitis, C.J. Taylor, T.F. Cootes, T. Ahmed, Automatic interpretation of human faces and hand gesture using flexible...
  • R. Funayama, N. Yokoya, H. Iwasa, H. Takemura, Facial component extraction by cooperative active nets with global...
  • S.R. Gunn, M.S. Nixon, Snake head boundary extraction using global and local energy minimisation, Proceedings of the...
  • S.A. Sirohey, Human face segmentation and identification, Tech. Report CAR-TR-695, Center for Automation Research,...
  • A. Jacquin, A. Eleftheriadis, Automatic location tracking of faces and facial features in video sequences, Proceedings...
  • R. Herpers, H. Kattner, H. Rodax, G. Sommer, GAZE: an attentive processing strategy to detect and analyze the prominent...
  • V. Govindaraju, S.N. Srihari, D.B. Sher, A computational model for face location, Proceedings of the 3rd ICCV, 1990,...
  • H.P. Graf, T. Chen, E. Petajan, E. Cosatto, Locating faces and facial parts, Proceedings of the International Workshop...
  • M.C. Burl, T.K. Leung, P. Perona, Face localization via shape statistics, Proceedings of the International Workshop on...
  • Cited by (123)

    • Gaussian model for closed curves

      2024, Expert Systems with Applications
    • Robust ellipse detection with Gaussian mixture models

      2016, Pattern Recognition
      Citation Excerpt :

      Ellipse fitting is a challenging problem that arises in several fields. Some examples of applications are segmentation of cells [11], study of galaxies [12], medical diagnostics [13], camera calibration and face detection among others [14,15]. As many applications as there are of fitting ellipses there are also a great number of algorithms proposing solutions to this problem [16].

    • A novel rotation adaptive object detection method based on pair Hough model

      2016, Neurocomputing
      Citation Excerpt :

      In order to use Generalized Hough Transform to recognize objects, voting models of detectable parts need to be defined so that votes for object parameters such as position and size can be casted according to them. Most Hough-based face recognition techniques [20–22] use predefined models since frontal human faces share many common features. However, defining models for an unseen object category is difficult, and every object class is not so easy to be modeled as human faces.

    • Machine learning for multi-view eye-pair detection

      2014, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      In another eye detection and tracking system (Abdel-Kader et al., 2014), eyes are detected and tracked by a particle swarm optimization based multiple template matching algorithm. In another paper, the Hough transform algorithm is used in combination with directional image filters previously proposed for face detection (Maio and Maltoni, 2000). In Ilbeygi and Shah-Hosseini (2012), luminance and chrominance values of colored image patches are extracted and given to a template matching algorithm to detect eyes.

    • Two dimensional synthetic face generation and verification using set estimation technique

      2012, Computer Vision and Image Understanding
      Citation Excerpt :

      A PSNR value is greater than 30 dB, in general, indicates the closeness of the images. Quality of the generated image is taken to be acceptable if the PSNR is found to be greater than 20 dB [24]. 160 new images, taken four images per class from 40 classes have been generated.

    • Image and Video Processing Tools for HCI

      2010, Multimodal Signal Processing
    View all citing articles on Scopus

    About the Author—DARIO MAIO is Full Professor at the Computer Science Department, University of Bologna, Italy. He has published in the fields of distributed computer systems, computer performance evaluation, database design, information systems, neural networks, biometric systems, autonomous agents. Before joining the Computer Science Department, he received a fellowship from the C.N.R. (Italian National Research Council) for participation to the Air Traffic Control Project. He received the degree in Electronic Engineering from the University of Bologna in 1975. He is a IEEE member. He is with CSITE - C.N.R. and with DEIS; he teaches database and information systems at the Computer Science Dept., Cesena.

    About the Author—DAVIDE MALTONI is an Associate Researcher at the Computer Science Department, University of Bologna, Italy. He received the degree in Computer Science from the University of Bologna, Italy, in 1993. In 1998 he received his Ph.D. in Computer Science and Electronic Engineering at DEIS, University of Bologna, with research theme “Biometric Systems”. His research interests also include autonomous agents, pattern recognition and neural nets. He is an IAPR member.

    View full text