A novel method for detecting lips, eyes and faces in real time
Introduction
In recent years, the fast advancement of the image processing techniques and the cost down of various image/video acquisition devices encouraged the development of many computer vision applications, such as vision-based surveillance, vision-based man–machine interfaces, vision-based biometrics, and so on. Among these many applications, face recognition is one of the central tasks that attract the attention of more and more researchers. A number of works in the literature had presented some face recognition applications in laboratorical and commercial scales [1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11]. One of the important tasks in designing a good face recognition system is the design of an efficient algorithm to locate faces in captured images or video. Actually, face detection is also a central task in some applications other than recognition systems. For example, in some video transmission applications, human faces are the only changing foreground objects in the video frames. Therefore, the repeated encoding, transmission and decoding of unchanged background parts are avoided to save the network bandwidth and computations. Hence, face detection plays a key role in segmenting the faces from the video background. As the face detection is always the first step in the processes of these recognition or transmission systems, its performance would put a strict limit on the achieved performance of the whole system. Ideally, a good face detector should accurately extract all faces in images regardless of their positions, scales, orientations, colors, shapes, poses, expressions and light conditions. However, for the current state of the art in image processing technologies, this goal is a big challenge. For this reason, many designed face detectors deal with only upright and frontal faces in well-constrained environments [1], [12], [13], [14], [15], [16].
In addition to the accuracy, another important concern is the detecting speed. For instances, in many video phones and surveillance applications, the real-time speed is a critical requirement. This real-time speed requirement prohibited many algorithms that precisely extract faces at the cost of an extensive amount of computation time. Some high-speed computer CPUs may provide a good hardware solution to the speed requirement; however, the high costs of these powerful CPUs may also cut down the acceptability of these systems for common users.
In this paper, we propose a novel real-time face detection algorithm that can accurately locate both the face regions in images and the eyes and lips for each located face. The detailed capability specifications of the proposed algorithm are described as follows:
- 1.
Users can tilt their faces left or right for about 45°.
- 2.
Users can raise, lower, or rotate their heads as long as neither lips nor eyes are occluded.
- 3.
The sizes of faces are limited to the size between 1600 (=40×40) pixels and 9216 (=96×96) pixels. This limitation is set to fit the resolution requirements for general face recognition engines. The values can be easily adjusted if different resolutions are demanded.
The rest of the paper is organized as follows. Section 2 makes a brief survey on some related work. The details of the proposed algorithm are presented in Section 3. In order to show the effectiveness of the proposed algorithm, some experimental results are provided in Section 4. The performance evaluation and comparisons in terms of the accuracy and speed is also given there. Finally, we conclude this paper in Section 5.
Section snippets
Related works
A straightforward approach for detecting faces in images is through template correlation matching [1], [2], [3], [4]. The template can be designed or learned through the collection of a set of face patterns. During the matching process, a template is convolved with the subimages everywhere within the input image to find the possible candidates based on a predefined similarity or distance. To handle the possible variations in size, orientation, and shape, etc., two methods are usually adopted.
Rule-based face detection algorithm
According to the framework of the bottom-up detection approach, the proposed algorithm is designed to extract the facial components including lips and eyes. In order to reducing the searching areas in the input images, the proposed algorithm also performs the extraction of skin pixels. However, instead of using probabilistic models, we use a quadratic polynomial model for the color model of skin pixels to reduce the computation time. Moreover, we also extend this polynomial model to the
Performance evaluation
For the performance evaluation, we have implemented the proposed algorithm on a PC with a Pentium III 800 CPU and 128M RAM. The implemented system has two modes of operations. The first is on-line mode that is designed to detect faces in video frames captured from a PC camera in real time. The other mode is off-line mode that is designed to detect faces in still images. To evaluate the accuracy and the speed of the proposed algorithm, we have prepared a test set that contains 1000 images. Among
Concluding remarks
According to the experimental results, the proposed algorithm exhibits satisfactory performances in both accuracy and speed. Actually, for those applications with well-constrained conditions in system usage and environment control, the proposed algorithm can be further improved in both speed and accuracy by further simplifications and refinement of our system design. However, there still are two main restrictions in using the proposed algorithm:
- 1.
The light condition must be normal. In other
Acknowledgements
This work was supported in part by the National Science Council of Republic of China under Grant NSC-90-2218-E-259-001.
References (37)
- et al.
Face recognition using the mixture-of-eigenfaces method
Pattern Recognition Letters
(2002) - et al.
Face recognition with one training image per person
Pattern Recognition Letters
(2002) - et al.
Adaptive skin color modeling using the skin locus for selecting training pixels
Pattern Recognition
(2003) - et al.
Mixture model for face-color modeling and segmentation
Pattern Recognition Letters
(2001) - et al.
Adaptive skin-color filter
Pattern Recognition
(2001) - et al.
Face detection and location based on skin chrominance and lip chrominance transformation from color images
Pattern Recognition
(2001) - et al.
A novel approach for human face detection from color images under complex background
Pattern Recognition
(2001) - et al.
Detecting human faces in color images
Image and Vision Computing
(1999) - et al.
Modelling facial colour and identity with gaussian mixtures
Pattern Recognition
(1998) - Pentland A, Moghaddam B, Stamer T, Oliyide O, Turk M. View-based and modular eigenspaces for face recognition. In: IEEE...
Eigenfaces vs. fisherfacesrecognition using class specific linear projection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Face recognition with radial basis function (rbf) neural networks
IEEE Transactions on Neural Networks
Face recognition using line edge map
IEEE Transactions on Pattern Analysis and Machine Intelligence
Face recognition algorithm using local and global information
Electronics Letters
Face recognition using kernel principal component analysis
IEEE Signal Processing Letters
Face recognition using support vector machines with local correlation kernels
International Journal of Pattern Recognition and Artificial Intelligence
Bioida multimodal biometric identification system
IEEE Computer
Human and machine recognition of faces, a survey
Proceedings of the IEEE
Cited by (90)
Explore double-opponency and skin color for saliency detection
2021, NeurocomputingCitation Excerpt :The saliency value is not high in the image saliency detection, especially in the saliency target of a human face, and consequently the current method often fails. Skin color detection is often used in face detection [14], gesture recognition [71], and sensitive image filtering [59]. To address this problem, we propose the double opponency-skin color (DS) saliency model.
Accurate eye localization in the Short Waved Infrared Spectrum through summation range filters
2015, Computer Vision and Image UnderstandingAutomatic generation of facial expression using triangular geometric deformation
2014, Journal of Applied Research and TechnologyCitation Excerpt :This study used the eight-adjacent method to process the pixels one by one with the center of the 3×3 mask. The two major steps of the processing process [26] [28] are Label-Assigning and Label-Merging. As the chin and the neck are of skin color in skin color segmentation, they can be easily regarded as continuous skin color blocks.
Hybrid computer vision system for drivers' eye recognition and fatigue monitoring
2014, NeurocomputingCitation Excerpt :The resulting average recall parameter is 95% and precision 98%. This compares favorably with the results reported e.g. in [61,7]. Examination of the misclassified cases reveals that problems are usually due to wrong initial skin segmentation.
Hinge loss bound approach for surrogate supervision multi-view learning
2014, Pattern Recognition LettersFuzzy-controlled Image Morse Code Input System
2022, Sensors and Materials