Skip to main content

2009 | Buch

Advanced Concepts for Intelligent Vision Systems

11th International Conference, ACIVS 2009, Bordeaux, France, September 28–October 2, 2009. Proceedings

herausgegeben von: Jacques Blanc-Talon, Wilfried Philips, Dan Popescu, Paul Scheunders

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 11th International Conference on Advanced Concepts for Intelligent Vision Systems, ACIVS 2009, held in Bordeaux, France in September/October 2009. The 43 revised full papers and 25 posters presented were carefully reviewed and selected from 115 submissions. The papers are organized in topical sections on technovision, fundamental mathematical techniques, image processing, coding and filtering, image and video analysis, computer vision, tracking, color, multispectral and special-purpose imaging, medical imaging, and biometrics.

Inhaltsverzeichnis

Frontmatter

Technovision

Evaluation of Interest Point Detectors for Non-planar, Transparent Scenes

The detection of stable, distinctive and rich feature point sets has been an active area of research in the field of video and image analysis. Transparency imaging, such as X-ray, has also benefited from this research. However, an evaluation of the performance of various available detectors for this type of images is lacking. The differences with natural imaging stem not only from the transparency, but -in the case of medical X-ray- also from the non-planarity of the scenes, a factor that complicates the evaluation. In this paper, a method is proposed to perform this evaluation on non-planar, calibrated X-ray images. Repeatability and accuracy of nine interest point detectors is demonstrated on phantom and clinical images. The evaluation has shown that the Laplacian-of-Gaussian and Harris-Laplace detectors show overall the best performance for the datasets used.

Chrysi Papalazarou, Peter M. J. Rongen, Peter H. N. de With
On the Evaluation of Segmentation Methods for Wildland Fire

This paper focuses on the study of fire color spaces and the evaluation of image segmentation methods commonly available in the literature of wildland and urban fires. The evaluation method, based on the determination of a segmentation quality index, is applied on three series of fire images obtained at the usual scales of validation of forest fire models (laboratory scale, fire tunnel scale and field scale). Depending on the considered scale, different methods reveal themselves as being the most appropriate. In this study we present the advantages and drawbacks of different segmentation algorithms and color spaces used in fire detection and characterization.

Steve Rudz, Khaled Chetehouna, Adel Hafiane, Olivier Sero-Guillaume, Hélène Laurent
2D Face Recognition in the IV2 Evaluation Campaign

In this paper, the first evaluation campaign on 2D-face images using the multimodal IV

2

database is presented. The five appearance-based algorithms in competition are evaluated on four experimental protocols, including experiments with challenging illumination and pose variabilities. The results confirm the advantages of the Linear Discriminant Analysis (LDA) and the importance of the training set for the Principal Component Analysis (PCA) based approaches. The experiments show the robustness of the Gabor based approach combined with LDA, in order to cope with challenging face recognition conditions. This evaluation shows the interest and the richness of the IV

2

multimodal database.

Anouar Mellakh, Anis Chaari, Souhila Guerfi, Johan Dhose, Joseph Colineau, Sylvie Lelandais, Dijana Petrovska-Delacrètaz, Bernadette Dorizzi
Background Subtraction Techniques: Systematic Evaluation and Comparative Analysis

Moving object detection is a critical task for many computer vision applications: the objective is the classification of the pixels in the video sequence into either foreground or background. A commonly used technique to achieve it in scenes captured by a static camera is Background Subtraction (BGS). Several BGS techniques have been proposed in the literature but a rigorous comparison that analyzes the different parameter configuration for each technique in different scenarios with precise ground-truth data is still lacking. In this sense, we have implemented and evaluated the most relevant BGS techniques, and performed a quantitative and qualitative comparison between them.

Sonsoles Herrero, Jesús Bescós

Fundamental Mathematical Techniques

Kolmogorov Superposition Theorem and Wavelet Decomposition for Image Compression

Kolmogorov Superposition Theorem stands that any multivariate function can be decomposed into two types of monovariate functions that are called inner and external functions: each inner function is associated to one dimension and linearly combined to construct a hash-function that associates every point of a multidimensional space to a value of the real interval [0,1]. These intermediate values are then associated by external functions to the corresponding value of the multidimensional function. Thanks to the decomposition into monovariate functions, our goal is to apply this decomposition to images and obtain image compression.

We propose a new algorithm to decompose images into continuous monovariate functions, and propose a compression approach: thanks to the decomposition scheme, the quantity of information taken into account to define the monovariate functions can be adapted: only a fraction of the pixels of the original image have to be contained in the network used to build the correspondence between monovariate functions. To improve the reconstruction quality, we combine KST and multiresolution approach, where the low frequencies will be represented with the highest accuracy, and the high frequencies representation will benefit from the adaptive aspect of our method to achieve image compression.

Our main contribution is the proposition of a new compression scheme: we combine KST and multiresolution approach. Taking advantage of the KST decomposition scheme, the low frequencies will be represented with the highest accuracy, and the high frequencies representation will be replaced by a decomposition into simplified monovariate functions, preserving the reconstruction quality. We detail our approach and our results on different images and present the reconstruction quality as a function of the quantity of pixels contained in monovariate functions.

Pierre-Emmanuel Leni, Yohan D. Fougerolle, Frédéric Truchetet
Theorems Relating Polynomial Approximation, Orthogonality and Balancing Conditions for the Design of Nonseparable Bidimensional Multiwavelets

We relate different properties of nonseparable quincunx multiwavelet systems, such as polynomial approximation order, orthonormality and balancing, to conditions on the matrix filters. We give mathematical proofs for these relationships. The results obtained are necessary conditions on the filterbank. This simplifies the design of such systems.

Ana M. C. Ruedin
Mixtures of Normalized Linear Projections

High dimensional spaces pose a challenge to any classification task. In fact, these spaces contain much redundancy and it becomes crucial to reduce the dimensionality of the data to improve analysis, density modeling, and classification. In this paper, we present a method for dimensionality reduction in mixture models and its use in classification. For each component of the mixture, the data are projected by a linear transformation onto a lower-dimensional space. Subsequently, the projection matrices and the densities in such compressed spaces are learned by means of an Expectation Maximization (EM) algorithm. However, two main issues arise as a result of implementing this approach, namely: 1) the scale of the densities can be different across the mixture components and 2) a singularity problem may occur. We suggest solutions to these problems and validate the proposed method on three image data sets from the UCI Machine Learning Repository. The classification performance is compared with that of a mixture of probabilistic principal component analysers (MPPCA). Across the three data sets, our accuracy always compares favourably, with improvements ranging from 2.5% to 35.4%.

Ahmed Fawzi Otoom, Oscar Perez Concha, Hatice Gunes, Massimo Piccardi
A New Feasible Approach to Multi-dimensional Scale Saliency

In this paper, we present a multi-dimensional extension of an image feature extractor, the scale saliency algorithm by Kadir and Brady. In order to avoid the curse of dimensionality, our algorithm is based on a recent Shannon’s entropy estimator and on a new divergence metric in the spirit of Friedman’s and Rafsky estimation of Henze-Penrose divergence. The experiments show that, compared to our previous existing method based on entropic graphs, this approach remarkably decreases computation time, while not significantly deterioring the quality of the results.

Pablo Suau, Francisco Escolano
Attributed Graph Matching Using Local Descriptions

In the pattern recognition context, objects can be represented as graphs with attributed nodes and edges involving their relations. Consequently, matching attributed graphs plays an important role in objects recognition. In this paper, a node signatures extraction is combined with an optimal assignment method for matching attributed graphs. In particular, we show how local descriptions are used to define a node-to-node cost in an assignment problem using the Hungarian method. Moreover, we propose a distance formula to compute the distance between attributed graphs. The experiments demonstrate that the newly presented algorithm is well-suited to pattern recognition applications. Compared with well-known methods, our algorithm gives good results for retrieving images.

Salim Jouili, Ines Mili, Salvatore Tabbone
A Template Analysis Methodology to Improve the Efficiency of Fast Matching Algorithms

Several methods aimed at effectively speeding up the block matching and template matching tasks have been recently proposed. A class of these methods, referred to as exhaustive due to the fact that they optimally solve the minimization problem of the matching cost, often deploys a succession of bounding functions based on a partitioning of the template and subwindow to perform rapid and reliable detection of non-optimal candidates. In this paper we propose a study aimed at improving the efficiency of one of these methods, that is, a state-of-the-art template matching technique known as

Incremental Dissimilarity Approximations

(IDA). In particular, we outline a methodology to order the succession of bounding functions deployed by this technique based on the analysis of the template only. Experimental results prove that the proposed approach is able to achieve improved efficiency.

Federico Tombari, Stefano Mattoccia, Luigi Di Stefano, Fabio Regoli, Riccardo Viti
Enhanced Low-Resolution Pruning for Fast Full-Search Template Matching

Gharavi-Alkhansari [1] proposed a full-search equivalent algorithm for speeding-up template matching based on

L

p

-norm distance measures. This algorithm performs a pruning of mismatching candidates based on multilevel pruning conditions and it has been shown that, under certain assumptions on the distortion between the image and the template, it is faster than the other full-search equivalent algorithms proposed so far, including algorithms based on the Fast Fourier Transform. In this paper we propose an original contribution with respect to Gharavi-Alkhansari’s work that is based on the exploitation of an initial estimation of the global minimum aimed at increasing the efficiency of the pruning process.

Stefano Mattoccia, Federico Tombari, Luigi Di Stefano
A Novel Approach to Geometric Fitting of Implicit Quadrics

This paper presents a novel approach for estimating the geometric distance from a given point to the corresponding implicit quadric curve/surface. The proposed estimation is based on the height of a tetrahedron, which is used as a coarse but reliable estimation of the real distance. The estimated distance is then used for finding the best set of quadric parameters, by means of the Levenberg-Marquardt algorithm, which is a common framework in other geometric fitting approaches. Comparisons of the proposed approach with previous ones are provided to show both improvements in CPU time as well as in the accuracy of the obtained results.

Mohammad Rouhani, Angel D. Sappa
Two-Level Bimodal Association for Audio-Visual Speech Recognition

This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, where cross-modal association is considered in two levels. First, the acoustic and the visual data streams are combined at the feature level by using the canonical correlation analysis, which deals with the problems of audio-visual synchronization and utilizing the cross-modal correlation. Second, information streams are integrated at the decision level for adaptive fusion of the streams according to the noise condition of the given speech datum. Experimental results demonstrate that the proposed method is effective for producing noise-robust recognition performance without a priori knowledge about the noise conditions of the speech data.

Jong-Seok Lee, Touradj Ebrahimi
Level Set-Based Fast Multi-phase Graph Partitioning Active Contours Using Constant Memory

We present multi-phase FastGPAC that extends our dramatic improvement of memory requirements and computational complexity on two-class GPAC, into multi-class image segmentation. Graph partitioning active contours GPAC is a recently introduced approach that elegantly embeds the graph-based image segmentation problem within a continuous level set-based active contour paradigm. However, GPAC similar to many other graph-based approaches has quadratic memory requirements. For example, a 1024x1024 grayscale image requires over one terabyte of working memory. Approximations of GPAC reduce this complexity by trading off accuracy. Our FastGPAC approach implements an exact GPAC segmentation using constant memory requirement of few kilobytes and enables use of GPAC on high throughput and high resolution images. Extension to multi-phase enables segmention of multiple regions of interest with different appearances. We have successfully applied FastGPAC on different types of images, particularly on biomedical images of different modalities. Experiments on the various image types, natural, biomedical etc. show promising segmentation results with substantially reduced computational requirements.

Filiz Bunyak, Kannappan Palaniappan

Image Processing, Coding and Filtering

Image Quality Assessment Based on Edge-Region Information and Distorted Pixel for JPEG and JPEG2000

The main objective of image quality assessment metrics is to provide an automatic and efficient system to evaluate visual quality. It is imperative that these measures exhibit good correlation with perception by the human visual system (HVS). This paper proposes a new algorithm for image quality assessment, which supplies more flexibility than previous methods in using the distorted pixel in the assessment. First, the distorted and original images are divided into blocks of 11×11 pixels, and secondly, we calculate distorted pixels then visual regions of interest and edge information are computed which can be used to compute the global error. Experimental comparisons demonstrate the effectiveness of the proposed method.

Zianou Ahmed Seghir, Fella Hachouf
Fast Multi Frames Selection Algorithm Based on Macroblock Reference Map for H.264/AVC

The variable block size motion estimation (ME) and compensation (MC) using multiple reference frames is adopted in H.264/AVC to improve coding efficiency. However, the computational complexity for ME/MC increases proportional to the number of reference frames. In this paper, we propose a new efficient reference frame selection algorithm to reduce the complexity. The proposed algorithm selects suitable reference frames by employing the spatial and temporal correlation of video sequence. The experimental results show that the proposed algorithm decreases video encoding time while keeping the similar visual quality and bit rates.

Kyung-Hee Lee, Jae-Won Suh
Highlight Removal from Single Image

The highlight removal method from the single image without knowing the illuminant has been presented. The presented method is based on the Principal Component Analysis (PCA), Histogram equalization and Second order polynomial transformation. The proposed method does not need color segmentation and normalization of image by illuminant. The method has been tested on different types of images, images with or without texture and images taken in different unknown light environment. The result shows the feasibility of the method. Implementation of the method is straight forward and computationally fast.

Pesal Koirala, Markku Hauta-Kasari, Jussi Parkkinen
Parameter Estimation in Bayesian Super-Resolution Image Reconstruction from Low Resolution Rotated and Translated Images

This paper deals with the problem of

high-resolution

(HR) image reconstruction, from a set of degraded, under-sampled, shifted and rotated images, utilizing the variational approximation within the Bayesian paradigm. The proposed inference procedure requires the calculation of the covariance matrix of the HR image given the LR observations and the unknown hyperparameters of the probabilistic model. Unfortunately the size and complexity of such matrix renders its calculation impossible, and we propose and compare three alternative approximations. The estimated HR images are compared with images provided by other HR reconstruction methods.

Salvador Villena, Miguel Vega, Rafael Molina, Aggelos K. Katsaggelos
A New Approach to Sparse Image Representation Using MMV and K-SVD

This paper addresses the problem of image representation based on a sparse decomposition over a learned dictionary. We propose an improved matching pursuit algorithm for Multiple Measurement Vectors (MMV) and an adaptive algorithm for dictionary learning based on multi-Singular Value Decomposition (SVD), and combine them for image representation. Compared with the traditional K-SVD and

orthogonal matching pursuit

MMV (OMPMMV) methods, the proposed method runs faster and achieves a higher overall reconstruction accuracy.

Jie Yang, Abdesselam Bouzerdoum, Son Lam Phung
3D Filtering of Colour Video Sequences Using Fuzzy Logic and Vector Order Statistics

Novel approach designed in this paper permits the suppression of impulsive noise in multichannel video sequences. It employs the fuzzy logic and vector order statistic methods to detect motion and noise presence during spatial-temporal processing neighbouring video frames, preserving the edges, fine details, as well as colour properties. Numerous simulation results have justified it excellent performance in terms of objective criteria: Pick Signal-to- Noise Ratio (PSNR), Mean Absolute Error (MAE) and Normalized Colour Difference (NCD), as well as in subjective perception by human viewer.

Volodymyr Ponomaryov, Alberto Rosales-Silva, Francisco Gallegos-Funes
A Performance Comparison of De-convolution Algorithms on Transmission Terahertz Images

Terahertz imaging has found applications in many fields, to explore these applications we have built a coherent, transmission terahertz imaging system at 186 GHz. De-convolution algorithms were tested for improving the image’s resolution beyond the diffraction limit of the imaging system. Tested algorithms include the Wiener, Tikhonov, and Richardson-Lucy algorithms. Their performances are discussed and compared in this paper. Experimental results have demonstrated that coherent de-convolution algorithms are capable of improving the resolution of images formed with this imaging system.

Yue Li, Li Li, Juan Tello, Dan Popescu, Andrew Hellicar
Content-Based Annotation of User Generated Videos on a Mobile Platform

This paper focuses on the problems of video annotation of outdoor user generated low bit-rate videos. An effective algorithm for estimating percentage of artificial (man-made) and natural content in a video frame is presented. The algorithm is based on edge information from the luminance frame only and it is intended to be executed on a mobile platform in an acceptable time frame. The experiments, performed on a large set of user generated videos, adhere with human perception of artificial and natural content.

Hristina Pavlovska, Tomislav Kartalov, Zoran Ivanovski
Dynamic Texture Extraction and Video Denoising

According to recent works, introduced by Y.Meyer [1] the decomposition models based on Total Variation (TV) appear as a very good way to extract texture from image sequences. Indeed, videos show up characteristic variations along the temporal dimension which can be catched in the decomposition framework. However, there are very few works in literature which deal with spatio-temporal decompositions. Thus, we devote this paper to spatio-temporal extension of the spatial color decomposition model. We provide a relevant method to accurately catch Dynamic Textures (DT) present in videos. Moreover, we obtain the spatio-temporal regularized part (the geometrical component), and we distinctly separate the highly oscillatory variations, (the noise). Furthermore, we present some elements of comparison between several models in denoising purpose.

Mathieu Lugiez, Michel Ménard, Abdallah El-Hamidi

Image and Video Analysis

Unsupervised Detection of Gradual Video Shot Changes with Motion-Based False Alarm Removal

The temporal segmentation of a video into shots is a fundamental prerequisite for video retrieval. There are two types of shot boundaries: abrupt shot changes (“cuts”) and gradual transitions. Several high-quality algorithms have been proposed for detecting cuts, but the successful detection of gradual transitions remains a surprisingly difficult problem in practice. In this paper, we present an unsupervised approach for detecting gradual transitions. It has several advantages. First, in contrast to alternative approaches, no training stage and hence no training data are required. Second, no thresholds are needed, since the used clustering approach separates classes of gradual transitions and non-transitions automatically and

adaptively

for each video. Third, it is a generic approach that does not employ a specialized detector for each transition type. Finally, the issue of removing false alarms caused by camera motion is addressed: in contrast to related approaches, it is not only based on low-level features, but on the results of an appropriate algorithm for camera motion estimation. Experimental results show that the proposed approach achieves very good performance on TRECVID shot boundary test data.

Ralph Ewerth, Bernd Freisleben
VISRET – A Content Based Annotation, Retrieval and Visualization Toolchain

This paper presents a system for content-based video retrieval, with a complete toolchain for annotation, indexing, retrieval and visualization of imported data. The system contains around 20 feature descriptors, a modular infrastructure for descriptor addition and indexing, a web-based search interface and an easy-to-use query-annotation-result visualization module. The features that make this system differ from others is the support of all the steps of the retrieval chain, the modular support for standard MPEG-7 and custom descriptors, and the easy-to-use tools for query formulation and retrieval visualization. The intended use cases of the system are content- and annotation-based retrieval applications, ranging from community video portals to indexing of image, video, judicial, and other multimedia databases.

Levente Kovács, Ákos Utasi, Tamás Szirányi
Combination of Attributes in Stereovision Matching for Fish-Eye Lenses in Forest Analysis

This paper describes a novel stereovision matching approach by combining several attributes at the pixel level for omni-directional images obtained with fish-eye lenses in forest environments. The goal is to obtain a disparity map as a previous step for determining distances to the trees and then the volume of wood in the imaged area. The interest is focused on the trunks of the trees. Because of the irregular distribution of the trunks, the most suitable features are the pixels. A set of six attributes is used for establishing the matching between the pixels in both images of the stereo pair. The final decision about the matched pixels is taken by combining the attributes. Two combined strategies are proposed: the Sugeno Fuzzy Integral and the Dempster-Shafer theory. The combined strategies, applied to our specific stereo vision matching problem, make the main finding of the paper. In both, the combination is based on the application of three well known matching constraints. The proposed approaches are compared among them and favourably against the usage of simple features.

P. Javier Herrera, Gonzalo Pajares, María Guijarro, J. Jaime Ruz, Jesús M. De la Cruz
Image Categorization Using ESFS: A New Embedded Feature Selection Method Based on SFS

Feature subset selection is an important subject when training classifiers in Machine Learning (ML) problems. Too many input features in a ML problem may lead to the so-called “curse of dimensionality”, which describes the fact that the complexity of the classifier parameters adjustment during training increases exponentially with the number of features. Thus, ML algorithms are known to suffer from important decrease of the prediction accuracy when faced with many features that are not necessary. In this paper, we introduce a novel embedded feature selection method, called ESFS, which is inspired from the wrapper method SFS since it relies on the simple principle to add incrementally most relevant features. Its originality concerns the use of mass functions from the evidence theory that allows to merge elegantly the information carried by features, in an embedded way, and so leading to a lower computational cost than original SFS. This approach has successfully been applied to the domain of image categorization and has shown its effectiveness through the comparison with other feature selection methods.

Huanzhang Fu, Zhongzhe Xiao, Emmanuel Dellandréa, Weibei Dou, Liming Chen
Pattern Analysis for an Automatic and Low-Cost 3D Face Acquisition Technique

This paper proposes an automatic 3D face modeling and localizing technique, based on active stereovision. In the offline stage, the optical and geometrical parameters of the stereosensor are estimated. In the online acquisition stage, alternate complementary patterns are successively projected. The captured right and left images are separately analyzed in order to localize left and right primitives with sub-pixel precision. This analysis also provides us with an efficient segmentation of the informative facial region. Epipolar geometry transforms a stereo matching problem into a one-dimensional search problem. Indeed, we employ an adapted, optimized dynamic programming algorithm to pairs of primitives which are already located in each epiline. 3D geometry is retrieved by computing the intersection of optical rays coming from the pair of matched features. A pipeline of geometric modeling techniques is applied to densify the obtained 3D point cloud, and to mesh and texturize the 3D final face model. An appropriate evaluation strategy is proposed and experimental results are provided.

Karima Ouji, Mohsen Ardabilian, Liming Chen, Faouzi Ghorbel
Bayesian Pressure Snake for Weld Defect Detection

Image Segmentation plays a key role in automatic weld defect detection and classification in radiographic testing. Among the segmentation methods, boundary extraction based on deformable models is a powerful technique to describe the shape and then deduce after the analysis stage, the type of the defect under investigation. This paper describes a method for automatic estimation of the contours of weld defect in radiographic images. The method uses a statistical formulation of contour estimation by exploiting statistical pressure snake based on non-parametric modeling of the image. Here the edge energy is replaced by a region energy which is a function of statistical characteristics of area of interest.

Aicha Baya Goumeidane, Mohammed Khamadja, Nafaa Naceredine
Parallel Blob Extraction Using the Multi-core Cell Processor

The rapid increase in pixel density and frame rates of modern imaging sensors is accelerating the demand for fine-grained and embedded parallelization strategies to achieve real-time implementations for video analysis. The IBM Cell Broadband Engine (BE) processor has an appealing multi-core chip architecture with multiple programming models suitable for accelerating multimedia and vector processing applications. This paper describes two parallel algorithms for blob extraction in video sequences: binary morphological operations and connected components labeling (CCL), both optimized for the Cell-BE processor. Novel parallelization and explicit instruction level optimization techniques are described for fully exploiting the computational capacity of the Synergistic Processing Elements (SPEs) on the Cell processor. Experimental results show significant speedups ranging from a factor of nearly 300 for binary morphology to a factor of 8 for CCL in comparison to equivalent sequential implementations applied to High Definition (HD) video.

Praveen Kumar, Kannappan Palaniappan, Ankush Mittal, Guna Seetharaman
Quality Fusion Rule for Face Recognition in Video

Face recognition in video is confronted with many problems: varying illumination, pose and expression. Their compensation algorithms may produce much noise and make face abnormal, which degrade the face image quality. In this paper, motivated by human cognitive process, a quality fusion rule is designed to reduce the influence of compensated face image quality that may affect recognition performance. Combined with video features and the recognition contribution degrees of compensated face image, the rule fuses the recognition result of every face video frame to opt best result. In this paper, quality fusion rule for illumination compensation is mainly involved. In the experiment, the proposed quality fusion rule is evaluated on a face video database with varied illumination. In contrast to other state-of-the-art methods, the novel approach has better recognition performance.

Chao Wang, Yongping Li, Xinyu Ao

Computer Vision

Decorrelation and Distinctiveness Provide with Human-Like Saliency

In this work, we show the capability of a new model of saliency, of reproducing remarkable psychophysical results. The model presents low computational complexity compared to other models of the state of the art. It is based in biologically plausible mechanisms: the decorrelation and the distinctiveness of local responses. Decorrelation of scales is obtained from principal component analysis of multiscale low level features. Distinctiveness is measured through the Hotelling’s T

2

statistic. The model is conceived to be used in a machine vision system, in which attention would contribute to enhance performance together with other visual functions. Experiments demonstrate the consistency with a wide variety of psychophysical phenomena, that are referenced in the visual attention modeling literature, with results that outperform other state of the art models.

Antón Garcia-Diaz, Xosé R. Fdez-Vidal, Xosé M. Pardo, Raquel Dosil
Intelligent Vision: A First Step – Real Time Stereovision

We describe a real time stereo vision system capable of processing high resolution (1Mpixel or more) images at 30 fps with disparity ranges of 100 pixels or more. This system has a fast rectification module associated with each camera which uses a look up table approach to remove lens distortion and correct camera misalignment in a single step. The corrected, aligned images are passed through a module which generates disparity and occlusion maps with a latency of two camera scan line intervals. This module implements a version of the Symmetric Dynamic Programming Stereo (SDPS) algorithm which has a small, compact hardware realization, permitting many copies to be instantiated to accommodate large disparity ranges. Snapshots from videos taken in our laboratory demonstrate that the system can produce effective depth maps in real time. The occlusion maps that the SDPS algorithm produces clearly outline distinct objects in scenes and present a powerful tool for segmenting scenes rapidly into objects of interest.

John Morris, Khurram Jawed, Georgy Gimel’farb
Engineering of Computer Vision Algorithms Using Evolutionary Algorithms

Computer vision algorithms are currently developed by looking up the available operators from the literature and then arranging those operators such that the desired task is performed. This is often a tedious process which also involves testing the algorithm with different lighting conditions or at different sites. We have developed a system for the automatic generation of computer vision algorithms at interactive frame rates using GPU accelerated image processing. The user simply tells the system which object should be detected in an image sequence. Simulated evolution, in particular Genetic Programming, is used to automatically generate and test alternative computer vision algorithms. Only the best algorithms survive and eventually provide a solution to the user’s image processing task.

Marc Ebner
Shape Recognition by Voting on Fast Marching Iterations

In this study, we present a Fast Marching (FM) - Shape Description integrated methodology that is capable both extracting object boundaries and recognizing shapes. A new speed formula is proposed, and the local front stopping algorithm in [1] is enhanced to freeze the active contour near real object boundaries. GBSD [2] is utilized as shape descriptor on evolving contour. Shape description process starts when a certain portion of the contour is stopped and continues with FM iterations. Shape description at each iteration is treated as a different source of shape information and they are fused to get better recognition results. This approach removes the limitation of traditional recognition systems that have only one chance for shape classification. Test results shown in this study prove that the voted decision result among these iterated contours outperforms the ordinary individual shape recognizers.

Abdulkerim Capar, Muhittin Gokmen
Unusual Activity Recognition in Noisy Environments

In this paper we present a method for unusual activity recognition that is used in home environment monitoring. Monitoring systems are needed in elderly persons homes to generate automatic alarms in case of emergency. The unusual activity recognition method presented here is based on a body part segmentation algorithm that gives an estimation of how similar the current pose is compared to the poses in the training data. As there are arbitrary number of possible unusual activities it is impossible to train a system to recognize every unusual activity. We train our system to recognize a set of normal poses and consider everything else unusual. Normal activities in our case are walking and sitting down.

Matti Matilainen, Mark Barnard, Olli Silvén
Real-Time Center Detection of an OLED Structure

The research presented in this paper focuses on real-time image processing for visual servoing, i.e. the positioning of a x-y table by using a camera only instead of encoders. A camera image stream plus real-time image processing determines the position in the next iteration of the table controller. With a frame rate of 1000 fps, a maximum processing time of only 1 millisecond is allowed for each image of 80x80 pixels. This visual servoing task is performed on an OLED (Organic Light Emitting Diode) substrate that can be found in displays, with a typical size of 100 by 200

μm

. The presented algorithm detects the center of an OLED well with sub-pixel accuracy (1 pixel equals 4

μm

, sub-pixel accuracy reliable up to ±1

μm

) and a computation time less than 1 millisecond.

Roel Pieters, Pieter Jonker, Henk Nijmeijer
Comparing Feature Matching for Object Categorization in Video Surveillance

In this paper we consider an object categorization system using local HMAX features. Two feature matching techniques are compared: the MAX technique, originally proposed in the HMAX framework, and the histogram technique originating from Bag-of-Words literature. We have found that each of these techniques have their own field of operation. The histogram technique clearly outperforms the MAX technique with 5–15% for small dictionaries up to 500–1,000 features, favoring this technique for embedded (surveillance) applications. Additionally, we have evaluated the influence of interest point operators in the system. A first experiment analyzes the effect of dictionary creation and has showed that random dictionaries outperform dictionaries created from Hessian-Laplace points. Secondly, the effect of operators in the dictionary matching stage has been evaluated. Processing all image points outperforms the point selection from the Hessian-Laplace operator.

Rob G. J. Wijnhoven, Peter H. N. de With
Self Organizing and Fuzzy Modelling for Parked Vehicles Detection

Our aim is to distinguish moving and stopped objects in digital image sequences taken from stationary cameras by a model based approach. A self-organizing model is adopted both for the scene background and for the scene foreground, that can handle scenes containing moving backgrounds or gradual illumination variations, helping in distinguishing between moving and stopped foreground regions. The model is enriched by spatial coherence to enhance robustness against false detections and fuzzy modelling to deal with decision problems typically arising when crisp settings are involved. We show through experimental results and comparisons that good accuracy values can be reached for color video sequences that represent typical situations critical for vehicles stopped in no parking areas.

Lucia Maddalena, Alfredo Petrosino
Rapid Detection of Many Object Instances

We describe an algorithm capable of detecting multiple object instances within a scene in the presence of changes in object viewpoint. Our approach consists of first calculating frequency vectors for discrete feature vector clusters (visual words) within a sliding window as a representation of the image patch. We then classify each patch using an AdaBoost classifier whose weak classifier simply applies a threshold to one visual word’s frequency within the patch. Compared to previous work, our algorithm is simpler yet performs remarkably well on scenes containing many object instances. The method requires relatively few training examples and consumes 2.2 seconds on commodity hardware to process an image of size 640×480. In a test on a challenging car detection problem using a relatively small training set, our implementation dramatically outperforms the detection performance of a standard AdaBoost cascade using Haar-like features.

Suwan Tongphu, Naddao Thongsak, Matthew N. Dailey
Concealed Object Perception and Recognition Using a Photometric Stereo Strategy

Following a review of current hidden objects detection techniques in a range of security applications, a strategy based on an innovative, low-cost photometric stereo technique is proposed to reveal concealed objects. By taking advantage of information rapidly acquired under different illumination conditions, various enhanced real time images can be produced, free from the confusion of textured camouflage. The extracted surface normals can be used for the calculation of curvature and flatness attributes, and providing clues for subsequent hidden object detection and recognition tasks. Experiments on both simulated and real data have verified the strategy is useful for stealthy objects detection and may provide another modality of data for current monitoring system. The results demonstrate good potential application in the detection of concealed objects in security and military applications through the deployment of image enhancement and augmented reality devices.

Jiuai Sun, Melvyn Smith, Abdul Farooq, Lyndon Smith

Tracking

Tracking 3D Orientation through Corresponding Conics

We propose here a new method to recover the 3D orientation of a rigid body by matching corresponding conics embedded in the object itself. The method is based on writing the projective equations of the conics and rearranging them in a suitable way. This leads to a very simple linear system. Results from simulated experiments show good accuracy and suggest that this method could be used for instance in augmented reality surgery to effectively track surgery instruments inside the operating room.

Alberto Alzati, Marina Bertolini, N. Alberto Borghese, Cristina Turrini
Parallel Region-Based Level Set Method with Displacement Correction for Tracking a Single Moving Object

We proposed a parallel level set method with displacement correction (DC) to solve collision problems during tracking a single moving object. The major collision scenarios are that the target cell collides with other cells, air bubbles, or a wall of the water pool where cells swim. These collisions result in detected contour of the target spreading to the other obstacles which induces target missing and tracking failure. To overcome this problem, we add displacement correction to the procedure of boundary detection once the collision occurs. The intensity summation of inside detected contour is utilized to determine whether collision occurs. After the collision is detected, we translate the current level set function according to the displacement information of target cell. To clarify the ability of our proposed method, we try cell (paramecium) tracking by visual feedback controlling to keep target cell at the center of a view field under a microscope. To reduce computational time, we implement our proposed method in a column parallel vision (CPV) system. We experimentally show that the combination of our proposed method and CPV system can detect the boundary of the target cell within about 2 [ms] for each frame and robustly track cell even when the collision occurs.

Xianfeng Fei, Yasunobu Igarashi, Koichi Hashimoto
Lane Detection and Tracking Using a Layered Approach

A new night-time lane detection system that extends the idea of a

Layered Approach

[1] is presented in this document. The extension includes the incorporation of (1) Inverse Perspective Mapping (IPM) to generate a bird’s-eye view of the road surface, (2) application of Random Sample Consensus (RANSAC) to rid outliers from the data, and (3) Kalman filtering to smooth the output of the lane tracker. Videos of driving scenarios on local city roads and highways were used to test the new system. Quantitative analysis shows higher accuracy in detecting lane markers in comparison to other approaches.

Amol Borkar, Monson Hayes, Mark T. Smith
Temporal Templates for Detecting the Trajectories of Moving Vehicles

In this study, we deal with the problem of detecting the trajectories of moving vehicles. We introduce a method, based on the spatio-temporal connectivity analysis, to extract the vehicles trajectories from temporal templates, spanned over a short period of time. Temporal templates are conformed with the successive images differences. The trajectories are computed using the centers of the blobs in the temporal template. A Kalman filter for a constant value with emphasis in the measurement uncertainty is used to smooth the result. The algorithm is tested extensively using a sequence took from tower overlooking a vehicular intersection. Our approach allow us to detect the vehicles trajectories without the need to construct a background model or using a sophisticated tracking strategy for the moving objects. Our experiments show that the scheme we propose is reliable, and fast.

Hugo Jiménez, Joaquín Salas
Robust Detection and Tracking of Moving Objects in Traffic Video Surveillance

Building an efficient and robust system capable of working in harsh real world conditions represents the ultimate goal of the traffic video surveillance. Despite an evident progress made in the area of statistical background modeling over the last decade or so, moving object detection is still one of the toughest problems in video surveillance, and new approaches are still emerging. Based on our published method for motion detection in the wavelet domain, we propose a novel, wavelet-based method for robust feature extraction and tracking. Hereby, a more efficient approach is proposed that relies on a non-decimated wavelet transformation to achieve both motion segmentation and selection of features for tracking. The use of wavelet transformation for selection of robust features for tracking stems from the persistence of actual edges and corners across the scales of the wavelet transformation. Moreover, the output of the motion detector is used to limit the search space of the feature tracker to those areas where moving objects are found. The results demonstrate a stable and efficient performance of the proposed approach in the domain of traffic video surveillance.

Borislav Antić, Jorge Oswaldo Niño Castaneda, Dubravko Ćulibrk, Aleksandra Pižurica, Vladimir Crnojević, Wilfried Philips
Vehicle Tracking Using Geometric Features

Applications such as traffic surveillance require a real-time and accurate method for object tracking. We propose to represent scene observations with parabola segments with an algorithm that allows us to fit parabola segments in real-time to edge pixels. The motion vectors for these parabola segments are obtained in consecutive frames by a matching technique based on distance and intensity. Furthermore, moving rigid objects are detected by an original method that clusters comparable motion vectors. The result is a robust detection and tracking method, which can cope with small changes in viewpoint on the moving rigid object.

Francis Deboeverie, Kristof Teelen, Peter Veelaert, Wilfried Philips
Object Tracking by Non-overlapping Distributed Camera Network

People Tracking is a problem of great interest for wide areas video surveillance systems. In these large areas, it is not possible for a single camera to observe the complete area of interest. Surveillance systems architecture requires algorithms with the ability to track objects while observing them through multiple cameras. We focus our work on multi camera tracking with non overlapping fields of view (FOV). In particular we propose a multi camera architecture for wide area surveillance and a real time people tracking algorithm across non overlapping cameras. In this scenario it is necessary to track object both in intra-camera and inter-camera FOV. We consider these problems in this paper. In particular we have investigated different techniques to evaluate intra-camera and inter-camera tracking based on color histogram. For the intra-camera tracking we have proposed different methodologies to extract the color histogram information from each object patches. For inter-camera tracking we have compared different methods to evaluate the colour Brightness Transfer Function (BTF) between non overlapping cameras. These approaches are based on color histogram mapping between pairs of images of the same object in different FOVs. Therefore we have combined different methodology to calculate the color histogram in order to estimate different colour BTF performances. Preliminary results demonstrates that the proposed method combined with BTF outperform the performance in terms of matching rate between different cameras.

Pier Luigi Mazzeo, Paolo Spagnolo, Tiziana D’Orazio
Relational Dynamic Bayesian Networks to Improve Multi-target Tracking

Tracking relations between moving objects is a big challenge for Computer Vision research. Relations can be useful to better understand the behaviors of the targets, and the prediction of trajectories can become more accurate. Moreover, they can be useful in a variety of situations like monitoring terrorist activities, anomaly detection, sport coaching, etc.

In this paper we propose a model based on Relational Dynamic Bayesian Networks (RDBNs), that uses first-order logic to model particular correlations between objects behaviors, and show that the performance of the prediction increases significantly. In our experiments we consider the problem of multi-target tracking on a highway where the behavior of targets is often correlated to the behavior of the targets near to them. We compare the performance of a Particle Filter that does not take into account relations between objects and the performance of a Particle Filter that makes inference over the proposed RDBN.

We show that our method can follow the targets path more closely than the standard methods, being able to better predict their behaviors while decreasing the complexity of the tracker task.

Cristina Manfredotti, Enza Messina
Multiple Human Tracking in High-Density Crowds

In this paper, we present a fully automatic approach to multiple human detection and tracking in high density crowds in the presence of extreme occlusion. Human detection and tracking in high density crowds is an unsolved problem. Standard preprocessing techniques such as background modeling fail when most of the scene is in motion. We integrate human detection and tracking into a single framework, and introduce a confirmation-by-classification method to estimate confidence in a tracked trajectory, track humans through occlusions, and eliminate false positive detections. We use a Viola and Jones AdaBoost cascade classifier for detection, a particle filter for tracking, and color histograms for appearance modeling. An experimental evaluation shows that our approach is capable of tracking humans in high density crowds despite occlusions.

Irshad Ali, Matthew N. Dailey
3D Face Alignment via Cascade 2D Shape Alignment and Constrained Structure from Motion

In this paper, we consider fitting a 3D wireframe face model to continuous video sequences for the tasks of simultaneous tracking of rigid head motion and non-rigid facial animation. We propose a two-level integrated model for accurate 3D face alignment. At the low level, the 2D shape is accurately extracted via a regularized shape model relied on a cascaded parameter/constraint prediction and optimization. At the high level, those already accurately aligned points from the low level are used to constrain the projected 3D wireframe alignment. Using a steepest descent approach, the algorithm is able to extract simultaneously the parameters related to the face expression and to the 3D posture. Extensive fitting and tracking experiments demonstrate the feasibility, accuracy and effectiveness of the developed methods. A performance evaluation also shows that the proposed methods can outperform the fitting based on an active appearance model search and can tackle many disadvantages associated with such approaches.

Yunshu Hou, Ping Fan, Ilse Ravyse, Hichem Sahli

Color, Multispectral and Special-Purpose Imaging

Carotenoid Concentration of Arctic Charr (Salvelinus Alpinus L.) from Spectral Data

The most striking feature of Arctic Charr (

Salvelinus alpinus L

.) is the red abdomen area during the mating season. This colouration is assumed to be related to the vitality, nutritional status, foraging ability and generally health of the fish – an important knowledge to fisheries and researchers. The colouration should be assessed numerically, and the amount of pigment (carotenoid) causing the colour should be known for quality evaluation of the fish. Especially the carotenoid amount is thought to be directly connected to the investment of the individual since carotenoids are energetically costly. To assess this amount, we investigate the relationship between chemical and spectral data. We also tested a simple model for approximating carotenoid content from spectral measurements. The preliminary results indicate a reasonable good correlation between these two data.

J. Birgitta Martinkappi, Jukka Kekäläinen, Yevgeniya Shatilova, Jussi Parkkinen
Quality of Reconstructed Spectrum for Watermarked Spectral Images Subject to Various Illumination Conditions

Digital imaging continues expansion to various applications. Spectral images are becoming more popular as one field of digital imaging. In this study we utilize a watermarking method for spectral images, based on the three-dimensional wavelet transform. We study the influence of watermarking process to illuminated watermarked images. In particular, we focus on how the watermarking effects to the spectrum of restored images. The experiments were performed on a large dataset of 58 spectral images. The experiments indicate that using the proposed watermarking method the quality of reconstructed image depends more on illumination and embedding strength controller than compression, with respect to L

*

a

*

b

*

color difference.

Konstantin Krasavin, Jussi Parkkinen, Arto Kaarna, Timo Jaaskelainen
Compression of Remote Sensing Images for the PROBA-V Satellite Mission

We investigate compression of remote sensing images with a special geometry of non-square pixels. Two fundamentally different data reduction strategies are compared: a combination of pixel binning with near lossless compression and a method operating at higher compression rates. To measure the real impact of the compression, the image processing flow upto final products is included in the experiments. The effects on sensor non-uniformities and their corrections are explicitly modeled and measured. We conclude that it is preferable to apply higher compression rates than to rely on pixel binning, even if the derived images have lower resolutions.

Stefan Livens, Richard Kleihorst
Estimating Color Signal at Different Correlated Color Temperature of Daylight

Color signal changes with change in illuminant information. This study focuses on estimating color signals at different Correlated Color Temperature (CCT) of daylight. We selected a set of color signals at different CCT of daylight for estimation. An experiment was conducted by generating color signals from 24 color samples of Macbeth Color Checker and 1645 daylight spectral power distributions (SPD), where CCT ranges from 3757K to 28322K. By uniform sampling of this, we collected 84 color signals from each color samples and combined them to form a training dataset. Principal Component Analysis (PCA) has been applied on the selected training dataset to find the basis vectors and the number of color signals needed for estimation. We apply the Wiener estimation with different order of polynomials to estimate the color signal of color samples. Interestingly, good estimation of all 1645 color signals of given color sample from Macbeth color chart is obtained by selecting five best CCT color signals of that given color sample and with association to its third order polynomial. However, the results from high order polynomials yield to significant errors on Wiener estimation.

Paras Pant, Pesal Koirala, Markku Hauta-Kasari, Jussi Parkkinen
Local Color Descriptor for Object Recognition across Illumination Changes

In the context of object recognition, it is useful to extract, from the images, efficient local descriptors that are insensitive to the illumination conditions, to the camera scale factor and to the position and orientation of the object. In this paper, we propose to cope with this invariance problem by applying a spatial transformation to the local regions around detected key points. The new position of each pixel after this local spatial transformation is evaluated according to both the colors and the relative positions of all the pixels in the original local region. The descriptor of the considered local region is the set of the new positions of three particular pixels in this region. The invariance and the discriminating power of our local descriptor is assessed on a public database.

Xiaohu Song, Damien Muselet, Alain Trémeau
A New Method for Segmentation of Images Represented in a HSV Color Space

This paper presents an original low-level system for color image segmentation considering the Hue-Saturation-Value (HSV) color space. Many difficulties of color image segmentation may be resolved using the correct color space in order to increase the effectiveness of color components to discriminate color data. The technique proposed in the article uses new data structures that lead to simpler and more efficient segmentation algorithms. We introduce a flexible hexagonal network structure on the pixels image and we extract for each segmented region the syntactic features that can be used in the shape recognition process. Our technique has a time complexity lower than the methods studied from specialized literature and the experimental results on Berkeley Segmentation Dataset color image database show that the performance of method is robust.

Dumitru Dan Burdescu, Marius Brezovan, Eugen Ganea, Liana Stanescu
Radar Imager for Perception and Mapping in Outdoor Environments

Perception remains a challenge in outdoor environments. Overcoming the limitations of vision-based sensors, microwave radar presents considerable potential. Such a sensor so-called K2Pi has been designed for environment mapping. In order to build radar maps, an algorithm named R-SLAM has been developed. The global radar map is constructed through a data merging process, using map matching of successive radar image sequences. An occupancy grid approach is used to describe the environment. First results obtained in urban and natural environments are presented, which show the ability of the microwave radar to deal with extended environments.

Raphaël Rouveure, Patrice Faure, Marie-Odile Monod
Phantom-Based Point Spread Function Estimation for Terahertz Imaging System

We present a terahertz imaging system designed to operate in reflection mode and propose a method for estimating its point spread function. A phantom with known geometry is built, such as to generate a regular pattern with sharp edges under an ideal delta-like point spread function. The phantom is imaged with the terahertz system operating at 186 GHz. Several masking alterations applied to the beam pattern are also tested. The corresponding point spread functions are obtained by a deconvolution technique in the Fourier domain. We validate our results by using the estimated point spread functions to deblur the imaging results of a natural scene, and by direct comparison with a point source response.

Dan C. Popescu, Andrew Hellicar, Yue Li
Advanced Vision Processing Systems: Spike-Based Simulation and Processing

In this paper we briefly summarize the fundamental properties of spike events processing applied to artificial vision systems. This sensing and processing technology is capable of very high speed throughput, because it does not rely on sensing and processing sequences of frames, and because it allows for complex hierarchically structured neuro-cortical-like layers for sophisticated processing. The paper describes briefly cortex-like spike event vision processing principles, and the AER (Address Event Representation) technique used in hardware spiking systems. In this paper we present a simulation AER tool that we have developed entirely in Visual C++ 6.0. We have validated it using real AER stimulus and comparing the outputs with real outputs obtained from AER-based devices. With this tool we can predict the eventual performance of AER-based systems, before the technology becomes mature enough to allow such large systems.

José-Antonio Pérez-Carrasco, Carmen Serrano-Gotarredona, Begoña Acha-Piñero, Teresa Serrano-Gotarredona, Bernabe Linares-Barranco

Medical Imaging

Self-assessed Contrast-Maximizing Adaptive Region Growing

In the context of an experimental virtual-reality surgical planning software platform, we propose a fully self-assessed adaptive region growing segmentation algorithm. Our method successfully delineates main tissues relevant to head and neck reconstructive surgery, such as skin, fat, muscle/organs, and bone. We rely on a standardized and self-assessed region-based approach to deal with a great variety of imaging conditions with minimal user intervention, as only a single-seed selection stage is required. The detection of the optimal parameters is managed internally using a measure of the varying contrast of the growing regions. Validation based on synthetic images, as well as truly-delineated real CT volumes, is provided for the reader’s evaluation.

Carlos S. Mendoza, Begoña Acha, Carmen Serrano, Tomás Gómez-Cía
Convex Hull-Based Feature Selection in Application to Classification of Wireless Capsule Endoscopic Images

In this paper we propose and examine a Vector Supported Convex Hull method for feature subset selection. Within feature subspaces, the method checks locations of vectors belonging to one class with respect to the convex hull of vectors belonging to the other class. Based on such analysis a coefficient is proposed for evaluation of subspace discrimination ability. The method allows for finding subspaces in which vectors of one class cluster and they are surrounded by vectors of the other class. The method is applied for selection of color and texture descriptors of capsule endoscope images. The study aims at finding a small set of descriptors for detection of pathological changes in the gastrointestinal tract. The results obtained by means of the Vector Supported Convex Hull are compared with results produced by a Support Vector Machine with the radial basis function kernel.

Piotr Szczypiński, Artur Klepaczko
Pattern Analysis of Dermoscopic Images Based on FSCM Color Markov Random Fields

In this paper a method for pattern analysis in dermoscopic images of abnormally pigmented skin (melanocytic lesions) is presented. In order to diagnose a possible skin cancer, physicians assess the lesion according to different rules. The new trend in Dermatology is to classify the lesion by means of pattern irregularity. In order to analyze the pattern turbulence, lesions ought to be segmented into single pattern regions. Our classification method, when applied on overlapping lesion patches, provides a pattern chart that could ultimately allow for in-region single-texture turbulence analysis. Due to the color-textured appearance of these patterns, we present a novel method based on a Finite Symmetric Conditional Model (FSCM) Markov Random Field (MRF) color extension for the characterization and discrimination of pattern samples. Our classification success rate rises to 86%.

Carlos S. Mendoza, Carmen Serrano, Begoña Acha

Biometrics

A 3D Statistical Facial Feature Model and Its Application on Locating Facial Landmarks

3D face landmarking aims at automatic localization of 3D facial features and has a wide range of applications, including face recognition, face tracking, facial expression analysis. Methods so far developed for 2D images were shown sensitive to lighting condition changes. In this paper, we propose a learning-based approach for reliable locating of face landmarks in 3D. Our approach relies on a statistical model, called 3D Statistical Facial feAture Model(SFAM) in the paper, which learns both global variations in 3D face morphology and local ones around the 3D face landmarks in terms of local texture and shape. Experimented on FRGC v1.0 dataset, our approach shows its effectiveness and achieves 99.09% of locating accuracy in 10mm precision. The mean error and standard deviation of each landmark are respectively less than 5mm and 4.

Xi Zhao, Emmanuel Dellandréa, Liming Chen
Behavioral State Detection of Newborns Based on Facial Expression Analysis

Prematurely born infants are observed at a Neonatal Intensive Care Unit (NICU) for medical treatment. Whereas vital body functions are continuously monitored, their incubator is covered by a blanket for medical reasons. This prevents visual observation of the newborns during most time of the day, while it is known that the facial expression can give valuable information about the presence of discomfort.

This prompted the authors to develop a prototype of an automated video survey system for the detection of discomfort in newborn babies by analysis of their facial expression. Since only a reliable and situation-independent system is useful, we focus at robustness against non-ideal viewpoints and lighting conditions. Our proposed algorithm automatically segments the face from the background and localizes the eye, eyebrow and mouth regions. Based upon measurements in these regions, a hierarchical classifier is employed to discriminate between the behavioral states sleep, awake and cry.

We have evaluated the described prototype system on recordings of three healthy newborns, and we show that our algorithm operates with approximately 95% accuracy. Small changes in viewpoint and lighting conditions are allowed, but when there is a major reduction in light, or when the viewpoint is far from frontal, the algorithm fails.

Lykele Hazelhoff, Jungong Han, Sidarto Bambang-Oetomo, Peter H. N. de With
Supervised Face Recognition for Railway Stations Surveillance

The feasibility of a supervised surveillance system for railway stations (or airports) is evaluated. Surveillance is based on suspicious recognition by means of video cameras. As the problem involves both face detection and face recognition, we have evaluated the best performing algorithms of these two areas. For face detection, we have selected the Viola-Jones algorithm; while for face recognition we have performed tests with an appearance based algorithm (PCA) and an interest-point based algorithm (SIFT). We have used both the AT&T database and the LFW database for our tests. The results obtained show that face detection works reliably and fast enough, but face recognition cannot cope with highly non-homogeneous images like those of LFW and requires parallel computing in order to work in real time. As a conclusion, supervised surveillance is feasible provided image homogeneity fulfils some minimum standards and parallel computing is used. Besides, interest-point based methods are more robust to image quality, so their use is encouraged.

Maria Asuncion Vicente, Cesar Fernandez, Angela M. Coves
Person’s Recognition Using Palmprint Based on 2D Gabor Filter Response

Palmprint recognition is very important in automatic personal identification. The objective of this study is to develop an efficient prototype system for an automatic personal identification using palmprint technology. In this work, a new texture feature based on Gabor filter is proposed. First, the region of interest was filtering by 2D Gabor filter, then, the principal lines, wrinkles, and ridges, are extracted using a simple thresholding of the complex magnitude of the filtred ROI, Latterly, the candidate was found by matching process. We have tested our algorithm scheme over several images taken from a palmprint database collected by hong kong polytechnic university. The testing results showed that the designed system achieves an acceptable level of performance.

Abdallah Meraoumia, Salim Chitroub, Mohamed Saigaa
Retina Identification Based on the Pattern of Blood Vessels Using Angular and Radial Partitioning

This paper presents a new human identification system based on features obtained from retina images using angular and radial partitioning of the images. The proposed algorithm is composed of two principal stages including feature extraction and decision making. In the feature extraction stage, first all of the images are normalized in a preprocessing step. Then, the blood vessels’ pattern is extracted from retina images and a morphological thinning process is applied on the extracted pattern. After thinning, two feature vectors based on the angular and radial partitioning of the pattern image are extracted from the blood vessels’ pattern. The extracted features are rotation and scale invariant and robust against translation. In the next stage, the extracted feature vectors are analyzed using 1D discrete Fourier transform and the Manhattan metric is used to measure the closeness of the feature vectors to have a compression on them. Experimental results on a database, including 360 retina images obtained from 40 subjects, demonstrated an average true identification accuracy rate equal to 98.75 percent for the proposed system.

Mehran Deljavan Amiri, Fardin Akhlaqian Tab, Wafa Barkhoda
Backmatter
Metadaten
Titel
Advanced Concepts for Intelligent Vision Systems
herausgegeben von
Jacques Blanc-Talon
Wilfried Philips
Dan Popescu
Paul Scheunders
Copyright-Jahr
2009
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-04697-1
Print ISBN
978-3-642-04696-4
DOI
https://doi.org/10.1007/978-3-642-04697-1