Action, Gestures and Audio-Visual Recognition

Frontmatter

Audio-Visual Emotion Analysis Using Semi-Supervised Temporal Clustering with Constraint Propagation

In this paper, we investigate applying semi-supervised clustering to audio-visual emotion analysis, a complex problem that is traditionally solved using supervised methods. We propose an extension to the semi-supervised aligned cluster analysis algorithm (SSACA), a temporal clustering algorithm that incorporates pairwise constraints in the form of

must-link

and

cannot-link

. We incorporate an exhaustive constraint propagation mechanism to further improve the clustering process. To validate the proposed method, we apply it to emotion analysis on a multimodal naturalistic emotion database. Results show substantial improvements compared to the original aligned clustering analysis algorithm (ACA) and to our previously proposed semi-supervised approach.

Rodrigo Araujo, Mohamed S. Kamel

Exemplar-Based Human Action Recognition with Template Matching from a Stream of Motion Capture

Recent works on human action recognition have focused on representing and classifying articulated body motion. These methods require a detailed knowledge of the action composition both in the spatial and temporal domains, which is a difficult task, most notably under real-time conditions. As such, there has been a recent shift towards the exemplar paradigm as an efficient low-level and invariant modelling approach. Motivated by recent success, we believe a real-time solution to the problem of human action recognition can be achieved. In this work, we present an exemplar-based approach where only a single action sequence is used to model an action class. Notably, rotations for each pose are parameterised in Exponential Map form. Delegate exemplars are selected using

k

-means clustering, where the cluster criteria is selected automatically. For each cluster, a delegate is identified and denoted as the exemplar by means of a similarity function. The number of exemplars is adaptive based on the complexity of the action sequence. For recognition, Dynamic Time Warping and template matching is employed to compare the similarity between a streamed observation and the action model. Experimental results using motion capture demonstrate our approach is superior to current state-of-the-art, with the additional ability to handle large and varied action sequences.

Daniel Leightley, Baihua Li, Jamie S. McPhee, Moi Hoon Yap, John Darby

A New Visual Speech Recognition Approach for RGB-D Cameras

Visual speech recognition remains a challenging topic due to various speaking characteristics. This paper proposes a new approach for lipreading to recognize isolated speech segments (words, digits, phrases,

etc

.) using both of 2D image and depth data. The process of the proposed system is divided into three consecutive steps, namely, mouth region tracking and extraction, motion and appearance descriptors (HOG and MBH) computing, and classification using the Support Vector Machine (SVM) method. To evaluate the proposed approach, three public databases (MIRALC, Ouluvs, and CUAVE) were used. Speaker dependent and speaker independent settings were considered in the evaluation experiments. The obtained recognition results demonstrate that lipreading can be performed effectively, and the proposed approach outperforms recent works in the literature for the speaker dependent setting while being competitive for the speaker independent setting.

Ahmed Rekik, Achraf Ben-Hamadou, Walid Mahdi

2D Appearance Based Techniques for Tracking the Signer Configuration in Sign Language Video Recordings

Current linguistic research on sign language is often based on analysing large corpora of video recordings. The videos must be annotated either manually or automatically. Automatic methods for estimating the signer body configuration—especially the hand positions and shapes—would thus be of great practical interest. Methods based on rigorous 3D and 2D modelling of the body parts have been presented. However, they face insurmountable problems of computational complexity due to the large sizes of modern linguistic corpora. In this paper we look at an alternative approach and investigate what can be achieved with the use of straightforward local 2D appearance based methods: template matching-based tracking of local image neighbourhoods and supervised skin blob category detection based on local appearance features. After describing these techniques, we construct a signer configuration estimation system using the described techniques among others, and demonstrate the system in the video material of Suvi dictionary of Finnish Sign Language.

Ville Viitaniemi, Matti Karppa, Jorma Laaksonen

Computer Aided Hearing Assessment: Detection of Eye Gesture Reactions as a Response to the Sound

A methodology for the detection of eye gestural reactions as a response to auditory stimuli is presented in this work. A precise hearing evaluation is important to improve the quality of life of those who suffer from hearing loss. In the case of patients with cognitive decline or other communication disorders this evaluation becomes much more complicated. The audiologist needs to focus his attention on spontaneous gestural reactions that might indicate some sort of perception. The detection of this gestural reactions is sometimes imprecise and it requires a broad experience from the audiologist. To facilitate this task, we present a fully automated method that analyzes video sequences recorded during the audiometric evaluation and identifies these unconscious gestural reactions. The presented methodology achieves an accuracy of the 94.21 % in the detection of these reactions to the auditory stimuli, which makes of it an interesting tool to assist the audiologists in the hearing assessment of this specific group of patients.

A. Fernández, Marcos Ortega, Manuel G. Penedo

Multi-sensor Acceleration-Based Action Recognition

In this paper, a framework to recognize human actions from acceleration data is proposed. An important step for an accurate recognition is the pre-processing of input data and the following classification by the machine learning algorithm. In this paper, we suggest to combine Dynamic Time Warping (DTW) with Random Forest. The intention of using DTW is to pre-process the data to eliminate outliers and to align the time series. Many applications require more than one inertial sensor for an accurate prediction of actions. In this paper, nine inertial sensors are deployed to ensure an accurate recognition of actions. Further, sensor fusion approaches are introduced and the most promising strategy is shown. The proposed framework is evaluated on a self-recorded dataset consisting of six human actions. Each action was performed three times by 20 subjects. The dataset is publicly available for download.

Florian Baumann, Irina Schulz, Bodo Rosenhahn

Incremental Learning of Hand Gestures Based on Submovement Sharing

This paper presents an incremental learning method for hand gesture recognition that learns the individual movements in each gesture of a user. To recognize the movement, we use a subunit-based dynamic time warping method, which treats a hand movement as a sequence of ubmovements. In our method, each hand movement is decomposed into submovements and the arrangement of submovements is reflected in the training sample database. Experimental results from the lassification of ten gestures demonstrate that our method can improve the recognition rate compared with a method without incremental learning. In addition, the experimental results show that incremental learning of a single class of gestures can improve the recognition rate of multi-class gestures using our method.

Ryo Kawahata, Yanrung Wang, Atsushi Shimada, Takayoshi Yamashita, Rin-ichiro Taniguchi

Gait Analysis from Video: Camcorders vs. Kinect

Gait analysis is a domain of interest in clinical medical practice, both for neurological and non-neurological abnormal troubles. Marker-based systems are the most favored methods of human motion assessment and gait analysis, however, these systems require specific equipment and expertise and are cumbersome, costly and difficult to use. In this paper we compare two low-cost and marker-less systems that are: (1) A Kinect in front of a treadmill and (2) a set of two camcorders on the sides of the treadmill, used to reconstruct the skeleton of a subject during walk. We validated our method with ground truth data obtained with markers manually placed on the subject’s body. Finally, we present an application for asymmetric gait recognition. Our results on different subjects showed that, compared to the Kinect, the two-camcorder approach was very efficient and provided accurate measurements for gait assessment.

Hoang Anh Nguyen, Jean Meunier

Biometrics

Frontmatter

Person Re-identification Using Region Covariance in a Multi-feature Approach

Person re-identification is an important requirement for modern video surveillance systems and relevant for human tracking, especially over camera networks. Many different approaches have been proposed but a robust identification under real-life conditions still remains hard. In this paper we investigate the fusion of multiple person descriptors in order to increase the performance using complementary feature vectors. As an additional improvement to state-of-the-art region covariance descriptors, an extension of the comparison metric is proposed which increases the robustness and performance of the system in cases of rank deficiency. The proposed system is evaluated on the well-known benchmarks CAVIAR4REID, VIPeR, ETHZ and PRID 2011 and shows significant improvements over existing re-identification algorithms.

Volker Eiselein, Gleb Sternharz, Tobias Senst, Ivo Keller, Thomas Sikora

Multi-biometric Score-Level Fusion and the Integration of the Neighbors Distance Ratio

Multi-biometrics aims at building more accurate unified bio-metric decisions based on the information provided by multiple biometric sources. Information fusion is used to optimize the process of creating this unified decision. In previous works dealing with score-level multi-biometric fusion, the scores of different biometric sources belonging to the comparison of interest are used to create the fused score. The novelty of this work focuses on integrating the relation of the fused scores to other comparisons within a 1:N comparison. This is performed by considering the neighbors distance ratio in the ranked comparisons set within a classification-based fusion approach. The evaluation was performed on the Biometric Scores Set BSSR1 database and the enhanced performance induced by the integration of neighbors distance ratio was clearly presented.

Naser Damer, Alexander Opel

Adaptive Haar-Like Features for Head Pose Estimation

This paper presents a work on head pose estimation. Here, face images are tagged with head pose information. To achieve head pose estimation, anatomic regions (eyes, nose and mouth) are extracted using a facial descriptor. Candidates for these regions are extracted from an energy map based on Haar-like features. Then, a multi-threshold analysis is applied to find the position and the size of each region. Region projections on vertical and horizontal axis enable to define a set of rules in order to estimate head pose.

Nam-Jun Pyun, Halima Sayah, Nicole Vincent

Face and Palmprint Recognition Using Hierarchical Multiscale Adaptive LBP with Directional Statistical Features

In this paper, a new method for face and palmprint recognition has been proposed based on hierarchical multiscale local binary pattern (HMLBP) and adaptive LBP with directional statistical features (ALBPF) methods. It’s build a hierarchical multiscale ALBPF histogram for an image. By using HMLBP, the representative features of “non-uniform” patterns are extracted at a smaller scale. On the other hand, using ALBPF improves the efficiency by incorporating mean and standard deviation statistical features in addition to minimizing the directional difference along different orientation .The results of experiments conducted on 2 public face databases and 2 public palmprint databases show that our proposed method has better recognition accuracy than LBP, ALBPF and HMLBP methods.

Ghada Shams, Mohamed Ismail, Sohier Bassiouny, Nagia Ghanem

Multispectral Iris Recognition Using Patch Based Game Theory

Multispectral imaging offers potential to improve the recognition performance of an iris biometric system. The novelty of this research effort is that a Coalition Game Theory (CGT) is proposed to select only the important patches that are obtained using the modified Local Binary Pattern (mLBP) operator. The mLBP fuses both the sign and magnitude difference vector in an effort to extract feature from normalized iris images. The CGT selects patches based on the Shapely value that have better individual importance along with a strong interaction with other patches to improve the overall performance. Results show that CGT model maintains better recognition accuracy while reducing the overall surface area needed for recognition purpose.

Foysal Ahmad, Kaushik Roy, Khary Popplewell

Medical Image Processing and Analysis

Frontmatter

Periodic Background Pattern Detection and Removal for Cell Tracking

The study of cell morphology and cell mobility variation when cells are grown on top of patterned substrates is becoming a very important factor in tissue regeneration.

In this paper we present a novel approach to automatically detect and remove periodic background patterns in brightfield microscopy images. This background removal process is fundamental for the analysis of cell mobility as the periodic background pattern would otherwise lead to erroneous cell analysis. The detection of the background is performed by searching for the periodic background pattern organization through the analysis of keypoints automatically obtained from images. Using this information we are able to both detect and reconstruct the periodic background and finally remove it from the original images.

We tested the proposed approach on microscopy images with different periodic background patterns. The effectiveness of the method was validated both by visual inspection and by the cell tracking results obtained.

Tiago Esteves, Ângela Carvalho, Fernando Jorge Monteiro, Pedro Quelhas

Nerve Detection in Ultrasound Images Using Median Gabor Binary Pattern

Ultrasound in regional anesthesia (RA) has increased in popularity over the last years. The nerve localization presents a key step for RA practice, it is therefore valuable to develop a tool able to facilitate this practice. The nerve detection in the ultrasound images is a challenging task, since the noise and other artifacts corrupt the visual properties of such kind of tissue. In this paper we propose a new method to address this problem. The proposed technique operates in two steps. As the median nerve belongs to a hyperechoic region, the first step consists in the segmentation of this type of region using the k-means algorithm. The second step is more critical; it deals with nerve structure detection in noisy data. For that purpose, a new descriptor is developed. It combines tow methods median binary pattern (MBP) and Gabor filter to obtain the median Gabor binary pattern (MGBP). The method was tested on 173 ultrasound images of the median nerve obtained from three patients. The results showed that the proposed approach achieves better accuracy than the original MBP, Gabor descriptor and other popular descriptors.

Oussama Hadjerci, Adel Hafiane, Pascal Makris, Donatello Conte, Pierre Vieyres, Alain Delbos

Automatic Localization of Skin Layers in Reflectance Confocal Microscopy

Reflectance Confocal Microscopy (RCM) is a noninvasive imaging tool used in clinical dermatology and skin research, allowing real time visualization of skin structural features at different depths at a resolution comparable to that of conventional histology [

1

]. Currently, RCM is used to generate a rich skin image stack (about 60 to 100 images per scan) which is visually inspected by experts, a process that is tedious, time consuming and exclusively qualitative. Based on the observation that each of the skin images in the stack can be characterized as a texture, we propose a quantitative approach for automatically classifying the images in the RCM stack, as belonging to the different skin layers: stratum corneum, stratum granulosum, stratum spinosum, stratum basale, and the papillary dermis. A reduced set of images in the stack are used to generate a library of representative texture features named textons. This library is employed to characterize all the images in the stack with a corresponding texton histogram. The stack is ultimately separated into 5 different sets of images, each corresponding to different skin layers, exhibiting good correlation with expert grading. The performance of the method is tested against three RCM stacks and we generate promising classification results. The proposed method is especially valuable considering the currently scarce landscape of quantitative solutions for RCM imaging.

Eduardo Somoza, Gabriela Oana Cula, Catherine Correa, Julie B. Hirsch

Thermal Signature Using Non-redundant Temporal Local Binary-Based Features

In this paper, we propose a method for event recognition from thermal signature based 1-D signal. We use the non-redundant temporal Local Binary Pattern NRTLBP as a descriptor of the Pattern Of Interest (POI) signal. The original signal is extracted directly from local patch in region of interest. We introduce the wavelet decomposition as a pre-processing stage in order to extract the approximation wave-components of the raw signal. Then, NRTLBP is applied on the wave-components which provide wavelet domain descriptor of the raw thermal signature. Finally, we provide an evaluation of our method on the real dataset (Preterm Pain in Infants “PPI” ) composed of thermal videos developed in the context of Infant pain project, a french project supported by the French National Research Agency Projects for science (ANR).

Adnan Al Alwani, Youssef Chahir, Francois Jouen

Image Warping in Dermatological Image Hair Removal

The paper focuses on solving the problem of hair removal in dermatology applications. The proposed hair removal algorithm is based on Gabor filtering and PDE-based image reconstruction. It also includes the edge sharpening stage using a new warping algorithm. The idea of warping is to move pixels from the neighborhood of the blurred edge closer to the edge. The proposed technique preserves the overall luminosity and textures of the image, while making the edges sharper and less noisy.

Alexandra Nasonova, Andrey Nasonov, Andrey Krylov, Ivan Pechenko, Alexey Umnov, Natalia Makhneva

3D Multimodal Visualization of Subdural Electrodes with Cerebellum Removal to Guide Epilepsy Resective Surgery Procedures

Patients with medically refractory epilepsy may benefit from surgical resection of the epileptic focus. Subdural electrodes are implanted to accurately locate the seizure onset and locate the eloquent areas to be spared. However, the visualization of the subdural electrodes may be limited by the current methods. The aim of this work was to assist physicians in the localization of subdural electrodes in relation to anatomical landmarks using co-registration methods and by removing the cerebellum from MRI images. Three patients with refractory epilepsy were studied, in whom subdural electrodes were implanted. All electrodes were correctly localized in a 3D view over the cortex and their visualization was improved by the removal of cerebellum. This method promises to be useful in the optimization of the surgical plan.

Nádia Moreira da Silva, Ricardo Rego, João Paulo Silva Cunha

Medical Image Segmentation

Frontmatter

On the Automatic Normalization of Plaque Regions in Ultrasound Images of the Carotid

In order to assess the atherosclerotic plaque disruption risk from B-mode ultrasound images of the carotid, an appropriate normalization of the plaque regions is required. This is usually achieved through the manual selection of two sample regions in the image containing blood and adventitia tissues, which are used as reference.

In this work, we propose a new plaque region normalization method that takes advantage of multiple blood and adventitia reference samples per image, and a method for the automatic selection of these reference samples. Several preliminary results are provided in order to demonstrate the possible capabilities of the proposed methods.

José Rouco, Jorge Novo, Aurélio Campilho

Automatic Tear Film Segmentation Based on Texture Analysis and Region Growing

Dry eye syndrome is a prevalent disease characterized by symptoms of discomfort and ocular surface damage. It can be identified by several types of diagnostic tests, one of which consists in capturing the appearance of the tear film by means of the Doane interferometer. Previous research has demonstrated that this manual test can be automated, with the benefits of saving time for experts and providing unbiased results. However, most images are made up of a combination of different patterns which makes their classification into one single category per eye not always possible. In this sense, this paper presents a first attempt to segment tear film images based on the interference patterns, and so to detect multiple categories in each individual subject. The adequacy of the proposed methodology was demonstrated since it provides reliable results in comparison with the practitioners’ annotations.

Beatriz Remeseiro, Katherine M. Oliver, Eilidh Martin, Alan Tomlinson, Daniel G. Villaverde, Manuel G. Penedo

An Improved Segmentation Method for Non-melanoma Skin Lesions Using Active Contour Model

Computer-Aided Diagnosis (CAD) systems are widely used to classify skin lesions in dermoscopic images. The segmentation of the lesion area is the initial and key step to automate this process using a CAD system. In this paper, an improved segmentation algorithm is developed based on the following steps: (1) color space transform to the perception-oriented CIECAM02 color model, (2) preprocessing step to correct specular reflection, (3) contrast enhancement using an homomorphic transform filter (HTF) and nonlinear sigmoidal function (NSF) and (4) segmentation with relative entropy (RE) and active contours model (ACM). To validate the proposed technique, comparisons with other three state-of-the-art segmentation algorithms were performed for 210 non-melanoma lesions. From these experiments, an average true detection rate of 91.01, false positive rate of 6.35 and an error probability of 7.8 were obtained. These experimental results indicate that the proposed technique is useful for CAD systems to detect non-melanoma skin lesions in dermoscopy images.

Qaisar Abbas, Irene Fondón, Auxiliadora Sarmiento, M. Emre Celebi

Statistical-Based Segmentation of Bone Structures via Continuous Max-Flow Optimization

In this paper an automatic algorithm for segmentation of bone structures in CT volumes has been developed. This is a complicated task because bones present intensities overlapping with those of surrounding tissues. This overlapping happens because of the presence of some diseases and the different densities present in the bones, providing values similar to those in other tissues like muscle, fat or some organs. In our implementation, gray information and statistical information have been combined to be used as input to a continuous max-flow algorithm to get accurate and fast bone segmentation. Twenty CT images have been automatically segmented and several coefficients such as DICE, Jaccard, Sensitivity and Positive Predictive Value (PPV) indexes have been computed. High sensitivity values above 0.97 were obtained, which shows that the results are promising. Besides, low computational times under 0.6s in the max-flow algorithm were obtained, implying lower times in comparison to many algorithms in the literature.

Jose-Antonio Pérez Carrasco, Carmen Serrano Gotarredona, Cristina Suárez-Mejías, Begoña Acha-Piñero

A Portable Multi-CPU/Multi-GPU Based Vertebra Localization in Sagittal MR Images

Accurate Vertebra localization presents an essential step for automating the diagnosis of many spinal disorders. In case of MR images of lumbar spine, this task becomes more challenging due to vertebra complex shape and high variation of soft tissue. In this paper, we propose an efficient framework for spine curve extraction and vertebra localization in T1-weighted MR images. Our method is a fast parametrized algorithm based on three steps: 1. Image enhancing 2. Meanshift clustering [

5

] 3. Pattern recognition techniques. We propose also an adapted and effective exploitation of new parallel and hybrid platforms, that consist of both central (CPU) and graphic (GPU) processing units, in order to accelerate our vertebra localization method. The latter can exploit both NVIDIA and ATI graphic cards since we propose CUDA and OpenCL implementations of our vertebra localization steps. Our experiments are conducted using 16 MR images of lumbar spine. The related results achieved a vertebra detection rate of 95% with an acceleration ranging from 4 to 173

$$\times $$

thanks to the exploitation of Multi-CPU/Multi-GPU platforms.

Mohamed Amine Larhmam, Sidi Ahmed Mahmoudi, Mohammed Benjelloun, Saïd Mahmoudi, Pierre Manneback

An Automated Level-set Approach for Identification of Aortic Valve Borders in Short Axis Windows of Transesophageal Echo Sequences (TEE)

The ability to identify and quantify two (2D) and three-dimensional (3D) morphological parameters of the aortic valve (AV) apparatus from transesophageal echocardiographic (TEE) imaging constitutes a valuable tool in diagnosis, treatment and follow-up of patients with aortic valve related diseases, as well as image-based morphological assessment for surgical interventions, so there is a considerable need to develop a standardized frameworks for 2D-3D valve segmentation and shape representation.

AV borders and leaflets quantification is still a challenging task, and commonly based on intensive user interaction that limits its applicability. We propose a fast and accurate model free, automated method for segmenting and extracting morphological parameters. This work integrates level-set techniques to automatically delineate and quantitatively describe aortic geometry in echocardiographic images, a challenging task that has been explored only to a limited extent. The algorithm accuracy was tested on 5 patients compared to “gold standard” manual analysis, showing strong agreement between both. The proposed technique appears promising for clinical application.

César Veiga, Francisco Calvo, Emilio Paredes-Galán, Pablo Pazos, Carlos Peña, Andrés Íñiguez

Reliable Lung Segmentation Methodology by Including Juxtapleural Nodules

In a lung nodule detection task, parenchyma segmentation is crucial to obtain the region of interest containing all the nodules. Thus, the challenge is to devise a methodology that includes all the lung nodules, particularly those close to the walls, as the juxtapleural nodules. In this paper, different region growing approaches are proposed for the automatic segmentation of the lung parenchyma. The methodology is organized in five different steps: first, the image intensity is corrected to improve the contrast of the lungs. With that, the fat area is obtained, automatically deriving the interior of the lung region. Then, the traquea is extracted by a 3D region growing, being subtracted from the lung region results. The next step is the division of the two lungs to guarantee that both are separated. And finally, the lung contours are refined to provide appropriate final results.

The methodology was tested in 50 images taken from the LIDC image database, with a large variability and, specially, including different types of lung nodules. In particular, this dataset contains 158 nodules, from which 40 are juxtapleural nodules. Experimental results demonstrate that the method provides accurate lung regions, specially including the centers of 36 of the juxtapleural nodules. For the other 4, although the centers are not included, parts of their areas are retained in the segmentation, which is useful for lung nodule detection.

J. Novo, J. Rouco, A. Mendonça, Aurélio Campilho

Computer-Aided Diagnosis

Frontmatter

Fully Automatic 3D Glioma Extraction in Multi-contrast MRI

This work deals with the fully automatic extraction of a glioma, the most common type of brain tumor, in multi-contrast 3D magnetic resonance volumes. The detection is based on the locating the area that breaks the left-right symmetry of the brain. The proposed method uses multi-contrast MRI, where FLAIR and T2-weighted volumes are employed. The algorithm was designed to extract the whole pathology as one region.

The created algorithm was tested on 80 volumes from publicly available BRATS databases containing multi-contrast 3D brain volumes afflicted by a brain tumor. These pathological structures had various sizes and shapes and were located in various parts of the brain. The extraction process was evaluated by Dice Coefficient(0.75). The proposed algorithm detected and extracted multifocal tumors as separated regions as well.

Pavel Dvorak, Karel Bartusek

Grading Cancer from Liver Histology Images Using Inter and Intra Region Spatial Relations

Histology image analysis is widely used in cancer studies since it preserves the tissue structure. In this paper, we propose a framework to grade metastatic liver histology images based on the spatial organization inter and intra regions. After detecting the presence of metastases, we first decompose the image into regions corresponding to the tissue types (sane, cancerous, vessels and gaps). A sample of each type is further decomposed into the contained biological objects (nuclei, stroma, gaps). The spatial relations between all the pairs of regions and objects are measured using a Force Histogram Decomposition. A specimen is described using a Bag of Words model aggregating the features measured on all its randomly acquired images. The grading is made using a Naive Bayes Classifier. Experiments on a 23 mice dataset with CT26 intrasplenic tumors highlight the relevance of the spatial relations with a correct grading rate of

$$78.95\,\%$$

.

Mickaël Garnier, Maya Alsheh Ali, Johanne Seguin, Nathalie Mignet, Thomas Hurtut, Laurent Wendling

eFis: A Fuzzy Inference Method for Predicting Malignancy of Small Pulmonary Nodules

Predicting malignancy of small pulmonary nodules from computer tomography scans is a difficult and important problem to diagnose lung cancer. This paper presents a rule based fuzzy inference method for predicting malignancy rating of small pulmonary nodules. We use the nodule characteristics provided by Lung Image Database Consortium dataset to determine malignancy rating. The proposed fuzzy inference method uses outputs of ensemble classifiers and rules from radiologist agreements on the nodules. The results are evaluated over classification accuracy performance and compared with single classifier methods. We observed that the preliminary results are very promising and system is open to development.

Aydın Kaya, Ahmet Burak Can

Degradation Adaptive Texture Classification: A Case Study in Celiac Disease Diagnosis Brings New Insight

Degradation adaptive texture classification has been claimed to be a powerful instrument for classifying images suffering from degradations of dissimilar extent. The main goal of this framework is to separate the image databases into smaller sets, each showing a high degree of similarity with reference to degradations. Up to now, only scenarios with different types of synthetic degradations have been investigated. In this work we generalize the adaptive classification framework and introduce new degradation measures to extensively analyze the effects of the approach on real world data for the first time. Especially computer aided celiac disease diagnosis based on endoscopic images, which has become a major field of research, is investigated. Due to the weak illuminations and the downsized sensors, the images often suffer from various distortions and the type as well as the strength of these degradations significantly varies over the image data. In a large experimental setup, we show that the average classification accuracies can be improved significantly.

Michael Gadermayr, Andreas Uhl, Andreas Vécsei

Retinal Image Analysis

Frontmatter

Optic Disk Localization for Gray-Scale Retinal Images Based on Patch Filtering

In this paper, an optic disk (OD) localization method is proposed for the retinal images based on a novel patch filtering approach. The patch filtering has been performed sequentially based on clustering in two stages. In the first stage, the patches are selected exploiting an ’isotropic’ measure based on the ratio of maximum and minimum eigenvalues of the moment matrix representing the structure tensor. In the second stage, the patch filtering is based on the saliency measure. Finally, the optic disk is located from the centroids of the selected patches. Promising results are obtained for the low-contrast pathological retinal images using STARE database providing high localization accuracy.

F. Sattar, Aurélio Campilho, M. Kamel

Automatic Optic Disc Detection in Retinal Fundus Images Based on Geometric Features

Regular eye examinations are the key to limiting the vision loss caused by glaucoma and diabetic retinopathy. Optic disc (OD) detection is of vital importance in developing automated diagnosis systems for these diseases. In this work we present a method for automatic localization and boundary detection of the optic disc in retinal fundus images. In the first step, we rely on the OD geometric feature and utilize gaussian and mean curvatures for the localization of the OD. In the second step, we extract a region of interest based on the OD localization. Then, to the edges of this extracted region a circular Hough transform is applied to segment the OD boundary. The experimental results on three public datasets show the efficacy of the proposed method.

Isabel N. Figueiredo, Sunil Kumar

Optic Nerve Head Detection via Group Correlations in Multi-orientation Transforms

Optic nerve head detection is a fundamental step in automated retinal image analysis algorithms. In this paper, we propose a new optic nerve head detection algorithm that is based on the efficient analysis of multi-orientation patterns. To this end, we make use of invertible orientation scores, which are functions on the Euclidean motion group. We apply the classical and fast approach of template matching via cross-correlation, however, we do this in the domain of an orientation score rather than the usual image domain. As such, this approach makes it possible to efficiently detect multi-orientation patterns. The method is extensively tested on public and private databases and we show that the method is generically applicable to images originating from traditional fundus cameras as well as from scanning laser ophthalmoscopes.

Erik Bekkers, Remco Duits, Bart ter Haar Romeny

A Robust Algorithm for Optic Disc Segmentation from Colored Fundus Images

Efficient and accurate optic disc (OD) segmentation is an essential task in automated diagnosis of different retinal diseases from digital fundus images. Due to presence of non-uniform illumination, noise, vessels and other lesions in the fundus images, it is challenging to come up with an algorithm which can accurately segment the OD from the fundus images. It is even more difficult to detect OD accurately for real time patient data in which the images are not captured in the very control environment. This paper presents a novel approach for efficient and robust OD segmentation even in presence of high retinal pathologies and noise. The proposed system consists of four modules i.e. preprocessing, candidate OD regions detection, vessel segmentation and OD detection based on vessel density in candidate regions. The proposed system is tested and validated on publicly available fundus image databases and images gathered locally for real patients. The experimental results show the validation of proposed system.

Anam Usman, Sarmad Abbas Khitran, M. Usman Akram, Yasser Nadeem

Coupled Parallel Snakes for Segmenting Healthy and Pathological Retinal Arteries in Adaptive Optics Images

In this paper, we propose two important improvements of an existing approach for automatically segmenting the walls of retinal arteries of healthy/pathological subjects in adaptive optics images. We illustrate the limits of the previous approach and propose to (i) modify the pre-segmentation step, and (ii) embed additional information through coupling energy terms in the parallel active contour model. The interest of these new elements as well as the pre-segmentation step is then evaluated against manual segmentations. They improve the robustness against low contrasted walls and morphological deformations that occur along vessels in case of pathologies. Noticeably, this strategy permits to obtain a mean error of

$$13.4\,\%$$

compared to an inter-physicians error of

$$17\,\%$$

, for the wall thickness which is the most sensitive measure used. Additionally, this mean error is in the same range than for healthy subjects.

Nicolas Lermé, Florence Rossant, Isabelle Bloch, Michel Paques, Edouard Koch

Automatic Arteriovenous Nicking Identification by Color Fundus Images Analysis

Retinal arteriovenous nicking (AVN) assessment has been considered a very important indicator of cardiovascular and cerebrovascular diseases. A computerized method to infer the AVN presence in retinal images could increase the reproducibility and accuracy of this analysis that for now has been done by ophthalmologists in a subjective and qualitative manner. Therefore, a new approach is proposed for the AVN assessment in color fundus images. First the algorithm segments the blood vessels by means of a multi-scale line detector. The arteriovenous cross points are then detected and classified as AVN presence or absence with an SVM. The proposed approach is clearly efficient in separating normal cases from the evident or severe AVN cases.

Carla Pereira, Diana Veiga, Luís Gonçalves, Manuel Ferreira

Detection of Hemorrhages in Colored Fundus Images Using Non Uniform Illumination Estimation

Hemorrhages are retinal lesions caused because of different eye diseases such as diabetic retinopathy, hypertensive retinopathy and macular oedema. This paper presents a novel method for detection of hemorrhages form digital fundus images. The proposed system consists of preprocessing, candidate hemorrhage detection, removing of false regions and hemorrhage detection. The proposed system also consists of illumination estimation using non uniform circular points grid for proper detection of hemorrhages. The evaluation of proposed system is done using publicly available fundus image databases along with some locally collected images. The analysis has been done at image level and results are compared with existing techniques for hemorrhage detection.

M. Usman Akram, Sarmad Abbas Khitran, Anam Usman, Ubaid ullah Yasin

Automatic Robust Segmentation of Retinal Layers in OCT Images with Refinement Stages

At the present, Optical Coherence Tomography (OCT) is a very promising imaging technique used by ophthalmologists for diagnosing because it provides more information than other classical modalities. Retinal structures can be studied on these images, so image processing-based methods are emerging to extract their information. Previously to any automatic feature extraction process, delimitation of retinal layers must be automated. With that purpose, this paper presents an active contour-based method to segment retinal layer boundaries. Regarding previous work, it is remarkable that this proposal includes processes of refinement for segmented layers. Thus, validation done by an opthalmologic expert shows that the method obtains accurate results even when some of these layers present alterations or low definition, making it robust, which is a very important feat.

Ana González-López, Marcos Ortega, Manuel G. Penedo, Pablo Charlón

3D Imaging

Frontmatter

Accurate Multi-View Stereo 3D Reconstruction for Cost-Effective Plant Phenotyping

Phenotyping, which underpins much of plant biology and breeding, involves the measurement of characteristics or traits. Traditionally, this has been often destructive and/or subjective but the dynamic objective measurement of traits as they change in response to genetic mutation or environmental influences is an important goal. 3-D imaging technologies are increasingly incorporated into mass produced consumer goods (3D laser scanning, structured light and digital photography) and may represent a cost-effective alternative to current commercial phenotyping platforms. We evaluate their performance, cost and practicability for plant phenotyping and present a 3D reconstruction method for plants from multi-view images acquired with domestic quality cameras. We exploit an efficient Structure-From-Motion followed by stereo matching and depth-map merging processes. Experimental results show that the proposed method is flexible, adaptable and inexpensive, and promising as an generalized groundwork for phenotyping various plant species.

Lu Lou, Yonghuai Liu, Jiwan Han, John H. Doonan

Truncated Signed Distance Function: Experiments on Voxel Size

Real-time 3D reconstruction is a hot topic in current research. Several popular approaches are based on the truncated signed distance function (TSDF), a volumetric scene representation that allows for integration of multiple depth images taken from different viewpoints. Aiming at a deeper understanding of TSDF we discuss its parameters, conduct experiments on the influence of voxel size on reconstruction accuracy and derive practical recommendations.

Diana Werner, Ayoub Al-Hamadi, Philipp Werner

Human Activity Analysis in a 3D Bird’s-eye View

Efficient and reliable human tracking in arbitrary environments is challenging, as there is currently no single solution that can successfully handle all scenarios. In this paper we present a novel approach that uses a top view 3D camera, which employs a simplified yet expressive human body model for effective multi-target detection and tracking. Both bottom-up and high level processes are involved to construct a saliency map with selective visual information. We handle the tracking task in a hierarchical data association framework, and a novel salience occupancy pattern (SOP) descriptor is proposed as the motion representation for action recognition. Our real-time bird’s-eye multi-person tracking and recognition approach is being applied in a human-computer interaction (HCI) research prototype, and has a wide range of applications.

Gang Hu, Derek Reilly, Ben Swinden, Qigang Gao

3D Spatial Layout Propagation in a Video Sequence

Intelligent autonomous systems need detailed models of their environment to achieve sophisticated tasks. Vision sensors provide rich information and are broadly used to obtain these models, particularly, indoor scene understanding has been widely studied. A common initial step to solve this problem is the estimation of the

$$3$$

D layout of the scene. This work addresses the problem of scene layout propagation along a video sequence. We use a Particle Filter framework to propagate the scene layout obtained using a state-of-the-art technique on the initial frame and propose how to generate, evaluate and sample new layout hypotheses on each frame. Our intuition is that we can obtain better layout estimation at each frame through propagation than running separately at each image. The experimental validation shows promising results for the presented approach.

Alejandro Rituerto, Roberto Manduchi, Ana C. Murillo, J. J. Guerrero

SASCr3: A Real Time Hardware Coprocessor for Stereo Correspondence

Main focus of this paper is to show the relevant improvements for a real time hardware co-processor for Stereo-Matching. The approach follows the well-known scheme for strings alignment proposed by Needleman&Wunsch, commonly used in bio-informatics. The principal improvement concerns the algorithm parallelization in FPGA design, in an hardware architecture many resources can work at the same time avoiding the reduction of system performance. The architecture, highly modular, was designed by using Bluespec SystemVerilog development tool and is described in detail. For many parallelism degrees the synthesis and performance results are shown, for this purpose a Lattice ECP3-70 is set as target device. The aim of this project is to build stereo vision system for embedded application, charaterized by low power usage and device cost. The actual circuit is an updated version of SASCr2 design. Performance is benchmarked against the former implementation.

Luca Puglia, Mario Vigliar, Giancarlo Raiconi

Motion Analysis and Tracking

Frontmatter

Adaptive Feature Selection for Object Tracking with Particle Filter

Object tracking is an important topic in the field of computer vision. Commonly used color-based trackers are based on a fixed set of color features such as RGB or HSV and, as a result, fail to adapt to changing illumination conditions and background clutter. These drawbacks can be overcome to an extent by using an adaptive framework which selects for each frame of a sequence the features that best discriminate the object from the background. In this paper, we use such an adaptive feature selection method embedded into a particle filter mechanism and show that our tracking method is robust to lighting changes and background distractions. Different experiments also show that the proposed method outperform other approaches.

Darshan Venkatrayappa, Désiré Sidibé, Fabrice Meriaudeau, Philippe Montesinos

Exploiting Color Constancy for Robust Tracking Under Non-uniform Illumination

Tracking objects in environments under non-uniform illumination condition is particularly challenging as the observed appearance may change in space and time. Adapting the appearance model increases the risk of drifts, while iltering out the illumination information through built-in invariance reduces the discriminative capabilities. In this work we adhere to color constancy principles to learn the appearance variation induced by non-uniform illumination and we use this information to perform location-dependent color corrections to boost tracking performance. The training procedure is carried out in an unsupervised manner by exploiting walking people as illumination probes and an on-line, non-parametric regression method is developed to densely predict the location-specific color transformations.

Sinan Mutlu, Samuel Rota Bulò, Oswald Lanz

Wavelet Subspace Analysis of Intraoperative Thermal Imaging for Motion Filtering

Intraoperative thermography allows fast capturing of small temperature variations during neurosurgical operations. External influences induce periodic vibrational motion to the whole camera system superimposing signals of high-frequent neuronal activity, heart rate activity and injected perfusion tracers by motion artifacts. In this work, we propose a robust method to eliminate the effects induced by the vibrational motion allowing further inference of clinical information. For this purpose, an efficient wavelet shrinkage scheme is developed based on subspace analysis in 1D wavelet domain to recognize and remove motion related patterns. The approach does not require any specific motion modeling or image warping, making it fast and preventing image deformations. Promising results of a simulation study and by intraoperative measurements make this method a reliable and efficient method improving subsequent perfusion and neuronal activity analysis.

Nico Hoffmann, Julia Hollmach, Christian Schnabel, Yordan Radev, Matthias Kirsch, Uwe Petersohn, Edmund Koch, Gerald Steiner

A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps

This paper presents a novel framework for object detection in videos that considers both structural and temporal information. Detection is performed by first applying low-level feature extraction techniques in each frame of the video. Then, additional robustness is obtained by considering the temporal stability of videos, using particle filters and probability maps, which encode information about the expected location of each object. Lastly, structural information of the scene is described using graphs, which allows us to further improve the results. As a practical application, we evaluate our approach on table tennis sport videos databases: the UCF101 table tennis shots and an in-house one. The observed results indicate that the proposed approach is robust, showing a high hit rate on the two databases.

Henrique Morimitsu, Roberto M. Cesar Jr., Isabelle Bloch

Robot Vision

Frontmatter

Adopting Feature-Based Visual Odometry for Resource-Constrained Mobile Devices

In many practical applications of mobile devices self-localization of the user in a GPS-denied indoor environment is required. Among the available approaches the visual odometry concept enables continuous, precise egomotion estimation in previously unknown environments. In this paper we examine the usual pipeline of a monocular visual odometry system, identifying the bottlenecks and demonstrating how to circumvent the resource constrains, to implement a real-time visual odometry system on a smartphone or tablet.

Michał Fularz, Michał Nowicki, Piotr Skrzypczyński

Strategy for Folding Clothing on the Basis of Deformable Models

In this study, a strategy is given for automatically reshaping an item of clothing from an arbitrary shape into a fixed shape by using its deformable model. The strategy consists of three stages that correspond to the clothing state: unknown (before recognition), unknown to known (recognition), and known (after recognition). In the first stage, a clothing item that is initially placed in an arbitrary shape is picked up and observed after some recognition-aid actions. In the second stage, the clothing state is recognized by matching the deformable clothing model to the observed 3D data [

1

]. In the third stage, a proper sequence of grasps toward the goal state is selected according to the clothing state. As an instance of this strategy, a folding task was implemented in a humanoid robot. Experimental results using pullovers show that the folding task can be achieved with a small number of grasping steps.

Yasuyo Kita, Fumio Kanehiro, Toshio Ueshiba, Nobuyuki Kita

Multiple Camera Approach for SLAM Based Ultrasonic Tank Roof Inspection

This paper presents an approach of an autonomous measuring system for the wall thickness of tank roofs. It consists of a mobile robot, a multi-camera system and an ultrasonic sensor. A simultaneous localization and mapping with six degrees of freedom (SLAM-6D) is performed. The developed algorithms are optimized to work in man-made environments. Because of the great amount of line features in such environments, feature extraction is based on line segment detection. We propose a simple method for matching line segments in multiple images of three cameras. The correspondences are used to calculate 3D lines, which are used for localization. The accuracy of the system is evaluated by measurements on a prepared scene with unique markers.

Christian Freye, Christian Bendicks, Erik Lilienblum, Ayoub Al-Hamadi

On Tracking and Matching in Vision Based Navigation

The paper presents a thorough comparative analysis of the feature tracking and the feature matching approaches applied to the visual navigation. The evaluation was performed on a synthetic dataset with perfect ground truth to assure maximum reliability of results. The presented results include the analysis of both the feature localization accuracy and the computational costs of different methods. Additionally, the distribution of the uncertainty of the features localization was analyzed and parametrized.

Adam Schmidt, Marek Kraft, Michal Fularz

Biologically Inspired Vision for Indoor Robot Navigation

Ultrasonic, infrared, laser and other sensors are being applied in robotics. Although combinations of these have allowed robots to navigate, they are only suited for specific scenarios, depending on their limitations. Recent advances in computer vision are turning cameras into useful low-cost sensors that can operate in most types of environments. Cameras enable robots to detect obstacles, recognize objects, obtain visual odometry, detect and recognize people and gestures, among other possibilities. In this paper we present a completely biologically inspired vision system for robot navigation. It comprises stereo vision for obstacle detection, and object recognition for landmark-based navigation. We employ a novel keypoint descriptor which codes responses of cortical complex cells. We also present a biologically inspired saliency component, based on disparity and colour.

M. Saleiro, K. Terzić, D. Lobato, J. M. F. Rodrigues, J. M. H. du Buf

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Action, Gestures and Audio-Visual Recognition

Frontmatter

Audio-Visual Emotion Analysis Using Semi-Supervised Temporal Clustering with Constraint Propagation

Exemplar-Based Human Action Recognition with Template Matching from a Stream of Motion Capture

A New Visual Speech Recognition Approach for RGB-D Cameras

2D Appearance Based Techniques for Tracking the Signer Configuration in Sign Language Video Recordings

Computer Aided Hearing Assessment: Detection of Eye Gesture Reactions as a Response to the Sound

Multi-sensor Acceleration-Based Action Recognition

Incremental Learning of Hand Gestures Based on Submovement Sharing

Gait Analysis from Video: Camcorders vs. Kinect

Biometrics

Frontmatter

Person Re-identification Using Region Covariance in a Multi-feature Approach

Multi-biometric Score-Level Fusion and the Integration of the Neighbors Distance Ratio

Adaptive Haar-Like Features for Head Pose Estimation

Face and Palmprint Recognition Using Hierarchical Multiscale Adaptive LBP with Directional Statistical Features

Multispectral Iris Recognition Using Patch Based Game Theory

Medical Image Processing and Analysis

Frontmatter

Periodic Background Pattern Detection and Removal for Cell Tracking

Nerve Detection in Ultrasound Images Using Median Gabor Binary Pattern

Automatic Localization of Skin Layers in Reflectance Confocal Microscopy

Thermal Signature Using Non-redundant Temporal Local Binary-Based Features

Image Warping in Dermatological Image Hair Removal

3D Multimodal Visualization of Subdural Electrodes with Cerebellum Removal to Guide Epilepsy Resective Surgery Procedures

Medical Image Segmentation

Frontmatter

On the Automatic Normalization of Plaque Regions in Ultrasound Images of the Carotid

Automatic Tear Film Segmentation Based on Texture Analysis and Region Growing

An Improved Segmentation Method for Non-melanoma Skin Lesions Using Active Contour Model

Statistical-Based Segmentation of Bone Structures via Continuous Max-Flow Optimization

A Portable Multi-CPU/Multi-GPU Based Vertebra Localization in Sagittal MR Images

An Automated Level-set Approach for Identification of Aortic Valve Borders in Short Axis Windows of Transesophageal Echo Sequences (TEE)

Reliable Lung Segmentation Methodology by Including Juxtapleural Nodules

Computer-Aided Diagnosis

Frontmatter

Fully Automatic 3D Glioma Extraction in Multi-contrast MRI

Grading Cancer from Liver Histology Images Using Inter and Intra Region Spatial Relations

eFis: A Fuzzy Inference Method for Predicting Malignancy of Small Pulmonary Nodules

Degradation Adaptive Texture Classification: A Case Study in Celiac Disease Diagnosis Brings New Insight

Retinal Image Analysis

Frontmatter

Optic Disk Localization for Gray-Scale Retinal Images Based on Patch Filtering

Automatic Optic Disc Detection in Retinal Fundus Images Based on Geometric Features

Optic Nerve Head Detection via Group Correlations in Multi-orientation Transforms

A Robust Algorithm for Optic Disc Segmentation from Colored Fundus Images

Coupled Parallel Snakes for Segmenting Healthy and Pathological Retinal Arteries in Adaptive Optics Images

Automatic Arteriovenous Nicking Identification by Color Fundus Images Analysis

Detection of Hemorrhages in Colored Fundus Images Using Non Uniform Illumination Estimation

Automatic Robust Segmentation of Retinal Layers in OCT Images with Refinement Stages

3D Imaging

Frontmatter

Accurate Multi-View Stereo 3D Reconstruction for Cost-Effective Plant Phenotyping

Truncated Signed Distance Function: Experiments on Voxel Size

Human Activity Analysis in a 3D Bird’s-eye View

3D Spatial Layout Propagation in a Video Sequence

SASCr3: A Real Time Hardware Coprocessor for Stereo Correspondence

Motion Analysis and Tracking

Frontmatter

Adaptive Feature Selection for Object Tracking with Particle Filter

Exploiting Color Constancy for Robust Tracking Under Non-uniform Illumination

Wavelet Subspace Analysis of Intraoperative Thermal Imaging for Motion Filtering

A Spatio-temporal Approach for Multiple Object Detection in Videos Using Graphs and Probability Maps

Robot Vision

Frontmatter

Adopting Feature-Based Visual Odometry for Resource-Constrained Mobile Devices

Strategy for Folding Clothing on the Basis of Deformable Models

Multiple Camera Approach for SLAM Based Ultrasonic Tank Roof Inspection

On Tracking and Matching in Vision Based Navigation

Biologically Inspired Vision for Indoor Robot Navigation

Backmatter

Premium Partner