Skip to main content

2018 | Buch

Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

22nd Iberoamerican Congress, CIARP 2017, Valparaíso, Chile, November 7–10, 2017, Proceedings

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed post-conference proceedings of the 22nd Iberoamerican Congress on Pattern Recognition, CIARP 2017, held in Valparaíso, Chile, in November 2017.

The 87 papers presented were carefully reviewed and selected from 156 submissions. The papers feature research results in the areas of pattern recognition, image processing, computer vision, multimedia and related fields.

Inhaltsverzeichnis

Frontmatter
A Citation k-NN Approach for Facial Expression Recognition

The identification of facial expressions with human emotions plays a key role in non-verbal human communication and has applications in several areas. In this work, we propose a descriptor based on areas and angles of triangles formed by the landmarks from face images. We test this descriptors for facial expression recognition by means of an adaptation of the k-Nearest Neighbors classifier called Citation-kNN in which the training examples come in the form of sets of feature vectors. Comparisons with other state-of-the-art techniques on the CK+ dataset are shown. The descriptor remains robust and precise in the recognition of expressions.

Daniel Acevedo, Pablo Negri, María Elena Buemi, Francisco Gómez Fernández, Marta Mejail
Mining Generalized Closed Patterns from Multi-graph Collections

Frequent approximate subgraph (FAS) mining has become an important technique into the data mining. However, FAS miners produce a large number of FASs affecting the computational performance of methods using them. For solving this problem, in the literature, several algorithms for mining only maximal or closed patterns have been proposed. However, there is no algorithm for mining FASs from multi-graph collections. For this reason, in this paper, we introduce an algorithm for mining generalized closed FASs from multi-graph collections. The proposed algorithm obtains more patterns than the maximal ones, but less than the closed one, covering patterns with small frequency differences. In our experiments over two real-world multi-graph collections, we show how our proposal reduces the size of the FAS set.

Niusvel Acosta-Mendoza, Andrés Gago-Alonso, Jesús Ariel Carrasco-Ochoa, José Francisco Martínez-Trinidad, José Eladio Medina-Pagola
Edge Detection Based on Digital Shape Elongation Measure

In this paper, we justify the hypothesis that the methods based on the tools designed to cope with digital images can outperform the standard techniques, usually coming from differential calculus and differential geometry. Herein, we have employed the shape elongation measure, a well known shape based image analysis tool, to offer a solution to the edge detection problem. The shape elongation measure, as used in this paper, is a numerical characteristic of discrete shape, computable for all discrete point sets, including digital images. Such a measure does not involve any of the infinitesimal processes for its computation. The method proposed can be applied to any digital image directly, without the need of any pre-processing.

Faisal Alamri, Joviša Žunić
Mining the Criminal Data of Rio de Janeiro: Analyzing the Impact of the Pacifying Police Units Deployment

Rio de Janeiro is recognized as a violent city, but the pacifying police program was established with the goal to modify this situation. This paper shows that after six years and many pacified communities, the program has had an impact on the decrease of violent deaths with effects both before and after the deployment of the pacifying police units (UPPs). We also evidence a spatial dynamic where the neighboring areas to the UPPs present a more pronounced decrease. Those results raise new issues in the analysis of public safety.

Cassio Almeida, Sonia F. Gonzalez, Pedro C. L. Souza, Simone D. J. Barbosa, Hélio Lopes
Going Deeper on BioImages Classification: A Plant Leaf Dataset Case Study

In this paper, we present and evaluate the accuracy of a Deep Convolutional Neural Network (DCNN) architecture, with other traditional methods, to solve a bioimage classification problem. The main contributions of this work are the application of a DCNN architecture and the further comparison of different types of classification and feature extraction techniques applied to a plant leaf image dataset. Furthermore, we go deeper on the analysis of a cross-domain transfer learning approach using a state-of-the-art deep neural network called Inception-v3. Our results show that we manage to classify a subset of 53 species of leafs with a notable mean accuracy of $$98.2\%$$.

Daniel H. A. Alves, Luís F. Galonetti, Claiton de Oliveira, Pedro H. Bugatti, Priscila T. M. Saito
Benchmarking Head Pose Estimation in-the-Wild

Head pose estimation systems have quickly evolved from simple classifiers estimating a few yaw angles, to the most recent regression approaches that provide precise 3D face orientations in images acquired “in-the-wild”. Accurate evaluation of these algorithms is an open issue. Although the most recent approaches are tested using a few challenging annotated databases, their published results are not comparable. In this paper we review these works, define a common evaluation methodology, and establish a new state-of-the-art for this problem.

Elvira Amador, Roberto Valle, José M. Buenaposada, Luis Baumela
Forensic Document Examination: Who Is the Writer?

This work presents a baseline system to automatic handwriting identification based only on graphometric features. Initially a set composed of 12 features was presented and its extraction process demonstrated. In order to evaluate the efficiency of these features, a selection process was applied, and a smaller group composed only of 4 features (GS = Goodness Subset) present the best writer identification rates. Experiments were conducted in order to evaluate the performance, individually and in group, of the graphometric features; and to identify the number of writers that significantly affect the accuracy of the system. The accuracy of the system applied to 100 different writers taking account the GS features set were 84% (TOP1), 96% (TOP5) and 98% (TOP10). These results are comparable to others in the literature on graphometric features. It can be observed that gradually the relation between the number of writers and accuracy is stabilized, and with 200 writers the results are maintained.

Aline Maria M. M. Amaral, Cinthia O. de Almendra Freitas, Flávio Bortolozzi, Yandre Maldonado e Gomes da Costa
Traversability Cost Identification of Dynamic Environments Using Recurrent High Order Neural Networks for Robot Navigation

In this paper, it is presented a neural network methodology for learning traversability cost maps to aid autonomous robotic navigation. This proposal is based on the control theory idea for dynamical system identification i.e. we solve the problem of learning and recognizing the pattern which describes the best the behavior of the cost function that represents the environment to obtain traversability cost maps as if we are identifying a dynamical system that is the rough terrain where the robot navigates. Recurrent High Order Neural Networks (RHONN) trained with Extended Kalman Filter (EKF) are used to identify rough terrain traversability costs, and besides the good results in the identification tasks, we get the advantages of using a robust machine learning method such as RHONNs. Our proposal gives the robot the capability to generalize the knowledge learned in previous navigation episodes when it is navigating on similar (but not necessarily equal) environments, so the robot can re-use learned knowledge, generalize it and also it can recognize hidden states. Experimental results show that our proposed approach can identify and learn very complex cost maps, we prove it with artificially generated maps as well as satellite maps of real land.

Nancy Arana-Daniel, Julio Valdés-López, Alma Y. Alanís, Carlos López-Franco
Texture Classification of Phases of Ti-6Al-4V Titanium Alloy Using Fractal Descriptors

Traditionally, the evaluation of metal microstructures and their physical properties is a subject of study in Metallography. Through microscopy, we obtain images of the microstructures of the material evaluated, while a human expert performs its analysis. However, texture is an important image descriptor as it is directly related to the physical properties of the surface of the object. Thus, in this paper, we propose to use texture analysis methods to automatically classify metal microstructures, more specifically, the phases of a Titanium alloy, Ti-6Al-4V. We performed texture analysis using the Bouligand-Minkowski fractal dimension method, which enables us to describe a texture image in terms of its irregularity. Experiments were performed using 3900 texture samples of 2 different phases of the titanium alloy. We used LDA (Linear Discriminant Analysis) to evaluate computed texture descriptors. The results indicated that fractal dimension is a feasibility tool for the evaluation of the microstructures present in the metal samples.

André R. Backes, Jarbas Joaci de Mesquita Sá Junior
Pyramidal Zernike Over Time: A Spatiotemporal Feature Descriptor Based on Zernike Moments

This paper aims at presenting an approach to recognize human activities in videos through the application of Zernike invariant moments. Instead of computing the regular Zernike moments, our technique, named Pyramidal Zernike Over Time (PZOT), creates a pyramidal structure and uses the Zernike response at different levels to associate subsequent frames, adding temporal information. At the end, the feature response is associated to Gabor filters to generate video descriptions. To evaluate the present approach, experiments were performed on the UCFSports dataset using a standard protocol, achieving an accuracy of 86.05%, comparable to results achieved by other widely employed spatiotemporal feature descriptors available in the literature.

Igor L. O. Bastos, Larissa Rocha Soares, William Robson Schwartz
A Multilabel Extension of LDA Based on the Gram-Schmidt Orthogonalization Procedure

Multilabel classification is a generalization of the traditional unidimensional classification problem, the goal of multilabel classification is to learn a function that maps instances into a set of relevant labels. This article proposes an extension to linear discriminant analysis in the context of multilabel classification. The new method is based on Gram-Schmidt orthogonalization procedure. The theoretical basis and underlying assumptions of the new model are described and the method is experimentally evaluated on the Emotions data set for multilabel classification. The analysis of the empirical results support that this new method is competitive and in some instances superior to the baseline.

Juan Bekios-Calfa, Brian Keith
Linear Modelling of Cerebral Autoregulation System Using Genetic Algorithms

Cerebral autoregulation (CA) represents the brain’s capacity to maintain the cerebral blood flow constant, independent of the activities realized by an individual. There are pathologies like Alzheimer, vascular dementia, ischemic stroke, subarachnoid haemorrhage and severe brain injury, where a degradation of CA can be found. Despite limited understanding of its physiological basis, assessment of CA is relevant for diagnosis, monitoring and treating some of these pathologies. CA modelling is done by using mean arterial blood pressure (MABP) as input and cerebral blood flow velocity (CBFV) as output; the standard model used is transfer function analysis, although CA has been modelled with support vector machines (SVM) and other methods. In this work a resistive-capacitive model (R-C) is presented where parameters can be estimated from MABP and CBFV signals through Genetic Algorithms (GA), comparing its discrimination capacity against SVM models. Signals from 16 healthy subjects were used with 5 min of spontaneous variations (SV) and 5 min breathing oxygen with 5% of CO2 (hypercapnia). Results show that both models can capture CA and the degradation induced by hypercapnia. Using the autoregulation index (ARI), the R-C model discriminates with a ROC area of 0.89 against 0.72 from SVM, thus representing a promising alternative to assess CA.

Felipe-Andrés Bello Robles, Ronney B. Panerai, Max Chacón Pacheco
Knowledge Transfer for Writer Identification

The technical literature on writer identification usually considers the best case scenario in terms of data availability, i.e., a database composed of hundreds of writers with several documents per writer is available to train the machine learning models. However, in real-life problems such a database may not be available. In this context, learning from one dataset and transferring the knowledge to other would be extremely useful. In this paper we show how to transfer knowledge from one dataset to another through a framework that uses a writer-independent approach based on dissimilarity. Experiments on five different databases under single- and multi-script environments showed that the proposed approach achieves good results. This is an important contribution since it makes it possible do deploy the writer identification system even when no data from that particular writer are available for training.

Diego Bertolini, Luiz S. Oliveira, Yandre M. G. Costa, Lucas G. Helal
Real-Time Brand Logo Recognition

The increasing popularity of Social Networks makes change the way people interact. These interactions produce a huge amount of data and it opens the door to new strategies and marketing analysis. According to Instagram (https://instagram.com/press/) and Tumblr (https://www.tumblr.com/press), an average of 80 and 59 million photos respectively are published every day, and those pictures contain several implicit or explicit brand logos. The analysis and detection of logos in natural images can provide information about how widespread is a brand. In this paper, we propose a real-time brand logo recognition system, that outperforms all other state-of-the-art methods for the challenging FlickrLogos-32 dataset. We experimented with 5 different approaches, all based on the Single Shot MultiBox Detector (SSD). Our best results were achieved with the SSD 512 pretrained, where we outperform by 2.5% of F-score and by 7.4% of recall the best results on this dataset. Besides the higher accuracy, this approach is also relatively fast and can process with a single Nvidia Titan X 19 images per second.

Leonardo Bombonato, Guillermo Camara-Chavez, Pedro Silva
Retinal Vessels Segmentation Based on a Convolutional Neural Network

We present a supervised method for vessel segmentation in retinal images. The segmentation issue has been addressed as a pixel-level binary classification task, where the image is divided into patches and the classification (vessel or non-vessel) is performed on the central pixel of the patch. The input image is then segmented by classifying all of its pixels. A Convolutional Neural Network (CNN) has been used for the classification task, and the network has been trained on a large number of samples, in order to obtain an adequate generalization ability. Since blood vessels are characterized by a linear structure, we have introduced a further layer into the classic CNN including directional filters. The method has been tested on the DRIVE dataset producing satisfactory results, and its performance has been compared to that of other supervised and unsupervised methods.

Nadia Brancati, Maria Frucci, Diego Gragnaniello, Daniel Riccio
DNLM-IIFFT: An Implementation of the Deceived Non Local Means Filter Using Integral Images and the Fast Fourier Transform for a Reduced Computational Cost

In this paper we propose an efficient implementation of the Deceived Non Local Means filter, using Integral Images and the Fast Fourier Transform, named DNLM-IFFT. The deceived non local means filter is part of the Deceived Weighted Averaging Filter Framework (DeWAFF), which defines an approach for image abstraction with a combination of unsharp masking for contrast and edges enhancement and weighted averaging filtering for noise reduction. The proposed optimization approach achieved a speedup factor up to 10.

Saúl Calderón Ramírez, Manuel Zumbado Corrales
Utilizing Deep Learning and 3DLBP for 3D Face Recognition

Methods based on biometrics can help prevent frauds and do personal identification in day-to-day activities. Automated Face Recognition is one of the most popular research subjects since it has several important properties, such as universality, acceptability, low costs, and covert identification. In constrained environments methods based on 2D features can outperform the human capacity for face recognition but, once occlusion and other types of challenges are presented, the aforementioned methods do not perform so well. To deal with such problems 3D data and deep learning based methods can be a solution. In this paper we propose the utilization of Convolutional Neural Networks (CNN) with low-level 3D local features (3DLBP) for face recognition. The 3D local features are extracted from depth maps captured by a Kinect sensor. Experimental results on Eurecom database show that this proposal is promising, since, in average, almost 90% of the faces were correctly recognized.

João Baptista Cardia Neto, Aparecido Nilceu Marana
Efficient Hyperparameter Optimization in Convolutional Neural Networks by Learning Curves Prediction

In this work, we present an automatic framework for hyperparameter selection in Convolutional Neural Networks. In order to achieve fast evaluation of several hyperparameter combinations, prediction of learning curves using non-parametric regression models is applied. Considering that “trend” is the most important feature in any learning curve, our prediction method is focused on trend detection. Results show that our forecasting method is able to catch a complete behavior of future iterations in the learning process.

Andrés F. Cardona-Escobar, Andrés F. Giraldo-Forero, Andrés E. Castro-Ospina, Jorge A. Jaramillo-Garzón
Evaluation of Deep Feedforward Neural Networks for Classification of Diffuse Lung Diseases

Diffuse Lung Diseases (DLDs) are a challenge for physicians due their wide variety. Computer-Aided Diagnosis (CAD) are systems able to help physicians in their diagnoses combining information provided by experts with Machine Learning (ML) methods. Among ML techniques, Deep Learning has recently established itself as one of the preferred methods with state-of-the-art performance in several fields. In this paper, we analyze the discriminatory power of Deep Feedforward Neural Networks (DFNN) when applied to DLDs. We classify six radiographic patterns related with DLDs: pulmonary consolidation, emphysematous areas, septal thickening, honeycomb, ground-glass opacities, and normal lung tissues. We analyze DFNN and other ML methods to compare their performance. The obtained results show the high performance obtained by DFNN method, with an overall accuracy of 99.60%, about 10% higher than the other studied ML methods.

Isadora Cardoso, Eliana Almeida, Héctor Allende-Cid, Alejandro C. Frery, Rangaraj M. Rangayyan, Paulo M. Azevedo-Marques, Heitor S. Ramos
Non-dermatoscopic Image Analysis for the Recognition of Malignant Skin Diseases with Convolutional Neural Network and Autoencoders

Every year, people around the world are affected by different skin diseases or cancer. Nowadays, these can only be detected accurately by clinical analysis and skin biopsy. However, the diagnosis of this malignant disease does not ensure the survival of the patient, since many clinical cases are detected in the terminal phases. Only early diagnosis would increase the life expectancy of patients.In this paper, we propose a method to recognition malignant skin diseases to identify malignant lesions in non-dermatoscopic images. For the method, we use Convolutional Neural Network and propose the use of autoencoders as another classification model that provides more information on the diagnosis. Experiments show that our proposal reaches up to 84.4% of accuracy in the well-known dataset of the ISIC-2016. In addition, we collect non-dermatoscopic images of skin lesions and developed a new dataset to demonstrate the advantage of our method.

Ricardo Coronado, Alexander Ocsa, Oscar Quispe
Non-stationary Multi-output Gaussian Processes for Enhancing Resolution over Diffusion Tensor Fields

Diffusion magnetic resonance imaging (dMRI) is an advanced technique derived from magnetic resonance imaging (MRI) that allows the study of internal structures in biological tissue. Due to acquisition protocols and hardware limitations of the equipment employed to obtain the data, the spatial resolution of the images is often low. This inherent lack in dMRI is a considerable difficulty because clinical applications are affected. The scientific community has proposed several methodologies for enhancing the spatial resolution of dMRI data, based on interpolation of diffusion tensors fields. However, most of the methods have considerable drawbacks when they interpolate strong transitions, such as crossing fibers. Also, relevant clinical information from tensor fields is modified when interpolation is performed. In this work, we propose a probabilistic methodology for interpolation of diffusion tensors fields using multi-output Gaussian processes with non-stationary kernel function. First, each tensor is decomposed in shape and orientation features. Then, the model interpolates the features jointly. Results show that proposed approach outperforms state-of-the-art methods regarding resolution enhancement accuracy on synthetic and real data, when we evaluate interpolation quality with Frobenius and Riemann metrics. Also, the proposed method demonstrates an adequate characterization of both stationary and non-stationary fields, contrary to previous approaches where performance is seriously reduced when complex fields are interpolated.

Jhon F. Cuellar-Fierro, Hernán Darío Vargas-Cardona, Mauricio A. Álvarez, Andrés M. Álvarez, Álvaro A. Orozco
Improving the Classification of Volcanic Seismic Events Extracting New Seismic and Speech Features

This paper presents a study on features extracted from the seismic and speech domains that were used to classify four groups of seismic events of the Llaima volcano, located in the Araucanía Region of Chile. 63 features were extracted from 769 events that were labeled and segmented by experts. A feature selection process based on a genetic algorithm was implemented to select the best descriptors for the classifying structure formed by one SVM for each class. The process identified a few features for each class, and a performance that overcame the results of previous similar works, reaching over that 95% of exactitude and showing the importance of the feature selection process to improve classification. These are the newest results obtained from a technology transfer project in which advanced signal processing tools are being applied, in collaboration with the Southern Andes Volcano Observatory (OVDAS), to develop a support system for the monitoring of the Llaima volcano.

Millaray Curilem, Camilo Soto, Fernando Huenupan, Cesar San Martin, Gustavo Fuentealba, Carlos Cardona, Luis Franco
An Automatic System for Computing Malaria Parasite Density in Thin Blood Films

Malaria is a major worldwide health problem, specially in countries with tropical climates and remote areas. In this paper, we present an automatic system for estimating malaria parasite density in thin blood smears. The proposed approach is based on simple image processing methods that can be implemented efficiently even on low budget devices. The method has been tested on images acquired under different illumination and acquisition setups and has produced encouraging results, achieving a sensitivity of 89.3%.

Allisson Dantas Oliveira, Bruno M. Carvalho, Clara Prats, Mateu Espasa, Jordi Gomez i Prat, Daniel Lopez Codina, Jones Albuquerque
A Local Branching Heuristic for the Graph Edit Distance Problem

In graph matching, Graph Edit Distance (GED) is a well-known distance measure for graphs and it is a NP-Hard minimization problem. Many heuristics are defined in the literature to give approximated solutions in a reasonable time. Some other work have used mathematical programming tools to come up with Mixed Integer Linear Program (MILP) models. In this work, a heuristic from Operational Research domain, is proposed and adapted to handle GED problem. It is called Local Branching and operates over a MILP model, where it defines neighborhoods in the solution space by adding the local branching constraint. A black-box MILP solver is then used to intensify the search in a neighborhood. This makes the solution search very fast, and allow exploring different sub-regions. Also, it includes a diversification mechanism to escape local solutions and in this work this mechanism is modified and improved. Finally, it is evaluated against other heuristics in order to show its efficiency and precision.

Mostafa Darwiche, Romain Raveaux, Donatello Conte, Vincent T’Kindt
Emotion Assessment by Variability-Based Ranking of Coherence Features from EEG

The automatic assessment of emotional states has important applications in human-computer interfaces and marketing. Several approaches use a dimensional characterization of emotional states along with features extracted from physiological signals to classify emotions elicited from complex audiovisual stimuli; however, the classification accuracy remains low. Here, we develop an emotion assessment approach using a variability-based ranking scheme to reveal relevant coherence features from electroencephalography (EEG) signals. Our method achieves higher classification accuracies than comparable state-of-the-art methods and almost matches the performance of multimodal strategies that require information from several physiological signals.

Iván De La Pava, Andres Álvarez-Meza, Alvaro-Angel Orozco
Fusion of Deep Learning Descriptors for Gesture Recognition

In this paper, we propose an approach for dynamic hand gesture recognition, which exploits depth and skeleton joint data captured by Kinect™ sensor. Also, we select the most relevant points in the hand trajectory with our proposed method to extract keyframes, reducing the processing time in a video. In addition, this approach combines pose and motion information of a dynamic hand gesture, taking advantage of the transfer learning property of CNNs. First, we use the optical flow method to generate a flow image for each keyframe, next we extract the pose and motion information using two pre-trained CNNs: a CNN-flow for flow-images and a CNN-pose for depth-images. Finally, we analyze different schemes to fusion both informations in order to achieve the best method. The proposed approach was evaluated in different datasets, achieving promising results compared to other methods, outperforming state-of-the-art methods.

Edwin Escobedo Cardenas, Guillermo Camara-Chavez
Ultimate Leveling Based on Mumford-Shah Energy Functional Applied to Plant Detection

This paper presents a filter based on energy functions applied to the ultimate levelings which are powerful image operators based on numerical residues. Within a multi-scale framework, these operators analyze a given image under a series of levelings. Thus, contrasted objects can be detected if a relevant residue is generated when they are filtered out by one of these levelings. During the residual extraction process, it is very common that undesirable regions of the input image contain residual information that should be filtered out. These undesirable residual regions often include desirable residual regions due to the design of the ultimate levelings which consider maximum residues. In this paper, we improve the residual information by filtering out residues extracted from undesirable regions. In order to test our approach, some experiments were conducted in plant dataset and the results show the robustness of our approach.

Charles Gobber, Wonder A. L. Alves, Ronaldo F. Hashimoto
Sclera Segmentation in Face Images Using Image Foresting Transform

The sclera is the part of the eye surround the iris, it is white and presents blood vessels that can be used for biometric recognition. In this paper, we propose a new method for sclera segmentation in face images. The method is divided into two steps: (1) the eye location and the (2) sclera segmentation. Eyes are located using Color Distance Map (CDM), Histogram of Oriented Gradients (HOG) descriptor and Random Forest (RF). The sclera is segmented by Image Foresting Transform (IFT). The first step has an accuracy of 95.95%.

Jullyana Fialho Pinheiro, João Dallyson Sousa de Almeida, Geraldo Braz Junior, Anselmo Cardoso de Paiva, Aristófanes Corrêa Silva
Use of Machine Learning to Improve the Robustness of Spatial Estimation of Evapotranspiration

Estimation of the crop water requirement is critical in the optimization of the agricultural production process, due to that yield and costs are directly affected by this estimation. Nowadays, remote sensing is a useful tool for estimating Evapotranspiration (ET), since it is possible to map their spatial and temporal variability. ET models using satellite images have been developed in the last decades, using in most cases the surface energy balance which has generated good ET representation in different study sites. One of these models is METRIC (Mapping EvapoTranspiration at high Resolution using Internalized Calibration), which estimates ET using mainly data from Landsat 7 and 5 images, and a physical-empirical basis to solve the surface energy balance. The main drawback of the METRIC model is the low robustness in the selection of two parameters called anchor pixels. Even though the rules to select anchor pixels are standardized, the procedure requires a user to choose the area where these pixels will be selected. In this sense, ET estimation is highly sensible to this selection, producing important differences when different anchor pixels are selected. In this study, a machine learning method is implemented through the GEOBIA (Geographic Object Based Image Analysis) approach for the identification of anchor objects, changing the focus from the pixels to the objects. Image segmentation and classification processes are used for an adequate selection of anchor objects, considering spectral and contextual information. The main contribution of this work proves that it is not necessary to choose an area to select the anchor parameters, improving the numerical stability of the model METRIC and increasing the robustness of the ET estimation. Results were validated by comparing the original selection of anchor pixels, as well as in-situ ET estimation using data obtained from Surface Renewal Stations, in sugar beat crops.

David Fonseca-Luengo, Mario Lillo-Saavedra, L. O. Lagos, Angel García-Pedrero, Consuelo Gonzalo-Martín
Automatic Peripheral Nerve Segmentation in Presence of Multiple Annotators

Peripheral Nerve Blocking (PNB) is a technique commonly used to perform regional anesthesia. The success of PNB procedures lies of the accurate location of the target nerve. The ultrasound images (UI) have frequently been used aiming to locate nerve structures in the context of PNB procedures. This type of images allows a direct visualization of the target nerve, and the anatomical structures around it. Notwithstanding, the nerve segmentation in UI by an anesthesiologist is not straightforward since these images are affected by several artifacts; hence, the accuracy of nerve segmentation depends on the anesthesiologist expertise. In this sense, we face a scenario where we have manual multiple nerve segmentations performed by several anesthesiologists with different levels of expertise. In this paper, we propose a nerve segmentation approach based on supervised learning. For the classification step, we compare two schemes based on the concepts “Learning from crowds” aiming to code the information of multiple manual segmentations. Attained results show that our approach finds a suitable UI approximation by ensuring the identification of discriminative nerve patterns according to the opinions given by multiple specialists.

Julián Gil González, Andrés M. Álvarez, Andrés F. Valencia, Álvaro A. Orozco
Sparse Hilbert Embedding-Based Statistical Inference of Stochastic Ecological Systems

The growth rate of a population has been an important aspect in ecological and biologic applications. Since a non-linear stochastic behavior is linked to this type of systems, the inference of the model parameters is a challenging task. Approximate Bayesian Computation (ABC) can be used for leading the intractability of the likelihood function caused by the model characteristics. Recently, some methods based on Hilbert Space Embedding (HSE) have been proposed in the context of ABC; nevertheless, the relevance of the observations and simulations are not contemplated. Here, we develop a Sparse HSE-based distance, termed SHSED, to compare distributions associated with two random variables through sparse estimations of the densities in a Reproducing Kernel Hilbert Space (RKHS). Namely, SHSED highlights relevant information using a sparse weighted representation of data within an ABC-based inference. Our method improves the inference accuracy of a Ricker map-based population model in comparison with other state-of-the-art ABC-based approaches.

Wilson González-Vanegas, Andrés Alvarez-Meza, Álvaro Orozco-Gutierrez
Fingerprint Presentation Attack Detection Method Based on a Bag-of-Words Approach

Fingerprint-based biometric systems are not entirely secure due to their vulnerability to presentation attacks. In this paper, we propose a new presentation attack method based on a Bag-of-Words approach, which by combining local and global information of fingerprint can correctly identify bona fine presentations from presentation attacks. The experimental evaluation of our proposal, over the well-known LivDet 2011 dataset, showed an Average Classification Error of $$4.73\%$$, outperforming the state of the art.

Lázaro Janier González-Soler, Leonardo Chang, José Hernández-Palancar, Airel Pérez-Suárez, Marta Gomez-Barrero
Multi-objective Overlapping Community Detection by Global and Local Approaches

Overlapping community detection on social networks has received a lot of attention nowadays and it has been recently addressed as Multi-objective Optimization Evolutionary Algorithms. In this paper, we introduce a new algorithm, named MOGLAOC, which is based on the Pareto-dominance based MOEAs and combines global and local approaches for discovering overlapping communities. The experimental evaluation over four classical real-life networks showed that our proposal is promising and effective for overlapping community detection in social networks.

Darian H. Grass-Boada, Airel Pérez-Suárez, Andrés Gago-Alonso, Rafael Bello, Alejandro Rosete
Impulse Response Estimation of Linear Time-Invariant Systems Using Convolved Gaussian Processes and Laguerre Functions

This paper presents a novel method to estimate the impulse response function of Linear Time-Invariant systems from input-output data by means of Laguerre functions and Convolved Gaussian Processes. We define a new non-stationary covariance function that encodes the convolution between the Laguerre functions and the input. The input (excitation) is modelled by a Gaussian Process prior. Thus, we are able to estimate the system’s impulse response by performing maximum likelihood estimation over the model hyperparameters. Besides, the proposed model performs well in missing and noisy data scenarios.

Cristian Guarnizo, Mauricio A. Álvarez
A Novel Hybrid Data Reduction Strategy and Its Application to Intrusion Detection

The presence of useless information and the huge amount of data generated by telecommunication services can affect the efficiency of traditional Intrusion Detection Systems (IDSs). This fact encourage the development of data preprocessing strategies for improving the efficiency of IDSs. On the other hand, improving such efficiency relying on the data reduction strategies, without affecting the quality of the reduced dataset (i.e. keeping the accuracy during the classification process), represents a challenge. Also, the runtime of commonly used strategies is usually high. In this paper, a novel hybrid data reduction strategy is presented. The proposed strategy reduces the number of features and instances in the training collection without greatly affecting the quality of the reduced dataset. In addition, it improves the efficiency of the classification process. Finally, our proposal is favorably compared with other hybrid data reduction strategies.

Vitali Herrera-Semenets, Osvaldo Andrés Pérez-García, Andrés Gago-Alonso, Raudel Hernández-León
Clustering-Based Undersampling to Support Automatic Detection of Focal Cortical Dysplasias

Focal Cortical Dysplasias (FCDs) are cerebral cortex abnormalities that cause epileptic seizures. Recently, machine learning techniques have been developed to detect FCDs automatically. However, dysplasias datasets contain substantially fewer lesional samples than healthy ones, causing high order imbalance between classes that affect the performance of machine learning algorithms. Here, we propose a novel FCD automatic detection strategy that addresses the class imbalance using relevant sampling by a clustering strategy approach in cooperation with a bagging-based neural network classifier. We assess our methodology on a public FCDs database, using a cross-validation scheme to quantify classifier sensitivity, specificity, and geometric mean. Obtained results show that our proposal achieves both high sensitivity and specificity, improving the classification performance in FCD detection in comparison to the state-of-the-art methods.

Keider Hoyos-Osorio, Andrés M. Álvarez, Álvaro A. Orozco, Jorge I. Rios, Genaro Daza-Santacoloma
Ultrasonic Assessment of Platelet-Rich Plasma by Digital Signal Processing Techniques

This paper presents the implementation of an ultrasonic non-invasive and non-destructive system for the acoustic characterization of bovine Platelet-Rich Plasma based on advanced digital signal processing techniques. The system comprises the development of computational procedures that allow spectral estimation of parameters such as the angular coefficients with linear frequency dependence and the measurement of the speed of sound of regions of concern in sample studies. The results show that the relationship of acoustic parameters obtained from backscattered ultrasonic signals contributes to the hematological prediction of platelet concentration based on linear regression model.

Julián A. Villamarín, Yady M. Jiménez, Tatiana Molano, Edgar W. Gutiérrez, Luis F. Londoño, David Gutiérrez, Daniela Montilla
3D Probabilistic Morphable Models for Brain Tumor Segmentation

Segmenting abnormal areas in brain volumes is a difficult task, due to the shape variability that the brain tumors exhibit between patients. The main problem in these processes is that the common segmentation techniques used in these tasks, lack of the property of modeling the shape structure that the tumor presents, which leads to an inaccurate segmentation. In this paper, we propose a probabilistic framework in order to model the shape variations related to abnormal tissues relevant in brain tumor segmentation procedures. For this purpose the database of the Brain Tumor Image Segmentation Challenge (Brats) 2015 is used. We use a Probabilistic extension of the 3D morphable model to learn those tumor variations between patients. Then from the trained model, we perform a non-rigid matching to fit the deformed modeled tumor in the medical image. The experimental results show that by using Probabilistic morphable models, the non-rigid properties of the abnormal tissues can be learned and hence improve the segmentation task.

David A. Jimenez, Hernán F. García, Andres M. Álvarez, Álvaro A. Orozco
Multi Target Tracking Using Determinantal Point Processes

Multi Target Tracking has many applications such as video surveillance and event recognition among others. In this paper, we present a multi object tracking (MOT) method based on point processes and random finite sets theory. The Probability Hypothesis Density (PHD) filter is a MOT algorithm that deals with missed, false and redundant detections. However, the PHD filter, as well as other conventional tracking-by-detection approaches, requires some sort of pre-processing technique such as non-maximum suppression (NMS) to eliminate redundant detections. In this paper, we show that using NMS is sub-optimal and therefore propose Determinantal Point Processes (DPP) to select the final set of detections based on quality and similarity terms. We conclude that PHD filter-DPP method outperforms PHD filter-NMS.

Felipe Jorquera, Sergio Hernández, Diego Vergara
Boosted Projection: An Ensemble of Transformation Models

Computer vision problems usually suffer from a very high dimensionality, which can make it hard to learn classifiers. A way to overcome this problem is to reduce the dimensionality of the input. This work presents a novel method for tackling this problem, referred to as Boosted Projection. It relies on the use of several projection models based on Principal Component Analysis or Partial Least Squares to build a more compact and richer data representation. We conducted experiments in two important computer vision tasks: pedestrian detection and image classification. Our experimental results demonstrate that the proposed approach outperforms many baselines and provides better results when compared to the original dimensionality reduction techniques of partial least squares.

Ricardo Barbosa Kloss, Artur Jordão, William Robson Schwartz
Full-Quaternion Color Correction in Images for Person Re-identification

Nowadays, video surveillance systems are very used to safeguard airport areas, train stations, public places, among others. Using these systems, the person re-identification is an automated task. However, there are many problems that affect the good performance of person re-identification algorithms. For example, illumination changes in the scenes is one of the essential problems. It increases the false colors on the person appearance. Moreover, in the extraction of low level features (color) there is a need to obtain reliable colors of the image or person appearance. To this end, we propose a new algorithm using full-quaternions for color image representation. The quaternionic trigonometry and the Quaternion Fast Fourier Transform are used to improve person image in the frequency domain. Finally, to see the transformed image, an adaptive gamma function is developed. Experiment results on two datasets (VIPeR and GRID) show consistent improvements over the state-of-the-art approaches.

Reynolds León Guerra, Edel B. García Reyes, Francisco J. Silva Mata
Long Short-Term Memory Networks Based in Echo State Networks for Wind Speed Forecasting

Integrating increasing amounts of wind generation require power system operators to improve their wind forecasting tools. Echo State Networks (ESN) are a good option for wind speed forecasting because of their capacity to process sequential data, having achieved good performance in different forecasting tasks. However, the simplicity of not training its hidden layer may restrict reaching a better performance. This paper proposes to use an ESN architecture, but replacing its hidden units by LSTM blocks and to train the whole network with some restrictions. We tested the proposal by forecasting wind speeds from 1 to 24 h ahead. Results demonstrate that our proposal outperforms the ESNs performance in terms of different error metrics such as MSE, MAE and MAPE.

Erick López, Carlos Valle, Héctor Allende, Esteban Gil
On the Use of Pre-trained Neural Networks for Different Face Recognition Tasks

Deep Convolutional Neural Networks (DCNN) are the state-of-the-art in face recognition. In this paper, we study different representations obtained from a pre-trained DCNN, in order to determine the best way in which they can be used in different tasks. In particular, we evaluate the use of intermediate representations independently or combined with a Fisher Vector approach, or with a Bilinear model. From our study, we found that convolutional features may be more suitable than the features obtained from the last fully connected layers for different applications.

Leyanis López-Avila, Yenisel Plasencia-Calaña, Yoanna Martínez-Díaz, Heydi Méndez-Vázquez
Many-Objective Ensemble-Based Multilabel Classification

This paper proposes a many-objective ensemble-based algorithm to explore the relations among the labels on multilabel classification problems. This proposal consists in two phases. In the first one, a many-objective optimization method generates a set of candidate components exploring the relations among the labels, and the second one uses a stacking method to aggregate the components for each label. By balancing or not the relevance of each label, two versions were conceived for the proposal. The balanced one presented a good performance for recall and F1 metrics, and the unbalanced one for 1-Hamming loss and precision metrics.

Marcos M. Raimundo, Fernando J. Von Zuben
Bio-Chemical Data Classification by Dissimilarity Representation and Template Selection

The identification and classification of bio-chemical substances are very important tasks in chemical, biological and forensic analysis. In this work we present a new strategy to improve the accuracy of the supervised classification of this type of data obtained from different analytical techniques that combine two processes: first, a dissimilarity representation of data and second, the selection of templates for the refinement of the representative samples in each class set.In order to evaluate the performance of our proposal, a comparative study between three approaches is presented. As a baseline, entropy template selection (ETS) is performed in the original feature space and selected templates are used for training. The underlying concept of the other two alternatives, is the combination of Dissimilarity Representations and ETS. The first alternative performs ETS in the original feature space and uses the selected templates as prototypes for the generation of the dissimilarity space and as training set. The second one represents the data in the dissimilarity space, and next ETS is performed.The experimental results showed that an adequate combination of the representation in the dissimilarity the space and the selection of templates based on entropy, outperformed the baseline in accuracy and/or efficiency for the majority of the problems studied.

Victor Mendiola-Lau, Francisco José Silva Mata, Yenisel Plasencia Calaña, Isneri Talavera Bustamante, Maria de Marsico
Abnormal Event Detection in Video Using Motion and Appearance Information

This paper presents an approach for the detection and localization of abnormal events in pedestrian areas. The goal is to design a model to detect abnormal events in video sequences using motion and appearance information. Motion information is represented through the use of the velocity and acceleration of optical flow and the appearance information is represented by texture and optical flow gradient. Unlike literature methods, our proposed approach provides a general solution to detect both global and local abnormal events. Furthermore, in the detection stage, we propose a classification by local regions. Experimental results on UMN and UCSD datasets confirm that the detection accuracy of our method is comparable to state-of-the-art methods.

Neptalí Menejes Palomino, Guillermo Cámara Chávez
Efficient and Effective Face Frontalization for Face Recognition in the Wild

Face image alignment is one of the most important steps in a face recognition system, being directly linked to its accuracy. In this work we propose a method for face frontalization based on the use of 3D models obtained from 2D images. We first extend the 3D Generic Elastic Model method in order to make it suitable for real applications, and once we have the 3D dense model of a face image, we obtain its frontal projection, introducing a new method for the synthesis of occluded regions. We evaluate the proposal by frontalizing the face images on LFW database and compare it with other frontalization techniques using different face recognition methods. We show that the proposed method allows to effectively align the images in an efficient way.

Nelson Méndez, Luis A. Bouza, Leonardo Chang, Heydi Méndez-Vázquez
Path-Gradient – A Theory of Computing Full Intensity-Transition Between Two Points

A major challenge for path-based segmentation methods is to select the optimum scale capturing the total intensity variation across object interfaces without losing small-scale structures. Minimum barrier distance (MBD) attempts to alleviate this issue using a unique path-cost function that computes the maximum intensity variation on the path. Two major concerns of MBD are related to high computational complexity and convoluted trajectory of the optimum path between two points on either side of an object interface limiting benefits of MBD. Here, we introduce the notion of path-gradient (PG) that exhibits similar behavior as MBD for object segmentation with significantly reduced computation. The formulation of PG allows the addition of a regularization term in path cost, which improves segmentation still at considerably reduced computation cost than regular MBD. Efficient algorithms for computing PG and regularized PG are presented and their segmentation performances are compared with that of MBD.

Syed Ahmed Nadeem, Eric A. Hoffman, Punam K. Saha
A Robust Indoor Scene Recognition Method Based on Sparse Representation

In this paper, we present a robust method for scene recognition, which leverages Convolutional Neural Networks (CNNs) features and Sparse Coding setting by creating a new representation of indoor scenes. Although CNNs highly benefited the fields of computer vision and pattern recognition, convolutional layers adjust weights on a global-approach, which might lead to losing important local details such as objects and small structures. Our proposed scene representation relies on both: global features that mostly refers to environment’s structure, and local features that are sparsely combined to capture characteristics of common objects of a given scene. This new representation is based on fragments of the scene and leverages features extracted by CNNs. The experimental evaluation shows that the resulting representation outperforms previous scene recognition methods on Scene15 and MIT67 datasets, and performs competitively on SUN397, while being highly robust to perturbations in the input image such as noise and occlusion.

Guilherme Nascimento, Camila Laranjeira, Vinicius Braz, Anisio Lacerda, Erickson R. Nascimento
Deep Convolutional Neural Networks and Noisy Images

The presence of noise represent a relevant issue in image feature extraction and classification. In deep learning, representation is learned directly from the data and, therefore, the classification model is influenced by the quality of the input. However, the ability of deep convolutional neural networks to deal with images that have a different quality when compare to those used to train the network is still to be fully understood. In this paper, we evaluate the generalization of models learned by different networks using noisy images. Our results show that noise cause the classification problem to become harder. However, when image quality is prone to variations after deployment, it might be advantageous to employ models learned using noisy data.

Tiago S. Nazaré, Gabriel B. Paranhos da Costa, Welinton A. Contato, Moacir Ponti
Spiking Hough for Shape Recognition

The paper implements a spiking neural model methodology inspired on the Hough Transform. On-line event-driven spikes from Dynamic Vision Sensors are evaluated to characterize and recognize the shape of Poker signs. The multi-class system, referred as Spiking Hough, shows the good performance on the public POKER-DVS dataset.

Pablo Negri, Teresa Serrano-Gotarredona, Bernabe Linares-Barranco
Region-Based Classification of PolSAR Data Through Kernel Methods and Stochastic Distances

Stochastic distances combined with Minimum Distance method for region-based classification of Polarimetric Synthetic Aperture Radar (PolSAR) image was successfully verified in Silva et al. (2013). Methods like K-Nearest Neighbors may also adopt stochastic distances and then used in a similar purpose. The present study investigates the use of kernel methods for PolSAR region-based classification. For this purpose, the Jeffries-Matusita stochastic distance between Complex Multivariate Wishart distributions is integrated in a kernel function and then used in Support Vector Machine and Graph-Based kernel methods. A case study regarding PolSAR remote sensing image classification is carried to assess the above mentioned methods. The results show superiority of kernel methods in comparison to the other analyzed methods.

Rogério G. Negri, Wallace C. O. Casaca, Erivaldo A. Silva
Hand Posture Recognition Using Convolutional Neural Network

In this work we present a convolutional neural network-based algorithm for recognition of hand postures on images acquired by a single color camera. The hand is extracted in advance on the basis of skin color distribution. A neural network-based regressor is applied to locate the wrist. Finally, a convolutional neural network trained on 6000 manually labeled images representing ten classes is executed to recognize the hand posture in a sub-window determined on the basis of the wrist. We show that our model achieves high classification accuracy, including scenarios with different camera used in testing. We show that the convolutional network achieves better results on images pre-filtered by a Gabor filter.

Dennis Núñez Fernández, Bogdan Kwolek
On Semantic Solutions for Efficient Approximate Similarity Search on Large-Scale Datasets

Approximate similarity search algorithms based on hashing were proposed to query high-dimensional datasets due to its fast retrieval speed and low storage cost. Recent studies, promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for heavy training process to achieve accurate query results and the critical dependency on data-parameters. Aiming to overcome these issues, we propose a new method for scalable similarity search, i.e., Deep frActal based Hashing (DAsH), by computing the best data-parameters values for optimal sub-space projection exploring the correlations among CNN features attributes using fractal theory. Moreover, inspired by recent advances in CNNs, we use not only activations of lower layers which are more general-purpose but also previous knowledge of the semantic data on the latest CNN layer to improve the search accuracy. Thus, our method produces a better representation of the data space with a less computational cost for a better accuracy. This significant gain in speed and accuracy allows us to evaluate the framework on a large, realistic, and challenging set of datasets.

Alexander Ocsa, Jose Luis Huillca, Cristian Lopez del Alamo
Self-organizing Maps for Motor Tasks Recognition from Electrical Brain Signals

Recently, there has been a relevant progress and interest for brain–computer interface (BCI) technology as a potential channel of communication and control for the motor disabled, including post-stroke and spinal cord injury patients. Different mental tasks, including motor imagery, generate changes in the electro-physiological signals of the brain, which could be registered in a non-invasive way using electroencephalography (EEG). The success of the mental motor imagery classification depends on the choice of features used to characterize the raw EEG signals, and of the adequate classifier. As a novel alternative to recognize motor imagery tasks for EEG-based BCI, this work proposes the use of self-organized maps (SOM) for the classification stage. To do so, it was carried out an experiment aiming to predict three-class motor tasks (rest versus left motor imagery versus right motor imagery) utilizing spectral power-based features of recorded EEG signals. Three different pattern recognition algorithms were applied, supervised SOM, SOM+k-means and k-means, to classify the data offline. Best results were obtained with the SOM trained in a supervised way, where the mean of the performance was 77% with a maximum of 85% for all classes. Results indicate potential application for the development of BCIs systems.

Alvaro D. Orjuela-Cañón, Osvaldo Renteria-Meza, Luis G. Hernández, Andrés F. Ruíz-Olaya, Alexander Cerquera, Javier M. Antelis
Semantic Segmentation of Color Eye Images for Improving Iris Segmentation

Iris segmentation under visible spectrum (VIS) is a topic that has been gaining attention in many researches in the last years, due to an increasing interest in iris recognition at-a-distance and in non-cooperative environments such as: blur, off-axis, occlusions, specular reflections among others. In this paper, we propose a new approach to detect the iris region on eye images acquired under VIS. We introduce the semantic information of different classes of an eye image (such as sclera, pupil, iris, eyebrows among others) in order to segment the iris region. Experimental results on UBIRIS v2 database show that the semantic segmentation improves the iris segmentation by reducing the intra-class variability, especially for the non-iris classes.

Dailé Osorio-Roig, Annette Morales-González, Eduardo Garea-Llano
A Single-Step 2D Thinning Scheme with Deletion of P-Simple Points

Thinning is a frequently applied technique for producing skeletons from digital binary pictures in a topology-preserving way. Bertrand proposed a two-step thinning scheme that is based on P-simple points. In this paper, we give two sufficient conditions for topology-preserving reductions working on the three possible 2D regular grids. The new conditions combined with parallel thinning strategies and geometrical constraints yield a single-step thinning scheme that deletes solely P-simple points.

Kálmán Palágyi, Péter Kardos
Non Local Means Image Filtering Using Clustering

In this work we study improvements for the Non Local Means image filter using clustering in the space of patches. Patch clustering it is proposed to guide the selection of the best patches to be used to filter each pixel. Besides clustering, we incorporate spatial coherence keeping some local patches extracted from a local search window around each pixel. Therefore, for each pixel we use local patches and non local ones extracted from the corresponding cluster. The proposed method outperforms classical Non Local Means filter and also, when compared with other methods from the literature based on random sampling, our results confirm the benefits of sampling inside clusters combined with local information.

Alvaro Pardo
Neural Networks for the Reconstruction and Separation of High Energy Particles in a Preshower Calorimeter

Particle detectors have important applications in fields such as high energy physics and nuclear medicine. For instance, they are used in huge particles accelerators to study the elementary constituents of matter. The analysis of the data produced by these detectors requires powerful statistical and computational methods, and machine learning has become a key tool for that. We propose a reconstruction algorithm for a preshower detector. The reconstruction algorithm is in charge of identifying and classifying the particles spotted by the detector. More importantly, we propose to use a machine learning algorithm to solve the problem of particle identification in difficult cases for which the reconstruction algorithm fails. We show that our reconstruction algorithm together with the machine learning rejection method are able to identify most of the incident particles. Moreover, we found that machine learning methods greatly outperform cut based techniques that are commonly used in high energy physics.

Juan Pavez, Hayk Hakobyan, Carlos Valle, William Brooks, Sergey Kuleshov, Héctor Allende
Noisy Character Recognition Using Deep Convolutional Neural Networks

Due to degradation and low quality in noisy images, such as natural scene images and CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) based on text, the character recognition problem continues to be extremely challenging. In this work, we study two convolutional neural network approaches (filter learning and architecture optimization) to improve the feature representations of these images through deep learning. We perform experiments in the widely used Street View House Numbers (SVHN) dataset and a new dataset of CAPTCHAs created by us. The approach to learn filter weights through back-propagation algorithm using data augmentation technique and the strategy of adding few locally-connected layers to the Convolutional Neural Network (CNN) has obtained promising results on the CAPTCHA dataset (97.36% of accuracy for characters and 85.4% for CAPTCHAs) and results very close to the state-of-the-art regarding the SVHN dataset (97.45% of accuracy for digits).

Sirlene Peixoto, Gabriel Gonçalves, Andrea Bianchi, Alceu De S. Brito, William Robson Schwartz, David Menotti
A Flexible Framework for the Evaluation of Unsupervised Image Annotation

Automatic Image Annotation (AIA) consists in assigning keywords to images describing their visual content. A prevalent way to address the AIA task is based on supervised learning. However, the unsupervised approach is a new alternative that makes a lot of sense when there are not manually labeled images to train supervised techniques. AIA methods are typically evaluated using supervised learning performance measures, however applying these kind of measures to unsupervised methods is difficult and unfair. The main restriction has to do with the fact that unsupervised methods use an unrestricted annotation vocabulary while supervised methods use a restricted one. With the aim to alleviate the unfair evaluation, in this paper we propose a flexible evaluation framework that allows us to compare coverage and relevance of the assigned words by unsupervised automatic image annotation (UAIA) methods. We show the robustness of our framework through a set of experiments where we evaluated the output of both, unsupervised and supervised methods.

Luis Pellegrin, Hugo Jair Escalante, Manuel Montes-y-Gómez, Mauricio Villegas, Fabio A. González
Unsupervised Local Regressive Attributes for Pedestrian Re-identification

Discovering of attributes is a challenging task in computer vision due to uncertainty about the attributes, which is caused mainly by the lack of semantic meaning in image parts. A usual scheme for facing attribute discovering is to divide the feature space using binary variables. Moreover, we can assume to know the attributes and by using expert information we can give a degree of attribute beyond only two values. Nonetheless, a binary variable could not be very informative, and we could not have access to expert information. In this work, we propose to discover linear regressive codes using image regions guided by a supervised criteria where the obtained codes obtain better generalization properties. We found that the discovered regressive codes can be successfully re-used in other visual datasets. As a future work, we plan to explore richer codification structures than lineal mapping considering efficient computation.

Billy Peralta, Luis Caro, Alvaro Soto
A Deep Boltzmann Machine-Based Approach for Robust Image Denoising

A Deep Boltzmann Machine (DBM) is composed of a stack of learners called Restricted Boltzmann Machines (RBMs), which correspond to a specific kind of stochastic energy-based networks. In this work, a DBM is applied to a robust image denoising by minimizing the contribution of some of its top nodes, called “noise nodes”, which often get excited when noise pixels are present in the given images. After training the DBM with noise and clean images, the detection and deactivation of the noise nodes allow reconstructing images with great quality, eliminating most of their noise. The results obtained from important public image datasets showed the validity of the proposed approach.

Rafael G. Pires, Daniel S. Santos, Gustavo B. Souza, Aparecido N. Marana, Alexandre L. M. Levada, João Paulo Papa
Discovering Bitcoin Mixing Using Anomaly Detection

Bitcoin is a peer-to-peer electronic currency system which has increased in popularity in recent years, having a market capitalization of billions of dollars. Due to the alleged anonymity of the Bitcoin ecosystem, it has attracted the attention of criminals. Mixing services are intended to provide further anonymity to the Bitcoin network, making it impossible to link the sender of some money with the receiver. These services can be used for money laundering or to finance terrorist groups without being detected. We propose to model the Bitcoin network as a social network and to use community anomaly detection to discover mixing accounts. Furthermore, we present the first technique for detecting Bitcoin accounts associated to money mixing, and demonstrate our proposal effectiveness on real data, using known mixing accounts.

Mario Alfonso Prado-Romero, Christian Doerr, Andrés Gago-Alonso
A Scale-Space Approach for Multiscale Shape Analysis

Currently, given the widespread of computers through society, the task of recognizing visual patterns is being more and more automated, in particular to treat the large and growing amount of digital images available. Two well-referenced shape descriptors are BAS (Beam Angle Statistics) and MFD (Multiscale Fractal Dimension). Results obtained by these shape descriptors on public image databases have shown high accuracy levels, better than many other traditional shape descriptors proposed in the literature. As scale is a key parameter in Computer Vision and approaches based on this concept can be quite successful, in this paper we explore the possibilities of a scale-space representation of BAS and MFD and propose two new shape descriptors SBAS (Scale-Space BAS) and SMFD (Scale-Space MFD). Both new scale-space based descriptors were evaluated on two public shape databases and their performances were compared with main shape descriptors found in the literature, showing better accuracy results in most of the comparisons.

Lucas Alexandre Ramos, Aparecido Nilceu Marana, Luis Antônio de Souza Junior
An Unsupervised Approach for Eye Sclera Segmentation

We present an unsupervised sclera segmentation method for eye color images. The proposed approach operates on a visible spectrum RGB eye image and does not require any prior knowledge such as eyelid or iris center coordinate detection. The eye color input image is enhanced by an adaptive histogram normalization to produce a gray level image in which the sclera is highlighted. A feature extraction process is involved both in the image binarization and in the computation of scores to assign to each connected components of the foreground. The binarization process is based on clustering and adaptive thresholding. Finally, the selection of foreground components identifying the sclera is performed on the analysis of the computed scores and of the positions between the foreground components. The proposed method was ranked $$2^{nd}$$ in the Sclera Segmentation and Eye Recognition Benchmarking Competition (SSRBC 2017), providing satisfactory performance in terms of precision.

Daniel Riccio, Nadia Brancati, Maria Frucci, Diego Gragnaniello
An Approach to Clustering Using the Expectation-Maximization and Selection of Attributes ReliefF Applied to Water Treatment Plants process

The water treatment process contains several physico-chemical parameters relevant to decision making and the water quality scenarios’ identification. Some scenarios are evident and can be observed without the application of mathematical or statistical techniques, however some of these scenarios are difficult to distinguish, and it is necessary to use computational intelligence techniques for solution. In this context, the paper aims to show the application of the expectation-maximization (EM) algorithm for data clusters of the coagulation process and the ReliefF algorithm to determine the importance of the physico-chemical parameters, using the WEKA tool to analyze historical dataset of a water treatment plant. The results were favorable to the scenarios’ identification and to determine the relevance of the parameters related to the process.

Fábio Cosme Rodrigues dos Santos, André Felipe Henriques Librantz, Renato José Sassi
A Two-Step Neural Network Approach to Passage Retrieval for Open Domain Question Answering

Passage retrieval is an important subtask of question answering. Given a question and a set of candidate passages, the goal is to rank them according to their relevance to the question. This work presents a two-stage approach for solving this problem. Both stages are based on convolutional neural network architecture with a reduced set of parameters. In the first stage the network is used to identify the degree of similarity between question and candidate answers, then, in the second stage, the result of the first stage is used to re-rank the answers according to their similarity with the initial best-ranked answer in such a way that the most similar candidate answers are moved up. This approach is analogous to a pseudo-relevance feedback strategy. The experimental results suggest that the proposed method is competitive with the state-of-the-art methods, achieving a remarkable performance in three evaluation datasets.

Andrés Rosso-Mateus, Fabio A. González, Manuel Montes-y-Gómez
Ensembles of Multiobjective-Based Classifiers for Detection of Epileptic Seizures

This paper proposes multiobjective-based classifiers to detect epileptic seizures using ensemble approaches, transfer-learning methods, and three alternative feature extraction techniques. Two aspects of the problem were investigated: (1) the relative merit of distinct proposals to synthesize an ensemble of classifiers, considering all the three feature extraction techniques; (2) the potential of an ensemble composed of transfer-learned classifiers. The blend approaches with the best performance detected all test seizures, with a high proportion of correctly detected samples inside the seizure interval and high proportion of time intervals correctly classified as non-seizures.

Fernando S. Beserra, Marcos M. Raimundo, Fernando J. Von Zuben
Class Confusability Reduction in Audio-Visual Speech Recognition Using Random Forests

This paper presents an audio-visual speech classification system based on Random Forests classifiers, aiming to reduce the intra-class misclassification problems, which is a very usual situation, specially in speech recognition tasks. A novel training procedure is proposed, introducing the concept of Complementary Random Forests (CRF) classifiers. Experimental results over three audio-visual databases, show that a good performance is achieved with the proposed system for the different types of input information considered, viz., audio-only information, video-only information and fused audio-video information. In addition, these results also indicate that the proposed method performs satisfactorily over the three databases using the same configuration parameters.

Gonzalo D. Sad, Lucas D. Terissi, Juan C. Gómez
Exploring Image Bit Planes for Video Shot Boundary Detection

The wide availability of digital content and the advances in multimedia technology have leveraged the development of efficient mechanisms for storing, indexing, transmitting, retrieving and visualizing video data. A challenging task is to automatically construct a compact representation of video sequences to help users comprehend the most relevant information present in their content. In this work, we develop and evaluate a novel method for detecting abrupt transitions based on bit planes extracted from the video frames. Experiments are conducted on two public datasets to demonstrate the effectiveness of the proposed method. Results are compared against other approaches of the literature.

Anderson Carlos Sousa e Santos, Helio Pedrini
Convolutional Network for EEG-Based Biometric

The global expansion of biometric systems promotes the emergence of new and more robust biometric modalities. In that context, electroencephalogram (EEG) based biometric interest has been growing in recent years. In this study, a novel approach for EEG representation, based on deep learning, is proposed. The method was evaluated on a database containing 109 subjects, and all 64 EEG channels were used as input to a Deep Convolution Neural Network. Data augmentation techniques are explored to train the deep network and results showed that the method is a promising path to represent brain signals, overcoming baseline methods published in the literature.

Thiago Schons, Gladston J. P. Moreira, Pedro H. L. Silva, Vitor N. Coelho, Eduardo J. S. Luz
Skeleton Pruning Based on Elongation and Size of Object’s Limbs and Boundary’s Convexities

We present a new pruning method able to remove peripheral branches of the skeleton of a 2D object without altering more significant branches. Pruning criteria take into account elongation and size of the object’s parts associated with skeleton branches. Only peripheral branches associated with scarcely significant object’s limbs and boundary’s convexities are removed, so that the object can be recovered satisfactorily starting from the pruned skeleton. Since by removing peripheral branches, new peripheral branches can be created, pruning is iterated until the skeleton structure becomes stable. The algorithm does not require fine tuning of the parameters and the obtained results are satisfactory.

Luca Serino, Gabriella Sanniti di Baja
Volume Rendering by Stochastic Neighbor Embedding-Based 2D Transfer Function Building

Multidimensional transfer functions (MDTF) allow studying a volumetric data in a space built from features of interest. Thus, a transfer function (TF) can be defined as a region in a feature space that assigns optical properties to each voxel supporting volume rendering. Since voxels belonging to different objects can share feature similarities, segmentation of individual volume structures is not a straightforward task. We present a TF building approach from a 2D low-dimensional space using dimensionality reduction (DR). Namely, we carried out a Stochastic Neighbor Embedding (SNE)-based DR from MDTF domains. The outcomes show how our proposal, termed SNETF, outperform state-of-the-art approaches that use DR techniques in TF domains. The experiments were performed in a synthetical volume and in a standard volumetric tomography. Our method achieved a higher separability among objects on the new 2D space preserving the original distances between voxel samples. Thus, it was possible to get 3D representation of an object of interest into a given volume, which is an important fact for the next step in automating the generation of TF for volume rendering.

Walter Serna-Serna, Andres M. Álvarez-Meza, Álvaro-Ángel Orozco-Gutierrez
Text Localization in Born-Digital Images of Advertisements

Localizing text in images is an important step in a number of applications and fundamental for optical character recognition. While born-digital text localization might look similar to other complex tasks in this field, it has certain distinct characteristics. Our novel approach combines individual strengths of the commonly used methods: stroke width transform and extremal regions and combines them with a method based on edge-based morphologically growing. We present a parameter-free method with high flexibility to varying text sizes and colorful image elements. We evaluate our method on a novel image database of different retail prospects, containing textual product information. Our results show a higher f-score than competitive methods on that particular task.

Dirk Siegmund, Aidmar Wainakh, Tina Ebert, Andreas Braun, Arjan Kuijper
Deep Learning Techniques Applied to the Cattle Brand Recognition

The automatic recognition of cattle brandings is a need of the government organizations responsible for controlling this activity. To help this process, this work presents a method that consists in using Deep Learning techniques for extracting features from images of cattle branding and Support Vector Machines for their classification. This method consists of six stages: (a) selection of a database of images; (b) selection of a pre-trained CNN; (c) pre-processing of the images, and application of the CNN; (d) extraction of features from the images; (e) training and classification of images (SVM); (f) evaluation of the results obtained in the classification phase. The accuracy of the method was tested on the database of a City Hall, where it achieved satisfactory results, comparable to other methods reported in the literature, with 91.94% of Overall Accuracy, and a processing time of 26.766 s, respectively.

Carlos Silva, Daniel Welfer, Juliano Weber
Efficient Transfer Learning for Robust Face Spoofing Detection

Biometric systems are synonym of security. However, nowadays, criminals are violating them by presenting forged traits, such as facial photographs, to fool their capture sensors (spoofing attacks). In order to detect such frauds, handcrafted methods have been proposed. However, by working with raw data, most of them present low accuracy in challenging scenarios. To overcome problems like this, deep neural networks have been proposed and presented great results in many tasks. Despite being able to work with more robust and high-level features, an issue with such deep approaches is the lack of data for training, given their huge amount of parameters. Transfer Learning emerged as an alternative to deal with such problem. In this work, we propose an accurate and efficient approach for face spoofing detection based on Transfer Learning, i.e., using the very deep VGG-Face network, previously trained on large face recognition datasets, to extract robust features of facial images from the Replay-Attack spoofing database. An SVM is trained based on the feature vectors extracted by VGG-Face from the training images of Replay database in order to detect spoofing. This allowed us to work with such 16-layered network, obtaining great results, without overfitting and saving time and processing.

Gustavo B. Souza, Daniel F. S. Santos, Rafael G. Pires, Aparecido N. Marana, João P. Papa
A Kernel-Based Optimum-Path Forest Classifier

The modeling of real-world problems as graphs along with the problem of non-linear distributions comes up with the idea of applying kernel functions in feature spaces. Roughly speaking, the idea is to seek for well-behaved samples in higher dimensional spaces, where the assumption of linearly separable samples is stronger. In this matter, this paper proposes a kernel-based Optimum-Path Forest (OPF) classifier by incorporating kernel functions in both training and classification steps. The proposed technique was evaluated over a benchmark comprised of 11 datasets, whose results outperformed the well-known Support Vector Machines and the standard OPF classifier for some situations.

Luis C. S. Afonso, Danillo R. Pereira, João P. Papa
A Possibilistic c-means Clustering Model with Cluster Size Estimation

Most c-means clustering models have serious difficulties when facing clusters of different sizes and severely outlier data. The possibilistic c-means (PCM) algorithm can handle both problems to some extent. However, its recommended initialization using a terminal partition produced by the probabilistic fuzzy c-means does not work when severe outliers are present. This paper proposes a possibilistic c-means clustering model that uses only two parameters independently of the number of clusters, which is able to correctly handle the above mentioned obstacles. Numerical evaluation involving synthetic and standard test data sets prove the advantages of the proposed clustering model.

László Szilágyi, Sándor M. Szilágyi
Randomized Neural Network Based Signature for Classification of Titanium Alloy Microstructures

This paper presents the application of the randomized neural network based signature, an innovative and powerful texture analysis algorithm, to a relevant problem of metallography, which consists of classifying zones of titanium alloys Ti-6Al-4V into two categories: “alpha and beta” and “alpha + beta”. The obtained results are very promising, with accuracy of 98.84% by using LDA, and accuracy of 98.64%, precision of 99.11% for “alpha and beta”, and precision of 98.09% for “alpha + beta” by using SVM. This performance suggests that this texture analysis method is a valuable tool that can be applied to many other problems of metallography.

Jarbas Joaci de Mesquita Sá Junior, André R. Backes, Odemir Martinez Bruno
Pap-smear Image Classification Using Randomized Neural Network Based Signature

This paper presents a state-of-the-art texture analysis method called “randomized neural network based signature” applied to the classification of pap-smear cell images for the Papanicolaou test. For this purpose, we used a well-known benchmark dataset composed of 917 images and compared the aforementioned image signature to other texture analysis methods. The obtained results were promising, presenting accuracy of 87.57% and AUC of 0.8983 using LDA and SVM, respectively. These performance values confirm that the randomized neural network based signature can be applied successfully to this important medical problem.

Jarbas Joaci de Mesquita Sá Junior, André R. Backes, Odemir Martinez Bruno
Estimation of Pedestrian Height Using Uncalibrated Cameras

The height of a person can be used as a Soft Biometrics feature in surveillance scenarios. The automatic estimation of pedestrian height have been addressed mostly on calibrated cameras. We are proposing a new method for real height estimation in videos from uncalibrated cameras. Our proposal computes the horizon line within the scene and then, the relative height of each person is obtained. We employ the real height distribution of a population to provide the final height value. In this process, it has been included an evaluation of the silhouette’s quality, in order to improve the results. Experiments were conducted in uncontrolled scenarios, showing a good performance of our method.

Alejandro Valdés-Camejo, Guillermo Aguirre-Carrazana, Raúl Alonso-Baryolo, Annette Morales-González, Francisco J. Silva-Mata, Edel García-Reyes
Semi-supervised Online Kernel Semantic Embedding for Multi-label Annotation

This paper presents a multi-label annotation method that uses a semantic embedding strategy based on kernel matrix factorization. The proposed method called Semi-supervised Online Kernel Semantic Embedding (SS-OKSE) learns to predict the labels of a document by building a semantic representation of the document features that takes into account the labels, when available. A remarkable characteristic of the algorithm is that it is based on a kernel formulation that allows to model non-linear relationships. The SS-OKSE method was evaluated under a semi-supervised learning setup for a multi-label annotation task, over two text document datasets and was compared against several supervised and semi-supervised methods. Experimental results show that SS-OKSE exhibits a significant improvement, showing that a better modeling can be achieved with an adequate selection/construction of a kernel input representation.

Jorge A. Vanegas, Hugo Jair Escalante, Fabio A. González
Ultrasonic Backscatter Signal Processing Technique for the Characterization of Animal Lymph Node

Quantitative ultrasonic characterization of biological soft tissues has become in recent years an essential tool in the non-invasive-non-destructive assessment of physical properties of the microstructure of tissues, due to the potential for estimating acoustic parameters associated to density characteristics, distribution and heterogeneity of histological samples, as well as making the construction of improved quantitative images that support processes of clinical diagnosis. This paper presents the implementation of computational methods based on spectral analysis techniques for the construction of parametric ultrasonic images of animal suprascapular lymph node, which is an important tissue for the analysis of animal health risk or animal health. The computational algorithms were implemented based on the estimation of the acoustic attenuation coefficient dependent of the frequency and integrated backscatter coefficient (IBC). These computational procedures automatically processed 400 ultrasonic echoes acquired in a region of interest of 4 cm2 for each sample of lymph node, which it was exposed to an incident ultrasonic field of 2.25 MHz with bandwidth of 1 MHz @ −3 dB. The results allowed parametric identification of nodule structures as germinal nodules, which are hardly identified in conventional qualitative ultrasound images. Finally ultrasonic parametric characterization of biological study samples provides potential quantitative indicators, which are so much accurate in the estimation of histonormality.

Julian A. Villamarín, Daniela A. Montilla, Olga M. Potosi, Luis F. Londoño, Fabian G. Muñoz, Edgar W. Gutierrez
A Distributed Shared Nearest Neighbors Clustering Algorithm

Current data processing tasks require efficient approaches capable of dealing with large databases. A promising strategy consists in distributing the data along several computers that partially solves the undertaken problem. Then, these partial answers are integrated in order to obtain a final solution. We introduce the Distributed Shared Nearest Neighbor based clustering algorithm (D-SNN) which is able to work with disjoint partitions of data producing a global clustering solution that achieves a competitive performance regarding centralized approaches. Our algorithm is suited for large scale problems (e.g, text clustering) where data cannot be handled by a single machine due to memory size constraints. Experimental results over five data sets show that our proposal is competitive in terms of standard clustering quality performance measures.

Juan Zamora, Héctor Allende-Cid, Marcelo Mendoza
Automatic Classification of Optical Defects of Mirrors from Ronchigram Images Using Bag of Visual Words and Support Vector Machines

The Ronchi test is known as a procedure that is able to generate visual patterns called ronchigrams. These patterns could be used to determine optical characteristics on the surface of mirrors, particularly to quantify and qualify optical aberrations and deformations. This paper presents an automatic method to detect these optical errors of mirrors using the Ronchi test by classifying ronchigram images using bag of visual words (BoVWs) for image representation and support vector machines (SVM) for ronchigrams classification. The ronchigram image data set was obtained from the optical manufacture laboratory of lenses and mirrors at Universidad de los Llanos. The BoVWs approach used was based on Scale-Invariant Feature Transform (SIFT) as visual words and a Linear SVM was trained for automatic classification of ronchigrams into optical defects. The classification performance achieved was 0.69% in terms of accuracy measure. These results shows that our proposed approach can be used to detect optical defects of mirrors with high precision in a real scenario of ronchigrams obtained from mirrors during the manufacture process of a optical laboratory.

Daniel Zapata, Angel Cruz-Roa, Andrés Jiménez
Backmatter
Metadaten
Titel
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
herausgegeben von
Marcelo Mendoza
Sergio Velastín
Copyright-Jahr
2018
Electronic ISBN
978-3-319-75193-1
Print ISBN
978-3-319-75192-4
DOI
https://doi.org/10.1007/978-3-319-75193-1