Top

2014 | Book

Read chapter Read first chapter

Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

19th Iberoamerican Congress, CIARP 2014, Puerto Vallarta, Mexico, November 2-5, 2014. Proceedings

Editors: Eduardo Bayro-Corrochano, Edwin Hancock

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the refereed proceedings of the 19th Iberoamerican Congress on Pattern Recognition, CIARP 2014, held in Puerto Vallarta, Jalisco, Mexico, in November 2014. The 115 papers presented were carefully reviewed and selected from 160 submissions. The papers are organized in topical sections on image coding, processing and analysis; segmentation, analysis of shape and texture; analysis of signal, speech and language; document processing and recognition; feature extraction, clustering and classification; pattern recognition and machine learning; neural networks for pattern recognition; computer vision and robot vision; video segmentation and tracking.

Frontmatter

Keynote 1

An Introduction to Evolutionary Multi-objective Optimization with Some Applications in Pattern Recognition

In this paper, we provide a general introduction to the so-called multi-objective evolutionary algorithms, which are metaheuristic search techniques inspired on natural evolution that are able to deal with highly complex optimization problems having two or more objectives. In the first part of the paper, we provide some basic concepts necessary to make the paper self-contained, as well as a short review of the most representative multi-objective evolutionary algorithms currently available in the specialized literature. After that, a short review of applications of these algorithms in pattern recognition is provided. The final part of the paper presents some possible future research paths in this area as well as our conclusions.

Carlos A. Coello-Coello

Image Coding, Processing and Analysis

Rotation-Invariant Nonlinear Filters Design

Geometrical distortions are a major problem in image recognition. Composite correlation filters can be used for distortion-invariant image recognition by incorporating rotated versions of the target object. Traditionally composite filters are designed with linear techniques; but, these filters are sensitive to non-Gaussian noise. On the other hand, for the same purpose, composite nonlinear filters have been proposed too. These filters have a good discrimination capability and they are robust to non-Gaussian noise and illumination changes; however, the performance of filter could be reduced when the number of training images incorporated increases. In this paper, we propose a method for designing rotation-invariant composite nonlinear filters. The method tries to maximize the number of objects incorporated into the filter and preserve its performance.

Saúl Martínez-Díaz, Saúl Martínez-Chavelas

Topology-Preserving General Operators in Arbitrary Binary Pictures

A general operator may transform a binary picture by changing both black and white points. Sequential operators traverse the points of a picture, and consider a single point for possible alteration, while parallel operators can alter a set of points simultaneously. An order-independent transition function yields the same sequential operator for arbitrary visiting orders. Two operators are called equivalent if they produce the same result for each input picture. A transition function is said to be equivalent if it specifies a pair of equivalent parallel and sequential operators. This paper establishes a necessary and sufficient condition for order-independent transition functions, a sufficient criterion for equivalent transition functions, and a sufficient condition for topology-preserving parallel general operators in arbitrary binary pictures.

Kálmán Palágyi

Evaluating Robustness of Template Matching Algorithms as a Multi-objective Optimisation Problem

Template matching has multiple applications on different problems in computer vision. Image distortions remain as the main challenge that template matching algorithms have to overcome. Thus, measuring robustness of algorithms against distortion conditions is an important task. Moreover, a comparison among template matching algorithms is difficult to achieve due to the lack of a standard evaluation methodology. In this paper, a measurement for quantifying the robustness of template matching algorithms against a single distortion is introduced. In addition, a procedure for comparing template matching algorithms is presented, aiming to become an evaluation standard. The comparison of template matching algorithms is formulated as a Multi-objective Optimisation problem. Experimental evaluation of the proposed procedure, using the robustness coefficient, is conducted by comparing algorithms based on full-search and different similarity measurements.

Jose Bernal, Maria Trujillo, Ivan Cabezas

Estimation of Bubble Size Distribution Based on Power Spectrum

A bubble size distribution gives relevant insight into mixing processes where gas-liquid phases are present. The distribution estimation is challenging since accurate bubble detection from images captured from industrial processes is a complicated task due to varying lighting conditions which change the appearance of bubbles considerably. In this paper, we propose a new method for estimating the bubble size distribution based on the image power spectrum. The method works by calculating the power spectrum for a number of frequency bins and learning the linear relationship between the power spectrum and the bubble size distribution. Since the detection of individual bubbles is not needed, the proposed method is remarkably faster than the traditional approaches. The method was compared to a geometry-based bubble detection method with both synthetic and industrial image data. The proposed method outperformed the bubble detection based approach especially in the cases where bubbles were small and the number of bubbles high.

Jarmo Ilonen, Tuomas Eerola, Heikki Mutikainen, Lasse Lensu, Jari Käyhkö, Heikki Kälviäinen

Edge-Based Coding Tree Unit Partitioning Strategy in Inter Prediction

The High Efficiency Video Coding standard increases in the range of [30, 40] % data compression ratio compared to H.264/MPEG-4 (AVC), but it requires bigger number of operations. HEVC uses a quadtree coding structure. The quadtree partitioning process is a high complexity operation since it employs an exhaustive process, called rate distortion optimisation, which involves all possible combinations of quadtree partitions. In this paper, a frame partitioning strategy is addressed using motion features. Motion features are edges extracted using Gaussian smoothing, the Sobel operators, and the Otsu’s method. The strategy achieves CU partitions according to the amount of motion in content, and reduces the number of operations in the inter prediction mode.

Maria Santamaria, Maria Trujillo

A Robust Tracking Algorithm Based on HOGs Descriptor

A novel tracking algorithm based on matching of filtered histograms of oriented gradients (HOGs) computed in circular sliding windows is proposed. The algorithm is robust to geometrical distortions of a target as well as invariant to illumination changes in scene frames. The proposed algorithm is composed by the following steps: first, a fragment of interest is extracted from a current frame around predicted coordinates of the target location; second, the fragment is preprocessed to correct illumination changes; third, a geometric structure consisting of disks to describe the target is constructed; finally, filtered histograms of oriented gradients computed over geometric structures of the fragment and template are matched. The performance of the proposed algorithm is compared with that of similar state-of-the-art techniques for target tracking in terms of objective metrics.

Daniel Miramontes-Jaramillo, Vitaly Kober, Víctor Hugo Díaz-Ram

Fast Heuristics for Eliminating Switching Components in Binary Matrices by 0-1 Flips

Switching components are special patterns in binary matrices that play an essential role in many image processing and pattern analysis tasks. Finding the minimal number of 0s that must be switched to 1s in order to eliminate all switching components is an NP-complete problem. We present two novel-type heuristics for the above problems and show via experiments that they outperform the formerly proposed ones, both in optimality and in running time.

Norbert Hantos, Péter Balázs

Practical Genericity: Writing Image Processing Algorithms Both Reusable and Efficient

An important topic for the image processing and pattern recognition community is the construction of open source and efficient libraries. An increasing number of software frameworks are said to be generic: they allow users to write reusable algorithms compatible with many input image types. However, this design choice is often made at the expense of performance. We present an approach to preserve efficiency in a generic image processing framework, by leveraging data types features. Variants of generic algorithms taking advantage of image types properties can be defined, offering an adjustable trade-off between genericity and efficiency. Our experiments show that these generic optimizations can match dedicated code in terms of execution times, and even sometimes perform better than routines optimized by hand.

Roland Levillain, Thierry Géraud, Laurent Najman, Edwin Carlinet

Real Time Hardware Accelerator for Image Filtering

The image processing nowadays is a field in development, many image filtering algorithms are tested every day; however, the main hurdles to overcome are the difficulty of implementation or the time response in a general purpose processors. When the amount of data is too big, a specific hardware accelerator is required because a software implementation or a generic processor is not fast enough to respond in real time. In this paper optimal hardware implementation is proposed for extracting edges and noise reduction of an image in real time. Furthermore, the hardware configuration is flexible with the ability to select between power and area optimization or speed and performance. The results of algorithms implementation are reported.

Susana Ortega-Cisneros, Miguel A. Carrazco-Díaz, Adrian Pedroza de-la-Crúz, Juan J. Raygoza-Panduro, Federico Sandoval-Ibarra, Jorge Rivera-Domínguez

FPGA Implementation of a NARX Network for Modeling Nonlinear Systems

This paper presents the FPGA implementation of a NARX neural network for the modeling nonlinear systems. The complete neural architecture was implemented with Verilog language in Xilinx ISE Tool with the Virtex-6 FPGA ML605 Evaluation Kit. All operations, such as data processing, weight connections, multipliers, adders and activation function were performed using floating point format, because allows high precision in operations with high complexity. Some resources of Xilinx were used such as multipliers and CORE blocks, and the hyperbolic tangent of the activation is realized based on Taylor series. To validate the implementation results, the NARX network was used to model the inverse characteristics of a power amplifier. The results obtained in the simulation and the FPGA implementation shown a high correspondence.

J. A. Rentería-Cedano, L. M. Aguilar-Lobo, S. Ortega-Cisneros, J. R. Loo-Yau, Juan J. Raygoza-Panduro

Segmentation, Analysis of Shape and Texture

Self-Adaptive Skin Segmentation in Color Images

In this paper, we present a new method for skin detection and segmentation, relying on spatial analysis of skin-tone pixels. Our contribution lies in introducing self-adaptive seeds, from which the skin probability is propagated using the distance transform. The seeds are determined from a local skin color model that is learned on-line from a presented image, without requiring any additional information. This is in contrast to the existing methods that need a skin sample for the adaptation, e.g., acquired using a face detector. In our experimental study, we obtained F-score of over 0.85 for the ECU benchmark, and this is highly competitive compared with several state-of-the-art methods.

Michal Kawulok, Jolanta Kawulok, Jakub Nalepa, Bogdan Smolka

Spoken Emotion Recognition Using Deep Learning

Spoken emotion recognition is a multidisciplinary research area that has received increasing attention over the last few years. In this paper, restricted Boltzmann machines and deep belief networks are used to classify emotions in speech. The motivation lies in the recent success reported using these alternative techniques in speech processing and speech recognition. This classifier is compared with a multilayer perceptron classifier, using spectral and prosodic characteristics. A well-known German emotional database is used in the experiments and two methodologies of cross-validation are proposed. Our experimental results show that the deep method achieves an improvement of 8.67% over the baseline in a speaker independent scheme.

E. M. Albornoz, M. Sánchez-Gutiérrez, F. Martinez-Licona, H. L. Rufiner, J. Goddard

A Comparison of Fused Segmentation Algorithms for Iris Verification

Recent studies show fusion at level of segmentation to be useful for more robust iris recognition rates compared with simple segmentation. In this paper we perform Sum-Rule Interpolation at level of the result of the normalized segmented iris images using the well-known Daugman’s algorithm, since the process of normalization is essentially composed by two parts: Iris segmentation, in which the pupillary and limbic polar curves are detected and Iris normalization: a normalized representation of the iris texture is created using angular and pupil-to-limbic radial coordinates. For evaluation we propose an experimental fusion scheme using three automatic segmentation algorithms which have reported good results and are not computationally expensive. The experiments were performed on the CASIA V3-Interval, CASIA.V4-Thousand and UBIRIS V1 datasets showing increased recognition accuracy for representative feature extraction algorithms.

Yasiel Sanchez-Gonzalez, Yasser Chacon-Cabrera, Eduardo Garea-Llano

A Linear Time Implementation of k-Means for Multilevel Thresholding of Grayscale Images

In this paper we present a method based on the k-means algorithm for multilevel thresholding of grayscale images. The clustering is computed over the histogram rather than on the full list of intensity levels. Our implementation runs in linear time per iteration proportional to the number of bins of the histogram, not depending on the size of the image nor on the number of clusters/levels as in a traditional implementation. Therefore, it is possible to get a large speedup when the number of bins of the histogram is significantly shorter than the number of pixels. In order to achieve that running time, two restrictions were exploited in our implementation: (I) we target only grayscale images and (II) thresholding does not use spatial information.

Pablo Fonseca, Jacques Wainer

Angle Estimation Using Hahn Moments for Image Analysis

We present a new approach for angle estimation in binary images from Hahn moments, which provide an approximate estimate with short computational times. The method proposed retrieving the angle formed from a reference point to another, through a multiple linear regression and a set of Hahn moments obtained in a training database. Finally, we discuss the performance analysis of our approach under noise conditions and scale change.

C. Camacho-Bello, J. J. Báez-Rojas

A Fast Gabor Filter Approach for Multi-Channel Texture Feature Discrimination

Texture is a very important concept for many image understanding and pattern classification applications. The analysis of texture can be performed by the multi-channel filtering theory, a classical theory for texture perception based on the early stages of human visual system. This approach decomposes an image into a set of responses given by a bank of Gabor filters, that nearly covers in an uniformly manner the spatial-frequency domain. This approach relies on the image dimensions, and the number of kernels in a bank of Gabor filters varies according to the number of combinations between frequencies and orientations. In many practical applications, this large number of combinations makes quickly unfeasible the computation of the whole bank of filters. To ease this problem, in this paper we propose a multi-channel filtering where the Gabor bank for texture discrimination is computed in parallel in a graphics processing unit (GPU). Experimental results show an improvement of 8.78 times for feature extraction when compared against the corresponding CPU-based approach.

Antonio Carlos Sobieranski, Rodrigo T. F. Linhares, Eros Comunello, Aldo von Wangenheim

Color Skin Segmentation Based on Non-linear Distance Metrics

In this paper we present a semiautomatic method for skin identification in video sequences. The user trains the system by selecting in a frame some typical positive skin pixels, that will be used as a reference for the construction of a nonlinear distance metric. In this learning process the global optimum is obtained by induction employing higher polynomial terms of the Mahalanobis distance, extracting nonlinear features of the skin pattern distributions. These nonlinear features are then used to classify the frames captured from the camera, identifying all skin and non-skin regions on the scene. We adopt an estrategy which enables this method to run in real-time after some iteractions. We also compare our classification method against vector norm (

) and Mahalanobis distance, showing a better classification for the skin patterns.

Antonio Carlos Sobieranski, Vito F. Chiarella, Eduardo Barreto-Alexandre, Rodrigo T. F. Linhares, Eros Comunello, Aldo von Wangenheim

Partial Shape Matching and Retrieval under Occlusion and Noise

Shape information have proven to be useful in many computer vision applications. In this work, a self-containing shape descriptor for open and closed contours is proposed. Also, a partial shape matching method robust to partial occlusion and noise in the contour is proposed. Both the shape descriptor and the matching method are invariant to rotation and translation. Experiments were carried out in the Shapes99 and Shapes216 datasets, where contour segments of different lengths were removed to obtain partial occlusion as high as 70%. For the highest occlusion levels the proposed method outperformed other popular shape description methods, with up to 50% higher bull’s eye score.

Leonardo Chang, Miguel Arias-Estrada, José Hernández-Palancar, L. Enrique Sucar

Keynote II

Searching for Patterns in Imbalanced Data

Methods and Alternatives with Case Studies in Life Sciences

The prime motivation for pattern discovery and machine learning research has been the collection and warehousing of large amounts of data, in many domains such as life sciences and industrial processes. Examples of unique problems arisen are situations where the data is imbalanced. The class imbalance problem corresponds to situations where majority of cases belong to one class and a small minority belongs to the other, which in many cases is equally or even more important. To deal with this problem a number of approaches have been studied in the past. In this talk we provide an overview of some existing methods and present novel applications that are based on identifying the inherent characteristics of one class vs the other. We present the results of a number of studies focusing on real data from life science applications.

A. Fazel Famili

Analysis of Signal, Speech and Language

The Place Theory as an Alternative Solution in Automatic Speech Recognition Tasks

Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). This paper shows how using an alternative solution reported in the state of the art solves the Lesser and Berkeley’s cochlea model in ASR tasks. An approach that considers a new form to construct the bank filter in the parametric representation used to extract MFCC is proposed. Then this distribution of the bank filter to have a new representation of the speech in frequency domain is used. It is important to indicate that MFCC parameters use Mel scale to create a bank filter. The cochlea behavior based on the theory to create the central frequencies of the bank filter was used, .The Mel scale function was substituted for our purpose. A 98.5% performance was reached, for a task that uses isolated digits pronounced by 5 different speakers in the Spanish language and corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was used.

José Luis Oropeza-Rodríguez, Sergio Suárez-Guerra, Mario Jiménez-Hernández

On the Use of Locality Sensitive Hashing for Audio Following

Audio Following (AF) is the process of mapping a musician’s performance, usually in real-time, to a reference performance that is used as a reference. Such base performance is considered a “correct performance” and thus, the live performace must be aligned to it. The objective of AF is to track the musician’s position throughout the performance. We present a novel approach to AF that uses a locality sensitive hashing (LSH) based index to perform such task. First, we obtain the Audio Fingerprint (AFP) of the base performance. Then, the obtained AFP is indexed using LSH. Such performance’s AFP is used as a reference to align any other performance of the same music. Next, we obtain half-a-second sub-AFP’s of the performance being followed and their corresponding positions in the reference AFP are searched for by querying the index. The system was tested on a set of 22 pianists playing music by Chopin with very good results when comparing the obtained alignment with the ideal alignment.

Luis F. Guzmán, Antonio Camarena-Ibarrola

Stationary Signal Separation Using Multichannel Local Segmentation

In this work, we study the influence of locally stationary segments as preprocess stage to separate stationary and non-stationary segments. To this, we compare three different segmentation approaches, namely i)cumulative variance based segmentation, ii)PCA based segmentation, and iii)HMM based segmentation. Results are measured as the true and false detection probabilities, and also as the ratio between the real and estimated number of segments. Finally, to achieve the separation, we use the Analytic Stationary Subspace Analysis (ASSA) and results are measured as the correlation between the true and the estimated stationary sources. In this case, we also compare against the best possible ASSA solution. Results show that inclusion of locally stationary segments could enhance or at least achieve optimal estimation of stationary sources.

C. Castro-Hoyos, F. M. Grisales-Franco, J. D. Martínez-Vargas, Carlos D. Acosta-Medina, Germán Castellanos-Domínguez

On the Use of Evolutionary Programming for Combinational Logic Circuits Design

In this paper, we propose a new methodology to solve the combinational logic circuit design problem by using evolutionary programming. We have implemented a set of evolutionary operators to evolve combinational circuits. Our method uses a set of examples described by their truth table, and it automatically generates a combinational logic function that represents this set. We test our system by using some examples from the literature, and also, we present a set of examples addressing more complex problems to validate the performance of our method.

Marco A. Contreras-Cruz, Victor Ayala-Ramirez, Paola B. Alvarado-Velazco

From Speech Quality Measures to Speaker Recognition Performance

In this work, a study about the relationship between speech Quality Measures(QM) and speaker recognition performance is presented. To accomplish this, four objective non-intrusive QM were selected (KLPC, KCEP, HD, P563), based on the relationship between their features and the speaker recognition process. Experiments were performed in a male speech corpus in several additive noise conditions. Statistical analysis about the data distributions related to quality, signal to noise ratio and speaker recognition scores were carried out. Interesting conclusions were obtained regarding the relationship between QM and speaker recognition scores for 20 noisy environments evaluated.

Claudia Bello, Dayana Ribas, José R. Calvo, Carlos A. Ferrer

Temporal Information in a Binary Framework for Speaker Recognition

In recent years a simple representation of a speech excerpt has been proposed, as a binary matrix allowing easy access to the speaker discriminant information. In addition to the time-related abilities of this representation, it also allows the system to work with a temporal information representation based on sequential changes present in the binary representation. A new temporal information is proposed in order to add it to speaker recognition systems. A new specificity selection approach using a mask in the cumulative vector space is also proposed. This aims to increase effectiveness in the speaker binary key paradigm. The experimental validation, done on the NIST-SRE framework, demonstrates the efficiency of the proposed solutions, which shows an EER improvement of 7%. The combination of i-vector and binary approaches, using the proposed methods, showed the complementarity of the discriminatory information exploited by each of them.

Gabriel Hernández-Sierra, José R. Calvo, Jean-François Bonastre

Artificial Neural Networks for Acoustic Lung Signals Classification

A classification of acoustic lung signals for the respiratory disease diagnosis problem is studied in the present work. Models based on artificial neural networks, using Mel Frequency Cepstral Coefficients for training are employed in this task. Results show that neural networks are comparable, and in some cases better, with other classification techniques as Gaussian Mixture Models, that work on the same database.

Alvaro D. Orjuela-Cañón, Diego F. Gómez-Cajas, Robinson Jiménez-Moreno

RASCNA: Radio Astronomy Signal Classification through Neighborhood Assemblies

Computation is applicable to any branch in order to improve performance in times of process and results improvement, this article is the demonstration of an automatic process applied to the area of astronomy. The classification of electromagnetic spectra by pattern recognition is based on an assembly composed of neighborhood-based methods of classification. The acquisition of the electromagnetic spectrum to classify, is obtained of the SDSS III (

Sloan Digital Sky Survey

), the process of classification consists of a preprocessing, to obtain a specific region of the spectrum followed by filtering in advance of relevant features by means of digital signal processing and the wavelet haar transform.

abstract

environment.

Mildred Morales-Xicohtencatl, Leticia Flores-Pulido, Carolina Rocío Sánchez-Pérez, Juan José Córdova-Zamorano

Spectral Correlation Measure for Selecting Intrinsic Mode Functions

Time series analysis implies extracting relevant features from real-world applications to improve pattern recognition tasks. In that sense, representation methods based on time series decomposition and similarity measures are combined to select representative features with physical interpretability. In this work, we introduce two similarity measures based on the cross-power spectral density to select representative intrinsic mode functions (IMF) that characterize the time series. The IMFs are obtained by Ensemble Empirical Mode Decomposition because it deals with non-stationary dynamics present into time series. The proposed similarity measures are an extension of the correlation coefficient and are validate using vibration signals acquired in a test rig under three different machine states (undamaged, unbalance and misalignment). Results show that the proposed measures improve the interpretability in terms of association between an IMF and a fault state, preserving a high classification rate.

Edgar F. Sierra-Alonso, Oscar Cardona-Morales, Carlos D. Acosta-Medina, German Castellanos-Dominguez

Feature Analysis for Audio Classification

In this work we analyze and implement several audio features. We emphasize our analysis on the ZCR feature and propose a modification making it more robust when signals are near zero. They are all used to discriminate the following audio classes: music, speech, environmental sound. An SVM classifier is used as a classification tool, which has proven to be efficient for audio classification. By means of a selection heuristic we draw conclusions of how they may be combined for fast classification.

Gaston Bengolea, Daniel Acevedo, Martín Rais, Marta Mejail

Neural Decoding Using Kernel-Based Functional Representation of ECoG Recordings

A Kernel-based Feature Representation (KFR) approach is proposed to extract patterns from multi-channel time-series of measured brain activity. To search for Inter-Channel Similarity (ICS), we introduce a kernel function to embed input data through a sliding window. We use the ICS-based data representation to obtain relevant channel dependencies along time. Hence, the introduced KFR that seeks for spatio-temporal relationships among channels facilitates brain activity analysis relating to neural decoding tasks. We test the KFR on two neural decoding collections of macaque Electrocorticographic signals. Obtained results show that proposed KFR improves both data visual interpretability and stimulus prediction.

S. García-Vega, A. M. Álvarez-Meza, Germán Castellanos-Domínguez

Comparison of Methods to Assess Similarity between Phrases

We study the problem of similarity between phrases. To do so, we study three similarity methods. The first one considers the commonalities and differences of the two phrases. The second one is an extension of the well-known Levenshtein-Damerau distance in a word oriented fashion. The third one considers the sequentiality of the phrases and is resistant to phrases with repeated words. Finally, we show an experimental evaluation of our methods in both English and Spanish corpora.

Renzo Angles, Valeria Araya, Jesus Concha, Rodrigo Paredes

Document Processing and Recognition

Efficient Overlapping Document Clustering Using GPUs and Multi-core Systems

Overlapping clustering algorithms have been successfully applied in several contexts. Among the reported overlapping clustering algorithms, OClustR is the one showing the best trade-of between quality of the clusters and efficiency, in the task of document clustering; however, it has a quadratic computational complexity so it could be less useful in applications dealing with a very large number of documents. In this paper, we propose two parallel versions of the OClustR algorithm, specifically tailored for GPUs and multi-core CPUs, which enhance the efficiency of OClustR in problems dealing with a very large number of documents. The experimental evaluation over standard document collections showed the correctness and good performance of our proposals.

Lázaro J. González Soler, Airel Pérez-Suárez, Leonardo Chang

Assessing Cross-Cut Shredded Document Assembly

In this paper we address the problem of quantitative evaluation of cross-cut shredded document reconstruction. We propose quantitative metrics using graph theory and classic information retrieval concepts to compare the neighborhood connectivity graph of a reassembled document shredded by a cross-cut machine against the neighborhood graph of the ground-truth. These metrics focus entirely on the proper relative positioning of the shredded pieces. To do so, we have shredded 12 documents containing diverse content, such as handwriting, printed text, images and photographs. We then scanned, extracted the pieces, and reassembled them into the ground-truth. This dataset is available to the readers, with the original documents, the digital representation of the shreds, and the scripts that provide the quantitative evaluation of the user’s reconstructions.

Priscila Saboia, Siome Goldenstein

Keynote III

Person Reidentification and Recognition in Video

Person recognition has been a challenging research problem for computer vision researchers for many years. A variation of this generic problem is that of identifying the reappearance of the same person in different segments to tag people in a family video. Often we are asked to answer seemingly simple queries such as ‘how many different people are in this video? or ‘find all instances of this person in these videos’. The complexity of the task grows quickly if the video in question includes segments taken at different times, places, lighting conditions, camera settings and distances since these could include substantial variations in resolution, pose, appearance, illumination, background, occlusions, etc. In some scenarios (airports, shopping centers, and city streets) we may have video feeds from multiple cameras with partially overlapping views operating under widely varying lighting and visibility conditions. Yet computer vision systems are challenged to find and track a person of interest as data from such systems have become ubiquitous and concern for security in public spaces has become a growing concern. While this is yet an unsolved challenge, much progress has been made in recent years in developing computer vision algorithms which are the building blocks for person detection, tracking and recognition. We consider several video capture scenarios, discuss the challenges they present for person re-identification and recognition as the complexity of the scene changes, and present pointers to recent research work in relevant computer vision areas in this paper.

Rangachar Kasturi, Rajmadhan Ekambaram

Feature Extraction, Clustering and Classification

Are Reducts and Typical Testors the Same?

This paper deals with the relation between rough set reducts and typical testors from the logical combinatorial approach to pattern recognition. The main objective is to clarify once and for all that although in many cases the two concepts coincide, being rigorous they are not the same. Definitions, comments and observations are formally introduced and supported by illustrative examples. Furthermore, some theorems expressing theoretical relations between reducts and typical testors are enunciated and proved.

Manuel S. Lazo-Cortés, José Fco. Martínez-Trinidad, Jesús Ariel Carrasco-Ochoa, Guillermo Sanchez-Diaz

A Multiscale and Multi-Perturbation Blind Forensic Technique for Median Detecting

This paper aims at detecting traces of median filtering in digital images, a problem of paramount importance in forensics given that filtering can be used to conceal traces of image tampering such as resampling and light direction in photomontages. To accomplish this objective, we present a novel approach based on multiple and multiscale progressive perturbations on images able to capture different median filtering traces through using image quality metrics. Such measures are then used to build a discriminative feature space suitable for proper classification regarding whether or not a given image contains signs of filtering. Experiments using a real-world scenario with compressed and uncompressed images show the effectiveness of the proposed method.

Anselmo Ferreira, Anderson Rocha

Automatic Classification of Coating Epithelial Tissue

Histology images may be used in E-Learning systems to teach how morphological features and function of each organ contribute to its identification. Automatic classification of coating epithelial cells is an open problem in image processing. This problem has been addressed using morphological gradient, region-based and, shape-based method, among others. In this paper, coating epithelial cells are recognised and classified into: Flat, Cubic and Cylindrical. Epithelial cells are classified based on sphericity and projection. Information about sphericity is used to classify cells into cubic and a measure based in projecting cell nucleus into light region is used to classify into flat and cylindrical. Experimental validations are conducted according to expert criteria, along with manually annotated images, as a ground-truth. Experimental results revealed that the proposed approach recognised coating epithelial cells and classified tissues in a similar way to how experts have performed these classifications.

Claudia Mazo, Maria Trujillo, Liliana Salazar

Human Action Classification Using N-Grams Visual Vocabulary

Human action classification is an important task in computer vision. The

Bag-of-Words

model is a representation method very used in action classification techniques. In this work we propose an approach based on mid-level features representation for human action description. First, an optimal vocabulary is created without a preliminary number of visual words, which is a known problem of the

K-means

method. We introduce a graph-based video representation using the interest points relationships, in order to take into account the spatial and temporal layout. Finally, a second visual vocabulary based on n-grams is used for classification. This combines the representational power of graphs with the efficiency of the bag-of-words representation. The representation method was tested on the KTH dataset using STIP and MoSIFT descriptors and multi-class SVM with a chi-square kernel. The experimental results show that our approach using STIP descriptor outperforms the best results of state-of-art, meanwhile using MoSIFT descriptor are comparable to them.

Ruber Hernández-García, Edel García-Reyes, Julián Ramos-Cózar, Nicolás Guil

Spectral Clustering Using Compactly Supported Graph Building

Clustering techniques demand on suitable models of data structures to infer the main samples patterns. Nonetheless, detection of data structures becomes a difficult task when dealing with nonlinear data relationships and complex distributions. Here, to support clustering tasks, we introduce a new graph building strategy based on a compactly supported kernel technique. Thus, our approach makes relevant pair-wise sample relationships by finding a sparse kernel matrix that codes the main sample connections. Clustering performance is assessed on synthetic and real-world data sets. Obtained results show that the proposed method enhances the data interpretability and separability by revealing relevant data relationships into a graph-based representation.

A. M. Álvarez-Meza, A. E. Castro-Ospina, Germán Castellanos-Dominguez

Unsupervised Kernel Function Building Using Maximization of Information Potential Variability

We propose a kernel function estimation strategy to support machine learning tasks by analyzing the input samples using Renyi’s Information Metrics. Specifically, we aim to identify a Reproducing Kernel Hilbert Space spanning the most widely the information force among data points by the maximization of the information potential variability of Parzen-based pdf estimation. So, a Gaussian kernel bandwidth updating rule is obtained as a function of the forces induced by a given dataset. Our proposal is tested on synthetic and real-world datasets related to clustering and classification tasks. Obtained results show that presented approach allows to compute RKHS’s favoring data groups separability, attaining suitable learning performances in comparison with state of the art algorithms.

A. M. Álvarez-Meza, D. Cárdenas-Peña, Germán Castellanos-Dominguez

Kernel-Based Image Representation for Brain MRI Discrimination

Discriminating Magnetic Resonance Images (MRI) allows supporting the analysis of physiological and pathological processes, however, finding MRI relationships posses a challenge when analyzing in voxel-based high-dimensional spaces. We introduce a kernel-based representation approach to support MRI discrimination. In this sense, inherent Inter-Slice Kernel relationship is employed to highlight brain structure distributions. Then, a generalized Euclidean metric is estimated by using a kernel-based centered alignment algorithm to code the correlation between MRI dependencies and prior demographic patient information. The proposed approach is tested on MRI data classification by considering patient gender and age categories. Attained results show that proposed methodology improves data interpretability and separability in comparison to state of the art algorithms based on MRI Voxel-wise features. Therefore, introduced kernel-based representation can be useful to support MRI clustering and similarity inference tasks required on template-based image segmentation and atlas construction.

D. Cárdenas-Peña, A. M. Álvarez-Meza, Germán Castellanos-Domínguez

Estimation of Cyclostationary Codebooks for Kernel Adaptive Filtering

A methodology based on kernel adaptive filtering termed DCKAF to support prediction tasks over one-dimensional time-series is proposed. DCKAF uses a linear combination of multiple

codebooks

to obtain the estimation from an input-output nonlinear mapping. This methodology employs a vector quantization based on statistic measures to check whether is necessary create a new

codebook

, then the nearest

codebook

to the current input sample is found. After that,

codebooks

are used to obtain the signal prediction at every instant, and evaluates if the current sample is added as a

codeword

or not as in traditional quantized kernel least mean square (QKLMS). Hence, DCKAF takes advantage of information learned on previous iterations to improve the system accuracy. The proposed methodology is tested on two one-dimensional time series and compared against QKLMS in terms of prediction accuracy. Obtained results show that DCKAF provides an effective way to predict time series improving prediction tasks.

S. García-Vega, A. M. Álvarez-Meza, Germán Castellanos-Domínguez

Person Re-Identification Based on Weighted Indexing Structures

Surveillance cameras are present almost everywhere, indicating an increasing interest regarding people safety. The automation of surveillance systems is important to allow real time analysis of critical events, crime investigation and prevention. A crucial step in the surveillance systems is the person re-identification which aims at maintaining the identity of agents that pass through the monitored environment, despite the occurrence of significant gaps in time and space. Many approaches have been proposed to person re-identification. However, there are still problems to be solved, such as illumination changes, pose variation, occlusions, appearance modeling and the management of the large number of people being monitored. This work approaches the last problem with the employment of multiple indexing structures associated with a weighting strategy to maintain the scalability and improve the accuracy. Experimental results demonstrate that the proposed approach is able to improve results based only on a single indexing structure.

Cristianne R. S. Dutra, Matheus Castro Rocha, William Robson Schwartz

Using Reference Point as Feature for Fingerprint Indexing

This work introduces a new feature based on relative minutia position regarding a reference point. The introduction of this feature, allows the elimination of false matches generated by minutiae. Moreover, a novel algorithm for detecting the reference point in fingerprints is introduced. This approach was tested in a manually edited dataset and it proved to be highly tolerant to distorted impressions. Moreover, the new feature was integrated to a recent fingerprint indexing algorithm in an efficient way. Well known fingerprint datasets were employed to show the improvement in accuracy and the superiority of the presented method regarding other proposals.

Alfredo Muñoz-Briseño, Andrés Gago-Alonso, José Hernández-Palancar

Scalable Feature Extraction for Visual Surveillance

The availability of surveillance cameras placed in public locations has increased vastly in the last years, providing a safe environment to people at the cost of huge amount of visual data collected. Such data are mostly processed manually, a task which is labor intensive and prone to errors. Therefore, automatic approaches must be employed to enable the processing of the data, so that human operators only need to reason about selected portions. Aiming at solving problems in the domain of visual surveillance, computer vision techniques have been applied successfully for several years. However, they are rarely tackled in a scalable manner. With that in mind, in this paper we tackle the feature extraction problem, one of the most expensive and necessary tasks in computer vision, by proposing a scheme to allow scalable feature extraction that uses the full power of the multi-core systems.

Antonio C. Nazare Jr., Renato Ferreira, William Robson Schwartz

The Evaluation of Ordered Features for SMS Spam Filtering

In this work we propose a method to capture the writing style of spams and non-spam messages by preserving the sequentiality of the text in the feature space. To be more specific, we propose to build the feature vector considering the features apparition order in the text. We extract features from messages by applying three techniques: Extrinsic Information, Sequential Labeling Extraction and Term Clustering. In doing so, the method presents low dimensional feature space that shows competitive classification accuracy for the tested classifiers.

José M. Bande Serrano, José Hernández Palancar, René Cumplido

Graph Clustering via Inexact Patterns

Graph pattern mining is an important task in Data Mining and several algorithms have been proposed to solve this problem. Most of them require that a pattern and its occurrences are identical, thus, they rely on solving the graph isomorphism problem. In the last years, however, some algorithms have focused in the case in which label and edge structure differences between a pattern and its occurrences are allowed but maintaining a bijection among vertices, using inexact matching during the mining process. Recently, an algorithm that allows structural differences in vertices was proposed. This feature allows it to find patterns missed by other algorithms, but, do these extra patterns actually contain useful information? We explore the answer to this question by performing an experiment in the context of unsupervised mining tasks. Our results suggests that by allowing structural differences in both, vertices and edges, it is possible to obtain new useful information.

Marisol Flores-Garrido, Jesús Ariel Carrasco-Ochoa, José Fco. Martínez-Trinidad

Large-Scale Micro-Blog Authorship Attribution: Beyond Simple Feature Engineering

With the ever-growing use of social media, authorship attribution plays an important role in avoiding cybercrime, and helping the analysis of online trails left behind by cyber pranks, stalkers, bullies, identity thieves and alike. In this paper, we propose a method for authorship attribution in micro-blogs with efficiency one hundred to a thousand times faster than state-of-the-art counterparts. The method relies on a powerful and scalable feature representation approach taking advantage of user patterns in micro-blog messages, and also on a custom-tailored pattern classifier adapted to deal with big data and high-dimensional data. Finally, we discuss search-space reduction when analyzing hundreds of online suspects and millions of online micro messages, which makes this approach invaluable for digital forensics and law enforcement.

Thiago Cavalcante, Anderson Rocha, Ariadne Carvalho

Quadratic Problem Formulation with Linear Constraints for Normalized Cut Clustering

This work describes a novel quadratic formulation for solving the normalized cuts-based clustering problem as an alternative to spectral clustering approaches. Such formulation is done by establishing simple and suitable constraints, which are further relaxed in order to write a quadratic functional with linear constraints. As a meaningful result of this work, we accomplish a deterministic solution instead of using a heuristic search. Our method reaches comparable performance against conventional spectral methods, but spending significantly lower processing time.

D. H. Peluffo-Ordóñez, C. Castro-Hoyos, Carlos D. Acosta-Medina, Germán Castellanos-Domínguez

YYC: A Fast Performance Incremental Algorithm for Finding Typical Testors

The last few years have seen an important increase in research publications dealing with external typical testor-finding algorithms, while internal ones have been almost forgotten or modified to behave as external on the basis of their alleged poor performance. In this research we present a new internal typical testor-finding algorithm called YYC that incrementally calculates typical testors for the currently analized set of basic matrix rows by searching for compatible sets. The experimentally measured performance of this algorithm stands out favorably in problems where other external algorithms show very low performance. Also, a comparative analysis of its efficiency is done against some external typical testor-finding algorithms published during the last few years.

Eduardo Alba-Cabrera, Julio Ibarra-Fiallo, Salvador Godoy-Calderon, Fernando Cervantes-Alonso

Evolutionary Multi-Objective Approach for Prototype Generation and Feature Selection

This paper introduces EMOPG+FS, a novel approach to prototype generation and feature selection that explicitly minimizes the classification error rate, the number of prototypes, and the number of features. Under EMOPG+FS, prototypes are initialized from a subset of training instances, whose positions are adjusted through a multi-objective evolutionary algorithm. The optimization process aims to find a set of suitable solutions that represent the best possible trade-offs among the considered criteria. Besides this, we also propose a strategy for selecting a single solution from the several that are generated during the multi-objective optimization process.We assess the performance of our proposed EMOPG+FS using a suite of benchmark data sets and we compare its results with respect to those obtained by other evolutionary and non-evolutionary techniques. Our experimental results indicate that our proposed approach is able to achieve highly competitive results.

Alejandro Rosales-Pérez, Jesus A. Gonzalez, Carlos A. Coello-Coello, Carlos A. Reyes-Garcia, Hugo Jair Escalante

Estimations of Clustering Quality via Evaluation of Its Stability

Currently, there are many clustering algorithms for the case of a known/unknown number of clusters. Typically, clustering is a result of optimisation of some quality criterion or iterative process. How to estimate the quality of clustering obtained by some method? Is the clustering result corresponding to the objective reality or just a stopping criterion of the method is made and obtained some partition? In this paper, a practical approach and the general criteria based on an estimation of the stability of clustering are proposed. For the well-known clustering methods, efficient algorithms for computing the introduced stability criteria according to the training set are obtained. We give illustrative examples.

Vladimir Ryazanov

A Comparison between Time-Frequency and Cepstral Feature Representations for the Classification of Seismic-Volcanic Signals

The analysis and classification of seismic patterns, which are typically registered as digital signals, can be used to monitor and understand the underlying geophysical phenomena beneath the volcanoes. In recent years, there has been an increasing interest in the development of automated systems for labeling those signals according to a number of pre-defined volcanic, tectonic and environmental classes. The first and crucial stage in the design of such systems is the definition or adoption of an appropriate representation of the raw seismic signals, in such a way that the subsequent stage —classification— is made easier or more accurate. This paper describes and discusses the most common representations that have been applied in the literature on classification of seismic-volcanic signals; namely, time-frequency features and cepstral coefficients. A comparative study of them is performed in terms of two criteria: (i) the leave-one-out nearest neighbor error, which provides a parameterless measure of the discriminative representational power and (ii) a visual examination of the representational quality via a scatter plot of the best three selected features.

Paola Alexandra Castro-Cabrera, Mauricio Orozco-Alzate, Andrea Adami, Manuele Bicego, John Makario Londoño-Bonilla, Germán Castellanos-Domínguez

A Method to Build Classification and Regression Trees

An adaptable structure to build a classification tree is presented. From such structure different existing classification trees can be obtained, but also we can build new ones, and compare the results of different trees (classification error, tree size, number of levels or other defined criteria). We use the adaptable scheme to emulate ID3, C4.5 and M5 trees, but also create a new tree (called general tree), and results obtained shows that we can obtain the same results with the original trees, and for the case of the general tree, its results are very close to the better classifier tree of the three studied.

Emilio Unda-Trillas, Jorge Rivera-Rovelo

An Adaptive Vehicle License Plate Detection at Higher Matching Degree

In this paper, a novel approach for vehicle license plate detection that improves in both efficiency and quality over the common multiscale search method is proposed. The detection efficiency is improved by employing the result of a single scale sliding window search as a promising guess of the license plate location. The quality is assured by locally refining the initial detection in multiple scales. The main benefit of our method is that we have reached a more precise detection with the analysis of 20 times fewer detection windows with high reliability (96% recall and 70% precision). We also compared our method with an edge-based hybrid approach.

Raphael C. Prates, Guillermo Cámara-Chávez, William Robson Schwartz, David Menotti

Pattern Recognition and Machine Learning

On the Influence of Markovian Models for Contextual-Based Optimum-Path Forest Classification

Contextual classification considers the information about a sample’s neighborhood to improve standard pixel-based classification approaches. In this work, we evaluated four different Markovian models for Optimum-Path Forest contextual classification considering land use recognition in remote sensing data. Some insights about the situations in which each of them should be applied are stated, as well as the idea behind them is explained.

D. Osaku, A. L. M. Levada, J. P. Papa

SPaMi-FTS: An Efficient Algorithm for Mining Frequent Sequential Patterns

In this paper, we propose a novel algorithm for mining frequent sequences, called SPaMi-FTS (

equential

ttern

ning based on

requent

wo-

equences). SPaMi-FTS introduces a new data structure to store the frequent sequences, which together with a new pruning strategy to reduce the number of candidate sequences and a new heuristic to generate them, allows to increase the efficiency of the frequent sequence mining. The experimental results show that the SPaMi-FTS algorithm has better performance than the main algorithms reported to discover frequent sequences.

José Kadir Febrer-Hernández, José Hernández-Palancar, Raudel Hernández-León, Claudia Feregrino-Uribe

Local Binary Pattern Matching for Fast Retina Map Relocalization Using the Slit-Lamp

Computer assistance has the potential for increasing safety and accuracy during retinal laser treatment using the slit-lamp. In this context, intra-operative retinal mapping is a fundamental requirement to overlay relevant pre-operative information for surgeons. Retinal mapping using the slit-lamp is a challenging task, due to disturbances such as lens distortions, occlusions and glare. Such disturbances have a negative impact on the duration of the mapping procedure, consequently affecting its acceptance in clinical practice. To cope with these visual tracking interruptions, we propose a fast retina map relocalization strategy based on template-matching, using local binary patterns, which are suitable for the retina’s texture. We perform extensive experiments to show the superior accuracy and computational efficiency of the proposed approach in comparison with feature-based methods.

Rodrigo T. F. Linhares, Rogério Richa, Ricardo Moraes, Eros Comunello, Aldo von Wangenheim

Thermal Face Recognition Using Local Patterns

The aim of this article is to compare the performance of well-known visible recognition methods but using the thermal spectrum. Specifically, the work considers two local-matching based methods for face recognition commonly used in visible spectrum: Local Binary Pattern (LBP) and Local Derivative Pattern (LDP). The methods are evaluated and compared using the UCHThermalFace database, which includes evaluation methodology that considers real-world conditions. The comparative study results shown that, contrary to what happens in the visible spectrum, the LBP method obtains the best results from the thermal face recognition. On the other hand, LDP results show that it is not an appropriate descriptor for face recognition systems in the thermal spectrum.

Gabriel Hermosilla, Gonzalo Farias, Hector Vargas, Francisco Gallardo, Cesar San-Martin

Comments on Randomly Sampled Non Local Means Image Filter

In this work we comment the results presented in [1] regarding a random sampling approach of the Non Local Means (NLM) image denoising filter with respect to computational cost and denoising performance. We will show that although the approach is novel and mathematically revealing, the computation cost of the approach is higher, and the PSNR lower, compared to the classical version. Furthermore, we will present a probabilistic model to evaluate the performance of different versions of NLM and tune its parameters.

Alvaro Pardo

Learning Graph-Matching Substitution Costs Based on the Optimality of the Oracle’s Correspondence

Throughout the last 30 years, several methods have been presented to perform error-tolerant graph matching. All of these methods assume there are some given weights that gauge the importance of each of attributes on nodes or edges. These weights are supposed to be manually validated and little research have been done to automatically learn the best combination of weights such that the resulting graph matching problem best matches the expected solution than an expert (human or artificial) would provide. We present an optimisation function (Loss function and Regularisation term) to automatically find these weights. Our practical evaluation reveals that our method properly learns these weights since applying the learned weights, the automatically obtained labelling between nodes is closer to the oracle’s labelling than applying the non-learned weights.

Xavier Cortés, Carlos Francisco Moreno-García, Francesc Serratosa

How Fashion Talks: Clothing-Region-Based Gender Recognition

In this paper, we investigate the gender recognition problem of people in photos via clothing information other than faces in the case of insufficient face specification. Similar to human’s intuition on telling a person’s gender from his/her dressing, we formulate this problem as a binary classification problem based on features extracted from semantic regions of clothing. Given a query image, we first apply category-level clothing parsing to divide the clothes into several semantic regions, such as blazers, shirts, jeans and so on. From each region, we obtain a local estimation on gender by classifying features describing color, texture and shape as middle level attributes. We then leverage an offline learned Mahalanobis distance metric on the middle level attributes to yield a final prediction on gender.Finally, We evaluate our method on proposed novel dataset and compare with state-of-art methods based on face specification.

Shengnan Cai, Jingdong Wang, Long Quan

Discriminating Fingerprint Images of Other Images

In this paper we propose several features which describe a fingerprint image. These features are based on orientation field and ridges characteristics. We introduce an algorithm to decide if an image is a fingerprint or not. This algorithm is not a fake fingerprint detector, it is oriented to discriminate between fingerprints images and other images. The classification accuracy was tested using a set of databases of fingerprints and non-fingerprints images (like common objects images, face images and iris images). For 2000 fingerprint images our approach shows its robustness achieving a 99.65% of True Positive Rate (TPR). Carrying out an experimentation for 3525 non-fingerprint images, our approach achieved 97.47% of reject.

Armando Rodríguez-Fonte, José Hernández-Palancar

Pattern Analysis in DNA Microarray Data through PCA-Based Gene Selection

DNA microarrays is a technology that can be used to diagnose cancer and other diseases. To automate the analysis of such data, pattern recognition and machine learning algorithms can be applied. However, the

curse of dimensionality

is unavoidable: very few samples to train, and many attributes in each sample. As the predictive accuracy of supervised classifiers decays with irrelevant and redundant features, the necessity of a dimensionality reduction process is essential. In this paper, we propose a new methodology that is based on the application of Principal Component Analysis and other statistical tools to gain insight in the identification of relevant genes. We run the approaches using two benchmark datasets: Leukemia and Lymphoma. The results show that it is possible to reduce considerably the number of genes while increasing the performance of well known classifiers.

Ricardo Ocampo, Marco A. de Luna, Roberto Vega, Gildardo Sanchez-Ante, Luis E. Falcon-Morales, Humberto Sossa

Recognizing Visual Categories with Symbol-Relational Grammars and Bayesian Networks

A novel proposal for a compositional model for object recognition is presented. The proposed method is based on visual grammars and Bayesian networks. An object is modeled as a hierarchy of features and spatial relationships. The grammar is learned automatically from examples. This representation is automatically transformed into a Bayesian network. Thus, recognition is based on probabilistic inference in the Bayesian network representation. Preliminary results in recognition of natural objects are presented. The main contribution of this work is a general methodology for building object recognition systems which combines the expressivity of a grammar with the robustness of probabilistic inference.

Elías Ruiz, L. Enrique Sucar

Geometric Indexing for Recognition of Places

The

Place Recognition

(PR) problem is fundamental for real time applications such as mobile robots (e.g. to detect loop closures) and guidance systems for the visually impaired. The Bag of Words (BoW) is a conventional approach that calculates a histogram of frequencies. One of the disadvantages of the BoW representation is that it loses information about the spatial location of features in the image. In this paper we study an approximate index based on the classic

–gram paradigm to recover images. Similar to the BoW, our approach detects interest points and assigns labels. Each image is represented by a set of

–grams obtained from triangles of a Delaunay decomposition. This representation allows us to create an index and to recover images efficiently. The proposed approach is path independent and was tested with a publicly available dataset showing a high recall rate and reduced time complexity.

Carlos Lara–Alvarez, Alfonso Rojas, Eduardo Bayro–Corrochano

Rolled-Plain Fingerprint Images Classification

Fingerprints are the most widely used biometric characteristic. Due to the acquisition mode, fingerprint impressions can be classified into three classes: rolled, plain and latent. Latent fingerprint matching against rolled/plain fingerprints databases is a topic of great importance to law enforcement and forensics. This is the reason why maintaining consistency in the rolled fingerprints database has great importance. Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM) are two extensively used techniques in data classification. In this article, a classification method of fingerprints into rolled and plain is proposed using SVM classifier. Three features are proposed to form the features vector due to its distinctive and discriminative characteristics. Our proposal achieved a classification accuracy of 99.1% using SVM, while with LDA the accuracy reached was 96.46%.

Katy Castillo-Rosado, José Hernández-Palancar

Evaluation of Keypoint Descriptors for Gender Recognition

Gender recognition is a relevant problem due to the number and importance of its possible application areas. The challenge is to achieve high recognition rates in the shortest possible time. Most studies are based on Local Binary Patterns (LBP) and its variants to estimate gender. In this paper, we propose the use of Binary Robust Independent Elementary Features (BRIEF), Oriented FAST and Rotated BRIEF (ORB) and Binary Robust Invariant Scalable Keypoints (BRISK) in gender recognition due to their good performance and speed. The aim is to show that ORB and BRISK are faster than LBP but allow to achieve similar recognition rates, which makes them suitable for real-time systems. For the best of our knowledge, it has not been studied in literature.

Florencia Soledad Iglesias, María Elena Buemi, Daniel Acevedo, Julio Jacobo-Berlles

Automated Banknote Identification Method for the Visually Impaired

A novel method for automated identification of banknotes’ denominations based on image processing is presented. The method is part of a wearable aiding system for the visually impaired, and it uses a standard video camera as the image collecting device. The method first extracts points of interest from the denomination region of a banknote and then performs an analysis of the geometrical patterns so defined, which allows the identification of the banknote denomination. Experiments were performed with a test-subject in order to simulate real-world operating conditions. A high average identification rate was achieved.

A. Rojas Domínguez, Carlos Lara-Alvarez, Eduardo Bayro

Using Complex Networks for Offline Handwritten Signature Characterization

This paper develops a novel way for offline handwritten signature characterization using a complex networks approach in order to apply for signature verification and identification process. Complex networks can be considered among the areas of graph theory and statistical mechanics. They are suitable for shape recognition due to their properties as invariance to rotation, scale, thickness and noise. Offline signatures images were pre-processed to obtain a skeletonized version. This is represented as an adjacency matrix where there are applied degree descriptors and dynamic evolution property of complex networks in order to generate the feature vector of offline signatures. We used a database composed of 960 offline signatures groups; every group corresponds to one person with 24 genuine and 30 forged signatures. We obtained a true rate of 85.12% for identification and 76.11% for verification. With our proposal it is demonstrated that complex networks provide a promising methodology for the process of identification and verification of offline handwritten signatures and it could be used in applications like document validation.

César Armando Beltrán Castañón, Ronald Juárez Chambi

Automatic Camera-Screen Localization

Knowing the location of the TV screen with respect to a camera it is important for many applications. This work addresses this problem in a configuration where there are people looking at the TV and a RGB-D camera facing them, located near the TV screen. We propose a method to automatically estimate the screen location and camera rotation using only people’s head pose obtained from a Face Tracking analysis on the RGB-D video. We validated these algorithms on a dataset with groundtruth and obtained very promising results.

Francisco Gómez-Fernández, Zicheng Liu, Alvaro Pardo, Marta Mejail

Learning Similarities by Accumulating Evidence in a Probabilistic Way

Clustering ensembles take advantage of the diversity produced by multiple clustering algorithms to produce a consensual partition. Evidence accumulation clustering (EAC) combines the output of a clustering ensemble into a co-association similarity matrix, which contains the co-occurrences between pairs of objects in a cluster. A consensus partition is then obtained by applying a clustering technique over this matrix. We propose a new combination matrix, where the co-occurrences between objects are modeled in a probabilistic way. We evaluate the proposed methodology using the dissimilarity increments distribution model. This distribution is based on a high-order dissimilarity measure, which uses triplets of nearest neighbors to identify sparse and odd shaped clusters. Experimental results show that the new proposed algorithm produces better and more robust results than EAC in both synthetic and real datasets.

Helena Aidos, Ana Fred

Unsupervised Manifold Learning for Video Genre Retrieval

This paper investigates the perspective of exploiting pairwise similarities to improve the performance of visual features for video genre retrieval. We employ manifold learning based on the reciprocal neighborhood and on the authority of ranked lists to improve the retrieval of videos considering their genre. A comparative analysis of different visual features is conducted and discussed. We experimentally show in the dataset of 14,838 videos from the MediaEval benchmark that we can achieve considerable improvements in results. In addition, we also evaluate how the late fusion of different visual features using the same manifold learning scheme can improve the retrieval results.

Jurandy Almeida, Daniel C. G. Pedronette, Otávio A. B. Penatti

A Tipicity Concept for Data Analysis and Its Application to Cleft Lip and Palate

The paper presents a model to analyze data structured in classes, to determine their representativity and classification. The model includes an algorithm integrating three parameters: Informational-Weight, Differential-Weight and Tipicity-Contrast. In application we analyze clinical data on 160 patients with lip and palate malformations. The model allows to assess how representative the sample is, using the variables of the cleft, lip and nose along with some expertly determined comparison criteria. Moreover using the Tipicity-Contrast parameter a supervised classification was achieved and has been able to classify correctly, in average, a 93% of the patients. As a result this model can provide helpful auxiliary criteria in medical decision-making.

Leticia Vega-Alvarado, Martha R. Ortíz-Posadas

A Machine Learning Method for High-Frequency Data Forecasting

In recent years several models for financial high-frequency data have been proposed. One of the most known models for this type of applications is the ACM-ACD model. This model focuses on modelling the underlying joint distribution of both duration and price changes between consecutive transactions. However this model imposes distributional assumptions and its number of parameters increases rapidly (producing a complex and slow adjustment process). Therefore, we propose using two machine learning models, that will work sequentially, based on the ACM-ACD model. The results show a comparable performance, achieving a better performance in some cases. Also the proposal achieves a significatively more rapid convergence. The proposal is validated with a well-known financial data set.

Erick López, Héctor Allende, Héctor Allende-Cid

Spatial Pyramid Matching for Finger Spelling Recognition in Intensity Images

Sign language is a complex way of communication mostly used for deaf people where hands, limbs, head and facial expressions are used to communicate. Finger spelling is a system where each letter of the alphabet is represented by a unique and discrete movement of the hand. In this paper, we are interested in studying the properties of the spatial pyramid matching descriptor for finger spelling recognition. This method is a simple extension of an orderless bag-of-features image representation where local features are mapped to multi-resolution histograms and compute a weighted histogram intersection. The performance of the approach is evaluated on a dataset of real images of the American Sign Language (ASL) finger spelling. We conduct experiments considering three evaluation protocols. The first uses 10% of the data as training and the remaining as test, we achieve an accuracy rate of 92.50%. The second protocol considers 50% as training data, the accuracy rate was about 97.1%. Finally, in the third protocol, we perform a 5-fold cross-validation, where we achieve an accuracy rate of 97.9%. Our method achieves the best results in all three protocols when compared to state-of-the-art approaches. In all the experiments, we also evaluate the influence of the weights of the multi-resolution histograms. They do not have a significant influence in the experimental results.

Samira Silva, William Robson Schwartz, Guillermo Cámara-Chávez

Perspective Based Model for Constructing Diverse Ensemble Members in Multi-classifier Systems for Multi-spectral Image Classification

This paper presents a perspective based model for creating diverse ensemble members in a multi-classifier system. With this technique different input feature sets are constructed using standard digital image processing and analyzing techniques viz. Haralick texture features, Gabor texture features, normalized difference vegetation index, standard deviation, spectral signatures, color spaces - CIELAB, HSV. These features are used as descriptors. Input feature sets are created as many as ensemble members. Input feature sets are discrete in nature because there is no common feature shared between any two input feature sets. Each one of these discrete input feature sets is utilized for training a particular ensemble member only. Each ensemble member would identify the classes independently and with completely different set of features. An empirical study for multi-spectral images shows that diverse and independent ensemble members can be constructed through our proposed technique. Results also show that proposed technique outperforms bagging in terms of individual member diversity and classification accuracy.

Laxmi Narayana Eeti, Krishna Mohan Buddhiraju

Neural Networks for Pattern Recognition

Neuro-Fuzzy Data Mining Mexico’s Economic Data

Given the increase of data being collected, there is a need to explore the use of tools to automate the recognition and extraction of patterns within some targeted data. The present work explores the use of a neuro-fuzzy classifier for the multi-factor productivity from the manufacturing sector in the Mexican economy. The chosen data set contains the time series for the variables: Sale Value of products, Wages, Work Force, Days Worked, and Hours Worked. The data is taken from the Banco de Información Económica at the Instituto Nacional de Estadística y Geografía. The neuro-fuzzy system is implemented on top of the Neuroph library extending on the ideas behind the Neuro-Fuzzy Reasoner. A sample run tends to assign the same values given by a visual inspection.

Gustavo Becerra-Gaviño, Liliana Ibeth Barbosa-Santillán

Handwritten Digit Recognition Based on Pooling SVM-Classifiers Using Orientation and Concavity Based Features

In order to increase the performance in the handwritten digit recognition field, researchers commonly combine a variety of features to represent a pattern. This approach has showed to be very effective in practice. The classical approach to combine features is by concatenating the underlying feature vectors. A drawback of this approach is that it could generate high-dimensional descriptors, which increases the complexity of the training process. Instead, we propose to use a pooling based classifier, that allow us to get not only a faster training process but also outperforming results. For evaluation, we used two state-of-the-art handwritten digit datasets: CVL and MNIST. In addition, we show that a simple rectangular spatial division, that characterize our descriptors, yields competitive results and a smaller computation cost with respect to other more complex zoning techniques.

Jose M. Saavedra

Enhanced Fuzzy-Relational Neural Network with Alternative Relational Products

This paper describes an extension of fuzzy relational neural networks (FRNNs) that aims at improving their classification performance. We consider Pedrycz’s FRNN, which is one of the most effective and popular models. This model has traditionally used a single relational product (Circlet). The extension described in this paper consists in allowing applying other relational products in the training phase to the basic FRNN, looking to increase its predictive capabilities. The relational products considered for the extension are the so called BK-Products: SubTriangle, SupTriangle and Square; in addition, we propose the use of more general operators (t-norms and s-norms) in their definitions, which are also applied to the Circlet relational product. We explore the effectiveness of this extension in classification problems, through testing experiments on benchmark data sets with and without noise. Experimental results reveal that the proposed extension improves the classification performance of the basic FRNN, particularly in noisy data sets.

Efraín Mendoza-Castañeda, Carlos A. Reyes-García, Hugo Jair Escalante, Wilfrido Moreno, Alejandro Rosales-Pérez

Weighted Convolutional Neural Network Ensemble

We introduce a new method to combine the output probabilities of convolutional neural networks which we call Weighted Convolutional Neural Network Ensemble. Each network has an associated weight that makes networks with better performance have a greater influence at the time to classify in relation to networks that performed worse. This new approach produces better results than the common method that combines the networks doing just the average of the output probabilities to make the predictions. We show the validity of our proposal by improving the classification rate on a common image classification benchmark.

Xavier Frazão, Luís A. Alexandre

Bio-inspired Aging Model Particle Swarm Optimization Neural Network Training for Solar Radiation Forecasting

This paper deals with a novel training algorithm for a neural network architecture applied to solar radiation time series prediction. The proposed training algorithm is based in a novel bio-inspired aging model-particle swarm optimization (BAM-PSO). The BAM-PSO based algorithm is employed to update the synaptic weights of the neural network. The size of the regression vector is determined by means of the Cao methodology. The proposed structure captures efficiently the complex nature of the solar radiation time series. The proposed model is trained and tested using real data values for solar radiation.

Eduardo Rangel, Alma Y. Alanís, Luis J. Ricalde, Nancy Arana-Daniel, Carlos López-Franco

Dairy Cattle Sub-clinical Uterine Disease Diagnosis Using Pattern Recognition and Image Processing Techniques

This work presents a framework for diagnosing sub-clinical endometritis, a common uterine disease in dairy cattle, based in the analysis of ultrasound images of the uterine horn. The main contribution consists in the feature extraction proposal, based on the characteristics that the expert takes into account for diagnosing, such as statistics measures, image textures, shape, custom thickness measures and histogram, among others. Given the segmentation of the different regions of the uterine horn, a fully automatic supervised classification is performed, using a model based on C-SVM. Two different datasets of ultrasound images were used, acquired and tagged by an expert. The proposed framework shows promising results, allowing to consider the development of a complete automatic procedure to measure morphological features of the uterine horn that may contribute in the diagnosis of the pathology.

Matías Tailanián, Federico Lecumberry, Alicia Fernández, Giovanni Gnemmi, Ana Meikle, Isabel Pereira, Gregory Randall

Semisupervised Approach to Non Technical Losses Detection

Non-technical electrical losses detection is a complex task, with high economic impact. Due to the diversity and large number of consumption records, it is very important to find an efficient automatic method to detect the largest number of frauds with the least amount of experts’ hours involved in preprocessing and inspections. This article analyzes the performance of a strategy based on a semisupervised method, that starting from a set of labeled data, extends this labels to unlabeled data, and then allows to detect new frauds at consumptions. Results show that the proposed framework, improves performance in terms of the

measure

against manual methods performed by experts and previous supervised methods, avoiding hours of experts/inspection labeling.

Juan Tacón, Damián Melgarejo, Fernanda Rodríguez, Federico Lecumberry, Alicia Fernández

New Radial Basis Function Neural Network Architecture for Pattern Classification: First Results

This paper presents the initial results concerning a new Radial Basis Function Artificial Neural Network (RBFNN) architecture for pattern classification. Performance of the new architecture is demonstrated with different data sets. Its efficiency is also compared with different classification methods reported in literature: Multilayer Perceptron, Standard Radial Basis Neural Networks, KNN and Minimum Distance classifiers, showing a much better performance. Results are only given for problems using two features

Humberto Sossa, Griselda Cortés, Elizabeth Guevara

Predicting Very Early Stage Mild Cognitive Impairment Based on a Voxel-wise Arterial Spin Labeling Analysis

In recent years, medical images have been increasingly used as an objective method for the diagnosis of neurodegenerative diseases. Most previous studies have been based on structural or functional magnetic resonance imaging. However, the results are not yet sufficient to identify early stages of dementia. In this paper, we present an image processing and pattern recognition strategy that allows to predict short-term conversion to Mild Cognitive Impairment (MCI) based on the analysis of Arterial Spin Labeling images. Healthy subjects, categorized as individuals at risk of dementia, were assessed annually in order to identify those that converted to MCI. After 1 − 2 years, 20 subjects were classified as non-converters and 15 as converters according Mini–Mental State Examination test and other neuropsychiatric scales. The proposed approach was able to classify converter from non-converter subjects with an accuracy of 0.88 using the leave-one-out cross-validation method.

Gloria Díaz, Pablo García-Polo, Virginia Mato, Eva Alfayate, Juan Antonio Hernández-Tamames, Norberto Malpica

Quaternion Support Vector Classifier

This paper presents the Quaternion Support Vector Machines for classification as a generalization of the real- and complex- valued Support Vector Machines. In this framework we handle the design of kernels involving the Clifford or quaternion product. The QSVM allows to change the metric involved in the quaternion product. The application section shows experiments in pattern recognition and colour image processing.

G. López-González, Nancy Arana-Daniel, Eduardo Bayro-Corrochano

Keynote IV

Lattice Based Dendritic Computing: A Biomimetic Approach to ANNs

This paper presents an overview of the current status of lattice based dendritic computing. Roughly speaking, lattice based dendritic computing refers to a biomimetic approach to artificial neural networks whose computational aspects are based on lattice group operations. We begin our presentation by discussing some important processes of biological neurons followed by a biomimetic model which implements these processes. We discuss the reasons and rationale behind this approach and illustrate the methodology with some examples. Global activities in this field as well as some potential research issues are also part of this discussion.

Gerhard X. Ritter, Gonzalo Urcid

Computer Vision and Robot Vision

Partial to Full Image Registration Based on Candidate Positions and Multiple Correspondences

In some image-registration based applications, it is more usual to detect a low quality and tiny partial image rather than a full sample (forensic palmprint recognition, satellite images, object detection in outdoor scenes …). In these cases, the usual registration methods fail due to the great amount of outliers that have to be detected while comparing a tiny image (object to be registered) to a full image (object in the database). In this paper, we present an image registration method that explicitly considers a great amount of outliers. In a first step, the method selects some candidate points to be the centres of the partial image. In a second step, these candidates are refined until selecting one through a multiple correspondence method. Experimental validation shows that the algorithm can outperform state of the art identification methods given the image to be identified a tiny and partial sample.

Carlos Francisco Moreno-García, Xavier Cortés, Francesc Serratosa

Automatic Corner Matching in Highly Distorted Images of Zhang’s Calibration Pattern

Zhang’s method is a widely used technique for camera calibration from different views of a planar calibration pattern. This pattern contains a set of squares arranged in a certain configuration. In order to calibrate the camera, the corners of the squares in the images must be matched with those in the reference model. When the images show a strong lens distortion, the usual methods to compute the corner matching fail because the corners are shifted from their expected positions. We propose a new method which automatically estimates such corner matching taking into account the lens distortion. The method is based on an automatic algorithm for lens distortion correction which allows estimating the distorted lines passing through the edges of the squares. We present some experiments to illustrate the performance of the proposed method, as well as a comparison with the usual technique proposed in a Matlab toolbox.

Miguel Alemán-Flores, Luis Alvarez, Luis Gomez, Daniel Santana-Cedrés

Bio-inspired Aging Model-Particle Swarm Optimization and Geometric Algebra for Structure from Motion

On computer vision field Structure from Motion (SfM) algorithms offer good advantages for numerous applications (augmented reality, autonomous navigation, motion capture, remote sensing, object recognition, image-base 3D modeling, among others), nevertheless, these algorithms show some weakness; in the present paper we propose the use of Bio-inspired Aging Model-PSO (BAM-PSO) to improve the accuracy of SfM algorithms. The BAM-PSO algorithm is used over a Geometric Algebra (GA) framework in order to compute the rigid movement on images and this allows us to obtain a numerically stable algorithm.

Nancy Arana-Daniel, Carlos Villaseñor, Carlos López-Franco, Alma Y. Alanís

Detecting Subtle Human-Object Interactions Using Kinect

We present a method to identify human-object interactions involved in complex, fine-grained activities. Our approach benefits from recent improvements in range sensor technology and body trackers to detect and classify important events in a depth video. Combining global motion information with local video analysis, our method is able to recognize the time instants of a video at which a person picks up or puts down an object. We introduce three novel datasets for evaluation and perform extensive experiments with promising results.

Sebastian Ubalde, Zicheng Liu, Marta Mejail

A New Saliency Detection Method for Stereoscopic Images Using Contrast and Prior Knowledge

In this paper, we propose a new visual saliency detection method, which is effective regardless of unreliable disparity information, by using contrast and prior knowledge. Our proposed method consists of two phases. In the first phase, we used region based contrast information to compute the saliency of an input image. We consider not only global but also local contrast in color and disparity information to efficiently extract salient regions in a stereoscopic image. In addition, we introduce a confidence measure to handle unreliable disparity information. In the second phase, we used region based prior knowledge existent in a stereoscopic image. The region based prior knowledge is constructed from low-level features such as color, frequency, location and disparity in the stereoscopic image. Finally, we integrate contrast-based and prior knowledge-based saliency to accurately detect saliency from input stereoscopic image. Experimental results show that our method efficiently detects salient contents in stereoscopic images.

Sang-Hyun Cho, Hang-Bong Kang

Krawtchouk Moments for Gait Phase Detection

We present a novel method for gait phase detection based on Krawtchouk moments, which can be used in gait analysis. The low computational cost and high capacity of description of the Krauchouk moments makes it easy detect the parameters of the gait cycle, such as the swing phase, stance phase and double support. In addition, we present the results of the gait phases detection with the proposed method of 10 test subjects and compared with standard values.

C. Camacho-Bello, J. J. Báez-Rojas

3D Face Reconstruction from a Single Shaded Image Using Subspace Crossing Engine

This paper presents a technique to reconstruct a 3D human face from a single shaded face image. The core idea of this work is to find a possible relationship between an image subspace and a 3D subspace. In this work, the subspaces are created by Principle Component Analysis to only capture the major factors for creating a face in each subspace. The subspaces are then adjusted to arrange them in a comparable range. By inputting a novel 2D shading face image, a subspace crossing engine developed for this work will estimate the major factors of the 2D shading face image creation and use them to compose a 3D shape automatically. The evaluation reveals that the researchers approach reconstructed 3D faces faster and closer to the 3D ground-truth faces than a traditional approach in a controlled environment.

Tanasai Sucontphunt

Conformal Geometric Method for Voting

In this paper, we introduce a novel geometric voting scheme that extends previous algorithms, like Hough transform and tensor voting, in order to tackle perceptual organization problems. Our approach is grounded in three methodologies: representation of information using Conformal Geometric Algebra, a local voting process, which introduce global perceptual considerations at low level, and a global voting process, which clusters salient geometric entities in the whole image. Since geometric algebra is the mathematical framework of our approach, our algorithm infers high-level geometric representations from tokens that are perceptually salient in an image.

Gerardo E. Altamirano-Gómez, Eduardo Bayro-Corrochano

Hybrid Parallel Cascade Classifier Training for Object Detection

A drawback of the Viola and Jones framework for object detection in digital images is the large amount of time needed to train the underlying cascade classifiers. In this paper, we propose a novel hybrid approach for parallelizing that framework. The approach employs message passing among computers and multi-threading in the processor cores, hence its hybrid nature. In contrast to related works, which dealt with only parts the original framework, in this paper we considered the complete framework. Besides, the set of weak classifiers obtained by our parallel approach is identical to the one of a serial version. An experimental evaluation on face detection focused on speedup and scalability measures and has shown the improvements of the proposed approach over a serial implementation of the original framework.

Eanes Torres-Pereira, Herman Martins-Gomes, Andrey Elísio Monteiro-Brito, João Marques de Carvalho

3D Face Recognition by Functional Data Analysis

This work proposes the use of functional data analysis to represent 3D faces for recognition tasks. This approach allows exploiting and studying characteristics of the continuous nature of this type of data. The basic idea of our proposal is to approximate the 3D face surface through an expansion of a basis functions set. These functions are used for a

global

representation of the entire face, and a

local

representation, where pre-selected face regions are used to construct multiple local representations. In both cases, the functions are fitted to the 3D data by means of the least squares method. Univariate attribute selection is finally applied to reduce the dimensionality of the new representation. The experiments prove the validity of the proposed approach, showing competitive results with respect to the state of the art solutions. Moreover, the dimensionality of the data is considerably reduced with respect to the original size, which is one of the goals of using this approach.

Dania Porro-Muñoz, Francisco José Silva-Mata, Anier Revilla-Eng, Isneri Talavera-Bustamante, Stefano Berretti

Object Classification and Detection with Context Kernel Descriptors

Context information is important in object representation. By embedding context cue of image attributes into kernel descriptors, we propose a set of novel kernel descriptors called Context Kernel Descriptors (CKD) for object classification and detection. The motivation of CKD is to use spatial consistency of image attributes or features defined within a neighboring region to improve the robustness of descriptor matching in kernel space. For feature selection, Kernel Entropy Component Analysis (KECA) is exploited to learn a subset of discriminative CKD. Different from Kernel Principal Component Analysis (KPCA) that only keeps features contributing mostly to image reconstruction, KECA selects the CKD that contribute mostly to the Rényi entropy of the image. These CKD are discriminative as they relate to the density distribution of the histogram of image attributes. We report superior performance of CKD for object classification on the CIFAR-10 dataset, and for detection on a challenging chicken feet dataset.

Hong Pan, Søren Ingvor Olsen, Yaping Zhu

Gait-Based Carried Object Detection Using Persistent Homology

There are surveillance scenarios where it is important to emit an alarm when a person carrying an object is detected. In order to detect when a person is carrying an object, we build models of naturally-walking and object-carrying persons using topological features. First, a stack of human silhouettes, extracted by background subtraction and thresholding, are glued through their gravity centers, forming a 3D digital image

. Second, different

filters

(i.e. orderings of the cells) are applied on ∂

(

) (cubical complex obtained from

) which capture relations among the parts of the human body when walking. Finally, a

topological signature

is extracted from the persistence diagrams according to each filter. We build some clusters of persons walking naturally, without carrying object and some clusters of persons carrying bags. We obtain vector prototypes for each cluster. Simple distances to the means are calculated for detecting the presence of carrying object. The measure cosine is used to give a similarity value between topological signatures. The accuracies obtained are 95.7% and 95.9% for naturally-walking and object-carrying respectively.

Javier Lamar-Leon, Raul Alonso Baryolo, Edel Garcia-Reyes, Rocio Gonzalez-Diaz

A Fast Pavement Location Approach for Autonomous Car Navigation

This paper describes a fast image segmentation approach designed for pavement detection in a moving camera. The method is based on a graph-oriented segmentation approach where gradient information is used temporally as a system of discontinuities to control merging between adjacent regions. The method presumes that the navigable path usually is located at specific positions on the scene, and a predefined set of seed points is used to locate the region of interest. The obtained results shown the proposed approach is able to accurately detect in an inexpensive computation manner the navigable path even in non-optimum scenarios such as miss-painted or unpaved dirt roads. Validation was conducted using a dataset with 701 samples of navigable paths, presenting a very high precision for real-time applications.

Thiago Rateke, Karla A. Justen, Vito F. Chiarella, Rodrigo T. F. Linhares, Antonio Carlos Sobieranski, Eros Comunello, Aldo von Wangenheim

Plane Detection Using Particle Swarm Optimization and Conformal Geometric Algebra

The detection of planes is a useful capability for a safe navigation algorithm, which is a crucial task for robots working in indoor environments. In this paper, the authors present an approach for plane detection based on Conformal Geometric Algebra, and Particle Swarm Optimization, with data obtained from an RGB-D sensor. The simulation and experimental results show the effectiveness of the proposed approach.

Carlos López-Franco, Jesús Hernández-Barragán, Michel López-Franco, Nancy Arana-Daniel, Alma Y. Alanís

Face Detection in Video Using Local Spatio-temporal Representations

Face detection is an important step in many video applications. Several algorithms have been proposed to this task, but most of them do not consider the spatio-temporal information. In this paper two recently introduced spatio-temporal descriptors are analyzed and evaluated in the context of face detection in videos. We designed and tested two full face detectors on the challenging YouTube Faces database. The obtained results are compared with those obtained by a frame-by-frame approach with a spatial descriptor, showing that using spatio-temporal descriptors can boost the detection performance.

Yoanna Martínez-Díaz, Noslen Hernández, Heydi Méndez-Vázquez

Fast Edge Detection in RGB-D Images

The understanding of scenes is a key aspect of computer vision. Edge detection helps us to understand more about the scene structure since the edges mark a clear distinction for a transition from one region with similar properties to another one. When the edges are obtained from changes in orientation, we can use them to find key planes and describe the scene. This paper describes a method for fast edge detection in RGB-D images. The edge detection algorithm for depth images is based on the use of smooth constraints in orientation. We show experimental results that demonstrate the potential of the approach proposed for edge detection.

Heriberto Casarrubias-Vargas, Alberto Petrilli-Barceló, Eduardo Bayro-Corrochano

MUGEN RANSAC - MUltiple GENerator Applied to Motion Estimation

The related literature reports the strong and weak points of motion hypotheses generation algorithms that aim at solving the problem of fitting the hypotheses generation to a specific set of motions along an application execution. Each algorithm fits certain motion sequences, but one only cannot excel itself in any arbitrary motion. This problem is especially difficult to solve in the presence of time constraints, because of the restricted number of hypotheses. We present a modified generation schema that employs a wider number of hypotheses generators in order to achieve more accurate estimations. The experiments revealed that in both Preemptive RANSAC and GPR time restricted scenarios the levels of accuracy of our approach were mostly even to the best hypotheses generator results in the experiments, when not better.

Severino P. Gomes-Neto, Bruno M. de Carvalho

Multimodal Background Modeling Using RGB-Depth Features

This paper presents a method of background subtraction that uses multimodal information, specifically depth and appearance cues, to robustly separate the foreground in dynamic indoor scenes. To this end, RGB-Depth data from a Microsoft Kinect sensor are exploited. We propose an extension of one from the most effective technique for background modeling in real time: Kernel Density Estimation with Fast Gauss Transform technique. Experimental results show that our proposed deals well with gradual/sudden illumination changes, shadows and dynamic backgrounds.

Rim Trabelsi, Fethi Smach, Issam Jabri, Ammar Bouallegue

Video Segmentation and Tracking

Static Video Summarization through Optimum-Path Forest Clustering

This paper introduces the Optimum-Path Forest (OPF) classifier for static video summarization, being its results comparable to the ones obtained by some state-of-the-art video summarization techniques. The experimental section has been conducted using several image descriptors in two public datasets, followed by an analysis of OPF robustness regarding one ad-hoc parameter. Future works are guided to improve OPF effectiveness on each distinct video category.

G. B. Martins, L. C. S. Afonso, D. Osaku, Jurandy Almeida, J. P. Papa

Summarization of Videos by Image Quality Assessment

Video summarization plays a key role in manipulating large amounts of digital videos, making it faster to analyze their contents and aiding in the tasks of browsing, indexing and retrieval. A straightforward method for producing the summaries is by means of extraction of color features from the video frames. However, in order to automatically generate summaries as human beings would do, the way that humans perceive images must be considered, which can be done by image quality assessment (IQA) metrics. This work presents VSQUAL, a method for summarization of videos based on objective IQA metrics, which is also used for other purposes such as shot boundary detection and keyframe extraction. Results of the proposed method are compared against other approaches of the literature with a specific evaluation metric.

Marcos Vinicius Mussel Cirne, Helio Pedrini

Modeling Video Activity with Dynamic Phrases and Its Application to Action Recognition in Tennis Videos

We present a novel approach to action recognition in tennis shot sequences. The underlying model considers the per-frame motion to be regarded as a

word

(within an alphabet of possible motions), and the sequence of frames as a

phrase

whose meaning is determined by the words given in a specific order. This feature extraction mechanism allows a semantic treatment of the classification stage using Conditional Random Fields. The system was applied on the RGB videos of the THETIS dataset, achieving an accuracy of over 86% in recognizing 12 different tennis shots among several takes produced by 55 different amateur and professional players.

Jonathan Vainstein, José F. Manera, Pablo Negri, Claudio Delrieux, Ana Maguitman

Combining Wavelet Saliency, Color and DCT Coefficients for Content-Based Image Retrieval

This paper presents an approach for content-based image retrieval extracting salient points and regions from images, and also aggregating color and DCT values in a signature descriptor for recognition. Salient points and regions are extracted from each image by a wavelet decomposition over the color channels where the highest coefficients in coarsest levels are the centers of salient regions in finest resolution. These local regions are support for extracting color histograms and a set of DCT magnitudes in order to derive a signature for the image. A feature vector combining histograms of color channels and DCT values is proposed and tested as signature of the image. Public COIL, Caltech, and ZuBuD images datasets are used for testing. Results comparing variations of the descriptor based on wavelet saliency are given on all those image datasets supporting potential for the proposed method.

Alberto Rios Júnior, Díbio Leandro Borges

Robust Face Tracking with Locally-Adaptive Correlation Filtering

A face tracking algorithm based on locally-adaptive correlation filtering is proposed. The algorithm is capable to track a face with invariance to pose, gesticulations, occlusions and clutter. The target face is chosen at the beginning of the algorithm. Afterwards, a composite filter is designed to recognize the face in posterior frames. The filter is adapted online using information of current and past scene frames. The adaptive filter is constructed by combining several optimal templates designed for distortion invariant target recognition. Results obtained with the proposed algorithm using real-life scenes, are presented and compared with those obtained with a recent state-of-the-art tracking method, in terms of detection efficiency, tracking accuracy, and speed of processing.

Leopoldo N. Gaxiola, Víctor Hugo Díaz-Ramírez, Juan J. Tapia, Arnoldo Diaz-Ramirez, Vitaly Kober

A Deformable Model to Search Characteristic Facial Points

A methodology is proposed to determine the positions of the characteristic facial points. The method is based on the Point Distribution Model, which estimates the most likely positions points. The search is refined with a discriminator, acting locally, around each point of the model; this discriminant model was achieved by training a Support Vector Machine with vectorized images in their histograms of oriented gradient (HoG). Then we aproximate de points by Thin Plate Spline by using a lambda parameters whose values were adjusted according to the local search errors. Models achieved performances that reached 90% in a cross-validation. Likewise, the strategy shows an error accumulated less than 10% in some characteristic points. The algorithm was evaluated with MUCT and BioID databases. This strategy would make digital morphological and anthropometric assessments to human face.

Reimer-A. Romero-H., Francisco-J. Renero-C.

Fast Automatic Detection of Wildlife in Images from Trap Cameras

Photo-identification of naturally marked animals is a non-intrusive technique for obtaining valuable information regarding population size and behavior in the wilderness of endangered species. In this paper we present a method for detecting/cutting wild felines in pictures taken with trap-cameras installed in the forest and triggered by infrared sensors. The detection of these felines serves the purpose of collecting information useful in studies about the population size or the migration phenomena. We propose computing the difference of images from the same trap-cameras within a short period of time. According to our experiments, our method is fast, reliable and robust, this method can be used for other species with different pelage patterns.

Karina Figueroa, Antonio Camarena-Ibarrola, Jonathan García, Héctor Tejeda Villela

Detection of Groups of People in Surveillance Videos Based on Spatio-Temporal Clues

Video surveillance has been widely employed in our society in the past years. In this context, humans play an important role and are the major players since they are responsible for changing the state of the scene through actions and activities. Therefore, the design of automatic methods to understand human behavior and recognize activities are important to determine which subjects are involved in an activity of interest. The computer vision research area has contributed vastly for the development of methods related to detection, tracking and recognition of humans. However, there is still a lack of methods able to recognize higher level activities (e.g., interaction among people that might be involved in an illegal activity). The first step to be successful in this enterprise is to detect and locate groups of people in the scene, which is essential to make inferences regarding interactions among persons. Aiming at such direction, this paper presents a group detection approach that combines motion and spatial information with low-level descriptors to be robust to situations such as partial occlusions. The experimental results obtained using the PETS 2009 and the BEHAVE datasets demonstrate that the proposed combination indeed achieves higher accuracies, indicating a promising direction for future research.

Rensso V. H. Mora-Colque, Guillermo Cámara-Chávez, William Robson Schwartz

GPUs and Multicore CPUs Implementations of a Static Video Summarization

The fast evolution of digital media, in special digital videos, has created an exponential growth of data, increasing the storage and transmission cost and the video content retrieve information complexity. Video summarization has been proposed to circumvent some of these issues and also serves as a pre-processing step in many video applications. In this paper, a static video summarization algorithm is studied and in order to reduce its high execution time, parallelizations using Graphics Processor Units (GPUs) and multicore CPUs are proposed. We also explore a hybrid approach combining both hardware to maximize the performance. The experiments were performed using 120 videos varying frame resolution and video length and the results showed that the hybrid and the multicore CPUs versions reached the best executions times, achieving 4× speedup in average.

Suellen S. Almeida, Edward Cayllahua-Cahuina, Arnaldo de A. Araújo, Guillermo Cámara-Chávez, David Menotti

CASI’2014 Workshop Computational Advances in Intelligent Processing of Multimode Remote Sensing Imagery

Descriptive Experiment Design Restructured MVDR Beamforming Technique for Enhanced Imaging with Unfocused SAR Systems

We address a new technique for feature-enhanced radar imaging with unfocused/fractional SAR sensor systems that unifies the minimum risk inspired descriptive experiment design regularization (DEDR) framework with the robust minimum variance distortionless response (MVDR) beamforming approach for recovery of the spatial spectrum pattern (SSP) of the wavefield scattered from the remotely sensed scene referred to as its reconstructed radar image. The new framework incorporates into the DEDR strategy for feature-enhanced SSP estimation the convergence guaranteed composite projector onto the convex solution set and solves the overall reconstructive imaging inverse problem employing the developed DEDR-restructured robust MVDR method.

Yuriy Shkvarko, Joel Amao Oliva

Super Resolution Imaging via Sparse Interpolation in Wavelet Domain with Implementation in DSP and GPU

This paper focuses on a novel image resolution enhancement method employing the wavelet domain techniques and hardware implementation of designed framework. In novel resolution enhancement approach for better preservation of the edge features, additional edge extraction step is used employing high-frequency (HF) sub-band images - low-high (LH), high-low (HL), and high-high (HH) - via the Discrete Wavelet Transform (DWT). In the designed procedure, the low resolution (LR) image is used in the sparse interpolation for the resolution-enhancement obtaining low-low (LL) sub-band. An efficiency analysis of the designed and other state-of-the-art filters have been performed on the DSP TMS320DM648 by Texas Instruments through MATLAB’s Simulink module and on the video card (NVIDIA Quadro K2000), demonstrating that novel SR procedure can be used in real-time processing applications. Experimental results have confirmed that implemented framework outperforms existing SR algorithms in terms of objective criteria as well as in subjective visual perception, justifying better image resolution.

H. Chavez, V. Gonzalez, A. Hernandez, V. Ponomaryov

2D to 3D Conversion Based on Disparity Map Estimation

Different hardware implementations of designed automatic 2D to 3D video color conversion employing 2D video and sequence are presented. The analyzed frameworks include together processing of neighboring frames using the following blocks: CIELa*b* color space conversion, wavelet transform (WT) with edge detection using HF wavelet sub-bands (HF, LH and HH) or pyramidal scheme, color segmentation via

k-means

on a*b* color plane, up-sampling in wavelet case, disparity map (DM) estimation, adaptive post-filtering, and finally, the anaglyph 3D scene generation. The SSIM and QBP criteria are applied in order to compare the performance of the proposed frameworks against other 3D computation techniques. The designed techniques has been implemented on DSP TMS320DM648, Matlab’s Simulink module over a PC with Windows 7, and using graphic card (NVIDIA Quadro K2000) demonstrating that the proposed approach can be applied in real-time processing mode.

V. Gonzalez-Huitron, E. Ramos-Diaz, V. Kravchenko, V. Ponomaryov

Lifting Filters Adjustment for Lossless Image Compression Applications

A method for adjustment of lifting scheme wavelet filters to achieve a higher image lossless compression is presented. The proposed method analyzes the image spectral characteristics and output the suboptimal coefficients to obtain a higher compression ratio in comparison to the standard lifting filters. The analysis follows by spectral pattern recognition with 1

-NN

classifier. Spectral patterns are of a small fixed length for the entire image permitting thus the optimization of the filter coefficients for different imager sizes. The proposed method was applied to a set of test images obtaining better image compression results in comparison to the standard wavelet lifting filters.

Oleksiy Pogrebnyak, Ignacio Hernández-Bautista

Texture Analysis of Mean Shift Segmented Low-Resolution Speckle-Corrupted Fractional SAR Imagery through Neural Network Classification

The novel proposal of this work is the application of the nonparametric mean shift technique, for image segmentation, to low-resolution (LR) speckle-corrupted imagery, acquired with conventional low-cost fractional synthetic aperture radar (Fr-SAR) systems; with aims of analyzing the resultant textures, related to the remotely sensed (RS) scenes, via neural network (NN) classification. The LR speckle-corrupted recovery of the spatial reflectivity maps, provided by Fr-SAR systems, is due to the fractional synthesis mode and the different model-level and system-level operational scenario uncertainties, peculiar to such systems operating in harsh remote sensing scenarios. The mean shift segmentation method delineates arbitrarily shaped regions in the treated LR image by locating the modes in the density distribution space, and by grouping all pixels associated with the same mode. Then, the textures extracted from the segmented image are classified through NN computing, to posteriorly be used for analysis and interpretation.

Gustavo D. Martín del Campo-Becerra, Juan I. Yañez-Vargas, Josué A. López-Ruíz

Morphological Change of a Scene Employing Synthetic Multispectral and Panchromatic Images

Climate change has produced transformations in the coastal zone of Tamaulipas State. Such changes include modifications to coastline and transformations to texture-relief and texture of the zone. In this work, high resolution panchromatic SPOT images have been employed to quantify such modifications. A synthetic multispectral image is used to validate our results. To quantify the texture-relief and texture, the multi-spectral image is modeled as a vector field of as many dimensions as bands of the image. Upon this field, the vector operators divergence and laplacian are applied. Results are presented for an area of Tampico-Altamira, details of the methodology are shown and results are discussed.

Jorge Lira, Erick Marín

Comparative Evaluation of Edge Linking Methods Using Markov Chain and Regression Applied Heuristic

There have been many studies on how to develop accurate edge detection algorithms for various applications – especially processing geographic images and maps, say coastline images. In this paper, we present edge linking algorithms, based on a heuristic approach using regression analysis and Markov chain technique respectively, for the coastline images. The heuristic approach consists of costs that are based on the distance and direction of the edge terminators while the Markov chain technique has been developed to investigate possible line drawing options to reconnect the broken edges so that a Markov transition matrix is generated in order to find the suitable edge terminators that can be reconnected. In this paper both techniques of using Markov chain and regression analysis have been developed and their outcomes have been evaluated comparatively upon their accuracy of reconnecting the broken edges.

Haklin Kimm, Neda Abolhassani, Eun-Joo Lee

Data Fusion Approach for Employing Multiple Classifiers to Improve Lake Shoreline Analysis

Remote sensing images have been widely employed to analyze bodies of water and have become essential to studying their dynamics. While the use of indices based on the threshold segmentation technique is preferred, the search for methods that define water edge contour continues. The segmentation algorithm introduced in this study is based on Mean-Shift and Watershed methods. We propose a fusion classifier strategy which allows us to obtain results that are consistent with the segmentation process. The use of two or more segmentation processes has been shown to improve pattern recognition. It is important to implement a good data integration scheme. Preliminary results suggest that the approach reported herein can improve the definition of lake shorelines.

Alejandra A. López-Caloca

An Efficient GPU-Based Implementation of the R-MSF-Algorithm for Remote Sensing Imagery

This paper presents an efficient real time implementation of the regularized matched spatial filter algorithm (R-MSF-Algorithm) for remote sensing (RS) imagery that employs the robust descriptive experiment design (DED) approach, using a graphics processing unit (GPU) as parallel architecture. The achieved performance is significantly greater than initial requirement of two image per second. The performance results are reported in terms of metrics as: number of operations, memory requirements, execution time, and speedup, which show the achieved improvements by the parallel version in comparison with the sequential version of the algorithm.

David Castro-Palazuelos, Daniel Robles-Valdez, Deni Torres-Roman

Hybrid FPGA/ARM Co-design for Near Real Time of Remote Sensing Imagery

A novel hybrid co-design for implementing high-resolution reconstruction algorithms, for near real time implementation of remote sensing (RS) imagery, is addressed in this paper. In the proposed co-design scheme, the inverse square root and the matrix operations of the robust adaptive space filter algorithm are implemented as accelerators units in a Field Programmable Gate Array (FPGA) using piecewise polynomial approximations and systolic array (SA) techniques. Then, the FPGA based accelerator is integrated with an ARM processor in a HW/SW co-design paradigm that meets the (near) real time imaging systems requirements in spite of conventional computations. Finally, we report and discuss the results of the hybrid FPGA/ARM co-design implementation in a Xilinx Virtex-5 XC5VFX70TFFG1136 for reconstruction of real world RS images.

C. Góngora-Martín, A. Castillo-Atoche, J. Estrada-López, J. Vázquez-Castillo, J. Ortegón-Aguilar, R. Carrasco-Álvarez

Backmatter

Title: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Editors: Eduardo Bayro-Corrochano
Edwin Hancock
Publisher: Springer International Publishing
Electronic ISBN: 978-3-319-12568-8
Print ISBN: 978-3-319-12567-1
DOI: https://doi.org/10.1007/978-3-319-12568-8

Springer Professional