Skip to main content

Über dieses Buch

This book constitutes the refereed proceedings of the 17th Iberoamerican Congress on Pattern Recognition, CIARP 2012, held in Buenos Aires, Argentina, in September 2012. The 109 papers presented, among them two tutorials and four keynotes, were carefully reviewed and selected from various submissions. The papers are organized in topical sections on face and iris: detection and recognition; clustering; fuzzy methods; human actions and gestures; graphs; image processing and analysis; shape and texture; learning, mining and neural networks; medical images; robotics, stereo vision and real time; remote sensing; signal processing; speech and handwriting analysis; statistical pattern recognition; theoretical pattern recognition; and video analysis.



Invited Talks

Optimal “Anti-Bayesian” Parametric Pattern Classification Using Order Statistics Criteria

The gold standard for a classifier is the condition of optimality attained by the Bayesian classifier. Within a Bayesian paradigm, if we are allowed to compare the testing sample with only

a single

point in the feature space from each class, the


Bayesian strategy would be to achieve this based on the (Mahalanobis) distance from the corresponding means. The reader should observe that, in this context, the mean, in one sense, is the most


point in the respective distribution. In this paper, we shall show that we can obtain optimal results by operating in a diametrically opposite way, i.e., a so-called “anti-Bayesian” manner. Indeed, we shall show the completely counter-intuitive result that by working with a

very few

(sometimes as small as two) points


from the mean, one can obtain remarkable classification accuracies. Further, if these points are determined by the

Order Statistics

of the distributions, the accuracy of our method, referred to as Classification by Moments of Order Statistics (CMOS), attains the optimal Bayes’ bound! This claim, which is totally counter-intuitive, has been proven for many uni-dimensional, and some multi-dimensional distributions within the exponential family, and the theoretical results have been verified by rigorous experimental testing. Apart from the fact that these results are quite fascinating and pioneering in their own right, they also give a theoretical foundation for the families of Border Identification (BI) algorithms reported in the literature.

A. Thomas, B. John Oommen

An Introduction to Restricted Boltzmann Machines

Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. The increase in computational power and the development of faster learning algorithms have made them applicable to relevant machine learning problems. They attracted much attention recently after being proposed as building blocks of multi-layer learning systems called deep belief networks. This tutorial introduces RBMs as undirected graphical models. The basic concepts of graphical models are introduced first, however, basic knowledge in statistics is presumed. Different learning algorithms for RBMs are discussed. As most of them are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and the required MCMC techniques is provided.

Asja Fischer, Christian Igel

Human Activity Recognition with 2D and 3D Cameras

This presentation will cover human activity recognition by using conventional 2D video cameras as well as the recently developed 3D depth cameras. I’ll first give an overview on the interest- point based approach which has become a popular research direction in the past few years for 2D based activity recognition. In addition to the conventional classification problem, I’ll discuss the problem of detection (spacetime localization) as well as the example-based search where the amount of labelled data is extremely small. The second part of the talk will focus on activity recognition with 3D depth cameras. I’ll describe some of the recently developed visual representations and machine learning frameworks for 3D data analysis.

Zicheng Liu

Smooth Signed Distance Surface Reconstruction and Applications

We describe a new and simple variational formulation to reconstruct the surface geometry, topology, and color map of a 3D scene from a finite set of colored oriented points. Point clouds are nowadays obtained using a variety of techniques, including structured lighting systems, pasive multi-view stereo algorithms, and 3D laser scanning. In our formulation the implicit function is forced to be a smooth approximation of the signed distance function to the surface. The formulation allows for a number of different efficient discretizations, reduces to a finite dimensional least squares problem for all linearly parameterized families of functions, does not require the specification of boundary conditions, and it is particularly good at extrapolating missing and/or irregularly sampled data. The resulting algorithms are significantly simpler and easier to implement than alternative methods. In particular, our implementation based on a primal-graph octree-based hybrid finite element-finite difference discretization, and the Dual Marching Cubes isosurface extraction algorithm is very efficient, and produces high quality crack-free adaptive manifold polygon meshes. After the geometry and topology are reconstructed, the color information from the points is smoothly extrapolated to the surface by solving a second variational problem which also reduces to a finite dimensional least squares problem. The resulting method produces high quality polygon meshes with smooth color maps, which accurately approximate the source colored oriented points. An open source implementation of this method is available for download. We describe applications to digital archaeology, 3D forensics, and 3D broadcasting.

Gabriel Taubin

Robot, Pass Me the Scissors! How Robots Can Assist Us in the Operating Room

The inclusion of robotics and automation to support and augment surgical performance offers the promise of shorter operating times, higher accuracy and fewer risks compared with traditional, human-only surgery. This paper discusses current research in the area of surgical robotics and human-robot collaboration. A multimodal robotic scrub nurse (Gestonurse) for the operating room (OR) is presented as a case study. Gestonurse assists the main surgeon by passing surgical instruments, thereby releasing the surgical technician to perform other tasks. Such a robotic system has the potential to reduce miscommunication and compensate for understaffing. The implications of the introduction of surgical robots, as assistants rather than autonomous agents, are discussed in terms of the societal and technological requirements. Quantitative and qualitative findings are presented as evidence to support the guidelines discussed.

Juan P. Wachs

Pattern Recognition in Transportation

The research, development and design of Intelligent Transportation Systems worldwide relies on technologies that are able to enhance security and safety, increase efficiency, reduce congestion and promote environmental sustainability. In addition, transportation systems are becoming increasingly complex as they are required to deliver mobility to large, diverse and densely populated areas across multiple modes of transportation, e.g. cars, public transport, bicycles, electric cars, etc. Transportation systems able to cope with these challenges and scale will necessarily rely on smart sensors that monitor and act upon stimuli from the environment. Of all sensor options, visual sensors will continue to be a preferred choice since they provide data that humans can easily process and verify, e.g. it is estimated that every vehicle built after 2014 will come equipped with a rear-mounted camera.

As pattern recognition techniques mature, the demand for applications in the transportation domain will only grow. These applications range from automated vehicle detection and access control to safety systems for red-light, lane or speed management passing through traffic condition monitoring, incident detection, autonomous vehicles, etc. We will first take a look at the state-of-the-art of the different solutions with emphasis on those that present open research challenges. We will also take a look at the main trends in transportation in order to understand what research is likely to be of high relevance in future transportation systems.

José Antonio Rodriguez-Serrano

Face and Iris: Detection and Recognition

The Intrinsic Dimensionality of Attractiveness: A Study in Face Profiles

The study of human attractiveness with pattern analysis techniques is an emerging research field. One still largely unresolved problem is which are the facial features relevant to attractiveness, how they combine together, and the number of independent parameters required for describing and identifying harmonious faces. In this paper, we present a first study about this problem, applied to face profiles. First, according to several empirical results, we hypothesize the existence of two well separated manifolds of attractive and unattractive face profiles. Then, we analyze with manifold learning techniques their intrinsic dimensionality. Finally, we show that the profile data can be reduced, with various techniques, to the intrinsic dimensions, largely without loosing their ability to discriminate between attractive and unattractive faces.

Andrea Bottino, Aldo Laurentini

A Hybrid of Principal Component Analysis and Partial Least Squares for Face Recognition across Pose

In this paper, we propose a simple and efficient hybrid approach based on the combination of principal component analysis and partial least squares. Principal component analysis is used to reduce the dimension of image in first step and partial least squares method is used to carry out pose invariant face recognition in second step. The performance of proposed method is compared with another popular method based on global linear regression on hybrid-eigenface (HGLR) in terms of classification accuracy and computation time. Experimental results on two well known publicly available face databases demonstrate the effectiveness of the proposed approach.

Ajay Jaiswal, Nitin Kumar, R. K. Agrawal

Gender Classification in Large Databases

In this paper, we address the challenge of gender classification using large databases of images with two goals. The first objective is to evaluate whether the error rate decreases compared to smaller databases. The second goal is to determine if the classifier that provides the best classification rate for one database, improves the classification results for other databases, that is, the cross-database performance.

Enrique Ramón-Balmaseda, Javier Lorenzo-Navarro, Modesto Castrillón-Santana

Combining Face and Facial Feature Detectors for Face Detection Performance Improvement

In this paper, we experimentally study the combination of face and facial feature detectors to improve face detection performance. The face detection problem, as suggeted by recent face detection challenges, is still not solved. Face detectors traditionally fail in large-scale problems and/or when the face is occluded or different head rotations are present. The combination of face and facial feature detectors is evaluated with a public database. The obtained results evidence an improvement in the positive detection rate while reducing the false detection rate. Additionally, we prove that the integration of facial feature detectors provides useful information for pose estimation and face alignment.

Modesto Castrillón-Santana, Daniel Hernández-Sosa, Javier Lorenzo-Navarro

Randomized Face Recognition on Partially Occluded Images

In this work we propose a new method for face recognition that successfully handles occluded faces. We propose an innovative improvement that allows to detect and discard occluded zones of the face, thus making recognition more robust in the presence of occlusion. We provide experimental results that show that the proposed method performs well in practice.

Ariel Morelli Andres, Sebastian Padovani, Mariano Tepper, Marta Mejail, Julio Jacobo

Face Recognition: Would Going Back to Functional Nature Be a Good Idea?

Traditional appearance-based methods for face recognition represent raw face images of size




as vectors in a




-dimensional space. However in practice, this space can be too large to perform classification. For that reason, dimensionality reduction techniques are usually employed. Most of those traditional approaches do not take advantage of the spatial correlation of pixels in the image, considering them as independent. In this paper, we proposed a new representation of face images that takes into account the smoothness and continuity of the face image and at the same time deals with the dimensionality of the problem. This representation is based on Functional Data Analysis so, each face image is represented by a function and a recognition algorithm for functional spaces is formulated. The experiments on the AT&T and Yale B facial databases show the effectiveness of the proposed method.

Noslen Hernández, Yoanna Martínez-Díaz, Dania Porro-Muñoz, Heydi Méndez-Vázquez

Dissimilarity Representations Based on Multi-Block LBP for Face Detection

Face representation is one of the open problems in face detection. The recently proposed Multi-Block Local Binary Patterns (MB-LBP) representation has shown good results for this purpose. Although dissimilarity representation has proved to be effective in a variety of pattern recognition problems, to the best of our knowledge, it has never been used for face detection. In this paper, we propose new dissimilarity representations based on MB-LBP features for this task. Different experiments conducted on a public database, showed that the proposed representations are more discriminative than the original MB-LBP representation when classifying faces. Using the dissimilarity representations, a good classification accuracy is achieved even when less training data is available.

Yoanna Martínez-Díaz, Heydi Méndez-Vázquez, Yenisel Plasencia-Calaña, Edel B. García-Reyes

On the Vulnerability of Iris-Based Systems to a Software Attack Based on a Genetic Algorithm

The vulnerabilities of a standard iris verification system to a novel indirect attack based on a binary genetic algorithm are studied. The experiments are carried out on the iris subcorpus of the publicly available BioSecure DB. The attack has shown a remarkable performance, thus proving the lack of robustness of the tested system to this type of threat. Furthermore, the consistency of the bits of the iris code is analysed, and a second working scenario discarding the fragile bits is then tested as a possible countermeasure against the proposed attack.

Marta Gomez-Barrero, Javier Galbally, Pedro Tome, Julian Fierrez


On the Robustness of Kernel-Based Clustering

This paper evaluates the robustness of two types of unsupervised learning methods, which work in feature spaces induced by a kernel function, kernel


-means and kernel symmetric non-negative matrix factorization. The main hypothesis is that the use of non-linear kernels makes these clustering algorithms more robust to noise and outliers. The hypothesis is corroborated by applying kernel and non-kernel versions of the algorithms to data with different degrees of contamination with noisy data. The results show that the kernel versions of the clustering algorithms are indeed more robust, i.e. producing estimates with lower bias in the presence of noise.

Fabio A. González, David Bermeo, Laura Ramos, Olfa Nasraoui

An Improved Multi-Class Spectral Clustering Based on Normalized Cuts

In this work, we present an improved multi-class spectral clustering (MCSC) that represents an alternative to the standard


-way normalized clustering, avoiding the use of an iterative algorithm for tuning the orthogonal matrix rotation. The performance of proposed method is compared with the conventional MCSC and


-means in terms of different clustering quality indicators. Results are accomplished on commonly used toy data sets with hardly separable classes, as well as on an image segmentation database. In addition, as a clustering indicator, a novel unsupervised measure is introduced to quantify the performance of the proposed method. The proposed method spends lower processing time than conventional spectral clustering approaches.

Diego Hernán Peluffo-Ordóñez, Carlos Daniel Acosta-Medina, César Germáan Castellanos-Domínguez

A Simple Hybrid Method for Semi-Supervised Learning

We introduce and describe the Hybrid Semi-Supervised Method (HSSM) for learning. This is the first hybrid method aimed to solve problems with both labeled and unlabeled data. The new method uses an unsupervised stage in order to decompose the full problem into a set of simpler subproblems. HSSM applies simple stopping criteria during the unsupervised stage, which allows the method to concentrate on the difficult portions of the original problem. The new algorithm also makes use of a simple strategy to select at each subproblem a small subset of unlabeled samples that are relevant to modify the decision surface. To this end, HSSM trains a linear SVM on the available labeled samples, and selects the unlabeled samples that lie within the margin of the trained SVM. We evaluated the new method using a previously introduced setup, which includes datasets with very different properties. Overall, the error levels produced by the new HSSM are similar to other SSL methods, but HSSM is shown to be more efficient than all previous methods, using only a small fraction of the available unlabeled data.

Hernán C. Ahumada, Pablo M. Granitto

Clustering of Incomplete Data and Evaluation of Clustering Quality

Two approaches to solving the problem of clustering with gaps for a specified number of clusters are considered. The first approach is based on restoring the values of unknown attributes and solving the problem of clustering of calculated complete data. The second approach is based on solving a finite set of tasks of clustering of corresponding to incomplete data complete sample descriptions and the construction of collective decision. For both approaches, the clustering quality criteria have been proposed as functions of incomplete descriptions. Results of practical experiments are considered.

Vladimir V. Ryazanov

A New Classifier Combination Scheme Using Clustering Ensemble

Combination of multiple classifiers has been shown to increase classification accuracy in many application domains. Besides, the use of cluster analysis techniques in supervised classification tasks has shown that they can enhance the quality of the classification results. This is based on the fact that clusters can provide supplementary constraints that may improve the generalization capability of the classifiers. In this paper we introduce a new classifier combination scheme which is based on the Decision Templates Combiner. The proposed scheme uses the same concept of representing the classifiers decision as a vector in an intermediate feature space and builds more representatives decision templates by using clustering ensembles. An experimental evaluation was carried out on several synthetic and real datasets. The results show that the proposed scheme increases the classification accuracy over the Decision Templates Combiner, and other classical classifier combinations methods.

Miguel A. Duval-Poo, Joan Sosa-García, Alejandro Guerra-Gandón, Sandro Vega-Pons, José Ruiz-Shulcloper

Nested Dichotomies Based on Clustering

Multiclass problems, i.e., classification problems involving more than two classes, are a common scenario in supervised classification. An important approach to solve this type of problems consists in using binary classifiers repeated times; within this category we find nested dichotomies. However, most of the methods for building nested dichotomies use a random strategy, which does not guarantee finding a good one. In this work, we propose new non-random methods for building nested dichotomies, using the idea of reducing misclassification errors by separating in the higher levels those classes that are easier to separate; and, in the lower levels those classes that are more difficult to separate. In order to evaluate the performance of the proposed methods, we compare them against methods that randomly build nested dichotomies, using some datasets (with mixed data) taken from the UCI repository.

Miriam Mónica Duarte-Villaseñor, Jesús Ariel Carrasco-Ochoa, José Francisco Martínez-Trinidad, Marisol Flores-Garrido

Combining Re-Ranking and Rank Aggregation Methods

Content-Based Image Retrieval (CBIR) aims at retrieving the most similar images in a collection by taking into account image visual properties. In this scenario, accurately ranking collection images is of great relevance. Aiming at improving the effectiveness of CBIR systems,



rank aggregation

algorithms have been proposed. However, different re-ranking and rank aggregation approaches produce different image rankings. These rankings are complementary and, therefore, can be further combined aiming at obtaining more effective results. This paper presents novel approaches for combining re-ranking and rank aggregation methods aiming at improving the effectiveness of CBIR systems. Several experiments were conducted involving shape, color, and texture descriptors. Experimental results demonstrate that our approaches can improve the effectiveness of CBIR systems.

Daniel Carlos Guimarães Pedronette, Ricardo da S. Torres

Extracting Understandable 3D Object Groups with Multiple Similarity Metrics

Some of the main difficulties involved in the clustering problem are the interpretation of the clusters and the choice of the number of clusters. The imposition of a complete clustering, in which all the objects must be classified might lead to incoherent and not convincing groups. In this paper we present an approach which alleviates this problem by proposing incomplete but reliable clustering strategies. The method is based on two pillars: using a set of different metrics which are evaluated through a

clustering confidence measure

and achieving a hard/soft clustering consensus. This method is particularly addressed to 3D shape grouping in which the objects are represented through geometric features defined over mesh models. Our approach has been tested using eight metrics defined on geometrical descriptors in a collection of freeshape objects. The results show that in all cases the algorithm yields coherent and meaningful groups for several numbers of clusters. The clustering strategy here proposed might be useful for future developments in the unsupervised grouping field.

Antonio Adán, Miguel Adán

Semantic Representation of Geospatial Objects Using Multiples Knowledge Domains

Geographical data is obtained through abstractions made from objects in the real world. Generally, each of these abstractions is obtained by taking into account only one point of view about the object being analyzed. When different abstractions are made on the same object different data sources regarding to it are produced. These data sources are generally heterogeneous. Thus the semantic processing of these objects become challenge since different data sources must be combined to obtain good results in tasks such as information retrieval and analysis for decision-making. This paper presents an approach based on ontologies to enrich the semantic representation of geospatial objects taking into account different abstractions made on them. The experimental results show the usefulness of this approach and how it is possible to make a multidimensional semantic representation automatically using classification algorithms and search techniques on trees.

Rainer Larin Fonseca, Eduardo Garea Llano

Fuzzy Methods

Feature Extraction and Classification for Insect Footprint Recognition

We propose a method to extract and classify insect footprints for the purpose of recognition. Our four-level procedural feature extraction model is defined as follows: First, images produce new data via the trace transform. Second, for reducing the dimensionality of the produced data, we apply some mathematical conversions. Third, dimensionality-reduced data are converted into frequency components. Finally, characteristic signals with significant components of representative values are created by excluding insignificant factors such as those related to noise. For classification, based on uncertain features, we propose a decision method defined by fuzzy weights and a fuzzy weighted mean. The proposed fuzzy weight decision method estimates weights according to degrees of contribution. Weights are assigned by ranking the degree of a feature’s contribution. We present experimental results of classification by using the proposed method on scanned insect footprints. Experiments show that the proposed method is suitable for noisy footprints with irregular directions, or symmetrical patterns in the extracted segments.

Bok-Suk Shin, James Russell, Reinhard Klette

Vector Transition Classes Generation from Fuzzy Overlapping Classes

We present in this paper a way to create transition classes and to represent them with vector structures. These classes are obtained using a supervised classification algorithm based on fuzzy decision trees. This method is useful to classify data which have a space evolution following a gradient such as forest, where transitions are spread over hundreds of meter, or other natural phenomenon. The vector representation is well adapted for integration in Geographical Information Systems because it is a more flexible structure than the raster representation. The method detailed takes into account local environmental conditions and leads to non regular gradient and fuzzy structures. It allows adding classes, called transition classes, when transition areas are too spread instead of fixing an arbitrary border between classes.

Enguerran Grandchamp, Sébastien Régis, Alain Rousteau

Infant Cry Classification Using Genetic Selection of a Fuzzy Model

In the last years, infant cry recognition has been of particular interest because it contains useful information to determine if the infant is hungry, has pain, or a particular disease. Several studies have been performed in order to differentiate between these kinds of cries. In this work, we propose to use Genetic Selection of a Fuzzy Model (GSFM) for classification of infant cry. GSFM selects a combination of feature selection methods, type of fuzzy processing, learning algorithm, and its associated parameters that best fit to the data. The experiments demonstrate the feasibility of this technique in the classification task. Our experimental results reach up to 99.42% accuracy.

Alejandro Rosales-Pérez, Carlos A. Reyes-García, Jesus A. Gonzalez, Emilio Arch-Tirado

Human Actions and Gestures

Intention, Context and Gesture Recognition for Sterile MRI Navigation in the Operating Room

Human-Computer Interaction (HCI) devices such as the keyboard and the mouse are among the most contaminated regions in an operating room (OR). This paper proposes a sterile, intuitive HCI to navigate MRI images using freehand gestures. The system incorporates contextual cues and intent of the user to strengthen the gesture recognition process. Experimental results showed that while performing an image navigation task, mean intent recognition accuracy was 98.7% and that the false positive rate of gesture recognition dropped from 20.76% to 2.33% with context integration at similar recognition rates.

Mithun Jacob, Christopher Cange, Rebecca Packer, Juan P. Wachs

Facilitated Gesture Recognition Based Interfaces for People with Upper Extremity Physical Impairments

A gesture recognition based interface was developed to facilitate people with upper extremity physical impairments as an alternative way to perform laboratory experiments that require ‘physical’ manipulation of components. A color, depth and spatial information based particle filter framework was constructed with unique descriptive features for face and hands representation. The same feature encoding policy was subsequently used to detect, track and recognize users’ hands. Motion models were created employing dynamic time warping (DTW) method for better observation encoding. Finally, the hand trajectories were classified into different classes (commands) by applying the CONDENSATION method and, in turn, an interface was designed for robot control, with a recognition accuracy of 97.5%. To assess the gesture recognition and control policies, a validation experiment consisting in controlling a mobile service robot and a robotic arm in a laboratory environment was conducted.

Hairong Jiang, Juan P. Wachs, Bradley S. Duerstock

A Performance Evaluation of HMM and DTW for Gesture Recognition

It is unclear whether Hidden Markov Models (HMMs) or Dynamic Time Warping (DTW) techniques are more appropriate for gesture recognition. In this paper, we compare both methods using different criteria, with the objective of determining the one with better performance. For this purpose we have created a set of recorded gestures. The dataset used includes many samples of ten different gestures, with their corresponding ground truth obtained with a kinect. The dataset is made public for benchmarking purposes.

The results show that DTW gives higher performance than HMMs, and strongly support the use of DTW.

Josep Maria Carmona, Joan Climent

Human Gait Identification Using Persistent Homology

This paper shows an image/video application using topological invariants for human gait recognition. Using a background subtraction approach, a stack of silhouettes is extracted from a subsequence and glued through their gravity centers, forming a 3D digital image


. From this 3D representation, the border simplicial complex ∂ 




) is obtained. We order the triangles of ∂ 




) obtaining a sequence of subcomplexes of ∂ 




). The corresponding filtration


captures relations among the parts of the human body when walking. Finally, a topological gait signature is extracted from the persistence barcode according to


. In this work we obtain 98.5% correct classification rates on CASIA-B database.

Javier Lamar-León, Edel B. García-Reyes, Rocío Gonzalez-Diaz

STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences

This paper presents Space-Time Occupancy Patterns (STOP), a new visual representation for 3D action recognition from sequences of depth maps. In this new representation, space and time axes are divided into multiple segments to define a 4D grid for each depth map sequence. The advantage of STOP is that it preserves spatial and temporal contextual information between space-time cells while being flexible enough to accommodate intra-action variations. Our visual representation is validated with experiments on a public 3D human action dataset. For the challenging cross-subject test, we significantly improved the recognition accuracy from the previously reported 74.7% to 84.8%. Furthermore, we present an automatic segmentation and time alignment method for online recognition of depth sequences.

Antonio W. Vieira, Erickson R. Nascimento, Gabriel L. Oliveira, Zicheng Liu, Mario F. M. Campos

Human Activity Recognition by Class Label LLE

Human motion analysis has emerged as an important area of research for different fields and applications. However, analyzing image and video sequences to perform tasks such as action recognition, becomes a challenge due to the high dimensionality of this type of data, not mentioning the restrictions in the recording conditions (lighting, angle, distances, etc). In that sense, we propose a framework for human action recognition, which involves a preprocessing stage that decreases the influence of the record conditions in the analysis. Further, our proposal is based on a new supervised feature extraction technique that includes class label information in the mapping process, to enhance both the underlying data structure unfolding and the margin of separability among classes. Proposed methodology is tested on a benchmark dataset. Attained results show how our approach obtains a suitable performance using straightforward classifiers.

Juliana Valencia-Aguirre, Andrés M. Álvarez-Meza, Genaro Daza-Santacoloma, Carlos Daniel Acosta-Medina, Germa Castellanos-Domínguez

Fast Non-parametric Action Recognition

In this work we propose a method for action recognition which needs no intensive learning stage, and achieves state-of-the-art classification performance. Our work is based on a method presented in the context of image classification. Unlike that method, our approach is well-suited for working with large real-world problems, thanks to an efficient organization of the training data. We show results on the KTH and IXMAS datasets. On the challenging IXMAS dataset, the average running time is reduced by 50% when using our method.

Sebastián Ubalde, Norberto Adrián Goussies

An Human-Computer Interface Using Facial Gestures for the Game of Truco

In this work we present a method to detect and recognize the signs of the card game of


which are a subset of facial gestures. The method uses temporal templates to represent motion and later extract features. The proposed method works in real time, allowing to use it as an human-computer interface , for example, in the context of the card game of


. To the best of our knowledge this is the first work that uses detection of facial gestures in the context of a game.

Gonzalo Castillo, Santiago Avendaño, Norberto Adrián Goussies


Using Word Graphs as Intermediate Representation of Uttered Sentences

We present an algorithm for building graphs of words as an intermediate representation of uttered sentences. No language model is used. The input data for the algorithm are the pronunciation lexicon organized as a tree and the sequence of acoustic frames. The transition between consecutive units are considered as additional units.

Nodes represent discrete instants of time, arcs are labelled with words, and a confidence measure is assigned to each detected word, which is computed by using the phonetic probabilities of the subsequence of acoustic frames used for completing the word.

We evaluated the obtained word graphs by searching the path that best matches with the correct sentence and then measuring the word accuracy, i.e. the oracle word accuracy.

Jon Ander Gómez, Emilio Sanchis

Image Classification Using Frequent Approximate Subgraphs

Frequent approximate subgraph (FAS) mining is used in applications where it is important to take into account some tolerance under slight distortions in the data. Following this consideration, some FAS miners have been developed and applied in several domains of science. However, there are few works related to the application of these types of graph miners in classification tasks. In this paper, we propose a new framework for image classification, which uses FAS patterns as features. We also propose to compute automatically the substitution matrices needed in the process, instead of using expert knowledge. Our approach is tested in two real image collections showing that it obtains good results, comparable to other non-miner solutions reported, and that FAS mining is better than the exact approach for this task.

Niusvel Acosta-Mendoza, Annette Morales-González, Andrés Gago-Alonso, Edel B. García-Reyes, José E. Medina-Pagola

Using Rough Sets and Maximum Similarity Graphs for Nearest Prototype Classification

The nearest neighbor rule (NN) is one of the most powerful yet simple non parametric classification techniques. However, it is time consuming and it is very sensitive to noisy as well as outlier objects. To solve these deficiencies several prototype selection methods have been proposed by the scientific community. In this paper, we propose a new editing and condensing method. Our method combines the Rough Set theory and the Compact Sets structuralizations to obtain a reduced prototype set. Numerical experiments over repository databases show the high quality performance of our method according to classifier accuracy.

Yenny Villuendas-Rey, Yailé Caballero-Mota, María Matilde García-Lorenzo

Hierarchical Elastic Graph Matching for Hand Gesture Recognition

This paper proposes a hierarchical scheme for elastic graph matching hand posture recognition. The hierarchy is expressed in terms of weights assigned to visual features scattered over an elastic graph. The weights in graph’s nodes are adapted according to their relative ability to enhance the recognition, and determined using adaptive boosting. A dictionary representing the variability of each gesture class is proposed, in the form of a collection of graphs (a bunch graph). Positions of nodes in the bunch graph are created using three techniques: manually, semi-automatic, and automatically. The recognition results show that the hierarchical weighting on features has significant discriminative power compared to the classic method (uniform weighting). Experimental results also show that the semi-automatically annotation method provides efficient and accurate performance in terms of two performance measures; cost function and accuracy.

Yu-Ting Li, Juan P. Wachs

On Speeding up Frequent Approximate Subgraph Mining

Frequent approximate subgraph (FAS) mining has become an interesting task with wide applications in several domains of science. Most of the previous studies have been focused on reducing the search space or the number of canonical form (CF) tests. CF-tests are commonly used for duplicate detection; however, these tests affect the efficiency of mining process because they have high computational complexity. In this paper, two prunes are proposed, which allow decreasing the label space, the number of candidates and the number of CF-tests. The proposed prunes are already used and validated in two reported FAS miners by speeding up their mining processes in artificial graph collections.

Niusvel Acosta-Mendoza, Andrés Gago-Alonso, José E. Medina-Pagola

Image Processing and Analysis

Segmentation of Building Facade Domes

Domes are architectural structural elements typical for ecclesiastical and secular grand buildings, like churches, mosques, palaces, capitols and city halls. The current paper targets the problem of segmentation of domes within the framework of architectural style classification of building facades. We perform segmentation of building facade domes by combining bilateral symmetry detection, graph-based segmentation approaches and image analysis and processing technics into a single method. Our algorithm achieves good segmentation results on buildings belonging to variety of architectural styles, such as Renaissanse, Neo-Renaissance, Baroque, Neo-Baroque, Neoclassical and Islamic.

Gayane Shalunts, Yll Haxhimusa, Robert Sablatnig

Human Relative Position Detection Based on Mutual Occlusion

In this paper, we propose, within the field of automatic social context analysis, a novel method to identify the mutual position between two persons in images. Based on the idea that mutual information of head position, body visibility and bodies’ contour shapes may lead to a good estimation of mutual position between people, a predictor is constructed to classify the relative position between both subjects. We advocate the use of superpixels as the basic unit of the human analysis framework. We construct a Support Vector Machine classifier on the feature vector for each image. The results show that this combination of features, provides a significantly low error rate with low variance in our database of 366 images.

Víctor Borjas, Michal Drozdzal, Petia Radeva, Jordi Vitrià

Online Matrix Factorization for Multimodal Image Retrieval

In this paper, we propose a method to build an index for image search using multimodal information, that is, using visual features and text data simultaneously. The method combines both data sources and generates one multimodal representation using latent factor analysis and matrix factorization. One remarkable characteristic of this multimodal representation is that it connects textual and visual content allowing to solve queries with only visual content by implicitly completing the missing textual content. Another important characteristic of the method is that the multimodal representation is learned online using an efficient stochastic gradient descent formulation. Experiments were conducted in a dataset of 5,000 images to evaluate the convergence speed and search performance. Experimental results show that the proposed algorithm requires only one pass through the data set to achieve high quality retrieval performance.

Juan C. Caicedo, Fabio A. González

Improved HSI Color Space for Color Image Segmentation

We present an interactive, semiautomatic image segmentation method that processes the color information of each pixel as a unit, thus avoiding color information scattering. The color information of every pixel is integrated in the segmented image by an adaptive color similarity function designed for direct color comparisons. The border between the achromatic and chromatic zones in the HSI color model has been transformed in order to improve the quality of the pixels segmentation when their colors are very obscure and very clear. The color integrating technique is direct, simple and computationally inexpensive, and it has also good performance in low chromaticity and low contrast images. It is shown that segmentation accuracy is above 95% as average and that the method is fast. These results are significant when compared to other solutions found in the current literature.

Rodolfo Alvarado-Cervantes, Edgardo M. Felipe-Riveron

Wavelet-FFT Filter Applied to Non Uniformity Correction in Infrared Imaging System

In this paper, we use the recently presented wavelet-FFT filter [1] to reduce the nonuniformity noise that affect almost all infrared imaging systems. The wavelet-FFT filter was originally developed to compensate the one-dimensional noise known as stripping noise. We perform an extension of this methodology in order to compensate the two-dimensional noise that degrades infrared imagery. The principal hypothesis of this work is that the two-dimensional focal-plane array can be considered as the composition of vertical and horizontal one-dimensional array sensors. Under this assumption we use a specific design of the wavelet filter to synthesize a replica of the two-dimensional noise and then recover the real incident radiation. The method is evaluated using real mid- and long-wave infrared data from two cameras. The results show the promising performance of the wavelet-FFT filter when is applied in infrared imaging system such as self heating effect.

Cesar San Martin, Carlos Deocares, S. Godoy, P. Meza, Daniela Bonilla

Concealing Damaged Coded Images Using Improved FSE with Critical Support Area

The transmission over error-prone networks of block-based coded images may results in the lost of the several images blocks, degrading drastically the visual quality of images. Consequently, if retransmission is not feasible, then applications of error concealment techniques are required to reduce this degradation caused mainly by the missing information. This paper proposes an adaptive and effective method to select the required support area, using suited base functions and optimal expansion coefficients, in order to conceal the damaged blocks in critical error situations. This method outperforms the concealment done by the conventional frequency selective extrapolation approach. It also performs well in current situations where significant loss of information is present and the data of the past reference images are also not available. The proposed method and the reviewed algorithms were implemented, tested and compared. Experimental results show that the proposed approach outperforms existing methods by up to 7.2 dB.

Alejandro Alvaro Ramírez-Acosta, Mireya S. García-Vázquez, Sunil Kumar

Sketchable Histograms of Oriented Gradients for Object Detection

In this paper we investigate a new representation approach for visual object recognition. The new representation, called sketchable-HoG, extends the classical histogram of oriented gradients (HoG) feature by adding two different aspects: the


of the majority orientation and the


of gradient orientations. In this way, the sketchable-HoG locally characterizes the complexity of an object model and introduces global structure information while still keeping simplicity, compactness and robustness. We evaluated the proposed image descriptor on publicly Catltech 101 dataset. The obtained results outperforms classical HoG descriptor as well as other reported descriptors in the literature.

Ekaterina Zaytseva, Santi Seguí, Jordi Vitrià

Supervised Biometric System Using Multimodal Compression Scheme

This work is a part of a project supported by STIC Am-Sud where the main objective is to design an intelligent vision system to protect children from some critical information accessible from the Internet, from some videos or from some video games that are related to violence, wars, pornography, etc. Considered definitively not appropriate for their age, such multimedia contains can significantly offend young people. More specifically, in this paper, we are interested in discussing a general concept of a supervised biometric system that is controlled by specific tags embedded in video frames through a multimodal compression. Using a spiral insertion scheme, specific frequencies (TAGs) are compressed jointly with video frames in the region of insertion and then extracted for supervision purpose. The multimodal compression is considered here because it allows high-level robustness regarding the bitrates and down-sampling.

Wafa Chaabane, Régis Fournier, Amine Naït-ali, Julio Jacobo, Marta Mejail, Marcelo Mottalli, Heitor Ramos, Alejandro C. Frery, Leonardo Viana

A New Morphological Measure of Histogram Bimodality

The presence of multiple modes in a histogram gives important information about data distribution for a great amount of different applications. The dip test has been the most common statistical measure used for this purpose.

Histograms of oriented gradients (HOGs) with a high bimodality have shown to be very useful to detect highly robust keypoints. However, the dip test presents serious disadvantages when dealing with such histograms. In this paper we describe the drawbacks of the dip test for determining HOGs bimodality, and present a new bimodality test, based on mathematical morphology, that overcomes them.

Miguel Angel Cataño, Joan Climent

Street Detection with Asymmetric Haar Features

We present a system for object detection applied to street detection in satellite images. Our system is based on asymmetric Haar features. Asymmetric Haar features provide a rich feature space, which allows to build classifiers that are accurate and much simpler than those obtained with other features. The extremely large parameter space of potential features is explored using a genetic algorithm. Our system uses specialized detectors in different street orientations that are built using AdaBoost and the C4.5 rule induction algorithm. Experimental results show that Asymmetric Haar features are better than basic Haar features for street detection.

Geovany A. Ramirez, Olac Fuentes

Automatic Camera Pose Recognition in Planar View Scenarios

The main goal of this paper is to recognize automatically camera pose from a single image of a planar view scenario. We apply this technique to sport event scenarios using as information the white lines/circles dividing the different parts of the sport court. Using these court primitives we define a loss function that we minimize to obtain the best perspective transformation (homography) matching the actual sport court with its projection in the image. From such homography we recover the camera pose (position and orientation in the 3D space). We present numerical experiments in simulated and real sport scenarios.

Luis Alvarez, Luis Gomez, Pedro Henriquez, Luis Mazorra

Shape and Texture

Texture Image Retrieval Based on Log-Gabor Features

Since Daugman found out that the properties of Gabor filters match the early psychophysical features of simple receptive fields of the Human Visual System (HVS), they have been widely used to extract texture information from images for retrieval of image data. However, Gabor filters have not zero mean, which produces a non-uniform coverage of the Fourier domain. This distortion causes fairly poor pattern retrieval accuracy. To address this issue, we propose a simple yet efficient image retrieval approach based on a novel log-Gabor filter scheme. We make emphasis on the filter design to preserve the relationship with receptive fields and take advantage of their strong orientation selectivity. We provide an experimental evaluation of both Gabor and log-Gabor features using two metrics, the Kullback-Leibler (



) and the Jensen-Shannon divergence (



). The experiments with the USC-SIPI database confirm that our proposal shows better retrieval performance than the classic Gabor features. 3

Rodrigo Nava, Boris Escalante-Ramírez, Gabriel Cristóbal

Image Retrieval Using Low Level Features of Object Regions with Application to Partially Occluded Images

This paper proposes an image retrieval system using the local colour and texture features of object regions and global colour features of the image. The object regions are roughly identified by segmenting the image into fixed partitions and finding the edge density in each partition using edge thresholding and morphological dilation. The colour and texture features of the identified regions are computed from the histograms of the quantized HSV colour space and Gray Level Co- occurrence Matrix (GLCM) respectively. A combined colour and texture feature vector is computed for each region and Euclidean distance measure is used for computing the distance between the features of the query and target image. Preliminary experimental results show that the proposed method provides better retrieving result than retrieval using some of the existing methods. Also promising results are obtained for 50% and 75% occluded query images.

E. R. Vimina, K. Poulose Jacob

Continuous Multi-way Shape Measure for Dissimilarity Representation

For many applications, a straightforward representation of objects is by multi-dimensional arrays e.g. signals. However, there are only a few classification tools which make a proper use of this complex structure to obtain a better discrimination between classes. Moreover, they do not take into account context information that can also be very beneficial in the classification process. Such is the case of multi-dimensional continuous data, where there is a connectivity between the points in all directions, a particular (differentiating) shape in the surface of each class of objects. The dissimilarity representation has been recently proposed as a tool for the classification of multi-way data, such that the multi-dimensional structure of objects can be considered in their dissimilarities. In this paper, we introduce a dissimilarity measure for continuous multi-way data and a new kernel for gradient computation. It allows taking the connectivity between the measurement points into account, using the information on how the shape of the surface varies in all directions. Experiments show the suitability of this measure for classifying continuous multi-way data.

Diana Porro-Muñoz, Robert P. W. Duin, Mauricio Orozco-Alzate, Isneri Talavera Bustamante

Improving Spider Recognition Based on Biometric Web Analysis

This work presents an improvement of the automatic and supervised spider identification approach based on biometric spider web analysis. We have used as feature extractor, a Joint Approximate Diagonalization of Eigen-matrixes Independent Component Analysis applying to a binary image with a reduced size (20×20 pixels) from the colour original image. Finally, we have applied a least square support vector machine as classifier, reaching over 98.15% in our hold-50%-out validation. This system is making easier Biologists’ tasks in this field, because they can have a second opinion or have a tool for this work.

Carlos M. Travieso Gonzalez, Jaime Roberto Ticay-Rivas, Marcos del Pozo-Baños, William G. Eberhard, Jesús B. Alonso-Hernández

Legume Identification by Leaf Vein Images Classification

In this paper we propose an automatic algorithm able to classify legume leaf images considering only the leaf venation patterns (leaf shape, color and texture are excluded). This method processes leaf images captured with a standard scanner and segments the veins using the Unconstrained Hit-or-Miss Transform (UHMT) and adaptive thresholding. We measure several morphological features on the veins and classify them using Random forests. We applied the process to recognize several legumes (soybean, white bean and red bean). We analyze the importance of the features and select a small set which is relevant for the recognition task. Our automatic procedure outperforms the expert manual classification.

Mónica G. Larese, Roque M. Craviotto, Miriam R. Arango, Carina Gallo, Pablo M. Granitto

Learning, Mining and Neural Networks

CAR-NF + : An Improved Version of CAR-NF Classifier

In this paper, we propose two improvements to CAR-NF classifier, which is a classifier based on Class Association Rules (CARs). The first one, is a theoretical proof that allows selecting the minimum Netconf threshold, independently of the dataset, that avoids ambiguity at the classification stage. The second one, is a new coverage criterion, which aims to reduce the number of non-covered unseen-transactions during the classification stage. Experiments over several datasets show that the improved classifier, called CAR-NF


, beats the best reported classifiers based on CARs, including the original CAR-NF classifier.

Raudel Hernández-León, José Hernández-Palancar, Jesús Ariel Carrasco-Ochoa, José Francisco Martínez-Trinidad

Recognition of Patterns of Health Problems and Falls in the Elderly Using Data Mining

We present a generalized data mining approach to the detection of health problems and falls in the elderly for the purpose of prolonging their autonomous living. The input for the data mining algorithm is the output of the motion-capture system. The approach is general since it uses a k-nearest-neighbor algorithm and dynamic time warping with the time series of all the measurable joint angles for the attributes instead of a more specific approach with medically defined attributes. Even though the presented approach is more general and can be used to differentiate other types of activities or health problems, it achieves very high classification accuracies, similar to the more specific approaches described in the literature.

Bogdan Pogorelc, Matjaž Gams

SVMTOCP: A Binary Tree Base SVM Approach through Optimal Multi-class Binarization

The tree architecture has been employed to solve multi-class problems based on SVM. It is an alternative to the well known OVO/OVA strategies. Most of the tree base SVM classifiers try to split the multi-class space, mostly, by some clustering like algorithms into several binary partitions. One of the main drawbacks of this is that the natural class structure is not taken into account. Also the same SVM parameterization is used for all classifiers. Here a preliminary and promising result of a multi-class space partition method that account for data base class structure and allow node‘s parameter specific solutions is presented. In each node the space is split into two class problem possibilities and the best SVM solution found. Preliminary results show that accuracy is improved, lesser information is required, each node reaches specific cost values and hard separable classes can easily be identified.

Diego Arab Cohen, Elmer Andrés Fernández

On the Comparison of Structured Data

This paper introduces a theoretical framework to characterize measures on structured data. We firstly describe the lattice of structured data. Then, four basic and intuitive properties which any measure on structure data must fulfill are formally introduced. Metrics and kernel functions are studied as particular cases of (dis)similarity measures. In the case of metrics we prove that the well-known edit distances meet all the desirable properties. We also give sufficient conditions for a kernel function to satisfy those properties. Some examples are given for particular kinds of structured data.

Jyrko Correa-Morris, Noslen Hernández

A Modification of the Lernmatrix for Real Valued Data Processing

An associative memory is a binary relationship between inputs and outputs, which is stored in an M matrix. In this paper, we propose a modification of the Steinbuch Lernmatrix model in order to process real-valued patterns, avoiding binarization processes and reducing computational burden. The proposed model is used in experiments with noisy environments, where the performance and efficiency of the memory is proven. A comparison between the proposed and the original model shows a good response and efficiency in the classification process of the new Lernmatrix.

José Juan Carbajal-Hernández, Luis P. Sánchez-Fernández, Luis A. Sánchez-Pérez, Jesús Ariel Carrasco-Ochoa, José Francisco Martínez-Trinidad

Automatic Design of Binary W-Operators Using Artificial Feed-Forward Neural Networks Based on the Weighted Mean Square Error Cost Function

One of the main issues concerning automatic design of W-operators is the one of generalization. Considering the designing of W-operators as a particular case of designing a pattern recognition system, in this work we propose a new approach for the automatic design of binary W-operators. This approach consists on a functional representation of the conditional probabilities for the whole set of patterns viewed by a given window, instead the values of the characteristic function. The estimation of its parameters is achieved by means of a nonlinear regression performed by an artificial feed-forward neural network based on a weighted mean square error cost function. Experimental results show that, for the applications presented in this work, the proposed approach leads to better results than one of the best existing methods of generalization within the family of W-operators, like is the case of pyramidal multiresolution.

Marco Benalcázar, Marcel Brun, Virginia Ballarin, Isabel Passoni, Gustavo Meschino, Lucía Dai Pra

On Using Asymmetry Information for Classification in Extended Dissimilarity Spaces

When asymmetric dissimilarity measures arise, asymmetry correction methods such as averaging are used in order to make the matrix symmetric. This is usually needed for the application of pattern recognition procedures, but in this way the asymmetry information is lost. In this paper we present a new approach to make use of the asymmetry information in dissimilarity spaces. We show that taking into account the asymmetry information improves classification accuracy when a small number of prototypes is used to create an extended asymmetric dissimilarity space. If the degree of asymmetry is higher, improvements in classification accuracy are also higher. The symmetrization by averaging also works well in general, but decreases performance for highly asymmetric data.

Yenisel Plasencia-Calaña, Edel B. García-Reyes, Robert P. W. Duin, Mauricio Orozco-Alzate

Improving Convergence of Restricted Boltzmann Machines via a Learning Adaptive Step Size

Restricted Boltzmann Machines (RBMs) have recently received much attention due to their potential to integrate more complex and deeper architectures. Despite their success, in many applications, training an RBM remains a tricky task. In this paper we present a learning adaptive step size method which accelerates its convergence. The results for the MNIST database demonstrate that the proposed method can drastically reduce the time necessary to achieve a good RBM reconstruction error. Moreover, the technique excels the fixed learning rate configurations, regardless of the momentum term used.

Noel Lopes, Bernardete Ribeiro

Robust Asymmetric Adaboost

In real world pattern recognition problems, such as computer-assisted medical diagnosis, events of a given phenomena are usually found in minority, making it necessary to build algorithms that emphasize the effect of one of the classes at training time. In this paper we propose a variation of the well-known Adaboost algorithm that is able to improve its performance by using an asymmetric and robust cost function. We assess the performance of the proposed method on two medical datasets and synthetic datasets with different levels of imbalance and compare our results against three state-of-the-art ensemble learning approaches, achieving better and comparable results.

Pablo Ormeño, Felipe Ramírez, Carlos Valle, Héctor Allende-Cid, Héctor Allende

Enhancing the Performance of AdaBoost Algorithms by Introducing a Frequency Counting Factor for Weight Distribution Updating

This work presents a modified Boosting algorithm capable of avoiding training sample overfitting during training procedures. The proposed algorithm updates weight distributions according to amount of misclassified samples at each iteration training step. Experimental tests reveal that our approach has several advantages over many classical AdaBoost algorithms in terms of error generalization capacity, overfitting avoidance and superior classification performance.

Diego Alonso Fernández Merjildo, Lee Luan Ling

Significative Learning Using Alpha-Beta Associative Memories

The main goal in pattern recognition is to be able to recognize interest patterns, although these patterns might be altered in some way. Associative memories is a branch in AI that obtains one generalization per class from the initial data set. The main problem is that when generalization is performed much information is lost. This is mainly due to the presence of outliers and pattern distribution in space. It is believed that one generalization is not sufficient to keep the information necessary to achieve a good performance in the recall phase. This paper shows a way to prevent information loss and make more significative learning allowing better recalling results.

Catalán-Salgado Edgar Armando, Yáñez-Márquez Cornelio, Figueroa-Nazuno Jesus

Early Visual Processing for Pattern Recognition in Natural Environments

In this article, the performance of hierarchical architectures for computer vision of biological inspiration is analyzed. In particular, the role of the lateral geniculate nucleus (LGN) for the robustness in the recognition of ring shaped patterns under gradual changes in illumination was analyzed. Changes in illumination are common in everyday environments and thus, it is a crucial factor to be considered for the recognition of objects in everyday environments.

Rosana Matuk Herrera

Medical Images

Motor Intention Recognition in EEG: In Pursuit of a Relevant Feature Set

Brain-computer interfaces (BCIs) based on electroencephalograms (EEG) are a noninvasive and cheap alternative to get a communication channel between brain and computers. Some of the main issues with EEG signals are its high dimensionality, high inter-user variance, and non-stationarity. In this work we present different approaches to deal with the high dimensionality of the data, finding relevant descriptors in EEG signals for motor intention recognition: first, a classical dimensionality reduction method using Diffusion Distance, second a technique based on spectral analysis of EEG channels associated with the frontal and prefrontal cortex, and third a projection over average signals. Performance analysis for different sets of features is done, showing that some of them are more robust to user variability.

Pablo A. Iturralde, Martín Patrone, Federico Lecumberry, Alicia Fernández

Bag of Features for Automatic Classification of Alzheimer’s Disease in Magnetic Resonance Images

The goal of this paper is to evaluate the suitability of a bag-of-feature representation for automatic classification of Alzheimer’s disease brain magnetic resonance (MR) images. The evaluated method uses a bag-of-features (BOF) to represent the MR images, which are then fed to a support vector machine, which has been trained to distinguish between normal control and Alzheimer’s disease. The method was applied to a set of images from the OASIS data set. An exhaustive exploration of different BOF parameters was performed, i.e. feature extraction, dictionary construction and classification model. The experimental results show that the evaluated method reaches competitive performance in terms of accuracy, sensibility and specificity. In particular, the method based on a BOF representation outperforms the best published result in this data set improving the equal error classification rate in about 10% (0.80 to 0.95 for Group 1 and 0.71 to 0.81 for Group 2).

Andrea Rueda, John Arevalo, Angel Cruz, Eduardo Romero, Fabio A. González

An Automatic Segmentation Approach of Epithelial Cells Nuclei

Histology images are used to identify biological structures present in living organisms — cells, tissues, organs, and parts of organs. E-Learning systems can use images to aid teaching how morphological features relate to function and understanding which features are most diagnostic of organs. The structure of cells varies according to the type and function of the cell. Automatic cell segmentation is one of the challenging tasks in histology image processing. This problem has been addressed using morphological gradient, region-based methods and shape-based method approaches, among others. In this paper, automatic segmentation of nuclei of epithelial cells is addressed by including morphological information. Image segmentation is commonly evaluated in isolation. This is either done by observing results, via manual segmentation or via some other goodness measure that does not rely on ground truth images. Expert criteria along with images manually segmented are used to validate automatic segmentation results. Experimental results show that the proposed approach segments epithelial cells in a close way to expert manual segmentations. An average sensitivity of 76% and an average specificity of 77% were obtained on a selected set of images.

Claudia Mazo, Maria Trujillo, Liliana Salazar

Evaluation and Selection of Morphological Procedures for Automatic Detection of Micro-calcifications in Mammography Images

In this paper, we present an evaluation of four different algorithms, based on Mathematical Morphology, to detect the occurrence of micro-calcifications in digital mammogram images from the mini-MIAS database. Results provided by TMVA produced the ranking of features that allowed discrimination between real micro-calcifications and normal tissue. ROC area measures the performance of automatic classification, which produced its highest value 0.976 for Gaussian kernel, followed by polynomial kernel, which produced 0.972. An additional parameter, called Signal Efficiency*Purity (SE*P), is proposed as a measure of the number of micro-calcifications with the lowest quantity of noise.

Claudia C. Diaz-Huerta, Edgardo M. Felipe-Riverón, Luis M. Montaño-Zetina

Detection of Chickenpox Vesicles in Digital Images of Skin Lesions

Chickenpox is a viral disease characterized by itchy skin vesicles that can have severe complications in adults. A tool for automatic detection of these lesions in patients’ photographs is highly desirable to help the physician in the diagnosis. In this work we design a method for detection of chickenpox skin lesions in images. It is a combination of image processing techniques - color transform, equalization, edge detection, circular Hough transform- and statistical tests. We obtain highly satisfactory results in the detection of chickenpox vesicles, the elimination of false detections using the Kullback Leibler divergence, and in preliminary tests for discrimination between chickenpox and herpes zoster.

Julián Oyola, Virginia Arroyo, Ana Ruedin, Daniel Acevedo

Robotics, Stereo Vision and Real Time

Real-Time On-Board Image Processing Using an Embedded GPU for Monocular Vision-Based Navigation

In this work we present a new image-based navigation method for guiding a mobile robot equipped only with a monocular camera through a naturally delimited path. The method is based on segmenting the image and classifying each super-pixel to infer a contour of navigable space. While image segmentation is a costly computation, in this case we use a low-power embedded GPU to obtain the necessary framerate in order to achieve a reactive control for the robot. Starting from an existing GPU implementation of the quick-shift segmentation algorithm, we introduce some simple optimizations which result in a speedup which makes real-time processing on board a mobile robot possible. Performed experiments using both a dataset of images and an online on-board execution of the system in an outdoor environment demonstrate the validity of this approach.

Matías Alejandro Nitsche, Pablo De Cristóforis

Hardware/Software Co-design for Real Time Embedded Image Processing: A Case Study

Many image processing applications need real time performance, while having restrictions of size, weight and power consumption. These include a wide range of embedded systems from remote sensing applications to mobile phones. FPGA-based solutions are common for these applications, their main drawback being long development time. In this work a co-design methodology for processor-centric embedded systems with hardware acceleration using FPGAs is applied to an image processing method for localization of multiple robots. The goal of the methodology is to achieve a real-time embedded solution using hardware acceleration, but with development time similar to software projects. The final embedded co-designed solution processes 1600×1200 pixel images at a rate of 25 fps, achieving a 12.6× acceleration from the original software solution. This solution runs with a comparable speed as up-to-date PC-based systems, and it is smaller, cheaper and demands less power.

Sol Pedre, Tomáš Krajník, Elías Todorovich, Patricia Borensztejn

Dynamic Textures Segmentation with GPU

This work addresses the problem of motion segmentation in video sequences using dynamic textures. Motion can be globally modeled as a statistical visual process know as dynamic texture. Specifically, we use the mixtures of dynamic textures model which can simultaneously handle different visual processes. Nowadays, GPU are becoming increasingly popular in computer vision applications because of their cost-benefit ratio. However, GPU programming is not a trivial task and not all algorithms can be easily switched to GPU. In this paper, we made two implementations of a known motion segmentation algorithm based on mixtures of dynamic textures. One using CPU and the other ported to GPU. The performance analyses show the scenarios for which it is worthwhile to do the full GPU implementation of the motion segmentation process.

Juan Manuel Rodríguez, Francisco Gómez Fernández, María Elena Buemi, Julio Jacobo-Berlles

Fast Tracking Algorithm with Borders 1-D Histogram Correlation

This paper presents a fast algorithm for object tracking in an image sequence. It is a method that models the borders of the image as one-dimensional histograms which are then used instead of templates in the matching procedure. The algorithm models the item being tracked as well as the background in the vicinity so as to then suppress it. It uses cross correlation to find the best match and weighted average to renew the model.

María Curetti, Santiago Garcia Bravo, Gabriela Soledad Arri, Ladislao Mathé

Disparity Confidence Measures on Engineered and Outdoor Data

Confidence measures for stereo analysis are not yet a subject of detailed comparative evaluations. There have been some studies, but still insufficient for estimating the performance of these measures. We comparatively discuss confidence measures whose performance appeared to be ‘promising’ to us, by evaluating their performance on commonly used stereo test data. Those data are either engineered and come with accurate ground truth (for disparities), or they are recorded outdoors and come with approximate ground truth. The performance of confidence measures varies widely between these two types of data. We propose modifications of confidence measures which can improve their performance on outdoor data.

Ralf Haeusler, Reinhard Klette

Remote Sensing

Speckle Reduction Using Stochastic Distances

This paper presents a new approach for filter design based on stochastic distances and tests between distributions. A window is defined around each pixel, samples are compared and only those which pass a goodness-of-fit test are used to compute the filtered value. The technique is applied to intensity Synthetic Aperture Radar (SAR) data, using the Gamma model with varying number of looks allowing, thus, changes in heterogeneity. Modified Nagao-Matsuyama windows are used to define the samples. The proposal is compared with the Lee’s filter which is considered a standard, using a protocol based on simulation. Among the criteria used to quantify the quality of filters, we employ the equivalent number of looks (related to the signal-to-noise ratio), line contrast, and edge preservation. Moreover, we also assessed the filters by the Universal Image Quality Index and the Pearson’s correlation between edges.

Leonardo Torres, Tamer Cavalcante, Alejandro C. Frery

Automatic Classification of Volcanic Earthquakes in HMM-Induced Vector Spaces

Even though hidden Markov models (HMMs) have been used for the automatic classification of volcanic earthquakes, their usage has been so far limited to the Bayesian scheme. Recently proposed alternatives, proven in other application scenarios, consist in building HMM-induced vector spaces where discriminative classification techniques can be applied. In this paper, a simple vector space is induced by considering log-likelihoods of the HMMs (per-class) as dimensions. Experimental results show that the discriminative classification in such an induced space leads to better performances than those obtained with the standard Bayesian scheme.

Riccardo Avesani, Alessio Azzoni, Manuele Bicego, Mauricio Orozco-Alzate

Building Change Detection from Uniform Regions

This paper deals with building change detection by supervised classification of image regions into ’built’ and ’non-built’ areas. Regions are the connected components of low gradient values in a multi-spectral aerial image. Classes are learnt from spectral (colour, vegetation index) and elevation cues relatively to building polygons and non building areas as defined in the existing database. Possible candidate building regions are then filtered by geometrical features. Inconsistencies in the database with the recent image are automatically detected. Tests in cooperation with the Belgian National Geographical Institute on an area with sufficient buildings and landscape variety have shown that the system allows for the effective verification of unchanged buildings, and detection of destructions and new candidate buildings.

Charles Beumier, Mahamadou Idrissa

Generalized Statistical Complexity of SAR Imagery

A new generalized Statistical Complexity Measure (SCM) was proposed by Rosso et al in 2010. It is a functional that captures the notions of order/disorder and of distance to an equilibrium distribution. The former is computed by a measure of entropy, while the latter depends on the definition of a stochastic divergence. When the scene is illuminated by coherent radiation, image data is corrupted by speckle noise, as is the case of ultrasound-B, sonar, laser and Synthetic Aperture Radar (SAR) sensors. In the amplitude and intensity formats, this noise is multiplicative and non-Gaussian requiring, thus, specialized techniques for image processing and understanding. One of the most successful family of models for describing these images is the Multiplicative Model which leads, among other probability distributions, to the

$\mathcal G^0$

law. This distribution has been validated in the literature as an expressive and tractable model, deserving the “universal” denomination for its ability to describe most types of targets. In order to compute the statistical complexity of a site in an image corrupted by speckle noise, we assume that the equilibrium distribution is that of fully developed speckle, namely the Gamma law in intensity format, which appears in areas with little or no texture. We use the Shannon entropy along with the Hellinger distance to measure the statistical complexity of intensity SAR images, and we show that it is an expressive feature capable of identifying many types of targets.

Eliana S. de Almeida, Antonio Carlos de Medeiros, Osvaldo A. Rosso, Alejandro C. Frery

New Metrics to Evaluate Pattern Recognition in Remote Sensing Images

The continuous development of pattern recognition approaches increases the need for evaluation tools to quantify algorithms performance and establish precise inter-algorithm comparison. So far, few performance evaluating metrics in pattern recognition algorithms are known in the literature, especially in remote sensing images. In this paper, four metrics are proposed for this purpose. The advantages and drawbacks of these metrics are first described, then some experimentation results are the presented in order to validate our contribution.

Manel Kallel, Mohamed Naouai, Yosr Slama

Polarimetric SAR Image Smoothing with Stochastic Distances

Polarimetric Synthetic Aperture Radar (PolSAR) images are establishing as an important source of information in remote sensing applications. The most complete format this type of imaging produces consists of complex-valued Hermitian matrices in every image coordinate and, as such, their visualization is challenging. They also suffer from speckle noise which reduces the signal-to-noise ratio. Smoothing techniques have been proposed in the literature aiming at preserving different features and, analogously, projections from the cone of Hermitian positive matrices to different color representation spaces are used for enhancing certain characteristics. In this work we propose the use of stochastic distances between models that describe this type of data in a Nagao-Matsuyama-type of smoothing technique. The resulting images are shown to present good visualization properties (noise reduction with preservation of fine details) in all the considered visualization spaces.

Leonardo Torres, Antonio C. Medeiros, Alejandro C. Frery

Signal Processing

Recognition and Real-Time Detection of Blinking Eyes on Electroencephalographic Signals Using Wavelet Transform

In this paper we study the detection of a specific pattern associated with the blinking of an eye in real time using electroencephalogram (EEG) signals of a single channel. This paper takes into account the theoretical and practical principles enabling the design and implementation of a system for real-time detection of time location, regardless of scale and multiple incidences. By using wavelet transform it permits us the fulfillment of our objective. The multiple detection and real-time operation is achieved by working with a pop-up window giving the projection of an ongoing analysis of the signal sampled by the EEG.

Renato Salinas, Enzo Schachter, Michael Miranda

Finite Rank Series Modeling for Discrimination of Non-stationary Signals

The analysis of time-variant biosignals for classification tasks, usually requires a modeling that may handel their different dynamics and non–stationary components. Although determination of proper stationary data length and the model parameters remains as an open issue. In this work, time–variant signal decomposition through Finite Rank Series Modeling is carried out, aiming to find the model parameters. Three schemes are tested for OSA detection based on HRV recordings: SSA and DLM as linear decompositions and EDS as non–linear decomposition. Results show that EDS decomposition presents the best performance, followed by SSA. As a conclusion, it can be inferred that adding complexity at the linear model the trend is approximate to a simple non–linear model.

Lina Maria Sepulveda-Cano, Carlos Daniel Acosta-Medina, Germán Castellanos-Dominguez

Quaternionic Analytic Signal Using Atomic Functions

Atomic Functions are widely used in different applications in image processing, pattern recognition, computational physics and also in the digital interpretation of signal measurements. In 1D signals, is usual to compute the phase and the magnitude of a signal using the analytic signal (the signal and its Hilbert transform using complex numbers). However, for high dimensional signals the monogenic signal (the signal and its Riesz transform) has been used to obtain the local phase and orientation with good results. The main aim of this work is to present a new way to make the computation of the Hilbert transform using the atomic function. The computation of the Hilbert transform take relevance when the phase computation is required.

E. Ulises Moya-Sánchez, Eduardo Bayro-Corrochano

Separation and Classification of Harmonic Sounds for Singing Voice Detection

This paper presents a novel method for the automatic detection of singing voice in polyphonic music recordings, that involves the extraction of harmonic sounds from the audio mixture and their classification. After being separated, sounds can be better characterized by computing features that are otherwise obscured in the mixture. A set of descriptors of typical pitch fluctuations of the singing voice is proposed, that is combined with classical spectral timbre features. The evaluation conducted shows the usefulness of the proposed pitch features and indicates that the approach is a promising alternative for tackling the problem, in particular for not much dense polyphonies where singing voice can be correctly tracked. As an outcome of this work an automatic singing voice separation system is obtained with encouraging results.

Martín Rocamora, Alvaro Pardo

Online Signature Verification Based on Legendre Series Representation. Consistency Analysis of Different Feature Combinations

In this paper, orthogonal polynomial series are used to approximate the time functions associated to the signatures and the coefficients in these series are used as features to model them. A novel consistency factor is proposed to quantify the discriminative power of different combinations of time functions related to the signing process. Pen coordinates, incremental variation of pen coordinates and pen pressure are analyzed for two different signature styles, namely, Western signatures and Chinese signatures from a publicly available Signature Database. Two state-of-the-art classifiers, namely, Support Vector Machines and Random Forests are used in the verification experiments. The obtained error rates are comparable to results reported over the same signature datasets in a recent signature verification competition.

Marianela Parodi, Juan Carlos Gómez

Speech and Handwriting Analysis

Gaussian Selection for Speaker Recognition Using Cumulative Vectors

Speaker recognition systems frequently use GMM - MAP method for modeling speakers. This method represents a speaker using a Gaussian mixture. However in this mixture not all the Gaussian components are truly representative of the speaker. In order to remove the model redundancy, this work proposes a Gaussian selection method to achieve a new GMM model only with the more representative Gaussian components. Speaker verification experiments applying the proposal show a similar performance to baseline; however the speaker models have a reduction of 80 % regarding the speaker model used for baseline. The application of this Gaussian selection method in real or embedded speaker verification systems could be very useful for reducing computational and memory cost.

Flavio J. Reyes Díaz, José Ramón Calvo de Lara, Gabriel Hernández Sierra

Speaker Recognition Using a Binary Representation and Specificities Models

State of the Art speaker recognition methods are mainly based on GMM/UBM based supervector paradigm. Recently, a simple representation of speech based on local binary decision taken on each acoustic frame have been proposed, allowing to represent a speech excerpt as a binary matrix. This article is based on a similar approach. A new temporal block representation of the binary transformed data as well as three simple algorithms to obtain an efficient similarity measure are proposed. The experimental results show a better robustness of the proposed approach and a similar or better overall performance over classical approaches.

Gabriel Hernández-Sierra, Jean-François Bonastre, José Ramón Calvo de Lara

Analysis of the Multifractal Nature of Speech Signals

Frame duration is an essential parameter to ensure correct application of multifractal signal processing. This paper aims to identify the multifractal nature of speech signals through theoretical study and experimental verification. One important part of this pursuit is to select adequate ranges of frame duration that effectively display evidence of multifractal nature. An overview of multifractal theory is given, including definitions and methods for analyzing and estimating multifractal characteristics and behavior. Based on these methods, we evaluate the utterances from two different Portuguese speech databases by studying their singularity curves (




) and




)).We conclude that the frame duration between 50 and 100 ms is more suitable and useful for multifractal speech signal processing in terms of speaker recognition performance [11].

Diana Cristina González, Lee Luan Ling, Fábio Violaro

Beam-Search Formant Tracking Algorithm Based on Trajectory Functions for Continuous Speech

This paper presents a formant frequency tracking algorithm for continuous speech processing. First, it uses spectral information for generating frequency candidates. For this purpose, the roots of the polynomial of a Linear Predictive Coding (LPC) and peak picking of Chirp Group Delay Function (CGD) were tested. The second stage is a beam-search algorithm that tries to find the best sequence of formants given the frequency candidates, applying a cost function based on local and global evidences. The main advantage of this beam-search algorithm compared with previous dynamic programming approaches lies in that a trajectory function that takes into account several frames can be optimally incorporated to the cost function. The performance was evaluated using a labeled formant database and the Wavesurfer formant tracker, achieving promising results.

José Enrique García Laínez, Dayana Ribas González, Antonio Miguel Artiaga, Eduardo Lleida Solano, José Ramón Calvo de Lara

Multi-level Modeling of Manuscripts for Authorship Identification with Collective Decision Systems

In the context of forensic and criminalistics studies the problem of identifying the author of a manuscript is generally expressed as a supervised-classification problem. In this paper a new approach for modeling a manuscript at the word and text line levels is presented. This new approach introduces an eclectic paradigm between texture-related and structure-related modeling approaches. Compared to previously published works, the proposed method significantly reduces the number and complexity of the text-features to be extracted from the text. Extensive experimentation with the proposed model shows it to be faster and easier to implement than other models, making it ideal for extensive use in forensic and criminalistics studies.

Salvador Godoy-Calderón, Edgardo M. Felipe-Riverón, Edith C. Herrera-Luna

Statistical Pattern Recognition

Extraction of Stationary Spectral Components Using Stochastic Variability

Biosignal recordings are widely used in the medical environment to support the evaluation and the diagnosis of pathologies. Nevertheless, the main difficulty lies in the non-stationary behavior of the biosignals, making difficult the obtention of patterns characterizing the changes in physiological or pathological states. Thus, the obtention of the stationary and non-stationary components of a biosignal poses still an open issue. This work proposes a methodology to detect time-homogeneities based on time-frequency analysis aiming to extract the non-stationary behavior of the biosignal. Two homogeneity constraints are introduced as the measure of stochastic variability of the considered dataset. The first one is the relevance value, which gives information about the contribution of the spectral component to the process. The second one is based on the first and second moments of stochastic variability map, being related to the uniformity along the time of each spectral component. Results show an increase in the stationarity of the reconstructions from the enhanced time-frequency representations. Moreover, the inter class distance for the reconstructed time-series show more discrimination on the stationary component than on the non-stationary one. Those extracted components tend to meet the requirement of most algorithms proposed for other tasks, such as biosignal classification problems, leading to a performance increase of the methodologies.

David Cárdenas-Peña, Juan David Martínez-Vargas, Germán Castellanos-Domínguez

Finding Edges by a Contrario Detection of Periodic Subsequences

A new method to detect salient pieces of boundaries in an image is presented. After detecting perceptually meaningful level lines, periodic binary sequences are built by labeling each point in close curves as salient or non-salient. We propose a general and automatic method to detect meaningful subsequences within these binary sequences. Experimental results show its good performance.

Mariano Tepper, Pablo Musé, Andrés Almansa, Marta Mejail

A Non Bayesian Predictive Approach for Functional Calibration

A non Bayesian predictive approach for statistical calibration with functional data is introduced. This is based on extending to the functional calibration setting the definition of non Bayesian predictive probability density proposed by Harris (1989). The new method is elaborated in detail in case of Gaussian functional linear models. It is shown through numerical simulations that the introduced non Bayesian predictive estimator of the unknown parameter of interest in calibration (commonly, a substance concentration) has negligible bias and compares favorably with the classical estimator, particularly in extrapolation problems. A further advantage of the new approach, which is also briefly illustrated, is that it provides not only point estimates but also a predictive likelihood function that allows the researcher to explore the plausibility of any possible parameter value.

Noslen Hernández, Rolando J. Biscay, Isneri Talavera

Classifier Combination Using Random Walks on the Space of Concepts

We propose a novel approach for the combination of classifiers based on two commonly adopted strategies in multiclass classification: one-vs-all and one-vs-one. The method relies on establishing the relevance of nodes in a graph defined in the space of concepts. Following a similar approach as in the ranking of websites, the relative strength of the nodes is given by the stationary distribution of a Markov chain defined on that graph. The proposed approach do not requires the base classifiers to provide calibrated probabilities. Experiments on the challenging problem of multiclass image classification show the potentiality of our approach.

Jorge Sánchez, Javier Redolfi

Stochastic Approaches of Minimum Distance Method for Region Based Classification

Normally remote sensing image classification is performed pixelwise which produces a noisy classification. One way of improving such results is dividing the classification process in two steps. First, uniform regions by some criterion are detected and afterwards each unlabeled region is assigned to class of the “nearest” class using a so-called stochastic distance. The statistics are estimated by taking in account all the reference pixels. Three variations are investigated. The first variation is to assign to the unlabeled region a class that has the minimum average distance between this region and each one of reference samples of that class. The second is to assign the class of the closest reference sample. The third is to assign the most frequent class of the


closest reference regions. A simulation study is done to assess the performances. The simulations suggested that the most robust and simple approach is the second variation.

Rogério G. Negri, Luciano V. Dutra, Sidnei J. S. Sant’Anna

Skills Assessment of Users in Medical Training Based on Virtual Reality Using Bayesian Networks

Virtual reality allows the development of digital environments that can explore users’ senses to provide realistic and immersive experiences. When used for training purposes, interaction data can be used to verify users skills. In order to do that, intelligent methodologies must be coupled to the simulations to classify users´ skills into N a priori defined classes of expertise. To reach that, models based on intelligent methodologies are composed from data provided by experts. However, online Single User’s Assessment System (SUAS) for training must have low complexity algorithms to do not compromise the performance of the simulator. Several approaches to perform it have been proposed. In this paper, it is made an analysis of performance of SUAS based on a Bayesian Network and also a comparison between that SUAS and another methodology based on Classical Bayes Rule.

Ronei M. Moraes, Liliane S. Machado, Leandro C. Souza

Theoretical Pattern Recognition

New Strategies for Evaluating the Performance of Typical Testor Algorithms

Typical testors have been used in feature selection and supervised classification problems in the logical combinatorial pattern recognition. Several algorithms have been used to find the set of all typical testors of a basic matrix. This algorithms are based on different heuristics. There is no doubt these algorithms find the set of all typical testors. However, the time spent on this search strategies, differs between them. Due to size of this set, the search time is a critical factor. There is not a standard procedure to evaluate the time performance of typical testors algorithms. In this paper we introduce a strategy to solve this problem through a new set of test matrices. These test matrices have the property that the set’s cardinality of all typical testors is known in advance.

Eduardo Alba, Diego Guilcapi, Julio Ibarra

Hierarchies and Climbing Energies

A new approach is proposed for finding the ”best cut” in a hierarchy of partitions by energy minimization. Said energy must be ”climbing” i.e. it must be hierarchically and scale increasing. It encompasses separable energies [5], [9] and those which composed under supremum [14], [12]. It opens the door to multivariate data processing by providing laws of combination by extrema and by products of composition.

Jean Serra, Bangalore Ravi Kiran, Jean Cousty

A Method for Reducing the Cardinality of the Pareto Front

Multi-objective problems are characterised by the presence of a set of optimal trade-off solutions –a Pareto front–, from which a solution has to be selected by a decision maker. However, selecting a solution from a Pareto front depends on large quantities of solutions to select from and dimensional complexity due to many involved objectives, among others. Commonly, the selection of a solution is based on preferences specified by a decision maker. Nevertheless a decision maker may have not preferences at all. Thus, an informed decision making process has to be done, which is difficult to achieve. In this paper, selecting a solution from a Pareto front is addressed as a multi-objective problem using two utility functions and operating in the objective space. A quantitative comparison of stereo correspondence algorithms performance is used as an application domain.

Ivan Cabezas, Maria Trujillo

Video Analysis

Feature Selection by Relevance Analysis for Abandoned Object Classification

A methodology to classify abandoned objects in video surveillance environments is proposed. Our aim is to determine a set of relevant features that properly describes the main patterns of the objects. Assuming that the abandoned object was previously detected by a visual surveillance framework, a preprocessing stage to segment the region of interest from a given detected object is also presented. Then, some geometric and Hu’s moments features are estimated. Moreover, a relevance analysis is employed to identify which features reveal the major variability of the input space to discriminate among different objects. Attained results over a real-world video surveillance dataset show how our approach is able to select a subset of features for achieving stable classification performance. Our approach seems to be a good alternative to support the development of automated video surveillance systems.

Johanna Carvajal-González, AndrésM. Álvarez-Meza, German Castellanos-Domínguez

Fusion of Local and Global Descriptors for Content-Based Image and Video Retrieval

Recently, fusion of descriptors has become a trend for improving the performance in image and video retrieval tasks. Descriptors can be global or local, depending on how they analyze visual content. Most of existing works have focused on the fusion of a single type of descriptor. Different from all of them, this paper aims to analyze the impact of combining global and local descriptors. Here, we perform a comparative study of different types of descriptors and all of their possible combinations. Extensive experiments of a rigorous experimental design show that global and local descriptors complement each other, such that, when combined, they outperform other combinations or single descriptors.

Felipe S. P. Andrade, Jurandy Almeida, Hélio Pedrini, Ricardo da S.Torres

Speed Estimation Thanks to Two Images from One Stationary Camera

This paper presents speed estimation of a moving object thanks to two images captured within a known time interval from one stationary un-calibrated camera. The development is currently applicable to rigid objects animated by a pure translation and requires the localization of corresponding points in both images and the specification of one real dimension. An interesting solution based on an equivalent stereo problem is suggested. It considers the object stationary and searches for the virtual camera motion which would produce the same images. The mathematical formulation is simple using 3D vectors and the camera parameters: focal length, CCD size and pixel size. The developed software package was tested for vehicle speed second assessment of the velocity captured by the LIDAR system LMS-06 distributed by secuRoad SA.

Charles Beumier

An Algorithm for Highlights Identification and Summarization of Broadcast Soccer Videos

This paper presents an algorithm that aims to perform automatic summarization in broadcast soccer videos. The summarization considers identifying and keeping only the highlights of the match. A situation that could be considered a highlight is defined as one with high convergence of players to a spot. Our approach considers velocities and positions of the players, and an inferred movement of the TV camera as basic features to be extracted. The movement of the TV cameras are approximated using the movement of all the players in the image. A motion field is computed over the image in order to analyze aspects of the match. The algorithm was tested with real data of a soccer match and results are promising considering the approach uses as input broadcast videos only, and it has no a priori knowledge of cameras positions or other fixed parameters.

Waldez Azevedo Gomes Junior, Díbio Leandro Borges

Bubble Identification Based on High Speed Videometry Data: Algorithm and Validation

The simultaneous flow of gas and liquid in a pipe is commonly found in several industrial activities, such as crude oil extraction and processing. In order to analyze this (two-phase) flow, many measurement techniques have been proposed, including X-ray, ultrasound, impedance and optical measurement. In this context, although the high speed videometry use is limited in practical cases, it is an important tool to validate other methods in experimental essays. Approaches based on image segmentation have already been considered to analyze gas-liquid flows along vertical pipes, but just a few have focused horizontal experiments, which are also widely found in many applications. This article describes a new technique developed to automatically measure the bubble volume through video analysis. The tests carried out considering horizontal air-water flow images yielded results with good correlation with known volume data, thus showing that the measurements are accurate enough to be considered for the validation of other technologies.

Carlos E. F. do Amaral, Rafael F. Alves, Marco J. da Silva, Lúcia V. R. Arruda, Leyza B. Dorini, Rigoberto E. M. Morales

Object and Gesture Recognition to Assist Children with Autism during the Discrimination Training

Teachers prompt children with autism to redirect their attention to the object discrimination training and reduce the time they spend “off task”. In this paper, we describe MOBIS, a mobile augmented reality application enabling multi-modal interaction to provide guidance to students with autism during the object discrimination training. The system uses a vision-based object recognition algorithm to associate visual and verbal prompts to the object being discriminated (i.e., “object of interest”). The results of a performance evaluation of the system show that the object recognition component achieves an accuracy of 90%, processing an image every 0.5 seconds. Accelerometers placed on objects of interest are used to detect interaction gestures with an accuracy of 87%. The performance of both algorithms is sufficient to support the object discrimination training in real-time.

Eduardo Quintana, Catalina Ibarra, Lizbeth Escobedo, Monica Tentori, Jesus Favela

Pedestrian Detection Using a Feature Space Based on Colored Level Lines

This work gives the guidelines to develop a pedestrian detection system using a feature space based on colored level lines, called Movement Feature Space (MFS). Besides detecting the movement in the scene, this feature space defines the descriptors used by the classifiers to identify pedestrians. The multi-channel level lines approach has been tested on the HSV color space, which improves the one-channel (gray scale) level lines calculation. Locations hypotheses of pedestrian are performed by a cascade of boosted classifiers. The validation of these regions of interest is carry out by a Support Vector Machine classifier. Results give more than 78.5 % of good detections on urban video sequences.

Pablo Negri, Pablo Lotito


Weitere Informationen

Premium Partner

BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.



Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.
Jetzt gratis downloaden!