Skip to main content

2010 | Buch

Advances in Pattern Recognition

Second Mexican Conference on Pattern Recognition, MCPR 2010, Puebla, Mexico, September 27-29, 2010. Proceedings

herausgegeben von: José Francisco Martínez-Trinidad, Jesús Ariel Carrasco-Ochoa, Josef Kittler

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Inhaltsverzeichnis

Frontmatter

Computer Vision and Robotics

A Hierarchical Recursive Partial Active Basis Model

Recognition of occluded objects in computer vision is a very hard problem. In this work we propose an algorithm to construct a structure of a model using learned active basis models, then use it to do inference over the most probable detected parts of an object, to allow partial recognition using the standard sum-max-maps algorithm used for

active basis

. We tested our method and present some improvements on occluded face detection using our algorithm, we also present some experiments with other partially occluded objects.

Pavel Herrera-Domínguez, Leopoldo Altamirano-Robles
Facial Feature Model for Emotion Recognition Using Fuzzy Reasoning

In this paper we present a fuzzy reasoning system that can measure and recognize the intensity of basic or non-prototypical facial expressions. The system inputs are the encoded facial deformations described either in terms of Ekman´s Action Units (AUs) or Facial Animation Parameters (FAPs) of MPEG-4 standard. The proposed fuzzy system uses a knowledge base implemented on knowledge acquisition and ontology editor Protégé. It allows the modeling of facial features obtained from geometric parameters coded by AUs - FAPs and also the definition of rules required for classification of measured expressions. This paper also presents the designed framework for fuzzyfication of input variables for fuzzy classifier based on statistical analysis of emotions expressed in video records of standard Cohn-Kanade’s and Pantic´s MMI face databases. The proposed system has been tested in order to evaluate its capability for detection, classifying, and interpretation of facial expressions.

Renan Contreras, Oleg Starostenko, Vicente Alarcon-Aquino, Leticia Flores-Pulido
Face Recognition Using Simplicial Complexes

The paper presents a novel method for 3D facial shape recognition. Our inputs are 3D facial shapes which are reconstructed from point clouds, and then filtered using PCA. The resulting data are represented by simplicial complexes. This representation can capture topological and geometric information at a specified resolution with a small number of control points. We calculate the Gromov-Hausdorff distance between simplicial complexes, and this measures how far each pair of faces are from being isometric. Finally, we demonstrate our method in an application to point clouds collected from laser range scanner.

Chengming Zou, Edwin R. Hancock
Skeleton Simplification by Key Points Identification

The current skeletonisation algorithms, based on thinning, extract the morphological features of an object in an image but the skeletonized objects are coarsely presented. This paper proposes an algorithm which goes beyond that approach by changing the coarse line segments into perfect “straight” line segments, obtaining points, angles, line segment size and proportions. Our technique is applied in the post-processing phase of the skeleton, which improves it no matter which skeletonisation technique is used, as long as the structure is made with one-pixel width continuous line segments. This proposal is a first step towards human activity recognition through the analysis of human poses represented by their skeletons.

Gabriel Rojas-Albarracín, Carlos A. Carbajal, Antonio Fernández-Caballero, María T. López
Learning and Fast Object Recognition in Robot Skill Acquisition: A New Method

Invariant object recognition aims at recognising an object independently of its position, scale and orientation. This is important in robot skill acquisition during grasping operations especially when working in unstructured environments. In this paper we present an approach to aid the learning of manipulative skills on-line. We introduce and approach based on an ANN for object learning and recognition using a descriptive vector built on recurrent patterns. Experimental learning results using a fast camera are presented. Some simple parts (i.e. circular, squared and radiused-square) were used for comparing different connectionist models (Backpropagation, Perceptron and FuzzyARTMAP) and to select the appropriate model. Later during experiments, complex figures were learned using the chosen FuzzyARTMAP algorithm showing a 93.8% overall efficiency and 100% recognition rate with not so complex parts. Recognition times were lower than 1 ms, which clearly indicates the suitability of the approach to be implemented in robotic real-world operations.

I. Lopez-Juarez, R. Rios-Cabrera, M. Peña-Cabrera, R. Osorio-Comparan
Light Source Intensity Adjustment for Enhanced Feature Extraction

We explore the automatic adjustment of an artificial light source intensity for the purposes of image-based feature extraction and recognition. Two histogram-based criteria are proposed to achieve this adjustment: a two-class separation measure for 2D features and a Gaussian distribution measure for 2.5D features. To this end, the light source intensity is varied within a fixed interval as a camera captures one image for each intensity variation. The image that best satisfies the criteria for feature extraction is tested on a neural-network based recognition system. The network considers information related to both 2D (contour) and 2.5D shape (local surface curvature) of different objects. Experimental tests performed during different times of the day confirm that the proposed adjustment delivers improved feature extraction, extending the recognition capabilities of the system and adding robustness against changes in ambient light.

Francisco J. Castro-Martínez, Mario Castelán, Ismael López-Juárez
Fringe-Pattern Demodulation Using a Parametric Method Based on Differential Evolution

A parametric method to carry out fringe pattern demodulation by means of Differential Evolution is presented. The phase is approximated by the parametric estimation of an nth-grade polynomial so that no further unwrapping is required. On the other hand, a different parametric function can be chosen according to the prior knowledge of the phase behavior. A differential evolution is codified with the parameters of the function that estimates the phase. The differential evolution evolves until a fitness average threshold is obtained. The method can demodulate noisy fringe patterns and even a one-image closed-fringe pattern successfully.

J. F. Jimenez, F. J. Cuevas, J. H. Sossa, L. E. Gomez
ANIMA: Non-conventional Brain-Computer Interfaces in Robot Control through Electroencephalography and Electrooculography, ARP Module

ANIMA has as a primary objective to compare three non-conventional human computer interfaces that comply with the industrial robot ST Robotics R-17 instructions. This module, Alpha Waves Related Potentials -ARP- explains how brain waves are obtained, processed, analyzed and identified depending on their frequency. This module makes use of the Open EEG Project’s open hardware monitor for brain wave activity, called the modular EEG. The brain waves are obtained through an electrode cap complying with the international 10-20 system for electrode positioning. The brain waves are processed with a fast Fourier transform using a micro-controller and analyzed in software identifying the alpha wave’s contribution. A program identifies the amount of time that alpha wave generation was maintained through concentration, and instructions are sent to the robotic arm, executing one of four pre-defined routines. Thirty percent of the users attained control over the robotic arm with the human computer interface.

Luis F. Reina, Gerardo Martínez, Mario Valdeavellano, Marie Destarac, Carlos Esquit
Efficient Roughness Recognition for Velocity Updating by Wheeled-Robots Navigation

In this paper is shown that the Appearance-Based modeling is the best pattern recognition method for supporting the velocity updating of wheeled-robots navigation. Although Appearance-Based recognition algorithms have lower accuracy than the ones for detailed pattern recognition, they successfully classify terrain textures by regarding the average of the appearance. Actually, the detailed recognition algorithms success in recognizing patterns depicted with lines, dots or borders, but they fail for recognizing patterns where the average appearance is required. As human driving experience shows, the assessment of the average appearance is needed for velocity updating during navigation on outdoor terrains. Human drivers make the velocity adjusting based on an estimation of the terrain average appearance. Hence, as the experimental result illustrate, the algorithms for average appearance recognition are the best option for training wheeled-robot for velocity updating while navigating over outdoor terrains.

Farid García, Matías Alvarado

Image Processing

Fingerprint Verification with Non-linear Composite Correlation Filters

Fingerprint recognition has been used from many years for identification of persons. However, conventional fingerprint recognition systems might fail with poor quality, noisy or rotated images. Recently, novel non-linear composite filters for correlation-based pattern recognition have been introduced. The filters are designed with information from distorted versions of reference object to achieve distortion-invariant recognition. Besides, a non-linear correlation operation is applied among the filter and the test image. These kinds of filters are robust to non-Gaussian noise. In this paper we apply non-linear composite filters for fingerprint verification. Computer simulations show performance of proposed filters with distorted fingerprints. In addition, in order to illustrate robustness to noise, filters were tested with noisy images.

Saúl Martínez-Díaz, Javier A. Carmona-Troyo
Automatic Ultrasound Image Analysis in Hashimoto’s Disease

The paper presents diagnostics of parenchyma echogenicity and organ dimensions in thyroid examinations in the case of Hashimoto’s disease using image processing methods. In the event of discovering focal changes within the thyroid, a method for their pathology evaluation was suggested. The detector proposed operates fully automatically; using the information on the image texture it detects an artery in the image, which fulfils the role of reference point, and based on it - detects the area of interest.

Robert Koprowski, Zygmunt Wrobel, Witold Zieleznik
Estimating Quality Bounds of JPEG 2000 Compressed Leukocytes Images

Several pathologies are detected by counting different types of leukocytes indigital microscopic images. However, manipulation of these images, i.e. storage and/or transmission, can be complicated by the large sizes of the files containing them. In order to tackle this particular situation,

lossy

compression

codecs

such as JPEG2000 have been employed while preserving the overall perceived image quality. In this paper a strategy based on objective quality metrics and performance of segmentation algorithms is proposed for the estimation of the maximal allowable compression rate (CR) where deterioration introduced in the images by the JPEG 2000 codec does not affect identification of white blood cells. Results indicate that the estimated value lays around CR = 142:1as measured by the metrics employed.

Alexander Falcón-Ruiz, Juan Paz-Viera, Hichem Sahli
Surface Material Segmentation Using Polarisation

This paper describes the use of polarisation information for surface segmentation based on material characteristics. We work with both polarised and unpolarised light, and hence domains where the polarisation is either specular or diffuse. We commence by using moments to estimate the components of the polarisation image (mean-intensity, polarisation degree and phase) from images obtained through multiple polariser orientations. From the Fresnel theory, the phase of light remitted from a surface is equal to the azimuth angle of the remitted direction, and for materials with restricted ranges of refractive index the polarisation degree determines the zenith angle. Based on this observation, we parameterise the angular distribution of the mean intensity for remitted light using spherical harmonics. We explore how vectors of spherical harmonics can be used to characterise varying surface reflectance distributions, and segment a scene into different material patches using Mahalanobis distances and normalized graph cuts.

Nitya Subramaniam, Edwin Hancock
Measurement of Defocus Level in Iris Images Using Different Convolution Kernel Methods

During the video and fixed image acquisition procedure of an automatic iris recognition system, it is essential to acquire focused iris images. If defocus iris images are acquired, the performance of the iris recognition is degraded, because iris images don’t have enough feature information. Therefore it’s important to adopt the image quality evaluation method before the image processing. In this paper, it is analyzed and compared four representative quality assessment methods on the MBGC iris database. Through methods, it can fast grade the images and pick out the high quality iris images from the video sequence captured by real-time iris recognition camera. The experimental results of the four methods according to the receiver operating characteristic (ROC) curve are shown. Then the optimal method of quality evaluation that allows better performance in an automatic iris recognition system is founded. This paper also presents an analysis in terms of computation speed of the four methods.

J. Miguel Colores-Vargas, Mireya S. García-Vázquez, Alejandro A. Ramírez-Acosta
Radon Transform Algorithm for Fingerprint Core Point Detection

This article presents an innovative technique for solving the problem of finding the core within a fingerprint. The Radon transform and a tree clustering algorithm were key to locating the coordinates of the core. Binarization and high-pass filtering processes to improve the contrast in fingerprints are proposed. The core of a fingerprint is located in the geometric cross section of maxima and minima in the Radon transforms at 0° and 90°. The technique is very stable, since it only presents difficulties when the fingerprint core is located on the edges of the image or is nonexistent.

Miguel Mora-González, Julio C. Martínez-Romo, Jesús Muñoz-Maciel, Guillermo Sánchez-Díaz, Javier Salinas-Luna, H. I. Piza-Dávila, Francisco J. Luna-Rosas, Carlos A. de Luna-Ortega
Genetic Algorithms and Tabu Search for Correcting Lanes in DNA Images

This paper describes an experience that combines Genetic Algorithms and Tabu Search as a mechanism for correcting lanes in DNA images obtained through Random Amplified Polymorphism DNA (RAPD) technique. RAPDs images are affected by various factors; among these factors, the noise and distortion that impact the quality of images, and subsequently, accuracy in interpreting the data. This work proposes a hybrid method that uses genetic algorithms, for dealing with the highly combinatorial feature of this problem, and tabu search, for dealing with local optimum. The results obtained by using them in this particular problem show an improvement in both, fitness of individuals and execution time.

M. J. Angélica Pinninghoff, Q. Daniel Venegas, A. Ricardo Contreras
Adaptive Algorithm-Based Fused Bayesian Maximum Entropy-Variational Analysis Methods for Enhanced Radar Imaging

In this paper we address an adaptive computational algorithm to improve the Bayesian maximum entropy–variational analysis (BMEVA) performance for high resolution radar imaging and denoising. Furthermore, the variational analysis (VA) approach is aggregated by imposing the metrics structures in the corresponding signal spaces. Then, the formalism for combining the Bayesian maximum entropy strategy with the VA paradigm is presented. Finally, the image enhancement and denoising benefits produced by the proposed Adaptive Bayesian maximum entropy–variational analysis (ABMEVA) method are showed via simulations with real-world radar scene

R. F. Vázquez-Bautista, L. J. Morales-Mendoza, R. Ortega-Almanza, A. Blanco-Ortega
Iris Segmentation Using a Statistical Approach

Eyelashes and reflections occluding the iris region are noise factors that degrade the performance of iris recognition. If these factors are not eliminated in iris segmentation phase, they are incorrectly considered as the iris region. Thus, produce false iris pattern information which decreases the recognition rate. In this paper a statistical approach is used to improve iris segmentation phase eliminating this noise from none constrain images, which is composed in three parts, finding the pupil and limbus boundary, reflection detection and eyelash detection. First an edge map is calculated using canny filter then the Circular Hough Transform is used to improve circle parameter finding. An intensity variation analysis is use to recognize a strong reflection. Eyelashes are classified in two categories, separable and multiple. Intensity variances are used to detect multiple eyelashes and an edge detector to localize separable eyelashes. The results show that statistics are useful to decide when is necessary applied the eyelash detector.

Luis M. Zamudio-Fuentes, Mireya S. García-Vázquez, Alejandro A. Ramírez-Acosta

Neural Networks and Signal Processing

Adaboost Classifier by Artificial Immune System Model

An algorithm combining Artificial Immune System and AdaBoost called Imaboost is proposed to improve the feature selection and classification performance. Adaboost is a machine learning technique, which generates a strong classifier as a combination of simple classifiers. In Adaboost, through learning, the search for the best simple classifiers is replaced by the clonal selection algorithm. Haar features extracted from face database are chosen as a case study. A comparison between Adaboost and Imaboost is provided.

Hind Taud, Juan Carlos Herrera-Lozada, Jesús Álvarez-Cedillo
Cost-Sensitive Neural Networks and Editing Techniques for Imbalance Problems

The multi-class imbalance problem in supervised pattern recognition methods is receiving growing attention. Imbalanced datasets means that some classes are represented by a large number of samples while the others classes only contain a few. In real-world applications, imbalanced training sets may produce an important deterioration of the classifier performance when neural networks are applied in the classes less represented. In this paper we propose training cost-sentitive neural networks with editing techniques for handling the class imbalance problem on multi-class datasets. The aim is to remove majority samples while compensating the class imbalance during the training process. Experiments with real data sets demonstrate the effectiveness of the strategy here proposed.

R. Alejo, J. M. Sotoca, V. García, R. M. Valdovinos
Designing RBFNNs Using Prototype Selection

Performance and accuracy of a neural network are strongly related to its design. Designing a neural network involves topology (number of neurons, number of layers, number of synapses between layers, etc.), training synapse weights, and parameter selection. Radial basis function neural networks (RBFNNs) could additionally require some other parameters, for example, the means and standard deviations if the activation function of neurons in the hidden layer is a Gaussian function. Commonly, Genetic Algorithms and Evolution Strategies have been used for automatically designing RBFNNs In this work, the use of prototype selection methods for designing a RBFNN is proposed and studied. Experimental results show the viability of designing RBFNNs using prototype selection.

Ana Cecilia Tenorio-González, José Fco. Martínez-Trinidad, Jesús Ariel Carrasco-Ochoa
A Learning Social Network with Recognition of Learning Styles Using Neural Networks

The implementation of an adaptive learning social network to be used as an authoring tool, is presented in this paper. With this tool, adaptive courses, intelligent tutoring systems and lessons can be created, displayed and shared in collaborative and mobile environments by communities of instructors and learners. The Felder-Silverman model is followed to tailor courses to the student’s learning style. Self Organizing Maps (SOM) are applied to identify the student’s learning style. The introduction of a social learning network to create, view and manage adaptive intelligent tutoring systems, and a novel method to identify the student’s learning style, are the contributions of this paper.

Ramón Zatarain-Cabada, M. L. Barrón-Estrada, Viridiana Ponce Angulo, Adán José García, Carlos A. Reyes García
On-line Signature Verification Based on Modified Dynamic Time Warping and Wavelet Sub-band Coding

This paper presents an on-line signature biometric system based on a modified Dynamic Time Warping (DTW) algorithm applied to the signature wavelet coefficients. The modification on DTW relies on the use of direct matching points information (DMP) to dynamically adapt the similarity measure during the matching process, which is shown to increase the verification success rate. The wavelet analysis is done using a sub-band coding algorithm at global and local level. The use of wavelet coefficients showed a considerable reduction in processing time and an improvement in the equal error recognition rate (EER). The system was tested using a locally constructed database. A comparison of the ROC curves obtained in each case is presented.

Juan Carlos Sánchez-Diaz, Juan Manuel Ramírez-Cortes, Rogerio Enriquez-Caldera, Pilar Gomez-Gil
New Dissimilarity Measures for Ultraviolet Spectra Identification

Ultraviolet Spectra (UVS) analysis is a frequent tool in tasks like diseases diagnosis, drugs detection and hyperspectral remote sensing. A key point in these applications is the UVS comparison function. Although there are several UVS comparisons functions, creating good dissimilarity functions is still a challenge because there are different substances with very similar spectra and the same substance may produce different spectra. In this paper, we introduce a new spectral dissimilarity measure for substances identification, based on the way experts visually match the spectra shapes. We also combine the new measure with the Spectral Correlation Measure. A set of experiments conducted with a database of real substances reveals superior results of the combined dissimilarity, with respect to state-of-the-art measures. We use Receiver Operating Characteristic curve analysis to show that our proposal get the best tradeoff between false positive rates and true positive rates.

Andrés Eduardo Gutiérrez-Rodríguez, Miguel Angel Medina-Pérez, José Fco. Martínez-Trinidad, Jesús Ariel Carrasco-Ochoa, Milton García-Borroto
Third Degree Volterra Kernel for Newborn Cry Estimation

Newborn cry analysis is a difficult task due to its nonstationary nature, combined to the presence of nonlinear behavior as well. Therefore, an adaptive hereditary optimization algorithm is implemented in order to avoid the use of

windowing

nor

overlapping

to capture the transient signal behavior. Identification of the linear part of this particular time series is carried out by employing an Autorregresive Moving Average (ARMA) structure; then, the resultant estimation error is approched by a Nonlinear Autorregresive Moving Average (NARMA) model, which realizes a Volterra cubic kernel by means of a bilinear homogeneous structure in order to capture burst behavior. Normal, deaf, asfixia, pain, and uncommon newborn cries are inspected for differentation.

Gibran Etcheverry, Efraín López-Damian, Carlos A. Reyes-García

Pattern Recognition and Data Mining

Cascading an Emerging Pattern Based Classifier

Emerging Pattern classifiers are accurate and easy to understand classifiers. However, they have two characteristics that can degrade their accuracy: global discretization of numerical attributes and high sensitivity to the support threshold value. In this paper, we introduce a novel algorithm to find emerging patterns without global discretization. Additionally, we propose a new method for building cascades of emerging pattern classifiers, which combines the higher accuracy of classifying with higher support thresholds with the lower levels of abstention of classifying with lower thresholds. Experimental results show that our cascade attains higher accuracy than other state-of-the-art classifiers, including one of the most accurate emerging pattern based classifier.

Milton García-Borroto, José Fco. Martínez-Trinidad, Jesús Ariel Carrasco-Ochoa
A New Combined Filter-Wrapper Framework for Gene Subset Selection with Specialized Genetic Operators

This paper introduces a new combined filter-wrapper gene subset selection approach where a Genetic Algorithm (GA) is combined with Linear Discriminant Analysis (LDA). This LDA-based GA algorithm has the major characteristic that the GA uses not only a LDA classifier in its fitness function, but also LDA’s discriminant coefficients in its dedicated crossover and mutation operators. This paper studies the effect of these informed operators on the evolutionary process. The proposed algorithm is assessed on a several well-known datasets from the literature and compared with recent state of art algorithms. The results obtained show that our filter-wrapper approach obtains globally high classification accuracies with very small number of genes to those obtained by other methods.

Edmundo Bonilla Huerta, J. Crispín Hernández Hernández, L. Alberto Hernández Montiel
Hybrid Feature Selection Method for Supervised Classification Based on Laplacian Score Ranking

In this paper, we introduce a new hybrid filter-wrapper method for supervised feature selection, based on the Laplacian Score ranking combined with a wrapper strategy. We propose to rank features with the Laplacian Score to reduce the search space, and then we use this order to find the best feature subset. We compare our method against other based on ranking feature selection methods, namely, Information Gain Attribute Ranking, Relief, Correlation-based Feature Selection, and additionally we include in our comparison a Wrapper Subset Evaluation method. Empirical results over ten real-world datasets from the UCI repository show that our hybrid method is competitive and outperforms in most of the cases to the other feature selection methods used in our experiments.

Saúl Solorio-Fernández, J. Ariel Carrasco-Ochoa, José Fco. Martínez-Trinidad
Navigating K-Nearest Neighbor Graphs to Solve Nearest Neighbor Searches

Nearest neighbor queries can be satisfied, in principle, with a greedy algorithm under a proximity graph. Each object in the database is represented by a node, and proximal nodes in this graph will share an edge. To find the nearest neighbor the idea is quite simple, we start in a random node and get iteratively closer to the nearest neighbor following only adjacent edges in the proximity graph. Every reachable node from current vertex is reviewed, and only the closer-to-the-query node is expanded in the next round. The algorithm stops when none of the neighbors of the current node is closer to the query. The number of revised objects will be proportional to the diameter of the graph times the average degree of the nodes. Unfortunately the degree of a proximity graph is unbounded for a general metric space [1], and hence the number of inspected objects can be linear on the size of the database, which is the same as no indexing at all.

In this paper we introduce a

quasi

-proximity graph induced by the all-

k

-nearest neighbor graph. The degree of the above graph is bounded but we will face local minima when running the above greedy algorithm, which boils down to have false positives in the queries.

We show experimental results for high dimensional spaces. We report a recall greater than 90% for most configurations, which is very good for many proximity searching applications, reviewing just a tiny portion of the database.

The space requirement for the index is linear on the database size, and the construction time is quadratic in worst case. Relaxations of our method are sketched to obtain practical subquadratic implementations.

Edgar Chávez, Eric Sadit Tellez
On the Design of a Hardware-Software Architecture for Acceleration of SVM’s Training Phase

Support Vector Machines (SVM) is a new family of Machine Learning techniques that have been used in many areas showing remarkable results. Since training SVM scales quadratically (or worse) according of data size, it is worth to explore novel implementation approaches to speed up the execution of this type of algorithms. In this paper, a hardware-software architecture to accelerate the SVM training phase is proposed. The algorithm selected to implement the architecture is the Sequential Minimal Optimization (SMO) algorithm, which was partitioned so a General Purpose Processor (GPP) executes operations and control flow while the coprocessor executes tasks than can be performed in parallel. Experiments demonstrate that the proposed architecture can speed up SVM training phase 178.7 times compared against a software-only implementation of this algorithm.

Lázaro Bustio-Martínez, René Cumplido, José Hernández-Palancar, Claudia Feregrino-Uribe
A Highly Parallel Algorithm for Frequent Itemset Mining

Mining frequent itemsets in large databases is a widely used technique in Data Mining. Several sequential and parallel algorithms have been developed, although, when dealing with high data volumes, the execution of those algorithms takes more time and resources than expected. Because of this, finding alternatives to speed up the execution time of those algorithms is an active topic of research. Previous attempts of acceleration using custom architectures have been limited because of the nature of the algorithms that have been conceived sequentially and do not exploit the intrinsic parallelism that the hardware provides. The innovation in this paper is a highly parallel algorithm that utilizes a vertical bit vector (VBV) data layout and its feasibility for making support counting. Our results show that for dense databases a custom architecture for this algorithm can perform faster than the fastest architecture reported in previous works by one order of magnitude.

Alejandro Mesa, Claudia Feregrino-Uribe, René Cumplido, José Hernández-Palancar
A Hybrid Methodology for Pattern Recognition in Signaling Cervical Cancer Pathways

Cervical Cancer (CC) is the result of the infection of high risk Human Papilloma Viruses. mRNA microarray expression data provides biologists with evidences of cellular compensatory gene expression mechanisms in the CC progression. Pattern recognition of signalling pathways through expression data can reveal interesting insights for the understanding of CC. Consequently, gene expression data should be submitted to different pre-processing tasks. In this paper we propose a methodology based on the integration of expression data and signalling pathways as a needed phase for the pattern recognition within signaling CC pathways. Our results provide a top-down interpretation approach where biologists interact with the recognized patterns inside signalling pathways.

David Escarcega, Fernando Ramos, Ana Espinosa, Jaime Berumen
Graph Indexing and Retrieval Based on Median Graphs

M-trees are used to organize and define fast queries on large databases of Attributed Graphs. In classical schemes based on metric trees, the routing information stored in a routing tree node is a selected Attributed Graph from the sub-cluster the node represents. Depending on the sub-cluster and the application, it is difficult to select a good representative of the sub-cluster. To that aim, we propose to use Generalized Median Graphs as the main information kept in the routing nodes of the m-tree. Experimental validation shows that in database queries, the decrease of the nodes explored in the m-tree while using a Generalized Median Graph is about 20% respect using a selected Attributed Graph.

Francesc Serratosa, Albert Solé-Ribalta, Enric Vidiella
A Novel Method for Counting Models on Grid Boolean Formulas

We research on the possible orientations patterns of a grid graph

G

, and propose a method for counting certain combinatorial structures over the class of orientations of

G

. For example, our method can be applied for counting sink-free orientations of

G

, as well as it can be applied for solving the #2SAT problem for grid Boolean formulas.

Our proposal extends the classical transfer matrix method used for counting the number of independent sets in a grid.

Carlos Guillén, Guillermo De Ita, Aurelio López-López

Natural Language and Document Processing

Sentence to Document Level Emotion Tagging – A Coarse-Grained Study on Bengali Blogs

This paper presents the identification of document level emotions from the sentential emotions obtained at word level granularity. Each of the Bengali blog documents consists of a topic and corresponding user comments. Sense weight based average scoring technique for assigning sentential emotion tag follows the word level emotion tagging using Support Vector Machine (SVM) approach. Cumulative summation of sentential emotion scores is assigned to each document considering the combinations of some heuristic features. An average F-Score of 59.32% with respect to all emotion classes is achieved on 95 documents on the development set by incorporating the best feature combination into account. Instead of assigning a single emotion tag to a document, each document is assigned with the best two emotion tags according to the ordered emotion scores obtained. The best two system assigned emotion tags of each document are compared against best two human annotated emotion tags. Evaluation of 110 test documents yields an average F-Score of 59.50% with respect to all emotion classes.

Dipankar Das, Sivaji Bandyopadhyay
Clustering Weblogs on the Basis of a Topic Detection Method

In recent years we have seen a vast increase in the volume of information published on weblog sites and also the creation of new web technologies where people discuss actual events. The need for automatic tools to organize this massive amount of information is clear, but the particular characteristics of weblogs such as shortness and overlapping vocabulary make this task difficult. In this work, we present a novel methodology to cluster weblog posts according to the topics discussed therein. This methodology is based on a generative probabilistic model in conjunction with a Self-Term Expansion methodology. We present our results which demonstrate a considerable improvement over the baseline.

Fernando Perez-Tellez, David Pinto, John Cardiff, Paolo Rosso
A Naïve Bayes Approach to Cross-Lingual Word Sense Disambiguation and Lexical Substitution

Word Sense Disambiguation (WSD) is considered one of the most important problems in Natural Language Processing [1]. It is claimed that WSD is essential for those applications that require of language comprehension modules such as search engines, machine translation systems, automatic answer machines, second life agents, etc. Moreover, with the huge amounts of information in Internet and the fact that this information is continuosly growing in different languages, we are encourage to deal with cross-lingual scenarios where WSD systems are also needed. On the other hand, Lexical Substitution (LS) refers to the process of finding a substitute word for a source word in a given sentence. The LS task needs to be approached by firstly disambiguating the source word, therefore, these two tasks (WSD and LS) are somehow related. In this paper, we present a naïve approach to tackle the problem of cross-lingual WSD and cross-lingual lexical substitution. We use a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus, in order to calculate the probability of a source word to be translated to a target word (which is assumed to be the correct sense of the source word but in a different language). Two versions of the probabilistic model are tested: unweighted and weighted. The results were compared with those of an international competition, obtaining a good performance.

David Pinto, Darnes Vilariño, Carlos Balderas, Mireya Tovar, Beatriz Beltrán
Supervised Learning for Semantic Classification of Spanish Collocations

The meaning of word combination such as give a book or lend money can be obtained by mechanically combining the meaning of the two constituting words: to give is to hand over, a book is a pack of pages, then to give a book is to hand over a pack of pages. However, the meaning of such word combinations as give a lecture or lend support is not obtained in this way: to give a lecture is not to hand it over. Such word pairs are called collocations. While their meaning cannot be derived automatically from the meaning of their constituents, we show how to predict the meaning of a previously unseen word combination using semantic regularities we observe in a training set of collocations whose meaning has been specified manually.

Alexander Gelbukh, Olga Kolesnikova
Recognizing Textual Entailment with Statistical Methods

In this paper we propose a new cause-effect non-symmetric measure applied to the task of Recognizing Textual Entailment .First we searched over a big corpus for sentences which contains the discourse marker “because” and collected cause-effect pairs. The entailment recognition is based on measure the cause-effect relation between the text and the hypothesis using the relative frequencies of words from the cause-effect pairs. Our measure outperformed the baseline method, over the three test sets of the PASCAL Recognizing Textual Entailment Challenges (RTE). The measure shows to be good at discriminate over the “true” class. Therefore we develop a meta-classifier using a symmetric measure and a non-symmetric measure as base classifiers. So, our meta-classifier has a competitive performance.

Miguel Angel Ríos Gaona, Alexander Gelbukh, Sivaji Bandyopadhyay
Backmatter
Metadaten
Titel
Advances in Pattern Recognition
herausgegeben von
José Francisco Martínez-Trinidad
Jesús Ariel Carrasco-Ochoa
Josef Kittler
Copyright-Jahr
2010
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-15992-3
Print ISBN
978-3-642-15991-6
DOI
https://doi.org/10.1007/978-3-642-15992-3