Skip to main content
Top

2020 | Book

Pattern Recognition

12th Mexican Conference, MCPR 2020, Morelia, Mexico, June 24–27, 2020, Proceedings

Editors: Karina Mariela Figueroa Mora, Juan Anzurez Marín, Jaime Cerda, Dr. Jesús Ariel Carrasco-Ochoa, José Francisco Martínez-Trinidad, José Arturo Olvera-López

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science

insite
SEARCH

About this book

This book constitutes the proceedings of the 12th Mexican Conference on Pattern Recognition, MCPR 2020, which was due to be held in Morelia, Mexico, in June 2020. The conference was held virtually due to the COVID-19 pandemic.
The 31 papers presented in this volume were carefully reviewed and selected from 67 submissions. They were organized in the following topical sections: pattern recognition techniques; image processing and analysis; computer vision; industrial and medical applications of pattern recognition; natural language processing and recognition; artificial intelligence techniques and recognition.

Table of Contents

Frontmatter

Pattern Recognition Techniques

Frontmatter
Fruit Classification for Retail Stores Using Deep Learning

Payment of fruits or vegetables in retail stores normally require them to be manually identified. This paper presents an image classification method, based on lightweight Convolutional Neural Networks (CNN), with the goal of speeding up the checkout process in stores. A new dataset of images is introduced that considers three classes of fruits, inside or without plastic bags. In order to increase the classification accuracy, different input features are added into the CNN architecture. Such inputs are, a single RGB color, the RGB histogram, and the RGB centroid obtained from K-means clustering. The results show an overall 95% classification accuracy for fruits with no plastic bag, and 93% for fruits in a plastic bag .

Jose Luis Rojas-Aranda, Jose Ignacio Nunez-Varela, J. C. Cuevas-Tello, Gabriela Rangel-Ramirez
Towards Dendrite Spherical Neurons for Pattern Classification

This paper introduces the Dendrite Spherical Neuron (DSN) as an alternative to the Dendrite Ellipsoidal Neuron (DEN), in which hyperspheres group the patterns from different classes instead of hyperellipses. The reasoning behind DSN is simplifying the computation of DEN architecture, where a centroid and covariance matrix are two dendritic parameters, whereas, in DSN, the covariance matrix is replaced by a radius. This modification is useful to avoid singular covariance matrices since DEN requires measuring the Mahalanobis distance to classify patterns. The DSN training consists of determining the centroids of dendrites with the k-means algorithm, followed by calculating the radius of dendrites as the mean distance to the two nearest centroids, and finally determining the weights of a softmax function, with Stochastic Gradient Descent, at the output of the neuron. Besides, the Simulated Annealing automatically determines the number of dendrites that maximizes the classification accuracy. The DSN is applied to synthetic and real-world datasets. The experimental results reveal that DSN is competitive with Multilayer Perceptron (MLP) networks, with less complex architectures. Also, DSN tends to outperform the Dendrite Morphological Neuron (DMN), which uses hyperboxes. These findings suggest that the DSN is a potential alternative to MLP and DMN for pattern classification tasks.

Wilfrido Gómez-Flores, Juan Humberto Sossa-Azuela
Comparison of Recurrent Neural Networks for Wind Power Forecasting

Integrating wind power to the electrical grid is complicated due to the stochastic nature of the wind, which makes its prediction a challenging task. Then, it is important to devise forecasting tools to support this task. For example, a network that integrates an Echo State Network architecture and Long Short-Term Memory blocks as hidden units (ESN+LSTM) has been proposed, showing good performance against a physical model. This paper proposes to compare this network versus Echo State Network (ESN) and Long Short-Term Memory (LSTM), to forecast wind power from 1 to 24 h ahead. Results show than the ESN+LSTM model outperforms the performance reached for ESN and LSTM, in terms of MSE, MAE, and the metrics used in the Taylor diagram. In addition, we observe that the advantage of this network is statistically significant during the first moments of the forecast horizon, in terms of T-test and Wilcoxon-test.

Erick López, Carlos Valle, Héctor Allende-Cid, Héctor Allende
Classification of Cattleya Trianae and Its Varieties by Using Colorimetry

Orchids in general, like Cattleya trianae, have been characterized mostly by taxonomic and visual studies. However, colour is not used for classification. Here, a new method for identifying and classifying orchids of different varieties of Cattleya trianae is introduced. This method is not subjective and uses the colour information obtained from the central axis of the Cattleya trianae lip. To this end, a new acquisition protocol was established, which uses a new device for image acquisition of the labellum’s central axis from the hippocampus to the epicentre. The colour patterns found between samples of the same variety were adjusted and it was verified by using correlation they can be employed to identify each variety. Finally, a support vector machine was used to classify and identify four Cattleya trianae varieties, finding that a linear kernel was enough to classify them with an accuracy of 100%.

Manuel G. Forero, Carlos E. Beltrán, Armando Troncoso, Christian González-Santos
Analysis of Repair Costs of Scholar Buildings Affected by Earthquakes Using Data Mining. Case Study: Earthquakes of 2017 in Mexico

Earthquakes are events that cannot be predicted. However, when they occur, devastating consequences are shown in economic, social and structural areas, among others. In this paper, the mining of association rules is carried out in order to estimate the repair cost required by schools affected during the earthquakes of September 7th and 19th, of 2017 in Mexico. For that, we use the public data collected by the Mexican FONDEN.

Graciela García-Rueda, Rosa M. Valdovinos, Jesús Valdés-González, Roberto Alejo, J. Leonardo González-Ruiz, José R. Marcial-Romero
Basic Pattern Graphs for the Efficient Computation of Its Number of Independent Sets

The problem of counting the number of independent sets of a graph G (denoted as i(G)) is a classic #P-complete problem. We present some patterns on graphs that allows us the polynomial computation of i(G).For example, we show that for a graph G where its set of cycles can be arranged as embedded cycles, i(G) can be computed in polynomial time. Particularly, our proposal counts independent sets on outerplanar graphs.

Guillermo De Ita, Miguel Rodríguez, Pedro Bello, Meliza Contreras
Towards Selecting Reducts for Building Decision Rules for Rule-Based Classifiers

In rule-based classifiers, calculating all possible rules of a learning sample consumes many resources due to its exponential complexity. Therefore, finding ways to reduce the number and length of the rules without affecting the efficacy of a classifier remains an interesting problem. Reducts from rough set theory have been used to build rule-based classifiers by their conciseness and understanding. However, the accuracy of the classifiers based on these rules depends on the selected rule subset. In this work, we focus on analyzing three different options for using reducts for building decision rules for rule-based classifiers .

Manuel S. Lazo-Cortés, José Fco. Martínez-Trinidad, Jesús A. Carrasco-Ochoa, Nelva N. Almanza-Ortega
Towards Inpainting and Denoising Latent Fingerprints: A Study on the Impact in Latent Fingerprint Identification

In this paper, we provide a study about the impact of the most prominent inpainting and denoising solutions on the latent fingerprint identification. From an in-depth analysis, we show how some of the analyzed inpainting and denoising solutions can improve up 63% for Rank-1 and 26% for Rank-20 the fingerprint identification rates when state-of-the-art minutiae extractors are used. Nevertheless, it is necessary to create new denoising and inpainting solutions that are specifically built to deal with latent fingerprints and their associated issues.

Ernesto Ramírez-Sáyago, Octavio Loyola-González, Miguel Angel Medina-Pérez

Image Processing and Analysis

Frontmatter
New Method for Extreme Color Detection in Images

In image processing and computer vision, it is common to find applications, in which it is necessary to detect reference points characterized by extreme color, i.e., a primary color RGB or complementary CMY with very high saturation. Thus, there are cases in which a certain class of objects can be distinguished according to their characteristic extreme color, which can be used as landmarks or to identify objects. Therefore, there is an interest in identifying landmarks characterized by extreme colors. In this paper, a new method for detecting objects with an extreme color is introduced and compared with other approaches found in the literature. The methods are analyzed and compared using a color palette in which a transition between R, G, B, C, M and Y colors is generated. The results obtained show that the methods studied allow the specific colors to be adequately discriminated, while the proposed method is the only one that allows the full range of extreme colors R, G, B, C, M and Y to be detected, being more selective than the others, by taking practically the areas corresponding to each color separately .

Manuel G. Forero, Julián Ávila-Navarro, Sergio Herrera-Rivera
A Simple Methodology for 2D Reconstruction Using a CNN Model

In recent years, Deep Learning research have demonstrated their effectiveness in digital image processing, mainly in areas with heavy computational load. Such is the case of aerial photogrammetry, where the principal objective is to generate a 2D map or a 3D model from a specific terrain. In these topics, high-efficiency in visual information processing is demanded. In this work we present a simple methodology to build an orthomosaic, our proposal is focused in replacing traditional digital imagen processing using instead a Convolutional Neuronal Network (CNN) model. The dataset of aerial images is generated from drone photographs of our university campus. The method described in this article uses a CNN model to detect matching points and RANSAC algorithm to correct feature’s correlation. Experimental results show that feature maps and matching points obtained between pair of images through a CNN are comparable with those obtained in traditional artificial vision algorithms.

Armando Levid Rodríguez-Santiago, José Anibal Arias-Aguilar, Alberto Elías Petrilli-Barceló, Rosebet Miranda-Luna
Improvement of the Turajlić Method for the Estimation of Gaussian Noise in Images

Gaussian noise estimation is an important step in some of the more recently developed noise removal methods. This is a difficult task and although several estimation techniques have been proposed recently, they generally do not produce good results. In a previous comparative study, among several noise estimation techniques, a method proposed in 2017 by Turajlić was found to give the best results. Although acceptable, they are still far from ideal. Therefore, several changes to this method are introduced in this paper to improve the estimation. Tests on monochromatic images contaminated with different levels of Gaussian noise showed that the modified method produces a significant improvement in the estimation of Gaussian noise, over 35%, at a slightly higher computational cost.

Manuel G. Forero, Sergio L. Miranda, Carlos Jacanamejoy-Jamioy
Spatial -Trimmed Fuzzy C-Means Algorithm to Image Segmentation

An important aspect should be taken into account, when an image is segmented, the presence of atypical information. In this investigation an algorithm is proposed that is noise tolerant in the segmentation process. A method to image segmentation that combines Fuzzy C-Means (FCM) algorithm and Trimmed Means filter, called Spatial $$\alpha $$ Trimmed Fuzzy C-means, using local information to achieve better segmentation. The FCM is very sensitive to noise, and the Trimmed Means filter is used to eliminate outliers with a lower computational cost. Compared to some state-of-the-art algorithms, the proposed is faster and noise tolerant, demonstrating better performance in the metrics considered.

Virna V. Vela-Rincón, Dante Mújica-Vargas, Manuel Mejía Lavalle, Andrea Magadán Salazar
Restoration of Range Images by the Gaussian Pyramid Method, Testing Different Interpolation Techniques to Select the Best Performance

The inpainting method implemented in this work was used to estimate the missing information in a range image from a single image, achieved independence of the RGB image of the scene or multiple range images to perform the restoration. The proposal is based on improving the results of restoring range images using the Gaussian Pyramid method. This, finding the best interpolation technique to use in this method to estimate the missing information. Different interpolation techniques were computed and applied in order to know the best option to implement. This is carried out considering the amount of information that can be estimated, processing time and the total of information missing in the image to be restored. The method was tested with five different databases, one of which was created specifically for this work. These databases include different interior scenarios with several objects. A qualitative and quantitative comparative analysis of the obtained results was performed.

Enrique Chavira Calderón, Alejandra Cruz-Bernal
A Novel Set of Moment Invariants for Pattern Recognition Applications Based on Jacobi Polynomials

A novel set of moment invariants for pattern recognition applications, which are based on Jacobi polynomials, are presented. These moment invariants are constructed for digital images by means of a combination with geometric moments, and are invariant in the face of affine geometric transformations such as rotation, translation and scaling, on the image plane. This invariance is tested on a sample of the MPEG-7 CE-Shape-1 dataset. The results presented show that the low-order moment invariants indeed possess low variance between images that are affected by the mentioned geometric transformations.

Rafael Augusto Rocha Angulo, Juan Martín Carpio, Alfonso Rojas-Domínguez, Manuel Ornelas-Rodríguez, Héctor Puga
Orthogonal Local Image Descriptors with Convolutional Autoencoders

This work proposes the use of deep learning architectures, and in particular Convolutional Autencoders (CAE’s), to incorporate an explicit component of orthogonality to the computation of local image descriptors. For this purpose we present a methodology based on the computation of dot products among the hidden outputs of the center-most layer of a convolutional autoencoder. This is, the dot product between the responses of the different kernels of the central layer (sections of a latent representation). We compare this dot product against an indicator of orthogonality, which in the presence of non-orthogonal hidden representations, back-propagates a gradient through the network, adjusting its parameters to produce new representations which will be closer to have orthogonality among them in future iterations. Our results show that the proposed methodology is suitable for the estimation of local image descriptors that are orthogonal to one another, which is often a desirable feature in many patter recognition tasks.

Edgar Roman-Rangel, Stephane Marchand-Maillet
Digital Assisted Image Correlation for Metal Sheet Strain Measurement

Current methods of correlation and point matching between stereoscopic images produce large errors or are completely inefficient when the surface has a repetitive, non-isotropic, low contrast pattern. In this article a new method of Digital Assisted Image Correlation (DAIC) is presented to match specific points in order to estimate the deformation of the surface in the metal sheets used in the automotive industry. To achieve this, it is necessary to stamp the surface to be measured with a regular pattern of points, then a digital image processing is done to obtain the labels of the circles of the pattern. After this, a semi-automatic search is made in the labels of both images to correlate all of them and perform the triangulation. DIC is used to corroborate the correspondence between points and verify the accuracy and efficiency of the developed method. This allows the 3D reconstruction of the sheet with a minimum of information and provides more efficiency and a great benefit in computational cost. Deformation is calculated by two methods, which show similarity between the values obtained with a digital microscope. It is assumed that quality of marks stamping, lighting, and the initial conditions, also contribute for trustworthy effects.

García-Alcalá Carlos-Eduardo, Padilla-Medina José-Alfredo, Barranco-Gutiérrez Alejandro-Israel
Structured Pointcloud Segmentation for Individual Mangrove Tree Modeling

Tree structure parameters of mangrove forests are hard to measure in the field and therefore inventories of this type of forests are impossible to keep up to date. In this article, we tested a structured pointcloud segmentation method for extracting individual mangrove trees. Structure parameters of individual trees were estimated from the segmented pointcloud and its 3d geometry was generated using revolution surfaces. Estimated parameters were then assessed at both plot and tree levels using field data. It was observed that the number of segments in each test plot agreed well with the number of trees observed in the field. Nonetheless, the estimated parameters exhibited mixed accuracy with top height being the most accurate.

José L. Silván-Cárdenas, José A. Gallardo-Cruz, Laura M. Hernández-Huerta

Computer Vision

Frontmatter
Vision-Based Blind Spot Warning System by Deep Neural Networks

Traffic accidents represent one of the most serious problems around the world. Many efforts have been concentrated on implementing Advanced Driver Assistance Systems (ADAS) to increase safety by reducing critical tasks faced by the driver. In this paper, a Blind Spot Warning (BSW) system capable of virtualizing cars around the driver’s vehicle is presented. The system is based on deep neural models for car detection and depth estimation using images captured with a camera located on top of the main vehicle, then transformations are applied to the image and to generate the appropriate information format. Finally the cars in the environment are represented in a 3D graphical interface. We present a comparison between car detectors and another one between depth estimators from which we choose the best performance ones to be implemented in the BSW system. In particular, our system offers a more intuitive assistance interface for the driver allowing a better and quicker understanding of the environment from monocular cameras.

Víctor R. Virgilio G., Humberto Sossa, Erik Zamora
Onboard CNN-Based Processing for Target Detection and Autonomous Landing for MAVs

In this work, we address the problem of target detection involved in an autonomous landing task for a Micro Aerial Vehicle (MAV). The challenge is to detect a flag located somewhere in the environment. The flag is posed on a pole, and to its right, a landing platform is located. Thus, the MAV has to detect the flag, fly towards it and once it is close enough, locate the landing platform nearby, aiming at centring over it to perform landing; all of this has to be carried out autonomously. In this context, the main problem is the detection of both the flag and the landing platform, whose shapes are known in advanced. Traditional computer vision algorithms could be used; however, the main challenges in this task are the changes in illumination, rotation and scale, and the fact that the flight controller uses the detection to perform the autonomous flight; hence the detection has to be stable and continuous on every camera frame. Motivated by this, we propose to use a Convolutional Neural Network optimised to be run on a small computer with limited computer processing budget. The MAV carries this computer, and it is used to process everything on board. To validate our system, we tested with rotated images, changes in scale and the presence of low illumination. Our method is compared against two conventional computer vision methods, namely, template and feature matching. In addition, we tested our system performance in a wide corridor, executing everything on board the MAV. We achieved a successful detection of the flag with a confidence metric of 0.9386 and 0.9826 for the Landing platform. In total, all the onboard computations ran at an average of 13.01 fps.

A. A. Cabrera-Ponce, J. Martinez-Carranza
COUPLED: Calibration of a LiDAR and Camera Rig Using Automatic Plane Detection

LiDARs and cameras are two widely used sensors in robotics and computer vision, particularly for navigation and recognition in 3D scenarios. Systems combining both may benefit from the precise depth of the former and the high-density information of the latter, but a calibration process is necessary to relate them spatially. In this paper, we introduce COUPLED, a method that finds the extrinsic parameters to relate information between them. The method implies the use of a setup consisting of three planes with charuco patterns to find the planes in both systems. We obtain corresponding points in both systems through geometric relations between the planes. Afterward, we use these points and the Kabsch algorithm to compute the transformation that merges the planes between both systems. Compared to recent single plane algorithms, we obtain more accurate parameters, and only one pose is required. In the process, we develop a method to automatically find the calibration target using a plane detector instead of manually selecting the target in the LiDAR frame.

Omar Montoya, Octavio Icasio, Joaquín Salas
ModuleNet: A Convolutional Neural Network for Stereo Vision

Convolutional Neural Networks (CNN) has gained much attention for the solution of numerous vision problems including disparities calculation in stereo vision systems. In this paper, we present a CNN based solution for disparities estimation that builds upon a basic module (BM) with limited range of disparities that can be extended using various BM in parallel. Our BM can be understood as a segmentation by disparity and produces an output channel with the memberships for each disparity candidate, additionally the BM computes a channel with the out–of–range disparity regions. This extra channel allows us to parallelize several BM and dealing with their respective responsibilities. We train our model with the MPI Sintel dataset. The results show that ModuleNet, our modular CNN model, outperforms the baseline algorithm Efficient Large-scale Stereo Matching (ELAS) and FlowNetC achieving about a 80% of improvement.

O. I. Renteria-Vidales, J. C. Cuevas-Tello, A. Reyes-Figueroa, M. Rivera

Industrial and Medical Applications of Pattern Recognition

Frontmatter
A Method for Estimating Driving Factors of Illicit Trade Using Node Embeddings and Clustering

The trade on illegal goods and services, also known as illicit trade, is expected to drain 4.2 trillion dollars from the world economy and put 5.4 million jobs at risk by 2022. These estimates reflect the importance of combating illicit trade, as it poses a danger to individuals and undermines governments. To do so, however, we have to first understand the factors that influence this type of trade. Therefore, we present in this article a method that uses node embeddings and clustering to compare a country based illicit supply network to other networks that represent other types of country relationships (e.g., free trade agreements, language). The results offer initial clues on the factors that might be driving the illicit trade between countries.

Jorge Ángel González Ordiano, Lisa Finn, Anthony Winterlich, Gary Moloney, Steven Simske
Accurate Identification of Tomograms of Lung Nodules Using CNN: Influence of the Optimizer, Preprocessing and Segmentation

The diagnosis of pulmonary nodules plays an important role in the treatment of lung cancer, thus improving the diagnosis is the primary concern. This article shows a comparison of the results in the identification of computed tomography scans with pulmonary nodules, through the use of different optimizers (Adam and Nadam); the effect of the use of pre-processing and segmentation techniques using CNNs is also thoroughly explored. The dataset employed was Lung TIME which is publicly available. When no preprocessing or segmentation was applied, training accuracy above 90.24% and test accuracy above 86.8% were obtained. In contrast, when segmentation was applied without preprocessing, a training accuracy above 97.19% and test accuracy above 95.07% were reached. On the other hand, when preprocessing and segmentation was applied, a training accuracy above 96.41% and test accuracy above 94.71% were achieved. On average, the Adam optimizer scored a training accuracy of 96.17% and a test accuracy of 95.23%. Whereas, the Nadam optimizer obtained 96.25% and 95.2%, respectively. It is concluded that CNN has a good performance even when working with images with noise. The performance of the network was similar when working with preprocessing and segmentation than when using only segmentation. Also, it can be inferred that, the application of preprocessing and segmentation is an excellent option when it is required to improve accuracy in CNNs.

Cecilia Irene Loeza Mejía, R. R. Biswal, Eduardo Rodriguez-Tello, Gilberto Ochoa-Ruiz
What the Appearance Channel from Two-Stream Architectures for Activity Recognition Is Learning?

The automatic recognition of human activities from video data is being led by spatio-temporal Convolutional Neural Networks (3D CNNs), in particular two-stream architectures such as I3D that reports the best accuracy so far. Despite the high performance in accuracy of this kind of architectures, very little is known about what they are really learning from data, resulting therefore in a lack of robustness and explainability. In this work we select the appearance channel from the I3D architecture and create a set of experiments aimed at explaining what this model is learning. Throughout the proposed experiments we provide evidence that this particular model is learning the texture of the largest area (which can be the activity or the background, depending on the distance from the camera to the action performed). In addition, we state several considerations to take into account when selecting the training data to achieve a better generalization of the model for human activity recognition.

Reinier Oves García, L. Enrique Sucar
Automatic Estrus Cycle Identification System on Female Dogs Based on Deep Learning

Vaginal cytology is a complementary economic method and of simple realization, an indicative to determine in which stage of the estrous cycle the dog is, to achieve a higher fertility and fertility rate. This method is based on determining the type and quantity of cells of the different stages of the estrous cycle, since the hormonal changes that the vaginal mucosa undergoes during the estrous cycle are shown in the morphology of its epithelial cells. The canine female in her reproductive life goes through different phases of activity and hormonal rest that are repeated cyclically. This is called the estrous cycle and consists of 4 stages: proestrus, estrus, diestrus and anestrus. The interpretation of vaginal cytology’s, is a process to which a considerable amount of time is dedicated by its observation in the microscope and the same interpretation by the doctor which can become subjective and poorly performed, causing economic losses for the owners. Therefore, this work proposes an automatic system that will identify six types of cells and the quantity of them in the glass slide, based on a Faster R-CNN to determine in which stage of the estrous cycle the dog is. Our results show an accuracy of 91.6%. The proposed system will improve the efficiency and speed of cytology’s to decreased from 1 h approximately to just a few seconds.

Gustavo Calderón, Cesar Carrillo, Mariko Nakano, Jeanine Acevedo, José Ernesto Hernández

Natural Language Processing and Recognition

Frontmatter
Machine Learning Techniques for Identity Document Verification in Uncontrolled Environments: A Case Study

Distributed (i.e. mobile) enrollment to services such as banking is gaining popularity. In such processes, users are often asked to provide proof of identity by taking a picture of an ID. For this to work securely, it is critical to automatically check basic document features, perform text recognition, among others. Furthermore, challenging contexts might arise, such as various backgrounds, diverse light quality, angles, perspectives, etc. In this paper we present a machine-learning based pipeline to process pictures of documents in such scenarios, that relies on various analysis modules and visual features for verification of document type and legitimacy. We evaluate our approach using identity documents from the Republic of Colombia. As a result, our machine learning background detection method achieved an accuracy of 98.4%, and our authenticity classifier an accuracy of 97.7% and an F1-score of 0.974.

Alejandra Castelblanco, Jesus Solano, Christian Lopez, Esteban Rivera, Lizzy Tengana, Martín Ochoa
Not All Swear Words Are Used Equal: Attention over Word n-grams for Abusive Language Identification

The increasing propagation of abusive language in social media is a major concern for supplier companies and governments because of its negative social impact. A large number of methods have been developed for its automatic identification, ranging from dictionary-based methods to sophisticated deep learning approaches. A common problem in all these methods is to distinguish the offensive use of swear words from their everyday and humorous usage. To tackle this particular issue we propose an attention-based neural network architecture that captures the word n-grams importance according to their context. The obtained results in four standard collections from Twitter and Facebook are encouraging, they outperform the $$F_1$$ scores from state-of-the-art methods and allow identifying a set of inherently offensive swear words, and others in which its interpretation depends on its context.

Horacio Jesús Jarquín-Vásquez, Manuel Montes-y-Gómez, Luis Villaseñor-Pineda
Gender Identification in Social Media Using Transfer Learning

Social networks have modified the way we communicate. It is now possible to talk to a large number of people we have never met. Knowing the traits of a person from what he/she writes has become a new area of computational linguistics called Author Profiling. In this paper, we introduce a method for applying transfer learning to address the gender identification problem, which is a subtask of Author Profiling. Systems that use transfer learning are trained in a large number of tasks and then tested in their ability to learn new tasks. An example is to classify a new image into different possible classes, giving an example of each class. This differs from the traditional approach of standard machine learning techniques, which are trained in a single task and are evaluated in new examples of that task. The aim is to train a gender identification model on Twitter users using only their text samples in Spanish. The difference with other related works consists in the evaluation of different preprocessing techniques so that the transfer learning-based fine-tuning is more efficient.

Aquilino Francisco Sotelo, Helena Gómez-Adorno, Oscar Esquivel-Flores, Gemma Bel-Enguix

Artificial Intelligence Techniques and Recognition

Frontmatter
Experimental Study on Transfer Learning in Denoising Autoencoders for Speech Enhancement

The quality of speech signals is affected by a combination of background noise, reverberation, and other distortions in real-life environments. The processing of such signals presents important challenges for tasks such as voice or speaker recognition. To enhance signals in such challenging conditions several deep learning-based methods have been proposed. Those new methods have proven to be effective, in comparison to classical algorithms based on statistical analysis and signal processing. In particular, recurrent neural networks, especially those with long short-term memory (LSTM and BLSTM), have presented surprising results in tasks related to enhancing speech. One of the most challenging aspects of artificial neural networks is to reduce the high computational cost of the training procedure. In this work, we present a comparative study on transfer learning to accelerate and improve traditional training based on random initialization of the internal weights of the networks. The results show the advantage of the proposal in terms of less training time and better results for the task of denoising speech signals at several signal-to-noise ratio levels of white noise.

Marvin Coto-Jiménez
A Preliminary Study on Score-Based Hyper-heuristics for Solving the Bin Packing Problem

The bin packing problem is a widespread combinatorial problem. It aims at packing a set of items by using as few bins as possible. Among the many available solving methods, approximation ones such as heuristics have become popular due to their reduced cost and generally acceptable solutions. A further step in this regard is given by hyper-heuristics, which literature usually defines as “high-level heuristics to choose heuristics”. Hyper-heuristics choose one suitable heuristic from a set of available ones, to solve a particular portion of an instance. As the search progresses, heuristics can be exchanged, adapting the solution process to the current problem state under exploration. In this work, we describe how to generate and use hyper-heuristics that keep a record of the scores achieved by individual heuristics on previously solved bin packing problem instances in the form of rules. Then, hyper-heuristics manage those scores to estimate the performance of such heuristics on unseen instances. In this way, the previous actions of the hyper-heuristics determine which heuristic to use on future unseen cases. The experiments conducted under different scenarios yield promising results where some of the hyper-heuristics produced outperform isolated heuristics.

A. Silva-Gálvez, E. Lara-Cárdenas, I. Amaya, J. M. Cruz-Duarte, J. C. Ortiz-Bayliss
Learning Clasiffier Systems with Hebbian Learning for Autonomus Behaviors

One of the main characteristics of multi-agent systems is the ability to solve problems achieving objectives. This is possible because of the learning mechanisms that are embedded in the systems and go from neural networks up to vector support machines. Agent-based systems stand out for their autonomy and adaptation of dynamic conditions of the environment. This article presents the Hebbian theory, which is one of the learning methods from the neuroscience field. A particularity presented by the Hebbian theory from the computer since field perspective is the primary mechanism of synaptic plasticity where the value of a synaptic connection increases if neurons on both sides of a said synapse are activated repeatedly, creating a new one simultaneously. This mechanism is integrated into the Learning Classifier Systems (LCS) to validate its effectiveness in the solution task, and can be used in multi-agent systems.

Marco Ramos, Vianney Muñoz-Jiménez, Félix F. Ramos
Backmatter
Metadata
Title
Pattern Recognition
Editors
Karina Mariela Figueroa Mora
Juan Anzurez Marín
Jaime Cerda
Dr. Jesús Ariel Carrasco-Ochoa
José Francisco Martínez-Trinidad
José Arturo Olvera-López
Copyright Year
2020
Electronic ISBN
978-3-030-49076-8
Print ISBN
978-3-030-49075-1
DOI
https://doi.org/10.1007/978-3-030-49076-8

Premium Partner