Skip to main content
Top

2021 | Book

Applied Informatics

Fourth International Conference, ICAI 2021, Buenos Aires, Argentina, October 28–30, 2021, Proceedings

insite
SEARCH

About this book

This book constitutes the thoroughly refereed papers of the 4th International Conference on Applied Informatics, ICAI 2021, held in Buenos Aires, Argentina, in October, 2021.The 35 full papers were carefully reviewed and selected from 89 submissions. The papers are organized in topical sections on artificial intelligence; data analysis; decision systems; health care information systems; image processing; security services; simulation and emulation; smart cities; software and systems modeling; software design engineering.

Table of Contents

Frontmatter

Artificial Intelligence

Frontmatter
A Chatterbot Based on Genetic Algorithm: Preliminary Results

Chatterbots are programs that simulate an intelligent conversation with people. They are commonly used in customer service, product suggestions, e-commerce, travel and vacations, queries, and complaints. Although some works have presented valuable studies by using several technologies including evolutionary computing, artificial intelligence, machine learning, and natural language processing, creating chatterbots with a low rate of grammatical errors and good user satisfaction is still a challenging task. Therefore, this work introduces a preliminary study for the development of a GA-based chatterbot that generates intelligent dialogues with a low rate of grammatical errors and a strong sense of responsiveness, so boosting the personals satisfaction of individuals who interact with it. Preliminary results show that the proposed GA-based chatterbot yields 69% of “Good” responses for typical conversations regarding orders and receipts in a cafeteria.

Cristian Orellana, Martín Tobar, Jeremy Yazán, D. Peluffo-Ordóñez, Lorena Guachi-Guachi
A Supervised Approach to Credit Card Fraud Detection Using an Artificial Neural Network

The wide acceptability and usage of credit card-based transactions can be attributed to improved technological availability and increased demand due to ease of use. As a result of the increased adoption levels, this domain has become profitable and one of the most popular targets for fraudsters who use it to conduct regular exploitations or assaults. Merchants and financial processing providers that sell credit cards suffer substantial financial damages as a result of credit card theft. Because of the possibility of large casualties, it is one of the most serious risks to these organizations and individuals. Credit card fraudulent transaction can be viewed as a binary classification task in which a supervised machine learning technique could be used to analyze and classify a credit card transaction dataset into genuine or fraudulent cases. Therefore, this study explored the use of Artificial Neural Network (ANN) for credit card fraud detection. ULB Machine Learning Group dataset that has 284, 315 legitimate and 492 fraudulent transaction were used to validate the proposed model. Performance evaluation results revealed that model achieved a 100% and 99.95% classification accuracy during training and testing respectively. This affirmed the fact that ANN model could be efficiently used to predict credit card fraudulent transactions.

Oluwatobi Noah Akande, Sanjay Misra, Hakeem Babalola Akande, Jonathan Oluranti, Robertas Damasevicius
Machine Learning Classification Based Techniques for Fraud Discovery in Credit Card Datasets

The frequency of credit card-based online payment frauds has increased rapidly in recent years, forcing banks and e-commerce companies to create automated fraud detection systems that perform mining on massive transaction logs. Machine learning appears to be one of the most promising techniques for detecting illegal transactions since it uses supervised binary classification algorithms appropriately trained using pre-screened sample datasets to differentiate between fraudulent and non-fraudulent cases. This study aims to concentrate on machine learning (ML) methods thereby proposing a credit card fraud discovery scheme to detect fraud. The ML techniques employed are Decision Tree (DT) and K-Nearest Neighbor (KNN) ML classification techniques. The performance outcomes of the two ML classification techniques are evaluated depending on accuracy, precision, specificity, recall, f1-score, and false-positive rate (FPR). The area under the ROC curve (AUC) of the receiver operating characteristics (ROC) curve was similarly drawn built on the confusion matrix for both classifiers. The two classification techniques were evaluated and compared using the performance metrics mentioned earlier and it was demonstrated that the KNN technique outperformed that of the DT with a greater ROC curve value of 91% for KNN and 86% for DT. It was concluded that KNN is considered a better ML classification technique that can be employed to discover credit card fraudulent activities.

Roseline Oluwaseun Ogundokun, Sanjay Misra, Opeyemi Eyitayo Ogundokun, Jonathan Oluranti, Rytis Maskeliunas
Object Detection Based Software System for Automatic Evaluation of Cursogramas Images

The aim of this work is to describe the tasks performed to carry out the development of a software system capable of detecting and recognizing the symbols of Cursogramas in images by using a Deep Learning model that has been trained from scratch. In this way, we seek to assist teachers of an undergraduate subject to automatically evaluate diagrams made as part of the practical exercise of their students. For this purpose, in addition to having carried out a process of understanding the problem and identifying the available data, tasks of technology selection and construction of each of the components that are part of the system are also carried out. Therefore, although the problem domain belongs to the field of university education, this work is more related to the engineering and technological aspect of the application of Artificial Intelligence to solve complex problems.

Pablo Pytel, Matías Almad, Rocío Leguizamón, Cinthia Vegega, Ma Florencia Pollo-Cattaneo
Sign Language Recognition Using Leap Motion Based on Time-Frequency Characterization and Conventional Machine Learning Techniques

The abstract should briefly summarize the contents of the paper in Sign language is the form of communication between the deaf and hearing population, which uses the gesture-spatial configuration of the hands as a communication channel with their social environment. This work proposes the development of a gesture recognition method associated with sign language from the processing of time series from the spatial position of hand reference points granted by a Leap Motion optical sensor. A methodology applied to a validated American Sign Language (ASL) Dataset which involves the following sections: (i) preprocessing for filtering null frames, (ii) segmentation of relevant information, (iii) time-frequency characterization from the Discrete Wavelet Transform (DWT). Subsequently, the classification is carried out with Machine Learning algorithms (iv). It is graded by a 97.96% rating yield using the proposed methodology with the Fast Tree algorithm.

D. López-Albán, A. López-Barrera, D. Mayorca-Torres, D. Peluffo-Ordóñez
Solar Radiation Prediction Using Machine Learning Techniques

The proposal of a solar radiation estimation model using Machine Learning is submit, the processing of meteorological data measured by satellite and data measured on the land is made. The model uses two solutions using an artificial neural network and robust linear regression the climatic variables used as input to the model are solar radiation, temperature and clarity index, all get from satellite data. The main aim of this work is to propose a model that allows using the satellite data to get an estimate of the behavior of the solar resource on the ground, reducing the error between the satellite data and the data measured on the ground. The results of the model got by training an artificial neural network with hidden layers are submit, here the normal distributions of the data reported by the satellite and the data got by the proposed model are submit. In addition, the results of the daily average got by the model and the daily average values measured on land are submit. I conclude it by proposing a second estimation model using robust linear regression. A proposed model adjusted to the assumptions made during the regression process and acceptable results to those got by the satellite and reported by other works are got.

Luis Alejandro Caycedo Villalobos, Richard Alexander Cortázar Forero, Pedro Miguel Cano Perdomo, José John Fredy González Veloza
Towards More Robust GNN Training with Graph Normalization for GraphSAINT

Graph Neural Networks (GNNs) field has a dramatically development nowadays due to the strong representation ability for data in non-Euclidean space, such as graphs. However, with the larger graph datasets and the trend of more complex algorithms, the stability problem appears during model training. For example, GraphSAINT algorithm will not converge in training with a probability range from 0.1 to 0.4. In order to solve this problem, this paper proposes an improved GraphSAINT method. Firstly, a proper graph normalization strategy is introduced into the model as a neural network layer. Secondly, the structure of the model is modified based on the normalization strategy to normalize the original input data and the input data of the middle layer. Thirdly, the training process and the inference process of the model are adjusted to fit this normalization strategy. The improved GraphSAINT method successfully eliminates the instability and improves the robustness during training. Besides, it accelerates the training procedure convergence of the GraphSAINT algorithm and reduces the training time by about a quarter. Furthermore, it also achieves an improvement in the pre-diction accuracy. The effectiveness of the improved method is verified by using the citation dataset of Open Graph Benchmark (OGB).

Yuying Wang, Qinfen Hao

Data Analysis

Frontmatter
A Complementary Approach in the Analysis of the Human Gut Microbiome Applying Self-organizing Maps and Random Forest

The human gastrointestinal tract is colonized by millions of microorganisms that make up the so-called gut microbiota, with a vital role in the well-being, health maintenance as well as the appearance of several diseases in the human host. A data mining analysis approach was applied on a set of gut microbiota data from healthy individuals. We used two machine learning methods to identify biomedically relevant relationships between demographic and biomedical variables of the subjects and patterns of abundance of bacteria. The study was carried out focusing on the two most abundant human gut microbiota groups, Bacteroidetes and Firmicutes. Both subsets of bacterial abundances together with the metadata variables were subjected to an exploratory analysis, using self-organizing maps that integrate multivariate information through different component planes. Finally, to evaluate the relevance of the variables on the biological diversity of the microbial communities, an ensemble-based method such as random forest was used. Results showed that age and body mass index were among the most important features at explaining bacteria diversity. Interestingly, several bacteria species known to be associated to diet and obesity were identified as relevant features as well. In the topological analysis of self-organizing maps, we identified certain groups of nodes with similarities in subject metadata and gut bacteria. We conclude that our results represent a preliminary approach that could be considered, in future studies, as a potential complement in health reports so as to help health professionals personalize patient treatment or support decision making.

Valeria Burgos, Tamara Piñero, María Laura Fernández, Marcelo Risk
An Evaluation of Intrinsic Mode Function Characteristic of Non-Gaussian Autorregresive Processes

Empirical mode decomposition (EMD) is a suitable transformation to analyse non-linear time series. This work presents a empirical study of intrinsic mode functions (IMFs) provided by the empirical mode decomposition. We simulate several non-gaussian autoregressive processes to characterize this decomposition. Firstly, we studied the probability density distribution, Fourier spectra and the cumulative relative energy to each IMF as part of the study of empirical mode decomposition. Then, we analyze the capacity of EMD to characterize, both the autocorrelation dynamics and the marginal distribution of each simulated stochastic process. Results show that EMD seems not to only discriminate autocorrelation but also the marginal distribution of simulated processes. Results also show that entropy based EMD is a promising estimator as it is capable to distinguish between correlation and probability distribution. However, the EMD entropy does not reach its maximum value in stochastic processes with uniform probability distribution.

Fernando Pose, Javier Zelechower, Marcelo Risk, Francisco Redelico
Analysing Documents About Colombian Indigenous Peoples Through Text Mining

The indigenous peoples of Colombia have a considerable social, political and cultural wealth. However, issues such as the decades-long armed conflict and drug trafficking have posed a significant threat to their survival. In this work, publicly available documents on the Internet with information about two indigenous communities, the Awá and Inga people from the Cauca region in southern Colombia, are analysed using text analytics approaches. A corpus is constructed comprising general characterisation documents, media articles and rulings from the Constitutional Court. Topic analysis is carried out to identify the relevant themes in the corpus to characterise each community. Sentiment analysis carried out on the media articles indicates that the articles about the Inga tend to be more positive and objective than the Awá. This may be attributed to the significant impact that the armed conflict has had on the Awá in recent years, and the productive projects of the Inga. Furthermore, an approach for summarising long, complex documents by means of timelines is proposed, and illustrated using a ruling issued by the Constitutional Court. It is concluded that such an approach has significant potential to facilitate understanding of documents of this nature.

Jonathan Stivel Piñeros-Enciso, Ixent Galpin
Application of a Continuous Improvement Method for Graphic User Interfaces Through Intelligent Systems for a Language Services Company

The Graphic User Interface represents the most common mechanism in human-computer interaction. Thus, its correct development is an utterly important task, which is reflected to the point where most software projects fail due to low user acceptance. The techniques, methodologies, tools, and principles to improve Graphical User Interfaces have been the object of interest of researchers and organizations over time. Due to the low success rate, a recent trend consists in experimenting with a greater number of interface alternatives, lowering their development costs through automation and Artificial Intelligence. Therefore, this paper presents a method that, by Interactive Genetic Algorithms, a semi-automatic continuous improvement process for landing pages is established. The proposed method is applied in a study case, consisting in a landing page that belongs to a translation services company. The method application provides positive results, enabling the landing page to reach the goals for which it was designed.

Osvaldo Germán Fernandez, Pablo Pytel, Ma Florencia Pollo-Cattaneo
Descriptive Analysis Model to Improve the Pension Contributions Collection Process for Colpensiones

The objective of this work is to show how, with the use of Big data techniques, the results of a descriptive analysis of the collection for pension contributions of the COLPENSIONES income management process; because the sustainability of the average premium regime must be guaranteed by the collection by the employers; in order for the pension system to be maintained. Payments are made through the Integrated Contribution Settlement Form–PILA by spanish Planilla Integrada de Liquidación de Aportes–, after registering with an information operator, which can be completed on the information operators’ website or through an advisor to them. The information operator is the company in charge of facilitating the creation, modification, validation, correction and sending of PILA. Additionally, it directs information, records and payments to the different social security and parafiscal administrators, in a safe and timely manner. Therefore, we want to make known the importance of this process within the company with this article, from the information found in the payment tables for contributors during the years 2013 to 2017, thanks to COLPENSIONES. A descriptive analysis methodology of the collection process is implemented current situation and the construction of a model based on the different variables that can influence collection management, also validating the existence of methods for predicting collection or non-payment within the company. Through the CRISP DM methodology, apply each of the different stages that it contains within the development of the process.

William M. Bustos C., Cesar O. Diaz
User Response-Based Fake News Detection on Social Media

Social media has been a major information sharing and communication platform for individuals and organizations on a mass scale. Its ability to engage users to react to information posted on this media in the form of like, share, and comment made it a preferable information sharing platform by many. But the contents posted on social media are not filtered, fact checked or judged by an editorial body like any traditional news platform. Therefore, individuals, institutions and communities who consume news from social media are vulnerable to misinformation by malicious authors. In this work, we are proposing an approach that detects fake news by investigating the reaction of users to a post composed by malicious authors. Using features extracted by bag-of-words model and TF-IDF from text based replies (comments), and visual emotion responses in the form of categorical data, we built models that predicted news as fake or real. We have designed and conducted a series of experiments to evaluate the performance of our approach. The results show the proposed approach outperforms the baseline in all the six models. In particular, our models from random forest, logistic regression, and XGBoost algorithms produce a precision of 0.97, a recall of 0.99 and an F1 of 0.98.

Hailay Kidu, Haile Misgna, Tong Li, Zhen Yang

Decision Systems

Frontmatter
A Multi-objective Mathematical Optimization Model and a Mobile Application for Finding Best Pedestrian Routes Considering SARS-CoV-2 Contagions

Given the spread and pandemic generated by the SARS-CoV-2 coronavirus and the requirements of the Universidad de Los Andes in terms of guaranteeing the health and safety of the students considering possible contagions at the university campus, we propose and design a mobile application to obtain the best route between two places in the campus university for a student. The path obtained by our proposal, reduces the distance traveled by the student as well as a possible contagion of the coronavirus during his journey through the university campus. In this sense, two types of costs were modeled: one cost represents the distance cost to travel the campus by a student, and the second one represents the contagion susceptibility that a student has when he is passing through the campus. In summary, it was developed and validated a solution algorithm that minimizes these two types of costs. The results of our algorithm are compared against a Multi-Objective mathematical optimization solution and interesting findings were found. Finally, a mobile application was created and designed in order to obtain optimal routes to travel between two points in the university campus.

Juan E. Cantor, Carlos Lozano-Garzón, Germán A. Montoya
An Improved Course Recommendation System Based on Historical Grade Data Using Logistic Regression

Elective course selection is very important to undergraduate students as the right courses could provide a boost to a student’s Cumulative Grade Point Average (CGPA) while the wrong courses could cause a drop in CGPA. As a result, institutions of higher learning usually have paid advisers and counsellors to guide students in their choice of courses but this method is limited due to factors such as a high number of students and insufficient time on the part of advisers/counsellors. Another factor that limits advisers/counsellors is the fact that no matter how hard we try, there are patterns in data that are simply impossible to detect by human knowledge alone. While many different methods have been used in an attempt to solve the problem of elective course recommendation, these methods generally ignore student performance in previous courses when recommending courses. Therefore, this paper, proposes an effective course recommendation system for undergraduate students using Python programming language, to solve this problem based on grade data from past students. The logistic regression model alongside a wide and deep recommender were used to classify students based on whether a particular course would be good for them or not and to recommend possible electives to them. The data used for this study was gotten from records of the Department of Computer Science, University of Ilorin only and the courses to be predicted were electives in the department. These models proved to be effective with accuracy scores of 0.84 and 0.76 and a mean-squared error of 0.48.

Idowu Dauda Oladipo, Joseph Bamidele Awotunde, Muyideen AbdulRaheem, Oluwasegun Osemudiame Ige, Ghaniyyat Bolanle Balogun, Adekola Rasheed Tomori, Fatimoh Abidemi Taofeek-Ibrahim
Predictive Academic Performance Model to Support, Prevent and Decrease the University Dropout Rate

One of the biggest problems in higher education is student dropout. Prior to the pandemic, one of the biggest problems for university institutions was the dropout and dropout of many of their students. Today, the situation has become even more critical, as the pandemic has forced many people to drop out of school for a variety of reasons, whether financial or personal. Investigating the causes of dropout with appropriate means to reduce it contributes to decision making within academic management. The objective of this work is to develop a machine learning model that generates early warnings about course loss, which is based on historical data of pupils and students. The model is based on historical data from an undergraduate program that includes, student grades, at various points in time, percentage of course loss in previous semesters, percentage of student loss in previous semesters, subjects passed at the time of evaluating the data, along with student and course average. This would facilitate the identification and internal management of alarms for the early detection of potential dropouts, as well as efficiently display the results found with the execution of these models.

Diego Bustamante, Olmer Garcia-Bedoya
Proposed Algorithm to Digitize the Root-Cause Analysis and 8Ds of the Automotive Industry

As we adapt to our new post pandemic reality, many activities are being and have been replaced or modified. Companies need to adapt to digitalization as a new way of interacting with staff and customers. The challenge concerns the management of information since it must be aligned to the trends of Engineering 4.0 to solve everyday problems. This article analyzes the quality tool QRCI (Quick Response Continues Improvement), a solution-based method used by automotive company Faurecia for solving production problems related to 5M (Machine, Raw Material, Labor, Environment and Method) the investigation describes the relationship between QRCI and PPAP (Production Parts Approval Process) a standard with technical information to ensure the product quality. During our investigation we identified a disruptive communication between QRCI & PPAP and we propose to include both tools in the same platform and replace the troubleshooting in QRCI done by humans to artificial intelligence analysis, in order to reduce misperception, improve the analysis time and communicate the results to PPAP in real time, reducing operating costs and increasing productivity and quality since the improvement is continuous. Therefore, the fulfilment of the certifications is ensured through an integrated prevention model. A robust system inspires confidence in suppliers and customers based on the effectiveness and efficiency of the production processes.

Brenda-Alejandra Mendez-Torres, Patricia Cano-Olivos

Health Care Information Systems

Frontmatter
Blood Pressure Morphology as a Fingerprint of Cardiovascular Health: A Machine Learning Based Approach

Daily exercise and a healthful diet allows the cardiovascular (CV) system to age slower. However, unhealthy life habits and diseases impact negatively on this condition. In consequence, an individual’s “chronological age” (CHA) may differ from the “arterial age” (AA). There is proven evidence of correlation between arterial stiffness (AS), highly associated with AA, and central blood pressure (BP) waveform, which combined with other CV properties, define a group of characteristic parameters to assess CV Health (CVH). Such assessment, obtained by means of noninvasive and simple arterial measurements is extremely useful for the prevention of CV diseases. Machine Learning (ML) techniques allow the development of predicting models, but usually a large amount of data is required, often unavailable and/or not fully complete. One-dimensional (1D) CV models become an interesting alternative to overcome this issue, since CV system simulations can be performed. OBJECTIVE: A ML model was trained/validated with data from healthy and heterogeneous population (n = 4374), using parameters derived from arterial pressure waveforms (APW) in order to assess CVH. METHODS: Sixteen features were extracted from carotid and radial APW. Such were used for prediction of cardiac output (CO), systemic vascular resistance (SVR), systolic BP (SBPc), diastolic BP (DBPc), carotid-femoral pulse wave velocity (PWVcf), aortic augmentation index (AIa) and CHA as a surrogate of AA. RESULTS: The normalized-RMSEs for CO, SVR, SBPc, DBPc, PWVcf, AIa and CHA using the best-performing model were 5.4%, 2.2%, 0.1%, 0.06%, 0.6%, 1.2% and 5.5%, respectively. CONCLUSION: The acquired results showed that the ML models obtained from measurements in the assessed locations provide an acceptable performance on estimating CVH parameters. Further in-vivo studies are required to validate the application of the outcome.

Eugenia Ipar, Nicolás A. Aguirre, Leandro J. Cymberknop, Ricardo L. Armentano
Characterizing Musculoskeletal Disorders. A Case Study Involving Kindergarten Employees

This article presents the results of applying the Nordic Kuorinka questionnaire in a group of educators to infer musculoskeletal healthiness. The Kuorinka instrument queries the workers about postural pains and muscular afflictions that might have appeared due to systematic effort in the workplace. The sample, n = 42, included administrative workers, teachers, general services members, and kitchen individuals a kindergarten. These people execute different activities with diverse workloads. The results show that individuals with general discomfort in the last 12 months pointed at the neck as the source of the problem. In 19% of the workers, the discomfort lasted around 30 days, but the employees decided not to report the health-related event to the human resources office. 25% of the participants indicate that each episode of pain or discomfort lasted from one to four weeks. Other results indicate that among the workers incapacitated for 1 to 7 days, only 17% received medical treatment. According to Liberty Mutual, the largest workers’ compensation insurance provider in the United States, musculoskeletal diseases cost employers 13.4 billion every year the United States. Nevertheless, and perhaps more disturbing is that workers do not use the health systems after work-related discomforts. Also, corrective actions are few or non-existent in Latin American countries, perpetuating work-related diseases and increasing the burden of losing health and productivity.

Gonzalo Yepes Calderon, Julio Perea Sandoval, Julietha Oviedo-Correa, Fredy Linero-Moreno, Fernando Yepes-Calderon
Comparison of Heart Rate Variability Analysis with Empirical Mode Decomposition and Fourier Transform

The heart rate variability (HRV) analysis allows the study of the regulation mechanisms of the cardiovascular system, in both normal and pathological conditions, and the power spectral density analysis of the short-term HRV was adopted as a tool for the evaluation of the autonomic function. The Ensemble Empirical Mode Decomposition (EEMD) is an adaptive method generally used to analyze non-stationary signals from non-linear systems. In this work, the performance of the EEMD in the decomposition of the HRV signal in the main spectral components is studied, in a first instance to a synthesized series to calibrate the method and achieve confidence and then to a real HRV database. In conclusion, the results of this work propose the EEMD as useful method for analysis HRV data. The ability of decomposes the main spectral bands and the capability to deal with non-linear and non-stationary behaviors makes the EEMD a powerful method for tracking frequency changes and amplitude modulations in HRV signals generated by autonomic regulation.

Javier Zelechower, Fernando Pose, Francisco Redelico, Marcelo Risk
Evalu@ + Sports. Creatine Phosphokinase and Urea in High-Performance Athletes During Competition. a Framework for Predicting Injuries Caused by Fatigue

Elite athletes follow a strict regime of physical training that forces muscle deterioration - reconstruction cycles and specific energy generation patterns. One can monitor metabolic functions in blood for further training planning and optimization. The creatine phosphokinase (CPK) and the Urea appear in the serum-blood with higher average values in elite-athletes in comparison with sedentary subjects. In this manuscript, CPK and Urea recorded in professional soccer players are studied along a full season to create a framework where training sessions could be customized. Preliminary results set the foundation for building a platform capable of anticipating fatigue-induced injuries and a detailed recovery follow-up of lesions.

Juan Fernando Yepes Zuluaga, Alvin David Gregory Tatis, Daniel Santiago Forero Arévalo, Fernando Yepes-Calderon
Heart Rate Variability: Validity of Autonomic Balance Indicators in Ultra-Short Recordings

This work aims to find the minimum recording times of ultra-short heart rate variability (HRV) that enable the analysis of autonomic activity indexes. Samples covering 5 min are employed to extract SS and S/PS from the Poincaré diagram from a group of 23 subjects. The RR series, extracted from the electrocardiogram signal, were recorded for 300 s at rest – used as the gold standard – and at intervals of 60, 90, 120, 180 and 240 s to perform the concordance analysis with the gold standard derived indexes. We used four different techniques of concordance: Spearman, Bland, and Altman correlation, and Cliff’s Delta.The SS times within records of 120 s were equivalent to those of short-term HRV and S/PS of 90 s. Also, ultra-short HRV indexes were similar to those obtained for the short-term HRV analysis. Such a reduction in measurement times will allow expanding the use of HRV to monitor the state of health and well-being and help physical trainers achieve better performance in the registration and processing of the information obtained.The results motivate the conduct of new studies to analyze the behavior of these indicators in different populations, using different pre-processing methods than the RR series.

Jose Gallardo, Giannina Bellone, Marcelo Risk

Image Processing

Frontmatter
An Improved Machine Learnings Diagnosis Technique for COVID-19 Pandemic Using Chest X-ray Images

The pandemic produced by coronavirus2 (COVID-19) has confined the world, and avoiding close human contact is still suggested to combat the outbreak although the vaccination campaigns. It is expectable that emerging technologies have prominent roles to play during this pandemic, and the use of Artificial Intelligence (AI) has been proved useful in this direction. The use of AI by researchers in developing novel models for diagnosis, classification, and prediction of COVID-19 has really assist reduce the spread of the outbreak. Therefore, this paper proposes a machine learning diagnostic system to combat the spread of COVID-19. Four machine learning algorithms: Random Forest (RF), XGBoost, and Light Gradient Boosting Machine (LGBM) were used for quick and better identification of potential COVID-19 cases. The dataset used contains COVID-19 symptoms and selects the relevant symptoms of the diagnosis of a suspicious individual. The experiments yielded the LGBM leading with an accuracy of 0.97, recall of 0.96, precision of 0.97, F1-Score of 0.96, and ROC of 0.97 respectively. The real-time data capture would effectively diagnose and monitor COVID-19 patients, as revealed by the results.

Joseph Bamidele Awotunde, Sunday Adeola Ajagbe, Matthew A. Oladipupo, Jimmisayo A. Awokola, Olakunle S. Afolabi, Timothy O. Mathew, Yetunde J. Oguns
Evaluation of Local Thresholding Algorithms for Segmentation of White Matter Hyperintensities in Magnetic Resonance Images of the Brain

White matter hyperintensities are distinguished in magnetic resonance images as areas of abnormal signal intensity. In clinical research, determining the region and position of these hyperintensities in brain MRIs is critical; it is believed this will find applications in clinical practice and will support the diagnosis, prognosis, and therapy monitoring of neurodegenerative diseases. The properties of hyperintensities vary greatly, thus segmenting them is a challenging task. A substantial amount of time and effort has gone into developing satisfactory automatic segmentation systems.In this work, a wide range of local thresholding algorithms has been evaluated for the segmentation of white matter hyperintensities. Nine local thresholding approaches implemented in ImageJ software are considered: Bernsen, Contrast, Mean, Median, MidGrey, Niblack, Otsu, Phansalkar, Sauvola. Additionally, the use of other local algorithms (Local Normalization and Statistical Dominance Algorithm) with global thresholding was evaluated. The segmentation accuracy results for all algorithms, and the parameter spaces of the best algorithms are presented.

Adam Piórkowski, Julia Lasek
Super-Resolution Algorithm Applied in the Zoning of Aerial Images

Nowadays, multiple applications based on images and unmanned aerial vehicles (UAVs), such as autonomous flying, precision agriculture, and zoning for territorial planning, are possible thanks to the growing development of machine learning and the evolution of convolutional and adversarial networks. Nevertheless, this type of application implies a significant challenge because even though the images taken by a high-end drone are very accurate, it is not enough since the level of detail required for most precision agriculture and zoning applications is very high. So, it is necessary to further improve the images by implementing different techniques to recognize small details. Hence, an alternative to follow is the super-resolution method, which allows constructing an image with the information from multiple images. An efficient tool can be obtained by combining drones’ advantages with different image processing techniques. This article proposes a method to improve the quality of images taken on board in a drone by increasing information obtained from multiple images that present noise, vibration-induced displacements, and illumination changes. These higher resolution images, called super-resolution images, allow supervised training processes to perform different zoning methods better. In this study, GAN-type networks show the best results to recognize visually differentiated ones on an aerial image automatically. The quality measure of the super-resolution image obtained by different methods was defined using sharpness and entropy metrics, and a semantic confusion matrix measures the accuracy of the following semantic segmentation network. Finally, the results show that the super-resolution algorithm’s implementation and the automatic segmentation provide an acceptable accuracy according to the defined metrics.

J. A. Baldion, E. Cascavita, C. H. Rodriguez-Garavito

Security Services

Frontmatter
An Enhanced Lightweight Speck System for Cloud-Based Smart Healthcare

In the realm of information and communication sciences, the Internet of Things (IoT) is a new technology with sensors in the healthcare sector. Sensors are critical IoT devices that receive and send crucial bodily characteristics like blood pressure, temperature, heart rate, and breathing rate to and from cloud repositories for healthcare specialists. As a result of technical advancements, the usage of these devices, referred to as smart sensors, is becoming acceptable in smart healthcare for illness diagnosis and treatment. Data generated from these devices is huge and intrinsically tied to every sphere of daily life including healthcare domain. This information must be safeguarded and processed in a safe location. The term “cloud computing” refers to the type of innovation that is employed to safe keep such tremendous volume of information. As a result, it has become critical to protect healthcare data from hackers in order to maintain its protection, privacy, confidentiality, integrity, and its processing mode. This research suggested a New Lightweight Speck Cryptographic Algorithm to Improve High - Performance Computing Security for Healthcare Data. In contrasted to the cryptographic methods commonly employed in cloud computing, the investigational results of the proposed methodology showed a high level of security and an evident improvement in terms of the time it takes to encrypt data and the security obtainable.

Muyideen AbdulRaheem, Ghaniyyat Bolanle Balogun, Moses Kazeem Abiodun, Fatimoh Abidemi Taofeek-Ibrahim, Adekola Rasheed Tomori, Idowu Dauda Oladipo, Joseph Bamidele Awotunde
Hybrid Algorithm for Symmetric Based Fully Homomorphic Encryption

Fully Homomorphic Encryption (FHE) supports realistic computations on encrypted data and hence it is widely proposed to be used in cloud computing to protect the integrity and privacy of data stored in the cloud. The existing symmetric-based FHE schemes suffer from insecurity against known plaintext/ciphertext attacks and generate a large ciphertext size that required a large bandwidth to transfer the ciphertext over the network. To ameliorate these weaknesses is the aim of this paper. A hybrid algorithm for symmetric-based Fully Homomorphic Encryption that combines N-prime model and Matrix Operation for Randomization and Encryption with Secret Information Moduli Set (MORESIMS) is proposed. The results show that the proposed hybrid symmetric-based FHE framework satisfied homomorphism properties with robust inbuilt security that resists known plaintext, ciphertext and statistical attacks. The ciphertext size produced by the proposed hybrid framework is less than 4 times of its equivalent plaintext size with a considerable encryption execution time and fast decryption time. Thus, guaranteed to provide optimum performance and reliable solutions for securing integrity and privacy of user’s data in the cloud.

Kamaldeen Jimoh Muhammed, Rafiu Mope Isiaka, Ayisat Wuraola Asaju-Gbolagade, Kayode Sakariyah Adewole, Kazeem Alagbe Gbolagade
Information Encryption and Decryption Analysis, Vulnerabilities and Reliability Implementing the RSA Algorithm in Python

The processing and transmission of information has increased its effectiveness in recent decades. From mathematical models the security and integrity of the data are guaranteed. In spite of that, interceptions in the signal, attacks and information theft can happen in the transmission process. This paper presents a RSA algorithm analysis, using 4, 8 and 10 bits prime numbers with short messages. The encryption and decryption process implemented in python allowed the computational resources use. Processing time and data security are evaluated with a typical computational infrastructure required for its operation; in order to identify vulner-abilities and their reliability level when ideal conditions are available to perform a cryptanalysis.

Rocío Rodriguez G., Gerardo Castang M., Carlos A. Vanegas

Simulation and Emulation

Frontmatter
Optimization of Multi-level HEED Protocol in Wireless Sensor Networks

Increasing the lifetime of the wireless sensor network after deployment is a challenge. Thus, the one-hop routing protocols associated with agglomeration techniques are not scalable resistant for the large wireless sensor network. In this paper, we try to remedy these problems, by proposing a new approach based on the hierarchical routing technique (multi-hop), this approach presents a solution to maximize the lifetime of the large wireless sensor network. Our approach consists of the design of a new ML-HEED (Multi level Hybrid, Energy-Efficient, Distributed approach) protocol. ML-HEED is a multi-hop routing protocol, this protocol is based on the HEED protocol to organize the wireless sensor network into clusters. A cluster is formed by a set of sensor nodes and elected sensor node. Multi-hop communication takes place between the elected nodes and the base station. The simulation results show the effectiveness of this protocol in maximizing the lifetime of wireless sensor networks. In this approach, we have noticed, the further the base station is, the more the elected nodes consume more energy, in order to route their data to the latter. Thus, in this paper, we will propose an optimization of the Multi-Level HEED protocol in terms of energy consumption. This optimization method takes into account two criteria for the selection of the elected nodes of the higher levels: the distance between the elected nodes and the base station, and the residual energy of each elected node of higher-level. this method becomes effective when the elected nodes are far from the base station.This approach has been evaluated by simulation according to existing metrics and new proposed metrics and they have proven their performance.

Walid Boudhiafi, Tahar Ezzedine

Smart Cities

Frontmatter
LiDAR and Camera Data for Smart Urban Traffic Monitoring: Challenges of Automated Data Capturing and Synchronization

Availability and range of sensors and other type of smart hardware are growing. Implementation of such solutions is becoming a crucial part of development strategies for towns and cities. Usually, hardware and software are proposed by various vendors and proper implementation can raise various tasks. One of the tasks is to establish communication between sensors and synchronization of, for example, data capturing. The aim of this article is to analyze available solutions for data capturing and synchronization and propose solutions for real world applications. Available solutions are analyzed and a general process model is proposed involving synchronization between cameras and LiDAR sensors. The proposal is installed and tested to analyses traffic flow on a 4-lane street in Latvia and capture vehicles that are not allowed to use a particular road such as heavy machinery and trucks. Synchronization proposal includes application of timestamps of capturing devices as a most feasible solution for real-world application.

Gatis Vitols, Nikolajs Bumanis, Irina Arhipova, Inga Meirane
Mobility in Smart Cities: Spatiality of the Travel Time Indicator According to Uses and Modes of Transportation

The objective of this document is to present the analysis of the level of vehicular congestion that occurs in the surrounding sectors, the downtown neighborhood, specifically on Alfonso López Avenue, to analyze and propose, what is the best solution, and thus improving the mobility of the road network, and minimizing congestion in the present and the future. The results of the analysis of the measurement of the speeds that were obtained taking the different modes of transportation: private, public: taxis and buses, and plate method, speeds were taken at peak and off-peak hours, in the morning and afternoon days, for 15 continuous days; through cycles that incorporate series of fluid and congested traffic or combined; taking into account the relationship variables between flow, velocity, density, interval, and spacing; Finally, the results were spatialized, to expand the crossovers of variables. The previous analysis seeks to contribute to the discussion about the impact of the different modes of transportation on traffic congestion and how to diagnose and propose solutions to improve mobility.

Carlos Alberto Diaz Riveros, Karen Astrid Beltran Rodriguez, Cesar O. Diaz, Alejandra Juliette Baena Vasquez
Preliminary Studies of the Security of the Cyber-Physical Smart Grids

The electric power network is a cyber-physical system (CPS) that is important for modern society. Smart Grids (SG) can improve the electric power system’s profitability and reliability by integrating renewable energy and advanced communication technologies. The communication network that connects numerous generators, devices, and controllers distributed remotely plays a vital role in controlling electrical grids, and the trends favor the implementation of IoT devices. However, it is vulnerable to cyber-attacks. This paper presents our current studies of potential malware that can attack these power systems and investigate future strategies to improve that security and risk management. The paper explains as the first step the explanation of a Delphi with experts from the top security companies in order to understand the trends and malware beyond an extensive literature survey. Also, the paper studies the propagation of the selected malware using System Dynamics. Finally, conclusions are provided, allowing future work to be accomplished using multiple resolution modeling (MRM) (i.e., a form of distributed simulation).

Luis Rabelo, Andrés Ballestas, Bibi Ibrahim, Javier Valdez

Software and Systems Modeling

Frontmatter
Enterprise Modeling: A Multi-perspective Tool-Supported Approach

Enterprises are inherently complex systems, which involve a multitude of components working dynamically and in coordination to obtain a desired end goal. The intricacy of an enterprise and how the different components can be modeled is the area of research of Enterprise Modeling (EM). When holistically modeling an enterprise, it is required to use more than one modeling language, including general and domain specific modeling languages (DSML), to boost the comprehension of different enterprise areas. However, the use of multiple languages can lead to a lack of detail and understanding of the relationships between enterprise components that are modeled in different domains. Analyzing the relationships between various enterprise domains can generate valuable insight into understanding the uncertainty, complexity, and operations of an enterprise. In this paper, we present an approach to enable EM from multiple perspectives allowing enterprises to use different languages to describe various domains. This approach supports the creation of relationships and dependencies between domains, to analyze from different depths and viewpoints desired enterprise perspectives. Considering the needed requirements for multi-perspective modeling, we developed a workbench that allows the composition and merger of different EM languages and creates ad-hoc EM tools based on the necessities of the enterprise.

Paola Lara Machado, Mario Sánchez, Jorge Villalobos

Software Design Engineering

Frontmatter
Architectural Approach for Google Services Integration

Every day, new software applications appear and, commonly, these applications need to implement an authorization system. Usually, some applications are focused on specific communities such as universities. In this case, managing the credentials of every application could be tedious for the members of the community. In addition, applications need different hardware requirements for the right performance of some features, which might be very expensive in some cases. Fortunately, big companies such as Google and Microsoft offer suites to some organizations like universities; then, universities can seize said suites to improve their business processes. In this paper, we present an approach for building applications using a microservices architecture and taking advantage of Google Workspace.

Hamilton Ricaurte, Hector Florez
Proposal to Improve Software Testing in Small and Medium Enterprises

Faced with their massive demand for software products, their quality has been increasingly questioned and, consequently, the quality of the processes by which they are produced. Different institutions have developed different models and standards for the continuous improvement of these development processes, mainly focused on the testing stage. Even in this proliferation of improvement models, there are none applicable to software developing SMEs in our region, industries that count with many opportunities but face challenges such as lack of resources, skills and experience in their pursuit to create quality software and survive in the market. This article presents a management proposal that allows to improve the testing process to obtain higher quality software products in SMEs in the region, based on a set of successful methods for software testing in large companies.

Melisa Argüello, Carlos Antonio Casanova Pietroboni, Karina Elizabeth Cedaro
Backmatter
Metadata
Title
Applied Informatics
Editors
Hector Florez
Ma Florencia Pollo-Cattaneo
Copyright Year
2021
Electronic ISBN
978-3-030-89654-6
Print ISBN
978-3-030-89653-9
DOI
https://doi.org/10.1007/978-3-030-89654-6

Premium Partner