Skip to main content
Top

2020 | Book

Cloud Computing, Big Data & Emerging Topics

8th Conference, JCC-BD&ET 2020, La Plata, Argentina, September 8-10, 2020, Proceedings

insite
SEARCH

About this book

This book constitutes the revised selected papers of the 8th International Conference on Cloud Computing, Big Data & Emerging Topics, JCC-BD&ET 2020, held in La Plata, Argentina*, in September 2020.

The 11 full papers presented were carefully reviewed and selected from a total of 36 submissions. The papers are organized in topical sections of cloud computing and HPC; Big Data and machine and deep learning.

*The conference was held virtually due to the COVID-19 pandemic.

Table of Contents

Frontmatter
Correction to: Cloud Computing, Big Data & Emerging Topics
Enzo Rucci, Marcelo Naiouf, Franco Chichizola, Laura De Giusti

Cloud, Edge and High-Performance Computing

Frontmatter
Cloud Robotics for Industry 4.0 - A Literature Review
Abstract
Robots in the industry have been used for decades, much before the so-called Fourth Industrial Revolution. They have been incorporated into industrial processes in various ways, for example, with mechanic arms, in assembly processes, welding, and painting, among others. Industrial robots are located in restricted access sites and their space is delimited by means of physical barriers and security measures. In recent years, Industry 4.0 proposes the use robots, able to collaborate with persons, known as collaborative robots or “cobots”. Cobots are characterized by cooperating with human work, sharing the same workspace, and able to respond to simple human-machine interactions. In addition, given the benefits of applying cloud computing in Industry 4.0, research has been conducted in applying such technologies to robots. The approach is known as “cloud robotics” and appears as an emerging topic. The objective of this work is to carry out a systematic literature review of cloud robotics for Industry 4.0, in an attempt to present the state of the art in this field and identify opportunities for future research. From the analysis of the results, we observe an emerging interest in this area, and we identify main technologies applied, research themes, and application areas, as well as a special interest on security and safety aspects.
Nancy Velásquez Villagrán, Patricia Pesado, Elsa Estevez
An Edge Focused Distributed Shared Memory
Abstract
Edge computing proposes access to largely unused computational resources without the added cost of the latency between the user and the Cloud. To take advantage of it we designed and implemented an abstraction layer compatible with standard JavaScript that builds a distributed shared memory on top of any existing web browser, like the ones present in smartphones or tablets, and a cloud server, enabling developers to use existing application code and enhance it by enabling collaboration between those devices. The synchronization mechanism supports mixed consistency, preferring eventual consistency but providing a stronger serializability when required, allowing the developers to tune it to their specific needs.
Matías Teragni, Ricardo Moran, Gonzalo Zabala
Towards a Malleable Tensorflow Implementation
Abstract
The TensorFlow framework was designed since its inception to provide multi-thread capabilities, extended with hardware accelerator support to leverage the potential of modern architectures. The amount of parallelism in current versions of the framework can be selected at multiple levels (intra- and inter-paralellism) under demand. However, this selection is fixed, and cannot vary during the execution of training/inference sessions. This heavily restricts the flexibility and elasticity of the framework, especially in scenarios in which multiple TensorFlow instances co-exist in a parallel architecture. In this work, we propose the necessary modifications within TensorFlow to support dynamic selection of threads, in order to provide transparent malleability to the infrastructure. Experimental results show that this approach is effective in the variation of parallelism, and paves the road towards future co-scheduling techniques for multi-TensorFlow scenarios.
Leandro Ariel Libutti, Francisco D. Igual, Luis Piñuel, Laura De Giusti, Marcelo Naiouf
Viral Diseases Propagation Analysis in Short Time
Abstract
Studying potentially harmful infectious agents for some population and trying to explain and predicts how the disease evolves in the time are difficult because many factors interactions. An solution is to analyse real systems by mean of simulations models. In these cases, Cellular Automata have been used with success, they can recreate a virtual world take account problem main features and their correlations. We developed an efficient and portable cellular automata model in Graphic Processing Units to simulate viral diseases propagation. The achieved efficiency allows us estimate in a short time the viral disease behaviour when it is known or not, as well as its associated uncertainty. Besides, it is suitable to test effects of different measures that tending towards stop the spread. We describe the solution and evaluate it for two viral diseases: Seasonal Influenza and COVID-19.
Maximiliano Lucero, Natalia Miranda, Fabiana Piccoli
Architectural Design Criteria for Evolvable Data-Intensive Machine Learning Platforms
Abstract
Recent advances in Artificial Intelligence (AI) have fostered a widespread adoption of Machine Learning (ML) capabilities within many products and services. However, most organizations are not well suited to fully exploit the strategic advantages of AI. Implementing ML solutions is still a complex endeavor due to the fast-pace evolution and the intrinsic exploratory nature of state-of-the-art ML techniques. In many respects, the evolution of data platforms through highly parallel or high performance technologies have focused on the capacity to massively process the elements consumed by these ML models. This separate consideration renders reference architectures to be either suited for analytics consumption, or for raw storage. There is no joint consideration for the complete cycle of data management, models development, and serving with feedback and human-in-the-loop requirements. This paper introduces design criteria conceived to help organizations to architect and implement data platforms to effectively exploit their ML capabilities. The main objective of this work is to expedite the development of data platforms for ML by avoiding common implementation mistakes. The proposed guideline constitutes the methodical articulation of the empirical knowledge acquired over the last years designing, developing, evolving and maintaining a broad spectrum of relevant industry-oriented Data and AI solutions. We have focused on evaluating our proposal by assessing the functionality and usability of the architectures and implementations originated from our design criteria.
Gonzalo Zarza, Juan José López Murphy

Big Data

Frontmatter
Harmonizing Big Data with a Knowledge Graph: OceanGraph KG Uses Case
Abstract
In this paper we introduce recent efforts carried out by the OceanGraph KG project to integrate semi-structured or unstructured content. We present some of the practical applications of OceanGraph through use cases, and finally summarize the lessons learned during the development process.
Marcos Zárate, Carlos Buckle, Renato Mazzanti, Mirtha Lewis, Pablo Fillottrani, Claudio Delrieux
Data Management Optimization in a Real-Time Big Data Analysis System for Intensive Care
Abstract
Vital signs monitors in intensive and intermediate care units generate large amounts of data, most of which are not recorded nor taken advantage of. We propose a computer system that allows the automatic and early detection of the deterioration of critical patients, through the real-time processing and analysis of digital health data, including physiological waveform data generated by the medical monitors. Our system tries to emulate the behavior of an expert intensivist physician, giving recommendations for clinical decision making to reduce the uncertainty on diagnosis, treatment options and prognosis. In our previous works, we presented an real-time Big Data infrastructure built using free software technologies. In this paper we improve its data management. We present and evaluate three different data representation models in Apache Kafka. One of this models outperforms the other two in storage space use and delivery time of both real-time and historical data. Our results show that Kafka can be used for historical data storage. This in turn allows us to eliminate the NoSQL database of our previous system. Unlike other works, ours attempts to reduce the number of components to lower system overhead and administration complexity.
Rodrigo Cañibano, Claudia Rozas, Cristina Orlandi, Javier Balladini

Machine and Deep Learning

Frontmatter
Reddening-Free Q Indices to Identify Be Star Candidates
Abstract
Astronomical databases currently provide high-volume spectroscopic and photometric data. While spectroscopic data is better suited to the analysis of many astronomical objects, photometric data is relatively easier to obtain due to shorter telescope usage time. Therefore, there is a growing need to use photometric information to automatically identify objects for further detailed studies, specially H\(\alpha \) emission line stars such as Be stars. Photometric color-color diagrams (CCDs) are commonly used to identify this kind of objects. However, their identification in CCDs is further complicated by the reddening effect caused by both the circumstellar and interstellar gas. This effect prevents the generalization of candidate identification systems. Therefore, in this work we evaluate the use of neural networks to identify Be star candidates from a set of OB-type stars. The networks are trained using a labeled subset of the VPHAS+ and 2MASS databases, with filters ugr,  H\(\alpha , i, J, H\), and K. In order to avoid the reddening effect, we propose and evaluate the use of reddening-free Q indices to enhance the generalization of the model to other databases and objects. To test the validity of the approach, we manually labeled a subset of the database, and use it to evaluate candidate identification models. We also labeled an independent dataset for cross dataset evaluation. We evaluate the recall of the models at a 99% precision level on both test sets. Our results show that the proposed features provide a significant improvement over the original filter magnitudes.
Yael Aidelman, Carlos Escudero, Franco Ronchetti, Facundo Quiroga, Laura Lanzarini
A Web System Based on Spotify for the automatic generation of affective playlists
Abstract
The online music streaming providers offer powerful personalization tools for recommending songs to their registered users. These tools are usually based on users’ listening histories and tastes, but ignore other contextual variables that affect users while listening to music, for example, the user’s mood. In this paper, a Web-based system for generating affective playlists that regulate the user’s mood is presented. The system has been implemented integrating resources and data offered by Spotify through its service platform, and the playlists generated are directly published in the user’s Spotify account. Internally, the emotions play a relevant role in the processes of cataloguing songs and making personalized music recommendations. Novel affective computing solutions are combined with traditional information retrieval and artificial intelligence techniques in order to solve these complex engineering problems. Besides, these solutions consider users’ collaboration as a first-class element in an attempt to improve affective recommendations.
Pedro Álvarez, Jorge García de Quirós, Sandra Baldassarri
Classification of Summer Crops Using Active Learning Techniques on Landsat Images in the Northwest of the Province of Buenos Aires
Abstract
The present work aims to obtain a classifier for summer crops in the northwest of Buenos Aires province from Landsat satellite images. Active Learning (AL) was used as the classification technique since it obtains satisfactory results using a small set of labeled samples to train the algorithm. The construction of the training set is iteratively performed by means of a heuristic for the selection of the unlabeled samples to be classified by an expert. The following heuristics were used for comparison: Breaking Ties, Multiclass Level Uncertainty, Margin Sampling, and Random Sampling. The algorithm was also compared with the supervised technique Support Vector Machine (SVM). The experiments were tested on three Landsat 8 images from different dates using 6 bands per image and various vegetation indices. The results obtained using AL in combination with the different heuristics do not differ substantially from SVM.
Lucas Benjamin Cicerchia, María José Abasolo, Claudia Cecilia Russo
Trainable Windowing Coefficients in DNN for Raw Audio Classification
Abstract
An artificial neural network for audio classification is proposed. This includes the windowing operation of raw audio and the calculation of the power spectrogram. A windowing layer is initialized with a hann window and its weights are adapted during training. The non-trainable weights of spectrogram calculation are initialized with the discrete Fourier transform coefficients. The tests are performed on the Speech Commands dataset. Results show that adapting the windowing coefficients produces a moderate accuracy improvement. It is concluded that the gradient of the error function can be propagated through the neural calculation of the power spectrum. It is also concluded that the training of the windowing layer improves the model’s ability to generalize.
Mario Alejandro García, Eduardo Atilio Destéfanis, Ana Lorena Rosset
Backmatter
Metadata
Title
Cloud Computing, Big Data & Emerging Topics
Editors
Dr. Enzo Rucci
Marcelo Naiouf
Franco Chichizola
Laura De Giusti
Copyright Year
2020
Electronic ISBN
978-3-030-61218-4
Print ISBN
978-3-030-61217-7
DOI
https://doi.org/10.1007/978-3-030-61218-4

Premium Partner