Skip to main content
Top

2019 | Book

Supercomputing

9th International Conference, ISUM 2018, Mérida, Mexico, March 5–9, 2018, Revised Selected Papers

insite
SEARCH

About this book

This book constitutes the refereed proceedings of the 9th International Conference on Supercomputing, ISUM 2018, held in Mérida, Mexico, in March 2018.

The 19 revised full papers presented were carefully reviewed and selected from 64 submissions. The papers are organized in topical sections on scheduling, architecture, and programming; parallel computing; applications and HPC.

Table of Contents

Frontmatter

Scheduling, Architecture, and Programming

Frontmatter
A Scheduling Algorithm for a Platform in Real Time
Abstract
We propose a scheduling algorithm that was designed and implemented to obtaining the results for assigning tasks based with miss deadline among several nodes of a mobile distributed environment, taking into account the delay quality, achieving in such a way that the data of a mobile device can be transferred and located in a network. This method was intended to give a real-time scheduler, which allowed the obtaining of good results without loss of information. Also, we proposed to develop a mechanism to maintain and construct a scheduler to from the beginning.
M. Larios-Gómez, J. Migliolo Carrera, M. Anzures-García, A. Aldama-Díaz, G. Trinidad-García
Automatic Code Generator for a Customized High Performance Microprocessor Simulator
Abstract
This paper presents a software that generates code that implements a microprocessor simulator based on features defined by user. Software receives a set of microprocessor architecture description that includes: number of cores, operations to be executed in the ALU, cache memory details, and number of registers, among others. After configuration, the software generates Java code that implements the microprocessor simulator described. Software can generates more than forty different codes depending on the configurations defined. Each simulator follows a standard four stages pipeline: fetch, decode, execute and store. Code generator has been used as a learning tool in an undergraduate course with interesting effects in the student’s learning process. Preliminary results show that students understand better how a microprocessor works and they felt ready to propose new microprocessor architectures.
Alfredo Cristóbal-Salas, Juan D. Santiago-Domínguez, Bardo Santiago-Vicente, Marco Antonio Ramírez-Salinas, Luis Alfonso Villa-Vargas, Neiel Israel Layva-Santes, Cesar Alejandro Hernández-Calderón, Carlos Rojas-Morales
Use of Containers for High-Performance Computing
Abstract
The past decade, virtual machines emerged to solve many infrastructure problems and practical use of computing resources. The limitations of this type of technology, are in the sense of resource overload because each virtual machine has a complete copy of an operating system plus different libraries needed to run an application. Containers technology reduces this load by eliminating the hypervisor and the virtual machine for its operation, where each application is executed with the most elementary of a server, plus a shared instance of the operating system that hosts it. Container technology is already an essential part of the IT industry, as it is a simpler and more efficient way to virtualize Micro-Services with workflow’s creations support in development and operations (DevOps). Unlike the use of virtual machines, this solution generates much less overhead in the kernel host and the application, improving performance. In the high-performance computing (HPC) there is a willingness to implement this solution for scientific computing purposes. The most important and standard technology in the industry is Docker, however is not a trivial and direct adoption of this standard for the requirements of scientific computing in a HPC environment. In the present study, a review of research works focused on the use of containers for the HPC will be carried out with the objective of familiarizing the user and system administrator of HPC in the use of this technology, and how scientific research projects can get benefit from this type of technology in terms of mobility of compute and reproducibility of workflows.
F. Medrano-Jaimes, J. E. Lozano-Rizk, S. Castañeda-Avila, R. Rivera-Rodriguez

Parallel Computing

Frontmatter
Generic Methodology for the Design of Parallel Algorithms Based on Pattern Languages
Abstract
A parallel system to solve complex computational problems involve multiple instruction, simultaneous flows, communication structures, synchronisation and competition conditions between processes, as well as mapping and balance of workload in each processing unit. The algorithm design and the facilities of processing units will affect the cost-performance ratio of any algorithm. We propose a generic methodology to capture the main characteristics of parallel algorithm design methodologies, and to add the experience of expert programmers through pattern languages. Robust design considering the relations between architectures and programs is a crucial item to implement high-quality parallel algorithms. We aim for a methodology to exploit algorithmic concurrencies and to establish optimal process allocation into processing units, exploring the lowest implementation details. Some basic examples are described, such as the k-means algorithm, to illustrate and to show the effectiveness of our methodology. Our proposal identifies essential design patterns to find models of Data Mining algorithms with string self-adaptive mechanisms for homogeneous and heterogeneous parallel architectures.
A. Alejandra Serrano-Rubio, Amilcar Meneses-Viveros, Guillermo B. Morales-Luna, Mireya Paredes-López
A Parallelized Iterative Closest Point Algorithm for 3D View Fusion
Abstract
The Iterative Closest Point Algorithm (ICP) is a widely used method in computer science and robotics, used for minimizing a distance metric between two set of points. Common applications of the ICP are object localization and position estimation. In this work, we introduce a parallel version of the ICP which significantly reduces the computational time, by performing fewer operations while maintaining a simple and highly parallelizable algorithm. Our proposal is based on the naive computation of closest pairs of points in two different sets, instead of comparing all possible pairs we approximate the closest pairs of points by means of searching in a plausible subset. The experiments are performed on a sample from the Stanford 3D Scanning Repository, used for the 3D cloud of points registration. For these case studies, the error, as well as the solution, are exactly the same than using the exact algorithm.
S. Ivvan Valdez, Felipe Trujillo-Romero
Evaluation of OrangeFS as a Tool to Achieve a High-Performance Storage and Massively Parallel Processing in HPC Cluster
Abstract
Nowadays, the requirements of modern software demand a greater computing power; numerous scientific and engineering applications request an increase in data storage capacity, be able to exchange of information at high speeds, as well as a faster data processing and better memory management. The implementation of personal computers interconnected to form a cluster and the use of distributed/parallel file systems are presented as a highly suitable alternative in the solution of complex problems that require these resources as their needs grow. The present work shows the evaluation of OrangeFS as a tool to achieve high performance storage and massive parallel processing. It takes advantage of the capacity of the hard drives included in each node of the cluster through the virtual file system and the network bandwidth, instead of having to add a more expensive type of storage. The tests carried out in a cluster with CentOS show that stripping a large file into small objects and distributed in parallel to the I/O servers provides that upcoming read/write operations runs faster; In addition, the use of the message passing interface in the development and execution of applications allows to increase the parallelism of the data in terms of processing due to the intervention of the multicore processor in each of the clients.
Hugo Eduardo Camacho Cruz, Jesús Humberto Foullon Peña, Julio Cesar González Mariño, Ma. de Lourdes Cantú Gallegos
Many-Core Parallel Algorithm to Correct the Gaussian Noise of an Image
Abstract
The digitization of information is abundant in different areas related to digital image processing; its primary objective is to improve the quality of the image for a correct human interpretation or to facilitate the search of information patterns in a shorter time, with fewer computing resources, size and low energy consumption. This research is focused on validating a possible implementation using a limited embedded system, so the specified processing speed and algorithms that redistribute the computational cost are required. The strategy has been based on parallel processing for the distribution of tasks and data to the Epiphany III. It was combined to reduce the factors that introduce noise to the image and improve quality. The most common types of noise are Gaussian noise, impulsive noise, uniform noise and speckle noise. In this paper, the effects of Gaussian noise that occurs at the moment of the acquisition of the image that produces as a consequence blur in some pixels of the image is analyzed, and that generates the effect of haze (blur). The implementation was developed using the Many-core technology in 2 × 2 and 4 × 4 arrays with (4, 8, 16) cores, also the performance of the Epiphany system was characterized to FFT2D, FFT setup, BITREV, FFT1D, Corner turn and LPF and the response times in machine cycles of each algorithm are shown. The power of parallel processing with this technology is displayed, and the low power consumption is related to the number of cores used. The contribution of this research in a qualitative way is demonstrated with a slight variation for the human eye in each other images tested, and finally, the method is a useful tool for applications with resources limited.
Teodoro Alvarez-Sanchez, Jesus A. Alvarez-Cedillo, Jacobo Sandoval-Gutierrez
Amdahl’s Law Extension for Parallel Program Performance Analysis on Intel Turbo-Boost Multicore Processors
Abstract
In last years the use of multicore processors has been increased. This tendency to develop processors with several cores obeys to look for better performance in parallel programs with a lower consumption of energy. Currently, the analysis of performance of speedup and energy consumption has taken a key role for applications executed in multicore systems. For this reason, it is important to analyze the performance based on new characteristics of modern processors, such as Intel’s turbo boost technology. This technology allows to increase the frequency of Intel multicore processors. In this work, we present an extension of Amdahl’s law to analyze the performance of parallel programs running in multicore processors with Intel turbo boost technology. We conclude that for cases when the sequential portion of a program is small, it is possible to overcome the upper limit of the traditional Amdahl’s law. Furthermore, we show that for parallel programs running with turbo boost the performance is better compare to programs running in processors that does not have this technology on.
Amilcar Meneses-Viveros, Mireya Paredes-López, Isidoro Gitler

Applications and HPC

Frontmatter
Traffic Sign Distance Estimation Based on Stereo Vision and GPUs
Abstract
Recognition, detection, and distance determination of traffic signs have become essential tasks for the development of Intelligent Transport Systems (ITS). Processing time is very important to these tasks, since they not only need an accurate answer, but also require a real-time response. The distance determination of traffic Signs (TS) uses the greatest number of computational resources for the disparity map calculations based on the Stereo Vision method. In this paper, we propose the acceleration of the disparity map calculation by using our parallel algorithm, called Accelerated Estimation for Traffic Sign Distance (AETSD) and implemented in the Graphics Processors Unit (GPU), which uses data storage strategies based on their frequency of use. Furthermore, it carries out an optimized search for the traffic signal in the stereoscopic pair of images. The algorithm splits the problem into parts and they are solved concurrently by the available massive processors into the stream processors units (SM). Our results show that the proposed algorithm accelerated the response time 141 times for an image resolution of 1024 × 680 pixels, with an execution time 0.04 s for the AETSD parallel version and 5.67 s for the sequential version.
Luis Barbosa-Santillan, Edgar Villavicencio-Arcadia, Liliana Barbosa-Santillan
The Use of HPC on Volcanic Tephra Dispersion Operational Forecast System
Abstract
High Performance Computing (HPC) was used to estimate the tephra dispersion forecast in an operational mode using the Popocatepetl volcano as base case. Currently it is not possible to forecast a volcanic eruption, which can occur at any time. In order to reduce the human intervention for obtaining immediate ash deposition information, the HPC was used to compute a wide spectrum of possible eruptions and dispersion scenarios; information obtained from previous eruptions and meteorological forecast was used to generate the possible scenarios. Results from the scenarios are displayed in a web page for consultation and decision-making when a real eruption occurs. This work shows the methodology approach used to forecast the tephra dispersion from a possible eruption, the computing strategy to reduce the processing time and a description of products displayed.
Agustín García-Reynoso, Jorge Zavala-Hidalgo, Hugo Delgado-Granados, Dulce R. Herrera-Moro
High Performance Open Source Lagrangian Oil Spill Model
Abstract
An oil spill particle dispersion model implemented in Julia, a high-performance programming language, and Matlab is described. The model is based on a Lagrangian particle tracking algorithm with a second-order Runge-Kutta scheme. It uses ocean currents from the Hybrid Coordinate Ocean Model (HYCOM) and winds from the Weather Research and Forecasting Model (WRF). The model can consider multiple oil components according to their density and different types of oil decay: evaporation, burning, gathering, and exponential degradation. Furthermore, it allows simultaneous modeling of oil spills at multiple locations. The computing performance of the model is tested in both languages using an analogous implementation. A case study in the Gulf of Mexico is described.
Andrea Anguiano-García, Olmo Zavala-Romero, Jorge Zavala-Hidalgo, Julio Antonio Lara-Hernández, Rosario Romero-Centeno
Fast Random Cactus Graph Generation
Abstract
In this article, we propose a fast algorithm for generating a cactus graph. Cacti has important applications in diverse areas such as biotechnology, telecommunication systems, sensor networks, among others. Thus, generating good random cacti graphs it is important for simulation of the diverse algorithms and protocols. In this paper, we present an efficient parallel algorithm to generate random cactus. To the best of our knowledge, this algorithm is the first parallel algorithm to generate random cacti graphs.
Joel Antonio Trejo-Sánchez, Andrés Vela-Navarro, Alejandro Flores-Lamas, José Luis López-Martínez, Carlos Bermejo-Sabbagh, Nora Leticia Cuevas-Cuevas, Homero Toral-Cruz
Theoretical Calculation of Photoluminescence Spectrum Using DFT for Double-Wall Carbon Nanotubes
Abstract
Using DFT theory, we calculated theoretically photoluminescence (PL) spectra of double-walled carbon nanotubes (DWCNTs). Using the supercomputer (LNS) tool, the photoluminescence spectra for eight double-walled nanotubes were calculated with the Gaussian09 software; the DWCNTs built are of the armchair/armchair type, (3,3)/(2,2), (6,6)/(3,3), (8,8)/(4,4), (10,10)/(5,5), (12,12)/(6,6), (14,14)/(7,7), (16,16)/(8,8) and (18,18)/(9,9). The calculations were obtained taking into account different DWCNT lengths ranging from 4.9 Å to 23.4 Å when changing the chirality (n, m) of the double-walled carbon nanotubes, as well as we considered the increase in their inter-radial distance ranging from 0.18 ≤ dR ≤ 0.62 nm. The objective of this work focuses on investigating the DWCNTs PL considering different atomic arrangements. The calculation was performed at a DFT level in which we used the Generalized Gradient Approximation (GGA) to establish the molecular geometries and the fundamental state energies. To obtain the results of the PL spectra, the DWCNTs were optimized in their ground state, with the hybrid function CAM-B3LYP, which is a mixed functional exchange and correlation and the base set that was used is the 6–31 G.
A. P. Rodríguez Victoria, Javier Martínez Juárez, J. A. David Hernández de la Luz, Néstor David Espinosa-Torres, M. J. Robles-Águila
Computational Study of Aqueous Solvation of Vanadium (V) Complexes
Abstract
Vanadium complexes are of great biological interest due to their antidiabetic and anticancer properties. Analyses of the aqueous solvation effects using explicit models in the octahydrated complexes of vanadium (V) linked to the tridentate ONO Schiff base (VL·(H2O)8), are performed. Here, L is the Schiff base 1-(((5-chloro-2-oxidophenyl)imine)methyl)naphthalen-2-olate. The complexes VL1, VL2, VL3 and VL4, include the functional groups –NH(CH3CH2)3, –CH2CH2CH3, and –CH2CH2CH2CH3, respectively. The explicit model is used to examine the effects of water molecules in the first solvation shell that surrounds the bis-peroxo-oxovanadate ion (two molecules per oxygen atom in the [VO(O2)2·(H2O)]). Computational calculations are performed using density functional theory (DFT)/M06-2X. A complete basis set (CBS) using correlation-consistent Dunning basis sets from double-ξ to quadruple-ξ is used. The solvation energies are analyzed in order to propose possible complex structures as the most relevant species in biological-like systems. The results indicate that, by including explicit water molecules in the first solvation shell, a particular stabilization trend in the octahydrated complexes (VL1–VL4)·(H2O)8 is observed with VL1·(H2O)8 < VL3·(H2O)8 < VL4·(H2O)8 < VL2·(H2O)8. Our results indicate that the complex VL3·(H2O)8, substituted with –CH2CH2CH3, presents the most stable ΔGSolv and hence, it might represent the more likely species in biological-like environments.
Francisco J. Melendez, María Eugenia Castro, Jose Manuel Perez-Aguilar, Norma A. Caballero, Lisset Noriega, Enrique González Vergara
3D Image Reconstruction System for Cancerous Tumors Analysis Based on Diffuse Optical Tomography with Blender
Abstract
There are different studies that allow detecting the presence of cancer cells in a patient; however, the time it takes to obtain a correct diagnosis is critical for these cases. This work presents the design and construction of a prototype as first approach for a microtomograph based on Diffuse Optical Tomography (DOT). Diffused Optical Tomography is a technique that uses light from the Near Infrared Region (NIR) to measure tissue’s optical properties that has become relevant for the medical field due to it being a non-invasive technique. The main goal of this device is to be able to detect and analyze cancerous tumors by measuring diffuse photon densities in turbid media. As a first phase of the development, this project integrates an image acquisition mechanism, an image processing interface at software level developed with Blender to exploit a GPU architecture to optimize the execution time.
Marco Antonio Ramírez-Salinas, Luis Alfonso Villa-Vargas, Neiel Israel Leyva-Santes, César Alejandro Hernández-Calderón, Sael González-Romero, Miguel Angel Aleman-Arce, Eduardo San Martín-Martínez
Sea-Surface Temperature Spatiotemporal Analysis for the Gulf of California, 1998–2015: Regime Change Simulation
Abstract
Aiming to gain insight into probable climate change regime in the Gulf of California, a spatiotemporal simulation of sea surface temperature (SST) for the years 1998–2015 based on monthly satellite images with spatial resolution of 4 km was undertaken. In addition to SST’s time series, El Niño Southern Oscillation (ENSO) Multivariate Index (MEI) and monthly standardized SST anomalies (SSTA) for the study area were further taken in consideration. Arrival dates for summer transition, SST ≥ 25 ℃, showed up 15.5 days earlier for the 2007 to 2015 period, with respect to the 1998–2006 period. In contrast, the winter transition, SST < 25 ℃, for such period turned up 3.9 days later, which add up to 20 extra days of summer for this time series. Furthermore, for the later period, the spatial distribution of surface waters with SST > 26 ℃ covered an extra 10% of the Gulf’s area. Additionally, the SST variability showed an annual positive trend of 0.04 ℃, 0.72 ℃ total for the series, according to Theil-Sen trend estimation.
María del Carmen Heras-Sánchez, José Eduardo Valdez-Holguín, Raúl Gilberto Hazas-Izquierdo
Use of High Performance Computing to Simulate Cosmic-Ray Showers Initiated by High-Energy Gamma Rays
Abstract
We use the supercomputer from the Laboratorio Nacional de Supercómputo del Sureste de México (LNS) to simulate secondary cosmic-ray showers initiated by gamma rays with energies between 100 GeV and 100 TeV. These simulations play an important role in the search for gamma ray bursts (GRB) in ground observatories, such as the High Altitude Water Cherenkov (HAWC) observatory located in Sierra Negra, Mexico. GRB are the most energetic explosions observed so far in our Universe and they have been observed only in satellite detectors such as Fermi/GBM, Swift/BAT and INTEGRAL. Their observation in ground observatories will constitute an important scientific breakthrough in the field of astroparticle physics. We use MPI to run simulation code in parallel on hundreds of CPU cores from the LNS. In particular we use the CORSIKA Monte Carlo shower generator with zenith angles of the primary gamma rays between 0 and 45° and azimuth angles between 0 and 360° with an spectral index of −2. We report on benchmark results on the speed and scalability of our code as a function of the number of CPU cores. The authors are members of the HAWC Collaboration, they use high performance computing to analyze the data collected with the HAWC Observatory.
Cederik de León, Humberto Salazar, Luis Villaseñor
Data Augmentation for Deep Learning of Non-mydriatic Screening Retinal Fundus Images
Abstract
Fundus image is an effective and low-cost tool to screen for common retinal diseases. At the same time, Deep Learning (DL) algorithms have been shown capable of achieving similar or even better performance accuracies than physicians in certain image classification tasks. One of the key aspects to improve the performance of DL models is to use data augmentation techniques. Data augmentation reduces the impact of overfitting and improves the generalization capacity of the models. However, the most appropriate data augmentation methodology is highly dependant on the nature of the problem. In this work, we propose a data augmentation and image enhancement algorithm for the task of classifying non-mydriatic fundus images of pigmented abnormalities in the macula. For training, fine tuning and data augmentation, we used the Barcelona Supercomputing Centre cluster CTE IBM Power8+ and Marenostrum IV. The parallelization and optimization of the algorithms were performed using Numba, and Python-Multiprocessing, made compatible with the underlying DL framework used for training the model. We propose and trained a specific DL model from scratch. Our main results are an increase in the number of input images up to a factor of, and report the information of quality images for. As a result, our data augmentation approach results in an increase of up to 9% in classification accuracy.
E. Ulises Moya-Sánchez, Abraham Sánchez, Miguel Zapata, Jonathan Moreno, D. Garcia-Gasulla, Ferran Parrés, Eduard Ayguadé, Jesús Labarta, Ulises Cortés
Decision Support System for Urban Flood Management in the Cazones River Basin
Abstract
Every year, the state of Veracruz suffers from constant torrential rains that produce overflows in rivers and streams as well as floods in urban areas. Due to the orography of the state, there are few hours for the evacuation of people from risk areas. That is why, this paper presents a software that helps the civil protection authorities to take preventive actions in cases of emergency due to extreme weather events. This software considers information such as: current amount of water in the region, rain forecast, soil porosity and orography of the region. With this information, the system generates a simulation of floods in that area. The system is implemented using parallelization of the problem domain and using multi-threaded techniques and distributed computation to reduce the execution time of the simulation.
The case of the region near the Cazones River basin is presented as an example of the utility of the system. It could simulate 72 h of rain falling in the area of 50 km × 20 km and showing the areas that could be affected during the flood.
A. Cristóbal Salas, J. M. Cuenca Lerma, B. Santiago Vicente, H. L. Monroy Carranza
Backmatter
Metadata
Title
Supercomputing
Editors
Dr. Moises Torres
Jaime Klapp
Isidoro Gitler
Prof. Dr. Andrei Tchernykh
Copyright Year
2019
Electronic ISBN
978-3-030-10448-1
Print ISBN
978-3-030-10447-4
DOI
https://doi.org/10.1007/978-3-030-10448-1

Premium Partner