Skip to main content
Top

2023 | Book

ITNG 2023 20th International Conference on Information Technology-New Generations

insite
SEARCH

About this book

This volume represents the 20th International Conference on Information Technology - New Generations (ITNG), 2023. ITNG is an annual event focusing on state of the art technologies pertaining to digital information and communications. The applications of advanced information technology to such domains as astronomy, biology, education, geosciences, security, and health care are the among topics of relevance to ITNG. Visionary ideas, theoretical and experimental results, as well as prototypes, designs, and tools that help the information readily flow to the user are of special interest. Machine Learning, Robotics, High Performance Computing, and Innovative Methods of Computing are examples of related topics. The conference features keynote speakers, a best student award, poster award, service award, a technical open panel, and workshops/exhibits from industry, government and academia. This publication is unique as it captures modern trends in IT with a balance of theoretical and experimental work. Most other work focus either on theoretical or experimental, but not both. Accordingly, we do not know of any competitive literature.

Table of Contents

Frontmatter

Machine Learning

Frontmatter
Chapter 1. Loop Closure Detection in Visual SLAM Based on Convolutional Neural Network

In Robotics, autonomous navigation has been addressed in recent years due to the potential of applications in different areas, such as industrial, comercial, health and entertainment. The capacity to navigate, whether autonomous vehicles or service robots, is related to the problem of Simultaneous Localization And Mapping (SLAM). Loop closure, in the context of Visual SLAM, uses information from the images to identify previously visited environments, which allows for correcting and updating the map and the robot’s localization. This paper presents a system that identifies loop closure and uses a Convolutional Neural Network (CNN) trained in Gazebo simulated environment. Based on the concept of transfer learning, the CNN of VGG-16 architecture is retrained with images from a scenario in Gazebo to enhance the accuracy of feature extraction. This approach allows for the reduction of the descriptors’ dimension. The features from the images are captured in real-time by the robot’s camera, and its control is performed by the Robot Operating System (ROS). Furthermore, loop closure is addressed from image preprocessing and its division in the right and left regions to generate the descriptors. Distance thresholds and sequences are defined to enhance performance during image-to-image matching. A virtual office designed in Gazebo was used to evaluate the proposed system. In this scenario, loop closures were identified while the robot navigated through the environment. Therefore, the results showed good accuracy and a few false negative cases.

Fabiana Naomi Iegawa, Wagner Tanaka Botelho, Tamires dos Santos, Edson Pinheiro Pimentel, Flavio Shigeo Yamamoto
Chapter 2. Getting Local and Personal: Toward Building a Predictive Model for COVID in Three United States Cities

The COVID-19 pandemic was lived in real-time on social media. In the current project, we use machine learning to explore the relationship between COVID-19 cases and social media activity on Twitter. We were particularly interested in determining if Twitter activity can be used to predict COVID-19 surges. We also were interested in exploring features of social media, such as replies, to determine their promise for understanding the views of individual users. With the prevalence of mis/disinformation on social media, it is critical to develop a deeper and richer understanding of the relationship between social media and real-world events in order to detect and prevent future influence operations. In the current work, we explore the relationship between COVID-19 cases and social media activity (on Twitter) in three major United States cities with different geographical and political landscapes. We find that Twitter activity resulted in statistically significant correlations using the Granger causality test, with a lag of one week in all three cities. Similarly, the use of replies, which appear more likely to be generated by individual users, not bots or public relations operations, was also strongly correlated with the number of COVID-19 cases using the Granger causality test. Furthermore, we were able to build promising predictive models for the number of future COVID-19 cases using correlation data to select features for input to our models. In contrast, significant correlations were not identified when comparing the number of COVID-19 cases with mainstream media sources or with a sample of all US COVID-related tweets. We conclude that, even for an international event such as COVID-19, social media tracks closely with local conditions. We also suggest that replies can be a valuable feature within a machine learning task that is attempting to gauge the reactions of individual users.

April Edwards, Leigh Metcalf, William A. Casey, Shirshendu Chatterjee, Heeralal Janwa, Ernest Battifarano
Chapter 3. Integrating LSTM and EEMD Methods to Improve Significant Wave Height Prediction

One of the most significant reliable and renewable energy sources is wave energy which has the most energy density among the renewable energy sources. Significant Wave Height (SWH) plays a major role in wave energy and hence this study aims to predict wave height using time series of wave characteristics as input to various machine learning approaches and analyze these approaches under several scenarios. Two different machine learning algorithms will be implemented to forecast SWH. In the first approach, the SWH will be forecasted directly using a Long Short Term Memory (LSTM) network and in the second approach an LSTM and an Ensemble Empirical Mode Decomposition (EEMD) method are proposed for SWH prediction. For this purpose, the elements of wave height will be initially decomposed and used for training an LSTM network to calculate the time series of SWH. Also, the calibration and verification of the modeled wave characteristics will be done using real data acquired from buoys. The results imply that the EEMD approach provides more accurate results and calculating the wave height through the decomposition and prediction of its main wave components can deliver more accurate outcomes considering various error indices. Also, it can be inferred from the results that the accuracy of the predictions will decrease as the forecasting time horizon increases.

Ashkan Reisi-Dehkordi, Alireza Tavakkoli, Frederick C. Harris Jr
Chapter 4. A Deep Learning Approach for Sentiment and Emotional Analysis of Lebanese Arabizi Twitter Data

Arabizi is an Arabic dialect that is represented in Latin transliteration and is commonly used in social media and other informal settings. This work addresses the problem of Arabizi text identification and emotional analysis based on Lebanese dialect. The work starts with the extraction and construction of a dataset and uses two machine learning models. The first is based on fastText for learning the embeddings while the second uses a combination of recurrent and dense deep learning models. The proposed approaches were attempted on the Arabizi dataset that we extracted and curated from Twitter. We attempted our results with six classical machine learning approaches using separate sentiment and emotion analysis. We achieved the highest result in literature for the binary sentiment analysis with an F1 score of 81%. We also present baseline results for the 3-class sentiment classification of Arabizi tweets with an F1 score of 64%, and for emotion classification of Arabizi tweets with an f1 score of 61%.

Maria Raïdy, Haidar Harmanani
Chapter 5. A Two-Step Approach to Boost Neural Network Generalizability in Predicting Defective Software

With society’s digitalization, the ever-growing dependence on software increased the negative impact of poor software quality. That impact was estimated at $2.41 trillion to the US economy in 2022. In searching for better tools for supporting quality assurance efforts, such as software testing, many studies have demonstrated the use of Machine Learning (ML) classifiers to predict defective software modules. They could be used as tools to focus test efforts on the potentially defective modules, enhancing the results achieved with limited resources. However, the practical applicability of many of those studies is arguable because of (1) the misuse of their training datasets; (2) the improper metrics used to measure those classifiers’ performance; (3) the use of data from only a system or project; and (4) the use of data from only a computer programing language. When those factors are not considered, the experiments’ results are biased towards a very high accuracy, leading to improper conclusions related to the generalizability of classifiers to practical uses. This study sheds light on those issues and points out promising results by proposing and testing the cross-project and cross-language generalizability of a novel 2-step approach for artificial neural networks (ANN) using a large dataset of 17,147 software modules from 12 projects with distinct programming languages (C, C++, and Java). The results demonstrated that the proposed approach could deal with an imbalanced dataset and outperform a similar ANN trained with the conventional approach. Moreover, the proposed approach was able to improve by 277% the number of defective modules found with the same software test effort.

Alexandre Nascimento, Vinicius Veloso de Melo, Marcio Basgalupp, Luis Alberto Viera Dias
Chapter 6. A Principal Component Analysis-Based Scoring Mechanism to Quantify Crime Hot Spots in a City

Hot spots policing is a tactic that judiciously distributes police resources in accordance with regional historical data on criminal occurrences and local crime patterns. Unquestionably, the key to this method is identifying crime hot spots. A growing number of studies are looking into how to pinpoint crime hot spots with greater accuracy. Nevertheless, the majority of them merely take the task as a binary classification problem. Our research proposes the notion of a Crime Hot Spot Score, a Principal Component Analysis (PCA)-based linear scoring mechanism for assessing regional crime severity, which equips data users with a more flexible way of utilizing crime hot spot analysis results. We conducted our study on the 3-year period crime dataset from the Boston Police Department. All our preliminary results are encouraging: we not only provide a new perspective in hot spot detection, but also reveal the correlation between the crime hot spot and its adjacent area.

Yu Wu, Natarajan Meghanathan
Chapter 7. Tuning Neural Networks for Superior Accuracy on Resource-Constrained Edge Microcontrollers

The approaches to tune Artificial neural networks (ANN) for running on edge devices, such as weight quantization, knowledge distillation, weight low-rank approximation, and network pruning, usually reduce their accuracy (gap 1). Moreover, they usually require at least 32-bit microcontrollers, leaving out of the equation widely used and much cheaper platforms mostly based on 8-bit microcontrollers (e.g., ATMega328p and ATMega2560), such as Arduino (gap 2). Those microcontrollers can cost between $0.01 to $0.10 on a large scale and can make viable extending IoT applications to a wider range of cheaper personal objects, such as bottles, cans, and cups. In this context, the present study addresses those two identified gaps by proposing and evaluating a technique for tuning ANN to run on 8-bit microcontrollers. 16,000 ANN with distinct configurations were trained and tuned with four widely used datasets and evaluated on two 8-bit microcontrollers. Using less than 3.5Kbytes, the embedded ANN average accuracies outperformed their benchmarks on a 64-bit computer.

Alexandre M. Nascimento, Vinícius V. de Melo, Márcio P. Basgalupp
Chapter 8. A Deep Learning Approach for the Intersection Congestion Prediction Problem

Traffic prediction at intersections is an important problem as it serves an essential role in minimizing wait time in large cities while reducing emissions. The problem is challenging, especially with spatial and temporal dependencies between intersections in a large metropolitan city. In this paper, we use a deep learning model to predict traffic congestion based on day, time and weather data. we evaluate our model using datasets from large cities including Atlanta, Philadelphia, Boston and Chicago.

Marie Claire Melhem, Haidar Harmanani
Chapter 9. A Detection Method for Stained Asbestos Based on Dyadic Wavelet Packet Transform and a Locally Adaptive Method of Edge Extraction

Recently, the development of two dye staining methods has made it easier to visually recognize asbestos. We propose a method for detecting stained asbestos-specific fiber shapes in microscopic images by extracting edge information using the two-dimensional dyadic wavelet packet transform(2D-DYWPT), which can extract detailed edge information, and the idea of eigenvalue analysis of the Hessian matrix, which captures the difference in pixel values in a locally adaptive method. At first, we convert the original image from RGB space to YIQ color space, and then apply the 2D-DYWPT to the Y and Q components. We extract 36 features depending on the statistics obtained by the 2D-DYWPT and the eigenvalue of Hessian matrices, and classify microscopic images by support vector machine. Experimental results show a comparison with fine-tuned ResNet and the results of applying the detection system to actual microscopic images. We confirmed that the performance of our method is superior to the one of ResNet in total.

Hikaru Tomita, Teruya Minamoto
Chapter 10. Machine Learning: Fake Product Prediction System

Product review plays a vital role in hopping, especially for the online customers. Some base their buying decisions on the review; hence a fake review is a major concern (Jadhav and Gore, Int J Comput Sci Inform Technol 5(2):1447–1450, 2014). Competitions appear to facilitate Fake product malicious agents, a major challenge in the e-commerce industry. This paper intends to use machine learning to explore a predict fake or genuine products system by feeding the products into the model.

Okey Igbonagwam

Cybersecurity and Blockchain

Frontmatter
Chapter 11. Ontology of Vulnerabilities and Attacks on VLAN

Proposing defense strategies for critical computing infrastructures, such as Virtual Local Area Networks (VLAN), is a hard task. We present a conceptual model aiming at protection of VLANs. We identify, formalize, and relate important concepts, and map vulnerabilities and attacks, in addition to proposing protection strategies. The main contributions of the paper are: (i) a domain ontology (OWL format), which models vulnerabilities and attacks on VLANs; (ii) a set of attack prevention strategies for protecting VLANs. This work is intended to be used by researchers pursuing to develop systematic methods and techniques aimed at protecting critical infrastructures.

Marcio Silva Cruz, Ferrucio de Franco Rosa, Mario Jino
Chapter 12. Verifying X.509 Certificate Extensions

Covert channels are used to hide the presence of information in another medium. Attackers have used covert channels to hide the transferring of malicious files, command-and-control traffic, and more. Previous research has shown X.509 certificate extensions can be used as a covert channel. This quasi-experiment utilizes Suricata, an open-source intrusion detection system, to verify specific X.509 certificate extensions that have been used as a covert channel. Several Suricata rules were generated and tested to determine the effectiveness in detecting the presence of a covert channel. All of the generated rules had a 100% true-positive rate, though some had significant impacts on the processor utilization on the IDS. It is possible to detect X.509 covert channels with a high success rate, though detailed verification of the entire X.509 certificate with lua scripting can be extremely resource intensive and unrealistic for high-bandwidth environments.

Cody Welu, Michael Ham, Kyle Cronin
Chapter 13. Detecting Malicious Browser Extensions by Combining Machine Learning and Feature Engineering

As the popularity of the internet continues to grow, along with the use of web browsers and browser extensions, the threat of malicious browser extensions has increased and therefore demands an effective way to detect and in turn prevent the installation of these malicious extensions. These extensions compromise private user information (including usernames and passwords) and are also able to compromise the user’s computer in the form of Trojans and other malicious software. This paper presents a method which combines machine learning and feature engineering to detect malicious browser extensions. By analyzing the static code of browser extensions and looking for features in the static code, the method predicts whether a browser extension is malicious or benign with a machine learning algorithm. Four machine learning algorithms (SVM, RF, KNN, and XGBoost) were tested with a dataset collected by ourselves in this study. Their detection performance in terms of different performance metrics are discussed.

Jacob Rydecki, Jizhou Tong, Jun Zheng
Chapter 14. A Lightweight Mutual Authentication and Key Generation Scheme in IoV Ecosystem

In the Internet of Vehicles (IoV) system, the need for vehicles to authenticate themselves dynamically makes them vulnerable to physical, side channel, and cloning attacks. This article presents a lightweight, two-factor mutual authentication scheme and key agreement protocol suitable to the IoV system to address these issues. The proposed scheme uses Physical Unclonable Functions (PUF) to achieve the desired security requirements. The proposed scheme is organized using a four-layer model with four main communicating entities: the Vehicle (V), Road Side Unit (RSU), Central RSU (CRSU), and a Trusted Authority (TA). In the proposed scheme, a vehicle needs to authenticate itself only once, when it enters the coverage area of a CRSUx or handoff seamlessly between different RSUs that are under the same coverage of the CRSUx. Security and performance analysis of the proposed scheme are conducted to validate the proposed solution. The results demonstrate that the proposed scheme is robust against various attacks with less computation complexity compared to the current solutions in the literature.

Samah Mansour, Adrian Lauf, Mostafa El-Said
Chapter 15. To Reject or Not Reject: That Is the Question. The Case of BIKE Post Quantum KEM

NIST post-quantum cryptography standardization project just entered its final Round 4, where three KEMs are evaluated for standardization, as alternatives. BIKE is one of them. This paper deals with several considerations around building an isochronous and constant-time implementation of the errors-vector generation (EVG) that is used by BIKE. The starting point is the Round 3 BIKE (Ver. 4.2), where a recently published timing attack motivated some changes toward the Round 4 submission. The easiest mitigation simply redefines the EVG to be isochronous. This approach was readily available (already in June 2022) in [1]. It requires only minor changes in the Round 3 specification and reference code, with no changes to the KATs. BIKE team chose a different, newly proposed EVG method (with new KATs). It was integrated into the definition and reference code of the first Round 4 submission (Ver. 5.0) but turned out to be erroneous. We alerted NIST and the BIKE team about the problems, and proposed solutions. This responsible disclosure allowed the BIKE team to very quickly revisit the design decision per one of our solutions, modify the specifications document and the associated proof and submit a revised Round 4 submission (Ver. 5.1). NIST gracefully accepted the fixed specification as the submission (fortunately, before it was posted in the official NIST web site). In this paper, we explore the problems, review and compare some engineering aspects associated with different approaches, present more alternatives and conclude with our critique and recommendations.

Nir Drucker, Shay Gueron, Dusan Kostic
Chapter 16. IoT Forensics: Machine to Machine Embedded with SIM Card

Embedded Universal Integrated Circuit Card (eUICC), also known as Embedded SIM (eSIM), is getting popular when merged with other IoT devices. Recently most business industries adopted this technology. The exited research discusses how to implement this technology as well as the ability to hack eUICC and get remote control of other Machine-to-Machine (M2M) devices. However, this technology was not explained or had done practical studies from a digital forensics point of view. The suggested solution is a flowchart that will help digital forensics investigators towards the initial steps to start to investigate eSIM cards into two forensics methods which are chip-off and analyzing eSIM’s profile according to the scenario that investigators will face during the incident.

Istabraq Mohammed Alshenaifi, Emad Ul Haq Qazi, Abdulrazaq Almorjan
Chapter 17. Streaming Platforms Based on Blockchain Technology: A Business Model Impact Analysis

Streaming platforms have established themselves in the global economy as a powerful source of revenue generation, owing to the advancement of mobile and broadband networks, as well as the popularity of smartphones. Blockchain technology has been used in the ecosystem of streaming platforms to develop solutions that provide transparency and traceability features about the services provided. This paper investigates how the characteristics of decentralized storage, immutability, and traceability have influenced the business model of streaming platforms. To achieve this goal, it was investigated in state-of-the-art, studies that addressed the development and validation of computational solutions with blockchain technology observing the following criteria: (i) type of network, (ii) data storage description, (iii) payment system with blockchain technology and (iv) transparency in audience metrics using smart contracts. Based on the results found, there is a lack of studies that assess the financial and governance feasibility of maintaining all of a streaming platform’s functionalities using blockchain technology.

Rendrikson Soares, André Araújo, Gabriel Rodrigues, Charles Alencar
Chapter 18. Digital Forensic Investigation Framework for Dashcam

Dashboard cameras (dashcams) are becoming the most reliable eyewitness on the streets. Those small installed devices inside vehicles record everything in front of the car during its journey. These devices are becoming popular and usually record incidents or scary scenes on the way. Thus, those scenes become digital evidences that need to be handled by law enforcement agencies. When dashcam owners have digital evidence on their devices, they post it online on their social media accounts, and sometimes they mention the traffic police handle. This means sharing this digital evidence with the public, which affects the digital evidence on many levels. There are a few standards from the National Institute of Standards and Technology (NIST), INTERPOL, and others on collecting digital evidence from incident scenes by digital forensics first responders. In contrast, there are no frameworks or guidelines for delivering digital evidence online. In this paper, we improved the first phase of the NIST standard for digital investigators to receive dashcam evidence online from drivers through a website designed to prove our concept. In addition, the new framework considers the integrity and availability of the data and its metadata during the digital evidence lifecycle.

Saad Alboqami, Huthifah Alkurdi, Nawar Hinnawi, Emad Ul Haq Qazi, Abdulrazaq Almorjan

Software Engineering

Frontmatter
Chapter 19. Conflicts Between UX Designers, Front-End and Back-End Software Developers: Good or Bad for Productivity?

Modern software development processes side traditional development activities with user experience design, i.e. the process of designing software systems supporting and improving user interactions through the fulfilment of quality properties of the software systems such as usability, usefulness, and desirability. Commonly, developers and user experience designers have rather different backgrounds. Several researches have identified this difference as pivotal for the rising of miscommunication and conflicts, that may theoretically undermine productivity of the software development process. In this research, we investigated the kinds of conflict that are more likely to arise between developers and user experience designers. Besides, we analysed whether socio-cultural factors and geographic distribution of team members affect the rising of conflicts. Eventually, we related the presence of conflicts to the success of a software development project. We conducted this research as a questionnaire-based survey involving 56 professional software developers and user experience designers from various countries in Europe. The collected data showed that the most common type of conflict is task conflict. Age, gender and geographical distribution of team members do not affect the rising of conflicts. Conflicts are surprisingly perceived as beneficial for productivity in software development processes in several cases, although they led to the failure of projects in the 10.9% of the cases.

Tea Pavicevic, Dejana Tomasevic, Alessio Bucaioni, Federico Ciccozzi
Chapter 20. Generalized EEG Data Acquisition and Processing System

Data acquisition is an integral part in any intelligent system to ensure the data captured can be processed to a meaningful deduction. It is common for the researchers to use the third-party hardware to collect raw data but integrating the processes into a research workflow is always a challenge. This is especially so for individuals working with sensors, such as electroencephalogram (EEG) headsets, as the amount of consumer support that these devices receive from their vendors does not cover the rigors of human-centered research. Researchers are forced to utilize services and functions offered by vendors that may not be tailored to their specific need. In this paper, we present a proposed methodology that is supported by a prototype to show the feasibility of consolidating the processes included in EEG-based user studies, as well as the data analysis that follows. The system presented utilizes a web application in order to facilitate the experimental data collection, record timings, and execute device calibrations. This interface is tied to an institution service-based pipeline that is not only capable of EEG data capturing, but able to produce data products for later analysis. It is envisaged that such an approach can be the first step in automating EEG data acquisition and its subsequent analytics.

Vinh D. Le, Chase D. Carthen, Norhaslinda Kamaruddin, Alireza Tavakkoli, Sergiu M. Dascalu, Frederick C. Harris Jr.
Chapter 21. Supporting Technical Adaptation and Implementation of Digital Twins in Manufacturing

In manufacturing, digital twins are emerging technologies that integrate several advances, such as the industrial internet of things, and cyber-physical systems, for creating software replicas that monitor and control manufacturing units or processes. Despite their great potential and innovation, digital twins are challenging to implement and there is a lack of practical guidelines for their technical adaptation and implementation. This discourages enterprises from planning and adopting full-fledged digital twin-based solutions due to the low return on investment.In this paper, we fill such a lack of guidelines for the technical adaptation and implementation of digital twins by providing a catalogue of technologies used for realising digital twin-based systems in manufacturing. We align the catalogue to the International Organization for Standardization 23247 standard for digital twins in manufacturing. We elicit the catalogue by systematically reviewing 14 state-of-the-art DT implementations resulting from a pool of 140 peer-reviewed studies.To the best of our knowledge, this is the first work that identifies a catalogue of technologies supporting the realisation of digital twin-based systems mapping it into the 23247 standard.

Enxhi Ferko, Alessio Bucaioni, Moris Behnam
Chapter 22. Towards Specifying and Evaluating the Trustworthiness of an AI-enabled System

We propose a method to specify and evaluate the trustworthiness of AI-based systems using scenarios and design tactics. Using our trustworthiness scenarios and design tactics, we can analyze the architectural design of AI-enabled systems to ensure trustworthiness has been properly achieved. Trustworthiness scenarios allow for the specification of trustworthiness, and design tactics are used to achieve the desired level of trustworthiness in the system. We illustrate the validity of our proposal through the design of the software architecture of a pollination robot. We find that our method opens discussions on ways to achieve trustworthiness and leads to the discovery of any weaknesses in the design concerning the trustworthiness of the AI system. Furthermore, our method allows for designing an AI system with trustworthiness in mind and therefore leads to greater analysis and identification of the sub-attributes that affect the trustworthiness of an AI system.

Mark Arinaitwe, Hassan Reza
Chapter 23. Description and Consistency Checking of Distributed Algorithms in UML Models Using Composite Structure and State Machine Diagrams

In this study, the leader finding algorithm, one of the distributed systems, is targeted and PRocess MEta LAnguage (PROMELA), the description language of the model checking tool SPIN, is described using unified modeling language (UML) diagrams. As a proposed method, the communication channel description method of PROMELA was modified to use composite structure diagrams for the structural parts of the model such as instances and channels, and state machine diagrams for the behavior of processes. Furthermore, after analyzing multiple UML diagrams created using the astah* API, a set of Java interfaces, a consistency checker and a function to display the results on the diagrams were implemented in the astah* plug-in.

Yu Manta, Katsumi Wasaki
Chapter 24. Simulation and Comparison of Different Scenarios of a Workflow Net Using Process Mining

Many processes produce a large amount of data during its execution and, a great tool to extract useful information from this data is process mining. Is this paper, the “Handle Complaint Process” Workflow net will be compared with different types of resources and time restrictions scenarios using process mining and simulation. Some of these scenarios try to simulate the uncertainty inherent to human behavior. To achieve this goal, event logs were generated from simulation using CPN Tools. ProM was used to apply process mining and generate the model from logs. As a result, a comparison between the results from simulation with the results from process mining is presented. With this it is possible to verify which scenario represents human behavior the best.

Felipe Nedopetalski, Franciny Medeiros Barreto, Joslaine Cristina Jeske de Freitas, Stéphane Julia
Chapter 25. Making Sense of Failure Logs in an Industrial DevOps Environment

Processing and reviewing nightly test execution failure logs for large industrial systems is a tedious activity. Furthermore, multiple failures might share one root/common cause during test execution sessions, and the review might therefore require redundant efforts. This paper presents the LogGrouper approach for automated grouping of failure logs to aid root/common cause analysis and for enabling the processing of each log group as a batch. LogGrouper uses state-of-art natural language processing and clustering approaches to achieve meaningful log grouping. The approach is evaluated in an industrial setting in both a qualitative and quantitative manner. Results show that LogGrouper produces good quality groupings in terms of our two evaluation metrics (Silhouette Coefficient and Calinski-Harabasz Index) for clustering quality. The qualitative evaluation shows that experts perceive the groups as useful, and the groups are seen as an initial pointer for root cause analysis and failure assignment.

Muhammad Abbas, Ali Hamayouni, Mahshid H. Moghadam, Mehrdad Saadatmand, Per E. Strandberg

Data Science

Frontmatter
Chapter 26. Analysis of News Article Various Countries on a Specific Event Using Semantic Network Analysis

People today obtain news and information from many sources. This has been made possible by the exponential growth of the number of news outlets due to the development in the scope and speed of the Internet. As a result, people can access news articles in various ways, not only through domestic media, but also through media from other countries. There are limited ways to check news reports from overseas on events and their differences from the news content of one’s own country. By collecting and analyzing news related to the Russian-Ukraine war, we conduct experiments and analyze contents reported in countries related to this war and the focus of their reports. Our study analyzes whether all the news related to this war reported the same content and what it focused on. The rationale for this study is that the more access people have to different kinds of news, the more opportunities they have for thinking differently. It takes a large amount of effort to collect, translate, and analyze news reports in various languages across various countries. For this analysis, the news must be collected from the media in various countries, and translations must be carried out if the content was not in English. The main purpose of the data is to obtain meaningful results through various methods of analysis rather than a single method. To address these problems, we propose a system for translating, analyzing, and comparing news collected from various countries. We collected news from the United States, Germany, Russia, and China; we then applied LDA to extract the topics for comparison and analysis. We aim to use semantic network analysis to obtain betweenness centrality scores and calculate sentiments scores.

Hyun Park, Jinie Pak, Yanggon Kim
Chapter 27. An Approach to Assist Ophthalmologists in Glaucoma Detection Using Deep Learning

Artificial neural networks consist of computational models inspired by a central nervous system, being capable of machine learning and pattern recognition. Based on the technological advancement of these networks and the possibility of applying new technologies of artificial intelligence and image processing to problems in the field of medicine, this project seeks to apply the deep learning technique with different architectures as ResNet50, DenseNet and VggNet – 16 along with digital image processing in which different filters were applied to a set of 512 images in order to develop a system to aid in the identification of glaucoma optic neuropathy, considering retinal images as input for training a network capable of detecting glautomatous patterns. Through this present study, in which a combination of different architectures, activation functions as Softmax, ReLu and Sigmoid, were used for image classification. It is expected that the system will be able to help specialist physicians in detecting the disease during examinations performed on patients’ eyes, considering that three architectures obtained satisfactory results above 80% accuracy, among them a model using an image filter called malachite that was specially developed for this study.

Tatiane Martins Bistulfi, Marcelli Marques Monteiro, Juliano de Almeida Monte-Mor
Chapter 28. Multtestlib: A Parallel Approach to Unit Testing in Python

In this article, we propose a solution to improve the speed of Unit Tests in Python Language by the creation of a package, also in Python, to run unit tests using parallel processing. It was necessary to side-step Python’s GIL (Global Interpreter Lock) which prevents parallel processing. With the reuse of code from the multprocessing package, it was possible to execute code in parallel, using subprocesses instead of the threads used by the GIL. Performance tests were carried out, comparing the developed solution with the unittest, from the standard library, such tests demonstrated that it is possible to carry out unit tests, in Python, using multiprocessors to reduce running time. The authors believe that, based on the results presented, this is a promising technique and that the developed package may eventually be included in a Python library.

Ricardo Ribeiro de Alvarenga, Luiz Alberto Vieira Dias, Adilson Marques da Cunha, Lineu Fernando Stege Mialaret
Chapter 29. DEFD: Adapted Decision Tree Ensemble for Financial Fraud Detection

Financial fraud activities are increasing. Financial institutions must improve their systems, judged not efficient enough. Our work is to provide a solution to analyse transactions and explain the transactions blocked by the systems. It is crucial to inform the experts who will review the blocked transactions to help them make the best decision. Our approach is based on an ensemble of decision trees to compute a fraudulent score. Any transaction above a certain threshold will be considered fraudulent. With this approach, we chose the decision tree algorithm for its interpretability to help the investigation of experts assigned to review fraudulent transactions. Experiments are done with 1M transactions, and we obtain a f1-score of 0.89 with our method.

Chergui Hamza, Abrouk Lylia, Cullot Nadine, Cabioch Nicolas
Chapter 30. Prediction of Bike Sharing Activities Using Machine Learning and Data Analytics

We study bike sharing systems using machine learning techniques and statistics data analytics. Using data from the state of Minnesota, we investigate number of trips from each station and trip length in time affected by weather condition, weekday, and weekend. Correlations among the parameters are obtained and discussed. Machine learning algorithms including neural networks and gradient tree boosting are applied to predict the number of start trips using the input dataStation ID, number of docks, date, and weather variables. Our results show the efficiency of the learning algorithms in this applied area.

Shushuang Man, Ryan Zhou, Ben Kam, Wenying Feng

E-Learning

Frontmatter
Chapter 31. ICT: Attendance and Contact Tracing During a Pandemic

This paper proposes the use of a web tool to collect attendance for on-campus and online students. This allows for a single interface for tracking participation of students using either modality while also capturing position information of any on-campus students for use in contact tracing. While the need for contact tracing has been reduced in the years following 2020, the tool still quickly captures generic attendance information across modalities.

Shawn Zwach, Michael Ham
Chapter 32. Towards Cloud Teaching and Learning: A COVID-19 Era in South Africa

The pandemic of COVID-19 has forced many Higher Learning Institutions around the globe to move their conventional teaching to the cloud using the technology called Cloud Computing to cater to online teaching and learning. However, the transition to the cloud by some institutions brought uncertainties to both the students and teachers. This paper aims to investigate the experiences of students with online teaching and learning during the COVID-19 pandemic at the Central University of Technology in Free State, South Africa. The author used observations and qualitative questionnaires to describe the perceptions of students regarding online teaching and learning. The research used Activity Theory to guide the analysis of data. The perceptions are that most students hardly embraced the blackboard. This made the transition to the cloud discouraging since some did not know how to use the technology. Furthermore, the findings include challenges related to reducing interaction between the lecturer and students, social isolation, technical problems, data, and their homes being not conducive for learning. Irrespective of these challenges, some students perceived online learning as effective as face-to-face learning, enjoyment, ability to learn at their own pace, and easy access to online material. Online teaching and learning is an effective technique. However, integrating it successfully into the curriculum necessitates a well-thought-out framework and a more proactive approach. Also, the digital divide must be conquered to favor all students, from anywhere in the world. This research contributes to the body of knowledge on technology in education.

Dina Moloja
Chapter 33. Learning Object as a Mediator in the User/Learner’s Zone of Proximal Development

The Zone of Proximal Development (ZPD), a term coined by Russian psychologist Vygotsky in his theory of learning and development, can be described as the learner’s ability to successfully problem solving with the assistance of more qualified people. In his writings, Vygotsky highlighted the importance of interaction in facilitating learning. The gamified environment “Logic Live” was developed to assist in the teaching of Propositional and Predicates Logic. It consists of modules for teaching truth tables, formalization of arguments, refutation tree, and propositional calculus. The information in this environment is structured into texts, examples, and leveled exercises for each theme. In their initial version, these exercises provided no method of learning mediation with users in the case that they were unable to complete the activity or completed it incorrectly. Thus, the goal of this work was to incorporate into “Logic Live” the necessary mediation to aid in the resolution of the exercise by the user/learner through the creation of a learning object, LO-ZPD.

Parcilene Fernandes de Brito, Douglas Aquino Moreno, Giovanna Biagi Filipakis de Souza, José Henrique Coelho Brandão
Chapter 34. Quality Assessment of Open Educational Resources Based on Data Provenance

Open Educational Resources (OER) expand the possibility of creating educational materials more suitable for a specific audience and context. Due to these educational benefits, it is mandatory to provide means to guarantee the quality of OER so that the user can have confidence when using an OER. In this sense, it is important to consider that a new OER can be created by reviewing (adapting) and/or remixing (combining) two or more OER. Thus, data provenance becomes relevant, as it can be used to assess the quality of the OER created through review and/or remix activities. In addition, data provenance can also be considered to evaluate the source OER used as a basis for the creation of a new OER. In the literature, there are no previous work that consider data provenance to assess the quality of OER. On the other hand, there are examples of digital repositories that store provenance information, but this information is not considered for quality assessment. In this paper, we propose an approach called QualiProvOER to perform a semi-automatic assessment of the quality of OER based on data provenance. In this sense, we defined a Provenance Model for OER, called the ProvOER Model, composed by a minimum set of metadata to describe OER history. We also present the criteria and mathematical formulas used to assess the quality of OER. We observed that the review and remix criteria strongly influence the provenance of the OER and, therefore, must be measured for quality purposes.

Renata Ribeiro dos Santos, Marilde Terezinha Prado Santos, Ricardo Rodrigues Ciferri
Chapter 35. Quality Assessment of Open Educational Resources: A Systematic Review

Open Educational Resources (OER) are an important feature for generating and sharing educational knowledge. They are made available through an open license that allows their use, adaptation, and combination for different purposes. Thus, providing means to ensure OER’s quality is mandatory. Quality is a contextual, subjective, and multidimensional concept that strongly impacts the evaluation process. This paper presents the results of a systematic review of the literature focused on studies that contribute to evaluating the quality of OER. In the literature, there are examples of literature reviews that present dimensions for assessing the quality of OER. In this paper, we expand and complement these contributions by presenting (i) strategies that do not consider dimensions for the assessment of the quality of OER, (ii) a complementary and updated set of quality dimensions, (iii) a list of indicators for the evaluation of the quality dimensions, (iv) quality evaluators and (v) semi-automatic and automatic strategies for assessing the quality of OER. We conducted a comparative analysis between the studies to identify points of convergence, particularities, and possibilities for future contributions to assessing the quality of OER. The results point out that there is no consensus on the best strategy to evaluate the quality of OER since different points of view, usage scenarios, and characteristics can be considered.

Renata Ribeiro dos Santos, Marilde Terezinha Prado Santos, Ricardo Rodrigues Ciferri

Health

Frontmatter
Chapter 36. Predicting COVID-19 Occurrences from MDL-based Segmented Comorbidities and Logistic Regression

In this work, a method based on Minimum Description Language (MDL) is proposed to perform the preprocessing of Electronic Health Record (EHR) data and predict COVID-19 occurrences based on the comorbidities of patients with symptoms of the disease. In the proposed method, openEHR archetypes are used to standardize the database records and MDL is applied to segment of the text that indicates the comorbidities. From the resulting database, instances are extracted and used with Logistic Regression to predict the influence of comorbidities on the COVID-19 test result. The method was validated using data from COVID-19 patients who are from the State of Pernambuco – Brazil. The prediction based on MDL obtained 100% of accuracy and facilitated the verification of the influence of comorbidities on the patient’s diagnosis for COVID-19. In addition to making it difficult to interpret the results, the non-segmented model obtained an accuracy of 67.4%.

Ana Patrícia de Sousa, Valéria Cesário Times, André Araújo
Chapter 37. Internet of Things Applications for Cold Chain Vaccine Tracking: A Systematic Literature Review

The COVID-19 pandemic has led to an immense effort by laboratories, governments, and non-governmental organizations to develop vaccines in record time. According to the World Health Organization (WHO), one of the major challenges for global immunization is to keep the temperature conditions of these supplies in remote areas, especially in underdeveloped countries. Monitoring the cold chain, the supply chain specializing in refrigerated goods, such as food and health items, becomes critical to their preservation. Technologies inherent to the Internet of Things, like RFID and temperature sensors, are commonly used to identify and detect physical changes in the environment in which they are embedded. This survey presents a Systematic Literature Review (SLR) conducted to identify the most relevant studies on this topic from digital libraries. Through this SLR, 23 primary studies were selected from which important data were extracted and analyzed. The presented results in this study should explain which and how the key technologies are used to control the delivery of vaccines in the cold chain, focusing on remote regions.

Alex Fabiano Garcia, Wanderley Lopes de Souza
Chapter 38. GDPR and FAIR Compliant Decision Support System Design for Triage and Disease Detection

In this study, a novel decision support system design is proposed that addresses triage and disease detection, and automatically makes predictions on structural and semi-structural clinical data. The proposed system consists of a hybrid design that uses ontology-driven and machine learning based methods together while performing the disease prediction and triage processes. PUBMED citation records and well-known biomedical ontologies were used as source of information to effectively determine disease-symptom relationships. Since patient data are sensitive and require responsibility, the solution to be developed to comply with certain criteria and principles. In order to achieve this, data is obtained and stored in accordance with the General Data Protection Regulation and FAIR Data Principles.

Alper Karamanlioglu, Elif Tansu Sunar, Cihan Cetin, Gulsum Akca, Hakan Merdanoglu, Osman Tufan Dogan, Ferda Nur Alpaslan

Potpourri I

Frontmatter
Chapter 39. Truckfier: A Multiclass Vehicle Detection and Counting Tool for Real-World Highway Scenarios

This paper presents Truckfier, a prototype tool that uses a deep learning approach for detecting, tracking, and counting multiclass vehicles, particularly trucks, in highway scenarios to assist in traffic analysis. We focus specifically on Brazilian routes because there are many classes of trucks in that Country, some being very similar to each other. To ensure the quality of our dataset, given our very specific classes, we built it with the assistance of domain specialists that labeled each vehicle. Truckfier is based on YOLOv4 and DeepSort and has obtained good results in metrics, such as Precision and Recall, in two analyzed real-life scenarios. In our setup, we use horizontal images captured with the camera in a position parallel to the road, which is more prone to problems such as occlusion and, in some cases, the trucks do not fit in the entire frame. Truckfier also provides an interactive interface to assist the user in interacting with the results of the detection pipeline and aid in their traffic analysis which a group of experts validated.

Murilo S. Regio, Gabriel V. Souza, Roberto Rosa, Soraia R. Musse, Isabel H. Manssour, Rafael H. Bordini
Chapter 40. Explaining Multimodal Image Retrieval Using A Vision and Language Task Model

Humans can provide arguments, cite proof, and express confidence in their predictions. They may also say, “I don’t know,” if there isn’t enough information. However, only some of the multimodal image retrieval techniques now in use have these characteristics, making the models extremely opaque and unreliable. Additionally, the vision-language task should pay more attention to multimodal aspects. The Vison-language concept has recently become very popular because of all the potential uses. It is crucial to understand the output of the vision and language models using various use cases, such as visual captioning, visual question answering, and image retrieval. In this article, we examine the image retrieval task, where the input query consists of an image and a text description of the image. With the help of the Transformer model, which concentrates on both modalities and merges embedded features using the feature fusion technique, the system launches a query prompted by the input image and text. We applied two different techniques based on the traits taken out of the fusion layer. We first trained the K Nearest Neighbor (KNN) algorithm to discover a similar image on training data. Second, we computed the nearest neighbor points and clustered the center using the clustering algorithm and a support vector machine to see a comparable image.In our work, we choose the VisualBert model for Multimodal Image Retrieval and apply the SHAP algorithm in the NLVR2 JSON dataset to explain language features based on features importance. The algorithm visualizes both positive and negative shapley values for the input text.

Md Imran Sarker, Mariofanna Milanova, John R. Talburt
Chapter 41. Machine Vision Inspection of Steel Surface Using Combined Global and Local Features

This paper introduces a new framework for defect classification of hot-rolled steel surfaces using global and local features. The framework begins by firstly preprocessing steel surface images in the image module. Afterwards, multiple global and local features are extracted from those images, and the optimal feature vectors for classification are selected. The DST-GLCM method is used to extract global features, while GLCM, ULBP, and SURF are used to extract local features. The discrimination capabilities of the features are evaluated based on three settings: (i) each of the global and local features are evaluated separately, (ii) several local features are evaluated individually, and (iii) global and local features are evaluated in combination. In the classification module, two supervised learning algorithms are trained, namely the multi-class SVM (with varying kernels) in addition to the k-Nearest Neighbor. We also compare the best overall setting (feature-and-classifier-wise) with several existing methods. Based on the NEU dataset that consists of 1800 grayscale images of hot-rolled steel strips, the best setting in this work showed promising results for the classification of six different types of defects. The main challenge of this dataset is that the intra-class defects have large differences in appearance, while the inter-class defects have similar aspects.

Mohammed W. Ashour, M. M. Abdulrazzaq, Mohammed Siddique
Chapter 42. A Process to Support Heuristic Evaluation and Tree Testing from a UX Integrated Perspective

This paper presents a proposal for an integral user experience evaluation process, as well as the methodology followed to carry it out, and the validation achieved. In previous works, two processes were proposed: the first one, was to perform Heuristic Evaluations, and the second one, was to perform Tree Testing Evaluations. Both processes are aligned with User Experience improvement. Heuristic Evaluation is a method that allows UX and HCI specialists to evaluate a software product through a set of heuristics to find usability problems. Tree Testing or Reverse Card Sorting is a method based on the evaluation by end users of the information architecture of a software product so that the hypotheses that led to the proposed design can be validated early and improvements in the design can be made. This paper presents how, through a collaborative and continuous improvement approach, an integrated process was developed to perform the aforementioned methods, both in face-to-face and remote schemes, with an integral approach. In addition, this proposal lays the groundwork for adding other UX evaluation methods and techniques in the future. To carry out this, an AS IS was considered, a remote collaborative workshop was held with the stakeholders involved, and with the identified improvement points, the integral process was elaborated, which was captured in BPMN diagrams. Finally, it was verified that the proposal covered all the improvement points identified in the workshop, and the proposal was validated through expert judgment. The collaborative experience and the feedback obtained by both workshop participants and experts led us to conclude the importance of involving users throughout the improvement experience, with a user-centered focus. This proposal is important because this process can be implemented in any organization that performs UX evaluations. Likewise, a software platform can be implemented, so that the automation and standardization of this type of process can be achieved.

Freddy Paz, Adrian Lecaros, Fiorella Falconi, Alejandro Tapia, Joel Aguirre, Arturo Moquillaza
Chapter 43. Description and Verification of Systolic Array Parallel Computation Model in Synchronous Circuit Using LOTOS

Systolic array is a parallel computing model. A systolic array is a system in which multiple cells that perform simple operations such as addition are prepared, and the cells are arranged in one or two dimensions to perform computation on the entire system. The Design W1 systolic array is an architecture in which the cells are arranged in one dimension and convolutional computation is performed on the entire system. In this study, we describe the LOTOS description of the Design W1 systolic array. The cells are described at the register transfer level, and a synchronous circuit model with a global clock is created by connecting multiple cells. By using LOTOS language to describe the systolic array, the operation of the systolic array can be verified. We verify the described circuit model using the LOTOS model checker.

Yuya Chiba, Katsumi Wasaki
Chapter 44. A Virtual Reality Mining Training Simulator for Proximity Detection

Many applications in industrial mining rely on large manually operated trucks to transport materials around the mine. These trucks are often enormous, with very limited visibility for the driver. The combination of limited visibility and a truck with a substantial amount of weight is a recipe for accidents resulting in severe property destruction or even loss of human life. To solve these issues, we implement a simulation of a LIDAR-based proximity collision detection system to notify drivers of an imminent collision with something (or someone) outside of their field of view. We have developed a virtual reality training simulation to help train miners how to use this new system. Our simulation focuses on two main participant types: truck drivers and ground workers. This separation allows both parties to gain experience operating around the other in a low-risk virtual environment. Our final result aggregated useful features from a variety of past works and enhanced them with more immersive input and output devices.

Erik Marsh, Joshua Dahl, Alireza Kamran Pishhesari, Javad Sattarvand, Frederick C. Harris Jr.
Chapter 45. A Performance Analysis of Different MongoDB Consistency Levels

MongoDB DBMS is a NoSQL database management system that is known for its good performance in clusters of computers. In addition, the MongoDB has distinct levels of consistency that can be defined at the moment that a database operation is performed. The work described in this paper aims at investigating the MongoDB‘s performance using different levels of consistency. More specifically, the throughput and runtime performance of MongoDB operations are evaluated based on some workloads provided by the YCSB+T benchmark, which is an extension of the YCSB benchmark. The performance tests were performed according to two of the consistency level settings offered by MongoDB, w:1, j:true and readConcern (rc):local and w:majority, j:true and rc:local. The conclusion reached is that better performance is achieved when using the w:1, j:true and readConcern:local configuration. As a result, if there is a need for higher performance and data consistency is not an essential requirement, a write concern configuration that writes only to the primary node is a good solution.

Caio Lazarini Morceli, Valeria Cesario Times, Ricardo Rodrigues Ciferri

Potpourri II

Frontmatter
Chapter 46. Information Extraction and Ontology Population Using Car Insurance Reports

In the automotive field, the management of vehicle transportation can generate different types of damage to the bodywork during transportation. The assessment of these damages is one of the most difficult tasks. This is because there are many criteria for damage assessment, depending on the type and the severity of the damage, taking into account the damaged part. The reports describing the damage are not used to perform analyses or to create predictive models, or at least stored in a database. The non-use of this data is due to the lack of an ontology to model insurance reports and systems to extract useful information. In this paper, we define an ontology for modeling car damage based on the knowledge of insurance experts and their description reports. Then, we populate it using information extracted with information extraction system (IE) based on named entity recognition (NER). The experimentation was performed on a real dataset of insurance reports, and we compare between four NER models (BERT, BiLSTM-CRF and CRF). The reported results show that CRF outperforms all other NER algorithms, achieving the highest scores for all metrics in most cases.

Hamid Ahaggach, Lylia Abrouk, Eric Lebon
Chapter 47. Description of Restricted Object Reservation System Using Specification and Description Language VDM++

When designing a system from a specification written in natural language, gaps in understanding and ambiguities caused by natural language arise. Formal methods are methods to solve these problems. The objective of this study is to describe a system using VDM++, a formal specification and description language, which is one of the formal methods. The target system is a “Restricted object reservation system”. A restricted object is a system in which an object has a deadline or constraint. The addition of restricted objects to a conventional reservation system complicates the description. Therefore, in this study, UML diagrams are created to clarify the interface of each role in the reservation system. In addition, since the reservation system and restricted objects are separated in cyber and reality, the connection between each space should be considered based on the idea of cyber-physical systems (CPS). Based on these two ideas, we considered the VDM++ description of a “Restricted object reservation system”.

Aoto Makita, Katsumi Wasaki
Chapter 48. Applying Scrum in Interdisciplinary Case Study Projects for Literacy in Fluency Analysis

During the calendar year of 2021, at the Aeronautics Institute of Technology (ITA, Brazil), an Interdisciplinary Problem-Based Learning Case Study was performed. It was divided into two Projects: FAL-BD (33 grad and undergrad students – dealing with databases aspects) and FAL-RT (13 grad and undergrad students – dealing with real-time issues and the development of a hardware prototype), respectively for the first and second semesters of the Academic year of 2021. This Case Study comprised 46 students from seven Computer Science disciplines: three from the first and four from the second semester. In each Project/Semester, it was attempted to conceptualize, model and develop a part of a database distributed system (first semester) and to apply Real-Time to it (second semester). In the second semester, it was also developed a Personal Digital Assistant (PDA) prototype to collect audios. In order to achieve this goal, the following Technologies were employed: Data Systems, Artificial Intelligence, Internet of Things, Machine Learning and Real-Time Embedded Systems. The final system was based on a similar project, being developed by the Brazilian Ministry of Education. This Project intends to automatically analyze the reading fluency of Elementary School children. The two projects described in this paper deal with the creation of a database and the evaluation of its Real-Time aspects. The audios, collected via the locally developed PDA (may be offline mode), must be stored in a server, for later automatic analysis, using Machine Learning and Artificial Intelligence. Each of the two Projects was completed in 12 weeks, during its Academic semester. It applied the SCRUM framework for the two projects management. This paper’s major contribution was the use of the Agile Method (with SCRUM) for testing, managing and developing the Case Study, which resulted in a Literacy in Fluency Analysis system, including a working hardware prototype for audio collection.

Matheus Silva Martins Mota, Breslei Max Reis da Fonseca, Gildarcio Sousa Goncalves, Jean Claudio de Souza, Vitor Eduardo Sabadine da Cruz, Odair Oliveira de Sá, Adilson Marques da Cunha, Luiz Alberto Vieira Dias, Lineu Fernando Stege Mialaret, Johnny Cardoso Marques
Chapter 49. A Demographic Model to Predict Arrests by Race: An Exploratory Approach

​​Racial discrimination remains a critical issue in the United States. Hence, the study employs a logistic regression model to predict racial arrests based on demographics. Using a secondary open dataset from the City of Albany’s official website from November 2020 to September 27, 2021, the results revealed that the variables with relative influence on the race of an individual arrested are the age, neighborhood, and sex variables, at 80%, 12%, and 8%, respectively. The model results demonstrated that the likelihood of a white male between the age of 18 and 25 years arrested in a predominantly white neighborhood is significantly less than being black. Limitations of the study include small sample size (10 months of historical data); and under-specification of the Logistic regression model due to excluding one or more relevant independent variables from the model.

Alice Nneka Ottah, Yvonne Appiah Dadson, Kevin Matthe Caramancion
Chapter 50. An Efficient Approach to Wireless Firmware Update Based on Erasure Correction Coding

Updating the firmware of the deployed nodes in a large sensor network can be time consuming and challenging. One way to perform these updates is through an over-the-air (OTA) protocol that utilizes repeated rounds of firmware broadcasting. In this work, we propose to use erasure correction coding to improve the efficiency of the existing protocol. With the new method, receivers recover lost packets from received ones, instead of waiting for repeated transmissions. We implemented the proposed protocol and compared it with the existing method. Both theoretical analysis and experimental results demonstrate the advantages of the new approach. In our experiment, when the packet lost ratio was 27%, the new method achieved 99% success rate on firmware transmissions while the existing approach failed.

Berk Kivilcim, Daniel Zhou, Zhijie Shi, Kaleel Mahmood
Chapter 51. Complex Network Analysis of the US Marine Highway Network

We model the US marine highway network (MHN) as an undirected graph of nodes (marine highways) and edges (two marine highways have an edge between them if there are one or more intermodal ports in the two highways). We focus on the 16 marine highways spanning over the gulf coast, east coast and mid west parts of the US. We present a comprehensive evaluation of the MHN graph with respect to the four major node centrality metrics (degree, eigenvector, betweenness and closeness), node centrality tuple (neighborhood-based bridge node centrality tuple), edge centrality metrics (betweenness), community detection and k-core decomposition. We also present a quantitative comparison of the MHN with road, rail and air transportation networks evaluation with respect to a suite of network-level metrics such as the assortativity index, bipartivity index, algebraic connectivity, randomness index and spectral radius ratio for node degree. To the best of our knowledge, ours is the first work in the Network Science literature to model and evaluate the US MHN using algorithms and metrics for complex network analysis.

Natarajan Meghanathan
Chapter 52. Directed Acyclic Networks and Turn Constraint Paths

We address algorithmic problems in constructing turn constrained paths in a two dimensional Euclidean network. It is known that a shortest path tree (SPT) is good enough for capturing shortest paths from the source vertex to all other vertices. However, under turn limit requirement, SPT cannot capture shortest paths from source vertex to all other vertices. We show that a directed acyclic graph (DAG) may contain shortest paths from the source vertex to all other vertices. We then describe how to modify the standard Dijkstra’s shortest algorithm so that turn-constrained shortest paths can be constructed efficiently.

Laxmi Gewali, Sabrina Wallace
Backmatter
Metadata
Title
ITNG 2023 20th International Conference on Information Technology-New Generations
Editor
Shahram Latifi
Copyright Year
2023
Electronic ISBN
978-3-031-28332-1
Print ISBN
978-3-031-28331-4
DOI
https://doi.org/10.1007/978-3-031-28332-1

Premium Partner