Skip to main content
Top

2024 | Book

ITNG 2024: 21st International Conference on Information Technology-New Generations

insite
SEARCH

About this book

This volume represents the 21st International Conference on Information Technology - New Generations (ITNG), 2024. ITNG is an annual event focusing on state of the art technologies pertaining to digital information and communications. The applications of advanced information technology to such domains as astronomy, biology, education, geosciences, security, and health care are the among topics of relevance to ITNG. Visionary ideas, theoretical and experimental results, as well as prototypes, designs, and tools that help the information readily flow to the user are of special interest. Machine Learning, Robotics, High Performance Computing, and Innovative Methods of Computing are examples of related topics. The conference features keynote speakers, a best student award, poster award, service award, a technical open panel, and workshops/exhibits from industry, government and academia. This publication is unique as it captures modern trends in IT with a balance of theoretical and experimental work. Most other work focus either on theoretical or experimental, but not both. Accordingly, we do not know of any competitive literature.

Table of Contents

Frontmatter

AI and Robotics

Frontmatter
Projecting Elliott Patterns in Different Degrees of Waves for Analyzing Financial Market Behavior

In the financial market, investors rely on technical and fundamental indicators to estimate the price behavior of an asset to reduce investment risks. Indicators from time series of historical prices are widely used since the past behavior is expected to have a high probability of reflecting itself in future behavior. In this sense, Elliott waves can be used for this purpose since they can describe the patterns and relationships in such historical data. The rules to identify Elliott are well-defined. The challenge remains in projecting patterns with different time frames. Some studies consider Elliott waves for pattern prediction but don’t consider how the pattern will be formed. This paper presents a way to project different patterns of lower degrees onto waves of a higher-degree pattern. The solution is a modular model that uses Fibonacci proportions from wavelengths inside the patterns and thus chooses the pattern that is most likely to happen again, considering the type of pattern desired. The results show that the model is accurate when patterns of different degrees are projected.

Rafael Ribeiro dos Santos, Vanderlei Bonato, Geraldo Nunes Silva
An Integration of Blockchain Web3 and Robotic Process Automation for Property Ownership Traceability

The concept of traceability in property ownership pertains to the establishment of a comprehensive information trail that encompasses transaction history and the associated property ownership data. The challenges inherent in managing property ownership information stem from the complexity of real estate transactions. The properties not only involve ownership status but also encompass critical records, including land reports, records of prior disputes, risk assessments, and the processes related to the property registry. Consequently, ensuring the traceability of property records, mitigating the risks of document forgery, and addressing the potential for erroneous data input and assessment become intricate tasks. Developing a traceable property ownership management system and enhancing the veracity of property ownership records assume paramount importance. Such measures are pivotal for mitigating moral hazards, reducing the occurrence of faulty input, elevating transparency, managing the risks associated with erroneous assessments, and streamlining business processes. In recent years, various traceability mechanisms for ownership have been proposed. Despite the disruptive impact of blockchain technology on enhancing these systems, automated traceability systems, particularly tailored to property ownership management, remain conspicuously absent in the current academic literature. This paper employs a design science research approach to illustrate how an integration framework uniting robotic process automation and blockchain technology can effectively address the issues concerning available ownership traceability systems and related technologies. To this end, we develop and implement a system framework that connects robotic process automation with blockchain Web3 technology. While the proposed solution is rooted in the specific context of property ownership, we explore its potential for generalization to a broader application domain.

Mina Cu, Johnny Chan, Gabrielle Peko, David Sundaram
Early Identification of Conflicts in the Chilean Fisheries and Aquaculture Sector Via Text Mining and Machine Learning Techniques

This project tackles challenges in Chile’s fishing and aquaculture sector, vital for both economic and social reasons. Aquaculture entrepreneurs and fishermen, spanning industrial to artisanal, depend on Chile’s rich hydrobiological resources. They face issues due to regulatory limits, designed to preserve species and ecosystem balance. The objective is to assess machine-learning algorithms with text mining to create an AI model. This model will help the Undersecretariat of Fisheries and Aquaculture anticipate conflicts via early alerts.The study employed the CRISP-DM methodology, focusing on a Neural Networks-based model, specifically the Multilayer Perceptron. This research surpassed its initial hypothesis, aiming for 70% accuracy in conflict classification. The model processed natural language from electronic media and Twitter, achieving 81.50% precision in conflict prediction. As a result, managers can now make more informed decisions to preemptively address conflicts with greater confidence.

Mauricio Figueroa Colarte
Enhanced AES for Securing Hand Written Signature Using Residue Number System

The uncertainty and vulnerability of symmetric algorithm, Advanced Encryption Standard (AES), which uses a shared public secret key, however, timing attacks have called for a modification to it, to retain its potency and effectiveness. Residue Number System (RNS) has good potential in applications where speed and power consumption is very significant and with a carry-free propagation. To this effect, this paper focuses on the security of digital images based on hand written offline digital signature samples form CEDAR signatory images by enhancing the public key AES algorithm encryption and introducing RNS with three moduli which gives a three light weight image shares termed as residues. The study presents a hybrid method by further converting the encrypted images from the AES algorithm into residues using forward conversion technique with moduli of {2n − 2, 2n − 1, 2n + 1} into residues. Authentication stage involves conversion of residues to encrypted signatory using residue -to- binary (R/B) converters known as Chinese Remainder Theorem (CRT) by performing modular operations. Furthermore, the decryption stage utilizes inverse function going by the backward procedures involved in the encryption process. The AES + RNS model secured the digital signatory images stronger and further optimizes in terms of computation time, memory and throughput on both side of encryption and decryption and no constraint on image width or size with the advantage of dimensionality reduction and low power consumption of RNS to process the resultant image.

Ifedotun Roseline Idowu, Bamidele Samson Alobalorun, Abubakar Abdulsalam
Using GPT-4 to Tutor Technical Subjects in Non-English Languages in Africa

According to the Organization of Economic Cooperation and Development, education leads to economic growth and enhanced employment opportunities. In the developing world, however, there are a number of barriers to education including high student-teacher ratio, long distances to schools, and students being forced to study in a second or even a third language (non-home language education). This paper looks at how GPT-4 (the API or application programming interface behind the popular chatGPT website) could be used to tutor technical subjects such as mathematics and Java programming in non-English and non-European language in Africa. Tutoring bots were developed which could be accessed using WhatsApp on students’ cell phones and by using normal web browsers. This paper describes a number of such projects tutoring mathematics and Java programming in Afrikaans, Amharic, Arabic, Kiswahili, and isiZulu.

Laurie Butgereit

Cybersecurity

Frontmatter
A Practical Survey of Data Carving from Non-Functional Android Phones Using Chip-Off Technique

Mobile devices have become ubiquitous in modern life and are increasingly used for communication, entertainment, and personal data storage. Due to their prevalence, they have become essential sources of evidence in criminal investigations. Android-based mobile phones account for almost two-thirds of the market share, making data extraction from these devices necessary in criminal investigations. However, data extraction methods become limited when the phones are non-functional, which is often the case when phones are seized during investigations. This paper presents a survey of forensic data carving techniques from non-functional Android mobile devices. The study aims to evaluate a chip-off data carving technique for extracting and analyzing data from non-functional devices using a four-step methodology, including disassembly, chip-off, image acquisition, and data carving. The methodology highlights the critical steps and precautions that must be followed to ensure proper data carving. Additionally, the study compares various tools used for data carving. The study highlights the challenges involved in the forensic analysis of mobile devices, such as device encryption, and provides recommendations for future research in this area. Overall, this study contributes to the growing body of knowledge on digital forensics and provides a valuable resource for investigators working with Android mobile devices in criminal investigations.

Sarfraz Shaikh, Lin Deng, Weifeng Xu
Automated Semantic Role Mining Using Intelligent Role Based Access Control in Globally Distributed Banking Environment

Globally distributed banking environment need proper access control to ensure secure information sharing and collaboration among employees. Although Role Based Access Control (RBAC) is widely used, it falls short in adapting to the dynamic nature of global IT systems and complex real-world business roles. This work is an implementation of our already proposed Intelligent Role Based Access Control (I-RBAC) model that utilizes intelligent software agents to automate semantic role mining process in globally distributed collaborative banking environment where employees are expected to play multiple roles at multiple times based upon their profile and assigned tasks under limitations of provided banking policies. The information about roles, permissions, policies, and constraints is encapsulated in ontologies and Intelligent agents automatically extract semantic knowledge from employee’s profiles and banking policies that are shared as text files. Subsequently, automated semantic role mining is achieved through agent-driven reasoning. Experimental results have demonstrated promising accuracy in the mined roles.

Rubina Ghazal, Nauman Qadeer, Hasnain Raza, Ahmad Kamran Malik
Software Bill of Materials (SBOM) Approach to IoT Security Vulnerability Assessment

This paper presents a study of the security vulnerabilities surrounding Internet of Things (IoT) devices, and how these vulnerabilities can be detected and analyzed utilizing the Software Bill of Materials (SBOM). This methodology allows a user to gain more information about a device than what was available before using tools such as an automated vulnerability scanner. Compared to the information available from current popular security vulnerability scanners, the information gathered from the SBOM approach allows a user to have far more insight into a device’s vulnerabilities and composition. This study emphasizes the importance of the SBOM and how it can be used to assess such security vulnerabilities on a deeper level than automated scanners. In this study, we compare the security vulnerability assessment capabilities of three different methods: NetRise, Tenable OT Security, and the free National Vulnerability Database (NVD) provided by the National Institute of Standards and Technology. NetRise is the method that will be used to demonstrate the capabilities of SBOM security. Tenable OT Security is a traditional vulnerability scanner. The last method used is referencing the NVD. This is the U.S. government repository of vulnerability management data. Limitations and deficiencies of the SBOM approach to security analysis are also addressed throughout the study.

James Bonacci, Reese Martin
AI-Assisted Pentesting Using ChatGPT-4

Artificial Intelligence (AI) technologies have been experiencing rapid developments and applications in various fields including Cybersecurity to improve efficiency, productivity, and accuracy. Penetration testing (pentesting) is a critical step in cyber defense to utilize authorized offensive tools and simulated attacks to uncover security vulnerabilities to be used for cybersecurity risk assessment and mitigation. Pentesting steps often include reconnaissance, scanning, knowledge discovery, data analysis, and queries of large amounts of information to detect meaningful threats and vulnerabilities, which could use the help of interactive AI tools, such as ChatGPT. However, AI tools like ChatGPT are still evolving with limitations and challenges for applications. This study conducts simulation tests based on a limited AI-Assisted pentesting model for security knowledge discovery using interactive ChatGPT-4 powered by Large Language Models (LLMs). The purpose of this research is to discover and demonstrate the role and value of AI in planning and conducting pentesting. This study utilizes a VMWare-based network of virtual machines for simulated network attacks and ChatGPT-4 for training and answering prompts on pentesting questions of interest. This research will also discuss limitations of using AI technologies in pentesting and suggestions for the future.

Ping Wang, Hubert D’Cruze
Combining Cyber Security and Data Science: ACutting-Edge Approach for Public Health Education Masters

The need to include data science and cyber security into Master of Public Health (MPH) degree programs is summarized in this abstract. This method attempts to provide MPH students with a broad skill set that includes both data-driven decision-making and strong cyber security measures, acknowledging the dynamic nature of public health concerns. The curriculum aims to enable upcoming public health professionals to manage the challenges of protecting private health information while utilizing modern analytics to make well-informed decisions by merging these two crucial fields. This innovative approach highlights the program’s dedication to staying at the forefront of technology innovations in the pursuit of excellence in public health, while also meeting the rising need for professionals with interdisciplinary skills.

Maurice Dawson, Elizabeth Omotoye
Integrating Intelligence Paradigms into Cyber Security Curriculum for Advanced Threat Mitigation

As the cyber domain continues to develop, it becomes increasingly crucial to provide cybersecurity workers with state-of-the-art competencies. This scholarly endeavor investigates the incorporation of intelligence paradigms into the curriculum of cyber security, with the objective of augmenting the skills and expertise of professionals in the field of advanced threat mitigation. The endeavor entails a collective task in which participants work together to conceive, construct, and execute a bespoke open-source intelligent application utilizing a prominent high-level programming language, specifically Python. The application is equipped with a comprehensive Graphical User Interface (GUI) and has the capacity to retrieve and analyze data from at least four different sources or Application Programming Interfaces (APIs). The project entails the development of a thorough project strategy, risk management plan, earned value sheet, and project management plan. Moreover, the participants actively partake in thorough data analysis of the acquired material, subsequently showcasing their findings to illustrate the tangible implementation of intelligence-based insights within the realm of cybersecurity. The complete codebase is effectively managed on the GitHub platform, so ensuring meticulous version control and promoting a conducive environment for collaborative development. This initiative not only improves technical skills but also fosters a comprehensive comprehension of intelligence integration in the field of cyber protection.

Maurice Dawson
Simulation Tests in Anti-phishing Training

Phishing is a form of social engineering attack that targets and exploits human vulnerabilities using such tactics as reciprocation, social proof, authority, and scarcity for psychological and behavioral manipulation. Phishing has been a primary factor and starting point that leads to the majority of cyber-attacks. The volume of phishing attacks along with resulting financial losses has been fast growing. As human factor is the weakest link in phishing attacks, effective anti-phishing training of human users is a significant research topic and critical to the security of organizational and individual data and other digital assets. Behavioral research shows that simulations are an acceptable and effective way of education and training for cognitive behavioral therapy (CPT) and human awareness and response improvement. This research presents an adapted behavioral training framework that employs simulation tests to address psychological factors of phishing for phishing awareness and response improvement. This research also contributes empirical data and performance results on employee anti-phishing awareness and response training using simulated phishing tests from a case study of an organization

Peyton Lutchkus, Ping Wang, Jim Mahony
Where Is Our Data and How Is It Protected?

As humans, we commit to using our information with enthusiasm and often based on an expectation of return; categorized as the “Reciprocity Standard” of the “social exchange theory.” Additionally, we infrequently honor the request to use information with alacrity, acutely aware of the associated risks—the risk to our privacy, often discussed as the “Privacy Paradox.” This privacy risk has increased as we become more connected through technology and has been the center of the daily news with events like the recent Equifax data breach. These increasing events resulted in a huge public outcry with little discussion of the root cause. This paper will attempt to address the root cause of these frequent and increasing breaches by examining literature showing an anthology of human actions authorizing the use of information throughout our very existence. The paper will shift the focus of attention from bewilderment with the constant occurrence of data breaches to a sanitizing view of contributors, such as the actions of humans. Because this subject is vast and has a worldwide impact across cultures, its scope will be limited to an examination of the individual actions related to the data breach and a focus on our actions to minimize future occurrences. The paper will begin with a review of information sharing throughout our civilization as a background and progress with the mechanism used to support this human desire into the current technological era. The paper will discuss the human conscious role in the process of data sharing and our expectation of return along with the protection of privacy. The paper will show how the expectation of privacy evolved to the current level of uncontrollability, suggesting that the path and releasability of data is both unpredictable and uncontrollable—answering the question of whether we know where our data is stored and if we can ever retain ownership.

Kenneth L. Williams
SIV-MAC: An Efficient MAC Scheme

SIV-MAC is a deterministic Message Authentication Code (MAC) built over the efficient universal family of hash functions POLYVAL. Unlike the standardized GMAC that also uses universal hashing (with GHASH) SIV-MAC does not require a nonce. SIV-MAC is the special case of the nonce-misuse resistant AEAD named AES-GCM-SIV, instantiated with a 256-bit main key and a fixed 96-bit zero nonce. The authentication tag of a string X is the output of AES-GCM-SIV invoked with an empty message and with X as the Additional Authenticated Data (AAD). This means that SIV-MAC is readily available in libraries that support AES-GCM-SIV, such as BoringSSL and OpenSSL (The OpenSSL Project, OpenSSL: the open source toolkit for SSL/TLS. www.openssl.org , 2003). However, performance can be further improved. We show here how tagging messages can reach asymptotic performance of 0.3 cycles per byte. Finally, we explain why a key can be used for safely processing 250 bytes before it needs to be rotated.

Shay Gueron
Speeding Up RSA Signature Verification

In some scenarios that emerge in data centers, one signed certificate is verified a very large number of times. This makes the performance of signature verification a significant target for optimization. With this motivation we show how to speed up RSA verification in the globally deployed cryptographic library OpenSSL.RSA verification time is dominated by the time for computing X e ( mod N ) $$X^e \pmod N$$ where e is the public exponent. The default choice is e = 65537 = 2 16 + 1 $$e= 65537 = 2^{16} + 1$$ , but obviously, signing with e = 3 $$e=3$$ leads to a faster verification. In both cases e has the form of e = 2 k + 1 $$e = 2^k +1$$ for some k, namely k = 1 $$k=1$$ or k = 16 $$k=16$$ . We speed up the computation of X e ( mod N ) $$X^e \pmod N$$ by replacing OpenSSL’s call to its modular exponentiation function with a dedicated sequence of Montgomery Multiplication (MM) calls. We also show an algorithm that uses only k + 2 $$k+2$$ MM calls instead of k + 3 $$k+3$$ , i.e., 3 instead of 4 for e = 3 $$e=3$$ (and 18 instead of 19 for e = 65537 $$e= 65537$$ ). Integrating our method into OpenSSL (version 3.0) and measuring RSA2048 on a “Skylake” processor shows speedups of 1.56 x $$1.56x$$ for e = 3 $$e=3$$ and 1.21 x $$1.21x$$ for e = 65537 $$e= 65537$$ .

Isaac Elbaz, Shay Gueron
A Comparative Analysis on Ensemble Learning and Deep Learning Based Intrusion Detection Systems over the NCC2 Dataset

Intrusion detection systems (IDS) are effective countermeasures to ensure Internet of Things (IoT) network security. IDS are employed to guarantee data privacy and assess network integrity through the detection of malicious activities at an early stage. IDS aims to accurately characterize normal traffic pattern and identify behavior deviation as malicious. This paper presents a comparative study on advanced machine learning (ML) and deep learning (DL) techniques for attack traffic detection and type identification over simultaneous parallel sensors. Performances of four classifiers including Extreme Gradient Boosting (XG-Boost), CNN-LSTM, Autoencoders (EA), and Multi-layer Perceptron (MLP) for multiclass classification problem to accurately identify distributed cyber-attacks type over the NCC2 dataset. Experimental results reveal that the XGBoost classifier over the RF based feature set produces an accuracy and precision up to 1 outperforms other classifiers. Comparative analysis indicates the importance of feature extraction techniques and their ability to affect classifier performances.

Soundes Belkacem
Innovative Lightweight Key Agreement Protocol Based on Hyperelliptic Curve for IoT Mutual Authentication

Fog nodes are established to be intermediate between the IoT devices node and the cloud data center to alleviate the latency of data transmission. Fog computing enables low-cost IoT systems to communicate with high-end cloud servers at the network’s edge. As a result, fog systems can conduct data summarization and analysis, resulting in a substantial reduction in request delay for latency-sensitive applications. It can perform data aggregation, reducing the bandwidth required, which is critical for wireless connection with cloud servers. Due to the huge amount of data exchanged among three nodes namely: IoT device, fog, and cloud nodes, it poses many security issues like how to protect these connected devices from unauthorized access.The challenge is to design a secure mutual authentication protocol that has lightweight to resource-constrained devices. Therefore, IoT-enabled devices demand a lightweight secure cryptography scheme. It’s necessary to secure end-to-end communication in IoT applications to avoid information leakage. This study proposes a lightweight authenticated key agreement protocol based on a hyperelliptic curve which uses a small size base field compared with traditional cryptography methods. The proposed protocol saves communication bandwidth and reduces computational complexity. In addition, the proposed protocol can establish a common session key between three communication nodes. We analysed and proved the security properties of the proposed protocol informally against the known attacks. We found that the proposed protocol can resist known attacks. The comparative analysis of our proposed protocol with related works in contexts of IoT security and performance showed that our work outperformed the protocols of related works by 54% as an average rational improvement with less communication and computation overhead costs.

Mohamad Al-Samhouri, Maher Abur-rous, Nuria Novas
Security Analysis of Drone Communication Methods

The use of drones in various applications has increased significantly in recent years, including agriculture, transportation, and military operations. Drones are equipped with various communication technologies that enable them to transmit and receive data to and from their operators or other drones. However, the use of these communication technologies can also pose significant security risks, as they can be vulnerable to cyberattacks that can compromise the confidentiality, integrity, and availability of the data transmitted. This research paper presents a comprehensive security analysis of drone communication methods. The paper explores the different communication methods used in drones, such as wireless communication protocols, including Wi-Fi, Bluetooth, and cellular networks. The analysis aims to identify security risks associated with these communication methods and assess the security of the drone’s software and hardware components. The research paper reviews the current state of drone communication security identifies the most significant security risks and discusses the impact of cyberattacks on drones and their operators.

Anteneh Girma, Kymani Brown
AI and Democracy: A Human Problem

While fear of the potential for generative Artificial Intelligence (AI) to create and spread misinformation for the purpose of disrupting modern representative democracy is understandable, such apprehension may be exaggerated or even unwarranted. There already exists structural issues corrupting the relationship between citizens and representatives, and concerns that the populace may be easily manipulated by AI-generated misinformation demonstrates a lack of confidence in both the democratic process and citizens’ intellect. As with any tool, generative AI can be misused or abused, and citizens, rather than the AI developer, are ultimately responsible for conducting their own research and educating themselves on political and social matters.

Zachary Zuck
Comparative Study of the Clarity of Privacy Policies in Social Media

A look into the privacy policies set by major social networking websites. Policies are intentionally unclear and lengthy to trick users to accepting something they are not aware of. Contains a deep dive into what information is gathered and the uses of user’s data. Comparing how user friendly each policy is to one another.

Steven Zalenski, Jing Hua, Darren Gray
ClipSecure: Addressing Privacy and Security Concerns in Android Clipboard

This paper investigates the privacy and security concerns surrounding the clipboard feature in Android devices. Since its inception, the Android clipboard has demonstrated a history of exploitation through lack of user knowledge and awareness coupled with poor design choices in an attempt to mitigate privacy issues. This paper first presents the results of a comprehensive study on Android users, revealing specific instances of clipboard exploitations and conducting a comparative analysis with other mobile Operating Systems. The study uncovers critical insights into the nature and extent of these exploits, laying the foundation for ClipSecure, an Android application designed to address these privacy concerns effectively. ClipSecure acts as a privacy preserving clipboard manager for the Android ecosystem and is compatible with over 90% of Android devices. Additional recommendations on preserving privacy within the Android system are also subsequently explored.

Eric Lavoie, Avishek Mukherjee, Scott James
Security Vulnerabilities in Facebook Data Breach

This research explores the intricate landscape of data breaches within Facebook (now Meta) from 2018 to 2021. The goal of this case study is to discover valuable lessons learned for online security improvements by analyzing the security vulnerabilities and causes leading to the recent data breaches at Facebook. Over 2.5 billion users were affected across 11 significant data breaches, predominantly involving Personal Identifiable Information (PII). The breaches emerged due to flawed data sharing practices, API issues, and privacy design flaws. Remarkably, most breaches went unnoticed during data security assessments but were discovered during product testing or by third parties. Facebook’s recurrent mishaps, particularly the storage of user passwords in plaintext, underscore its inadequate privacy design implementation. This study proposes an emphasis shift towards privacy-centric product design, more rigorous privacy change management, and stringent data access controls in order to safeguard user data and ensure regulatory compliance for social media platforms. Facebook’s mishandling of data security, resulted in regulatory intervention and substantial fines, serves as a cautionary case example for organizations worldwide.

Jing Hua, Ping Wang
Intelligent Intrusion Detection Model with MapReduce and Deep Learning Model

Cybersecurity has become crucial for defending networks from various cyberattacks. A conventional Intrusion Detection System (IDS) is crucial to contemporary security. However, there are limits to how intelligently it can analyse massive amounts of data in order to spot an abnormality. The MapReduce-Based Improved Deep Learning Model for ID (MR-IDLM) is a technique that may be used to intelligently automate ID. It is closely related to deep learning (DL). The DL method is employed for detecting intrusions accurately. In this study, MR-IDLM is proposed to identify network intrusions that include several data categorization jobs. The proposed MR-IDLM efficiently uses commodity technology to analyse large data volumes. According to this study’s proposed methodology, the MR-IDLM can identify intrusions by making educated guesses about hypothetical test cases and then saving that information to a database in order to prevent duplicate entries. The proposed model outperforms previously reported techniques with a detection accuracy of 100%. Since mapreduce and other preprocessing stages are added the proposed model provides these results. In future, efforts will be made to move this work in real-time.

Nawaf A. Almolhis
Ethical Considerations in the Development and Use of Artificial Intelligence in Weapon Systems

Over the course of the past two to three decades, Artificial Intelligence, AI, has been growing rather rapidly in all fields. A field that has seen significant increase in the use of AI is the weapon systems industry. This paper will focus on the ethicalness of the development and use of AI in the use of weapon systems. Some of these issues that are going to be addressed are autonomy, accountability, and the possible harm of there being no regulation of this industry. There is also research shown that will help with stating points and providing clarity into what everyday citizens thoughts are on the matter.

Reed Greco
Cyber Attack Intensity Prediction Using Feature Selection and Machine Learning Models

Cybercrimes are becoming increasingly more sophisticated and dangerous as we rely more on technology in all aspects of our lives. Crimes, such as data breaches, cyber extortion, and identity theft are more common than ever. It is estimated to cost the world billions of dollars and no country is immune to it. This paper aims to investigate the possibility of using various machine learning techniques, such as stochastic gradient descent and random forest in order to forecast potential cyberattacks. This is done by training the chosen machine learning model using the UNSW-NB15 dataset. This dataset contains nine types of network-based cyberattacks along with normal network activities. Information Gain Attribute Evaluation (IGAE) is used for feature selection with a rank cutoff 0.15. For the cross-validation task, 10-fold cross-validation is used. Results show that applying feature selection marginally increased the accuracy of all models used. The accuracy of the models ranged between 92.4% and 99.9%. The highest accuracy is obtained when using the random forest algorithm and a combination of random forest and logistic regression.

Mustafa Hammad, Khalid Altarawneh, Abdulla Almahmood
Analysis of IoT Vulnerabilities and Proposed Solution Approach for Secure Real-Time IoT Communication

The growing reliance on real-time communication in Internet of Things (IoT) devices raises concerns about security vulnerabilities. Our research identifies vulnerabilities in prevalent IoT protocols that could enable attackers to disrupt communication, compromise data, or even seize control of devices. By analyzing various threats in a test environment, we found that insecure device configurations and weak authentication mechanisms were the primary culprits. We demonstrate the effectiveness of implementing robust authentication, end-to-end data encryption, and network segmentation in mitigating these vulnerabilities. Our findings emphasize the need for proactive security measures in developing and deploying real-time IoT communication systems.

Anteneh Girma, Nurus Safa, Antione Searcy
Blockchain Based Identity Management for Secure Data Sharing

Information management products are commonly used in real-life applications along with designed to make managing digital identities as well as activities like authentication easier. There were attempts in recent years to develop BlockChain (BC)-based Identity Management (IdM) systems that allow customers to gain control other than their names. This article discusses the current BC-based IdM publications and patents that were published in 20–2023. We identify prospective opportunities and research gaps based on literature analysis, that would hopefully help inform future national policies. A card is an electronic log of communications that remains reproduced and disseminated across the BC of complete computer network systems. Because time a newfangled cargo comes on the BC, a log of that deal is attached to the periodical of the relevant contributor, each block on the chain contains several communications. However, there is still a gap in the present literature in a thorough examination of the elements of IdM in addition to user privacy and data security methods architecture for IdM. In this paper, we give a unified picture of the essential notions of SSI, including authentication techniques for diverse SSI systems and components of identity proofing.

Salahaldeen Duraibi

Data Science

Frontmatter
Case Study of an Interdisciplinary Academic Project for Reading Fluency Analysis

During the first semester of 2023, at the Aeronautics Institute of Technology (ITA, Brazil), an Interdisciplinary Problem-Based Learning (IPBL) Case Study was conducted. This case study involved 14 undergraduate and graduate students, distributed across 3 disciplines of Electronic and Computer Engineering graduate course at ITA. The project aimed to conceptualize, model, and develop a portion of a distributed database system. During this period, it was possible to develop a prototype for audio collection using the following technologies: Database System, Artificial Intelligence, Machine Learning, Blockchain and Kubernetes. The final system was based on a similar project developed for the Brazilian Ministry of Education. This project aims to automatically analyze the reading fluency of elementary school children. The project described in this article focuses on creating a computational infrastructure for real-time audio collection. The audio, collected through the locally developed interface, should be stored on a server for subsequent automatic analysis using a Machine Learning and Artificial Intelligence model. The project was completed in 16 weeks, during an academic semester, and the SCRUM Framework was applied for project management. The primary contribution of this work was the joint utilization of the Agile Method (with SCRUM) and other mentioned technologies to test, manage, and develop the case study, resulting in a Literacy Fluency Analysis system, including a functional software prototype for audio acquisition.

Matheus Silva Martins Mota, Luis Felipe Silva Rezende Soares, Victor Araujo Paula Cavichioli, Stephanie Souza Russo, Julio Roncal, Gildarcio Sousa Goncalves, Adilson Marques da Cunha, Luiz Alberto Vieira Dias, Lineu Fernando Stege Mialaret, Johnny Cardoso Marques
Strategic Software Modernization: Business-IT Convergence with Large Language Models

Existing research on legacy system modernization has primarily focused on technical challenges. Can a system be modernized while concurrently enhancing business processes?This paper introduces a strategic, four-step framework designed to guide software modernization in large organizations. This systematic approach provides a well-structured pathway towards modernization, targeting both cost reduction and efficiency enhancement. Strategically, the framework aligns with business goals and objectives to strengthen the modernization process and employs a Large Language Model to validate the approach.

Wilson Cristoni Neto, Luiz Alberto Vieira Dias
Feature Fusion Approach for Emotion Classification in EEG Signals

In this study, a novel approach is introduced for investigating the impact of diverse feature interactions on the categorization of emotional states within the context of human-computer interaction (HCI) using lateral electroencephalography (EEG). The approach presented in the study offers a unique method for emotion classification through the integration of numerous components. Feature extraction from preprocessed EEG recordings is executed judiciously using Differential Entropy (DE), Mean Square Teager-Kaiser Energy Operator (MST), and Sample Entropy (SampEn). Subsequently, the determination of the emotional state is conducted employing a Support Vector Machine (SVM) classification model. Experiments were carried out using the SEED-IV dataset, and the findings indicate that DE demonstrates the highest performance as a singular feature, with an average classification accuracy of 77.86%. Furthermore, a substantial improvement is achieved through the integration of non-linear Sample Entropy (SampEn) features with functional Minimum Spanning Tree (MST) attributes, resulting in an average classification accuracy of 84.58%. The incorporation of various characteristics and sophisticated electroencephalogram (EEG) analysis methods in the field of human-computer interaction (HCI) present a possible avenue for improving interactions between humans and computers.

Yahya M. Alqahtani
Household Discovery with Group Membership Graphs

Entity Resolution (ER) determines whether two entities references refer to the same or different objects. Most work in this area has focused on pairwise matching. However, identity attributes such as address and age change over time in population data. In many cases, these changes can cause pairwise matching to fail. This paper describes how some references can be correctly linked in data with inferred household memberships. The technique comprises a household discovery process followed by household blocking and matching. In testing with synthetic data, the references are clustered using standard pairwise matching, we have used knowledge graphs.

Onais Khan Mohammed, John R. Talburt, Khizer Syed, Abdus Salam Siddiqui, Altaf Mohammed, Adeeba Tarannum, Faraz Mohammed
Initial Design and Implementation of an Edge-to-Edge LoRaWAN Data Collection System

Democratizing established research infrastructure yields profound beneficial effects on local regions, especially with sensor networks deployed through environmental research. The active deployment of sensor networks and IoT workflows that stem from these research projects, if made accessible to the public, allow for a level of longevity to research infrastructure and the re-purposing of the research products for public gain. In this paper, we present a data collection system designed for an NSF-sponsored CSSI project which both collects environmental data from sensor networks deployed around the Lake Tahoe Basin and makes the research data/infrastructure available to entities around those deployment sites. Due to the massive coverage of the research area and nature of the data, the system described in this paper makes use of emerging LPWAN technologies and affordable LoRaWAN gateways to handle the edge-to-edge communication. The data is streamed over a long distance to the University of Nevada, Reno, and managed with Apache NiFi. Additional hardware/software configurations outside of the project are also made available, due to a deployment of the Things Stack. It is envisaged that the approach described in this paper can be the first step in establishing a foothold to create transformative real-time ‘crowd-participating’ data services.

Chase Carthen, Zach Estreito, Vinh Le, Jehren Boehm, Scotty Strachan, Alireza Tavakkoli, Frederick C. Harris Jr., Sergiu M. Dascalu
A Study on Data Quality and Analysis in Business Intelligence

Data quality is an increasingly important concern for organizations as their dependence on data to make decisions is growing. Business Intelligence methods and techniques are used by companies and institutions to collect, treat, organize, analyze, and view critical information in a structured and objective way. We present a literature review to identify approaches aimed at improving data quality to support decision-making in the context of Business Intelligence. Approaches involving data transformation, integration, and maintenance were identified, considering the specific decision-making requirements. Our review offers a comprehensive view of the latest strategies, presenting works that address issues regarding data quality and analysis.

Robson Carlos Bosse, Mario Jino, Ferrucio de Franco Rosa
Exploring the Impact of Augmented Reality Applications on Student Engagement in Higher Education in South Africa: A Review

This comprehensive review investigates the ramifications of integrating augmented reality applications into higher education, specifically within the unique context of Higher Education in South Africa. As the digital landscape evolves, the efficacy of Augmented Reality in enhancing student engagement and improving learning outcomes becomes a focal point for educational institutions. This review aims to elucidate the multifaceted relationship between AR applications, student engagement, and academic achievement in the South African higher education setting by synthesizing a diverse range of literature, including empirical studies, institutional case analyses, and theoretical frameworks. Special emphasis is placed on contextual factors that influence the adoption and impact of AR technologies, considering the sociocultural and educational landscape of South Africa using Cultural Historical Activity Theory. Various academic databases, including but not limited to PubMed, IEEE Xplore, ERIC, and Google Scholar were used. The search strategy involved combining keywords such as “augmented reality,” “higher education,” “student engagement,” “learning outcomes,” and “South Africa.” Out of 189 papers, 36 papers were selected based on their abstracts. Both quantitative and qualitative studies were considered. The synthesis of findings not only provided a comprehensive overview of the current state of research but also identified gaps in understanding, paving the way for tailored strategies and future investigations. The outcomes of this review offer valuable insights for educators, administrators, and policymakers seeking to leverage AR technologies to enhance the educational experience in South African higher education.

Dina Moloja

Human-Computer Interaction

Frontmatter
TickTrax: A Mobile and Web-Based Application for Tick Monitoring and Analysis

Due to the impact of climate change on environmental factors, tick populations are expanding in various regions, posing a growing public health concern due to the tick-borne pathogens (TBP) they transmit. To address this issue and gather valuable insights for preventing tick-borne diseases (TBD), the TickTrax software initiative introduces a website and mobile application that utilizes citizen science to monitor and track tick distribution. By adopting a citizen science-based approach, TickTrax significantly enhances the size and diversity of data collection. The TickTrax software platform offers an intuitive interface, enabling users to contribute valuable tick-related data through both the website and mobile application. The data is aggregated into a live database, which visualizes tick locations across different regions. The project’s key components encompass user input, data visualizations, and the ability to export reports for further analysis. Through these features, TickTrax aims to empower the public to actively participate in tick monitoring efforts.

Denielle Oliva, Ryan Dahan, Rohman Sultan, Joanna Lopez, Monika Gulia-Nuss, Andrew B. Nuss, Mike B. Teglas, David Feil-Seifer, Frederick C. Harris Jr.
Developing Students’ Reflective Skills to Improve the Learning of Human-Computer Interaction

The field of Human-computer interaction (HCI) has emerged as a multidisciplinary field. Students are required to learn theoretical design concepts and implement them in practice. This creates difficulties in learning HCI because students are not able to apply what they have learned. Reflective practice has proven to be advantageous in helping students critically reflect on their actions. The reflective practice concept can provide students with a chance to examine what was done and recommend improvements to their actions. This study was conducted as a case study of information technology students at the South African University of Technology. Data were collected from 66 students who underwent HCI as part of their curriculum. After the completion of an assignment, students completed reflective sheets, and content analysis was performed on the qualitative data. The findings indicate that it is essential to understand the question to apply relevant theoretical concepts to practical situations. This paper presents students’ perspectives on how to improve HCI learning by developing their reflective skills. The active involvement of students in enhancing their HCI learning experiences and the benefits of reflective practice in this regard are highlighted in our presentation, and we illustrate that such an approach leads to improved HCI learning outcomes.

Moretlo Tlale-Mkhize, Janet Liebenberg
Options Matter: Exploring VR Input Fatigue Reduction

Virtual and Augmented Reality are technologies that continue to touch the lives of consumers day in and day out. The very promise of immersion is what drives its consistent innovations, even years after the initial peak of interest. And much like the development of the mouse, input methods have to constantly be challenged and studied within all applicable domains to refine that immersion. In this paper, we present a VR user study that immerses participants in a more non-typical domain for VR, Art Exhibits. While the current VR community tends to lean towards controls that are typical for video gaming, this study showcases a set of alternative VR inputs that 16 participants used to roam and interface with a curated exhibit. By leveraging the newer technologies provided in the Meta Quest Pro, participant metadata was autonomously recorded, processed, and stored without any need for external systems. Based on the results from this user study, it was found that task completion times were slower for hand-tracked inputs compared to controller inputs. Additionally, there was no significant correlation found regarding the input accuracy between the three input methods.

Michael Wilson, Levi Scully, Vinh Le, Frederick Harris Jr., Pengbo Chu, Sergiu Dascalu
Introductory Pathway to Virtual Reality by Developing Immersive Games of Skill

As the Virtual Reality (VR) market grows with more innovations in the field, the demand for Virtual Reality developers has increased. With the market potentially hitting half-a-trillion market share by 2030, it is a great time to become proficient in VR development. This paper proposes a introductory pathway, Carnival VR, to learn the foundational skills required to become a proficient VR developer. VR user studies and development techniques were reviewed for potential use of developing a lesson plan that incorporates setting up the tooling, technical knowledge, and collaborative process for VR development. Carnival VR lays out the core learning objectives and discusses the introductory pathway success in facilitating a culture of collaboration and creativity in a group of computer science researchers. Furthermore, this paper presents the results of Carnival VR through pre-and-post interviews with researchers in the Software Systems Lab at the University of Nevada, Reno.

Levi Scully, Araam Zaremehrjardi, Rojin Manouchehri, Jonathan Chi, Pengbo Chu, Sergiu M. Dascalu
A User Study of Two Downstream Single-Cell Data Analysis Methods: Clustering and Trajectory Inference

Recent advancements in deep learning have significantly improved the analysis of single-cell data, including clustering and trajectory inference. Multiple methods have been proposed for these downstream tasks. However, researchers often rely on only a few metrics to assess these methods, disregarding whether users can effectively derive useful information from their outputs. To address this gap, we conducted a user study comparing various downstream single-cell analysis methods, including a post-questionnaire to indicate user preferences, user-friendliness, and customization capabilities. We also provided a brief analysis of the methods examined in this study. We conclude that user preferences vary across different aspects.

Yifan Zhang, Sergiu Dascalu, Frederick C. Harris Jr., Rui Wu
Fostering Joint Innovation: A Global Online Platform for Ideas Sharing and Collaboration

In today’s world, where moving forward hinges on innovation and working together, this article introduces a new global online platform that’s all about sparking teamwork to come up with new ideas. This platform goes beyond borders and barriers between different fields, creating an exciting space where people from all over the world can swap ideas, get helpful feedback, and team up on exciting projects. What sets our platform apart is its ability to tap into the combined brainpower of a diverse bunch of users, giving people the power to come up with game-changing ideas that tackle big global problems. By making it easy for people to share ideas and promoting a culture of working together, our platform is like a buddy for innovation, boosting creativity and problem-solving on a global level. This article spills the details on what the platform aims to do, how it works, and what makes it special, emphasizing how it can kickstart creativity, ramp up problem-solving skills, and get different fields collaborating. It’s not just a tool—it’s a whole new way of teaming up to make daily life better and build a global community of problem-solving pals.

Hossein Jamali, Sergiu M. Dascalu, Frederick C. Harris Jr.
Keep Sailing: An Investigation of Effective Navigation Controls and Subconscious Learning in Simulated Maritime Environment

Simulation engines are gradually being used by researchers for creating maritime datasets for various analysis, detection and surveillance purposes. However, ship behaviour modeling only through scripts has limited scope. An interactive interface for ship behavior modeling can be a creative and durable solution towards generating multiple divergent scenarios from the same set up. Based on this idea, we have investigated the effectiveness of Mouse, Keyboard and Gamepad navigation controls in terms of task completion rate, error rate and task completion time in a gamified user interface. The same interface can be used to generate maritime simulated datasets with little effort. Upon examining 15 participants, we found that Mouse is the superior control in terms of task completion rate with a mean of 86% and a p value of 0.0337. Additionally, none of the three controls showed statistical significance in terms of error rate or time taken for task completion. Furthermore, we explored the field of subconscious learning of Navy ship properties from our interactive tasks in simulated environment.

Mayamin Hamid Raha, Md. Abu Sayed, Sergiu Dascalu, Monica Nicolescu, Mircea Nicolescu

Machine Learning: Theory & Applications

Frontmatter
Offense Severity Prediction Under Partial Knowledge: Trigger Factor Detection Using Machine Learning and Network Science Methods

Predictive policing in the new era requires a more accurate prediction of potential crime events. When data scientists use crime data to conduct analyses or prediction tasks, due to the particularity of crime data, the incompleteness of crime data has always been a big challenge. The purpose of this work is to find the most sensitive presumptive feature when only limited or delayed offense information is available. In this study, the authors create a framework that employs both machine learning (decision tree) and network science (eigenvector centrality) methods, to detect the trigger feature that has the greatest influence on the prediction of crime severity under only partial knowledge. The outcome of this work reveals the trigger features that can best improve the prediction results under different a priori knowledge contexts and provides a new evaluation indicator of crime hotspots for predictive policing.

Yu Wu, Natarajan Meghanathan
Exploring Deep Learning Techniques in the Prediction of Cancer Relapse Using an Open Brazilian Tabular Database

The early prediction of the risk of cancer relapse can bring various benefits for healthcare, and open databases can help to address this challenge. Although the literature presents several studies to predict cancer relapse based on images and genomics data, these are not always available. To deal with this issue, we investigated the use of deep learning techniques with tabular data of an open Brazilian database. This database aggregates hospital tabular records from over 70 hospitals located in Brazil. We analyzed models based on Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) and FT-Transformers. CNN results for the 202 types of cancer show an average Area Under the ROC Curve (AUC) of 0.62, while the best AUC of 0.85 is for cancer type ICD-O C718 (Malignant neoplasm of overlapping sites of brain). The best F1 score and average accuracy results were obtained using FT-Transformer-based model, reaching 0.92 and 0.86, respectively, while the best individual F1 and accuracy were 0.99 and 0.98 for the cancer type ICD-O C258. In addition to the model performance, we also bring explainability to the models by identifying the most relevant features through the calculation of SHAP values. The corresponding feature analysis indicates that the most relevant explanatory variables are TNM classification codes and the time interval to begin treatment after the diagnosis. This article also discusses the limitations, challenges, and future works.

Rodrigo Bonacin, Sérgio Modesto Vechi, Mariangela Dametto, Guilherme Cesar Soares Ruppert
A Review on the Use of Machine Learning for Pharmaceutical Formulations

The development of pharmaceutical formulations is a costly and uncertain process, which includes testing different combinations. We present a literature review on the use of Machine Learning (ML) techniques in the realm of pharmaceutical formulation. A systematic approach was carried out on the following scientific databases: PubMed, Springer Link, and IEEE Xplore. From evaluating 18 selected articles, this article presents and discusses the applications (e.g., Drug Delivery and Protein Development), ML techniques, the main contributions, and research challenges to be addressed. The results show a very promising scenario for ML, as the volume of scientific data is growing, as well as they reveal that there is no dominant or previously established solution for all cases. Those solutions must be used carefully and complementary to laboratory experimentation. We also identified that, despite the good results, there is still a need for advances in the processes of creation and use of complex and integrated databases.

Helder Pestana, Rodrigo Bonacin, Ferrucio de Franco Rosa, Mariangela Dametto
A Method for Improving the Recognition Accuracy of Pattern Classification

We propose a simple and intuitive method for improving recognition accuracy in pattern classification. This model improves the average recognition accuracy by adjusting the number of pattern classification classes according to the characteristics of the pattern without being dependent on the algorithm. In this respect, the method proposed in this paper can be positively utilized in various applications. We verified that the overall average recognition accuracy can be improved by retraining after increasing the number of classes. Specifically, we retrained the data by assigning a new class to samples that are incorrectly recognized as a separated class, using handwritten digit recognition and text classification models. Our method shows some improvements in test accuracy by adjusting one class. In the text classification, we trained 4-class model based on our proposed idea with IMDb text corpus through detailed analyses. Our model recorded 99.70% as the average test accuracy, outperforming the state-of-the-art model which achieved a test accuracy of 96.21% so far.

Damheo Lee, Seungmok Ha, Bowon Suh, Yongjin Kwak, Mun-Sung Han
An Application of Support Vector Machine, Random Forest, and Related Machine Learning Algorithms on California Wildfire Data

Computerized weather predictions have allowed the National Weather Service to identify meteorological conditions that give rise to thunderstorms and tornadoes, and therefore issue advance warnings to the population. This suggests that wildfires could similarly benefit from mathematical simulations and data analysis. In this study, we extract the subset of data related to California in the U.S. wildfire data from 1992 to 2018, with the aim of gaining insights into the causes of wildfires and their sizes (acres burned). We perform this using support vector machine (SVM), random forest (RF) and related machine learning algorithms. In addition to seeking to predict a fire size from its given ignition point, our study also sought to predict the fire duration using ensemble regression methods. Of the multi-output methods considered, we observed that random forest regression significantly outperformed multi-output support vector regression (MSVR). Looking into the causes of fires, we noted that human activities such as arson, debris burning, smoking, and accidental in-home ignitions were responsible for numerous fire incidents and accounted for significant burned acres.Machine learning methodologies that build on mathematical modeling and computational approaches can improve the knowledge of fire locations, sizes and durations, which in turn can inform fire-fighting strategies (both the allocation of equipment and human resources) as well as guide future re-evaluation of decision and policy making. Informed public policies help preparedness for the yearly fires that affect large swaths of the U.S., notably California, Texas and Florida. While our study explores California wildfire data, we use methodologies that readily extend where similar datasets are available.

Joshua Ologbonyo, Roger B. Sidje
GPU-Accelerated Neural Networks and Computational Strategies to Predict Wave Heights

Significant Wave Height is an ocean wave characteristic that plays a major role in deriving and predicting wave energy. In this study, we aim to predict significant wave height using wind data as input to a multi-layer perceptron (MLP) neural network and analyze the developed network under several scenarios. Accordingly, the network was tested using different learning rates and numbers of layers. The computational times and results for all of the iterations and training periods were also recorded. Also, the acquired results of the models with better performances were compared with real data acquired from buoys. The results imply that all of the applied MLP networks could learn the relationships between wind and wave height and predict them. These MLP neural networks are composed of many operations that are either element-wise, or are represented as matrix-multiplications, making them good candidates for hardware (GPU) acceleration. This work illustrates the efficacy of wave height forecasting using MLP networks with GPU acceleration.

Ashkan Reisi-Dehkordi, Steven I. Reeves, Frederick C. Harris Jr.

Software Engineering

Frontmatter
A Method Based on Behavior Driven Development (BDD) and System-Theoretic Process Analysis (STPA) for Verifying Security Requirements in Critical Software Systems

Security failures in critical software systems can lead to severe economic, environmental, and human consequences. To ensure the security of these systems, it is necessary to identify and document security requirements as part of the software development process. Although the System-Theoretic Process Analysis (STPA) technique can be used to identify security requirements, it is challenging to verify their accuracy, completeness, and consistency. We propose a method based on STPA and Behavior Driven Development (BDD) for verifying software security requirements. BDD establishes a common language between business analysts and software developers. We evaluate the method through examples related to preserving the Confidentiality, Integrity, and Availability (CIA) of information. The application of the method to the examples produces automated test cases written using Gherkin syntax, which are used to verify the requirements in the examples. The method proposed in this work has the potential to generate automated test cases that can be used to verify whether the software solution built meets the security requirements identified through an STPA analysis.

Vitor Rubatino, Alice Batista Nogueira, Fellipe Guilherme Rey de Souza, Rodrigo Martins Pagliares
JSON and XML Schemas for WebSTAMP

WebSTAMP is a software tool to assist safety and security analysts in the application of the STPA (Systems-Theoretic Process Analysis) technique. WebSTAMP does not have a formalized schema. This prevents analysts from using an analysis created in WebSTAMP in other software tools, and using analyses exported from other software tools in WebSTAMP. This work aims to (i) define an XML (Extensible Markup Language) schema and (ii) define a JSON (JavaScript Object Notation) schema for WebSTAMP, allowing hazard analysis portability among software tools that support the schemas. We define the schemas for XML and JSON using XSD (XML Schema Definition) and JSON Schema, respectively. These schemas were used in WebSTAMP and in a second tool in order to validate the import and export features of hazard analyses in XML and JSON, using examples presented in the literature. The schemas created for WebSTAMP provide portability to hazard analyses, making it possible to use an STPA analysis created in WebSTAMP in other applications that support the schemas, and vice versa. The results of this work allow us to conclude that the schemas for XML and JSON aid safety and security analysts, making their task of conducting hazard and vulnerability analysis more flexible, since they allow the analysts to use the tools of their choice, including the possibility to use more than one tool, leveraging the strengths of each one.

Rodrigo Martins Pagliares, Gustavo Henrique Santiago da Silva, Gabriel Piva Pereira
Element-Based Test Suite Reduction for SARSA-Generated Test Suites

The widespread use of Android apps motivates exploration of efficient testing approaches that improve quality while addressing real-world time constraints and budgets for testing. Automated test generation with Reinforcement Learning (RL) algorithms have shown promise, but there is room for improvement as these algorithms often produce test suites with redundant coverage. Fine tuning RL algorithms is one possible solution but time-consuming due to complex characteristics of software under test. In this study, we employ a hybrid methodology to address the problem at a more general level. The hybrid methodology takes test suites generated by reinforcement learning as an input and applies test suite reduction to remove redundancy. The proposed algorithm utilizes a greedy approach to quickly reduce test suites generated by SARSA based on elements. Outcomes show a significant reduction ranging from 25.61% to 65.78% while maintaining a high level of code coverage with a loss of 0.69% at most.

Abdullah Alenzi, Waleed Alhumud, Renée Bryce
Towards Adopting a Digital Twin Framework (ISO 23247) for Battery Systems

Increasing demands of sustainability and electrification are driving development for customer and regulatory satisfaction, and battery systems are on the rise in several domains, such as railway, automotive, aerospace, etc. [1]. With the growing interest in battery technology advanced applications are continuously evolving to meet traditional challenges of performance, for example in the Electrical vehicle (EV) domain. More recently, technical advances have improved the run-time configuration of battery systems to be adaptable and applicable across a wider range of use cases [2, 3]. As a result, several paradigms have emerged, such as reconfigurable battery systems [4], software-defined batteries [5], and Heterogeneous Battery Systems (HBS) [6]. These solutions mainly improve the flexibility of battery systems to reduce over-engineering and improve the range of applicable use cases for a given system due to the run-time configuration that can cover a larger range of functional and non-functional requirements that need to be met simultaneously.

Johan Cederbladh, Enxhi Ferko, Emil Lundin
Challenges and Success Factors in Large Scale Agile Transformation—A Systematic Literature Review

The extraordinary success of agile software development in small software teams pushed organizations to find ways to scale the practices to large projects and multiple teams. However, agile works differently in the large than in the small. This paper assimilates organizational challenges and success factors for large-scale agile transformations. We further explore these factors in their business domain and the transformation framework implemented. We conducted a systematic literature review of 15 case studies and found that organizations may share common challenges and success factors. However, agile’s challenges and success factors are very case-specific, and what works for one may not work for the other, even when both are in the same business domain or following the same framework.

Suddhasvatta Das, Kevin Gary
Integrating AIaaS into Existing Systems: The Gokind Experience

In this research paper, we present the results of our collaborative study with Gokind AB on the integration of artificial intelligence as a service into an existing system. Initially, we conducted a comprehensive review of existing research to understand current architectural integration techniques and ensure alignment with best practices. Building on these insights, we designed an integration architecture that emphasizes functional suitability and interoperability. The practical implications of our research are demonstrated through the implementation of this architecture within the Gokind platform. This implementation resulted in notable improvements, including accelerated processing times, enhanced scalability, and efficient resource allocation. Our study also highlights areas for potential enhancement, advocating for ongoing refinements such as variance analysis across artificial intelligence as a service classes, comparative assessments among providers, strengthened security measures, and a comprehensive exploration of the impact of architectural attributes.

Benedicte Boneza Musabimana, Alessio Bucaioni
AI-Powered Smartphone Context and High-Utility Itemset Mining for Enhanced App Testing and Personalization

Internet of Things (IoT) devices have become increasingly essential. User experience and functionality rely heavily on context events such as screen orientation, wireless connectivity, media events, and battery events. Volatility and unpredictability of these events pose challenges to app development and testing. To address this, we use a novel application of SPMF library algorithms, including Top-K Quantitative High Utility Itemset Miner (TKQ), Fast High Utility Quantitative Itemset Miner (FHUQI-Miner), and Fast Correlated High-Utility Itemset Miner (FCHM) with two different correlation metrics. We train the machine learning techniques (decision tree regression (DTR) and bagging on generated rules on itemsets) from sequences of context events. This includes 78 unique context event streams from real world users over periods of 15 days, 30 days, and 60 days. We identify high-utility itemsets (HUIs) of context events, which developers can incorporate into testing. Results show that the HUIs mining algorithms result in high predictive accuracy of greater than 90% on testing data which makes it suitable for helping to improve context-aware software testing processes. Indeed, HUIs help predict user behavior, personalize user experience, detect anomalies, and realize efficient and reliable mobile apps through context-aware testing.

Pooja Goyal, Abbie Seale, Katie Charubin, Quinn Bennett, Renée Bryce
Towards Systematic and Precise Compilation of Domain-Specific Modelling Languages

Software is pervasive and often critical in our everyday life. Its production is complicated and expensive, especially for complex systems like Cyber-Physical Systems. These systems are often safety-critical and rely on heterogeneous processors (e.g., CPUs, GPUs, FPGAs, DSPs), hence their engineering requires reliable and flexible methods. Domain-Specific Modelling Languages and model-based techniques have proven to be very suitable for that. Currently, from these modelling languages, executables are generated by first translating a model to a program in a high-level programming language (e.g., C++) via code generators and then compiling it. Code generators are language-specific, inflexible, and not always reliable, difficult and expensive to certify, customise, and maintain. In this paper, we outline ORPHEUS, a novel approach for model compilation based on state-of-the-art methods and tools. This kind of approach aims at maximising the profit of using model-based techniques, thus producing high-quality and safe software in a more efficient manner, and accelerating research by providing a unified common ground for researchers and practitioners.

Federico Ciccozzi

Potpourri

Frontmatter
A Detection Method for Circumferential Alignment of Diminutive Lesions Using Wavelet Transform Modulus Maxima and Higher-Order Local Autocorrelation

The number of patients with Crohn’s disease is increasing annually. Since the cause of Crohn’s disease remains unclear, early detection and appropriate treatment are crucial. Diagnostic criteria for Crohn’s disease include features such as erosions, ulcers, and the circumferential alignment of diminutive lesions. Medical professionals employ capsule endoscopy for diagnosis. While existing research focuses on erosions and ulcers, studies on the circumferential alignment of diminutive lesions are limited. Therefore, this paper presents a classification of images showing the circumferential alignment of diminutive lesions and normal images obtained through capsule endoscopy. We propose a classification method that utilizes the Wavelet Transform Modulus Maxima (WTMM) to extract contours of circumferential alignment, obtains features using Higher-order Local Autocorrelation (HLAC), and classifies them using a Support Vector Machine (SVM). Our method accurately classified circumferential alignment of diminutive lesions and normal images with an accuracy of 98.2%.

Tomoki Suka, Hajime Omura, Teruya Minamoto
Impedance Analysis of Adaptive Distance Relays Using Machine Learning

This paper introduces a novel approach to enhance the reliability of power system protection using dynamic distance relay settings. This system at first predicts Load Encroachment situations by load prediction based on machine learning (ML) or existing accurate prediction like PJM (PJM is a regional transmission organization ( www.pjm.com ) that coordinates the movement of wholesale electricity in all or parts of 13 states and the District of Columbia) predictions. The system will then recalculate distance relay settings based on new Impedance (Z = R + JX = V/I). Then when the system is back into the normal situation, settings will reset to the original settings that were used prior to the Load Encroachment situation. These calculations are done on both cloud and on-premise simultaneously which has several advantages: (1) It predicts future impedance values. (2) It monitoring the line even without a distance relay, preventing false tripping during rapid voltage changes. (3) As the proposed system recognizes Load Encroachment scenarios and employs real impedance data, erroneous relay trips are significantly reduced, enhancing overall grid stability.

Kamran Hassanpouri Baesmat
Employees’ Experiences of Using a Mobile Health Application: A Qualitative Study Based on Digital Intervention

Employee health can be promoted by companies by offering various health-specific programs, activities, and measures and by forming a well-functioning work organization that is beneficial to productivity, well-being and health. The aim of this paper is to investigate employees’ experiences of using a mobile health application, GOOZO, as inspired by gamification, to promote work-related health. An exploratory case study with an inductive approach is used in this study. The results show that using a digital application to promote work-related health can be a good way to draw attention to the importance of health. Gamification is also an incentive to increase the social coherence at the workplace, as activities with a competition focus can engage employees.

Cecilia Johansson, Ann Svensson
MicroSTAMP: Microservices for Steps 1 and 2 of the System-Theoretic Process Analysis (STPA) Technique

STPA (System-Theoretic Process Analysis) is a technique that aids safety analysts in the development of hazard analyses for socio-technical systems. Although safety analysts are usually interested in a complete hazard analysis, by performing the four steps of the STPA, artifacts produced in each of the steps can be independently valued by other stakeholders, such as domain experts, systems engineers, software developers and system designers. There exist some software solutions to support STPA, but none of them are developed focusing on reuse of the implementations of each of the four STPA steps, in isolation, between different tools. This work has two objectives: (i) to create MicroSTAMP, a set of two microservices that partially supports the STPA technique (steps 1 and 2) and (ii) to demonstrate that it is possible to reuse the microservices, by integrating MicroSTAMP with other tools to fully support STPA (steps 1–4). This work presents the two MicroSTAMP microservices to support STPA steps 1 and 2 and demonstrate that it is possible to integrate MicroSTAMP with a second tool (WebSTAMP) in order to provide full support for the four steps of STPA. We named PowerSTAMP the result of the integration of MicroSTAMP with WebSTAMP. We conclude that MicroSTAMP benefits several stakeholders, from those who are interested in the results of specific steps of the STPA (steps 1 and 2) to those who need a complete hazard analysis, by applying the four steps of the STPA technique.

João Hugo Marinho Maimone, Thiago Franco de Carvalho Dias, Fellipe Guilherme Rey de Souza, Rodrigo Martins Pagliares
The Disparate Impact of Distinct Background Music on Gameplay Experience: An Empirical Analysis

In modern video games, music has become an unavoidable component. Previous research showed that to fabricate the connection between games and players, music is inevitable. While it has been widely accepted that music has the power to impact people’s minds, it is yet to discover the differential impact of distinct categories of tunes on players’ gameplay experience. In this paper, we have conducted a controlled experiment to investigate the impact of different in-game background music on the gameplay experience. We hypothesized that, different background music will deferentially impact players’ gameplay experiences. Aligned with our hypothesis, the experimental outcome demonstrated that our three original music—Fruit Drop, Iced, and Soar have a distinctive impact on an individual’s gameplay experience.

Rifat Ara Tasnim, Farjana Z. Eishita, Jon Armstrong, Eddie Ludema, Jeremy Russell
Triangulation Guided High Clearance Paths

We explore the development of efficient algorithms for constructing collision-free paths with high clearance in the presence of polygonal obstacles. We review existing algorithms for planning collision-free paths in two-dimensional environment consisting of two-dimensional obstacles. We propose an algorithm that construct collision-free paths by following the adjacency relationships of the triangulation of the free-space. Preliminary investigation of the proposed algorithms show that generated paths have high clearance from the obstacles.

Laxmi Gewali, Sandeep Maharjan
Towards Visualizing the Status of Bug Reports

This paper presents preliminary results towards visualizing comprehensive and useful visualizations for the status and the history of bug reports stored in bug repositories. The goal is to help maintainers to gain better and quick comprehension about archived bug reports. Three visualizations are proposed to model different information about bug reports via different views. The visualizations are simple and easy to understand and analyze. The proposed views show information about a group of bug reports or about a specific one. A preliminary tool has been implemented to automatically analyze bug reports and to generate the proposed visualizations. The tool is a web-based tool that supports user interactions. The tool has been applied on real datasets from Bugzilla to illustrate the proposed visualizations and show their usefulness.

Maen Hammad, Sabah Alsofriya, Ahmed Fawzi Otoom
Quarantine Centrality: Principal Component Analysis of SIS Model Simulation Results to Quantify the Vulnerability of Nodes to Stay Infected in Complex Networks

We propose a novel simulations-based centrality metric to proactively identify and quarantine topologically vulnerable nodes that are more likely to stay infected (due to repeated infections from the neighbors) for a longer time during an epidemic spread per the SIS (Susceptible-Infected-Susceptible) model wherein a node does not get immunity after recovering from an infection and again becomes susceptible for infection. Referred to as the Quarantine Centrality, it is the first such simulations-based centrality metric proposed in the literature and is computed as follows: First, conduct multiple in-situ simulation runs of the SIS model on the complex real-world network to build a dataset that records the number of rounds the nodes stay infected in each simulation run and then run PCA (principal component analysis) on the dataset to quantify (the Quarantine Centrality metric) and rank the nodes with respect to the extent they could stay infected during an SIS-style epidemic spread.

Natarajan Meghanathan
A Holistic Approach for Single-Cell Data Trajectory Inference Using Chromosome Physical Location and Ensemble Random Walk

Single-cell RNA sequencing technology enables the analysis of complex, heterogeneous cell samples. However, errors in data processing, dimension reduction, and clustering can negatively impact subsequent calculations, particularly when inferring cell trajectories using graph methods. We proposed a novel method for single-cell data Trajectory Inference using Chromosome physical location and ensemble Random Walk (scCRW). It utilizes entire chromosomes and their gene identifiers to enhance factor analysis, providing a more comprehensive view of biological processes. For trajectory inference, scCRW employs a random walk, which has been evaluated against other state-of-the-art methods using real single-cell RNA-seq datasets. These datasets include both linear and nonlinear data, showcasing scCRW’s capabilities in pseudotime and trajectory inference tasks. The results demonstrate that scCRW consistently achieves top or near-top correlation scores and excels in nonlinear metrics such as F1 branches and milestones. This approach provides accurate trajectory inference that closely aligns with ground truth, highlighting the utility of using chromosomes in factor analysis and random walk techniques for more precise data analysis.

Jovany Cardoza-Aguilar, Caleb Milbourn, Yifan Zhang, Lei Yang, Sergiu M. Dascalu, Frederick C. Harris Jr.
Graph Partitioning Algorithms: A Comparative Study

One of the classic problems related to graphs is partitioning their vertices into subsets, consisting of composing groups with high connectivity. The graph partitioning problem is of interest since the amount of data generated today is gigantic, and the importance of determining groups is essential for making strategic decisions in several areas. This paper compares the main graph partitioning methods found in the literature, considering the minimum cut criteria and load balancing factors in different types of graphs.

Rafael M. S. Siqueira, Alexandre D. Alves, Otávio A. O. Carpinteiro, Edmilson M. Moreira
3D Video Game Simulation Based on Colored Petri Net and SIMIO

Creating video games is not a trivial task, specially due to the combination of technical and artistic processes that are compiled into one piece of entertainment. Creating diagrams and charts to specify the requirements of a video game still a great approach. However, it is not everyone on the development team who knows how to read these diagrams properly. Therefore, the underlying proposal of this work is to present a 3D simulation model based on colored Petri nets using the software SIMIO. With this tool, it is possible to track every decision made by the player graphically, and a timed base simulation comparison between the Petri net model and the 3D model is possible too. The formal colored Petri net model is then considered for formal verification and quantitative analysis, and the 3D SIMIO Simulation model is used to better understand the gameplay requirements. The video game Silent Hill II is considered to illustrate the modeling approach.

Felipe Nedopetalski, Estela Ferreira Silva, Franciny Medeiros Barreto, Stéphane Julia
Backmatter
Metadata
Title
ITNG 2024: 21st International Conference on Information Technology-New Generations
Editor
Shahram Latifi
Copyright Year
2024
Electronic ISBN
978-3-031-56599-1
Print ISBN
978-3-031-56598-4
DOI
https://doi.org/10.1007/978-3-031-56599-1

Premium Partner