nach oben

2022 | Buch

Intelligent Human Computer Interaction

13th International Conference, IHCI 2021, Kent, OH, USA, December 20–22, 2021, Revised Selected Papers

herausgegeben von: Jong-Hoon Kim, Madhusudan Singh, Javed Khan, Prof. Uma Shanker Tiwary, Prof. Marigankar Sur, Dhananjay Singh

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This volume constitutes the refereed proceedings of the 13th International Conference on Intelligent Human Computer Interaction, IHCI 2021, which took place in Kent, OH, USA, in December 2021.

The 59 full and 9 short papers included in these proceedings were carefully reviewed and selected from a total of 142 submissions. The papers were organized in topical sections named human centered AI; and intelligent interaction and cognitive computing

Inhaltsverzeichnis

Frontmatter

Human Centered AI

Frontmatter

Machine Learning Techniques for Grading of PowerPoint Slides

This paper describes the design and implementation of automated techniques for grading students’ PowerPoint slides. Preparing PowerPoint slides for seminars, workshops, and conferences is one of the crucial activity of graduate and undergraduate students. Educational institutes use rubrics to assess the PowerPoint slides’ quality on different grounds, such as the use of diagrams, text highlighting techniques, and animations. The proposed system describes a method and dataset designed to automate the task of grading students’ PowerPoint slides. The system aims to evaluate students’ knowledge about various functionalities provided by presentation software. Multiple machine learning techniques are used to grade presentations. Decision Tree classifiers gives 100% accuracy while predicting grade of PowerPoint presentation.

Jyoti G. Borade, Laxman D. Netak

Textual Description Generation for Visual Content Using Neural Networks

Various methods in machine learning have noticeable use in generating descriptive text for images and video frames and processing them. This area has attracted the immense interest of researchers in past years. For text generation, various models contain CNN and RNN combined approaches. RNN works well in language modeling; it lacks in maintaining information for a long time. An LSTM language model can overcome this drawback because of its long-term dependency handling. Here, the proposed methodology is an Encoder-Decoder approach where VGG19 Convolution Neural Network is working as Encoder; LSTM language model is working as Decoder to generate the sentence. The model is trained and tested on the Flickr8K dataset and can generate textual descriptions on a larger dataset Flickr30K with the slightest modifications. The results are generated using BLEU scores (Bilingual Evaluation Understudy Score). A GUI tool is developed to help in the field of child education. This tool generates audio for the generated textual description for images and helps to search for similar content on the internet.

Komal Garg, Varsha Singh, Uma Shanker Tiwary

Using LSTM Models on Accelerometer Data to Improve Accuracy of Tap Strap 2 Wearable Keyboard

This paper proposes the implementation of three different long short-term memory (LSTM) recurrent neural network (RNN) models to improve the accuracy of the input readings of the Tap Strap 2, a Bluetooth wearable keyboard device. The Tap Strap 2 was found in the previous study to have an undesirably low level of accuracy when it came to outputting the correct characters upon interpreting data from the built-in sensors. In response to this, raw accelerometer data was obtained from the device and used to train an LSTM model. This model would be used to not only determine which features correspond to which characters, but also would use contextual information from past characters to determine which character was most likely pressed whenever the input seems ambiguous. This paper first provides an description of the LSTM RNN used in this experiment. It then evaluates the effectiveness of the models in reducing the accuracy problems of the Tap Strap 2.

Kristian Mrazek, Tauheed Khan Mohd

Electronic Dictionary and Translator of Bilingual Turkish Languages

The article presents the IDEF0 model, functional module for the implementation of electronic translation in Uzbek and Karakalpak languages, which belong to the family of Turkic languages, as well as software for electronic translation based on these models and modules.

E. Sh. Nazirova, Sh. B. Abidova, Sh. Sh. Yuldasheva

Facial Recognition Technologies: A Survey and Comparison of Systems and Practical Applications

Through recent years, facial recognition technology has become increasingly relevant in a widespread area of applications. There are numerous approaches to facial recognition technology, each best-suited for different types of practices. A survey which compares the infrared, thermal, and deep learning methods is performed in this study. Each method is evaluated based on its speed, accuracy, and efficiency and is given a overall percentage of reliability. Further, we examine the advantages and disadvantages of each method and assess what common usage of each method would be in a practical setting. We find a point of commonality between each method type where accuracy and efficiency must strike a balance, further compounded by the practical applications of each method. Our findings show that while there is an ideal method of facial recognition for each individual application, there is no ideal method that applies to every application.

Nicholas Muskopf-Stone, Georgia Votta, Joshua Van Essen, Aira Peregrino, Tauheed Khan Mohd

Technological Evaluation of Virtual and Augmented Reality to Impart Social Skills

The global spread of COVID-19 has disrupted education in recent times. Imparting social skills necessary to survive in professional life is becoming increasingly challenging in the post-COVID-19 new normal. The existing online learning platforms which have been primarily developed as a technology for content delivery need to be augmented with additional features required to impart social skills and provide better learning experiences. In this paper, we provide a technical evaluation of two emerging technologies called augmented and virtual reality from the point of view of imparting social skills such as working as a team member, group-based learning, participating in real-life physical events (e.g., seminars and conferences, group discussion), and carrying out physical experiments in the laboratories. This paper describes the approaches to impart social skills through AR and VR based platforms.

Vinayak V. Mukkawar, Laxman D. Netak

Deep Convolutional Neural Network Approach for Classification of Poems

In this paper, we proposed an automatic convolutional neural network (CNN)-based method to classify poems written in Marathi, one of the popular Indian languages. Using this classification, a person unaware of Marathi Language can come to know what kind of emotion the given poem indicates. To the best of our knowledge, this is probably the first attempt of deep learning strategy in the field of Marathi poem classification. We conducted experiments with different models of CNN, considering different batch sizes, filter sizes, regularization methods like dropout, early stopping. Experimental results witness that our proposed approach outperforms both in effectiveness and efficiency. Our proposed CNN architecture for the classification of poems produces an impressive accuracy of 73%.

Rushali Deshmukh, Arvind W. Kiwelekar

Three Decision Points Vetting a More Ideal Online Technical Platform for Monitoring and Treating Mental Health Problems like Depression and Schizophrenia

To improve Information Communication Technology for Development (ICT4D) in e-health applications means getting the maximum advantage at the minimum cost to purveyors and consumers. This is a conceptual and empirical paper arguing three informed decision points, taken in order, should happen whenever maximizing advantages of health diagnostics and/or treatment facilitation via online platforms. Three kinds of data are important to know beforehand to create informed decisions to maximize beneficial uses of online technology to aid mental health: (1) how do you define the etiology of mental health problems; (2) how do you define who should be helped as a priority, and (3) how do you make decisions technically to fit the former two points? Since serving patients is more important than technical profiteering, these three critical decision points mean primary medical choices of diagnosis and secondary social demographic research should guide a more conditioned tertiary technical use, instead of vice versa that regularly leads to cutting patients to fit a pre-determined and thus misaligned technological investment. Due to limitations of space, only the third vetting point is analyzed in detail. The current state of the art for how to reach patients best via Internet-connected technologies for monitoring and treating depression and schizophrenia is analyzed. Policy advice on platform design is given from this vetting procedure that might later be scaled worldwide.

Mark D. Whitaker, Nara Hwang, Durdonakhon Usmonova, Kangrim Cho, Nara Park

Touching Minds: Deep Generative Models Composing the Digital Contents to Practice Mindfulness

Interest in the proper treatment of mental health has been rapidly growing under the steep changes in society, family structure and lifestyle. COVID-19 pandemic in addition drastically accelerates this necessity worldwide, which brings about a huge demand on digital therapeutics for this purpose. One of the key ingredients to this attempt is the appropriately designed practice contents for the prevention and treatment of mental illness. In this paper, we present novel deep generative models to construct the mental training contents based upon mindfulness approach, with a particular focus on providing Acceptance and Commitment Therapy (ACT) on the self-talk techniques. To this end, we first introduce ACT script generator for mindfulness meditation. With over one-thousand sentences collected from the various sources for ACT training practices, we develop a text generative model through fine-tuning on the variant of GPT-2. Next, we introduce a voice generator to implement the self-talk technique, a text-to-speech application using the ACT training script generated above. Computational and human evaluation results demonstrate the high quality of generated training scripts and self-talk contents. To the best of our knowledge, this is the first approach to generate the meditation contents using artificial intelligence techniques, which is able to deeply touch the human mind to care and cure the mental health of individuals. Applications would be main treatment contents for digital therapeutics and meditation curriculum design.

So Hyeon Kim, Ji Hun Kim, Jung Ahn Yang, Jung Yeon Lee, Jee Hang Lee

Remote Monitoring of Disability: A Case Study of Mobility Aid in Rohingya Camp

Examining disability within the context of displacement is a vital area of study. Additionally, further study of assistive technology devices for refugees with disabilities and those in low-resource settings presents the opportunity to dramatically improve the safety and medical welfare of people with disabilities. Mobility Aid project is a pilot study in a Rohingya refugee camp with refugees who have suffered debilitating injuries and need to use crutches to walk. The goal of the project is to improve remote monitoring of disability in the context of displacement. It could be extended to many other environments where people walk with crutches on uneven and muddy terrain.

Suzana Brown, Faheem Hussain, Achilles Vairis, Emily Hacker, Maurice Bess

AI Based Convenient Evaluation Software for Rehabilitation Therapy for Finger Tapping Test

Among the clinical features of Parkinson’s disease, It’s important to evaluate Bradykinesia. In order to evaluate Bradykinesia, a Finger Tapping Test included in the kinematic test item on the Unified Parkinson’s Disease Rating Scale is employed. For the accuracy of evaluation, there is a need for a tool that can perform a Finger Tapping Test based on quantitative data. In this study, An AI based novel approach to evaluate a human motion function quantitatively was suggested and demonstrated for use of rehabilitation therapy using Mediapipe. For a preliminary experiment, the finger tapping test was employed to evaluate its clinical utilization. The developed software showed results that were very consistent with the expert’s evaluation opinion. The AI based developed software showed the high potential for clinical use as a quantitative evaluation tool that is cost-effective & easy to use.

Seung-min Hwang, Sunha Park, Na-yeon Seo, Hae-Yean Park, Young-Jin Jung

A Smart Wearable Fall Detection System for Firefighters Using V-RNN

Falling is one of the leading causes of death among firefighters in China. Fall detection systems (FDS) have yet to be deployed in firefighting applications in China, negatively impacting the safety of firefighters in the fireground. Despite many studies exploring FDSs, few have explored the application of multiple sensors, or applications outside of geriatric healthcare. This study proposes a smart wearable FDS for detecting firefighter falls by incorporating motion sensors in nine different positions on the firefighter’s personal protective clothing (PPC). The firefighter’s fall activities are detected by a proposed RNN model combined with a Boyer-Moore Voting (BMV) Algorithm (V-RNN) and a fall alert can be issued at an early phase. The results indicated that the proposed FDS with optimized parameters achieves 97.86% and 98.20% in sensitivity and accuracy, respectively.

Xiaoqing Chai, Boon-Giin Lee, Matthew Pike, Renjie Wu, Wan-Young Chung

CycleGAN Based Motion Artifact Cancellation for Photoplethysmography Wearable Device

Motion artifacts (MA) in photoplethysmography (PPG) signals is a challenging problem in signal processing today although various methods have been researched and developed. Using deep learning techniques recently has demonstrated their performance to overcome many limitations in traditional ones. In this study, we develop a protocol to build the PPG dataset and a cycleGAN-based model which can use to remove MA from PPG signals at the radial artery. We verified that the assumption of noisy PPG signals is a linear combination of clean PPG and accelerator (ACC) signals is not strong enough. Our evaluation of the CycleGAN model for reconstructing PPG signals at the radial artery which consisted of two opposite phases was feasible but the quality of signals needs more further research.

Nguyen Mai Hoang Long, Jong Jin Kim, Boon Giin Lee, Wan Young Chung

Effect of Time Window Size for Converting Frequency Domain in Real-Time Remote Photoplethysmography Extraction

Remote-photoplethysmography (rPPG) is an attractive technology that can measure vital signs at a distance without contact. Previous remote-photoplethysmography studies focused mainly on eliminating the artifact such as motion but finding the optimal setup or hyperparameters are also an important factor influencing the performance. As one of them, window size is the length of the signal used to calculate the vital signs once in a spectral method and has not been analyzed in detail in previous works. In general, the use of a long window size increases the re-liability of the estimations, but it cannot reflect continuously changing physiological responses of human. Also, using too short window size increases uncertainty. In this paper, we compare and analyze the pulse rate estimation results according to window sizes from short to long using CHROM, which is one of the popular rPPG algorithms. Results on the PURE dataset showed that the longer the window size, the higher the SNR and the lower the RMSE. At a window size of about 4 s (120 frames), the SNR was switched from negative to positive and an acceptable error rate (RMSE < 5) was observed.

Yu Jin Shin, Woo Jung Han, Kun Ha Suh, Eui Chul Lee

Development of Application Usability Evaluation Scale for Seniors

Purpose: Applications for seniors are emerging because of the increasing number of seniors due to an aging society and popularization of smartphones by the 4th industrial revolution. The purpose of this paper is to establish a usability evaluation scale to develop more convenient and useful applications for seniors. Research design, data and methodology: For this study, first, the necessity of developing an application usability evaluation scale for seniors is discussed by examining the definition and context of seniors. Second, the primary usability evaluation factors were derived from collecting the factors of usability evaluation such as applications, seniors and conducting in-depth interviews with experts. Third, the second usability evaluation factors were derived through survey and statistical analysis based on the usability evaluation factors. Results: As a result, the usability evaluation scale was established with five factors – Cognition, Navigation, Feedback, Error, Aesthetic - and 20 items. Conclusions: This study can be the basis for guidelines of development and design in applications for seniors. Therefore, not only the young generation but also seniors can feel convenience in their daily lives by using customized applications for seniors.

Changhee Seo, Jin Suk Lee, Jieun Kwon

Analysis of the Utilization of Telehealth Worldwide and the Requisites for Its Effective Adoption

The popularization of medical devices in households and video calls during the COVID-19 pandemic have set grounds for the necessary conditions for telehealth to thrive. This allowed for patients with chronic diseases to be treated without the need for physical contact with a clinician and to access professional medics regardless of time. However, due to concerns for the safety of sensitive data of patients and both the quality and accuracy of medical treatment provided, telehealth was and is set to become strictly regulated. Finding the necessary technologies to facilitate mobile medical treatment as a viable option for those who struggle with attaining physical presence in hospitals is, therefore, necessary to maintain telehealth. This research aims to conduct user research on how to improve telehealth services to better serve the elderly population.

Yeun Soo Choi, Mingeon Kim, Ahhyun Ryu, Seowon Park, Jason Kim, Seongbin Lee

Intelligent Interaction and Cognitive Computing

Frontmatter

Cognition Prediction Model for MOOCs Learners Based on ANN

Massive Open Online Courses (MOOCs) are becoming in- increasingly popular in recent years. In a virtual world, examining the cognition processes of the students is a real hassle. The primary goal of this study is to examine the influence of MOOCs on learning. This study presents a Cognitive Model based on brain signals for predicting the most effective MOOCs video lecture. In this work, students’ brain signals collected using an Electroencephalogram (EEG) device while watching MOOCs videos are used to classify their level of confusion using a publicly available dataset. The video that causes the least amount of confusion in the majority of students has been chosen as the best. This paper proposes and analyses the Cognitive Model for MOOCs Learning. A Deep Learning-based Artificial Neural Network Model has been created to predict student confusion levels. The methodology has been built using 10K fold cross-validation and shown to be 97% accurate in predicting students’ misunderstandings while watching MOOCs videos. The proposed Cognitive Model will aid in the evaluation of MOOCs course performance.

Varsha T. Lokare, Laxman D. Netak, N. S. Jadhav

A Novel Methodology for Assessing and Modeling Manufacturing Processes: A Case Study for the Metallurgical Industry

Historically, researchers and practitioners have often failed to consider all the areas, factors, and implications of a process within an integrated manufacturing model. Thus, the aim of this research was to propose a holistic approach to manufacturing processes to assess their status and performance. Moreover, using the conceptual model, manufacturing systems can be modelled, considering all areas, flows, and factors in the respective areas of production, maintenance, and quality. As a result, the model serves as the basis for the integral management and control of manufacturing systems in digital twin models for the regulation of process stability and quality with maintenance strategies. Thus, a system dynamics simulation model based on the conceptual model is developed for a metallurgical process. The results show how the monitoring of all flows together with the optimal strategies in the quality and maintenance areas enable companies to increase their profitability and customer service level. In conclusion, the conceptual approach and the applied simulation case study allow better decision making, ensuring continuous optimization along the manufacturing asset lifecycle, and providing a unique selling proposition for equipment producers and service engineering suppliers as well as industrial companies.

Reschke Jan, Gallego-García Diego, Gallego-García Sergio, García-García Manuel

Research of the Deep Learning Model for Denoising of ECG Signal and Classification of Arrhythmias

In this paper, we propose a DL model that can remove noise signals and classify arrhythmias for effective ECG analysis. The proposed DL model removes noise included in the signal by inputting the ECG signal divided by a specific time into the DAE, and inputs the noise-removed signal to the CNN model to distinguish the normal ECG signal from the arrhythmic ECG signal. The model was trained using the MIT-BIH Arrhythmia DB and the MIT-BIH Noise Stress Test DB, and the performance of the DL model was evaluated with two experiments by constructing a separate evaluation data set that is not used for training. The first experiment compared the noise removal performance of the implemented DAE model and Pan-Tompkins’ QRS detection algorithm, and the second experiment performed the classification performance evaluation of the implemented CNN model. As a result of performance evaluation of the proposed DAE model, SNR_imp was 9.8310, RMSE was 0.0446, and PRD was 20.8869. In addition, as a result of classification performance evaluation, the accuracy was 98.50%, the recall rate was 98.0%, the precision was 98.99%, and F1 was 98.50%.

Ji-Yun Seo, Yun-Hong Noh, Do-Un Jeong

XRTI: eXtended Reality Based Telepresence Interface for Multiple Robot Supervision

The XRTI system is an immersive augmented reality (AR) interface proposed to support the supervision of a robot team by providing a highly optimized and accessible method of robot system monitoring. Its design takes advantage of the augmented reality space to enhance the capacity of a single human supervisor to simultaneously monitor multiple telepresence robots. Its 3D data visualization capabilities use extended reality space to display environmental, sensory, and maintenance data provided by the robot over a wrap-around view to prevent data negligence due to human error. By increasing the accessibility of crucial data, the XRTI system aims to reduce cognitive overload of an on-site supervisor. It provides an interactive interface to assist the user’s data comprehension and is customizable to a user’s preference to prioritize certain data categories. This research can be further optimized to support NASA’s exploration extravehicular mobility units (xEMU) in the AARON (Assistive Augmented Reality Operational and Navigation) System. This paper provides proof of concept for the XRTI system.

Naomi Wang, Jong Hoon Kim

Validating Pre-requisite Dependencies Through Student Response Analysis

During examinations, students are unable to solve the problems requiring complex skills because they forget the primary or prerequisite skills or infrequent use of these skills. Also, students have problems with advanced courses due to weakness with prerequisite skills. One of the challenges for an instructor is to find the reason why students are not able to solve complex-skill problems. The reason may be a lack of prerequisite skills or has not mastered the complex skill or combination of both. This paper presents an analysis of the dependencies of prerequisite skills on the post requisite skills. Three assessments are conducted. The assessments are based on the skills required for the computer programming course. Basic assessments are prerequisites to the intermediate skill assessments. Both these assessments are prerequisites to the complex skills assessments. Based on the students’ responses, the analysis is carried out. As these variables are the categorical type of variables for that purpose Chi-Squared test is applied. It is observed that there is a statistically significant dependency is present between prerequisite skills and post requisites skills.

Manjushree D. Laddha, Swanand Navandar, Laxman D. Netak

Pneumonia Classification Using Few-Shot Learning with Visual Explanations

Deep learning models have demonstrated state of the art performance in varied domains, however there is still room for improvement when it comes to learning new concepts from little data. Learning relevant features from a few training samples remains a challenge in machine learning applications. In this study, we propose an automated approach for the classification of Viral, Bacterial, and Fungal pneumonia using chest X-rays on a publicly available dataset. We employ distance learning based Siamese Networks with visual explanations for pneumonia detection. Our results demonstrate remarkable improvement in performance over conventional deep convolutional models with just a few training samples. We exhibit the powerful generalization capability of our model which once trained, effectively predicts new unseen data in the test set. Furthermore, we also illustrate the effectiveness of our model by classifying diseases from the same genus like COVID-19 and SARS.

Shipra Madan, Anirudra Diwakar, Santanu Chaudhury, Tapan Gandhi

User Experience Design for Defense Systems with AI

As artificial intelligence (AI) is applied at an increasing frequency in various fields, the number of studies on the user experience (UX) design of human-AI interaction is also increasing. However, the results of these studies on AI UX design principles are insufficient for actual AI systems. In light of this fact, the purpose of this study was to upgrade the UX design of a defense system that uses AI technology to detect land changes and targets. In order to upgrade the UX design of this AI system, a three-step procedure was executed. First, AI UX principles were derived by analyzing literature related to human-AI interaction. Second, ideation was performed to improve the interface. Finally, the results of the ideation were utilized to construct the UX prototype of the AI system with Adobe XD. The results of this study are expected to be used as fundamental data for future research that will develop UX principles and advanced methods for AI systems.

Sunyoung Park, Hyun K. Kim, Yuryeon Lee, Gyuwon Park, Danbi Lee

Multilayer Tag Extraction for Music Recommendation Systems

With hundreds and thousands of songs being added to online music streaming platforms everyday, there is a challenge to recommend songs that the users decide to hear at any given time. Classification of songs plays a vital role in any recommendation system and when it comes to Indian music, there are a lot of parameters to be taken into consideration. The proposed paper takes into account this task and through recent advancement in data processing and signal processing, we have tried to use classification processes on Indian music based on various parameters. These parameters include metadata of music, sentimental values, as well as technical features. India being a diverse country with multiple culture values, is home to variety of local music. At various instances, these various classification parameters play significant roles especially when local music is involved in the process of recommendation. Classifying Indian music based on such parameters will lead to better results and also aid to be an improvement in recommendation system for Indian music.

Sameeksha Gurrala, M. Naagamani, Sayan Das, Pralhad Kolambkar, R. P. Yashasvi, Jagjeet Suryawanshi, Pranay Raj, Amarjeet Kumar, Shankhanil Ghosh, Sri Harsha Navundru, Rajesh Thalla

Interactive Visualization and Capture of Geo-Coded Multimedia Data on Mobile Devices

In digital community applications, geo-coded multimedia data including spatial videos, speech, and geo-narratives are collected and utilized by community users and researchers from multiple fields. It is often preferred that these data can be captured, visualized, and explored directly on mobile phones and tablets interactively. In this paper, we present a Geo-Video Mobile Application (GVM App) that collects geo-coded multimedia data for experts to process and analyze over an interactive visual exploration. This mobile App integrates user interactivity, AI-based semantic image segmentation, and audio transcription for effective data extraction and utilization. Then visualization functions are designed to quickly present geographical, semantic, and street view visual information for knowledge discovery. The users of this tool can include community workers, teachers, and tourists, and also span across multiple social disciplines in digital humanity studies.

Deepshikha Bhati, Md Amiruzzaman, Suphanut Jamonnak, Ye Zhao

Application of VR to Educational System of Korea

Since the breakout of Covid-19, the number of online classes skyrocketed as conducting in-person classes in school has been discouraged. This led to the sudden shift into the adaption of online classes, blurring the line between traditional in-person classes and modern technology. In the Republic of Korea, the nation where it is known for its speed of Internet and high rate of digital natives amongst its citizen, there have been numerous attempts to incorporate virtual reality (VR) into the existing curriculum but has not been getting satisfactory results back. Through interviews and research, this paper tries to assess the current position of VR in the marketplace and suggest possible solutions that can support the expansion of the system into the school. The main purpose of this essay is to analyze whether Google Arts and Culture, one of the most accessible VR assimilated educational platforms, can be blended into the Korean education programs.

Yeun Soo Choi, Mingeon Kim, Ahhyun Ryu, Seowon Park, Sunwoo Kim

Architectural View of Non-face-to-face Experiential Learning Through the Immersive Technologies

The significant disruption caused by the COVID-19 pandemic prohibits the face-to-face teaching-learning process. This pandemic forced the students to utilize online i.e. non-face-to-face mode of education via different platforms available on the internet. The use of the internet allows students to use e-learning resources to learn things from anywhere and at any time. The ease of use makes these systems more favorable amongst the learner. The traditional way of face-to-face teaching-learning with the utmost learning rate is no longer beneficial in the pandemic situation. Many of the teaching institutes use online aids for content delivery. No doubt, all these platforms are far better to provide knowledge and educate the students, but these platforms do not focus on the active participation of students in an online class like teacher observe the students concentration in a physical classroom. Monitoring the engagement of students in the online mode of education seems to be difficult, but can be achieved through the use of immersive technology. This paper provides an architectural view of the online teaching-learning system with the use of Virtual Reality to achieve better engagement of students in the virtual classroom through immersion.

Vinayak V. Mukkawar, Laxman D. Netak, Valmik B. Nikam

The Role of Artificial Intelligence (AI) in Assisting Applied Natya Therapy for Relapse Prevention in De-addiction

This paper is about exploring the role of Artificial Intelligence in therapy for drug addiction and relapse. This paper also investigates the results of timely interventions through sensing the changes in the body as a result of thoughts, learnt behaviour, and emotional responses during trauma, and stress-inducing situations. One of the possible solutions we propose is exploring the principles of Applied Natya Therapy (ANT) and integrating it with Artificial Intelligence (AI) to assist with the Therapy and Rehabilitation for the patients with drug and substance abuse, as well as drug relapse. For a long time, scientists have been observing and replicating every physical aspect of the body’s movements, and have successfully replicated and applied it in robots. However, translating human emotion into machine learning is what is the next frontier of scientific research as it will open up significant opportunities in overall health and emotional well-being. The key element in any treatment is the prevention of relapse especially with the patients of drug and substance abuse. Relapse is the most dreaded condition for the patient as well as for family and the caretakers of the patients of drug addiction. Addictions are difficult to treat because we need to focus on the time window before the relapse and not after. Recent findings corroborate the characterizations of delay discounting as a candidate behavioral marker of addiction and may help identify subgroups that require special treatment or unique interventions to overcome their addiction. In a study published in the January 5 Molecular Psychiatry, researchers found that machine learning could not only help predict the characteristics of a person’s depression, but also do this more effectively, and with less information, than traditional approaches [1]. The authors concluded that machine learning could be a clinically useful way to stratify depression [4] This exploration will assist in capturing data for emotions and expressions generated during Applied Natya Therapy Techniques that can be captured in real time to be utilized in devising aids for relapse prevention.

Dimple Kaur Malhotra

Immersive and Tangible Interface

Frontmatter

Design of a Smart Puppet Theatre System for Computational Thinking Education

Many efforts have failed to achieve tangible models of a robotic theatre, as opposed to virtual or simulated theatres, despite many attempts to merge the progress of robotics with the growth of theatre and the performing arts. Many of the initiatives that have achieved significant progress in these domains are on a considerably larger scale, with the primary goal of entertaining rather than demonstrating the interdisciplinary nature of Robotics and Engineering. The purpose of this paper is to correctly unite the principles of Science, Technology, Engineering, Arts, and Mathematics in a small size robotic theatre that will allow for a more portable and changeable exhibition. The Tortoise and Hare play will be performed in the theatre, which is made up of both stage and puppet elements. A pan and tilt lighting system, audio integration via an external device, automated curtains with stepper motors, props, and a grid stage are among the stage’s components. A camera tracking module in the light system detects the location of a robot and communicates with the light fixtures to angle the spotlight. A transportable module that interacts wirelessly with its environment, as well as simple-moving, decorative puppet cutouts protruding from the module, make up the smart puppets. The mBlock IDE is used to edit the story in the theatre software, providing for a simple technique of programming the scene. The Smart Mini Theatre’s production of the Tortoise and Hare play intends to encourage performing arts students to experiment with robots and programming to create their own shows, in the hopes of inspiring them to pursue Robotics and Engineering as a potential career choice.

Raghav Kasibhatla, Saifuddin Mahmud, Redwanul Haque Sourave, Marcus Arnett, Jong-Hoon Kim

Smart Trashcan Brothers: Early Childhood Environmental Education Through Green Robotics

One of the main concerns of modern life would be the potential risk of irreversible ecological damage. However, due to the lack of focus on the subject within education, this type of risk will only get worse when being left unattended. This is where the robotics system, known as the “Smart Trashcan Brothers”, can provide better environmental consciousness with the current, younger generation attending primary school. This paper goes over the concepts that make up the Smart Trashcan Brothers system, as well with a functional evaluation to verify that the described parts of the robotics system function as intended. From there, a discussion of future works will be brought up with regards to further Child Human Interaction works.

Marcus Arnett, Saifuddin Mahmud, Redwanul Haque Sourave, Jong-Hoon Kim

Effects of Computer-Based (Scratch) and Robotic (Cozmo) Coding Instruction on Seventh Grade Students’ Computational Thinking, Competency Beliefs, and Engagement

The purpose of this pre-/posttest quasi-experimental study was to examine the effects of coding activities supported by the emotional educational robot Cozmo on seventh grade students’ computational thinking, competency beliefs, and engagement compared to the computer-based program of Scratch. Two versions of the coding curriculum were developed that shared the same content and instructional features but differed in the code blocks used in each program. Two intact classes at a public middle school in the Midwestern United States participated in the study during the regularly scheduled Technology course. One class received the Scratch coding curriculum (n = 21), and the other class received the robotics coding curriculum (n = 22).Results revealed non-significant posttest differences in computational thinking and competency beliefs among the Scratch and Cozmo interventions. However, students found Cozmo to be significantly more engaging than Scratch. Both interventions significantly improved students’ computational thinking and competency beliefs from pre- to posttest.This study contributes to the emerging literature on coding education in a public school setting. The positive gains in the cognitive and affective domains of learning can serve as a point of reference for researchers, designers, and educators with the desire to introduce students to coding.

Shannon Smith, Elena Novak, Jason Schenker, Chia-Ling Kuo

Design of a VR-Based Campus Tour Platform with a User-Friendly Scene Asset Management System

Virtual reality tours have become a desire for many educational institutions due to the potential difficulties for students to attend in person, especially during the COVID-19 pandemic. Having the ability for a student to explore the buildings and locations on a university campus is a crucial part of convincing them to enroll in classes. Many institutions have already installed tour applications that they have either designed themselves or contracted to a third party. However, they lack a convenient way for non-maintainers, such as faculty, to manage and personalize their classrooms and offices in a simple way. In this paper, we propose a platform to not only provide a full virtual experience of the campus but also feature a user-friendly content management system designed for staff and faculty to customize their assigned scenes. The tour uses the Unity3D engine, which communicates to a university server hosting a custom .NET API and SQL database to obtain information about the virtual rooms through a role-based access system to the faculty and staff. We believe this system for managing tour scenes will solve both time and expense for the tour development team and allow them to focus on implementing other features, rather than having to fulfill requests for editing locations in the tour. We expect this framework to function as a tour platform for other universities, as well as small businesses and communities. We seek to demonstrate the feasibility of this platform through our developed prototype application. Based on small sample testing, we have received overall positive responses and constructive critique that has played a role in improving the application moving forward.

Stefan Hendricks, Alfred Shaker, Jong-Hoon Kim

Usability Analysis for Blockchain-Based Applications

In this digitization phase, several new applications are developed by developers and used by end-users to complete their needs. But very few applications remain popular among users depending on the functional working of an application, such as the mobility of the application, its user-friendliness, pop-up advertisements. The Non-functional requirements such as space required in the memory, security aspects fulfill by application, etc. Blockchain Technology is an emerging trend in the market, and many developers are developing enormous applications in various domains. Human-Computer Interaction (HCI) will play a significant role in designing a graphical user interface. In this paper, we will discuss the opportunities and challenges faced by the developers while working on the different projects.

Sanil S. Gandhi, Yogesh N. Patil, Laxman D. Netak, Harsha R. Gaikwad

Permission-Educator: App for Educating Users About Android Permissions

Cyberattacks and malware infestation are issues that surround most operating systems (OS) these days. In smartphones, Android OS is more susceptible to malware infection. Although Android has introduced several mechanisms to avoid cyberattacks, including Google Play Protect, dynamic permissions, and sign-in control notifications, cyberattacks on Android-based phones are prevalent and continuously increasing. Most malware apps use critical permissions to access resources and data to compromise smartphone security. One of the key reasons behind this is the lack of knowledge for the usage of permissions in users. In this paper, we introduce Permission-Educator, a cloud-based service to educate users about the permissions associated with the installed apps in an Android-based smartphone. We developed an Android app as a client that allows users to categorize the installed apps on their smartphones as system or store apps. The user can learn about permissions for a specific app and identify the app as benign or malware through the interaction of the client app with the cloud service. We integrated the service with a web server that facilitates users to upload any Android application package file, i.e. apk, to extract information regarding the Android app and display it to the user.

Akshay Mathur, Ethan Ewoldt, Quamar Niyaz, Ahmad Javaid, Xiaoli Yang

An Intelligent System to Support Social Storytelling for People with ASD

The number of diagnoses of ASD (Autism Spectrum Disorder) is growing every day. Children suffering from ASD lack of social behaviors, which strongly impacts on the inclusion of the child and his/her own family in the community. Much scientific evidence shows that social storytelling is a valuable tool for developing pro-social behaviors in children with ASD and for articulating their emotional language, empathy, and expressive and verbal communication. To be effective, social stories should be customized for the target user since there exist different levels of ASD. To tackle this issue, this paper proposes an application that supports the semiautomatic creation of social stories. The intelligent system learns the needs of the target user over time and adapts by creating social stories tailored to the ASD level of the disorder. The editor lets non-expert caregivers select the appropriate elements and content representations to be included in the social stories.

Rita Francese, Angela Guercio, Veronica Rossano

Evaluating Accuracy of the Tobii Eye Tracker 5

Eye-tracking sensors are a relatively new technology and currently has use as an accessibility method to allow those with disabilities to use technology with greater independence. This study evaluates the general accuracy and precision of Tobii eye-tracking software and hardware, along with the efficacy of training a neural network to improve both aspects of the eye-tracker itself. With three human testers observing a grid of data points, the measured and known point locations are recorded and analyzed using over 250 data points. The study was conducted over two days, with each participant performing four trials each. In this study, we use basic statistics and a k-means clustering algorithm to examine the data in depth and give insights into the performance of the Tobii-5 eye-tracker. In addition to evaluating performance, this study also attempts to improve the accuracy of the Tobii-5 eye-tracker by using a Multi-Layer Perceptron Regressor to reassign gaze locations to better line up with the expected gaze location. Potential future developments are also discussed.

Andrew Housholder, Jonathan Reaban, Aira Peregrino, Georgia Votta, Tauheed Khan Mohd

Evaluation of Accuracy of Leap Motion Controller Device

As Human-Computer Interaction has continued to advance with technology, Augmented Virtuality (AV) systems have become increasingly useful and improve our interaction with technology. This article addresses the effectiveness of the Leap Motion Controller at capturing this interaction, as well as present new ways to improve the experience. First, we will present and discuss data from test trials with human input to show how accurately the Leap Motion Controller represents the user’s hand motion. This will provide the background information needed to understand our proposal for the potential changes and modifications to the device that would improve the Human-Computer Interaction. This will also provide insight into how implementing deep learning could improve the accuracy of this device. Improving the accuracy of the Leap Motion Controller could lead to an increased usage of the device in games and it could also potentially be used for educational purposes. While it has a long way to go, the Leap Motion Controller could potentially become an incredible source for virtual environments in academia, the world of rehabilitation, and for recreational use.

Anas Akkar, Sam Cregan, Yafet Zeleke, Chase Fahy, Parajwal Sarkar, Tauheed Khan Mohd

A Comparison of Input Devices for Gaming: Are Gamepads Still Useful in PC Environment?

As the PC has emerged as a new video game platform since its rapid performance progress, the cross-platform compatibility has grown to prominence for developers and designers. Besides, the number of users who want to use their gamepads, which is the conventional input devices of video game consoles, has been gradually increasing. This study was designed to compare the performance and usability of two input devices, the gamepad and the keyboard-mouse setup, the traditional setup used by most PC users, by task type and difficulty level. The goal of this study is to explore the interface design guidelines for each device. In this study, we measured the reaction time for the performance and the perceived workload for usability. The results indicated that the keyboard-mouse setup showed higher performance than the gamepad in general, and that the difference by the difficulty level was intensified in the physical task group. However, this tendency was not found in the cognitive task group. The keyboard-mouse setup also showed lower mental demand and physical demand than the gamepad. This result suggests that designers should not require too many simultaneous inputs when designing interfaces for gamepads, and that users should understand the characteristics of each device, in order to choose the one appropriate for their intended goals.

Yunsun A. Hong, Soojin O. Peck, Ilgang M. Lee

The Value of eCoaching in the COVID-19 Pandemic to Promote Adherence to Self-isolation and Quarantine

A Digital electronic Coach (eCoach) app was built and evaluated during the Covid-19 pandemic in The Netherlands. Its aim was to provide support for individuals that had to either quarantine or self-isolate after a positive corona test or an indication of a heightened risk of infection. The coach (“IsolationCoach”), its value and uses were evaluated in 29 semi-structured interviews with individuals who had quarantined or isolated themselves or were part of the general Dutch public. Three main findings emerge. First, participants found value in a digital coach that would help them comply with quarantine or isolation instructions and provided information on the practical challenges of organizing their quarantine or isolation. Second, the usage of the app, which gradually and conditionally provides relevant information as opposed to conventional paper pamphlets/email, was greatly appreciated. Third, participants experienced a need for mental support during their period of isolation or quarantine, and this could at least partially be filled by the eCoach, which provided emotional support through a Socratic method styled form of self-reflection. It was beneficial that the app was implemented rapidly within weeks using a ready-to-use platform and that its content was assessed by experts from various health-related disciplines prior to rollout. Yet, for large-scale implementation, an integrated vision and digital strategy is needed to align forms of support by the health authorities.

Jan Willem Jaap Roderick van ’t Klooster, Joris Elmar van Gend, Maud Annemarie Schreijer, Elles Riek de Witte, Lisette van Gemert-Pijnen

Signal Processing and Complex System

Frontmatter

Methods of Constructing Equations for Objects of Fractal Geometry and R-Function Method

The paper discusses methods for constructing equations for objects of fractal geometry and method of R-functions. Basic concepts of the theory of fractals, areas of application and their types have been presented. The basic methods of constructing fractals are taken into account: L-system method, system of iterating functions, set theory method, and the R-function method. Equations of complex structures of fractal geometry have been developed based on the R-functions method. Using the of straight-line equation, the equation of a circle and constructive means of the method of R-functions R0: R-conjunctions and R-disjunctions are constructed various kinds of fractals, equations of fractals consisting of intersections of lines, tangencies of circles. Based on these equations, various prefractals were generated by specifying the number of iterations n and the angle of inclination. Equations are constructed for fractal antennas based on the “Cayley tree”, fractal ring monopolies and the Sierpinski curve that are used in antenna design. These fractals are very beautiful, which can be used in the creation of computer landscapes, in various illustrations, in telecommunications, in the textile industry, in drawing patterns in ceramic and porcelain products, as well as in the development of patterns for the modern design of Uzbek national carpets, fabrics, costumes, etc.

Sh. A. Anarova, Z. E. Ibrohimova

Determination of Dimensions of Complex Geometric Objects with Fractal Structure

This article is given to the assurance of the dimensions of complex geometric objects with fractal structures. A detailed depiction of the different mathematical methods for deciding the dimensions of complex geometric objects with a fractal structure and the investigation of errors in determining the fractional measure of complex geometric objects are displayed. The article presents the concept of fractal estimation, properties, topological estimation, estimations of designs and scenes in nature, differences between Hausdorf-Bezikovich measurement and Mandelbrot-Richardson measurement, fractal measurements. Dimensions of complex geometric objects with several fractal structures have too been identified. In particular, the Mandelbrot-Richardson scale was used to calculate the fractal dimensions of four-sided star fractals, eight-sided star fractals, the Cox curve, and the Given (cap) curves. Hausdorf-Bezikovich and Mandelbrot-Richardson measurements were used to determine the fractal scale. Most articles describe the study of the properties of complex objects in graphical form. In this article, the measurement properties of complex objects are studied on the premise of mathematical equations and special methods are used to compare and calculate the fractional measurements of fractal structures, as well as the results of a number of experiments at each iteration, which are presented in formulas and charts. In addition, different methods of measuring fractal structure images are presented, as well as information on their practical application.

Kh. N. Zaynidinov, Sh. A. Anarova, J. S. Jabbarov

Performance Analysis of the IBM Cloud Quantum Computing Lab Against MacBook Pro 2019

Quantum Computing is the conjunction of Quantum Physics, Computer Science, Mathematics and Nanotechnology. While this technology is extremely complex and unexplored, this paper addresses and explains the basic functioning of these devices. Additionally, it covers its most tangible applications nowadays, as well as the short-term implementation and development of these ones. Our research reflects the experimental performance of IBM’s Quantum Computer Cloud Lab. This is an environment designed to interact with IBM’s Quantum Computer by using the Jupyter Notebook interface, Conda package and environment manager, and Python. The results of different computations were mirrored on a 2019 MacBook Pro. The outcomes of these experiments were unexpected due to the low performance of this tool.

Alvaro Martin Grande, Rodrigo Ayala, Izan Khan, Prajwal Sarkar, Tauheed Khan Mohd

Algorithms and Service for Digital Processing of Two-Dimensional Geophysical Fields Using Octave Method

This paper covers new algorithms for digital signal processing of two-dimensional geophysical fields using octave method, which predicts the mineral value of the field in terms of signal energy value. In addition, it addresses and shows a functional scheme of the platform service based on cloud technologies. The essence of the work is that the geophysical data are two-dimensional, so their volume is very large. This requires the use of fast algorithms for digital processing of large amounts of data. Therefore, if we use the octave method effectively, the required result is obtained by calculating the value of the signal energy and comparing the finite difference of these values with its previous value.

H. N. Zaynidinov, Dhananjay Singh, I. Yusupov, S. U. Makhmudjanov

Methods for Determining the Optimal Sampling Step of Signals in the Process of Device and Computer Integration

In this paper, digital signal processing methods and its solution for HCI were described. Mostly, problems are connected to taking signals or data from real time devices. Data is often serial, stream or etc. device and computer integration in HCI focuses to digital signal processing. Today, the use of interpolation methods in the digital processing of biomedical signals is important, and at the same time allows the detection and diagnosis of diseases as a result of digital processing of biomedical signals. This paper discusses the construction of a signal model using the spline-wavelet interpolation formula for equal intervals in the digital processing of biomedical signals.

Hakimjon Zaynidinov, Dhananjay Singh, Sarvar Makhmudjanov, Ibrohimbek Yusupov

Integrated Analogical Signs Generator for Testing Mixed Integrated Circuits

This paper presents the design of a functional block for testing analog and mixed-signal integrated circuits. The objective is that this functional block is embedded into an integrated circuit, IC, to generate the stimuli of the analog functional blocks. The result is a simple block with the ability to generate analog stimuli, as evidenced in the simulations carried out.

José L. Simancas-García, Farid A. Meléndez-Pertuz, Harold Combita-Niño, Ramón E. R. González, Carlos Collazos-Morales

Modeling and Metric of Intelligent Systems

Frontmatter

Comparison of Various Deep CNN Models for Land Use and Land Cover Classification

Activities of identifying kinds of physical objects on lands from the images captured through satellite and labeling them according to their usages are referred to as Land Use and Land Cover Classification (LULC). Researchers have developed various machine learning techniques for this purpose. The effectiveness of these techniques has been individually evaluated. However, their performance needs to be compared against each other primarily when they are used for LULC. This paper compares the performance of five commonly used machine learning techniques, namely Random Forest, two variants of Residual Networks, and two variants of Visual Geometry Group Models. The performance of these techniques is compared in terms of accuracy, recall and precision using the Eurosat dataset. The performance profiling described in this paper could help researchers to select a given model over other related techniques.

Geetanjali S. Mahamunkar, Laxman D. Netak

Mathematical Modeling of the Nostational Filteration Process of Oil in the System of Oil Deposits Related to Slow Conductor Layers

This paper discusses the mathematical model of the filtration process in the horizontal section of the motion of fluids in a three-layer oil system in a non-homogeneous reciprocal dynamic interaction in a porous medium, their interaction dynamics and the interaction in the layers. The mathematical model of the problem consists of three interconnected differential equations of one-dimensional parabolic type. An efficient computational algorithm for solving the boundary value problem built on a mathematical model has been developed to determine the main parameters of the filtration process. For the system of finite differences, the formula for finding the solution based on the driving method was determined and an algorithm for finding the driving method coefficients was developed. The formula for finding the values of the pressure function and the driving coefficient at the boundary is defined. Based on the developed algorithm, software was created and computational experiments were conducted, and the results were presented graphically for different situations. Computational experiments were performed on the main parameters of the filtration process of oil in porous media associated with three-layer slurry, as well as the filtration process was analyzed and studied on the basis of the obtained results.

Elmira Nazirova, Abdug’ani Nematov, Rustamjon Sadikov, Inomjon Nabiyev

A Novel Metric of Continuous Situational Awareness Monitoring (CSAM) for Multi-telepresence Coordination System

Humans will remain in the loop of robotic systems as the switch from semi-autonomous to autonomous decisions making. Situational awareness is key factor in how efficient a human is in a human-robot system. This study examines the role that visual presentation mediums have on situational awareness of remote robot operators. Traditional display monitors and virtual reality headsets are compared for their ability to provide a user with situational awareness of a remote environment. Additionally, a novel metric Continuous Situational Awareness Monitoring (CSAM) to capture a participants environmental awareness. Participants are asked to monitor either one or multiple robots as they navigate through a simulated environment. Results indicate that virtual reality as a medium is more efficient in keeping an operator situationally aware of a remote environment.

Nathan Kanyok, Alfred Shaker, Jong-Hoon Kim

AI-Based Syntactic Complexity Metrics and Sight Interpreting Performance

Complex syntax may lead to increased cognitive effort during translation. However, it is unclear what kinds of syntactic complexity have a stronger impact on translation performance. In this paper, we employ several syntactic metrics which enable us to explore the impact of syntactic complexity on the quality in English-to-Chinese sight interpreting. We have operationalized syntactic complexity by six metrics, namely, Incomplete Dependency Theory metric (IDT), Dependency Locality Theory metric (DLT), Combined IDT and DLT metric (IDT+DLT), Left Embeddedness metric (LE), Nested Nouns Distancemetric (NND), and Bilingual Complexity Ratio metric (BRC). Three professional translators have manually annotated translation errors using MQM-derived error taxonomies, which includes accuracy, fluency, and style errors, each as critical or minor errors. We assessed inter-rater agreement by adopting weighted Fleiss’ Kappa scores. We found that there are strong correlations between the IDT and IDT+DLT metrics and sight interpreting errors. We also found that language-specific syntactic differences between English and Chinese such as directions of branching and noun modifiers can have a strong influence on accuracy and critical errors.

Longhui Zou, Michael Carl, Mehdi Mirzapour, Hélène Jacquenet, Lucas Nunes Vieira

Gender Detection Using Voice Through Deep Learning

Particularly in an online or digital environment sometimes it is important to detect gender by other means beyond visual or facial recognition. Which is why this article is about detecting the gender of a person by their voice. With Gender Detection Using Voice, it is easier to implement it to security protocols that require gender detection with better accuracy without having people removing pieces of clothing, masks or accessories for the facial recognition. Also, it can be embedded in medical appliances as it can help detect some vocal pathologies like coughing and breathing differently which also depend on the gender as well as detecting criminals’ gender through video surveillance and also in businesses, it can help with customized advertisement. The model measures the voice of males and females for optimal accuracy. Our model achieved an accuracy of 90.95% by using feature extraction upon dataset of 500 h of voice recordings.

Vanessa Garza Enriquez, Madhusudan Singh

Night Vision, Day & Night Prediction with Object Recognition (NVDANOR) Model

Night vision has been one of the key developments in Computer Vision system as it gave us a key point to modify an area where humans have the least ability to perform. Object detection is reliable and efficient tool to recognize objects in scenarios such as daytime images where the illumination is great. However, night pictures tend to be challenging to recognize for human being and it usually brings us less data than the images that are taken during day due to poor contrast against its background that interfere with clearly recognizing and labeling them. Different models have been proposed for night vision image processing which use denoising, deblurring and enhancing technique however, other methods can be used in order to enhance that picture and make them as usable and understandable as possible. In addition, different prediction methods and models have been developed in order to achieve different degrees of object recognition in that image, still those results and accuracy can be improved for better results. In this paper, we propose a model that can predict which time of the day it is in the picture with help of calculating average brightness on the images of different time periods with HSV. The model includes ResNet-50 and VGG-16 classifiers that can also recognize the objects and buildings in the image with good accuracy. Implementation of deep learning algorithms and image brightness enhancement tools helped us to achieve improved accuracy and better prediction. The model achieved 94% prediction results when it comes to day and night prediction and 93.75% in object detection on night images.

Akobir Ismatov, Madhusudan Singh

A Built-in Concentration Level Prediction Device for Neuro Training System Based on EEG Signal

This study aims to develop a built-in concentration level feedback system using EEG signals. The system includes an embedded device for electroencephalography (EEG) acquisition from two electrodes mounted at designated positions on the frontal scalp for concentration level prediction. The selected EEG-based feature used in this study is the relative power spectral density (PSD) extracted from five EEG bands (Delta, Theta, Alpha, Beta, Gamma) by using the Fourier Fast Transform (FFT) method. Then, two standard machine learning models, including support vector machine (SVM) and multilayer perceptron (MLP), are trained on the personal computer (PC) with the feature of relative power spectral density (PSD) as input for concentration level prediction. After conducting the performance evaluation, MLP is adopted to deploy on the device for real-time concentration level prediction based on the evaluation. The results have demonstrated the feasibility of our EEG-based built-in concentration level prediction device in real-life applications.

Ha-Trung Nguyen, Ngoc-Dau Mai, Jong-Jin Kim, Wan-Young Chung

Towards Man/Machine Co-authoring of Advanced Analytics Reports Around Big Data Repositories

This paper explores the problem of generating advanced analytical report for gaining sophisticated insight from massive databases by machine assistance. This study shows a model that takes a country-specific scientometric scientific research analysis report as a template and goes into a curated source database to generate a similar insightful report for other countries. The overall process consists of three key phases. The first phase is processing the template report for identifying the generalizable data elements. The second phase is extracting the elements for the selected country from a scholarly database. The third phase is re-assembling the high-level report for the new case. A case study on big data analysis is presented for Saudi Arabia scientific research publications. The generated co-authored report was evaluated by 10 human reviewers through assessing several criteria in the report, which achieved a satisfactory evaluation.

Amal Babour, Javed Khan

WTM to Enhances Predictive Assessment of Systems Development Practices: A Case Study of Petroleum Drilling Project

Software engineering has devised several project management metrics to optimize implementation and obtain a product with high efficiency, at less time and cost. Inventors faced many challenges to find measurements that able to anticipate accurate results that help avoid errors and risks in the advanced stages of the project. The systems most affected by any errors during the development process are Safety-Critical systems (SCS), as 60% of failures during operation are due to errors during development. This paper proposes a metric that uses weight and milestones to predict implementation in advanced stages of a project. The proposed metric is called Weighted Test Metric (WTM). WTM enhance the reliability assessment and reduce failures during project development by predicting Standards Achievement (SA) in the next test. WTM results showed that faults can be reduced during the development of a petroleum drilling project to 0.67% and enhance the overall reliability to 99.16% while actual results (98.30%). This paper focuses on “How to enhance reliability assessment and reduce failures during project development activities?”. This research raises the question through the application of WTM in the stages of development of the Petroleum Drilling Project.

Abdulaziz Ahmed Thawaba, Azizul Azhar Ramli, Mohd. Farhan Md. Fudzee

Face and Face Mask Detection Using Convolutional Neural Network

The COVID-19 outbreak has posed a severe healthcare concern in Malaysia. Wearing a mask is the most effective way to prevent infections. However, some Malaysians refuse to wear a face mask for a variety of reasons. This work proposes a real-time face and face mask detection method using image processing technique to promote wearing face mask. Haar Cascade is used for the face detection to extract the features of the human faces as a method of approach. On the other hand, the face mask detection utilizes convolutional neural network (CNN) to train a model using the MobileNetV2 training model designed using Python, Keras and Tensorflow. OpenCV package was used as the interface for the algorithms to be connected to a web camera. Based on the performance metric calculation of detection rate analysis of the experimental results, the face detection rate is at 90% true and 10% false detection, which shows very good detection rate. Furthermore, the training accuracy and validation accuracy for the face mask detector are efficiently near to 1.0, proving a steady accuracy over the time. Training loss and validation loss are almost near to zero and decreasing over time, reassuring the algorithm performance is accurate and efficient for a datasets of 4000 images.

Muhammad Mustaqim Zainal, Radzi Ambar, Mohd Helmy Abd Wahab, Hazwaj Mhd Poad, Muhammad Mahadi Abd Jamil, Chew Chang Choon

Evaluating the Efficiency of Several Machine Learning Algorithms for Fall Detection

Elderly falls are a growing phenomenon observed within the world. According to World Health Organization (WHO), it is the second leading cause of unintentional or accidental deaths among the elderly. Thus, the need for research regarding the development of fall detection systems is imperative. Researchers have utilized various approaches to develop fall detection systems, significant number of which have employed Machine Leaning (ML) algorithms for fall detection. In this study, we evaluated the efficiency of six ML algorithms on a public fall detection dataset. A robust deep neural network for fall detection (FD-DNN) is identified to be the current state-of-the-art, it detects falls by using a self-built sensor that consumes low power. By evaluating the efficiency of six machine learning algorithms on a publicly available joint fall detection dataset, the accuracy of the fall detection was increased from 99.17% to 99.88% by using the K-nearest Neighbor indicating that common machine learning algorithms can achieve identical or higher accuracy rendering the complex and expensive deep neural network-based fall detection systems inefficient.

Parimala Banda, Masoud Mohammadian, Gudur Raghavendra Reddy

Fuzzy Logic Based Explainable AI Approach for the Easy Calibration of AI Models in IoT Environments

The Internet of Things (IoT) permeates all aspects of human existence shortly. As a result of the IoT, it can now construct a smart world. For this to happen, however, extracting meaningful information from raw sensory input functioning in loud and complicated settings must be addressed to achieve it. For example, bandwidth, processing power, and power consumption must be addressed while building a possible IoT system. Due to the current epidemic, the need for contactless solutions has risen. Possible solutions include a gesture-based control system that protects user privacy and can operate several different appliances simultaneously. When implementing such gesture-based control systems, opaque box artificial intelligence (AI) models are used. This opaque box AI model has shown good performance metrics on in-distribution data when tested in a lab. However, their complexity and opaqueness make them prone to failure when exposed to real-world out-of-distribution input. In contrast to opaque box models, explainable AI models based on fuzzy logic (EAI-FL) demonstrate comparable performance on lab data distributions. The type-2 fuzzy models, on the other hand, are readily calibrated and modified to offer equivalent performance to those attained on the lab in-distribution data in the real world.

Mohammed Alshehri

AI-Inspired Solutions

Frontmatter

Using Mask-RCNN to Identify Defective Parts of Fruits and Vegetables

Fruits and vegetables are a major source of food for humans after cereals. Since the evolution of civilizations, they have been gathered, cultivated and modified according to our needs. During the process of modification and harvesting, there might be diseased variants that are unfit for consumption. Manual removal/segregation of the diseased fruits and vegetables is a time consuming process on a large scale, which could be automated in the near future with the help of artificial intelligence. This could be done by training a machine, using machine learning algorithms, to recognize which fruits and vegetables are fit for consumption and which ones are not, with the help of an annotated dataset. The goal of this study is to introduce a dataset that contains 11 classes of fruits and vegetables that are annotated for instance segmentation tasks and the effectiveness of the dataset in simplifying quality testing and analysis. This paper begins by explaining the usage of Mask-RCNN [5] algorithm, and then explains the properties of the dataset and further discusses the areas of application where the dataset can be used.

Sai Raghunandan Suddapalli, Perugu Shyam

Attitude Control for Fixed-Wing Aircraft Using Q-Learning

In recent years, there have been many advances in the field of Reinforcement Learning (RL). RL algorithms have achieved human master abilities in games such as Go, chess, Atari games, etc. The capabilities of RL algorithms have also now been tested in the automated transportation field for self-driving cars and aerial vehicles, where they are used to aid drivers and pilots in various situations. In this paper we apply Reinforcement Learning models to simulated airplane flight. In particular, we develop and test a Reinforcement Learning based methodology for airplane stabilization. In essence, through reward functions and Q-Learning based modeling, we analyzed and evaluated how a trained agent can learn to control a simulated Cessna 172 to stabilize itself while in flight. Our results show that, after training, the agent learns to achieve a stable attitude for the airplane. We perform the experiments using QPlane, which incorporates two flight simulators (X-Plane 11 and JSBSim). X-Plane 11 and JSBSim are both independently developed realistic flight simulators. The trained agent will be trained in JSBSim and tested in both simulators. Results of the analysis are presented and discussed.

David J. Richter, Lance Natonski, Xiaxin Shen, Ricardo A. Calix

Exploiting Federated Learning Technique to Recognize Human Activities in Resource-Constrained Environment

The conventional machine learning (ML) and deep learning (DL) methods use large amount of data to construct desirable prediction models in a central fusion center for recognizing human activities. However, such model training encounters high communication costs and leads to privacy infringement. To address the issues of high communication overhead and privacy leakage, we employed a widely popular distributed ML technique called Federated Learning (FL) that generates a global model for predicting human activities by combining participated agents’ local knowledge. The state-of-the-art FL model fails to maintain acceptable accuracy when there is a large number of unreliable agents who can infuse false model, or, resource-constrained agents that fails to perform an assigned computational task within a given time window. We developed an FL model for predicting human activities by monitoring agent’s contributions towards model convergence and avoiding the unreliable and resource-constrained agents from training. We assign a score to each client when it joins in a network and the score is updated based on the agent’s activities during training. We consider three mobile robots as FL clients that are heterogeneous in terms of their resources such as processing capability, memory, bandwidth, battery-life and data volume. We consider heterogeneous mobile robots for understanding the effects of real-world FL setting in presence of resource-constrained agents. We consider an agent unreliable if it repeatedly gives slow response or infuses incorrect models during training. By disregarding the unreliable and weak agents, we carry-out the local training of the FL process on selected agents. If somehow, a weak agent is selected and started showing straggler issues, we leverage asynchronous FL mechanism that aggregate the local models whenever it receives a model update from the agents. Asynchronous FL eliminates the issue of waiting for a long time to receive model updates from the weak agents. To the end, we simulate how we can track the behavior of the agents through a reward-punishment scheme and present the influence of unreliable and resource-constrained agents in the FL process. We found that FL performs slightly worse than centralized models, if there is no unreliable and resource-constrained agent. However, as the number of malicious and straggler clients increases, our proposed model performs more effectively by identifying and avoiding those agents while recognizing human activities as compared to the state-of-the-art FL and ML approaches.

Ahmed Imteaj, Raghad Alabagi, M. Hadi Amini

Modeling Human Decision-Making Delays and Their Impacts on Supply Chain System Performance: A Case Study

The interaction between computer systems and humans is largely driven by the decision-making process of human beings. In the interaction process, delays are frequently unseen and therefore unforeseen, and their impacts are not considered as losses or as potential areas for improvement. As a result, the goal of this paper is to design a conceptual model to calculate delays impacts in decision-making interactions between humans and computerized systems. The model considers the sum of delays that occur due to various reasons; these include human delays, interface delays, and computer delays. Moreover, the conceptual model is applied in a supply chain system in which different human decisions interact with the digital planning model of an automotive manufacturer. The purpose of the simulation model is to quantify the loss and improvement potentials depending on the decision process delays in capacity measures in strategic, tactical, and operational planning horizons based on a defined target system. The results obtained show how delays significantly affect the supply chain performance. Finally, a methodological approach is presented for assessing the impacts of the delays in a sensitivity analysis.

Diqian Ren, Diego Gallego-García, Salvador Pérez-García, Sergio Gallego-García, Manuel García-García

reSenseNet: Ensemble Early Fusion Deep Learning Architecture for Multimodal Sentiment Analysis

Multimodal sentiment analysis is an actively emerging field of research in deep learning that deals with understanding human sentiments based on more than one sensory input. In this paper, we propose reSenseNet, an ensemble of early fusion architecture of deep convolutional neural network (CNN) and Long Short term Memory (LSTM) for multimodal sentiment analysis of audio, visual, and text data. ReSenseNet consists of feature extraction, feature fusion, and fully connected layers stacked together as a three-layer architecture. Instances of the generalized reSenseNet architecture have been experimented on several variants of modalities combined together to form different variations in the test data. Such a combination has produced results in predicting average arousal and valence up to an F1 score of 50.91% and 35.74% respectively.

Shankhanil Ghosh, Chhanda Saha, Nagamani Molakathala, Souvik Ghosh, Dhananjay Singh

Analysis of User Interaction to Mental Health Application Using Topic Modeling Approach

Mental health-related illnesses like depression and anxiety have become a major concern for society. Due to social stigma and unawareness, many such patients lack proper doctors’ consultancy, diagnosis, and treatments. Many of such problems arise due to the urban living style, social and family disconnect. Nonetheless, various health-related mobile applications offer ways to minimize the risks by providing an option for remote doctor consultations, connecting to family and friends, or sharing thoughts. To investigate the effectiveness and discover the key issues of such apps, we analyze the user’s interaction, discussion, and responses to mental health apps through topic modeling approaches. The experimental results show that many users found many applications helpful and complained mostly about technical issues and business practices. We also evaluated topics and keywords from feedbacks to further improve the application interface and functionalities.

Ajit Kumar, Ankit Kumar Singh, Bong Jun Choi

Alzheimer’s Dementia Recognition Using Multimodal Fusion of Speech and Text Embeddings

Alzheimer’s disease related Dementia (ADRD) compromises the memory, thinking and speech. This neurodegenerative disease severely impacts the cognitive capability and motor skills of affected individuals. ADRD is accompanied by progressive degeneration of the brain tissue, which leads to impairments in the memory formation, and loss of verbal fluency among other adverse physiological manifestations. Cognitive impairment becomes particular evident over time since it overtly alters the communication with repetitive utterances, confusion, filler words, and inability to speak at a normal pace. Most prevailing methodologies for ADRD recognition focus on mental health scores given by clinicians through in-person interviews, which can potentially be influenced by subjective bias of the evaluation. Accordingly, use of alterations in speech as a robust, quantitative indicator of Alzheimer’s progression presents an exciting non-invasive prognostic framework. Recent studies utilizing statistical and deep learning approaches have shown that assessment of ADRD can be effectively done algorithmically, which can serve as an objective companion diagnostic to the clinical assessment. However, the sensitivity and specificity of extant approaches are suboptimal. To this end, we present a multimodal fusion-based framework to leverage discriminative information from both speech and text transcripts. Text transcripts are extracted from speech utterances using the wav2vec2 model. Two fusion approaches are evaluated - score-level fusion and late feature fusion for classifying subjects into AD/Non-AD categories. Experimental appraisal of the fusion approaches on the Interspeech 2021 ADDreSSo Challenge dataset yields promising recognition performance with the added advantage of a simpler architecture, reduced compute load and complexity.

Sandeep Kumar Pandey, Hanumant Singh Shekhawat, Shalendar Bhasin, Ravi Jasuja, S. R. M. Prasanna

Exploring Multimodal Features and Fusion for Time-Continuous Prediction of Emotional Valence and Arousal

Advances in machine learning and deep learning make it possible to detect and analyse emotion and sentiment using textual and audio-visual information at increasing levels of effectiveness. Recently, an interest has emerged to also apply these techniques for the assessment of mental health, including the detection of stress and depression. In this paper, we introduce an approach that predicts stress (emotional valence and arousal) in a time-continuous manner from audio-visual recordings, testing the effectiveness of different deep learning techniques and various features. Specifically, apart from adopting popular features (e.g., BERT, BPM, ECG, and VGGFace), we explore the use of new features, both engineered and learned, along different modalities to improve the effectiveness of time-continuous stress prediction: for video, we study the use of ResNet-50 features and the use of body and pose features through OpenPose, whereas for audio, we primarily investigate the use of Integrated Linear Prediction Residual (ILPR) features. The best result we achieved was a combined CCC value of 0.7595 and 0.3379 for the development set and the test set of MuSe-Stress 2021, respectively.

Ajit Kumar, Bong Jun Choi, Sandeep Kumar Pandey, Sanghyeon Park, SeongIk Choi, Hanumant Singh Shekhawat, Wesley De Neve, Mukesh Saini, S. R. M. Prasanna, Dhananjay Singh

Simulation Model of a Spare Parts Distribution Network in the Airline Industry for Reducing Delays and Improving Service Levels: A Design of Experiments Study

Currently, delays are the most common cause of airline disputes. One of the factors leading to these situations is the distribution of spare parts. The efficient management of the spare parts distribution can reduce the volume of delays and the number of problems encountered and therefore maximize the consumer satisfaction levels. Moreover, airlines are under pressure due to their tight competition, a problem that is expected to grow worse due to the COVID-19 pandemic. By carrying out efficient maintenance and distribution management along the supply chain, authorities, airlines, aircraft manufacturers, and consumers can obtain various benefits. Thus, the aim of this research is to perform a design of experiments study on a spare parts distribution network simulation model for the aviation industry. Based on this model, the effect of the input parameters and their interactions can be derived. Moreover, the findings are converted to a combined methodology based on simulation and design of experiments for the design and optimization of distribution networks. This research study thereby provides an approach to identify significant factors that could lead to a better system performance. In conclusion, this proposed approach enables aircraft maintenance systems to improve their service by minimizing delays and claims, reducing processing costs, and reducing the impact of maintenance on customer unsatisfaction.

Javier Gejo-García, Diego Gallego-García, Sergio Gallego-García, Manuel García-García

KeyNet: Enhancing Cybersecurity with Deep Learning-Based LSTM on Keystroke Dynamics for Authentication

Currently, everyone accumulates, stores, and processes their sensitive data on computers which makes it essential to protect computers from intrusion. Several approaches employ biometric data such as voice, retinal scan, fingerprints, etc., to enhance user authentication. There is an added overhead of sensors needed to implement these biometric approaches. Instead, an improved and strong password authentication would be cost-effective and straightforward. Keystroke dynamics is the analysis of temporal patterns to validate user authenticity. It is a behavioral biometric that makes use of the typing style of an individual and can be used to enhance the current authentication security procedures efficiently and economically. Such a behavioral biometric system is fairly unexplored compared to other behavioral verifications models. In this study, we focus on applying and training deep learning approach based Long Short Term Memory (LSTM) algorithm in an optimized way to validate temporal keystroke patterns of users for improved password authentication. Our research shows an enhanced authentication rate for the keystroke dynamic benchmark dataset.

Jayesh Soni, Nagarajan Prabakar

Emotion Recognition from Brain Signals While Subjected to Music Videos

Emotions are simple, yet complex windows to the brain. Music and emotions are associated closely together. There are few things that stimulate the brain the way music does. It can be used as a powerful tool to regulate one’s emotions. In recent years, emotion detection using brain waves has become an active topic of research. Various researchers are implementing different feature extraction techniques and machine learning models to classify the emotion by predicting the measurement of the electroencephalography signals. Many researchers are working on improving the accuracy of this problem and employing different techniques. In our study, we looked into achieving good scores by trying to predict the actual 4 emotional quadrants of the 2 dimensional Valence-Arousal plane. We evaluated and looked into various feature extraction approaches, modeling approaches and tried to combine best practices in our approach. We used the publicly available DEAP dataset for this study. Features from multiple domains were extracted from the EEG data and various statistical metrics and measures were extracted per channel. In our proposed approach, a one-dimensional convolutional neural network and a two-dimensional convolution neural network model were combined and fed through a neural network to classify the four quadrants of emotions. We did extensive and systematic experiments on the proposed approach over the benchmark dataset. The research findings that may be of significant interest to the user adaptation and personalization are presented in this study.

Puneeth Yashasvi Kashyap Apparasu, S. R. Sreeja

Backmatter

Titel: Intelligent Human Computer Interaction
herausgegeben von: Jong-Hoon Kim
Madhusudan Singh
Javed Khan
Prof. Uma Shanker Tiwary
Prof. Marigankar Sur
Dhananjay Singh
Verlag: Springer International Publishing
Electronic ISBN: 978-3-030-98404-5
Print ISBN: 978-3-030-98403-8
DOI: https://doi.org/10.1007/978-3-030-98404-5

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade