Zum Inhalt

2025 | Buch

International Conference on Biomedical and Health Informatics 2024

Proceedings of ICBHI 2024, October 30 – November 2, 2024, Tainan, Taiwan

herausgegeben von: Kang-Ping Lin, Ratko Magjarević, Paulo de Carvalho

Verlag: Springer Nature Switzerland

Buchreihe : IFMBE Proceedings

insite
SUCHEN

Über dieses Buch

Dieses Buch behandelt aktuelle Fortschritte und Herausforderungen in der biomedizinischen und Gesundheitsinformatik. Sie berichtet über neueste Technologien und Strategien und Konzepte zu ihrer Umsetzung in Medizin, Gesundheit und Bildung. Die Beiträge behandeln eine Reihe von Themen, darunter künstliche Intelligenz und Präzisionsmedizin, elektronische Gesundheitsfürsorge und Ausbildung, medizinische Geräte und tragbare Geräte sowie medizinische Bildgebung. In diesem Buch, das die Tagungen der sechsten Internationalen Konferenz für Biomedizinische und Gesundheitsinformatik (ICBHI 2024) zusammenfasst, die vom 30. Oktober bis 2. November 2024 in Tainan, Taiwan, unter dem Thema "KI-Trends für das Gesundheitswesen - Biomedizinische Daten, Informationstechnologie und Cybersicherheit" stattfand, erhalten Wissenschaftler und Fachleute eine aktuelle Momentaufnahme des digitalen Wandels im Bereich der Medizin.

Inhaltsverzeichnis

Frontmatter

Medical Devices, Medical Imaging, and Bio-signal Processing

Frontmatter
Using Active Noise Cancellation Technology to Enhance Streaming Audio Listening Benefits for Hearing-Impaired Individuals in Noisy Environments

Although the application of hearing aids in streaming audio is increasingly gaining attention, several challenges to the application remain, such as the need to (1) accurately compensate for the characteristics of hearing loss in streaming audio and (2) reduce the impact of noise leakage on audio quality in a noisy environment. Therefore, this study proposes a system that combines true wireless stereo (TWS) earbuds, active noise cancellation (ANC), and a deep learning gain control model to enhance the quality and comprehension of streaming audio for users in noisy environments. We tested 16 common types of hearing loss and adjusted for the appropriate compensation gain using the proposed model. Subsequently, we processed the speech using the KEMAR artificial ear recording device and evaluated the benefits using the Hearing Aid Speech Quality Index (HASQI) and Hearing Aid Speech Perception Index (HASPI). In addition, two flagship commercial hearing aids were used for comparison to determine the benefits of the proposed system. The experimental results showed that the proposed method achieved average HASQI and HASPI scores of [0.47, 0.99], which were significantly higher than those of the two commercial hearing aids [0.31, 0.94] and [0.27, 0.88]. Furthermore, we found that that the results were significantly superior when the ANC function was enabled than when it was disabled [0.32, 0.81]. Hence, the proposed method has the potential to significantly improve the quality of streaming audio for hearing-impaired individuals in noisy environments, thereby enhancing the listening benefits for users.

Hsin-Jui Lee, Chin-Yu Chou, Ying-Hui Lai
Immersive Virtual Reality Cruise Garden Relaxation Train: A Novel BCI Platform for Depression

The staggering global prevalence of depression extends to Taiwan, where approximately 8.9% of the population exhibits symptoms of depression, affecting around 2 million individuals, with 1.25 million classified as severe cases. Depression exerts profound impacts on both socioeconomic and health aspects, ranking as one of the leading causes of disability and closely linked with dementia. Over 70% of suicides involve individuals with depression, and those with depression face significantly higher suicide risks compared to the general population. Of particular concern is the higher-than-expected prevalence of depression, which, if left untreated, may escalate suicide risks. Research indicates that individuals with depression display unstable and imbalanced brainwave patterns, characterized by a lack of normal baseline brainwaves and elevated slow waves. While the commonly used repeated transcranial magnetic stimulation (rTMS) therapy can modulate brainwaves, the treatment is costly and often associated with high relapse rates, imposing a significant burden on patients. Recent studies reveal that potential depression sufferers commonly experience brainwave dysregulation and neuro-immune disruption, emphasizing the need for personalized emotional, brainwave, and autonomic nervous system regulation, along with community support. This study aims to integrate elements such as “Brainwave Entrainment with Schumann Resonance and Autonomic Nervous System Feedback Device”, “Light Therapy Device”, and “Garden Train-Style Anxiety-Relief Environment” to provide simulated natural environment brainwave and autonomic nervous system regulation, as well as light therapy. Additionally, it involves designing a safe “Garden Train” virtual reality environment to offer a secure and comforting anxiety-relief space for individuals at risk or suffering from depression, as well as caregivers.

Shu-Hui Tsai, Chih-Chiang Chao, Yue-Der Lin, Chin-Jung Kuo, Shu-Ping Dong, Chun-Yuan Liu, Hsin-Yi Ko, Jenn-Hann Wang, Ching Hsiu Liu
Spinal Curvature Estimation Using Low-Cost Portable Sensors

This study explores the application of the Intel RealSense D455 stereoscopic camera for assessing spinal curvature using non-invasive techniques. The research focuses on comparing the accuracy of spine determination through RGB images and corresponding point cloud data generated by the camera. Pilot measurements were conducted on three healthy participants. The study observed consistent deviations across the entire length of the spine, with minor noise in 3D space likely due to slight inaccuracies in distance measurement. These discrepancies were found to be significantly lower than the specified error margins of the camera, likely owing to optimal lighting conditions during the recordings. The findings suggest that the method is accurate enough for research and clinical applications, providing a non-invasive alternative for evaluating spinal curvature.

Jan Hejda, Petr Volf, Marek Sokol, Lýdie Leová, Patrik Kutílek
RGB-D Motion Capture System for Monitoring Health Conditions of Employees in Physically Demanding Professions

For the purpose of monitoring health conditions of employees in physically demanding professions, particularly those in emergency services and the military, a system utilizing two stereoscopic RGB-D cameras, Intel® RealSense™ D455, was designed. A pilot measurement was conducted using four exercises: Squats, Cardiopulmonary Resuscitation, Crawling on All Fours, and McGill’s Anterior Trunk Flexor Test. MediaPipe from Google LLC was used for anatomical landmark detection, supplemented by pixel-to-point deprojection into the depth map. The results suggest that the system is capable of assessing even complex exercises and movements. Measurement errors mainly arise from incorrect pixel-to-point deprojection of the RGB image into the depth map, affecting the positions of individual landmarks. The results primarily show the visibility values of individual landmarks for the given exercises and camera positions in frontal and three-quarters views. This information could potentially be used to correct pixel-to-point deprojection errors.

Petr Volf, Jan Hejda, Marek Sokol, Lýdie Leová, Yi-Jia Lin, Tommy Sugiarto, Patrik Kutílek
Alternative Labeling of ECG Datasets on Deep Learning Model for ECG Fiducial Point Detection

Electrocardiography (ECG) is a commonly used method for detecting heart conditions, and its fiducial points can be used to diagnose various cardiovascular diseases. Many studies utilize deep learning models to assist with annotation; however, due to individual differences and noise interference, generating labeled data for model training is time-consuming and labor-intensive. This study discusses the training of deep learning models based on data. It explores the impact of using model-automated annotated data as training data on the results of ECG annotation. This study selected the Residual U-Net (R-Unet) as the experiment's model architecture. After training with manually labeled data, we use model-labeled data to increase the data volume and perform progressive learning. The results showed that the accuracy of the model trained with partially non-manually labeled data was comparable to that of the model trained with manually labeled data. Finally, the study demonstrates that through progressive learning, it is possible to maintain the model's accuracy while reducing the workload of generating training data by utilizing unlabeled ECG data.

Yu-Chen Lin, Cheng Cheng, Shi-Yi Wu, Kang-Ping Lin
Evaluation of Cuffless Blood Pressure Estimation Accuracy at Each Heartbeat During Exercise

Continuous blood pressure monitoring is crucial because of its dynamic nature, influenced by factors such as physical activity and mental stress. Although cuffless blood pressure estimation is a promising technique for continuous monitoring, its accuracy during drastic fluctuations, especially during exercise, remains insufficiently validated. In this study, we evaluated a simple Pulse Arrival Time based blood pressure estimation method and several improved methods for sharp fluctuations induced by exercise. Sixteen subjects in the experiment, with data from 13 valid for analysis. The proposed method achieved an average correlation of 0.527 in systolic blood pressure, suggesting that cuffless blood pressure estimation could be applied to estimate continuous blood pressure variability during exercise.

Kouki Ono, Yuka Maeda, Takumi Yoshimura, Ming Huang, Toshiyo Tamura
Deep Learning-Based Early Detection of Diabetic Foot Wounds Using Resnet101 and TCPO2 Standards

Diabetic Foot is a common complication among patients with Diabetes Mellitus, often associated with PAD or chronic non-healing wounds. These conditions frequently lead to recurrent hospital visits and admissions, ultimately resulting in amputations or disability. This study collects various annotated acute and chronic wound images for machine learning model training. Using TCPO2 as an evaluation standard, a deep CNN is employed for wound classification. Future integration with Mask-RCNN will enable automatic wound localization and the construction of models capable of identifying wounds with hypoxia and impaired healing potential.Lower extremity wound images from 1,000 patients were collected from Mackay Memorial Hospital. Physicians defined and annotated these images based on electronic medical records, integrating lower extremity PAD and TCPO2 data. De-identified data were grouped and used to train Resnet101.Results indicate that using TCPO2 as a classification standard is feasible. Among various tested criteria, classification using TCPO2 values greater than or less than 30 showed the best performance, with an accuracy of 86.0 ± 0.8. These findings confirm that the study’s outcomes can assist healthcare professionals and patients in early detection and diagnosis of complications, facilitating timely treatment.

Yu-Chang Chu, Ming-Feng Tsai, Hung-Wen Chiu
Broadening of Focal Brain Cooling Using a Thermally Conductive Sheet: Evaluation of Neuroprotective Effects in Severe Cerebral Infarction Model in Cats

Brain cooling is an effective method for suppressing abnormal brain activity and reducing brain inflammation, particularly in conditions such as epileptic seizures and cerebral infarction. Traditional focal brain cooling techniques, that use devices like Peltier elements are effective, but can cover only a limited area. To overcome this limitation, we introduced a thin, flexible thermally conductive sheet that can extend the cooling range without an increase in the size of the cooling source. This study employed a severe cerebral infarction model in cats to evaluate the efficacy of this approach, and aimed to prevent ischemic damage and explore potential therapeutic applications. The middle cerebral artery (MCA) was occluded for 3 h, followed by reperfusion. Cooling was then initiated using the thermally conductive sheet, which was placed over the perfusion area with the cooling source positioned on the ectosylvian gyrus. Cooling was continued for 24 h, during which ECoG (electrocorticographic activity) and brain temperature were simultaneously monitored to assess both immediate and long-term effects. Results demonstrated that the thermally conductive sheet effectively reduced brain temperatures of a broad area below 20 ℃. This significantly limited the expansion of the infarct size compared to focal cooling. However, significant cerebral edema persisted despite these benefits, indicating that broad cooling alone may not be sufficient to mitigate brain swelling. These findings indicate that though extensive cooling can effectively reduce infarct size, additional strategies, such as anti-inflammatory therapies, are needed to optimize outcomes in patients with severe cerebral infarction.

Sayuki Takara, Takao Inoue, Fumiaki Oka, Toshitaka Yamakawa, Takuma Nishimoto, Sadahiro Nomura, Michiyasu Suzuki
Non-contact Respiratory Rate Detection Using 24 GHz Continuous-Wave Radar

Respiratory rate is an important vital sign that reflects health status. In hospital, traditional methods of measuring respiratory rate include manual counting and using a respiratory belt, both of which may pose a risk of contact infection. This study conducted clinical trial in medical ward of hospital, presented a non-contact method based on a 24 GHz continuous-wave radar to detect the displacement of the chest wall during breathing. By processing the radar signal, we were able to obtain the respiratory rate. However, during clinical validation, uncertainty in the daily measurement time of vital sign, which serve as the gold standard, can lead to comparison error. To overcome the interference caused by the issue, this study proposed a validation method that calculates the average of radar-detected respiratory rate over specific time interval and uses the standard deviation of these time interval as an indicator of the stability of the radar-detected respiratory rate value. Statistical results show that the radar-detected respiratory rate derived in this study is consistent with the hospital-measured respiratory rate (p = 0.17) and exhibits significant differences across different subjects (p < .05). The standard deviation of the radar-detected respiratory rate varies across different time. These findings confirm the feasibility of replacing hospital-measured respiratory rate with radar-detected respiratory rate. Using this non-contact method to detect respiratory rate can reduce the risk of contact infection. Moreover, the stability indicator of radar-detected respiratory rate proposed in this study can provide healthcare personnel with a reference for assessing the stability of a patient’s respiratory status over a specific period of time.

Yun-Chien Chen, Ching-Fen Jiang
Continuous Blood Pressure Estimation Using PPG Signals Without Cuff

Cardiovascular diseases (CVD) are one of the major global health threats, causing millions of deaths each year. According to data from the World Health Organization, over 500 million people are affected by cardiovascular diseases, resulting in over 20 million deaths in 2021 alone, accounting for about one-third of the total global deaths. Invasive blood pressure measurement methods not only cause discomfort to patients but also pose a risk of infection. Meanwhile, cuffless cuff-based methods cannot provide continuous monitoring.Therefore, the purpose of this study is to develop a cuffless continuous blood pressure measurement system based on photoplethysmography (PPG). By using waveform measurement technology, we aim to provide a more comfortable and safer method for blood pressure monitoring, thereby enhancing the management of cardiovascular diseases.

Wen-Kai Hung, Ming-Hsuan Sun, Chi-Jen Lu, Yu-Xian Chen, Yue-Der Lin
Cuffless Blood Pressure Estimation Based on PPG Features

Cardiovascular diseases (CVD) are a leading cause of death, encompassing conditions like heart disease, hypertension, and diabetes, which increase the risk of myocardial infarction and stroke. Continuous monitoring of blood pressure is crucial due to its significant correlation with these conditions. However, traditional methods are often invasive, causing discomfort and potential infections.This study aims to develop a continuous, cuffless blood pressure detection system using data from approximately 9,000 patients in the MIMIC II database, totaling over 2 million records. The system utilizes photoplethysmography (PPG) to extract temporal and waveform features, with regression analysis for feature reduction. A Transformer-based deep learning model is employed to establish the relationship between PPG features and blood pressure. The mean error (ME) and standard deviation (SD) for systolic (SBP) and diastolic blood pressure (DBP) are −0.14  ±  9.05 and −1.15  ±  5.45, respectively. According to AAMI and BHS standards, diastolic pressure meets AAMI criteria, with systolic and diastolic pressures achieving Grade B and Grade a levels, respectively.

Yu-Xian Chen, Ming-Hsuan Sun, Chi-Jen Lu, Wen-Kai Hung, Yue-Der Lin
Enhancing Dermatological Diagnostics: A GAN-Based Approach for Synthetic Skin Lesion Image Generation

This paper presents a Generative Adversarial Network (GAN)-based approach for enhancing dermatological diagnostics by generating high-quality synthetic images of skin lesions. Utilizing a Deep Convolutional GAN (DCGAN) architecture, the model will be trained on the HAM10000 dataset, which contains over 10,000 dermatoscopic images of various pigmented skin lesions. The proposed model aims to address the challenge of limited data in training deep learning models by augmenting the dataset with synthetic images. Key features include the use of Spectral Normalization and Label Smoothing. The model’s performance will be evaluated using both objective metrics, such as Frechet Inception Distance (FID), Mean Deviation Similarity Index (MDSI), Structural Similarity (SSMI), and Haar Perceptual Similarity Index (HaarPSI), as well as subjective assessments by dermatologists. The goal is to integrate the generated synthetic images into real-world datasets to improve the accuracy of automated skin lesion classification, thereby enhancing early detection and diagnosis of skin cancers while reducing misdiagnosis.

Alessio Luschi, Ali Salman, Alessandra Cartocci, Linda Tognetti, Gabriele Cevenini, Pietro Rubegni, Ernesto Iadanza
New Design of an Oxygenation Cannula with a Comprehensive Set of Blood Pump

To manage the O2 supply in patients who need improvement in O2 saturation, extracorporeal membrane oxygenation (ECMO) is alternatively applied. In general, a serial connection of a blood pump with an artificial lung is used to supply oxygen to the vessel. Recently, extracorporeal carbon di-oxide removal (ECCO2R) has the potential to play a role in the recovery of acute respiratory distress patients. This study aimed to establish a new cannula design for effectively managing oxygenation and carbon dioxide removal with a simple structure. In this study, we composed a membrane oxygenation system equipped with a double luminal cannula with an axial rotary blood pump actuator and its design. The cannula is designed as an intravenous cannulation technique to achieve assistive oxygenation in the vena cava via a single-port approach. Once the cannula is inserted transcutaneous via the femoral or jugular vein towards the right atrium via superior or inferior vena cava, respectively, an intra-cannula blood pump delivers blood from the distal portion into the proximal vein via the mem-brane oxygenation layer of the cannula. In this study, we made a prototype of the cannula with a small axial blood pump and examined its design feasibility as proof of concept.

Hirotaka Sato, Kotaro Nakano, Francis Chikweto, Aoi Fukaya, Tomoyuki Yambe, Yasuyuki Shiraishi
Hybrid Deep Learning Approach for Accurate Cardiovascular Disease Detection Using Single-Lead ECG Signals

This study compared several deep learning models for classifying cardiovascular disease using a single-lead ECG signal. The models included 1D-Convolutional Neural Network (CNN), 2D-CNN, Vision Transformer (ViT), and a hybrid model combining ResNet-34 with ViT. ECG signals were obtained from the MIT-BIH Arrhythmia database and preprocessed using SMOTE, wavelet denoising, and continuous wavelet transform. The 1D-CNN model used the denoised ECG signal as direct input. The 2D-CNN architecture used VGG-16 and ResNet-34. The hybrid model was proposed, which outperformed the other models, achieving an accuracy of 99.2%. The study highlights the importance of accurate diagnosis in improving patient health.

Novie Theresia Br. Pasaribu, Frederick Halim, Kezia Sarahsophia Immanuel Ryadi, Che-Wei Lin
Developing an IoT-Based Surface Electromyography Signal Detection Device for Sarcopenia

Sarcopenia is characterized by a persistent reduction in skeletal muscle mass and function, which, if undetected and untreated, can lead to serious consequences such as increased fall risk and long-term immobility. Although sarcopenia has gained attention in recent years, there are currently no effective pharmacological treatments. Muscle loss can only be mitigated in the early stages through resistance training and nutritional interventions. However, precise detection methods used in hospitals are often costly and inaccessible to the general population. Traditional screening methods, like the Sarcopenia Self-Screening Questionnaire (SARC-F), proposed by the Asian Working Group for Sarcopenia, rely on subjective patient reports and lack objectivity.This study aims to address this gap by developing a rapid and convenient sarcopenia screening device that can be used at home or in community settings. Leveraging Internet of Things (IoT) technology, we developed a surface electromyography (sEMG) system to measure the muscle activity of the gastrocnemius, tibialis anterior, and quadriceps during walking and standing. The muscle signals were wirelessly transmitted using the MQTT protocol for backend analysis, enabling real-time gait and muscle strength monitoring.Twenty-eight healthy elderly individuals from the community participated in this study. To validate the effectiveness of the device, traditional sarcopenia-related physical measurements were obtained using standard methods. We first classified participants into four sarcopenia risk levels using the above data. In the meantime, we used deep image gait analysis software (GaitBEST) to classify it into four risk levels. Preliminary results show that these two classification levels had a slight discrepancy. However, the sEMG signals obtained using the developed device reflect the risk level determined by gait pattern, indicating the necessity of further analysis to explore the relation between gait and skeletal muscle functions.

Po-Xiang Wang, Ching-Fen Jiang
AN 8-Channel Neuronal Signal Acquisition and Conditioning System with Optrode LED Driving Capability

This work presents an 8-channel neuronal signal acquisition and conditioning system with the ability to drive LED on the optrode made of on-the-shelf discrete components. The system is divided into 3 modules: a DC offset generating module, an LED driving module, and a two-stage signal amplifying and conditioning module. Texas Instruments LMP770x series amplifiers were selected to be the amplifiers of this system. The system is designed to be powered with batteries to minimize electromagnetic interference. A PCB was fabricated to verify the system. The gain of the system is 41.3 dB and the bandwidth of the system is 80 Hz to 14 kHz. To verify the ability of the system in laboratory environment, an in vivo recording of the Inferior Colliculus of laboratory mice was conducted.

Hong Liang Loo, Xiangxin Ma, Yu Xia, Yujing Huang, Hung Chun Li, Zhen Yuan, Peng Un Mak, Mang I. Vai, Sio Hang Pun
Molecular Imaging Application on SCA17 Mice by [18F]-HDAC6i and [18F]-Florbetaben

Spinocerebellar ataxia 17 (SCA17) is a fatal, incurable polyglutamine (polyQ) disease, highlighting the need for effective diagnostic and therapeutic approaches. Histone deacetylases (HDACs) are enzymes that catalyze the removal of acetyl groups from lysine residues of histone and non-histone proteins, which epigenetically effect the protein synthesis. Recently, more and more evidence indicates HDAC6 inhibitor functions on aggression inhibition of alzheimer’s disease. However, no application of HDAC6 inhibitors uses in the spinocerebellar ataxia. The HDAC6 inhibitor [18F]-NC110 (i.e., [18F] 4-((5-(4-fluorobenzoyl)-1H-indol-1-yl)methyl)-N-hydroxylbenzamide) was synthesized via microwave-assisted radiofluorination from a nitro precursor, followed by the formation of a hydroxamic acid. PET/CT imaging of [18F]-NC110 and the amyloid imaging agent [18F]-Florbetaben ([18F]-FBB) in SCA17 transgenic (TG) mice demonstrated the potential of both agents for cerebellar imaging. Notably, [18F]-FBB exhibited significantly increased radioactivity in the brains of TG mice compared to wild-type (WT) mice, with higher signal accumulation in the cerebellum, indicating specificity for regions affected by SCA. These findings suggest that the expression of HDAC6 may be upregulated and highlight the potential of [18F]-FBB as a diagnostic tool for cerebellar lesions in SCAs. This study represents the first report of a HDAC6 imaging agent and the application of [18F]-FBB in SCA17 transgenic mice.

Chun-hung Yang, Chen-Hsin Chan, Hung-Wen Yu, Kun-Liang Lin, Mei-Hui Wang
Exploring the Ability to Ride a Motorcycle in Patients with Alzheimer’s Disease. A Motorcycle Simulation Study with Functional NIRs

Alzheimer’s disease (AD) is the most common form of dementia, which is a global growing issue in the foreseeable future. Few studies focused on the ability to ride a motorcycle in patients with AD. We built a dynamic motorcycle simulator with a scenario of possible traffic accident. Functional near-infrared spectroscopy was done to monitor the activation of bilateral prefrontal cortex. 55 participants were recruited (Healthy control (HC)/Mild cognitive impairment (MCI)/AD = 24/20/11). A cognitive battery was done. During simulated riding, the time between participants seeing the incident and squeezing the brake was recorded as the reaction time of incident (RTI). The result show that the brain contra-lateral to the event had higher activation in patients with AD and MCI. AD patients had a significant longer RTI than HC (p = 0.046), RTI was not correlated with items in cognitive battery. In this study, we showed that AD patients had worse ability to ride a motorcycle. We also proved the viability of the design. Future studies with more scenarios and participants are warranted.

Chen-Wen Fang, Po-Tsen Lin, Wei-Ru Chen, Yang-Kun Ou, Jia-Jin J. Chen
How Much Do Activity Patterns Change With Age?

This paper presents a comparison of features derived from various biomedical signals from both younger as well as elderly subjects. The dataset consists of accelerometer, gyroscope and altimeter signals acquired from wearable sensors, describing three different activities of daily living: walking, running and sitting on a chair. The features of all three signal data were processed, and later analysed to determine whether significant differences between signals recorded by younger subjects and elder subjects are present. Results of this comparison could prove valuable in the field of signal processing for various applications (activity recognition, fall detection, elderly care etc.) in terms of rehabilitation.

Krunoslav Jurčić, Andrea Juliana Ruiz Gómez, Martha Zequera-Diaz, Ratko Magjarević

Biomedical Informatics and Technology

Frontmatter
Predicting Surgical Outcomes in Obstructive Sleep Apnea Patients Using Machine Learning Approach

Obstructive Sleep Apnea (OSA) is a prevalent respiratory sleep disorder, with its pathogenesis potentially associated with airway obstruction influenced by physical factors such as BMI, neck circumference, age, and sex. Surgical intervention is often undertaken by some patients to mitigate the severity of this breathing disorder. However, not all OSA patients benefit uniformly from surgical intervention. To address this variability, we propose a method aimed at predicting the efficacy of surgery, focusing on identifying prognostic indicators that effectively predict the reduction in the apnea-hypopnea index (AHI) post-surgery. We investigated and compared two machine learning models, k-Nearest Neighbors (k-NN) and Logistic Regression (LR), to predict whether there would be a positive response in AHI following surgical intervention in OSA patients. Data were collected from 33 participants before (AHI range: 11.20–120.60) and after OSA surgery at National Cheng Kung Hospital. The dataset includes polysomnography (PSG) indices such as AHI, Apnea Index, and Hypopnea Index, and demographic variables including BMI, age, sex, and neck circumference. The results were validated through stratified 5-fold cross-validation. The k-NN model, with Manhattan distance and distance weighting, demonstrated an accuracy of 90%, sensitivity of 80%, specificity of 100%, and Sn + Sp – 1 of 0.80, outperforming the Logistic Regression model, which achieved an accuracy of 80%, sensitivity of 60%, specificity of 100%, and Sn + Sp – 1 of 0.60. Additionally, feature importance analysis from the Logistic Regression model identified HI, BMI, and AHI as the most significant predictors, providing insights into the key features influencing surgical outcomes. These findings suggest that the k-NN model is a robust tool for predicting surgical outcomes in OSA patients, with potential for clinical application in personalized treatment planning.

Ying-Siou Liao, Cheng-Yu Lin, Sheng-Fu Liang
Feature Selection for Military Basic Throwing Training Strategy with Machine Learning

Basic military training (BMT) is important and initial period for soldiers, especially the military basic throwing performance is one of the key military tactical motor skill. In recent years, machine learning (ML) techniques have been increasingly adopted to analyze extensive datasets and derive meaningful insights. To evaluate the effectiveness of fitness training strategies for military throwing, this study analyzed seven common input features: pull-ups, push-ups, squat concentric/eccentric peak power, squat maximum strength, leg tuck, and a 3 km run. Four data mining models—Random Forest, Multilayer Perceptron, AdaboostM1, and Bagging classifier—were tested to identify the most effective method for predicting performance. The Random Forest model excelled, achieving the highest accuracy, precision, recall, and F1 score, indicating its superiority in this context. These seven attributes were thus identified as key predictors of military throwing performance. However, further research is necessary to establish a definitive ranking and to fully understand the importance of each feature in refining training strategies.

Hsieh-Ping Han, Tong-Hsien Chow, Wen-Hsien Ho
Enhancing Clinical Accuracy in Middle Ear Disease Diagnosis with a Cloud-Based AI System Integrating CNNs and LLMs

Middle ear diseases, such as otitis media and middle ear effusion, pose significant diagnostic challenges, especially in primary care settings, where misdiagnoses or delayed treatments can lead to severe complications and unnecessary antibiotic use. To address these issues, we developed a cloud-based AI system that integrates convolutional neural networks (CNNs) and large language models (LLMs) to assist clinicians in diagnosing middle ear diseases with high accuracy. This study involved a retrospective analysis of 2,820 de- identified otoendoscopic images collected from Taipei Veterans General Hospital between January 2011 and December 2019. The CNN models, including InceptionV3, were optimized through transfer learning and validated against independent test datasets. The AI system, deployed on cloud servers, enables real-time analysis via a smartphone application, with the LLM providing contextual medical advice based on CNN outputs. The system achieved a diagnostic accuracy of 97.6%, comparable to otolaryngology specialists (98.2%) and significantly higher than general practitioners (84.3%), with a precision of 96.8%, recall of 95.4%, and F1-score of 96.1%. The area under the ROC curve (AUC) was 0.98, indicating excellent model performance. Additionally, the integration of the LLM improved clinical utility by offering relevant medical advice, which was highly rated in usability tests. This cloud- based AI system demonstrates high diagnostic accuracy and practical utility, particularly in resource-limited settings, making it a valuable tool for enhancing patient safety and clinical outcomes.

Yuan-Chia Chu, Kuan-Hsun Lin, Yen-Chi Chen, Chien-Yeh Hsu, Chen-Tsung Kuo, Yen-Fu Cheng, Wen-Huei Liao
Base on Voice Signal Multimodal Deep Learning to Establish an Auto-Pulmonary Function Assessment Model for Post-Thoracic Surgery Patient

Lung cancer remains the number one cancer in Taiwan, and thoracoscopic lung surgery is the main treatment method. However, this surgery severely impairs respiratory function, resulting in a high incidence of pulmonary complications (19% – 59%). Preventive measures include deep breathing exercises, coughing, and close monitoring of respiratory function, which are critical to reducing complications and ensuring a smooth recovery. The study conducted in collaboration with Hsinchu MacKay Memorial Hospital, involved 70 participants and collected 223 sound samples using an iPhone microphone to assess lung function before and after training. Using STFT, MFCC and CNN/VGG19 models, the STFT-VGG19 model achieved the highest accuracy (96.3%) in identifying postoperative lung function status. The goal of the study is to develop an automated assessment system to increase the effectiveness of postoperative pulmonary rehabilitation and improve patient outcomes.

Mei-Ju Su, Liang-Ching Hsu, Deng-Jei Siang, Yu-Huei Su, Pai-Hsi Chen
Intelligent Health Promotion Service System: A Project for Remote Rehabilitation

This paper presents the ongoing development of the Intelligent Health Promotion Service System, a joint Taiwanese-Czech project focused on remote rehabilitation solutions. The system combines a multi-sensor smart orthosis with serious game integration for patient motivation. Our primary targets are aging populations with sarcopenia and potentially dementia, undergoing long-term home rehabilitation.Methodology: We focus on three key areas:Camera-based motion capture and analysis: We utilize a single 2D camera for movement capture, enabling a versatile and cost-effective solution. Machine learning models trained by physiotherapist evaluations are used for automated assessment, comparable to human raters.Reconditioning for bedridden patients: This solution provides passive reconditioning for ICU patients with limited mobility. It allows for rehabilitation in existing hospital beds, minimizing disruption to treatment routines.Smart somatosensory wearable assistive device (SSWAD): This device incorporates wireless surface electromyography (sEMG) with exergames to enhance user engagement and track rehabilitation progress. Usability studies show high user acceptance (average SUS score: 77.70) for SSWAD, particularly among female participants with prior rehabilitation experience.Conclusion: The Intelligent Health Promotion Service System offers a promising approach for remote rehabilitation, promoting patient motivation and improving accessibility, particularly for vulnerable populations.

Lenka Lhotska, Jaromir Dolezal, Jindrich Adolf, Josef Černohorsky, Ales Richter, Zdenek Pliva, Pavel Sedlak, Yang-Cheng Lin, Wei-Chih Lien, Poh Thong Tan, Bo Liu
Evaluation of Different Control Systems of Wheelchair-Mounted Assistive Robotic Arm in Performing Activities of Daily Living

Approximately 6.8 million people in the US use assistive technologies for mobility, with 25% relying on wheelchairs. Many require assistance with activities of daily living such as dressing, grooming, and eating. The Multifunctional robotic assistive arm (mR2A), developed by the BioRobotics Lab at the University of Wisconsin Milwaukee, offers a high-tech solution for powered wheelchair users, supporting various control interfaces: finger joystick, chin-controlled joystick, eye-gaze, and AssistKey keypad. This study evaluated these control systems with ten healthy participants performing pick-and-place tasks. Results showed a 100% task completion. The finger joystick was precise and fast, suitable for users with finger dexterity, while the chin joystick provided an alternative for those with limited upper limb movement but was slower. The eye-gaze interface enabled control for individuals with severe impairments, offering high accuracy but requiring intact eye movement. The AssistKey keypad allowed efficient mode switching but needed ergonomic improvements for broader usability. The findings highlight the need for customizable assistive technologies to enhance independence for individuals with disabilities, with each control system offering distinct strengths and limitations based on users’ motor abilities.

Md Ishrak Islam Zarif, Md Samiul Haque Sunny, Nayan Banik, Emily Hope Robann Longwell-Grice, Sheikh Iqbal Ahamed, Inga Wang, Mohammad H. Rahman
Changes of Mechanical Energy Flow Pattern During the Sit to Stand Task in Sarcopenia and Dynapenia Elderly After Eight-Week Elastic Band Training

Background: Dynapenia and sarcopenia are age-related conditions characterized by reduced muscle mass, impacting daily activities such as sit-to-stand movements. Elastic bands, which offer progressive resistance, are commonly used in strength training to improve muscle mass and strength. Mechanical energy flow analysis is essential for understanding dynamics of movements. This study investigates the mechanical energy flow pattern in elderly individuals with dynapenia and sarcopenia following high-load resistance band training. Methods: Fifty participants, including twenty-seven with dynapenia and twenty-three with sarcopenia, underwent an 8-week high-load elastic resistance band training program (three 60-min sessions per week). Mechanical energy flow during sit-to-stand tasks was assessed before and after the intervention, with a focus on five key events: (1) start of seat unloading, (2) seat-off, (3) end of momentum transfer, (4) peak knee power, and (5) start of stabilization. Results: In the dynapenia group, the pelvis transferred energy to the left hip joint after training (0.04 J/kg) at the start of seat unloading. The sarcopenia group showed a significant shift in energy flow to the pelvis after the training (0.03 J/kg), accompanied by reduced trunk energy transmission. Both groups primarily generated energy at the knee joint, particularly during the seat-off phase. Conclusion: Resistance band training improved pelvic energy utilization during the sit-to-stand task in elderly with dynapenia and sarcopenia. Future interventions are warranted to integrate multi-joint training to enhance mechanical energy efficiency, movement patterns, so as to the functional capacity.

Long-Huei Lin, Qian-Yu Chen, Chih-Hsiu Cheng
An Automated Process for Personal Cardiovascular Model Using CTA Imaging and 3D Printing

The interpretation of three-dimensional (3D) medical images largely depends on the expertise of healthcare professionals, making the training and interpretation processes challenging. With the expanding use of 3D printing in medicine, particularly in orthopedics, oncology, and cardiovascular diseases, its application has become increasingly widespread. This study focuses on the cardiovascular, developing a model-building process to provide accurate 3D models tailored to individual patient conditions.We utilized computed tomography angiography (CTA) images to construct the models. Initially, median filtering was employed to eliminate noise while preserving edge details, thus enhancing image quality. This was followed by threshold segmentation, which extracted relevant anatomical structures by isolating pixel values within a specific range. Subsequent erosion and dilation operations were applied to remove minor vessels and irrelevant regions. The largest connected component analysis was then conducted to ensure retention of only primary vascular structures.The processed data were converted into Stereolithography (STL) format for 3D printing. We reconstructed digital image slices to align with the 3D printed model’s angles and perspectives, allowing for direct comparison within software. Healthcare professionals can use measurement tools in the software to verify the consistency between the 3D model and the original images.The resulting 3D models facilitate a better understanding of complex anatomical structures and pathological features, significantly benefiting surgical planning, personalized treatment, and medical education. These models serve as valuable tools not only for clinical decision-making but also for enhancing healthcare professionals’ training.

Jui-Fen Hsu, Yung-Han Lin, Tsao-Min Lin, Chun-Yi Zac Lo
Effect of Normobaric Hypoxic Training on the Physiological Parameters of Individuals Visiting High-Altitude Areas

Background: Hypoxic conditioning can help hikers adapt to low-oxygen conditions and hypoxic-loaded training can improve the physical fitness of hikers. Purpose: Study the effect of hypoxia conditioning on physiological parameters of acute altitude hiking. Methods: 24 healthy people with no high-altitude hiking experience and regular exercise habits (<2 times/week) participated in the hiking on the north peak of Hehuan Mountain (elevation 3422 meters). Participants were randomly assigned to either the hypoxia conditioning group (HC group, n=12, age 21.70 ± 2.42 years, height 165.75 ± 7.60 cm, weight 60.83 ± 10.24 kg) or the control group (CON group, n=12, age 21.39 ± 2.42 years, height 167.08 ± 9.27 cm, weight 62.16 ± 11.91 kg). All participants completed three treadmill workouts per week for four weeks, each consisting of 30 minutes of uphill walking at 15% grade, 8 kg load, and 3.6 km/h speed. The HC group additionally inhaled 13.5% oxygen during training sessions. Collect physiological parameters (Heart rate, HR; Rating of perceived exertion, RPE; Lactate, La; Oxygen saturation, SpO2; Acute mountain sickness, AMS; Hematocrit, Hct, and creatine kinase, CK) at different altitude points during the hike, as well as CK for five consecutive days after the end. Nonparametric statistics was used to compare within-group changes in each variable across time points. Kruskal-Wallis tests were performed for post hoc comparisons. Between-group differences were analyzed using Mann-Whitney U tests, with significance at p< .05. Results: Hypoxic conditioning enhanced altitude threshold height to a greater extent than regular training (p< .05). No significant differences (p>.05) were observed between the HC and CON groups in terms of physiological parameters during the hike. These did change as expected with uphill and downhill hiking, and altitude for oxygen saturation. Conclusions: This study explored how pre-hiking hypoxic conditioning (HC) affects hikers with no prior high-altitude experience. Compared to regular training, HC significantly improved altitude tolerance for these hikers. Both groups exerted themselves similarly during the hike. However, HC offered protection against altitude sickness despite similar exertion levels. This finding suggests HC as a valuable tool for hikers preparing for high-altitude ascents.

Yu-Ling Lee, Tong-Hsien Chow
Synthesis and Application of Hydroquinone with Heptadecatrienyl Side Chain for Inhibition of SARS-CoV-2

Background The hydroquinone with heptadecatrienyl side chain, referred to as HQ17(3), can inhibit topoisomerase 2, which is required for proliferation of tumor cells and virus. Because HQ17(3) is mainly extracted from Rhus succedanea, which is deficient in resources and its low extraction yield limited further development in pharmaceutical industry. Aim: In this study, we aim to develop a new chemical synthesis method for HQ17(3) to explore its potential in inhibiting SARS-CoV-2. Material and Method: HQ17(3) was synthesized starting from hydroquinone to create oxynonyl-acetoxyl benzene. This intermediate was then subjected to a Wittig reaction with heptadien-triphenylphosphine, producing heptadecatrienyl alkyl acetoxybenzene. The final solid pure heptadecatrienyl hydroquinone was obtained using sodium methoxide in methanol. Additionally, the final deacetylation step of HQ17(3) utilized sodium methoxide, replacing the previously used LiAlH4 deacetylation method. The antiviral activity of HQ17(3) was evaluated to have anti-SARS-CoV-2 potential by the viral plaque reduction assay. Results: The synthesized HQ17(3) presents as a non-oily, solid pure substance. The plaque reduction assay results showed that HQ17(3) can inhibit 60% of SARS-CoV-2 replication in VeroE6 cells at approximately 10 µM. Conclusion and Discussion: A high-purity, solid HQ17(3) product was successfully synthesized through deacetylation with sodium methoxide, demonstrating effective inhibition of SARS-CoV-2 proliferation. This non-oily, chemically synthesized solid substance overcomes the limitations of natural product scarcity and fulfills pharmaceutical requirements.

Mei-Hui Wang, Kun-Liang Lin, Hung-Wen Yu, Ching-Liang Kuo, Wan-Chi Li, Sui-Yuan Chang, Chung-Yi Hu, Shwu-Bin Lin
Innovative Syringe Volume Detection and Calculation Using YOLOv8 and Embedded AI on Jetson Orin Nano

This paper presents a system that leverages image processing and deep learning for syringe detection and liquid volume measurement. The main objectives are to develop a robust model for detecting various syringes under different angle orientations and to accurately measure liquid volume within them. The paper details the experimental setup, dataset specifications, and methodologies, including the use of YOLOv8 for object detection, as well as systems for image orientation correction, syringe detection, rubber detection, line detection, and volume calculation. It also introduces a methodology for optimizing the PyTorch model to TensorRT for GPU-embedded devices like the Jetson Orin Nano 8GB, using Quantization Aware Training. This optimization boosts detection speed by 2–3 times with only a 1% accuracy drop, making it suitable for real-time video processing. The system demonstrates 96.89% accuracy in syringe detection and volume measurement across various conditions. The paper concludes with discussions on the integration of image processing and deep learning, performance analysis, and limitations, and includes appendices with confusion matrices for different syringe types.

Aunuun Jeffry Mahbuubi, Che-Wei Lin
Enhancing Multi-label Wound Image Classification with Class-Specific Augmentation Using Vision Transformer Models

Wound classification is a critical task in medical diagnostics, essential for determining appropriate treatment strategies. Despite advancements, current methods often suffer from low accuracy, particularly in multi-label wound classification. This study aims to enhance the accuracy and robustness of wound image classification using the AZH Wound Image Dataset, which includes 730 Region of Interest (ROI) images and 538 whole wound images of venous, diabetic, pressure, and surgical wounds. Our methodology involved comprehensive pre-processing techniques such as resizing, rotation, flipping, affine transforms, color jitter, and normalization. Additionally, we implemented class-specific augmentation to address the unique challenges of each wound type. We employed a Vision Transformer (ViT) model with ViT_B_16_Weights pre-trained on ImageNet and conducted 5-fold cross-validation to ensure robust evaluation. The results demonstrated significant improvements with class-specific augmentation, achieving 88.47% average accuracy for whole image classification and 88.36% average accuracy for ROI classification, compared to 78.22% and 77.69%, respectively, without it. These findings highlight the superiority of class-specific augmentation in enhancing model performance. This study underscores the effectiveness of class-specific augmentation in wound classification. Moving forward, we aim to develop lightweight models suitable for embedding into web applications, thus enabling real-time and accessible wound assessment for broader clinical use.

Juan Anthonio Salas, Che-Wei Lin
Fostering Preservice Primary Schoolteachers’ Competencies in Science Popularization Knowledge of Plantar Medical Science via Hands-On Learning with AR and AI Applications

It is essential for undergraduates as preservice primary schoolteachers to be well-prepared for facing the socio-scientific issues on foot health and self-care practices arising commonly in the local schools and ageing communities in South Taiwan. Purpose: In order to explore to what extent and how learning with integration of new technologies could affect preservice schoolteachers’ understanding of science popularization regarding plantar medical science, this study focused on: 1) What was the impact of learning with augmented reality (AR) and artificial intelligence (AI) applications on the preservice schoolteachers’ understanding of plantar medical science; 2) How did learning with AR and AI applications facilitate the preservice schoolteachers’ knowledge transparency of science popularization concerning plantar medical science. Methods: This study leveraged the hands-on learning approach with AR and AI applications to promote the underlying future schoolteachers’ competencies in science popularization concerning plantar medical science. The participants were forty-four undergraduates with neither technological nor medical science learning background attending a teacher education course at university in South Taiwan. The participants were required to employ AR and AI techniques to co-design learning materials with a particular learning focus on plantar medical science and self-care education for schoolchildren. Questionnaires and interviews served as the main approaches for data collection. Results: Findings showed that as technological novices and non-medical science students, most preservice schoolteachers experienced frustration when absorbing and practicing new skills for developing digital learning content. Yet, it was anticipated that the participants were ultimately satisfied with the substantial progress in designing with AR and AI, and that the learning approach promoted their real-world knowledge of science popularization. Conclusions: The results revealed that hands-on learning with new technologies not simply offered preservice schoolteachers a feasible entry point to develop conceptual understanding of medical science, but promoted their abilities to design digital game-based learning materials centring on plantar pressures and healthcare. This was expected to further provide the local schoolchildren with more interactive and interesting learning content for promoting science popularization.

Yih-Shyuan Chen, Tong-Hsien Chow
Good and Regular Sleep is Associated with Better Academic Performance in College Students: An Actiwatch Study

Attending University is when many students first experience living away from home and learning the skills to manage their daily routines. Certain students tend to stay up late or reduce their length of sleep to have more time to study hard or to experience their lives. Nevertheless, irregular sleep patterns can harm the brain’s ability to learn and remember, resulting in lower academic performance among college students. Several studies have reported that poor sleep is associated with poor academic performance. However, most of these studies rely on subjective parameters, such as the Pittsburgh Sleep Quality Index (PSQI), to estimate sleep quality. These data may be influenced by subjective bias and are unable to quantify day-by-day variation in sleep. In this study, 33 undergraduate students who attended the Sleep Physiology class at National Cheng Kung University, Taiwan were invited to conduct a 14-consecutive day of sleep measurement using the actiwatch to estimate the objective sleep efficiency (SE). To quantify the irregularity of students’ sleep, the mean absolute deviation (MAD) values of the SE were considered. The experimental results show that SE has a significant positive correlation with the final exam scores. In addition, MAD of SE have a significant negative correlation with the final exam scores. This study provides objective and quantitative evidence that good and regular sleep is associated with better academic performance in college students.

Chen-Ta Lin, Fu-Zen Shaw, Sheng-Fu Liang

Young Investigator Competition

Frontmatter
Skeletal Muscle Oxygen Consumption Dynamic Features Base on Near-Infrared Spectroscopy

This study utilized Near-Infrared Spectroscopy (NIRS) to assess the oxygen metabolism characteristics of lower limb skeletal muscles during ankle pumping exercises, simulating short-duration, high-intensity dynamic exercise. Four NIRS probes were used to simultaneously measure the bilateral medial gastrocnemius muscles. Data were collected from 20 healthy subjects over two sessions of ankle pumping exercises, including measurements of oxygenated, deoxygenated, and total hemoglobin. The results showed that oxygen metabolism during the second exercise session (2.24 ± 0.88 µM) was consistently higher than during the first (26.56 ± 14.36 µM), likely due to a post-exercise effect. During the second session, the blood flow response in the medial gastrocnemius was classified into three distinct metabolic patterns: an increase, stabilization, or decrease in total hemoglobin levels. Among these, an increase in total hemoglobin during exercise was the most common, observed in 64% of cases, possibly resulting from endothelial-related vasodilation induced by exercise. The variations in NIRS measurements within the same muscle are hypothesized to be due to the differing compositions of arteries, veins, and microvessels through which the light passes. Therefore, when using NIRS to evaluate dynamic tissue oxygen metabolism and hemodynamic physiological responses, careful consideration should be given to factors such as the measurement location within the muscle, the initial state of the muscle tissue, the warm-up duration and intensity before exercise, and the baseline metabolic characteristics prior to the exercise session.

Shao-Hung Lu, Tieh-Cheng Fu, Cheng-Lun Tsai, Kang-Pin Lin
The Relationship Between Kinematic and Kinetic Characteristics of Countermovement Jump and Change of Direction in Elite Female Basketball Players

The countermovement jump (CMJ) is a dynamic strength test used to assess neuromuscular performance in athletes, particularly in sports requiring rapid changes in direction (COD), such as basketball. Methods: Twelve professional female basketball players with over 1 year of resistance training and 6 years of specialized training participated. They performed unloaded CMJ (ULCMJ), loaded CMJ (LCMJ), and modified 505 (Mod-505) COD tests sequentially. Pearson’s correlation analyzed associations between CMJ and Mod-505 variables. Participants were categorized into fast and slow Mod-505 groups, with differences assessed using t-tests (p < .05). Results: The fast Mod-505 group showed shorter total times and better performance in entry velocities (EnV1 and EnV2), with no difference in ground contact time. They had higher ULCMJ jump heights, eccentric peak force, mean power, and concentric duration. Negative correlations were found between Mod-505 total time and both ULCMJ and LCMJ jump heights, while positive correlations existed between LCMJ eccentric phase peak power and mean velocity. (r = .58-.83, p = .05). Conclusion: The CMJ is a standardized method for assessing lower limb muscle strength and evaluating professional female basketball players’ COD performance. Enhancing lower body dynamic strength may improve jumping and COD abilities in this athletic population.

Chien-Chun Chang, Chieh-ying Chiang, Chuang-Yuan Chiu, Tong-Hsien Chow
Brain Network Changes in the Harmaline Induced Tremor Rat Model

Patients with essential tremor suffer from involuntary and rhythmic movements of heads, hands, and legs. In addition to motor-related symptoms, non-motor features such as mild cognitive deficits have been discovered, implying the alterations in both cerebellar circuits and cerebral functions. As a disorder with a wide spectrum of clinical features, the underlying mechanisms of essential tremor and its alteration to the brain are not well recognized. Investigation to the animal models of essential tremor can provide more insights into its brain circuitry. This study utilized harmaline injection on rat model to mimic tremor features and characterized its brain networks by using resting-state functional MRI. The tremor movements were monitored by a customized platform combined with an accelerometer. The resting-state networks and the functional connectivity among several brain hubs were identified. Aside from regular movement of the rat, excessive movement frequency within 13–15 Hz was detected in the harmaline-injected rats. The fMRI results reveal the alteration of default mode network in the harmaline-induced tremor rat, particularly the lower connectivity in hippocampus and higher connectivity in primary somatosensory cortex. Furthermore, comparing with control rats, the harmaline-induced rats exhibited significantly higher functional connectivity among several region-pairs, including the connectivity between left hippocampus and left striatum, left hippocampus and right motor, left hippocampus and left somatosensory, and bilateral somatosensory cortex. Although the rat models were scanned under anesthesia, the consequences of the harmaline injection can be probed by using the resting-state fMRI, showing that the alteration is not only on the olivocerebellar circuit but also on other cerebral hubs. Besides, a low-price detection module for tremor measurement was developed by combining the accelerometer. The concept has the potential to be implemented to wearable devices for longitudinal recording in the future. In summary, the alteration in the resting-state brain networks of harmaline-induced tremor rat model has been identified in this study, highlighting specific regions that are associated with harmaline-induced tremor. The brain networks in this rat model can be further examined after various therapeutic approaches in the future.

Sheng-Min Huang, Li-Wei Kuo
Neurodegenerative Changes in Structure-Function Coupling of White Matter in Alzheimer’s Disease Using Multimodel MRI

Alzheimer’s disease (AD) is an irreversible, progressive neurodegenerative disorder characterized by cognitive and memory impairments, accompanied by brain atrophy. With no effective treatment available, early diagnosis and slowing disease progression are crucial clinical priorities. Research indicates that brain structural and functional connectivity play vital roles in AD pathology, closely related to cognitive abilities and clinical manifestations. This study proposes a multimodal structural-functional coupling model to explore the mechanisms between brain structural connections and neural activity, providing a comprehensive analysis of brain structure-function integration. By analyzing and validating data based on specific white matter network functions and neural connectivity changes, we aim to develop quantitative diagnostic indicators for AD, predict disease severity and cognitive function changes, and achieve disease diagnosis and treatment evaluation. This approach will enhance our understanding of AD progression and potentially improve patient outcomes through earlier intervention and more targeted therapies.

Chun-Yi Zac Lo, Huei-Min Liu

Student Paper Competition

Frontmatter
Pilot Study of Retrieval-Augmented Generation Model in Recommending Traditional Chinese Medicine Formulations

Retrieval-Augmented Generation (RAG) is a method used to optimize the output of large language models (LLMs). This study investigates the feasibility of using an LLM within a RAG framework to generate recommendations for Traditional Chinese Medicine (TCM) formulations. The study employs the mixtral-8x7b model as the LLM within the RAG architecture, utilizing clinical records from outpatient TCM visits as external data sources for generating TCM formulation recommendations. The recommendations from the RAG-based LLM are compared with those generated by the ChatGPT 3.5 model, evaluating their consistency with actual clinical prescriptions. Results indicate that the RAG-based LLM achieved an average score of 74, demonstrating a high level of alignment with clinical prescriptions across the cases studied. In contrast, the ChatGPT 3.5 model only achieved an average score of 25, primarily due to inconsistencies in the generated recommendations, which rendered them clinically unusable. The study concludes that while the RAG-based LLM shows potential in generating TCM formulation recommendations, there remains a need for improvement in the model’s accuracy.

Ya-Chuan Chan, Po-Yu Huang, Zhi-Liang Chen, Chih-Nung Wang, Wen-Chen Lin, Jung-Peng Chiu, Yi-Chun Chiu, Yang-Hsien Lin, Eddie T. C. Huang, Simon See, Kang-Ping Lin
The Performance of the Total Arterial Compliance Index via the Oscillometric Constant Pressure Method in the Cold Pressor Test

Total arterial compliance (TAC) is a critical cardiovascular parameter that significantly influences central blood pressure and cardiac load. It plays a crucial role in evaluating cardiovascular disease risk and monitoring the progress of treatments. In this study, we employ a specialized blood pressure measurement technique that integrates a constant pressure procedure within the oscillometric method. This approach is followed by analyzing the blood pressure waveform using the Windkessel model. By observing the changes in the diastolic phase of the blood pressure curve, we conduct a cold pressor test to create conditions that alter vascular stiffness. This test involves immersing a participant’s hand or foot in ice water, which induces vasoconstriction and modifies arterial compliance. The study estimates arterial compliance parameters based on these observations. The results reveal that, following the cold pressor test, the arterial compliance parameter decreased by 34% compared to baseline, demonstrating a significant reduction and highlighting the impact of cold-induced changes in vascular stiffness on arterial compliance.

Shi-Yi Wu, Shao-Hung Lu, Liang-Yu Hsu, Cheng-Lun Tsai, Kang-Ping Lin
Event-Based Detection of Obstructive Sleep Apnea Using YOLOv8 and Electrocardiogram Spectrograms

This research explores the feasibility of utilizing YOLOv8 to detect obstructive sleep apnea (OSA) based on real-time events derived from single-lead electrocardiogram (ECG) data, moving beyond the traditional epoch-based detection methods. The study utilized the National Cheng Kung University Hospital sleep database. To create a balanced dataset, data segmentation was performed at different ratios to balance hypopnea and apnea events. Preprocessing involved generating features using continuous wavelet transform (CWT) with the Morlet wavelet on ECG data, producing spectrogram images with various heat maps for YOLOv8 training.Our preliminary study involved twenty subjects, with plans to increase the sample size in future work. Using the Inferno heatmap, we achieved an F1 score of 74.63% for normal events and 74.37% for apnea and hypopnea events, with mean average precision (mAP) scores 73.8%. These results suggest that our approach is a promising first step towards developing diagnostic tools for OSA detection based on real-time event occurrences using ECG.Unlike existing methods that offer classification or detection based on fixed time windows, our approach allows YOLOv8 to identify hypopnea and apnea events of various durations, demonstrating the model’s capability to differentiate between apnea and normal events more effectively. This study underscores the potential of training YOLOv8 with ECG spectrograms for accurate event-based detection of OSA.

Ponpatcharee Nipattanon, Che-Wei Lin
Virtual Reality Multisensory Cooperative Task-Oriented Mirror Therapy for Stroke Rehabilitation

Each year, millions suffer from hemiplegia due to strokes, and while conventional mirror therapy (MT) and its virtual reality (VR) adaptations are common in rehabilitation, they often lack the ability to simulate real-life bimanual cooperative tasks, which are crucial for daily activities. Virtual Reality Multisensory Cooperative Task-Oriented Mirror Therapy (VRMCTOMT) is introduced in the study to test a novel approach that enables the coordinated use of both the affected and unaffected hands in immersive, functional tasks. By combining hand tracking and VR technology, VRMCTOMT overcomes limitations of conventional MT by providing synchronized multisensory feedback (visual, auditory, and tactile) to enhance brain plasticity. Preliminary evaluations using EEG with 15 healthy subjects demonstrated that VRMCTOMT significantly increased bilateral brain activation and Mu wave synchrony compared to standard VR Mirror Therapy (VRMT), suggesting potential benefits for motor function recovery. Future research will expand task scenarios and include broader clinical trials to further validate its effectiveness.

Tung-Shen Lin, Chi-Chun Chiang, Hsiu-Yun Hsu, Che-Wei Lin
Development of a Virtual Reality Mirror Therapy System Combining Rhythm Music and Vibration Stimulation for Upper Limb Rehabilitation in Stroke Hemiplegia Patients

This study aims to develop a virtual reality mirror therapy system that integrates rhythmic music and vibration stimuli, and to investigate its effects on hand function and brain activation in young, healthy subjects. The system comprises the Meta Quest 2 and VRMT-RT software, which is inspired by the rhythm game “Taiko no Tatsujin.” Users are required to hit a drum in sync with the music rhythm while their hand movements are mirrored, incorporating the concept of mirror therapy. Thirty young participants were recruited for the study, and their hand function and brain activation were assessed using hand function scales and EEG under three different modes of music and vibration stimuli. The results indicate that under the influence of music and vibration stimuli, participants demonstrated significant improvements in hand function and increased activation of the motor cortex. These findings suggest that this system can positively enhance hand function and augment brain activation.

Yan-Wen Chen, Hsiu-Yun Hsu, Che-Wei Lin
Comparing the Hardware Efficiency and Accuracy of 1D and 2D Convolutional Neural Networks for Ectopic Beat Detection on ARM Embedded Systems

This study examines the hardware efficiency and model accuracy on three model architectures for classifying ectopic beats on an STM32H743VIT6 microcontroller. The models include two pre-trained CNNs, ResNet V1 and MobileNet V2, and a lightweight CNN named LMUEBCNet. The research evaluates hardware metrics like memory usage, energy consumption, computation time, and accuracy to find an optimal balance. It also discusses redesigning the STM32 chip into an AI module for commercial ECG devices, achieving 99.7% accuracy. The study concludes that while 2D models generally had higher accuracy, the 1D models, particularly MobileNet V2, offered a significant reduction in memory usage and computation time, making them more suitable for resource-limited systems. It indicates that the 1D models, particularly MobileNetV2, exhibited a significant reduction in memory usage and computation time, making them more suitable for resource-limited systems.

Cheng-Yang Lee, Chang-Wei Chen, Che-Wei Lin
A Lightweight Metric-Based Few-Shot Learning Approach for Ectopic Heartbeat Classification in Real-World and Balanced Data Distribution

This study successfully combined lightweight models and a metric-based few-shot learning framework; the proposed models trained in limited data have comparable performance against those trained in the extensive dataset. They also outperformed the existing few-shot learning model. To overcome label scarcity, a longstanding issue in ECG analysis, this study refined 2-dimensional lightweight models SEembedNet and LMUEBCNet, which have the best results with the most efficient parameter size, using a metric-based few-shot learning framework. The framework adjusts them into a one-dimensional convolutional neural network as a feature extractor and computes feature dissimilarity between input pairs for classification purposes. The experiments use the MIT-BIH dataset for 10-fold cross-validation and compare the result with the conventional deep learning training method, which is to train the feature extraction CNN directly for a classification task. The results showed that the metric-based few-shot learning approach for SEembedNet yielded higher average macro F1 scores in ectopic heartbeat classification than conventional deep learning methods, with improvements ranging from 11% to 38% for real-world data and 3% to 11% for balanced data. In the case of LMUEBCNet, the improvements ranged from 2.5% to 29.9% for real-world data and 5% to 14% for balanced data. The proposed models performed comparably to existing few-shot learning models under identical test conditions but with significantly reduced parameter sizes. Specifically, the parameters of SEembedNet and LMUEBCNet are reduced by 955 and 821 times. This study underlines the potential of lightweight metric-based few-shot learning models for automated ectopic heartbeat classification.

Jyun-Siyan Chu, You-Liang Xie, Che-Wei Lin
The Effect of Visual Cue on Anticipatory Postural Adjustments During Gait Initiation in Individuals with Parkinson’s Disease

Introduction. Gait initiation (GI) difficulties in Parkinson's disease (PD) are often attributed to impaired anticipatory postural adjustments (APAs), a key pathophysiological factor. The aim of the study is to explore the effects of visual cues on anticipatory postural adjustments (APAs) alterations during gait initiation (GI) in individuals with and without Parkinson's disease (PD) in 18 patients with PD and 18 age-matched healthy controls. Methods: Eighteen patients with PD (PD group) and 18 age-matched healthy controls (HC group) performed self-generated gait initiation with two experimental GI conditions comprising a transverse visual cue and no visual cue. Characteristics of APAs were derived from monitoring the trajectory of the center of pressure (COP) during GI. The measure outcomes included lateral/posterior APA size, APA duration, APA velocity, and the frequency of trials with either missing or multiple APAs. Results: Visual cues significantly increased (P < .0001) both the lateral APA size and APA velocity in the PD groups only. The frequency of missing APAs was higher in the PD group, whereas the incidence of multiple APAs was similar between the PD and HC groups. Conclusion: The findings suggest that the visual cue significantly enhances the preparation for gait initiation in individuals with Parkinson’s disease. Further research and long-term studies are needed to fully explore the potential of visual cues as a therapeutic tool in managing postural instability in people with Parkinson’s.

Phuc Thi Mai, Rou-Shayn Chen, Guo-Sheng Li, Chin-Song Lu, Hsiao-Lung Chan, Ya-Ju Chang

Scientific Challenge Competition

Frontmatter
Advancing Emotional State Recognition Using Multi-task Learning with Dual CNN-LSTM

Emotions are innate human responses significantly impacting cognition, behavior, and mental health. Accurate emotion recognition is essential for understanding how emotions affect individuals and developing new diagnostic and treatment methods. This study utilizes available data from the ICBHI Scientific Challenge to predict the emotional responses of nine levels and three types of videos. A dual CNN-LSTM model was proposed, incorporating multi-task learning with two parallel pathways to effectively integrate multi-task information. The experimental result lowered the external Phase I error to approximately 0.4, and achieved external Phase II error of 0.2944, demonstrating the potential of dual CNN-LSTM model for advancing emotion recognition and its applications in mental health.

Ching-Ping Wang, Hsiang-Chin Chien, Xin-Yu Chen, Hong-Kun Lin, Chi-Sheng Chang, Chia-Yen Lee, Jung-Chih Chen
Multimodal Emotion Recognition Through a Hybrid CNN-GCN Model Integrating Neuroimaging and Physiological Data

Emotion recognition plays an important role in understanding human affective states and holds significant potential in diagnostic and therapeutic applications. This study presents a novel hybrid model that integrates convolutional neural networks and graph convolutional networks to improve multimodal emotion recognition by leveraging neuroimaging and physiological data. Specifically, functional magnetic resonance imaging, photoplethysmography, and respiratory signals were utilized to predict emotional valence. The proposed model processes fMRI data to capture spatial and temporal brain activity, while physiological signals are transformed into Gramian Angular Fields. to extract temporal features. The model’s performance was evaluated using a Leave-One-Subject-Out cross-validation strategy on the dataset provided by the ICBHI 2024 Scientific Challenge, achieving a competition score of 0.3314. The experimental results demonstrated an average accuracy of 70.42% ± 0.06% in the valence class prediction task and an average of the mean absolute error of 1.74 ± 0.46 in the valence rating prediction. Our hybrid model performance showed an accuracy of 83.33% in the classification task.

Marek Sokol, Jan Hejda, Petr Volf, Patrik Kutílek
Predicting Emotions Using Channel Attention Mechanism on fMRI Signal Data

fMRI applications in emotion recognition can assist doctors in better assessing patients’ emotions, thereby aiding in the development of personalized treatment plans. This study incorporating an SE block into the first layer of the CNN model to automatically focus on brain regions related to emotion classification among 246 fMRI signals (SE method). The results for the Class classification task achieved a Precision, Sensitivity, and F1-score of 0.73, 0.73, and 0.73, respectively, and for the Level task, they were 0.23, 0.32, and 0.26, respectively. Additionally, in this study, directly using brain regions previously identified as emotion-related as model input resulted in Precision, Sensitivity, and F1-scores of 0.49, 0.48, and 0.47, respectively, and for the Level task, scores of 0.04, 0.07, and 0.05 were achieved. The results indicate that employing the SE method to focus the model on brain regions associated with emotions enhances performance in emotion recognition tasks, achieving an error rate of 0.4213 in the second phase of the ICBHI challenge.

Hong-Kun Lin, Chia-Yen Lee
Enhancing Emotion Prediction via Self-supervised Learning Using Neural Data

Predicting mental states from neural data holds potential for various real-world applications, such as brain–computer interfaces. However, the high costs of data acquisition generally limit the size of these data, which hinders the application of advanced deep neural networks (DNNs) to the prediction of mental states. To address this issue, we proposed a self-supervised pretraining framework for DNNs using large-scale neural data. Specifically, we pretrained a DNN using functional magnetic resonance imaging (fMRI) signals from over 800 individuals in the Human Connectome Project (HCP) database. The pretrained DNN was then applied in the ICBHI 2024 Scientific Challenge (ICBHI-SC), in which the cross-participant prediction of emotional classes and ratings is performed using a small dataset—only 30 trials per participant—collected during video watching. After the DNN was fine-tuned, emotion prediction was performed using features extracted by the DNN. Finally, we confirmed that the prediction performance surpassed that of baseline models without the use of self-supervised pretraining. Thus, our framework potentially provides a versatile tool for mental state prediction from limited neural data, and expands the applicability of such predictions in real-world situations.

Chiyu Maeda, Takeru Abe, Satoshi Nishida
Predicting Emotional Valence from fMRI and Physiological Data: An Ablation Study and Model Comparison

This study aimed to classify the emotions experienced using physiological signals and brain activity as participants viewed emotion-inducing videos. We conducted an ablation study of the physiological and brain activity data using Fast Fourier Transforms and concluded that brain activity yields the most discernible signals. We further applied ablation on the brain signals to identify the most relevant features of the dataset. The Voxel-Wise analysis showed that 63% of the brain signals are relevant for classification. After pinpointing the key datasets and features, we tested five different neural network models with hyperparameter tuning, as well as an algorithmic-based approach. The neural networks achieved 30% accuracy, while the algorithm achieved 76.6% accuracy in classifying emotions, demonstrating the ability to discern distinct emotions using physiological data. This shows that the simple classifier was able to outperform the more complicated neural network models.

TingRay Chung, Srihaan Seelam
Deep Learning Approaches for Emotion Recognition with Missing PPG and RESP Signals Using Multi-channel Data

The 2024 ICBHI Challenge requires the design of an effective method to analyze three different emotions and their intensity levels. The data is recorded using three different instruments to capture the participants’ emotions while watching videos: functional magnetic resonance imaging (fMRI), photoplethysmography (PPG), and respiratory (RESP) signals. However, there are missing PPG and RESP signals, resulting in multiple emotion records that only have fMRI signals. Most methods may use only fMRI as training data to build models, but this would waste the advantage of using PPG and RESP signal features to assist the model in achieving more efficient discrimination. Therefore, this study proposes effective methods for compensating for missing PPG and RESP data, correcting offset values, and effectively augmenting fMRI data. These methods address the aforementioned issues to construct a multi-channel deep learning model based on the three types of data. The model proposed in this study achieved an overall error rate of only 0.2844 on the official evaluation metrics, indicating that the proposed method can achieve high-performance emotion and emotion intensity discrimination.

Xin-Yu Chen, Hsiang-Chin Chien, Ching-Ping Wang, Hao-Jen Wang, Chia-Yen Lee
Feature Extraction of fMRI Signals Based on Autoencoder for Constructing a Two-Stage Multi-class Emotion Classification Model

This study utilizes the pre-processed fMRI data provided by the 2024 ICBHI challenge and excludes two other signal data types with missing information to construct a CNN model that can distinguish three emotion classes and their corresponding levels. However, due to the high variability and noise in the pre-processed fMRI data, a simple CNN model alone cannot achieve good classification performance. This study addresses the issue of noisy data by proposing a two-stage deep learning model training framework. In the first stage, an autoencoder method is adopted, leveraging its ability to effectively encode and decode data to extract useful signal features from the noisy data for use in the subsequent second stage. In the second stage, the effective features obtained by the encoder are transferred, and the weights of the encoding layers are combined with a fully connected layer for model retraining. This study also analyzes different methods of transferring the weights of the encoding layers. The best model for this study achieved an error rate of only 0.3383 on the official evaluation metric.

Chi-Sheng Chang, Shao-Pu Lu, Chia-Yen Lee
Application of Wavelet Scattering Transform on BOLD Signals for Emotion Recognition
ICBHI 2024 Scientific Challenge

Emotion recognition is the process of identifying human emotions. It allows for the understanding and interpretation of emotional states. Furthermore, emotion recognition has the potential to improve human-machine interaction, making technology more intuitive and responsive to human emotions. In the ICBHI 2024 Scientific Challenge, it was challenged to recognize emotions by predicting both the valence class and valence level. The overall performance of emotion recognition is determined by a weighted sum of the errors in predicting the valence class and valence level. In this study, the wavelet scattering transform was examined and applied to BOLD data, converted from fMRI data, for emotion recognition. The wavelet scattering coefficients obtained from all optimized scattering paths of the wavelet scattering network are used as quantitative features of BOLD data, and a feedforward, fully connected neural network was used as a classifier. The effects of different formations of wavelet scattering feature vectors and structures of neural network models on emotion recognition are investigated. Computational results demonstrate that both the valence class and valence level can be remarkably predicted using the wavelet scattering features of BOLD data. The best performance of emotion recognition achieved in the ICBHI 2024 Scientific Challenge is the score of 0.3624.

Suparerk Janjarasjitt
Predicting Emotional States from Multi-modal Physiological Data Using Multi-expert Ensemble Systems and Graph Neural Networks

This study focused on identifying emotional states by analyzing brain and physiological data, aiming to improve diagnostic and intervention strategies. Sixteen patients watched videos that elicited positive, negative, or neutral emotions, and they rated these on a 9-level valence scale. The study collected data from three sources: BOLD fMRI signals, PPG (photoplethysmography), and respiratory data across 30 trials of 25 s each.A multi-expert ensemble system was used to predict valence ratings from these physiological signals, employing advanced AI models tailored to each data type. For fMRI data, graph representations were created to capture neural activity, which were then processed using a Graph Attention Network (GAT). PPG and respiratory signals were analyzed using a sliding window approach, focusing on changes from a baseline to identify significant features. These features were processed using Fully Connected Networks (FCN).The models were trained using nested Leave-One-Subject-Out cross-validation, and their predictions were combined using a meta-learning approach, which involved training an additional AI model (either Random Forest or FCN) on the outputs of the initial models.The study demonstrated the feasibility of a multi-expert meta-learning approach for emotion detection from multi-modal physiological data, showing promise for enhancing accuracy in emotional state analysis. Further optimization of single-modality models is expected to improve performance.

Paolo Giaccone, Daniele Sasso, Margherita A. G. Matarrese, Mario Merone, Leandro Pecchia
Machine Learning Models for Predicting Emotional Valence from Brain Activity and Physiological Responses

In this study, we developed machine learning models to predict the valence class and valence rating of emotions experienced by participants based on their brain activity and physiological responses. The ICBHI 2024 Scientific Challenge involves using a rich dataset comprising pre-processed functional Magnetic Resonance Imaging (fMRI), photoplethysmography (PPG), and respiratory data from 20 participants. Each participant watched emotion-provoking video clips categorized into three valence classes (positive, negative, neutral) and rated them on a nine-level scale. Our approach integrates Convolutional Neural Networks (CNNs) for analyzing fMRI data and CNNs + Long Short-Term Memory (LSTM) networks for handling PPG and respiratory data. The models were trained to classify the valence class and predict the valence level, using a categorical cross-entropy as loss functions. Initial results show promising trends, indicating the model’s potential for accurate emotion prediction. fMRI model training and validation accuracy are 0.99 and 0.98 respectively. PPG and respiratory models accuracy are 0.86 and 0.66 on training and 0.80 and 0.56 on validation. However, further fine-tuning and architectural adjustments are necessary to enhance performance. This work aims to contribute to understanding how brain activity and physiological responses can be used to decode emotional states, with potential applications in psychological assessment and therapeutic interventions.

Ali Salman, Alessio Luschi, Ernesto Iadanza
Personalized Emotion Prediction Models Based on Individual Neural and Physiological Signals

Mental health disorders affect approximately 970 million people globally, representing 1 in every 8 individuals, with anxiety and depressive disorders being the most common. Poor mental health can interfere with daily activities, work performance, and physical health lower overall quality of life. The growing prevalence of people suffering from mental health issues has led to a significant demand for accurate emotion prediction tools that can enhance mental health diagnostics and provide personalized user experiences. Human emotion prediction largely relies on facial expression analysis, which has limitations due to the potential for hiding or misinterpretation. According to the previous studies, these challenges was addressed by focusing on physiological signals as they offer a more direct and reliable way to assess the emotional states of human emotions based on actual reactions rather than outward expressions. Combining complex data types, such as neural signals from the brain and physiological signals like breathing, can significantly improve the prediction accuracy, but it presents a considerably challenge due to the diverse input. To address these challenges, we propose a novel approach that utilizes dimensionality normalization techniques to standardize multimodal inputs, combining neural data from fMRI with physiological signals such as PPG and respiration. Our methodology integrates four machine learning models—Random Forest, Support Vector Machine, Gradient Boosting, and K-Nearest Neighbors—to optimize emotion prediction accuracy. We validated our model on a dataset of 600 trials from 20 participants, achieving a classification accuracy of 85% for valence classes (positive, negative, and neutral) and strong performance in predicting valence ratings (scale -4 to 4). The results underscore the effectiveness of our approach in advancing emotion prediction, potentially leading to significant improvements in mental health care, personalized technology, and user experience design.

Nguyen-Ngan-Ha Lam, Asydicky Firmanto, Natalie Tanjaya, Yi-Chun Du
Innovative Cascade and Machine Learning Models Based on fMRI Data for Accurate Emotion Prediction

Decoding emotional states from brain activity has been a long-standing challenge in neuroscience and psychology, with significant implications for understanding human behavior and mental health. In this study, we utilized functional magnetic resonance imaging (fMRI) to decode emotional responses to visual stimuli. The data is from the ICBHI scientific challenge, comprising fMRI scans from multiple participants underwent emotionally tasks with film clips, alongside their subsequent classification into three primary emotional classifications and nine subjective emotional levels. We developed a multi-step analysis pipeline for predicting emotional classification involving initial data filtering,feature selection,and feature extraction through Lasso regression and Kernel Principal Component Analysis(KPCA). Finally, we applied a Random Forest Classifier to predict three types of emotional states,achieving a current model accuracy 65%. Further analysis of the subdivided emotional levels was conducted using a Support Vector Machine(SVM) and a Random Forest Classifier. To address data imbalance, we applied Adaptive Synthetic Sampling (ADASYN), resulting in a model accuracy of 33%. The score calculated by ICBHI scientific challenge is 0.32. This detailed analytical approach demonstrates the utility of these methods in extracting meaningful patterns from complex brain activity data to predict various emotional states. The proposed algorithm has demonstrated robust performance on the challenging and diverse dataset provided by the ICBHI, offering novel insights and a new perspective for the prediction of emotional states

Zi-Xiang Tsai, Li-Yun Tseng, Tzu-Yao Lo, Pin-Han Chen, Mi-Hsuan Lin, Rong-Huan Huang, Chun-Yi Zac Lo
Advanced AI Models for Emotion Recognition by Functional MRI: Fully Connected Neural Network

Functional magnetic resonance imaging (fMRI) allows for the temporal observation of human brain activity, providing insights into the activation patterns of various brain regions under different emotional states. Given the complexity of brain data, influenced by numerous factors, accurately predicting emotions using fMRI data presents a significant challenge. This study aims to develop an advanced artificial intelligence model for emotion recognition based on a comprehensive set of fMRI datasets. The dataset includes fMRI data from 20 participants engaged in emotion-inducing tasks, such as watching video clips, with emotional responses categorized into three classes and nine levels. The fMRI signals were segmented into 25-s intervals, capturing neural activity across 246 brain regions. To enhance signal quality and reduce noise, we employed a nonlinear least squares method, where each 25-s signal segment was modeled using a polynomial function. The polynomial parameters were optimized to minimize the sum of squared differences between observed and predicted values. Initially, a fully connected neural network (FCNN) model was trained to identify the brain regions whose activity patterns are most strongly correlated with emotional states. These identified regions were then selected for further training. Subsequently, the signals from the selected brain regions are concatenated for training multiple Fully Convolutional Neural Network (FCNN) models. An ensemble voting system is then employed to predict the unknown dataset, with the most frequently occurring prediction result being used as the final output. Our emotion recognition system achieved an accuracy of approximately 52–58% for class predictions and around 30% for level predictions. The model’s performance is further evidenced by a submission score of 0.5704. These findings underscore the strong association between brain fMRI data and emotions, demonstrating the potential of FCNNs in decoding emotional states from brain fMRI data. Despite challenges such as data heterogeneity and individual variability, the model’s evaluation on the ICBHI dataset indicates significant predictive capabilities, contributing to our understanding of the neural mechanisms of emotions.

Pin-Han Chen, Cheng-Hao Chang, Ta-Chun Lin, Chun-Yi Zac Lo
Enhanced Emotion Recognition from fMRI Using a Multi-model Neural Network Framework

Functional magnetic resonance imaging (fMRI) has the capability to reflect brain activities, with brain regions such as the amygdala, prefrontal cortex, and hippocampus exhibiting distinct emotional activity patterns. These neural activities interact in complex ways, making interpretation and identification challenging. The objective of this study is to identify participants’ emotions using fMRI data from the ICBHI scientific challenge, involving 16 participants in the training set and 4 in the test set. Each participant has undergone 30 trails and reported their emotion classes and levels. The averaged fMRI signals were extracted from 246 regions by Brainnetome atlas, and the regions of interest were selected based on correlations, including emotional related 66 regions. We applied a band-pass filter (sigma = 2) for data preprocessing. There were two models developed, for class and level prediction, respectively. For class prediction, to enhance signal quality and reduce noise in fMRI signals, we employed a nonlinear least squares approach. Specifically, each 25-s signal segment was modeled using a polynomial, and parameters of this polynomial were adjusted to minimize the sum of squares of the differences between observed and predicted values. To test which brain regions’ activity patterns are highly correlated with emotions, we trained multiple fully connected neural networks (FCNNs) to select from the previously mentioned 66 brain regions. Ultimately, the signals from brain regions highly correlated with emotions are selected as inputs for subsequent integrated convolutional neural network (CNN) model to distinguish between types of emotions. For level prediction, we utilized long short- term memory (LSTM) models for level prediction to participants’ emotions judgement. Finally, the algorithm we proposed achieved a classification accuracy of 56% for identifying emotion classes, 20% for identifying emotional intensity levels. The mean score of 0.6251 was calculated by ICBHI. The difference of the precision for class and level may be related to the subjective evaluation of the subjects as the rating of emotion is highly dependent by the subject’s self-perception and background.

Mi-Hsuan Lin, Pin-Han Chen, Rong-Huan Huang, Tzu-Yao Lo, Li-Yun Tseng, Zi-Xiang Tsai, Chun-Yi Zac Lo
fMRI Activity Slop Feature with Poincare Plot for Emotion Classification

In this study, we combined fMRI (functional Magnetic Resonance Imaging) data and Poincare plots derived from PPG (Photoplethysmography) to investigate the relationship between brain activity captured by fMRI and PPG signals and mood classification. The study focused on extracting features from the fMRI data and using these features along with the heart rate variability index derived from the Poincare plot to categorize mood. The study used the XGBOOST model for classification, and the accuracy varied depending on the model and mood classification. The results of this study suggest that the combination of fMRI and HRV features can be effective for mood classification.

Po-Han Huang, Yu-Chen Lin, Shao-Huang Lu, Shi-Yi Wu, Cheng-Lun Tsai, Kang-Ping Lin
Backmatter
Metadaten
Titel
International Conference on Biomedical and Health Informatics 2024
herausgegeben von
Kang-Ping Lin
Ratko Magjarević
Paulo de Carvalho
Copyright-Jahr
2025
Electronic ISBN
978-3-031-86323-3
Print ISBN
978-3-031-86322-6
DOI
https://doi.org/10.1007/978-3-031-86323-3