Zum Inhalt

Machine Learning for Cyber Security

5th International Conference, ML4CS 2023, Yanuca Island, Fiji, December 4–6, 2023, Proceedings

  • 2024
  • Buch
insite
SUCHEN

Über dieses Buch

Dieses Buch stellt die Vortragsreihe der 5. Internationalen Konferenz über maschinelles Lernen für Cyber-Sicherheit, ML4CS 2023, dar, die vom 4. bis 6. Dezember 2023 auf der Insel Januuca, Fidschi, stattfand. Die in diesem Buch präsentierten 11 vollständigen Beiträge wurden sorgfältig überprüft und aus 35 Einreichungen ausgewählt. Sie decken eine Vielzahl von Themen ab, darunter Cybersicherheit, KI-Sicherheit, Sicherheit des maschinellen Lernens, Verschlüsselung, Authentifizierung, Datensicherheit und Datenschutz, forensische Cybersicherheitsanalyse, Schwachstellenanalyse, Malware-Analyse, Anomalie und Einbruchserkennung.

Inhaltsverzeichnis

Frontmatter
Keystroke Transcription from Acoustic Emanations Using Continuous Wavelet Transform
Abstract
Acoustic propagation is a notable pathway, enabling information input via a keyboard to potentially leak. This type of attack, which leverages the processing of keystroke sounds to capture data, has been the subject of various proposed methodologies. However, the application of continuous wavelet transforms for this purpose remains largely unexplored. The continuous wavelet transform provides better resolution in both time and frequency for impulse-like signals. As such, this transformation proves more effective for analyzing keystroke sounds in comparison to conventional transform methods. We propose a method based on machine learning to analyze features. This process involves transcribing keystrokes from the acoustic emanations of a keyboard, utilizing wave files as input. Consequently, this allows the recovery of pressed keys as output, achieving an accuracy rate of up to 80.3%.
Abdullah Ozkan, Banu Gunel Kilic, Cengiz Acarturk
Strengthening Cyber Security Education: Designing Robust Assessments for ChatGPT-Generated Answers
Abstract
Cyber security education has become a hot topic in Australia and many OECD countries due to increasing job demands for cyber security professionals. Designing authentic cyber security assessment tasks is an ongoing challenge, especially in the context of ChatGPT and similar AI-generated content (AIGC) tools. Some early studies suggest that the risks of using ChatGPT tools can be mitigated, but these studies overlooked cyber security education. This paper addresses this gap in the literature, focusing on assessment design in cyber security education in the presence of ChatGPT. While existing research has examined the transition from in-person to online education and ChatGPT’s capabilities, our study emphasizes the assessment structure and pedagogical approaches related to cyber security education. We conducted a systematic analysis by creating questions with four distinct prompts, feeding them to ChatGPT, and analyzing the answers with statistical tools. Our findings highlight the significance of question types and fact-checking in ChatGPT’s responses. We propose practical recommendations to enhance cyber security assessment design when incorporating ChatGPT. Our recommendations include incorporating recent academic references, using long essay questions, and thorough fact-checking to ensure the integrity of assessments.
Andrew Plapp, Jianzhang Wu, Lei Pan, Chao Chen, Caslon Chua, Jun Zhang
PassFile: Graphical Password Authentication Based on File Browsing Records
Abstract
With the rapid growth of smartphone shipments, it has become a necessity to safeguard the mobile devices from unauthorized access (e.g., in case of stolen or lost). To complement traditional textual passwords, graphical password is believed to be a promising alternative, which requires users to create their credentials based on image(s). However, many complicated graphical password schemes may increase the memory load of a user. In practice, a usable graphical password scheme is often designed based on users’ own knowledge. In this work, we introduce PassFile, a graphical password authentication scheme based on file browsing records on mobile devices. It requires a user to select the most frequently used applications from the mobile devices as authentication token. In the user study, our results indicate that our proposed PassFile can provide a high login success rate (over 96%).
Ho Chun Fu, Wenjuan Li, Yu Wang
On the Role of Similarity in Detecting Masquerading Files
Abstract
Similarity has been applied to a wide range of security applications, typically used in machine learning models. We examine the problem posed by masquerading samples; that is samples crafted by bad actors to be similar or near identical to legitimate samples. We find that these samples potentially create significant problems for machine learning solutions. The primary problem being that bad actors can circumvent machine learning solutions by using masquerading samples.
We then examine the interplay between digital signatures and machine learning solutions. In particular, we focus on executable files and code signing. We offer a taxonomy for masquerading files. We use a combination of similarity and clustering to find masquerading files. We use the insights gathered in this process to offer improvements to similarity based and machine learning security solutions.
Jonathan Oliver, Jue Mo, Susmit Yenkar, Raghav Batta, Sekhar Josyoula
A Password-Based Mutual Authenticated Key Exchange Scheme by Blockchain for WBAN
Abstract
With the continuous development of Internet of Things technologies, Wireless Body Area Networks (WBAN) have shown great application potentials in the healthcare industry. However, adversaries may masquerade as legitimate users and sensitive medical data may be intercepted during transmission. Therefore, proper authentication and secure communications are required in WBAN. Password-based authenticated key exchange (PAKE) is an attractive solution for this problem due to its simplicity and low costs, i.e. the user and the server can use their shared password to perform authentication and to establish a session key for secure information exchange. However, many existing PAKE protocols are suffering some limitations. First, some schemes only consider one-way authentication, while masquerade is still possible on the server side. Second, some schemes are vulnerable to the offline dictionary attack, and the consequence is that the user’s password with limited entropy can be leaked. Third, some schemes need to employ secure channels, making them less practical in real-world applications. In this paper, we propose a password-based mutual authenticated key exchange scheme by Blockchain for WBAN, in which all these issues are addressed. In particular, mutual authentication is realized, and the adversary cannot launch the offline dictionary attack. Moreover, these features are achieved without employing a secure channel. Therefore, it achieves a good balance between usability and security. Security and performance analyses demonstrate that it satisfies the desirable security requirements and is very efficient for practical applications.
Pei Huang, Yaorui He, Ting Liang, Zhe Xia
Traffic Signal Timing Optimization Based on Intersection Importance in Vehicle-Road Collaboration
Abstract
Vehicle-road collaboration positively promotes vehicle development and intelligent transportation. The construction of intelligent transportation cannot be separated from optimizing traffic signal timing, which is crucial for improving traffic efficiency, reducing congestion, and minimizing accident risks. Nowadays, reinforcement learning (RL) has emerged as an effective method for traffic signal timing optimization. However, many current RL-based approaches ignore the variations among different intersections, reducing the traffic efficiency. In this paper, we propose a novel method for traffic signal timing optimization, which models the problem of traffic timing optimization as an importance-oriented decision making problem. To achieve this, we first construct a directed adjacency graph based on the real road network. Then, a graph attention neural network (GAT) is utilized to estimate the importance of each intersection. Finally, we introduce the nodes importance into the reward function to find the optimal traffic light timing scheme. Extensive experiments demonstrate that our proposed method achieves higher traffic efficiency, compared to existing RL-based traffic signal timing optimization methods which ignore the intersection importance. Moreover, our method fits well with different RL algorithms, including Q-learning, DQN, Sarsa, DDPG and A3C.
Pengna Liu, Ziyan Qiao, Yalun Wu, Kang Chen, Jiasong Hou, Yingqi Cai, Liqun Chu, Endong Tong, Wenjia Niu, Jiqiang Liu
A Client-Side Watermarking with Private-Class in Federated Learning
Abstract
Federated learning is a privacy-focused distributed learning framework that involves sharing model updates among participants. However, this sharing of the global model increases the risk of model leakage when unreliable participants are involved. To protect model copyright and improve watermark robustness, we propose a client-side watermarking method. This method introduces an additional watermark class to the client’s model, extending it to an \(N+1\) classification model. The client’s model is trained using both the watermark dataset and the local dataset. Before sending updates to the server, the watermark class parameters are removed and stored locally. Participants enhance model personalization during aggregation by uploading amplified parameters. After server aggregation, the global model is distributed for local training. The saved watermark parameters continuously update through iterations until model convergence. Extensive experiments demonstrate minimal impact on neural network performance and strong robustness during model modifications, such as fine-tuning and pruning.
Weitong Chen, Wei Zhang, Jiale Zhang, Xiaobing Sun, Xiang Cheng, Chengcheng Zhu
Research on Evasion and Detection of Malicious JavaScript Code
Abstract
This thesis analyzes the malicious essence of malicious JavaScript and the implementation of malicious functions. Then, this thesis combines the result with the taint analysis technology in the field of software vulnerability analysis, and proposes a new malicious JavaScript detection method based on taint analysis. This method defines the taint source and taint sink point according to the implementation of malicious code functions, and then performs taint propagation on the abstract syntax tree of the code to obtain the characteristics of the code. After forming a feature vector through the process, this thesis finally uses machine learning models to complete detection. Experimental results show that the method can well complete the binary classification of malicious and benign samples, and the detection effect on the obfuscated samples is significantly better than mainstream online anti-malware engines. Code obfuscation can hardly affect detection results of this method.
Yujie Ma, Haokai Wu, Yu-an Tan, Yuanzhang Li
Tackling Non-IID for Federated Learning with Components Alignment
Abstract
Federated Learning (FL) is a privacy-preserving framework used to perform machine learning tasks with distributed data. One of the key challenges is heterogeneous data distributions among clients, which results in client-drift, leading to the oscillatory and low-accuracy global model. Although lots of work has been proposed to mitigate client-drift, we find there are drawbacks associated with the two common methods: feature alignment and classifier tuning. For the former, the great bias in classifiers still holds in local models and degrades global model performance. For the latter, it’s hard to obtain suitable global features to introduce external knowledge to locals. To address the above drawbacks, in this paper, we propose a privacy-preserving and effective method, named FCA, to tackle client-drift issues in Non-IID federated learning via aligning models’ components. Specifically, FCA enhances similarity among the local models’ components, i.e. feature extractors and classifiers, by utilizing the estimated global feature representations. Experimental results demonstrate that FCA achieves better performance with fewer rounds. Compared with vanilla, our method achieves from 0.4% to 7.5% performance improvement on three popular datasets with four different Non-IID scenarios.
Baolu Xue, Jiale Zhang, Bing Chen, Wenjuan Li
Security on Top of Security: Detecting Malicious Firewall Policy Changes via K-Means Clustering
Abstract
As a company grows, so does its infrastructure—especially its information technology (IT) infrastructure. Maintaining a transparent and manageable firewall policy during this period of rapid upscaling is nigh impossible. The situation is further complicated when multiple people—or even multiple teams—deploy and maintain these firewall policies. Different people often tackle a problem differently, developing different solutions, which, in turn, lead to different firewall policies. Inconsistencies in firewall policies are particularly problematic when it comes to updating, patching, and testing firewalls. Motivated by these issues, in this work, we collaborate with a telecommunications company and construct a web application that leverages machine learning to detect anomalies in firewall policies. The machine learning models can use firewall logs from internal firewalls, and, therefore, can learn the intricacies of traffic on a given network. The models can then predict the expected output from the network logs; anomalies can be identified if the expected values differ from the predicted values. In our evaluation, we collect data from the participating telecommunications company, implement our solution using the k-means clustering algorithm, and evaluate its performance against the collected data.
Mads Solberg Collingwood Pyke, Weizhi Meng, Brooke Lampe
Penetrating Machine Learning Servers via Exploiting BMC Vulnerability
Abstract
With the recent significant advancements in machine learning fields, there has been an increasing focus on the data security and availability of servers, which serve as critical hardware infrastructure supporting AI computations. However, most existing security research has primarily focused on upper layers, attempting to defend against attacks from applications and operating system , thereby neglecting research in firmware and lower-level management modules. Nevertheless, these fields are crucial in constructing a comprehensive security chain. To analyze the security of lower-level management modules, this paper introduces a method for privilege escalation through vulnerabilities in the Baseboard Management Controller (BMC) of the server. The BMC is a critical component responsible for managing and monitoring the hardware of the server. This method allows for bypassing the Kernel Address Space Layout Randomization (KASLR) protection of the Linux kernel and implanting a backdoor into the host operating system, thereby gaining root access to the host. Through this method, we can access server memory data or execute malicious programs arbitrarily without physical contact, and reinstalling the system cannot overwrite the modifications made in the BMC. This poses a significant security threat to servers.
Yashi Liu, Kefan Qiu, Lu Liu, Quanxin Zhang
Backmatter
Titel
Machine Learning for Cyber Security
Herausgegeben von
Dan Dongseong Kim
Chao Chen
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
Electronic ISBN
978-981-9724-58-1
Print ISBN
978-981-9724-57-4
DOI
https://doi.org/10.1007/978-981-97-2458-1

Informationen zur Barrierefreiheit für dieses Buch folgen in Kürze. Wir arbeiten daran, sie so schnell wie möglich verfügbar zu machen. Vielen Dank für Ihre Geduld.

    Bildnachweise
    AvePoint Deutschland GmbH/© AvePoint Deutschland GmbH, NTT Data/© NTT Data, Wildix/© Wildix, arvato Systems GmbH/© arvato Systems GmbH, Ninox Software GmbH/© Ninox Software GmbH, Nagarro GmbH/© Nagarro GmbH, GWS mbH/© GWS mbH, CELONIS Labs GmbH, USU GmbH/© USU GmbH, G Data CyberDefense/© G Data CyberDefense, FAST LTA/© FAST LTA, Vendosoft/© Vendosoft, Kumavision/© Kumavision, Noriis Network AG/© Noriis Network AG, WSW Software GmbH/© WSW Software GmbH, tts GmbH/© tts GmbH, Asseco Solutions AG/© Asseco Solutions AG, AFB Gemeinnützige GmbH/© AFB Gemeinnützige GmbH