Skip to main content
Erschienen in:

Open Access 2025 | OriginalPaper | Buchkapitel

Towards Large Language Model Guided Kernel Direct Fuzzing

verfasst von : Xie Li, Zhaoyue Yuan, Zhenduo Zhang, Youcheng Sun, Lijun Zhang

Erschienen in: Fundamental Approaches to Software Engineering

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dieses Kapitel untersucht die Integration großer Sprachmodelle (LLMs) mit direktem Kernelverschwimmen, um die Erkennung von Software-Schwachstellen in Betriebssystemen zu verbessern. Die Autoren präsentieren ein neuartiges Framework, SyzAgent, das die Fähigkeiten des bestehenden Syzkaller-Tools erweitert, indem es Echtzeit-Feedback von LLMs einbezieht. Dieser dynamische Ansatz ermöglicht es dem verschwommenen Prozess, sich an Veränderungen im Kernel anzupassen, wodurch Effizienz und Abdeckung verbessert werden. Das Kapitel vertieft sich in die Architektur von SyzAgent und beschreibt seine Komponenten und Interaktionen mit Syzkaller. Es liefert auch vorläufige experimentelle Ergebnisse, die die Wirksamkeit der LLM-gestützten Methode beim Brechen von Deckungsplateaus und der Überwindung traditioneller Fuzzing-Tools zeigen. Die Autoren diskutieren die Herausforderungen und Erkenntnisse aus der Integration von LLMs mit Kernel-Fuzzern und heben das Potenzial dieses Ansatzes für zukünftige Fortschritte in der Softwaresicherheit hervor. Das Kapitel schließt mit einer Diskussion über die zukünftigen Richtungen und Verbesserungen von SyzAgent, wobei die Notwendigkeit weiterer Forschung betont wird, um die Fähigkeiten von LLMs beim Kernel-Fuzzing vollständig zu nutzen.

1 Introduction

Operating systems (OS) are crucial in modern computing infrastructures, making the correctness and reliability of an OS kernel vital. Fuzzing is a common method for identifying software vulnerabilities and has been notably applied in kernel testing with tools like Syzkaller [3], which has identified many bugs. Despite progress, the complexity of modern OSes can impede fuzzers from reaching deeper code paths. To improve fuzzing efficiency and coverage, researchers have explored ways to better discover and utilize the dependency relations between system calls and tasks [8, 9]. Other works have employed reinforcement learning techniques [11] and static analysis methods [6, 14] to target previously unreached code during fuzzing.
With the rapid advancement of generative AI [5], the use of large language models (LLMs) in system fuzzing is increasingly recognized [12]. The KernelGPT method [13] has been proposed to utilizing LLMs to generate Syzlang, a domain-specific language for system calls, facilitating improved seed generation and test case creation in Syzkaller [4].
Instead of general kernel fuzzing like Syzkaller, this work emphasizes direct kernel fuzzing, which targets specific, often critical areas within the OS kernel to manage the challenges posed by frequent updates and rapid iterations. The Syzdirect approach [10] extends Syzkaller by leveraging the call graph and resource model to provide structured guidance for generating test cases more effectively, enabling the more effective direct kernel fuzzing.
In this work, we integrate LLMs with direct fuzzing of the OS kernel. The source code and fuzzing intermediate results are fed to the LLM dynamically to retrieve guidance for test case generation. Unlike KernelGPT, which focuses on generating Syzlang specifications, and Syzdirect, which utilizes pre-built guidance from the call graph and resource model, our approach employs real-time feedback from the LLM to adapt to changes in the kernel. We implemented our framework, SyzAgent, to achieve this integration and provide preliminary experimental results demonstrating the effectiveness of the approach. Without loss of generality, GPT-4o [1] is used for the experiments in this paper. In addition, we share insights into the challenges and experiences encountered while integrating LLMs with kernel fuzzers.

2 Motivating Example

Consider a commit changing the function __anon_inode_getfd in the Linux kernel, referred to as the target function. Our objective is to test the newly introduced code in this commit using guidance from a LLM.
By compiling and analyzing the Linux kernel, we generate a set of call paths, which represent the routes from system calls to the target function in the kernel’s call graph. Below are two example call paths, where func1 \(\rightarrow \) func2 indicates that func2 is called within the body of func1:
$$\begin{aligned} &\texttt {inotify\_init} \rightarrow \texttt {do\_inotify\_init} \rightarrow \texttt {anon\_inode\_getfd} \rightarrow \texttt {\_\_anon\_inode\_getfd}\\ &\texttt {mock\_drm\_getfile} \rightarrow \texttt {anon\_inode\_getfile} \rightarrow \texttt {\_\_anon\_inode\_getfd}\\ \end{aligned}$$
The first path illustrates a direct call to the target function from a system call (inotify_init), while the second involves an indirect call through other functions within two steps. We collect these paths to inform the LLM about potential triggers for the target function. Once identified, the source code from these paths, referred to as calling code, will be extracted and used to formulate the initial prompt, as shown in Prompt 1.11.
Upon receiving the initial prompt, the LLM identifies the following system calls that may potentially interact with the target function.
[inotify_init, inotify_init1, fsopen, fspick, perf_event_open,
timerfd_create, epoll_create, epoll_create1, eventfd, eventfd2,
signalfd, signalfd4]
These system calls represent possible entry points to the target function, and the reason to use the LLM for analysis is to leverage its potential to identify additional system calls, as LLM may provide more diverse outcome since it has been trained on extensive open-source project data.
Subsequently, a kernel fuzzer like Syzkaller is launched with generated test cases using system calls with increased probability to reaching the target. During the fuzzing process, whenever 500 test cases are executed, those that covering functions within 2 steps of the target function are collected, and the covered source code is recorded to create a feedback prompt, as shown in Prompt 1.2.
After receiving the feedback prompt, the LLM provides an updated list of system calls. With real test cases available, the LLM is more likely to introduce related system calls. In this example, the LLM adds two more system calls, drm_syncobj_handle_to_fd_ioctl and mmap, to the initial list. This improvement in system call generation allows the fuzzing process to cover the target function more frequently in subsequent runs.

3 Approach

The architecture of our proposed approach is depicted in Figure 1, comprising two parts: 1) the original kernel fuzzer Syzkaller, and 2) its LLM extension, SyzAgent. Below, we introduce each part and explain their interactions.
Fig. 1.
SyzAgent extends the existing Syzkaller by applying LLM in fuzzing kernels.

3.1 Syzkaller

Syzkaller fuzzes the OS kernel by executing finite sequences of system calls with their arguments, where system call comes from a set of system calls S. It creates three task types in the work queue (as shown in Figure 1):
Generation Initial seed programs are generated from manually tuned templates to ensure deeper test cases.
Mutation Mutation is applied to programs selected from a corpus (i.e., previously executed programs with new coverage). During this phase, system calls and their arguments are modified, including adding, removing, or changing system calls. This process is guided by the fuzzing state, which includes a choice table (ct): a two-dimensional array where \(ct[c][c']\) represents the probability of generating system call c \('\) after c. System call insertion is either random (5% of the time) or based on probabilities from the choice table. Arguments are generated by considering available resources at the insertion point.
Triage Test cases that triggered new coverage will be verified and minimized by removing redundant system calls, with successful cases added to the corpus as new seeds. Triage tasks take priority in the fuzzing process, followed by generation and mutation if no triage tasks are available.

3.2 SyzAgent

We propose SyzAgent to extend Syzkaller, as shown in Figure 1. It integrates an LLM into Syzkaller’s generation and modification of the choice table. The LLM influences the fuzzing process in three key procedures: 1. It constructs the initial choice table based on static analysis and LLM analysis results. 2. During fuzzing, it collects some running test cases with coverage information during fuzzing, formulates feedback prompt to obtain guidance on fuzzing from LLM. 3. Finally it updates the choice table using guiding information provided by the LLM. The extension corresponds to the four new components: the preprocessor, static analyzer, address extractor, and LLM interface, as shown in Figure 1.
Pre-Processor The pre-processor compiles the OS kernel’s source code into a binary image for testing and generates intermediate representations (IR) from LLVM framework[7]. These generated IR files are used for static analysis and are avoided from any optimization to reflect the calling relation as detailed as possible. Additionally, the pre-processor gathers information on all C functions present in the Linux kernel.
Static Analyzer The static analyzer parses and analyzes the IR files generated by the pre-processor, resulting in the call graph of the OS kernel. A call graph of the Linux kernel is a graph \( G = (C \cup F, E) \), where: \( C \) is a finite set of system calls, \( F \) is a finite set of other functions, and \( E \subseteq (C \cup F) \times (C \cup F) \) represents the set of directed edges in the call graph, showing the calling relationships between functions. Given a target function \(f_t\), the static analyzer performs following tasks:
  • Job 1: Find all paths from some \( s \in C \) to \( f_t \). This corresponds to the first type of call paths in the motivating example.
  • Job 2: Find all paths from any function \( f \) to \( f_t \) with length \( l \), where \( l < \textbf{k} \) and \( \textbf{k} \in \mathbb {N} \) is a constant. These are the second type of call paths in the https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Figc_HTML.gif motivating example;
  • Job 3: Identify all close functions \( f_c \) within a specific close range constant \( \textbf{d} \in \mathbb {N} \), where \( f_c \) is an \( n \)-step predecessor in the call graph and \( n \le \textbf{d} \). A predecessor refers to a function that directly calls or influences another function in a call path. These functions are denoted by close area.
Address Extractor The address extractor matches program counter (PC) points in the compiled Linux kernel binary to their actual locations. Syzkaller uses KCOV [2] for coverage feedback, tracking the PC points reached by test cases. To improve efficiency, PC points in the close area are extracted in advance for quicker coverage checks.
LLM-Interface The LLM interface communicates with the LLM by sending the initial and feedback prompts. It then extracts a set of system calls, \( S_{inc} \), from the LLM feedback to update the choice table.
The choice table is modified as follows: for any system calls \( c_1 \) and \( c_2 \), let \( ct_0[c_1][c_2] \) represent the original choice table value in Syzkaller. The LLM-updated value, \( ct_1[c_1][c_2] \), is set to \( ct_0[c_1][c_2] + 1 \) if either \( c_1 \in S_{inc} \) or \( c_2 \in S_{inc} \); otherwise, \( ct_1[c_1][c_2] = ct_0[c_1][c_2] \). The final choice table is computed by normalizing \( ct_1 \) for each row: \( ct[c_1][c_2] = \dfrac{ct_1[c_1][c_2]}{\sum _i ct_1[c_1][c_i]}. \)
Apart from components above, it is worth mentioning that since LLM analysis runs slower than fuzzing. We sample test cases from all test cases and run LLM analysis in parallel. For every 500 cases sampled, some cases that covered close area will be selected randomly to do the feedback prompting via LLM-interface.

4 Preliminary Experimental Results

We conducted experiments on fuzzing the Linux kernel to demonstrate that our LLM-driven SyzAgent method: 1) effectively adapts the existing vanilla Syzkaller tool, even breaking its coverage plateau, and 2) offers advantages over the specialized direct kernel fuzzing tool, SyzDirect.
Our experimental setup consisted of a PC equipped with a 13th Gen Intel Core i7-13700 processor and 128GB of memory. The virtual machine under test was configured on QEMU, running a Linux system on an AMD architecture with 4 CPUs and 4GB of memory. Given the 12.8k token limit of the LLM-interface in GPT-4o [1], we selected target functions based on the principle that no function in their call paths should have more than five predecessors to prevent the explosion of number of calling paths of the target function. From this set, we selected a total of 27 target functions which our tool can process currently as our benchmark.
SyzAgent vs Syzkaller In this experiment, each target function was fuzzed using both SyzAgent and Syzkaller, with each tool tested three times per function, and each run limited to two hours. The fuzzing results from SyzAgent and Syzkaller are summarized in Table 1. Out of the 27 cases, SyzAgent achieved a hit rate (the ratio of number of test cases hit close area and the number of all test cases) that surpassed Syzkaller by more than \(10\%\) in 8 cases, while it underperformed compared to Syzkaller in only 5 cases. Comparably, we also compute how the increased coverage outperforms the original one, represented as \(\omega = \frac{\text {Avg. Diff}}{\text {Avg. Syzkaller Hit Rate}}\) and 18 cases out of 27 have \(\omega \ge 10\%\) which is 67%.
Table 1.
Experimental Data Comparison between Two Methods (“Dist.” denotes the minimum length of call path from some system call to target function. “Hit %” represents the ratio of the test cases that covered close area in the sampled test cases in percentage. “Avg. Diff” denotes the average difference of the hit rate of SyzAgent minus the hit rate of Syzkaller across all runs.)
ID
Target Function
Dist.
SyzAgent Hit %
Syzkaller Hit %
Avg. Diff
Run 1
Run 2
Run 3
Run 1
Run 2
Run 3
1
ksys_semctl
1
28.27
28.89
31.8
3.8
5.15
1.11
26.3
2
__sys_setfsgid
1
19.1
13.89
9.45
0.0
0.03
0.0
14.14
3
do_sched_yield
1
25.15
26.32
41.7
20.57
30.14
26.86
5.2
4
vm_acct_memory
2
32.4
28.82
32.84
22.12
17.83
15.27
12.95
5
__shmem_file_setup
2
8.82
7.32
7.81
3.79
6.49
4.93
2.91
6
io_register_iowq_m...
2
22.05
15.63
18.26
1.02
3.28
2.59
16.35
7
__anon_inode_getfile
2
30.47
30.38
30.65
9.97
11.89
11.68
19.32
8
copy_fsxattr_from...
3
56.8
56.99
54.75
51.0
49.34
50.1
6.03
9
__io_uring_add_...
3
36.02
34.97
28.0
8.9
2.65
11.76
25.23
10
keyring_ptr_to_key
3
30.26
21.64
23.95
6.58
2.48
6.47
20.11
11
mnt_get_writers
3
77.1
73.33
75.44
67.6
73.84
80.1
1.44
12
futex_requeue_pi_...
3
0.92
0.0
0.0
2.94
0.31
2.0
-1.44
13
wait_for_device_probe
4
0.33
0.31
0.12
0.35
0.14
0.13
0.05
14
memcpy_to_page
4
24.07
29.73
34.0
7.1
9.02
0.0
23.89
15
kimage_is_dest...
5
1.74
7.69
8.05
0.0
0.06
0.66
5.59
16
find_lock_entries
5
40.68
37.72
33.88
38.28
34.67
39.23
0.03
17
fsnotify_data_sb
5
58.2
57.33
61.22
55.73
55.72
60.94
1.45
18
security_inode_set...
5
12.02
10.74
12.86
3.39
4.78
3.61
7.94
19
free_partitions
6
13.48
21.94
14.57
28.17
24.73
25.97
-9.63
20
bpf_prog_free
6
0.56
5.12
3.68
1.25
1.5
3.37
1.08
21
locks_delete_glob...
6
0.59
0.58
0.0
0.73
0.04
0.56
-0.05
22
pmd_none_or_clear_bad
7
12.92
11.3
16.47
14.72
19.68
18.11
-3.94
23
__submit_bio_noac...
7
31.89
21.5
19.88
20.09
28.69
27.06
-0.86
24
srcu_read_lock_nm...
7
19.89
45.15
26.51
23.62
25.22
20.31
7.47
25
trace_wbc_writepage
8
1.82
0.81
3.03
0.79
1.71
0.6
0.85
26
sk_set_bit
8
8.21
10.61
6.48
3.09
3.23
6.61
4.12
27
sidtab_search_core
8
76.48
77.85
75.68
73.14
76.26
73.34
2.42
These results confirm that the LLM integration in SyzAgent effectively improves Syzkaller’s performance in direct fuzzing, as the majority of cases achieved a higher hit rate when using SyzAgent.
Fig. 2.
Coverage-Execution graph for target function sk_set_bit within 2h( https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Figd_HTML.gif line for Syzkaller and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Fige_HTML.gif line for SyzAgent)
While this paper primarily focuses on kernel direct fuzzing, during our experiments, we observed that SyzAgent successfully breaks the Syzkaller coverage plateau. In the 27 direct fuzzing cases, we found that 5 cases achieved higher coverage within a fixed number of test cases, with IDs 4, 19, 25, 26, 27. Figure 2 illustrates the coverage progression for case 27, where deeper target functions were tested, partially validating our hypothesis that the main reason of plateau is the fuzzer lacking a seed that can reach deeper code. However, we also noted that in 6 cases, the coverage performance of SyzAgent was inferior to that of Syzkaller, while the remaining cases showed similar performance between the two tools. We identify this as another promising new direction emerging from this work, and it will be valuable to investigate this hypothesis further, exploring how to harness the LLM’s capabilities to systematically improve kernel fuzzing coverage.
Table 2.
Exemplar system call entry analysis results from SyzAgent and SyzDirect reveal that SyzAgent has advantages over SyzDirect in identifying system call relationships, as highlighted in https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Figf_HTML.gif . Conversely, SyzDirect excels in detecting argument types, as shown in https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Figg_HTML.gif , a feature not currently supported by SyzAgent.
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Tab2_HTML.png
Dummy
SyzAgent vs SyzDirect An end-to-end comparison with the SyzDirect tool was not feasible due to multiple issues encountered during its installation, configuration, and manual instrumentation requirements.
Nevertheless, we managed to run SyzDirect’s stages for system call entry analysis and conducted a comparison with the LLM-generated results from SyzAgent. Table 2 presents the results for three target functions in the Linux kernel2. In the table, system calls highlighted in https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Figi_HTML.gif indicate cases where SyzAgent outperforms SyzDirect, while those in https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90900-9_2/MediaObjects/648501_1_En_2_Figj_HTML.gif represent cases where SyzDirect performs better.
In cases with IDs 2 and 3, SyzAgent identified three additional system call entries compared to SyzDirect. After manually verifying these cases, we found that SyzDirect’s call graph analysis was less precise than that of SyzAgent. For example, in the first case, io_uring_enter did not appear to be beneficial for reaching the target function. However, SyzDirect outperformed SyzAgent in providing specific variants of system calls, likely due to its more detailed call graph model that incorporates resource-producing and consuming relationships, which are currently not included in SyzAgent analysis. This results in a finer-grained analysis by SyzDirect compared to that of SyzAgent.

5 Conclusion and Discussion

In this work, we explored the integration of LLM capabilities with OS kernel fuzzers in real-time. Based on our preliminary experimental results, this approach appears effective for direct fuzzing and warrants further investigation. However, our work is still in its early stage, as several advanced techniques, such as the relational graph approach from [9] and more sophisticated static analysis methods like those in [6, 14], have not yet been incorporated. Our work also lacks the validation on whether the system calls are correctly generated.
At the implementation level, there are several ways SyzAgent could be enhanced: 1) Splitting the calling code into smaller segments to facilitate deeper exploration of target functions ; 2) Integrating more closely with Syzkaller to enable LLMs to contribute to argument mutation processes; and 3) Using the distance to the target function of cases that cover nearby areas to select the most promising test cases for generating feedback prompts.
We regard LLMs as a viable solution to the complexities inherent in OS kernel fuzzing, thanks to the vast amount of data on which they are trained and optimized. The combination of LLM capabilities with our real-time feedback framework offers a flexible way to automatically adjust the fuzzing strategy. In the future, we believe it will be important to continue researching how LLMs can boost fuzzing coverage by utilizing information from intermediate results of static analysis and kernel documentation.

Acknowledgments

We gratefully thank Pierre Olivier for providing insights of linux kernel on this study. This work is partly supported by CAS Project for Young Scientists in Basic Research, Grant No.YSBR-040, ISCAS New Cultivation Project ISCAS-PYFX-202201, ISCAS Basic Research ISCAS-JCZD-202302 and the Ministry of Education, Singapore under its Academic Research Fund Tier 3 (Award ID: MOET32020-0004).
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Fußnoten
2
Commit 304040fb4909f7771caf6f8e8c61dbe51c93505a
 
Literatur
5.
Zurück zum Zitat Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:​2303.​08774 (2023)
6.
Zurück zum Zitat Corina, J., Machiry, A., Salls, C., Shoshitaishvili, Y., Hao, S., Kruegel, C., Vigna, G.: DIFUZE: interface aware fuzzing for kernel drivers. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017. pp. 2123–2138 (2017). https://doi.org/10.1145/3133956.3134069 Corina, J., Machiry, A., Salls, C., Shoshitaishvili, Y., Hao, S., Kruegel, C., Vigna, G.: DIFUZE: interface aware fuzzing for kernel drivers. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017. pp. 2123–2138 (2017). https://​doi.​org/​10.​1145/​3133956.​3134069
7.
Zurück zum Zitat Lattner, C., Adve, V.S.: LLVM: A compilation framework for lifelong program analysis & transformation. In: 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 20-24 March 2004, San Jose, CA, USA. pp. 75–88 (2004). https://doi.org/10.1109/CGO.2004.1281665 Lattner, C., Adve, V.S.: LLVM: A compilation framework for lifelong program analysis & transformation. In: 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 20-24 March 2004, San Jose, CA, USA. pp. 75–88 (2004). https://​doi.​org/​10.​1109/​CGO.​2004.​1281665
9.
Zurück zum Zitat Sun, H., Shen, Y., Wang, C., Liu, J., Jiang, Y., Chen, T., Cui, A.: HEALER: relation learning guided kernel fuzzing. In: SOSP ’21: ACM SIGOPS 28th Symposium on Operating Systems Principles, Virtual Event / Koblenz, Germany, October 26-29, 2021. pp. 344–358 (2021). https://doi.org/10.1145/3477132.3483547 Sun, H., Shen, Y., Wang, C., Liu, J., Jiang, Y., Chen, T., Cui, A.: HEALER: relation learning guided kernel fuzzing. In: SOSP ’21: ACM SIGOPS 28th Symposium on Operating Systems Principles, Virtual Event / Koblenz, Germany, October 26-29, 2021. pp. 344–358 (2021). https://​doi.​org/​10.​1145/​3477132.​3483547
10.
Zurück zum Zitat Tan, X., Zhang, Y., Lu, J., Xiong, X., Liu, Z., Yang, M.: Syzdirect: Directed greybox fuzzing for linux kernel. In: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, CCS 2023, Copenhagen, Denmark, November 26-30, 2023. pp. 1630–1644 (2023). https://doi.org/10.1145/3576915.3623146 Tan, X., Zhang, Y., Lu, J., Xiong, X., Liu, Z., Yang, M.: Syzdirect: Directed greybox fuzzing for linux kernel. In: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, CCS 2023, Copenhagen, Denmark, November 26-30, 2023. pp. 1630–1644 (2023). https://​doi.​org/​10.​1145/​3576915.​3623146
11.
Zurück zum Zitat Wang, D., Zhang, Z., Zhang, H., Qian, Z., Krishnamurthy, S.V., Abu-Ghazaleh, N.B.: Syzvegas: Beating kernel fuzzing odds with reinforcement learning. In: 30th USENIX Security Symposium, USENIX Security 2021, August 11-13, 2021. pp. 2741–2758 (2021) Wang, D., Zhang, Z., Zhang, H., Qian, Z., Krishnamurthy, S.V., Abu-Ghazaleh, N.B.: Syzvegas: Beating kernel fuzzing odds with reinforcement learning. In: 30th USENIX Security Symposium, USENIX Security 2021, August 11-13, 2021. pp. 2741–2758 (2021)
12.
Zurück zum Zitat Xia, C.S., Paltenghi, M., Tian, J.L., Pradel, M., Zhang, L.: Fuzz4all: Universal fuzzing with large language models. In: Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, ICSE 2024, Lisbon, Portugal, April 14-20, 2024. pp. 126:1–126:13 (2024). https://doi.org/10.1145/3597503.3639121 Xia, C.S., Paltenghi, M., Tian, J.L., Pradel, M., Zhang, L.: Fuzz4all: Universal fuzzing with large language models. In: Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, ICSE 2024, Lisbon, Portugal, April 14-20, 2024. pp. 126:1–126:13 (2024). https://​doi.​org/​10.​1145/​3597503.​3639121
14.
Zurück zum Zitat Zhao, B., Li, Z., Qin, S., Ma, Z., Yuan, M., Zhu, W., Tian, Z., Zhang, C.: Statefuzz: System call-based state-aware linux driver fuzzing. In: 31st USENIX Security Symposium, USENIX Security 2022, Boston, MA, USA, August 10-12, 2022. pp. 3273–3289 (2022) Zhao, B., Li, Z., Qin, S., Ma, Z., Yuan, M., Zhu, W., Tian, Z., Zhang, C.: Statefuzz: System call-based state-aware linux driver fuzzing. In: 31st USENIX Security Symposium, USENIX Security 2022, Boston, MA, USA, August 10-12, 2022. pp. 3273–3289 (2022)
Metadaten
Titel
Towards Large Language Model Guided Kernel Direct Fuzzing
verfasst von
Xie Li
Zhaoyue Yuan
Zhenduo Zhang
Youcheng Sun
Lijun Zhang
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-90900-9_2