RacerF ist ein statischer Analysator, der sich auf die schnelle und skalierbare Erkennung von Datenrennen in Multithreadprogrammen konzentriert. Es unterscheidet Threads anhand ihrer Einstiegspunktfunktionen und analysiert sie als sequenzielle Programme, wobei eine Kombination aus Unter- und Übernäherungsstrategien verwendet wird, um die Ergebnisse für ein Multithread-Setting zu verallgemeinern. Der Analysator verwendet Frama-C's Werteanalyse-Plugin EVA, das mehrere abstrakte Domänen kombiniert, um ganze Zahlen und Adressen zu charakterisieren, was ein hohes Maß an Präzision bietet. Die Analyse-Pipeline von RacerF umfasst Lockset-Analysen, Active-Threads-Analysen, Speicherzugriffsanalysen und die Erkennung von Datenrennen, die jeweils zu ihrer robusten Leistung beitragen. Das Kapitel hebt die Stärken von RacerF hervor, wie die Geschwindigkeit und die korrekten Ergebnisse für Linux-Kernel-Benchmarks, sowie seine Schwächen, einschließlich des Mangels an formalen Garantien und Pfadempfindlichkeit. Die Leistung des Analysators im Wettbewerb SV-COMP '25 wird diskutiert, wobei besonderes Augenmerk auf seine Fähigkeit gelegt wird, komplexe Benchmarks effizient zu handhaben. Darüber hinaus enthält das Kapitel Details zum Softwareprojekt und zur Werkzeugeinrichtung, einschließlich der Vorverarbeitungsschritte, der Kombination von Approximationsstrategien und der Zeugennachbearbeitung. Die Implementierung von RacerF in OCaml als Plugin für die Frama-C-Plattform wird ebenso beschrieben wie deren Verfügbarkeit und Gebrauchsanweisung.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
RacerF is a static analyser for detection of data races in multithreaded C programs implemented as a plugin of the Frama-C platform. The approach behind RacerF is mostly heuristic and relies on analysis of the sequential behaviour of particular threads whose results are generalised using a combination of under- and over-approximating techniques to allow analysis of the multithreading behaviour. In particular, in SV-COMP’25, RacerF relies on the Frama-C’s abstract interpreter EVA to perform the analysis of the sequential behaviour. Although RacerF does not provide any formal guarantees, it ranked second in the NoDataRace-Main sub-category, providing the largest number of correct results (when excluding metaverifiers) and just 4 false positives.
Hinweise
T. Dacík—Jury member.
1 Verification Approach
RacerF is a static analyser whose primary goal is to provide fast and scalable data race detection, without sacrificing too much precision. It relies on using a backend static analyser that summarises (in a way specific for the chosen analyser) the sequential behaviour of particular threads – or, more precisely, classes of threads since RacerF distinguishes threads according to their entry point function only. The existing threads are discovered incrementally since a newly discovered thread may create further threads. The results of the analysis of the sequential behaviour are generalised for the multi-threaded setting using a combination of an under-approximating and an over-approximating strategy.
RacerF comes with several analysis backends differing in their scalability and precision. In SV-COMP’25, we use the most precise of them – Frama-C’s value analysis plugin EVA based on abstract interpretation combining several abstract domains to characterize sets of integers and addresses that may get assigned to the program variables [3]. Over the backend’s results, RacerF runs several analyses to perform data race detection as a final step. We briefly describe these analyses in the rest of this section. More details can be found in [7].
Anzeige
1.1 Analysing Multithreaded Programs Using Sequential Behaviour
The main idea behind RacerF is to discover all program threads, distinguished statically according to their entry functions, and analyse them as sequential programs, starting with initial states that are computed using a set of equations over currently discovered threads. We have two strategies to construct those equations, based on under-approximating or over-approximating thread behaviours. The approximations computed by the different analysis backends are then available to subsequent analysis phases performed by RacerF, which may but need not respect the guaranteed under-/over-approximation.
Under-approximating strategy. The idea behind this strategy is to ignore all interleavings. Each thread is analysed with the initial state corresponding to the join of states discovered as reachable at its create statements, and its writes to global variables are not propagated to other threads. When there is no cyclic dependency in thread creation, it is enough to analyse each thread just once.
Over-approximating strategy. This strategy over-approximates considered interleavings (including also those ruled out by some synchronization method). In this setting, each thread is analysed with the initial state given as the join of all states encountered by analyses of all discovered threads. This strategy may require several re-analyses of individual threads, but we aggressively use widening to reduce this number. This strategy needs to be accompanied by a program transformation that ensures that the computed over-approximation of the initial state is not lost during the sequential analysis of threads. Intuitively, the exchange of information between thread analyses happens at the initial states only, and so we need to preserve the information obtained from analyses of other threads throughout the re-analysis of a given thread and not destroy it by some local assignments before it gets propagated to a code location where it actually matters (see [7] for more details). This transformation was implemented after our submission to SV-COMP’25, but its absence does not manifest in the results (it would manifest only on programs that our memory access analysis cannot handle precisely at that time.)
Combined strategy. For SV-COMP’25, we provide a wrapper that combines both strategies in a natural way. The under-approximation is only allowed to report races, and the over-approximation is only allowed to claim a program race-free. We start with the over-approximation, and if it is inconclusive, we run the under-approximation. If both strategies are inconclusive, we report unknown. The situation in which running the combination is beneficial within SV-COMP is, however, quite rare as the result of using purely under- or over-approximation would lead to results different from using their combination in 11 out of 1029 data race benchmarks only.
Anzeige
1.2 Analysis Pipeline
We call an analysis context a triple consisting of a thread, a sequence of a fixed number of the latest calls (each of them formed by a call site and the function called, with the number being a parameter of the analysis – for SV-COMP’25, set to two), and a program statement. We run the following pipeline of analyses to detect data races.
Lockset analysis. This analysis computes the set of sets of locks held for each program context (using an algorithm inspired by [8]) and provides must- and may-queries to check whether two contexts are guarded by a common lock. Trylocks and read-write locks are supported.
Active-threads analysis. The goal of this analysis is to provide information which threads may- and must-run in parallel in particular contexts. This information is then used to not report data races on memory accesses where one happens before the thread of the second is created, or after it is surely joined.
Memory access analysis. EVA represents memory addresses as pairs consisting of a base and an offset. Each variable (global, local, or formal) has its associated base. Dynamic allocations have so-called dynamic bases, either strong or weak (representing multiple allocations). The memory access analysis tracks accesses to memory bases and distinguishes their states in a way inspired by [11]. Those states are used to select bases for which data race detection should be performed.
Data race detection. Finally, over the discovered memory accesses, a may and a must data race detection is performed. Intuitively, the may date race detection checks whether at least one of the accesses is a write access, the accesses may come from distinct threads and may happen in parallel, the concerned locksets may have empty intersection, and the concerned memory blocks may overlap (with the must data race detection modified accordingly and accompanied by several more checks to eliminate common sources of false positives).
Reducing Incorrect Verdicts. As already mentioned, to reduce the number of incorrect verdicts, we can leverage the distinction of the detected races to may and must races, essentially reporting a data race only when a must data race is detected, and reporting no data race only if no may data race is found. Otherwise, an unknown result is produced.
Further, since the SV-COMP’25 benchmark also contains programs implementing lock-free algorithms, which we cannot analyse, we have implemented a simple detection of the code pattern of active waiting (essentially detecting empty control loops), which is the common denominator of such programs, and report unknown if we encounter such active waiting.
Table 1.
Results of RacerF in NoDataRace-Main sub-category.
Dummy
2 Strengths and Weaknesses
Despite its heuristic nature, RacerF can provide very precise results as Table 1 shows [2]. RacerF does not provide a true/false verdict for 253 programs, of which 17 cannot be parsed by Frama-C, 60 contain unsupported features (3 cases of semaphores, 57 cases of custom atomic functions), and in 176 cases, our analyser returns unknown (in 20 cases, active waiting is detected; and in 156 cases, our may/must data race classification is inconclusive).
Strengths.RacerF is quite fast. In just 8 cases, it needs more than 5 seconds to compute a result, and it also never runs out of time or memory. This is in stark contrast with model checkers. The only competitive SV-COMP’25 participant from this point of view is Goblint [10, 12], which, however, also timed out in a few cases. What we are especially proud of is that RacerF is the only SV-COMP’25 participant (excluding tools with a negative score) which provides correct results for all 5 programs derived from the Linux kernel in the ldv-linux-3.14-races benchmark. Besides it, just Goblint is able to provide the correct result for one of those tasks. All the other tools either run out of resources, fail, or return an unknown verdict. RacerF needs just 2 minutes for one of them and about 30 seconds for the others.
Weaknesses.RacerF does not provide any formal guarantees for its results. There are 4 cases in which our heuristic approach reports a false alarm. All of them rule out a data race by some intricate condition that is out of the scope of our analyses. Two of them (pthread-ext/09_fmaxsym-zero.c and pthread-ext/11_fmaxsymopt-zero.c) are variants of the same program that relies on the fact that an array is zero-initialized and a certain condition cannot hold more than once throughout the run of the program. The other two programs goblint-regression/13-privatized_40-traces-ex-6_true.c and pthread-atomic/time_var_mutex.c use synchronisation patterns that are beyond the scope of our lockset analysis.
Another weakness of our analyser is a lack of path-sensitivity. In some cases, especially on the ldv-races benchmark, RacerF returns unknown because it considers some trivially infeasible path that prevents it from classifying the encountered issue as a must data race. In future versions, this can be improved by adding preprocessing to eliminate trivial branching conditions and dead code.
Finally, RacerF can only generate trivial witnesses and thus almost one third of its false answers is unconfirmed. In fact, RacerF provides traces for detected races, but we have not yet implemented their conversion to witnesses.
3 Software Project and Tool Setup
RacerF is implemented in OCaml as a plugin of the Frama-C platform [1, 5]. It relies on the Frama-C’s fork of CIL [9] as its frontend and uses the abstract interpretation plugin EVA [3] to perform sequential analyses of individual threads. The library OCamlgraph [4] is used to represent several graph structures and solve systems of equations created during thread analysis. In SV-COMP’25, RacerF is run via a Python script racerf-sv.py which performs the following:
Preprocessing. Since Frama-C discards the \(\mathtt {\_Atomic}\) attributes during parsing, we encode them as type aliases, e.g., the declaration _Atomic int x is transformed to atomic_intx. We also further preprocess several aspects of input that are problematic for Frama-C (e.g. we replace variable-length arrays in non-final fields of structures with arrays of a constant size). The preprocessing is implemented in the script preprocess.py.
Combination of under- and over-approximation. The script runs two configurations of RacerF and reports the verdict as discussed in Section 1.
Witness postprocessing. During the competition, it turned out that RacerF produces syntactically incorrect witnesses. The script fix_witness.py fixes this by filling in the correct metadata. The witness graph itself is not modified.
Usage. A binary of RacerF is available at Zenodo [6] (the only runtime dependencies are GCC and Python version 3.10 or newer). The wrapper script can be run as (see the attached file README.md for more information):
Software project.RacerF is available under the MIT license as a part of the authors’ project of concurrency analysers for C, maintained by Tomáš Dacík.
Participation.RacerF participated in the NoDataRace-Main sub-category only.
Acknowledgements
We appreciate discussions with Julien Signoles from the Frama-C team at CEA, France and with Adam Rogalewicz from FIT BUT, Czechia. The research was supported by the Czech Science Foundation project 23-06506S and the FIT BUT internal project FIT-S-23-8151. Tomáš Dacík was supported by the Brno PhD Talent Scholarship of the Brno City Municipality. The collaboration with Julien Signoles was supported by the project VASSAL: “Verification and Analysis for Safety and Security of Applications in Life” funded by the European Union under Horizon Europe WIDERA Coordination and Support Action/Grant Agreement No. 101160022.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.