Zum Inhalt
Erschienen in:

Open Access 2025 | OriginalPaper | Buchkapitel

Improvements in Software Verification and Witness Validation: SV-COMP 2025

verfasst von : Dirk Beyer, Jan Strejček

Erschienen in: Tools and Algorithms for the Construction and Analysis of Systems

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Der 14. Wettbewerb zur Softwareverifizierung (SV-COMP 2025) markiert einen bedeutenden Meilenstein im Bereich der Softwareverifizierung und zeigt die neuesten Fortschritte und Herausforderungen bei der automatisierten Softwareverifizierung und Zeugenvalidierung. Dieses Kapitel vertieft die Ziele des Wettbewerbs, zu denen die Bereitstellung eines Überblicks über den aktuellen Stand der Software-Verifikationstechnologie, der Aufbau eines Repositoriums für Software-Verifikationsaufgaben und die Beschleunigung des Transfers neuer Verifikationstechnologie in die industrielle Praxis gehören. Der Wettbewerb umfasst vielfältige Verifikationsaufgaben, einschließlich neuer Kategorien und Eigenschaften, und verwendet ein rigoroses Scoring-Schema, um die Leistung der teilnehmenden Tools zu bewerten. Eines der wichtigsten Highlights von SV-COMP 2025 ist die Einführung neuer Verifikationsaufgaben und die Erweiterung bestehender Kategorien, die die sich entwickelnde Landschaft der Software-Verifikation widerspiegelt. Der Wettbewerb legt auch großen Wert auf Reproduzierbarkeit, wobei alle wichtigen Komponenten in öffentlichen Versionskontrollrepositorien gepflegt und öffentlich zum Herunterladen bereitgestellt werden. Die Ergebnisse von SV-COMP 2025 zeigen die Fortschritte sowohl bei der Verifizierung als auch bei der Validierung von Zeugenaussagen sowie die Bereiche, die weiterer Forschung und Entwicklung bedürfen. Das Kapitel behandelt auch die im Wettbewerb verwendeten Rechenressourcen und bietet einen umfassenden Überblick über die gesamte Berechnungsarbeit. Insgesamt bietet dieses Kapitel einen detaillierten und aufschlussreichen Einblick in die Welt der Software-Verifikation, was es zu einer unverzichtbaren Lektüre für jeden macht, der sich für die neuesten Entwicklungen in diesem kritischen Bereich interessiert.
Hinweise
This report extends previous reports on SV-COMP [1725, 27, 28, 30].
Reproduction packages are available on Zenodo (see Table 5).

1 Introduction

This report presents the objectives, processes, rules, participants, and results of SV-COMP 2025. It extends the series of competition reports (see footnote). This year, we focus on a more precise description of the current category structure, competition workflow, scoring schema, and changes in the competition done in 2025. The 14th Competition on Software Verification (https://​sv-comp.​sosy-lab.​org/​2025) is again the largest comparative evaluation ever in this area. The objectives of the competitions were discussed earlier (1–4 [23]) and extended over the years (5–6 [24] and 7 [30]):
1.
provide an overview of the state of the art in software-verification technology and increase visibility of the most recent software verifiers,
 
2.
establish a repository of software-verification tasks that is publicly available for free use as a standard benchmark suite for evaluating verification software,
 
3.
establish standards that make it possible to compare different verification tools, including a property language and formats for the results,
 
4.
accelerate the transfer of new verification technology to industrial practice by identifying the strengths of the various verifiers on a diverse set of tasks,
 
5.
educate PhD students and others on performing reproducible benchmarking, packaging tools, and running robust and accurate research experiments,
 
6.
provide research teams that do not have sufficient computing resources with the opportunity to obtain experimental results on large benchmark sets, and
 
7.
conserve tools for formal methods for later reuse by using a standardized format to announce archives (via DOIs), default options, contacts, competition participation, and other meta data in a central repository.
 
The SV-COMP 2020 report [24] discusses the achievements of the SV-COMP competition so far with respect to these objectives.
Related Competitions. SV-COMP is one of many competitions that measure progress of research in the area of formal methods [16]. Competitions can lead to fair and accurate comparative evaluations because of the involvement of the developing teams. The competitions most related to SV-COMP are RERS [84], VerifyThis [70], Test-Comp [29], and TermCOMP [77]. The SV-COMP 2020 report [24] provides a more detailed discussion.
Quick Summary of Changes. We aim to keep the setup of the competition stable. Still, there are always improvements and developments. For the 2025 edition, the following changes were made:
  • The organization committee was established.
  • The number of considered verification tasks again increased in both C and Java languages: the number of C tasks increased from 30 300 in 2024 to 33 353 and the number of Java tasks increased from 587 in 2024 to 674 (not counting the demo category mentioned below).
  • We newly considered the property saying that a Java program does not cause a runtime exception. We used 673 tasks with this property as a demo category.
  • In the validation track, we newly used 103 handcrafted witness validation tasks as a complement to the validation tasks generated by verifiers.
  • The category structure was extended by three new base categories with C programs added to meta category SoftwareSystems, namely SoftwareSystems-Intel-TDX-Module-ReachSafety, SoftwareSystems-uthash-MemCleanup, and SoftwareSystems-DeviceDriversLinux64-Termination, one new base category RuntimeException-Java with Java programs (ran as a demo category due to late announcement) added to JavaOverall, and one new meta category ValidationCrafted of handcrafted validation tasks divided into two base categories CorrectnessWitnesses-Loops and ViolationWitnesses-ControlFlow.
  • The definition of the memory safety subproperty expressing that all allocated memory is tracked was reformulated more precisely.
  • The validation of violation witnesses in version 2.0 turned into a regular category due to a higher number of participating tools.
  • The normalization formula computing the scores in meta categories was modified to ignore void tasks and empty categories.
  • We now officially recognize two types of hors-concours participants: meta verifiers and inactive participants, which are tools without active team support.
  • The medals can be assigned only for positive scores.

2 Organization and Processes

Organization. The competition was established and ran for the first 13 years by Dirk Beyer. The SV-COMP community supported the competition mostly by maintaining the collection of verification tasks. Executing the competition in its current form would not be possible without the benchmarking framework BenchExec [42] and the system for distributed benchmark execution BenchCloud [35], which are developed and maintained by his research group. Since the beginning, the numbers of benchmarks and participants are substantially growing. Moreover, completely new layers of complexity were added with witness validation, the whole validation track, and a second format of witnesses. To distribute the effort necessary to smoothly run of the competition, the organization committee was established before the SV-COMP 2025. The competition has now two chairs (Dirk Beyer and Jan Strejček) and members in charge of benchmark quality (Zsófia Ádám, Raphaël Monat, Simmo Saan, and Frank Schüssele), category structure (Thomas Lemberger), infrastructure development (Philipp Wendler, Po-Chun Chien, Marek Jankola, Henrik Wachowitz, Matthias Kettl, and Marian Lingsch-Rosenfeld), and reproducibility (Levente Bajczi).
Procedure. SV-COMP is an open competition (also known as comparative evaluation). Verification tasks and handcrafted validation tasks are publicly available in a repository (Table 4) where anyone can contribute. The competition has basically two phases: training and evaluation. During the training phase, participating teams can repeatedly submit new versions of their tool. The organizers run the tool on relevant tasks and provide the results to the whole community. The participants can inspect the results, fix bugs in their tools and submit a new version or report an issue with some tasks. The set of verification tasks and handcrafted validation tasks is frozen approximately two weeks before the the end of the training phase. In the evaluation phase, the tools are again executed on all relevant tasks, the participants are asked for an inspection of the results and they can challenge the validity of some tasks. After removing the invalid tasks (they are marked as void), the results are announced on the competition web site.
Competition Jury. The competition jury reviews the competition contribution papers and helps the organizer with resolving any disputes that might occur (cf. competition report of SV-COMP 2013 [18]). The tasks of the jury were described in more detail in the report of SV-COMP 2022 [27]. The jury consists of the SV-COMP chairs and one representative of each actively participating tool. The representatives of participating tools circulate every year after the tool-submission deadline. The current representatives are listed in Tables 6 and 7. The jury includes one additional member, P. Darke (TCS, India), representing a tool that ultimately did not actively participate due to license issues. The current jury is also listed on the web site (https://​sv-comp.​sosy-lab.​org/​2025/​committee.​php).

3 Tasks, Workflow, and Scoring

Verification and Validation Tasks. A verification task consists of a program, a property to be verified, and an expected verification result. A validation task is a program, a property, and a witness of the program correctness or property violation to be validated. SV-COMP 2025 supports witnesses in two versions of the witness format, namely 1.0 [37] and 2.0 [7]. Some validation tasks (e.g., the handcrafted ones) contain also the expected validation result. SV-COMP 2025 used the task-definition format in version 2.​1 to denote the verification and validation tasks.
Properties. In connection with C programs, we consider 6 different properties: unreachability of a given function (referenced as unreach-call in the competition), memory safety (valid-memsafety) composed of three subproperties saying that all pointer dereferences are valid (valid-deref), all memory deallocations are valid (valid-free), and all allocated memory is tracked (valid-memtrack), memory cleanup (valid-memcleanup) saying that all allocated memory is deallocated before the program terminates, no overflow (no-overflow) saying that operations on signed integers never overflow, no data race (no-data-race) saying that the program does not contain any data race, and termination (termination) saying that the program always terminates. Note that a program is considered memory safe only if it satisfies all three subproperties. The subproperties are used in particular to report what is violated if a program is not memory safe. For Java programs, we consider the property assertion validity (assert_java) and newly also no runtime exception (runtime-exception). We refer to the conference web page for precise definition of these properties (https://​sv-comp.​sosy-lab.​org/​2025/​rules.​php). We note that the precise definition of tracked memory was modified for SV-COMP 2025 as the previous definition was ambiguous.
Fig. 1.
Category structure for SV-COMP 2025; https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figh_HTML.gif marks the demo category, new categories are green
Categories. The verification tasks are divided into base categories loosely reflecting the programming language, program features, the considered property, and the source of the benchmarks. Base categories are accumulated into meta categories. For the C language, there are two levels of meta categories. The top level contains the category Overall of all C verification tasks and the category FalsificationOverall, which is rather specific. The category FalsificationOverall was originally introduced to support bug-finding tools. It has basically the same structure as Overall, but it contains only the tasks with safety properties, which are all but termination. In other words, FalsificationOverall does not contain the whole meta category Termination and base category SoftwareSystems-DeviceDriversLinux64-Termination. It also uses a specific scoring schema as explained below. For Java language, there is only a single meta category called JavaOverall. The category structure was significantly updated for SV-COMP 2024. For SV-COMP 2025, we added 3 new base categories to the category SoftwareSystems. Further, the category JavaOverall was extended with the base category RuntimeException-Java of verification tasks with the property no runtime exception. This base category ran only as a demo category because it was announced shortly before the competition deadlines. As a consequence, this base category is not reflected in the score computations of JavaOverall The current category structure including the new categories is shown in Fig. 1. The meta categories and the number of verification tasks they contain are shown in Tables 10, 11, and 12 presenting the results of the verification track. Finally, the categories are described in detail on the competition web site (https://​sv-comp.​sosy-lab.​org/​2025/​benchmarks.​php).
All validation tasks are basically divided into 4 groups: the witnesses of program correctness are separated from the witnesses of property violation and each of these two groups is again divided according to the used witness format (version 1.0 or 2.0). In each of the four groups, the validation tasks are organized basically in the same category structure: each validation task generated by a verifier is in the base category determined by the corresponding verification task. The handcrafted validation tasks are stored in two new base categories united in a new meta category called ValidationCrafted as shown in Fig. 1. The meta category FalsificationOverall is not considered in the validation track. As the witness formats and current validators have certain limitations, SV-COMP 2025 supports witnesses in individual formats only for selected properties and categories, as shown in Table 1. Some categories in some groups are empty, for example because SV-COMP does not support the particular witness format for the category. All empty categories are removed from the category structures of the groups.
Table 1.
Support of witnesses in individual formats based on property and category; ‘–’ indicates that the property is not relevant for the witness type in the column
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab1_HTML.png
Competition Workflow. Roughly speaking, the inputs of the competition are (a) verification tasks and, for each verifier and witness validator, the participating teams provide (b) benchmark definition listing the base categories the tool is supposed to be applied on (i.e., the tool opts-out from the categories that are not listed there), (c) tool-info module that specifies the interface for running the tool and interpreting its results, and (d) tool archive on Zenodo from where the tool is downloaded. The inputs provided by participating teams are described more precisely in the report for Test-Comp 2021 [26] (SV-COMP and Test-Comp use similar workflow and components).
The verification track proceeds as follows. The competition scripts run each verifier on all relevant verification tasks given by its benchmark definition. If the verifier solves a task, it returns either True meaning that the program satisfies the given property or False meaning that the program violates the property. The output True is accompanied with a correctness witness and the output False with a violation witnesses in some of the formats supported by SV-COMP for the corresponding property and category (see Table 1). When SV-COMP supports both formats, the verifier can actually produce two witnesses, one in each format. If no format is supported, the witness is not required. This is the case of correctness witnesses in categories *-Arrays, *-Floats, *-Heap, *MemSafety*, ConcurrencySafety-*, *NoDataRace*, *Termination-*, and *-Java. The generated witnesses are turned into validation tasks. If the program of the validation task is written in C, we run WitnessLint to check that the witness adheres to the format. If the answer is negative, the witness is syntactically invalid and we never consider such a witness as confirmed. We do not apply this step for validation tasks with programs in Java as there is no witness linter for these witnesses. Competition scripts then run all suitable validators on each validation task, where suitability is determined by the benchmark definition of validators. Each validator can either confirm the task, refute it, or fail to solve it. A validator confirms a correctness witness by returning True and refutes it by returning False. A violation witness is confirmed by False and refuted by True.
The validation track uses the same inputs, only the verification tasks are replaced by validation tasks. These are the handcrafted validation tasks and all validation tasks generated by the verifiers in the verification track and successfully checked by WitnessLint. A validation task contains an expected result if it is handcrafted or if it was generated with an incorrect verification result (i.e., it contains a correctness witness for a verification task with property violation or a violation witness for a verification task with a correct program). In the latter case, the expected validation result corresponds to refutation. Each validator is executed on all validation tasks specified by the corresponding benchmark definition.
Computing Resources. The computing resources for each tool execution were the same as in SV-COMP 2024. Each verifier run was limited by 15 GB of memory and 15 min of cumulative CPU time on 4 processing units. Each run of a validator was limited to 2 processing units, 7 GB of memory, and 1.5 min of CPU time for violation witnesses and 15 min of CPU time for correctness witnesses. The machines for running the experiments are part of a computer cluster at the SoSy-Lab at LMU, which consists of 168 machines, where each machine has one Intel Xeon E3-1230 v5 CPU with 8 processing units, a frequency of 3.4 GHz, 33 GB of RAM, and a GNU/Linux operating system (x86_64-linux, Ubuntu 24.04 with Linux kernel 6.8). We used BenchExec [42] to measure and control computing resources (CPU time, memory) and BenchCloud [35] to distribute, install, run, and clean-up verification runs, and to collect the results.
Table 2.
Scores per individual verification results used since SV-COMP 2021
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab2_HTML.png
Scoring Schema. For verification track, the scoring schema of SV-COMP 2025 in base categories was the same as for SV-COMP 2021. Table 2 lists all the cases when verification results are awarded with non-zero points, where a verification result is
  • incorrect if it does not agree with the expected result of the verification task,
  • correct if it agrees with the expected result and the produced witness (or at least one of them if there are two) is confirmed by some validator. In the categories where correctness witnesses are not required, a verification results is correct whenever it agrees with the expected result.
When a verification result agrees with the expected result but no validator confirms the witness (although it is required), we call the result correct-unconfirmed. The score of a verifier in a base category is simply the sum of the points for individual verification tasks. The score for a meta category is computed from the scores of all contained (meta or base) categories on the next level and the number of tasks in these categories. Formally, if a meta category contains k categories of the next level and the i-th contained category has the score \(s_i\) and consists of \(n_i\) verification tasks, then the meta category gets the score \((\varSigma _{i=1}^k s_i/n_i)\cdot (\varSigma _{i=1}^k n_i)/k\), i.e., the sum of scores in each category normalized by the number of tasks in the category multiplied by the average number of tasks in the contained categories. Note that in previous years, the numbers \(n_i\) included also void tasks that are technically in categories but are not used in the competition (a verification task is marked as void typically because it was changed for some serious reason after the task freezing deadline). Since SV-COMP 2025, these void tasks are not included in \(n_i\).
The scoring in FalsificationOverall and its subcategories is slightly different. As this meta category was motivated by tools that can only find a bug, the scores in its base categories are sums of the points for results False (both correct and incorrect), while the results True are ignored. The computation of scores for meta categories remains unchanged.
For validation track, the scoring schema of SV-COMP 2025 in base categories was the same as for SV-COMP 2024. The biggest difference from the verification track comes from the fact that many validation tasks do not contain the expected validation result. In fact, it is contained only in handcrafted validation tasks and in the tasks generated by verifiers that solve some verification task incorrectly (and thus the witness they produce should be refuted). For the remaining tasks, the expected results are determined by voting. For each such a task, we collect the results from all validators that solved (i.e., confirmed or refuted) the task. If we have at least two such results and at least 75 % of them agree on their decision, then the expected result is set by the majority vote. In all other cases, the expected result is not determined and the validation task has no influence on the validation track results as it is considered void. Tool developers can inspect the results and convince the community that a voted expected results is in fact wrong. In such a case, the corresponding validation task is removed from the competition.
Table 3.
Scores per individual validation results used since SV-COMP 2024
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab3_HTML.png
Table 3 lists all the cases when validation results are awarded with non-zero points. A validation result True or False is considered correct if it agrees with the expected result and incorrect otherwise. To compute the scores in base categories, we virtually divide each base category C into two subcategories \(C_c,C_w\), where
  • \(C_c\) contains the witnesses that are expected to be correct (i.e., correctness witnesses with the expected result True and violation witnesses with the expected result False), and
  • \(C_w\) contains the witnesses that are expected to be wrong (i.e., correctness witnesses with the expected result False and violation witnesses with the expected result True).
The score of a validator in each subcategory \(C_c,C_w\) is the sum of the points for individual validation tasks in the subcategory. If some of the subcategories \(C_c,C_w\) is empty, the score for C is directly the score for the non-empty subcategory. If both subcategories \(C_c,C_w\) are non-empty and have respective scores \(s_c,s_w\), then the score for C is computed as for a meta category in the verification track, i.e., \((\frac{s_c}{|C_c|}+\frac{s_w}{|C_w|})\cdot \frac{|C_c|+|C_w|}{2}\). The score of a validator in a meta category is then computed by the same process as in the verification track. Note that all the scores presented in this paper and on the competition web are rounded to integers before printing, but their computation is done with a higher precision.
Ranking. As before, the rank of a verifier in each category was decided based on the achieved score. In case of a tie, we used the success run time as the secondary criterion, which is the total CPU time of the verifier over all tasks in a given category for which the verifier reported a correct verification result. Ranking in validation track works in the same way. Recall that a tool participates only in the categories specified in its benchmark definition. In contrast to the previous year, we assign medals only to tools with a positive score.
Table 4.
Publicly available components for reproducing SV-COMP 2025
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab4_HTML.png
Table 5.
Artifacts published for SV-COMP 2025
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab5_HTML.png
Reproducibility. SV-COMP results must be reproducible. Hence, all major components are maintained in public version-control repositories. Table 4 lists these components with the links to the repositories and their versions used in SV-COMP 2025. Most of these components are described with more details in the SV-COMP 2016 report [21]. Later, BenchCloud was introduced to distribute the benchmarking jobs in an elastic cloud and collect results. Moreover, CoVeriTeam is used to continuously check (GitLab CI pipeline) whether tools can be executed in the competition environment. The processing scripts to execute the experiments and post-process the data into tables, scores, and rankings are also publicly released. The competition artifacts are published at Zenodo (see Table 5) with the relevant tools and data to guarantee their long-term availability and immutability.
For the reproducibility reasons, SV-COMP requires since 2018 that the verifiers and validators must be publicly available for download and has a license that
  • allows reproduction and evaluation by anybody (incl. results publication),
  • does not restrict the usage of the verifier output (log files, witnesses), and
  • allows (re-)distribution of the unmodified verifier archive via SV-COMP repositories and archives.
The verification and handcrafted validation tasks used in the competition are also accompanied by a license. In this case, the stated license must allow to
  • view, understand, investigate, and reverse engineer the algorithm or system,
  • change the program (in particular, pre-process and adopt the programs to be useful for a verification task),
  • (re-)distribute the (original and changed) program under the same terms (in particular, in replication packages for research projects or as regression tests),
  • compile and execute the program (in particular, for the purpose of verifying that a specification violation exists),
  • and commercially take advantage of the program (in particular, to not exclude developers of commercial verifiers).
Table 6.
Participating validators with tool references, representing jury members, their affiliations, and indications of supported witnesses depending on format version and type; https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Fign_HTML.gif for inactive, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figo_HTML.gif for first-time participants, Cor. for correctness witnesses, Vio. for violation witnesses, and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figp_HTML.gif for newly added support
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab6_HTML.png
Table 7.
Participating verifiers with tool references, representing jury members and their affiliations; https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figr_HTML.gif for inactive, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figs_HTML.gif for meta verifiers, and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figt_HTML.gif for first-time participants
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab7a_HTML.png
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab7b_HTML.png

4 Participating Verifiers and Validators

In total, SV-COMP 2025 evaluates 62 verification and 18 validation tools. Besides 35 verifiers and 13 validators registered to and supported in the competition by development teams, we also evaluated some tools participating in previous years but not actively registered and supported this year. These tools are called inactive, clearly marked with https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figv_HTML.gif in all tables, and they do not appear in rankings. Further, we clearly distinguish meta verifiers according to the following characterization approved by the community of SV-COMP 2023.
A meta verifier is a combination of at least two existing verification components such that each result produced by the combination can be computed by some of its components alone. A verifier is the result of research and engineering in verification algorithms and approaches, while a typical meta verifier selects a verification component to run, sets up its parameters, and potentially post-processes its output.
Meta verifiers are annotated with https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figw_HTML.gif in all tables and also excluded from rankings. Note that before SV-COMP 2025, both inactive tools and meta verifiers were marked as hors-concours participants and not properly distinguished.
Table 8.
Algorithms and techniques used by the participating tools; https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figx_HTML.gif for inactive, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figy_HTML.gif for meta verifiers, and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figz_HTML.gif for first-time participants
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab8a_HTML.png
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab8b_HTML.png
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab8c_HTML.png
Tables 6 and 7 list all validation and verification tools evaluated in SV-COMP 2025, respectively. The tables contain the tool name (with hyperlink), references to papers that describe the tool, the representing jury member and the affiliation. The listings are also available on the competition web site at https://​sv-comp.​sosy-lab.​org/​2025/​systems.​php. Table 6 additionally indicates the witness formats and witness kinds supported by individual validators and whether they were supported in SV-COMP 2025 for the first time ( https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figac_HTML.gif ) or not ( https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figad_HTML.gif ).
Table 9.
Solver libraries and frameworks used as components by at least three participating tools; tools that did not declare any used library or framework are omitted, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figae_HTML.gif for inactive, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figaf_HTML.gif for meta verifiers, and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figag_HTML.gif for first-time participants
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab9a_HTML.png
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab9b_HTML.png
Table 8 lists the algorithms and techniques used by the verification and validation tools (some techniques were omitted due to limited space). Further, Table 9 gives an overview of common solver libraries and frameworks used by these tools. Note that both tables are based on information provided by teams that registered individual tools to SV-COMP 2025 or to some previous editions in the case of currently inactive tools. The web site https://​fm-tools.​sosy-lab.​org provides more information about all tools evaluated in SV-COMP and Test-Comp since 2023 in a uniform way.
Figure 2 shows the evolution of the number of verifiers and validators participating in individual editions of SV-COMP. It seems that the number of actively participating tools, in particular verifiers, is now stable.
Fig. 2.
Number of evaluated verifiers and validators in each year of SV-COMP; three new hors concours tools of 2022 counted only as new, not as hors concours

5 Results

The results of the competition represent the state of the art of what can be achieved with fully automatic software-verification tools on the given benchmark set. We report the effectiveness (the number of verification tasks that can be solved and correctness of the results, as accumulated in the score) and the efficiency (resource consumption in terms of CPU time). The results are presented in the same way as in last years, such that the improvements compared to the last years are easy to identify. The results presented in this report were provided to the participants in advance and their objections have been settled.
Consumed Resources. Before we present the competition results, we report some statistics to give an impression of the overall computation work: One complete execution of all verifiers in the competition consisted of 942 284 verification runs (each verifier runs on each verification task of the categories listed in the verifier’s benchmark definition), consuming 2 312 days of CPU time (without validation). Witness validation required 21.8 million validation runs (each validator runs on each validation task of the categories listed in the validator’s benchmark definition) consuming 2 573 days of CPU time. Moreover, each tool was executed several times, in order to make sure no installation issues occur during the execution.
Table 10.
Overview of the results of all actively participating verifiers; empty cells indicate opt-outs, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figai_HTML.gif for meta verifiers, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figaj_HTML.gif for first-time participants
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab10_HTML.png
Table 11.
Overview of the results of all inactively participating verifiers; empty cells indicate opt-outs, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figal_HTML.gif for inactive, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figam_HTML.gif for meta verifiers
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab11_HTML.png
Table 12.
Verification: Overview of the medalists in each meta category; values for CPU time in hours and rounded to two significant digits
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab12_HTML.png
Verification track. Tables 10 and 11 present the quantitative overview of all tools and all meta categories except the meta categories included in FalsificationOverall. We split the presentation into two tables, one for the verifiers that participate with an active team support (Table 10), and one for the inactive verifiers (Table 11). The head row lists meta categories with the number of valid tasks and the maximal score for each category. The tools are listed in alphabetical order; every table row lists the scores of one verifier. An empty table cell means that the verifier did not participate in the corresponding meta category. In Table 10, we indicate the top three candidates in each category by formatting their scores in bold face and in larger font size. We recall that inactive tools and meta verifiers are excluded from rankings. More information (including interactive tables, quantile plots for every category, and also the raw data in XML format) is available on the competition web site (https://​sv-comp.​sosy-lab.​org/​2025/​results) and in the results artifact (see Table 5). Note that the results for subcategories of FalsificationOverall are not explicitly presented neither in this report nor on the web due to their marginal significance. The results can be obtained from the detailed results of the corresponding categories in Overall presented on the web.
Table 12 shows the medalists for each meta category. The column ‘Solved Tasks’ shows the number of tasks that the respective verifier solved correctly and produced a witness that was confirmed (or not required). The cumulative run time of the verifier on these tasks is presented in the column ‘CPU Time’. The column ‘Unconf. Tasks’ indicates the number of tasks for which the verifier returned a correct answer, but the corresponding witness was not confirmed by any validator. The columns ‘False Alarms’ and ‘Wrong Proofs’ provide the number of verification tasks for which the verifier reported wrong results, i.e., reporting a property violation when the property holds (false alarm) and claiming that the program satisfies the property although it actually violates it (wrong proof), respectively.
Fig. 3.
Quantile functions for category Overall. Each quantile function illustrates the quantile (x-coordinate) of the scores obtained by correct verification runs below a certain run time (y-coordinate), minus the overall penalty for incorrect results. More details were given previously [18]. A logarithmic scale is used for the time range from 1 s to 1000 s, and a linear scale is used for the time range between 0 s and 1 s.
Score-Based Quantile Functions. We use score-based quantile functions [18, 42] because these visualizations make it easier to understand the results of the comparative evaluation. The results archive (see Table 5) and the web site (https://​sv-comp.​sosy-lab.​org/​2025/​results) include such a plot for each category. As an example, we show the plot for category Overall (all verification tasks in C) in Fig. 3. A total of 16 verifiers participated in category Overall, for which the quantile plot shows the overall performance over all categories (scores for meta categories are normalized [18]). A more detailed discussion of score-based quantile plots, including examples of what insights one can obtain from the plots, is provided in previous competition reports [18, 21]. Since this year we use different lines to distinguish regular active participants (solid line), active meta verifiers (dashed line), and inactive verifiers (dotted line).
The graph shows that the Overall winner is UAutomizer as its graph end most to the right. The graph for UAutomizer also starts left from \(x=0\) because the verifier produced 8 wrong proofs and therefore received some negative points. Also other verifiers whose graphs start with a negative cumulative score produced wrong results.
Validation Track. Validation of verification witnesses was pioneered by SV-COMP in 2015 (by the tools CPAchecker and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figap_HTML.gif , cthe early attempts: https://​sv-comp.​sosy-lab.​org/​2015/​witnesses/​). Shortly after that, verification witnesses become more and more important for various reasons: they do not only justify and help to understand and interpret verification results, but they also serve as exchange object for intermediate results and allow to make use of imprecise verification techniques (e.g., via machine learning). However, a case study on the quality of the results of witness validators [46] published in 2022 revealed a great potential for improvements. To stimulate further advances in verification witnesses and their verifiers, the study suggested that witness validators should also undergo a periodical comparative evaluation and proposed a scoring schema for witness-validation results. This materializes in the validation track run by SV-COMP since 2023.
Table 13.
Validation of correctness witnesses (version 1.0): Overview of the medalists in each meta category; values for CPU time in hours and rounded to two significant digits, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Figaq_HTML.gif for first-time participants
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab13_HTML.png
Table 14.
Validation of violation witnesses (version 1.0): Overview of the medalists in each meta category; values for CPU time in hours and rounded to two significant digits
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab14_HTML.png
Table 15.
Validation of correctness witnesses (version 2.0): Overview of the medalists in each meta category; values for CPU time in hours and rounded to two significant digits
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab15_HTML.png
Table 16.
Validation of violation witnesses (version 2.0): Overview of the medalists in each meta category; values for CPU time in hours and rounded to two significant digits
https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-90660-2_9/MediaObjects/652620_1_En_9_Tab16_HTML.png
Thanks to the development and adoption of the version 2.0 of the witness format [7], the validation track has now four regular subtracks:
1.
validation of correctness witnesses in format version 1.0 (see Table 13),
 
2.
validation of violation witnesses in format version 1.0 (see Table 14),
 
3.
validation of correctness witnesses in format version 2.0 (see Table 15), and
 
4.
validation of violation witnesses in format version 2.0 (see Table 16).
 
The subtrack with violation witnesses 2.0 ran for the first time as a regular part of the competition as it was only a demonstration subtrack in SV-COMP 2024. The tables present only the medalists for all non-empty meta categories of each subtrack. If some category has less than 3 medalists, it means that less than three validators reached a positive score. The column ‘Wrong Confirm.’ gives the number of cases when the respective validator confirmed a witness that was incorrect. The column ‘Wrong Refut.’ shows the number of cases when the respective validator refuted a correct witness. We recall that the correctness or incorrectness of many witnesses used in the validation track is determined by voting and thus the numbers of wrong confirmations and refutation do not have to be completely objective.
The complete results of all validators in all relevant categories of all subtracks are available in the results artifact (see Table 5) and on the SV-COMP web site (https://​sv-comp.​sosy-lab.​org/​2025/​results/​results-validated/​).
The limited support of some properties and program features by the witness formats (see Table 1), missing medalists in some categories, and the negative scores in the detailed results on the web site: all of these show the need of further research and development in the are of software verification witnesses and their validation.

6 Conclusion

The 14th edition of the Competition on Software Verification (SV-COMP 2025) compared 62 automatic tools for software verification (including 6 new ones and 27 tools without active team support) and 18 automatic tools for validation of verification witnesses (including 2 new and 5 tools without active team support). The overall numbers of evaluated verifiers and validators were historically the highest (see Fig. 2). The total number of verification tasks in SV-COMP 2025 for both C and Java programs was significantly increased to precisely 34 700 (including one demo category).
The results of the competition show a progress in both verification and witness validation area, especially in the adoption of the witness format 2.0. However, even the best verification and validation tools still produce some incorrect results. This motivates further improvements of verifiers, but also extensions of the witness format to support more properties and program features, and development of validators.
Data-Availability Statement. The verification tasks and results of the competition are published at Zenodo, as described in Table 5. All components and data that are necessary for reproducing the competition are available in public version repositories, as specified in Table 4. For easy access, the results are presented also online on the competition web site https://​sv-comp.​sosy-lab.​org/​2025. The main results of SV-COMP 2025 were reproduced in an independent reproduction report [14].
Funding Statement. SV-COMP 2025 was supported by Huawei – Dresden Research Center, Germany. Some participants of this competition were funded in part by the Deutsche Forschungsgemeinschaft (DFG) — 378803395 (ConVeY). Jan Strejček has been supported by the Czech Science Foundation grant GA23-06506S.

Acknowledgments

We thank the organization committee for their help in running the competition, the jury for their work on improving the quality of the verification tasks and for their advice in refining and apply the competition rules, and the verification community for contributing their tools to the evaluation.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Literatur
4.
Zurück zum Zitat Andrianov, P., Friedberger, K., Mandrykin, M.U., Mutilin, V.S., Volkov, A.: CPA-BAM-BnB: Block-abstraction memoization and region-based memory models for predicate abstractions (competition contribution). In: Proc. TACAS. pp. 355–359. LNCS 10206, Springer (2017). https://doi.org/10.1007/978-3-662-54580-5_22 Andrianov, P., Friedberger, K., Mandrykin, M.U., Mutilin, V.S., Volkov, A.: CPA-BAM-BnB: Block-abstraction memoization and region-based memory models for predicate abstractions (competition contribution). In: Proc. TACAS. pp. 355–359. LNCS 10206, Springer (2017). https://​doi.​org/​10.​1007/​978-3-662-54580-5_​22
10.
Zurück zum Zitat Baier, D., Beyer, D., Chien, P.C., Jakobs, M.C., Jankola, M., Kettl, M., Lee, N.Z., Lemberger, T., Lingsch-Rosenfeld, M., Wachowitz, H., Wendler, P.: Software verification with CPAchecker 3.0: Tutorial and user guide. In: Proc. FM. pp. 543–570. LNCS 14934, Springer (2024). https://doi.org/10.1007/978-3-031-71177-0_30 Baier, D., Beyer, D., Chien, P.C., Jakobs, M.C., Jankola, M., Kettl, M., Lee, N.Z., Lemberger, T., Lingsch-Rosenfeld, M., Wachowitz, H., Wendler, P.: Software verification with CPAchecker 3.0: Tutorial and user guide. In: Proc. FM. pp. 543–570. LNCS 14934, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-71177-0_​30
11.
Zurück zum Zitat Baier, D., Beyer, D., Chien, P.C., Jankola, M., Kettl, M., Lee, N.Z., Lemberger, T., Lingsch-Rosenfeld, M., Spiessl, M., Wachowitz, H., Wendler, P.: CPAchecker 2.3 with strategy selection (competition contribution). In: Proc. TACAS (3). pp. 359–364. LNCS 14572, Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_21 Baier, D., Beyer, D., Chien, P.C., Jankola, M., Kettl, M., Lee, N.Z., Lemberger, T., Lingsch-Rosenfeld, M., Spiessl, M., Wachowitz, H., Wendler, P.: CPAchecker 2.3 with strategy selection (competition contribution). In: Proc. TACAS (3). pp. 359–364. LNCS 14572, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​21
12.
Zurück zum Zitat Bajczi, L., Szekeres, D., Mondok, M., Ádám, Z., Somorjai, M., Telbisz, C., Dobos-Kovács, M., Molnár, V.: EmergenTheta: Verification beyond abstraction refinement (competition contribution). In: Proc. TACAS (3). pp. 371–375. LNCS 14572, Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_23 Bajczi, L., Szekeres, D., Mondok, M., Ádám, Z., Somorjai, M., Telbisz, C., Dobos-Kovács, M., Molnár, V.: EmergenTheta: Verification beyond abstraction refinement (competition contribution). In: Proc. TACAS (3). pp. 371–375. LNCS 14572, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​23
14.
Zurück zum Zitat Bajczi, L., Ádám, Z., Micskei, Z.: SV-COMP’25 reproduction report (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Bajczi, L., Ádám, Z., Micskei, Z.: SV-COMP’25 reproduction report (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
16.
Zurück zum Zitat Bartocci, E., Beyer, D., Black, P.E., Fedyukovich, G., Garavel, H., Hartmanns, A., Huisman, M., Kordon, F., Nagele, J., Sighireanu, M., Steffen, B., Suda, M., Sutcliffe, G., Weber, T., Yamada, A.: TOOLympics 2019: An overview of competitions in formal methods. In: Proc. TACAS (3). pp. 3–24. LNCS 11429, Springer (2019). https://doi.org/10.1007/978-3-030-17502-3_1 Bartocci, E., Beyer, D., Black, P.E., Fedyukovich, G., Garavel, H., Hartmanns, A., Huisman, M., Kordon, F., Nagele, J., Sighireanu, M., Steffen, B., Suda, M., Sutcliffe, G., Weber, T., Yamada, A.: TOOLympics 2019: An overview of competitions in formal methods. In: Proc. TACAS (3). pp. 3–24. LNCS 11429, Springer (2019). https://​doi.​org/​10.​1007/​978-3-030-17502-3_​1
41.
Zurück zum Zitat Beyer, D., Lingsch-Rosenfeld, M.: CPAchecker Validator 4.0 (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Beyer, D., Lingsch-Rosenfeld, M.: CPAchecker Validator 4.0 (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
53.
Zurück zum Zitat Chalupa, M., Richter, C.: Bubaak: Dynamic cooperative verification (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Chalupa, M., Richter, C.: Bubaak: Dynamic cooperative verification (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
56.
65.
Zurück zum Zitat Dacík, T., Vojnar, T.: RacerF: Data race detection with Frama-C (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Dacík, T., Vojnar, T.: RacerF: Data race detection with Frama-C (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
67.
Zurück zum Zitat Darke, P., Chimdyalwar, B., Agrawal, S., Venkatesh, R., Chakraborty, S., Kumar, S.: VeriAbsL: Scalable verification by abstraction and strategy prediction (competition contribution). In: Proc. TACAS (2). pp. 588–593. LNCS 13994, Springer (2023). https://doi.org/10.1007/978-3-031-30820-8_41 Darke, P., Chimdyalwar, B., Agrawal, S., Venkatesh, R., Chakraborty, S., Kumar, S.: VeriAbsL: Scalable verification by abstraction and strategy prediction (competition contribution). In: Proc. TACAS (2). pp. 588–593. LNCS 13994, Springer (2023). https://​doi.​org/​10.​1007/​978-3-031-30820-8_​41
81.
Zurück zum Zitat Heizmann, M., Bentele, M., Dietsch, D., Jiang, X., Klumpp, D., Schüssele, F., Podelski, A.: Ultimate Automizer and the abstraction of bitwise operations (competition contribution). In: Proc. TACAS (3). pp. 418–423. LNCS 14572, Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_31 Heizmann, M., Bentele, M., Dietsch, D., Jiang, X., Klumpp, D., Schüssele, F., Podelski, A.: Ultimate Automizer and the abstraction of bitwise operations (competition contribution). In: Proc. TACAS (3). pp. 418–423. LNCS 14572, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​31
88.
Zurück zum Zitat Inverso, O., Tomasco, E., Fischer, B., Torre, S.L., Parlato, G.: Bounded verification of multi-threaded programs via lazy sequentialization. ACM Trans. Program. Lang. Syst. 44(1), 1:1–1:50 (2022). https://doi.org/10.1145/3478536 Inverso, O., Tomasco, E., Fischer, B., Torre, S.L., Parlato, G.: Bounded verification of multi-threaded programs via lazy sequentialization. ACM Trans. Program. Lang. Syst. 44(1), 1:1–1:50 (2022). https://​doi.​org/​10.​1145/​3478536
90.
Zurück zum Zitat Jonáš, M., Kumor, K., Novák, J., Sedláček, J., Trtík, M., Zaoral, L., Ayaziová, P., Strejček, J.: Symbiotic 10: Lazy memory initialization and compact symbolic execution (competition contribution). In: Proc. TACAS (3). pp. 406–411. LNCS 14572, Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_29 Jonáš, M., Kumor, K., Novák, J., Sedláček, J., Trtík, M., Zaoral, L., Ayaziová, P., Strejček, J.: Symbiotic 10: Lazy memory initialization and compact symbolic execution (competition contribution). In: Proc. TACAS (3). pp. 406–411. LNCS 14572, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​29
91.
Zurück zum Zitat Journault, M., Miné, A., Monat, R., Ouadjaout, A.: Combinations of reusable abstract domains for a multilingual static analyzer. In: Proc. VSTTE. pp. 1–18. LNCS 12031, Springer (2019) Journault, M., Miné, A., Monat, R., Ouadjaout, A.: Combinations of reusable abstract domains for a multilingual static analyzer. In: Proc. VSTTE. pp. 1–18. LNCS 12031, Springer (2019)
94.
98.
Zurück zum Zitat Lemberger, T., Wachowitz, H.: Nacpa: Native checking with parallel-portfolio analyses (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Lemberger, T., Wachowitz, H.: Nacpa: Native checking with parallel-portfolio analyses (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
99.
Zurück zum Zitat Lin, Y., Chen, Z., Wang, J.: AISE v2.0: Combining loop transformations (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Lin, Y., Chen, Z., Wang, J.: AISE v2.0: Combining loop transformations (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
100.
Zurück zum Zitat Lommen, N., Giesl, J.: AProVE (KoAT + LoAT) (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Lommen, N., Giesl, J.: AProVE (KoAT + LoAT) (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
101.
104.
Zurück zum Zitat McGowan, C., Richards, M., Sui, Y.: SVF-SVC: Software verification using SVF (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) McGowan, C., Richards, M., Sui, Y.: SVF-SVC: Software verification using SVF (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
106.
Zurück zum Zitat Monat, R., Milanese, M., Parolini, F., Boillot, J., Ouadjaout, A., Miné, A.: Mopsa-C: Improved verification for C programs, simple validation of correctness witnesses (competition contribution). In: Proc. TACAS (3). pp. 387–392. LNCS 14572, Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_26 Monat, R., Milanese, M., Parolini, F., Boillot, J., Ouadjaout, A., Miné, A.: Mopsa-C: Improved verification for C programs, simple validation of correctness witnesses (competition contribution). In: Proc. TACAS (3). pp. 387–392. LNCS 14572, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​26
107.
Zurück zum Zitat Monat, R., Ouadjaout, A., Miné, A.: Mopsa-C with trace partitioning and autosuggestions (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Monat, R., Ouadjaout, A., Miné, A.: Mopsa-C with trace partitioning and autosuggestions (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
108.
Zurück zum Zitat Mondok, M., Bajczi, L., Szekeres, D., Molnár, V.: EmergenTheta: Variations on symbolic transition systems (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Mondok, M., Bajczi, L., Szekeres, D., Molnár, V.: EmergenTheta: Variations on symbolic transition systems (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
111.
Zurück zum Zitat Mukhopadhyay, D., Metta, R., Karmarkar, H., Madhukar, K.: PROTON 2.1: Synthesizing ranking functions via fine-tuned locally hosted LLM (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Mukhopadhyay, D., Metta, R., Karmarkar, H., Madhukar, K.: PROTON 2.1: Synthesizing ranking functions via fine-tuned locally hosted LLM (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
118.
Zurück zum Zitat Păsăreanu, C.S., Visser, W., Bushnell, D.H., Geldenhuys, J., Mehlitz, P.C., Rungta, N.: Symbolic PathFinder: Integrating symbolic execution with model checking for Java bytecode analysis. Autom. Software Eng. 20(3), 391–425 (2013). https://doi.org/10.1007/s10515-013-0122-2 Păsăreanu, C.S., Visser, W., Bushnell, D.H., Geldenhuys, J., Mehlitz, P.C., Rungta, N.: Symbolic PathFinder: Integrating symbolic execution with model checking for Java bytecode analysis. Autom. Software Eng. 20(3), 391–425 (2013). https://​doi.​org/​10.​1007/​s10515-013-0122-2
121.
Zurück zum Zitat Saan, S., Erhard, J., Schwarz, M., Bozhilov, S., Holter, K., Tilscher, S., Vojdani, V., Seidl, H.: Goblint: Abstract interpretation for memory safety and termination (competition contribution). In: Proc. TACAS (3). pp. 381–386. LNCS 14572, Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_25 Saan, S., Erhard, J., Schwarz, M., Bozhilov, S., Holter, K., Tilscher, S., Vojdani, V., Seidl, H.: Goblint: Abstract interpretation for memory safety and termination (competition contribution). In: Proc. TACAS (3). pp. 381–386. LNCS 14572, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​25
122.
Zurück zum Zitat Saan, S., Erhard, J., Schwarz, M., Bozhilov, S., Holter, K., Tilscher, S., Vojdani, V., Seidl, H.: Goblint Validator: Correctness witness validation by abstract interpretation (competition contribution). In: Proc. TACAS (3). pp. 335–340. LNCS 14572, Springer (2024). https://doi.org/10.1007/978-3-031-57256-2_17 Saan, S., Erhard, J., Schwarz, M., Bozhilov, S., Holter, K., Tilscher, S., Vojdani, V., Seidl, H.: Goblint Validator: Correctness witness validation by abstract interpretation (competition contribution). In: Proc. TACAS (3). pp. 335–340. LNCS 14572, Springer (2024). https://​doi.​org/​10.​1007/​978-3-031-57256-2_​17
127.
Zurück zum Zitat Telbisz, C., Bajczi, L., Szekeres, D., Vörös, A.: Theta: Various approaches for concurrent program verification (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Telbisz, C., Bajczi, L., Szekeres, D., Vörös, A.: Theta: Various approaches for concurrent program verification (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
134.
Zurück zum Zitat Wu, T., Li, X., Manino, E., Menezes, R., Gadelha, M., Xiong, S., Tihanyi, N., Petoumenos, P., Cordeiro, L.: ESBMC v7.7: Efficient concurrent software verification with scheduling, incremental SMT and partial order reduction (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025) Wu, T., Li, X., Manino, E., Menezes, R., Gadelha, M., Xiong, S., Tihanyi, N., Petoumenos, P., Cordeiro, L.: ESBMC v7.7: Efficient concurrent software verification with scheduling, incremental SMT and partial order reduction (competition contribution). In: Proc. TACAS (3). LNCS 15698, Springer (2025)
Metadaten
Titel
Improvements in Software Verification and Witness Validation: SV-COMP 2025
verfasst von
Dirk Beyer
Jan Strejček
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-90660-2_9